System monitoring

dsr at tao.merseine.nu dsr at tao.merseine.nu
Thu Apr 22 16:09:48 EDT 2004


On Thu, Apr 22, 2004 at 03:39:28PM -0400, Cole Tuininga wrote:
> On Thu, 2004-04-22 at 15:21, dsr at tao.merseine.nu wrote:
> > Have you evaluated mon?
> 
> Nope - do you use it?  If so, what do you like about it?

I've used two dozen systems; mon is the one I keep returning to. It may
be masochism.

The good news: mon works pretty well as a primarily SNMP-based requestor
of values, which then evaluates certain conditions and takes certain
actions. In practice, this comes down to:

mon: ping productionhost
 (OK, it's up and on the network)
mon: run ssh.mon productionhost
 (OK, I can ssh to it)
mon: snmp get diskspace / productionhost
 (OK, it hasn't run out of space on /)
mon: snmp get process named oracle productionhost
 (OK, oracle is running)
mon: run simple-oracle-query productionhost
 (blast, that failed)
 (if it fails again within 1 minute, send pages to...)

If that's the level of functionality you want, this is a good bet. You
can add in web integration of various sorts, but basically, when mon
isn't sending alerts, everything should be good.

-dsr-



More information about the Discuss mailing list