UNIX process monitor
Tom Metro
blu at vl.com
Mon Nov 21 21:46:06 EST 2005
Over the weekend I received some unusual looking email from one of the
monitoring tools I run on my mail server, and while investigating it I
discovered that a bunch of instances of a program I use to download
email from a Yahoo! account were stuck in endless loops and filling up
my process table (due to a data provoked bug). (The alert email I
received had nothing directly to do with the hung processes.)
That made me think that I probably should be running a program to
monitor the process list and spit out a warning when something looks
unusual, given that this is a lightly used system, and I rarely have
occasion to look at the process list myself.
A Freshmeat.Net search did turn up a couple of tools:
Process Change Detection System
http://doornenburg.homelinux.net/scripts/pcds/
Procwatch
http://freshmeat.net/projects/procwatch/
but they don't quite do what I want.
Procwatch notes all changes (process start/stop) and outputs that data
in a format suitable for logging. If you then ran a sophisticated log
monitor tool, you could probably get it to trigger alerts only when
things looked strange. (And given that a proper security setup should
include such a log monitor anyway, maybe this is the way to go.)
PCDS is a bit closer in concept to what I envisioned. You run it once to
establish a baseline for your system. It generates a file with process
names and the count for each. You can then manually edit it to change
the numbers to ranges, if you wish. (i.e. httpd 5-10)
Then on subsequent runs, it generates a report of how things differ from
the baseline. But as you might expect, this would lead to a lot of
unnecessary noise.
So I started down the path of coding up my own tool (borrowing ideas
from both of the two scripts above) which would do things like trigger
an alert when the total number of processes changes by more than X
percent between sampling periods, or if the quantity of a single process
type (i.e. httpd) changes by more than X percent between sampling
periods, as well as checking for hard limits on the maximum number of
processes and process types.
But percentage change doesn't work so hot for a small group of processes
that are all the same type. Going from 4 instances of smbd to 6 is a big
percentage change, but in reality not all that noteworthy.
It really needs to be smarter. What I'd really like is a program that
runs for a week or so in learning mode, develops a database of what is
"normal" and then sends alerts for when it notices unusual behavior.
Does anyone know of a tool that does this? I'm sure there are intrusion
detection tools that incorporate this, but following the UNIX
philosophy, I'd rather use a tool that specifically addressed this need.
Alternatively, do you have thoughts on what data should be recorded by
such a tool? For example, would it be useful to track how often process
X is seen (and how many instances) during a day, and then at the end of
the day calculate an average for it. Maybe the same for week and month
periods. Then you could say that it is "normal" for process X to appear
Y times over Z period, and thus be able to detect when things were abnormal.
What I'd like to avoid is triggering an alert when I happen to be doing
some infrequent maintenance work, while still quickly catching that
there are too many fetchyahoo processes running (the program that had
the bug), or that there's an unexpected sshd running (possible
backdoor). (Though I don't see this tool as being focused on intrusion
detection.)
-Tom
--
Tom Metro
Venture Logic, Newton, MA, USA
"Enterprise solutions through open source."
Professional Profile: http://tmetro.venturelogic.com/
More information about the Discuss
mailing list