Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Technical issues on Linux



I'm following up on the three questions I posted two weeks ago:
> > 1)  The TCP connect() system call apparently never times out in current
> >   versions of Linux.  This caused mail server crashes here during the AOL
> >   outage Tuesday (hundreds of sendmail processes building up, waiting
> >   indefinitely for the AOL server to come back up).
> > 2)  Packet fragmentation problems lead to frequent complaints that the
> >   web pages can't be viewed without glitches (this includes www.bcs.org).
> > 3)  I notice syslogd taking up an inordinate amount of processor time,
> >   causing slowdowns in POP3 email access and logins (ftp, etc).  Sometimes
> >   it takes upwards of a minute for an item to get logged, if a lot of
> >   system activity is taking place, during which the syslogd process (on a
> >   P-133, mind you) is racking up 80% or more of CPU usage.

Someone commented that the Linux SIG may not be the best place to post
these technical issues; responding to that, I'll point out that the
worldwide Linux user community has grown at least 1000-fold since I first
started using Linux in December 1992, and it's no longer practical to use
the main mailing lists and newsgroups to reach the 'right people' to
handle operational issues like these--the software writers are absolutely
swamped, and can't tell the difference between an old-timer with a
legitimate issue vs. a newcomer who hasn't researched the issue.  Smaller,
focused forums of people who have a reason to work with another on diverse
problems are a better place to get answers.  The BCS Linux SIG remains a
valuable resource for both newcomers and oldtimers.

Regarding my issues above:

Item 1...  Don't really have more info.  The sendmail processes were filling
up IP connection slots all day.  On BSD systems, there is a 60-second timeout
on the connect() system call.

Item 2...  Packet fragmentation seemed to get better when I moved up from
2.0.0 to 2.0.12.  That rev crashed twice in a couple of days, so I moved up
to patch level 2.0.13, and the system has been stable since.  I've not
specifically tried reproducing the packet-fragmentation myself, but there
are no longer any complaints about the issue.  This is life on the bleeding
edge.

Item 3...  About a year ago, the author of syslogd under Linux inserted an
fsync() system call after every log entry.  Thanks to my posting on this
mailing list, I was alerted to this (the response here suggested putting a
dash before the filename in syslog.conf, but that didn't work so I got to
reading source code).  If syslogd's log files get bigger than a few megabytes,
the overhead of this fsync() on a busy system causes major CPU thrashing.
For now, I'm periodically trimming the log files; those of you who plan to
run Linux in a production environment with a lot of email traffic may want
to consider rebuilding syslogd with the fsync() disabled.

System wizard.pn.com is now humming along nicely, though showing signs of
wanting another RAM upgrade (from its current 96Mb).

-rich




BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org