Technical issues on Linux

Hsin-Yu Sidney Li LIH at cliffy.polaroid.com
Fri Aug 9 07:38:07 EDT 1996


Richard Braun <richb at pioneer.ci.net> wrote:

> 
> I'm looking for hints on fixing any of three technical problems associated
> with managing high-volume Linux servers...
> 
> 1)  The TCP connect() system call apparently never times out in current
>   versions of Linux.  This caused mail server crashes here during the AOL
>   outage Tuesday (hundreds of sendmail processes building up, waiting
>   indefinitely for the AOL server to come back up).
> 2)  Packet fragmentation problems lead to frequent complaints that the
>   web pages can't be viewed without glitches (this includes www.bcs.org).
> 3)  I notice syslogd taking up an inordinate amount of processor time,
>   causing slowdowns in POP3 email access and logins (ftp, etc).  Sometimes
>   it takes upwards of a minute for an item to get logged, if a lot of
>   system activity is taking place, during which the syslogd process (on a
>   P-133, mind you) is racking up 80% or more of CPU usage.
> 
> -rich
> 

I don't have any clue to these questions, but I would like to throw in
something that might be related (and is a problem for me now).  I have
a couple of Linux boxes set up in our area, and they (and some
non-Linux PC's and Macintoshes) share an Apple LaserWriter network
(PostScript) printer.  They all worked fine up to the 1.2.13 kernel,
but then started to have problems with the 1.3.x kernels.  I have
upgraded one of them to 2.0.0, and then 2.0.11 recently, and still
have the same problem.  The system is running ELF, and I have also
upgraded libc (5.3.12), libm (5.0.6), and lpr (I tried 5.9-11 and then
5.9-12, from debian).  The symptoms are: lpr would submit a print job,
and the printer would start blinking, but nothing ever comes out.
Small jobs (not sure how small) would get printed, but larger jobs
(say around 1K or 4K) would get stuck.  Even worse, after removing the
print job (with lprm), the printer would be stuck, and I would have to
turn off and restart the printer.

Netstat reports that the connetion is ESTABLISHED, but Send-Q (on one
occasion) was stuck at 4189, and never changed.  I've also noticed
sometimes Recv-Q with 2 or 3.  The work-around so far has been to keep
one of the machines at kernel 1.2.13, and use it as a print server.
Curious enough, there are no problems sending print jobs to this other
Linux machine (which in turn sends the job to the network printer).

Another thing that happened after switching to 2.0.0 and 2.0.11 has
been that VersaTerm Pro on the Macintosh is no longer able to ftp to
my machine (although Sun workstations, SGI machines, and other Linux
boxes have no problem).

So it seems that something is broken.  This is the *third* time I've
complained about the printer problem here, as some of you might
remember.  Anyway, thanks for listening, and hope somebody will fix
the problem soon.  I'd like to learn more about network and socket
programming to help with this problem myself, but it is hard to find
any time nowadays.

Best Regards,

Sidney Li

lih at polaroid.com



More information about the Discuss mailing list