Ping puzzle

Wed Mar 26 16:26:40 EST 2003

OK; I can see that people totally missed the point.

David Kramer writes:
| Are you sure?  On my system, ping  Destination Host Unreachable messages come
| on stdout.

In fact, it varies system to system.  But I'm not trying to figure
out whether ping is writing to stdout or stderr.  I thought that
would be obvious from my samples:

| > $ /bin/ping -n -i 10 64.28.81.46 2>&1 | more
| > $ /bin/ping -n -i 10 64.28.81.46 2>&1 | tee ping.out
| > $ /bin/ping -n -i 10 64.28.81.46 2>&1 | cat

Note that in every one, I used 2>&1 to merge stderr with stdout.  The
point  is  that  this  is  a  subprocess  whose output is being piped
somewhere.  I want the reader to get all of ping's output, no  matter
whether  it is stdout or stderr on the current system.  In the normal
case, ping's success messages get through.  But ping's error messages
don't  get  through.   They appear on my screen, but a parent process
doesn't see them.  These pipe commands demonstrate that it's not just
my poor programming; a simple pipe also loses the error messages.

| OK, how about forgetting the output and checking $?, which will be set to 0 if
| the host was reachable, or 1 if it was not.

That won't work, because the above ping commands don't exit.  This is
what  I  want, of course.  The idea is a parent process that wants to
know whether host foo.bar is alive.  The  obvious  suggestion  is  to
simply  fire  up a ping subprocess and read its output.  This doesn't
always work. As long as the child keeps reporting success, the remote
system is alive. But ping also reports the reasons for ICMP failures,
and the parent wants to  know  about  those,  too,  so  that  it  can
sensibly diagnose failures.

This doesn't work, because the  parent  process  doesn't  see  ping's
error  messages.  If the remote system is responding, then the parent
does see the  "normal"  output.   But  if  the  remost  system  isn't
responding,  ping  produces  error  messages  that  only appear on my
screen when I run it by  hand.   When  the  output  goes  to  another
process, the error messages don't arrive.

If you try the above commands with a  responsive  host,  you'll  find
that they work.  It's only with a non-responsive host that they fail.
And they only fail on some systems. (As I said, they seem to work for
all hosts on this FreeBSD system.)

The guess that they're being buffered is likely correct.  Maybe if  I
waited  long  enough,  the  parent would get those messages.  But the
buffer is apparently pretty big,  because  tests  running  10  or  20
minutes don't see the messages.  If this is the problem, is there any
way to persuade ping to not do this (assuming that it's actually ping
that's doing it)?

One bizarre thing about this conjecture is that it implies that  ping
is  using  fflush  on  its  success  messages  but not on its failure
messages.  This is the opposite of what most programs do.

One possibility is to find the source to ping, add some fflush calls,
and  recompile it.  I've done this in the past.  But it has one major
gotcha:  It only works if ping is installed as setuid root, otherwise
the  kernel refuses to allow the raw socket I/O.  So I can't use this
approach on a machine where I don't have root access.

Also, I've found that ping's source is highly non-portable.  I have a
number of ping.c files from various systems. They hardly ever compile
on a new system, and when they do compile,  they  usually  don't  run
sanely. A better approach on a new system is to figure out how to use
the native ping, which usually does work on that system. But then you
run afoul of the buffering (if that's what is really happening here).

Maybe I'm just gonna have to install expect everywhere, as a  wrapper
around ping to defeat its buffering. Then the app will be even slower
than it is now.

I wonder if there's a perl ping module? ;-)