[Discuss] L1-A in Linux, 10GBE woes

Tom Metro tmetro+blu at gmail.com
Fri Aug 24 17:32:07 EDT 2012


Daniel Feenberg wrote:
> ...our Linux boxes (which boot from the network) will crash if
> deprived of / for that long. Some of them have long running jobs I
> was hoping not to interuppt. So I was hoping that there might be a
> way to "pause" them while the switch got updated.

It seems like some of the mechanisms developed for laptops to suspend
the OS would be applicable here, though it would be high risk to try
that out on a production machine, unless it was your last option.

If your long running jobs are constrained to a small number of known
processes, you can try sending them a SIGSTOP signal, do your work, and
then send a SIGCONT to resume. Unlike suspending the whole OS, you're
less likely to run into problems with video hardware and the like not
getting reinitialized on resumption, but on the other hand, you won't be
halting the kernel, so some other process may still do something that
calls upon the inaccessible network file system.

 -Tom

-- 
Tom Metro
Venture Logic, Newton, MA, USA
"Enterprise solutions through open source."
Professional Profile: http://tmetro.venturelogic.com/



More information about the Discuss mailing list