Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Kernel version 2.6 -- RAID performance woes?



My experience with replacing a 100% reliable 6-year-old Linux file server has,
so far, made me consider reverting back to a 1999-vintage kernel running on
that old vintage hardware.

The thing I didn't bargain for with this upgrade was the bloated buggy
inefficiency of 2005-vintage Linux kernel code.  Wow, is this not your
grandfather's Oldsmobile (er, Linus' terminal emulator)!  Even after running a
3-hour compile following a 15-minute 50-meg download from ftp.funet.fi, I get
weird syslog messages like the following runing the 2.6.14 kernel:

Nov 15 16:33:18 cumbre kernel: hda: DMA timeout retry
Nov 15 16:33:18 cumbre kernel: hda: timeout waiting for DMA
Nov 15 16:33:18 cumbre kernel: hda: status timeout: status=0xd0 { Busy }
Nov 15 16:33:18 cumbre kernel: ide: failed opcode was: unknown
Nov 15 16:33:18 cumbre kernel: hda: no DRQ after issuing MULTWRITE_EXT
Nov 15 16:33:19 cumbre kernel: ide0: reset: success
...
Nov 15 17:18:02 cumbre kernel: BUG: soft lockup detected on CPU#0!
Nov 15 17:18:02 cumbre kernel: Pid: 1178, comm:            md0_raid1
Nov 15 17:18:02 cumbre kernel: EIP: 0060:[<cf83966f>] CPU: 0
Nov 15 17:18:02 cumbre kernel: EIP is at ide_intr+0x8f/0x120 [ide_core]

The latter one (soft lockup) has been seen only since upgrading from 2.6.13;
the former is frequently seen in syslog with both my new motherboard and the
other one I used to preconfigure the system.

But my main gripe about 2.6 is software RAID performance.  It's stunningly
worse than under 2.4.  On version 2.4, you see a process named raid1d that
never racks up any runtime, and a couple of related ones (bdflush,
mdrecoveryd) that have mere seconds of runtime after 3 weeks of uptime.  On
version 2.6, a process called md0_raid1 sucks up so much runtime (at nice
level minus-5) during file creation that the system is brought to its knees.

I use this server mainly for MP3 playback.  When I created the files 3 or 4
years ago, I set up four CD readers and ripped a 1000-disc collection with 4
simultaneous streams.  Today if I run a single stream of CD ripping, the
server is so sluggish that if I am playing back an MP3 file, Samba gets
CPU-starved and playback is choppy.  This is on a 667-MHz processor which is
more than ample to handle a few streams of 100-megabit file uploading.

I'm *amazed* at how Linux has evolved!

So my question is what to do about this:  is there a version 2.6 kernel with
stable software RAID performance that I could revert to, or do I have to go
all the way back to 2.4?  Where do I look online these days for technical
discussions of software RAID in the 2.6 kernel?

-rich





BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org