Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Findings Re: Kernel version 2.6 -- RAID performance woes?



Thanks to all who posted or replied to me directly about my challenging 2.6
software-RAID installation.  It *appears* that I have solved my problem, and
it was pretty obscure so I'll post my findings and some other Suse-10
installation notes here for the benefit of posterity.

* When I first brought up Suse-10's 2.6 kernel on a Shuttle motherboard with
an AMD 2400+, the system would go into thermal shutdown at random intervals. 
DIAGNOSIS: a bug in the ACPI module which yields wild random numbers for
current temp.  WORKAROUND: add acpi=oldboot to the kernel's boot param in grub
menu.lst.

* As noted in my first post, I was getting DMA/IDE errors in syslog, and
sluggish performance.  A process md0_raid1 was chewing up CPU cycles whenever
I generated disk I/O>  (Google my previous post for specific error message
text.)  To assess whether DMA has been deactivated, I got a tip via email to
give the command 'hdparm /dev/hda'.  Sure enough, the using_dma parameter was
set to off.

* Given the first observation above, I (and a couple of my correspondents)
thought to try rebooting with acpi=off.  No change to the symptoms.

* Some suggested a hardware failure of some sort, but the mainboard and disks
are new, I'm using 80-pin IDE cables taken from the previous known-good
system, each disk is configured as master on a dedicated IDE bus.  The key is
this:  whenever I got my syslog errors, I'd get not one but two error reports
in rapid succession:  /dev/hda and /dev/hdc would both get a similar (not
necessarily identical) error.  That pretty much ruled out any physical problem
with the drives and cables.

* One thing I noticed is that my CPU was operating in dynamic frequency mode: 
if you do a cat /proc/cpuinfo, you could see it sometimes operating at 533MHz
and others at 667MHz.  This I apparently configured during installation via a
yast2 setting System->Power Management->AC Powered->Powersave.  One of my
beefs about yast2 is that you can modify settings and not remember (or be able
to go back in and consult a log) what you changed or when.  Whenever I make a
manual change to a file on my system, I make a personal habit of recording the
change using RCS which creates a complete record.  In any case this was set to
dynamic and I changed it to performance which forces the CPU to operate at
maximum frequency instead of dropping during idle time.

I have now been running for an hour or so since making that final change and
have not seen any IDE disk errors; both IDE buses (hardware is VIA vt8235)
stay in DMA mode, and overall system performance of this software RAID1 system
is better than the hardware RAID1 that it replaced.

I'll therefore stay with kernel 2.6 instead of fooling around with 2.4.31 as I
was planning to do next.  This is a big kernel to compile, and it's difficult
to diagnose problems or to figure out where to find solutions if you run into
a weird one like this.  I have been a Suse user since about release 6, and am
wondering if I should stay with this distro or not:  it seems to have laid a
few traps for me with this installation.  And of course it does everything
with yast2, a program that won't win too many awards.

The minor mysteries that I still need to resolve on this setup are:  figure
out why the ifup script fails with my gigE card (I can bring it up manually
but...); figure out a couple more Suse-Firewall logging issues, if I want to
bother with iptables at all; figure out why the kernel build didn't produce
modules for thermal and processor (which give modprobe errors at boot).

Maybe Red Hat does a better job of installation and hardware verification?

Thanks again to all who put up with/responded to my tale of woe.

-rich





BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org