![]() |
Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
Thanks to all who posted or replied to me directly about my challenging 2.6 software-RAID installation. It *appears* that I have solved my problem, and it was pretty obscure so I'll post my findings and some other Suse-10 installation notes here for the benefit of posterity. * When I first brought up Suse-10's 2.6 kernel on a Shuttle motherboard with an AMD 2400+, the system would go into thermal shutdown at random intervals. DIAGNOSIS: a bug in the ACPI module which yields wild random numbers for current temp. WORKAROUND: add acpi=oldboot to the kernel's boot param in grub menu.lst. * As noted in my first post, I was getting DMA/IDE errors in syslog, and sluggish performance. A process md0_raid1 was chewing up CPU cycles whenever I generated disk I/O> (Google my previous post for specific error message text.) To assess whether DMA has been deactivated, I got a tip via email to give the command 'hdparm /dev/hda'. Sure enough, the using_dma parameter was set to off. * Given the first observation above, I (and a couple of my correspondents) thought to try rebooting with acpi=off. No change to the symptoms. * Some suggested a hardware failure of some sort, but the mainboard and disks are new, I'm using 80-pin IDE cables taken from the previous known-good system, each disk is configured as master on a dedicated IDE bus. The key is this: whenever I got my syslog errors, I'd get not one but two error reports in rapid succession: /dev/hda and /dev/hdc would both get a similar (not necessarily identical) error. That pretty much ruled out any physical problem with the drives and cables. * One thing I noticed is that my CPU was operating in dynamic frequency mode: if you do a cat /proc/cpuinfo, you could see it sometimes operating at 533MHz and others at 667MHz. This I apparently configured during installation via a yast2 setting System->Power Management->AC Powered->Powersave. One of my beefs about yast2 is that you can modify settings and not remember (or be able to go back in and consult a log) what you changed or when. Whenever I make a manual change to a file on my system, I make a personal habit of recording the change using RCS which creates a complete record. In any case this was set to dynamic and I changed it to performance which forces the CPU to operate at maximum frequency instead of dropping during idle time. I have now been running for an hour or so since making that final change and have not seen any IDE disk errors; both IDE buses (hardware is VIA vt8235) stay in DMA mode, and overall system performance of this software RAID1 system is better than the hardware RAID1 that it replaced. I'll therefore stay with kernel 2.6 instead of fooling around with 2.4.31 as I was planning to do next. This is a big kernel to compile, and it's difficult to diagnose problems or to figure out where to find solutions if you run into a weird one like this. I have been a Suse user since about release 6, and am wondering if I should stay with this distro or not: it seems to have laid a few traps for me with this installation. And of course it does everything with yast2, a program that won't win too many awards. The minor mysteries that I still need to resolve on this setup are: figure out why the ifup script fails with my gigE card (I can bring it up manually but...); figure out a couple more Suse-Firewall logging issues, if I want to bother with iptables at all; figure out why the kernel build didn't produce modules for thermal and processor (which give modprobe errors at boot). Maybe Red Hat does a better job of installation and hardware verification? Thanks again to all who put up with/responded to my tale of woe. -rich
![]() |
|
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |