LVM/RAID recovery question

Wed Sep 24 22:20:48 EDT 2008

> Once I resolve this I'll try to post again.

I looked at some more man pages, googled a bit and found resolutions to this
problem.  It definitely wasn't obvious.  Here's my hypothesis:  when I
regenerated the system volume, I set it up with a different device name
/dev/md5.  At boot time as part of the logic contained in /boot/initrd, a
program 'vgchange -a y' is invoked; if it doesn't find the root volume in the
exact same RAID1 device as when the initrd volume was generated (by the
openSUSE or other distro's installer) then you're hosed.

To get un-hosed, it's apparently not enough to simply rename the system
volume's RAID device (which itself isn't very obvious).  Here was the command
to rename a system volume, which I found buried on a Red Hat knowledge-base
article:

 mdadm -A /dev/md[new] -m[old] --update=super-minor /dev/sda2 /dev/sdb2

In my case [new] is 1 and [old] is 5.  After invoking this, I still couldn't
boot off the system disk but when I brought up the rescue CD, the device now
showed up as /dev/md1 instead of /dev/md5.  Now came the hard part.  The first
thing I ran into is that the next step MUST be done by a rescue kernel with
the same architecture (preferably the same kernel revision) as the root volume
you're tryin to recover.  In my case, the root volume contained a 64-bit
distro and my first attempt at recovery was with a 32-bit rescue kernel (which
most are).  Obviously, your rescue system needs to have the LVM and RAID tools
built in (and some don't).

The one I mentioned earlier, SystemRescueCD, does include a 'rescue64' kernel
image.  With that up and running, here are the commands I typed:

* To see which RAID volumes are running:
   cat /proc/mdstat
* To see if the LVM is running:
   lvdisplay -C
* Since the LVM *wasn't* running, I needed to type *both* of the following:
   vgscan
   vgchange -a y
* Since my /boot partition is not on the LVM and didn't come up:
   mdadm --assemble /dev/md0 /dev/sda1 /dev/sdb1
* Now mount the root and boot volumes:
   mount /dev/system/volroot /mnt/system
   mount /dev/md0 /mnt/system/boot
* Now follow the directions in the mkinitrd man page:
   mount --bind /dev/mnt/system/dev
   chroot /mnt/system
   mount /proc
   mount /sys
   mkinitrd
* Just for good measure, invoke grub-install (that probably wasn't needed)

At the next boot, my system was back up with the root volume on its original
physical device /dev/md1.

Why am I posting this here?  I've gotten into the habit of entering error
messages and commands used to fix them into the trouble-ticket system at work.
 After a year of doing that, I now have a searchable knowledge-base that my
team relies on to fix routine problems.  As wonderful as Google is, the
information it pulls up mixes too much bad with the good.  The BLU discussion
archive has a lot of useful stuff like this so this is one of my
contributions:  two hours of my frustration (gee I missed Bush's
sky-is-falling speech so I guess it couldn't have been *all* bad!) can be
saved by someone else who runs into this issue.

-rich