Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
Last week I had installed a mirrored pair of disks to my mail server, and then extended the LVM volume group to include the pair. I extended the /home partition to use about 14 GB of it, as /home was just about out of space. Everything tested fine at that point. Sometime during the night, probably during backups, the mail server hung. It appeared to be hanging on disk i/o; I was able to telnet to the ssh and imap ports, and nagios of course didn't see any problem with it. But I couldn't actually login to ssh or complete an imap connection. I ended up driving in to undo the volume group extension, which involved booting single-user and moving enough data off /home to make it fit completely on the old physical volume so I could reduce its size and remove the new pair from the volume group. One thing I noticed was in a screen(1) session on the server's serial port console. At home I had connected to it and hit Enter a few times, and there was no response at all. When I got to the office about an hour later, it showed two instances of the login prompt, as if I had hit Enter twice; and a few minutes later, the third instance finally displayed. I'm wondering what could have gone wrong, and I worry that the same thing might happen again if I add the pair back into the volume group again. Any ideas what might have caused this, and how I can avoid it next time? The server is an HP Netserver lp2000r, running CentOS 4. I've kept it up to date with yum, so it's at CentOS 4.4 now. It had a pair of 36gb scsi disks, each partitioned for /boot, swap, and LVM, with the /boot and LVM partitions mirrored via Linux's software raid. I installed an additional pair of 72gb scsi disks, then shutdown and rebooted the server so kudzu would recognize the disks. I formatted them both identically as follows: Disk /dev/sdc: 73.4 GB, 73407868928 bytes 255 heads, 63 sectors/track, 8924 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdc1 1 8924 71681998+ fd Linux raid autodetect Disk /dev/sdd: 73.4 GB, 73407868928 bytes 255 heads, 63 sectors/track, 8924 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdd1 1 8924 71681998+ fd Linux raid autodetect I used mdadm --query to verify the next free md device. I found that /dev/md0 was /boot and /dev/md1 was the existing sda3/sdb3 volume that comprised the volume group. /dev/md2 didn't exist. I created the new volume and added it to the volume group: * mdadm --create /dev/md2 --level=mirror \ -raid-devices=2 /dev/sdc1 /dev/sdd1 * pvcreate /dev/md2 * vgextend BrodieVG /dev/md2 and then added space to /home: * lvextend --size +12g /dev/BrodieVG/home * ext2online /dev/BrodieVG/home which brought /home from 99% full to 68% full. After the failure, I had to kill the server by powering it off, after which I booted single-user. I moved enough data out of /home to increase the free space to 14gb, and then * e2fsck -f /dev/BrodieVG/home * resize2fs -p /dev/BrodieVG/home 24G * lvreduce --size -14G /dev/BrodieVG/home * pvmove -v /dev/md2 /dev/md1 * vgreduce BrodieVG /dev/md2 -- John Abreau IT Manager Zuken USA 238 Littleton Rd., Suite 100 Westford, MA 01886 T: 978-392-1777 F: 978-692-4725 M: 978-764-8934 E: John.Abreau at zuken.com W: www.zuken.com -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: <http://lists.blu.org/pipermail/discuss/attachments/20060929/592e9a6a/attachment.sig>
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |