How to get my Dell Ubuntu server to boot from a different drive?
April 28, 2011 6:58 PM   Subscribe

I have an Ubuntu server. It's a Dell PowerEdge with a PERC 6 RAID controller. I'm trying to get it to boot to a different drive, and grub doesn't seem to be cooperating - it appears to attempt to boot to the correct volume and partition and then bonks out when mounting /proc - what gives?

Here's a photo of the screen when it dies.

I'm using the UUIDs of each volume in my grub configuration, and the volume I want to boot from has separate /boot and root partitions. I've been hand-editing my various grub files: devices.map, menu.lst, etc., and have what appears to be the right UUIDs in the right places: the boot partition under "uuid", and the *root* partition on the kernel options. I've also got what I think are the right entries in the fstab, and I can get into the grub shell at boot time and poke around enough to confirm that my (hd0,0) is where I think it is and that the UUIDs all appear correct. I can provide all these actual files if it would be useful.

The existing and working boot drive is a RAID5 while the new one is a single physical volume that I'd like to use temporarily while shuffling around some other drives. Once I get this working, my plan is to do it a few more times in order to shuffle all the various files onto their intended disks.
posted by migurski to Computers & Internet (21 answers total) 2 users marked this as a favorite
 
Doesn't look like a grub problem to me; looks like an initrd issue. How did you go about getting your system onto the new disk, and was update-initramfs -u -k all involved at any stage?
posted by flabdablet at 7:25 PM on April 28, 2011


Response by poster: It wasn't - I rsynced the old system onto the new drive, and then used grub and haned-editing of config files to enter all the new volume locations. I'll do some reading about initrd, thank you!
posted by migurski at 9:44 PM on April 28, 2011


Best answer: Got a little more time now and can expand a little.

If I were going to use a working Debian-derived installation to clone itself onto a new disk drive, I'd get a root shell on it and do these steps:

1. Partition the new disk with cfdisk.

2. Create empty filesystems in the new partitions with mkfs and mkswap.

3. Mount the partition that is to hold the clone's root filesystem on /mnt.

4. cp -ax / /mnt to copy the working root filesystem only to /mnt. -a recurses while preserving existing metadata, -x stops the recursion from descending into non-root filesystems like /dev and /proc and /mnt itself.

5. Mount all the clone's subsidiary filesystem partitions to appropriate spots inside /mnt (e.g. /mnt/home, /mnt/usr or whatever).

6. Use more cp -ax commands to copy each of the source system's mounted filesystems (except /mnt itself) to their corresponding locations under /mnt.

7. for d in dev proc sys; do mount --bind /$d /mnt/$d; done so that when I chroot into the clone's filesystem I'll still be working with the original system's processes, devices and whatnot.

8. chroot /mnt to start making necessary changes inside the clone.

9. Hand-edit /etc/fstab and /boot/grub/menu.lst (now that we're inside a chroot, these are the clone's copies), and change all the UUIDs to make them match the clone's partitions rather than the original's.

10. update-initramfs -u -k all - this is the step I suspect you missed. It makes sure that the version of /etc/fstab inside the clone's initial (boot-time) RAMdisk filesystem matches what will replace it once the real root partition gets mounted.

11. grub-install /dev/sdX where sdX is the device name for the whole clone disk (not any of its partitions).

12. Ctrl-D out of the chroot.

13. umount /mnt/dev/ /mnt/proc /mnt/sys /mnt and the new disk is ready to try booting from.

On preview: I think all you'll probably need are steps 7, 8, 10, 12 and 13; looks like you've already done the equivalent of the rest.
posted by flabdablet at 9:56 PM on April 28, 2011 [3 favorites]


Urk. Steps 4 and 5 are in the wrong order, and there's a missing step where I make stub directories under /mnt to make the places to mount the /mnt/home, /mnt/usr etc. filesystems; as they stand, my instructions will break if the clone doesn't have the same mount tree as the original.
posted by flabdablet at 10:00 PM on April 28, 2011


Response by poster: Thanks for the detail! Steps 1-6 are a close match to what I've been up to. Steps 7 & 8 are where we diverge and the part I'll need to learn more about. I'm pretty new to chroot and completely new to initrd - I'll learn more about them and try your suggestion!
posted by migurski at 10:06 PM on April 28, 2011


Best answer: When you rsync'd it probably skipped copying /proc /sys /dev because they're not normal filesystems. So your new /root is simply missing a empty directory named proc that needs to be there for the the real /proc to get mounted on via fstab.
# /etc/fstab
#                
proc            /proc           proc    nodev,noexec,nosuid 0       0
Boot to a rescue CD or your old install, mount your new root under /mnt or somewhere and make sure there's an empty proc, sys and dev directory there.
posted by zengargoyle at 4:27 AM on April 29, 2011


from that error screen alone, zengargoyle's answer seems spot on.
posted by gjc at 6:59 AM on April 29, 2011


Is there any reason why you would need to do a file-level copy of the system rather than just using dd to do a full sector-by-sector clone? That would avoid all of the issues around making sure you copy the entire system, and it's a one-liner (dd if=/dev/sdx of=/dev/sdy)
posted by burnmp3s at 7:57 AM on April 29, 2011


Response by poster: burnmp3s: I did a bit of reading about dd, and I read some suggestions that it might not be a good fit for us. It felt brittle, and I'm not quite doing a perfect mirror but rather splitting off home and switching from a single-partition volume to multiple partitions.

zengargoyle & gjc: that seems so simple but not unlikely; I may not have created a /proc dir because I wasn't sure if the system auto-made those or what. When I try rebooting later today in the office your suggestion might be all that's needed!

flabdablet: I've done the parts of your suggestion that are possible remotely, and in the chroot environment I found this weird thing:
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             1.4T   16G  1.3T   2% /
udev                  4.0G  212K  4.0G   1% /dev
none                  4.0G  212K  4.0G   1% /dev/pts
none                  4.0G  212K  4.0G   1% /dev/shm
none                  1.4T   16G  1.3T   2% /var/run
none                  1.4T   16G  1.3T   2% /var/lock
none                  1.4T   16G  1.3T   2% /lib/init/rw
/dev/sda1 is actually /dev/sdc3 as you can see from this corresponding call to df -h outside of chroot. I tried to grub-install /dev/sdc and it didn't work ("Could not find device for /boot: Not found or not a block device."); I'm a bit squeamish about doing it to /dev/sda while I'm confused about which actual volume that is.
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             448G  242G  185G  57% /
udev                  4.0G  212K  4.0G   1% /dev
none                  4.0G     0  4.0G   0% /dev/shm
none                  4.0G  116K  4.0G   1% /var/run
none                  4.0G  4.0K  4.0G   1% /var/lock
none                  4.0G     0  4.0G   0% /lib/init/rw
/dev/sdb1             1.8T  1.3T  471G  73% /mnt/donk
/dev/sdc1             259M   51M  195M  21% /mnt/april-boot
/dev/sdc3             1.4T   16G  1.3T   2% /mnt/april
Maybe making a /proc dir is all it's going to take.
posted by migurski at 8:11 AM on April 29, 2011


dd is a bad idea unless your partitions are the same size, or your destination is smaller and you don't mind not using all the available space. Even cp has some problems that I've forgotten what they are. Best bet for copying is probably tar, and explicitly choosing directories to copy.
mount /dev/sdc3 /mnt/root
mkdir /mnt/root/boot
mount /dev/sdc1 /mnt/root/boot
mkdir /mnt/root/home
mount /dev/sdc4 /mnt/root/home
cd /
tar cf - bin boot cdrom etc home lib opt root sbin src srv usr var | tar xvpf - -C /mnt/root
# then make some directories
cd /mnt/root
mkdir dev mnt proc sys tmp
# check permissions against /tmp for tmp
Then continue on. You also want to remove /etc/mtab in your clone! That's probably the source of your df confusion, it's looking at the mtab that you originally copied over. I would probably unmount boot, home from the clone before chroot and then mount them from inside the chroot.

I'd forgo messing with grub just yet and boot with the current grub, get the grub prompt and edit to the new clone. Or maybe add an entry to the existing menu.lst after exiting the chroot.

Actually the error message 'mounting /proc on /root/proc' is a bit odd, I would think it would just be 'mounting /proc on /proc', but I haven't seen that before and Ubuntu is just plain weird in places.

Are you planning on eventually swapping drives or changing BIOS boot order to boot from sda3?
posted by zengargoyle at 11:19 AM on April 29, 2011


dd is a bad idea unless your partitions are the same size, or your destination is smaller and you don't mind not using all the available space.

If the destination is larger you can very easily adjust the destination partition size using something like GParted. If the destination is larger but still large enough to hold everything, you can use GParted (probably off of a Live CD) to shrink the source file system to the correct size, clone it, then resize it back, although that is more of a pain.
posted by burnmp3s at 12:12 PM on April 29, 2011


cp is indeed deprecated in general for Unix cloning, but as far as I know the Gnu cp that comes with Linux is perfectly well-behaved; there's no particular need to jump through tar or cpio hoops to work around deficiencies it simply doesn't have.

Using dd (or, better, ddrescue) is a good idea when (a) the system you're cloning is mostly full - otherwise you waste a lot of time copying empty space (b) the clone's mount tree is the same shape as the source's and (c) you'll be physically disconnecting the source drives before attempting to boot the clone - dd clones all the filesystem UUIDs along with everything else.

It probably is a bit rude trying to do install-grub from inside a chroot with a /etc/mtab that tells lies, but whatever you've done seems to have worked in any case; you're way past grub in the boot sequence when you strike your mount problem, which in my opinion is almost certainly caused by booting up with a copy of your original installation's initrd. This in turn will contain a copy of your original installation's /etc/fstab, including all the original UUIDs, which will make it impossible for init to find your clone's real root filesystem, which makes that impossible to mount, which makes all its mount points unavailable, which makes everything else impossible to mount.

Running update-initramfs from inside the chroot should copy the clone's /etc/fstab into the clone's initrd image, and as long as that fstab contains a correct reference to the clone's root filesystem's UUID, all should then be well.

When you rsync'd it probably skipped copying /proc /sys /dev because they're not normal filesystems. So your new /root is simply missing a empty directory named proc that needs to be there for the the real /proc to get mounted on via fstab.

I have experimental results that show that cp -ax does not cause this trouble. If you've got a second filesystem mounted on /foo/bar and you cp -ax /foo /qux you will end up with an empty directory at /qux/foo/bar (whcih makes sense; -x means "stay on one filesystem" and in the original /foo filesystem there is an empty mount-point directory at /foo/bar).
posted by flabdablet at 6:38 PM on April 29, 2011 [1 favorite]


Also: if there's any doubt at all about the accuracy of /etc/mtab (which there will be, both in a chroot of a clone and when running a maintenance shell inside an initrd) then you should not rely on tools like mount and df to tell you what's mounted where. Use cat /proc/mounts instead.
posted by flabdablet at 6:45 PM on April 29, 2011


Also also: if you get back to the screen you took the shot of and then do cat /etc/fstab I'll bet you a dollar that the fstab you will see contains all your original UUIDs.
posted by flabdablet at 6:47 PM on April 29, 2011


Response by poster: So good, thanks everyone - flabdablet and zengargoyle I think your advice was exactly right!

I ran initrd and I also made sure that /proc exists ahead of time on the new system. It got well past the /proc stage, running into problems only when one of the other volumes on the system (the one where I moved /home to) needed to be fscked. I let that run for a bit and then switched back to the original drives to do a bit more maintenance. I'm going to complete the switch sometime this weekend.
posted by migurski at 7:56 PM on April 29, 2011


Response by poster: cat /proc/mounts is really valuable, btw - didn't know about it.
posted by migurski at 7:58 PM on April 29, 2011


flabdablet, I really don't think fstab has anything to do with initrd. At least not unless Ubuntu Server is different than Ubuntu Desktop and CentOS and Gentoo and just plain old generic kernel. The initrd just holds kernel modules needed to get the root filesystem mounted (esoteric fstype modules like reiser/crypt) and on the fru-fru systems a fancy graphics screen to hide the boot messages. It simply mounts root, rebinds proc and sys that it mounted in its fakeroot so it could do all the fru-fru stuff, finds a suitable /sbin/init, /bin/init, /init, /bin/sh and then pivots the filesystem over and execs init.

I looked at the initrd for my Ubuntu Desktop and that is what it does. I've mix and matched various distros kernel/initrd combos across different roots and even different distros and bare kernels with no modules with no problems. The initrd if any just needs to match the kernel for module loading.
posted by zengargoyle at 8:50 PM on April 29, 2011


It simply mounts root

How?
posted by flabdablet at 5:00 AM on April 30, 2011


So now I've found out more or less how, and my assumption that the initrd contains an /etc/fstab is indeed dead wrong. Sorry for the mislead.

For the record: the init script inside the Debian initrd finds out where to look for the real root filesystem by parsing the root= command line option passed to the kernel, which it extracts from /proc/cmdline after doing mount -t proc -o nodev,noexec,nosuid none /proc very early on.

However, the fact remains that I have often needed to do a chrooted update-initramfs before a cloned system would boot properly for me. I'd always assumed it was fstab. Now I'm assuming it's actually down to some difference in the modules required for startup.

The initrd images are just gzip-compressed cpio archives, so it's pretty easy to find out what your chrooted update-initramfs actually did, if you're interested (I am!). After you've exited the chroot but before unmounting /mnt, you can do

cd /tmp
mkdir initrd-orig initrd-clone
cd initrd-orig
zcat /boot/initrd.img* | cpio -i
cd ../initrd-clone
zcat /mnt/boot/initrd.img* | cpio -i
cd ..
diff -r initrd-orig initrd-clone | less

If you've got more than one initrd.img* file in /boot, put in the whole name of the one you've actually been working with (tab-completion fails for me, probably because zcat wants its argument filenames to end in .gz).
posted by flabdablet at 6:08 AM on April 30, 2011


How?
I don't think you can do this anymore, but it's easier to think about the old days when you could boot a Linux system from a single floppy disk. The kernel is built with support compiled in for a simple extfs filesystem. Parts of it are compressed and a block of code with the compressed size and the uncompressor/loader code is stuck on the front. You dd that out to a floppy and say it takes up the first 500 sectors. Then you build a extfs filesystem starting at block 501 to the end.

The computer boots, load the first sector and executes it. The boot code uncompresses and loads the kernel into memory, then loads the root inode of the following filesysem into the kernel's / inode space and passes execution off to the loaded kernel. The kernel then looks for an init or failing that a sh in the only filesystem it knows about and presto! running linux.

Later on in the HD age, the kernel has a couple of set offset places that hold the device information for the / device, and you have a MBR based boot loader. You don't bother with compression. You build your kernel with the filesystem support for your / filesystem, then (ls -l /dev/sda1 => brw-rw---- 1 root disk 8, 1 2011-04-30 02:09 /dev/sda1) you patch your kernel with and give it the 8,1. Then you patch the MBR loader with the raw disk blocks location of the kernel. The computer boots, the MBR runs, it loads the kernel into memory, the kernel looks for device 8,1 and loads the / inode from there and continues with the init hunting process.

Now with things like GRUB and initrd, grub will load the kernel, then uncompress initrd into memory and then patch the kernel with the device info for (brw-rw---- 1 root disk 1, 0 2011-04-30 02:09 /dev/ram0) IIRC. Anyway, the kernel gets the / inode of a in-memory filesystem with bunches of modules and fru-fru, loads the init from that. The init on the initrd has access to the libraries and programs needed to probe the machine to determine which special modules to load (raid, encryption, lvm), then processes the arguments passed in by the bootloader letting you dynamically control which / filesystem you want to use among other things. Then it does a 'pivot' to possibly remove the initrd filesystem, and chain on to the realroot init.

By now, Ubuntu's initrd has so much fluf in it that it is practically it's own OS.
posted by zengargoyle at 8:48 AM on April 30, 2011


I was clearly labouring under the misapprehension that it was even fluffier still! Thanks for the clear exposition.
posted by flabdablet at 10:46 PM on April 30, 2011


« Older phalange SOS   |   Barcode/UPC lookup tool? Newer »
This thread is closed to new comments.