Howto enable degraded software (mdadm) RAID arrays to boot under Ubuntu

Update: Please do not follow the advice in this article. I had hoped that I had found a solution but in fact the steps outlined here, while preventing the problem of systems hanging at boot during a drive failure, introduce new problems. The best advice I can give right now is to wait for the fixes for Hardy to be released and hope for no drive failures on remotely administered systems. Yes, I know that doesn’t inspire confidence in Ubuntu on the server; nor should it. End Update

I’m not entirely sure why the default behavior with regard to RAID would be to fail on startup since the point of RAID is for a system to survive a drive failure. However, the default behavior for Ubuntu is for the system to fail if a RAID array is degraded at boot time. Not only will it fail to boot it will also fail to give any useful information and spend several minutes before delivering the user into a recovery console. A far more useful behavior would be to prompt for a number of seconds that an array is degraded and then continue to boot (with an option to reverse this behavior). In any case, this is not a difficult situation to remedy though finding the right information can be.

This bug has been discussed for sometime as Bug 120375: cannot boot raid1 with only one disk. There are several solutions posted, and I’m sure some work and some do not. I found the simplest solution (which I tried to test thoroughly in a test environment) was a permutation of the solutions offered by Plnt and encbladexp.

This solution requires that one open the file /etc/udev/rules.d/85-mdadm.rules as root in one’s editor of choice such as:

  • sudo nano /etc/udev/rules.d/85-mdadm.rules

Then find this text, which should be the only uncommented text:

SUBSYSTEM=="block", ACTION=="add|change", ENV{IDFSTYPE}=="linux_raid*", \
       RUN+="watershed /sbin/mdadm --assemble --scan --no-degraded"

Replace the --no-degraded option with the --run option like so:

SUBSYSTEM=="block", ACTION=="add|change", ENV{IDFSTYPE}=="linux_raid*", \
       RUN+="watershed /sbin/mdadm --assemble --scan --run"

Then propagate the changes to every initrd with the following command:

  • sudo update-initramfs -u -k all

At that point all kernels should be bootable even if a RAID array is degraded.

Trackback URL for this post:

http://hightechsorcery.com/trackback/171

maybe for RAID1 but watchout if using RAID5 + spare

I tried this workaround for the horrible mdadm bug in hardy. Every reboot after applying this yeild stopped md array because of a random number of missing members for my /dev/md1.
It maybe works great with RAID1, but it messed up my 4 members RAID5 array. I had a spare drive which after, I tried to reboot two or three times kicked in and started to rebuild my array.

I decided to scrap everything and install Intrepid Ibex as this server was not in production. As mentioned in bug #120375 (https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/120375) it have been resolved in Intrepid but not yet backported to Hardy, which I hope will happen because this is a real deal breaker for a server LTS release. I'm using the exact same setup on Debian Etch since more than 6 months, and it worked right from day one (booting from degraded RAID5 array).

Thanks for the having posted this on you website, keep up your great articles about virtualisation under ubuntu. I'm still dreaming to be able to easily use Linux-VServer and OpenVZ with Hardy, but I feel I'll have to stick with Debian Etch for this propose.

Thanks for reminding me

I've been meaning to update this article for a while because it is bad advice. It does work ok for RAID 1, but only ok because like other RAID setups it does randomly drop drives from the RAID on reboot. I'm disappointed not only with the failure to fix this bug but also that no easy workarounds were ever found. I thought this was one only to find out that this introduces its own problems. While I'm glad to see effort being moved into fixing this, and even offering some improvements such as being able to grub-install on md devices, I still worry that this work isn't being overseen by anyone with sufficient sysadmin experience. I have yet to hear a good reason why the default behavior for Ubuntu should be different from Debian, and every other Linux distro.

Creative Commons License Except where otherwise noted, content on this site is licensed under a Creative Commons by-nc-sa 3.0 License