Raid: DegradedArray event on /dev/md1

How To solve the Problem of a defective RAID Array?

It can happen at any time - suddenly you get the message "DegradedArray event on /dev/md1".
What now?
In most cases, the message indicates a failed disk in a RAID array and now it is time to identify the affected disk and to take appropriate action.

Read and Analyze the Message you have Received

Here is a example of a possible message you get when a RAID disk failed:

This is an automatically generated mail message from mdadm
running on server1

A DegradedArray event had been detected on md device /dev/md/1.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid0] [raid1] [linear] [multipath] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sdb1[1] sda1[0]
      1023936 blocks super 1.0 [2/2] [UU]

md1 : active raid1 sdb3[2](F) sda3[0]
      1948488512 blocks super 1.0 [2/1] [U_]

unused devices: <none>

From message you can read which Raid array is affected. Here it is /dev/md/1.
Now look deeper and understand the content of md1 block.
md1 is active and uses raid1. The array is build with disk sdb3 and sda3.
[U_] does mean, one of the drives is not synced. In this case sdb3 is the failed disk, because it is marked with a (F).
If you like, you can get more details of the degraded array by using this command:

root@server:~# mdadm --detail /dev/md1
/dev/md1:
        Version : 1.0
  Creation Time : Tue Aug  5 11:11:40 2016
     Raid Level : raid1
     Array Size : 1948488512 (1858.22 GiB 1995.25 GB)
  Used Dev Size : 1948488512 (1858.22 GiB 1995.25 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Thu Aug 11 10:20:10 2016
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 1
  Spare Devices : 0

           Name : h2328146:1
           UUID : 90e3e3fa:9ef283af:0fd043f0:ee01cc22
         Events : 558815

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       0        0        1      removed

       2       8       19        -      faulty spare   /dev/sdb3

Failed Disk is Identified. And Now?

Regardless of whether the disk sdb3 is to be changed or simply re-enabled, it must be removed first from the array md1.
For this, the following command is executed:

root@server:~# mdadm --remove /dev/md1 /dev/sdb3
mdadm: hot removed /dev/sdb3 from /dev/md1

Now the disk can be replaced and the new one re-added by the following command.

root@server:~# mdadm --add /dev/md1 /dev/sdb3
mdadm: re-added /dev/sdb3

After re-add of the disk, the synchronization starts immediately. You can watch the status of synchronisation by executing the following command:

root@server:~# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb3[2] sda3[0]
      1948488512 blocks [2/1] [_U]
      [>....................]  recovery =  0.1% (2849024/1948488512) finish=455.9min speed=127898K/sec

md0 : active raid1 sdb1[1] sda1[0]
      1023936 blocks super 1.0 [2/2] [UU]

unused devices: <none>

You can see it synchronize and you can see how long it takes to finish.

Status: 2016-08-11

Back To Overview