storage It can happen at any time - suddenly you get the message "DegradedArray event on /dev/md1".
What now?
In most cases, the message indicates a failed disk in a RAID array and now it is time to identify the affected disk and to take appropriate action.
Here is a example of a possible message you get when a RAID disk failed:
This is an automatically generated mail message from mdadm running on server1 A DegradedArray event had been detected on md device /dev/md/1. Faithfully yours, etc. P.S. The /proc/mdstat file currently contains the following: Personalities : [raid0] [raid1] [linear] [multipath] [raid6] [raid5] [raid4] [raid10] md0 : active raid1 sdb1[1] sda1[0] 1023936 blocks super 1.0 [2/2] [UU] md1 : active raid1 sdb3[2](F) sda3[0] 1948488512 blocks super 1.0 [2/1] [U_] unused devices: <none>
From message you can read which Raid array is affected. Here it is /dev/md/1.
Now look deeper and understand the content of md1 block.
md1 is active and uses raid1. The array is build with disk sdb3 and sda3.
[U_] does mean, one of the drives is not synced. In this case sdb3 is the failed disk, because it is marked with a (F).
If you like, you can get more details of the degraded array by using this command:
root@server:~# mdadm --detail /dev/md1 /dev/md1: Version : 1.0 Creation Time : Tue Aug 5 11:11:40 2016 Raid Level : raid1 Array Size : 1948488512 (1858.22 GiB 1995.25 GB) Used Dev Size : 1948488512 (1858.22 GiB 1995.25 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Thu Aug 11 10:20:10 2016 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Name : h2328146:1 UUID : 90e3e3fa:9ef283af:0fd043f0:ee01cc22 Events : 558815 Number Major Minor RaidDevice State 0 8 3 0 active sync /dev/sda3 1 0 0 1 removed 2 8 19 - faulty spare /dev/sdb3
Regardless of whether the disk sdb3 is to be changed or simply re-enabled, it must be removed first from the array md1.
For this, the following command is executed:
root@server:~# mdadm --remove /dev/md1 /dev/sdb3 mdadm: hot removed /dev/sdb3 from /dev/md1
Now the disk can be replaced and the new one re-added by the following command.
root@server:~# mdadm --add /dev/md1 /dev/sdb3 mdadm: re-added /dev/sdb3
After re-add of the disk, the synchronization starts immediately. You can watch the status of synchronisation by executing the following command:
root@server:~# cat /proc/mdstat Personalities : [raid1] md1 : active raid1 sdb3[2] sda3[0] 1948488512 blocks [2/1] [_U] [>....................] recovery = 0.1% (2849024/1948488512) finish=455.9min speed=127898K/sec md0 : active raid1 sdb1[1] sda1[0] 1023936 blocks super 1.0 [2/2] [UU] unused devices: <none>
You can see it synchronize and you can see how long it takes to finish.