mdadm
コマンド、システムの他の部分に RAID アレイを統合するためのスクリプトやツール、監視システムが含まれます。
sdb
ディスク (4 GB) は全領域を利用できます。
sdc
ディスク (4 GB) は全領域を利用できます。
sdd
ディスクは sdd2
パーティション (約 4 GB) だけを利用できます。
sde
ディスク (4 GB) は全領域を利用できます。
#
mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/sdb /dev/sdc
mdadm: Defaulting to version 1.2 metadata mdadm: array /dev/md0 started. #
mdadm --query /dev/md0
/dev/md0: 7.99GiB raid0 2 devices, 0 spares. Use mdadm --detail for more detail. #
mdadm --detail /dev/md0
/dev/md0: Version : 1.2 Creation Time : Mon Feb 28 01:54:24 2022 Raid Level : raid0 Array Size : 8378368 (7.99 GiB 8.58 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Mon Feb 28 01:54:24 2022 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Layout : -unknown- Chunk Size : 512K Consistency Policy : none Name : debian:0 (local to host debian) UUID : a75ac628:b384c441:157137ac:c04cd98c Events : 0 Number Major Minor RaidDevice State 0 8 0 0 active sync /dev/sdb 1 8 16 1 active sync /dev/sdc #
mkfs.ext4 /dev/md0
mke2fs 1.46.2 (28-Feb-2021) Discarding device blocks: done Creating filesystem with 2094592 4k blocks and 524288 inodes Filesystem UUID: ef077204-c477-4430-bf01-52288237bea0 Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632 Allocating group tables: done Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: done #
mkdir /srv/raid-0
#
mount /dev/md0 /srv/raid-0
#
df -h /srv/raid-0
Filesystem Size Used Avail Use% Mounted on /dev/md0 7.8G 24K 7.4G 1% /srv/raid-0
mdadm --create
command requires several parameters: the name of the volume to create (/dev/md*
, with MD standing for Multiple Device), the RAID level, the number of disks (which is compulsory despite being mostly meaningful only with RAID-1 and above), and the physical drives to use. Once the device is created, we can use it like we'd use a normal partition, create a filesystem on it, mount that filesystem, and so on. Note that our creation of a RAID-0 volume on md0
is nothing but coincidence, and the numbering of the array doesn't need to be correlated to the chosen amount of redundancy. It is also possible to create named RAID arrays, by giving mdadm
parameters such as /dev/md/linear
instead of /dev/md0
.
#
mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sdd2 /dev/sde
mdadm: Note: this array has metadata at the start and may not be suitable as a boot device. If you plan to store '/boot' on this device please ensure that your boot-loader understands md/v1.x metadata, or use --metadata=0.90 mdadm: largest drive (/dev/sdc2) exceeds size (4189184K) by more than 1% Continue creating array?
y
mdadm: Defaulting to version 1.2 metadata mdadm: array /dev/md1 started. #
mdadm --query /dev/md1
/dev/md1: 4.00GiB raid1 2 devices, 0 spares. Use mdadm --detail for more detail. #
mdadm --detail /dev/md1
/dev/md1: Version : 1.2 Creation Time : Mon Feb 28 02:07:48 2022 Raid Level : raid1 Array Size : 4189184 (4.00 GiB 4.29 GB) Used Dev Size : 4189184 (4.00 GiB 4.29 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Mon Feb 28 02:08:09 2022 State : clean, resync Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Consistency Policy : resync Rebuild Status : 13% complete Name : debian:1 (local to host debian) UUID : 2dfb7fd5:e09e0527:0b5a905a:8334adb8 Events : 17 Number Major Minor RaidDevice State 0 8 34 0 active sync /dev/sdd2 1 8 48 1 active sync /dev/sde #
mdadm --detail /dev/md1
/dev/md1: [...] State : clean [...]
mdadm
は物理デバイス同士のサイズが異なる点を指摘しています。さらに、このことによりサイズが大きい側のデバイスの一部の領域が使えなくなるため、確認が求められています。
/dev/md1
を利用することが可能で、ファイルシステムを作成したり、データのコピーを取ったりすることが可能という点に注意してください。
mdadm
に --fail
オプションを付けることで、ディスク障害を模倣することが可能です。
#
mdadm /dev/md1 --fail /dev/sde
mdadm: set /dev/sde faulty in /dev/md1 #
mdadm --detail /dev/md1
/dev/md1: Version : 1.2 Creation Time : Mon Feb 28 02:07:48 2022 Raid Level : raid1 Array Size : 4189184 (4.00 GiB 4.29 GB) Used Dev Size : 4189184 (4.00 GiB 4.29 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Mon Feb 28 02:15:34 2022 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Consistency Policy : resync Name : debian:1 (local to host debian) UUID : 2dfb7fd5:e09e0527:0b5a905a:8334adb8 Events : 19 Number Major Minor RaidDevice State 0 8 34 0 active sync /dev/sdd2 - 0 0 1 removed 1 8 48 - faulty /dev/sde
sdd
ディスクにも障害が発生した場合、データは失われます。この危険性を避けるために、障害の発生したディスクを新しいディスク sdf
に交換します。
#
mdadm /dev/md1 --add /dev/sdf
mdadm: added /dev/sdf #
mdadm --detail /dev/md1
/dev/md1: Version : 1.2 Creation Time : Mon Feb 28 02:07:48 2022 Raid Level : raid1 Array Size : 4189184 (4.00 GiB 4.29 GB) Used Dev Size : 4189184 (4.00 GiB 4.29 GB) Raid Devices : 2 Total Devices : 3 Persistence : Superblock is persistent Update Time : Mon Feb 28 02:25:34 2022 State : clean, degraded, recovering Active Devices : 1 Working Devices : 2 Failed Devices : 1 Spare Devices : 1 Consistency Policy : resync Rebuild Status : 47% complete Name : debian:1 (local to host debian) UUID : 2dfb7fd5:e09e0527:0b5a905a:8334adb8 Events : 39 Number Major Minor RaidDevice State 0 8 34 0 active sync /dev/sdd2 2 8 64 1 spare rebuilding /dev/sdf 1 8 48 - faulty /dev/sde #
[...]
[...] #
mdadm --detail /dev/md1
/dev/md1: Version : 1.2 Creation Time : Mon Feb 28 02:07:48 2022 Raid Level : raid1 Array Size : 4189184 (4.00 GiB 4.29 GB) Used Dev Size : 4189184 (4.00 GiB 4.29 GB) Raid Devices : 2 Total Devices : 3 Persistence : Superblock is persistent Update Time : Mon Feb 28 02:25:34 2022 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 1 Spare Devices : 0 Consistency Policy : resync Name : debian:1 (local to host debian) UUID : 2dfb7fd5:e09e0527:0b5a905a:8334adb8 Events : 41 Number Major Minor RaidDevice State 0 8 34 0 active sync /dev/sdd2 2 8 64 1 active sync /dev/sdf 1 8 48 - faulty /dev/sde
sde
ディスクをアレイから削除することを伝えることが可能です。削除することで、2 台のディスクからなる古典的な RAID ミラーになります。
#
mdadm /dev/md1 --remove /dev/sde
mdadm: hot removed /dev/sde from /dev/md1 #
mdadm --detail /dev/md1
/dev/md1: [...] Number Major Minor RaidDevice State 0 8 34 0 active sync /dev/sdd2 2 8 64 1 active sync /dev/sdf
sde
disk failure had been real (instead of simulated) and the system had been restarted without removing this sde
disk, this disk could start working again due to having been probed during the reboot. The kernel would then have three physical elements, each claiming to contain half of the same RAID volume. In reality this leads to the RAID starting from the individual disks alternately - distributing the data also alternately, depending on which disk started the RAID in degraded mode Another source of confusion can come when RAID volumes from two servers are consolidated onto one server only. If these arrays were running normally before the disks were moved, the kernel would be able to detect and reassemble the pairs properly; but if the moved disks had been aggregated into an md1
on the old server, and the new server already has an md1
, one of the mirrors would be renamed.
/etc/mdadm/mdadm.conf
ファイルを編集することです。以下に例を示します。
例 12.1 mdadm
設定ファイル
# mdadm.conf # # !NB! Run update-initramfs -u after updating this file. # !NB! This will ensure that initramfs has an uptodate copy. # # Please refer to mdadm.conf(5) for information about this file. # # by default (built-in), scan all partitions (/proc/partitions) and all # containers for MD superblocks. alternatively, specify devices to scan, using # wildcards if desired. DEVICE /dev/sd* # automatically tag new arrays as belonging to the local system HOMEHOST <system> # instruct the monitoring daemon where to send mail alerts MAILADDR root # definitions of existing MD arrays ARRAY /dev/md/0 metadata=1.2 UUID=a75ac628:b384c441:157137ac:c04cd98c name=debian:0 ARRAY /dev/md/1 metadata=1.2 UUID=2dfb7fd5:e09e0527:0b5a905a:8334adb8 name=debian:1 # This configuration was auto-generated on Mon, 28 Feb 2022 01:53:48 +0100 by mkconf
DEVICE
オプションがあります。これは起動時にシステムが RAID ボリュームの構成情報を自動的に探すデバイスをリストします。上の例では、値をデフォルト値 partitions containers
からデバイスファイルを明示したリストに置き換えました。なぜなら、パーティションだけでなくすべてのディスクをボリュームとして使うように決めたからです。
/dev/md*
デバイス名にマッチすることを確認する) には不十分です。
#
mdadm --misc --detail --brief /dev/md?
ARRAY /dev/md/0 metadata=1.2 UUID=a75ac628:b384c441:157137ac:c04cd98c name=debian:0 ARRAY /dev/md/1 metadata=1.2 UUID=2dfb7fd5:e09e0527:0b5a905a:8334adb8 name=debian:1
/dev
hierarchy, so there is no risk of using them directly.
/dev
に現れ、他の物理パーティションと同様に取り扱うことが可能です (一般的に言えば、LV にファイルシステムやスワップ領域を作成することが可能です)。
sdb
ディスク上の sdb2
パーティション (4 GB)。
sdc
ディスク上の sdc3
パーティション (3 GB)。
sdd
ディスク (4 GB) は全領域を利用できます。
sdf
ディスク上の sdf1
パーティション (4 GB) および sdf2
パーティション (5 GB)。
sdb
と sdf
が他の 2 台に比べて高速であると仮定しましょう。
pvcreate
を使って PV を作成します。
#
pvcreate /dev/sdb2
Physical volume "/dev/sdb2" successfully created. #
pvdisplay
"/dev/sdb2" is a new physical volume of "4.00 GiB" --- NEW Physical volume --- PV Name /dev/sdb2 VG Name PV Size 4.00 GiB Allocatable NO PE Size 0 Total PE 0 Free PE 0 Allocated PE 0 PV UUID yK0K6K-clbc-wt6e-qk9o-aUh9-oQqC-k1T71B #
for i in sdc3 sdd sdf1 sdf2 ; do pvcreate /dev/$i ; done
Physical volume "/dev/sdc3" successfully created. Physical volume "/dev/sdd" successfully created. Physical volume "/dev/sdf1" successfully created. Physical volume "/dev/sdf2" successfully created. #
pvdisplay -C
PV VG Fmt Attr PSize PFree /dev/sdb2 lvm2 --- 4.00g 4.00g /dev/sdc3 lvm2 --- 3.00g 3.00g /dev/sdd lvm2 --- 4.00g 4.00g /dev/sdf1 lvm2 --- 4.00g 4.00g /dev/sdf2 lvm2 --- 5.00g 5.00g
pvdisplay
コマンドは既存の PV をリストします。出力フォーマットは 2 種類あります。
vgcreate
を使って、これらの PV から VG を構成しましょう。高速なディスクの PV から vg_critical
VG を構成します。さらに、これ以外の低速なディスクの PV から vg_normal
VG を構成します。
#
vgcreate vg_critical /dev/sdb2 /dev/sdf1
Volume group "vg_critical" successfully created #
vgdisplay
--- Volume group --- VG Name vg_critical System ID Format lvm2 Metadata Areas 2 Metadata Sequence No 1 VG Access read/write VG Status resizable MAX LV 0 Cur LV 0 Open LV 0 Max PV 0 Cur PV 2 Act PV 2 VG Size 7.99 GiB PE Size 4.00 MiB Total PE 2046 Alloc PE / Size 0 / 0 Free PE / Size 2046 / 7.99 GiB VG UUID JgFWU3-emKg-9QA1-stPj-FkGX-mGFb-4kzy1G #
vgcreate vg_normal /dev/sdc3 /dev/sdd /dev/sdf2
Volume group "vg_normal" successfully created #
vgdisplay -C
VG #PV #LV #SN Attr VSize VFree vg_critical 2 0 0 wz--n- 7.99g 7.99g vg_normal 3 0 0 wz--n- <11.99g <11.99g
vgdisplay
コマンドはかなり簡潔です (そして vgdisplay
には 2 種類の出力フォーマットがあります)。同じ物理ディスク上にある 2 つの PV から 2 つの異なる VG を構成することが可能である点に注意してください。また、vg_
接頭辞を VG の名前に使っていますが、これは慣例に過ぎない点に注意してください。
lvcreate
command, and a slightly more complex syntax:
#
lvdisplay
#
lvcreate -n lv_files -L 5G vg_critical
Logical volume "lv_files" created. #
lvdisplay
--- Logical volume --- LV Path /dev/vg_critical/lv_files LV Name lv_files VG Name vg_critical LV UUID Nr62xe-Zu7d-0u3z-Yyyp-7Cj1-Ej2t-gw04Xd LV Write Access read/write LV Creation host, time debian, 2022-03-01 00:17:46 +0100 LV Status available # open 0 LV Size 5.00 GiB Current LE 1280 Segments 2 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:0 #
lvcreate -n lv_base -L 1G vg_critical
Logical volume "lv_base" created. #
lvcreate -n lv_backups -L 11.98G vg_normal
Rounding up size to full physical extent 11.98 GiB Rounding up size to full physical extent 11.98 GiB Logical volume "lv_backups" created. #
lvdisplay -C
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lv_base vg_critical -wi-a----- 1.00g lv_files vg_critical -wi-a----- 5.00g lv_backups vg_normal -wi-a----- 11.98g
lvcreate
に渡します。作成する LV の名前を -n
オプションで指定し、サイズを -L
オプションで指定します。また、操作対象の VG をコマンドに伝えることが必要です。これはもちろん最後のコマンドラインパラメータです。
/dev/mapper/
に現れます。
#
ls -l /dev/mapper
total 0 crw------- 1 root root 10, 236 Mar 1 00:17 control lrwxrwxrwx 1 root root 7 Mar 1 00:19 vg_critical-lv_base -> ../dm-1 lrwxrwxrwx 1 root root 7 Mar 1 00:17 vg_critical-lv_files -> ../dm-0 lrwxrwxrwx 1 root root 7 Mar 1 00:19 vg_normal-lv_backups -> ../dm-2 #
ls -l /dev/dm-*
brw-rw---- 1 root disk 253, 0 Mar 1 00:17 /dev/dm-0 brw-rw---- 1 root disk 253, 1 Mar 1 00:19 /dev/dm-1 brw-rw---- 1 root disk 253, 2 Mar 1 00:19 /dev/dm-2
#
ls -l /dev/vg_critical
total 0 lrwxrwxrwx 1 root root 7 Mar 1 00:19 lv_base -> ../dm-1 lrwxrwxrwx 1 root root 7 Mar 1 00:17 lv_files -> ../dm-0 #
ls -l /dev/vg_normal
total 0 lrwxrwxrwx 1 root root 7 Mar 1 00:19 lv_backups -> ../dm-2
#
mkfs.ext4 /dev/vg_normal/lv_backups
mke2fs 1.46.2 (28-Feb-2021) Discarding device blocks: done Creating filesystem with 3140608 4k blocks and 786432 inodes Filesystem UUID: 7eaf0340-b740-421e-96b2-942cdbf29cb3 Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208 Allocating group tables: done Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: done #
mkdir /srv/backups
#
mount /dev/vg_normal/lv_backups /srv/backups
#
df -h /srv/backups
Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_normal-lv_backups 12G 24K 12G 1% /srv/backups #
[...]
[...] #
cat /etc/fstab
[...] /dev/vg_critical/lv_base /srv/base ext4 defaults 0 2 /dev/vg_critical/lv_files /srv/files ext4 defaults 0 2 /dev/vg_normal/lv_backups /srv/backups ext4 defaults 0 2
vg_critical
から分割できる全領域はまだ使い切られていないので、lv_files
のサイズを増やすことが可能です。LV のサイズを増やすために lvresize
コマンドを使い、LV のサイズの変化にファイルシステムを対応させるために resize2fs
を使います。
#
df -h /srv/files/
Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_critical-lv_files 4.9G 4.2G 485M 90% /srv/files #
lvdisplay -C vg_critical/lv_files
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lv_files vg_critical -wi-ao---- 5.00g #
vgdisplay -C vg_critical
VG #PV #LV #SN Attr VSize VFree vg_critical 2 2 0 wz--n- 7.99g 1.99g #
lvresize -L 6G vg_critical/lv_files
Size of logical volume vg_critical/lv_files changed from 5.00 GiB (1280 extents) to 6.00 GiB (1536 extents). Logical volume vg_critical/lv_files successfully resized. #
lvdisplay -C vg_critical/lv_files
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lv_files vg_critical -wi-ao---- 6.00g #
resize2fs /dev/vg_critical/lv_files
resize2fs 1.46.2 (28-Feb-2021) Filesystem at /dev/vg_critical/lv_files is mounted on /srv/files; on-line resizing required old_desc_blocks = 1, new_desc_blocks = 1 The filesystem on /dev/vg_critical/lv_files is now 1572864 (4k) blocks long. #
df -h /srv/files/
Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_critical-lv_files 5.9G 4.2G 1.5G 75% /srv/files
lv_base
のサイズを増加させます。以下の通り lv_base
の分割元である vg_critical
から分割できる領域は既にほぼ使い切った状態になっています。
#
df -h /srv/base/
Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_critical-lv_base 974M 883M 25M 98% /srv/base #
vgdisplay -C vg_critical
VG #PV #LV #SN Attr VSize VFree vg_critical 2 2 0 wz--n- 7.99g 1016.00m
sdb3
partition, which was so far used outside of LVM, only contained archives that could be moved to lv_backups
. We can now recycle it and integrate it to the volume group, and thereby reclaim some available space. This is the purpose of the vgextend
command. Of course, the partition must be prepared as a physical volume beforehand. Once the VG has been extended, we can use similar commands as previously to grow the logical volume then the filesystem:
#
pvcreate /dev/sdb3
Physical volume "/dev/sdb3" successfully created. #
vgextend vg_critical /dev/sdb3
Volume group "vg_critical" successfully extended #
vgdisplay -C vg_critical
VG #PV #LV #SN Attr VSize VFree vg_critical 3 2 0 wz--n- <12.99g <5.99g #
lvresize -L 2G vg_critical/lv_base
[...] #
resize2fs /dev/vg_critical/lv_base
[...] #
df -h /srv/base/
Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_critical-lv_base 2.0G 886M 991M 48% /srv/base
sda
と sdc
として現れます。どちらのディスクも以下に示したパーティショニング方針に従ってパーティショニングされます。
#
sfdisk -l /dev/sda
Disk /dev/sda: 894.25 GiB, 960197124096 bytes, 1875385008 sectors Disk model: SAMSUNG MZ7LM960 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: BB14C130-9E9A-9A44-9462-6226349CA012 Device Start End Sectors Size Type /dev/sda1 2048 4095 2048 1M BIOS boot /dev/sda2 4096 100667391 100663296 48G Linux RAID /dev/sda3 100667392 134221823 33554432 16G Linux RAID /dev/sda4 134221824 763367423 629145600 300G Linux RAID /dev/sda5 763367424 1392513023 629145600 300G Linux RAID /dev/sda6 1392513024 1875384974 482871951 230.3G Linux LVM
sda2
and sdc2
(about 48 GB) are assembled into a RAID-1 volume, md0
. This mirror is directly used to store the root filesystem.
sda3
and sdc3
partitions are assembled into a RAID-0 volume, md1
, and used as swap partition, providing a total 32 GB of swap space. Modern systems can provide plenty of RAM and our system won't need hibernation. So with this amount added, our system will unlikely run out of memory.
sda4
and sdc4
partitions, as well as sda5
and sdc5
, are assembled into two new RAID-1 volumes of about 300 GB each, md2
and md3
. Both these mirrors are initialized as physical volumes for LVM, and assigned to the vg_raid
volume group. This VG thus contains about 600 GB of safe space.
sda6
and sdc6
, are directly used as physical volumes, and assigned to another VG called vg_bulk
, which therefore ends up with roughly 460 GB of space.
vg_raid
から分割された LV は 1 台のディスク障害に対して耐性を持ちますが、vg_bulk
から分割された LV はディスク障害に対する耐性を持たない点を忘れないでください。逆に、vg_bulk
は両方のディスクにわたって割り当てられるので、vg_bulk
から分割された LV に保存された巨大なファイルの読み書き速度は高速化されるでしょう。
lv_var
and lv_home
LVs on vg_raid
, to host the matching filesystems; another large LV, lv_movies
, will be used to host the definitive versions of movies after editing. The other VG will be split into a large lv_rushes
, for data straight out of the digital video cameras, and a lv_tmp
for temporary files. The location of the work area is a less straightforward choice to make: while good performance is needed for that volume, is it worth risking losing work if a disk fails during an editing session? Depending on the answer to that question, the relevant LV will be created on one VG or the other.