Wondering what I could adjust to get my RAID 6 Software Raid to resync quicker. Currently it's proceeding at max of 64MB/s and averages to something around 25MB/s. Hoping to get it to 200MB:
[root@localhost mnt]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid6 sdh[5] sdg[4] sdf[3] sde[2] sdd[1] sdc[0]
8000935168 blocks super 1.2 level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU]
[====>................] resync = 21.9% (439396416/2000233792) finish=1148.8min speed=22641K/sec
bitmap: 12/15 pages [48KB], 65536KB chunk
I've checked with iostat to see what could be the bottleneck and see rrqm and wrqm hanging near 100 but never crossing it:
10/08/23 20:39:35
avg-cpu: %user %nice %system %iowait %steal %idle
0.01 0.00 0.36 1.03 0.00 98.60
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util
dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.40 26.80 0.00 0.00 1.75 67.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.08
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md127 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
nvme0n1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
nvme1n1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
nvme2n1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.40 26.80 0.00 0.00 2.25 67.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.08
sdc 25.40 17932.00 3895.60 99.35 411.14 705.98 49.00 5638.45 1334.20 96.46 16.54 115.07 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 11.25 34.47
sdd 23.80 17932.00 3897.20 99.39 905.93 753.45 50.00 5824.05 1381.10 96.51 79.93 116.48 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 25.56 44.14
sde 26.20 17932.00 3894.60 99.33 417.60 684.43 53.80 5632.05 1330.90 96.11 31.82 104.68 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 12.65 42.79
sdf 24.90 24414.80 3901.40 99.37 2737.51 980.51 61.10 7334.45 1382.70 95.77 2166.86 120.04 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 200.56 84.72
sdg 413.70 15671.20 3504.10 89.44 1.37 37.88 50.70 5580.85 1332.00 96.33 1.61 110.08 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.65 10.96
sdh 215.20 15671.20 3702.60 94.51 2.91 72.82 49.50 5772.85 1381.20 96.54 1.37 116.62 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.69 11.74
What I've set:
/sys/block/md127/md/stripe_cache_size to 32768
dev.raid.speed_limit_min = 500000 or 10000
dev.raid.speed_limit_max = 5000000 or 200000
Set /sys/block/sd{c,d,e,f,g,h}/queue/max_sectors_kb to 1024 Queue depth for each device (/sys/block/$diskdrive/device/queue_depth) is set to 10. Can't change it.
[root@localhost mnt]# sysctl -a |grep -Ei vm.dirty
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 20
vm.dirty_writeback_centisecs = 500
vm.dirtytime_expire_seconds = 43200
[root@localhost mnt]#
The definition for rrqm and wrqm is well documented:
rrqm/s
The number of read requests merged per second that were queued to the device.
wrqm/s
The number of write requests merged per second that were queued to the device.
%rrqm,%wrqm: The percentage of read/write requests merged at ioscheduler before sent to the device.
However, the definition is not quite matching the behaviour I'm seeing in iostat, where it's marked as red and reaching but not crossing 100%. Close to or at 100% typically indicates saturation of some kind to me.
The drives in this system are 6 x 2TB SSD's (Patriot). They are connected to the P440AR in this HP box. (Model: ATA Patriot P210 204)
So I'm curious what else I could try to increase the resync time?
Oct 9th
I'll just tackle each item accordingly, though that does sound quite like a ChatGPT answer. (Btw, I tried ChatGPT however most parameters didn't really work that it suggested after a 2 hour chat.) ;) . Current average speed is 9MB/s.
CPU 2x Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz. Lot's of free and idle cores. Max I see is 3 cores near 100%, however most of the time iowait on 1-2 cores hanging < 75%.
Tried. No effect. Reverted it. No discernible effect when set to bfq vs mq-deadline for the drives.
Kernel 6.5.2 parmameters: GRUB_CMDLINE_LINUX="crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M resume=/dev/mapper/almalinux-swap rd.lvm.lv=almalinux/root rd.lvm.lv=almalinux/swap elevator=mq-deadline nvme.poll_queue=8"
No filesystem yet. Just messing with the drives right now. Total test box, hence gives me lot's of options.
256GB
AlmaLinux 9/2
Lined it up on a 64K chunk size between the mdadm and disk definition
See above.
[root@localhost mnt]# cat /sys/block/md127/md/sync_force_parallel 1 https://unix.stackexchange.com/questions/734715/multiple-mdadm-raid-rebuild-in-parallel
This is a net new array build.
Scheduler settings:
[root@localhost mnt]# cat /sys/block/*/queue/scheduler
none [mq-deadline] kyber bfq
none [mq-deadline] kyber bfq
none [mq-deadline] kyber bfq
none [mq-deadline] kyber bfq
none [mq-deadline] kyber bfq
none [mq-deadline] kyber bfq
none [mq-deadline] kyber bfq
none [mq-deadline] kyber bfq
none [mq-deadline] kyber bfq
none [mq-deadline] kyber bfq
[root@localhost mnt]# ls -altri /sys/block/*/queue/scheduler
62995 -rw-r--r--. 1 root root 4096 Oct 8 14:28 /sys/block/sdb/queue/scheduler
61092 -rw-r--r--. 1 root root 4096 Oct 8 14:28 /sys/block/nvme0n1/queue/scheduler
61775 -rw-r--r--. 1 root root 4096 Oct 8 14:28 /sys/block/nvme1n1/queue/scheduler
60866 -rw-r--r--. 1 root root 4096 Oct 8 14:28 /sys/block/nvme2n1/queue/scheduler
78483 -rw-r--r--. 1 root root 4096 Oct 8 22:11 /sys/block/sdd/queue/scheduler
78799 -rw-r--r--. 1 root root 4096 Oct 8 22:12 /sys/block/sde/queue/scheduler
79130 -rw-r--r--. 1 root root 4096 Oct 8 22:12 /sys/block/sdf/queue/scheduler
79461 -rw-r--r--. 1 root root 4096 Oct 8 22:12 /sys/block/sdg/queue/scheduler
79792 -rw-r--r--. 1 root root 4096 Oct 8 22:12 /sys/block/sdh/queue/scheduler
80123 -rw-r--r--. 1 root root 4096 Oct 8 22:12 /sys/block/sdi/queue/scheduler
[root@localhost mnt]# echo bfq > /sys/block/sdd/queue/scheduler
[root@localhost mnt]# echo bfq > /sys/block/sde/queue/scheduler
[root@localhost mnt]# echo bfq > /sys/block/sdf/queue/scheduler
[root@localhost mnt]# echo bfq > /sys/block/sdg/queue/scheduler
[root@localhost mnt]# echo bfq > /sys/block/sdh/queue/scheduler
[root@localhost mnt]# echo bfq > /sys/block/sdi/queue/scheduler
[root@localhost mnt]# cat /sys/block/*/queue/scheduler
none [mq-deadline] kyber bfq
none [mq-deadline] kyber bfq
none [mq-deadline] kyber bfq
none [mq-deadline] kyber bfq
none mq-deadline kyber [bfq]
none mq-deadline kyber [bfq]
none mq-deadline kyber [bfq]
none mq-deadline kyber [bfq]
none mq-deadline kyber [bfq]
none mq-deadline kyber [bfq]
[root@localhost mnt]#
Tried P440AR HD RAID 6 as suggested in the comments. Slower by 25-30%. Just finished testing it, blew it up and am recreating the software raid. HD RAID 6 was blocked on IO heavily. After filling the cache through some performance testing by copying a large file over and over to the array w/ different file names, hdparm -tT /dev/sdX would sit hung totally blocked for 30+ minutes after I cancelled the copy. An error when it was blocked was:
Oct 7 22:17:47 localhost kernel: INFO: task hdparm:9331 blocked for more than 1228 seconds.
Oct 7 22:17:47 localhost kernel: Not tainted 6.5.2 #3
Oct 7 22:17:47 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 7 22:17:47 localhost kernel: task:hdparm state:D stack:0 pid:9331 ppid:6254 flags:0x00004006
Oct 7 22:17:47 localhost kernel: Call Trace:
Oct 7 22:17:47 localhost kernel: <TASK>
Oct 7 22:17:47 localhost kernel: __schedule+0x211/0x660
Oct 7 22:17:47 localhost kernel: schedule+0x5a/0xd0
Oct 7 22:17:47 localhost kernel: wb_wait_for_completion+0x56/0x80
Oct 7 22:17:47 localhost kernel: ? __pfx_autoremove_wake_function+0x10/0x10
Oct 7 22:17:47 localhost kernel: sync_inodes_sb+0xc0/0x100
Oct 7 22:17:47 localhost kernel: ? __pfx_sync_inodes_one_sb+0x10/0x10
Oct 7 22:17:47 localhost kernel: iterate_supers+0x88/0xf0
Oct 7 22:17:47 localhost kernel: ksys_sync+0x40/0xa0
Oct 7 22:17:47 localhost kernel: __do_sys_sync+0xa/0x20
Oct 7 22:17:47 localhost kernel: do_syscall_64+0x5c/0x90
Oct 7 22:17:47 localhost kernel: ? syscall_exit_work+0x103/0x130
Oct 7 22:17:47 localhost kernel: ? syscall_exit_to_user_mode+0x22/0x40
Oct 7 22:17:47 localhost kernel: ? do_syscall_64+0x69/0x90
Oct 7 22:17:47 localhost kernel: ? do_user_addr_fault+0x22b/0x660
Oct 7 22:17:47 localhost kernel: ? exc_page_fault+0x65/0x150
Oct 7 22:17:47 localhost kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Oct 7 22:17:47 localhost kernel: RIP: 0033:0x7f081163ed5b
Oct 7 22:17:47 localhost kernel: RSP: 002b:00007ffe797ccc28 EFLAGS: 00000217 ORIG_RAX: 00000000000000a2
Oct 7 22:17:47 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f081163ed5b
Oct 7 22:17:47 localhost kernel: RDX: 00007f0811600000 RSI: 0000000000200000 RDI: 0000000000000003
Oct 7 22:17:47 localhost kernel: RBP: 0000000000000003 R08: 00000000ffffffff R09: 0000000000000000
Oct 7 22:17:47 localhost kernel: R10: 0000000000000022 R11: 0000000000000217 R12: 00007f0811400000
Oct 7 22:17:47 localhost kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Oct 7 22:17:47 localhost kernel: </TASK>
Oct 7 22:17:47 localhost kernel: Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings
Also tried this as suggested in the comments:
"echo 8 > /sys/block/md127/md/group_thread_cnt"
but little to no discernible effect I could see. Those rsync speeds fluctuate rather wildly.
Some ssacli numbers. I've wrapped RAID 0 over the drives to make use of some of the P440AR capabilities instead of the HW RAID 6 via the P440AR.
=> ctrl slot=3 show
Smart Array P440 in Slot 3
Bus Interface: PCI
Slot: 3
Serial Number: ABC12345678
Cache Serial Number: ABC12345678
RAID 6 (ADG) Status: Enabled
Controller Status: OK
Hardware Revision: B
Firmware Version: 4.02-0
Firmware Supports Online Firmware Activation: False
Rebuild Priority: High
Expand Priority: Medium
Surface Scan Delay: 3 secs
Surface Scan Mode: Idle
Parallel Surface Scan Supported: Yes
Current Parallel Surface Scan Count: 1
Max Parallel Surface Scan Count: 16
Queue Depth: Automatic
Monitor and Performance Delay: 60 min
Elevator Sort: Enabled
Degraded Performance Optimization: Disabled
Inconsistency Repair Policy: Disabled
Wait for Cache Room: Disabled
Surface Analysis Inconsistency Notification: Disabled
Post Prompt Timeout: 15 secs
Cache Board Present: True
Cache Status: OK
Cache Ratio: 75% Read / 25% Write
Drive Write Cache: Enabled
Total Cache Size: 4.0
Total Cache Memory Available: 3.8
No-Battery Write Cache: Disabled
SSD Caching RAID5 WriteBack Enabled: True
SSD Caching Version: 2
Cache Backup Power Source: Batteries
Battery/Capacitor Count: 1
Battery/Capacitor Status: OK
SATA NCQ Supported: True
Spare Activation Mode: Activate on physical drive failure (default)
Controller Temperature (C): 58
Cache Module Temperature (C): 51
Number of Ports: 1 Internal only
Encryption: Not Set
Express Local Encryption: False
Driver Name: hpsa
Driver Version: 3.4.20
Driver Supports SSD Smart Path: True
PCI Address (Domain:Bus:Device.Function): 0000:08:00.0
Negotiated PCIe Data Rate: PCIe 3.0 x8 (7880 MB/s)
Controller Mode: RAID
Pending Controller Mode: RAID
Port Max Phy Rate Limiting Supported: False
Latency Scheduler Setting: Disabled
Current Power Mode: MaxPerformance
Survival Mode: Enabled
Host Serial Number: ABC12345678
Sanitize Erase Supported: True
Primary Boot Volume: None
Secondary Boot Volume: None
=>
One of the drives (example):
=> ctrl slot=3 ld 1 show
Smart Array P440 in Slot 3
Array A
Logical Drive: 1
Size: 1.86 TB
Fault Tolerance: 0
Heads: 255
Sectors Per Track: 32
Cylinders: 65535
Strip Size: 64 KB
Full Stripe Size: 64 KB
Status: OK
Caching: Enabled
Unique Identifier: UNIQUEIDENTIFIER
Disk Name: /dev/sdd
Mount Points: None
Logical Drive Label: DRIVELABEL
Drive Type: Data
LD Acceleration Method: Controller Cache
=>
On a brief moment of performance increase, sometime early in the resync, it did reach this number below, which is more of what I would expect on such an array, but then dropped drastically:
[root@localhost mnt]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid6 sdi[5] sdh[4] sdg[3] sdf[2] sde[1] sdd[0]
8000935168 blocks super 1.2 level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU]
[=>...................] resync = 5.2% (105114044/2000233792) finish=121.2min speed=260430K/sec
bitmap: 15/15 pages [60KB], 65536KB chunk
....................
[root@localhost mnt]#
Not sure the impact, but one of the IO boards or the controller is 75C. Looks like this is the Smart Array P440 Controller based on dmidecode output.
My top output. Number of cores active in IO wait state is more now. Not sure if it is due to group_thread_cnt but if it is, it's a clue that something else is the bottleneck since now just more threads are waiting, but resync still hovers in the low speeds:
top - 13:04:07 up 22:35, 5 users, load average: 9.26, 9.27, 9.21
Tasks: 717 total, 1 running, 716 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.0 us, 0.1 sy, 0.0 ni, 99.0 id, 0.7 wa, 0.0 hi, 0.2 si, 0.0 st
%Cpu1 : 0.0 us, 2.1 sy, 0.0 ni, 0.7 id, 97.0 wa, 0.1 hi, 0.1 si, 0.0 st
%Cpu2 : 0.0 us, 1.4 sy, 0.0 ni, 1.0 id, 97.4 wa, 0.1 hi, 0.1 si, 0.0 st
%Cpu3 : 0.0 us, 0.1 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu4 : 0.0 us, 0.7 sy, 0.0 ni, 0.9 id, 98.4 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 0.0 us, 0.1 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu6 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu7 : 0.0 us, 0.3 sy, 0.0 ni, 99.6 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu8 : 0.0 us, 0.1 sy, 0.0 ni, 98.6 id, 1.2 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu9 : 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu10 : 0.0 us, 0.6 sy, 0.0 ni, 3.3 id, 96.0 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu11 : 0.0 us, 0.2 sy, 0.0 ni, 0.7 id, 99.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu12 : 0.0 us, 0.3 sy, 0.0 ni, 99.6 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu13 : 0.0 us, 0.1 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu14 : 0.0 us, 0.7 sy, 0.0 ni, 97.9 id, 1.2 wa, 0.1 hi, 0.1 si, 0.0 st
%Cpu15 : 0.0 us, 0.1 sy, 0.0 ni, 99.8 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu16 : 0.0 us, 0.1 sy, 0.0 ni, 99.7 id, 0.1 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu17 : 0.0 us, 0.2 sy, 0.0 ni, 0.6 id, 99.1 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu18 : 0.1 us, 0.0 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu19 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu20 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu21 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu22 : 0.0 us, 0.1 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu23 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu24 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu25 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu26 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu27 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu28 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu29 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu30 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu31 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu32 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu33 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu34 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu35 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu36 : 0.0 us, 1.4 sy, 0.0 ni, 0.7 id, 97.9 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu37 : 0.0 us, 0.4 sy, 0.0 ni, 99.5 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu38 : 0.0 us, 0.2 sy, 0.0 ni, 0.7 id, 99.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu39 : 0.0 us, 0.4 sy, 0.0 ni, 99.4 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu40 : 0.0 us, 0.4 sy, 0.0 ni, 99.5 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu41 : 0.0 us, 0.2 sy, 0.0 ni, 99.7 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu42 : 0.0 us, 0.4 sy, 0.0 ni, 99.5 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu43 : 0.0 us, 0.3 sy, 0.0 ni, 99.6 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu44 : 0.0 us, 0.3 sy, 0.0 ni, 99.6 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu45 : 0.0 us, 0.1 sy, 0.0 ni, 99.8 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu46 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu47 : 0.0 us, 0.1 sy, 0.0 ni, 99.8 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu48 : 0.0 us, 0.1 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu49 : 0.0 us, 0.1 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu50 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu51 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu52 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu53 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu54 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu55 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu56 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu57 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu58 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu59 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu60 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu61 : 0.0 us, 0.1 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu62 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu63 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu64 : 0.1 us, 0.1 sy, 0.0 ni, 99.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu65 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu66 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu67 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu68 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu69 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu70 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu71 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 257264.0 total, 253288.3 free, 4062.6 used, 1491.2 buff/cache
MiB Swap: 4096.0 total, 4096.0 free, 0.0 used. 253201.3 avail Mem
My next exercise will be to totally avoid any RAID 6 or RAID 0 setup for each individual drive, via the controller. Just let it present the drives as-is.
Oct 9th - Update 2
My other array made up of NVMe drives also seems impacted. Both arrays should be clocking near 2GB/s when resync is done but (md127 = SATA SSD array above. md0 = NVMe mdadm software raid 6. ):
[root@localhost queue]# hdparm -tT /dev/md127
/dev/md127:
Timing cached reads: 2 MB in 4.54 seconds = 451.26 kB/sec
Timing buffered disk reads: 2 MB in 3.50 seconds = 585.55 kB/sec
[root@localhost queue]#
[root@localhost queue]#
[root@localhost queue]#
[root@localhost queue]# hdparm -tT /dev/md127
/dev/md127:
Timing cached reads: 2 MB in 8.77 seconds = 233.61 kB/sec
Timing buffered disk reads: 2 MB in 5.87 seconds = 349.05 kB/sec
[root@localhost queue]#
[root@localhost queue]#
[root@localhost queue]# hdparm -tT /dev/md0
/dev/md0:
Timing cached reads: 17256 MB in 1.99 seconds = 8667.23 MB/sec
Timing buffered disk reads: 48 MB in 3.17 seconds = 15.13 MB/sec
[root@localhost queue]#
Without md0 (NVMe) RAID 6 array:
Cheers,