One of raid cards were giving multibit ecc ram module failure , thats why I am moving 60TB of data from one server to another ...
Getting 60-100 Mb/s read speeds 3 Megaraid sas raid1 s 8+8+6 x 10TB Disks and spanning them with raid 0 using mdadm when compresisng a folder with tar pigz to another network raid5 folder or same server. I rarely beat 100mb/s for a couple of seconds in an hour (up to 250mb/s) ...
However When I copy a huge file 8 GB it copies with 10G Ethernet 987mb/s. Also hdparm -tT /dev/md0p1
gives 1.5 GB/s.
Cluster size of disks and raid array all 4096 k %75 of files are >30mb there GB files too How can I tar.gz reliably and as fast as possible ... What is wrong with this machine
One of raid cards were giving multibit ecc ram module failure so can it be the culprit behind slow speeds and or is it because disks are mechanical however keep in mind file distribution is like this :
Bytes Number of Files:
0 14
16 24
32 21
64 603
128 207
256 1677
512 2361
1024 45
2048 90
4096 112
8192 358
16384 315
32768 235
65536 309
131072 296
262144 2275
524288 1148
1048576 2187
2097152 3204
4194304 2708
8388608 2148
16777216 703
33554432 1585
67108864 906
134217728 259
268435456 71
536870912 42
1073741824 33
2147483648 38
4294967296 16
Here is a little bit human readable version : https://docs.google.com/spreadsheets/d/15J3LsU5G_km70mW0yE6ehiK4oHZmOLuUxRpBCM25-r0/edit?usp=sharing
Any practical workaround to copy all data before I change faulty raid card ?
I have figured out that , small files especially close to cluster size of raid is a hell to copy using mechanical disks and it can go way down ... less than 250Kb/sec while copying small files.
Two workarounds are enabling raid card cache which is 512mb and plentiful for streaming small files when it has a chance ... Better, we can use nvme or sata-ssd disk as a cache using bcache in linux and it leads to 256 MB/sec to 1GB/sec speeds when copying files with file size 512KB to 1MB. Bcache can be adjusted using sysfs. There is a sequantial_cutoff parameter when you set it to 1MB smaller than 1MB files go to the sata-sdd or nvme cache and others write normally. Default sequantial_cutoff is 4MB.
And it turns out we have unplugged server case fans because of incredible noise. And forgot them to re-plug that was heating raid card and makes it give single-multibit ECC errors. ! Especially the card in the middle of other two raid cards gave error. I guess it is because they have passive cooling and the one in the middle heats most. Nevertheless slow copy speed was nothing to do with it.
The final answer is: There is nothing wrong with very slow speeds with mechanical disk raid setups when copying files near to cluster size or files smaller than that.
You even didn't mention FS nor if it has been tuned for RAID layout. Yes, it could be. Secondly, there's no surprise that small files are likely to be scattered across the disk meanwhile huge ones typically have its regions continuous. Rotational storage has hard time with that state of things.