Ping a Specific Port

Question

Mr.Boon

Asked: 2019-10-14 21:56:11 +0800 CST2019-10-14 21:56:11 +0800 CST 2019-10-14 21:56:11 +0800 CST

RAID10 array check slow, and slowing

772

I run a new CentOS 7 machine. Linux runs on 2x SSD setup, and I also have 4x SAS drives setup in software RAID10. The RAID10 array is large, 4x 12TB drives, so 24TB usable.

File system is: ext4

Now I finished copying some files to it, and I'm doing a raid check (very first one).

Every 2.0s: cat /proc/mdstat                                                                                                                                                                                         Mon Oct 14 06:28:38 2019

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty]
md127 : active raid10 sdf1[3] sdd1[1] sde1[2] sdc1[0]
      23437503488 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
      [======>..............]  check = 32.6% (7649123136/23437503488) finish=3402.6min speed=77333K/sec
      bitmap: 0/175 pages [0KB], 65536KB chunk

md2 : active raid1 sdb2[1] sda2[0]
      20478912 blocks [2/2] [UU]

md3 : active raid1 sdb3[1] sda3[0]
      447318976 blocks [2/2] [UU]
      bitmap: 3/4 pages [12KB], 65536KB chunk

unused devices: <none>

It started around 250,000K/sec but it keeps getting slower, and it now it's around 75,000K/s

The drives in the RAID10 array are not being use by anything else at the moment.

I already tweaked the speed limit settings.

dev.raid.speed_limit_min = 100000

dev.raid.speed_limit_max = 1000000

CPU usage is on like 2%, I got tons of RAM free, and the 4 drives in the RAID array are reporting about 25% drive utilization per drive, so they are not being pushed hard by resync.

My question:

What can I do to speed this up?
And what could be causing it to slow down?

1 Answers

Voted

shodanshok · Answer 1 · 2019-10-17T00:42:01+08:00

Best Answer

shodanshok

2019-10-17T00:42:01+08:002019-10-17T00:42:01+08:00

Your message file show exactly what I expected: a disk/enclosure continuously aborting commands and resetting. The affected disk seems always to be sdc, so it is probably the culprit.

The obvious action to solve the problem is to replace it. However, I would first try to:

reseat your drive and power/data cables;
swap sdc with another disk (to change SAS cable/power cord) and check if the errors follows the drive or remain bound to the very same slot/port;
optionally, read directly from the disk via dd if=/dev/sdc of=/dev/null bs=1M iflag=direct to gain additional debug data.

If you can't, for some reason, replace the drive, you can try forcing bad blocks reallocation by completely rewrite the device via dd if=/dev/zero of=/dev/sdc bs=1M oflag=direct. BIG WARNING: this will completely and irreversibly destroy all data on sdc. Try it only if you really can't replace the drive.

2

RAID10 array check slow, and slowing

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?