Setup looks as follows:
- HP Proliant DL380 G7
- 6 x 3TB Sata drives (surveillance level) configured with hardware RAID 1+0 with the SATA controller on board. Model is Seagate SV35
- 192GB RAM
VMware ESXi 6.0
- One VM guest running Centos 6.7 (Kernel 2.6.32-573)
Datastore is made up of all the remaining disk space after the ESXi-installation (little less than 8tb)
- 1 VMDK file for the system partition at 100GB
- 1 VMDK file for the data partition at around 7.7TB
On the guest CentOS, the system partition is LVM ext4
The data partition is a LVM with a single PV, LV and VG ext4
Now the problem I have is that data transfer speeds on the disk is extremely slow. Trying to copy a semi-large file (10-30 GB) from one place on the LVM to another on the LVM starts out with a transfer rate of around 240MB/s which is the speed I'd expect it to have, but just after a few seconds (30ish usually) it drops down to 1-4 MB/s, and viewing iotop tells me a process starts running called flush-253:2 which seem to slow everything down.
I've been using rsync --progress to get a better picture of the transfer speeds in real time, but I'm seeing the same result with a cp operation.
When it finally finishes, I have tried performing the same procedure again, with the same file to the same location. The second time the indicated transfer speed of rsync keeps steady at around 240MB/s throughout the whole transfer, but when rsync indicated the file transfer is complete it hangs at that state for about as long as it took to complete the first copy procedure. I can see the flush-253:2 process working just as hard for both procedures.
Now I know the setup isn't optimal, and I would have preferred to have a separate disk for the ESXi system, but I don't feel like that should be the cause of this extreme slow transfer rates.
I've searched for information regarding the flush-process, and as far as I can tell, it basically writes data from memory on to the actual disks, but I haven't found anyone saying they've experienced this level of slow transfer rates. The system is not in production yet, and CPU is hardly even running at all, and it has around 100GB of free memory to use when the copy procedures run.
Does anyone have any idea on what to try? I've seen similar results on a different system which is basically setup the same way, except on completely different (somewhat lesser) hardware. I have also a third system running CentOS 5 and ext3 on LVM, which does not have any issues like this.
EDIT 1: I realize now had remembered incorrectly, and the system partition is also lvm, but still a separate volume from the data partition
[root@server /]# mount
/dev/mapper/vg1-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw,rootcontext="system_u:object_r:tmpfs_t:s0")
/dev/sda1 on /boot type ext4 (rw)
/dev/mapper/vg1-lv_home on /home type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
/dev/mapper/vg_8tb-lv_8tb on /datavolume type ext4 (rw,nobarrier)
[root@server /]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_1-lv_root<br>
50G 9.7G 37G 21% /
tmpfs 91G 0 91G 0% /dev/shm
/dev/sda1 477M 52M 400M 12% /boot
/dev/mapper/vg_1-lv_home
45G 52M 43G 1% /home
/dev/mapper/vg_8tb-lv_8tb
7.9T 439G 7.1T 6% /datavolume
Update 1: I have tried increasing the dirty_ratio all the way up to 90, and still saw no improvements. I also tried mounting it with -o nobarriers, and still the same result
Update 2: Sorry to everyone who are trying to help me about the confusion, now that I've had a look myself, the hardware is actually a HP Proliant 380 G7, I don't know if that makes any difference.
I have also had a look myself at the raid configuration, and it seems we're using a P410 raid controller, and when I'm booting into the raid management, it says
HP Smart array (I think) P410 "SOMETHING", with 0MB in parenthesis
I'm guessing that might mean we have 0MB in write cache?
I'm a little out of my depth here when it comes to hardware, can you add a write cache module(?) to this raid controller if one doesn't already exist? Or do you need a new controller/move to a SAN? How can I tell if it has a write cache, but perhaps the battery is dead?
Update 3: Thanks to your suggestions and some futher research, I'm now going to try and install the HP smart array driver vib file in the ESXi, and hopefully get a clearer picture of what I have. I also found the option in the system BIOS to enable drive cache, so I might have a last resort in case it turns out we don't have write cache on the controller.
Update 4 (solved): Thanks to all who suggested solutions, and yes it turned out there was no cache module present on the disk controller.
To anyone having similar problems, I installed the hpssacli utility VIB for ESXi, and could with the following output confirm what had been suggested in the replies.
Cache Board Present: False
Smart Array P410i in Slot 0 (Embedded)
Bus Interface: PCI
Slot: 0
Serial Number:
Controller Status: OK
Hardware Revision: C
Firmware Version: 6.62
Rebuild Priority: Medium
Surface Scan Delay: 15 secs
Surface Scan Mode: Idle
Parallel Surface Scan Supported: No
Wait for Cache Room: Disabled
Surface Analysis Inconsistency Notification: Disabled
Post Prompt Timeout: 0 secs
Cache Board Present: False
Drive Write Cache: Disabled
Total Cache Size: 0 MB
SATA NCQ Supported: True
Number of Ports: 2 Internal only
Driver Name: HP HPSA
Driver Version: 5.5.0
PCI Address (Domain:Bus:Device.Function): 0000:05:00.0
Host Serial Number:
Sanitize Erase Supported: False
Primary Boot Volume: logicaldrive 1
Secondary Boot Volume: None
It appears that your RAID controller have no cache. The main problem is that hardware RAID card tend to disable, by default, the disk's private DRAM cache.
In short, this means that when, after some seconds (~30, to be precise) the dirty pagecache will be flushed to disk, a tons of random I/O request starts hammering your (slow) mechanical disk, killing throughput.
Re-enable your disk's private DRAM cache (it's a RAID controller option, often) and performance should go way up. For even faster write you can turn off write barriers (with the
nobarrier
mount option) but unfortunately, without a BBU cache, turning them off will affect your data reliability in case of system crash / power outage.EDIT: give a look here for more information.
It doesn't appear as though you have any write cache.
Please confirm the generation and model of your server. If you don't have a Flash-backed write cache module (FBWC) on the controller that your disks are attached to, your VMware performance will suffer.
The other issue here is LVM and some of the defaults that appeared in RHEL6 a few years ago. You'll want to try this with disabling write barriers. LVM can be an issue because it leads people to avoid partitioning their volumes... And that impacts the ability for tools like
tuned-adm
to do their job.I asked for the output of
mount
. Can you please post it?Try mounting your volumes with the
no barrier
flag. Write barriers are the default for EL6 on ext4, so that's the biggest problem you're running into.Seems a duplicate of this:
Flush-0:n processes causing massive bottleneck
Indeed, you should check the dirty_ratio, how it's going is that the first writes go in RAM so you have a very fast IO rate in the begining. Later when RAM fills till dirty_ratio the kernel starts fulshing to disk.
Some Questions:
and two personal notes: - i don't think you will ever really reach constant 240 MB/s with 6 slow SATA drives with 7,2K and a RAID 10.