I have a server with a LSI MegaRAID SAS 9260-4i controller, RAID-5 with 3 x 2 TB disks. I did some performance testing (with iozone3) and the numbers show clearly that the write cache policy affects the read performance as well. If I set the policy to WriteBack I get about 2x the read performance in comparison with WriteThrough. How could the write cache affect the read performance?
Here are the details of the setup:
megacli -LDInfo -L0 -a0
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :
RAID Level : Primary-5, Secondary-0, RAID Level Qualifier-3
Size : 3.637 TB
Is VD emulated : Yes
Parity Size : 1.818 TB
State : Optimal
Strip Size : 512 KB
Number Of Drives : 3
Span Depth : 1
Default Cache Policy: WriteThrough, ReadAhead, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAhead, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disabled
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: No
With WriteBack enabled (everything else is unchanged):
Default Cache Policy: WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU
Current Cache Policy: WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU
Some numbers from iozone3:
WriteThrough:
random random
KB reclen write rewrite read reread read write
2033120 64 91963 38146 144980 139122 11795 21564
2033120 128 83039 90746 118660 118147 21193 33686
2033120 256 78933 40359 113611 114327 31493 51838
2033120 512 71133 39453 131113 143323 28712 60946
2033120 1024 91233 76601 141257 142820 35869 45331
2033120 2048 58507 48419 136078 135220 51200 54548
2033120 4096 98426 70490 119342 134319 80883 57843
2033120 8192 70302 63047 132495 144537 101882 57984
2033120 16384 79594 29208 148972 135650 124207 79281
WriteBack:
random random
KB reclen write rewrite read reread read write
2033120 64 347208 302472 331824 302075 12923 31795
2033120 128 354489 343420 292668 322294 24018 45813
2033120 256 379546 343659 320315 302126 37747 71769
2033120 512 381603 352871 280553 322664 33192 116522
2033120 1024 374790 349123 289219 290284 43154 232669
2033120 2048 364758 342957 297345 320794 73880 264555
2033120 4096 368939 339926 303161 324334 128764 281280
2033120 8192 374004 346851 303138 326100 186427 324315
2033120 16384 379416 340577 284131 289762 254757 356530
Some details about the system:
- Ubuntu 12.04
- 64 bit
- Kernel 3.2.0 (3.2.0-58-generic)
- Memory was limited to 1 GB for the test
- iozone3 version 397-2
- Partition used for the test:
/dev/sda4 /var ext4 rw,relatime,user_xattr,barrier=1,data=ordered 0 0
By using a writeback cache, you are saving disk IOPS. The controller can batch up smaller writes into one big write.
Thus, there are more IOPS available for reads.
This assumes that the tests are concurrent. If a given test is only reads or only writes this won't matter.
What filesystem is this test run on?
What comes to mind is atime. If your filesystem is mounted with atime option or missing no/relatime mount option you will get a write for every read.
(atime means recording last access time for files)
It might be helpful if you post the output of
and specify on which device you did the tests.
write-back policy do have effects on red performances when using tests like iozone because those benchmark tools measure read performance by reading data they have written previously. hence when iozone starts read tests, some data still lies in the cache hence making the read throughput a lot higher. this is regardless of the size of files as the raid adapter has no knowledge of files, or even filesystems. All it knows are IOs and blocks. keep in mind iozonr I'd an fs benchmark tool and thug totally abstracts hardware. maybe using -J/-Y you could mitigate effects of write back policy and get an idea of your read performance... or use a true HDD bench tool (hdparm?)
The most obvious reason is because the reads are coming from the cache, rather than the disk itself. Remember that with WriteBack, the written data is held in the cache, until the RAID controller gets a chance/decides to write it to the disk. However it makes sense that if the same data is read (or anything that it still holds in it cache) then it uses the cache to retrieve the data, rather than indulge in relatively expensive disk reads.
It is likely that as the file is being written it is also being written as on continuous block on the disk.
One thing the previous solutions don't take into account is the difference between how write-back and write-through commands are handled on the controller.
Write-Back means when the controller receives a command, it immediately tells the OS handler that it was "ok", and to give the next one. Write-Though waits for each individual command to report a success before processing the next query.
As a result commands are queued faster. Then the Read-Ahead setting of the Array will start populating the cache with a continuous stream of data.
You can see that Read-Ahead with the faster command queuing is helping quite a bit, if you look at the very small percentage differences between the Random Write and Random Read events which mostly remove the Read-Ahead boost especially at the smaller chunk sizes.
Another thing that can affect the performance is the slice size and block size for how many different heads are involved in each read or write operation.