Setup: SSD encrypted with luks aes-xts 512 bits (256 bits AES key), ext4 filesystem
dd write performance of 138 MB/s, CPU usage 97-100 %
dd if=/dev/zero of=testfile status=progress bs=32M count=128
4294967296 bytes (4,3 GB, 4,0 GiB) copied, 31 s, 138 MB/s
128+0 Datensätze ein
128+0 Datensätze aus
4294967296 bytes (4,3 GB, 4,0 GiB) copied, 31,0463 s, 138 MB/s
dd read performance 110 MB/s, CPU usage starts with > 90 %, then falls to about 50-60 % and then at the end of file reading goes up to > 90 % again.
#crop cache before
sudo sh -c "echo 1 > /proc/sys/vm/drop_caches"
Now do the test:
dd if=testfile of=/dev/null status=progress bs=32M
4261412864 bytes (4,3 GB, 4,0 GiB) copied, 39 s, 109 MB/s
128+0 Datensätze ein
128+0 Datensätze aus
4294967296 bytes (4,3 GB, 4,0 GiB) copied, 39,1345 s, 110 MB/s
Now lets take a closer look at samba:
Writing a 1 GB file to a samba share at disk benchmarked above gives about 73 MB/s, CPU usage is only about 70 %.
Reading a 1 GB file from samba share gives only about 64 MB/s, CPU usage is about 55 %. Also watch this graph: It starts slowly and then speed goes up and down, generating some kind of wave form.
Copying this file immediately again, when it is in cache, then it is copied with 112 MB/s, so full GigabitEthernet speed as it should.
Compare to unencrypted drive:
dd write speed 133 MB/s
dd read speed 207 MB/s
Samba write: 112 MB/s
Samba read: 112 MB/s
So LUKS encryption alone gives sufficient speed, Samba alone also has sufficient speed. In combination there is a huge performance drop while there are still plenty of CPU ressources available which are used when solely dd is used.
What is wrong here? Why isn't CPU fully used when doing operations for samba while it is with dd? What can be done to have also full performance / CPU usage with smb and luks encryption?
I've dug deeper into it.
It seems like a wrong display of CPU usage.
Reading with samba from unecrypted drive giving 112 MB/s requiring about 38 % CPU usage on whole system
CPU usage is floating between 29 % and even sometimes goes shortly up to 94 % while reading from unencrypted drive.
Now taking encrypting read performance of 110 MB/s reduced by 38 % gives 68,2 MB/s. Thats quite close to the 64 MB/s.
So from a logical point of view: Samba itself requires relatively much CPU and in combination with encryption the resulting speed seems to make sense now.
BTW: System done these tests on is a Rasperry PI 400 with 4 core arm CPU @ default clock of 1,8 GHz.
cryptsetup benchmark
reports for aes-xts with 512 bits key (so 256 bit AES encryption) 77 MB/s for encryption and 66,9 MB/s for decryption. However cryptsetup does these tests with only one CPU utilized, so I guess powermanagement clocks down CPU thats why with real encryption and decryption there is much more performance like dd shows.I've also done some other performance tests.
I've also increased read ahead size both on /dev/mapper and /dev/sdd from 256 to 65536 via
sudo blockdev --setra 65536 /dev/sdd
andsudo blockdev --setra 65536 /dev/mapper/sdd_crypt
however these did not make any noticeable difference.Digging still deeper into it I found this very interesting article https://blog.cloudflare.com/speeding-up-linux-disk-encryption/
Their research lead to
no_read_workqueue
andno_write_workqueue
beginning with Kernel version 5.9. Luckily current Rasperry PI OS is on 5.10.11-v7l+, so dmcrypt supporting these options.However latest cryptsetup version 2.1.0 on Raspberry PI OS Buster don't support these options. So I've compiled cryptsetup 2.3.4 to use no_read_workqueue and no_write_workqueue (see https://www.kernel.org/doc/html/latest/admin-guide/device-mapper/dm-crypt.html) and mounted via
however performance was massively reduced on this particular setup reading from device and not RAM disk.
In conclusion: Since the resulting speeds are plausible it seems like a wrong display of CPU Usage.