Since Ubuntu 18.04 two machines I use, one desktop and one notebook, both started to present an occasional really slow boot and a really slow performance for everything after that boot. Both use a small SSD and a bigger HD through bcache.
Except for those occasions, they are fast. No noticeable difference from other PCs with SSD only. Bcache is great. And usually a reboot after that slow boot makes things go back to normal. That's why I took so long to investigate it deeper.
Following instructions I probably found here in askubuntu, I used systemd-analyze
to discover fstrim was causing it.
$ sudo systemd-analyze blame
2min 9.448s fstrim.service
...
The package was found using this:
$ dpkg -S fstrim
util-linux: /sbin/fstrim
util-linux: /usr/share/man/man8/fstrim.8.gz
util-linux: /lib/systemd/system/fstrim.service
util-linux: /lib/systemd/system/fstrim.timer
util-linux: /usr/share/bash-completion/completions/fstrim
My guess is that this fstrim ruins bcache's performance. It is scheduled for running once a week, which is consistent with the observed behaviour. It probably thinks the bcache device is a huge SSD and does its thing making the boot super slow, which also messes with the cache and thus every access after that is a cache miss.
It's kind of fixed on my machines, since I disabled fstrim and it's timer following the instructions here and the slow boot haven't occurred again.
rm /var/lib/systemd/timers/stamp-fstrim.timer
systemctl stop fstrim.service fstrim.timer
systemctl disable fstrim.service fstrim.timer
systemctl mask fstrim.service fstrim.timer
But there's probably better solutions to this. For example: there should be a way to disable fstrim for only one partition editing fstab.
There is, maybe... I have just found it reading ArchLinux's wiki and a link to kernel.org from there. You just add nodiscard
to the line of that bcache filesystem in fstab. I haven't tested it yet. In my case it would be:
...
# /home was on /dev/bcache0 during installation
UUID=0880deae-1eeb-4c07-af01-a3db9d2d6282 /home ext4 defaults,nodiscard 0 2
...
Even better would be bcache to report as not having trim support to lsblk --discard
or fstrim to recognize a bcache partition and avoid it.
Any suggestions? Should I file a bug? Where?