I want to test disaster recovery of RDBMs after power loss under high load.
My idea is to mount data directory under new mountpoint and then execute umount -f
during the load and investigate outcome / state of files.
My expectation is that with non-durable configuration the data should be inconsistent and consistent otherwise.
Does anybody think it is good idea and maybe other related hints (e.g. which filesystem better to use or my expectation is irrelevant, then why)?
Presumably you are actually removing the power.
umount -f
is not nearly impolite enough to simulate many failures.On Linux, umount(2) explains that force is only supported for networked file systems.
Here are some more ideas regarding how to do very nasty things to a database system:
Physically unplug all power supplies to the host. Any processes and shared memory will go away very ungracefully.
Overcommit the storage with thin provisioning and run it to 100%. Even if the storage did something sane in this scenario, the DBMS might be unhappy if its volumes went read only in the middle of a write.
Unplug all paths to the SAN, to simulate that "non disruptive" storage maintenance that isn't.
Find a process that does writes and send it SIGKILL signal or equivalent.
Crash the OS. For example, on Linux
echo 'c' > /proc/sysrq-trigger
The state of the data remaining after the test depends on the storage and DBMS. Either could have a journal they could replay, or maybe they don't. You probably want to do a fsck or equivalent on the file system. If the database can recover to a consistent point in time, from logs or whatever, you may want to do that. If you have an integrity checker for the DBMS, use it as a sanity check.
Hopefully you already have done a restore test of your backup just in case. Do not assume just because something claims crash recovery, that it works in all situations.