Ping a Specific Port

Question

noonex

Asked: 2017-10-24 19:35:28 +0800 CST2017-10-24 19:35:28 +0800 CST 2017-10-24 19:35:28 +0800 CST

Simulate power loss with force unmount?

772

I want to test disaster recovery of RDBMs after power loss under high load.

My idea is to mount data directory under new mountpoint and then execute umount -f during the load and investigate outcome / state of files.

My expectation is that with non-durable configuration the data should be inconsistent and consistent otherwise.

Does anybody think it is good idea and maybe other related hints (e.g. which filesystem better to use or my expectation is irrelevant, then why)?

1 Answers

Voted

John Mahowald · Answer 1 · 2017-10-24T22:12:13+08:00

Presumably you are actually removing the power. umount -f is not nearly impolite enough to simulate many failures.

On Linux, umount(2) explains that force is only supported for networked file systems.

   MNT_FORCE (since Linux 2.1.116)
          Ask the filesystem to abort pending requests before attempting
          the unmount.  This may allow the unmount to complete without
          waiting for an inaccessible server, but could cause data loss.
          If, after aborting requests, some processes still have active
          references to the filesystem, the unmount will still fail.  As
          at Linux 4.12, MNT_FORCE is supported only on the following
          filesystems: 9p (since Linux 2.6.16), ceph (since Linux
          2.6.34), cifs (since Linux 2.6.12), fuse (since Linux 2.6.16),
          lustre (since Linux 3.11), and NFS (since Linux 2.1.116).

Here are some more ideas regarding how to do very nasty things to a database system:

Physically unplug all power supplies to the host. Any processes and shared memory will go away very ungracefully.
Overcommit the storage with thin provisioning and run it to 100%. Even if the storage did something sane in this scenario, the DBMS might be unhappy if its volumes went read only in the middle of a write.
Unplug all paths to the SAN, to simulate that "non disruptive" storage maintenance that isn't.
Find a process that does writes and send it SIGKILL signal or equivalent.
Crash the OS. For example, on Linux echo 'c' > /proc/sysrq-trigger

The state of the data remaining after the test depends on the storage and DBMS. Either could have a journal they could replay, or maybe they don't. You probably want to do a fsck or equivalent on the file system. If the database can recover to a consistent point in time, from logs or whatever, you may want to do that. If you have an integrity checker for the DBMS, use it as a sanity check.

Hopefully you already have done a restore test of your backup just in case. Do not assume just because something claims crash recovery, that it works in all situations.

Simulate power loss with force unmount?

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?