Ping a Specific Port

Question

Naftuli Kay

Asked: 2018-07-01 14:22:01 +0800 CST2018-07-01 14:22:01 +0800 CST 2018-07-01 14:22:01 +0800 CST

Filesystem Test for Contended Writes

772

I have been using a suite of filesystem testing tools to benchmark and abuse a GlusterFS volume. The volume is a replica 3 volume spread out over 6 hosts.

Fio, iozone, and Bonnie indicate to me that Gluster is working just fine and the bandwidth is roughly equal to that of the client and server network adapters, so performance can't really be improved. Most of my test cases operated on 32gb files, apart from iozone and Bonnie.

I have gotten reports of split brain occurring for certain files which are being concurrently written to by multiple clients. All of the documentation I have read seems to indicate that split brain largely occurs when network partitions happen, and this is clearly not the case, judging from the logs.

Unfortunately, this split brain seems to occur only when using a certain hosted service, and I have zero introspection into how that service operates, what version of Gluster client it has, etc. The servers are running the latest 4.0 release.

Judging from the failure case I have been presented with ("split brain happens when two containers are writing to the same file at the same time"), I need a test that will reproduce a similar situation.

I could definitely write my own test case in C or Rust, but is there something out there which will test this exact case without having to write anything?

I do have access (but not introspection) into this hosted service, so I will probably test that too. I'm also scratching my head at the actual problem: what is the desired outcome when two programs write different data to the same file at the same time?

EDIT: The servers are running the latest CentOS 7 release. My testing client server is also running the same. The underlying filesystem is XFS.

Is there a specific test case that I can use to try to recreate the problem?

2 Answers

Voted

John Mahowald · Answer 1 · 2018-07-05T07:51:38+08:00

John Mahowald

2018-07-05T07:51:38+08:002018-07-05T07:51:38+08:00

Sounds like you have a PHP app and its error log is getting corrupted. So the most realistic test would be to job off multiple PHP processes, which are in parallel calling error_log().

You could trace the app doing error log or read the source code to find out its precise implementation. Particularly interesting would be if it opens in append mode with O_APPEND. Append has race conditions on NFS, so this does not necessarily fix the problem on network file systems.

Consider switching error_log to syslog and letting your syslogd forward on to a central syslog instead. That will convert it to a single file writer. Or you can forward to a log analytics platform like Graylog, ELK, or Splunk, which have proper databases.

3

Anon · Answer 2 · 2018-07-17T22:14:46+08:00

Anon

2018-07-17T22:14:46+08:002018-07-17T22:14:46+08:00

Just create two separate fio jobs that are doing direct I/O to the same file which is controlled by the filename parameter. Make the size of the file somewhat small and perhaps have one or both of the fio jobs do the write I/O randomly and perhaps set each job to use a different blocksize. Bonus points for using fio's client/server mode so the jobs come from different machines. Use runtime and time_based to keep fio looping.

2

Filesystem Test for Contended Writes

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?