Ping a Specific Port

Question

Mike B

Asked: 2012-05-19 12:43:19 +0800 CST2012-05-19 12:43:19 +0800 CST 2012-05-19 12:43:19 +0800 CST

How can I easily confirm in Linux that two separate directories have the exact same contents?

772

CentOS 5.x

Mq question seemed similar to this one but I wasn't sure...

I have two servers (completely isolated from each other), each with a directory and sub-directories that should have the same exact contents.

For example the directory layout could be something like:

SERVER A -

/opt/foo/foob/1092380298309128301283/123.txt
/opt/foo/foob/5094380298309128301283/456.txt
/opt/foo/foob/5092380298309128301283/789.txt
/opt/foo/foob/1592380298309128301283/abc.txt

SERVER B -

/opt/foo/foob/1092380298309128301283/123.txt
/opt/foo/foob/5094380298309128301283/456.txt
/opt/foo/foob/5092380298309128301283/789.txt
/opt/foo/foob/1592380298309128301283/abc.txt

Ideally I'd like a way to do a recursive check and have something confirm that everything matches.

I also want to avoid using any third-party tools.

Any ideas?

4 Answers

Voted

Camden S. · Answer 1 · 2012-05-19T13:25:26+08:00

Best Answer

Camden S.

2012-05-19T13:25:26+08:002012-05-19T13:25:26+08:00

One good way is to use md5sums on every file in the tree:

Run this on server1:

find /opt/foo/foob/ -type f -print0 | xargs -0 md5sum > report_from_server1.tx

Run this on server2

find /opt/foo/foob/ -type f -print0 | xargs -0 md5sum > report_from_server2.tx

Then just compare the two files (using diff) or whatever you like.

Is that along the lines of what you're looking for?

Of course, you can use SSH to just execute the command remotely if you want.

9

Scott Pack · Answer 2 · 2012-05-19T12:58:55+08:00

Scott Pack

2012-05-19T12:58:55+08:002012-05-19T12:58:55+08:00

If you don't necessarily care about what changed, just that something has changed, rsync is still really good for that. Try running this command and take a gander at the output, assuming this is run from 'servera'.

rsync -avcn /opt/foo/ serverb:/opt/foo

The resulting list will be those files that would have been modified if you actually ran the sync process. Keeping in mind that the files will show up in the list even if only the timestamp changed, but the contents remained the same. Since we added the -n flag, then no actions will actually be performed, only reported.

7

Richard Keller · Answer 3 · 2012-06-01T14:50:26+08:00

While you could hack together a quick script that will calculate individual MD5 hashes for individual files in a directory, the better way to do it would be to use a tool called md5deep which will recursively calculate the hashes of all files in a directory, and then output them to a file. It can then be used on another directory, taking the first hash file as an input, and providing you with a list of files that are different between the two directories.

So, taking your example, you would follow this process:

Calculate hashes of the required directory on Server A:

md5deep -r /opt/foo/ > file_hashes.txt
Copy the file file_hashes.txt file onto Server B for comparison.
Calculate hashes of the required directory on Server B, but taking the file hashes from Server A as an input file by using the -x flag to only show files that are different:

md5deep -x file_hashes.txt -r /opt/foo/

The md5deep set of tools forms part of the package management system of most distros, and the great thing is that it supports a number of different hashing algorithms, not just MD5. So if you're paranoid about collisions, you have a number of alternatives available. The following tools form part of md5deep, each providing an alternative hashing algorithm:

   md5deep - Compute and compare MD5 message digests
   sha1deep - Compute and compare SHA-1 message digests
   sha256deep - Compute and compare SHA-256 message digests
   tigerdeep - Compute and compare Tiger message digests
   whirlpooldeep - Compute and compare Whirlpool message digests

David Baucum · Answer 4 · 2013-08-13T10:41:31+08:00

David Baucum

2013-08-13T10:41:31+08:002013-08-13T10:41:31+08:00

I used a technique similar to @scott-pack This will tell give you two-way diffing. Everything that starts with "deleting" is a file that is on the remote server but not the local server. Every directory listed without any file contents is one that has no changes. Every file that is listed is a file that either doesn't exist on the remote server, or it the local version is "newer".

rsync -rvnac --delete /local/directory/ user@remote:/remote/directory/

0

How can I easily confirm in Linux that two separate directories have the exact same contents?

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?