CentOS 5.x
Mq question seemed similar to this one but I wasn't sure...
I have two servers (completely isolated from each other), each with a directory and sub-directories that should have the same exact contents.
For example the directory layout could be something like:
SERVER A -
/opt/foo/foob/1092380298309128301283/123.txt
/opt/foo/foob/5094380298309128301283/456.txt
/opt/foo/foob/5092380298309128301283/789.txt
/opt/foo/foob/1592380298309128301283/abc.txt
SERVER B -
/opt/foo/foob/1092380298309128301283/123.txt
/opt/foo/foob/5094380298309128301283/456.txt
/opt/foo/foob/5092380298309128301283/789.txt
/opt/foo/foob/1592380298309128301283/abc.txt
Ideally I'd like a way to do a recursive check and have something confirm that everything matches.
I also want to avoid using any third-party tools.
Any ideas?
One good way is to use md5sums on every file in the tree:
Run this on server1:
Run this on server2
Then just compare the two files (using diff) or whatever you like.
Is that along the lines of what you're looking for?
Of course, you can use SSH to just execute the command remotely if you want.
If you don't necessarily care about what changed, just that something has changed, rsync is still really good for that. Try running this command and take a gander at the output, assuming this is run from 'servera'.
The resulting list will be those files that would have been modified if you actually ran the sync process. Keeping in mind that the files will show up in the list even if only the timestamp changed, but the contents remained the same. Since we added the
-n
flag, then no actions will actually be performed, only reported.While you could hack together a quick script that will calculate individual MD5 hashes for individual files in a directory, the better way to do it would be to use a tool called
md5deep
which will recursively calculate the hashes of all files in a directory, and then output them to a file. It can then be used on another directory, taking the first hash file as an input, and providing you with a list of files that are different between the two directories.So, taking your example, you would follow this process:
Calculate hashes of the required directory on Server A:
md5deep -r /opt/foo/ > file_hashes.txt
Copy the file
file_hashes.txt
file onto Server B for comparison.Calculate hashes of the required directory on Server B, but taking the file hashes from Server A as an input file by using the
-x
flag to only show files that are different:md5deep -x file_hashes.txt -r /opt/foo/
The md5deep set of tools forms part of the package management system of most distros, and the great thing is that it supports a number of different hashing algorithms, not just MD5. So if you're paranoid about collisions, you have a number of alternatives available. The following tools form part of md5deep, each providing an alternative hashing algorithm:
I used a technique similar to @scott-pack This will tell give you two-way diffing. Everything that starts with "deleting" is a file that is on the remote server but not the local server. Every directory listed without any file contents is one that has no changes. Every file that is listed is a file that either doesn't exist on the remote server, or it the local version is "newer".