I have two text files and want to find the differences between them using Windows Powershell. Is there something similar to the Unix diff tool available? Or is there another other way I haven't considered?
I've tried compare-object, but get this cryptic output:
PS C:\> compare-object one.txt two.txt
InputObject SideIndicator
----------- -------------
two.txt =>
one.txt <=
Figured it out myself. Because Powershell works with .net objects rather than text, you need to use get-content to expose the contents of the text files. So to perform what I was trying to do in the question, use:
A simpler way of doing it is to write:
Or you could use the DOS
fc
command like so (This shows the output of both files so you will have to scan for the differences):fc
is an alias for the Format-Custom cmdlet so be sure to enter the command asfc.exe
. Please note that many DOS utilities don't handle UTF-8 encoding.You can also spawn a CMD process and run
fc
within it.This instructs PowerShell to start a process with the 'cmd' program using the parameters in quotes. In the quotes, is the '/c' cmd option to run the command and terminate. The actual command to run by cmd in the process is
fc filea.txt fileb.txt
redirecting the output to the filediff.txt
.You can use the DOS
fc.exe
from within powershell.diff on *nix is not part of the shell, but a separate application.
Is there any reason you can't just use diff.exe under PowerShell?
You can download a version from the UnxUtils package (http://unxutils.sourceforge.net/)
compare-object (aka diff alias) is pathetic if you expect it to behave something like a unix diff. I tried the diff (gc file1) (gc file2), and if a line is too long, I can't see the actual diff and more importantly, I can't tell which line number the diff is on.
When I try adding -passthru, I now can see the difference, but I lose which file the difference is in, and I still don't get a line number.
My advice, don't use powershell to find differences in files. As someone else noted, fc works, and works a little better than compare-object, and even better is downloading and using real tools like the unix emulator that Mikeage mentioned.
WinMerge is another good GUI-based diff tool.
As others have noted, if you were expecting a unix-y diff output, using the powershell diff alias would let you down hard. For one thing, you have to hold it's hand in actually reading files (with gc / get-content). For another, the difference indicator is on the right, far from the content -- it's a readability nightmare.
The solution for anyone looking for a sane output is
add the line
The -force argument is required because Powershell is quite precious about this particular inbuilt alias. If anyone is interested, having GnuWin32 installed, I also include the following in my powershell profile:
Mainly because Powershell doesn't understand arguments which are run together and typing, for example "rm -Force -Recurse" is a lot more effort than "rm -rf".
Powershell has some nice features, but there are some things it should just not try to do for me.
fc.exe
is better for text comparing since it designed to work like *nix diff, i.e. compares lines sequentially, showing the actual differences and trying to re-synchronise (if the differing sections have different lengths). It also has some useful control options (text/binary, case sensitivity, line numbers, resynchronisation length, mismatch buffer size) and provides exit status (-1 bad syntax, 0 files same, 1 files differ, 2 file missing). Being a (very) old DOS utility, it does have a few limitations. Most notably, it does not automatically work with Unicode, treating the 0 MSB of ASCII characters as a line terminator so the file becomes a sequence of 1 character lines (@kennycoc: use the /U option to specify BOTH files are Unicode, WinXP onwards) and it also has a hard line buffer size of 128 characters (128 bytes ASCII, 256 bytes Unicode) so long lines get split up and compared separately.compare-object is designed to determine if 2 objects are member-wise identical. if the objects are collections then they are treated as SETS (see help compare-object), i.e. UNORDERED collections without duplicates. 2 sets are equal if they have the same member items irrespective of order or duplications. This severely limits its usefulness for comparing text files for differences. Firstly, the default behaviour collects the differences until the entire object (file = array of strings) has been checked thus losing the information regarding the position of the differences and obscuring which differences are paired (and there is no concept of line number for a SET of strings). Using -synchwindow 0 will cause the differences to be emitted as they occur but stops it from trying to re-synchronise so if one file has an extra line then subsequent line comparisons can fail even though the files are otherwise identical (until there is a compensatory extra line in the other file thereby realigning the matching lines). However, powershell is extremely versatile and a useful file compare can be done by utilising this functionality, albeit at the cost of substantial complexity and with some restrictions upon the content of the files. If you need to compare text files with long (> 127 character) lines and where the lines mostly match 1:1 (some changes in lines between files but no duplications within a file such as a text listing of database records having a key field) then by adding information to each line indicating in which file it is, its position within that file and then ignoring the added information during comparison (but including it in the output) you can get a *nix diff like output as follows (alias abbreviations used):
where xx is the length of the longest line + 9
Explanation
(gc file | % -begin { $ln=0 } -process { '{0,6}<<:{1}' -f ++$ln,$_ })
gets the content of the file and prepends the line number and file indicator (<< or >>) to each line (using the format string operator) before passing it to diff.-property { $_.substring(9) }
tells diff to compare each pair of objects (strings) ignoring the first 9 characters (which are the line number and file indicator). This utilises the ability to specify a calculated property (the value of a script block) instead of the name of a property.-passthru
causes diff to output the differing input objects (which include the line number and file indicator) instead of the differing compared objects (which don't).sort-object
then puts all the lines back into sequence.out-string stops the default truncation of the output to fit the screen width (as noted by Marc Towersap) by specifying a width big enough to avoid truncation. Normally, this output would be put into a file which is then viewed using a scrolling editor (e.g. notepad).
Note
The line number format {0,6} gives a right justified, space padded 6 character line number (for sorting). If the files have more than 999,999 lines then simply change the format to be wider. This also requires altering the
$_.substring
parameter (3 more than the line number width) and the out-string xx value (maximum line length +$_.substring
parameter).There's also Windiff which provides a GUI diff interface (great for use with GUI based CVS/SVN programs)
Powershell is awkward at best and a sad replacement of diff -y. I came here looking for how it could work, ended up opening the files in notepad++. Exactly what I needed