Compare sorted files FILE1 and FILE2 line by line.
With no options, produce three-column output. Column one
contains lines unique to FILE1, column
two contains lines unique to FILE2,
and column three contains lines common
to both files.
These columns are suppressable with -1, -2 and -3 respectively.
Example:
[root@dev ~]# cat a
common
shared
unique
[root@dev ~]# cat b
common
individual
shared
[root@dev ~]# comm -3 a b
individual
unique
And if you just want the unique lines and don't care which file they're in:
[root@dev ~]# comm -3 a b | sed 's/^\t//'
individual
unique
As the man page says, the files must be sorted beforehand.
Visual comparison tools fit two files together so that a segment with the same number of lines but differing content will be considered a changed segment. Completely new lines between matching segments are considered added segments.
This is also how sdiff command-line tool works, which shows a side-by-side comparison of two files in a terminal. Changed lines are separated by | character. If a line exists only in file A, < is used as the separator character. If a line exists only in file B, > is used as the separator. If you don't have < and > characters in the files, you can use this to show only added lines:
No, diff doesn't actually show the differences between two files in the way one might think. It produces a sequence of editing commands for a tool like patch to use to change one file into another.
The difficulty for any attempt at doing what you're looking for is how to define what constitutes a line that has changed versus a deleted one followed by an added one. Also what to do when lines are added, deleted and changed adjacent to each other.
Thanks senarvi, your solution (not voted for) actually gave me EXACTLY what I wanted after looking for ages on a ton of pages.
Using your answer, here is what I came up with to get the list of things changed/added/deleted. The example uses 2 versions of the /etc/passwd file and prints out the username for the relevant records.
diff --changed-group-format='-%<+%>' --unchanged-group-format='' f g
Example:
printf 'a\nb\nc\nd\ne\nf\ng\n' > f
printf 'a\nB\nC\nd\nE\nF\ng\n' > g
diff --old-line-format=$'-%l\n' \
--new-line-format=$'+%l\n' \
--unchanged-line-format='' \
f g
Output:
-b
-c
+B
+C
-e
-f
+E
+F
So it shows old lines with - followed immediately by the corresponding new line with +.
If we had a deletion of C:
printf 'a\nb\nd\ne\nf\ng\n' > f
printf 'a\nB\nC\nd\nE\nF\ng\n' > g
diff --old-line-format=$'-%l\n' \
--new-line-format=$'+%l\n' \
--unchanged-line-format='' \
f g
it looks like this:
-b
+B
+C
-e
-f
+E
+F
The format is documented at man diff:
--line-format=LFMT
format all input lines with LFMT`
and:
LTYPE is 'old', 'new', or 'unchanged'.
GTYPE is LTYPE or 'changed'.
and:
LFMT (only) may contain:
%L contents of line
%l contents of line, excluding any trailing newline
[...]
Try comm
Another way to look at it:
Show lines that only exist in file a: (i.e. what was deleted from a)
Show lines that only exist in file b: (i.e. what was added to b)
Show lines that only exist in one file or the other: (but not both)
(Warning: If file
a
has lines that start with TAB, it (the first TAB) will be removed from the output.)Sorted files only
NOTE: Both files need to be sorted for
comm
to work properly. If they aren't already sorted, you should sort them:If the files are extremely long, this may be quite a burden as it requires an extra copy and therefore twice as much disk space.
To show additions and deletions without context, line numbers, +, -, <, > ! etc, you can use diff like this:
For example, given two files:
a.txt
b.txt
The following command will show lines either removed from a or added to b:
output:
This slightly different command will show lines removed from a.txt:
output:
Finally, this command will show lines added to a.txt
output
comm
might do what you want. From its man page:These columns are suppressable with
-1
,-2
and-3
respectively.Example:
And if you just want the unique lines and don't care which file they're in:
As the man page says, the files must be sorted beforehand.
Visual comparison tools fit two files together so that a segment with the same number of lines but differing content will be considered a changed segment. Completely new lines between matching segments are considered added segments.
This is also how sdiff command-line tool works, which shows a side-by-side comparison of two files in a terminal. Changed lines are separated by | character. If a line exists only in file A, < is used as the separator character. If a line exists only in file B, > is used as the separator. If you don't have < and > characters in the files, you can use this to show only added lines:
No,
diff
doesn't actually show the differences between two files in the way one might think. It produces a sequence of editing commands for a tool likepatch
to use to change one file into another.The difficulty for any attempt at doing what you're looking for is how to define what constitutes a line that has changed versus a deleted one followed by an added one. Also what to do when lines are added, deleted and changed adjacent to each other.
That's what diff does by default... Maybe you need to add some flags to ignore whitespace?
should ignore blank lines and different numbers of spaces.
Thanks senarvi, your solution (not voted for) actually gave me EXACTLY what I wanted after looking for ages on a ton of pages.
Using your answer, here is what I came up with to get the list of things changed/added/deleted. The example uses 2 versions of the /etc/passwd file and prints out the username for the relevant records.
I find this particular form often useful:
Example:
Output:
So it shows old lines with
-
followed immediately by the corresponding new line with+
.If we had a deletion of
C
:it looks like this:
The format is documented at
man diff
:and:
and:
Related question: https://stackoverflow.com/questions/15384818/how-to-get-the-difference-only-additions-between-two-files-in-linux
Tested in Ubuntu 18.04.
We can combine diff and sed to achieve what you want. lets take the same example from https://serverfault.com/a/68717/947477
To show added lines with
+
and deleted lines with-
we can useHere,
-u
is for printing unified content andsed
will filter only outputs with-
or+
at the beginning.A more straightforward answer is
File1:
File2:
Use:
This show two columns for repectives files.
Output: