I am looking for an application that can compare two C++ sources and find the code-meaningful differences (to compare versions which may have been reformatted differently). At the very minimum, something which has the capability for ignoring changes in white spaces, tab spaces and newlines which do not affect the functionality of the source (note that whether a newline is considered whitespace is language-dependent, and C and C++ do so). And, ideally, something that can identify exactly all code-meaningful differences. I am under Ubuntu.
As per diff --help | grep ignore
, I expected diff -bBwZ
to do reasonably the job (I expected to get some false negatives, to be dealt with later).
Nevertheless, it doesn't.
if I have the following files with snippets
test_diff1.txt
else if (prop == "P1") { return 0; }
and test_diff2.txt
else if (prop == "P1") {
return 0;
}
then
$ diff -bBwZ test_diff1.txt test_diff2.txt
1c1,3
< else if (prop == "P1") { return 0; }
---
> else if (prop == "P1") {
> return 0;
> }
instead of empty results.
Using a code formatter as a "filter" on both inputs may filter out these differences, but then the resulting output would have to be tied back to the original inputs for the final reporting of differences to keep actual text and line numbers. So the objective is attainable without a need for a compiler properly... I do not know if something is available, though.
Can the objective be attained with diff
?
Otherwise, is there an alternative (preferably, for command line)?
You can use
dwdiff
. Fromman dwdiff
:Program is very clever - see
dwdiff --help
:Test it with:
Then launch comparison:
Please note
100% common
above.I doubt this is something that diff can do. If there are space changes within a line, then it will work (or other similar programs like kompare). At worse, you can do a search-and-replace and collapse tab characters, etc. But what you're asking for whitespace changes beyond a line...
You would need a program that understands the C++ language. Note that all languages are different and Python, in particular, uses whitespace to define code blocks. As such, I doubt any general diff-like program would work with "any" (or a specific) programming language.
You might consider some kind of parser to go through the two source files and then compare the outputs of this parser.
This is beyond my background, but I suggest you look into Lex and Yacc. These are Wikipedia pages; you might want to take a look at this page which gives a concise explanation and an example.
In similar situation, when I needed to compare two
git
branches in code-formatting agnostic way, I did this:created temporary branches:
formatted both branches using
clang-format
:did actual comparison:
(
-w -b
allows you to ignore space difference, just in case).You may prefer
uncrustify
overclang-format
(uncrustify
'smod_full_brace_if
may be used to enforce insertion/removal of curly braces around single-lineif
's body).Also, if GNU
parallel
isn't installed, usexargs
- it does the same, but a little bit longer.