How can I use docker without sudo?

Question

samhitha

Asked: 2017-02-27 21:28:20 +0800 CST2017-02-27 21:28:20 +0800 CST 2017-02-27 21:28:20 +0800 CST

compare contents of two text files in ubuntu?

772

I need to compare the contents of text files (as a whole and not line-wise like diff command) and print out the missing text. Is there a command to do so? Thanks in advance.

EDIT: As an example, say file1 has:

1 2 3
4 5

file 2 has got:

1 5
2 3 4 6

I want to compare these files and print as output:

The command diff compares the text files line by line, in which case, almost the entire file will be printed out. (My actual files are more complicated and lengthy, so I'm giving a simple example.)

5 Answers

Voted

muru · Answer 1 · 2017-02-27T22:21:58+08:00

Since the order does not matter, you can use awk for printing unique lines, but with whitespace separating lines:

awk -v RS='[[:space:]]+' 'FNR==NR {a[$0]++; next} {a[$0]--} END {for (i in a) if (a[i] != 0) print i}' file1 file2

Here:

-v RS='[[:space:]]+' sets the record separator (RS) to any whitespace, so all each "line" will be separated by any whitespace (including newlines).
FNR == NR - FNR is the record number (NR) (or, line number, if you will) for the current file, and NR is the overall line number in all input files. So, whenever these two are equal, we're dealing with the first file.
{a[$0]++; next} - set and increment the count of appearances of the current "line", then move to the next line without processing any more rules. This block is only run for the first file, so the effect is that this rule applies to lines from the first file, and the next block applies to to all other files.
{a[$0]--}, decrement the count of appearances of the current "line".
END {for (i in a) if (a[i] != 0) print i} - at the END of all input, for each entry in the array a, print that entry if the count of appearances is not 0. So, any "line" which was seen an equal number of times in both files will be skipped.

Lety · Answer 2 · 2018-03-15T02:08:15+08:00

Lety

2018-03-15T02:08:15+08:002018-03-15T02:08:15+08:00

This is another solution:

comm -3 <( cat file1 | tr ' \t' '\n' | sort) <(cat file2 | tr ' \t' '\n' | sort) | sort -u

where:

tr ' \t' '\n' | sort replace space and tab with newline, and reorder the result
comm Compares sorted files file1 and file2 line by line, and with -3 option suppress lines that appear in both files
sort -u at last, removes duplicate lines, this is necessary in case of duplicate token

In this case tr ' ' '\n' | sort output is used as standard input for comm command.

1

FelixJN · Answer 3 · 2017-02-28T07:01:09+08:00

FelixJN

2017-02-28T07:01:09+08:002017-02-28T07:01:09+08:00

I'll assume the following: Your files consist of data, that are either separated with a space or newline AND you do not care about knowing where the data is missing (or know it is always e.g. in file2).

What we will do is simple: Replace every space with a newline in both files, concatenate them, then search for single (unique) entries only:

sort <( tr ' ' '\n' < file1 ) <(tr ' ' '\n' < file2) | uniq -u

0

sudodus · Answer 4 · 2018-03-15T02:39:58+08:00

Preparation of the files

If I understand correctly, you want to print unique numbers or words in the comparison. I would convert the files so that each number/word has an own line, sort them, remove blank lines and duplicates and after that compare the files.

I assume that space characters are separating the numbers or words. For each file

< file-x.orig tr ' ' '\n' | tr -s '\n' '\n' |sort -u > file-x.1

See man tr for details.

If you want to sort numerically, you can add the option -n or -h depending of the numeric format. See man sort for details.

The example of your original question,

In the example of the original question, I would use -n,

< file-x.orig tr ' ' '\n' | tr -s '\n' '\n' |sort -nu > file-x.1

where x can be a and b for the two files to compare, so for example

< file-a.orig tr ' ' '\n' | tr -s '\n' '\n' |sort -nu > file-a.1
< file-b.orig tr ' ' '\n' | tr -s '\n' '\n' |sort -nu > file-b.1

You can inspect these converted and sorted files, if you wish.

Finally compare the files with the following command line,

$ diff file-a.1 file-b.1
5a6
> 6

This identifies the number, that is only found in the second file of your sample input from the original question. See man diff for details, if you want a modified output from diff.

My example

This is an example with a unique number in each of the files.

$ echo '1 2 3 4 5'>file-c.orig
$ echo '2 3 4 5 6'>file-d.orig
$ < file-c.orig tr ' ' '\n' | tr -s '\n' '\n' |sort -nu > file-c.1
$ < file-d.orig tr ' ' '\n' | tr -s '\n' '\n' |sort -nu > file-d.1
$ diff  file-c.1 file-d.1
1d0
< 1
5a5
> 6

This identifies

'1' that is only found in the 1st file, indicated by '<'
'6' that is only found in the 2nd file, indicated by '>'

The example in Lety's comment

For example: file1 = 1 3 1 4 5 and file2 = 5 1 4 3 6, expected output: 1 6.

$ echo '1 3 1 4 5'>file-e.orig
$ echo '5 1 4 3 6'>file-f.orig

$ < file-e.orig tr ' ' '\n' | tr -s '\n' '\n' |sort -nu > file-e.1
$ < file-f.orig tr ' ' '\n' | tr -s '\n' '\n' |sort -nu > file-f.1
$ diff  file-e.1 file-f.14a5
> 6

$ comm -3 <( cat file-e.orig | tr ' \t' '\n' | sort) <(cat file-f.orig | tr ' \t' '\n' | sort) | sort -u
1
    6

$ awk -v RS='[[:space:]]+' 'FNR==NR {a[$0]++; next} {a[$0]--} END {for (i in a) if (a[i] != 0) print i}' file-e.orig file-f.orig
1
6

In this case my method shows a different result compared to the methods of @Lety and @muru. Let us wait for the OP, @samhitha, to tell us what is the desired output of the comparison.

Gildo · Answer 5 · 2018-03-15T04:16:07+08:00

Gildo

2018-03-15T04:16:07+08:002018-03-15T04:16:07+08:00

Another solution (python):

def readFile(fi):
  return open(fi,"rb").read().decode(cod)
def process(text):
  return text.replace("\n"," ").split(" ")

print(set([x for x in process(readFile(file1,"utf-8")) if not x==""]).difference(set([x for x in process(readFile(file2,"utf-8")) if not x==""])))

0

compare contents of two text files in ubuntu?

Preparation of the files

The example of your original question,

My example

The example in Lety's comment

How to install Google Chrome

Is there a command to list all users? Also to add, delete, modify users, in the terminal?

How to delete a non-empty directory in Terminal?

How to unzip a zip file from the Terminal?

How can I copy the contents of a folder to another folder in a different directory using terminal?

How do I install a .deb file via the command line?

How do I run .sh scripts?

How do I install a .tar.gz (or .tar.bz2) file?

How to list all installed packages

Unable to lock the administration directory (/var/lib/dpkg/) is another process using it?