Without uniq:
amin@ubuntu:~/Desktop$ cut -f 1 info.log | tail -n +2 | head -n -1 | sort
Abol
Abol
Ahmad
Akbar
Arash
Hadi
Hamed
Mahmood
Maryam
Maryam
Mohsen
NIma
Rasool
Sadegh
Sepide
Sepide
With uniq:
amin@ubuntu:~/Desktop$ cut -f 1 info.log | tail -n +2 | head -n -1 | sort | uniq
Abol
Abol
Ahmad
Akbar
Arash
Hadi
Hamed
Mahmood
Maryam
Mohsen
NIma
Rasool
Sadegh
Sepide
Sepide
As you see result are same in both, why?
TL;DR: The lines have different whitespace (possibly spaces) at the end.
This happens when you have lines that look the same but are are actually different due to characters that don't display in your terminal, usually at the end. Often these are are trailing spaces (as fkraiem suggested) or inconsistent line terminators.
You might expect that starting the pipeline, as you do, with
cut
, would prevent this. It doesn't, though.cut
uses a tab as its default delimiter. (Readers who wish to verify this behavior--and its relevance to having unexpected duplicate lines afteruniq
--can trycut -f 1 <<<$'foo\nfoo ' | uniq
, which prints two lines.)The solution in your case is probably to use something other than
cut -f 1
to select the fields. In particular, if the fields are separated by spaces instead of tabs--whether by a single space or multiple spaces, and even if the number of spaces is different in different records--then you can usecut -d' ' -f 1
instead, specifying space as the delimiter character. Or you might not want to usecut
at all, but instead useawk '{ print $1 }'
, which prints the first field, taking any sequence of consecutive spaces and tabs as the delimiter.You could alternatively strip the trailing whitespace, though this makes your command even more complicated. One way to do that would be by piping your text through
sed -E 's/[[:space:]]+$//'
before it goes touniq
.As a side note, if whatever command you ultimately use still ends up piping the output of
sort
directly touniq
, you might consider just usingsort -u
for that instead.