I have this code in a shell script:
sort input | uniq -c | sort -nr > output
The input file had no preceding white spaces, but the output does. How do I fix this? This is in bash
I have this code in a shell script:
sort input | uniq -c | sort -nr > output
The input file had no preceding white spaces, but the output does. How do I fix this? This is in bash
Source : https://www.thelinuxrain.com/articles/tweaking-uniq-c (Wayback Machine)
Remove the leading spaces with sed :
uniq -c
adds leading whitespace. E.g.You could add a command at the end of the pipeline to remove it. E.g.
FWIW you can use a different sorting tool for more flexibility. Python is one such tool.
Source
In theory this would even be faster than the
sort
tool for large inputs since the above program uses a hash table to identify duplicate lines instead of a sorted list. (Alas it places lines of identical count in an arbitrary instead of a natural order; this can be amended and still be faster than twosort
invocations.)Output Format
If you want more flexibility on the output format you can look into the
print()
andformat()
built-in functions.For instance, if you want to print the count number in octal with up to 7 leading zeros and followed by a tab instead of a space character with a NUL line terminator, replace the last line with:
Usage
Store the script in a file, say
sort_count.py
, and invoke it with Python:Translate leading whitespaces into single whitespace with tr -s and then print output from 2nd character with cut -c.