How can I use docker without sudo?

Question

John Baptist

Asked: 2019-10-04 18:58:03 +0800 CST2019-10-04 18:58:03 +0800 CST 2019-10-04 18:58:03 +0800 CST

Count number of different name in a file

772

I want to count the number of different name in a text file of this presentation:

2008 girl Avah
2009 girl Avah
2008 girl Carleigh
2011 girl Kenley
2012 boy Joseph
2013 boy Joseph
2014 boy Isaac
2014 boy Brandon

So basically I want to skip the duplicate and have as an answer 6. I tried with awk to access only the third column but I can't get it to print the number of lines.

4 Answers

Voted

αғsнιη · Answer 1 · 2019-10-04T20:37:37+08:00

αғsнιη

2019-10-04T20:37:37+08:002019-10-04T20:37:37+08:00

with awk:

<fileName awk '!nameSeen[$3]++{ count++ } END{ print count }'

if new name found !nameSeen[$3]++ increment counter count++ and at the END print counter value.

6

steeldriver · Answer 2 · 2019-10-05T03:45:59+08:00

steeldriver

2019-10-05T03:45:59+08:002019-10-05T03:45:59+08:00

Since your file appears to be pre-sorted on the name column, you could use uniq with the -f (--skip-fields) option to output only the first line of each name, and count lines:

uniq -f2 FileName | wc -l

or

uniq --skip-fields=2 FileName | wc -l

If your data are not pre-sorted, you can combine sort -u with a -k field specification to achieve the same thing (although it's not clearly documented in the GNU sort man page):

sort -uk3 FileName | wc -l

It's overkill for this task, however you could also use GNU Datamash:

datamash -W countunique 3 < FileName

6

Raffa · Answer 3 · 2019-10-04T20:46:41+08:00

A rather simple quick way that explains itself:

cat FileName | sed 's/[0-9]*//g' | sed 's/\<boy\>//g' | sed 's/\<girl\>//g' | sort -u | wc -l

Or to satisfy @αғsнιη's concern about UUoC:

<FileName sed 's/[0-9]*//g' | sed 's/\<boy\>//g' | sed 's/\<girl\>//g' | sort -u | wc -l

Or another UUoC compliant command:

sed 's/[0-9]*//g' <FileName | sed 's/\<boy\>//g' | sed 's/\<girl\>//g' | sort -u | wc -l

A notice to @Rebi Khalifa:

@αғsнιη rightly wrote in the comments below:

or <fileName cut -d' ' -f3 |sort -u |wc -l; cat filename | ... is UUoC

@steeldriver rightly wrote in the comments below:

I'd suggest using cut rather than all those sed commands - you should at least combine them into a single invocation ex. sed -E -e 's/^[0-9]+//' -e 's/\b(boy|girl)\b//'

They both used field selection approach which is the same approach you were trying to implement to solve your issue based on what you wrote in your question:

I tried with awk to access only the third column but I can't get it to print the number of lines.

One does not need to be sophisticated to get things done in Ubuntu! Things can be done in many unimaginable ways.

One way which praises the KISS principle is to pipe | simple commands one to the next until mission is accomplished:

Read the content of the file with cat FileName -->
Pipe it | -->
Remove number groups with sed 's/[0-9]*//g' -->
Pipe it | -->
Remove the word boy with sed 's/\<boy\>//g' -->
Pipe it | -->
Remove the word girl with sed 's/\<girl\>//g' -->
Only names are left now... -->
Pipe it | -->
Sort the names and remove duplicates with sort -u -->
Only unique names ar left now -->
Pipe it | -->
Count the lines with wc -l -->
Done

aborruso · Answer 4 · 2019-10-05T06:16:08+08:00

aborruso

2019-10-05T06:16:08+08:002019-10-05T06:16:08+08:00

A really short and easy way, using Miller (https://github.com/johnkerl/miller)

mlr --nidx uniq -g 3 -n input.txt

2

Count number of different name in a file

How to install Google Chrome

Is there a command to list all users? Also to add, delete, modify users, in the terminal?

How to delete a non-empty directory in Terminal?

How to unzip a zip file from the Terminal?

How can I copy the contents of a folder to another folder in a different directory using terminal?

How do I install a .deb file via the command line?

How do I run .sh scripts?

How do I install a .tar.gz (or .tar.bz2) file?

How to list all installed packages

Unable to lock the administration directory (/var/lib/dpkg/) is another process using it?