How can I use docker without sudo?

Question

dearN

Asked: 2018-05-03 04:26:13 +0800 CST2018-05-03 04:26:13 +0800 CST 2018-05-03 04:26:13 +0800 CST

Extract number if lines from file in order of date file name

772

I have a folder with csv files whose file names are dates, viz.: January-01-2018.csv, January-02-2018.csv, ..., April-30-2018.csv.

Using Bash preferably, I want to extract the number of lines from each csv file but doing so in order of date. i.e., I wish to extract the number of lines in January-01-2018.csv and then January-02-2018.csv ... and then April-30-2018.csv and so on.

At the moment, all I have is:

for filename in $(ls *.csv); do cat $filename | wc -l >> by_day.dat; done

But this does not take care of my operation in "ascending order of date".

Any suggestions on how I might accomplish this? I would like to do this using bash.

1 Answers

Voted

Byte Commander · Answer 1 · 2018-05-03T04:53:56+08:00

You can do this by combining a few common tools:

find to list all .csv files (unordered) and execute a command for each
basename to extract the file name without .csv extension from the path
date to interpret the date specification in the file name and convert it to an easily sortable number, like seconds since 1970.
echo to print the calculated number and the real file path in one line for each file
sort to sort the file paths according to this converted date number
cut to extract only the file paths again from the combined list
xargs cat to construct a command by passing all file names in order to the cat command for concatenating them.

The complete line looks like this, if all files we want to process are located in a folder named datecsv:

$ find datecsv/ -name '*.csv' -exec bash -c 'echo "$(date -d "$(basename -s.csv "{}")" +%s) {}"' \; | sort -n | cut -d' ' -f2- | xargs cat
2018,1,1,aaa
2018,1,1,bbb
2018,1,2,ccc
2018,1,2,ddd
2018,4,30,eee
2018,4,30,fff

My example files producing the output above are these:

$ cat datecsv/April-30-2018.csv
2018,4,30,eee
2018,4,30,fff
$ cat datecsv/January-01-2018.csv
2018,1,1,aaa
2018,1,1,bbb
$ cat datecsv/January-02-2018.csv
2018,1,2,ccc
2018,1,2,ddd

As you only want the line number of each file, the command for that would look like this:

$ find datecsv/ -name '*.csv' -exec bash -c 'echo "$(date -d "$(basename -s.csv "{}")" +%s) {}"' \; | sort -n | cut -d' ' -f2- | xargs -n1 wc -l
2 datecsv/January-01-2018.csv
2 datecsv/January-02-2018.csv
2 datecsv/April-30-2018.csv

The only change is the last part, where we use xargs -n1 wc -l instead of xargs cat as above.

Some notes: The approach above relies in your file names being a format that date can parse. This is the case for the example names you provided, but it might break if the format changes. It also requires the file name to end with a lowercase .csv. Not sure if some special characters in file names might break stuff (spaces should probably be safe, newlines will surely break it).

Extract number if lines from file in order of date file name

How to install Google Chrome

Is there a command to list all users? Also to add, delete, modify users, in the terminal?

How to delete a non-empty directory in Terminal?

How to unzip a zip file from the Terminal?

How can I copy the contents of a folder to another folder in a different directory using terminal?

How do I install a .deb file via the command line?

How do I run .sh scripts?

How do I install a .tar.gz (or .tar.bz2) file?

How to list all installed packages

Unable to lock the administration directory (/var/lib/dpkg/) is another process using it?