I have a CSV file it includes some columns, in that I have the start time and end time. My requirement is to get the difference between the time and add the difference to a new column.
I'm getting the difference between time but not able to add it into new column properly for each line.
This is my sample csv.
4,ganesh-28,2019-09-26T16:56:40Z,closed,harshavardhanc,2019-09-26T16:57:02Z,1,1 3,ganesh-28,2019-09-26T16:54:25Z,closed,harshavardhanc,2019-09-26T16:54:55Z,1,1 2,ganesh-28,2019-09-26T16:52:59Z,closed,harshavardhanc,2019-09-26T16:55:19Z,1,1 1,ganesh-28,2019-09-26T16:46:52Z,closed,harshavardhanc,2019-09-26T16:47:25Z,1,1
This is the script.
#!/bin/bash
cat a.csv | while read line
do
created_at=$(date -d $(echo $line | awk -F "," '{print $3}') +%s)
merged_at=$(date -d $(echo $line | awk -F "," '{print $6}') +%s)
echo $created_at $merged_at
diff=$(( $merged_at - $created_at ))
h=`expr $diff / 3600`
m=`expr $diff % 3600 / 60`
s=`expr $diff % 60`
diff=$(printf "%02d:%02d:%02d\n" $h $m $s)
echo $diff
awk -v v1="$diff" -F"," 'BEGIN { OFS = "," } {$9=v1; print}' a.csv >> b.csv
done
I'm getting the output something like this.
4,ganesh-28,2019-09-26T16:56:40Z,closed,harshavardhanc,2019-09-26T16:57:02Z,1,1,00:00:22
3,ganesh-28,2019-09-26T16:54:25Z,closed,harshavardhanc,2019-09-26T16:54:55Z,1,1,00:00:22
2,ganesh-28,2019-09-26T16:52:59Z,closed,harshavardhanc,2019-09-26T16:55:19Z,1,1,00:00:22
1,ganesh-28,2019-09-26T16:46:52Z,closed,harshavardhanc,2019-09-26T16:47:25Z,1,1,00:00:22
4,ganesh-28,2019-09-26T16:56:40Z,closed,harshavardhanc,2019-09-26T16:57:02Z,1,1,00:00:30
3,ganesh-28,2019-09-26T16:54:25Z,closed,harshavardhanc,2019-09-26T16:54:55Z,1,1,00:00:30
2,ganesh-28,2019-09-26T16:52:59Z,closed,harshavardhanc,2019-09-26T16:55:19Z,1,1,00:00:30
1,ganesh-28,2019-09-26T16:46:52Z,closed,harshavardhanc,2019-09-26T16:47:25Z,1,1,00:00:30
4,ganesh-28,2019-09-26T16:56:40Z,closed,harshavardhanc,2019-09-26T16:57:02Z,1,1,00:02:20
3,ganesh-28,2019-09-26T16:54:25Z,closed,harshavardhanc,2019-09-26T16:54:55Z,1,1,00:02:20
2,ganesh-28,2019-09-26T16:52:59Z,closed,harshavardhanc,2019-09-26T16:55:19Z,1,1,00:02:20
1,ganesh-28,2019-09-26T16:46:52Z,closed,harshavardhanc,2019-09-26T16:47:25Z,1,1,00:02:20
4,ganesh-28,2019-09-26T16:56:40Z,closed,harshavardhanc,2019-09-26T16:57:02Z,1,1,00:00:33
3,ganesh-28,2019-09-26T16:54:25Z,closed,harshavardhanc,2019-09-26T16:54:55Z,1,1,00:00:33
2,ganesh-28,2019-09-26T16:52:59Z,closed,harshavardhanc,2019-09-26T16:55:19Z,1,1,00:00:33
1,ganesh-28,2019-09-26T16:46:52Z,closed,harshavardhanc,2019-09-26T16:47:25Z,1,1,00:00:33
Which is append the difference to all the line.
But my requirement to get the difference of time only for that line. The output should be like this.
4,ganesh-28,2019-09-26T16:56:40Z,closed,harshavardhanc,2019-09-26T16:57:02Z,1,1,00:00:22
3,ganesh-28,2019-09-26T16:54:25Z,closed,harshavardhanc,2019-09-26T16:54:55Z,1,1,00:00:30
2,ganesh-28,2019-09-26T16:52:59Z,closed,harshavardhanc,2019-09-26T16:55:19Z,1,1,00:02:20
1,ganesh-28,2019-09-26T16:46:52Z,closed,harshavardhanc,2019-09-26T16:47:25Z,1,1,00:00:33
Please someone help me to achieve this.
The last awk command inside your loop
processes the whole file
a.csv
at every loop iteration, and appends the whole result tob.csv
each time.Presumably what you intended was to apply the command only to the current contents of the
$line
variable - inbash
you can do that with a here stringHowever, processing CSV files line-by-line in a shell loop is not generally recommended - you might want to consider using a utility that provides datetime processing natively (Perl, Python, GNU Awk) or Miller ex.
(remove
--implicit-csv-header --headerless-csv-output
if your CSV file does actually have headers).See also