I have a file and it consists of two fields. The first field format is "%FT%T".
Sample data:
2019-01-01T00:00:00 4.8
2019-01-01T01:00:00 5.1
2019-01-01T02:00:00 5.4
2019-01-01T03:00:00 5.7
2019-01-01T04:00:00 5.8
2019-01-01T05:00:00 5.4
2019-01-01T06:00:00 5
2019-01-01T07:00:00 4.4
2019-01-01T08:00:00 3.8
2019-01-01T09:00:00 3.7
2019-01-01T10:00:00 3.8
2019-01-01T11:00:00 4.1
2019-01-01T12:00:00 5
2019-01-01T13:00:00 6.7
2019-01-01T14:00:00 8.4
2019-01-01T15:00:00 9.1
2019-01-01T16:00:00 8.6
2019-01-01T17:00:00 8.5
2019-01-01T18:00:00 8.6
2019-01-01T19:00:00 8.1
2019-01-01T20:00:00 8
2019-01-01T21:00:00 6.9
2019-01-01T22:00:00 5.6
2019-01-01T23:00:00 5.2
2019-01-02T00:00:00 5.2
2019-01-02T01:00:00 5.3
2019-01-02T02:00:00 5.8
2019-01-02T03:00:00 6
2019-01-02T04:00:00 5.7
2019-01-02T05:00:00 5.4
2019-01-02T06:00:00 5.7
2019-01-02T07:00:00 5.3
2019-01-02T08:00:00 4.8
2019-01-02T09:00:00 4.3
2019-01-02T10:00:00 3.6
2019-01-02T11:00:00 2.8
2019-01-02T12:00:00 3.2
2019-01-02T13:00:00 4.2
2019-01-02T14:00:00 4.9
2019-01-02T15:00:00 5.4
2019-01-02T16:00:00 5.9
2019-01-02T17:00:00 6.5
2019-01-02T18:00:00 6.7
2019-01-02T19:00:00 7.1
2019-01-02T20:00:00 5.7
2019-01-02T21:00:00 4.4
2019-01-02T22:00:00 4.1
2019-01-02T23:00:00 3.8
2019-01-03T00:00:00 4
2019-01-03T01:00:00 3.5
2019-01-03T02:00:00 3.6
2019-01-03T03:00:00 4
2019-01-03T04:00:00 4.2
2019-01-03T05:00:00 3.9
2019-01-03T06:00:00 3.7
2019-01-03T07:00:00 3.8
2019-01-03T08:00:00 3.7
2019-01-03T09:00:00 3.7
2019-01-03T10:00:00 4
2019-01-03T11:00:00 4.7
2019-01-03T12:00:00 5.4
2019-01-03T13:00:00 6.5
2019-01-03T14:00:00 7.6
2019-01-03T15:00:00 7.7
2019-01-03T16:00:00 7.3
2019-01-03T17:00:00 7.4
2019-01-03T18:00:00 8
2019-01-03T19:00:00 8.5
2019-01-03T20:00:00 8.1
2019-01-03T21:00:00 6.5
2019-01-03T22:00:00 5.6
2019-01-03T23:00:00 5.6
I want to calculate daily average of 2nd column.
Output should be as follows...
01-01-2019 6.1
02-01-2019 5.1
03-01-2019 5.5
An awk approach:
Using Miller
Formatting the results seems to be an area in which Miller is somewhat lacking, so if you need that I suggest piping the results through
numfmt
ex.Alternatively, with a sufficiently recent version of GNU awk and using
mktime
to index thesum
andcount
arrays with the epoch time of the date:Another alternative using
csvsql
/csvformat
from the Python-basedcsvkit
: