Awk: how to change timestamp format of a column

Recently i had a task to convert a csv file with timestamps in one column from a format to another! i didn't want to use Python for this…

Awk: how to change timestamp format of a column
Photo by Bernard Hermant on Unsplash
Join Medium with my referral link - Konstantinos Patronas
As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…

Recently i had a task to convert a csv file with timestamps in one column from a format to another! i didn't want to use Python for this since i think it is an overkill nor universal in terms of portability, i tried to do bash manipulation using reading the file in an array using cut to get the desired field and use date to change the format, it worked but was ugly and underperforming! The best solution was awk! lets see how!.

Scenario

In my case what exactly i had to do is to modify a human readable timestamp to an epoch timestamp of nanosecond accuracy.

2023-06-24T18:59:42.803Z ---> 1687633182803000000 

Create the following file and save it as test.csv

2023-06-24T18:59:42.803Z,ford,focus 
2023-06-24T19:10:47.626Z,mazda,mx5 
2023-06-24T22:18:50.391Z,toyota,yaris 
2023-06-23T15:59:41.591Z,toyota,corolla 
2023-06-23T18:52:42.346Z,nissan,micra

The first column is the column that we want to transform to epoch.

Working with awk

Lets try something basic first, lets see how to parse only the first column which is the timestamp column and convert it to epoch.

$ awk -F "," '{system("date -d "$1" +%s%9N")}' test.csv 
1687633182803000000 
1687633847626000000 
1687645130391000000 
1687535981591000000 
1687546362346000000
  • We invoked awk to parse file test.csv, -F “,” option said to awk that the file is comma delimited
  • then we used the system command of awk to execute the system date command in my Linux box for each element of of the first column (the $1 variable)
  • date command using the -d parameter and the +%s%9N formatting string parsed dates and converted them to epoch seconds with nano second accuracy

You can check more about the date command and its formating options here https://man7.org/linux/man-pages/man1/date.1.html

Now lets try the complete thing! to print the rest of the columns along with the converted timestamp

$ awk -F "," '{"date -d "$1" +%s%9N" | getline $1;printf("%s,%s,%s\n",$1,$2,$3)}' test.csv 
1687633182803000000,ford,focus 
1687633847626000000,mazda,mx5 
1687645130391000000,toyota,yaris 
1687535981591000000,toyota,corolla 
1687546362346000000,nissan,micra

What we did here is

  • for each value of $1 executed the date command to convert the timestamp and with the | getline we passed the value back to $1
  • Then using printf we printed the column values with the selected format

But what if we wanted to keep the old time format and the same time convert the timestamp?

$ awk -F "," '{"date -d "$1" +%s%9N" | getline ts;printf("%s,%s,%s,%s\n",ts,$1,$2,$3)}' test.csv 
1687633182803000000,2023-06-24T18:59:42.803Z,ford,focus 
1687633847626000000,2023-06-24T19:10:47.626Z,mazda,mx5 
1687645130391000000,2023-06-24T22:18:50.391Z,toyota,yaris 
1687535981591000000,2023-06-23T15:59:41.591Z,toyota,corolla 
1687546362346000000,2023-06-23T18:52:42.346Z,nissan,micra

What we did now is very similiar to the previous example but with some slight differences

  • | getline ts we stored the converted timestamp to a variable named ts
  • using printf we print all three existing column values plus the value of the converted timestamp (ts)

Conclusion

Awk is a complete language but can be used also as a command line tool in order to make useful oneliners that can help us to do quick wins instead of using a language like Python! it seems that old tools still have their value in our modern Linux echo system and are not going to replaced any time soon!

Join Medium with my referral link - Konstantinos Patronas
As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…