Shell snippets

Introduction:

Shell tools like Awk,Sed,Grep,etc. are great tools for data manipulation, especialy for CSV files ! But I don't use them every day so I have always to re-read man page and find how to use them. The tasks I do with awk or sed are often similars. So why don't put them somewhere and try to amend them when I've found a better answer ?

So be careful: those snippets do not pretend to be THE solution to all of your problems...

Add a columns to a bunch of CSV files

At work today, I had to add a new empty column to a lot of CSV files. Time to use sedagain:

find ./ -name '*.csv' -exec sed -i -e '1 s/$/,newcolumn/' -e '2,$ s/$/,/' {} \;

Explanations:

  • sed -i : modify file in place.
  • -e '1 s/$/,newcolumn/': modify only the first line (add the column title in the header).
  • -e '2,$ s/$/,/': add an empty cell for each line (from second one to the last).

Concatenate n columns in one (awk+sed)

Here is a CSV file:

 BUREAU,AGENTS,AGENTS,AGENTS,AGENTS,AGENTS,AGENTS,AGENTS,AGENTS
 1005,ABC,,,,,,,
 1007,DEF,,,,,,,
 1008,GHFsdf,,,,,,,
 1009,sdfsdfsdf,,,,,,,
 1073,Borsdfsdf,sdfsdfsdf,,,,,,
 1078,zeopdfigop,Dzerzerzer,,,,,,
 1079,zeoiuozeituopezrit,,,,,,,
 1080,xcklcb,,,,,,,

You want to extract only the first column and a concatenation of all of the others (like: "Borsdfsdf,sdfsdfsdf") and delete each null column.

Here is the awk (+sed) snippet to do it:

     sed -e 's/,/;/' -e 's/,,//g' -e 's/,$//g' ./file.csv | awk -F ";" '{print "\""$1"\",\""$2"\""}'