Shell snippets🔗
Introduction
Shell tools like Awk, Sed, Grep, etc. are great tools for data manipulation, especialy for CSV files! But I don't use them every day so I always have to re-read man pages and find how to use them. The tasks I do with awk or sed are often similars. So why don't put them somewhere and try to adapt them when I've found a better answer ?
Be careful: those snippets do not pretend to be THE solution to all of your problems…
Add a column to a bunch of CSV files
At work today, I had to add a new empty column to a lot of CSV files. Time to use sed
again:
find ./ -name '*.csv' -exec sed -i -e '1 s/$/,newcolumn/' -e '2,$ s/$/,/' {} \;
Explanations:
sed -i
: modify file in place.-e '1 s/$/,newcolumn/'
: modify only the first line (add the column title in the header).-e '2,$ s/$/,/'
: add an empty cell for each line (from second one to the last).
Concatenate n columns in one (awk+sed)
Here is a CSV file:
BUREAU,AGENTS,AGENTS,AGENTS,AGENTS,AGENTS,AGENTS,AGENTS,AGENTS 1005,ABC,,,,,,, 1007,DEF,,,,,,, 1008,GHFsdf,,,,,,, 1009,sdfsdfsdf,,,,,,, 1073,Borsdfsdf,sdfsdfsdf,,,,,, 1078,zeopdfigop,Dzerzerzer,,,,,, 1079,zeoiuozeituopezrit,,,,,,, 1080,xcklcb,,,,,,,
You want to extract only the first column and a concatenation of all of the others (like: "Borsdfsdf,sdfsdfsdf") and delete each null column.
Here is the awk (+sed) snippet to do it:
sed -e 's/,/;/' -e 's/,,//g' -e 's/,$//g' ./file.csv | awk -F ";" '{print "\""$1"\",\""$2"\""}'
Multiline sed
Here, I want to add a newline for strings where the previous line is not an empty newline. From this:
this is a line: ---sh blah blah blah blah
To this:
this is a line: ---sh blah blah blah blah
Sed is not multiline by default so you have to re-read the manual.
But, here is a snippet to do the job:
sed -E -e '{N;s/^([^\n]+)\n---([^\n]+)/\1\n\n---\2/;ty;P;D;:y}' text.md