Add quotes to strings between commas shell - shell

I have a string like
1,2,A,N,53,3,R,R,^A,-C,-T,2,S,9,l,8,8Z,sl,138Z,l,Y,75680,P
Now, I am trying to achieve below string
1,2,A,N,53,3,R,R,"^A,-C,-T",2,S,9,l,8,8Z,sl,138Z,l,Y,75680,P
So, I am trying to replace everything after 8th occurrence of comma (,) from start and before 12th occurrence of comman (,) from end to be in quotes.
I tried some options of awk but unable to achieve it . Anyway to get this done .
Thanks in advance .

try:
awk -v s1="\"" -F, '{$9=s1 $9;$(NF-12)=$(NF-12) s1} 1' OFS=, Input_file
So here I am making a variable which is " and making field separator as comma. Then I am re-creating 9th field as per your requirement with s1 and $9. Then re-creating 13th field from last(point to be noted no hardcoding of field number here so it may have any number of fields) and adding s1's value in last of it's current value. Then mentioning 1 will print the line. Setting OFS(output field separator) as comma too.

x='1,2,A,N,53,3,R,R,^A,-C,-T,2,S,9,l,8,8Z,sl,138Z,l,Y,75680,P'
awk -F, -v OFS=, -v q='"' '{$9=q $9;$11=$11 q}1' <<< "$x"
1,2,A,N,53,3,R,R,"^A,-C,-T",2,S,9,l,8,8Z,sl,138Z,l,Y,75680,P
Explanation: Here FS and OFS are set to comma as the input stream is CSV.double quote is stored in a variable named q. Then the value of the desired columns are altered to get the desired results. You can change the values of columns to get any other results.
For files:
awk -F, -v OFS=, -v q='"' '{$9=q $9;$11=$11 q}1' inputfile

$ awk -v FS=',' -v OFS=',' '{$9="\"" $9;$11=$11"\""; print}' your_file
1,2,A,N,53,3,R,R,"^A,-C,-T",2,S,9,l,8,8Z,sl,138Z,l,Y,75680,P

This might work for you (GNU sed):
sed 's/,/&"/8;s/,/"&/11' file
Insert " after and before ' eight and eleven.

awk '{sub(/\^A,-C,-T/,"\42^A,-C,-T\42")}1' file
1,2,A,N,53,3,R,R,"^A,-C,-T",2,S,9,l,8,8Z,sl,138Z,l,Y,75680,P
The fine point here is to escape the caret.

Related

Shell - How to remove a string "EES" from a record after 7th occurrence of colon(:)

how do we remove EES from below input file
{"last_name":"Kiran","first_name":"kumar","sno":"1234","effe‌​ctive_date":"11/01/2‌​011","cancel_date":"‌​12/31/9999","alt_ein‌​_id_indicator":"Y","‌​alt_ein_id_employer_‌​number":"V3EES"}
Expecting the file after transformation to look like this
{"last_name":"Kiran","first_name":"kumar","sno":"1234","effe‌​ctive_date":"11/01/2‌​011","cancel_date":"‌​12/31/9999","alt_ein‌​_id_indicator":"Y","‌​alt_ein_id_employer_‌​number":"V3"}
TIA
Use jq for parsing JSON data
jq -c '.alt_ein_id_employer_number |= sub("EES";"")' file.json
{"last_name":"Kiran","first_name":"kumar","sno":"1234","effective_date":"11/01/2011","cancel_date":"12/31/9999","alt_ein_id_indicator":"Y","alt_ein_id_employer_number":"V3"}
Following awk should remove the EES string from 8th field or after 7th colon.
awk -F':' '{sub("EES","",$8)} 1' OFS=":" Input_file
Will add a detailed explanation for same too shortly.
Explanation:
awk -F':' Means I am setting up field separator here, by default in awk field separator's value is space so I am setting into colon now. So it will break the lines in parts with respect to colon only.
{sub("EES","",$8)} Means, I am using substitute utility of awk, which will work on method sub(regex_to_be_subsituted,new_value,current_line/variable). So here I am giving string EES to be substituted with NULL("") in $8 means 8th field of the line(which you mentioned after 7th colon).
1 means, awk works on method of condition then action, so by writing 1 I am making condition TRUE and didn't mention any action, so by default print will happen.
OFS=":" Means, setting output field separator, by default OFS will be space so as per your Input_file I am setting it to :
Input_file Means, simply mentioning Input_file name here.
If you want to save output into same Input_file then following may help you.
awk -F':' '{sub("EES","",$8)} 1' OFS=":" Input_file > temp_file && mv temp_file Input_file

awk command: adding prefix to an csv file

I am trying to add an prefix to my csv file. Below is the source csv
A,B
121ABC,London
2212ABC,Paris
312ABC,Tokyo
I am using the following awk command
$ awk -F=',' -vOFS=',' '{$2="AC_"$2; print}' t.csv >t1.csv
But, the output is somewhat adding another column to the csv file.
A,B,AC_
121ABC,London,AC_
2212ABC,Paris,AC_
312ABC,Tokyo,AC_
Any pointers as to where the error is?
You're setting FS to =, instead of ,. Use -F',' or -v FS=',' but not -F=','.
Since you require , for both input and output field separators you should be setting them together to that value in one place rather than setting them both separately to the same value:
awk 'BEGIN{FS=OFS=","} {$2="AC_"$2; print}' t.csv >t1.csv
You can use this awk:
awk 'BEGIN{FS=OFS=","} {$2 = "AC_" $2} 1' file
A,AC_B
121ABC,AC_London
2212ABC,AC_Paris
312ABC,AC_Tokyo
perhaps simpler with sed
$ sed 's/,/&AC_/' file
A,AC_B
121ABC,AC_London
2212ABC,AC_Paris
312ABC,AC_Tokyo

Using a multi-character field separator in awk on Solaris

I wish to use a string (BIRCH) as a field delimiter in awk to print second field. I am trying the following command:
cat tmp.log|awk -FBirch '{ print $2}'
Below output is getting printed:
irch2014/06/23,04:36:45,3,1401503,xml-harlan,P12345-1,temp,0a653356353635635,temp,L,Success
Desired output:
2014/06/23,04:36:45,3,1401503,xml-harlan,P12345-1,temp,0a653356353635635,temp,L,Success
Contents of tmp.log file.
-bash-3.2# cat tmp.log
Dec 05 13:49:23 [x.x.x.x.180.100] business-log-dev/int [TEST][0x80000001][business-log][info] mpgw(Test): trans(8497187)[request][10.x.x.x]:
Birch2014/06/23,04:36:45,3,1401503,xml-harlan,P12345-1,temp,0a653356353635635,temp,L,Success
Am I doing something wrong?
OS: Solaris10
Shell: Bash
Tried below command suggested in one of the ansers below. I am getting the desired output, but with an extra empty line at the top. How can this be eliminated from the output?
-bash-3.2# /usr/xpg4/bin/awk -FBirch '{print $2}' tmp.log
2014/06/23,04:36:45,3,1401503,xml-harlan,P12345-1,temp,0a653356353635635,temp,L,Success
Originally, I suggested putting quotes around "Birch" (-F'Birch') but actually, I don't think that should make any difference.
I'm not at all experienced working with Solaris but you may want to also try using nawk ("new awk") instead of awk.
nawk -FBirch '{print $2}' file
If this works, you may want to consider creating an alias so that you always use the newer version of awk with more features.
You may also want to try using the version of awk in the /usr/xpg4/bin directory, which is a POSIX compliant implementation so should support multi-character FS:
/usr/xpg4/bin/awk -FBirch '{print $2}' file
If you only want to print lines which have more than one field, you can add a condition:
/usr/xpg4/bin/awk -FBirch 'NF>1{print $2}' file
This only prints the second field when there is more than one field.
From the man page of the default awk on solaris usr/bin/awk
-Fc Uses the character c as the field separator
(FS) character. See the discussion of FS
below.
As you can see solaris awk only takes a single character as a Field separator
Also in the man page is split
split(s, a, fs)
Split the string s into array elements a[1], a[2], ...
a[n], and returns n. The separation is done with the
regular expression fs or with the field separator FS if
fs is not given.
As you can see here it takes a regular expression as a separator so we can use.
awk 'split($0,a,"Birch"){print a[2]}' file
To print the second field split by Birch

Add blank column using awk or sed

I have a file with the following structure (comma delimited)
116,1,89458180,17,FFFF,0403254F98
I want to add a blank column on the 4th field such that it becomes
116,1,89458180,,17,FFFF,0403254F98
Any inputs as to how to do this using awk or sed if possible ?
thank you
Assuming that none of the fields contain embedded commas, you can restate the task as replacing the third comma with two commas. This is just:
sed 's/,/,,/3'
With the example line from the file:
$ echo "116,1,89458180,17,FFFF,0403254F98" | sed 's/,/,,/3'
116,1,89458180,,17,FFFF,0403254F98
You can use this awk,
awk -F, '$4="," $4' OFS=, yourfile
(OR)
awk -F, '$4=FS$4' OFS=, yourfile
If you want to add 6th and 8th field,
awk -F, '{$4=FS$4; $1=FS$1; $6=FS$6}1' OFS=, yourfile
Through awk
$ echo '116,1,89458180,17,FFFF,0403254F98' | awk -F, -v OFS="," '{print $1,$2,$3,","$4,$5,$6}'
116,1,89458180,,17,FFFF,0403254F98
It prints a , after third field(delimited) by ,
Through GNU sed
$ echo 116,1,89458180,17,FFFF,0403254F98| sed -r 's/^([^,]*,[^,]*,[^,]*)(.*)$/\1,\2/'
116,1,89458180,,17,FFFF,0403254F98
It captures all the characters upto the third command and stored it into a group. Characters including the third , upto the last are stored into another group. In the replacement part, we just add an , between these two captured groups.
Through Basic sed,
Through Basic sed
$ echo 116,1,89458180,17,FFFF,0403254F98| sed 's/^\([^,]*,[^,]*,[^,]*\)\(.*\)$/\1,\2/'
116,1,89458180,,17,FFFF,0403254F98
echo 116,1,89458180,17,FFFF,0403254F98|awk -F',' '{print $1","$2","$3",,"$4","$5","$6}'
Non-awk
t="116,1,89458180,17,FFFF,0403254F98"
echo $(echo $t|cut -d, -f1-3),,$(echo $t|cut -d, -f4-)
You can use bellow awk command to achieve that.Replace the $3 with what ever the column that you want to make it blank.
awk -F, '{$3="" FS $3;}1' OFS=, filename
sed -e 's/\([^,]*,\)\{4\}/&,/' YourFile
replace the sequence of 4 [content (non comma) than comma ] by itself followed by a comma

How does value matching in awk work?

I have got a csv file, it has various columns, one of these columns is a Date column, so for a given date, I want to fetch all of the records from the csv, the reason that I cant use grep is that that date might be there in some other column and I dont want that.
So far this is what I have got,
Date is this format:
sed 's/\"//g' kk.csv | awk -F ',' '{print $4}' '$4 ~ /^2012.*/'
First I am removing all of the "" quotes, then I have specified the separator ',' then I am applying the condition on the 4th column of the files which is the date column, but It is not working, I am doing exactly what the book says.
Can anyone point out what I am doing wrong, is it a quote related issue?
awk only accepts one script, while you're giving it two ({printf $4} and $4 ~ /^2012.*/). Join them into one awk script using awk's condition { commands } syntax:
sed 's/\"//g' kk.csv | awk -F ',' '$4 ~ /^2012.*/ {print $4}'
no need to use sed. Just awk is enough to search
awk -F, '$4 ~ "^\"2012"' kk.csv
This will print all line which have 4th column starting with "2012
If you want formatted output(removing quotes etc) awk do that also

Resources