awk command: adding prefix to an csv file - bash

I am trying to add an prefix to my csv file. Below is the source csv
A,B
121ABC,London
2212ABC,Paris
312ABC,Tokyo
I am using the following awk command
$ awk -F=',' -vOFS=',' '{$2="AC_"$2; print}' t.csv >t1.csv
But, the output is somewhat adding another column to the csv file.
A,B,AC_
121ABC,London,AC_
2212ABC,Paris,AC_
312ABC,Tokyo,AC_
Any pointers as to where the error is?

You're setting FS to =, instead of ,. Use -F',' or -v FS=',' but not -F=','.
Since you require , for both input and output field separators you should be setting them together to that value in one place rather than setting them both separately to the same value:
awk 'BEGIN{FS=OFS=","} {$2="AC_"$2; print}' t.csv >t1.csv

You can use this awk:
awk 'BEGIN{FS=OFS=","} {$2 = "AC_" $2} 1' file
A,AC_B
121ABC,AC_London
2212ABC,AC_Paris
312ABC,AC_Tokyo

perhaps simpler with sed
$ sed 's/,/&AC_/' file
A,AC_B
121ABC,AC_London
2212ABC,AC_Paris
312ABC,AC_Tokyo

Related

Replacing new line with comma seperator

I have a text file that the records in the following format. Please note that there are no empty files within the Name, ID and Rank section.
"NAME","STUDENT1"
"ID","123"
"RANK","10"
"NAME","STUDENT2"
"ID","124"
"RANK","11"
I have to convert the above file to the below format
"STUDENT1","123","10"
"STUDENT2","124","11"
I understand that this can be achieved using shell script by reading the records and writing it to another output file. But can this can done using awk or sed ?
$ awk -F, '{ORS=(NR%3?FS:RS); print $2}' file
"STUDENT1","123","10"
"STUDENT2","124","11"
With awk:
awk -F, '$1=="\"RANK\""{print $2;next}{printf "%s,",$2}' file
With awk, printing newline each 3 lines:
awk -F, '{printf "%s",$2;if (NR%3){printf ","}else{print""};}'
Following awk may also help you on same.
awk -F, '{ORS=$0~/^"RANK/?"\n":FS;print $NF}' Input_file
With sed
sed -E 'N;N;;y/\n/ /;s/([^,]*)(,[^ ]*)/\2/g;s/,//' infile

Add quotes to strings between commas shell

I have a string like
1,2,A,N,53,3,R,R,^A,-C,-T,2,S,9,l,8,8Z,sl,138Z,l,Y,75680,P
Now, I am trying to achieve below string
1,2,A,N,53,3,R,R,"^A,-C,-T",2,S,9,l,8,8Z,sl,138Z,l,Y,75680,P
So, I am trying to replace everything after 8th occurrence of comma (,) from start and before 12th occurrence of comman (,) from end to be in quotes.
I tried some options of awk but unable to achieve it . Anyway to get this done .
Thanks in advance .
try:
awk -v s1="\"" -F, '{$9=s1 $9;$(NF-12)=$(NF-12) s1} 1' OFS=, Input_file
So here I am making a variable which is " and making field separator as comma. Then I am re-creating 9th field as per your requirement with s1 and $9. Then re-creating 13th field from last(point to be noted no hardcoding of field number here so it may have any number of fields) and adding s1's value in last of it's current value. Then mentioning 1 will print the line. Setting OFS(output field separator) as comma too.
x='1,2,A,N,53,3,R,R,^A,-C,-T,2,S,9,l,8,8Z,sl,138Z,l,Y,75680,P'
awk -F, -v OFS=, -v q='"' '{$9=q $9;$11=$11 q}1' <<< "$x"
1,2,A,N,53,3,R,R,"^A,-C,-T",2,S,9,l,8,8Z,sl,138Z,l,Y,75680,P
Explanation: Here FS and OFS are set to comma as the input stream is CSV.double quote is stored in a variable named q. Then the value of the desired columns are altered to get the desired results. You can change the values of columns to get any other results.
For files:
awk -F, -v OFS=, -v q='"' '{$9=q $9;$11=$11 q}1' inputfile
$ awk -v FS=',' -v OFS=',' '{$9="\"" $9;$11=$11"\""; print}' your_file
1,2,A,N,53,3,R,R,"^A,-C,-T",2,S,9,l,8,8Z,sl,138Z,l,Y,75680,P
This might work for you (GNU sed):
sed 's/,/&"/8;s/,/"&/11' file
Insert " after and before ' eight and eleven.
awk '{sub(/\^A,-C,-T/,"\42^A,-C,-T\42")}1' file
1,2,A,N,53,3,R,R,"^A,-C,-T",2,S,9,l,8,8Z,sl,138Z,l,Y,75680,P
The fine point here is to escape the caret.

Shell script copying all columns of text file instead of specified ones

I trying to copy 3 columns from one text file and paste them into a new text file. However, whenever I execute this script, all of the columns in the original text file get copied. Here is the code I used:
cut -f 1,2,6 PROFILES.1.0.profile > compiledfile.txt
paste compiledfile.txt > myNewFile
Any suggestions as to what I'm doing wrong? Also, is there a simpler way to do this? Thanks!
Let's suppose that the input is comma-separated:
$ cat File
1,2,3,4,5,6,7
a,b,c,d,e,f,g
We can extract columns 1, 2, and 6 using cut:
$ cut -d, -f 1,2,6 File
1,2,6
a,b,f
Note the use of option -d, to specify that the column separator is a comma.
By default, cut uses a tab as the column separator. If the separator in your file is anything else, you must use the -d option.
Using awk
awk -vFS=your_delimiter_here -vOFS=your_delimiter_here 'print $1,$2,$6' PROFILES.1.0.profile > compiledfile.txt
should do it.
For comma separated fields the solution would be
awk -vFS=, -vOFS=, '{print $1,$2,$6}' PROFILES.1.0.profile > compiledfile.txt
FS is an awk builtin variable which stands for field-separator.
Similarly OFS stands for output-field-separator.
And the handy -v option with awk helps you assign a value to variable.
You could use awk to this.
awk -F "delimiter" '
{
print $1,$2 ,$3 #Where $1,$2 and so are column numbers
}' filename > newfile

awk command to select exact word in any field

I have input file as
ab,1,3,qqq,bbc
b,445,jj,abc
abcqwe,234,23,123
abc,12,bb,88
uirabc,33,99,66
I have to select the rows which has only 'abc'. And note that abc string can appear in any of the column. Please help me how to achieve this using awk.
Output:
b,445,jj,abc
abc,12,bb,88
You could also use plain grep:
grep "(^|,)abc(,|$)" file
Or if you have to use awk
awk '/(^|,)abc(,|$)/' file
Using awk
awk 'gsub(/(^|,)abc(,|$)/,"&")' file
b,445,jj,abc
abc,12,bb,88
Based on Beny23s regex.
It does look for abc where its starting from ^ start or from a , and
ends with a , or end of line $
Another one using beny23 regex:
awk 'NF>1' FS="(^|,)abc(,|$)" infile
Not asked but if you feel the need to filter just the lines with one ocurrence:
$ cat infile
ab,1,3,qqq,bbc
b,445,jj,abc
abcqwe,234,23,123
abc,12,bb,88
abc,12,bb,abc
uirabc,33,99,66
This will be handy:
$ awk 'NF==2' FS="(^|,)abc(,|$)" infile
b,445,jj,abc
abc,12,bb,88
Also possible using Jotne solution:
$ awk 'gsub(/(^|,)abc(,|$)/,"&")==1' infile
Through awk,
$ awk -F, '{for(i=1;i<=NF;i++){if($i=="abc") print $0;}}' file | uniq
b,445,jj,abc
abc,12,bb,88
OR
$ awk -F, '{for(i=1;i<=NF;i++){if($i=="abc") {print; next}}}' file
b,445,jj,abc,abc
abc,12,bb,88
In the above awk command Field Separator variable is set to , . AWk parses the input file line by line. for function is used to traverse all the fields in a line. If a value of a particular field is abc, then it prints the whole line.

Awk adding constant values

I have data in the text file like val1,val2 with multiple lines
and I want to change it to 1,val1,val2,0,0,1
I tried with print statement in awk(solaris) to add constants by it didn't work.
What is the correct way to do it ?
(From the comments) This is what I tried
awk -F, '{print "%s","1,"$1","$2"0,0,1"}' test.txt
Based on the command you posted, a little change makes it:
$ awk -F, 'BEGIN{OFS=FS} {print 1,$1,$2,0,0,1}' file
1,val1,val2,0,0,1
OR using printf (I prefer print):
$ awk -F, '{printf "1,%s,%s,0,0,1", $1, $2}' file
1,val1,val2,0,0,1
To prepend every line with the constant 1 and append with 0,0,1 simply do:
$ awk '{print 1,$0,0,0,1}' OFS=, file
1,val1,val2,0,0,1
A idiomatic way would be:
$ awk '$0="1,"$0",0,0,1"' file
1,val1,val2,0,0,1
Using sed:
sed 's/.*/1,&,0,0,1/' inputfile
Example:
$ echo val1,val2 | sed 's/.*/1,&,0,0,1/'
1,val1,val2,0,0,1

Resources