Changing column value in CSV file - bash

In csv file I have below columns and I try to change the second column's value with
awk -F ',' -v OFS=',' '$1 { $2=$2*2; print}' path/file.csv > output.csv.
But it returns zero and removes double quotations.
file.csv
"sku","0.47","supplierName"
"sku","3.14","supplierName"
"sku","3.56","supplierName"
"sku","4.20","supplierName"
output.csv
"sku",0,"supplierName"
"sku",0,"supplierName"
"sku",0,"supplierName"
"sku",0,"supplierName"

You may specify more than one character in FS value.
$ awk -v FS="\",\"" -v OFS="\",\"" '{$2=$2*2}1' file
"sku","0.94","supplierName"
"sku","6.28","supplierName"
"sku","7.12","supplierName"
"sku","8.4","supplierName"
Try this if you want to round upto two decimal places.
$ awk -v FS="\",\"" -v OFS="\",\"" '{$2=sprintf("%.2f",$2*2)}1' file
"sku","0.94","supplierName"
"sku","6.28","supplierName"
"sku","7.12","supplierName"
"sku","8.40","supplierName"

Related

Feed literal string bash variable to awk and gsub

I want to edit a column in a text file by feeding a bash variable containing a literal string to awk and gsub
I have tried various version of the command below. It works for a variable that does not contain any special characters but not for one that needs to be interpreted as a literal string.
#create intial file
echo -e "SOD1:c.112G>A(p.[G38R])"'\t'"SOD1:c.112G>A(p.[G38R]);NA" > testfile
#set variable
var="SOD1:c.112G>A(p.[G38R])"
#test awk
more testfile | awk -F '\t' -v OFS='\t' -v var="${var}" '{gsub(var,"",$2)}1'
I want to delete the variable only in the second column not in the first.
Thanks in advance for your help
You can just put your var definition and your awk command in one line like this:
var='SOD1:c.112G>A\(p.\[G38R\]\)'; awk -F '\t' -v OFS='\t' -v var="$var" '{gsub(var,"",$2)}1' testfile

replace and store a value in the same file, shell

How can I replace the value of a column and store the output into the same file?
example input, file:
#car|year|model
toyota|1998|corrola
toyota|2006|yaris
opel|2001|corsa
replace "corrola" with "corrolacoupe"
and store it to the input file
#car|year|model
toyota|1998|corrolacoupe
toyota|2006|yaris
opel|2001|corsa
I have tried this
awk -F '|' -v col=$column -v val=$value '/^[^#]/ FNR==NR {print $col = val }' OFS='|' $FILE >> $FILE
To simply replace the value in (row,col) with a new value:
$ awk -F'|' -v OFS='|' -v row=2 -v col=3 -v val=corollacoupe 'NR==row {$col=val} 1' file
#car|year|model
toyota|1998|corollacoupe
toyota|2006|yaris
opel|2001|corsa
This will set the value of input field col to val, but only in the input record row. The 1 in the end will ensure each record is printed by default. Input and output field separators are set via -F option and OFS variable.
If you need to make these changes in-place, create a temporary output file and then copy it over the original:
$ awk ... file >file.tmp && cp file{.tmp,}
Alternatively, in GNU awk, you can use the inplace library via -i inplace option:
$ awk -i inplace -F'|' -v OFS='|' -v row=2 -v col=3 -v val=corollacoupe 'NR==row {$col=val} 1' file
If you wish to skip the comments, and count only non-comment rows:
$ awk -F'|' -v OFS='|' -v row=1 -v col=3 -v val=x '/^[^#]/ {nr++} nr==row {$col=val} 1' file
#car|year|model
toyota|1998|x
toyota|2006|yaris
opel|2001|corsa
An ed solution that modifies the file in-place without any temporary files could be something like:
ed "$FILE" <<< $',s/|corrola$/|corrolacoupe/g\nw'
which uses an ANSI-C string to prevent special characters from being treated specially, then matches |corrola at the end of any line and replaces it with |corrolacoupe. Then we issue the w command to ed to have it write the file back out
A really simple solution.
darby#Debian:~/Scrivania$ cat file
#car|year|model
toyota|1998|corrola
toyota|2006|yaris
opel|2001|corsa
darby#Debian:~/Scrivania$ sed -ri 's#^(.+)\|(.+)\|corrola$#\1|\2|corrolacoupe#' file
darby#Debian:~/Scrivania$ cat file
#car|year|model
toyota|1998|corrolacoupe
toyota|2006|yaris
opel|2001|corsa
darby#Debian:~/Scrivania$

How to print the csv file excluding first column till end using awk

I have a csv file with dynamic columns.
I've tried to use awk -F , 'NF>1' resul1.txt but it still prints all columns.
Since it has dynamic columns.
Its quite difficult to print using print $1 till end.
Try this awk command:
awk -F, '{$1=""}1' input.txt | awk -vOFS=, '{$1=$1}1' > output.txt
Make the 1st field empty
Print out entire line again
try substr function :
substr(string, start [, length ])
Return a length-character-long substring of string, starting at character number start. The first character of a string is character
number one.For example, substr("washington", 5, 3) returns "ing".*
awk -F, '{print substr($0,length($1)+1+length(FS))}' file
You can use cut:
cut -d',' -f2- yourfile.csv > output.csv
Explanation:
-d - setting delimiter to ,
-f - fields to print
2- - from 2 field to end of line
With awk:
awk -F, '{sub(/[^,]+,/,"",$0);}1' OFS=, yourfile.csv > output.csv
With sed:
sed -i.bak 's/^[^,]\+,//g' yourfile.csv
-i - in-place edit

How to include variable to output filename using awk

There is a command which prints out to file range of values from CSV file:
date1var=mm/dd/yyyy hh:mm:ss
date2var=mm/dd/yyyy hh:mm:ss
awk -F, -v d1var="$date1var" -v d2var="$date2var" '$1 > d1var && $1 <= d2var {print $0 }' OFS=, plot_data.csv > graph1.csv
I'm just guessing if it's possible to include my variables to the output filename?
Final name of the file should be similar to:
graph_d1var-d2var.csv
Any ideas?
You can redirect the output of print command to a file name, like:
awk -F, -v d1var="$date1_var" -v d2var="$date2var" '
$1 > d1var && $1 <= d2var {
print > ("graph_" d1var "-" d2var ".csv")
}'
OFS=, plot_data.csv
This uses the values of d1var and d2var to create the name of the output file. If you want the name of the variables, surround the whole name in double quotes.
Let the shell handle it: you're starting with shell variables after all
date1var=mm/dd/yyyy hh:mm:ss
date2var=mm/dd/yyyy hh:mm:ss
awk -F, -v OFS=, -v d1var="$date1var" \
-v d2var="$date2var" \
'
# awk script is unchanged
' plot_data.csv > "graph1_${date1var}-${date2var}.csv"
#!/bin/bash
date1var="1234"
date2var="5678"
awk -F, -v d1="$date1var" -v d2="$date2var" '{print > ("graph" d1 "-" d2 ".txt")}' OFS=, plot_data.csv
Note that you can't compare date strings in awk like you are trying to do. You also have a typo, in that you have written date1_var with an underscore whereas you have used date1var without an underscore further on.
I guess the short answer is that you can print to a named file with print > "filename" and that you can concatenate (join) strings by placing them beside each other like this string2 = string1 "and" string3;

awk: csv split works, but ignores the last field in the row

I have a sample file that looks like:
Sample.csv
Data_1,0,289,292,293,300,306
Data_2,0,294,3,306
Data_3,0,294,305,306
Data_4,0,294,305,306
And Im running awk on it:
scr.sh:
awk -F ',' -v tId="$1" '{for(i=3; i<NF; i++){if($i==tId) print}}' $2
By calling
./scr.sh 300 Sample.csv
That works fine and returns me exactly one row that matches.
UK_4_AB34,0,289,292,293,300,306
Original Problem statement: From the 3rd column onwards, if any of the column data matches the number given, then the line should get printed.
But if I call:
./scr.sh 306 Sample.csv
That returns me NOTHING!
I've double checked the lines in Sample.csv and confirmed that there are NO trailing spaces on any of the lines.
Any clues? Thanks.
This awk will do what you're looking for:
awk -F ',' -v tId="$1" '$0 ~ "(^|,)" tId "(,|$)"' file
Alternatively this egrep will also do the job:
egrep '(^|,)306(,|$)' file
UPDATE: Based on your comments below you can use:
awk -v tId="$1" 'BEGIN{FS=OFS=","} {p=$0; $1=$2=""} $0 ~ "(^|,)" tId "(,|$)"{print p}' file
Here is a simple solution to your problem.
Lets say your argument is stored in a variable named var
ie var=$1;
Therefore run the following command to find the occurences in your file
grep -E "^${var},|,${var},|,${var}$" yourfilename

Resources