I'm trying to modify a groups.tsv file (I'm on repl.it so path to file is fine).
Each line in the file looks like this:
groupname \t amountofpeople \t lastadded
and I'm trying to count the occurences of both groupname($nomgrp) and a login($login), and change lastadded to login.
varcol2=$(grep "$nomgrp" groups | cut "-d " -f2- | awk -F"\t" '{print $2}' )
((varcol21=varcol2+1));
varcol3=$(awk -F"\t" '{print $3}' groups)
sed -i "s|${nomgrp}\t${varcol2}\t$varcol3|${nomgrp}\t${varcol21}\t${login}|" groups
However, I'm getting the error message:
sed : -e expression #1, char 27: unterminated 's' command
The groups file has lines such as " sudo 2 user1" (delimited with a tab): a user inputs "user" which is stored in $login, then "sudo" which is stored in $nomgrp.
What am I doing wrong?
Sorry if this has been answered/super easy to fix, I'm quite the newbie here...
If I understand what you are trying to do correctly and if you have GNU awk, you could do
gawk -i inplace -F '\t' -v group="$nomgrp" -v login="$login" -v OFS='\t' '$1 == group { $2 = $2 + 1; $3 = login; } { print }' groups.tsv
Example:
$ cat groups.tsv
wheel 1000 2019-12-10
staff 1234 2019-12-11
users 9001 2019-12-12
$ gawk -i inplace -F '\t' -v group=wheel -v login=2019-12-12 -v OFS='\t' '$1 == group { $2 = $2 + 1; $3 = login; } 1' groups.tsv
$ cat groups.tsv
wheel 1001 2019-12-12
staff 1234 2019-12-11
users 9001 2019-12-12
This works as follows:
-i inplace is a GNU awk extension that allows you to change a file in place,
-F '\t' sets the input field separator to a tab so that the input is interpreted as TSV and fields with spaces in them are not split apart,
-v variable=name sets an awk variable for use in awk's code,
specifically, -v OFS='\t' sets the output field separator variable to a tab, so that the output is again a TSV
So we set variables group, login to your shell variables and ensure that awk outputs a TSV. The code then works as follows:
$1 == group { # If the first field in a line is equal to the group variable
$2 = $2 + 1; # add 1 to the second field
$3 = login; # and overwrite the third with the login variable
}
{ # in all lines:
print # print
}
{ print } could also be abbreviated as 1, I'm sure people someone will point out, but I find this way easier to explain.
If you do not have GNU awk, you could achieve the same with a temporary file, e.g.
awk -F '\t' -v group="$nomgrp" -v login="$login" -v OFS='\t' '$1 == group { $2 = $2 + 1; $3 = login; } { print }' groups.tsv > groups.tsv.new
mv groups.tsv.new groups.tsv
Related
I have a txt file that looks like this.
this-is-name-1
this-is-name-2
...
I am trying to add a ,0 at the end of a certain line using gawk,
gawk -i inplace -v n=',0' -v s='this-is-name-1' '$1 == s { $2 = n } 1' file
But as you can see there is a space in-between.
this-is-name-1 ,0
this-is-name-2
...
Whats the right gawk syntax so there is no space, so it like this instead
this-is-name-1,0
this-is-name-2
...
Use comma as the output field separator:
gawk -i inplace -v OFS=',' -v n='0' -v s='this-is-name-1' '$1 == s { $2 = n } 1' file
# ..............^^^^^^^^^^.......^ (no comma here)
Change $2 = n to $0 = $0 n. Your current code is adding a 2nd field so awk has to add a separator between the fields.
I have multiple .txt with info like this:
"commercial_name":"THE OUTBACK","contact_name":"JEFF","contact_person":"MANAGER","working_place"
there is a lot of garbage behind and after the given sentence.
I want to get results like this:
THE OUTBACK,JEFF,MANAGER
All in the same line for each .txt file, but jump line for the next .txt.
I am doing with 3 different sed commands
sed -n 's:.*"commercial_name"\(.*\)"contact_name".*:\1:p' *.txt
sed -n 's:.*"contact_name"\(.*\)"contact_person".*:\1:p' *.txt
sed -n 's:.*"contact_person"\(.*\)"working_place".*:\1:p' *.txt
even if I combine these 3, the result is:
:"THE OUTBACK",
-all commercial names 1 line for each .txt
:"JEFF",
-all contact names 1 line for each .txt
:"MANAGER",
-all contact person 1 line for each .txt
I want to extract all the info in the same line:
THE OUTBACK,JEFF,MANAGER
then the info for the next .txt in the next line
and so on.
You may use this awk:
awk 'BEGIN {
FS=OFS=","
}
{
gsub(/"/, "")
for(i=1; i<=NF; ++i) {
if (split($i, entry, ":") == 2)
map[entry[1]] = entry[2]
}
print map["commercial_name"], map["contact_name"], map["contact_person"]
}' file
THE OUTBACK,JEFF,MANAGER
With awk
we set FS and OFS separately:
awk -v FS=',|:' -v OFS=',' '{print $2,$4,$6}' file
"THE OUTBACK","JEFF","MANAGER"
and gsub for removing double quotes:
awk -v FS=',|:' -v OFS=',' '{gsub(/"/, "")} {print $2,$4,$6}' file
THE OUTBACK,JEFF,MANAGER
This code:
why printing $2,$4,$6?
Ed Morton gives a detail explication here:
converting regex to sed or grep regex
Using Ed's code, you can see it with for
awk -v FS=',|:' -v OFS=',' '{gsub(/"/, "")} {for (i=1; i<=NF;i++) print "Record", NR, "Field", i, ": " $i;}{print RT}' file
Record,1,Field,1,: commercial_name
Record,1,Field,2,: THE OUTBACK
Record,1,Field,3,: contact_name
Record,1,Field,4,: JEFF
Record,1,Field,5,: contact_person
Record,1,Field,6,: MANAGER
Record,1,Field,7,: working_place
In this case, we are interested in fields 2, 4 and 6:
{print $2,$4,$6}
--
I'm trying to edit 3 columns in a file if the value in column 1 equals a specific string. This is my current attempt:
cp file file.copy
awk -F':' 'OFS=":" { if ($1 == "root1") $2="test"; print}' file.copy>file
rm file.copy
I've only been able to get the awk command working with one column being changed, I want to be able to edit $3 and $8 as well. Is this possible in the same command? Or is it only possible with separate awk commands or with a different command all together?
Edit note: The real command i'll be passing variables to the columns, i.e. $2=$var
It'll be used to edit the /etc/passwd file, sample input/output:
root:$6$fR7Vrjyp$irnF38R/htMSuk0efLSnAten/epf.5v7gfs0q.NcjKcFPeJmB/4TnnmgaAoTUE9.n4p4UyWOgFwB1guJau8AL.:17976::::::
You can create multiple statements for the if condition with a block {}.
awk -F':' 'OFS=":" { if ($1 == "root1") {$2="test"; $3="test2";} print}' file.copy>file
You can also improve your command by using awk's default "workflow": condition{commands}. For this you need to bring the OFS to the input variables (-v flag)
awk -F':' -v OFS=":" '$1=="root1"{$2="test"; $3="test2"; print}' file.copy>file
You may use
# Fake sample values
v1=pass1
v2=pass2
awk -v var1="$v1" -v var2="$v2" 'BEGIN{FS=OFS=":"} $1 == "root1" { $2 = var1; $3 = var2}1' file > tmp && mv tmp file
See the online awk demo:
s="root1:xxxx:yyyy
root11:xxxx:yyyy
root1:zzzz:cccc"
v1=pass1
v2=pass2
awk -v var1="$v1" -v var2="$v2" 'BEGIN{FS=OFS=":"} $1 == "root1" { $2 = var1; $3 = var2}1' <<< "$s"
Output:
root1:pass1:pass2
root11:xxxx:yyyy
root1:pass1:pass2
Note:
-v var1="$v1" -v var2="$v2" pass the variables you need to use in the awk command
BEGIN{FS=OFS=":"} set the field separator
$1 == "root1" check if Field 1 is equal to some value
{ $2 = var1; $3 = var2 } set Field 2 and 3 values
1 calls the default print command
file > tmp && mv tmp file helps you "shrink" the "replace-inplace-like" code.
I'm trying to take last value in third column of a CSV file and replace then the whole third column with this value.
I've been trying this:
var=$(tail -n 1 math_ready.csv | awk -F"," '{print $3}'); awk -F, '{$3="$var";}1' OFS=, math_ready.csv > math1.csv
But it's not working and I don't understand why...
Please help!
awk '
BEGIN { ARGV[2]=ARGV[1]; ARGC++; FS=OFS="," }
NR==FNR { last = $3; next }
{ $3 = last; print }
' math_ready.csv > math1.csv
The main problem with your script was trying to access a shell variable ($var) inside your awk script. Awk is not shell, it is a completely separate language/tool with it's own namespace and variables. You cannot directly access a shell variable in awk, just like you couldn't access it in C. To access the VALUE of a shell variable you'd do:
shellvar=27
awk -v awkvar="$shellvar" 'BEGIN{ print awkvar }'`
Some additional cleanup:
When FS and OFS have the same value, don't assign them each to that value separately, use BEGIN{ FS=OFS="," } instead for clarity and maintainability.
Do not iniatailize variables AFTER the script that uses those variables unless you have a very specifc reason to do so. Use awk -F... -v OFS=... 'script' to init those variables to separate values, not awk -F... 'script' OFS=... as it's very unnatural to init variables in the code segment AFTER you've used them and variables inited in the args list at the end are not initialized when the BEGIN section is executed which can cause bugs.
A shell variable is not expandable internally in awk. You can do this instead:
awk -F, -v var="$var" '{ $3 = var } 1' OFS=, math_ready.csv > math1.cs
And you probably can simplify your code with this:
awk -F, 'NR == FNR { r = $3; next } { $3 = r } 1' OFS=, math_ready.csv math_ready.csv > math1.csv
Example input:
1,2,1
1,2,2
1,2,3
1,2,4
1,2,5
Output:
1,2,5
1,2,5
1,2,5
1,2,5
1,2,5
Try this one liner. It doesn't depend on the column count
var=`tail -1 sample.csv | perl -ne 'm/([^,]+)$/; print "$1";'`; cat sample.csv | while read line; do echo $line | perl -ne "s/[^,]*$/$var\n/; print $_;"; done
cat sample.csv
24,1,2,30,12
33,4,5,61,3333
66,7,8,91111,1
76,10,11,32,678
Out:
24,1,2,30,678
33,4,5,61,678
66,7,8,91111,678
76,10,11,32,678
I have two CSV files and I want to compare them using AWK and generate a new file.
file1.csv:
"no","loc"
"abc121","C:/pro/in"
"abc122","C:/pro/abc"
"abc123","C:/pro/xyz"
"abc124","C:/pro/in"
file2.csv:
"no","loc"
"abc121","C:/pro/in"
"abc122","C:/pro/abc"
"abc125","C:/pro/xyz"
"abc126","C:/pro/in"
output.csv:
"file1","file2","Diff"
"abc121","abc121","Match"
"abc122","abc122","Match"
"abc123","","Unmatch"
"abc124","","Unmatch"
"","abc125","Unmatch"
"","abc126","Unmatch"
One way with awk:
script.awk:
BEGIN {
FS = ","
}
NR>1 && NR==FNR {
a[$1] = $2
next
}
FNR>1 {
print ($1 in a) ? $1 FS $1 FS "Match" : "\"\"" FS $1 FS "Unmatch"
delete a[$1]
}
END {
for (x in a) {
print x FS "\"\"" FS "Unmatch"
}
}
Output:
$ awk -f script.awk file1.csv file2.csv
"abc121","abc121",Match
"abc122","abc122",Match
"","abc125",Unmatch
"","abc126",Unmatch
"abc124","",Unmatch
"abc123","",Unmatch
I didn't use awk alone, but if I understood the gist of what you're asking correctly, I think this long one-liner should do it...
join -t, -a 1 -a 2 -o 1.1 2.1 1.2 2.2 file1.csv file2.csv | awk -F, '{ if ( $3 == $4 ) var = "\"Match\""; else var = "\"Unmatch\"" ; print $1","$2","var }' | sed -e '1d' -e 's/^,/"",/' -e 's/,$/,"" /' -e 's/,,/,"",/g'
Description:
The join portion takes the two CSV files, joins them on the first column (default behavior of join) and outputs all four fields (-o 1.1 2.1 1.2 2.2), making sure to include rows that are unmatched for both files (-a 1 -a 2).
The awk portion takes that output and replaces combination of the 3rd and 4th columns to either "Match" or "Unmatch" based on if they do in fact match or not. I had to make an assumption on this behavior based on your example.
The sed portion deletes the "no","loc" header from the output (-e '1d') and replaces empty fields with open-close quote marks (-e 's/^,/"",/' -e 's/,$/,""/' -e 's/,,/,"",/g'). This last part might not be necessary for you.
EDIT:
As tripleee points out, the above fails if the two initial files are unsorted. Here's an updated command to fix that. It punts the header line and sorts each file before passing them to join...
join -t, -a 1 -a 2 -o 1.1 2.1 1.2 2.2 <( sed 1d file1.csv | sort ) <( sed 1d file2.csv | sort ) | awk -F, '{ if ( $3 == $4 ) var = "\"Match\""; else var = "\"Unmatch\"" ; print $1","$2","var }' | sed -e 's/^,/"",/' -e 's/,$/,""/' -e 's/,,/,"",/g'