Updating a specific field with sed - shell

I'm trying to update a specific field on a specific line with the sed command in Bourne Shell.
Lets say I have a file TopScorer.txt
Player:Games:Goals:Assists
Salah:9:9:3
Kane:10:8:4
And I need to update the 3rd Column (Goals) of a player, I tried this command and it works unless Games and Goals have the same value then it updates the first one
player="Salah"
NewGoals="10"
OldGoals=$(awk -F':' '$1=="'$player'"' TopScorer.txt | cut -d':' -f3)
sed -i '/^'$player'/ s/'$OldGoals'/'$NewGoals'/' TopScorer.txt
Output> Salah:10:9:3 instead of Salah:9:10:3
Is there any solution? Should I use delimiters and $3==... to specify that field?
I also tried the option /2 for second occurrence but it's not very convenient in my case.

You can just do this with awk alone and not with sed. Also note that awk has an internal syntax to import variables from the shell. So your code just becomes
awk -F: -v pl="$player" -v goals="$NewGoals"
'BEGIN { OFS = FS } $1 == pl { $3= goals }1' TopScorer.txt
The -F: sets the input de-limiter as : and the part involving -v imports your shell variables to the context of awk. The BEGIN { OFS = FS } sets the output field separator to the same as input. Then we do the match using the imported variables and update $3 to the required value.
To make the modifications in-place, use a temporary file
awk -F: -v pl="$player" -v goals="$NewGoals"
'BEGIN { OFS = FS } $1 == pl { $3= goals }1' TopScorer.txt > tmpfile && mv tmpfile TopScorer.txt

This might work for you (GNU sed):
(player=Salah;newGoals=10;sed -i "/^$name/s/[^:]*/$newGoals/3" /tmp/file)
Use a sub shell so as not to pollute the current shell (...). Use sed and pattern matching to match the first field of each record to the variable player and replace the third field of the matching record with the contents of newGoals.
P.S. If the variables are needed in further processes the sub shell is not necessary i.e. remove the ( and )

You can try it with Perl
$ player="Salah"
$ NewGoals="10"
$ perl -F: -lane "\$F[2]=$NewGoals if ( \$F[0] eq $player ) ; print join(':',#F) " TopScorer.txt
Player:Games:Goals:Assists
Salah:9:10:3
Kane:10:8:4
$
or export them and call Perl one-liner within single quotes
$ export NewGoals="10"
$ export player="Salah"
$ perl -F: -lane '$F[2]=$ENV{NewGoals} if $F[0] eq $ENV{player} ; print join(":",#F) ' TopScorer.txt
Player:Games:Goals:Assists
Salah:9:10:3
Kane:10:8:4
$
Note that Perl has -i switch and you can do the replacement in-place, so
$ perl -i.bak -F: -lane '$F[2]=$ENV{NewGoals} if $F[0] eq $ENV{player} ; print join(":",#F) ' TopScorer.txt
$ cat TopScorer.txt
Player:Games:Goals:Assists
Salah:9:10:3
Kane:10:8:4
$

This will work .
With the first part of sed , i try to match a full line that math the player, and i keep all fields i want to keep by using \( .
The second part , i rebuild the line with some constants and the value of \1 and the value of \2
player="Salah"
NewGoals="10"
sed "s/^$player:\([^:]*\):[^:]*:\([^:]*\)\$/$player:\1:$NewGoals:\2/"

Could you please try following once. Advantage of this approach is that I am not hard coding field for Goals. This program will look for header's field wherever Goal is present(eg--> 4th or 5th any field), it will change for that specific column only.
1st Solution: When you need to make changes to all occurrences of player name then use following.
NewGoals=10
awk -v newgoals="$NewGoals" 'BEGIN{FS=OFS=":"} FNR==1{for(i=1;i<=NF;i++){if($i=="Goals"){field=i}}} FNR>1{if($1=="Salah"){$field=newgoals}} 1' Input_file
2nd Solution: In case you want to change a specific player's goals value to specific row only then try following.
NewGoals=10
awk -v newgoals="$NewGoals" 'BEGIN{FS=OFS=":"} FNR==1{for(i=1;i<=NF;i++){if($i=="Goals"){field=i}}} FNR>1{if($1=="Salah" && FNR==2){$field=newgoals}} 1' Input_file
Above will make changes only for row 2, you coud change it by changing FNR==2 in 2nd condition where FNR refers row number inawk. In case you want to save output into Input_file itself then you could append > temp_file && mv temp_file Input_file to above codes.

Related

Using sed command in shell script for substring and replace position to need

I’m dealing data on text file and I can’t find a way with sed to select a substring at a fixed position and replace it.
This is what I have:
X|001200000000000000000098765432|1234567890|TQ
This is what I need:
‘X’,’00000098765432’,’1234567890’,’TQ’
The following code in sed gives the substring I need (00000098765432) but not overwrites position to need
echo “ X|001200000000000000000098765432|1234567890|TQ” | sed “s/
*//g;s/|/‘,’/g;s/^/‘/;s/$/‘/“
Could you help me?
Rather than sed, I would use awk for this.
echo "X|001200000000000000000098765432|1234567890|TQ" | awk 'BEGIN {FS="|";OFS=","} {print $1,substr($2,17,14),$3,$4}'
Gives output:
X,00000098765432,1234567890,TQ
Here is how it works:
FS = Field separator (in the input)
OFS = Output field separator (the way you want output to be delimited)
BEGIN -> think of it as the place where configurations are set. It runs only one time. So you are saying you want output to be comma delimited and input is pipe delimited.
substr($2,17,14) -> Take $2 (i.e. second field - awk begins counting from 1 - and then apply substring on it. 17 means the beginning character position and 14 means the number of characters from that position onwards)
In my opinion, this is much more readable and maintainable than sed version you have.
If you want to put the quotes in, I'd still use awk.
$: awk -F'|' 'BEGIN{q="\047"} {print q $1 q","q substr($2,17,14) q","q $3 q","q $4 q"\n"}' <<< "X|001200000000000000000098765432|1234567890|TQ"
'X','00000098765432','1234567890','TQ'
If you just want to use sed, note that you say above you want to remove 16 characters, but you are actually only removing 14.
$: sed -E "s/^(.)[|].{14}([^|]+)[|]([^|]+)[|]([^|]+)/'\1','\2','\3','\4'/" <<< "X|0012000000000000000098765432|1234567890|TQ"
'X','00000098765432','1234567890','TQ'
Using sed
$ sed "s/|\(0[0-9]\{15\}\)\?/','/g;s/^\|$/'/g" input_file
'X','00000098765432','1234567890','TQ'
Using any POSIX awk:
$ echo 'X|001200000000000000000098765432|1234567890|TQ' |
awk -F'|' -v OFS="','" -v q="'" '{sub(/.{16}/,"",$2); print q $0 q}'
'X','00000098765432','1234567890','TQ'
not as elegant as I hoped for, but it gets the job done :
'X','00000098765432','1234567890','TQ'
# gawk profile, created Mon May 9 21:19:17 2022
# BEGIN rule(s)
'BEGIN {
1 _ = sprintf("%*s", (__ = +2)^++__+--__*++__,__--)
1 gsub(".", "[0-9]", _)
1 sub("$", "$", _)
1 FS = "[|]"
1 OFS = "\47,\47"
}
# Rule(s)
1 (NF *= NF == __*__) * sub(_, "|&", $__) * \
sub("^.*[|]", "", $__) * sub(".+", "\47&\47") }'
Tested and confirmed working on gnu gawk 5.1.1, mawk 1.3.4, mawk 1.9.9.6, and macosx nawk
— The 4Chan Teller
awk -v del1="\047" \
-v del2="," \
-v start="3" \
-v len="17" \
'{
gsub(substr($0,start+1,len),"");
gsub(/[\|]/,del1 del2 del1);
print del1$0del1
}' input_file
'X',00000098765432','1234567890','TQ'

change numerical value in file to characters via awk

I'm looking to replace the numerical values in a file with a new value provided by me. Can be present in any part of the text, in some cases, it comes across as the third position but is not always necessarily the case. Also to try and save a new version of the file.
original format
A:fdg:user#server:r
A:g:1234:xtcy
A:d:1111:xtcy
modified format
A:fdg:user#server:rxtTncC
A:g:replaced_value:xtcy
A:d:replaced_value:xtcy
bash line command with awk:
awk -v newValue="newVALUE" 'BEGIN{FS=OFS=":"} /:.:.*:/ && ~/^[0-9]+$/{~=newValue} 1' original_file.txt > replaced_file.txt
You can simply use sed instead of awk:
sed -E 's/\b[0-9]+\b/replaced_value/g' /path/to/infile > /path/to/outfile
Here is an awk that asks you for replacement values for each numerical value it meets:
$ awk '
BEGIN {
FS=OFS=":" # delimiters
}
{
for(i=1;i<=NF;i++) # loop all fields
if($i~/^[0-9]+$/) { # if numerical value found
printf "Provide replacement value for %d: ",$i > "/dev/stderr"
getline $i < "/dev/stdin" # ask for a replacement
}
}1' file_in > file_out # write output to a new file
I would use GNU AWK for this task following way, let file.txt content be
A:fdg:user#server:rxtTncC
A:g:1234:xtcy
A:d:1111:xtcy
then
awk 'BEGIN{newvalue="replacement"}{gsub(/[[:digit:]]+/,newvalue);print}' file.txt
output
A:fdg:user#server:rxtTncC
A:g:replacement:xtcy
A:d:replacement:xtcy
Explanation: replace one or more digits using newvalue. Disclaimer: I assumed numeric is something consisting solely from digits.
(tested in gawk 4.2.1)
How about
awk -F : '$3 ~ /^[0-9]+$/ { $3 = "new value"} {print}' original_file >replaced_file
?

how to replace a string at a specific position in a csv file using bash

I have several .csv files and each csv file has lines which look like this.
AA,1,CC,1,EE
AA,FF,6,7,8,9
BB,6,7,8,99,AA
I am reading through each line of each csv file and then trying to replace the 4th position of each line beginning with AA with "ZZ"
Expected output
AA,1,CC,ZZ,EE
EE,FF,6,ZZ,8,9
BB,6,7,8,99,AA
However the variable "y" does contain the 4th variable "1" and "7" respectively, but when I use sed command it replaces the first occurrence of "1" with "ZZ".
How do I modify my code to replace only the 4th position of each line irrespective of what value it holds?
My code looks like this
$file = "name of file which contains list of all csv files"
for i in `cat file`
while IFS = read -r line;
do
if [[ $line == AA* ]] ; then
y=$(echo "$line" | cut -d',' -f 4)
sed -i "s/${y}/ZZ/" $i
fi
done < $i
Using sed, you can also direct that only the 4th field of a comma separated values file be changed to "ZZ" for lines beginning "AA" with:
sed -i '/^AA/s/[^,][^,]*/ZZ/4' file
Explanation
sed -i call sed to edit file in place;
general form /find/s/match/replace/occurrence; where
find is /^AA/ line beginning with "AA";
match [^,][^,]* a character not a comma followed by any number of non-commas;
replace /ZZ/4 the 4th occurrence of match with "ZZ".
Note, both awk and sed provide good solutions in this case so see the answers by #perreal and #RavinderSingh13
Example Input File
$ cat file
AA,1,CC,1,EE
AA,FF,6,7,8,9
BB,6,7,8,99,AA
Example Use/Output
(note: -i not used below so the changes are simply output to stdout)
$ sed '/^AA/s/[^,][^,]*/ZZ/4' file
AA,1,CC,ZZ,EE
AA,FF,6,ZZ,8,9
BB,6,7,8,99,AA
To robustly do this is just:
$ awk 'BEGIN{FS=OFS=","} $1=="AA"{$4="ZZ"} 1' csv
AA,1,CC,ZZ,EE
AA,FF,6,ZZ,8,9
BB,6,7,8,99,AA
Note that the above is doing a literal string comparison and a literal string replacement so unlike the other solutions posted so far it won't fail if the target string (AA in this example) contains regexp metachars like . or *, nor if it can be part of another string like AAX, nor if the replacement string (ZZ in this example) contains backreferences like & or \1.
If you want to map multiple strings in one pass:
$ awk 'BEGIN{FS=OFS=","; m["AA"]="ZZ"; m["BB"]="FOO"} $1 in m{$4=m[$1]} 1' csv
AA,1,CC,ZZ,EE
AA,FF,6,ZZ,8,9
BB,6,7,FOO,99,AA
and just like GNU sed has -i for "inplace" editing, GNU awk has -i inplace, so you can discard the shell loop and just do:
awk -i inplace '
BEGIN { FS=OFS="," }
(NR==FNR) { ARGV[ARGC++]=$0 }
(NR!=FNR) && ($1=="AA") { $4="ZZ" }
{ print }
' file
and it'll operate on all of the files named in file in one call to awk. "file" in that last case is your file containing a list of other CSV file names.
EDIT1: Since OP has changed requirement a bit do adding following now.
awk 'BEGIN{FS=OFS=","} /^AA/||/^BB/{$4="ZZ"} /^CC/||/^DD/{$5="NEW_VALUE"} 1' Input_file > temp_file && mv temp_file Input_file
Could you please try following.
awk -F, '/^AA/{$4="ZZ"} 1' OFS=, Input_file > temp_file && mv temp_file Input_file
OR
awk 'BEGIN{FS=OFS=","} /^AA/{$4="ZZ"} 1' Input_file > temp_file && mv temp_file Input_file
Explanation: Adding explanation to above code too now.
awk '
BEGIN{ ##Starting BEGIN section of awk which will be executed before reading Input_file.
FS=OFS="," ##Setting field separator and output field separator as comma here for all lines of Input_file.
} ##Closing block for BEGIN section of this program.
/^AA/{ ##Checking condition if a line starts from string AA then do following.
$4="ZZ" ##Setting 4th field as ZZ string as per OP.
} ##Closing this condition block here.
1 ##By mentioning 1 we are asking awk to print edited or non-edited line of Input_file.
' Input_file ##Mentioning Input_file name here.
Using sed:
sed -i 's/\(^AA,[^,]*,[^,]*,\)[^,]*/\1ZZ/' input_file

Using awk to search for a line that starts with but also contains a string

I have a file that has multiple lines that starts with a keyword. I only want to modify one of them and it's easy to distinguish the two. I want the one that is under the [dbinfo] section. The domain name is static so I know that won't change.
awk -F '=' '$1 ~ /^dbhost/ {print $NF};' myfile.txt
myfile.txt
[ual]
path=/web/
dbhost=ez098sf
[dbinfo]
dbhost=ec0001.us-east-1.localdomain
dbname=ez098sf_default
dbpass=XXXXXX
You can use this awk command to first check for presence of [dbinfo] section and then modify dbhost parameter:
awk -v h='newhost' 'BEGIN{FS=OFS="="}
$0 == "[dbinfo]" {sec=1} sec && $1 == "dbhost"{$2 = h; sec=0} 1' file
[ual]
path=/web/
dbhost=ez098sf
[dbinfo]
dbhost=newhost
dbname=ez098sf_default
dbpass=XXXXXX
You want to utilize a little bit of a state machine here:
awk -F '=' '
$0 ~ /^\[.*\]/ {in_db_info=($0=="[dbinfo]"}
$0 ~ /^dbhost/{if (in_db_info) print $2;}' myfile.txt
You can also do it with sed:
sed '/\[dbinfo\]/,/\[/s/\(^dbhost=\).*/\1domain.com/' myfile.txt

Explode to Array

I put together this shell script to do two things:
Change the delimiters in a data file ('::' to ',' in this case)
Select the columns and I want and append them to a new file
It works but I want a better way to do this. I specifically want to find an alternative method for exploding each line into an array. Using command line arguments doesn't seem like the way to go. ANY COMMENTS ARE WELCOME.
# Takes :: separated file as 1st parameters
SOURCE=$1
# create csv target file
TARGET=${SOURCE/dat/csv}
touch $TARGET
echo #userId,itemId > $TARGET
IFS=","
while read LINE
do
# Replaces all matches of :: with a ,
CSV_LINE=${LINE//::/,}
set -- $CSV_LINE
echo "$1,$2" >> $TARGET
done < $SOURCE
Instead of set, you can use an array:
arr=($CSV_LINE)
echo "${arr[0]},${arr[1]}"
The following would print columns 1 and 2 from infile.dat. Replace with
a comma-separated list of the numbered columns you do want.
awk 'BEGIN { IFS='::'; OFS=","; } { print $1, $2 }' infile.dat > infile.csv
Perl probably has a 1 liner to do it.
Awk can probably do it easily too.
My first reaction is a combination of awk and sed:
Sed to convert the delimiters
Awk to process specific columns
cat inputfile | sed -e 's/::/,/g' | awk -F, '{print $1, $2}'
# Or to avoid a UUOC award (and prolong the life of your keyboard by 3 characters
sed -e 's/::/,/g' inputfile | awk -F, '{print $1, $2}'
awk is indeed the right tool for the job here, it's a simple one-liner.
$ cat test.in
a::b::c
d::e::f
g::h::i
$ awk -F:: -v OFS=, '{$1=$1;print;print $2,$3 >> "altfile"}' test.in
a,b,c
d,e,f
g,h,i
$ cat altfile
b,c
e,f
h,i
$

Resources