Replace a word stored in a variable - bash

#!/bin/bash
IFS=''
replace="XXX"
for line in `cat test.csv`;do
name=`echo $line|cut -d"|" -f1`
date=`echo $line|cut -d"|" -f3`
echo $name
echo $date
sed -n "s/$name/XXX/gpw" output' test.csv
done
I need to replace value of $name by XXX but it's not working.
CSV file contains:
38880|update|20121227|customerXXXX|CXXX|Credit|Comp any|channel:XXX|XXX|XXX|0|Active|N|N|2012-12-31 17:37:46|Y|2012-12-31 17:37:46
And error is:
sed: -e expression #1, char 8: unterminated `s' command

I think you are doing too many things just to change the first column.
This should make:
awk -v subs="XXX" 'BEGIN{FS=OFS="|"} {$1=subs}1' file
For your given input it returns:
XXX|update|20121227|customerXXXX|CXXX|Credit|Comp any|channel:XXX|XXX|XXX|0|Active|N|N|2012-12-31 17:37:46|Y|2012-12-31 17:37:46
To update your file, use:
awk -v subs="XXX" 'BEGIN{FS=OFS="|"} {$1=subs}1' file > new_file && mv new_file file
If you needed to use bash, you can make use of IFS and read like this:
while IFS="|" read -r name _ date _
do
echo $name
echo $date
sed -i.bak "s/$name/XXX/g" another_file #do the replacement
done < file
Note $name gets the value of the first field and $date the value of the third, based on | as delimiter.

First of all, you can simplify things greatly by using while instead of for. You also don't need the sed. Bash, and other shells, have some quite extensive string manipulation abilities. So, a working version of your script could be:
$ while IFS='|' read -r name rest; do
printf "XXX|%s\n" "$rest"
done < test.csv > new.csv
XXX|update|20121227|customerXXXX|CXXX|Credit|Comp any|channel:XXX|XXX|XXX|0|Active|N|N|2012-12-31 17:37:46|Y|2012-12-31 17:37:46
That will write the new lines into the new.csv file. If you also want to echo the values of $name and $date to the terminal but only save the changed file, use this instead:
$ > new.csv; while IFS='|' read -r name f2 date rest; do
printf "%s\n%s\n" "$date" "$name"
printf "XXX|%s|%s|%s\n" "$f2" "$date" "$rest" >> new.csv
done < test.csv
You seem to want to do more manipulations as well. If so, you can read each field into a variable:
while IFS='|' read -r f{1..10}; do ... ;done; done < test.csv
The fields will be available as $f1 through $f15. Alternatively, you could use arrays:
$ while IFS='|' read -r -a fields; do
for((i=0; i<${#fields[#]}; i++)); do
echo "Field $i : ${fields[$i]}"
done
done < test.csv
Field 0 : 38880
Field 1 : update
Field 2 : 20121227
Field 3 : customerXXXX
Field 4 : CXXX
Field 5 : Credit
Field 6 : Comp any
Field 7 : channel:XXX
Field 8 : XXX
Field 9 : XXX
Field 10 : 0
Field 11 : Active
Field 12 : N
Field 13 : N
Field 14 : 2012-12-31 17:37:46
Field 15 : Y
Field 16 : 2012-12-31 17:37:46

sed 'h;s/^\([^|]*\)|[^|]*|\([^|]*\)|.*/\1\
\2/p
g
:a
s/^\([^|]*\)\(|.*\)\1/\1\2XXX/
t a
s/^\([^|]*\)|/XXX|/' test.csv
try this.
extract and print name + date
recursively replace name in line
replace first occurence of name

Related

Iterate over a csv and change the values of a column that meets a condition

I have to use bash to iterate over a CSV file and replace the values of a column that meets a condition. Finally, the results have to be stored in an output file.
I have written this code, which reads the file and stores the content in an array. On iterating over the file, if the value at column 13 is equal to "NULL" then the value of this record has to be replaced by "0". Once the file is reviewed the output with the replaced values is stored at file_b.
#!/bin/bash
file="./2022_Accidentalidad.csv"
while IFS=; read -ra array
do
if [[ ${array[13]} == "NULL" ]]; then
echo "${array[13]}" | sed -n 's/NULL/0/g'
fi
done < $file > file_b.csv
The problem is that file_b is empty. Nothing is written there.
How could I do this?
I cannot use AWK, and have to use or a FOR or a WHILE command to iterate over the file.
Sample input:
num_expediente;fecha;hora;localizacion;numero;cod_distrito;distrito;tipo_accidente;estado_meteorológico;tipo_vehiculo;tipo_persona;rango_edad;sexo;cod_lesividad;lesividad;coordenada_x_utm;coordenada_y_utm;positiva_alcohol;positiva_droga
2022S000001;01/01/2022;1:30:00;AVDA. ALBUFERA, 19;19;13;PUENTE DE VALLECAS;Alcance;Despejado;Turismo;Conductor;De 18 a 30 años;Mujer;NULL;NULL;443359,226;4472082,272;N;NULL
Expected output
num_expediente;fecha;hora;localizacion;numero;cod_distrito;distrito;tipo_accidente;estado_meteorológico;tipo_vehiculo;tipo_persona;rango_edad;sexo;cod_lesividad;lesividad;coordenada_x_utm;coordenada_y_utm;positiva_alcohol;positiva_droga
2022S000001;01/01/2022;1:30:00;AVDA. ALBUFERA, 19;19;13;PUENTE DE VALLECAS;Alcance;Despejado;Turismo;Conductor;De 18 a 30 años;Mujer;0;NULL;443359,226;4472082,272;N;NULL
Thanks a lot in advance.
You don't need sed. Just replace $array[13] with 0. Then print the entire array with ; separators between the fields.
( # in a subshell
IFS=';' # set IFS, that affects `read` and `"${array[*]}"`
while read -ra array
do
if [[ ${array[13]} == "NULL" ]]; then
array[13]=0
fi
echo "${array[*]}"
done
) < $file > file_b.csv
echo uses the first character of $IFS as the output field separator.
When awk is also possible:
awk 'BEGIN{FS=OFS=";"} NR==2 && $14=="NULL"{$14=0} {print}' "$file" > file_b.csv
See: 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR
One idea using a regex and the BASH_REMATCH array:
regex='(([^;]*;){13})(NULL)(;.*)'
while read -r line
do
[[ "${line}" =~ $regex ]] &&
line="${BASH_REMATCH[1]}0${BASH_REMATCH[4]}"
# uncomment following line to display contents of BASH_REMATCH[] array
# declare -p BASH_REMATCH
echo "${line}"
done < file.csv > file_b.csv
This generates:
$ cat file_b.csv
num_expediente;fecha;hora;localizacion;numero;cod_distrito;distrito;tipo_accidente;estado_meteorológico;tipo_vehiculo;tipo_persona;rango_edad;sexo;cod_lesividad;lesividad;coordenada_x_utm;coordenada_y_utm;positiva_alcohol;positiva_droga
2022S000001;01/01/2022;1:30:00;AVDA. ALBUFERA, 19;19;13;PUENTE DE VALLECAS;Alcance;Despejado;Turismo;Conductor;De 18 a 30 años;Mujer;0;NULL;443359,226;4472082,272;N;NULL

Split a string in bash based on delimiter

I have a file log_file which has contents such as
CCO O-MR1 Sync:No:3:No:346:Yes
CCO P Sync:No:1:No:106:Yes
CCO P Checkout:Yes:1:No:10:No
CCO O-MR1 Checkout(2.2):Yes:1:No:10:No
I am trying to obtain the 4 fields based on ":" delimiter
The script that I have is
#!/bin/bash
log_file=$1
for i in `cat $log_file` ; do
echo $i
field_a=`echo $i | awk -F '[:]' '{print $1}'`
echo $field_a
field_b=`echo $i | awk -F '[:]' '{print $2}'`
echo $lfield_b
...
done
but the value that this code gives for field_a is wrong, it splits the line based on " " delimiter.
echo $i also prints wrong value.
What else can I use to correct this?
This is covered in detail in BashFAQ #1. To summarize, use a while read loop with IFS set to contain (only) the characters that should be used to split fields.
while IFS=: read -r field_a field_b other_fields; do
echo "field_a is $field_a"
echo "field_b is $field_b"
echo "Remaining fields are $other_fields"
done <"$log_file"

How to browse a line from a file?

I have a file that contains 10 lines with this sort of content:
aaaa,bbb,132,a.g.n.
I wanna walk throw every line, char by char and put the data before the " , " is met in an output file.
if [ $# -eq 2 ] && [ -f $1 ]
then
echo "Read nr of fields to be saved or nr of commas."
read n
nrLines=$(wc -l < $1)
while $nrLines!="1" read -r line || [[ -n "$line" ]]; do
do
for (( i=1; i<=$n; ++i ))
do
while [ read -r -n1 temp ]
do
if [ temp != "," ]
then
echo $temp > $(result$i)
else
fi
done
paste -d"\n" $2 $(result$i)
done
nrLines=$($nrLines-1)
done
else
echo "File not found!"
fi
}
In parameter $2 I have an empty file in which I will store the data from file $1 after I extract it without the " , " and add a couple of comments.
Example:
My input_file contains:
a.b.c.d,aabb,comp,dddd
My output_file is empty.
I call my script: ./script.sh input_file output_file
After execution the output_file contains:
First line info: a.b.c.d
Second line info: aabb
Third line info: comp
(yes, without the 4th line info)
You can do what you want very simply with parameter-expansion and substring-removal using bash alone. For example, take an example file:
$ cat dat/10lines.txt
aaaa,bbb,132,a.g.n.
aaaa,bbb,133,a.g.n.
aaaa,bbb,134,a.g.n.
aaaa,bbb,135,a.g.n.
aaaa,bbb,136,a.g.n.
aaaa,bbb,137,a.g.n.
aaaa,bbb,138,a.g.n.
aaaa,bbb,139,a.g.n.
aaaa,bbb,140,a.g.n.
aaaa,bbb,141,a.g.n.
A simple one-liner using native bash string handling could simply be the following and give the following results:
$ while read -r line; do echo ${line%,*}; done <dat/10lines.txt
aaaa,bbb,132
aaaa,bbb,133
aaaa,bbb,134
aaaa,bbb,135
aaaa,bbb,136
aaaa,bbb,137
aaaa,bbb,138
aaaa,bbb,139
aaaa,bbb,140
aaaa,bbb,141
Paremeter expansion w/substring removal works as follows:
var=aaaa,bbb,132,a.g.n.
Beginning at the left and removing up to, and including, the first ',' is:
${var#*,} # bbb,132,a.g.n.
Beginning at the left and removing up to, and including, the last ',' is:
${var##*,} # a.g.n.
Beginning at the right and removing up to, and including, the first ',' is:
${var%,*} # aaaa,bbb,132
Beginning at the left and removing up to, and including, the last ',' is:
${var%%,*} # aaaa
Note: the text to remove above is represented with a wildcard '*', but wildcard use is not required. It can be any allowable text. For example, to only remove ,a.g.n where the preceding number is 136, you can do the following:
${var%,136*},136 # aaaa,bbb,136 (all others unchanged)
To print 2016 th line from a file named file.txt u have to run a command like this-
sed -n '2016p' < file.txt
More-
sed -n '2p' < file.txt
will print 2nd line
sed -n '2011p' < file.txt
2011th line
sed -n '10,33p' < file.txt
line 10 up to line 33
sed -n '1p;3p' < file.txt
1st and 3th line
and so on...
For more detail, please have a look in this tutorial and this answer.
In native bash the following should do what you want, assuming you replace the contents of your script.sh with the below:
#!/bin/bash
IN_FILE=${1}
OUT_FILE=${2}
IFS=\,
while read line; do
set -- ${line}
for ((i=1; i<=${#}; i++)); do
((${i}==4)) && continue
((n+=1))
printf '%s\n' "Line ${n} info: ${!i}"
done
done < ${IN_FILE} > ${OUT_FILE}
This will not print the 4th field of each line within the input file, on a new line in the output file (I assume this is your requirement as per your comment?).
[wspace#wspace sandbox]$ awk -F"," 'BEGIN{OFS="\n"}{for(i=1; i<=NF-1; i++){print "line Info: "$i}}' data.txt
line Info: a.b.c.d
line Info: aabb
line Info: comp
This little snippet can ignore the last field.
updated:
#!/usr/bin/env bash
if [ ! -f "$1" -o $# -ne 2 ];then
echo "Usage: $(basename $0) input_file out_file"
exit 127
fi
input_file=$1
output_file=$2
: > $output_file
if [ "$(wc -l < $1)" -ne 0 ];then
while true
do
read -r -n1 char
if [ "$char" == "" ];then
break
elif [ $char != "," ];then
temp=$temp$char
else
echo "line info: $temp" >> $output_file
temp=""
fi
done < $input_file
else
echo "file $1 is empty"
fi
Maybe this is what you want
Did you try
sed "s|,|\n|g" $1 | head -n -1 > $2
I assume that only the last word would not have a comma on its right.
Try this (tested with you sample line) :
#!/bin/bash
# script.sh
echo "Number of fields to save ?"
read nf
while IFS=$',' read -r -a arr; do
newarr=${arr[#]:0:${nf}}
done < "$1"
for i in ${newarr[#]};do
printf "%s\n" $i
done > "$2"
Execute script with :
$ ./script.sh inputfile outputfile
Number of fields ?
3
$ cat outputfile
a.b.c.d
aabb
comp
All words separated with commas are stored into an array $arr
A tmp array $newarr removes last $n element ($n get the read command).
It loops over new array and prints result in $2, the outputfile.

Reading a file in a shell script and selecting a section of the line

This is probably pretty basic, I want to read in a occurrence file.
Then the program should find all occurrences of "CallTilEdb" in the file Hendelse.logg:
CallTilEdb 8
CallCustomer 9
CallTilEdb 4
CustomerChk 10
CustomerChk 15
CallTilEdb 16
and sum up then right column. For this case it would be 8 + 4 + 16, so the output I would want would be 28.
I'm not sure how to do this, and this is as far as I have gotten with vistid.sh:
#!/bin/bash
declare -t filename=hendelse.logg
declare -t occurance="$1"
declare -i sumTime=0
while read -r line
do
if [ "$occurance" = $(cut -f1 line) ] #line 10
then
sumTime+=$(cut -f2 line)
fi
done < "$filename"
so the execution in terminal would be
vistid.sh CallTilEdb
but the error I get now is:
/home/user/bin/vistid.sh: line 10: [: unary operator expected
You have a nice approach, but maybe you could use awk to do the same thing... quite faster!
$ awk -v par="CallTilEdb" '$1==par {sum+=$2} END {print sum+0}' hendelse.logg
28
It may look a bit weird if you haven't used awk so far, but here is what it does:
-v par="CallTilEdb" provide an argument to awk, so that we can use par as a variable in the script. You could also do -v par="$1" if you want to use a variable provided to the script as parameter.
$1==par {sum+=$2} this means: if the first field is the same as the content of the variable par, then add the second column's value into the counter sum.
END {print sum+0} this means: once you are done from processing the file, print the content of sum. The +0 makes awk print 0 in case sum was not set... that is, if nothing was found.
In case you really want to make it with bash, you can use read with two parameters, so that you don't have to make use of cut to handle the values, together with some arithmetic operations to sum the values:
#!/bin/bash
declare -t filename=hendelse.logg
declare -t occurance="$1"
declare -i sumTime=0
while read -r name value # read both values with -r for safety
do
if [ "$occurance" == "$name" ]; then # string comparison
((sumTime+=$value)) # sum
fi
done < "$filename"
echo "sum: $sumTime"
So that it works like this:
$ ./vistid.sh CallTilEdb
sum: 28
$ ./vistid.sh CustomerChk
sum: 25
first of all you need to change the way you call cut:
$( echo $line | cut -f1 )
in line 10 you miss the evaluation:
if [ "$occurance" = $( echo $line | cut -f1 ) ]
you can then sum by doing:
sumTime=$[ $sumTime + $( echo $line | cut -f2 ) ]
But you can also use a different approach and put the line values in an array, the final script will look like:
#!/bin/bash
declare -t filename=prova
declare -t occurance="$1"
declare -i sumTime=0
while read -a line
do
if [ "$occurance" = ${line[0]} ]
then
sumTime=$[ $sumtime + ${line[1]} ]
fi
done < "$filename"
echo $sumTime
For the reference,
id="CallTilEdb"
file="Hendelse.logg"
sum=$(echo "0 $(sed -n "s/^$id[^0-9]*\([0-9]*\)/\1 +/p" < "$file") p" | dc)
echo SUM: $sum
prints
SUM: 28
the sed extract numbers from a lines containing the given id, such CallTilEdb
and prints them in the format number +
the echo prepares a string such 0 8 + 16 + 4 + p what is calculation in RPN format
the dc do the calculation
another variant:
sum=$(sed -n "s/^$id[^0-9]*\([0-9]*\)/\1/p" < "$file" | paste -sd+ - | bc)
#or
sum=$(grep -oP "^$id\D*\K\d+" < "$file" | paste -sd+ - | bc)
the sed (or the grep) extracts and prints only the numbers
the paste make a string like number + number + number (-d+ is a delimiter)
the bc do the calculation
or perl
sum=$(perl -slanE '$s+=$F[1] if /^$id/}{say $s' -- -id="$id" "$file")
sum=$(ID="CallTilEdb" perl -lanE '$s+=$F[1] if /^$ENV{ID}/}{say $s' "$file")
Awk translation to script:
#!/bin/bash
declare -t filename=hendelse.logg
declare -t occurance="$1"
declare -i sumTime=0
sumtime=$(awk -v entry=$occurance '
$1==entry{time+=$NF+0}
END{print time+0}' $filename)

Read words in a specific line in a text file using shell script

In my Bash shell script, I would like to read a specific line from a file; that is delimited by : and assign each section to a variable for processing later.
For example I want to read the words found on line 2. The text file:
abc:01APR91:1:50
Jim:02DEC99:2:3
banana:today:three:0
Once I have "read" line 2, I should be able to echo the values as something like this:
echo "$name";
echo "$date";
echo "$number";
echo "$age";
The output would be:
Jim
02DEC99
2
3
For echoing a single line of a file, I quite like sed:
$ IFS=: read name date number age < <(sed -n 2p data)
$ echo $name
Jim
$ echo $date
02DEC99
$ echo $number
2
$ echo $age
3
$
This uses process substitution to get the output of sed to the read command. The sed command uses the -n option so it does not print each line (as it does by default); the 2p means 'when it is line 2, print the line'; data is simply the name of the file.
You can use this:
read name date number age <<< $(awk -F: 'NR==2{printf("%s %s %s %s\n", $1, $2, $3, $4)}' inFile)
echo "$name"
echo "$date"
echo "$number"
echo "$age"

Resources