how do I replace a information Fromm csv file? - bash

I have the following program
#!/bin/bash
exec 3< lista.csv
read -u 3 header
declare -i id_nou
echo "ID: "
read id_nou
while IFS=, && read -u 3 -r id nume prenume seria grupa nota
do
if [ "$id_nou" -eq "$id" ]
then
echo "Nota noua: "
read nota_noua
nota=$nota_noua
print > lista.csv
fi
done
My csv file looks something like this:
id,nume,prenume,grupa,seria,nota
1,Ion,Andrada,1003,A,8
2,Simion,Raluca,1005,A,7
3,Gheorghita,Mihail,1009,B,5
4,Mihailescu,Georgina,1002,A,6
What I'm trying to do is replace the nota value of the correspondent's id with a given by the keyboard value, but this doesn't seem to work.
The error message is
line 14: print: command not found

Here's one in awk:
awk 'BEGIN {
FS=OFS="," # comma field separators
printf "id: " # ask for id
if((getline id < "/dev/stdin")<=0) # store to a variable
exit 1
printf "nota: " # ...
if((getline nota < "/dev/stdin")<=0)
exit 1
}
$1==id { # if firsst field matches
$NF=nota # replace last field value
}
1' file # output
Output:
id: 1
nota: THIS IS NEW VALUE
id,nume,prenume,grupa,seria,nota
1,Ion,Andrada,1003,A,THIS IS NEW VALUE
2,Simion,Raluca,1005,A,7
3,Gheorghita,Mihail,1009,B,5
4,Mihailescu,Georgina,1002,A,6
Here is some info on saving the changes.

Related

Count occurrences in a csv with Bash

I have to create a script that given a country and a sport you get the number of medalists and medals won after reading a csv file.
The csv is called "athletes.csv" and have this header
id|name|nationality|sex|date_of_birth|height|weight|sport|gold|silver|bronze|info
when you call the script you have to add the nationality and sport as parameters.
The script i have created is this one:
#!/bin/bash
participants=0
medals=0
while IFS=, read -ra array
do
if [[ "${array[2]}" == $1 && "${array[7]}" == $2 ]]
then
participants=$participants++
medals=$(($medals+${array[8]}+${array[9]}+${array[10]))
fi
done < athletes.csv
echo $participants
echo $medals
where array[3] is the nationality, array[8] is the sport and array[9] to [11] are the number of medals won.
When i run the script with the correct paramters I get 0 participants and 0 medals.
Could you help me to understand what I'm doing wrong?
Note I cannot use awk nor grep
Thanks in advance
Try this:
#! /bin/bash -p
nation_arg=$1
sport_arg=$2
declare -i participants=0
declare -i medals=0
declare -i line_num=0
while IFS=, read -r _ _ nation _ _ _ _ sport ngold nsilver nbronze _; do
(( ++line_num == 1 )) && continue # Skip the header
[[ $nation == "$nation_arg" && $sport == "$sport_arg" ]] || continue
participants+=1
medals+=ngold+nsilver+nbronze
done <athletes.csv
declare -p participants
declare -p medals
The code uses named variables instead of numbered positional parameters and array indexes to try to improve readability and maintainability.
Using declare -i means that strings assigned to the declared variables are treated as arithmetic expressions. That reduces clutter by avoiding the need for $(( ... )).
The code assumes that the field separator in the CSV file is ,, not | as in the header. If the separator is really |, replace IFS=, with IFS='|'.
I'm assuming that the field delimiter of your CSV file is a comma but you can set it to whatever character you need.
Here's a fixed version of your code:
#!/bin/bash
participants=0
medals=0
{
# skip the header
read
# process the records
while IFS=',' read -ra array
do
if [[ "${array[2]}" == $1 && "${array[7]}" == $2 ]]
then
(( participants++ ))
medals=$(( medals + array[8] + array[9] + array[10] ))
fi
done
} < athletes.csv
echo "$participants" "$medals"
remark: As $1 and $2 are left unquoted they are subject to glob matching (right side of [[ ... == ... ]]). For example you'll be able to show the total number of medals won by the US with:
./script.sh 'US' '*'
But I have to say, doing text processing with pure shell isn't considered a good practice; there exists dedicated tools for that. Here's an example with awk:
awk -v FS=',' -v country="$1" -v sport="$2" '
BEGIN {
participants = medals = 0
}
NR == 1 { next }
$3 == country && $8 == sport {
participants++
medals += $9 + $10 + $11
}
END { print participants, medals }
' athletes.csv
There's also a potential problem remaining: the CSV format might need a real CSV parser for reading it accurately. There exists a few awk libraries for that but IMHO it's simpler to use a CSV‑aware tool that provides the functionalities that you need.
Here's an example with Miller:
mlr --icsv --ifs=',' filter -s country="$1" -s sport="$2" '
begin {
#participants = 0;
#medals = 0;
}
$nationality == #country && $sport == #sport {
#participants += 1;
#medals += $gold + $silver + $bronze;
}
false;
end { print #participants, #medals; }
' athletes.csv

Is there a way to change/clean a variable inside a for cicle loop in a shell script?

RD_OPTION_AZWEBAPPNAME="01-SM1,02-SM1Touch,03-Data"
for i in $(echo $RD_OPTION_AZWEBAPPNAME | sed "s/,/ /g");
do
/bin/az group deployment create --name Template2020 --RD_OPTION_AZWEBAPPNAME=$i
With this command I create 3 APP with 01-SM1 02-SM1Touch, 03-Data but I need to insert a piece of this array in another parameter in order to have SM1 SM1Touch Data withot the number and the "-" before the APP name INSIDE a for cicle, like below
RD_OPTION_AZWEBAPPNAME="01-SM1,02-SM1Touch,03-Data"
for i in $(echo $RD_OPTION_AZWEBAPPNAME | sed "s/,/ /g");
do
/bin/az group deployment create --name Template2020 --RD_OPTION_AZWEBAPPNAME=$i --webappconf=$WEBAPPNAMEWITHOUTNUMBERANDMINUSBEFORE
You should use an array for RD_OPTION_AZWEBAPPNAME instead of a string.
Then you can iterate over it instead of parsing it with sed.
Something like this :
RD_OPTIONS=(
"SM1"
"SM1Touch"
"Data"
)
for number in `seq -f "%02g" 1 ${#RD_OPTIONS[#]}`
do
name=${RD_OPTIONS[$number-1]}
full="$number-$name"
echo "number: $number"
echo "name: $name"
echo "full: $full"
done
will print
number: 01
name: SM1
full: 01-SM1
number: 02
name: SM1Touch
full: 02-SM1Touch
number: 03
name: Data
full: 03-Data
So you could do this :
RD_OPTIONS=(
"SM1"
"SM1Touch"
"Data"
)
for number in `seq -f "%02g" 1 ${#RD_OPTIONS[#]}`
do
name=${RD_OPTIONS[$number-1]}
full="$number-$name"
/bin/az group deployment create --name Template2020 --RD_OPTION_AZWEBAPPNAME=$full --webappconf=$name
done
Consider this
RD_OPTION_AZWEBAPPNAME="01-SM1,02-SM1Touch,03-Data"
arr1=( ${RD_OPTION_AZWEBAPPNAME//,/' '} ) # conver your var to an array
arr2=( ${arr1[#]//*-/} ) # create second array witn names SM1, SM1Touch, Data
arr3=( ${arr1[#]} ${arr2[#]} ) # create mega) array
for name in ${arr3[#]}; { your_code; } # loop through mega array with your code
for i in $(echo $RD_OPTION_AZWEBAPPNAME | sed "s/,/ /g");
do
echo $i
export AZWEBAPPNAMENONUMBER=`echo "$i" | cut -c 4-`
This is the way I decided.

Count specific character in each line of at text and remove this character in a specific position until this character has a specific count

Hello i need help with one script that its on Solaris system:
I will explain the script analytically:
i have these files :
i)
cat /tmp/BadTransactions/TRANSACTIONS_DAILY_20180730.txt
201807300000000004
201807300000000005
201807300000000006
201807300000000007
201807300000000008
201807200002056422
201807230003099849
201807230003958306
201806290003097219
201806080001062012
201806110001633519
201806110001675603
ii)
cat /tmp/BadTransactions/test_data_for_validation_script.txt
20180720|201807200002056422||57413620344272|030341-213T |580463|WIRE||EUR|EUR|20180720|20180720|||||||00000000000019.90|00000000000019.90|Debit||||||||||MPA|||574000|129|||||||||||||||||||||||||31313001103712|BFNJKL|K| I P P BONNIER PUBLICATIO|||FI|PERS7
20180723|201807230003099849||57100440165173|140197-216U|593619|WIRE||EUR|EUR|20180723|20180723|||||||00000000000060.00|00000000000060.00|Debit||||||||||MPA|||571004|106|||||||||||||||||||||||||57108320141339|Ura Basket / UraNaiset|||-div|||FI|PERS
20180723|201807230003958306||57206820079775|210489-0788|593619|WIRE||EUR|EUR|20180721|20180723|||||||00000000000046.00|00000000000046.00|Debit||||||||||MPA|||578800|106|||||||||||||||||||||||||18053000009026|IC Kodit||| c/o Newsec Asset Manag|||FI|PERS
20180629|201806290003097219||57206820079775|210489-0788|593619|WIRE||EUR|EUR|20180628|20180629|||||||00000000000856.00|00000000000856.00|Debit||||||||||MPA|||578800|106|||||||||||||||||||||||||18053000009018|IC Kodit||| c/o Newsec Asset Manag|||FI|PERS
20180608|201806080001062012||57206820079441|140197-216S|580463|WIRE||EUR|EUR|20180608|20180608|||||||00000000000019.90|00000000000019.90|Debit||||||||||MPA|||541002|129|||||||||||||||||||||||||57108320141339|N FN|K| IKI I P BONNIER PUBLICATION|||FI|PERS7
20180611|201806110001633519||57206820079525|140197-216B|593619|WIRE||EUR|EUR|20180611|20180611|||||||00000000000242.10|00000000000242.10|Debit||||||||||MPA|||535806|106|||||||||||||||||||||||||57108320141339|As Oy Haikkoonsilta|| mannerheimin|||FI|PERS9
20180611|201806110001675603||57206820079092|140197-216Z|580463|WIRE||EUR|EUR|20180611|20180611|||||||00000000000019.90|00000000000019.90|Debit||||||||||MPA|||536501|129|||||||||||||||||||||||||57108320141339|N ^NLKL|K| I P NJ BONNIER PUBLICAT|||FI|PERS7
The script has to check each line of the
/tmp/BadTransactions/TRANSACTIONS_DAILY_20180730.txt and if the strings are on
the /tmp/BadTransactions/test_data_for_validation_script.txt it will create a
new file `/tmp/BadTransactions/TRANSACTIONS_DAILY_NEW_20180730.txt
From this new file it will count all the " | " in each line and if its more than 64 it will delete the " | " in 61th posistion of the line . This will be continued until its line has 64 pipes.
For example if one line has 67 " | " it will delete the 61th , then it will check it again and now has 66 " | | so it will delete the 61th " | " , etc... until it reach 64 pipes.So all the line have to have 64th " | ".
Here is my code , but in this code i have managed to delete only the 61th pipe in each line , i cannot make the loop so that it will check each line until it reach the 64 pipes.
I will appreciate it if you could help me.
#!/bin/bash
PATH=/usr/xpg4/bin:/bin:/usr/bin
while read line
do
grep "$line" /tmp/BadTransactions/test_data_for_validation_script.txt
awk 'NR==FNR { K[$1]; next } ($2 in K)' /tmp/BadTransactions/TRANSACTIONS_DAILY_20180730.txt FS="|" /opt/NorkomC
onfigS2/inbox/TRANSACTIONS_DAILY_20180730.txt > /tmp/BadTransactions/TRANSACTIONS_DAILY_NEW_20180730.txt
sed '/\([^|]*[|]\)\{65\}/ s/|//61' /tmp/BadTransactions/TRANSACTIONS_DAILY_NEW_20180730.txt
done < /tmp/BadTransactions/TRANSACTIONS_DAILY_20180730.txt > /tmp/BadTransactions/TRANSACTIONS_DAILY_NEW_201807
30.txt
Ok, in this problem you have several pieces of code.
You need to read a file line by line
Check each line against another file
Examine the matching line for the occurrences of "|"
Delete recursively the 61st "|" until the string will remain with 64 of them
You could do something like this
#!/bin/bash
count() { ### We will use this to count how many pipes are there
string="${1}"; shift
char="${1}"
printf "%s" "${string}" | grep -o -e "${char}" | grep -c .
}
file1="/tmp/BadTransactions/TRANSACTIONS_DAILY_20180730.txt" ### File to read
file2="/tmp/BadTransactions/test_data_for_validation_script.txt" ### File to check for duplicates
file3="/tmp/BadTransactions/TRANSACTIONS_DAILY_NEW_20180730.txt" ### File where to save our final work
printf "" > "${file3}" ### Delete (eventual) history
exec 3<"${file1}" ### Put our data in file descriptor 3
while read -r line <&3; do ### read each line and put it in var "$line"
string="$(grep -e "${line}" "${file2}")" ### Check the line against second file
while [ "$(count "${string}" "|")" -gt 64 ]; do ### While we have more than 64 "|"
string="$(printf "%s" "${string}" | sed -e "s/|//61")" ### Delete the 61st occurrence
done
printf "%s" "${string}" >> "${file3}" ### Save the correct line in the third file
done
exec 3>&- ### Clean file descriptor 3
This is not tested, but should work.
N.B. Please note that I am giving for granted that grep will return only one occurrence from second file...
If it is not your case you have to manually check each value with something like:
for value in $(grep -e "${line}" "${file2}"); do
...
done
EDIT:
For systems like Solaris or others that doesn't have GNU grep installed you can substitute the count method as follow:
count() {
string="${1}"; shift
char="${1}"
printf "%s" "${string}" | awk -F"${char}" '{print NF-1}'
}

bash routine to return the page number of a given line number from text file

Consider a plain text file containing page-breaking ASCII control character "Form Feed" ($'\f'):
alpha\n
beta\n
gamma\n\f
one\n
two\n
three\n
four\n
five\n\f
earth\n
wind\n
fire\n
water\n\f
Note that each page has a random number of lines.
Need a bash routine that return the page number of a given line number from a text file containing page-breaking ASCII control character.
After a long time researching the solution I finally came across this piece of code:
function get_page_from_line
{
local nline="$1"
local input_file="$2"
local npag=0
local ln=0
local total=0
while IFS= read -d $'\f' -r page; do
npag=$(( ++npag ))
ln=$(echo -n "$page" | wc -l)
total=$(( total + ln ))
if [ $total -ge $nline ]; then
echo "${npag}"
return
fi
done < "$input_file"
echo "0"
return
}
But, unfortunately, this solution proved to be very slow in some cases.
Any better solution ?
Thanks!
The idea to use read -d $'\f' and then to count the lines is good.
This version migth appear not ellegant: if nline is greater than or equal to the number of lines in the file, then the file is read twice.
Give it a try, because it is super fast:
function get_page_from_line ()
{
local nline="${1}"
local input_file="${2}"
if [[ $(wc -l "${input_file}" | awk '{print $1}') -lt nline ]] ; then
printf "0\n"
else
printf "%d\n" $(( $(head -n ${nline} "${input_file}" | grep -c "^"$'\f') + 1 ))
fi
}
Performance of awk is better than the above bash version. awk was created for such text processing.
Give this tested version a try:
function get_page_from_line ()
{
awk -v nline="${1}" '
BEGIN {
npag=1;
}
{
if (index($0,"\f")>0) {
npag++;
}
if (NR==nline) {
print npag;
linefound=1;
exit;
}
}
END {
if (!linefound) {
print 0;
}
}' "${2}"
}
When \f is encountered, the page number is increased.
NR is the current line number.
----
For history, there is another bash version.
This version is using only built-it commands to count the lines in current page.
The speedtest.sh that you had provided in the comments showed it is a little bit ahead (20 sec approx.) which makes it equivalent to your version:
function get_page_from_line ()
{
local nline="$1"
local input_file="$2"
local npag=0
local total=0
while IFS= read -d $'\f' -r page; do
npag=$(( npag + 1 ))
IFS=$'\n'
for line in ${page}
do
total=$(( total + 1 ))
if [[ total -eq nline ]] ; then
printf "%d\n" ${npag}
unset IFS
return
fi
done
unset IFS
done < "$input_file"
printf "0\n"
return
}
awk to the rescue!
awk -v RS='\f' -v n=09 '$0~"^"n"." || $0~"\n"n"." {print NR}' file
3
updated anchoring as commented below.
$ for i in $(seq -w 12); do awk -v RS='\f' -v n="$i"
'$0~"^"n"." || $0~"\n"n"." {print n,"->",NR}' file; done
01 -> 1
02 -> 1
03 -> 1
04 -> 2
05 -> 2
06 -> 2
07 -> 2
08 -> 2
09 -> 3
10 -> 3
11 -> 3
12 -> 3
A script of similar length can be written in bash itself to locate and respond to the embedded <form-feed>'s contained in a file. (it will work for POSIX shell as well, with substitute for string index and expr for math) For example,
#!/bin/bash
declare -i ln=1 ## line count
declare -i pg=1 ## page count
fname="${1:-/dev/stdin}" ## read from file or stdin
printf "\nln:pg text\n" ## print header
while read -r l; do ## read each line
if [ ${l:0:1} = $'\f' ]; then ## if form-feed found
((pg++))
printf "<ff>\n%2s:%2s '%s'\n" "$ln" "$pg" "${l:1}"
else
printf "%2s:%2s '%s'\n" "$ln" "$pg" "$l"
fi
((ln++))
done < "$fname"
Example Input File
The simple input file with embedded <form-feed>'s was create with:
$ echo -e "a\nb\nc\n\fd\ne\nf\ng\nh\n\fi\nj\nk\nl" > dat/affex.txt
Which when output gives:
$ cat dat/affex.txt
a
b
c
d
e
f
g
h
i
j
k
l
Example Use/Output
$ bash affex.sh <dat/affex.txt
ln:pg text
1: 1 'a'
2: 1 'b'
3: 1 'c'
<ff>
4: 2 'd'
5: 2 'e'
6: 2 'f'
7: 2 'g'
8: 2 'h'
<ff>
9: 3 'i'
10: 3 'j'
11: 3 'k'
12: 3 'l'
With Awk, you can define RS (the record separator, default newline) to form feed (\f) and IFS (the input field separator, default any sequence of horizontal whitespace) to newline (\n) and obtain the number of lines as the number of "fields" in a "record" which is a "page".
The placement of form feeds in your data will produce some empty lines within a page so the counts are off where that happens.
awk -F '\n' -v RS='\f' '{ print NF }' file
You could reduce the number by one if $NF == "", and perhaps pass in the number of the desired page as a variable:
awk -F '\n' -v RS='\f' -v p="2" 'NR==p { print NF - ($NF == "") }' file
To obtain the page number for a particular line, just feed head -n number to the script, or loop over the numbers until you have accrued the sum of lines.
line=1
page=1
for count in $(awk -F '\n' -v RS='\f' '{ print NF - ($NF == "") }' file); do
old=$line
((line += count))
echo "Lines $old through line are on page $page"
((page++)
done
This gnu awk script prints the "page" for the linenumber given as command line argument:
BEGIN { ffcount=1;
search = ARGV[2]
delete ARGV[2]
if (!search ) {
print "Please provide linenumber as argument"
exit(1);
}
}
$1 ~ search { printf( "line %s is on page %d\n", search, ffcount) }
/[\f]/ { ffcount++ }
Use it like awk -f formfeeds.awk formfeeds.txt 05 where formfeeds.awk is the script, formfeeds.txt is the file and '05' is a linenumber.
The BEGIN rule deals mostly with the command line argument. The other rules are simple rules:
$1 ~ search applies when the first field matches the commandline argument stored in search
/[\f]/ applies when there is a formfeed

Bash script, command - output to array, then print to file

I need advice on how to achieve this output:
myoutputfile.txt
Tom Hagen 1892
State: Canada
Hank Moody 1555
State: Cuba
J.Lo 156
State: France
output of mycommand:
/usr/bin/mycommand
Tom Hagen
1892
Canada
Hank Moody
1555
Cuba
J.Lo
156
France
Im trying to achieve with this shell script:
IFS=$'\r\n' GLOBIGNORE='*' :; names=( $(/usr/bin/mycommand) )
for name in ${names[#]}
do
#echo $name
echo ${name[0]}
#echo ${name:0}
done
Thanks
Assuming you can always rely on the command to output groups of 3 lines, one option might be
/usr/bin/mycommand |
while read name;
read year;
read state; do
echo "$name $year"
echo "State: $state"
done
An array isn't really necessary here.
One improvement could be to exit the loop if you don't get all three required lines:
while read name && read year && read state; do
# Guaranteed that name, year, and state are all set
...
done
An easy one-liner (not tuned for performance):
/usr/bin/mycommand | xargs -d '\n' -L3 printf "%s %s\nState: %s\n"
It reads 3 lines at a time from the pipe and then passes them to a new instance of printf which is used to format the output.
If you have whitespace at the beginning (it looks like that in your example output), you may need to use something like this:
/usr/bin/mycommand | sed -e 's/^\s*//g' | xargs -d '\n' -L3 printf "%s %s\nState: %s\n"
#!/bin/bash
COUNTER=0
/usr/bin/mycommand | while read LINE
do
if [ $COUNTER = 0 ]; then
NAME="$LINE"
COUNTER=$(($COUNTER + 1))
elif [ $COUNTER = 1 ]; then
YEAR="$LINE"
COUNTER=$(($COUNTER + 1))
elif [ $COUNTER = 2 ]; then
STATE="$LINE"
COUNTER=0
echo "$NAME $YEAR"
echo "State: $STATE"
fi
done
chepner's pure bash solution is simple and elegant, but slow with large input files (loops in bash are slow).
Michael Jaros' solution is even simpler, if you have GNU xargs (verify with xargs --version), but also does not perform well with large input files (external utility printf is called once for every 3 input lines).
If performance matters, try the following awk solution:
/usr/bin/mycommand | awk '
{ ORS = (NR % 3 == 1 ? " " : "\n")
gsub("^[[:blank:]]+|[[:blank:]]*\r?$", "") }
{ print (NR % 3 == 0 ? "State: " : "") $0 }
' > myoutputfile.txt
NR % 3 returns the 0-based index of each input line within its respective group of consecutive 3 lines; returns 1 for the 1st line, 2 for the 2nd, and 0(!) for the 3rd.
{ ORS = (NR % 3 == 1 ? " " : "\n") determines ORS, the output-record separator, based on that index: a space for line 1, and a newline for lines 2 and 3; the space ensures that line 2 is appended to line 1 with a space when using print.
gsub("^[[:blank:]]+|[[:blank:]]*\r?$", "") strips leading and trailing whitespace from the line - including, if present, a trailing \r, which your input seems to have.
{ print (NR % 3 == 0 ? "State: " : "") $0 } prints the trimmed input line, prefixed by "State: " only for every 3rd input line, and implicitly followed by ORS (due to use of print).

Resources