Bash: rename duplicates for generated values in csv file - bash

There is a bash script that create a email addresses in next format: first letter from name and full surname, lowercase +#example.com.
csv file:
id,location,name,email
1,1,John Smith,ab#dc.com
2,2,Paul Robinson,
3,3,Fidel Guererro,qw#er.com
4,4,John Smith,
...
Column Name can contain duplicates. In this case script should add 1 in email address (ex. for id=1 - jsmith#example.com, for id=4 - jsmith1#example.com).
I try next script:
#!/bin/bash
while IFS=, read -r col1 col2 col3 col4
do
if [ "$col1" == Id ]; then
echo "${col1},${col2},${col3},${col4}"
continue
fi
firstinitial=${col3:0:1}
surname=$(echo $col3 | cut -d' ' -f2)
if [[ $col4 == $col4 ]]; then
col4=${firstinitial,}${surname,,}1#example.com
else
col4=${firstinitial,}${surname,,}#example.com
fi
echo "${col1},${col2},${col3},${col4}"
done < acc.csv
I recieve all addresses with 1.
How can I change this script?
Expected output:
id,location,name,email
1,1,John Smith,jsmith#example.com
2,2,Paul Robinson,probinson#example.com
3,3,Fidel Guererro,fguererro#example.com
4,4,John Smith,jsmith1#example.com
...

Related

how do I replace a information Fromm csv file?

I have the following program
#!/bin/bash
exec 3< lista.csv
read -u 3 header
declare -i id_nou
echo "ID: "
read id_nou
while IFS=, && read -u 3 -r id nume prenume seria grupa nota
do
if [ "$id_nou" -eq "$id" ]
then
echo "Nota noua: "
read nota_noua
nota=$nota_noua
print > lista.csv
fi
done
My csv file looks something like this:
id,nume,prenume,grupa,seria,nota
1,Ion,Andrada,1003,A,8
2,Simion,Raluca,1005,A,7
3,Gheorghita,Mihail,1009,B,5
4,Mihailescu,Georgina,1002,A,6
What I'm trying to do is replace the nota value of the correspondent's id with a given by the keyboard value, but this doesn't seem to work.
The error message is
line 14: print: command not found
Here's one in awk:
awk 'BEGIN {
FS=OFS="," # comma field separators
printf "id: " # ask for id
if((getline id < "/dev/stdin")<=0) # store to a variable
exit 1
printf "nota: " # ...
if((getline nota < "/dev/stdin")<=0)
exit 1
}
$1==id { # if firsst field matches
$NF=nota # replace last field value
}
1' file # output
Output:
id: 1
nota: THIS IS NEW VALUE
id,nume,prenume,grupa,seria,nota
1,Ion,Andrada,1003,A,THIS IS NEW VALUE
2,Simion,Raluca,1005,A,7
3,Gheorghita,Mihail,1009,B,5
4,Mihailescu,Georgina,1002,A,6
Here is some info on saving the changes.

UNIX Pattern Sequence

The following scenario is for pattern search using UNIX Shell where the pattern between two strings need to happen and then a new column with sequence need to happen
Input Data
1|AB|1|2
2|BC|1|2
ID CLOSED
3|AB|1|2
4|BC|1|2
ID CLOSED
Query
As per the data above, we need to add SEQ column after UN and it should add
seq 1 as the first value and sequence 2 to the second part and so on till End.
Expected Output
1|AB|1|2|1
2|BC|1|2|1
3|AB|1|2|2
4|BC|1|2|2
Tried solution as first part but isn't giving correct output
sed -n '/^ID/,/^ID CLOSED/{p;/^pattern2/q}'
Any particular reason you want to use sed for this? It seems like a better fit for awk:
awk -v{,O}FS='|' '
BEGIN { seq = 1 }
/CLOSED/ { seq++ }
!/^ID/ { $5=seq; print }'
Output:
1|AB|1|2|1
2|BC|1|2|1
3|AB|1|2|2
4|BC|1|2|2
Maybe something like this:
(
seq=1
echo "ID NAME ID1 ID2 ID3 UN SEQ"
while read id name id1 id2 id3 un; do
[ "$id $name" = "ID NAME" ] && continue
[ "$id $name" = "ID CLOSED" ] && { let "seq+=1"; continue; }
echo "$id $name $id1 $id2 $id3 $un $seq"
done < /path/to/the/datafile
echo "ID CLOSED"
) | column -t -s' '
Doing this with just a sed instruction is not impossible I think, but a way much harder ;)

Is there a way to change/clean a variable inside a for cicle loop in a shell script?

RD_OPTION_AZWEBAPPNAME="01-SM1,02-SM1Touch,03-Data"
for i in $(echo $RD_OPTION_AZWEBAPPNAME | sed "s/,/ /g");
do
/bin/az group deployment create --name Template2020 --RD_OPTION_AZWEBAPPNAME=$i
With this command I create 3 APP with 01-SM1 02-SM1Touch, 03-Data but I need to insert a piece of this array in another parameter in order to have SM1 SM1Touch Data withot the number and the "-" before the APP name INSIDE a for cicle, like below
RD_OPTION_AZWEBAPPNAME="01-SM1,02-SM1Touch,03-Data"
for i in $(echo $RD_OPTION_AZWEBAPPNAME | sed "s/,/ /g");
do
/bin/az group deployment create --name Template2020 --RD_OPTION_AZWEBAPPNAME=$i --webappconf=$WEBAPPNAMEWITHOUTNUMBERANDMINUSBEFORE
You should use an array for RD_OPTION_AZWEBAPPNAME instead of a string.
Then you can iterate over it instead of parsing it with sed.
Something like this :
RD_OPTIONS=(
"SM1"
"SM1Touch"
"Data"
)
for number in `seq -f "%02g" 1 ${#RD_OPTIONS[#]}`
do
name=${RD_OPTIONS[$number-1]}
full="$number-$name"
echo "number: $number"
echo "name: $name"
echo "full: $full"
done
will print
number: 01
name: SM1
full: 01-SM1
number: 02
name: SM1Touch
full: 02-SM1Touch
number: 03
name: Data
full: 03-Data
So you could do this :
RD_OPTIONS=(
"SM1"
"SM1Touch"
"Data"
)
for number in `seq -f "%02g" 1 ${#RD_OPTIONS[#]}`
do
name=${RD_OPTIONS[$number-1]}
full="$number-$name"
/bin/az group deployment create --name Template2020 --RD_OPTION_AZWEBAPPNAME=$full --webappconf=$name
done
Consider this
RD_OPTION_AZWEBAPPNAME="01-SM1,02-SM1Touch,03-Data"
arr1=( ${RD_OPTION_AZWEBAPPNAME//,/' '} ) # conver your var to an array
arr2=( ${arr1[#]//*-/} ) # create second array witn names SM1, SM1Touch, Data
arr3=( ${arr1[#]} ${arr2[#]} ) # create mega) array
for name in ${arr3[#]}; { your_code; } # loop through mega array with your code
for i in $(echo $RD_OPTION_AZWEBAPPNAME | sed "s/,/ /g");
do
echo $i
export AZWEBAPPNAMENONUMBER=`echo "$i" | cut -c 4-`
This is the way I decided.

Concatenate string in .csv after x commas using shell/bash

I have several .csv files containing data. The data vendor created the files indicating the years once in the first line with missing values in between, variables names in the second. Data follows in the third to the Xth line.
"year 1", , , "year 2", , ,"year 2", , ,
"Var1", "Var2", "Var3", "Var1", "Var2", "Var3", "Var1", "Var2", "Var3"
"ABC" , 1234 , 4567 , "DEF" , 789 , "ABC" , 1234 , 4567 , "DEF"
I am new to shell programming but it shouldn't be too complicated writing a script that outputs the following
"Var1_year1", "Var2_year1", "Var3_year1", "Var1_year2", "Var2_year2", "Var3_year2", "Var1_year3", "Var2_year3", "Var3_year3"
"ABC" , 1234 , 4567 , "DEF" , 789 , "ABC" , 1234 , 4567 , "DEF"
Some thing like
#!/bin/bash
FILES=/Users/pathTo.csvfiles/*.csv
for f in $FILES
do
echo "Processing $f file..."
# 1. Replace the second line with 'Varname_YearX' where YearX comes from the first line
cat ????
# 2. Delete first line
sed -i '' 1d $f
done
echo "Processing complete."
Update: The .csv files vary in their amount of lines. Only the first two lines need to be edited, the following lines are data.
If you want to merge the first and the second line of each CSV, try this.
# No point in using a variable for the wildcard
for f in /Users/pathTo.csvfiles/*.csv
do
awk -F , 'NR==1 { # Collect first line
# Squash quotes
gsub(/"/, "")
for(i=1;i<=NF;++i)
y[i] = $i || y[i-1]
next # Do not fall through to print
}
NR==2 { # Combine collected with current
gsub(/"/, "")
for(i=1;i<=NF;++i)
$i = y[i] "_" $i
}
# Print everything (except first)
1' "$f" > "$f.tmp"
mv "$f.tmp" "$f"
done
The first loop simply copies the previous field's value to y[i] if the i:th field is empty.
Ugly code using csvtool, various standard tools, and bash:
i=file.csv
paste -d_ <(head -2 $i | tail -1 | csvtool transpose -) \
<(head -1 $i | csvtool transpose - |
sed '$d;s/ //;/^$/{g;b};h') |
csvtool transpose - | sed 's/[^,]*/"&"/g' | cat - <(tail +3 $i)
Output:
"Var1_year1","Var2_year1","Var3_year1","Var1_year2","Var2_year2","Var3_year2","Var1_year2","Var2_year2","Var3_year2"
"ABC" , 1234 , 4567 , "DEF" , 789 , "ABC" , 1234 , 4567 , "DEF"

how to use shell script to convert row to column or some nice table

I got the following script from stack overflow :
#!/bin/sh
in_file=temp2.txt # Input file
params=6 # Parameters count
res_file=$(mktemp) # Temporary file
sep=' ' # Separator character
# Print header
cnt=0
for i in $(cat $in_file | head -$((params*2))); do
if [ $((cnt % 2)) -eq 0 ]; then
echo $i
fi
cnt=$((cnt+1))
done | sed ":a;N;\$!ba;s/\n/$sep/g" >>$res_file
# Parse and print values
cnt=0
for i in $(cat $in_file); do
# Print values, skip param names
if [ $((cnt % 2)) -eq 1 ]; then
echo -n $i >>$res_file
fi
if [ $(((cnt+1) % (params*2))) -eq 0 ]; then
# Values line is finished, print newline
echo >>$res_file
elif [ $((cnt % 2)) -eq 1 ]; then
# More values expected to be printed on this line
echo -n "$sep" >>$res_file
fi
cnt=$((cnt+1))
done
# Make nice table format
cat $res_file | column -t
#rm -f $res_file
But then i have about 100 + lines in it and i'm getting a error**"column: line too long"** as below :
****column: line too long**** GigabitEthernet0/0 GigabitEthernet1/0/3 GigabitEthernet1/0/5 GigabitEthernet1/0/10
GigabitEthernet1/0/19 GigabitEthernet1/0/33 GigabitEthernet1/0/2
GigabitEthernet1/0/4 GigabitEthernet1/0/7 GigabitEthernet1/0/18
GigabitEthernet1/0/30 GigabitEthernet1/0/44 GigabitEthernet1/0/46
GigabitEthernet1/1/3 GigabitEthernet2/0/1 GigabitEthernet2/0/5
GigabitEthernet2/0/9 GigabitEthernet2/0/14 GigabitEthernet2/0/18
GigabitEthernet2/0/31 GigabitEthernet2/0/34 GigabitEthernet2/0/36
GigabitEthernet2/0/40 GigabitEthernet2/1/3 GigabitEthernet3/0/12
GigabitEthernet3/0/30 GigabitEthernet3/0/32 GigabitEthernet3/0/34
GigabitEthernet3/0/36 GigabitEthernet3/0/38 GigabitEthernet3/0/40
GigabitEthernet3/0/42 GigabitEthernet3/0/44 GigabitEthernet3/0/46
GigabitEthernet3/0/48 GigabitEthernet3/1/2
Any solutions you can give, i could not find the author of this script again here to ask him on this can be avoided.
Input file will be something like this :
{
GigabitEthernet0/0
GigabitEthernet1/0/2
GigabitEthernet1/0/3
GigabitEthernet1/0/4
GigabitEthernet1/0/5
GigabitEthernet1/0/7
GigabitEthernet1/0/10
GigabitEthernet1/0/18
GigabitEthernet1/0/19
GigabitEthernet1/0/30
GigabitEthernet1/0/33
GigabitEthernet1/0/44
GigabitEthernet1/0/45
GigabitEthernet1/0/46
GigabitEthernet1/1/2
GigabitEthernet1/1/3
GigabitEthernet1/1/4
GigabitEthernet2/0/1
GigabitEthernet2/0/2
GigabitEthernet2/0/5
GigabitEthernet2/0/8
GigabitEthernet2/0/9
GigabitEthernet2/0/12
GigabitEthernet2/0/14
GigabitEthernet2/0/15
GigabitEthernet2/0/18
GigabitEthernet2/0/22
GigabitEthernet2/0/31
GigabitEthernet2/0/33
GigabitEthernet2/0/34
GigabitEthernet2/0/35
GigabitEthernet2/0/36
GigabitEthernet2/0/38
GigabitEthernet2/0/40
GigabitEthernet2/1/2
GigabitEthernet2/1/3
GigabitEthernet2/1/4
GigabitEthernet3/0/12
GigabitEthernet3/0/23
GigabitEthernet3/0/30
GigabitEthernet3/0/31
GigabitEthernet3/0/32
GigabitEthernet3/0/33
GigabitEthernet3/0/34
GigabitEthernet3/0/35
GigabitEthernet3/0/36
GigabitEthernet3/0/37
GigabitEthernet3/0/38
GigabitEthernet3/0/39
GigabitEthernet3/0/40
GigabitEthernet3/0/41
GigabitEthernet3/0/42
GigabitEthernet3/0/43
GigabitEthernet3/0/44
GigabitEthernet3/0/45
GigabitEthernet3/0/46
GigabitEthernet3/0/47
GigabitEthernet3/0/48
GigabitEthernet3/1/1
GigabitEthernet3/1/2
GigabitEthernet3/1/3
GigabitEthernet3/1/4
}
Output i need something like this :
{
GigabitEthernet0/0 | GigabitEthernet1/0/33 |
GigabitEthernet1/0/2 | GigabitEthernet1/0/44 |
GigabitEthernet1/0/3 | GigabitEthernet1/0/43 |
GigabitEthernet1/0/4 | GigabitEthernet1/0/46 |
GigabitEthernet1/0/5 | GigabitEthernet1/1/2 |
GigabitEthernet1/0/7 | GigabitEthernet1/1/3 |
GigabitEthernet1/0/10| GigabitEthernet1/1/4 |
GigabitEthernet1/0/18| GigabitEthernet2/0/1 |
GigabitEthernet1/0/19| GigabitEthernet2/0/2 |
GigabitEthernet1/0/30| GigabitEthernet2/0/5 |
}
I have got this resolved using a column command.
it just posts the output something like this
column vacant_temp4.txt
GigabitEthernet1/0/1 GigabitEthernet1/0/48 GigabitEthernet2/0/34 GigabitEthernet3/0/44 GigabitEthernet4/0/28
GigabitEthernet1/0/2 GigabitEthernet1/1/2 GigabitEthernet2/0/35 GigabitEthernet3/1/1 GigabitEthernet4/0/29
GigabitEthernet1/0/3 GigabitEthernet1/1/3 GigabitEthernet2/0/36 GigabitEthernet3/1/2 GigabitEthernet4/0/30
GigabitEthernet1/0/5 GigabitEthernet1/1/4 GigabitEthernet2/0/38 GigabitEthernet3/1/3 GigabitEthernet4/0/31
GigabitEthernet1/0/7 GigabitEthernet2/0/1 GigabitEthernet2/0/45 GigabitEthernet3/1/4 GigabitEthernet4/0/32
GigabitEthernet1/0/8 GigabitEthernet2/0/5 GigabitEthernet2/1/2 GigabitEthernet4/0/1 GigabitEthernet4/0/33
GigabitEthernet1/0/9 GigabitEthernet2/0/8 GigabitEthernet2/1/3 GigabitEthernet4/0/5 GigabitEthernet4/0/34
GigabitEthernet1/0/14 GigabitEthernet2/0/9 GigabitEthernet2/1/4 GigabitEthernet4/0/6 GigabitEthernet4/0/35
GigabitEthernet1/0/16 GigabitEthernet2/0/10 GigabitEthernet3/0/2 GigabitEthernet4/0/9 GigabitEthernet4/0/36
GigabitEthernet1/0/19 GigabitEthernet2/0/13 GigabitEthernet3/0/5 GigabitEthernet4/0/12 GigabitEthernet4/0/37
GigabitEthernet1/0/20 GigabitEthernet2/0/14 GigabitEthernet3/0/7 GigabitEthernet4/0/13 GigabitEthernet4/0/38
GigabitEthernet1/0/26 GigabitEthernet2/0/16 GigabitEthernet3/0/13 GigabitEthernet4/0/16 GigabitEthernet4/0/39
GigabitEthernet1/0/27 GigabitEthernet2/0/20 GigabitEthernet3/0/16 GigabitEthernet4/0/17 GigabitEthernet4/0/40
GigabitEthernet1/0/28 GigabitEthernet2/0/21 GigabitEthernet3/0/19 GigabitEthernet4/0/18 GigabitEthernet4/0/41
GigabitEthernet1/0/30 GigabitEthernet2/0/25 GigabitEthernet3/0/22 GigabitEthernet4/0/19 GigabitEthernet4/0/42
GigabitEthernet1/0/31 GigabitEthernet2/0/26 GigabitEthernet3/0/25 GigabitEthernet4/0/20 GigabitEthernet4/1/1
GigabitEthernet1/0/35 GigabitEthernet2/0/27 GigabitEthernet3/0/26 GigabitEthernet4/0/21 GigabitEthernet4/1/2
GigabitEthernet1/0/36 GigabitEthernet2/0/28 GigabitEthernet3/0/27 GigabitEthernet4/0/22 GigabitEthernet4/1/3
GigabitEthernet1/0/37 GigabitEthernet2/0/29 GigabitEthernet3/0/37 GigabitEthernet4/0/23 GigabitEthernet4/1/4
GigabitEthernet1/0/40 GigabitEthernet2/0/30 GigabitEthernet3/0/39 GigabitEthernet4/0/24
GigabitEthernet1/0/45 GigabitEthernet2/0/31 GigabitEthernet3/0/41 GigabitEthernet4/0/26
GigabitEthernet1/0/46 GigabitEthernet2/0/32 GigabitEthernet3/0/42 GigabitEthernet4/0/27
It looked better than what you saw above on my putty screen .
Thank you oliv , tripleee , paul & all.

Resources