Unable to separate semi-colon separated line awk - bash

I am trying to do the following:
Read a file line by line.
Each line has the following structure: field1;field2;field3
Use awk to separate each of these fields and then process each of these fields further
The snippet of code I have is:
while read l
do
n=`echo ${l} | awk --field-separator=";" '{print NF}'`
field1=`echo ${l} | awk --field-separator=";" '{print $1}'`
field2=`echo ${l} | awk --field-separator=";" '{print $2}'`
field3=`echo ${l} | awk --field-separator=";" '{print $3}'`
echo ${n} ${field1} ${field2} ${field3}
done < temp
Where temp contains only the following line:
xx;yy;zz
The answer I get on the command line is:
1 xx;yy;zz
I am not sure I understand this output. Any explanations would be nice, given that it does work for other files. I am working on a Mac while this code uses awk within a bash script.

Why awk when you can do it in pure bash?
while IFS=';' read -r field1 field2 field3; do
echo "Field1: $field1"
echo "Field2: $field2"
echo "Field3: $field3"
done < file.txt
Or if you don't know the field count:
while IFS=';' read -ra fields; do
echo "Number of fields: ${#fields[#]}"
echo "Field1 ${fields[0]}"
done < file.txt

Your awk has no idea what --field-separator=";" means so when you do this:
awk --field-separator=";" '{print $1}'
your awk is still using the default FS of a space, and so $1 contains your whole input line while $2 and $3 are empty. Use -F';' to set the FS.
You are WAY, WAY off the mark in how to write the script you want. If you tell us more about what "process each field" is, we can help you.

It's probably a bug with your awk. Try other formats like these:
while read l
do
n=`echo "${l}" | awk -F\; '{print NF}'`
field1=`echo "${l}" | awk -F\; '{print $1}'`
field2=`echo "${l}" | awk -F\; '{print $2}'`
field3=`echo "${l}" | awk -F\; '{print $3}'`
echo "${n} ${field1} ${field2} ${field3}"
done < temp
Or
while read l
do
n=`echo "${l}" | awk -v 'FS=;' '{print NF}'`
field1=`echo "${l}" | awk -v 'FS=;' '{print $1}'`
field2=`echo "${l}" | awk -v 'FS=;' '{print $2}'`
field3=`echo "${l}" | awk -v 'FS=;' '{print $3}'`
echo "${n} ${field1} ${field2} ${field3}"
done < temp
Or
while read l
do
n=`echo "${l}" | awk 'BEGIN{FS=";"}{print NF}'`
field1=`echo "${l}" | awk 'BEGIN{FS=";"}{print $1}'`
field2=`echo "${l}" | awk 'BEGIN{FS=";"}{print $2}'`
field3=`echo "${l}" | awk 'BEGIN{FS=";"}{print $3}'`
echo "${n} ${field1} ${field2} ${field3}"
done < temp
Try other awks like mawk or nawk as well.

Related

Final step - run this script for all files in directory

With 5 hours of learning and a lot of help from really smart people I have a script running perfect, but I need to scale it up. Currently I key in the filename of a single file on the third line as a variable, save the script and run it. The script processes with no problem. File is uploaded to Google CLoud Storage, Firebase is written to, all links work. Everything is great except the manual entry of the filename.
My question is how do I make this same script run for all flac files found in the directory?
#!/bin/bash
cd /var/www/html/library/422980-2560-WIN
file="Date-2019-07-10__Time-16:36:50.flac"
echo $file | awk -F'-' '{print $2, $3, $4, $5}' | awk -F':' '{print $1, $2, $3, $4}' | awk -F'__' '{print $1, $2, $3}' | awk -F'.' '{print $1}' | awk -F'Time' '{print $$year=`awk -F' ' '{print $1}' awkresults.txt`
month=`awk -F' ' '{print $2}' awkresults.txt`
date=`awk -F' ' '{print $3}' awkresults.txt`
hour=`awk -F' ' '{print $4}' awkresults.txt`
minute=`awk -F' ' '{print $5}' awkresults.txt`
second=`awk -F' ' '{print $6}' awkresults.txt`
sudo gcloud ml speech recognize /var/www/html/library/422980-2560-WIN/$file --language-code='en-US' >STT.txt
STT=`grep -Po '"transcript": *\K"[^"]*"' STT.txt | cut -d '"' -f2`
sudo gsutil cp /var/www/html/library/422980-2560-WIN/$file gs://422980
sudo /usr/local/fuego --credentials /home/repeater/medialunaauth01-280236ff5e5f.json add 422980 '
{
"bucketObjecturl": "https://storage.googleapis.com/422980/'"$file"'",
"fileDate":"'"$date"'",
"fileMonth":"'"$month"'",
"fileName": "filenametest33",
"fileHour":"'"$hour"'",
"fileMinute":"'"$minute"'",
"fileSecond":"'"$second"'",
"fileYear":"'"$year"'",
"liveOnline": "0",
"qCChecked": "0",
"speechToText":"'"$STT"'",
"transcribedData": ""
}'
sleep 1
rm $file
Noted: I understand for proper creation of error free json files I should be using jq, I will learn it next - I promise.
Change the script to get the filename from a command line argument:
file=$1
Then loop over all the files in the directory:
for file in $.flac
do
/path/to/your/script "$file"
done
Or you could put the loop in your script, and use the wildcard when running the script.
Your script:
#!/bin/bash
cd /var/www/html/library/422980-2560-WIN
for file in "$#"; do
echo $file | awk -F'-' '{print $2, $3, $4, $5}' | awk -F':' '{print $1, $2, $3, $4}' | awk -F'__' '{print $1, $2, $3}' | awk -F'.' '{print $1}' | awk -F'Time' '{print $$year=`awk -F' ' '{print $1}' awkresults.txt`
month=`awk -F' ' '{print $2}' awkresults.txt`
date=`awk -F' ' '{print $3}' awkresults.txt`
hour=`awk -F' ' '{print $4}' awkresults.txt`
minute=`awk -F' ' '{print $5}' awkresults.txt`
second=`awk -F' ' '{print $6}' awkresults.txt`
sudo gcloud ml speech recognize /var/www/html/library/422980-2560-WIN/$file --language-code='en-US' >STT.txt
STT=`grep -Po '"transcript": *\K"[^"]*"' STT.txt | cut -d '"' -f2`
sudo gsutil cp /var/www/html/library/422980-2560-WIN/$file gs://422980
sudo /usr/local/fuego --credentials /home/repeater/medialunaauth01-280236ff5e5f.json add 422980 '
{
"bucketObjecturl": "https://storage.googleapis.com/422980/'"$file"'",
"fileDate":"'"$date"'",
"fileMonth":"'"$month"'",
"fileName": "filenametest33",
"fileHour":"'"$hour"'",
"fileMinute":"'"$minute"'",
"fileSecond":"'"$second"'",
"fileYear":"'"$year"'",
"liveOnline": "0",
"qCChecked": "0",
"speechToText":"'"$STT"'",
"transcribedData": ""
}'
sleep 1
rm $file
done
Then run the script as:
/path/to/your/script *.flac

Bash awk: parsing variable string into another variable? [duplicate]

This question already has answers here:
Linux bash: Multiple variable assignment
(6 answers)
Closed 4 years ago.
I would like to extract all the values contained in $line and put them in variables var1, var2,... varn. I did use this previously to extract the vars from file (in.txt)
var1=$(awk '{print $1}' < in.txt)
var2=$(awk '{print $2}' < in.txt)
....
varn=$(awk '{print $n}' < in.txt)
How should I change my awk call so as to use $line instead of in.txt?
I tried these for example
echo $line | var2=$(awk '{print $2}')
or
var2=$(echo $line | awk '{print $2}')
but without success...
========== DETAIL==============
----- calling file:
.....
name=Matrix
line=$(sed -n '/^\[T\]/ {n;p}' in.txt)
echo 'line: ' $line
L1=$(./procline_matrix_vars.sh $line 30 $name)
echo 'L1: ' $L1
------- rocline_matrix_vars.sh:
#!/bin/bash
line=$1
choice=$2
var1=$(echo $line | awk '{print $1}')
var2=$(echo $line | awk '{print $2}')
var3=$(echo $line | awk '{print $3}')
var4=$(echo $line | awk '{print $4}')
if [ $choice == 30 ]; then
L1=$(printf '\n\n\n%s = [ %s %s %s %s \n' "$3" "$var1" "$var2" "$var3" "$var4")
fi
echo "${L1%.}"
a possible way:
line="aaa bbb ccc"
var=( $line )
echo "${var[1]}"
echo "my array has ${#var[#]} elements"
output
bbb
my array has 3 elements
maybe shortcut
var=( $( awk '{print $1, $2, $10}' file ) )

Bad substitution using awk

I am trying to open some files as awk's output; the command is:
grep "formatDate\s=" "js/components/" | awk '{print $1}' | awk -F ":" '/1/ {print $1}'
and it (seems to) work correctly.
If I try to open that output as vim's tabs, like this:
vim -p ${ grep "formatDate\s=" "js/components/" | awk '{print $1}' | awk -F ":" '/1/ {print $1}' }
then I get:
-bash: ${ grep "formatDate\s=" "js/components/" | awk '{print $1}' | awk -F ":" '/1/ {print $1}' }: bad substitution
Any help? Thanks.
The way to execute a command is $(), whereas you are using ${}.
Hence, this should work:
vim -p $(grep "formatDate\s=" "js/components/" | awk '{print $1}' | awk -F ":" '/1/ {print $1}')

how can I change the order of the strings in the data file by shellscript

The data is saved in 'test_id.fileids' and it's aligned as shown below:
mdlr1/mdlr1-si1299
mdlr1/mdlr1-sa2
mdlr1/mdlr1-si1929
mhxl0/mhxl0-sx242
mhxl0/mhxl0-sa1
fcrz0/fcrz0-si2053
fcrz0/fcrz0-sx343
mgak0/mgak0-sx136
mjjm0/mjjm0-sx107
mjjm0/mjjm0-si1251
...
how could I change them to ?
mdlr1/si1299-mdlr1
mdlr1/sa2-mdlr1
mdlr1/si1929-mdlr1
mhxl0/sx242-mhxl0
mhxl0/sa1-mhxl0
fcrz0/si2053-fcrz0
fcrz0/sx343-fcrz0
mgak0/sx136-mgak0
mjjm0/sx107-mjjm0
mjjm0/si1251-mjjm0
...
Here's an example
echo "mdlr1/mdlr1-si1299" | awk -F'/' '{split($2,tmpArr,"-"); print $1"/" tmpArr[2]"-"tmpArr[1]}'
output
mdlr1/si1299-mdlr1
You can skip the echo ... | and just use a filename after the awk cmd, AND redirect to a tmp file, and then move that tmp file back to your original file (OR you can skip the && mv .. and just keep a new and old version of your file).
awk -F'/' '{split($2,tmpArr,"-"); print $1"/" tmpArr[2]"-"tmpArr[1]}' yourFile > FixedFile && mv FixedFile yourFile
IHTH
Okey enjoy the following code:
paste -d / <(awk -F'/' {'print $1'} test_id.fileids ) <(awk -F'/' {'print $2'} test_id.fileids |awk -F'-' {'print $2 "-" $1'} )
Output is :
mdlr1/si1299-mdlr1
mdlr1/sa2-mdlr1
mdlr1/si1929-mdlr1
mhxl0/sx242-mhxl0
mhxl0/sa1-mhxl0
fcrz0/si2053-fcrz0
fcrz0/sx343-fcrz0
mgak0/sx136-mgak0
mjjm0/sx107-mjjm0
mjjm0/si1251-mjjm0
Then you can store to a file such as:
paste -d / <(awk -F'/' {'print $1'} test_id.fileids ) <(awk -F'/' {'print $2'} test_id.fileids |awk -F'-' {'print $2 "-" $1'} ) > output.txt

Having trouble with awk

I am trying to assign a variable to an awk statement. I am getting an error. Here is the code:
for i in `checksums.txt` do
md=`echo $i|awk -F'|' '{print $1}'`
file=`echo $i|awk -F'|' '{print $2}'`
done
Thanks
for i in `checksums.txt` do
This will try to execute checksums.txt, which is very probably not what you want. If you want the contents of that file do:
for i in $(<checksums.txt) ; do
md=$(echo $i|awk -F'|' '{print $1}')
file=$(echo $i|awk -F'|' '{print $2}')
# ...
done
(This is not optimal, and will not do what you want if the file has lines with spaces in them, but at least it should get you started.)
You don't need external programs for this:
while IFS=\| read m f; do
printf 'md is %s, filename is %s\n' "$m" "$f"
done < checksums.txt
Edited as per new requirement.
Given the file is already sorted, you could use uniq (assuming GNU uniq and md hash length of 33 characters):
uniq -Dw33 checksums.txt
If GNU uniq is not available, you can use awk
(this version doesn't require a sorted input):
awk 'END {
for (M in m)
if (m[M] > 1)
print M, "==>", f[M]
}
{
m[$1]++
f[$1] = f[$1] ? f[$1] FS $2 : $2
}' checksums.txt
while read line
do
set -- `echo $line | tr '|' ' '`
echo md is $1, file is $2
done < checksums.txt

Resources