Having trouble with awk - bash

I am trying to assign a variable to an awk statement. I am getting an error. Here is the code:
for i in `checksums.txt` do
md=`echo $i|awk -F'|' '{print $1}'`
file=`echo $i|awk -F'|' '{print $2}'`

for i in `checksums.txt` do
This will try to execute checksums.txt, which is very probably not what you want. If you want the contents of that file do:
for i in $(<checksums.txt) ; do
md=$(echo $i|awk -F'|' '{print $1}')
file=$(echo $i|awk -F'|' '{print $2}')
# ...
(This is not optimal, and will not do what you want if the file has lines with spaces in them, but at least it should get you started.)

You don't need external programs for this:
while IFS=\| read m f; do
printf 'md is %s, filename is %s\n' "$m" "$f"
done < checksums.txt
Edited as per new requirement.
Given the file is already sorted, you could use uniq (assuming GNU uniq and md hash length of 33 characters):
uniq -Dw33 checksums.txt
If GNU uniq is not available, you can use awk
(this version doesn't require a sorted input):
awk 'END {
for (M in m)
if (m[M] > 1)
print M, "==>", f[M]
f[$1] = f[$1] ? f[$1] FS $2 : $2
}' checksums.txt

while read line
set -- `echo $line | tr '|' ' '`
echo md is $1, file is $2
done < checksums.txt


Filter lines based on certain string and then print only some attributes greater

I have a big text file with million of log lines.
I would like to filter all the lines which satisfy following criteria
url should be url=/v2/testB
totalTime value should be greater than 500
I have tried using multiple awk and then putting if block, I wonder, it can be done quickly? Thanks !
while IFS= read -r line; do
value=`echo $line|grep "url=/v2/testB" | awk -F"totaltime=" '{ print $2}'| awk -F"|" '{ print $1}'`
if (( $value > 500 )); then
echo $line
done < file.log
You may use this awk:
awk -F '|' -v OFS=, '$NF == "url=/v2/testB" {v=$3; sub(/^totaltime=/, "", v); if (v+0 > 500) print $2, $3}' file
To make it more readable:
awk -F '|' -v OFS=, '
$NF == "url=/v2/testB" {
v = $3
sub(/^totaltime=/, "", v)
if (v+0 > 500)
print $2, $3
}' file
If you have gnu-awk then it can be reduced to:
awk -F '|' -v OFS=, '$NF == "url=/v2/testB" &&
gensub(/^totaltime=/, "", "1", $3)+0 > 500 {print $2, $3}' file
v+0 is shorthand in awk to covert a string value to number.
$ awk -F'|' -v OFS=',' '{split($3,t,/=/)} $5=="url=/v2/testB" && t[2]>500{print $2, $3}' file
You seem to be in luck:
awk -F'|' 'BEGIN{FS="|"; OFS=","}
{ url = substr($NF,index($NF,"=")+1)
totaltime = substr($3,index($3,"=")+1)
(url == "/v1/testB") && (totaltime+0 > 500) { print $2,$3 }
' file
With your shown samples, please try following awk program.
awk -F'\\||totaltime=' '$NF=="url=/v2/testB" && $4>500{print $2",totaltime="$4}' Input_file
Explanation: Following is the detailed explanation for above code.
Setting field separator by using -F option in awk program.
Setting field separators to | and totaltime= for all the lines of Input_file.
In main program, checking conditions:
a- If $NF(last field) is equal to url=/v2/testB AND
b- 4th field is greater than 500 then do:
print 2nd field of current line followed by string ,totaltime= followed by 4th field as per required output by OP.
All the awk solutions are great, and if that is a solution use them.
If you wanted to fix your Bash effort, you can do:
while IFS='|' read -r id ti; do
[[ "${ti#*=}" -gt 500 ]] && printf "%s,%s\n" "$id" "$ti"
done < <(grep 'url=/v2/testB$' file | cut -d '|' -f 2,3)
Alternatively, you can eliminate cut and keep all five fields:
while IFS='|' read -r c1 c2 c3 c4 c5; do
[[ "${c3#*=}" -gt 500 ]] && printf "%s,%s\n" "$c2" "$c5"
done < <(grep 'url=/v2/testB$' file)
Either prints:

Iterating through Comma Separated rows in loop in Shell

Failed,2021-12-07 22:30 EST,Scheduled Backup,abc,/clients/FORD_1030PM_EST_Windows2008,Windows File System
Failed,2021-12-07 22:00 EST,Scheduled Backup,def,/clients/FORD_10PM_EST_Windows2008,Windows File System
I want to iterate through these rows instead of column
Expected Output
I tried this
while read line ; do
group=$(awk -F',' '{print $4}')
client=$(awk -F',' '{print $5}')
echo $group
echo $client
done < Final
it's Not working but when I am individually doing this
cat Final | awk -F',' '{print $4}'
then it is giving me the expected output but does not work when I am trying in the loop.
With GNU awk:
awk -F ',' 'BEGINFILE{f++}
f==1{print "client=" $4}
f==2{print "group=" $5}
' Final Final
One-pass awk solution, storing field $5 in an array for printing at the end:
$ awk -F, '{print $4; groups[NR]=$5} END {for (i=1;i<=NR;i++) print groups[i]}' Final.txt
Two-pass awk that eliminates need to store field $5 in an array:
$ awk -F, 'FNR==NR {print $4;next} {print $5}' Final.txt Final.txt
with bash
declare -a client group
while IFS=, read -ra fields; do
done < Final
printf 'client=%s\n' "${client[#]}"
printf 'group=%s\n' "${group[#]}"
Using miller, it's no different really from a single-pass awk
solution, collecting the values in arrays:
mlr --icsv --implicit-csv-header put '
#client[NR] = $4;
#group[NR] = $5;
filter false;
end {
emit #client, "client";
emit #group, "group";
' Final
The equivalent of the above, as more readable (IMO) awk code:
awk -F, '
{client[NR] = $4; group[NR] = $5}
for (i=1; i<=NR; i++) print "client=" client[i]
for (i=1; i<=NR; i++) print "group=" group[i]
' Final
Using csvtool is nice because it has a transpose function,
but it still needs help getting to the desired output
csvtool col 4,5 Final \
| csvtool cat <(echo "client,group") - \
| csvtool transpose - \
| awk -F, -v OFS="=" '{for (i=2; i<=NF; i++) print $1, $i}'
LOG="Failed,2021-12-07 22:30 EST,Scheduled Backup,abc,/clients/FORD_1030PM_EST_Windows2008,Windows File System
Failed,2021-12-07 22:00 EST,Scheduled Backup,def,/clients/FORD_10PM_EST_Windows2008,Windows File System"
[ -n "$log" -a -n "$column" ] && sed -E 's/([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),(.*)/\'$column'/mg;t;d' <<<$log
[ -n "$log" -a -n "$column" ] && awk -F',' -v "c=$column" '{print $(c)}' <<<$log
echo "# with sed and column number"
echo "-------------------------------"
getColumnTextSed "$LOG" 4
getColumnTextSed "$LOG" 5
echo "-------------------------------"
echo "# with awk and column number"
echo "-------------------------------"
getColumnTextAWK "$LOG" 4
getColumnTextAWK "$LOG" 5
# with sed
# with awk

How to grab fields in inverted commas

I have a text file which contains the following lines:
"jeffrey","2021-09-21 12:54:26","90 days"
"root","2021-09-21 11:06:57","0 days"
How can I grab two fields jeffrey and 90 days from inverted commas and save in a variable.
If awk is an option, you could save an array and then save the elements as individual variables.
$ IFS="\"" read -ra var <<< $(awk -F, '/jeffrey/{ print $1, $NF }' input_file)
$ $ var2="${var[3]}"
$ echo "$var2"
90 days
$ var1="${var[1]}"
$ echo "$var1"
while read -r line; do # read in line by line
name=$(echo $line | awk -F, ' { print $1} ' | sed 's/"//g') # grap first col and strip "
expire=$(echo $line | awk -F, ' { print $3} '| sed 's/"//g') # grap third col and strip "
echo "$name" "$expire" # do your business
done < yourfile.txt
arr=( $(cat txt | head -2 | tail -1 | cut -d, -f 1,3 | tr -d '"') )
echo "${arr[0]}"
echo "${arr[1]}"
The result is into an array, you can access to the elements by index.
May be this below method will help you using
sed and awk command
username=$(sed -n '/jeffrey/p' demo.txt | awk -F',' '{print $1}')
echo "$username"
expires_in=$(sed -n '/jeffrey/p' demo.txt | awk -F',' '{print $3}')
echo "$expires_in"
Output :
90 days
Note :
This above method will work if their is only distinct username
As far i know username are not duplicate

How can I specify a row in awk in for loop?

I'm using the following awk command:
my_command | awk -F "[[:space:]]{2,}+" 'NR>1 {print $2}' | egrep "^[[:alnum:]]"
which successfully returns my data like this:
file Name 1
file Nameone
f i l e Name 1
So as you can see some file names have spaces. This is fine as I'm just trying to echo the file name (nothing special). The problem is calling that specific row within a loop. I'm trying to do it this way:
for num in $rows
fileName=$(my_command | awk -F "[[:space:]]{2,}+" 'NR==$i {print $2}' | egrep "^[[:alnum:]])"
echo "$num $fileName"
But my output is always null
I've also tried using awk -v record=$i and then printing $record but I get the below results.
f i l e Name 1
Sorry for the confusion: rows is a variable that list ids like this 11 12 13
and each one of those ids ties to a file name. My command without doing any parsing looks like this:
id File Info OS
11 File Name1 OS1
12 Fi leNa me2 OS2
13 FileName 3 OS3
I can only use the id field to run a the command that I need, but I want to use the File Info field to notify the user of the actual File that the command is being executed against.
I think your $i does not expand as expected. You should quote your arguments this way:
fileName=$(my_command | awk -F "[[:space:]]{2,}+" "NR==$i {print \$2}" | egrep "^[[:alnum:]]")
And you forgot the other ).
As an update to your requirement you could just pass the rows to a single awk command instead of a repeatitive one inside a loop:
ROWS=(11 12)
function my_command {
# This function just emulates my_command and should be removed later.
echo " id File Info OS
11 File Name1 OS1
12 Fi leNa me2 OS2
13 FileName 3 OS3"
awk -- '
input = ARGV[1]
while (getline line < input) {
sub(/^ +/, "", line)
split(line, a, / +/)
for (i = 2; i < ARGC; ++i) {
if (a[1] == ARGV[i]) {
printf "%s %s\n", a[1], a[2]
' <(my_command) "${ROWS[#]}"
That awk command could be condensed to one line as:
awk -- 'BEGIN { input = ARGV[1]; while (getline line < input) { sub(/^ +/, "", line); split(line, a, / +/); for (i = 2; i < ARGC; ++i) { if (a[1] == ARGV[i]) {; printf "%s %s\n", a[1], a[2]; break; }; }; }; exit; }' <(my_command) "${ROWS[#]}"
Or better yet just use Bash instead as a whole:
ROWS=(11 12)
while IFS=$' ' read -r LINE; do
IFS='|' read -ra FIELDS <<< "${LINE// +( )/|}"
for R in "${ROWS[#]}"; do
if [[ ${FIELDS[0]} == "$R" ]]; then
echo "${R} ${FIELDS[1]}"
done < <(my_command)
It should give an output like:
11 File Name1
12 Fi leNa me2
Shell variables aren't expanded inside single-quoted strings. Use the -v option to set an awk variable to the shell variable:
fileName=$(my_command | awk -v i=$i -F "[[:space:]]{2,}+" 'NR==i {print $2}' | egrep "^[[:alnum:]])"
This method avoids having to escape all the $ characters in the awk script, as required in konsolebox's answer.
As you already heard, you need to populate an awk variable from your shell variable to be able to use the desired value within the awk script so thi:
awk -F "[[:space:]]{2,}+" 'NR==$i {print $2}' | egrep "^[[:alnum:]]"
should be this:
awk -v i="$i" -F "[[:space:]]{2,}+" 'NR==i {print $2}' | egrep "^[[:alnum:]]"
Also, though, you don't need awk AND grep since awk can do anything grep van do so you can change this part of your script:
awk -v i="$i" -F "[[:space:]]{2,}+" 'NR==i {print $2}' | egrep "^[[:alnum:]]"
to this:
awk -v i="$i" -F "[[:space:]]{2,}+" '(NR==i) && ($2~/^[[:alnum:]]/){print $2}'
and you don't need a + after a numeric range so you can change {2,}+ to just {2,}:
awk -v i="$i" -F "[[:space:]]{2,}" '(NR==i) && ($2~/^[[:alnum:]]/){print $2}'
Most importantly, though, instead of invoking awk once for every invocation of my_command, you can just invoke it once for all of them, i.e. instead of this (assuming this does what you want):
for num in rows
fileName=$(my_command | awk -v i="$i" -F "[[:space:]]{2,}" '(NR==i) && ($2~/^[[:alnum:]]/){print $2}')
echo "$num $fileName"
you can do something more like this:
for num in rows
done |
awk -F '[[:space:]]{2,}' '$2~/^[[:alnum:]]/{print NR, $2}'
I say "something like" because you don't tell us what "my_command", "rows" or "num" are so I can't be precise but hopefully you see the pattern. If you give us more info we can provide a better answer.
It's pretty inefficient to rerun my_command (and awk) every time through the loop just to extract one line from its output. Especially when all you're doing is printing out part of each line in order. (I'm assuming that my_command really is exactly the same command and produces the same output every time through your loop.)
If that's the case, this one-liner should do the trick:
paste -d' ' <(printf '%s\n' $rows) <(my_command |
awk -F '[[:space:]]{2,}+' '($2 ~ /^[::alnum::]/) {print $2}')

Splitting CSVs into files named for one of the columns

I have CSVs like this:
How can I get it to place all of the items from the left column into files named with the items in the right column? E.g. file.txt would contain this list:
So far, I have this:
while read line
firstcolumn=$(echo $line | awk -F ",*" '{print $1}')
secondcolumn=$(echo $line | awk -F ",*" '{print $2}')
done < Text/selection.csv
One way using awk:
awk 'BEGIN { FS = "," } { print $1 >> $2 }' infile
This should work -
awk -F, '{a[$1]=$2} END{for (i in a) print i > a[i]}' file
[jaypal:~/Temp] cat file
[jaypal:~/Temp] awk -F, '{a[$1]=$2} END{for (i in a) print i > a[i]}' file
[jaypal:~/Temp] ls file*
file file1.txt file2.txt
[jaypal:~/Temp] cat file1.txt
[jaypal:~/Temp] cat file2.txt
You can also do something like this -
awk -F, '{print $1 > $2}' INPUT_FILE
Pure Bash and under the assumption that all target files are empty or non-existing:
while IFS=',' read item file ; do
echo "$item" >> "$file"
done < "$infile"
sed loves this stuff...
sed "s%\(.*\),\(.*\)%echo \1 >> \2 %" inputfile.txt | sh
