How to use awk in a `for` loop with two loop variables - shell

I have a zsh shell scripting problem :-(
There is a file containing a list of 4 columns :
NAME SURNAME OLD TOWN
DOE John 30 London
CALAS Maria 50 Athens
...
I want to make a treatment of only some "elements" of each line. For example, I don't know if that's possible but it should be like :
for user,livesIn in `cat MyFile | awk '{print $2 $4}'`
echo "My friend $user lives in $livesIn"
done
Of course this code is wrong and I didn't find how to write it correctly.
Do someone knows if that's possible ?
Thanks in advance for your help.

When you want to loop through the output of awk, you can try
awk '{print $2 $4}' MyFile | while read -r user livesIn; do
echo "${user} was last seen in ${livesIn}."
done
In this case awkis not needed:
while read -r field1 user field3 livesIn; do
echo "${user} was last seen in ${livesIn}."
done < MyFile
The constructions above will fail when some field has a space, like New York.
Take a good look at the specifications of your MyFile, how the fields are seperated. Fixed width? TAB-character? With a TAB you are lucky:
while IFS=$'\t' read -r field1 user field3 livesIn; do
echo "${user} was last seen in ${livesIn}."
done < MyFile

awk processes each line one at a time, so no need for a for loop. Also no need for the cat as awk takes the file name as an argument. Try this:
awk '{print "My friend "$2" lives in "$4}' MyFile

If you are already using awk, why don't you process everything with awk?
For example:
$ awk '{print "My friend", $2, "lives in", $4}' MyFile
That gets the output you are looking for. Unless there is something else not stated in the question.

It is usually simpler and more efficient to do both iterating and processing in AWK rather than trying to divide the task by iterating in the shell and processing in AWK. AWK is well designed for iterating through input, and it also has its own loop structures (e.g. for). If at all possible, do all your processing in AWK.
That said, it seems that your problem also requires access to the input fields in the shell, and so full processing in AWK may not be possible in your case. (It is worth noting that AWK can also execute shell commands, but this may be just another level of complication.)
Other answers use AWK to iterate through the file and print something with columns 2 and 4, like
$ awk '{print "My friend", $2, "lives in", $4}' MyFile
which is fine if you can iterate and process with AWK like this. As an addition to this type of solution, you might want to skip the first line (which seems to have column headers instead of actual data) with
$ awk 'NR>1{print "My friend", $2, "lives in", $4}' MyFile
Your comment
In fact my treatment is a little bit more complicated than a print. I need to make some tests and assignments in the "for" loop.
suggests that what you really want is access to the fields in your shell. You can get this by using AWK to pick out the fields (as before), but piping the values into the shell:
awk 'NR>1{print $2,$4}' MyFile | while read user livesIn; do
echo "My friend $user lives in $livesIn"
done
This gives you $user and $livesIn in the shell, and so you can do more complicated shell processing with it. For example:
awk 'NR>1{print $2,$4}' MyFile | while read user livesIn; do
if [[ "$user" == "John" ]]; then
echo "$user is not my friend, but lives in $livesIn"
else
echo "My friend $user lives in $livesIn"
echo "$user" >> friends.txt
fi
done
Be careful with the format of your input file since AWK is splitting on white space.

Related

Why does the while loop in that bash script stop after the first line

I was stepping through a list of files with a bash script.
But: it always stopped after the first loop although the list in "$tempdat" was 10 lines long.
while IFS= read -r zeile; do
# Zielsteuerung
quelle=$(awk -F\/ '{print $2}')
if [[ "$quelle" == "foo" ]]; then
do that
else
do s.th. else
fi
rsync somefiles
done < "$tempdat"
After some searching I found the error, the awk was not correct
quelle=$(echo "$zeile"|awk -F\/ '{print $2}')
But: why did that mistake prevent the loop from finishing? Maybe someone with more bash insight could be so nice to enlighten me. :-)
The awk is consuming all of the data intended for the read in the condition of the loop. Instead of using awk to try to parse the line, it seems like you intend to do:
while IFS=/ read -r a quelle b; do
and read the second column of the file into the variable quelle.
You can also write:
quelle=$(echo "$zeile" | awk -F\/ '{print $2}')
but it is generally best practice to let read split the fields for you.

How to get '2f8b547d..eb94967a' string from the log 'Updating 2f8b547d..eb94967a Fast-forward....' in shell?

I am building a shell script.
The script gets git log such as:
"Updating 2f8b547d..eb94967a Fast-forward...."
but I want to get 2f8b547d..eb94967a snippet.
I am a new one for the shell. So, Thanks for you help.
Update:
For the more, I want use the snippet as a param. Because I will excute
git log 2f8b547d..eb94967a
You can pipe it to awk like so:
echo "Updating 2f8b547d..eb94967a Fast-forward...." | awk '{print $2}'
Your result will be 2f8b547d..eb94967a.
If it is a script, say, abc.sh that had such output, then you can run:
$> ./abc.sh | awk '{print $2}'
awk takes the output and splits the information by space. Updating is represented with $1. 2f8b547d..eb94967a is $2 and so on. In the above example, we ask awk to print out the 2nd item in the output.
As an alternative to awk (don't get me wrong, awk is super for this job as well), you can simply use cut with a space delimiter extract the second field, e.g.
cut -d' ' -f2 yourgit.log
You can also pipe output to cut or redirect the input file to it using < as well. It essentially does the same as the awk command, it just being a different choice of which utility to use.
Here another alternative:
echo "Updating 2f8b547d..eb94967a Fast-forward...." | read u hash rest
After this, the string you are looking for is stored on the variable hash:
echo $hash

Swap in shell from file

Through cut -d":" -f1,3 I made a new file which looks like this:
username1:number1
username2:number2
username3:number3
But my point is, I want to my file to looks like this:
number1:username1
number2:username2
number3:username3
I tried that cut -d":" -f3,1 but it still gets me username1:number1 even when I want to be that 3rd column be the 1st and the 1st column to print it like a last one ... Any help with that ?
cut -f3,1 will print the same as cut -f1,3. Use awk:
awk -F: '{print $3 FS $1}' file
I like awk for this sort of thing. You've already got an awk answer though.
To do this in pure bash, you'd do the following:
while IFS=: read -r one two; do printf '%s:%s\n' "$two" "$one"; done < input.txt
The IFS variable is the field separator used to slice up the input into separate variables for read, and I'm using printf to gives us predictably formatted output.

CSV Two Column List With Spaces. Need everything before or everything after in two separate variables

I have a CSV list that is two columns (col1 is Share Name, col2 is file system path). I need two variables for either everything BEFORE the comma, or everything AFTER the column. My issue is that either column potentially has spaces, and even though these are quoted in the output, my script isn't handling them properly.
CSV:
ShareName,/path/to/sharename
"Share with spaces",/path/to/sharewithspaces
ShareWithSpace,"/path/to/share with spaces"
I was using this awk statement to get either field 1 or field 2:
echo $line | awk -F "\"*,\"*" '{print $2}'
BUT, I soon realized that it wasn't handling the spaces properly, even when passing that command to a variable and quoting the variable.
So, then after googling my brain out, I was trying this:
echo $line | cut -d, -f2
Which works, EXCEPT when echoing the variable $line. If I echo the string, it works perfectly, but unfortunately I'm using this in a while/read/do.
I am fairly certain my issue is having to define fields and having whitespace, but I really only need before or after a comma.
Here's the stripped down version so there's no sensitive data.
#!/usr/bin/bash
ssh <ip> <command> > "2_shares.txt"
<command> > "1_shares.txt"
file1="1_shares.txt"
file2="2_shares.txt"
while read -r line
do
share=`echo "$line" | awk -F "\"*,\"*" '{print $1}'`
path=`echo "$line" | awk -F "\"*,\"*" '{print $2}'`
if grep "$path" $file2 > /dev/null;
then
:
else
echo "SHARE NEEDS CREATED FOR $line"
case $path in
*)
blah blah blah
;;
esac
fi
done < "$file1"
You could simply do like this,
awk -F',' '{print $2}' file
To skip the first line.
awk -F',' 'NR>1{print $2}' file
Your issue is simply that you aren't quoting your shell variables. ALWAYS quote shell variables unless you have a very specific reason not to and are fully aware of all of the consequences.
I strongly suspect the rest of your script is completely wrong in it's approach since you apparently didn't know to quote variables and are talking about shell loops and echoing one line at time to awk so please do post a followup question if you'd like help.

Using the output of awk as the list of names in a for loop

How can I pass the output of awk to a for file in loop?
for file in awk '{print $2}' my_file; do echo $file done;
my_file contains the name of the files whose name should be displayed (echoed).
I get just a
>
instead of my normal prompt.
Use backticks or $(...) to substitute the output of a command:
for file in $(awk '{print $2}' my_file)
do
echo "$file"
done
for file in $(awk '{print $2}' my_file); do echo "$file"; done
The notation to use is $(...) or Command Substitution.
for file in $(awk '{print $2}' my_file)
do
echo $file
done
Where I assume that you do more in the body of the loop than just echo since you could then leave the loop out altogether:
awk '{print $2}' my_file
Or, if you miss typing semicolons and don't like to spread code over multiple lines for readability, then you can use:
for file in $(awk '{print $2}' my_file); do echo $file; done
You will also find in (mostly older) code the backticks used:
for file in `awk '{print $2}' my_file`
do
echo $file
done
Quite apart from being difficult to use in the Markdown used to format comments (and questions and answers) on Stack Overflow, the backticks are not as friendly, especially when nested, so you should recognize them and understand them but not use them.
Incidentally, the reason you got the > prompt is that this command line:
for file in awk '{print $2}' my_file; do echo $file done;
is missing a semicolon before the done. The shell was still waiting for the done. Had you typed done and return, you would have seen the output:
awk done
{print $2} done
my_file done
Using backticks or $(awk ...) for command substitution is an acceptable solution for a small number of files; however, consider using xargs for single commands or pipes or a simple while read ... for more complex tasks (but it will work for simple ones too)
awk '...' |while read FILENAME; do
#do work with each file here using $FILENAME
done
This will allow processing to be done as each filename is processed instead of having to wait for the whole awk script to complete and allow for a larger set of filenames (you can only give so many args to a for x in ...; do) This will typically speed up your scripts and allow the same kinds of operations you would get in a for in loop without its limitations.

Resources