Using the output of awk as the list of names in a for loop - bash

How can I pass the output of awk to a for file in loop?
for file in awk '{print $2}' my_file; do echo $file done;
my_file contains the name of the files whose name should be displayed (echoed).
I get just a
>
instead of my normal prompt.

Use backticks or $(...) to substitute the output of a command:
for file in $(awk '{print $2}' my_file)
do
echo "$file"
done

for file in $(awk '{print $2}' my_file); do echo "$file"; done

The notation to use is $(...) or Command Substitution.
for file in $(awk '{print $2}' my_file)
do
echo $file
done
Where I assume that you do more in the body of the loop than just echo since you could then leave the loop out altogether:
awk '{print $2}' my_file
Or, if you miss typing semicolons and don't like to spread code over multiple lines for readability, then you can use:
for file in $(awk '{print $2}' my_file); do echo $file; done
You will also find in (mostly older) code the backticks used:
for file in `awk '{print $2}' my_file`
do
echo $file
done
Quite apart from being difficult to use in the Markdown used to format comments (and questions and answers) on Stack Overflow, the backticks are not as friendly, especially when nested, so you should recognize them and understand them but not use them.
Incidentally, the reason you got the > prompt is that this command line:
for file in awk '{print $2}' my_file; do echo $file done;
is missing a semicolon before the done. The shell was still waiting for the done. Had you typed done and return, you would have seen the output:
awk done
{print $2} done
my_file done

Using backticks or $(awk ...) for command substitution is an acceptable solution for a small number of files; however, consider using xargs for single commands or pipes or a simple while read ... for more complex tasks (but it will work for simple ones too)
awk '...' |while read FILENAME; do
#do work with each file here using $FILENAME
done
This will allow processing to be done as each filename is processed instead of having to wait for the whole awk script to complete and allow for a larger set of filenames (you can only give so many args to a for x in ...; do) This will typically speed up your scripts and allow the same kinds of operations you would get in a for in loop without its limitations.

Related

give a file without changing the name in script [duplicate]

This question already has answers here:
How to pass parameters to a Bash script?
(4 answers)
Closed 1 year ago.
At the beginning I have a file.txt, which contains several informations that I will take using the grep command as you see in the script.
What I want is to give the script the file I want instead of file.txt but without changing the file name each time in the script for example if the file is named Me.txt I don’t want to go into the script and write Me.txt in each grep command especially if I have dozens of orders.
Is there a way to do this?
#!/bin/bash
grep teste file.txt > testline.txt
awk '{print $2}' testline.txt > test.txt
echo '#'
echo '#'
grep remote file.txt > remoteline.txt
awk '{print $3}' remoteline.txt > remote.txt
echo '#'
echo '#'
grep adresse file.txt > adresseline.txt
awk '{print $2}' adresseline.txt > adresse.txt
Using a parameter, as many contributors here suggested, is of course the obvious approach, and the one which is usually taken in such case, so I want to extend this idea:
If you do it naively as
filename=$1
you have to supply the name on every invocation. You can improve on this by providing a default value for the case the parameter is missing:
filename=${1:-file.txt}
But sometimes you are in a situation, where for some time (working on a specific task), you always need the same filename over and over, and the default value happens to be not the one you need. Another possibility to pass information to a program is via the environment. If you set the filename by
filename=${MOOFOO:-file.txt}
it means that - assuming your script is called myscript.sh - if you invoke your script by
MOOFOO=myfile.txt myscript.sh
it uses myfile.txt, while if you call it by
myscript.sh
it uses the default file.txt. You can also set MOOFOO in your shell, as
export MOOFOO=myfile.txt
and then, even a lone execution of
myscript.sh
with use myfile.txt instead of the default file.txt
The most flexible approach is to combine both, and this is what I often do in such a situation. If you do in your script a
filename=${1:-${MOOFOO:-file.txt}}
it takes the name from the 1st parameter, but if there is no parameter, takes it from the variable MOOFOO, and if this variable is also undefined, uses file.txt as the last fallback.
You should pass the filename as a command line parameter so that you can call your script like so:
script <filename>
Inside the script, you can access the command line parameters in the variables $1, $2,.... The variable $# contains the number of command line parameters passed to the script, and the variable $0 contains the path of the script itself.
As with all variables, you can choose to put the variable name in curly brackets which has advantages sometimes: ${1}, ${2}, ...
#!/bin/bash
if [ $# = 1 ]; then
filename=${1}
else
echo "USAGE: $(basename ${0}) <filename>"
exit 1
fi
grep teste "${filename}" > testline.txt
awk '{print $2}' testline.txt > test.txt
echo '#'
echo '#'
grep remote "${filename}" > remoteline.txt
awk '{print $3}' remoteline.txt > remote.txt
echo '#'
echo '#'
grep adresse "${filename}" > adresseline.txt
awk '{print $2}' adresseline.txt > adresse.txt
By the way, you don't need two different files to achieve what you want, you can just pipe the output of grep straight into awk, e.g.:
grep teste "${filename}" | awk '{print $2}' > test.txt
but then again, awk can do the regex match itself, reducing it all to just one command:
awk '/teste/ {print $2}' "${filename}" > test.txt

Why does the while loop in that bash script stop after the first line

I was stepping through a list of files with a bash script.
But: it always stopped after the first loop although the list in "$tempdat" was 10 lines long.
while IFS= read -r zeile; do
# Zielsteuerung
quelle=$(awk -F\/ '{print $2}')
if [[ "$quelle" == "foo" ]]; then
do that
else
do s.th. else
fi
rsync somefiles
done < "$tempdat"
After some searching I found the error, the awk was not correct
quelle=$(echo "$zeile"|awk -F\/ '{print $2}')
But: why did that mistake prevent the loop from finishing? Maybe someone with more bash insight could be so nice to enlighten me. :-)
The awk is consuming all of the data intended for the read in the condition of the loop. Instead of using awk to try to parse the line, it seems like you intend to do:
while IFS=/ read -r a quelle b; do
and read the second column of the file into the variable quelle.
You can also write:
quelle=$(echo "$zeile" | awk -F\/ '{print $2}')
but it is generally best practice to let read split the fields for you.

Bash awk append to same line

There are numerous posts about removing leading white space and appending an entry to a single existing line in a file using awk. None of my attempts work - just three examples here of the many I have tried.
Say I have a file called $log with a single line
a:b:c
and I want to add a fourth entry,
awk '{ print $4"d" }' $log | tee -a $log
output seems to be a newline
`a:b:c:
d`
whereas, I want all on the same line;
a:b:c:d
try
BEGIN { FS = ":" } ; awk '{ print $4"d" }' $log | tee -a $log
or, this - avoid a new line
awk 'BEGIN { ORS=":" }; { print $4"d" }' $log | tee -a $log
no change
`a:b:c:
d`
awk is placing a space after c: and then writing d to the next line.
EDIT: | tee -a $log appears to be necessary to write the additional string to the file.
$log contains 39 variables and was generated using awk without | tee -a
odd...
The actual command to write $40 to the single line entries
awk '{ print $40"'$imagedir'" }' $log
output
+ awk '{ print $40"/home/geoland/Asterism-DEVEL/DSO" }'
/home/geoland/.asterism/log
but this does not write to the $log file.
How should I append d to the same line without leading white space using awk - also looking at sed xargs and other alternatives.
Using awk:
awk '{ print $0":d" }' file
Using sed:
sed 's/$/:d/' file
Using only bash:
while IFS= read -r line; do
echo "$line:d"
done < file
Using sed:
$ echo a:b:c | sed 's,\(^.*$\),\1:d,'
a:b:c:d
Thanks all... This is the solution I went with. I also needed to write the entire line to a perpetual log file because the log file is overwritten at each new process instance.
I will further investigate an awk solution.
logname=$imagedir/log_$name
while IFS=: read -r line; do
echo "$line$imagedir"
done < $log | tee $logname
This places $imagedir directly behind the last IFS ':' separator
There is probably room for refinement.
I too am not entirely sure what you're trying to do here.
Your command line, awk '{ print $4"d" }' $log | tee -a $log is problematic in a number of ways.
First, your awk script tries to print the 4th field, which is empty. Unless you say otherwise, fields are separated by whitespace, and the string a:b:c has no whitespace. So .. awk prints "d". And tee -a appends to your existing logfile, so what you're seeing is the original data, along with the d printed by awk. That's totally expected.
Second, it appears to have tee appending to the same file that awk is in the process of reading. This won't make an endless loop, as awk should stop reading the input file after whatever was the last byte when the file was opened, but it does mean you may have repeated data there.
Your other attempts, aside from some syntactical errors, all suffer from the same assumption that $4 means something that it does not.
The following awk snippet sets the input and output field separators to :, then sets the 4th field to "d", then prints the line.
$ echo "a:b:c" | awk 'BEGIN{FS=OFS=":"} {$4="d"} 1'
a:b:c:d
Is that what you want?
If you really do need to append this data to an existing log file, you can do so with tee -a or simple >> redirection. Just bear in mind that awk will only see the content of the file as of the time it was run, and by appending, you are not replacing lines.
One other thing. If you are actually hoping to use the content of the shell variable $imagedir inside awk, you should pass the variable in rather than exiting your quotes. For example:
$ echo "a:b:c" | awk -v d="foo/bar" 'BEGIN{FS=OFS=":"} {$4=d} 1'
a:b:c:foo/bar
sed "s|$|$imagedir|" file | tee newfile
This does the trick. Read 'file' and write the contents of 'file' with the substitution to a 'new file', so as to read the image directory when using a secondary standalone process.
Because the variable is a directory with several / these need to be escaped, so as not to interpret as sed delimiters. I had difficulty with this using a variable.
A neater option was to use an alternative delimiter. Not to be confused with the pipe that follows.

CSV Two Column List With Spaces. Need everything before or everything after in two separate variables

I have a CSV list that is two columns (col1 is Share Name, col2 is file system path). I need two variables for either everything BEFORE the comma, or everything AFTER the column. My issue is that either column potentially has spaces, and even though these are quoted in the output, my script isn't handling them properly.
CSV:
ShareName,/path/to/sharename
"Share with spaces",/path/to/sharewithspaces
ShareWithSpace,"/path/to/share with spaces"
I was using this awk statement to get either field 1 or field 2:
echo $line | awk -F "\"*,\"*" '{print $2}'
BUT, I soon realized that it wasn't handling the spaces properly, even when passing that command to a variable and quoting the variable.
So, then after googling my brain out, I was trying this:
echo $line | cut -d, -f2
Which works, EXCEPT when echoing the variable $line. If I echo the string, it works perfectly, but unfortunately I'm using this in a while/read/do.
I am fairly certain my issue is having to define fields and having whitespace, but I really only need before or after a comma.
Here's the stripped down version so there's no sensitive data.
#!/usr/bin/bash
ssh <ip> <command> > "2_shares.txt"
<command> > "1_shares.txt"
file1="1_shares.txt"
file2="2_shares.txt"
while read -r line
do
share=`echo "$line" | awk -F "\"*,\"*" '{print $1}'`
path=`echo "$line" | awk -F "\"*,\"*" '{print $2}'`
if grep "$path" $file2 > /dev/null;
then
:
else
echo "SHARE NEEDS CREATED FOR $line"
case $path in
*)
blah blah blah
;;
esac
fi
done < "$file1"
You could simply do like this,
awk -F',' '{print $2}' file
To skip the first line.
awk -F',' 'NR>1{print $2}' file
Your issue is simply that you aren't quoting your shell variables. ALWAYS quote shell variables unless you have a very specific reason not to and are fully aware of all of the consequences.
I strongly suspect the rest of your script is completely wrong in it's approach since you apparently didn't know to quote variables and are talking about shell loops and echoing one line at time to awk so please do post a followup question if you'd like help.

Bash - Convert sub.site.com in to com_site_sub?

I am writing a script that will do some automated things, and it requires to be put in the format <tld>_<site>_<sub> for now.
I will basically provoke it as such: ./add.sh about.site.com
Which will add the corresponding entries once the name is extracted
How could I write this?
You can mess with $IFS to change how things like read parse text:
hostname="foo.bar.com"
IFS=. read sub site tld <<< "$hostname"
echo ${tld}_${site}_${sub}
Or awk (a little cleaner than sed):
echo $1 | awk -F"." '{print $3 "_" $2 "_" $1}'
You could also use sed:
echo $1 | sed 's/\([^.]*\)\.\([^.]*\)\.\([^.]*\)/\3.\2.\1/'
Lots of backslashes, and it involves two processes, but it could have advantages if you need to handle many such substitutions in a single invocation of the script.

Resources