bash search for string in each line of file - bash

I'm trying what seems like a very simple task: use bash to search a file for strings, and if they exist, output those to another file. It could be jetlag, but this should work:
#!/bin/bash
cnty=CNTRY
for line in $(cat wheatvrice.csv); do
if [[ $line = *$cnty* ]]
then
echo $line >> wr_imp.csv
fi
done
I also tried this for completeness:
#!/bin/bash
cnty=CNTRY
for line in $(cat wheatvrice.csv); do
case $line in
*"$cnty"*) echo $line >> wr_imp.csv;;
*) echo "no";;
esac
done
both output everything, regardless of whether the line contains CNTRY or not, and I'm copy/pasting from seemingly reliable sources, so apparently there's something simple about bash-ness that I'm missing?

Don't use bash, use grep.
grep -F "$cnty" wheatvrice.csv >> wr_imp.csv

While I would suggest to simply use grep too, the question is open, why you approach didn't work. Here a self referential modification of your second approach - with keyword 'bash' to match itself:
#!/bin/bash
cnty=bash
while read -r line
do
case $line in
*${cnty}*)
echo $line " yes" >> bashgrep.log
;;
*)
echo "no"
;;
esac
done < bashgrep.sh
The keypoint is while read -r line ... < FILE. Your command with cat involves String splitting, so every single word is processed in the loop, not every line.
The same problem in example 1.

Related

Bash script trying to list files unsuccessfully

I'm reading some file paths and names from a text file and trying to test if file exists. I'm not sure what I'm doing wrong but first echo returns filepath and file name whilst the echo inside the if statement doesn't. Any ideas?
#!/bin/bash
while read line; do
echo $line
if [ -f "$line" ]; then
echo "found: $line"
fi
done < /mbackup/temp/images.txt
The only change is adding the -r option to read. That option is documented as:
Backslash does not act as an escape character. The backslash is considered to be part of the line. In particular, a backslash-newline pair may not then be used as a line continuation.
This helps prevent special characters in file names from interfering with your script.
I test this with files containing special characters and it works as you expected.
#!/bin/bash
while read -r line; do
echo $line
if [ -f "$line" ]; then
echo "found: $line"
fi
done < /mbackup/temp/images.txt

Why does my variable set in a do loop disappear? (unix shell)

This part of my script is comparing each line of a file to find a preset string. If the string does NOT exist as a line in the file, it should append it to the end of the file.
STRING=foobar
cat "$FILE" | while read LINE
do
if [ "$STRING" == "$LINE" ]; then
export ISLINEINFILE="yes"
fi
done
if [ ! "$ISLINEINFILE" == yes ]; then
echo "$LINE" >> "$FILE"
fi
However, it appears as if both $LINE and $ISLINEINFILE are both cleared upon finishing the do loop. How can I avoid this?
Using shell
If we want to make just the minimal change to your code to get it working, all we need to do is switch the input redirection:
string=foobar
while read line
do
if [ "$string" == "$line" ]; then
islineinfile="yes"
fi
done <"$file"
if [ ! "$islineinfile" == yes ]; then
echo "$string" >> "$file"
fi
In the above, we changed cat "$file" | while do ...done to while do...done<"$file". With this one change, the while loop is no longer in a subshell and, consequently, shell variables created in the loop live on after the loop completes.
Using sed
I believe that the whole of your script can be replaced with:
sed -i.bak '/^foobar$/H; ${x;s/././;x;t; s/$/\nfoobar/}' file*
The above adds line foobar to the end of each file that doesn't already have a line that matches ^foobar$.
The above shows file* as the final argument to sed. This will apply the change to all files matching the glob. You could list specific files individually if you prefer.
The above was tested on GNU sed (linux). Minor modifications may be needed for BSD/OSX sed.
Using GNU awk (gawk)
awk -i inplace -v s="foobar" '$0==s{f=1} {print} ENDFILE{if (f==0) print s; f=0}' file*
Like the sed command, this can tackle multiple files all in one command.
Why does my variable set in a do loop disappear?
It disappears because it is set in a shell pipeline component. Most shells run each part of a pipeline in a subshell. By Unix design, variables set in a subshell cannot affect their parent or any already running other shell.
How can I avoid this?
There are several ways:
The simplest is to use a shell that doesn't run the last component of a pipeline in a subshell. This is ksh default behavior, e.g. use that shebang:
#!/bin/ksh
This behavior can also be bash one when the lastpipe option is set:
shopt -s lastpipe
You might use the variable in the same subshell that set it. Note that your original script indentation is wrong and might lead to the incorrect assumption that the if block is inside the pipeline, which isn't the case. Enclosing the whole block with parentheses will rectify that and would be the minimal change (two extra characters) to make it working:
STRING=foobar
cat "$FILE" | ( while read LINE
do
if [ "$STRING" == "$LINE" ]; then
export ISLINEINFILE="yes"
fi
done
if [ ! "$ISLINEINFILE" == yes ]; then
echo "$LINE" >> "$FILE"
fi
)
The variable would still be lost after that block though.
You might simply avoid the pipeline, which is straigthforward in your case, the cat being unnecessary:
STRING=foobar
while read LINE
do
if [ "$STRING" == "$LINE" ]; then
export ISLINEINFILE="yes"
fi
done < "$FILE"
if [ ! "$ISLINEINFILE" == yes ]; then
echo "$LINE" >> "$FILE"
fi
You might use another argorithmic approach, like using sed or gawk as suggested by John1024.
See also https://unix.stackexchange.com/a/144137/2594 for standard compliance details.

grep, else print message for no matches

In a bash script, I have a list of lines in a file I wish to grep and then display on standard out, which is easiest done with a while read:
grep "regex" "filepath" | while read line; do
printf "$line\n"
done
However, I would like to inform the user if no lines were matched by the grep. I know that one can do this by updating a variable inside the loop but it seems like a much more elegant approach (if possible) would be to try to read a line in an until loop, and if there were no output, an error message could be displayed.
This was my first attempt:
grep "regex" "filepath" | until [[ -z ${read line} ]]; do
if [[ -z $input ]]; then
printf "No matches found\n"
break
fi
printf "$line\n"
done
But in this instance the read command is malformed, and I wasn't sure of another way the phrase the query. Is this approach possible, and if not, is there a more suitable solution to the problem?
You don't need a loop at all if you simply want to display a message when there's no match. Instead you can use grep's return code. A simple if statement will suffice:
if ! grep "regex" "filepath"; then
echo "no match" >&2
fi
This will display the results of grep matches (since that's grep's default behavior), and will display the error message if it doesn't.
A popular alternative to if ! is to use the || operator. foo || bar can be read as "do foo or else do bar", or "if not foo then bar".
grep "regex" "filepath" || echo "no match" >&2
John Kugelman's answer is the correct and succinct one and you should accept it. I am addressing your question about syntax here just for completeness.
You cannot use ${read line} to execute read -- the brace syntax actually means (vaguely) that you want the value of a variable whose name contains a space. Perhaps you were shooting for $(read line) but really, the proper way to write your until loop would be more along the lines of
grep "regex" "filepath" | until read line; [[ -z "$line" ]]; do
... but of course, when there is no output, the pipeline will receive no lines, so while and until are both wrong here.
It is worth amphasizing that the reason you need a separate do is that you can have multiple commands in there. Even something like
while output=$(grep "regex filepath"); echo "grep done, please wait ...";
count=$(echo "$output" | wc -l); [[ $count -gt 0 ]]
do ...
although again, that is much more arcane than you would ever really need. (And in this particular case, you would want probably actually want if , not while.)
As others already noted, there is no reason to use a loop like that here, but I wanted to sort out the question about how to write a loop like this for whenever you actually do want one.
As mentioned by #jordanm, there is no need for a loop in the use case you mentioned.
output=$(grep "regex" "file")
if [[ -n $output ]]; then
echo "$output"
else
echo "Sorry, no results..."
fi
If you need to iterate over the results for processing (rather than just displaying to stdout) then you can do something like this:
output=$(grep "regex" "file")
if [[ -n $output ]]; then
while IFS= read -r line; do
# do something with $line
done <<< "$output"
else
echo "Sorry, no results..."
fi
This method avoids using a pipeline or subshell so that any variable assignments made within the loop will be available to the rest of the script.
Also, i'm not sure if this relates to what you are trying to do at all, but grep does have the ability to load patterns from a file (one per line). It is invoked as follows:
grep search_target -f pattern_file.txt

shell parsing a line to look for a certain tag

I am planning to create a simple script to edit a file based on values stored within a properties file.
So essentially I am planning to loop through each line in the original file, when it comes across a certain tag within a line say "/#" it will get the text following the tag i.e. certs and then implement a function to parse through the properties file to get certain values and add them to the original file.
So for example the file would have the following line:
"/#certs"
I am not sure how best to search for the tag, I was planning to have an if to find the /# and then split the remaining text to get the string.
while read line
do
#need to parse line to look for tag
echo line >> ${NEW_FILE}
done < ${OLD_FILE}
Any help would e greatly appreciated
=====================================
EDIT:
My explanation was a bit poor; apologies. I am merely trying to get the text following the /# - i.e. I just want to get the string value that precedes it. I can then call a function based on what the text is.
You can use BASH regex capabilities:
while read line
do
if [[ "$line" =~ ^.*/#certs(.*)$ ]]; then
# do your processing here
# echo ${BASH_REMATCH[1]} is the part after /#certs
echo echo ${BASH_REMATCH[1]} >> ${NEW_FILE}
fi
done < ${OLD_FILE}
This is portable to Bourne shell and thus, of course, ksh and Bash.
case $line in
'/#'* ) tag="${line#/\#}" ;;
esac
To put it into some sort of context, here is a more realistic example of how you might use it:
while read line; do
case $line in
'/#'* ) tag="${line#/\#}" ;;
*) continue ;; # skip remainder of loop for lines without a tag
esac
echo "$tag"
# Or maybe do something more complex, such as
case $tag in
cert)
echo 'We have a cert!' >&2 ;;
bingo)
echo 'You are the winner.' >&2
break # terminate loop
;;
esac
done <$OLD_FILE >$NEW_FILE
For instance you can search for strings and manipulate them in one step using sed the stream editor.
echo $line | sed -rn 's:^.*/#(certs.+):\1:p'
This will print only the relevant parts after the /# of the relevant lines.

Modifying data using awk

In a long file i'm searching for something like this:
c 0.5p_f
10 px 2
I need to modify a 3rd column of a line after 'c 0.5p_f' marker.
It's part of a bash script that would do this and i would like to avoid using, like, awk scripts, only bash commands.
Why not use awk? It's perfect.
do_modify{$3="modify";do_modify=0}/c 0\.5p_f/{do_modify=1}1
If you can use sed scripts,
/c 0\.5p_f/{n;s/\([^[:space:]]*[[:space:]]\+[^[:space:]]*[[:space:]]\+\)\S*/\1modify/}
would do. Not that pure Bash is hard either, though.
do_modify=
while read -r line; do
if [[ -n ${do_modify} ]]; then
columns=(${line})
columns[2]=modified
line=${columns[*]}
do_modify=
fi
printf '%s\n' "${line}"
if [[ ${line} = *'c 0.5p_f'* ]]; then
do_modify=1
fi
done

Resources