Running `rg` in a while loop breaks after the first iteration - bash

My script is simple:
while read -r key; do
rg --glob='!some_dir' --fixed-strings --quiet "$key" || echo "$key"
done < <(grep 'some_pattern' some/file | cut -d'"' -f2)
I hoped to use this bash script to print keys that aren't used. This loop, however, breaks after the first iteration at every run. Why and how to fix? Thank you :D

This looks like a classic signature of cases when the command run over a while..read loop starts from the standard input also. You expected the output of grep will be read over by the while loop in an iterative way, but for some reason your command rg is also reading from the same.
Close it as
rg --glob='!some_dir' --fixed-strings --quiet "$key" < /dev/null || echo "$key"
or use a different file descriptor
while read -r -u 3 key; do
rg --glob='!some_dir' --fixed-strings --quiet "$key" || echo "$key"
done 3< <(grep 'some_pattern' some/file | cut -d'"' -f2)

Related

Shell: Add string to the end of each line, which match the pattern. Filenames are given in another file

I'm still new to the shell and need some help.
I have a file stapel_old.
Also I have in the same directory files like english_old_sync, math_old_sync and vocabulary_old_sync.
The content of stapel_old is:
english
math
vocabulary
The content of e.g. english is:
basic_grammar.md
spelling.md
orthography.md
I want to manipulate all files which are given in stapel_old like in this example:
take the first line of stapel_old 'english', (after that math, and so on)
convert in this case english to english_old_sync, (or after that what is given in second line, e.g. math to math_old_sync)
search in english_old_sync line by line for the pattern '.md'
And append to each line after .md :::#a1
The result should be e.g. of english_old_sync:
basic_grammar.md:::#a1
spelling.md:::#a1
orthography.md:::#a1
of math_old_sync:
geometry.md:::#a1
fractions.md:::#a1
and so on. stapel_old should stay unchanged.
How can I realize that?
I tried with sed -n, while loop (while read -r line), and I'm feeling it's somehow the right way - but I still get errors and not the expected result after 4 hours inspecting and reading.
Thank you!
EDIT
Here is the working code (The files are stored in folder 'olddata'):
clear
echo -e "$(tput setaf 1)$(tput setab 7)Learning directories:$(tput sgr 0)\n"
# put here directories which should not become flashcards, command: | grep -v 'name_of_directory_which_not_to_learn1' | grep -v 'directory2'
ls ../ | grep -v 00_gliederungsverweise | grep -v 0_weiter | grep -v bibliothek | grep -v notizen | grep -v Obsidian | grep -v z_nicht_uni | tee olddata/stapel_old
# count folders
echo -ne "\nHow much different folders: " && wc -l olddata/stapel_old | cut -d' ' -f1 | tee -a olddata/stapel_old
echo -e "Are this learning directories correct? [j ODER y]--> yes; [Other]-->no\n"
read lernvz_korrekt
if [ "$lernvz_korrekt" = j ] || [ "$lernvz_korrekt" = y ];
then
read -n 1 -s -r -p "Learning directories correct. Press any key to continue..."
else
read -n 1 -s -r -p "Learning directories not correct, please change in line 4. Press any key to continue..."
exit
fi
echo -e "\n_____________________________\n$(tput setaf 6)$(tput setab 5)Found cards:$(tput sgr 0)$(tput setaf 6)\n"
#GET && WRITE FOLDER NAMES into olddata/stapel_old
anzahl_zeilen=$(cat olddata/stapel_old |& tail -1)
#GET NAMES of .md files of every stapel and write All to 'stapelname'_old_sync
i=0
name="var_$i"
for (( num=1; num <= $anzahl_zeilen; num++ ))
do
i="$((i + 1))"
name="var_$i"
name=$(cat olddata/stapel_old | sed -n "$num"p)
find ../$name/ -name '*.md' | grep -v trash | grep -v Obsidian | rev | cut -d'/' -f1 | rev | tee olddata/$name"_old_sync"
done
(tput sgr 0)
I tried to add:
input="olddata/stapel_old"
while IFS= read -r line
do
sed -n "$line"p olddata/stapel_old
done < "$input"
The code to change only the english_old_sync is:
lines=$(wc -l olddata/english_old_sync | cut -d' ' -f1)
for ((num=1; num <= $lines; num++))
do
content=$(sed -n "$num"p olddata/english_old_sync)
sed -i "s/"$content"/""$content":::#a1/g"" olddata/english_old_sync
done
So now, this need to be a inner for-loop, of a outer for-loop which holds the variable for english, right?
stapel_old should stay unchanged.
You could try a while + read loop and embed sed inside the loop.
#!/usr/bin/env bash
while IFS= read -r files; do
echo cp -v "$files" "${files}_old_sync" &&
echo sed '/^.*\.md$/s/$/:::#a1/' "${files}_old_sync"
done < olddata/staple_old
convert in this case english to english_old_sync, (or after that what is given in second line, e.g. math to math_old_sync)
cp copies the file with a new name, if the goal is renaming the original file name from the content of the file staple_old then change cp to mv
The -n and -i flag from sed was ommited , include it, if needed.
The script also assumes that there are no empty/blank lines in the content of staple_old file. If in case there are/is add an addition test after the line where the do is.
[[ -n $files ]] || continue
It also assumes that the content of staple_old are existing files. Just in case add an additional test.
[[ -e $files ]] || { printf >&2 '%s no such file or directory.\n' "$files"; continue; }
Or an if statement.
if [[ ! -e $files ]]; then
printf >&2 '%s no such file or directory\n' "$files"
continue
fi
See also help test
See also help continue
Combining them all together should be something like:
#!/usr/bin/env bash
while IFS= read -r files; do
[[ -n $files ]] || continue
[[ -e $files ]] || {
printf >&2 '%s no such file or directory.\n' "$files"
continue
}
echo cp -v "$files" "${files}_old_sync" &&
echo sed '/^.*\.md$/s/$/:::#a1/' "${files}_old_sync"
done < olddata/staple_old
Remove the echo's If you're satisfied with the output so the script could copy/rename and edit the files.

Intermittent piping failure in bash

I have a code snippet that looks like this
while grep "{{SECRETS}}" /tmp/kubernetes/$basefile | grep -v "#"; do
grep -n "{{SECRETS}}" /tmp/kubernetes/$basefile | grep -v "#" | head -n1 | while read -r line ; do
lineno=$(echo $line | cut -d':' -f1)
spaces=$(sed "${lineno}!d" /tmp/kubernetes/$basefile | awk -F'[^ \t]' '{print length($1)}')
spaces=$((spaces-1))
# Delete line that had {{SECRETS}}
sed -i -e "${lineno}d" /tmp/kubernetes/$basefile
while IFS='' read -r secretline || [[ -n "$secretline" ]]; do
newline=$(printf "%*s%s" $spaces "" "$secretline")
sed -i "${lineno}i\ ${newline}" /tmp/kubernetes/$basefile
lineno=$((lineno+1))
done < "/tmp/secrets.yaml"
done
done
in /tmp/kubernetes/$basefile, the string {{SECRETS}} appears twice 100% of the time.
Almost every single time, this completes fine. However, very infrequently, the script errors on its second loop through the file. like so, according to set -x
...
IFS=
+ read -r secretline
+ [[ -n '' ]]
+ read -r line
exit code 1
When it works, the set -x looks like this, and continues processesing the file correctly.
...
+ IFS=
+ read -r secretline
+ [[ -n '' ]]
+ read -r line
+ grep '{{SECRETS}}' /tmp/kubernetes/deployment.yaml
+ grep -v '#'
I have no answer for how this can only happen occasionally, so I think there's something about bash piping's parallelism I don't understand. Is there something in grep -n "{{SECRETS}}" /tmp/kubernetes/$basefile | grep -v "#" | head -n1 | while read -r line ; do that could lead to out-of-order execution somehow? Based on the error, it seems like it's trying to read a line, but can't because previous commands didn't work. But there's no indication of that in the set -x output.
A likely cause of the problem is that the pipeline containing the inner loop both reads and writes the "basefile" at the same time. See How to make reading and writing the same file in the same pipeline always “fail”?.
One way to fix the problem is do a full read of the file before trying to update it. Try:
basepath=/tmp/kubernetes/$basefile
secretspath=/tmp/secrets.yaml
while
line=$(grep -n "{{SECRETS}}" "$basepath" | grep -v "#" | head -n1)
[[ -n $line ]]
do
lineno=$(echo "$line" | cut -d':' -f1)
spaces=$(sed "${lineno}!d" "$basepath" \
| awk -F'[^ \t]' '{print length($1)}')
spaces=$((spaces-1))
# Delete line that had {{SECRETS}}
sed -i -e "${lineno}d" "$basepath"
while IFS='' read -r secretline || [[ -n "$secretline" ]]; do
newline=$(printf "%*s%s" $spaces "" "$secretline")
sed -i "${lineno}i\ ${newline}" "$basepath"
lineno=$((lineno+1))
done < "$secretspath"
done
(I introduced the variables basepath and secretspath to make the code easier to test.)
As an aside, it's also possible to do this with pure Bash code:
basepath=/tmp/kubernetes/$basefile
secretspath=/tmp/secrets.yaml
updated_lines=()
is_updated=0
while IFS= read -r line || [[ -n $line ]] ; do
if [[ $line == *'{{SECRETS}}'* && $line != *'#'* ]] ; then
spaces=${line%%[^[:space:]]*}
while IFS= read -r secretline || [[ -n $secretline ]]; do
updated_lines+=( "${spaces}${secretline}" )
done < "$secretspath"
is_updated=1
else
updated_lines+=( "$line" )
fi
done <"$basepath"
(( is_updated )) && printf '%s\n' "${updated_lines[#]}" >"$basepath"
The whole updated file is stored in memory (in the update_lines array) but that shouldn't be a problem because any file that's too big to store in memory will almost certainly be too big to process line-by-line with Bash. Bash is generally extremely slow.
In this code spaces holds the actual space characters for indentation, not the number of them.

Inline array substitution

I have file with a few lines:
x 1
y 2
z 3 t
I need to pass each line as paramater to some program:
$ program "x 1" "y 2" "z 3 t"
I know how to do it with two commands:
$ readarray -t a < file
$ program "${a[#]}"
How can i do it with one command? Something like that:
$ program ??? file ???
The (default) options of your readarray command indicate that your file items are separated by newlines.
So in order to achieve what you want in one command, you can take advantage of the special IFS variable to use word splitting w.r.t. newlines (see e.g. this doc) and call your program with a non-quoted command substitution:
IFS=$'\n'; program $(cat file)
As suggested by #CharlesDuffy:
you may want to disable globbing by running beforehand set -f, and if you want to keep these modifications local, you can enclose the whole in a subshell:
( set -f; IFS=$'\n'; program $(cat file) )
to avoid the performance penalty of the parens and of the /bin/cat process, you can write instead:
( set -f; IFS=$'\n'; exec program $(<file) )
where $(<file) is a Bash equivalent to to $(cat file) (faster as it doesn't require forking /bin/cat), and exec consumes the subshell created by the parens.
However, note that the exec trick won't work and should be removed if program is not a real program in the PATH (that is, you'll get exec: program: not found if program is just a function defined in your script).
Passing a set of params should be more organized :
In this example case I'm looking for a file containing chk_disk_issue=something etc.. so I set the values by reading a config file which I pass in as a param.
# -- read specific variables from the config file (if found) --
if [ -f "${file}" ] ;then
while IFS= read -r line ;do
if ! [[ $line = *"#"* ]]; then
var="$(echo $line | cut -d'=' -f1)"
case "$var" in
chk_disk_issue)
chk_disk_issue="$(echo $line | tr -d '[:space:]' | cut -d'=' -f2 | sed 's/[^0-9]*//g')"
;;
chk_mem_issue)
chk_mem_issue="$(echo $line | tr -d '[:space:]' | cut -d'=' -f2 | sed 's/[^0-9]*//g')"
;;
chk_cpu_issue)
chk_cpu_issue="$(echo $line | tr -d '[:space:]' | cut -d'=' -f2 | sed 's/[^0-9]*//g')"
;;
esac
fi
done < "${file}"
fi
if these are not params then find a way for your script to read them as data inside of the script and pass in the file name.

process every line from command output in bash

From every line of nmap network scan output I want to store the hosts and their IPs in variables (for further use additionaly the "Host is up"-string):
The to be processed output from nmap looks like:
Nmap scan report for samplehostname.mynetwork (192.168.1.45)
Host is up (0.00047s latency).
thats my script so far:
#!/bin/bash
while IFS='' read -r line
do
host=$(grep report|cut -f5 -d' ')
ip=$(grep report|sed 's/^.*(//;s/)$//')
printf "Host:$host - IP:$ip"
done < <(nmap -sP 192.168.1.1/24)
The output makes something I do not understand. It puts the "Host:" at the very beginning, and then it puts "IP:" at the very end, while it completely omits the output of $ip.
The generated output of my script is:
Host:samplehostname1.mynetwork
samplehostname2.mynetwork
samplehostname3.mynetwork
samplehostname4.mynetwork
samplehostname5.mynetwork - IP:
In separate, the extraction of $host and $ip basically works (although there might a better solution for sure). I can either printf $host or $ip alone.
What's wrong with my script? Thanks!
Your two grep commands are reading from standard input, which they inherit from the loop, so they also read from nmap. read gets one line, the first grep consumes the rest, and the second grep exits immediately because standard input is closed. I suspect you meant to grep the contents of $line:
while IFS='' read -r line
do
host=$(grep report <<< "$line" |cut -f5 -d' ')
ip=$(grep report <<< "$line" |sed 's/^.*(//;s/)$//')
printf "Host:$host - IP:$ip"
done < <(nmap -sP 192.168.1.1/24)
However, this is inefficient and unnecessary. You can use bash's built-in regular expression support to extract the fields you want.
regex='Nmap scan report for (.*) \((.*)\)'
while IFS='' read -r line
do
[[ $line =~ $regex ]] || continue
host=${BASH_REMATCH[1]}
ip=${BASH_REMATCH[2]}
printf "Host:%s - IP:%s\n" "$host" "$ip"
done < <(nmap -sP 192.168.1.1/24)
Try this:
#!/bin/bash
while IFS='' read -r line
do
if [[ $(echo $line | grep report) ]];then
host=$(echo $line | cut -f5 -d' ')
ip=$(echo $line | sed 's/^.*(//;s/)$//')
echo "Host:$host - IP:$ip"
fi
done < <(nmap -sP it-50)
Output:
Host:it-50 - IP:10.0.0.10
I added an if clause to skip unwanted lines.

A script to find all the users who are executing a specific program

I've written the bash script (searchuser) which should display all the users who are executing a specific program or a script (at least a bash script). But when searching for scripts fails because the command the SO is executing is something like bash scriptname.
This script acts parsing the ps command output, it search for all the occurrences of the specified program name, extracts the user and the program name, verifies if the program name is that we're searching for and if it's it displays the relevant information (in this case the user name and the program name, might be better to output also the PID, but that is quite simple). The verification is accomplished to reject all lines containing program names which contain the name of the program but they're not the program we are searching for; if we're searching gedit we don't desire to find sgedit or gedits.
Other issues I've are:
I would like to avoid the use of a tmp file.
I would like to be not tied to GNU extensions.
The script has to be executed as:
root# searchuser programname <invio>
The script searchuser is the following:
#!/bin/bash
i=0
search=$1
tmp=`mktemp`
ps -aux | tr -s ' ' | grep "$search" > $tmp
while read fileline
do
user=`echo "$fileline" | cut -f1 -d' '`
prg=`echo "$fileline" | cut -f11 -d' '`
prg=`basename "$prg"`
if [ "$prg" = "$search" ]; then
echo "$user - $prg"
i=`expr $i + 1`
fi
done < $tmp
if [ $i = 0 ]; then
echo "No users are executing $search"
fi
rm $tmp
exit $i
Have you suggestion about to solve these issues?
One approach might looks like such:
IFS=$'\n' read -r -d '' -a pids < <(pgrep -x -- "$1"; printf '\0')
if (( ! ${#pids[#]} )); then
echo "No users are executing $1"
fi
for pid in "${pids[#]}"; do
# build a more accurate command line than the one ps emits
args=( )
while IFS= read -r -d '' arg; do
args+=( "$arg" )
done </proc/"$pid"/cmdline
(( ${#args[#]} )) || continue # exited while we were running
printf -v cmdline_str '%q ' "${args[#]}"
user=$(stat --format=%U /proc/"$pid") || continue # exited while we were running
printf '%q - %s\n' "$user" "${cmdline_str% }"
done
Unlike the output from ps, which doesn't distinguish between ./command "some argument" and ./command "some" "argument", this will emit output which correctly shows the arguments run by each user, with quoting which will re-run the given command correctly.
What about:
ps -e -o user,comm | egrep "^[^ ]+ +$1$" | cut -d' ' -f1 | sort -u
* Addendum *
This statement:
ps -e -o user,pid,comm | egrep "^\s*\S+\s+\S+\s*$1$" | while read a b; do echo $a; done | sort | uniq -c
or this one:
ps -e -o user,pid,comm | egrep "^\s*\S+\s+\S+\s*sleep$" | xargs -L1 echo | cut -d ' ' -f1 | sort | uniq -c
shows the number of process instances by user.

Resources