How to display progress of another command - bash

I need help with a bash script I run like this:
do_something > file.txt (I'm using the third line of this file.txt in another echo output)
Now I need to get a number of characters on the second line of the file.txt.
(There are only dots - ".")
I can get the number of characters with this command:
progress=$(awk 'NR==2' file.txt | grep -o \. | wc -w)
But the problem is, that the file.txt and the number of characters on the second line is "progress bar" so it's changing in time from 0 - XY (i.e. 100) characters.
I want to use it to see a progress in percentage: echo -ne "$progress % \\r"
How could I do that in a loop? do_something > file.txt must start just once. In next ~5-20 seconds it's printing dots on the second line and I have to take this number updated every second to my output echo "XY %".
How can I read from file.txt every second and find there "new/updated" count of characters?* < edit
edit:
* it's real-time process. My do_something > file.txt is "printing" dots to this file and I want print result saved in $progress in real-time. So first command is printing dots to file and I'm counting them in real-time every second and print how many percent is done from 0-100 %

What you want to do is run do_something > file.txt in the background and then monitor it. You can use the special kill signal 0 to do this.
do_something > file.txt &
PID=$!
while kill -0 $PID 2> /dev/null
do
[calculate percent complete]
[display percent complete]
sleep 5
done

First, you should run your command in the background:
do_something > file.txt &
Then you can watch the changes in the output file. This will infinitely print the second line of file.txt every second.
while true; do sed -n '2p' < file.txt; sleep 1; done
If you want to print only how many characters are on the second line, you can do this:
while true; do sed -n '2p' < file.txt | wc -m; sleep 1; done
If you want to stop when there is 100 characters on the second line, you can do this:
MAX="100"
CUR="0"
while [ $CUR -lt $MAX ]; do CUR=`sed -n '2p' < sprint | wc -m`; echo $CUR; sleep 1; done

Related

Counting all the 5 from a specific range in Bash

I want to count how many times the digit "5" appears from the range 1 to 4321. For example, the number 5 appears 1 or the number 555, 5 would appear 3 times etc.
Here is my code so far, however, the results are 0, and they are supposed to be 1262.
#!/bin/bash
typeset -i count5=0
for n in {1..4321}; do
echo ${n}
done | \
while read -n1 digit ; do
if [ `echo "${digit}" | grep 5` ] ; then
count5=count5+1
fi
done | echo "${count5}"
P.s. I am looking to fix my code so it can print the right output. I do not want a completely different solution or a shortcut.
What about something like this
seq 4321 | tr -Cd 5 | wc -c
1262
Creates the sequence, delete everything but 5's and count the chars
The main problem here is http://mywiki.wooledge.org/BashFAQ/024. With minimal changes, your code could be refactored to
#!/bin/bash
typeset -i count5=0
for n in {1..4321}; do
echo $n # braces around ${n} provide no benefit
done | # no backslash required here; fix weird indentation
while read -n1 digit ; do
# prefer modern command substitution syntax over backticks
if [ $(echo "${digit}" | grep 5) ] ; then
count5=count5+1
fi
echo "${count5}" # variable will not persist outside subprocess
done | head -n 1 # so instead just print the last one after the loop
With some common antipatterns removed, this reduces to
#!/bin/bash
printf '%s\n' {1..4321} |
grep 5 |
wc -l
A more efficient and elegant way to do the same is simply
printf '%s\n' {1..4321} | grep -c 5
One primary issue:
each time results are sent to a pipe said pipe starts a new subshell; in bash any variables set in the subshell are 'lost' when the subshell exits; net result is even if you're correctly incrementing count5 within a subshell you'll still end up with 0 (the starting value) when you exit from the subshell
Making minimal changes to OP's current code:
while read -n1 digit ; do
if [ `echo "${digit}" | grep 5` ]; then
count5=count5+1
fi
done < <(for n in {1..4321}; do echo ${n}; done)
echo "${count5}"
NOTE: there are a couple performance related issues with this method of coding but since OP has explicitly asked to a) 'fix' the current code and b) not provide any shortcuts ... we'll leave the performance fixes for another day ...
A simpler way to get the number for a certain n would be
nx=${n//[^5]/} # Remove all non-5 characters
count5=${#nx} # Calculate the length of what is left
A simpler method in pure bash could be:
printf -v seq '%s' {1..4321} # print the sequence into the variable seq
fives=${seq//[!5]} # delete all characters but 5s
count5=${#fives} # length of the string is the count of 5s
echo $count5 # print it
Or, using standard utilities tr and wc
printf '%s' {1..4321} | tr -dc 5 | wc -c
Or using awk:
awk 'BEGIN { for(i=1;i<=4321;i++) {$0=i; x=x+gsub("5",""); } print x} '

Unix bash script grep loop counter (for)

I am looping our the a grep result. The result contains 10 lines (every line has different content). So the loop stuff in the loop gets executed 10 times.
I need to get the index, 0-9, in the run so i can do actions based on the index.
ABC=(cat test.log | grep "stuff")
counter=0
for x in $ABC
do
echo $x
((counter++))
echo "COUNTER $counter"
done
Currently the counter won't really change.
Output:
51209
120049
148480
1211441
373948
0
0
0
728304
0
COUNTER: 1
If your requirement is to only print counter(which is as per shown samples only), in that case you could use awk(if you are ok with it), this could be done in a single awk like, without creating variable and then using grep like you are doing currently, awk could perform both search and counter printing in a single shot.
awk -v counter=0 '/stuff/{print "counter=" counter++}' Input_file
Replace stuff string above with the actual string you are looking for and place your actual file name for Input_file in above.
This should print like:
counter=1
counter=2
........and so on
Your shell script contains what should be an obvious syntax error.
ABC=(cat test.log | grep "stuff")
This fails with
-bash: syntax error near unexpected token `|'
There is no need to save the output in a variable if you only want to process one at a time (and obviously no need for the useless cat).
grep "stuff" test.log | nl
gets you numbered lines, though the index will be 1-based, not zero-based.
If you absolutely need zero-based, refactoring to Awk should solve it easily:
awk '/stuff/ { print n++, $0 }' test.log
If you want to loop over this and do something more with this information,
awk '/stuff/ { print n++, $0 }' test.log |
while read -r index output; do
echo index is "$index"
echo output is "$output"
done
Because the while loop executes in a subshell the value of index will not be visible outside of the loop. (I guess that's what your real code did with the counter as well. I don't think that part of the code you posted will repro either.)
Do not store the result of grep in a scalar variable $ABC.
If the line of the log file contains whitespaces, the variable $x
is split on them due to the word splitting of bash.
(BTW the statement ABC=(cat test.log | grep "stuff") causes a syntax error.)
Please try something like:
readarray -t abc < <(grep "stuff" test.log)
for x in "${abc[#]}"
do
echo "$x"
echo "COUNTER $((++counter))"
done
or
readarray -t abc < <(grep "stuff" test.log)
for i in "${!abc[#]}"
do
echo "${abc[i]}"
echo "COUNTER $((i + 1))"
done
you can use below increment statement-
counter=$(( $counter + 1));

bash - print a line every X seconds (like sed every X lines)

I know with sed you can pipe the output of a command so that you can print every X lines.
make all | sed -n '2~5'
Is there an equivalent command to print a line every X seconds?
make all | print_line_every_sec '5'
In 5 seconds timeout read one line and discard anything else:
while
# timeout 5 seconds
! timeout 5 sh -c '
# read one line
if IFS= read -r line; then
# output the line
printf "%s\n" "$line"
# discard the input for the rest of 5 seconds
cat >/dev/null
fi
# will get here only, if there is nothing to read
'
# that means that `timeout` will always return 124 if stdin is still open
# and it will return 0 exit status only if there is nothing to read
# so we loop on nonzero exit status of timeout.
do :; done
and a oneliner:
while ! timeout 0.5 sh -c 'IFS= read -r line && printf "%s\n" "$line" && cat >/dev/null'; do :; done
But maybe something simpler - just discard 5 seconds of data each one line:
while IFS= read -r line; do
printf "%s\n" "$line"
timeout 5 cat >/dev/null
done
or
while IFS= read -r line &&
printf "%s\n" "$line" &&
! timeout 5 cat >/dev/null
do :; done
If you want the most recent message every 5 seconds, this is a try :
make all | {
display(){
if (( $SECONDS >= 5)); then
if test -n "${last_line+x}"; then
# print only if there is a message in the last 5 seconds
echo $last_line; unset last_line
fi
SECONDS=0
fi
}
SECONDS=0
while true; do
while IFS= read -t 0.001 line; do
last_line=$line
display
done
display
done
}
Even if the proposed solutions are interesting and beautiful, the most elegant solution IMHO is a awk solution. If you want to issue
make all | print_line_every_sec 5
then you have to create the script print_line_every_sec as follows, including a test to avoid an infinite loop:
#!/bin/bash
if [ $1 -le 0 ] ; then echo $(basename $0): invalid argument \'$1\'; exit 1; fi
awk -v delay=$1 'BEGIN {t = systime ()}
{if (systime() >= t) {print $0 ; t += delay}}'
This might work for you (GNU sed):
sed 'e sleep 1' file
Print a line every n (in the above example 1 ) second.
To print 5 lines every 2 seconds, use:
sed '1~5e sleep 2' file
You can do it by watch command.
If You need only print your output every X second, you could use something like this:
watch -n X "Your CMD"
if you need to designate any change on your output, it would be useful to use -d switch :
watch -n X -d "Your CMD"

Counting number of delimiters of special character bash shell script Performance improvement

Hi I have a script that is going to count the number of records in a file and find the expected delimiters per a record by dividing the total record count by rs_count. It works fine but it is a little slow on large records. I was wondering if there is a way to improve performance. The RS is a special character octal \246. I am using bash shell script.
Some additional info:
A line is a record.
The file will always have the same number of delimiters.
The purpose of the script is to check if the file has the expected number of fields. After calculating it, the script just echos it out.
for file in $SOURCE; do
echo "executing File -"$file
if (( $total_record_count != 0 ));then
filename=$(basename "$file")
total_record_count=$(wc -l < $file)
rs_count=$(sed -n 'l' $file | grep -o $RS | wc -l)
Delimiter_per_record=$((rs_count/total_record_count))
fi
done
Counting the delimiters (not total records) in a file
On a file with 50,000 lines, I note around a 10 fold increase by incorporating the sed, grep, and wc pipeline to a single awk process:
awk -v RS='Delimiter' 'END{print NR -1}' input_file
Dealing with wc when there's no trailing line breaks
If you count the instances of ^ (start of line), you will get a true count of lines. Using grep:
grep -co "^" input_file
(Thankfully, even though ^ is a regex, the performance of this is on par with wc)
Incorporating these two modifications into a trivial test based on your supplied code:
#!/usr/bin/env bash
SOURCE="$1"
RS=$'\246'
for file in $SOURCE; do
echo "executing File -"$file
if [[ $total_record_count != 0 ]];then
filename=$(basename "$file")
total_record_count=$(grep -oc "^" $file)
rs_count="$(awk -v RS=$'\246' 'END{print NR -1}' $file)"
Delimiter_per_record=$((rs_count/total_record_count))
fi
done
echo -e "\$rs_count:\t${rs_count}\n\$Delimiter_per_record:\t${Delimiter_per_record}\n\$total_record_count:\t${total_record_count}" | column -t
Running this on a file with 50,000 lines on my macbook:
time ./recordtest.sh /tmp/randshort
executing File -/tmp/randshort
$rs_count: 186885
$Delimiter_per_record: 3
$total_record_count: 50000
real 0m0.064s
user 0m0.038s
sys 0m0.012s
Unit test one-liner
(creates /tmp/recordtest, chmod +x's it, creates /tmp/testfile with 10 lines of random characters including octal \246, and then runs the script file on the testfile)
echo $'#!/usr/bin/env bash\n\nSOURCE="$1"\nRS=$\'\\246\'\n\nfor file in $SOURCE; do\n echo "executing File -"$file\n if [[ $total_record_count != 0 ]];then\n filename=$(basename "$file")\n total_record_count=$(grep -oc "^" $file)\n rs_count="$(awk -v RS=$\'\\246\' \'END{print NR -1}\' $file)"\n Delimiter_per_record=$((rs_count/total_record_count))\n fi\ndone\n\necho -e "\\$rs_count:\\t${rs_count}\\n\\$Delimiter_per_record:\\t${Delimiter_per_record}\\n\\$total_record_count:\\t${total_record_count}" | column -t' > /tmp/recordtest ; echo $'\246459ca4f23bafff1c8fc017864aa3930c4a7f2918b\246753f00e5a9278375b\nb\246a3\246fc074b0e415f960e7099651abf369\246a6f\246f70263973e176572\2467355\n1590f285e076797aa83b2ee537c7f99\24666990bb60419b8aa\246bb5b6b\2467053\n89b938a5\246560a54f2826250a2c026c320302529331229255\246ef79fbb52c2\n9042\246bb\246b942408a22f912268ffc78f08c\2462798b0c05a75439\246245be2ea5\n0ef03170413f90e\246e0\246b1b2515c4\2466bf0a1bb\246ee28b78ccce70432e6b\24653\n51229e7ab228b4518404360b31a\2463673261e3242985bf24e59bc657\246999a\n9964\246b08\24640e63fae788ea\246a1777\2460e94f89af8b571e\246e1b53e6332\246c3\246e\n90\246ae12895f\24689885e\246e736f942080f267a275132a348ec1e837b99efe94\n2895e91\246\246f506f\246c1b986a63444b4258\246bc1b39182\24630\24696be' > /tmp/testfile ; chmod +x /tmp/recordtest ; /tmp/./recordtest /tmp/testfile
Which produces this result:
$rs_count: 39
$Delimiter_per_record: 3
$total_record_count: 10
Though there's a number of solutions for counting instances of characters in files, quite a few come undone when trying to process special characters like octal \246
awk seems to handle it reliably and quickly.

Using a pipe to read a file, run script and write to the same file

I need to write a script with one line that gets a file and print on the same file on the end of each line the numbers of words on the sentence only if the word "word" Appears on it. I can use another script that can do what ever I want.
My problem is that after I run the script the file is empty, the file that I sent to the script.
This is the one line script:
#!/bin/bash
cat $1 | ./words_num word | cat $1
words_num
#!/bin/bash
while read line; do
temp=`echo $line | grep $1 | wc -l`
if (($temp==1)); then
word_cnt=`echo $line | wc -w`
echo "$line $word_cnt"
else
echo "$line"
fi
done
For example, before the file is:
bla bla blaa word
words blaa
bla bla
after file:
bla bla blaa word 4
words blaa 2
bla bla
Can you help?
The one-liner:
cat $1 | ./words_num word | cat $1
is peculiar. It is approximately equivalent to:
cat $1 | ./words_num word >/dev/null; cat $1
which is unlikely to be the intended result. It is also a candidate for a UUOC (Useless Use of cat) award.
If the intention is to overwrite the original file with the amended version, then you should probably write:
./words_num word < $1 > tmp.$$; mv tmp.$$ $1
If you want to see the results on the screen as well, then:
./words_num word < $1 | tee tmp.$$; mv tmp.$$ $1
Both these will leave a temporary file around if interrupted. You can avoid that with:
#!/bin/bash
trap "rm -f tmp.$$; exit 1" 0 1 2 3 13 15
./words_num word < $1 | tee tmp.$$
mv tmp.$$ $1
trap 0
The trap sets signal handlers (EXIT, HUP, INT, QUIT, PIPE, TERM) and removes the temporary file (if it exists) and exits with a failure status. The trap 0 at the end cancels the exit trap so the command exits successfully.
As for the words_num script, that seems to call for awk rather than shell:
#!/bin/bash
[ $# == 0 ] && { echo "Usage: $0 word [file ...]" >&2; exit 1; }
word=$1
shift
awk "/$word/"' { print $0, NF; next } { print }' "$#"
You can reduce that if you're into code golfing your awk scripts, but I prefer clarify to sub-par code. It looks for lines containing the word, prints the line along with the number of fields in the line, and moves to the next line. If the line doesn't match, it is simply printed. The assignment and shift mean that "$#" contains all the other arguments to words_num, and awk will automatically cycle through the named files, or read standard input if no files are named.
The script should check that the given word does not contain any slashes as that will mess up the regex (it would be OK to replace each one that appears with [/], a character class containing only a slash). That level of bullet-proofing is left for the interested user.
cat $1 | ./words_num word | tee $1

Resources