Bash piping command output into sum loop - bash

Getting into bash, I love it, but it seems there are lots of subtleties that end up making a big difference in functionality, and whatnot, anyway here is my question:
I know this works:
total=0
for i in $(grep number some.txt | cut -d " " -f 1); do
(( total+=i ))
done
But why doesn't this?:
grep number some.txt | cut -d " " -f 1 | while read i; do (( total+=i )); done
some.txt:
1 number
2 number
50 number
both the for and the while loop receive 1, 2, and 50 separately, but the for loop shows the total variable being 53 in the end, while in the while loop code, it just stays in zero. I know there's some fundamental knowledge I'm lacking here, please help me.
I also don't get the differences in piping, for example
If I run
grep number some.txt | cut -d " " -f 1 | while read i; echo "-> $i"; done
I get the expected output
-> 1
-> 2
-> 50
But if run like so
while read i; echo "-> $i"; done <<< $(grep number some.txt | cut -d " " -f 1)
then the output changes to
-> 1 2 50
This seems weird to me since grep outputs the result in separate lines. As if this wasn't ambiguous, if I had a file with only numbers 1 2 3 in separate lines, and I ran
while read i; echo "-> $i"; done < someother.txt
Then the output would be printed by the echo in different lines, as expected in the previous example. I know < is for files and <<< for command outputs, but why does that line difference exist?
Anyways, I was hoping someone could shed some light on the matter, thank you for your time!

grep number some.txt | cut -d " " -f 1 | while read i; do (( total+=i )); done
Each command in a pipeline is run in a subshell. That means when you put the while read loop in a pipeline any variable assignments are lost.
See: BashFAQ 024 - "I set variables in a loop that's in a pipeline. Why do they disappear after the loop terminates? Or, why can't I pipe data to read?"
while read i; do echo "-> $i"; done <<< "$(grep number some.txt | cut -d " " -f 1)"
To preserve grep's newlines, add double quotes. Otherwise the result of $(...) is subject to word splitting which collapses all the whitespace into single spaces.

Related

Counting all the 5 from a specific range in Bash

I want to count how many times the digit "5" appears from the range 1 to 4321. For example, the number 5 appears 1 or the number 555, 5 would appear 3 times etc.
Here is my code so far, however, the results are 0, and they are supposed to be 1262.
#!/bin/bash
typeset -i count5=0
for n in {1..4321}; do
echo ${n}
done | \
while read -n1 digit ; do
if [ `echo "${digit}" | grep 5` ] ; then
count5=count5+1
fi
done | echo "${count5}"
P.s. I am looking to fix my code so it can print the right output. I do not want a completely different solution or a shortcut.
What about something like this
seq 4321 | tr -Cd 5 | wc -c
1262
Creates the sequence, delete everything but 5's and count the chars
The main problem here is http://mywiki.wooledge.org/BashFAQ/024. With minimal changes, your code could be refactored to
#!/bin/bash
typeset -i count5=0
for n in {1..4321}; do
echo $n # braces around ${n} provide no benefit
done | # no backslash required here; fix weird indentation
while read -n1 digit ; do
# prefer modern command substitution syntax over backticks
if [ $(echo "${digit}" | grep 5) ] ; then
count5=count5+1
fi
echo "${count5}" # variable will not persist outside subprocess
done | head -n 1 # so instead just print the last one after the loop
With some common antipatterns removed, this reduces to
#!/bin/bash
printf '%s\n' {1..4321} |
grep 5 |
wc -l
A more efficient and elegant way to do the same is simply
printf '%s\n' {1..4321} | grep -c 5
One primary issue:
each time results are sent to a pipe said pipe starts a new subshell; in bash any variables set in the subshell are 'lost' when the subshell exits; net result is even if you're correctly incrementing count5 within a subshell you'll still end up with 0 (the starting value) when you exit from the subshell
Making minimal changes to OP's current code:
while read -n1 digit ; do
if [ `echo "${digit}" | grep 5` ]; then
count5=count5+1
fi
done < <(for n in {1..4321}; do echo ${n}; done)
echo "${count5}"
NOTE: there are a couple performance related issues with this method of coding but since OP has explicitly asked to a) 'fix' the current code and b) not provide any shortcuts ... we'll leave the performance fixes for another day ...
A simpler way to get the number for a certain n would be
nx=${n//[^5]/} # Remove all non-5 characters
count5=${#nx} # Calculate the length of what is left
A simpler method in pure bash could be:
printf -v seq '%s' {1..4321} # print the sequence into the variable seq
fives=${seq//[!5]} # delete all characters but 5s
count5=${#fives} # length of the string is the count of 5s
echo $count5 # print it
Or, using standard utilities tr and wc
printf '%s' {1..4321} | tr -dc 5 | wc -c
Or using awk:
awk 'BEGIN { for(i=1;i<=4321;i++) {$0=i; x=x+gsub("5",""); } print x} '

Unix bash script grep loop counter (for)

I am looping our the a grep result. The result contains 10 lines (every line has different content). So the loop stuff in the loop gets executed 10 times.
I need to get the index, 0-9, in the run so i can do actions based on the index.
ABC=(cat test.log | grep "stuff")
counter=0
for x in $ABC
do
echo $x
((counter++))
echo "COUNTER $counter"
done
Currently the counter won't really change.
Output:
51209
120049
148480
1211441
373948
0
0
0
728304
0
COUNTER: 1
If your requirement is to only print counter(which is as per shown samples only), in that case you could use awk(if you are ok with it), this could be done in a single awk like, without creating variable and then using grep like you are doing currently, awk could perform both search and counter printing in a single shot.
awk -v counter=0 '/stuff/{print "counter=" counter++}' Input_file
Replace stuff string above with the actual string you are looking for and place your actual file name for Input_file in above.
This should print like:
counter=1
counter=2
........and so on
Your shell script contains what should be an obvious syntax error.
ABC=(cat test.log | grep "stuff")
This fails with
-bash: syntax error near unexpected token `|'
There is no need to save the output in a variable if you only want to process one at a time (and obviously no need for the useless cat).
grep "stuff" test.log | nl
gets you numbered lines, though the index will be 1-based, not zero-based.
If you absolutely need zero-based, refactoring to Awk should solve it easily:
awk '/stuff/ { print n++, $0 }' test.log
If you want to loop over this and do something more with this information,
awk '/stuff/ { print n++, $0 }' test.log |
while read -r index output; do
echo index is "$index"
echo output is "$output"
done
Because the while loop executes in a subshell the value of index will not be visible outside of the loop. (I guess that's what your real code did with the counter as well. I don't think that part of the code you posted will repro either.)
Do not store the result of grep in a scalar variable $ABC.
If the line of the log file contains whitespaces, the variable $x
is split on them due to the word splitting of bash.
(BTW the statement ABC=(cat test.log | grep "stuff") causes a syntax error.)
Please try something like:
readarray -t abc < <(grep "stuff" test.log)
for x in "${abc[#]}"
do
echo "$x"
echo "COUNTER $((++counter))"
done
or
readarray -t abc < <(grep "stuff" test.log)
for i in "${!abc[#]}"
do
echo "${abc[i]}"
echo "COUNTER $((i + 1))"
done
you can use below increment statement-
counter=$(( $counter + 1));

displaying command output in stdout then save to file with transformation?

I have a long-running command which outputs periodically. to demonstrate let's assume it is:
function my_cmd()
{
for i in {1..9}; do
echo -n $i
for j in {1..$i}
echo -n " "
echo $i
sleep 1
done
}
the output will be:
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
I want to display the command output meanwhile save it to a file at the same time.
this can be done by my_cmd | tee -a res.txt.
Now I want to display the output to terminal as-is but save to file with a transformed flavor, say with sed "s/ //g".
so the res.txt becomes:
11
22
33
44
66
77
88
99
how can I do this transformation on-the-fly without waiting for command exits then read the file again?
Note that in your original code, {1..$i} is an error because sequences can't contain variables. I've replaced it with seq. Also, you're missing a do and a done for the inner for loop.
At any rate, I would use process substitution.
#!/usr/bin/env bash
function my_cmd {
for i in {1..9}; do
printf '%d' "$i"
for j in $(seq 1 $i); do
printf ' '
done
printf '%d\n' "$j"
sleep 1
done
}
my_cmd | tee >(tr -d ' ' >> res.txt)
Process substitution usually causes bash to create an entry in /dev/fd which is fed to the command in question. The contents of the substitution run asynchronously, so it doesn't block the process sending data to it.
Note that the process substitution isn't a REAL file, so the -a option for tee is meaningless. If you really want to append to your output file, >> within the substitution is the way to go.
If you don't like process substitution, another option would be to redirect to alternate file descriptors. For example, instead of the last line in the script above, you could use:
exec 5>&1
my_cmd | tee /dev/fd/5 | tr -d ' ' > res.txt
exec 5>&-
This creates a file descriptor, /dev/fd/5, which redirects to your real stdout, the terminal. It then tells tee to write to this, allowing the normal stdout from tee to be processed by additional pipe elements before final redirection to your log file.
The method you choose is up to you. I find process substitution clearer.
Something you need to modify in your function. And you may use tee in the for loop to print and write file at the same time. The following script may get the result you desire.
#!/bin/bash
filename="a.txt"
[ -f $filename ] && rm $filename
for i in {1..9}; do
echo -n $i | tee -a $filename
for((j=1;j<=$i;j++)); do
echo -n " "
done
echo $i | tee -a $filename
sleep 1
done
Instead of double loop, I would use printf and its formatting capability %Xs to pad with blank characters.
Moreover I would use double printing (for stdout and your file) rather than using pipe and starting new processes.
So your function could look like this:
function my_cmd() {
for i in {1..9}; do
printf "%s %${i}s\n" $i $i
printf "%s%s\n" $i $i >> res.txt
done
}

Output a file in two columns in BASH

I'd like to rearrange a file in two columns after the nth line.
For example, say I have a file like this here:
This is a bunch
of text
that I'd like to print
as two
columns starting
at line number 7
and separated by four spaces.
Here are some
more lines so I can
demonstrate
what I'm talking about.
And I'd like to print it out like this:
This is a bunch and separated by four spaces.
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
How could I do that with a bash command or function?
Actually, pr can do almost exactly this:
pr --output-tabs=' 1' -2 -t tmp1
↓
This is a bunch and separated by four spaces.
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
-2 for two columns; -t to omit page headers; and without the --output-tabs=' 1', it'll insert a tab for every 8 spaces it added. You can also set the page width and length (if your actual files are much longer than 100 lines); check out man pr for some options.
If you're fixed upon “four spaces more than the longest line on the left,” then perhaps you might have to use something a bit more complex;
The following works with your test input, but is getting to the point where the correct answer would be, “just use Perl, already;”
#!/bin/sh
infile=${1:-tmp1}
longest=$(longest=0;
head -n $(( $( wc -l $infile | cut -d ' ' -f 1 ) / 2 )) $infile | \
while read line
do
current="$( echo $line | wc -c | cut -d ' ' -f 1 )"
if [ $current -gt $longest ]
then
echo $current
longest=$current
fi
done | tail -n 1 )
pr -t -2 -w$(( $longest * 2 + 6 )) --output-tabs=' 1' $infile
↓
This is a bunch and separated by four spa
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
… re-reading your question, I wonder if you meant that you were going to literally specify the nth line to the program, in which case, neither of the above will work unless that line happens to be halfway down.
Thank you chatraed and BRPocock (and your colleague). Your answers helped me think up this solution, which answers my need.
function make_cols
{
file=$1 # input file
line=$2 # line to break at
pad=$(($3-1)) # spaces between cols - 1
len=$( wc -l < $file )
max=$(( $( wc -L < <(head -$(( line - 1 )) $file ) ) + $pad ))
SAVEIFS=$IFS;IFS=$(echo -en "\n\b")
paste -d" " <( for l in $( cat <(head -$(( line - 1 )) $file ) )
do
printf "%-""$max""s\n" $l
done ) \
<(tail -$(( len - line + 1 )) $file )
IFS=$SAVEIFS
}
make_cols tmp1 7 4
Could be optimized in many ways, but does its job as requested.
Input data (configurable):
file
num of rows borrowed from file for the first column
num of spaces between columns
format.sh:
#!/bin/bash
file=$1
if [[ ! -f $file ]]; then
echo "File not found!"
exit 1
fi
spaces_col1_col2=4
rows_col1=6
rows_col2=$(($(cat $file | wc -l) - $rows_col1))
IFS=$'\n'
ar1=($(head -$rows_col1 $file))
ar2=($(tail -$rows_col2 $file))
maxlen_col1=0
for i in "${ar1[#]}"; do
if [[ $maxlen_col1 -lt ${#i} ]]; then
maxlen_col1=${#i}
fi
done
maxlen_col1=$(($maxlen_col1+$spaces_col1_col2))
if [[ $rows_col1 -lt $rows_col2 ]]; then
rows=$rows_col2
else
rows=$rows_col1
fi
ar=()
for i in $(seq 0 $(($rows-1))); do
line=$(printf "%-${maxlen_col1}s\n" ${ar1[$i]})
line="$line${ar2[$i]}"
ar+=("$line")
done
printf '%s\n' "${ar[#]}"
Output:
$ > bash format.sh myfile
This is a bunch and separated by four spaces.
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
$ >

ksh: shell script to search for a string in all files present in a directory at a regular interval

I have a directory (output) in unix (SUN). There are two types of files created with timestamp prefix to the file name. These file are created on a regular interval of 10 minutes.
e. g:
1. 20140129_170343_fail.csv (some lines are there)
2. 20140129_170343_success.csv (some lines are there)
Now I have to search for a particular string in all the files present in the output directory and if the string is found in fail and success files, I have to count the number of lines present in those files and save the output to the cnt_succ and cnt_fail variables. If the string is not found I will search again in the same directory after a sleep timer of 20 seconds.
here is my code
#!/usr/bin/ksh
for i in 1 2
do
grep -l 0140127_123933_part_hg_log_status.csv /osp/local/var/log/tool2/final_logs/* >log_t.txt; ### log_t.txt will contain all the matching file list
while read line ### reading the log_t.txt
do
echo "$line has following count"
CNT=`wc -l $line|tr -s " "|cut -d" " -f2`
CNT=`expr $CNT - 1`
echo $CNT
done <log_t.txt
if [ $CNT > 0 ]
then
exit
fi
echo "waiitng"
sleep 20
done
The problem I'm facing is, I'm not able to get the _success and _fail in file in line and and check their count
I'm not sure about ksh, but while ... do; ... done is notorious for running off with whatever variables you're using in bash. ksh might be similar.
If I've understand your question right, SunOS has grep, uniq and sort AFAIK, so a possible alternative might be...
First of all:
$ cat fail.txt
W34523TERG
ADFLKJ
W34523TERG
WER
ASDTQ34T
DBVSER6
W34523TERG
ASDTQ34T
DBVSER6
$ cat success.txt
abcde
defgh
234523452
vxczvzxc
jkl
vxczvzxc
asdf
234523452
vxczvzxc
dlkjhgl
jkl
wer
234523452
vxczvzxc
And now:
egrep "W34523TERG|ASDTQ34T" fail.txt | sort | uniq -c
2 ASDTQ34T
3 W34523TERG
egrep "234523452|vxczvzxc|jkl" success.txt | sort | uniq -c
3 234523452
2 jkl
4 vxczvzxc
Depending on the input data, you may want to see what options sort has on your system. Examining uniq's options may prove useful too (it can do more than just count duplicates).
Think you want something like this (will work in both bash and ksh)
#!/bin/ksh
while read -r file; do
lines=$(wc -l < "$file")
((sum+=$lines))
done < <(grep -Rl --include="[1|2]*_fail.csv" "somestring")
echo "$sum"
Note this will match files starting with 1 or 2 and ending in _fail.csv, not exactly clear if that's what you want or not.
e.g. Let's say I have two files, one starting with 1 (containing 4 lines) and one starting with 2 (containing 3 lines), both ending in `_fail.csv somewhere under my current working directory
> abovescript
7
Important to understand grep options here
-R, --dereference-recursive
Read all files under each directory, recursively. Follow all
symbolic links, unlike -r.
and
-l, --files-with-matches
Suppress normal output; instead print the name of each input
file from which output would normally have been printed. The
scanning will stop on the first match. (-l is specified by
POSIX.)
Finaly I'm able to find the solution. Here is the complete code:
#!/usr/bin/ksh
file_name="0140127_123933.csv"
for i in 1 2
do
grep -l $file_name /osp/local/var/log/tool2/final_logs/* >log_t.txt;
while read line
do
if [ $(echo "$line" |awk '/success/') ] ## will check the success file
then
CNT_SUCC=`wc -l $line|tr -s " "|cut -d" " -f2`
CNT_SUCC=`expr $CNT_SUCC - 1`
fi
if [ $(echo "$line" |awk '/fail/') ] ## will check the fail file
then
CNT_FAIL=`wc -l $line|tr -s " "|cut -d" " -f2`
CNT_FAIL=`expr $CNT_FAIL - 1`
fi
done <log_t.txt
if [ $CNT_SUCC > 0 ] && [ $CNT_FAIL > 0 ]
then
echo " Fail count = $CNT_FAIL"
echo " Success count = $CNT_SUCC"
exit
fi
echo "waitng for next search..."
sleep 10
done
Thanks everyone for your help.
I don't think I'm getting it right, but You can't diffrinciate the files?
maybe try:
#...
CNT=`expr $CNT - 1`
if [ $(echo $line | grep -o "fail") ]
then
#do something with fail count
else
#do something with success count
fi

Resources