Shell: why does while loop processing additional char

Shell: why does while loop processing additional char - shell

This is my simple shell script
root#Ubuntu:/tmp# cat -n script.sh
1 echo
2 while x= read -n 1 char
3 do
4 echo -e "Original value = $char"
5 echo -e "Plus one = `expr $char + 1`\n"
6 done < number.txt
7 echo
root#Ubuntu:/tmp#
And this is the content of number.txt
root#Ubuntu:/tmp# cat number.txt
12345
root#Ubuntu:/tmp#
As you can see on the code, I'm trying to read each number and process it separately. In this case, I would like to add one to each of them and print it on a new line.
root#Ubuntu:/tmp# ./script.sh
Original value = 1
Plus one = 2
Original value = 2
Plus one = 3
Original value = 3
Plus one = 4
Original value = 4
Plus one = 5
Original value = 5
Plus one = 6
Original value =
Plus one = 1
root#Ubuntu:/tmp#
Everything looks fine except for the last line. I've only have 5 numbers, however it seems like the code is processing additional one.
Original value =
Plus one = 1
Question is how does this happen and how to fix it?

It seems the input file number.txt contains a complete line, which is terminated by a line feed character (LF). (You can verify the input file is longer than 5 using ls -l.) read eventually encounters the LF and gives you an empty char (stripping the terminating LF from the input as it would without the -n option). This will give you expr + 1 resulting in 1. You can explicitely test for the empty char and terminate the while loop using the test -n for non-zero length strings:
echo "12345" | while read -n 1 char && [ -n "$char" ]; do echo "$char" ; done

Related

Continuously-updated (running-count) output from a program reading from a pipeline

How can I get continuously-updated output from a program that's reading from a pipeline? For example, let's say that this program were a version of wc:
$ ls | running_wc
So I'd like this to output instantly, e.g.
0 0 0
and then every time a new output line is received, it'd update again, e.g.
1 2 12
2 4 24
etc.
Of course my command isn't really ls, it's a process that slowly outputs data... I'd actually love to dynamically have it count matches and non matches, and sum this info up on a single line, e.g,
$ my_process | count_matches error
This would constantly update a single line of output with the matching and non matching counts, e.g.
$ my_process | count_matches error
0 5
then later on it might look like so, since it's found 2 matches and 10 non matching lines.
$ my_process | count_matches error
2 10

dd will print out statistics if it receives a SIGUSR1 signal, but neither wc nor grep does that. You'll need to re-implement them, more or less.
count_matches() {
local pattern=$1
local matches=0 nonmatches=0
local line
while IFS= read -r line; do
if [[ $line == *$pattern* ]]; then ((++matches)); else ((++nonmatches)); fi
printf '\r%s %s' "$matches" "$nonmatches"
done
printf '\n'
}
Printing a carriage return \r each time causes the printouts to overwrite each other.
Most programs will switch from line buffering to full buffering when used in a pipeline. Your slow-running program should flush its output after each line to ensure the results are available immediately. Or if you can't modify it, you can often use stdbuf -oL to force programs that use C stdio to line buffer stdout.
stdbuf -oL my_process | count_matches error

Using awk. First we create the "my_process":
$ for i in {1..10} ; do echo $i ; sleep 1 ; done # slowly prints lines
The match counter:
$ awk 'BEGIN {
print "match","miss" # print header
m=0 # reset match count
}
{
if($1~/(3|6)/) # match is a 3 or 6 (for this output)
m++ # increment match count
print m,NR-m # for each record output match / miss counts
}'
Running it:
$ for i in {1..10} ; do echo $i ; sleep 1 ; done | awk 'BEGIN{print "match","miss";m=0}{if($1~/(3|6)/)m++;print m,NR-m}'
match miss
0 1
0 2
1 2
1 3
1 4
2 4
2 5
2 6
2 7
2 8

How to nest loops correctly

I have 2 scripts, #1 and #2. Each work OK by themselves. I want to read a 15 row file, row by row, and process it. Script #2 selects rows. Row 0 is is indicated as firstline=0, lastline=1. Row 14 would be firstline=14, lastline=15. I see good results from echo. I want to do the same with script #1. Can't get my head around nesting correctly. Code below.
#!/bin/bash
# script 1
filename=slash
firstline=0
lastline=1
i=0
exec <${filename}
while read ; do
i=$(( $i + 1 ))
if [ "$i" -ge "${firstline}" ] ; then
if [ "$i" -gt "${lastline}" ] ; then
break
else
echo "${REPLY}" > slash1
fold -w 21 -s slash1 > news1
sleep 5
fi
fi
done
# script2
firstline=(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14)
lastline=(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
for ((i=0;i<${#firstline[#]};i++))
do
echo ${firstline[$i]} ${lastline[$i]};
done

Your question is very unclear, but perhaps you are simply looking for some simple function calls:
#!/bin/bash
script_1() {
filename=slash
firstline=$1
lastline=$2
i=0
exec <${filename}
while read ; do
i=$(( $i + 1 ))
if [ "$i" -ge "${firstline}" ] ; then
if [ "$i" -gt "${lastline}" ] ; then
break
else
echo "${REPLY}" > slash1
fold -w 21 -s slash1 > news1
sleep 5
fi
fi
done
}
# script2
firstline=(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14)
lastline=(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
for ((i=0;i<${#firstline[#]};i++))
do
script_1 ${firstline[$i]} ${lastline[$i]};
done
Note that reading the file this way is extremely inefficient, and there are undoubtedly better ways to handle this, but I am trying to minimize the changes from your code.

Update: Based on your later comments, the following idiomatic Bash code that uses sed to extract the line of interest in each iteration solves your problem much more simply:
Note:
- If the input file does not change between loop iterations, and the input file is small enough (as it is in the case at hand), it's more efficient to buffer the file contents in a variable up front, as is demonstrated in the original answer below.
- As tripleee points out in a comment: If simply reading the input lines sequentially is sufficient (as opposed to extracting lines by specific line numbers, then a single, simple while read -r line; do ... # fold and output, then sleep ... done < "$filename" is enough.
# Determine the input filename.
filename='slash'
# Count its number of lines.
lineCount=$(wc -l < "$filename")
# Loop over the line numbers of the file.
for (( lineNum = 1; lineNum <= lineCount; ++lineNum )); do
# Use `sed` to extract the line with the line number at hand,
# reformat it, and output to the target file.
fold -w 21 -s <(sed -n "$lineNum {p;q;}" "$filename") > 'news1'
sleep 5
done
A simplified version of what I think you're trying to achieve:
#!/bin/bash
# Split fields by newlines on input,
# and separate array items by newlines on output.
IFS=$'\n'
# Read all input lines up front, into array ${lines[#]}
# In terms of your code, you'd use
# read -d '' -ra lines < "$filename"
read -d '' -ra lines <<<$'line 1\nline 2\nline 3\nline 4\nline 5\nline 6\nline 7\nline 8\nline 9\nline 10\nline 11\nline 12\nline 13\nline 14\nline 15'
# Define the arrays specifying the line ranges to select.
firstline=(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14)
lastline=(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
# Loop over the ranges and select a range of lines in each iteration.
for ((i=0; i<${#firstline[#]}; i++)); do
extractedLines="${lines[*]: ${firstline[i]}: 1 + ${lastline[i]} - ${firstline[i]}}"
# Process the extracted lines.
# In terms of your code, the `> slash1` and `fold ...` commands would go here.
echo "$extractedLines"
echo '------'
done
Note:
The name of the array variable filled with read -ra is lines; ${lines[#]} is Bash syntax for returning all array elements as separate words (${lines[*]} also refers to all elements, but with slightly different semantics), and this syntax is used in the comments to illustrate that lines is indeed an array variable (note that if you were to use simply $lines to reference the variable, you'd implicitly get only the item with index 0, which is the same as: ${lines[0]}.
<<<$'line 1\n...' uses a here-string (<<<) to read an ad-hoc sample document (expressed as an ANSI C-quoted string ($'...')) in the interest of making my example code self-contained.
As stated in the comment, you'd read from $filename instead:
read -d '' -ra lines <"$filename"
extractedLines="${lines[*]: ${firstline[i]}: 1 + ${lastline[i]} - ${firstline[i]}}" extracts the lines of interest; ${firstline[i]} references the current element (index i) from array ${firstline[#]}; since the last token in Bash's array-slicing syntax
(${lines[*]: <startIndex>: <elementCount>}) is the count of elements to return, we must perform a calculation to determine the count, which is what 1 + ${lastline[i]} - ${firstline[i]} does.
By virtue of using "${lines[*]...}" rather than "${lines[#]...}", the extracted array elements are joined by the first character in $IFS, which in our case is a newline ($'\n') (when extracting a single line, that doesn't really matter).

How to sum a row of numbers from text file-- Bash Shell Scripting

I'm trying to write a bash script that calculates the average of numbers by rows and columns. An example of a text file that I'm reading in is:
1 2 3 4 5
4 6 7 8 0
There is an unknown number of rows and unknown number of columns. Currently, I'm just trying to sum each row with a while loop. The desired output is:
1 2 3 4 5 Sum = 15
4 6 7 8 0 Sum = 25
And so on and so forth with each row. Currently this is the code I have:
while read i
do
echo "num: $i"
(( sum=$sum+$i ))
echo "sum: $sum"
done < $2
To call the program it's stats -r test_file. "-r" indicates rows--I haven't started columns quite yet. My current code actually just takes the first number of each column and adds them together and then the rest of the numbers error out as a syntax error. It says the error comes from like 16, which is the (( sum=$sum+$i )) line but I honestly can't figure out what the problem is. I should tell you I'm extremely new to bash scripting and I have googled and searched high and low for the answer for this and can't find it. Any help is greatly appreciated.

You are reading the file line by line, and summing line is not an arithmetic operation. Try this:
while read i
do
sum=0
for num in $i
do
sum=$(($sum + $num))
done
echo "$i Sum: $sum"
done < $2
just split each number from every line using for loop. I hope this helps.

Another non bash way (con: OP asked for bash, pro: does not depend on bashisms, works with floats).
awk '{c=0;for(i=1;i<=NF;++i){c+=$i};print $0, "Sum:", c}'

Another way (not a pure bash):
while read line
do
sum=$(sed 's/[ ]\+/+/g' <<< "$line" | bc -q)
echo "$line Sum = $sum"
done < filename

Using the numsum -r util covers the row addition, but the output format needs a little glue, by inefficiently paste-ing a few utils:
paste "$2" \
<(yes "Sum =" | head -$(wc -l < "$2") ) \
<(numsum -r "$2")
Output:
1 2 3 4 5 Sum = 15
4 6 7 8 0 Sum = 25
Note -- to run the above line on a given file foo, first initialize $2 like so:
set -- "" foo
paste "$2" <(yes "Sum =" | head -$(wc -l < "$2") ) <(numsum -r "$2")

I want to delete a batch from file

I have a file and contents are like :
|T1234
010000000000
02123456878
05122345600000000000000
07445678920000000000000
09000000000123000000000
10000000000000000000000
.T1234
|T798
013457829
0298365799
05600002222222222222222
09348977722220000000000
10000057000004578933333
.T798
Here one complete batch means it will start from |T and end with .T.
In the file i have 2 batches.
I want to edit this file to delete a batch for record 10(position1-2),if from position 3 till position 20 is 0 then delete the batch.
Please let me know how i can achieve this by writing a shell script or syncsort or sed or awk .

I am still a little unclear about exactly what you want, but I think I have it enough to give you an outline on a bash solution. The part I was unclear on is exactly which line contained the first two characters of 10 and remaining 0's, but it looks like that is the last line in each batch. Not knowing exactly how you wanted the batch (with the matching 10) handled, I have simply written the remaining wanted batch(es) out to a file called newbatch.txt in the current working directory.
The basic outline of the script is to read each batch into a temporary array. If during the read, the 10 and 0's match is found, it sets a flag to delete the batch. After the last line is read, it checks the flag, if set simply outputs the batch number to delete. If the flag is not set, then it writes the batch to ./newbatch.txt.
Let me know if your requirements are different, but this should be fairly close to a solution. The code is fairly well commented. If you have questions, just drop a comment.
#!/bin/bash
ifn=${1:-dat/batch.txt} # input filename
ofn=./newbatch.txt # output filename
:>"$ofn" # truncate output filename
declare -i bln=0 # batch line number
declare -i delb=0 # delete batch flag
declare -a ba # temporary batch array
[ -r "$ifn" ] || { # test input file readable
printf "error: file not readable. usage: %s filename\n" "${0//*\//}"
exit 1
}
## read each line in input file
while read -r line || test -n "$line"; do
printf " %d %s\n" $bln "$line"
ba+=( "$line" ) # add line to array
## if chars 1-2 == 10 and chars 3 on == 00...
if [ ${line:0:2} == 10 -a ${line:3} == 00000000000000000000 ]; then
delb=1 # set delete flag
fi
((bln++)) # increment line number
## if the line starts with '.'
if [ ${line:0:1} == '.' ]; then
## if the delete batch flag is set
if [ $delb -eq 1 ]; then
## do nothing (but show batch no. to delete)
printf " => deleting batch : %s\n" "${ba[0]}"
## if delb not set, then write the batch to output file
else
printf "%s\n" ${ba[#]} >> "$ofn"
fi
## reset line no., flags, and uset array.
bln=0
delb=0
unset ba
fi
done <"$ifn"
exit 0
Output (to stdout)
$ bash batchdel.sh
0 |T1234
1 010000000000
2 02123456878
3 05122345600000000000000
4 07445678920000000000000
5 09000000000123000000000
6 10000000000000000000000
7 .T1234
=> deleting batch : |T1234
0 |T798
1 013457829
2 0298365799
3 05600002222222222222222
4 09348977722220000000000
5 10000057000004578933333
6 .T798
Output (to newbatch.txt)
$ cat newbatch.txt
|T798
013457829
0298365799
05600002222222222222222
09348977722220000000000
10000057000004578933333
.T798

Bash - Different Value of parameter Inside If condition

input from file $2 : 1 -> 2
while read -a line; do
if (( ${line[2]} > linesNumber )); then
echo "Graph does not match known sites4"
exit
fi
done < "$2"
For some reason inside the if condition, the value of ${line[2]) is not 2
but if I print the value outside if:
echo `${line[2]}`
2

What's linesNumber? Even if you put $linesNumber, where is it coming from?
If you are tracking the line number, you need to set it and increment it. Here's my sample program and data. It's inspired by your example, but doesn't do exactly what you want. However, it shows you how to setup a variable that tracks the line number, how to increment it, and how to use it in an if statement:
foo.txt:
this 1
that 2
foo 4
barf 4
flux 5
The Program:
lineNum=0
while read -a line
do
((lineNum++))
if (( ${line[1]} > $lineNum ))
then
echo "Line Number Too High!"
fi
echo "Verb = ${line[0]} Number = ${line[1]}"
done < foo.txt
Output:
Verb = this Number = 1
Verb = that Number = 2
Line Number Too High!
Verb = foo Number = 4
Verb = barf Number = 4
Verb = flux Number = 5

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Shell: why does while loop processing additional char - shell

Related

Continuously-updated (running-count) output from a program reading from a pipeline

How to nest loops correctly

How to sum a row of numbers from text file-- Bash Shell Scripting

I want to delete a batch from file

Bash - Different Value of parameter Inside If condition

Categories

Resources