Error in bash script while reading a file - bash

The following is a script I wrote to run an executable ./runnable on argument/input file input.
It takes standard input from another file called final_file and outputs it to a file called outfile. There are 91 lines in final_file (i.e., 91 different standard space delimited inputs) and therefore the bash script should call the ./runnable input 91 times.
But, I am not sure why it is calling it only one time. Any suggestions on what's going on wrong?
#!/bin/bash
OUTFILE=outfile
(
a=0
while read line
do
./runnable input
echo "This is line number: $a"
a='expr $a+ 1'
done<final_file
) >$OUTFILE
To clarify, the final_file looks like
_ _DATA_ _
2,9,2,9,10,0,38
2,9,2,10,11,0,0
2,9,2,11,12,0,0
2,9,2,12,13,0,0
2,9,2,13,0,1,4
2,9,2,13,3,2,2
and so on. One line, at a time, is the standard input. Number of lines in final_file correspond to number of times the standard input is given. So in the above case, the script should run six times as there are six lines.

I'll hazard that ./runnable seeks all the way through stdin. With no input left to read, the while loop ends after one iteration.
Reasoning: your example Works For Me (TM), substituting a file I happen to have (/etc/services) for final_file and commenting out the line that invokes ./runnable.
On the other hand, if I replace the ./runnable invocation with a one-liner that simply seeks and discards standard input (e.g., cat - > /dev/null or perl -ne 1), I get the behavior you describe.
(Note that you want backticks or $() around the call to expr.)

Run your shell script with the -x
option for some debugging output.
Add echo $line after your while read line; do
Note that while read line; do echo $line; done does not read space separated input, it reads line separated input.

Related

During a while loop file read, where is the first line of stdin lost?

Assume we have a file with the numbers 1 to 5 written down line by line.
When I open a file for reading as standard input and use 'while read,' commands which can read stdin are unable to read the first line of that file.
$ while read x; do sed ''; done<file
2
3
4
5
It makes no difference which command you use: sed, awk, cat, etc. That problem occurs if the command is able to read from stdin.There is also no difference between the shells I use. I try the same thing in sh, bash, and zsh, and the results are identical.
It's worth noting that the loop iterates five times, once for each line. For example:
$ while read x; do printf 'something\n'; done<file
something
something
something
something
something
I understand that if I want to read all lines correctly, I must specify a variable in the read command and then pass it to the command. But I'm trying to figure out what's going on here. Why does this problem occur when I do not specify input for a command directly?
Perhaps it is a side effect with no functional purpose.
I couldn't find any information about this behavior of the 'while read' statement, and neither did I find anyone who had a similar problem.
Your code only iterates once.
while read x; do sed ''; done<file
...behaves as follows:
file is opened and attached to stdin
read consumes the first line of the file from stdin and puts it into $x
sed '' consumes the entire rest of the file from stdin and prints it to stdout without changes.
read sees there's no more data (because sed consumed it all), and the loop ends.
If you want sed to operate on only the one line that read x consumed, and to safeguard against other bugs, you might instead write:
while IFS= read -r x; do printf '%s\n' "$x" | sed ''; done <file
The changes:
Using IFS= prevents leading or trailing whitespace from being deleted by read.
Using the -r argument prevents backslashes from being consumed by read.
Piping from printf '%s\n' "$x" into sed changes sed's stdin, such that instead of containing the rest of the file, it only contains the one line. Thus, this ensures that sed is processing the line that was consumed by read, instead of ignoring that line and processing the entire rest of the file. (Using printf instead of echo is a correctness concern; see Why is printf better than echo? on UNIX & Linux Stack Exchange).
the first line of stdin is not lost, but rather it is consumed by the shell when the redirection operator '<' is used to redirect the contents of the file to the while loop. The first line is used as the input to initialize the while loop, and subsequent lines are read inside the loop. This is why the first line is not processed by the commands inside the loop. To avoid this, you can redirect the file to a new file descriptor using '<&', as follows:
$ while read x; do sed ''; done <&3 3<file

understanding how input was redirected into while construct from a file

I came across a syntax for "while read" loop in a bash script
$> while read line; do echo $line; done < f1 # f1 is a file in my current directory
will print the file line by line.
my search for "while read" in the bash GNU manual https://www.gnu.org/software/bash/manual/
came up short, and while other "tutorial sites" give some usage examples, i would still like to understand the full syntax options for this construct.
can it be used for "for" loops as well?
something like
for line in read; do echo $line; done < f1
The syntax for a while loop is
while list-1; do list-2; done
where list-1 is one or more commands (usually one) and the loop continues while list-1 is successful (return value of zero), list-2 is the "body" of the loop.
The syntax of a for loop is different:
for name in word; do list ; done
where word is usually a list of strings, not a command (although it can be hacked to use a command which returns word).
The purpose of a for loop is to iterate through word, the purpose of while is to loop while a command is successful. They are used for different tasks.
Redirection changes a file descriptor to refer to another file or file descriptor.
< changes file descriptor 0 (zero), also known as stdin
> changes file descriptor 1 (one), also known as stdout
So somecommand < foo changes stdin to read from foo rather than the terminal keyboard.
somecommand > foo changes stdout to write to foo rather than the terminal screen (if foo exists it will be overwritten).
In your case somecommand is while, but it can be any other - note that not all commands read from stdin, yet the command syntax with < is still valid.
A common mistake is:
# WRONG!
while read < somefile
do
....
done
In that case somecommand is read and the effect is that it will read the first line of somefile, then proceed with the body of the loop, come back, then read the first line of the file again! It will continually loop just reading the first line, since while has no knowledge or interest in what read is doing, only its return value of success or fail. (read uses the variable REPLY if you don't specify one)
Redirection examples ($ indicates a prompt):
$ cat > file1
111111
<CTRL+D>
$ cat > file2
222222
<CTRL+D>
cat reads from stdin if we don't specify a filename, so it reads from the keyboard. Instead of writing to the screen we redirect to a file. The <CTRL+D> indicates End-Of-File sent from the keyboard.
This redirects stdin to read from a file:
$ cat < file1
111111
Can you explain this?
$ cat < file1 file2
222222

Writing a Unix filter using bash

If a Unix/Linux command accepts its input data from the standard input and produces its output (result) on standard output is known as a filter.
The trivial filter is cat. It just copies stdin to stdout without any modification whatsoever.
How do I implement cat in bash? (neglecting the case that the command gets command line arguments)
I came up with
#! /bin/bash
while IFS="" read -r line
do
echo -E "$line"
done
That seems to work in most cases, also for text files containing some binary bytes as long as they are not null bytes. However, if the last line does not end in a newline character, it will be missing from the output.
How can that be fixed?
I'm nearly sure this must have been answered before, but my searching skills don't seem to be good enough.
Obviously I don't want to re-implement cat in bash: It wouldn't work anyway because of the null byte problem. But I want to extend the basic loop to do some custom processing to certain lines of a text file. However, we have all seen text files without the final line feed, so I'd prefer if that case could be handled.
Assuming you don't need to work with arbitrary binary files (since the shell cannot store null bytes in a variable), you can handle a file that isn't terminated by a newline by checking if line is not empty once the loop exits.
while IFS= read -r line; do
printf '%s\n' "$line"
done
if [ -n "$line" ]; then
printf '%s' "$line"
fi
In the loop, we output the newline that read stripped off. In the final if, we don't output a newline because $line would be empty if the last line read by the while loop had ended with one.

Passing values of one file into another file as inputs

I have a file that contains following information in a tab separated manner:
abscdfr 2 5678
bgbhjgy 7 8756
ptxfgst 5 6783
lets call this file A and it contains 2000 lines
and I have another file B written in ruby
that takes these values as command line input:
f_id = ARGV[0]
lane = ARGV[1].to_i
sample_id = ARGV[2].to_i
puts " #{f_id}_#{lane}_#{sample_id}.bw"
I execute the file B in ruby by providing the information in file A
./fileB.rb abscdfr 2 5678
I want to know how can I pass the values of file A as input to file B in a recursive manner.
If it was one value it was easy but I am confused with three values.
Kindly help me in writing a wrapper around these two file either in bash or ruby.
Thank you
The following command will do the job in bash:
while read line; do ./fileB.rb $line; done < fileA
This reads each lines into line. Then it runs ./fileB.rb $line for each line. $line gets replaced before the command line is evaluated, thus each word in every line is passed as its own argument, it is important that there is no quotation like "$line". read reads from STDIN and would usually wait for user input, but with < fileA the content of fileA is redirected to STDIN so that read takes its input from there.
You could use a little bash script to loop through each line in the file and output the contents as arguments to another script.
while read line; do
eval "./fileB.rb $line"
done < fileA
This will evaluate the line in the quotes as if you typed it into the shall yourself.
Also you can use one liner ruby :
ruby -ne 'system( "./fileB.rb #{$_}" )' < fileA
Explanation :
-e Which allow us to specifies script from command-line
-n The other useful flags are -n (somewhat like sed -n or awk) , the flag tell ruby to read input or input file line by line like while loop.
$_ Default ruby save current line stored in $_ variable

WHILE loop - read line of a file one by one -- Not working the No. of times the file has lines in it

I'm using a "while" loop within a shell script (BASH) to read line of a file (one by one) -- "Fortunately", its not working the No. of times the file has lines in it.
Here's the summary:
$ cat inputfile.txt
1
2
3
4
5
Now, the shell script content is pretty simple as shown below:
#!/bin/bash
while read line
do
echo $line ----------;
done < inputfile.txt;
The above script code works just fine..... :). It shows all the 5 lines from inputfile.txt.
Now, I have another script whose code is like:
#!/bin/bash
while read line
do
echo $line ----------;
somevariable="$(ssh sshuser#sshserver "hostname")";
echo $somevariable;
done < inputfile.txt;
Now, in this script, while loop just shows only line "1 ---------" and exits out from the loop after showing valid value for "$somevariable"
Any idea, what I'm missing here. I didn't try using some number N < inputfile.txt and using done <&N way (i.e. to change the input redirector by using a file pointed by N descriptor)
.... but I'm curious why this simple script is not working for N no. of times, when I just added a simple variable declaration which is doing a "ssh" operation in a child shell.
Thanks.
You might want to add the -n option to the ssh command. This would prevent it to "swallow" your inputfile.txt as its standard input.
Alternatively, you might just redirect ssh stdin from /dev/null, eg:
somevariable="$(ssh sshuser#sshserver "hostname" </dev/null)";

Resources