File input getting consumed inside the while loop - bash

I'm reading through a lookup file and performing a set of actions for each line in the file. However the while loop only reads the first line in the file and exits. Here's the current code that I have.
sql_from_lkp $lookup
function sql_from_lkp {
lkp=$1
while read line; do
sql_from_columns ${line}
echo ${line}
done < ${lkp}
}
function sql_from_columns {
table_name=$1
table_column_info_file=${table_name}_columns
line_count=`cat $table_info_file | wc -l`
....
}
By selectively commenting the code, I found that if I comment the line_count line, the while loop goes through every line in the file and works fine. So the input is getting consumed by the cat statement.
I've checked other answers and understood that ssh usually consumes the file inputs inside while loops if -n option is not used. But not sure how to fix this case. Need some help.

You've mistyped a variable name: $table_info_file should be $table_column_info_file.
If you correct that, your problem will go away.
By referring to a non-extant variable - the mistyped $table_info_file - you're essentially executing cat | wc -l (no filename argument passed to cat) in sql_from_columns(), which makes cat read from stdin.
Therefore - after having read the 1st line in the while loop - the cat command in sql_from_columns() consumes the entire rest of your input (< ${lkp}), which is why the while loop exits after the 1st iteration.
Generally,
You should double-quote all your variable references so as not to subject their values to word-splitting and globbing.
Bash won't allow you to call functions before they're defined, so as presented in your question, your code fundamentally couldn't work.
While the legacy `...` syntax for command substitutions is still supported, it has pitfalls that can be avoided with the modern $(...) syntax.
A more efficient way to count lines is to pass the input file to wc -l via < rather than via cat and a pipeline (wc also accepts filename operands directly, but it then prints the input filename after the counts).
Incidentally, you probably would have caught your mistyped variable reference more easily had you done that, as Bash would have reported an ambiguous redirect error in the absence of a filename following <.
Here's a reformulation that addresses all the issues:
function sql_from_lkp {
lkp=$1
while read line; do
sql_from_columns "${line}"
echo "${line}"
done < "${lkp}"
}
function sql_from_columns {
table_name=$1
table_column_info_file=${table_name}_columns
line_count=$(wc -l < "$table_column_info_file")
# ...
}
sql_from_lkp "$lookup"
Note that I've only added double quotes where strictly needed to make the command robust; it wouldn't hurt to add them whenever a parameter (variable) is referenced.

Related

how does a while read loop work in bash?

This is a crawler from GitHub that I am to implement on myself but am unable to read the bash since am a novice. Can this be explained in answer
#!/bin/bash
# Create an array files that contains list of filenames
files=($(< url.txt))
cities=($(< city.txt))
url="http://www.grotal.com/"
citycodes=($(<citycode.txt))
# Read through the url.txt file and execute wget command for every filename
while IFS='=| ' read -r param uri; do
for file in "${files[#]}"; do
for city in "${cities[#]}"; do
mkdir "${city}"
mkdir "${city}/${file}"
wget -O "${city}/${file}/${file}${citycodes[#]}" "${uri}${url}${city}/${file}-${citycodes[#]}/"
done
done
done < url.txt
specifically these (even if u choose to downvote...)
while IFS='=| ' read -r param uri;
and then this:
done < url.txt
Let's break this down into pieces:
read, unless given a non-default -d argument to specify a terminator to use in place of the newline, reads a single line from stdin (that is, reads up to the next newline); splits that line on IFS characters, and writes each field into a different variable. If it stops being able to read more data before reaching a newline, then it emits a nonzero exit status, even if it successfully populated the variables given. (The -r argument prevents read from treating backslashes as continuation characters rather than literals; unless you have a specific reason to have continuation characters available in the context at hand, you should make a habit of passing -r to read by default).
< url.txt redirects a read handle on url.txt into stdin for the command (including a compound command such as a while loop) to which it's appended.
A while loop runs the conditional command it's given, checks whether that conditional reports success or failure, and then proceeds to run the body and restart on success, or exit on failure.
Thus, if you have IFS='=| ' read -r param uri, it will read a single line from stdin; assign everything up to the first =, | or space to the variable named param, and assign what's left to the variable uri.
If you put that in the conditional part of a while loop, then the loop will operate until that read fails -- as it will if there isn't more content (up to and including a newline character) available to be read.
For more in-depth discussion of the idiom and its uses, see BashFAQ #1.
Some asides:
Using mkdir -p -- "${city}/${file}" will let you have only a single mkdir command that creates both directories (and avoids generating error messages if they already exist).
Using readarray -t files < url.txt is a more robust way to read the contents of url.txt into an array named files, though it requires bash 4.0 or newer. For older versions of the shell, consider IFS=$'\n' read -r -d '' -a files <url.txt || (( ${#files[#]} )). These will behave far better than the original idiom if you have wildcards, whitespace, or other unexpected content in your input files.

Get tokens from a String until they are exhausted in shell script

I have a shell script that reads input strings from stdin and get only part of the value from the input. The input string can have any number of key/value pairs and is in the following format:
{"input0":"name:/data/name0.csv",
"input1":"name:/data/name1.csv",
....}
So in the above example, I want to get these as the output of my script:
/data/name0.csv
/data/name1.csv
.....
I think I need two while loops, one needs to keep reading from stdin, the other one needs to extract the values from the input until there is no more. Can someone let me know how to do the second loop block ?
if you have
{"input0":"name:/data/name0.csv",
"input1":"name:/data/name1.csv",
....}
inside a file abc.in, then you can do the following to parse your input with a command called sed:
cat abc.in | sed 's/.*"input[0-9]\+":"name:\(\/data\/name[0-9]\+.csv\)".*$/\1/g'
it basically lookup the current line with a regular expression and see if it matches one of the form Begining of line then anything"input and a number":"name:/data/name and a number.csv"anything and then end of line.
The result is:
/data/name1.csv
/data/name2.csv
/data/name3.csv
/data/name4.csv
...
A simple BashFAQ #1 loop works here, with jq preprocessing your string into line-oriented content:
while read -r value; do
echo "${value#name:}"
done < <(jq -r '.[]')
That said, you can actually do the whole thing just in jq with no bash at all; the following transforms your given input directly to your desired output (given jq 1.5 or newer):
jq -r '.[] | sub("name:"; "")'
If you really want to do things the fragile way rather than leveraging a JSON parser, you can do that too:
# This is evil: Will fail very badly if input formatting changes
content_re='"name:(.*)"'
while read -r line; do
[[ $line =~ $content_re ]] && printf '%s\n' "${BASH_REMATCH[1]}"
done
There's still no inner loop required -- just a single loop iterating over lines of input, with the body determining how to process each line.

Numerically sorting strings from file

Thing is, I would like to numerically sort those strings from file, without changing content of the file. Strings in file must not be changed after sorting operation. I want to use lines for editing them later, so my variable var should get values starting with 0:wc...'till 200:wc.
Input:
11:wc
1:wc
0:wc
200:wc
Desired order:
0:wc
1:wc
11:wc
200:wc
I'm using this code, but has no effect:
sort -k1n $1 | while read line
do
if [[ ${line:0:1} != "#" ]]
then
var=$line
fi
done <$1
Why not just
$ sort -k1n -t: file.txt
specifying the field separator as ':'.
You need to sort numerically on the first key, and if you need them for later, just read them into an array:
myarray=( $(sort -k1n <file) )
which will provide an array with sorted contents:
0:wc
1:wc
11:wc
200:wc
Two issues:
When you create a pipe, such as command | while read line; do ... end, the individual commands in the pipe (command and while read line; do ... end) run in subshells.
The subshells are created with copies of all the current variables, but are not able to reflect changes back into their parent. In this case line is only present in the subshell, and when the subshell terminates, it disappears with it.
You can use bash process substitution to avoid creating a subshell for one of the pipeline commands. For example, you could use:
while read line; do ... end < <(command)
If you both pipe and redirect, the redirect wins.
So when you write: command | while read line; do ... end < input, the while loop actually reads from input, not from the output of command.

Passing a filename into a script as an argument. No such file or directory

I'm relatively new to shell scripting and I've been stuck on this error for a couple days now. I'm trying to read in the contents of a file containing a list of strings and numbers, format it, and output the number of numbers below 50.
All the commands work when typed into the shell, however; in the script when I try and pass the filename in as an argument I keep getting a "No such file or directory" error.
Here is the function in question:
belowFifty(){
count=0
numbers=`cut -d : -f 3 < "$2"` #here is where the error occurs
for num in $numbers
do
if ((num<50));
then
count=$((count+1))
fi
done
echo $count
}
edit: sorry I forgot to mention the script does a couple things. $1 is the option, $2 is the file. I'm calling it like so:
./script.sh m filename
Try:
${2? 2 arguments are required to function belowFifty}
numbers=$( cut -d : -f 3 < $2 )
I suspect the problem is that you are calling the function
and not specifying the 2nd argument. Within the function,
$2 is the argument passed to the function, and not the argument
passed to the main script.
You specify "$2"; what's in the "$1" that's passed to the function and ignored? My strong suspicion is that you are trying to open the file with an empty string as the name, and there is no such file - hence the error message. The corollary is that you probably intended to reference "$1".
If so, you should probably write:
numbers=$(cut -d : -f 3 < "$1")
The back-tick notation should usually be avoided in favour of $(...).

Handle special characters in bash for...in loop

Suppose I've got a list of files
file1
"file 1"
file2
a for...in loop breaks it up between whitespace, not newlines:
for x in $( ls ); do
echo $x
done
results:
file
1
file1
file2
I want to execute a command on each file. "file" and "1" above are not actual files. How can I do that if the filenames contains things like spaces or commas?
It's a little trickier than I think find -print0 | xargs -0 could handle, because I actually want the command to be something like "convert input/file1.jpg .... output/file1.jpg" so I need to permutate the filename in the process.
Actually, Mark's suggestion works fine without even doing anything to the internal field separator. The problem is running ls in a subshell, whether by backticks or $( ) causes the for loop to be unable to distinguish between spaces in names. Simply using
for f in *
instead of the ls solves the problem.
#!/bin/bash
for f in *
do
echo "$f"
done
UPDATE BY OP: this answer sucks and shouldn't be on top ... #Jordan's post below should be the accepted answer.
one possible way:
ls -1 | while read x; do
echo $x
done
I know this one is LONG past "answered", and with all due respect to eduffy, I came up with a better way and I thought I'd share it.
What's "wrong" with eduffy's answer isn't that it's wrong, but that it imposes what for me is a painful limitation: there's an implied creation of a subshell when the output of the ls is piped and this means that variables set inside the loop are lost after the loop exits. Thus, if you want to write some more sophisticated code, you have a pain in the buttocks to deal with.
My solution was to take the "readline" function and write a program out of it in which you can specify any specific line number that you may want that results from any given function call. ... As a simple example, starting with eduffy's:
ls_output=$(ls -1)
# The cut at the end of the following line removes any trailing new line character
declare -i line_count=$(echo "$ls_output" | wc -l | cut -d ' ' -f 1)
declare -i cur_line=1
while [ $cur_line -le $line_count ] ;
do
# NONE of the values in the variables inside this do loop are trapped here.
filename=$(echo "$ls_output" | readline -n $cur_line)
# Now line contains a filename from the preceeding ls command
cur_line=cur_line+1
done
Now you have wrapped up all the subshell activity into neat little contained packages and can go about your shell coding without having to worry about the scope of your variable values getting trapped in subshells.
I wrote my version of readline in gnuc if anyone wants a copy, it's a little big to post here, but maybe we can find a way...
Hope this helps,
RT

Resources