How to read stdin when no arguments are passed? - bash

Script doesn't work when I want to use standard input when there are no arguments (files) passed. Is there any way how to use stdin instead of a file in this code?
I tried this:
if [ ! -n $1 ] # check if argument exists
then
$1=$(</dev/stdin) # if not use stdin as an argument
fi
var="$1"
while read line
do
... # find the longest line
done <"$var"

For a general case of wanting to read a value from stdin when a parameter is missing, this will work.
$ echo param | script.sh
$ script.sh param
script.sh
#!/bin/bash
set -- "${1:-$(</dev/stdin)}" "${#:2}"
echo $1

Just substitute bash's specially interpreted /dev/stdin as the filename:
VAR=$1
while read blah; do
...
done < "${VAR:-/dev/stdin}"
(Note that bash will actually use that special file /dev/stdin if built for an OS that offers it, but since bash 2.04 will work around that file's absence on systems that do not support it.)

pilcrow's answer provides an elegant solution; this is an explanation of why the OP's approach didn't work.
The main problem with the OP's approach was the attempt to assign to positional parameter $1 with $1=..., which won't work.
The LHS is expanded by the shell to the value of $1, and the result is interpreted as the name of the variable to assign to - clearly, not the intent.
The only way to assign to $1 in bash is via the set builtin.
The caveat is that set invariably sets all positional parameters, so you have to include the other ones as well, if any.
set -- "${1:-/dev/stdin}" "${#:2}" # "${#:2}" expands to all remaining parameters
(If you expect only at most 1 argument, set -- "${1:-/dev/stdin}" will do.)
The above also corrects a secondary problem with the OP's approach: the attempt to store the contents rather than the filename of stdin in $1, since < is used.
${1:-/dev/stdin} is an application of bash parameter expansion that says: return the value of $1, unless $1 is undefined (no argument was passed) or its value is the empty string (""or '' was passed). The variation ${1-/dev/stdin} (no :) would only return /dev/stdin if $1 is undefined (if it contains any value, even the empty string, it would be returned).
If we put it all together:
# Default to filename '/dev/stdin' (stdin), if none was specified.
set -- "${1:-/dev/stdin}" "${#:2}"
while read -r line; do
... # find the longest line
done < "$1"
But, of course, the much simpler approach would be to use ${1:-/dev/stdin} as the filename directly:
while read -r line; do
... # find the longest line
done < "${1:-/dev/stdin}"
or, via an intermediate variable:
filename=${1:-/dev/stdin}
while read -r line; do
... # find the longest line
done < "$filename"

Variables are assigned a value by Var=Value and that variable is used by e.g. echo $Var. In your case, that would amount to
1=$(</dev/stdin)
when assigning the standard input. However, I do not think that variable names are allowed to start with a digit character. See the question bash read from file or stdin for ways to solve this.

Here is my version of script:
#!/bin/bash
file=${1--} # POSIX-compliant; ${1:--} can be used either.
while IFS= read -r line; do
printf '%s\n' "$line"
done < <(cat -- "$file")
If file is not present in the argument, read the from standard input.
See more examples: How to read from file or stdin in bash? at stackoverflow SE

Related

Why won't my grep-filtered string print from within a while-loop?

Tried to keep my code as simple as possible:
1: What are the rules for using echo within a while loop?
All my $a and some of my $word variables are echoed not my echo kk?
2: What is the scope of my count variable? Why is it not working within my while loop? can I extend the variable to make it global?
3: When I use the grep in the final row the $word cariable only prints the first word in the passing rows ehile if I remove the grep line in the end $work functions as intended and prints all the words.
count=1
while read a; do
((count=count+1))
if [ $count -le 2 ]
then
echo $a
echo kk
for word in $a; do
echo $word
done
fi
done < data.txt | grep Iteration
Use Process Substitution
In a comment, you say:
I thtought I was using grep on data.txt (sic)
No. Your current pipeline passes the loop's results through grep, not the source file. To do that, you need to rewrite your redirection to use process substitution. For example:
count=1
while read a; do
((count=count+1))
if [ $count -le 2 ]
then
echo $a
echo kk
for word in $a; do
echo $word
done
fi
done < <(fgrep Iteration data.txt)
#CodeGnome answered your question but there's other problems with your script that will come back to bite you at some point. (see https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice for discussions on some of them and also google quoting shell variables). Just don't do it. Shell scripts are just for sequencing calls to tools and the UNIX tool for manipulating text is awk. In this case all you'd need to do the job robustly, portably and efficiently would be:
awk '
/Iteration/ {
if (++count <= 2) {
print
print "kk"
for (i=1; i<=NF; i++) {
print $i
}
}
}' data.txt
and of course it'd be more efficient still if you just stop reading the input when count hits 2:
awk '
/Iteration/ {
print
print "kk"
for (i=1; i<=NF; i++) {
print $i
}
if (++count == 2) {
exit
}
}' data.txt
To complement CodeGnome's helpful answer with an explanation of how your command actually works and why it doesn't do what you want:
In Bash's grammar, an input redirection such as < data.txt is part of a single command, whereas |, the pipe symbol, chains multiple commands, from left to right, to form a pipeline.
Technically, while ... done ... < data.txt | grep Iteration is a single pipeline composed of 2 commands:
a single compound command (while ...; do ...; done) with an input redirection (< data.txt),
and a simple command (grep Iteration) that receives the stdout output from the compound command via its stdin, courtesy of the pipe.
In other words:
only the contents of data.txt is fed to the while loop as input (via stdin),
and whatever stdout output the while loop produces is then sent to the next pipeline segment, the grep command.
By contrast, it sounds like you want to apply grep to data.txt first, and only sent the matching lines to the while loop.
You have the following options for sending a command's output to another command:
Note: The following solutions use a simplified while loop for brevity - whether a while command is single-line or spans multiple lines is irrelevant.
Also, instead of using input redirection (< data.txt) to pass the file content to grep, data.txt is passed as a filename argument.
Option 1: Place the command whose output to send to your while loop first in the pipeline:
grep 'Iteration' data.txt | while read -r a; do echo "$a"; done
The down-side of this approach is that your while loop then runs in a subshell (as all segments of a pipeline do by default), which means that variables defined or modified in your while command won't be visible to the current shell.
In Bash v4.2+, you can fix this by running shopt -s lastpipe, which tells Bash to run the last pipeline segment - the while command in this case - in the current shell instead.
Note that lastpipe is a nonstandard bash extension to the POSIX standard.
(To try this in an interactive shell, you must first turn off job control with set +m.)
Option 2: Use a process substitution:
Loosely speaking, a process substitution <(...) allows you to present command output as the content of a temporary file that cleans up after itself.
Since <(...) expands to the temporary file's (FIFO's) path, and read in the while loop only accepts stdin input, input redirection must be applied as well: < <(...):
while read -r a; do echo "$a"; done < <(grep 'Iteration' data.txt)
The advantage of this approach is that the while loop runs in the current subshell, and any variables definitions or modifications therefore remain in scope after the command completes.
The potential down-side of this approach is that process substitutions are a nonstandard bash extension to the POSIX standard (although ksh and zsh support them too).
Option 3: Use a command substitution inside a here-document:
Using the command first in the pipeline (option 1) is a POSIX-compliant approach, but doesn't allow you to modify variables in the current shell (and Bash's lastpipe option is not POSIX-compliant).
The only POSIX-compliant way to send command output to a command that runs in the current shell is to use a command substitution ($(...)) inside a double-quoted here-document:
while read -r a; do echo "$a"; done <<EOF
$(grep 'Iteration' data.txt)
EOF
Streamlining your code and making it more robust:
The rest of your code has some non-obvious pitfalls that are worth addressing:
Double-quote your variable references (e.g., echo "$a" instead of echo $a), unless you specifically want word-splitting and globbing (filename expansion) applied to the values; word splitting and globbing are two kinds of shell expansions.
Similarly, don't use for to iterate over an (of necessity unquoted) variable reference (don't use for word in $a, in your case), unless you want globbing applied to the individual words - see what happens when you run $a='one *'; for word in $a; do echo "$word"; done
You could turn globbing off beforehand (set -f) and back on after (set +f), but it's better to use read -ra words ... to read the words into an array first, and then safely iterate over the array elements with for word in "${words[#]}"; ...- note the "..." around the array variable reference.
Always use -r with read; without it, rarely used \-preprocessing is applied, which will "eat" embedded \ chars.
If we heed the advice above, apply a few additional tweaks, and use a process substitution to feed grep's output to the while loop, we get:
count=1
while read -r a; do # Note the -r
if (( ++count <= 2 )); then
echo "$a"
# Split $a safely into words and store the words in
# array variable ${words[#]}.
read -ra words <<<"$a" # Note the -a to read into an *array*.
# Loop over the words (elements of the array).
# Note: To simply print the words, you could use
# `printf '%s\n' "${words[#]}"`` instead of the loop.
for word in "${words[#]}"; do
echo "$word"
done
fi
done < <(grep 'Iteration' data.txt)
Note: As written, you don't need a loop at all, because you always exit after the 1st iteration.
Finally, as a general alternative for larger input sets, consider Ed Morton's helpful answer, which is much faster due to using awk to process your input file, whereas looping in shell code is generally slow.

String expansion - escaped quoted variable to value

To get started, here's the script I'm running to get the offending string:
# sed finds all sourced file paths from inputted file.
#
# while reads each match output from sed to $SOURCEFILE variable.
# Each should be a file path, or a variable that represents a file path.
# Any variables found should be expanded to the full path.
#
# echo and calls are used for demonstractive purposes only
# I intend to do something else with the path once it's expanded.
PATH_SOME_SCRIPT="/path/to/bash/script"
while read -r SOURCEFILE; do
echo "$SOURCEFILE"
"$SOURCEFILE"
$SOURCEFILE
done < <(cat $PATH_SOME_SCRIPT | sed -n -e "s/^\(source\|\.\|\$include\) //p")
You may also wish to use the following to test this out as mock data:
[ /path/to/bash/script ]
#!/bin/bash
source "$HOME/bash_file"
source "$GLOBAL_VAR_SCRIPT_PATH"
echo "No cow powers here"
For the tl;dr crew, basically the while loop spits out the following on the mock data:
"$HOME/bash_file"
bash: "$HOME/bash_file": no such file or directory
bash: "$HOME/bash_file": no such file or directory
"$GLOBAL_VAR_SCRIPT_PATH"
"$GLOBAL_VAR_SCRIPT_PATH": command not found
"$GLOBAL_VAR_SCRIPT_PATH": command not found
My question is, can you get the variable to expand correctly, e.g., print "/home//bash_file" and "/expanded/variable/path"? I should also state that although eval works I do not intend to use it because of its potential insecurities.
Protip that any variable value used in cat | sed would be available globally, including to the calling script, so it's not because the script cannot call the variable value.
FIRST SOLUTION ATTEMPT
Using anubhava's envsubst solution:
SOMEVARIABLE="/home/nick/.some_path"
while read -r SOURCEFILE; do
echo "$SOURCEFILE"
envsubst <<< "$SOURCEFILE";
done < <(echo -e "\"\$SOMEVARIABLE\"\n\"$HOME/.another_file\"")
This outputs the following:
"$SOMEVARIABLE"
""
"/home/nick/.another_file"
"/home/nick/.another_file"
Unfortunately, it does not expand the variable! Oh dear :(
SECOND SOLUTION ATTEMPT
Based upon the first attempt:
export SOMEVARIABLE="/home/nick/.some_path"
while read -r SOURCEFILE; do
echo "$SOURCEFILE"
envsubst <<< "$SOURCEFILE";
done < <(echo -e "\"\$SOMEVARIABLE\"\n\"$HOME/.another_file\"")
unset SOMEVARIABLE
which produces the results we wanted without eval and without messing with global variables (for too long anyway), hoorah!
Good runner-ups were further suggested using eval (although potentially unsafe) which can be found in this answer and here (link courtesy of anubhava's extended comments).
My question is, can you get the variable to expand correctly, e.g., print "/home//bash_file" and "/expanded/variable/path"?
Yes you can use envsubst program, that substitutes the values of environment variables:
while read -r sourceFile; do
envsubst <<< "$sourceFile"
done < <(sed -n "s/^\(source\|\.\|\$include\) //p" "$PATH_SOME_SCRIPT")
I think you are asking how to recursively expand variables in bash. Try
expanded=$(eval echo $SOURCEFILE)
inside your loop. eval runs the expanded command you give it. Since $SOURCEFILE isn't in quotes, it will be expanded to, e.g., $HOME/whatever. Then the eval will expand the $HOME before passing it to echo. echo will print the result, and expanded=$(...) will put the printed result in $expanded.

bash using stdin as a variable

I want to make sure my script will work when the user uses a syntax like this:
script.sh firstVariable < SecondVariable
For some reason I can't get this to work.
I want $1=firstVariable
And $2=SecondVariable
But for some reason my script thinks only firstVariable exists?
This is a classic X-Y problem. The goal is to write a utility in which
utility file1 file2
and
utility file1 < file2
have the same behaviour. It seems tempting to find a way to somehow translate the second invocation into the first one by (somehow) figuring out the "name" of stdin, and then using that name the same way as the second argument would be used. Unfortunately, that's not possible. The redirection happens before the utility is invoked, and there is no portable way to get the "name" of an open file descriptor. (Indeed, it might not even have a name, in the case of other_cmd | utility file1.)
So the solution is to focus on what is being asked for: make the two behaviours consistent. This is the case with most standard utilities (grep, cat, sort, etc.): if the input file is not specified, the utility uses stdin.
In many unix implementations, stdin does actually have a name: /dev/stdin. In such systems, the above can be achieved trivially:
utility() {
utility_implementation "$1" "${2:-/dev/stdin}"
}
where utility_implementation actually does whatever is required to be done. The syntax of the second argument is normal default parameter expansion; it represents the value of $2 if $2 is present and non-empty, and otherwise the string /dev/stdin. (If you leave out the - so that it is "${2:/dev/stdin}", then it won't do the substitution if $2 is present and empty, which might be better.)
Another way to solve the problem is to ensure that the first syntax becomes the same as the second syntax, so that the input is always coming from stdin even with a named file. The obvious simple approach:
utility() {
if (( $# < 2 )); then
utility_implementation "$1"
else
utility_implementation "$1" < "$2"
fi
}
Another way to do this uses the exec command with just a redirection to redirect the shell's own stdin. Note that we have to do this inside a subshell ((...) instead of {...}) so that the redirection does not apply to the shell which invokes the function:
utility() (
if (( $# > 1 )) then; exec < "$2"; fi
# implementation goes here. $1 is file1 and stdin
# is now redirected to $2 if $2 was provided.
# ...
)
To make the stdin of the second variable the final argument to the script(so if you have one arg then < second arg, it will be the second), you can use the below
#!/bin/bash
##read loop to read in stdin
while read -r line
do
## This just checks if the variable is empty, so a newline isn't appended on the front
[[ -z $Vars ]] && Vars="$line" && continue
## Appends every line read to variable
Vars="$Vars"$'\n'"$line"
## While read loop using stdin
done < /dev/stdin
##Set re-sets the arguments to the script to the original arguments and then the new argument we derived from stdin
set - "$#" "$Vars"
## Echo the new arguments
echo "$#"

bash - cleaner way to count the number of lines in a variable using pure bash?

I have a variable containing some command output, and I want to count the number of lines in that output. I am trying to do this using pure bash (instead of piping to wc like in this question).
The best I have come up with seems kind of cumbersome:
function count_lines() {
num_lines=0
while IFS='' read -r line; do
((num_lines++))
done <<< "$1"
echo "$num_lines"
}
count_lines "$my_var"
Is there a cleaner or shorter way to do this?
count_lines () (
IFS=$'\n'
set -f
set -- $1
echo $#
)
Some tricks used:
The function is defined using a subshell instead of a command group to localize the change to IFS and the -f shell option. (You could be less fancy and use local IFS=$'\n' instead, and running set +f at the end of the function).
Disable filename generation to avoid any metacharacters in the argument from expanding and interfering with the line count.
Once IFS is changed, set the positional parameters for the function using unquoted parameter expansion for the argument.
Finally, output the number of positional parameters; there is one per line in the original argument.
IFS=$'\n'
set -f
x=( $variable )
Now ${#x[#]} will contain the number of 'lines'. (Use "set +f" to undo "set -f".)
Another possible solution I came up with:
export x=0;while read i;do export x=$(($x+1));done < /path/to/your/file;echo $x

Read a Bash variable assignment from other file

I have this test script:
#!/bin/bash
echo "Read a variable"
#open file
exec 6<test.txt
read EXAMPLE <&6
#close file again
exec 6<&-
echo $EXAMPLE
The file test.txt has only one line:
EXAMPLE=1
The output is:
bash-3.2$ ./Read_Variables.sh
Read the variable
EXAMPLE=1
I need just to use the value of $EXAMPLE, in this case 1. So how can I avoid getting the EXAMPLE= part in the output?
Thanks
If the file containing your variables is using bash syntax throughout (e.g. X=Y), another option is to use source:
#!/bin/bash
echo "Read a variable"
source test.txt
echo $EXAMPLE
As an alternative to sourcing the entire file, you can try the following:
while read line; do
[[ $line =~ EXAMPLE= ]] && declare "$line" && break
done < test.txt
which will scan the file until it finds the first line that looks like an assignment to EXAMPLE, then use the declare builtin to perform the assignment. It's probably a little slower, but it's more selective about what is actually executed.
I think the most proper way to do this is by sourcing the file which contains the variable (if it has bash syntax), but if I were to do that, I'd source it in a subshell, so that if there are ever other variables declared there, they won't override any important variables in current shell:
(. test.txt && echo $EXAMPLE)
You could read the line in as an array (notice the -a option) which can then be indexed into:
# ...
IFS='=' read -a EXAMPLE <&6
echo ${EXAMPLE[0]} # EXAMPLE
echo ${EXAMPLE[1]} # 1
# ...
This call to read splits the input line on the IFS and puts the remaining parts into an indexed array.
See help read for more information about read options and behaviour.
You could also manipulate the EXAMPLE variable directly:
# ...
read EXAMPLE <&6
echo ${EXAMPLE##*=} # 1
# ...
If all you need is to "import" other Bash declarations from a file you should just use:
source file

Resources