read stdin in function in bash script - bash

I have some set of bash functions which output some information:
find-modelname-in-epson-ppds
find-modelname-in-samsung-ppds
find-modelname-in-hp-ppds
etc ...
I've been writing functions which read output and filter it:
function filter-epson {
find-modelname-in-epson-ppds | sed <bla-blah-blah>
}
function filter-hp {
find-modelname-in-hp-ppds | sed <the same bla-blah-blah>
}
etc ...
But the I thought that it would be better do something like this:
function filter-general {
(somehow get input) | sed <bla-blah-blah>
}
and then call in another high-level functions:
function high-level-func {
# outputs filtered information
find-modelname-in-hp/epson/...-ppds | filter-general
}
How can I achieve that with the best bash practices?

If the question is How do I pass stdin to a bash function?, then the answer is:
Shellscript functions take stdin the ordinary way, as if they were commands or programs. :)
input.txt:
HELLO WORLD
HELLO BOB
NO MATCH
test.sh:
#!/bin/sh
myfunction() {
grep HELLO
}
cat input.txt | myfunction
Output:
hobbes#metalbaby:~/scratch$ ./test.sh
HELLO WORLD
HELLO BOB
Note that command line arguments are ALSO handled in the ordinary way, like this:
test2.sh:
#!/bin/sh
myfunction() {
grep "$1"
}
cat input.txt | myfunction BOB
Output:
hobbes#metalbaby:~/scratch/$ ./test2.sh
HELLO BOB

To be painfully explicit that I'm piping from stdin, I sometimes write
cat - | ...

A very simple means to get stdin into a variable is to use read. By default, it reads file descriptor "0", i.e. stdin i.e., /dev/stdin.
Example Function:
input(){ local in; read in; echo you said $in; }
Example implementation:
echo "Hello World" | input
Result:
you said Hello World
Additional info
You don't need to declare a variable as being local, of course. I just included that for the sake of good form. Plain old read in does what you need.
So you understand how read works, by default it reads data off the given file descriptor (or implicit stdin) and blocks until it encounters a newline. Much of the time, you'll find that will implicitly be attached to your input, even if you weren't aware of it. If you have a function that seems to hang with this mechanism just keep this detail in mind (there are other ways of using read to deal with that...).
More robust solutions
Adding on to the basic example, here's a variation that lets you pass the input via a stdin OR an argument:
input()
{
local in=$1; if [ -z "$in" ]; then read in; fi
echo you said $in
}
With that tweak, you could ALSO call the function like:
input "Hello World"
How about handling an stdin option plus other arguments? Many standard nix utilities, especially those which typically work with stdin/stdout adhere to the common practice of treating a dash - to mean "default", which contextually means either stdin or stdout, so you can follow the convention, and treat an argument specified as - to mean "stdin":
input()
{
local a=$1; if [ "$a" == "-" ]; then read a; fi
local b=$2
echo you said $a $b
}
Call this like:
input "Hello" "World"
or
echo "Hello" | input - "World"
Going even further, there is actually no reason to only limit stdin to being an option for only the first argument! You might create a super flexible function that could use it for any of them...
input()
{
local a=$1; if [ "$a" == "-" ]; then read a; fi
local b=$2; if [ "$b" == "-" ]; then read b; fi
echo you said $a $b
}
Why would you want that? Because you could formulate, and pipe in, whatever argument you might need...
myFunc | input "Hello" -
In this case, I pipe in the 2nd argument using the results of myFunc rather than the only having the option for the first.

Call sed directly. That's it.
function filter-general {
sed <bla-blah-blah>
}

Related

How to use/call variables from another function in bash

I am learning to play around with functions in bash. I have the first function read_file() that reads /etc/file and replaces ':'with a space between words (e.g root:x:0:0:root ... becomes root x 0 0 root ... ). I then want to be able to manipulate output from individual words in each of the lines.
My second function- display__user_shell() prints our the shell for each corresponding users as is in the /etc/file.
My problem is figuring out how to call the first function read_file() and using its variables in the display__user_shell function.
I have been able to do the above when using input from a single line rather than reading from a file.
i just called new_data -i.e $new_data from the display__user_shell() function
read_file() {
read -p "Enter file" file
while read line
do
newlin=$(echo $line | tr ":" " ")
echo newlin
done
}
oldIFS=$IFS
IFS=" "
ct=0
display__user_shell() {
readfile
for item in $newlin;
do
[ $ct -eq 0 ] && name="$item";
[ $ct -eq 6 ] && name="$item";
done
echo "$user's shell is $shell"
}
IFS=$oldIFS
display__user_shell
the first line of the output should be..
root's shell is /bin/bash
Irrespective of the implementation there is an interesting question here: how to reference variables from one function in another function. The short answer is that you can:
$ a() { aye=bee; }
$ b() { echo "$aye"; }
$ a
$ b
bee
But this is a very bad idea - Bash has "unfortunate" scoping rules different from safer languages like Java, Python, or Ruby, and code like this is very hard to follow. Instead there are several patterns you can use to produce more readable code:
Print the value in the inner function and assign that to a value in the outer function:
a() {
echo 'bee'
}
b() {
aye="$(a)"
echo "$aye"
}
b # Prints "bee"
Call and assign to a variable the first function in the outer scope and use it in the second function:
a() {
echo 'bee'
}
aye="$(a)"
b() {
echo "$aye"
}
b # Prints "bee"
Treat the first and second functions as a pipeline, passing standard output of the first one to the standard input of the second one (read is a slow way to process a large file, but it'll serve as an example):
a() {
echo 'bee'
}
b() {
while read -r line
do
echo "$line"
done
}
a | b # Prints "bee"
Which one you choose depends on things like what else you intend to do with what a returns and whether a produces huge amounts of output.

Use variables outside the subprocess in bash

There's a getStrings() function that calls getPage() function that returns some html page. That html is piped through egrep and sed combination to get only 3 strings. Then I try to put every string into separate variable link, profile, gallery respectively using while read.. construction. But it works only inside the while...done loop because it runs in subprocess. What should I do to use those variables outside the getStrings() function?
getStrings() {
local i=2
local C=0
getPage $(getPageLink 1 $i) |
egrep *some expression that results in 3 strings* |
while read line; do
if (( (C % 3) == 0 )); then
link=$line
elif (( (C % 3) == 1 )); then
profile=$line
else
gallery=$line
fi
C=$((C+1)) #Counter
done
}
Simple: don't run the loop in a subprocess :)
To actually accomplish that, you can use process substitution.
while read line; do
...
done < <(getPage $(getPageLink 1 $i) | egrep ...)
For the curious, a POSIX-compatible way is to use a named pipe (and its possible that bash uses named pipes to implement process substitution):
mkfifo pipe
getPage $(getPageLink 1 $i) | egrep ... > pipe &
while read line; do
...
done < pipe
Starting in bash 4.2, you can just set the lastpipe option, which causes the last command in a pipeline to run in the current shell, rather than a subshell.
shopt -s lastpipe
getPage $(getPageLink 1 $i) | egrep ... | while read line; do
...
done
However, using a while loop is not the best way to set the three variables. It's easier to just call read three times within a command group, so that they all read from the same stream. In any of the three scenarios above, replace the while loop with
{ read link; read profile; read gallery; }
If you want to be a little more flexible, put the names of the variables you might want to read in an array:
fields=( link profile gallery )
then replace the while loop with this for loop instead:
for var in "${fields[#]}"; do read $var; done
This lets you easily adjust your code, should the pipeline ever return more or fewer lines, by just editing the fields array to have the appropriate field names.
One more solving using array:
getStrings() {
array_3=( `getPage | #some function
egrep | ...` ) #pipe conveyor
}

How can you override function redirection in bash?

I recently discovered some bash code that used the little-known (well, little known to me anyway) feature of function redirection, such as the greatly simplified:
function xyzzy () {
echo hello
} >/dev/null
When you call the function with a simple xyzzy, it automatically applies the redirections attached to the function regardless of what you've done when calling it.
What I'd like to know is if there's any way to override this behaviour in the call to the function itself, to see the message being generated. I'm reticent to change the file containing all the functions since (1) it's large, (2) it changes regularly, and (3) it's heavily protected by the group that supports it.
I've tried:
xyzzy >&1
to try to override it but the output still doesn't show up (possibly because >&1 may be considered a no-op).
In other words, given the script:
function xyzzy () {
echo hello
} >/tmp/junk
rm -f /tmp/junk
echo ================
echo Standard output
echo ----------------
xyzzy # something else here
echo ================
echo Function capture
echo ----------------
cat /tmp/junk
echo ================
it currently outputs:
================
Standard output
----------------
================
Function capture
----------------
hello
================
What can I change the xyzzy call to, so as to get hello printed in the standard output section rather than the function capture section?
And this needs to be without trying to read the file /tmp/junk after it's created since the actual redirections may be to /dev/null so they won't be in a file.
The only thing I can think of would be to parse the output of declare -f function_name and remove the redirection.
This is perhaps the easiest approach. Note that you need to tailor the awk script to the specific function layout and it doesn't modify the body of the function at all. That means you can only turn off redirection at the top level. You could modify whole call trees of functions to turn off redirection but that would require a bash parser capable of recognising and changing function calls within the body.
The following script shows how to do it with your sample function. All the awk command does is create a new function my_xyzzy which mirrors the xyzzy function except for the final line, effectively turning it into:
function my_xyzzy () {
echo hello
}
And the complete script as per the specifications:
function xyzzy () {
echo hello
} >/tmp/qqqq
declare -f xyzzy | awk '
NR==1 {print "my_xyzzy ()"}
NR==2 {prev=$0}
NR>2 {print prev;prev=$0}
END {print "}"}' >$$.bash
. $$.bash
rm -f $$.bash
rm -f /tmp/qqqq
echo ================
echo Standard output
echo ----------------
my_xyzzy
echo ================
echo Function capture
echo ----------------
cat /tmp/qqqq
echo ================
The output of that is:
================
Standard output
----------------
hello
================
Function capture
----------------
cat: /tmp/qqqq: No such file or directory
================
I don't think Bash function redirections can be overridden dynamically in the call to the function itself although a temporarily altered shell context can be made use of by combining Bash aliases and functions (see Magic Aliases: A Layering Loophole in the Bourne Shell).
Non-dynamically it is the last redirection expression, i. e. the rightmost one, that overrides the previous ones if the redirection expressions refer to the same file descriptor.
# example
ls -ld / no_such_file 1>/dev/null 1>/dev/tty 1>&2 1>redirtest.txt
cat redirtest.txt
Therefore, glenn jackman's suggestion to use declare -f function_name seems the way to add a final stdout redirection expression to override the previous ones.
xyzzy() { echo 'Hello, world!'; } 1>/dev/null
#func="$(declare -f xyzzy) 1>&2"
func="$(declare -f xyzzy) 1>/dev/tty"
eval "$func"
xyzzy

mutiple returns from a function in a shell script

function get_arguments()
{
read -p 'data : ' data
read -p 'lambda: ' lambda
echo $data $lambda
}
data,lambda=$(get_arguments)
But i am getting an error
data : /home/wolfman/Downloads/data
lambda value: 2
./shell_script.sh: line 25: data,lambda,= /home/wolfman/Downloads/data: No such file or directory
But
1) Why is it even evaluating that whether that file exists or not.. its just a string??
2) what am i doing wrong :(
THanks
sh syntax does not allow that. But, the variables in the function are global, so you can just invoke the function and data and lambda will be set in the caller.
functions return an integer value, but they can print arbitrary data which can be read by the caller. For example, you could do:
get_arguments | { read data lambda; echo $data $lambda; }
The drawback is that the values are only available in that block. (The pipe creates a subshell, and the values read by read are only valid in that subshell.)
Just for fun here are a couple of other possible methods.
read -r data lambda <<< $(get_arguments)
or
set -- $(get_arguments)
data=$1
lambda=$2
shells don't allow direct assignment to lists of variables, you have manage that with shell string parsing (or possibly other methods). Try
data_lambda=$(get_arguments)
data=${data_lambda% *}
#-----------------^^space char
lambda=${data_lambda#* }
#------------------^^space char
$d=123 l=345
$data_lambda=$(echo $d $l)
$echo $data_lambda
123 345
$data=${data_lambda% *}
$lambda=${data_lambda#* }
$echo $data
123
$echo $lambda
345
Substituting $(echo $d $l) for data_lambda=$(get_arguments)`.
See my write-up on shell parameter modifiers
IHTH

Capturing multiple line output into a Bash variable

I've got a script 'myscript' that outputs the following:
abc
def
ghi
in another script, I call:
declare RESULT=$(./myscript)
and $RESULT gets the value
abc def ghi
Is there a way to store the result either with the newlines, or with '\n' character so I can output it with 'echo -e'?
Actually, RESULT contains what you want — to demonstrate:
echo "$RESULT"
What you show is what you get from:
echo $RESULT
As noted in the comments, the difference is that (1) the double-quoted version of the variable (echo "$RESULT") preserves internal spacing of the value exactly as it is represented in the variable — newlines, tabs, multiple blanks and all — whereas (2) the unquoted version (echo $RESULT) replaces each sequence of one or more blanks, tabs and newlines with a single space. Thus (1) preserves the shape of the input variable, whereas (2) creates a potentially very long single line of output with 'words' separated by single spaces (where a 'word' is a sequence of non-whitespace characters; there needn't be any alphanumerics in any of the words).
Another pitfall with this is that command substitution — $() — strips trailing newlines. Probably not always important, but if you really want to preserve exactly what was output, you'll have to use another line and some quoting:
RESULTX="$(./myscript; echo x)"
RESULT="${RESULTX%x}"
This is especially important if you want to handle all possible filenames (to avoid undefined behavior like operating on the wrong file).
In case that you're interested in specific lines, use a result-array:
declare RESULT=($(./myscript)) # (..) = array
echo "First line: ${RESULT[0]}"
echo "Second line: ${RESULT[1]}"
echo "N-th line: ${RESULT[N]}"
In addition to the answer given by #l0b0 I just had the situation where I needed to both keep any trailing newlines output by the script and check the script's return code.
And the problem with l0b0's answer is that the 'echo x' was resetting $? back to zero... so I managed to come up with this very cunning solution:
RESULTX="$(./myscript; echo x$?)"
RETURNCODE=${RESULTX##*x}
RESULT="${RESULTX%x*}"
Parsing multiple output
Introduction
So your myscript output 3 lines, could look like:
myscript() { echo $'abc\ndef\nghi'; }
or
myscript() { local i; for i in abc def ghi ;do echo $i; done ;}
Ok this is a function, not a script (no need of path ./), but output is same
myscript
abc
def
ghi
Considering result code
To check for result code, test function will become:
myscript() { local i;for i in abc def ghi ;do echo $i;done;return $((RANDOM%128));}
1. Storing multiple output in one single variable, showing newlines
Your operation is correct:
RESULT=$(myscript)
About result code, you could add:
RCODE=$?
even in same line:
RESULT=$(myscript) RCODE=$?
Then
echo $RESULT $RCODE
abc def ghi 66
echo "$RESULT"
abc
def
ghi
echo ${RESULT#Q}
$'abc\ndef\nghi'
printf '%q\n' "$RESULT"
$'abc\ndef\nghi'
but for showing variable definition, use declare -p:
declare -p RESULT RCODE
declare -- RESULT="abc
def
ghi"
declare -- RCODE="66"
2. Parsing multiple output in array, using mapfile
Storing answer into myvar variable:
mapfile -t myvar < <(myscript)
echo ${myvar[2]}
ghi
Showing $myvar:
declare -p myvar
declare -a myvar=([0]="abc" [1]="def" [2]="ghi")
Considering result code
In case you have to check for result code, you could:
RESULT=$(myscript) RCODE=$?
mapfile -t myvar <<<"$RESULT"
declare -p myvar RCODE
declare -a myvar=([0]="abc" [1]="def" [2]="ghi")
declare -- RCODE="40"
3. Parsing multiple output by consecutives read in command group
{ read firstline; read secondline; read thirdline;} < <(myscript)
echo $secondline
def
Showing variables:
declare -p firstline secondline thirdline
declare -- firstline="abc"
declare -- secondline="def"
declare -- thirdline="ghi"
I often use:
{ read foo;read foo total use free foo ;} < <(df -k /)
Then
declare -p use free total
declare -- use="843476"
declare -- free="582128"
declare -- total="1515376"
Considering result code
Same prepended step:
RESULT=$(myscript) RCODE=$?
{ read firstline; read secondline; read thirdline;} <<<"$RESULT"
declare -p firstline secondline thirdline RCODE
declare -- firstline="abc"
declare -- secondline="def"
declare -- thirdline="ghi"
declare -- RCODE="50"
After trying most of the solutions here, the easiest thing I found was the obvious - using a temp file. I'm not sure what you want to do with your multiple line output, but you can then deal with it line by line using read. About the only thing you can't really do is easily stick it all in the same variable, but for most practical purposes this is way easier to deal with.
./myscript.sh > /tmp/foo
while read line ; do
echo 'whatever you want to do with $line'
done < /tmp/foo
Quick hack to make it do the requested action:
result=""
./myscript.sh > /tmp/foo
while read line ; do
result="$result$line\n"
done < /tmp/foo
echo -e $result
Note this adds an extra line. If you work on it you can code around it, I'm just too lazy.
EDIT: While this case works perfectly well, people reading this should be aware that you can easily squash your stdin inside the while loop, thus giving you a script that will run one line, clear stdin, and exit. Like ssh will do that I think? I just saw it recently, other code examples here: https://unix.stackexchange.com/questions/24260/reading-lines-from-a-file-with-bash-for-vs-while
One more time! This time with a different filehandle (stdin, stdout, stderr are 0-2, so we can use &3 or higher in bash).
result=""
./test>/tmp/foo
while read line <&3; do
result="$result$line\n"
done 3</tmp/foo
echo -e $result
you can also use mktemp, but this is just a quick code example. Usage for mktemp looks like:
filenamevar=`mktemp /tmp/tempXXXXXX`
./test > $filenamevar
Then use $filenamevar like you would the actual name of a file. Probably doesn't need to be explained here but someone complained in the comments.
How about this, it will read each line to a variable and that can be used subsequently !
say myscript output is redirected to a file called myscript_output
awk '{while ( (getline var < "myscript_output") >0){print var;} close ("myscript_output");}'

Resources