Consistent syntax for obtaining output of a command efficiently in bash? - bash

Bash has the command substitution syntax $(f), which allows to capture
the STDOUT of a command f. If the command is an executable, this is fine
– the creation of a new process is necessary anyway. But if the command is
a shell-function, using this syntax creates an overhead of about 25ms for
each subshell on my system. This is enough to add up to noticable delays
when used in inner loops, especially in interactive contexts such as
command completions or $PS1.
A common optimization is to use global variables instead
[1] for returning values,
but it comes at a cost to readability: The intent becomes less clear, and
output capturing suddenly is inconsistent between shell functions and
executables. I am adding a comparison of options and their weaknesses below.
In order to get a consistent, reliable syntax, I was wondering if bash has
any feature that allows to capture shell-function and executable output
alike, while avoiding subshells for shell-functions.
Ideally, a solution would also contain a more efficient alternative to executing multiple commands in a subshell, which allows more cleanly isolating concerns, e.g.
person=$(
db_handler=$(database_connect) # avoids leaking the variable
query $db_handler lastname # outside it's required
echo ", " # scope.
query $db_handler firstname
database_close $db_handler
)
Such a construct allows the reader of the code to ignore everything inside $(), if the details of how $person is formatted aren't interesting to them.
Comparison of Options
1. With command substitution
person="$(get lastname), $(get firstname)"
Slow, but readable and consistent: It doesn't matter to the reader at first
glance whether get is a shell function or an executable.
2. With same global variable for all functions
get lastname
person="$R, "
get firstname
person+="$R"
Obscures what $person is supposed to contain. Alternatively,
get lastname
local lastname="$R"
get firstname
local firstname="$R"
person="$lastname, $firstname"
but that's very verbose.
3. With different global variable for each function
get_lastname
get_firstname
person="$lastname $firstname"
More readable assignment, but
If some function is invoked twice, we're back to (2).
The side-effect of setting the variable is not obvious.
It is easy to use the wrong variable by accident.
4. With global variable, whose name is passed as argument
get LN lastname
get FN firstname
person="$LN, $FN"
More readable, allows multiple return values easily.
Still inconsistent with capturing output from executables.
Note: Assignment to dynamic variable names should be done with declare
rather than eval:
$VARNAME="$LOCALVALUE" # doesn't work.
declare -g "$VARNAME=$LOCALVALUE" # will work.
eval "$VARNAME='$LOCALVALUE'" # doesn't work for *arbitrary* values.
eval "$VARNAME=$(printf %q "$LOCALVALUE")"
# doesn't avoid a subshell afterall.
[1] http://rus.har.mn/blog/2010-07-05/subshells/

If you want it to be efficient the shell functions can't return their result via stdout. If they did, there'd be no way to get it but by running the function in a subshell and capturing the output via an internal pipe, and these operations are kind of expensive (a few ms on a modern system).
When I was focusing on shell scripts and I needed to max their performance I used a convention where function foo would return its result via a variable foo. This you can do even in a POSIX shell and it has the nice property that it won't overwrite your locals because if foo is a function, you've already kind of reserved the name.
Then I had this bx_r getter function that runs a shell function and saves its output into either a variable whose name is given by the first argument or it outputs the output to stdout if the first argument is a word that's an illegal variable name (without a newline if the word is exactly an empty word, i.e., '').
I've modified it so it can be used uniformly with either commands or functions.
You can't use the type builtin to differentiate between the two here because
type returns its result via stdout => you'd need to capture that result and that would impose the forking penalty again.
So what I do when I'm about to run function foo is I check if there's a corresponding variable foo (this can catch a local variable but you'll avoid the chances of this if you limit yourself to properly namespaced shell function names). If there is, I assume that's where function foo returns its result, otherwise I run it in a $(), capturing its stdout.
Here's the code with some testing code:
bx_varlike_eh()
{
case $1 in
([!A-Za-z_0-9]*) false;;
(*) true;;
esac
}
bx_r() #{{{ Varname=$1; shift; Invoke $# and save it to $Varname if a legal varname or print it
{
# `bx_r '' some_command` prints without a newline
# `bx_r - some_command` (or any non-variable-character-containing word instead of -)
# prints with a newline
local bx_r__varname="$1"; shift 1
local bx_r
if ! bx_varlike_eh "$1" || eval "[ \"\${$1+set}\" != set ]"; then
#https://unix.stackexchange.com/a/465715/23692
bx_r=$( "$#" ) || return #$1 not varlike or unset => must be a regular command, so capture
else
#if $1 is a variable name, assume $1 is a function that saves its output there
"$#" || return
eval "bx_r=\$$1" #put it in bx_r
fi
case "$bx_r__varname" in
('') printf '%s' "$bx_r";;
([!A-Za-z_0-9]*) printf '%s\n' "$bx_r";;
(*) eval "$bx_r__varname=\$bx_r";;
esac
} #}}}
#TEST
for sh in sh bash; do
time $sh -c '
. ./bx_r.sh
bx_getnext=; bx_getnext() { bx_getnext=$((bx_getnext+1)); }
bx_r - bx_getnext
bx_r - bx_getnext
i=0; while [ $i -lt 10000 ]; do
bx_r ans bx_getnext
i=$((i+1)); done; echo ans=$ans
'
echo ====
$sh -c '
. ./bx_r.sh
bx_r - date
bx_r - /bin/date
bx_r ans /bin/date
echo ans=$ans
'
echo ====
time $sh -c '
. ./bx_r.sh
bx_echoget() { echo 42; }
i=0; while [ $i -lt 10000 ]; do
ans=$(bx_echoget)
i=$((i+1)); done; echo ans=$ans
'
done
exit
#MY TEST OUTPUT
1
2
ans=10002
0.14user 0.00system 0:00.14elapsed 99%CPU (0avgtext+0avgdata 1644maxresident)k
0inputs+0outputs (0major+76minor)pagefaults 0swaps
====
Thu Sep 5 17:12:01 CEST 2019
Thu Sep 5 17:12:01 CEST 2019
ans=Thu Sep 5 17:12:01 CEST 2019
====
ans=42
1.95user 1.14system 0:02.81elapsed 110%CPU (0avgtext+0avgdata 1656maxresident)k
0inputs+1256outputs (0major+350075minor)pagefaults 0swaps
1
2
ans=10002
0.92user 0.03system 0:00.96elapsed 99%CPU (0avgtext+0avgdata 3284maxresident)k
0inputs+0outputs (0major+159minor)pagefaults 0swaps
====
Thu Sep 5 17:12:05 CEST 2019
Thu Sep 5 17:12:05 CEST 2019
ans=Thu Sep 5 17:12:05 CEST 2019
====
ans=42
5.20user 2.40system 0:06.96elapsed 109%CPU (0avgtext+0avgdata 3220maxresident)k
0inputs+1248outputs (0major+949297minor)pagefaults 0swaps
As you can see, you can get uniform call syntax with this, while speeding up
the execution of small shell functions by up to about 14 times due to eliminating the need for captures ($()).

Use a bash nameref.
With bash v4 you can use variable namerefs:
get() {
declare -n _get__res
_get_res="$1"
case "$2" in
firstname) _get_res="Kamil"; ;;
lastname) _get_res="Cuk"; ;;
esac
}
get LN lastname
get FN firstname
person="$LN, $FN"
Namerefs can still clash with variables from outer scope. Use long names for the namerefs, like here I used underscore, function name, two underscores and then variable name.

Related

How to use the numeric value returned by a function? [duplicate]

I am working with a bash script and I want to execute a function to print a return value:
function fun1(){
return 34
}
function fun2(){
local res=$(fun1)
echo $res
}
When I execute fun2, it does not print "34". Why is this the case?
Although Bash has a return statement, the only thing you can specify with it is the function's own exit status (a value between 0 and 255, 0 meaning "success"). So return is not what you want.
You might want to convert your return statement to an echo statement - that way your function output could be captured using $() braces, which seems to be exactly what you want.
Here is an example:
function fun1(){
echo 34
}
function fun2(){
local res=$(fun1)
echo $res
}
Another way to get the return value (if you just want to return an integer 0-255) is $?.
function fun1(){
return 34
}
function fun2(){
fun1
local res=$?
echo $res
}
Also, note that you can use the return value to use Boolean logic - like fun1 || fun2 will only run fun2 if fun1 returns a non-0 value. The default return value is the exit value of the last statement executed within the function.
Functions in Bash are not functions like in other languages; they're actually commands. So functions are used as if they were binaries or scripts fetched from your path. From the perspective of your program logic, there shouldn't really be any difference.
Shell commands are connected by pipes (aka streams), and not fundamental or user-defined data types, as in "real" programming languages. There is no such thing like a return value for a command, maybe mostly because there's no real way to declare it. It could occur on the man-page, or the --help output of the command, but both are only human-readable and hence are written to the wind.
When a command wants to get input it reads it from its input stream, or the argument list. In both cases text strings have to be parsed.
When a command wants to return something, it has to echo it to its output stream. Another often practiced way is to store the return value in dedicated, global variables. Writing to the output stream is clearer and more flexible, because it can take also binary data. For example, you can return a BLOB easily:
encrypt() {
gpg -c -o- $1 # Encrypt data in filename to standard output (asks for a passphrase)
}
encrypt public.dat > private.dat # Write the function result to a file
As others have written in this thread, the caller can also use command substitution $() to capture the output.
Parallely, the function would "return" the exit code of gpg (GnuPG). Think of the exit code as a bonus that other languages don't have, or, depending on your temperament, as a "Schmutzeffekt" of shell functions. This status is, by convention, 0 on success or an integer in the range 1-255 for something else. To make this clear: return (like exit) can only take a value from 0-255, and values other than 0 are not necessarily errors, as is often asserted.
When you don't provide an explicit value with return, the status is taken from the last command in a Bash statement/function/command and so forth. So there is always a status, and return is just an easy way to provide it.
$(...) captures the text sent to standard output by the command contained within. return does not output to standard output. $? contains the result code of the last command.
fun1 (){
return 34
}
fun2 (){
fun1
local res=$?
echo $res
}
The problem with other answers is they either use a global, which can be overwritten when several functions are in a call chain, or echo which means your function cannot output diagnostic information (you will forget your function does this and the "result", i.e. return value, will contain more information than your caller expects, leading to weird bugs), or eval which is way too heavy and hacky.
The proper way to do this is to put the top level stuff in a function and use a local with Bash's dynamic scoping rule. Example:
func1()
{
ret_val=hi
}
func2()
{
ret_val=bye
}
func3()
{
local ret_val=nothing
echo $ret_val
func1
echo $ret_val
func2
echo $ret_val
}
func3
This outputs
nothing
hi
bye
Dynamic scoping means that ret_val points to a different object, depending on the caller! This is different from lexical scoping, which is what most programming languages use. This is actually a documented feature, just easy to miss, and not very well explained. Here is the documentation for it (emphasis is mine):
Variables local to the function may be declared with the local
builtin. These variables are visible only to the function and the
commands it invokes.
For someone with a C, C++, Python, Java,C#, or JavaScript background, this is probably the biggest hurdle: functions in bash are not functions, they are commands, and behave as such: they can output to stdout/stderr, they can pipe in/out, and they can return an exit code. Basically, there isn't any difference between defining a command in a script and creating an executable that can be called from the command line.
So instead of writing your script like this:
Top-level code
Bunch of functions
More top-level code
write it like this:
# Define your main, containing all top-level code
main()
Bunch of functions
# Call main
main
where main() declares ret_val as local and all other functions return values via ret_val.
See also the Unix & Linux question Scope of Local Variables in Shell Functions.
Another, perhaps even better solution depending on situation, is the one posted by ya.teck which uses local -n.
Another way to achieve this is name references (requires Bash 4.3+).
function example {
local -n VAR=$1
VAR=foo
}
example RESULT
echo $RESULT
The return statement sets the exit code of the function, much the same as exit will do for the entire script.
The exit code for the last command is always available in the $? variable.
function fun1(){
return 34
}
function fun2(){
local res=$(fun1)
echo $? # <-- Always echos 0 since the 'local' command passes.
res=$(fun1)
echo $? #<-- Outputs 34
}
As an add-on to others' excellent posts, here's an article summarizing these techniques:
set a global variable
set a global variable, whose name you passed to the function
set the return code (and pick it up with $?)
'echo' some data (and pick it up with MYVAR=$(myfunction) )
Returning Values from Bash Functions
I like to do the following if running in a script where the function is defined:
POINTER= # Used for function return values
my_function() {
# Do stuff
POINTER="my_function_return"
}
my_other_function() {
# Do stuff
POINTER="my_other_function_return"
}
my_function
RESULT="$POINTER"
my_other_function
RESULT="$POINTER"
I like this, because I can then include echo statements in my functions if I want
my_function() {
echo "-> my_function()"
# Do stuff
POINTER="my_function_return"
echo "<- my_function. $POINTER"
}
The simplest way I can think of is to use echo in the method body like so
get_greeting() {
echo "Hello there, $1!"
}
STRING_VAR=$(get_greeting "General Kenobi")
echo $STRING_VAR
# Outputs: Hello there, General Kenobi!
Instead of calling var=$(func) with the whole function output, you can create a function that modifies the input arguments with eval,
var1="is there"
var2="anybody"
function modify_args() {
echo "Modifying first argument"
eval $1="out"
echo "Modifying second argument"
eval $2="there?"
}
modify_args var1 var2
# Prints "Modifying first argument" and "Modifying second argument"
# Sets var1 = out
# Sets var2 = there?
This might be useful in case you need to:
Print to stdout/stderr within the function scope (without returning it)
Return (set) multiple variables.
Git Bash on Windows is using arrays for multiple return values
Bash code:
#!/bin/bash
## A 6-element array used for returning
## values from functions:
declare -a RET_ARR
RET_ARR[0]="A"
RET_ARR[1]="B"
RET_ARR[2]="C"
RET_ARR[3]="D"
RET_ARR[4]="E"
RET_ARR[5]="F"
function FN_MULTIPLE_RETURN_VALUES(){
## Give the positional arguments/inputs
## $1 and $2 some sensible names:
local out_dex_1="$1" ## Output index
local out_dex_2="$2" ## Output index
## Echo for debugging:
echo "Running: FN_MULTIPLE_RETURN_VALUES"
## Here: Calculate output values:
local op_var_1="Hello"
local op_var_2="World"
## Set the return values:
RET_ARR[ $out_dex_1 ]=$op_var_1
RET_ARR[ $out_dex_2 ]=$op_var_2
}
echo "FN_MULTIPLE_RETURN_VALUES EXAMPLES:"
echo "-------------------------------------------"
fn="FN_MULTIPLE_RETURN_VALUES"
out_dex_a=0
out_dex_b=1
eval $fn $out_dex_a $out_dex_b ## <-- Call function
a=${RET_ARR[0]} && echo "RET_ARR[0]: $a "
b=${RET_ARR[1]} && echo "RET_ARR[1]: $b "
echo
## ---------------------------------------------- ##
c="2"
d="3"
FN_MULTIPLE_RETURN_VALUES $c $d ## <--Call function
c_res=${RET_ARR[2]} && echo "RET_ARR[2]: $c_res "
d_res=${RET_ARR[3]} && echo "RET_ARR[3]: $d_res "
echo
## ---------------------------------------------- ##
FN_MULTIPLE_RETURN_VALUES 4 5 ## <--- Call function
e=${RET_ARR[4]} && echo "RET_ARR[4]: $e "
f=${RET_ARR[5]} && echo "RET_ARR[5]: $f "
echo
##----------------------------------------------##
read -p "Press Enter To Exit:"
Expected output:
FN_MULTIPLE_RETURN_VALUES EXAMPLES:
-------------------------------------------
Running: FN_MULTIPLE_RETURN_VALUES
RET_ARR[0]: Hello
RET_ARR[1]: World
Running: FN_MULTIPLE_RETURN_VALUES
RET_ARR[2]: Hello
RET_ARR[3]: World
Running: FN_MULTIPLE_RETURN_VALUES
RET_ARR[4]: Hello
RET_ARR[5]: World
Press Enter To Exit:

Get length of an empty or unset array when “nounset” option is in effect

Due to the fact that Bash, when running in set -o nounset mode (aka set -u), may consider empty arrays as unset regardless of whether they have actually been assigned an empty value, care must be taken when attempting to expand an array — one of the workarounds is to check whether the array length is zero. Not to mention that getting the number of elements in an array is a common operation by itself.
While developing with Bash 4.2.47(1)-release in openSUSE 42.1, I accustomed to that getting array size with ${#ARRAY_NAME[#]} succeeds when array is either empty or unset. However, while checking my script with Bash 4.3.46(1)-release in FreeBSD 10.3, it turned out that this operation may fail with generic “unbound variable” error message. Providing default value for expansion does not seem to work for array length. Providing alternative command chains seems to work, but not inside a function called through a subshell expansion — functions just exits after the first failure. What else can be of any help here?
Consider the following example:
function Size ()
{
declare VAR="$1"
declare REF="\${#${VAR}[#]}"
eval "echo \"${REF}\" || echo 0" 2>/dev/null || echo 0
}
set -u
declare -a MYARRAY
echo "size: ${#MYARRAY[#]}"
echo "size: ${#MYARRAY[#]-0}"
echo "Size: $(Size 'MYARRAY')"
echo -n "Size: "; Size 'MYARRAY'
In openSUSE environment, all echo lines output 0, as expected. In FreeBSD, the same outcome is only possible when the array is explicitly assigned an empty value: MYARRAY=(); otherwise, both inline queries in the first two lines fail, the third line just outputs Size: (meaning that the expansion result is empty), and only the last line succeeds completely thanks to the outer || echo 0 — however passing the result through to the screen is not what is usually intended when trying to obtain array length.
Here is the summary of my observations:
Bash 4.2 Bash 4.3
openSUSE FreeBSD
counting elements of unset array OK FAILED
counting elements of empty array OK OK
content expansion of unset array FAILED FAILED
content expansion of unset array(*) OK OK
content expansion of empty array FAILED FAILED
content expansion of empty array(*) OK OK
(* with fallback value supplied)
To me, that looks pretty inconsistent. Is there any real future-proof and cross-platform solution for that?
There are known (documented) differences between the Linux and BSD flavors of bash. I would suggest writing your code as per the POSIX standard. You can start here for more information -> www2.opengroup.org.
With that in mind, you can start bash with the --posix command-line option or you can execute the command set -o posix while bash is running. Either will cause bash to conform to the POSIX standard.
The above suggestion will increase the probability of cross-platform consistency.
As a temporary solution, I followed the route suggested by #william-pursell and just unset the nounset option during the query:
function GetArrayLength ()
{
declare ARRAY_NAME="$1"
declare INDIRECT_REFERENCE="\${#${ARRAY_NAME}[#]}"
case "$-" in
*'u'*)
set +u
eval "echo \"${INDIRECT_REFERENCE}\""
set -u
;;
*)
eval "echo \"${INDIRECT_REFERENCE}\""
;;
esac
}
(Using if instead of case leads to negligibly slower execution on my test machines. Moreover, case allows matching additional options easily if that would become necessary sometime.)
I also tried exploiting the fact that content expansion (with fallback or replacement value) usually succeeds even for unset arrays:
function GetArrayLength ()
{
declare ARRAY_NAME="$1"
declare INDIRECT_REFERENCE="${ARRAY_NAME}[#]"
if [[ -z "${!INDIRECT_REFERENCE+isset}" ]]; then
echo 0
else
INDIRECT_REFERENCE="\${#${ARRAY_NAME}[#]}"
eval "echo \"${INDIRECT_REFERENCE}\""
fi
}
However, it turns out that Bash does not optimize ${a[#]+b} expansion, as execution time clearly increases for larger arrays — although being the smallest one for empty or unset arrays.
Nevertheless, if anyone has a better solution, fell free to post other answers.

Detect empty command

Consider this PS1
PS1='\n${_:+$? }$ '
Here is the result of a few commands
$ [ 2 = 2 ]
0 $ [ 2 = 3 ]
1 $
1 $
The first line shows no status as expected, and the next two lines show the
correct exit code. However on line 3 only Enter was pressed, so I would like the
status to go away, like line 1. How can I do this?
Here's a funny, very simple possibility: it uses the \# escape sequence of PS1 together with parameter expansions (and the way Bash expands its prompt).
The escape sequence \# expands to the command number of the command to be executed. This is incremented each time a command has actually been executed. Try it:
$ PS1='\# $ '
2 $ echo hello
hello
3 $ # this is a comment
3 $
3 $ echo hello
hello
4 $
Now, each time a prompt is to be displayed, Bash first expands the escape sequences found in PS1, then (provided the shell option promptvars is set, which is the default), this string is expanded via parameter expansion, command substitution, arithmetic expansion, and quote removal.
The trick is then to have an array that will have the k-th field set (to the empty string) whenever the (k-1)-th command is executed. Then, using appropriate parameter expansions, we'll be able to detect when these fields are set and to display the return code of the previous command if the field isn't set. If you want to call this array __cmdnbary, just do:
PS1='\n${__cmdnbary[\#]-$? }${__cmdnbary[\#]=}\$ '
Look:
$ PS1='\n${__cmdnbary[\#]-$? }${__cmdnbary[\#]=}\$ '
0 $ [ 2 = 3 ]
1 $
$ # it seems that it works
$ echo "it works"
it works
0 $
To qualify for the shortest answer challenge:
PS1='\n${a[\#]-$? }${a[\#]=}$ '
that's 31 characters.
Don't use this, of course, as a is a too trivial name; also, \$ might be better than $.
Seems you don't like that the initial prompt is 0 $; you can very easily modify this by initializing the array __cmdnbary appropriately: you'll put this somewhere in your configuration file:
__cmdnbary=( '' '' ) # Initialize the field 1!
PS1='\n${__cmdnbary[\#]-$? }${__cmdnbary[\#]=}\$ '
Got some time to play around this weekend. Looking at my earlier answer (not-good) and other answers I think this may be probably the smallest answer.
Place these lines at the end of your ~/.bash_profile:
PS1='$_ret$ '
trapDbg() {
local c="$BASH_COMMAND"
[[ "$c" != "pc" ]] && export _cmd="$c"
}
pc() {
local r=$?
trap "" DEBUG
[[ -n "$_cmd" ]] && _ret="$r " || _ret=""
export _ret
export _cmd=
trap 'trapDbg' DEBUG
}
export PROMPT_COMMAND=pc
trap 'trapDbg' DEBUG
Then open a new terminal and note this desired behavior on BASH prompt:
$ uname
Darwin
0 $
$
$
$ date
Sun Dec 14 05:59:03 EST 2014
0 $
$
$ [ 1 = 2 ]
1 $
$
$ ls 123
ls: cannot access 123: No such file or directory
2 $
$
Explanation:
This is based on trap 'handler' DEBUG and PROMPT_COMMAND hooks.
PS1 is using a variable _ret i.e. PS1='$_ret$ '.
trap command runs only when a command is executed but PROMPT_COMMAND is run even when an empty enter is pressed.
trap command sets a variable _cmd to the actually executed command using BASH internal var BASH_COMMAND.
PROMPT_COMMAND hook sets _ret to "$? " if _cmd is non-empty otherwise sets _ret to "". Finally it resets _cmd var to empty state.
The variable HISTCMD is updated every time a new command is executed. Unfortunately, the value is masked during the execution of PROMPT_COMMAND (I suppose for reasons related to not having history messed up with things which happen in the prompt command). The workaround I came up with is kind of messy, but it seems to work in my limited testing.
# This only works if the prompt has a prefix
# which is displayed before the status code field.
# Fortunately, in this case, there is one.
# Maybe use a no-op prefix in the worst case (!)
PS1_base=$'\n'
# Functions for PROMPT_COMMAND
PS1_update_HISTCMD () {
# If HISTCONTROL contains "ignoredups" or "ignoreboth", this breaks.
# We should not change it programmatically
# (think principle of least astonishment etc)
# but we can always gripe.
case :$HISTCONTROL: in
*:ignoredups:* | *:ignoreboth:* )
echo "PS1_update_HISTCMD(): HISTCONTROL contains 'ignoredups' or 'ignoreboth'" >&2
echo "PS1_update_HISTCMD(): Warning: Please remove this setting." >&2 ;;
esac
# PS1_HISTCMD needs to contain the old value of PS1_HISTCMD2 (a copy of HISTCMD)
PS1_HISTCMD=${PS1_HISTCMD2:-$PS1_HISTCMD}
# PS1_HISTCMD2 needs to be unset for the next prompt to trigger properly
unset PS1_HISTCMD2
}
PROMPT_COMMAND=PS1_update_HISTCMD
# Finally, the actual prompt:
PS1='${PS1_base#foo${PS1_HISTCMD2:=${HISTCMD%$PS1_HISTCMD}}}${_:+${PS1_HISTCMD2:+$? }}$ '
The logic in the prompt is roughly as follows:
${PS1_base#foo...}
This displays the prefix. The stuff in #... is useful only for its side effects. We want to do some variable manipulation without having the values of the variables display, so we hide them in a string substitution. (This will display odd and possibly spectacular things if the value of PS1_base ever happens to begin with foo followed by the current command history index.)
${PS1_HISTCMD2:=...}
This assigns a value to PS1_HISTCMD2 (if it is unset, which we have made sure it is). The substitution would nominally also expand to the new value, but we have hidden it in a ${var#subst} as explained above.
${HISTCMD%$PS1_HISTCMD}
We assign either the value of HISTCMD (when a new entry in the command history is being made, i.e. we are executing a new command) or an empty string (when the command is empty) to PS1_HISTCMD2. This works by trimming off the value HISTCMD any match on PS1_HISTCMD (using the ${var%subst} suffix replacement syntax).
${_:+...}
This is from the question. It will expand to ... something if the value of $_ is set and nonempty (which it is when a command is being executed, but not e.g. if we are performing a variable assignment). The "something" should be the status code (and a space, for legibility) if PS1_HISTCMD2 is nonempty.
${PS1_HISTCMD2:+$? }
There.
'$ '
This is just the actual prompt suffix, as in the original question.
So the key parts are the variables PS1_HISTCMD which remembers the previous value of HISTCMD, and the variable PS1_HISTCMD2 which captures the value of HISTCMD so it can be accessed from within PROMPT_COMMAND, but needs to be unset in the PROMPT_COMMAND so that the ${PS1_HISTCMD2:=...} assignment will fire again the next time the prompt is displayed.
I fiddled for a bit with trying to hide the output from ${PS1_HISTCMD2:=...} but then realized that there is in fact something we want to display anyhow, so just piggyback on that. You can't have a completely empty PS1_base because the shell apparently notices, and does not even attempt to perform a substitution when there is no value; but perhaps you can come up with a dummy value (a no-op escape sequence, perhaps?) if you have nothing else you want to display. Or maybe this could be refactored to run with a suffix instead; but that is probably going to be trickier still.
In response to Anubhava's "smallest answer" challenge, here is the code without comments or error checking.
PS1_base=$'\n'
PS1_update_HISTCMD () { PS1_HISTCMD=${PS1_HISTCMD2:-$PS1_HISTCMD}; unset PS1_HISTCMD2; }
PROMPT_COMMAND=PS1_update_HISTCMD
PS1='${PS1_base#foo${PS1_HISTCMD2:=${HISTCMD%$PS1_HISTCMD}}}${_:+${PS1_HISTCMD2:+$? }}$ '
This is probably not the best way to do this, but it seems to be working
function pc {
foo=$_
fc -l > /tmp/new
if cmp -s /tmp/{new,old} || test -z "$foo"
then
PS1='\n$ '
else
PS1='\n$? $ '
fi
cp /tmp/{new,old}
}
PROMPT_COMMAND=pc
Result
$ [ 2 = 2 ]
0 $ [ 2 = 3 ]
1 $
$
I need to use great script bash-preexec.sh.
Although I don't like external dependencies, this was the only thing to help me avoid to have 1 in $? after just pressing enter without running any command.
This goes to your ~/.bashrc:
__prompt_command() {
local exit="$?"
PS1='\u#\h: \w \$ '
[ -n "$LASTCMD" -a "$exit" != "0" ] && PS1='['${red}$exit$clear"] $PS1"
}
PROMPT_COMMAND=__prompt_command
[-f ~/.bash-preexec.sh ] && . ~/.bash-preexec.sh
preexec() { LASTCMD="$1"; }
UPDATE: later I was able to find a solution without dependency on .bash-preexec.sh.

returning values from functions in bash [duplicate]

I'd like to return a string from a Bash function.
I'll write the example in java to show what I'd like to do:
public String getSomeString() {
return "tadaa";
}
String variable = getSomeString();
The example below works in bash, but is there a better way to do this?
function getSomeString {
echo "tadaa"
}
VARIABLE=$(getSomeString)
There is no better way I know of. Bash knows only status codes (integers) and strings written to the stdout.
You could have the function take a variable as the first arg and modify the variable with the string you want to return.
#!/bin/bash
set -x
function pass_back_a_string() {
eval "$1='foo bar rab oof'"
}
return_var=''
pass_back_a_string return_var
echo $return_var
Prints "foo bar rab oof".
Edit: added quoting in the appropriate place to allow whitespace in string to address #Luca Borrione's comment.
Edit: As a demonstration, see the following program. This is a general-purpose solution: it even allows you to receive a string into a local variable.
#!/bin/bash
set -x
function pass_back_a_string() {
eval "$1='foo bar rab oof'"
}
return_var=''
pass_back_a_string return_var
echo $return_var
function call_a_string_func() {
local lvar=''
pass_back_a_string lvar
echo "lvar='$lvar' locally"
}
call_a_string_func
echo "lvar='$lvar' globally"
This prints:
+ return_var=
+ pass_back_a_string return_var
+ eval 'return_var='\''foo bar rab oof'\'''
++ return_var='foo bar rab oof'
+ echo foo bar rab oof
foo bar rab oof
+ call_a_string_func
+ local lvar=
+ pass_back_a_string lvar
+ eval 'lvar='\''foo bar rab oof'\'''
++ lvar='foo bar rab oof'
+ echo 'lvar='\''foo bar rab oof'\'' locally'
lvar='foo bar rab oof' locally
+ echo 'lvar='\'''\'' globally'
lvar='' globally
Edit: demonstrating that the original variable's value is available in the function, as was incorrectly criticized by #Xichen Li in a comment.
#!/bin/bash
set -x
function pass_back_a_string() {
eval "echo in pass_back_a_string, original $1 is \$$1"
eval "$1='foo bar rab oof'"
}
return_var='original return_var'
pass_back_a_string return_var
echo $return_var
function call_a_string_func() {
local lvar='original lvar'
pass_back_a_string lvar
echo "lvar='$lvar' locally"
}
call_a_string_func
echo "lvar='$lvar' globally"
This gives output:
+ return_var='original return_var'
+ pass_back_a_string return_var
+ eval 'echo in pass_back_a_string, original return_var is $return_var'
++ echo in pass_back_a_string, original return_var is original return_var
in pass_back_a_string, original return_var is original return_var
+ eval 'return_var='\''foo bar rab oof'\'''
++ return_var='foo bar rab oof'
+ echo foo bar rab oof
foo bar rab oof
+ call_a_string_func
+ local 'lvar=original lvar'
+ pass_back_a_string lvar
+ eval 'echo in pass_back_a_string, original lvar is $lvar'
++ echo in pass_back_a_string, original lvar is original lvar
in pass_back_a_string, original lvar is original lvar
+ eval 'lvar='\''foo bar rab oof'\'''
++ lvar='foo bar rab oof'
+ echo 'lvar='\''foo bar rab oof'\'' locally'
lvar='foo bar rab oof' locally
+ echo 'lvar='\'''\'' globally'
lvar='' globally
All answers above ignore what has been stated in the man page of bash.
All variables declared inside a function will be shared with the calling environment.
All variables declared local will not be shared.
Example code
#!/bin/bash
f()
{
echo function starts
local WillNotExists="It still does!"
DoesNotExists="It still does!"
echo function ends
}
echo $DoesNotExists #Should print empty line
echo $WillNotExists #Should print empty line
f #Call the function
echo $DoesNotExists #Should print It still does!
echo $WillNotExists #Should print empty line
And output
$ sh -x ./x.sh
+ echo
+ echo
+ f
+ echo function starts
function starts
+ local 'WillNotExists=It still does!'
+ DoesNotExists='It still does!'
+ echo function ends
function ends
+ echo It still 'does!'
It still does!
+ echo
Also under pdksh and ksh this script does the same!
Bash, since version 4.3, feb 2014(?), has explicit support for reference variables or name references (namerefs), beyond "eval", with the same beneficial performance and indirection effect, and which may be clearer in your scripts and also harder to "forget to 'eval' and have to fix this error":
declare [-aAfFgilnrtux] [-p] [name[=value] ...]
typeset [-aAfFgilnrtux] [-p] [name[=value] ...]
Declare variables and/or give them attributes
...
-n Give each name the nameref attribute, making it a name reference
to another variable. That other variable is defined by the value
of name. All references and assignments to name, except for⋅
changing the -n attribute itself, are performed on the variable
referenced by name's value. The -n attribute cannot be applied to
array variables.
...
When used in a function, declare and typeset make each name local,
as with the local command, unless the -g option is supplied...
and also:
PARAMETERS
A variable can be assigned the nameref attribute using the -n option to the
declare or local builtin commands (see the descriptions of declare and local
below) to create a nameref, or a reference to another variable. This allows
variables to be manipulated indirectly. Whenever the nameref variable is⋅
referenced or assigned to, the operation is actually performed on the variable
specified by the nameref variable's value. A nameref is commonly used within
shell functions to refer to a variable whose name is passed as an argument to⋅
the function. For instance, if a variable name is passed to a shell function
as its first argument, running
declare -n ref=$1
inside the function creates a nameref variable ref whose value is the variable
name passed as the first argument. References and assignments to ref are
treated as references and assignments to the variable whose name was passed as⋅
$1. If the control variable in a for loop has the nameref attribute, the list
of words can be a list of shell variables, and a name reference will be⋅
established for each word in the list, in turn, when the loop is executed.
Array variables cannot be given the -n attribute. However, nameref variables
can reference array variables and subscripted array variables. Namerefs can be⋅
unset using the -n option to the unset builtin. Otherwise, if unset is executed
with the name of a nameref variable as an argument, the variable referenced by⋅
the nameref variable will be unset.
For example (EDIT 2: (thank you Ron) namespaced (prefixed) the function-internal variable name, to minimize external variable clashes, which should finally answer properly, the issue raised in the comments by Karsten):
# $1 : string; your variable to contain the return value
function return_a_string () {
declare -n ret=$1
local MYLIB_return_a_string_message="The date is "
MYLIB_return_a_string_message+=$(date)
ret=$MYLIB_return_a_string_message
}
and testing this example:
$ return_a_string result; echo $result
The date is 20160817
Note that the bash "declare" builtin, when used in a function, makes the declared variable "local" by default, and "-n" can also be used with "local".
I prefer to distinguish "important declare" variables from "boring local" variables, so using "declare" and "local" in this way acts as documentation.
EDIT 1 - (Response to comment below by Karsten) - I cannot add comments below any more, but Karsten's comment got me thinking, so I did the following test which WORKS FINE, AFAICT - Karsten if you read this, please provide an exact set of test steps from the command line, showing the problem you assume exists, because these following steps work just fine:
$ return_a_string ret; echo $ret
The date is 20170104
(I ran this just now, after pasting the above function into a bash term - as you can see, the result works just fine.)
Like bstpierre above, I use and recommend the use of explicitly naming output variables:
function some_func() # OUTVAR ARG1
{
local _outvar=$1
local _result # Use some naming convention to avoid OUTVARs to clash
... some processing ....
eval $_outvar=\$_result # Instead of just =$_result
}
Note the use of quoting the $. This will avoid interpreting content in $result as shell special characters. I have found that this is an order of magnitude faster than the result=$(some_func "arg1") idiom of capturing an echo. The speed difference seems even more notable using bash on MSYS where stdout capturing from function calls is almost catastrophic.
It's ok to send in a local variables since locals are dynamically scoped in bash:
function another_func() # ARG
{
local result
some_func result "$1"
echo result is $result
}
You could also capture the function output:
#!/bin/bash
function getSomeString() {
echo "tadaa!"
}
return_var=$(getSomeString)
echo $return_var
# Alternative syntax:
return_var=`getSomeString`
echo $return_var
Looks weird, but is better than using global variables IMHO. Passing parameters works as usual, just put them inside the braces or backticks.
The most straightforward and robust solution is to use command substitution, as other people wrote:
assign()
{
local x
x="Test"
echo "$x"
}
x=$(assign) # This assigns string "Test" to x
The downside is performance as this requires a separate process.
The other technique suggested in this topic, namely passing the name of a variable to assign to as an argument, has side effects, and I wouldn't recommend it in its basic form. The problem is that you will probably need some variables in the function to calculate the return value, and it may happen that the name of the variable intended to store the return value will interfere with one of them:
assign()
{
local x
x="Test"
eval "$1=\$x"
}
assign y # This assigns string "Test" to y, as expected
assign x # This will NOT assign anything to x in this scope
# because the name "x" is declared as local inside the function
You might, of course, not declare internal variables of the function as local, but you really should always do it as otherwise you may, on the other hand, accidentally overwrite an unrelated variable from the parent scope if there is one with the same name.
One possible workaround is an explicit declaration of the passed variable as global:
assign()
{
local x
eval declare -g $1
x="Test"
eval "$1=\$x"
}
If name "x" is passed as an argument, the second row of the function body will overwrite the previous local declaration. But the names themselves might still interfere, so if you intend to use the value previously stored in the passed variable prior to write the return value there, be aware that you must copy it into another local variable at the very beginning; otherwise the result will be unpredictable!
Besides, this will only work in the most recent version of BASH, namely 4.2. More portable code might utilize explicit conditional constructs with the same effect:
assign()
{
if [[ $1 != x ]]; then
local x
fi
x="Test"
eval "$1=\$x"
}
Perhaps the most elegant solution is just to reserve one global name for function return values and
use it consistently in every function you write.
As previously mentioned, the "correct" way to return a string from a function is with command substitution. In the event that the function also needs to output to console (as #Mani mentions above), create a temporary fd in the beginning of the function and redirect to console. Close the temporary fd before returning your string.
#!/bin/bash
# file: func_return_test.sh
returnString() {
exec 3>&1 >/dev/tty
local s=$1
s=${s:="some default string"}
echo "writing directly to console"
exec 3>&-
echo "$s"
}
my_string=$(returnString "$*")
echo "my_string: [$my_string]"
executing script with no params produces...
# ./func_return_test.sh
writing directly to console
my_string: [some default string]
hope this helps people
-Andy
You could use a global variable:
declare globalvar='some string'
string ()
{
eval "$1='some other string'"
} # ---------- end of function string ----------
string globalvar
echo "'${globalvar}'"
This gives
'some other string'
To illustrate my comment on Andy's answer, with additional file descriptor manipulation to avoid use of /dev/tty:
#!/bin/bash
exec 3>&1
returnString() {
exec 4>&1 >&3
local s=$1
s=${s:="some default string"}
echo "writing to stdout"
echo "writing to stderr" >&2
exec >&4-
echo "$s"
}
my_string=$(returnString "$*")
echo "my_string: [$my_string]"
Still nasty, though.
The way you have it is the only way to do this without breaking scope. Bash doesn't have a concept of return types, just exit codes and file descriptors (stdin/out/err, etc)
Addressing Vicky Ronnen's head up, considering the following code:
function use_global
{
eval "$1='changed using a global var'"
}
function capture_output
{
echo "always changed"
}
function test_inside_a_func
{
local _myvar='local starting value'
echo "3. $_myvar"
use_global '_myvar'
echo "4. $_myvar"
_myvar=$( capture_output )
echo "5. $_myvar"
}
function only_difference
{
local _myvar='local starting value'
echo "7. $_myvar"
local use_global '_myvar'
echo "8. $_myvar"
local _myvar=$( capture_output )
echo "9. $_myvar"
}
declare myvar='global starting value'
echo "0. $myvar"
use_global 'myvar'
echo "1. $myvar"
myvar=$( capture_output )
echo "2. $myvar"
test_inside_a_func
echo "6. $_myvar" # this was local inside the above function
only_difference
will give
0. global starting value
1. changed using a global var
2. always changed
3. local starting value
4. changed using a global var
5. always changed
6.
7. local starting value
8. local starting value
9. always changed
Maybe the normal scenario is to use the syntax used in the test_inside_a_func function, thus you can use both methods in the majority of cases, although capturing the output is the safer method always working in any situation, mimicking the returning value from a function that you can find in other languages, as Vicky Ronnen correctly pointed out.
The options have been all enumerated, I think. Choosing one may come down to a matter of the best style for your particular application, and in that vein, I want to offer one particular style I've found useful. In bash, variables and functions are not in the same namespace. So, treating the variable of the same name as the value of the function is a convention that I find minimizes name clashes and enhances readability, if I apply it rigorously. An example from real life:
UnGetChar=
function GetChar() {
# assume failure
GetChar=
# if someone previously "ungot" a char
if ! [ -z "$UnGetChar" ]; then
GetChar="$UnGetChar"
UnGetChar=
return 0 # success
# else, if not at EOF
elif IFS= read -N1 GetChar ; then
return 0 # success
else
return 1 # EOF
fi
}
function UnGetChar(){
UnGetChar="$1"
}
And, an example of using such functions:
function GetToken() {
# assume failure
GetToken=
# if at end of file
if ! GetChar; then
return 1 # EOF
# if start of comment
elif [[ "$GetChar" == "#" ]]; then
while [[ "$GetChar" != $'\n' ]]; do
GetToken+="$GetChar"
GetChar
done
UnGetChar "$GetChar"
# if start of quoted string
elif [ "$GetChar" == '"' ]; then
# ... et cetera
As you can see, the return status is there for you to use when you need it, or ignore if you don't. The "returned" variable can likewise be used or ignored, but of course only after the function is invoked.
Of course, this is only a convention. You are free to fail to set the associated value before returning (hence my convention of always nulling it at the start of the function) or to trample its value by calling the function again (possibly indirectly). Still, it's a convention I find very useful if I find myself making heavy use of bash functions.
As opposed to the sentiment that this is a sign one should e.g. "move to perl", my philosophy is that conventions are always important for managing the complexity of any language whatsoever.
In my programs, by convention, this is what the pre-existing $REPLY variable is for, which read uses for that exact purpose.
function getSomeString {
REPLY="tadaa"
}
getSomeString
echo $REPLY
This echoes
tadaa
But to avoid conflicts, any other global variable will do.
declare result
function getSomeString {
result="tadaa"
}
getSomeString
echo $result
If that isn’t enough, I recommend Markarian451’s solution.
They key problem of any 'named output variable' scheme where the caller can pass in the variable name (whether using eval or declare -n) is inadvertent aliasing, i.e. name clashes: From an encapsulation point of view, it's awful to not be able to add or rename a local variable in a function without checking ALL the function's callers first to make sure they're not wanting to pass that same name as the output parameter. (Or in the other direction, I don't want to have to read the source of the function I'm calling just to make sure the output parameter I intend to use is not a local in that function.)
The only way around that is to use a single dedicated output variable like REPLY (as suggested by Evi1M4chine) or a convention like the one suggested by Ron Burk.
However, it's possible to have functions use a fixed output variable internally, and then add some sugar over the top to hide this fact from the caller, as I've done with the call function in the following example. Consider this a proof of concept, but the key points are
The function always assigns the return value to REPLY, and can also return an exit code as usual
From the perspective of the caller, the return value can be assigned to any variable (local or global) including REPLY (see the wrapper example). The exit code of the function is passed through, so using them in e.g. an if or while or similar constructs works as expected.
Syntactically the function call is still a single simple statement.
The reason this works is because the call function itself has no locals and uses no variables other than REPLY, avoiding any potential for name clashes. At the point where the caller-defined output variable name is assigned, we're effectively in the caller's scope (technically in the identical scope of the call function), rather than in the scope of the function being called.
#!/bin/bash
function call() { # var=func [args ...]
REPLY=; "${1#*=}" "${#:2}"; eval "${1%%=*}=\$REPLY; return $?"
}
function greet() {
case "$1" in
us) REPLY="hello";;
nz) REPLY="kia ora";;
*) return 123;;
esac
}
function wrapper() {
call REPLY=greet "$#"
}
function main() {
local a b c d
call a=greet us
echo "a='$a' ($?)"
call b=greet nz
echo "b='$b' ($?)"
call c=greet de
echo "c='$c' ($?)"
call d=wrapper us
echo "d='$d' ($?)"
}
main
Output:
a='hello' (0)
b='kia ora' (0)
c='' (123)
d='hello' (0)
You can echo a string, but catch it by piping (|) the function to something else.
You can do it with expr, though ShellCheck reports this usage as deprecated.
bash pattern to return both scalar and array value objects:
definition
url_parse() { # parse 'url' into: 'url_host', 'url_port', ...
local "$#" # inject caller 'url' argument in local scope
local url_host="..." url_path="..." # calculate 'url_*' components
declare -p ${!url_*} # return only 'url_*' object fields to the caller
}
invocation
main() { # invoke url parser and inject 'url_*' results in local scope
eval "$(url_parse url=http://host/path)" # parse 'url'
echo "host=$url_host path=$url_path" # use 'url_*' components
}
Although there were a lot of good answers, they all did not work the way I wanted them to. So here is my solution with these key points:
Helping the forgetful programmer
Atleast I would struggle to always remember error checking after something like this: var=$(myFunction)
Allows assigning values with newline chars \n
Some solutions do not allow for that as some forgot about the single quotes around the value to assign. Right way: eval "${returnVariable}='${value}'" or even better: see the next point below.
Using printf instead of eval
Just try using something like this myFunction "date && var2" to some of the supposed solutions here. eval will execute whatever is given to it. I only want to assign values so I use printf -v "${returnVariable}" "%s" "${value}" instead.
Encapsulation and protection against variable name collision
If a different user or at least someone with less knowledge about the function (this is likely me in some months time) is using myFunction I do not want them to know that he must use a global return value name or some variable names are forbidden to use. That is why I added a name check at the top of myFunction:
if [[ "${1}" = "returnVariable" ]]; then
echo "Cannot give the ouput to \"returnVariable\" as a variable with the same name is used in myFunction()!"
echo "If that is still what you want to do please do that outside of myFunction()!"
return 1
fi
Note this could also be put into a function itself if you have to check a lot of variables.
If I still want to use the same name (here: returnVariable) I just create a buffer variable, give that to myFunction and then copy the value returnVariable.
So here it is:
myFunction():
myFunction() {
if [[ "${1}" = "returnVariable" ]]; then
echo "Cannot give the ouput to \"returnVariable\" as a variable with the same name is used in myFunction()!"
echo "If that is still what you want to do please do that outside of myFunction()!"
return 1
fi
if [[ "${1}" = "value" ]]; then
echo "Cannot give the ouput to \"value\" as a variable with the same name is used in myFunction()!"
echo "If that is still what you want to do please do that outside of myFunction()!"
return 1
fi
local returnVariable="${1}"
local value=$'===========\nHello World\n==========='
echo "setting the returnVariable now..."
printf -v "${returnVariable}" "%s" "${value}"
}
Test cases:
var1="I'm not greeting!"
myFunction var1
[[ $? -eq 0 ]] && echo "myFunction(): SUCCESS" || echo "myFunction(): FAILURE"
printf "var1:\n%s\n" "${var1}"
# Output:
# setting the returnVariable now...
# myFunction(): SUCCESS
# var1:
# ===========
# Hello World
# ===========
returnVariable="I'm not greeting!"
myFunction returnVariable
[[ $? -eq 0 ]] && echo "myFunction(): SUCCESS" || echo "myFunction(): FAILURE"
printf "returnVariable:\n%s\n" "${returnVariable}"
# Output
# Cannot give the ouput to "returnVariable" as a variable with the same name is used in myFunction()!
# If that is still what you want to do please do that outside of myFunction()!
# myFunction(): FAILURE
# returnVariable:
# I'm not greeting!
var2="I'm not greeting!"
myFunction "date && var2"
[[ $? -eq 0 ]] && echo "myFunction(): SUCCESS" || echo "myFunction(): FAILURE"
printf "var2:\n%s\n" "${var2}"
# Output
# setting the returnVariable now...
# ...myFunction: line ..: printf: `date && var2': not a valid identifier
# myFunction(): FAILURE
# var2:
# I'm not greeting!
myFunction var3
[[ $? -eq 0 ]] && echo "myFunction(): SUCCESS" || echo "myFunction(): FAILURE"
printf "var3:\n%s\n" "${var3}"
# Output
# setting the returnVariable now...
# myFunction(): SUCCESS
# var3:
# ===========
# Hello World
# ===========
#Implement a generic return stack for functions:
STACK=()
push() {
STACK+=( "${1}" )
}
pop() {
export $1="${STACK[${#STACK[#]}-1]}"
unset 'STACK[${#STACK[#]}-1]';
}
#Usage:
my_func() {
push "Hello world!"
push "Hello world2!"
}
my_func ; pop MESSAGE2 ; pop MESSAGE1
echo ${MESSAGE1} ${MESSAGE2}
agt#agtsoft:~/temp$ cat ./fc
#!/bin/sh
fcall='function fcall { local res p=$1; shift; fname $*; eval "$p=$res"; }; fcall'
function f1 {
res=$[($1+$2)*2];
}
function f2 {
local a;
eval ${fcall//fname/f1} a 2 3;
echo f2:$a;
}
a=3;
f2;
echo after:a=$a, res=$res
agt#agtsoft:~/temp$ ./fc
f2:10
after:a=3, res=

In Bash, it is okay for a variable and a function to have the same name?

I have the following code in my ~/.bashrc:
date=$(which date)
date() {
if [[ $1 == -R || $1 == --rfc-822 ]]; then
# Output RFC-822 compliant date string.
# e.g. Wed, 16 Dec 2009 15:18:11 +0100
$date | sed "s/[^ ][^ ]*$/$($date +%z)/"
else
$date "$#"
fi
}
This works fine, as far as I can tell. Is there a reason to avoid having a variable and a function with the same name?
It's alright apart from being confusing. Besides, they are not the same:
$ date=/bin/ls
$ type date
date is hashed (/bin/date)
$ type $date
/bin/ls is /bin/ls
$ moo=foo
$ type $moo
-bash: type: foo: not found
$ function date() { true; }
$ type date
date is a function
date ()
{
true*emphasized text*
}
$ which true
/bin/true
$ type true
true is a shell builtin
Whenever you type a command, bash looks in three different places to find that command. The priority is as follows:
shell builtins (help)
shell aliases (help alias)
shell functions (help function)
hashed binaries files from $PATH ('leftmost' folders scanned first)
Variables are prefixed with a dollar sign, which makes them different from all of the above. To compare to your example: $date and date are not the same thing. So It's not really possible to have the same name for a variable and a function because they have different "namespaces".
You may find this somewhat confusing, but many scripts define "method variables" at the top of the file. e.g.
SED=/bin/sed
AWK=/usr/bin/awk
GREP/usr/local/gnu/bin/grep
The common thing to do is type the variable names in capitals. This is useful for two purposes (apart from being less confusing):
There is no $PATH
Checking that all "dependencies" are runnable
You can't really check like this:
if [ "`which binary`" ]; then echo it\'s ok to continue.. ;fi
Because which will give you an error if binary has not yet been hashed (found in a path folder).
Since you always have to use $ to dereference a variable in Bash, you're free to use any name you like.
Beware of overriding a global, though.
See also:
http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_03_02.html
An alternative to using a variable: use bash's command keyword (see the manual or run help command from a prompt):
date() {
case $1 in
-R|--rfc-2822) command date ... ;;
*) command date "$#" ;;
esac
}

Resources