bash when is a variable read during a fucntion or loop - bash

In a bash script lets take the extreme below examples where the call/start of the myFn() is 5 minutes before the echo of $inVar -> $myvar happens. During this time between the function start and the interaction with the $myvar, it is updated.
myvar=$(... //some json
alpha=hello )
myFn(){
local -n inVar=$1
//wait for 5 mins .... :)
echo $inVar
}
myFn "myvar"
if [ -z $var ]
then
//wait 5 mins
//but life goes on and this is non blocking
//so other parts of the script are running
echo $myvar
fi
myvar=$(... // after 2 mins update alpha
alpha=world
)
As the $myvar is passed to myFn(), when is $myvar actually read,
at myFn call time (when the function is called/starts)
at the reference copy time inVar=$1
when the echo $inVar occurs
and is this the same for other types of processes such as while, if etc?

You're setting inVar as a nameref, so the value is not known until the variable is expanded at the echo statement
HOWEVER
In your scenario, myFn is "non blocking", meaning you launch it in the background. In this case, the subshell gets a copy of the current value of myVar -- if myVar gets updated subsequently, that update is happening in the current shell, not the background shell.
To demonstrate:
$ bash -c '
fn() { local -n y=$1; sleep 2; echo "in background function, y=$y"; }
x=5
fn x &
x=10
wait
'
in background function, y=5
TL;DR: namerefs and background processes don't mix well.

Related

Tcsh Script Last Exit Code ($?) value is resetting

I am running the following script using tcsh. In my while loop, I'm running a C++ program that I created and will return a different exit code depending on certain things. While it returns an exit code of 0, I want the script to increment counter and run the program again.
#!/bin/tcsh
echo "Starting the script."
set counter = 0
while ($? == 0)
# counter ++
./auto $counter
end
I have verified that my program is definitely returning with exit code = 1 after a certain point. However, the condition in the while loop keeps evaluating to true for some reason and running.
I found that if I stick the following line at the end of my loop and then replace the condition check in the while loop with this new variable, it works fine.
while ($return_code == 0)
# counter ++
./auto $counter
set return_code = $?
end
Why is it that I can't just use $? directly? Is another operation underneath the hood performed in between running my custom program and checking the loop condition that's causing $? to change value?
That is peculiar.
I've altered your example to something that I think illustrates the issue more clearly. (Note that $? is an alias for $status.)
#!/bin/tcsh -f
foreach i (1 2 3)
false
# echo false status=$status
end
echo Done status=$status
The output is
Done status=0
If I uncomment the echo command in the loop, the output is:
false status=1
false status=1
false status=1
Done status=0
(Of course the echo in the loop would break the logic anyway, because the echo command completes successfully and sets $status to zero.)
I think what's happening is that the end that terminates the loop is executed as a statement, and it sets $status ($?) to 0.
I see the same behavior with both tcsh and bsd-csh.
Saving the value of $status in another variable immediately after the command is a good workaround -- and arguably just a better way of doing it, since $status is extremely fragile, and will almost literally be clobbered if you look at it.
Note that I've add a -f option to the #! line. This prevents tcsh from sourcing your init file(s) (.cshrc or .tcshrc) and is considered good practice. (That's not the case for sh/bash/ksh/zsh, which assign a completely different meaning to -f.)
A digression: I used tcsh regularly for many years, both as my interactive login shell and for scripting. I would not have anticipated that end would set $status. This is not the first time I've had to find out how tcsh or csh behaves by trial and error and been surprised by the result. It is one of the reasons I switched to bash for interactive and scripting use. I won't tell you to do the same, but you might want to read Tom Christiansen's classic "csh.whynot".
Slightly shorter/simpler explanation:
Recall that with tcsh/csh EACH command (including shell builtin) return a status. Therefore $? (aliases to $status) is updated by 'if' statements, 'for' loops, assignments, ...
From practical point of view, better to limit the usage of direct use of $? to an if statement after the command execution:
do-something
if ( $status == 0 )
...
endif
In all other cases, capture the status in a variable, and use only that variable
do-something
something_status=$?
if ( $something_status == 0 )
...
endif
To expand on the $status, even a condition test in an if statement will modify the status, therefore the following repeated test on $status will not never hit the '$status == 5', even when do-something will return status of 5
do-something
if ( $status == 2 ) then
echo FOO
else if ( $status == 5 ) then
echo BAR
endif

shell $RANDOM seed not honored in pipelines

This is a strange behavior I can't explain. I want to use shell to generate a predictable random number sequence. I use $RANDOM with a seed. Here is a test program.
RANDOM=15
echo $RANDOM
This works fine by giving the same number every time I run it. But if I add a pipe to this program it gives different results every time. Try the following simplified program.
RANDOM=15
echo $RANDOM | cat
I have found 2 fixes to the problem (making it predictable), but still can't explain why.
Fix 1
RANDOM=15
x=$RANDOM
echo $x | cat
Fix 2
(RANDOM=15
echo $RANDOM) | cat
I tried on Linux and Mac. The behavior is consistent. Can somebody explain?
Pipelines, as in echo $RANDOM | cat, create subshells -- separate processes forked from the parent but not replaced with a different executable image using an exec()-family call. You're observing a difference in behavior between the shell in which RANDOM is explicitly set, and subshells forked from same.
Your workarounds either move the evaluation of $RANDOM out of a subshell into the parent (first case), or move the explicit seed set into the subshell (second case).
Thank you Charles Duffy for pointing to the right direction (subshell). I found in the src code of bash, there is file variable.c . $RANDOM is a "dynamic variable", to get the value a function is called; and the function re-seeds the random generator when $RANDOM is first evaluated in the subshell.
// from bash-4.3/variables.c
int
get_random_number ()
{
int rv, pid;
/* Reset for command and process substitution. */
pid = getpid ();
if (subshell_environment && seeded_subshell != pid)
{
seedrand (); // <<<<==== re-seed!
seeded_subshell = pid;
}
do
rv = brand ();
while (rv == last_random_value);
return rv;
}
Seed is a static variable, so each shell has its own copy. Re-seeding in the subshell has no effects in the parent. Here is another test case to show $RANDOM reference in subshell has nothing to do with the sequence in parent shell.
RANDOM=15
echo $RANDOM $RANDOM
RANDOM=15
echo $RANDOM | cat
echo $RANDOM
The last line gives the first random number after 15.

Shell: Return value of a non-child process

In shell script I am trying to wait for non-child process. I got reference on how to do it from:
WAIT for "any process" to finish
My shell script structure is:
Main.sh
func1(){
return 1
}
func2(){
# Wait for func1 to finish
while kill -0 "$pid_func1"; do
sleep 0.5
done
}
# Call function 1 in background
func1 &
pid_func1=$!
func2 &
In this case how do I receive the return value of func1 inside function func2?
You generally cannot capture the exit status of non-child processes. You may be able to work something involving logging the exit codes to status files and then reading the values, but otherwise you're not going to be able to capture the values
I used anothe shell variable to store the return status in this case and checked value of this shell variable whereever required. Find a sample shell script below to simulate the scenario.
#!/bin/bash
func1(){
retvalue=23 # return value which needs to be returned
status_func1=$retvalue # store this value in shell variable
echo "func1 executing"
return $retvalue
}
func2(){
# Not possible to use wait command for pid of func1 as it is not a child of func2
#wait $pid_func1
#ret_func1=$?
while kill -0 "$pid_func1"; do
echo "func1 is still executing"
sleep 0.5
done
echo "func2 executing"
#echo "func1 ret: $ret_func1"
echo "func1 ret: $status_func1"
}
# Main shell script starts here
func1 &
pid_func1=$!
func2 &
Hope its useful for others who are facing the same issue.

Increment a global variable in Bash

Here's a shell script:
globvar=0
function myfunc {
let globvar=globvar+1
echo "myfunc: $globvar"
}
myfunc
echo "something" | myfunc
echo "Global: $globvar"
When called, it prints out the following:
$ sh zzz.sh
myfunc: 1
myfunc: 2
Global: 1
$ bash zzz.sh
myfunc: 1
myfunc: 2
Global: 1
$ zsh zzz.sh
myfunc: 1
myfunc: 2
Global: 2
The question is: why this happens and what behavior is correct?
P.S. I have a strange feeling that function behind the pipe is called in a forked shell... So, can there be a simple workaround?
P.P.S. This function is a simple test wrapper. It runs test application and analyzes its output. Then it increments $PASSED or $FAILED variables. Finally, you get a number of passed/failed tests in global variables. The usage is like:
test-util << EOF | myfunc
input for test #1
EOF
test-util << EOF | myfunc
input for test #2
EOF
echo "Passed: $PASSED, failed: $FAILED"
Korn shell gives the same results as zsh, by the way.
Please see BashFAQ/024. Pipes create subshells in Bash and variables are lost when subshells exit.
Based on your example, I would restructure it something like this:
globvar=0
function myfunc {
echo $(($1 + 1))
}
myfunc "$globvar"
globalvar=$(echo "something" | myfunc "$globalvar")
Piping something into myfunc in sh or bash causes a new shell to spawn. You can confirm this by adding a long sleep in myfunc. While it's sleeping call ps and you'll see a subprocess. When the function returns, that sub shell exits without changing the value in the parent process.
If you really need that value to be changed, you'll need to return a value from the function and check $PIPESTATUS after, I guess, like this:
globvar=0
function myfunc {
let globvar=globvar+1
echo "myfunc: $globvar"
return $globvar
}
myfunc
echo "something" | myfunc
globvar=${PIPESTATUS[1]}
echo "Global: $globvar"
The problem is 'which end of a pipeline using built-ins is executed by the original process?'
In zsh, it looks like the last command in the pipeline is executed by the main shell script when the command is a function or built-in.
In Bash (and sh is likely to be a link to Bash if you're on Linux), then either both commands are run in a sub-shell or the first command is run by the main process and the others are run by sub-shells.
Clearly, when the function is run in a sub-shell, it does not affect the variable in the parent shell (only the global in the sub-shell).
Consider adding an extra test:
echo Something | { myfunc; echo $globvar; }
echo $globvar

Simple timer to measure seconds an operation took to complete

I run my own script to dump databases into files on a nightly basis.
I wanted to count time (in seconds) it takes to dump each database, so I was trying to write some functions to help me achieve it, but I'm running into problems.
I am no expert in scripting in bash, so if I'm doing it plain wrong, just say so and ideally suggest alternative, please.
Here's the script:
#!/bin/bash
declare -i time_start
function get_timestamp {
declare -i time_curr=`date -j -f "%a %b %d %T %Z %Y" "\`date\`" "+%s"`
echo "get_timestamp:" $time_curr
return $time_curr
}
function timer_start {
get_timestamp
time_start=$?
echo "timer_start:" $time_start
}
function timer_stop {
get_timestamp
declare -i time_curr=$?
echo "timer_stop:" $time_curr
declare -i time_diff=$time_curr-$time_start
return $time_diff
}
timer_start
sleep 3
timer_stop
echo $?
The code should really be quite self-explanatory. echo commands are only for debugging.
I expect the output to be something like this:
$ bash timer.sh
get_timestamp: 1285945972
timer_start: 1285945972
get_timestamp: 1285945975
timer_stop: 1285945975
3
Now this is not the case unfortunately. What I get is:
$ bash timer.sh
get_timestamp: 1285945972
timer_start: 116
get_timestamp: 1285945975
timer_stop: 119
3
As you can see, the value that local var time_curr gets from the command is a valid timestamp, but returning this value causes it to be changed to an integer between 0 and 255.
Can someone please explain to me why this is happening?
PS. This obviously is just my timer test script without any other logic.
UPDATE
Just to be perfectly clear, I want this to be part of a bash script very similar to this one, where I want to measure each loop cycle.
Unless of course I can do it with time, then please suggest a solution.
You don't need to do all this. Just run time <yourscript> in the shell.
$? is used to hold the exit status of a command and can only hold a value between 0 and 255. If you pass an exit code outside this range (say, in a C program calling exit(-1)), the shell will still receive a value in that range and set $? accordingly.
As a workaround, you could just set a different value in your bash function:
function get_timestamp {
declare -i time_curr=`date -j -f "%a %b %d %T %Z %Y" "\`date\`" "+%s"`
echo "get_timestamp:" $time_curr
get_timestamp_return_value=$time_curr
}
function timer_start {
get_timestamp
#time_start=$?
time_start=$get_timestamp_return_value
echo "timer_start:" $time_start
}
...
I believe you should be able to use the existing "time" function.
After Update to the question:
This was the bit of script from your link which was doing a for loop.
# dump each database in turn
for db in $databases; do
echo $db
$MYSQLDUMP --force --opt --user=$USER --password=$PASSWORD
--databases $db > "$OUTPUTDIR/$db.bak"
done
You could extract the inner portion of the loop into a new script (call it dump_one_db.sh)
and do this inside the loop:
# dump each database in turn
for db in $databases; do
time dump_one_db.sh $db
done
Make sure to write the output of the time against the db name into some file.
This is happening because return codes need to be between 0-255. You can't return an arbitrary number. If you continue to refuse to use the builtin time function and roll your own, change your functions to echo their stamp and use a process expansion ($()) to grab the value.

Resources