What are the side effects of "ulimit -t unlimited" in ksh93? - ksh

I was bitten by this ksh93 bug (also here). Here is a SSCCE close to my use case:
$ cat bug.sh
#!/bin/ksh93
unset a b
c=0
function set_ac { a=1; c=1; }
function set_abc { ( set_ac ; b=1 ) }
set_abc
echo "a=$a b=$b c=$c"
$ ./bug.sh
a=1 b= c=0
Hence, although set_abc calls set_ac in a subshell, the assignment a=1 propagates to the parent shell. There are a few known workarounds and I'm leaning towards the one that says to replace set_abc above with
function set_abc { ( ulimit -t unlimited ; set_ac ; b=1 ) }
This seems to work fine. Now I wonder if there is any side effect of ulimit -t unlimited other than provoking the subshell to fork (this is the point of the workaround) that can cause me trouble in the future. (FWIW: This is supposed to be ran by a user without root privileges.)

The latest ksh93 release's implementation of non-forking/virtual subshells is chock-full of bugs. A subshell is supposed to be an environment that is copied from, but cleanly separated from the parent shell environment. Forcing the subshell to fork delegates that separation to the kernel, which is very robust. So the only side effects should be that you get (1) sightly slower performance and (2) much fewer bugs.
As for ulimit itself, using the unlimited parameter should not cause any side effects either because that is the default. The only possibility is that a parent shell environment already limited the CPU time with ulimit -t; in that case you don't have permission to set it back to unlimited and it prints an error message. However, it still forces the fork. So to be completely safe, you can use ulimit -t unlimited 2>/dev/null instead, to suppress any possible error message.

Related

Bash: read input available (was any key hit?)

Question
I am using Bash 5 and have a long-running loop which needs to check occasionally for various keys the user may have hit. I know how to do this using stty — see my answer below — but it's more ugly than ought.
Essentially, I'm looking for a clean way to do this:
keyhit-p() {
if "read -n1 would block"; then
return false
else
return true
fi
}
Non-solution: read -t 0
I have read the bash manual and know about read -t 0. That does not do what I want, which is to detect if any input is available. Instead, it only returns true if the user hits ENTER (a complete line of input).
For example:
while true; do
if read -n1 -t0; then
echo "This only works if you hit enter"
break
fi
done
A working answer, albeit ugly
While the following works, I am hoping someone has a better answer.
#!/bin/bash
# Reset terminal's stty to previous values on exit.
trap 'stty $(stty --save)' EXIT
keyhit-p() {
# Return true if input is available on stdin (any key has been hit).
local sttysave=$(stty --save)
stty -icanon min 1 time 0 # ⎫
read -t0 # ⎬ Ugly: This ought to be atomic so the
local status=$? # ⎪ terminal's stty is always restored.
stty $sttysave # ⎭
return $status
}
while true; do
echo -n .
if ! keyhit-p; then
continue
else
while keyhit-p; do
read -n1
echo Key: $REPLY
done
break
fi
done
This alters the user's terminal settings (stty) before the read and attempts to write them back afterward, but does so non-atomically. It's possible for the script to get interrupted and leave the user's terminal in an incorrect state. I'd like to see an answer which solves that problem, ideally using only the tools built in to bash.
A faster, even uglier answer
Another flaw in the above routine is that it takes a lot of CPU time trying to get everything right. It requires calling an external program (stty) three times just to check that nothing has happened. Forks can be expensive in loops. If we dispense with correctness, we can get a routine that runs two orders of magnitude (256×) faster.
#!/bin/bash
# Reset terminal's stty to previous values on exit.
trap 'stty $(stty --save)' EXIT
# Set one character at a time input for the whole script.
stty -icanon min 1 time 0
while true; do
echo -n .
# We save time by presuming `read -t0` no longer waits for lines.
# This may cause problems and can be wrong, for example, with ^Z.
if ! read -t0; then
continue
else
while read -t0; do
read -n1
echo Key: $REPLY
done
break
fi
done
Instead of changing to non-canonical mode only during the read test, this script sets it once at the beginning and uses an exception handler when the script exits to undo it.
While I like that the code looks cleaner, the atomicity flaw of the original version is exacerbated because the SUSPEND signal isn't handled. If the user's shell is bash, icanon is enabled when the process is suspended, but NOT disabled when the process is foregrounded. That makes read -t0 return FALSE even when keys (other than Enter) are hit. Other user shells may not enable icanon on ^Z as bash does, but that's even worse as entering commands will no longer work as usual.
Additionally, requiring non-canonical mode to be left on all the time may cause other problems as the script gets longer than this trivial example. It is not documented how non-canonical mode is supposed to affect read and other bash built-ins. It seems to work in my tests, but will it always? Chances of running into problems would multiply when calling — or being called by — external programs. Maybe there would be no issues, but it would require tedious testing.

flock fd across child processes : if I use same fd do I risk that it gets assigned to other process lock?

I'm writing a multiprocess bash script that runs several processes (continuously), there are some functions I need to launch, every function has a similar form described in man for flock :
function1 () {
#.... preamble, variable initialization
(
flock -x -w 10 200 || exit 1
# ... my commands ....
) 200>/var/lock/.`basename $0`$conf"-rtrr.lock"
}
Each function has its own fd (number) different from other, every function is independent from others .
The desired behavior is that I should be able to run the same function several times in parallel, the only condition is that only one instance described by function and its parameters can run in every moment.
It is not a problem if a function execution fails, the parent runs in a inifinite loop and launch it again.
For example if we consider this a list of running processes/functions :
OK
function1 a
function1 b
function2 a
function3 b
NO:
function1 a
function1 a
function2 b
function3 b
On every part I specify different lockfile names using something like :
/var/lock/${program_name}-${parameter}-${function_name}.lock
example lockfile , if called function1 with a :
/var/lock/program-a-function1.lock
The questions :
using same fd number across several processes, the ones launches the same function, do I risk that one child process overwrite the fd mapping of another child ? The risk is that a process may wait for a wrong lock.
Can I use a variable as fd ? For example a number which is a sort of hash of parameter ?
Otherwise, is there a way to not use fd while using flock command ?
Do you think, for this desired behavior, it is better to use simple files to get and release locks : ieg creating file when acquiring, deleting file when releasing and having an if on top to check lock file presence ?
No risk. When a child process opens a file at some filedescriptor slot, it will overshadow the inherited filedescriptor (if one was inherited into that slot)
You can, but it's ill-advised. For max portability, you should use something less than 10. E.g., 9 like in the flock manpage example.
Yes, as the flock manpage describes, but that implies execing a command, which is not suitable for your case.
Sounds needlessly complicated
If I were you, I'd create lockfiles with > whose names would be derived from $0 (scriptname), ${FUNCNAME[0]} (function-name), and $* (concatenation of function arguments) and use them with a small filedescriptor like 9 like in the flock manpage. If you use basename on the scriptname $0, do it once and save the result in a global.
Example code:
#!/bin/bash
script_name="$(basename "$0")"
func1()(
echo "$BASHPID: Locking $(readlink /dev/fd/9)"
flock 9 || return 1
echo "$BASHPID: Locked $(readlink /dev/fd/9)"
echo "$BASHPID: WORK"
sleep 1
echo "$BASHPID: Release $(readlink /dev/fd/9)"
) 9>/var/lock/"$script_name-${FUNCNAME[0]}-$*"
func1 a &
func1 a &
func1 a &
func1 b &
func1 b &
func1 b &
wait
Possible output of the example code:
16993: Locking /run/lock/locks-func1-b
16985: Locking /run/lock/locks-func1-a
16987: Locking /run/lock/locks-func1-a
16995: Locking /run/lock/locks-func1-b
16994: Locking /run/lock/locks-func1-a
16987: Locked /run/lock/locks-func1-a
16987: WORK
16999: Locking /run/lock/locks-func1-b
16995: Locked /run/lock/locks-func1-b
16995: WORK
16987: Release /run/lock/locks-func1-a
16995: Release /run/lock/locks-func1-b
16985: Locked /run/lock/locks-func1-a
16985: WORK
16993: Locked /run/lock/locks-func1-b
16993: WORK
16985: Release /run/lock/locks-func1-a
16993: Release /run/lock/locks-func1-b
16994: Locked /run/lock/locks-func1-a
16994: WORK
16999: Locked /run/lock/locks-func1-b
16999: WORK
16994: Release /run/lock/locks-func1-a
16999: Release /run/lock/locks-func1-b

How to increment a global variable within another bash script

Question,
I want to have a bash script that will have a global variable that can be incremented from other bash scripts.
Example:
I have a script like the following:
#! /bin/bash
export Counter=0
for SCRIPT in /Users/<user>/Desktop/*sh
do
$SCRIPT
done
echo $Counter
That script will call all the other bash scripts in a folder and those scripts will have something like the following:
if [ "$Output" = "$Check" ]
then
echo "OK"
((Counter++))
I want it to then increment the $Counter variable if it does equal "OK" and then pass that value back to the initial batch script so I can keep that counter number and have a total at the end.
Any idea on how to go about doing that?
Environment variables propagate in one direction only -- from parent to child. Thus, a child process cannot change the value of an environment variable set in their parent.
What you can do is use the filesystem:
export counter_file=$(mktemp "$HOME/.counter.XXXXXX")
for script in ~user/Desktop/*sh; do "$script"; done
...and, in the individual script:
counter_curr=$(< "$counter_file" )
(( ++counter_curr ))
printf '%s\n' "$counter_curr" >"$counter_file"
This isn't currently concurrency-safe, but your parent script as currently written will never call more than one child at a time.
An even easier approach, assuming that the value you're tracking remains relatively small, is to use the file's size as a proxy for the counter's value. To do this, incrementing the counter is as simple as this:
printf '\n' >>"$counter_file"
...and checking its value in O(1) time -- without needing to open the file and read its content -- is as simple as checking the file's size; with GNU stat:
counter=$(stat -f %z "$counter_file")
Note that locking may be required for this to be concurrency-safe if using a filesystem such as NFS which does not correctly implement O_APPEND; see Norman Gray's answer (to which this owes inspiration) for a working implementation.
You could source the other scripts, which means they're not running in a sub-process but "inline" in the calling script like this:
#! /bin/bash
export counter=0
for script in /Users/<user>/Desktop/*sh
do
source "$script"
done
echo $counter
But as pointed out in the comments i'd only advise to use this approach if you control the called scripts yourself. If they for example exit or have variables clashing with each other, bad things could happen.
As described, you can't do this, since there isn't anything which corresponds to a ‘global variable’ for shell scripts.
As the comment suggests, you'll have to use the filesystem to communicate between scripts.
One simple/crude way of doing what you describe would be to simply have each cooperating script append a line to a file, and the ‘global count’ is the size of this file:
#! /bin/sh -
echo ping >>/tmp/scriptcountfile
then wc -l /tmp/scriptcountfile is the number of times that's happened. Of course, there's a potential race condition there, so something like the following would sequence those accesses:
#! /bin/sh -
(
flock -n 9
echo 'do stuff...'
echo ping >>/tmp/stampfile
) 9>/tmp/lockfile
(the flock command is available on Linux, but isn't portable).
Of course, then you can start to do fancier things by having scripts send stuff through pipes and sockets, but that's going somewhat over the top.

Bash script to start Solr deltaimporthandler

I am after a bash script which I can use to trigger a delta import of XML files via CRON. After a bit of digging and modification I have this:
#!/bin/bash
# Bash to initiate Solr Delta Import Handler
# Setup Variables
urlCmd='http://localhost:8080/solr/dataimport?command=delta-import&clean=false'
statusCmd='http://localhost:8080/solr/dataimport?command=status'
outputDir=.
# Operations
wget -O $outputDir/check_status_update_index.txt ${statusCmd}
2>/dev/null
status=`fgrep idle $outputDir/check_status_update_index.txt`
if [[ ${status} == *idle* ]]
then
wget -O $outputDir/status_update_index.txt ${urlCmd}
2>/dev/null
fi
Can I get any feedback on this? Is there a better way of doing it? Any optimisations or improvements would be most welcome.
This certainly looks usable. Just to confirm, you intend to run this ever X minutes from your crontab? That seems reasonsable.
The only major quibble (IMHO) is discarding STDERR information with 2>/dev/null. Of course it depends on what are your expectations for this system. If this is for a paying customer or employer, do you want to have to explain to the boss, "gosh, I didn't know I was getting error message 'Cant connect to host X' for the last 3 months because we redirect STDERR to /dev/null"! If this is for your own project, and your monitoring the work via other channels, then not so terrible, but why not capture STDERR to file, and if check that there are no errors. as a general idea ....
myStdErrLog=/tmp/myProject/myProg.stderr.$(/bin/date +%Y%m%d.%H%M)
wget -O $outputDir/check_status_update_index.txt ${statusCmd} 2> ${myStdErrLog}
if [[ ! -s ${myStdErrLog} ]] ; then
mail -s "error on myProg" me#myself.org < ${myStdErrLog}
fi
rm ${myStdErrLog}
Depending on what curl includes in its STDERR output, you may need filter what is in the StdErrLog to see if there are "real" error messages that you need to have sent to you.
A medium quibble is your use backticks for command substitution, if you're using dbl-sqr-brackets for evaluations, then why not embrace complete ksh93/bash semantics. The only reason to use backticks is if you think you need to be ultra-backwards compatible and that you'll be running this script under the bourne shell (or possibly one of the stripped down shells like dash).Backticks have been deprecated in ksh since at least 1993. Try
status=$(fgrep idle $outputDir/check_status_update_index.txt)
The $( ... ) form of command substitution makes it very easy to nest multiple cmd-subtitutions, i.e. echo $(echo one $(echo two ) ). (Bad example, as the need to nest cmd-sub is pretty rare, I can't think of a better example right now).
Depending on your situation, but in a large production environement, where new software is installed to version numbered directories, you might want to construct your paths from variables, i.e.
hostName=localhost
portNum=8080
SOLRPATH=/solr
SOLRCMD='delta-import&clean=false"
urlCmd='http://${hostName}:${portNum}${SOLRPATH}/dataimport?command=${SOLRCMD}"
The final, minor quibble ;-). Are you sure ${status} == *idle* does what you want?
Try using something like
case "${status}" in
*idle* ) .... ;;
* ) echo "unknown status = ${status} or similar" 1>&2 ;;
esac
Yes, your if ... fi certainly works, but if you want to start doing more refined processing of infomation that you put in your ${status} variable, then case ... esac is the way to go.
EDIT
I agree with #alinsoar that 2>/dev/null on a line by itself will be a no-op. I assumed that it was a formatting issue, but looking in edit mode at your code I see that it appears to be on its own line. If you really want to discard STDERR messages, then you need cmd ... 2>/dev/null all on one line OR as alinsoar advocates, the shell will accept redirections at the front of the line, but again, all on one line ;-!.
IHTH

What does "local -a foo" mean in zsh?

Zsh manual mentions that option -a means ALL_EXPORT,
ALL_EXPORT (-a, ksh: -a)
All parameters subsequently defined are automatically exported.
While export makes the variable available to sub-processes, the how can the same variable foo be local?
In local -a, the -a has the same meaning as it does for typeset:
-a
The names refer to array parameters. An array parameter
may be created this way, but it may not be assigned to in
the typeset statement. When displaying, both normal and
associative arrays are shown.
I think you might be confused on a number of fronts.
The ALL_EXPORT (-a) setting is for setopt, not local. To flag a variable for export with local, you use local -x.
And you're also confusing directions of propagation :-)
Defining a variable as local will prevent its lifetime from extending beyond the current function (outwards or upwards depending on how your mind thinks).
This does not affect the propagation of the variable to sub-processes run within the function (inwards or downwards).
For example, consider the following scripts qq.zsh:
function xyz {
local LOCVAR1
local -x LOCVAR2
LOCVAR1=123
LOCVAR2=456
GLOBVAR=789
zsh qq2.zsh
}
xyz
echo locvar1 is $LOCVAR1
echo locvar2 is $LOCVAR2
echo globvar is $GLOBVAR
and qq2.zsh:
echo subshell locvar1 is $LOCVAR1
echo subshell locvar2 is $LOCVAR2
When you run zsh qq.zsh, the output is:
subshell locvar1 is
subshell locvar2 is 456
locvar1 is
locvar2 is
globvar is 789
so you can see that neither local variable survives the return from the function. However, the auto-export of the local variables to a sub-process called within xyz is different. The one marked for export with local -x is available in the sub-shell, the other isn't.

Resources