Addressable timers in Bash - bash

I am using inotifywait to run a command when a filesystem event happens. I would like this to wait for 5 seconds to see if another filesystem event happens and if another one does, I would like to reset the timer back to five seconds and wait some more. Make sense?
My problem is I'm attacking this in Bash and I don't know how I would do this. In JavaScript, I'd use setTimeout with some code like this:
function doSomething() { ... }
var timer;
function setTimer() {
window.clearTimeout(timer)
timer = window.setTimeout(doSomething, 5000);
}
// and then I'd just plug setTimer into the inotifywait loop.
But are there addressable, clearable background timers in Bash?

One idea I've had rattling around is forking out a subshell that sleeps and then runs my desired end command, and then stuffing that in the background. If it's run again, it'll pick up the previous PID and try to nuke it.
As a safety feature, after the sleep has finished, the subshell clears $PID to avoid the command being killed mid-execution
PID=0
while inotifywait -r test/; do
[[ $PID -gt 0 ]] && kill -9 $PID
{ sleep 5; PID=0; command; } & PID=$!
done
It's a bit messy but I've tested it and it works. If I create new files in ./test/ it sees that and if $PID isn't zero, it'll kill the previous sleeping command and reset the timer.

I provide this answer to illustrate a similar but more complex use case. Note that the code provided by #Oli is included in my answer.
I want to post process a file when it has changed. Specifically I want to invoke dart-sass on a scss file to produce a css file and its map file. Then the css file is compressed.
My problem is that editing/saving the scss source file could be done directly through vim (which uses a backup copy when writing the file) or through SFTP (specifically using macOS Transmit). That means the change could be seen with inotifywait as a pair CREATE followed by CLOSE_WRITE,CLOSE or as a single CREATE (due to the RENAME cmd through SFTP I think). So I have to launch the processing if I see a CLOSE_WRITE,CLOSE or a CREATE which is not followed by something.
Remarks:
It has to handle multiple concurrent edit/save.
The temporary files used by Transmit of the form <filename>_safe_save_<digits>.scss must not be taken into account.
The version of inotify-tools is 3.20.2.2 and has been compiled from the source (no package manager) to get a recent version with the include option.
#!/usr/bin/bash
declare -A pids
# $1: full path to source file (src_file_full)
# $2: full path to target file (dst_file_full)
function launch_dart() {
echo "dart"
/opt/dart-sass/sass "$1" "$2" && /usr/bin/gzip -9 -f -k "$2"
}
inotifywait -e close_write,create --include "\.scss$" -mr assets/css |
grep -v -P '(?:\w+)_safe_save_(?:\d+)\.scss$' --line-buffered |
while read dir action file; do
src_file_full="$dir$file"
dst_dir="${dir%assets/css/}"
dst_file="${file%.scss}.css"
dst_file_full="priv/static/css/${dst_dir%/}${dst_file}"
echo "'$action' on file '$file' in directory '$dir' ('$src_file_full')"
echo "dst_dir='$dst_dir', dst_file='$dst_file', dst_file_full='$dst_file_full'"
# if [ "$action" == "DELETE" ]; then
# rm -f "$dst_file_full" "${dst_file_full}.gz" "${dst_file_full}.map"
if [ "$action" == "CREATE" ]; then
echo "create. file size: " $(stat -c%s "$src_file_full")
{ sleep 1; pids[$src_file_full]=0; launch_dart "$src_file_full" "$dst_file_full"; } & pids[$src_file_full]=$!
elif [ "$action" == "CLOSE_WRITE,CLOSE" ]; then
[[ ${pids[$src_file_full]} -gt 0 ]] && kill -9 ${pids[$src_file_full]}
launch_dart "$src_file_full" "$dst_file_full"
fi
done

Related

BASH How to split my terminal during a long command line, and communicate with process?

First, sorry for the title, it's really difficult to resume my problem in a catch phrase.
I would like to build some bash script witch make a backup of a targeted directory, this backup will be in a server, using ssh.
My strategy is using rsync, to make a save properly and this program support ssh connection.
BUT, the problem is, when I use rsync command in order to copy heavy datas it's take some times and during this time I want print a loader.
My question is : how can I print a loader during rsync process, and when the copy is terminated close the loader ? I tried to lauch rsync in background with & but I don't know how to communicate with this process.
My script :
#!/bin/bash
#Strat : launch in local machine rsync
#$1 contain the source file ex : cmd my_dir
function loader(){
local i sp n
sp[0]="."
sp[1]=".."
sp[2]="..."
sp[3]=".."
sp[4]="."
for i in "${spin[#]}"
do
echo -ne "\b$i"
sleep 0.1
done
}
#main
if [ $# -ne 1 ]; then
help #a function not detailed here
exit 1
else
if [ $1 = "-h" ]; then
help
else
echo "==== SAVE PROGRAM ===="
echo "connection in progress ..."
sleep 1 #esthetic only
#I launch the copy, it works fine. rsync is launch with non-verbose mode
rsync -az $1"/" -e "ssh -p 22" serveradress:"SAVE/" &
$w=$! #rsync previous command's pid
#some code here to "wait" rsync processing, and during this time I would like to print some loading animation in my term (function loader).
while [ #condition ??? ]
do
loader
done
wait $w
echo "Copy complete !"
fi
fi
Thank's for the help.

Monitor Pre-existing and new files in a directory with bash

I have a script using inotify-tool.
This script notifies when a new file arrives in a folder. It performs some work with the file, and when done it moves the file to another folder. (it looks something along these line):
inotifywait -m -e modify "${path}" |
while read NEWFILE
work on/with NEWFILE
move NEWFILE no a new directory
done
By using inotifywait, one can only monitor new files. A similar procedure using for OLDFILE in path instead of inotifywait will work for existing files:
for OLDFILE in ${path}
do
work on/with OLDFILE
move NEWFILE no a new directory
done
I tried combining the two loops. By first running the second loop. But if files arrive quickly and in large numbers there is a change that the files will arrive wile the second loop is running. These files will then not be captured by neither loop.
Given that files already exists in a folder, and that new files will arrive quickly inside the folder, how can one make sure that the script will catch all files?
Once inotifywait is up and waiting, it will print the message Watches established. to standard error. So you need to go through existing files after that point.
So, one approach is to write something that will process standard error, and when it sees that message, lists all the existing files. You can wrap that functionality in a function for convenience:
function list-existing-and-follow-modify() {
local path="$1"
inotifywait --monitor \
--event modify \
--format %f \
-- \
"$path" \
2> >( while IFS= read -r line ; do
printf '%s\n' "$line" >&2
if [[ "$line" = 'Watches established.' ]] ; then
for file in "$path"/* ; do
if [[ -e "$file" ]] ; then
basename "$file"
fi
done
break
fi
done
cat >&2
)
}
and then write:
list-existing-and-follow-modify "$path" \
| while IFS= read -r file
# ... work on/with "$file"
# move "$file" to a new directory
done
Notes:
If you're not familiar with the >(...) notation that I used, it's called "process substitution"; see https://www.gnu.org/software/bash/manual/bash.html#Process-Substitution for details.
The above will now have the opposite race condition from your original one: if a file is created shortly after inotifywait starts up, then list-existing-and-follow-modify may list it twice. But you can easily handle that inside your while-loop by using if [[ -e "$file" ]] to make sure the file still exists before you operate on it.
I'm a bit skeptical that your inotifywait options are really quite what you want; modify, in particular, seems like the wrong event. But I'm sure you can adjust them as needed. The only change I've made above, other than switching to long options for clarity/explicitly and adding -- for robustness, is to add --format %f so that you get the filenames without extraneous details.
There doesn't seem to be any way to tell inotifywait to use a separator other than newlines, so, I just rolled with that. Make sure to avoid filenames that include newlines.
By using inotifywait, one can only monitor new files.
I would ask for a definition of a "new file". The man inotifywait specifies a list of events, which also lists events like create and delete and delete_self and inotifywait can also watch "old files" (beeing defined as files existing prior to inotifywait execution) and directories. You specified only a single event -e modify which notifies about modification of files within ${path}, it includes modification of both preexisting files and created after inotify execution.
... how can one make sure that the script will catch all files?
Your script is just enough to catch all the events that happen inside the path. If you have no means of synchronization between the part that generates files and the part that receives, there is nothing you can do and there always be a race condition. What if you script receives 0% of CPU time and the part that generates the files will get 100% of CPU time? There is no guarantee of cpu time between processes (unless using certified real time system...). Implement a synchronization between them.
You can watch some other event. If the generating sites closes files when ready with them, watch for the close event. Also you could run work on/with NEWFILE in parallel in background to speed up execution and reading new files. But if the receiving side is slower then the sending, if your script is working on NEWFILEs slower then the generating new files part, there is nothing you can do...
If you have no special characters and spaces in filenames, I would go with:
inotifywait -m -e modify "${path}" |
while IFS=' ' read -r path event file ;do
lock "${path}"
work on "${path}/${file}"
ex. mv "${path}/${file}" ${new_location}
unlock "${path}"
done
where lock and unlock is some locking mechanisms implemented between your script and the generating part. You can create a communication between the-creation-of-files-process and the-processing-of-the-files-process.
I think you can use some transaction file system, that would let you to "lock" a directory from the other scripts until you are ready with the work on it, but I have no experience in that field.
I tried combining the two loops. But if files arrive quickly and in large numbers there is a change that the files will arrive wile the second loop is running.
Run the process_new_file_loop in background prior to running the process_old_files_loop. Also it would be nice to make sure (ie. synchronize) that inotifywait has successfully started before you continue to the processing-existing-files-loop so that there is also no race conditions between them.
Maybe a simple example and/or startpoint would be:
work() {
local file="$1"
some work "$file"
mv "$file" "$predefiend_path"
}
process_new_files_loop() {
# let's work on modified files in parallel, so that it is faster
trap 'wait' INT
inotifywait -m -e modify "${path}" |
while IFS=' ' read -r path event file ;do
work "${path}/${file}" &
done
}
process_old_files_loop() {
# maybe we should parse in parallel here too?
# maybe export -f work; find "${path} -type f | xargs -P0 -n1 -- bash -c 'work $1' -- ?
find "${path}" -type f |
while IFS= read -r file; do
work "${file}"
done
}
process_new_files_loop &
child=$!
sleep 1
if ! ps -p "$child" >/dev/null 2>&1; then
echo "ERROR running processing-new-file-loop" >&2
exit 1
fi
process_old_files_loop
wait # wait for process_new_file_loop
If you really care about execution speeds and want to do it faster, change to python or to C (or to anything but shell). Bash is not fast, it is a shell, should be used to interconnect two processes (passing stdout of one to stdin of another) and parsing a stream line by line while IFS= read -r line is extremely slow in bash and should be generally used as a last resort. Maybe using xargs like xargs -P0 -n1 sh -c "work on $1; mv $1 $path" -- or parallel would be a mean to speed things up, but an average python or C program probably will be nth times faster.
A simpler solution is to add an ls in front of the inotifywait in a subshell, with awk to create output that looks like inotifywait.
I use this to detect and process existing and new files:
(ls ${path} | awk '{print "'${path}' EXISTS "$1}' && inotifywait -m ${path} -e close_write -e moved_to) |
while read dir action file; do
echo $action $dir $file
# DO MY PROCESSING
done
So it runs the ls, format the output and sends it to stdout, then runs the inotifywait in the same subshell sending the output also to stdout for processing.

Synchronizing Current Directory Between Two Zsh Sessions

I have two iTerm windows running zsh: one I use to documents in vim; the other I use to execute shell commands. I would like to synchronize the current working directories of the two sessions. I thought I could do this by outputting to a file ~/.cwd the new directory every time I change directories
alias cd="cd; pwd > ~/.cwd"
and creating a shell script ~/.dirsync that monitors the contents of ~/.cwd every second and changes directory if the other shell has updated it.
#!/bin/sh
echo $(pwd) > ~/.cwd
alias cd="cd; echo $(pwd) > ~/.cwd"
while true
do
if [[ $(pwd) != $(cat ~/.cwd) ]]
then
cd $(cat ~/.cwd)
fi
sleep 1
done
I would then append the following line of code to the end of my ~/.zshrc.
~/.dirsync &
However, it did not work. I then found out that shell scripts always execute in its own subshell. Does anyone know of a way to make this work?
Caveat emptor: I'm doing this on Ubuntu 10.04 with gnome-terminal, but it should work on any *NIX platform running zsh.
I've also changed things slightly. Instead of mixing "pwd" and "cwd", I've stuck with "pwd" everywhere.
Recording the Present Working Directory
If you want to run a function every time you cd, the preferred way is to use the chpwd function or the more extensible chpwd_functions array. I prefer chpwd_functions since you can dynamically append and remove functions from it.
# Records $PWD to file
function +record_pwd {
echo "$(pwd)" > ~/.pwd
}
# Removes the PWD record file
function +clean_up_pwd_record {
rm -f ~/.pwd
}
# Adds +record_pwd to the list of functions executed when "cd" is called
# and records the present directory
function start_recording_pwd {
if [[ -z $chpwd_functions[(r)+record_pwd] ]]; then
chpwd_functions=(${chpwd_functions[#]} "+record_pwd")
fi
+record_pwd
}
# Removes +record_pwd from the list of functions executed when "cd" is called
# and cleans up the record file
function stop_recording_pwd {
if [[ -n $chpwd_functions[(r)+record_pwd] ]]; then
chpwd_functions=("${(#)chpwd_functions:#+record_pwd}")
+clean_up_pwd_record
fi
}
Adding a + to the +record_pwd and +clean_up_pwd_record function names is a hack-ish way to hide it from normal use (similarly, the VCS_info hooks do this by prefixing everything with +vi).
With the above, you would simply call start_recording_pwd to start recording the present working directory every time you change directories. Likewise, you can call stop_recording_pwd to disable that behavior. stop_recording_pwd also removes the ~/.pwd file (just to keep things clean).
By doing things this way, synchronization be easily be made opt-in (since you may not want this for every single zsh session you run).
First Attempt: Using the preexec Hook
Similar to the suggestion of #Celada, the preexec hook gets run before executing a command. This seemed like an easy way to get the functionality you want:
autoload -Uz add-zsh-hook
function my_preexec_hook {
if [[-r ~/.pwd ]] && [[ $(pwd) != $(cat ~/.pwd) ]]; then
cd "$(cat ~/.pwd)"
fi
}
add-zsh-hook preexec my_preexec_hook
This works... sort of. Since the preexec hook runs before each command, it will automatically change directories before running your next command. However, up until then, the prompt stays in the last working directory, so it tab completes for the last directory, etc. (By the way, a blank line doesn't count as a command.) So, it sort of works, but it's not intuitive.
Second Attempt: Using signals and traps
In order to get a terminal to automatically cd and re-print the prompt, things got a lot more complicated.
After some searching, I found out that $$ (the shell's process ID) does not change in subshells. Thus, a subshell (or background job) can easily send signals to its parent. Combine this with the fact that zsh allows you to trap signals, and you have a means of polling ~/.pwd periodically:
# Used to make sure USR1 signals are not taken as synchronization signals
# unless the terminal has been told to do so
local _FOLLOWING_PWD
# Traps all USR1 signals
TRAPUSR1() {
# If following the .pwd file and we need to change
if (($+_FOLLOWING_PWD)) && [[ -r ~/.pwd ]] && [[ "$(pwd)" != "$(cat ~/.pwd)" ]]; then
# Change directories and redisplay the prompt
# (Still don't fully understand this magic combination of commands)
[[ -o zle ]] && zle -R && cd "$(cat ~/.pwd)" && precmd && zle reset-prompt 2>/dev/null
fi
}
# Sends the shell a USR1 signal every second
function +check_recorded_pwd_loop {
while true; do
kill -s USR1 "$$" 2>/dev/null
sleep 1
done
}
# PID of the disowned +check_recorded_pwd_loop job
local _POLLING_LOOP_PID
function start_following_recorded_pwd {
_FOLLOWING_PWD=1
[[ -n "$_POLLING_LOOP_PID" ]] && return
# Launch signalling loop as a disowned process
+check_recorded_pwd_loop &!
# Record the signalling loop's PID
_POLLING_LOOP_PID="$!"
}
function stop_following_recorded_pwd {
unset _FOLLOWING_PWD
[[ -z "$_POLLING_LOOP_PID" ]] && return
# Kill the background loop
kill "$_POLLING_LOOP_PID" 2>/dev/null
unset _POLLING_LOOP_PID
}
If you call start_following_recorded_pwd, this launches +check_recorded_pwd_loop as a disowned background process. This way, you won't get an annoying "suspended jobs" warning when you go to close your shell. The PID of the loop is recorded (via $!) so it can be stopped later.
The loop just sends the parent shell a USR1 signal every second. This signal gets trapped by TRAPUSR1(), which will cd and reprint the prompt if necessary. I don't understand having to call both zle -R and zle reset-prompt, but that was the magic combination that worked for me.
There is also the _FOLLOWING_PWD flag. Since every terminal will have the TRAPUSR1 function defined, this prevents them from handling that signal (and changing directories) unless you actually specified that behavior.
As with recording the present working directory, you can call stop_following_posted_pwd to stop the whole auto-cd thing.
Putting both halves together:
function begin_synchronize {
start_recording_pwd
start_following_recorded_pwd
}
function end_synchronize {
stop_recording_pwd
stop_following_recorded_pwd
}
Finally, you will probably want to do this:
trap 'end_synchronize' EXIT
This will automatically clean up everything just before your terminal exits, thus preventing you from accidentally leaving orphaned signalling loops around.

Is this a valid self-update approach for a bash script?

I'm working on a script that has gotten so complex I want to include an easy option to update it to the most recent version. This is my approach:
set -o errexit
SELF=$(basename $0)
UPDATE_BASE=http://something
runSelfUpdate() {
echo "Performing self-update..."
# Download new version
wget --quiet --output-document=$0.tmp $UPDATE_BASE/$SELF
# Copy over modes from old version
OCTAL_MODE=$(stat -c '%a' $0)
chmod $OCTAL_MODE $0.tmp
# Overwrite old file with new
mv $0.tmp $0
exit 0
}
The script seems to work as intended, but I'm wondering if there might be caveats with this kind of approach. I just have a hard time believing that a script can overwrite itself without any repercussions.
To be more clear, I'm wondering, if, maybe, bash would read and execute the script line-by-line and after the mv, the exit 0 could be something else from the new script. I think I remember Windows behaving like that with .bat files.
Update: My original snippet did not include set -o errexit. To my understanding, that should keep me safe from issues caused by wget.
Also, in this case, UPDATE_BASE points to a location under version control (to ease concerns).
Result: Based on the input from these answers, I constructed this revised approach:
runSelfUpdate() {
echo "Performing self-update..."
# Download new version
echo -n "Downloading latest version..."
if ! wget --quiet --output-document="$0.tmp" $UPDATE_BASE/$SELF ; then
echo "Failed: Error while trying to wget new version!"
echo "File requested: $UPDATE_BASE/$SELF"
exit 1
fi
echo "Done."
# Copy over modes from old version
OCTAL_MODE=$(stat -c '%a' $SELF)
if ! chmod $OCTAL_MODE "$0.tmp" ; then
echo "Failed: Error while trying to set mode on $0.tmp."
exit 1
fi
# Spawn update script
cat > updateScript.sh << EOF
#!/bin/bash
# Overwrite old file with new
if mv "$0.tmp" "$0"; then
echo "Done. Update complete."
rm \$0
else
echo "Failed!"
fi
EOF
echo -n "Inserting update process..."
exec /bin/bash updateScript.sh
}
(At least it doesn't try to continue running after updating itself!)
The thing that makes me nervous about your approach is that you're overwriting the current script (mv $0.tmp $0) as it's running. There are a number of reasons why this will probably work, but I wouldn't bet large amounts that it's guaranteed to work in all circumstances. I don't know of anything in POSIX or any other standard that specifies how the shell processes a file that it's executing as a script.
Here's what's probably going to happen:
You execute the script. The kernel sees the #!/bin/sh line (you didn't show it, but I presume it's there) and invokes /bin/sh with the name of your script as an argument. The shell then uses fopen(), or perhaps open() to open your script, reads from it, and starts interpreting its contents as shell commands.
For a sufficiently small script, the shell probably just reads the whole thing into memory, either explicitly or as part of the buffering done by normal file I/O. For a larger script, it might read it in chunks as it's executing. But either way, it probably only opens the file once, and keeps it open as long as it's executing.
If you remove or rename a file, the actual file is not necessarily immediately erased from disk. If there's another hard link to it, or if some process has it open, the file continues to exist, even though it may no longer be possible for another process to open it under the same name, or at all. The file is not physically deleted until the last link (directory entry) that refers to it has been removed, and no processes have it open. (Even then, its contents won't immediately be erased, but that's going beyond what's relevant here.)
And furthermore, the mv command that clobbers the script file is immediately followed by exit 0.
BUT it's at least conceivable that the shell could close the file and then re-open it by name. I can't think of any good reason for it to do so, but I know of no absolute guarantee that it won't.
And some systems tend to do stricter file locking that most Unix systems do. On Windows, for example, I suspect that the mv command would fail because a process (the shell) has the file open. Your script might fail on Cygwin. (I haven't tried it.)
So what makes me nervous is not so much the small possibility that it could fail, but the long and tenuous line of reasoning that seems to demonstrate that it will probably succeed, and the very real possibility that there's something else I haven't thought of.
My suggestion: write a second script whose one and only job is to update the first. Put the runSelfUpdate() function, or equivalent code, into that script. In your original script, use exec to invoke the update script, so that the original script is no longer running when you update it. If you want to avoid the hassle of maintaining, distributing, and installing two separate scripts. you could have the original script create the update script with a unique in /tmp; that would also solve the problem of updating the update script. (I wouldn't worry about cleaning up the autogenerated update script in /tmp; that would just reopen the same can of worms.)
Yes, but ... I would recommend you keep a more layered version of your script's history, unless the remote host can also perform version-control with histories. That being said, to respond directly to the code you have posted, see the following comments ;-)
What happens to your system when wget has a hiccup, quietly overwrites part of your working script with only a partial or otherwise corrupt copy? Your next step does a mv $0.tmp $0 so you've lost your working version. (I hope you have it in version control on the remote!)
You can check to see if wget returns any error messages
if ! wget --quiet --output-document=$0.tmp $UPDATE_BASE/$SELF ; then
echo "error on wget on $UPDATE_BASE/$SELF"
exit 1
fi
Also, Rule-of-thumb tests will help, i.e.
if (( $(wc -c < $0.tmp) >= $(wc -c < $0) )); then
mv $0.tmp $0
fi
but are hardly foolproof.
If your $0 could windup with spaces in it, better to surround all references like "$0".
To be super-bullet proof, consider checking all command returns AND that Octal_Mode has a reasonable value
OCTAL_MODE=$(stat -c '%a' $0)
case ${OCTAL_MODE:--1} in
-[1] )
printf "Error : OCTAL_MODE was empty\n"
exit 1
;;
777|775|755 ) : nothing ;;
* )
printf "Error in OCTAL_MODEs, found value=${OCTAL_MODE}\n"
exit 1
;;
esac
if ! chmod $OCTAL_MODE $0.tmp ; then
echo "error on chmod $OCTAL_MODE %0.tmp from $UPDATE_BASE/$SELF, can't continue"
exit 1
fi
I hope this helps.
Very late answer here, but as I just solved this too, I thought it might help someone to post the approach:
#!/usr/bin/env bash
#
set -fb
readonly THISDIR=$(cd "$(dirname "$0")" ; pwd)
readonly MY_NAME=$(basename "$0")
readonly FILE_TO_FETCH_URL="https://your_url_to_downloadable_file_here"
readonly EXISTING_SHELL_SCRIPT="${THISDIR}/somescript.sh"
readonly EXECUTABLE_SHELL_SCRIPT="${THISDIR}/.somescript.sh"
function get_remote_file() {
readonly REQUEST_URL=$1
readonly OUTPUT_FILENAME=$2
readonly TEMP_FILE="${THISDIR}/tmp.file"
if [ -n "$(which wget)" ]; then
$(wget -O "${TEMP_FILE}" "$REQUEST_URL" 2>&1)
if [[ $? -eq 0 ]]; then
mv "${TEMP_FILE}" "${OUTPUT_FILENAME}"
chmod 755 "${OUTPUT_FILENAME}"
else
return 1
fi
fi
}
function clean_up() {
# clean up code (if required) that has to execute every time here
}
function self_clean_up() {
rm -f "${EXECUTABLE_SHELL_SCRIPT}"
}
function update_self_and_invoke() {
get_remote_file "${FILE_TO_FETCH_URL}" "${EXECUTABLE_SHELL_SCRIPT}"
if [ $? -ne 0 ]; then
cp "${EXISTING_SHELL_SCRIPT}" "${EXECUTABLE_SHELL_SCRIPT}"
fi
exec "${EXECUTABLE_SHELL_SCRIPT}" "$#"
}
function main() {
cp "${EXECUTABLE_SHELL_SCRIPT}" "${EXISTING_SHELL_SCRIPT}"
# your code here
}
if [[ $MY_NAME = \.* ]]; then
# invoke real main program
trap "clean_up; self_clean_up" EXIT
main "$#"
else
# update myself and invoke updated version
trap clean_up EXIT
update_self_and_invoke "$#"
fi

Concurrent or lock access in bash script function

does anyone know how to lock on a function in bash script?
I wanted to do something like in java (like synchronize), ensuring that each file saved in monitored folder is on hold ever tries to use submit function.
an excerpt from my script:
(...)
ON_EVENT () {
local date = $1
local time = $2
local file = $3
sleep 5
echo "$date $time New file created: $file"
submit $file
}
submit () {
local file = $1
python avsubmit.py -f $file -v
python dbmgr.py -a $file
}
if [ ! -e "$FIFO" ]; then
mkfifo "$FIFO"
fi
inotifywait -m -e "$EVENTS" --timefmt '%Y-%m-%d %H:%M:%S' --format '%T %f' "$DIR" > "$FIFO" &
INOTIFY_PID=$!
trap "on_exit" 2 3 15
while read date time file
do
on_event $date $time $file &
done < "$FIFO"
on_exit
I'm using inotify to monitor a folder when a new file is saved. For each file saved (received), submit to VirusTotal service (avsubmit.py) and TreathExpert (dbmgr.py).
Concurrent access would be ideal to avoid blocking every new file created in monitored folder, but lock submit function should be sufficient.
Thank you guys!
Something like this should work:
if (set -o noclobber; echo "$$" > "$lockfile") 2> /dev/null; then
trap 'rm -f "$lockfile"; exit $?' INT TERM EXIT
# Your code here
rm -f "$lockfile"
trap - INT TERM EXIT
else
echo "Failed to acquire $lockfile. Held by $(cat $lockfile)"
then
Any code using rm in combination with trap or similar facility is inherently flawed against ungraceful kills, panics, system crashes, newbie sysadmins, etc. The flaw is that the lock needs to be manually cleaned after such catastrophic event for the script to run again. That may or may not be a problem for you. It is a problem for those managing many machines or wishing to have an unplugged vacation once in a while.
A modern solution using a file descriptor lock has been around for a while - I detailed it here and a working example is on the GitHub here. If you do not need to track process ID for whatever monitoring or other reasons, there is an interesting suggestion for a self-lock (I did not try it, not sure of its portability guarantee).
You can use a lock file to determine whether or not the file should be submitted.
Inside your ON_EVENT function, you should check if the appropriate lock file exists before calling the submit function. If it does exist, then return, or sleep and check again later to see if it's gone. If it doesn't exist, then create the lock and call submit. After the submit function completes, then delete the lock file.
See this thread for implementation details.
But I liked that files can not get lock stay on the waiting list (cache) to be submitted then or later.
I currently have something like this:
lockfile="./lock"
on_event() {
local date=$1
local time=$2
local file=$3
sleep 5
echo "$date $time New file created: $file"
if (set -o noclobber; echo "$$" > "$lockfile") 2> /dev/null; then
trap 'rm -f "$lockfile"; exit $?' INT TERM EXIT
submit_samples $file
rm -f "$lockfile"
trap - INT TERM EXIT
else
echo "Failed to acquire lockfile: $lockfile."
echo "Held by $(cat $lockfile)"
fi
}
submit_samples() {
local file=$1
python avsubmit.py -f $file -v
python dbmgr.py -a $file
}
Thank you once again ...
I had proplems wiith this approach and found a better solution:
Procmail comes with a lockfile command which does what I wanted:
lockfile -5 -r10 /tmp/lock.file
do something very important
rm -f /tmp/lock.file
lockfile will try to create the specified lockfile. If it exists it iwll retry in 5 seconds, this will be repeated for maximum 10 times. If can create the flile it goes on with the script.
Another solution are the lockfile-progs in debian, example directly from the man page:
Locking a file during a lengthy process:
lockfile-create /some/file
lockfile-touch /some/file &
# Save the PID of the lockfile-touch process
BADGER="$!"
do-something-important-with /some/file
kill "${BADGER}"
lockfile-remove /some/file
If you have GNU Parallel http://www.gnu.org/software/parallel/ installed you can do this:
inotifywait -q -m -r -e CLOSE_WRITE --format %w%f $DIR |
parallel -u python avsubmit.py -f {}\; python dbmgr.py -a {}
It will run at most one python per CPU when a file is written (and closed). That way you can bypass all the locking, and you get the added benefit that you avoid a potential race condition where a file is immediately overwritten (how do you make sure that both the first and the second version was checked?).
You can install GNU Parallel simply by:
wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
chmod 755 parallel
cp parallel sem
Watch the intro videos for GNU Parallel to learn more:
https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Resources