Using Flock in Bash so Request is Made Only Once - bash

I'm trying to configure my script in such a way that
If some data isn't available, try to fetch it
If another process is already fetching it, wait for that process to finish
Use the data
And from here I found this very nice example of flock:
exec 200>$pidfile
flock -n 200 || exit 1
pid=$$
echo $pid 1>&200
And this fails if it can't aquire the lock (-n flag).
Can I assume that this means another file has locked the $pidfile, and how can I detect that the lock has been released in a different process?
I understand that wait $pid would wait until that process is complete, and so if there's some way to record which process currently holds the lock or just detect the unlocking so that other processes know once the data is available, then I think this will work.
Any ideas?

As per the flock (1) man page,
if the lock cannot be immediately acquired, [in the absence of a -w timeout], flock waits until the lock is available
You can use fuser to see which process is holding a file handle.

My solution uses two files, pid.temp and data.temp:
backgroundGetData() {
local data=$1
# if global is null, check file.
if [ -z "$data" ]; then
data=$( cat $DATA_TEMP_FILE 2>/dev/null )
fi
# if file is empty, check process is making the request
if [ -z "$data" ]; then
for i in {1..5}; do
echo "INFO - Process: $BASHPID - Attempting to lock data temp file" >&2
local request_pid=$( cat $PID_FILE 2>/dev/null )
if [ -z "$request_pid" ]; then request_pid=0; fi
local exit_code=1
if [ "$request_pid" -eq 0 ]; then
( flock -n 200 || exit 1
echo "INFO - Process: $BASHPID - Fetching data." >&2
echo "$BASHPID">"$PID_FILE"
getData > $DATA_TEMP_FILE
) 200>$DATA_TEMP_FILE
exit_code=$?
fi
echo "INFO - Process: $BASHPID - returned $exit_code from lock attempt.">&2
[ $request_pid -ne 0 ] && echo "INFO - Process: $BASHPID - $request_pid is possibly locking">&2
if [ $exit_code -ne 0 ] && [ $request_pid -ne 0 ]; then
echo "INFO - Process: $BASHPID - waiting on $request_pid to complete">&2
tail --pid=${request_pid} -f /dev/null
echo "INFO - Process: $BASHPID - finished waiting.">&2
break
elif [ $exit_code -eq 0 ]; then break;
else
sleep 2
fi
done
data=$( cat $DATA_TEMP_FILE )
if [ -z "$data" ]; then
echo "WARN - Process: $BASHPID - Failed to retrieve data.">&2
fi
fi
echo "$least_loaded"
}
And it can be used like so:
DATA=""
DATA_TEMP_FILE="data.temp"
PID_FILE="pid.temp"
$( backgroundGetData $DATA ) & ## Begin making request
doThing() {
if [ -z $DATA ]; then
# Redirect 3 to stdout, then use sterr in backgroundGetData to 3 so that
# logging messages can be shown and output can also be captued in variable.
exec 3>&1
DATA=$( backgroundGetData $DATA 2>&3)
fi
}
for job in "$jobs"; do
doThing &
done
It's working for me, though I'm not 100% sure on how safe it is.

Related

How to avoid printing an error in the console in a Bash script when executing a command?

How to avoid printing an error in Bash? I want to do something like this. If the user enters a wrong argument (like a "." for example), it will just exit the program rather than displaying the error on the terminal. (I've not posted the whole code here... That's a bit long).
if [ -n "$1" ]; then
sleep_time=$1
# it doesn't work, and displays the error on the screen
sleep $sleep_time > /dev/null
if [ "$?" -eq 0 ]; then
measurement $sleep_time
else
exit
fi
# if invalid arguments passed, take the refreshing interval from the user
else
echo "Proper Usage: $0 refresh_interval(in seconds)"
read -p "Please Provide the Update Time: " sleep_time
sleep $sleep_time > /dev/null
if [ "$?" -eq 0 ]; then
measurement $sleep_time
else
exit
fi
fi
2>/dev/null will discard any errors. Your code can be simplified like this:
#!/usr/bin/env bash
if [[ $# -eq 0 ]]; then
echo "Usage: $0 refresh_interval (in seconds)"
read -p "Please provide time: " sleep_time
else
sleep_time=$1
fi
sleep "$sleep_time" 2>/dev/null || { echo "Wrong time" >&2; exit 1; }
# everything OK - do stuff here
# ...

How to print in the same line as a progress in bash script

I want to print the text in the following manner
Waiting for completion.
Waiting for completion..
Waiting for completion...
[Note :Not more than three dots]
The above should be in the same line and in a loop.
When the loop condition is false I want to get the following in the same line as well :
Waiting for completion... [OK]
How do I achieve this in bash script?
You should use carriage return. Search information in echo about \r.
for example maybe you want something like this:
#!/bin/bash
while [ 3 -gt 2 ];
do
echo -n -e 'Esperando.\r'
sleep 1
echo -n -e 'Esperando..\r'
sleep 1
echo -n -e 'Esperando...\r'
sleep 1
echo -n -e ' \r'
done
You need to sleep cause if you dont sleep you won't be able to watch the changing dots.
How about this:
#!/usr/bin/env bash
trap ctrl_c INT
ctrl_c()
{
flag=1
}
dots()
{
if [ "$1" -eq 1 ]
then
echo .
fi
if [ "$1" -eq 2 ]
then
echo ..
fi
if [ "$1" -eq 3 ]
then
echo ...
fi
}
flag=0
dots_count=1
while [ "$flag" -eq 0 ]
do
if [ "$dots_count" -eq 4 ]
then
dots_count=1
fi
printf "\r%sWaiting for completion%s" "$(tput el)" "$(dots "$dots_count")"
dots_count=$((dots_count + 1))
sleep 1
done
printf "\r%sWaiting for completion... [OK]\n" "$(tput el)"
It will continuously print Waiting for completion followed by up to
three dots in the same line. When Control-c is
pressed then "Waiting for completion... [OK]" will be printed.
Use echo's "-n" switch
next echo will print on the same line

Create parallel processes and wait for all of them to finish, then redo steps

What i want to do should be pretty simple, on my own i have reached the solution below, all i need is a few pointers to tell me if this is the way to do it or i should refactor anything in the code.
The below code, should create a few parallel processes and wait for them to finish executing then rerun the code again and again and again...
The script is triggered by a cron job once at 10 minutes, if the script is running, then do nothing, otherwise start the working process.
Any insight is highly appreciated since i am not that familiar with bash programming.
#!/bin/bash
# paths
THISPATH="$( cd "$( dirname "$0" )" && pwd )"
# make sure we move in the working directory
cd $THISPATH
# console init path
CONSOLEPATH="$( cd ../../ && pwd )/console.php"
# command line arguments
daemon=0
PHPPATH="/usr/bin/php"
help=0
# flag for binary search
LOOKEDFORPHP=0
# arguments init
while getopts d:p:h: opt; do
case $opt in
d)
daemon=$OPTARG
;;
p)
PHPPATH=$OPTARG
LOOKEDFORPHP=1
;;
h)
help=$OPTARG
;;
esac
done
shift $((OPTIND - 1))
# allow only one process
processesLength=$(ps aux | grep -v "grep" | grep -c $THISPATH/send-campaigns-daemon.sh)
if [ ${processesLength:-0} -gt 2 ]; then
# The process is already running
exit 0
fi
if [ $help -eq 1 ]; then
echo "---------------------------------------------------------------"
echo "| Usage: send-campaigns-daemon.sh |"
echo "| To force PHP CLI binary : |"
echo "| send-campaigns-daemon.sh -p /path/to/php-cli/binary |"
echo "---------------------------------------------------------------"
exit 0
fi
# php executable path, find it if not provided
if [ $PHPPATH ] && [ ! -f $PHPPATH ] && [ $LOOKEDFORPHP -eq 0 ]; then
phpVariants=( "php-cli" "php5-cli" "php5" "php" )
LOOKEDFORPHP=1
for i in "${phpVariants[#]}"
do
which $i >/dev/null 2>&1
if [ $? -eq 0 ]; then
PHPPATH=$(which $i)
fi
done
fi
if [ ! $PHPPATH ] || [ ! -f $PHPPATH ]; then
# Did not find PHP
exit 1
fi
# load options from app
parallelProcessesPerCampaign=3
campaignsAtOnce=10
subscribersAtOnce=300
sleepTime=30
function loadOptions {
local COMMAND="$PHPPATH $CONSOLEPATH option get_option --name=%s --default=%d"
parallelProcessesPerCampaign=$(printf "$COMMAND" "system.cron.send_campaigns.parallel_processes_per_campaign" 3)
campaignsAtOnce=$(printf "$COMMAND" "system.cron.send_campaigns.campaigns_at_once" 10)
subscribersAtOnce=$(printf "$COMMAND" "system.cron.send_campaigns.subscribers_at_once" 300)
sleepTime=$(printf "$COMMAND" "system.cron.send_campaigns.pause" 30)
parallelProcessesPerCampaign=$($parallelProcessesPerCampaign)
campaignsAtOnce=$($campaignsAtOnce)
subscribersAtOnce=$($subscribersAtOnce)
sleepTime=$($sleepTime)
}
# define the daemon function that will stay in loop
function daemon {
loadOptions
local pids=()
local k=0
local i=0
local COMMAND="$PHPPATH -q $CONSOLEPATH send-campaigns --campaigns_offset=%d --campaigns_limit=%d --subscribers_offset=%d --subscribers_limit=%d --parallel_process_number=%d --parallel_processes_count=%d --usleep=%d --from_daemon=1"
while [ $i -lt $campaignsAtOnce ]
do
while [ $k -lt $parallelProcessesPerCampaign ]
do
parallelProcessNumber=$(( $k + 1 ))
usleep=$(( $k * 10 + $i * 10 ))
CMD=$(printf "$COMMAND" $i 1 $(( $subscribersAtOnce * $k )) $subscribersAtOnce $parallelProcessNumber $parallelProcessesPerCampaign $usleep)
$CMD > /dev/null 2>&1 &
pids+=($!)
k=$(( k + 1 ))
done
i=$(( i + 1 ))
done
waitForPids pids
sleep $sleepTime
daemon
}
function daemonize {
$THISPATH/send-campaigns-daemon.sh -d 1 -p $PHPPATH > /dev/null 2>&1 &
}
function waitForPids {
stillRunning=0
for i in "${pids[#]}"
do
if ps -p $i > /dev/null
then
stillRunning=1
break
fi
done
if [ $stillRunning -eq 1 ]; then
sleep 0.5
waitForPids pids
fi
return 0
}
if [ $daemon -eq 1 ]; then
daemon
else
daemonize
fi
exit 0
when starting a script, create a lock file to know that this script is running. When the script finish, delete the lock file. If somebody kill the process while it is running, the lock file remain forever, though test how old it is and delete after if older than a defined value. For example,
#!/bin/bash
# 10 min
LOCK_MAX=600
typedef LOCKFILE=/var/lock/${0##*/}.lock
if [[ -f $LOCKFILE ]] ; then
TIMEINI=$( stat -c %X $LOCKFILE )
SEGS=$(( $(date +%s) - $TIEMPOINI ))
if [[ $SEGS -gt $LOCK_MAX ]] ; then
reportLocking or somethig to inform you
# Kill old intance ???
OLDPID=$(<$LOCKFILE)
[[ -e /proc/$OLDPID ]] && kill -9 $OLDPID
# Next time that the program is run, there is no lock file and it will run.
rm $LOCKFILE
fi
exit 65
fi
# Save PID of this instance to the lock file
echo "$$" > $LOCKFILE
### Your code go here
# Remove the lock file before script finish
[[ -e $LOCKFILE ]] && rm $LOCKFILE
exit 0
from here:
#!/bin/bash
...
echo PARALLEL_JOBS:${PARALLEL_JOBS:=1}
declare -a tests=($(.../find_what_to_run))
echo "${tests[#]}" | \
xargs -d' ' -n1 -P${PARALLEL_JOBS} -I {} bash -c ".../run_that {}" || { echo "FAILURE"; exit 1; }
echo "SUCCESS"
and here you can nick the code for portable locking with fuser
Okay, so i guess i can answer to my own question with a proper answer that works after many tests.
So here is the final version, simplified, without comments/echo :
#!/bin/bash
sleep 2
DIR="$( cd "$( dirname "$0" )" && pwd )"
FILE_NAME="$( basename "$0" )"
COMMAND_FILE_PATH="$DIR/$FILE_NAME"
if [ ! -f "$COMMAND_FILE_PATH" ]; then
exit 1
fi
cd $DIR
CONSOLE_PATH="$( cd ../../ && pwd )/console.php"
PHP_PATH="/usr/bin/php"
help=0
LOOKED_FOR_PHP=0
while getopts p:h: opt; do
case $opt in
p)
PHP_PATH=$OPTARG
LOOKED_FOR_PHP=1
;;
h)
help=$OPTARG
;;
esac
done
shift $((OPTIND - 1))
if [ $help -eq 1 ]; then
printf "%s\n" "HELP INFO"
exit 0
fi
if [ "$PHP_PATH" ] && [ ! -f "$PHP_PATH" ] && [ "$LOOKED_FOR_PHP" -eq 0 ]; then
php_variants=( "php-cli" "php5-cli" "php5" "php" )
LOOKED_FOR_PHP=1
for i in "${php_variants[#]}"
do
which $i >/dev/null 2>&1
if [ $? -eq 0 ]; then
PHP_PATH="$(which $i)"
break
fi
done
fi
if [ ! "$PHP_PATH" ] || [ ! -f "$PHP_PATH" ]; then
exit 1
fi
LOCK_BASE_PATH="$( cd ../../../common/runtime && pwd )/shell-pids"
LOCK_PATH="$LOCK_BASE_PATH/send-campaigns-daemon.pid"
function remove_lock {
if [ -d "$LOCK_PATH" ]; then
rmdir "$LOCK_PATH" > /dev/null 2>&1
fi
exit 0
}
if [ ! -d "$LOCK_BASE_PATH" ]; then
if ! mkdir -p "$LOCK_BASE_PATH" > /dev/null 2>&1; then
exit 1
fi
fi
process_running=0
if mkdir "$LOCK_PATH" > /dev/null 2>&1; then
process_running=0
else
process_running=1
fi
if [ $process_running -eq 1 ]; then
exit 0
fi
trap "remove_lock" 1 2 3 15
COMMAND="$PHP_PATH $CONSOLE_PATH option get_option --name=%s --default=%d"
parallel_processes_per_campaign=$(printf "$COMMAND" "system.cron.send_campaigns.parallel_processes_per_campaign" 3)
campaigns_at_once=$(printf "$COMMAND" "system.cron.send_campaigns.campaigns_at_once" 10)
subscribers_at_once=$(printf "$COMMAND" "system.cron.send_campaigns.subscribers_at_once" 300)
sleep_time=$(printf "$COMMAND" "system.cron.send_campaigns.pause" 30)
parallel_processes_per_campaign=$($parallel_processes_per_campaign)
campaigns_at_once=$($campaigns_at_once)
subscribers_at_once=$($subscribers_at_once)
sleep_time=$($sleep_time)
k=0
i=0
pp=0
COMMAND="$PHP_PATH -q $CONSOLE_PATH send-campaigns --campaigns_offset=%d --campaigns_limit=%d --subscribers_offset=%d --subscribers_limit=%d --parallel_process_number=%d --parallel_processes_count=%d --usleep=%d --from_daemon=1"
while [ $i -lt $campaigns_at_once ]
do
while [ $k -lt $parallel_processes_per_campaign ]
do
parallel_process_number=$(( $k + 1 ))
usleep=$(( $k * 10 + $i * 10 ))
CMD=$(printf "$COMMAND" $i 1 $(( $subscribers_at_once * $k )) $subscribers_at_once $parallel_process_number $parallel_processes_per_campaign $usleep)
$CMD > /dev/null 2>&1 &
k=$(( k + 1 ))
pp=$(( pp + 1 ))
done
i=$(( i + 1 ))
done
wait
sleep ${sleep_time:-30}
$COMMAND_FILE_PATH -p "$PHP_PATH" > /dev/null 2>&1 &
remove_lock
exit 0
Usually, it is a lock file, not a lock path. You hold the PID in the lock file for monitoring your process. In this case your lock directory does not hold any PID information. Your script also does not do any PID file/directory maintenance when it starts in case of a improper shutdown of your process without cleaning of your lock.
I like your first script better with this in mind. Monitoring the PID's running directly is cleaner. The only problem is if you start a second instance with cron, it is not aware of the PID's connect to the first instance.
You also have processLength -gt 2 which is 2, not 1 process running so you will duplicate your process threads.
It seems also that daemonize is just recalling the script with daemon which is not very useful. Also, having a variable with the same name as a function is not effective.
The correct way to make a lockfile is like this:
# Create a temporary file
echo $$ > ${LOCKFILE}.tmp$$
# Try the lock; ln without -f is atomic
if ln ${LOCKFILE}.tmp$$ ${LOCKFILE}; then
# we got the lock
else
# we didn't get the lock
fi
# Tidy up the temporary file
rm ${LOCKFILE}.tmp$$
And to release the lock:
# Unlock
rm ${LOCKFILE}
The key thing is to create the lock file to one side, using a unique name, and then try to link it to the real name. This is an atomic operation, so it should be safe.
Any solution that does "test and set" gives you a race condition to deal with. Yes, that can be sorted out, but you end up write extra code.

exit statement will not break while loop in unix shell

The exit statements in each status check if statement do not break the while loop and truly exit the script. Is there something I can do to break the loop and exit with that $STATUS code?
EDIT: I've updated my code and it still isn't working. The status check if statements successfully break the loop but when I try to evaluate the $EXIT_STATUS it's always null, likely having something to do with scope. What am I missing here?
if [ $RESTART -le $STEP ]; then
. tell_step
while read XML_INPUT; do
XML_GDG=`get_full_name $GDG_NAME P`
cp $XML_INPUT $XML_GDG
STATUS=$?
EXIT_STATUS=$STATUS
if [ $STATUS -ne 0 ]; then
break
fi
add_one_gen $XML_GDG
STATUS=$?
EXIT_STATUS=$STATUS
if [ $STATUS -ne 0 ]; then
break
fi
done < $XML_STAGE_LIST
echo $EXIT_STATUS
if [ $EXIT_STATUS -ne 0 ]; then
exit $EXIT_STATUS
fi
fi
I had the same problem: when piping into a while loop, the script did not exit on exit. Instead it worked like "break" should do.
I have found 2 solutions:
a) After your while loop check the return code of the while loop and exit then:
somecommand | while something; do
...
done
# pass the exit code from the while loop
if [ $? != 0 ]
then
# this really exits
exit $?
fi
b) Set the bash script to exit on any error. Paste this at the beginning of your script:
set -e
Not really understand why your script dosn't exits on exit, because the next is works without problems:
while read name; do
echo "checking: $name"
grep $name /etc/passwd >/dev/null 2>&1
STATUS=$?
if [ $STATUS -ne 0 ]; then
echo "grep failed for $name rc-$STATUS"
exit $STATUS
fi
done <<EOF
root
bullshit
daemon
EOF
running it, produces:
$ bash testscript.sh ; echo "exited with: $?"
grep failed for bullshit rc-1
exited with: 1
as you can see, the script exited immediatelly and doesn't check the "daemon".
Anyway, maybe it is more readable, when you will use bash functions like:
dostep1() {
grep "$1:" /etc/passwd >/dev/null 2>&1
return $?
}
dostep2() {
grep "$1:" /some/nonexistent/file >/dev/null 2>&1
return $?
}
err() {
retval=$1; shift;
echo "$#" >&2 ; return $retval
}
while read name
do
echo =checking $name=
dostep1 $name || err $? "Step 1 failed" || exit $?
dostep2 $name || err $? "Step 2 failed" || exit $?
done
when run like:
echo 'root
> bullshit' | bash testexit.sh; echo "status: $?"
=checking root=
Step 2 failed
status: 2
so, step1 was OK and exited on the step2 (nonexisten file) - grep exit status 2, and when
echo 'bullshit
bin' | bash testexit.sh; echo "status: $?"
=checking bullshit=
Step 1 failed
status: 1
exited immediatelly on step1 (bullshit isn't in /etc/passwd) - grep exit status 1
You'll need to break out of your loop and then exit from your script. You can use a variable which is set on error to test if you need to exit with an error condition.
I had a similar problem when pipelining. My guess is a separate shell is started when piplining. Hopefully it helps someone else who stumbles across the problem.
From jm666's post above, this will not print 'Here I am!':
while read name; do
echo "checking: $name"
grep $name /etc/passwd >/dev/null 2>&1
STATUS=$?
if [ $STATUS -ne 0 ]; then
echo "grep failed for $name rc-$STATUS"
exit $STATUS
fi
done <<EOF
root
yayablah
daemon
EOF
echo "Here I am!"
However the following, which pipes the names to the while loop, does. It will also exit with a code of 0. Setting the variable and breaking doesn't seem to work either (which makes sense if it is another shell). Another method needs to be used to either communicate the error or avoid the situation in the first place.
cat <<EOF |
root
yayablah
daemon
EOF
while read name; do
echo "checking: $name"
grep $name /etc/passwd >/dev/null 2>&1
STATUS=$?
if [ $STATUS -ne 0 ]; then
echo "grep failed for $name rc-$STATUS"
exit $STATUS
fi
done
echo "Here I am!"

bash: running cURLs in parallel slower than one after another

We have to cache quite a big database of data after each upload, so we created a bash script that should handle it for us. The script should start 4 paralel curls to the site and once they're done, start the next one from the URL list we store in the file.
In theory everything works ok, and the concept works if we run the run 4 processes from our local machines to the target site.
If i set the MAX_NPROC=1 the curl takes as long as it would if the browser hits the URL
i.e. 20s
If I set the MAX_NPROC=2 the time request took, triples.
Am I missing something? Is that an apache setting that is slowing us down? or is this a secret cURL setting that I'm missing?
Any help will be appreciated. Please find the bash script below
#!/bin/bash
if [[ -z $2 ]]; then
MAX_NPROC=4 # default
else
MAX_NPROC=$2
fi
if [[ -z $1 ]]; then
echo "File with URLs is missing"
exit
fi;
NUM=0
QUEUE=""
DATA=""
URL=""
declare -a URL_ARRAY
declare -a TIME_ARRAY
ERROR_LOG=""
function queue {
QUEUE="$QUEUE $1"
NUM=$(($NUM+1))
}
function regeneratequeue {
OLDREQUEUE=$QUEUE
echo "OLDREQUEUE:$OLDREQUEUE"
QUEUE=""
NUM=0
for PID in $OLDREQUEUE
do
process_count=`ps ax | awk '{print $1 }' | grep -c "^${PID}$"`
if [ $process_count -eq 1 ] ; then
QUEUE="$QUEUE $PID"
NUM=$(($NUM+1))
fi
done
}
function checkqueue {
OLDCHQUEUE=$QUEUE
for PID in $OLDCHQUEUE
do
process_count=`ps ax | awk '{print $1 }' | grep -c "^${PID}$"`
if [ $process_count -eq 0 ] ; then
wait $PID
my_status=$?
if [[ $my_status -ne 0 ]]
then
echo "`date` $my_status ${URL_ARRAY[$PID]}" >> $ERROR_LOG
fi
current_time=`date +%s`
old_time=${TIME_ARRAY[$PID]}
time_difference=$(expr $current_time - $old_time)
echo "`date` ${URL_ARRAY[$PID]} END ($time_difference seconds)" >> $REVERSE_LOG
#unset TIME_ARRAY[$PID]
#unset URL_ARRAY[$PID]
regeneratequeue # at least one PID has finished
break
fi
done
}
REVERSE_LOG="$1.rvrs"
ERROR_LOG="$1.error"
echo "Cache STARTED at `date`" > $REVERSE_LOG
echo "" > ERROR_LOG
while read line; do
# create the command to be run
DATA="username=user#server.com&password=password"
URL=$line
CMD=$(curl --data "${DATA}" -s -o /dev/null --url "${URL}")
echo "Command: ${CMD}"
# Run the command
$CMD &
# Get PID for process
PID=$!
queue $PID;
URL_ARRAY[$PID]=$URL;
TIME_ARRAY[$PID]=`date +%s`
while [ $NUM -ge $MAX_NPROC ]; do
checkqueue
sleep 0.4
done
done < $1
echo "Cache FINISHED at `date`" >> $REVERSE_LOG
exit
The network is almost always the bottleneck. Spawning more connections usually makes it slower.
You can try to see if parallel'izing it will do you any good by spawning several
time curl ...... &

Resources