I am new to bash and wondering if there is a way to run a script x amount of times until it succeeds? I have the following script, but it naturally bails out and doesn't retry until it succeeds.
yarn graphql
if [ $? -eq 0 ]
then
echo "SUCCESS"
else
echo "FAIL"
fi
I can see there is a way to continuously loop, however is there a way to throttle this to say, loop every second, for 30 seconds?
while :
do
command
done
I guess you could devise a dedicated bash function for this, relying on the sleep command.
E.g., this code is freely inspired from that code by Travis, distributed under the MIT license:
#!/usr/bin/env bash
ANSI_GREEN="\033[32;1m"
ANSI_RED="\033[31;1m"
ANSI_RESET="\033[0m"
usage() {
cat >&2 <<EOF
Usage: retry_until WAIT MAX_TIMES COMMAND...
Examples:
retry_until 1s 3 echo ok
retry_until 1s 3 false
retry_until 1s 0 false
retry_until 30s 0 false
EOF
}
retry_until() {
[ $# -lt 3 ] && { usage; return 2; }
local wait_for="$1" # e.g., "30s"
local max_times="$2" # e.g., "3" (or "0" to have no limit)
shift 2
local result=0
local count=1
local str_of=''
[ "$max_times" -gt 0 ] && str_of=" of $max_times"
while [ "$count" -le "$max_times" ] || [ "$max_times" -le 0 ]; do
[ "$result" -ne 0 ] && {
echo -e "\n${ANSI_RED}The command '$*' failed. Retrying, #$count$str_of.${ANSI_RESET}\n" >&2
}
"$#" && {
echo -e "\n${ANSI_GREEN}The command '$*' succeeded on attempt #$count.${ANSI_RESET}\n" >&2
result=0
break
} || result=$?
count=$((count + 1))
sleep "$wait_for"
done
[ "$max_times" -gt 0 ] && [ "$count" -gt "$max_times" ] && {
echo -e "\n${ANSI_RED}The command '$*' failed $max_times times.${ANSI_RESET}\n" >&2
}
return "$result"
}
Then to fully answer your question, you could run:
retry_until 1s 30 command
Related
I want to check if an application up and running by accessing the "health check" link, so I want to implement the following logic:
if it returns 200 the script stops working with exit code 0, but if it returns something else the script stops working with exit code 1, but I want to test the "health check" link 3 times in a row. So I did this
#!/bin/bash
n=0
until [ "$n" -ge 3 ]
do
i=`curl -o /dev/null -Isw '%{http_code}\n' localhost:8080`
if [ "$i" == "200" ]
then
echo "Health check is OK"
break
else
echo "Health check failed"
fi
n=$((n+1))
sleep 5
done
When it's 200 and break is triggered, it exits with exit code 0, which is fine, but I also want it to exit with exit code 1 if it's not 200 but after the third iteration of the until loop. If I add exit 1 inside or below the "if statement" the script exits with exit code 1 but after the first iteration.
Is there any way to make it exit with exit code 1 after the third iteration of the loop?
You could immediately exit 0 inside the loop when the health check passes, and exit 1 after the loop.
n=0
until [ "$n" -ge 3 ]
do
i=`curl -o /dev/null -Isw '%{http_code}\n' localhost:8080`
if [ "$i" == "200" ]
then
echo "Health check is OK"
exit 0
else
echo "Health check failed"
fi
n=$((n+1))
sleep 5
done
exit 1
Is there any way to make it exit with exit code 1 after the third iteration of the loop?
If we always want to exit 1 if any of the requests resulted in != 200, we can use a simple flag:
How can I declare and use Boolean variables in a shell script?
#!/bin/bash
n=0
failed=false
until [ "$n" -ge 3 ]
do
i=`curl -o /dev/null -Isw '%{http_code}\n' localhost:8080`
if [ "$i" == "200" ]
then
echo "Health check is OK"
else
echo "Health check failed"
failed=true
fi
n=$((n+1))
sleep 5
done
if [ "$failed" = true ] ; then
exit 1
else
exit 0
fi
Some additional tips to shorten the code:
We can remove the n variable by using a for loop
An short-if to exit
#!/bin/bash
failed=false
for (( i = 0; i < 3; i++ )); do
r=`curl -o /dev/null -Isw '%{http_code}\n' localhost:8080`
if [ "$r" == "200" ]; then
echo "Health check is OK"
else
echo "Health check failed"
failed=true
fi
sleep 5
done
[[ "$failed" = true ]] && exit 1 || exit 0
See #>>> notes:
#!/bin/bash
n=0
#>>> default status if never successful
status=1
until [ "$n" -ge 3 ]
do
i=`curl -o /dev/null -Isw '%{http_code}\n' localhost:8080`
if [ "$i" == "200" ]
then
echo "Health check is OK"
#>>> successful: set status to 0
status=0
break
else
echo "Health check failed"
fi
n=$((n+1))
sleep 5
done
#>>> exit with $status
exit $status
You don't need an extra boolean variable, if you set the status before the end of the loop. So move your counter update and sleep timer into the failed check branch:
#!/usr/bin/env sh
n=0
until [ "$n" -ge 3 ]; do
i=$(curl -o /dev/null -Isw '%{http_code}\n' localhost:8080)
if [ "$i" -eq 200 ]; then
echo "Health check is OK"
break
else
echo "Health check failed"
n=$((n + 1))
sleep 1
false # Set status to false
fi
done
#!/bin/sh
c=3
until
i=$(curl -Isw '%{http_code}' -o /dev/null localhost:8080)
[ "$i" = 200 ]
do
echo 'Health check failed' >&2
[ "$c" -gt 1 ] || exit
c=$((c-1))
sleep 5
done
echo 'Health check is OK' >&2
With the loop restructured this way (i.e. until <success> ...), the state of the counter variable can be used to exit after the last try. Having the counter start at n and decrease towards zero simplifies using and testing the variable. Also, sleep 5 after the nth failure is avoided.
I have a file named data_file with data:
london
paris
newyork
italy...50 more items
Have a directory with over 75 files, say dfile1, dfie2...afle75 in which i am performing search for entries in data_file.
files=$(find . -type f)
for f in $files; do
while read -r line; do
found=$(grep $line $f)
if [ ! -z "$found" ]; then
perform task here
fi
done < data_file
done
As the loop runs for each file one by one, it takes lots of time to finish. How can I speed it up, can i run the for loop for multiple files at same time?
Using GNU Parallel you can do something like this:
doit() {
f="$1"
line="$2"
found=$(grep $line $f)
if [ ! -z "$found" ]; then
perform task here
fi
}
export -f doit
find . -type f | parallel doit :::: - data_file
The following example is a full blown parallel execution method, that deals with:
Execution time (will warn after a certain execution time, and stop tasks after more time has passed)
Async logging (keeps logging what's going on while tasks being executed)
Parallelism (allows to specify the number of simultaneous tasks)
IO related zombie tasks (will not block the execution)
Does handle killing of grand children pids
Lots of more stuff
In your example, your (hardened) code would look like:
# Load the ExecTasks function described below (must be in the same directory as this one)
source ./exectasks.sh
directoryToProcess="/my/dir/to/find/stuff/into"
tasklist=""
# Prepare task list separated by semicolumn
while IFS= read -r -d $'\0' file; do
if grep "$line" "$file" > /dev/null 2>&1; then
tasklist="$tasklist""my_task;"
done < <(find "$directoryToProcess" -type f -print0)
# Run tasks
ExecTasks "$tasklist" "trivial-task-id" false 1800 3600 18000 36000 true 1 1800 true false false 8
Here we used a complex function ExecTasks that will deal with parallel queueing the tasks, and let you keep control of what's going on without fear to block the script because of some hanged task.
Quick explanation of ExecTasks arguments:
"$tasklist" = variable containing task list
"some name" trivial task id (in order to identify in logs)
boolean: read tasks from file (you may have passed a task list from a file if there are too many to fit in a variable
1800 = maximum number of seconds a task may be executed before a warning is raised
3600 = maximum number of seconds a task may be executed before an error is raised and the tasks is stopped
18000 = maximum number of seconds the whole tasks may be executed before a warning is raised
36000 = maximum number of seconds the whole tasks may be executed before an error is raised and all the tasks are stopped
boolean: account execution time since beginning of tasks execution (true) or since script begin
1 = number of seconds between each state check (accepts float like .1)
1800 = Number of seconds between each "i am alive" log just to know everything works as expected
boolean: show spinner (true) or not (false)
boolean: log errors when reaching max times (false) or do not log them (true)
boolean: do not log any errors at all (false) or do log them (true)
And finally
8 = number of simultaneous tasks to launch (8 in our case)
Here's the source to exectasks.sh (which you can also copy paste directly into your script header instead of source ./exectasks.sh):
function Logger {
# Dummy log function, replace with whatever you need
echo "$2: $1"
}
# Nice cli spinner so we now execution is ongoing
_OFUNCTIONS_SPINNER="|/-\\"
function Spinner {
printf " [%c] \b\b\b\b\b\b" "$_OFUNCTIONS_SPINNER"
_OFUNCTIONS_SPINNER=${_OFUNCTIONS_SPINNER#?}${_OFUNCTIONS_SPINNER%%???}
return 0
}
# Portable child (and grandchild) kill function tester under Linux, BSD and MacOS X
function KillChilds {
local pid="${1}" # Parent pid to kill childs
local self="${2:-false}" # Should parent be killed too ?
# Paranoid checks, we can safely assume that $pid should not be 0 nor 1
if [ $(IsInteger "$pid") -eq 0 ] || [ "$pid" == "" ] || [ "$pid" == "0" ] || [ "$pid" == "1" ]; then
Logger "Bogus pid given [$pid]." "CRITICAL"
return 1
fi
if kill -0 "$pid" > /dev/null 2>&1; then
if children="$(pgrep -P "$pid")"; then
if [[ "$pid" == *"$children"* ]]; then
Logger "Bogus pgrep implementation." "CRITICAL"
children="${children/$pid/}"
fi
for child in $children; do
Logger "Launching KillChilds \"$child\" true" "DEBUG" #__WITH_PARANOIA_DEBUG
KillChilds "$child" true
done
fi
fi
# Try to kill nicely, if not, wait 15 seconds to let Trap actions happen before killing
if [ "$self" == true ]; then
# We need to check for pid again because it may have disappeared after recursive function call
if kill -0 "$pid" > /dev/null 2>&1; then
kill -s TERM "$pid"
Logger "Sent SIGTERM to process [$pid]." "DEBUG"
if [ $? -ne 0 ]; then
sleep 15
Logger "Sending SIGTERM to process [$pid] failed." "DEBUG"
kill -9 "$pid"
if [ $? -ne 0 ]; then
Logger "Sending SIGKILL to process [$pid] failed." "DEBUG"
return 1
fi # Simplify the return 0 logic here
else
return 0
fi
else
return 0
fi
else
return 0
fi
}
function ExecTasks {
# Mandatory arguments
local mainInput="${1}" # Contains list of pids / commands separated by semicolons or filepath to list of pids / commands
# Optional arguments
local id="${2:-base}" # Optional ID in order to identify global variables from this run (only bash variable names, no '-'). Global variables are WAIT_FOR_TASK_COMPLETION_$id and HARD_MAX_EXEC_TIME_REACHED_$id
local readFromFile="${3:-false}" # Is mainInput / auxInput a semicolon separated list (true) or a filepath (false)
local softPerProcessTime="${4:-0}" # Max time (in seconds) a pid or command can run before a warning is logged, unless set to 0
local hardPerProcessTime="${5:-0}" # Max time (in seconds) a pid or command can run before the given command / pid is stopped, unless set to 0
local softMaxTime="${6:-0}" # Max time (in seconds) for the whole function to run before a warning is logged, unless set to 0
local hardMaxTime="${7:-0}" # Max time (in seconds) for the whole function to run before all pids / commands given are stopped, unless set to 0
local counting="${8:-true}" # Should softMaxTime and hardMaxTime be accounted since function begin (true) or since script begin (false)
local sleepTime="${9:-.5}" # Seconds between each state check. The shorter the value, the snappier ExecTasks will be, but as a tradeoff, more cpu power will be used (good values are between .05 and 1)
local keepLogging="${10:-1800}" # Every keepLogging seconds, an alive message is logged. Setting this value to zero disables any alive logging
local spinner="${11:-true}" # Show spinner (true) or do not show anything (false) while running
local noTimeErrorLog="${12:-false}" # Log errors when reaching soft / hard execution times (false) or do not log errors on those triggers (true)
local noErrorLogsAtAll="${13:-false}" # Do not log any errros at all (useful for recursive ExecTasks checks)
# Parallelism specific arguments
local numberOfProcesses="${14:-0}" # Number of simulanteous commands to run, given as mainInput. Set to 0 by default (WaitForTaskCompletion mode). Setting this value enables ParallelExec mode.
local auxInput="${15}" # Contains list of commands separated by semicolons or filepath fo list of commands. Exit code of those commands decide whether main commands will be executed or not
local maxPostponeRetries="${16:-3}" # If a conditional command fails, how many times shall we try to postpone the associated main command. Set this to 0 to disable postponing
local minTimeBetweenRetries="${17:-300}" # Time (in seconds) between postponed command retries
local validExitCodes="${18:-0}" # Semi colon separated list of valid main command exit codes which will not trigger errors
local i
# Expand validExitCodes into array
IFS=';' read -r -a validExitCodes <<< "$validExitCodes"
# ParallelExec specific variables
local auxItemCount=0 # Number of conditional commands
local commandsArray=() # Array containing commands
local commandsConditionArray=() # Array containing conditional commands
local currentCommand # Variable containing currently processed command
local currentCommandCondition # Variable containing currently processed conditional command
local commandsArrayPid=() # Array containing commands indexed by pids
local commandsArrayOutput=() # Array containing command results indexed by pids
local postponedRetryCount=0 # Number of current postponed commands retries
local postponedItemCount=0 # Number of commands that have been postponed (keep at least one in order to check once)
local postponedCounter=0
local isPostponedCommand=false # Is the current command from a postponed file ?
local postponedExecTime=0 # How much time has passed since last postponed condition was checked
local needsPostponing # Does currentCommand need to be postponed
local temp
# Common variables
local pid # Current pid working on
local pidState # State of the process
local mainItemCount=0 # number of given items (pids or commands)
local readFromFile # Should we read pids / commands from a file (true)
local counter=0
local log_ttime=0 # local time instance for comparaison
local seconds_begin=$SECONDS # Seconds since the beginning of the script
local exec_time=0 # Seconds since the beginning of this function
local retval=0 # return value of monitored pid process
local subRetval=0 # return value of condition commands
local errorcount=0 # Number of pids that finished with errors
local pidsArray # Array of currently running pids
local newPidsArray # New array of currently running pids for next iteration
local pidsTimeArray # Array containing execution begin time of pids
local executeCommand # Boolean to check if currentCommand can be executed given a condition
local functionMode
local softAlert=false # Does a soft alert need to be triggered, if yes, send an alert once
local failedPidsList # List containing failed pids with exit code separated by semicolons (eg : 2355:1;4534:2;2354:3)
local randomOutputName # Random filename for command outputs
local currentRunningPids # String of pids running, used for debugging purposes only
# fnver 2019081401
# Initialise global variable
eval "WAIT_FOR_TASK_COMPLETION_$id=\"\""
eval "HARD_MAX_EXEC_TIME_REACHED_$id=false"
# Init function variables depending on mode
if [ $numberOfProcesses -gt 0 ]; then
functionMode=ParallelExec
else
functionMode=WaitForTaskCompletion
fi
if [ $readFromFile == false ]; then
if [ $functionMode == "WaitForTaskCompletion" ]; then
IFS=';' read -r -a pidsArray <<< "$mainInput"
mainItemCount="${#pidsArray[#]}"
else
IFS=';' read -r -a commandsArray <<< "$mainInput"
mainItemCount="${#commandsArray[#]}"
IFS=';' read -r -a commandsConditionArray <<< "$auxInput"
auxItemCount="${#commandsConditionArray[#]}"
fi
else
if [ -f "$mainInput" ]; then
mainItemCount=$(wc -l < "$mainInput")
readFromFile=true
else
Logger "Cannot read main file [$mainInput]." "WARN"
fi
if [ "$auxInput" != "" ]; then
if [ -f "$auxInput" ]; then
auxItemCount=$(wc -l < "$auxInput")
else
Logger "Cannot read aux file [$auxInput]." "WARN"
fi
fi
fi
if [ $functionMode == "WaitForTaskCompletion" ]; then
# Force first while loop condition to be true because we don't deal with counters but pids in WaitForTaskCompletion mode
counter=$mainItemCount
fi
# soft / hard execution time checks that needs to be a subfunction since it is called both from main loop and from parallelExec sub loop
function _ExecTasksTimeCheck {
if [ $spinner == true ]; then
Spinner
fi
if [ $counting == true ]; then
exec_time=$((SECONDS - seconds_begin))
else
exec_time=$SECONDS
fi
if [ $keepLogging -ne 0 ]; then
# This log solely exists for readability purposes before having next set of logs
if [ ${#pidsArray[#]} -eq $numberOfProcesses ] && [ $log_ttime -eq 0 ]; then
log_ttime=$exec_time
Logger "There are $((mainItemCount-counter+postponedItemCount)) / $mainItemCount tasks in the queue of which $postponedItemCount are postponed. Currently, ${#pidsArray[#]} tasks running with pids [$(joinString , ${pidsArray[#]})]." "NOTICE"
fi
if [ $(((exec_time + 1) % keepLogging)) -eq 0 ]; then
if [ $log_ttime -ne $exec_time ]; then # Fix when sleep time lower than 1 second
log_ttime=$exec_time
if [ $functionMode == "WaitForTaskCompletion" ]; then
Logger "Current tasks still running with pids [$(joinString , ${pidsArray[#]})]." "NOTICE"
elif [ $functionMode == "ParallelExec" ]; then
Logger "There are $((mainItemCount-counter+postponedItemCount)) / $mainItemCount tasks in the queue of which $postponedItemCount are postponed. Currently, ${#pidsArray[#]} tasks running with pids [$(joinString , ${pidsArray[#]})]." "NOTICE"
fi
fi
fi
fi
if [ $exec_time -gt $softMaxTime ]; then
if [ "$softAlert" != true ] && [ $softMaxTime -ne 0 ] && [ $noTimeErrorLog != true ]; then
Logger "Max soft execution time [$softMaxTime] exceeded for task [$id] with pids [$(joinString , ${pidsArray[#]})]." "WARN"
softAlert=true
SendAlert true
fi
fi
if [ $exec_time -gt $hardMaxTime ] && [ $hardMaxTime -ne 0 ]; then
if [ $noTimeErrorLog != true ]; then
Logger "Max hard execution time [$hardMaxTime] exceeded for task [$id] with pids [$(joinString , ${pidsArray[#]})]. Stopping task execution." "ERROR"
fi
for pid in "${pidsArray[#]}"; do
KillChilds $pid true
if [ $? -eq 0 ]; then
Logger "Task with pid [$pid] stopped successfully." "NOTICE"
else
if [ $noErrorLogsAtAll != true ]; then
Logger "Could not stop task with pid [$pid]." "ERROR"
fi
fi
errorcount=$((errorcount+1))
done
if [ $noTimeErrorLog != true ]; then
SendAlert true
fi
eval "HARD_MAX_EXEC_TIME_REACHED_$id=true"
if [ $functionMode == "WaitForTaskCompletion" ]; then
return $errorcount
else
return 129
fi
fi
}
function _ExecTasksPidsCheck {
newPidsArray=()
if [ "$currentRunningPids" != "$(joinString " " ${pidsArray[#]})" ]; then
Logger "ExecTask running for pids [$(joinString " " ${pidsArray[#]})]." "DEBUG"
currentRunningPids="$(joinString " " ${pidsArray[#]})"
fi
for pid in "${pidsArray[#]}"; do
if [ $(IsInteger $pid) -eq 1 ]; then
if kill -0 $pid > /dev/null 2>&1; then
# Handle uninterruptible sleep state or zombies by ommiting them from running process array (How to kill that is already dead ? :)
pidState="$(eval $PROCESS_STATE_CMD)"
if [ "$pidState" != "D" ] && [ "$pidState" != "Z" ]; then
# Check if pid hasn't run more than soft/hard perProcessTime
pidsTimeArray[$pid]=$((SECONDS - seconds_begin))
if [ ${pidsTimeArray[$pid]} -gt $softPerProcessTime ]; then
if [ "$softAlert" != true ] && [ $softPerProcessTime -ne 0 ] && [ $noTimeErrorLog != true ]; then
Logger "Max soft execution time [$softPerProcessTime] exceeded for pid [$pid]." "WARN"
if [ "${commandsArrayPid[$pid]}]" != "" ]; then
Logger "Command was [${commandsArrayPid[$pid]}]]." "WARN"
fi
softAlert=true
SendAlert true
fi
fi
if [ ${pidsTimeArray[$pid]} -gt $hardPerProcessTime ] && [ $hardPerProcessTime -ne 0 ]; then
if [ $noTimeErrorLog != true ] && [ $noErrorLogsAtAll != true ]; then
Logger "Max hard execution time [$hardPerProcessTime] exceeded for pid [$pid]. Stopping command execution." "ERROR"
if [ "${commandsArrayPid[$pid]}]" != "" ]; then
Logger "Command was [${commandsArrayPid[$pid]}]]." "WARN"
fi
fi
KillChilds $pid true
if [ $? -eq 0 ]; then
Logger "Command with pid [$pid] stopped successfully." "NOTICE"
else
if [ $noErrorLogsAtAll != true ]; then
Logger "Could not stop command with pid [$pid]." "ERROR"
fi
fi
errorcount=$((errorcount+1))
if [ $noTimeErrorLog != true ]; then
SendAlert true
fi
fi
newPidsArray+=($pid)
fi
else
# pid is dead, get its exit code from wait command
wait $pid
retval=$?
# Check for valid exit codes
if [ $(ArrayContains $retval "${validExitCodes[#]}") -eq 0 ]; then
if [ $noErrorLogsAtAll != true ]; then
Logger "${FUNCNAME[0]} called by [$id] finished monitoring pid [$pid] with exitcode [$retval]." "ERROR"
if [ "$functionMode" == "ParallelExec" ]; then
Logger "Command was [${commandsArrayPid[$pid]}]." "ERROR"
fi
if [ -f "${commandsArrayOutput[$pid]}" ]; then
Logger "Truncated output:\n$(head -c16384 "${commandsArrayOutput[$pid]}")" "ERROR"
fi
fi
errorcount=$((errorcount+1))
# Welcome to variable variable bash hell
if [ "$failedPidsList" == "" ]; then
failedPidsList="$pid:$retval"
else
failedPidsList="$failedPidsList;$pid:$retval"
fi
else
Logger "${FUNCNAME[0]} called by [$id] finished monitoring pid [$pid] with exitcode [$retval]." "DEBUG"
fi
fi
fi
done
# hasPids can be false on last iteration in ParallelExec mode
pidsArray=("${newPidsArray[#]}")
# Trivial wait time for bash to not eat up all CPU
sleep $sleepTime
}
while [ ${#pidsArray[#]} -gt 0 ] || [ $counter -lt $mainItemCount ] || [ $postponedItemCount -ne 0 ]; do
_ExecTasksTimeCheck
retval=$?
if [ $retval -ne 0 ]; then
return $retval;
fi
# The following execution bloc is only needed in ParallelExec mode since WaitForTaskCompletion does not execute commands, but only monitors them
if [ $functionMode == "ParallelExec" ]; then
while [ ${#pidsArray[#]} -lt $numberOfProcesses ] && ([ $counter -lt $mainItemCount ] || [ $postponedItemCount -ne 0 ]); do
_ExecTasksTimeCheck
retval=$?
if [ $retval -ne 0 ]; then
return $retval;
fi
executeCommand=false
isPostponedCommand=false
currentCommand=""
currentCommandCondition=""
needsPostponing=false
if [ $readFromFile == true ]; then
# awk identifies first line as 1 instead of 0 so we need to increase counter
currentCommand=$(awk 'NR == num_line {print; exit}' num_line=$((counter+1)) "$mainInput")
if [ $auxItemCount -ne 0 ]; then
currentCommandCondition=$(awk 'NR == num_line {print; exit}' num_line=$((counter+1)) "$auxInput")
fi
# Check if we need to fetch postponed commands
if [ "$currentCommand" == "" ]; then
currentCommand=$(awk 'NR == num_line {print; exit}' num_line=$((postponedCounter+1)) "$RUN_DIR/$PROGRAM.${FUNCNAME[0]}-postponedMain.$id.$SCRIPT_PID.$TSTAMP")
currentCommandCondition=$(awk 'NR == num_line {print; exit}' num_line=$((postponedCounter+1)) "$RUN_DIR/$PROGRAM.${FUNCNAME[0]}-postponedAux.$id.$SCRIPT_PID.$TSTAMP")
isPostponedCommand=true
fi
else
currentCommand="${commandsArray[$counter]}"
if [ $auxItemCount -ne 0 ]; then
currentCommandCondition="${commandsConditionArray[$counter]}"
fi
if [ "$currentCommand" == "" ]; then
currentCommand="${postponedCommandsArray[$postponedCounter]}"
currentCommandCondition="${postponedCommandsConditionArray[$postponedCounter]}"
isPostponedCommand=true
fi
fi
# Check if we execute postponed commands, or if we delay them
if [ $isPostponedCommand == true ]; then
# Get first value before '#'
postponedExecTime="${currentCommand%%#*}"
postponedExecTime=$((SECONDS-postponedExecTime))
# Get everything after first '#'
temp="${currentCommand#*#}"
# Get first value before '#'
postponedRetryCount="${temp%%#*}"
# Replace currentCommand with actual filtered currentCommand
currentCommand="${temp#*#}"
# Since we read a postponed command, we may decrase postponedItemCounter
postponedItemCount=$((postponedItemCount-1))
#Since we read one line, we need to increase the counter
postponedCounter=$((postponedCounter+1))
else
postponedRetryCount=0
postponedExecTime=0
fi
if ([ $postponedRetryCount -lt $maxPostponeRetries ] && [ $postponedExecTime -ge $minTimeBetweenRetries ]) || [ $isPostponedCommand == false ]; then
if [ "$currentCommandCondition" != "" ]; then
Logger "Checking condition [$currentCommandCondition] for command [$currentCommand]." "DEBUG"
eval "$currentCommandCondition" &
ExecTasks $! "subConditionCheck" false 0 0 1800 3600 true $SLEEP_TIME $KEEP_LOGGING true true true
subRetval=$?
if [ $subRetval -ne 0 ]; then
# is postponing enabled ?
if [ $maxPostponeRetries -gt 0 ]; then
Logger "Condition [$currentCommandCondition] not met for command [$currentCommand]. Exit code [$subRetval]. Postponing command." "NOTICE"
postponedRetryCount=$((postponedRetryCount+1))
if [ $postponedRetryCount -ge $maxPostponeRetries ]; then
Logger "Max retries reached for postponed command [$currentCommand]. Skipping command." "NOTICE"
else
needsPostponing=true
fi
postponedExecTime=0
else
Logger "Condition [$currentCommandCondition] not met for command [$currentCommand]. Exit code [$subRetval]. Ignoring command." "NOTICE"
fi
else
executeCommand=true
fi
else
executeCommand=true
fi
else
needsPostponing=true
fi
if [ $needsPostponing == true ]; then
postponedItemCount=$((postponedItemCount+1))
if [ $readFromFile == true ]; then
echo "$((SECONDS-postponedExecTime))#$postponedRetryCount#$currentCommand" >> "$RUN_DIR/$PROGRAM.${FUNCNAME[0]}-postponedMain.$id.$SCRIPT_PID.$TSTAMP"
echo "$currentCommandCondition" >> "$RUN_DIR/$PROGRAM.${FUNCNAME[0]}-postponedAux.$id.$SCRIPT_PID.$TSTAMP"
else
postponedCommandsArray+=("$((SECONDS-postponedExecTime))#$postponedRetryCount#$currentCommand")
postponedCommandsConditionArray+=("$currentCommandCondition")
fi
fi
if [ $executeCommand == true ]; then
Logger "Running command [$currentCommand]." "DEBUG"
randomOutputName=$(date '+%Y%m%dT%H%M%S').$(PoorMansRandomGenerator 5)
eval "$currentCommand" >> "$RUN_DIR/$PROGRAM.${FUNCNAME[0]}.$id.$pid.$randomOutputName.$SCRIPT_PID.$TSTAMP" 2>&1 &
pid=$!
pidsArray+=($pid)
commandsArrayPid[$pid]="$currentCommand"
commandsArrayOutput[$pid]="$RUN_DIR/$PROGRAM.${FUNCNAME[0]}.$id.$pid.$randomOutputName.$SCRIPT_PID.$TSTAMP"
# Initialize pid execution time array
pidsTimeArray[$pid]=0
else
Logger "Skipping command [$currentCommand]." "DEBUG"
fi
if [ $isPostponedCommand == false ]; then
counter=$((counter+1))
fi
_ExecTasksPidsCheck
done
fi
_ExecTasksPidsCheck
done
# Return exit code if only one process was monitored, else return number of errors
# As we cannot return multiple values, a global variable WAIT_FOR_TASK_COMPLETION contains all pids with their return value
eval "WAIT_FOR_TASK_COMPLETION_$id=\"$failedPidsList\""
if [ $mainItemCount -eq 1 ]; then
return $retval
else
return $errorcount
fi
}
Hope you have fun.
You can do it like this :
files=$(find . -type f)
for f in $files; do
while read -r line; do
{
found=$(grep $line $f)
if [ ! -z "$found" ]; then
## perform task here
fi
} &
done < data_file
done
wait
It will execute the block within {} in the background. So basically it will open as many background processes as files you have. If you want finer control over how many processes are actually spawned you can instead use parallel.
The find command will slow things down and the script is more complicated than it needs to be.
If you want to do this with grep, better to loop through data_file and within that grep $line * > /dev/null && do_something (or grep -R $line * > /dev/null && do_something if there are subdirectories to deal with)
You could use grep's q option to stop searching after the first match and f option to obtain the patterns from a file:
for f in $(find . -type f); do
if $(grep -qf data_file "$f"); then
...
fi
done
If data_file contains:
xxx
yyy
zzz
then grep -qf "$data_file" "$f" evaluates to true if either xxx, yyy, or zzz are found in $f.
What i want to do should be pretty simple, on my own i have reached the solution below, all i need is a few pointers to tell me if this is the way to do it or i should refactor anything in the code.
The below code, should create a few parallel processes and wait for them to finish executing then rerun the code again and again and again...
The script is triggered by a cron job once at 10 minutes, if the script is running, then do nothing, otherwise start the working process.
Any insight is highly appreciated since i am not that familiar with bash programming.
#!/bin/bash
# paths
THISPATH="$( cd "$( dirname "$0" )" && pwd )"
# make sure we move in the working directory
cd $THISPATH
# console init path
CONSOLEPATH="$( cd ../../ && pwd )/console.php"
# command line arguments
daemon=0
PHPPATH="/usr/bin/php"
help=0
# flag for binary search
LOOKEDFORPHP=0
# arguments init
while getopts d:p:h: opt; do
case $opt in
d)
daemon=$OPTARG
;;
p)
PHPPATH=$OPTARG
LOOKEDFORPHP=1
;;
h)
help=$OPTARG
;;
esac
done
shift $((OPTIND - 1))
# allow only one process
processesLength=$(ps aux | grep -v "grep" | grep -c $THISPATH/send-campaigns-daemon.sh)
if [ ${processesLength:-0} -gt 2 ]; then
# The process is already running
exit 0
fi
if [ $help -eq 1 ]; then
echo "---------------------------------------------------------------"
echo "| Usage: send-campaigns-daemon.sh |"
echo "| To force PHP CLI binary : |"
echo "| send-campaigns-daemon.sh -p /path/to/php-cli/binary |"
echo "---------------------------------------------------------------"
exit 0
fi
# php executable path, find it if not provided
if [ $PHPPATH ] && [ ! -f $PHPPATH ] && [ $LOOKEDFORPHP -eq 0 ]; then
phpVariants=( "php-cli" "php5-cli" "php5" "php" )
LOOKEDFORPHP=1
for i in "${phpVariants[#]}"
do
which $i >/dev/null 2>&1
if [ $? -eq 0 ]; then
PHPPATH=$(which $i)
fi
done
fi
if [ ! $PHPPATH ] || [ ! -f $PHPPATH ]; then
# Did not find PHP
exit 1
fi
# load options from app
parallelProcessesPerCampaign=3
campaignsAtOnce=10
subscribersAtOnce=300
sleepTime=30
function loadOptions {
local COMMAND="$PHPPATH $CONSOLEPATH option get_option --name=%s --default=%d"
parallelProcessesPerCampaign=$(printf "$COMMAND" "system.cron.send_campaigns.parallel_processes_per_campaign" 3)
campaignsAtOnce=$(printf "$COMMAND" "system.cron.send_campaigns.campaigns_at_once" 10)
subscribersAtOnce=$(printf "$COMMAND" "system.cron.send_campaigns.subscribers_at_once" 300)
sleepTime=$(printf "$COMMAND" "system.cron.send_campaigns.pause" 30)
parallelProcessesPerCampaign=$($parallelProcessesPerCampaign)
campaignsAtOnce=$($campaignsAtOnce)
subscribersAtOnce=$($subscribersAtOnce)
sleepTime=$($sleepTime)
}
# define the daemon function that will stay in loop
function daemon {
loadOptions
local pids=()
local k=0
local i=0
local COMMAND="$PHPPATH -q $CONSOLEPATH send-campaigns --campaigns_offset=%d --campaigns_limit=%d --subscribers_offset=%d --subscribers_limit=%d --parallel_process_number=%d --parallel_processes_count=%d --usleep=%d --from_daemon=1"
while [ $i -lt $campaignsAtOnce ]
do
while [ $k -lt $parallelProcessesPerCampaign ]
do
parallelProcessNumber=$(( $k + 1 ))
usleep=$(( $k * 10 + $i * 10 ))
CMD=$(printf "$COMMAND" $i 1 $(( $subscribersAtOnce * $k )) $subscribersAtOnce $parallelProcessNumber $parallelProcessesPerCampaign $usleep)
$CMD > /dev/null 2>&1 &
pids+=($!)
k=$(( k + 1 ))
done
i=$(( i + 1 ))
done
waitForPids pids
sleep $sleepTime
daemon
}
function daemonize {
$THISPATH/send-campaigns-daemon.sh -d 1 -p $PHPPATH > /dev/null 2>&1 &
}
function waitForPids {
stillRunning=0
for i in "${pids[#]}"
do
if ps -p $i > /dev/null
then
stillRunning=1
break
fi
done
if [ $stillRunning -eq 1 ]; then
sleep 0.5
waitForPids pids
fi
return 0
}
if [ $daemon -eq 1 ]; then
daemon
else
daemonize
fi
exit 0
when starting a script, create a lock file to know that this script is running. When the script finish, delete the lock file. If somebody kill the process while it is running, the lock file remain forever, though test how old it is and delete after if older than a defined value. For example,
#!/bin/bash
# 10 min
LOCK_MAX=600
typedef LOCKFILE=/var/lock/${0##*/}.lock
if [[ -f $LOCKFILE ]] ; then
TIMEINI=$( stat -c %X $LOCKFILE )
SEGS=$(( $(date +%s) - $TIEMPOINI ))
if [[ $SEGS -gt $LOCK_MAX ]] ; then
reportLocking or somethig to inform you
# Kill old intance ???
OLDPID=$(<$LOCKFILE)
[[ -e /proc/$OLDPID ]] && kill -9 $OLDPID
# Next time that the program is run, there is no lock file and it will run.
rm $LOCKFILE
fi
exit 65
fi
# Save PID of this instance to the lock file
echo "$$" > $LOCKFILE
### Your code go here
# Remove the lock file before script finish
[[ -e $LOCKFILE ]] && rm $LOCKFILE
exit 0
from here:
#!/bin/bash
...
echo PARALLEL_JOBS:${PARALLEL_JOBS:=1}
declare -a tests=($(.../find_what_to_run))
echo "${tests[#]}" | \
xargs -d' ' -n1 -P${PARALLEL_JOBS} -I {} bash -c ".../run_that {}" || { echo "FAILURE"; exit 1; }
echo "SUCCESS"
and here you can nick the code for portable locking with fuser
Okay, so i guess i can answer to my own question with a proper answer that works after many tests.
So here is the final version, simplified, without comments/echo :
#!/bin/bash
sleep 2
DIR="$( cd "$( dirname "$0" )" && pwd )"
FILE_NAME="$( basename "$0" )"
COMMAND_FILE_PATH="$DIR/$FILE_NAME"
if [ ! -f "$COMMAND_FILE_PATH" ]; then
exit 1
fi
cd $DIR
CONSOLE_PATH="$( cd ../../ && pwd )/console.php"
PHP_PATH="/usr/bin/php"
help=0
LOOKED_FOR_PHP=0
while getopts p:h: opt; do
case $opt in
p)
PHP_PATH=$OPTARG
LOOKED_FOR_PHP=1
;;
h)
help=$OPTARG
;;
esac
done
shift $((OPTIND - 1))
if [ $help -eq 1 ]; then
printf "%s\n" "HELP INFO"
exit 0
fi
if [ "$PHP_PATH" ] && [ ! -f "$PHP_PATH" ] && [ "$LOOKED_FOR_PHP" -eq 0 ]; then
php_variants=( "php-cli" "php5-cli" "php5" "php" )
LOOKED_FOR_PHP=1
for i in "${php_variants[#]}"
do
which $i >/dev/null 2>&1
if [ $? -eq 0 ]; then
PHP_PATH="$(which $i)"
break
fi
done
fi
if [ ! "$PHP_PATH" ] || [ ! -f "$PHP_PATH" ]; then
exit 1
fi
LOCK_BASE_PATH="$( cd ../../../common/runtime && pwd )/shell-pids"
LOCK_PATH="$LOCK_BASE_PATH/send-campaigns-daemon.pid"
function remove_lock {
if [ -d "$LOCK_PATH" ]; then
rmdir "$LOCK_PATH" > /dev/null 2>&1
fi
exit 0
}
if [ ! -d "$LOCK_BASE_PATH" ]; then
if ! mkdir -p "$LOCK_BASE_PATH" > /dev/null 2>&1; then
exit 1
fi
fi
process_running=0
if mkdir "$LOCK_PATH" > /dev/null 2>&1; then
process_running=0
else
process_running=1
fi
if [ $process_running -eq 1 ]; then
exit 0
fi
trap "remove_lock" 1 2 3 15
COMMAND="$PHP_PATH $CONSOLE_PATH option get_option --name=%s --default=%d"
parallel_processes_per_campaign=$(printf "$COMMAND" "system.cron.send_campaigns.parallel_processes_per_campaign" 3)
campaigns_at_once=$(printf "$COMMAND" "system.cron.send_campaigns.campaigns_at_once" 10)
subscribers_at_once=$(printf "$COMMAND" "system.cron.send_campaigns.subscribers_at_once" 300)
sleep_time=$(printf "$COMMAND" "system.cron.send_campaigns.pause" 30)
parallel_processes_per_campaign=$($parallel_processes_per_campaign)
campaigns_at_once=$($campaigns_at_once)
subscribers_at_once=$($subscribers_at_once)
sleep_time=$($sleep_time)
k=0
i=0
pp=0
COMMAND="$PHP_PATH -q $CONSOLE_PATH send-campaigns --campaigns_offset=%d --campaigns_limit=%d --subscribers_offset=%d --subscribers_limit=%d --parallel_process_number=%d --parallel_processes_count=%d --usleep=%d --from_daemon=1"
while [ $i -lt $campaigns_at_once ]
do
while [ $k -lt $parallel_processes_per_campaign ]
do
parallel_process_number=$(( $k + 1 ))
usleep=$(( $k * 10 + $i * 10 ))
CMD=$(printf "$COMMAND" $i 1 $(( $subscribers_at_once * $k )) $subscribers_at_once $parallel_process_number $parallel_processes_per_campaign $usleep)
$CMD > /dev/null 2>&1 &
k=$(( k + 1 ))
pp=$(( pp + 1 ))
done
i=$(( i + 1 ))
done
wait
sleep ${sleep_time:-30}
$COMMAND_FILE_PATH -p "$PHP_PATH" > /dev/null 2>&1 &
remove_lock
exit 0
Usually, it is a lock file, not a lock path. You hold the PID in the lock file for monitoring your process. In this case your lock directory does not hold any PID information. Your script also does not do any PID file/directory maintenance when it starts in case of a improper shutdown of your process without cleaning of your lock.
I like your first script better with this in mind. Monitoring the PID's running directly is cleaner. The only problem is if you start a second instance with cron, it is not aware of the PID's connect to the first instance.
You also have processLength -gt 2 which is 2, not 1 process running so you will duplicate your process threads.
It seems also that daemonize is just recalling the script with daemon which is not very useful. Also, having a variable with the same name as a function is not effective.
The correct way to make a lockfile is like this:
# Create a temporary file
echo $$ > ${LOCKFILE}.tmp$$
# Try the lock; ln without -f is atomic
if ln ${LOCKFILE}.tmp$$ ${LOCKFILE}; then
# we got the lock
else
# we didn't get the lock
fi
# Tidy up the temporary file
rm ${LOCKFILE}.tmp$$
And to release the lock:
# Unlock
rm ${LOCKFILE}
The key thing is to create the lock file to one side, using a unique name, and then try to link it to the real name. This is an atomic operation, so it should be safe.
Any solution that does "test and set" gives you a race condition to deal with. Yes, that can be sorted out, but you end up write extra code.
I have a Gradle app that I startup using ./gradlew run. This works fine, but I'm trying to deploy to an AWS instance (Ubuntu 12) and I would like the script to execute on boot. I tried writing a startup.sh file with the above command, but no dice. I've also tried adding the command to the /etc/rc.local file, but that doesn't seem to work either. Can someone give me an idea as to how to execute `./gradlew run' on startup? Thanks!
I wrote the following init script for starting gradle applications at system startup for redhat distros (centos/fedora etc).
You need to perform a few steps to tie it all together:
deploy your gradle application using gradle distZip onto your target server
create a configuration file /etc/my-service.conf
link the init script (see below) to the service name in /etc/init.d/my-service
An example configuration file /etc/my-service.conf
username=someunixuser
serviceName=MyStandaloneServer
prog="/path/to/bin/MyStandaloneServer -a any -p params -y you -w want"
javaClass="some.java.MyStandaloneServer"
Note the path to the application from the distZip in the prog line.
You then link the init script to the actual service you want it to be run as, e.g.
ln -s /path/to/gradle-init-start-stop /etc/init.d/my-service
Once you've done this, you can use chkconfig to add the service in the usual way (it defaults to 3/4/5)
Here is the script gradle-init-start-stop
#!/bin/bash
#
# chkconfig: 345 80 20
# description: Start and stop script for gradle created java application startup
#
# This is a generic file that can be used by any distribution from gradle ("gradle distZip").
# Link this file to the name of the process you want to run.
# e.g.
# ln -s /path/to/gradle-init-start-stop /etc/init.d/ivy-jetty
#
# it requires a conf file /etc/NAME.conf, e.g. /etc/ivy-jetty.conf
# otherwise it will quit.
#
# CONFIGURATION FILE ENTRIES:
# ---------------------------
# username=process-owner
# prog="/path/to/gradle-startscript -a any -e extra parameters"
# serviceName=SomeShortNameForService
# javaClass=package.for.JavaClass
. /etc/rc.d/init.d/functions
BASENAME=$(basename $0)
maxShutdownTime=15
CONF=/etc/${BASENAME}.conf
pidfile=/var/run/$BASENAME.pid
if [ ! -f $CONF ] ; then
echo "Could not find configuration file: $CONF"
exit 1
fi
####### SOURCE CONFIGURATION FILE
source $CONF
checkProcessIsRunning() {
local pid="$1"
if [ -z "$pid" -o "$pid" == " " ]; then return 1; fi
if [ ! -e /proc/$pid ]; then return 1; fi
return 0
}
checkProcessIsOurService() {
local pid="$1"
if [ "$(ps -p $pid --no-headers -o comm)" != "java" ]; then return 1; fi
grep -q --binary -F "$javaClass" /proc/$pid/cmdline
if [ $? -ne 0 ]; then return 1; fi
return 0
}
getServicePID() {
if [ ! -f $pidfile ]; then return 1; fi
pid="$(<$pidfile)"
checkProcessIsRunning $pid || return 1
checkProcessIsOurService $pid || return 1
return 0
}
startService() {
cmd="nohup $prog >/dev/null 2>&1 & echo \$!"
sudo -u $username -H $SHELL -c "$cmd" > $pidfile
sleep 0.2
pid="$(<$pidfile)"
if checkProcessIsRunning $pid; then
return 0
else
return 1
fi
}
start() {
getServicePID
if [ $? -eq 0 ]; then echo -n "$serviceName is already running"; RETVAL=0; echo ""; return 0; fi
echo -n "Starting $serviceName: "
startService
if [ $? -ne 0 ] ; then
echo "failed"
return 1
else
echo "started"
return 0
fi
}
stopService() {
# soft kill first...
kill $pid || return 1
# check if process dead, sleep 0.2s otherwise
for ((i=0; i<maxShutdownTime*5; i++)); do
checkProcessIsRunning $pid
if [ $? -ne 0 ] ; then
rm -f $pidfile
return 0
fi
sleep 0.2
done
# hard kill now...
kill -s KILL $pid || return 1
# check if process dead, sleep 0.2s otherwise
for ((i=0; i<maxShutdownTime*5; i++)); do
checkProcessIsRunning $pid
if [ $? -ne 0 ] ; then
rm -f $pidfile
return 0
fi
sleep 0.2
done
return 1
}
stop() {
getServicePID
if [ $? -ne 0 ]; then echo -n "$serviceName is not running"; RETVAL=0; echo ""; return 0; fi
pid="$(<$pidfile)"
echo -n "Stopping $serviceName "
stopService
if [ $? -ne 0 ]; then RETVAL=1; echo "failed"; return 1; fi
echo "stopped PID=$pid"
RETVAL=0
return 0
}
restart() {
stop
start
}
checkServiceStatus() {
echo -n "Checking for $serviceName: "
if getServicePID; then
echo "running PID=$pid"
RETVAL=0
else
echo "stopped"
RETVAL=3
fi
return 0;
}
####### START OF MAIN SCRIPT
RETVAL=0
case "$1" in
start)
$1
;;
stop)
$1
;;
restart)
$1
;;
status)
checkServiceStatus
;;
*)
echo "Usage: $0 {start|stop|status|restart}"
exit 1
esac
exit $RETVAL
I have to make a conditional in ash, that depends on result of two commands. The problem is one of them returns the result to stdout, the other as exitcode.
Do I have to write
command2
RET=$?
if [ `command1` -eq 1 -a $RET -eq 2 ] ; then ...
or is there some construct that would let me simply access return code of command2 within logic of [ ] ?
if [ `command1` -eq 1 -a ${{{ command2 }}} -eq 2 ] ; then ...
( with ${{{ }}}} being the magical expression extracting the returncode ? )
It would be better to write:
if [ "`command1`" -eq 1 ] && command2
then
....
fi
Or when you want to check if the exit code is 2 then:
if [ "`command1`" -eq 1 ] && { command2 ; [ "$?" = 2 ] ; }
then
....
fi
Example:
$ cat 1.sh
ARG="$1"
command1()
{
echo 1
}
command2()
{
return "$ARG"
}
if [ "`command1`" -eq 1 ] && { command2 ; [ "$?" = 2 ] ; }
then
echo OK
else
echo FAILED
fi
$ sh 1.sh 2
OK
$ sh 1.sh 3
FAILED
I guess there is no way to avoid $?, but I can use the command inside test statement, by adding ;echo $? at the end.
if [ `command1` -eq 1 -a `command2 ; echo $?` -eq 2 ] ; then ...