Docker kill an infinite process in a container after X amount of time - bash

I am using the code found in this docker issue to basically start a container run a process within 20 seconds and if the process completes / does not complete / fails to execute / times out the container is killed regardless.
The code I am using currently is this:
#!/bin/bash
set -e
to=$1
shift
cont=$(docker run -d "$#")
code=$(timeout "$to" docker wait "$cont" || true)
docker kill $cont &> /dev/null
echo -n 'status: '
if [ -z "$code" ]; then
echo timeout
else
echo exited: $code
fi
echo output:
# pipe to sed simply for pretty nice indentation
docker logs $cont | sed 's/^/\t/'
docker rm $cont &> /dev/null
Which is almost perfect however if you run an infinite process (for example this python infinite loop):
while True:
print "inifinte loop"
The whole system jams up and the app crashes, after reading around a bit I think it has something to do with the STDOUT Buffer but I have absolutely no idea what that means?

The problem you have is with a process that is writing massive amounts of data to stdout.
These messages get logged into a file which grows infinitely.
Have a look at (depending on your system's location for log files):
sudo find /var/lib/docker/containers/ -name '*.log' -ls
You can remove old log files if they are of no interest.
One possibility is to start your docker run -d daemon
under a ulimit restriction on the max size a file can be.
Add to the start of your script, for example:
ulimit -f 20000 -c 0
This limits file sizes to 20000*1024 bytes, and disables core file dumps, which you expect
to get from infinite loops where writes are forced to fail.

Please add & at the end of
cont=$(docker run -d "$#")&
It will run the process in background.
I don't know dockers but if it still fail to stop you may also add just after this line the following :
mypid=$!
sleep 20 && kill $mypid
Regards

Related

Cron job won't start again after I stopped it?

I wrote a script to run constantly on startup. If for whatever reason the script were to fail, I wrote a second script to check if it has failed, and if so, run the first script again. I then set this second script as a cronjob to run every minute so that it is constantly checking if the first script is alive.
So to test this, I reboot my system. I can see in htop that the first script is running from start up as expected. Good. I kill the process to test the second script. Sure enough, the second script starts the first script again. Still good. I then kill this process, but the second script won't run again now. It still updates a txt file when I manually start the first script, but the second script just doesn't start the first script like it's supposed to. Is it because I killed the cronjob? Restarting the cron service doesn't fix anything though, so I don't know why my second script isn't running again at all.
First script:
#!/bin/bash
stamp=$(date +%Y%m%d-%H%M)
timeout 10d tcpdump -i eth0 -s 96 -z gzip -C 10 -w /home/user/Documents/${stamp}
Second script:
#!/bin/bash
echo "not running" > /home/working.txt
if (( $(ps -ef | grep -v grep | grep tcpdump.sh | wc -l) > 0 ))
then
echo "tcpdump is running!!!" > /home/working.txt
else
/usr/local/bin/tcpdump.sh start
fi
Any help?
You would probably be better off running a simple for loop as the main script, and that kicks off the tcpdump script in the background, so something like:
#!/bin/bash
while true; do
if ps -ef | grep -v grep | grep -q tcpdump; then
: tcpdump running OK
else
# tcpdump not running - start it off
nohup /usr/local/bin/firstscript.sh start &
fi
sleep 30
done
This checks that "tcpdump.sh" is in the output of the "ps -ef" command - if it is, then do nothing (note that you must have an actual command between the "then" and "else" - the ":" command, which just takes it s arguments and ignores them, is sufficient). If it isn't running, start the first script in the background. Then sleep 30 seconds and check again. (Yes, I could have inverted the test so that I didn't need an empty "then" arm, but it would have made the code less obvious)
You put this script as the one which starts at boot time.
Edit: Do you really want to check for "tcpdump.sh"? Is that what the first script is actually called? Assuming that you actually want to check for the tcpdump program, you could use:
if pgrep tcpdump; then

Terminal Application to Keep Web Server Process Alive

Is there an app that can, given a command and options, execute for the lifetime of the process and ping a given URL indefinitely on a specific interval?
If not, could this be done on the terminal as a bash script? I'm almost positive it's doable through terminal, but am not fluent enough to whip it up within a few minutes.
Found this post that has a portion of the solution, minus the ping bits. ping runs on linux, indefinitely; until it's actively killed. How would I kill it from bash after say, two pings?
General Script
As others have suggested, use this in pseudo code:
execute command and save PID
while PID is active, ping and sleep
exit
This results in following script:
#!/bin/bash
# execute command, use '&' at the end to run in background
<command here> &
# store pid
pid=$!
while ps | awk '{ print $1 }' | grep $pid; do
ping <address here>
sleep <timeout here in seconds>
done
Note that the stuff inside <> should be replaces with actual stuff. Be it a command or an ip address.
Break from Loop
To answer your second question, that depends in the loop. In the loop above, simply track the loop count using a variable. To do that, add a ((count++)) inside the loop. And do this: [[ $count -eq 2 ]] && break. Now the loop will break when we're pinging for a second time.
Something like this:
...
while ...; do
...
((count++))
[[ $count -eq 2 ]] && break
done
ping twice
To ping only a few times, use the -c option:
ping -c <count here> <address here>
Example:
ping -c 2 www.google.com
Use man ping for more information.
Better practice
As hek2mgl noted in a comment below, the current solution may not suffice to solve the problem. While answering the question, the core problem will still persist. To aid to that problem, a cron job is suggested in which a simple wget or curl http request is sent periodically. This results in a fairly easy script containing but one line:
#!/bin/bash
curl <address here> > /dev/null 2>&1
This script can be added as a cron job. Leave a comment if you desire more information how to set such a scheduled job. Special thanks to hek2mgl for analyzing the problem and suggesting a sound solution.
Say you want to start a download with wget and while it is running, ping the url:
wget http://example.com/large_file.tgz & #put in background
pid=$!
while kill -s 0 $pid #test if process is running
do
ping -c 1 127.0.0.1 #ping your adress once
sleep 5 #and sleep for 5 seconds
done
A nice little generic utility for this is Daemonize. Its relevant options:
Usage: daemonize [OPTIONS] path [arg] ...
-c <dir> # Set daemon's working directory to <dir>.
-E var=value # Pass environment setting to daemon. May appear multiple times.
-p <pidfile> # Save PID to <pidfile>.
-u <user> # Run daemon as user <user>. Requires invocation as root.
-l <lockfile> # Single-instance checking using lockfile <lockfile>.
Here's an example of starting/killing in use: flickd
To get more sophisticated, you could turn your ping script into a systemd service, now standard on many recent Linuxes.

how to check if mongodb is up and ready to accept connections from bash script?

I have a bash shell script which does a bunch of stuff before trying to mongorestore.
I want to make sure that not only MongoDB is up, but it is also ready to accept connections before i try restore.
Right now, what I am seeing is, mongo process is up but it take 45+ seconds to do the initial setup (setting up journal files etc) before it is ready to accept connections. Ideally I want to keep testing the connection in a loop and only when I am able to connect, I want to run mongorestore.
Can someone show me how to do this in Bash or point me in the right direction?
To test the connection in a loop like you suggest,
until nc -z localhost 27017
do
sleep 1
done
A solution using MongoDB Tools. Useful in a docker container or something similiar where you do not want to install nc.
until mongo --eval "print(\"waited for connection\")"
do
sleep 60
done
Based on that other guy's answer.
I recently had the same problem. I decided to configure mongod to log all it's output to a logfile and then wait in a loop checking the logfile until we see some output that suggests mongod is ready.
This is an example logfile output line we need to wait for:
Tue Dec 3 14:25:28.217 [initandlisten] waiting for connections on port 27017
This is the bash script I came up with:
#!/bin/bash
# Initialize a mongo data folder and logfile
mkdir -p /data/db
touch /var/log/mongodb.log
# Start mongodb with logging
# --logpath Without this mongod will output all log information to the standard output.
# --logappend Ensure mongod appends new entries to the end of the logfile. We create it first so that the below tail always finds something
/usr/bin/mongod --quiet --logpath /var/log/mongodb.log --logappend &
# Wait until mongo logs that it's ready (or timeout after 60s)
COUNTER=0
grep -q 'waiting for connections on port' /var/log/mongodb.log
while [[ $? -ne 0 && $COUNTER -lt 60 ]] ; do
sleep 2
let COUNTER+=2
echo "Waiting for mongo to initialize... ($COUNTER seconds so far)"
grep -q 'waiting for connections on port' /var/log/mongodb.log
done
# Now we know mongo is ready and can continue with other commands
...
Notice the script will not wait forever, it will timeout after 60s - you may or may not want that depending on your use case.
I needed Mongo running in Docker to initialize before creating a user. I combined the answers of Tom and Björn. This is the script I am using:
#!/bin/bash
# Wait until Mongo is ready to accept connections, exit if this does not happen within 30 seconds
COUNTER=0
until mongo --host ${MONGO_HOST} --eval "printjson(db.serverStatus())"
do
sleep 1
COUNTER=$((COUNTER+1))
if [[ ${COUNTER} -eq 30 ]]; then
echo "MongoDB did not initialize within 30 seconds, exiting"
exit 2
fi
echo "Waiting for MongoDB to initialize... ${COUNTER}/30"
done
# Connect to the MongoDB and execute the create users script
mongo ${FWRD_API_DB} /root/create-user.js --host ${MONGO_HOST} -u ${MONGO_ROOT_USER} -p ${MONGO_ROOT_PASSWORD} --authenticationDatabase admin
While Tom's answer will work quite well in most situations, it can fail if you are running your script with the -e flag. I mean you run your script with set -e at the very top. Why? Because Tom's answer relies on the exit status of the previous command, in this case grep -q which will fail if it does not find the required string, and therefore the entire script will fail. The -e flag documentation gives a clue on how to avoid this problem:
The shell does not exit if the command that fails is part of the
command list immediately following a while or until keyword, part of
the test in an if statement, part of any command executed in a && or
|| list except the command following the final && or ||, any command
in a pipeline but the last, or if the command’s return status is being
inverted with !.
So, one solution is to make the grep command part of the while condition. However, since he is running mongodb with the --logappend option the search string could appear as a result of a previous run. I combined that other guy answer with Tom's answer and it works really well:
# Wait until mongo logs that it's ready (or timeout after 60s)
COUNTER=0
while !(nc -z localhost 27017) && [[ $COUNTER -lt 60 ]] ; do
sleep 2
let COUNTER+=2
echo "Waiting for mongo to initialize... ($COUNTER seconds so far)"
done
I find that using tomcat is the best solution because it actually tests if there is something listening.
Possible Docker solution:
Given the docker_id for me it works reading the docker logs like:
until [ $(docker logs --tail all $docker_id | grep "waiting for connections on port" | wc -l) -gt 0 ]; do
printf '.'
sleep 1
done
then continue with any mongo-dependent task.
This approach is using in bitnami mongodb helm chart, but requires installed mongodb shell client:
mongosh --host {your_mongo_host} --port {your_mongo_port} --eval "db.adminCommand('ping')"

Restart a process when cpu gets high

I've got a cron job checking for webserver (seeing if its active), which is handy..
http://pastebin.com/raw.php?i=KW8crfzh
I'm wanting after something similar for cpu usage. I'm running java backend which occasionally gets 70%+ cpu. I'm after a cron script to automatically kill/restart java if cpu load gets too high, how is this possible?
You could use top in batch mode coupled with some code to parse its output. For example:
top -p 1234 -n 1 -b
Will output a snapshot of the state of process 1234.
I use this script and it is pretty cool
#!/bin/bash
# author = Jaysunn
# Log
LOGFILE=/var/log/load_kill_log
# log the process causing the load at the time.
PSFILE=/var/log/ps_log
# Obtain the server load
loadavg=`uptime |cut -d , -f 4|cut -d : -f 2`
thisloadavg=`echo $loadavg|awk -F \. '{print $1}'`
if [ "$thisloadavg" -ge "10" ]; then
ps auxfww >> $PSFILE
date >> $LOGFILE
# Issue the command of choice. This can be any shell command.
## Put the command which restarts ..
fi
give executable permissions and add this to crontab with proper path to this script.

How do I make sure my bash script isn't already running?

I have a bash script I want to run every 5 minutes from cron... but there's a chance the previous run of the script isn't done yet... in this case, i want the new run to just exit. I don't want to rely on just a lock file in /tmp.. I want to make sure sure the process is actually running before i honor the lock file (or whatever)...
Here is what I have stolen from the internet so far... how do i smarten it up a bit? or is there a completely different way that's better?
if [ -f /tmp/mylockFile ] ; then
echo 'Script is still running'
else
echo 1 > /tmp/mylockFile
/* Do some stuff */
rm -f /tmp/mylockFile
fi
# Use a lockfile containing the pid of the running process
# If script crashes and leaves lockfile around, it will have a different pid so
# will not prevent script running again.
#
lf=/tmp/pidLockFile
# create empty lock file if none exists
cat /dev/null >> $lf
read lastPID < $lf
# if lastPID is not null and a process with that pid exists , exit
[ ! -z "$lastPID" -a -d /proc/$lastPID ] && exit
echo not running
# save my pid in the lock file
echo $$ > $lf
# sleep just to make testing easier
sleep 5
There is at least one race condition in this script. Don't use it for a life support system, lol. But it should work fine for your example, because your environment doesn't start two scripts simultaneously. There are lots of ways to use more atomic locks, but they generally depend on having a particular thing optionally installed, or work differently on NFS, etc...
You might want to have a look at the man page for the flock command, if you're lucky enough to get it on your distribution.
NAME
flock - Manage locks from shell scripts
SYNOPSIS
flock [-sxon] [-w timeout] lockfile [-c] command...
Never use a lock file always use a lock directory.
In your specific case, it's not so important because the start of the script is scheduled in 5min intervals. But if you ever reuse this code for a webserver cgi-script you are toast.
if mkdir /tmp/my_lock_dir 2>/dev/null
then
echo "running now the script"
sleep 10
rmdir /tmp/my_lock_dir
fi
This has a problem if you have a stale lock, means the lock is there but no associated process. Your cron will never run.
Why use a directory? Because mkdir is an atomic operation. Only one process at a time can create a directory, all other processes get an error. This even works across shared filesystems and probably even between different OS types.
Store your pid in mylockFile. When you need to check, look up ps for the process with the pid you read from file. If it exists, your script is running.
If you want to check the process's existence, just look at the output of
ps aux | grep your_script_name
If it's there, it's not dead...
As pointed out in the comments and other answers, using the PID stored in the lockfile is much safer and is the standard approach most apps take. I just do this because it's convenient and I almost never see the corner cases (e.g. editing the file when the cron executes) in practice.
If you use a lockfile, you should make sure that the lockfile is always removed. You can do this with 'trap':
if ( set -o noclobber; echo "locked" > "$lockfile") 2> /dev/null; then
trap 'rm -f "$lockfile"; exit $?' INT TERM EXIT
echo "Locking succeeded" >&2
rm -f "$lockfile"
else
echo "Lock failed - exit" >&2
exit 1
fi
The noclobber option makes the creation of lockfile atomic, like using a directory.
As a one-liner and if you do not want to use a lockfile (e.g. b/c/ of a read only filesystem, etc)
test "$(pidof -x $(basename $0))" != $$ && exit
It checks that the full list of PID that bear the name of your script is equal to the current PID. The "-x" also checks for the name of shell scripts.
Bash makes it even shorter and faster:
[[ "$(pidof -x $(basename $0))" != $$ ]] && exit
In some cases, you might want to be able to distinguish between who is running the script and allow some concurrency but not all. In that case, you can use per-user, per-tty or cron-specific locks.
You can use environment variables such as $USER or the output of a program such as tty to create the filename. For cron, you can set a variable in the crontab file and test for it in your script.
you can use this one:
pgrep -f "/bin/\w*sh .*scriptname" | grep -vq $$ && exit
I was trying to solve this problem today and I came up with the below:
COMMAND_LINE="$0 $*"
JOBS=$(SUBSHELL_PID=$BASHPID; ps axo pid,command | grep "${COMMAND_LINE}" | grep -v $$ | g rep -v ${SUBSHELL_PID} | grep -v grep)
if [[ -z "${JOBS}" ]]
then
# not already running
else
# already running
fi
This relies on $BASHPID which contains the PID inside a subshell ($$ in the subshell is the parent pid). However, this relies on Bash v4 and I needed to run this on OSX which has Bash v3.2.48. I ultimately came up with another solution and it is cleaner:
JOBS=$(sh -c "ps axo pid,command | grep \"${COMMAND_LINE}\" | grep -v grep | grep -v $$")
You can always just:
if ps -e -o cmd | grep scriptname > /dev/null; then
exit
fi
But I like the lockfile myself, so I wouldn't do this without the lock file as well.
Since a socket solution has not yet been mentioned it is worth pointing out that sockets can be used as effective mutexes. Socket creation is an atomic operation, like mkdir is as Gunstick pointed out, so a socket is suitable to use as a lock or mutex.
Tim Kay's Perl script 'Solo' is a very small and effective script to make sure only one copy of a script can be run at any one time. It was designed specifically for use with cron jobs, although it works perfectly for other tasks as well and I've used it for non-crob jobs very effectively.
Solo has one advantage over the other techniques mentioned so far in that the check is done outside of the script you only want to run one copy of. If the script is already running then a second instance of that script will never even be started. This is as opposed to isolating a block of code inside the script which is protected by a lock. EDIT: If flock is used in a cron job, rather than from inside a script, then you can also use that to prevent a second instance of the script from starting - see example below.
Here's an example of how you might use it with cron:
*/5 * * * * solo -port=3801 /path/to/script.sh args args args
# "/path/to/script.sh args args args" is only called if no other instance of
# "/path/to/script.sh" is running, or more accurately if the socket on port 3801
# is not open. Distinct port numbers can be used for different programs so that
# if script_1.sh is running it does not prevent script_2.sh from starting, I've
# used the port range 3801 to 3810 without conflicts. For Linux non-root users
# the valid port range is 1024 to 65535 (0 to 1023 are reserved for root).
* * * * * solo -port=3802 /path/to/script_1.sh
* * * * * solo -port=3803 /path/to/script_2.sh
# Flock can also be used in cron jobs with a distinct lock path for different
# programs, in the example below script_3.sh will only be started if the one
# started a minute earlier has already finished.
* * * * * flock -n /tmp/path.to.lock -c /path/to/script_3.sh
Links:
Solo web page: http://timkay.com/solo/
Solo script: http://timkay.com/solo/solo
Hope this helps.
You can use this.
I'll just shamelessly copy-paste the solution here, as it is an answer for both questions (I would argue that it's actually a better fit for this question).
Usage
include sh_lock_functions.sh
init using sh_lock_init
lock using sh_acquire_lock
check lock using sh_check_lock
unlock using sh_remove_lock
Script File
sh_lock_functions.sh
#!/bin/bash
function sh_lock_init {
sh_lock_scriptName=$(basename $0)
sh_lock_dir="/tmp/${sh_lock_scriptName}.lock" #lock directory
sh_lock_file="${sh_lock_dir}/lockPid.txt" #lock file
}
function sh_acquire_lock {
if mkdir $sh_lock_dir 2>/dev/null; then #check for lock
echo "$sh_lock_scriptName lock acquired successfully.">&2
touch $sh_lock_file
echo $$ > $sh_lock_file # set current pid in lockFile
return 0
else
touch $sh_lock_file
read sh_lock_lastPID < $sh_lock_file
if [ ! -z "$sh_lock_lastPID" -a -d /proc/$sh_lock_lastPID ]; then # if lastPID is not null and a process with that pid exists
echo "$sh_lock_scriptName is already running.">&2
return 1
else
echo "$sh_lock_scriptName stopped during execution, reacquiring lock.">&2
echo $$ > $sh_lock_file # set current pid in lockFile
return 2
fi
fi
return 0
}
function sh_check_lock {
[[ ! -f $sh_lock_file ]] && echo "$sh_lock_scriptName lock file removed.">&2 && return 1
read sh_lock_lastPID < $sh_lock_file
[[ $sh_lock_lastPID -ne $$ ]] && echo "$sh_lock_scriptName lock file pid has changed.">&2 && return 2
echo "$sh_lock_scriptName lock still in place.">&2
return 0
}
function sh_remove_lock {
rm -r $sh_lock_dir
}
Usage example
sh_lock_usage_example.sh
#!/bin/bash
. /path/to/sh_lock_functions.sh # load sh lock functions
sh_lock_init || exit $?
sh_acquire_lock
lockStatus=$?
[[ $lockStatus -eq 1 ]] && exit $lockStatus
[[ $lockStatus -eq 2 ]] && echo "lock is set, do some resume from crash procedures";
#monitoring example
cnt=0
while sh_check_lock # loop while lock is in place
do
echo "$sh_scriptName running (pid $$)"
sleep 1
let cnt++
[[ $cnt -gt 5 ]] && break
done
#remove lock when process finished
sh_remove_lock || exit $?
exit 0
Features
Uses a combination of file, directory and process id to lock to make sure that the process is not already running
You can detect if the script stopped before lock removal (eg. process kill, shutdown, error etc.)
You can check the lock file, and use it to trigger a process shutdown when the lock is missing
Verbose, outputs error messages for easier debug

Resources