Is it ok to use check PID for rare exceptions? - bash

I read this interesting question, that basically says that I should always avoid reaching PID of processes that aren't child processes. It's well explained and makes perfect sense.
BUT, while OP was trying to do something that cron isn't meant to, I'm in a very different situation :
I want to run a process say every 5 minutes, but once in a hundred times it takes a little more than 5 minutes to run (and I can't have two instances running at once).
I don't want to kill or manipulate other processes, I just want to end my process without doing anything if another instance of the process is running.
Is it ok to fetch PID of "not-child processes" in that case ? If so, how would I do it ?
I've tried doing if pgrep "myscript"; then ... or stuff like that, but the process finds its own PID. I need to detect if it finds more than one.
(Initially before being redirected I read this question, but the solution given doesn't work: it can give pid of the process using it)
EDIT: I should have mentioned it before, but if the script is already in use I still need to write something in a log file, at least : date>>script.log; echo "Script already in use">>script.log", I may be wrong but I think flock doesn't allow to do that.

Use lckdo or flock to avoid duplicated running.
DESCRIPTION
lckdo runs a program with a lock held, in order to prevent multiple
processes from running in parallel. Use just like nice or nohup.
Now that util-linux contains a similar command named flock, lckdo is
deprecated, and will be removed from some future version of moreutils.
Of course you can implement this primitive lockfile feature by yourself.
if [ ! -f /tmp/my.lock ];then
touch /tmp/my.lock
run prog
rm -f /tmp/my.lock
fi

Related

Correct use Bash wait command with unknown processes number

Im writing a bash script that essentially fires off a python script that takes roughly 10 hours to complete, followed by an R script that checks the outputs of the python script for anything I need to be concerned about. Here is what I have:
ProdRun="python scripts/run_prod.py"
echo "Commencing Production Run"
$ProdRun #Runs python script
wait
DupCompare="R CMD BATCH --no-save ../dupCompareTD.R" #Runs R script
$DupCompare
Now my issues is that often the python script can generate a whole heap of different processes on our linux server depending on its input, with lots of different PIDs AND we have heaps of workers using the same server firing off scripts. As far as I can tell from reading, the 'wait' command must wait for all processes to finish or for a specific PID to finish, but when i cannot tell what or how many PIDs will be assigned/processes run, how exactly do I use it?
EDIT: Thank you to all that helped, here is what caused my dilemma for anyone google searching this. I broke up the ProdRun python script into its individual script that it was itself calling, but still had the issue, I think found that one of these scripts was also calling another smaller script that had a "&" at the end of it that was ignoring any commands to wait on it inside the python script itself. Simply removing this and inserting a line of "os.system()" allowed all the code to run sequentially.
It sounds like you are trying to implement a job scheduler with possibly some complex dependencies between different tasks. I recommend to use a job scheduler instead. It allows you to specify to run those jobs whilst also benefitting from features like monitoring, handling exceptional cases, errors, ...
Examples are: the open source rundeck https://github.com/rundeck/rundeck or the commercial one http://www.bmcsoftware.uk/it-solutions/control-m.html
Make your Python program wait on the children it spawns. That's the proper way to fix this scenario. Then you don't have to wait for Python after it finishes (sic).
(Also, don't put your commands in variables.)

Check status of a forked process?

I'm running a process that will take, optimistically, several hours, and in the worst case, probably a couple of days.
I've tried a couple of times to run it and it just never seems to complete (I should add, I didn't write the program, it's just a big dataset). I know my syntax for the command is correct as I use it all the time for smaller data and it works properly (I'll spare you the details as it is obscure for SO and I don't think that relevant to the question).
Consequently, I'd like to leave the program unattended running as a fork with &.
Now, I'm not totally sure whether the process is just grinding to a halt or is running but taking much longer than expected.
Is there any way to check the progress of the process other than ps and top + 1 (to check CPU use).
My only other thought was to get the process to output a logfile and periodically check to see if the logfile has grown in size/content.
As a sidebar, is it necessary to also use nohup with a forked command?
I would use screen for this purpose. see the man for more reference
Brief summary how to use:
screen -S some_session_name - starts a new screen session named session_name
Ctrl + a + d - detach session
screen -r some_session_name returns you to your session

bash shell script sleep or cronjob which is preferred?

I want to do a task every 5 mins. I want to control when i can start and when i can end.
One way is to use sleep in a while true loop, another way is to use cronjob. Which one is preferred performance-wise?
Thanks!
cron is almost always the best solution.
If you try to do it yourself with a simple script running in a while loop:
while true; do
task
sleep 300
done
you eventually find that nothing is happening because your task failed due to a transient error. Or the system rebooted. Or some such. Making your script robust enough to deal with all these eventualities is hard work, and unnecessary. That's what cron is for, after all.
Also, if the task takes some non-trivial amount of time, the above simple-minded while loop will slowly shift out of sync with the clock. That could be fixed:
while true; do
task
sleep $((300 - $(date +%s) % 300))
done
Again, it's hardly worth it since cron will do that for you, too. However, cron will not save you from starting the task before the previous invocation finished, if the previous invocation got stuck somehow. So it's not a completely free ride, but it still provides you with some additional robustness.
A simple approach to solving the stuck-task problem is to use the flock utility. For example, you could cron a script containing the following:
(
flock -n 8 || {
logger -p user.warning Task is taking too long
# You might want to kill the stuck task here. See pkill
exit 1
}
# Do the task here
) 8> /tmp/my_task.lck
Use a cron job. Cron is made for this type of use case. It frees you of having to to code the while loop yourself.
However, cron may be unsuitable if the run time of the script is unpredictable and exceeds the timer schedule.
Performance-wise It is hard to tell unless you share what the script does and how often it does it. But generally speaking, neither option should have a negative impact on performance.

Screen Bash Scripting for Cronjob

Hi I am trying to use screen as part of a cronjob.
Currently I have the following command:
screen -fa -d -m -S mapper /home/user/cron
Is there anyway I can make this command do nothing if the screen mapper already exists? The mapper is set on a half an hour cronjob, but sometimes the mapping takes more than half an hour to complete and so they and up overlapping, slowing each other down and sometimes even causing the next one to be slow and so I end up with lots of mapper screens running.
Thanks for your time,
ls /var/run/screen/S-"$USER"/*.mapper >/dev/null 2>&1 || screen -S mapper ...
This will check if any screen sessions named mapper exist for the current user, and only if none do, will launch the new one.
Why would you want a job run by cron, which (by definition) does not have a terminal attached to it, to do anything with the screen? According to Wikipedia, 'GNU Screen is a software application which can be used to multiplex several virtual consoles, allowing a user to access multiple separate terminal sessions'.
However, assuming there is some reason for doing it, then you probably need to create a lock file which the process checks before proceeding. At this point, you need to run a shell script from the cron entry (which is usually a good technique anyway), and the shell script can check whether the previous run of the task has completed, exiting if not. If the previous incarnation is complete, then the current incarnation creates a lock file containing its PID and runs the job. When it completes, it removes the lock file. You should review the shell trap command, and make sure that the lock file is removed if the shell script exits as a result of a trappable signal (you can't trap KILL and some process-control signals).
Judging from another answer, the screen program already creates lock files; you may not have to do anything special to create them - but will need to detect whether they exist. Also check the GNU manual for screen.

How to fire and forget a subprocess?

I have a long running process and I need it to launch another process (that will run for a good while too). I need to only start it, and then completely forget about it.
I managed to do what I needed by scooping some code from the Programming Ruby book, but I'd like to find the best/right way, and understand what is going on. Here's what I got initially:
exec("whatever --take-very-long") if fork.nil?
Process.detach($$)
So, is this the way, or how else should I do it?
After checking the answers below I ended up with this code, which seems to make more sense:
(pid = fork) ? Process.detach(pid) : exec("foo")
I'd appreciate some explanation on how fork works. [got that already]
Was detaching $$ right? I don't know why this works, and I'd really love to have a better grasp of the situation.
Alnitak is right. Here's a more explicit way to write it, without $$
pid = Process.fork
if pid.nil? then
# In child
exec "whatever --take-very-long"
else
# In parent
Process.detach(pid)
end
The purpose of detach is just to say, "I don't care when the child terminates" to avoid zombie processes.
The fork function separates your process in two.
Both processes then receive the result of the function. The child receives a value of zero/nil (and hence knows that it's the child) and the parent receives the PID of the child.
Hence:
exec("something") if fork.nil?
will make the child process start "something", and the parent process will carry on with where it was.
Note that exec() replaces the current process with "something", so the child process will never execute any subsequent Ruby code.
The call to Process.detach() looks like it might be incorrect. I would have expected it to have the child's PID in it, but if I read your code right it's actually detaching the parent process.
Detaching $$ wasn't right. From p. 348 of the Pickaxe (2nd Ed):
$$ Fixnum The process number of the program being executed. [r/o]
This section, "Variables and Constants" in the "Ruby Language" chapter, is very handy for decoding various ruby short $ constants - however the online edition (the first
So what you were actually doing was detaching the program from itself, not from its child.
Like others have said, the proper way to detach from the child is to use the child's pid returned from fork().
The other answers are good if you're sure you want to detach the child process. However, if you either don't mind, or would prefer to keep the child process attached (e.g. you are launching sub-servers/services for a web app), then you can take advantage of the following shorthand
fork do
exec('whatever --option-flag')
end
Providing a block tells fork to execute that block (and only that block) in the child process, while continuing on in the parent.
i found the answers above broke my terminal and messed up the output. this is the solution i found.
system("nohup ./test.sh &")
just in case anyone has the same issue, my goal was to log into a ssh server and then keep that process running indefinitely. so test.sh is this
#!/usr/bin/expect -f
spawn ssh host -l admin -i cloudkey
expect "pass"
send "superpass\r"
sleep 1000000

Resources