Hi I am trying to use screen as part of a cronjob.
Currently I have the following command:
screen -fa -d -m -S mapper /home/user/cron
Is there anyway I can make this command do nothing if the screen mapper already exists? The mapper is set on a half an hour cronjob, but sometimes the mapping takes more than half an hour to complete and so they and up overlapping, slowing each other down and sometimes even causing the next one to be slow and so I end up with lots of mapper screens running.
Thanks for your time,
ls /var/run/screen/S-"$USER"/*.mapper >/dev/null 2>&1 || screen -S mapper ...
This will check if any screen sessions named mapper exist for the current user, and only if none do, will launch the new one.
Why would you want a job run by cron, which (by definition) does not have a terminal attached to it, to do anything with the screen? According to Wikipedia, 'GNU Screen is a software application which can be used to multiplex several virtual consoles, allowing a user to access multiple separate terminal sessions'.
However, assuming there is some reason for doing it, then you probably need to create a lock file which the process checks before proceeding. At this point, you need to run a shell script from the cron entry (which is usually a good technique anyway), and the shell script can check whether the previous run of the task has completed, exiting if not. If the previous incarnation is complete, then the current incarnation creates a lock file containing its PID and runs the job. When it completes, it removes the lock file. You should review the shell trap command, and make sure that the lock file is removed if the shell script exits as a result of a trappable signal (you can't trap KILL and some process-control signals).
Judging from another answer, the screen program already creates lock files; you may not have to do anything special to create them - but will need to detect whether they exist. Also check the GNU manual for screen.
Related
Im writing a bash script that essentially fires off a python script that takes roughly 10 hours to complete, followed by an R script that checks the outputs of the python script for anything I need to be concerned about. Here is what I have:
ProdRun="python scripts/run_prod.py"
echo "Commencing Production Run"
$ProdRun #Runs python script
wait
DupCompare="R CMD BATCH --no-save ../dupCompareTD.R" #Runs R script
$DupCompare
Now my issues is that often the python script can generate a whole heap of different processes on our linux server depending on its input, with lots of different PIDs AND we have heaps of workers using the same server firing off scripts. As far as I can tell from reading, the 'wait' command must wait for all processes to finish or for a specific PID to finish, but when i cannot tell what or how many PIDs will be assigned/processes run, how exactly do I use it?
EDIT: Thank you to all that helped, here is what caused my dilemma for anyone google searching this. I broke up the ProdRun python script into its individual script that it was itself calling, but still had the issue, I think found that one of these scripts was also calling another smaller script that had a "&" at the end of it that was ignoring any commands to wait on it inside the python script itself. Simply removing this and inserting a line of "os.system()" allowed all the code to run sequentially.
It sounds like you are trying to implement a job scheduler with possibly some complex dependencies between different tasks. I recommend to use a job scheduler instead. It allows you to specify to run those jobs whilst also benefitting from features like monitoring, handling exceptional cases, errors, ...
Examples are: the open source rundeck https://github.com/rundeck/rundeck or the commercial one http://www.bmcsoftware.uk/it-solutions/control-m.html
Make your Python program wait on the children it spawns. That's the proper way to fix this scenario. Then you don't have to wait for Python after it finishes (sic).
(Also, don't put your commands in variables.)
I read this interesting question, that basically says that I should always avoid reaching PID of processes that aren't child processes. It's well explained and makes perfect sense.
BUT, while OP was trying to do something that cron isn't meant to, I'm in a very different situation :
I want to run a process say every 5 minutes, but once in a hundred times it takes a little more than 5 minutes to run (and I can't have two instances running at once).
I don't want to kill or manipulate other processes, I just want to end my process without doing anything if another instance of the process is running.
Is it ok to fetch PID of "not-child processes" in that case ? If so, how would I do it ?
I've tried doing if pgrep "myscript"; then ... or stuff like that, but the process finds its own PID. I need to detect if it finds more than one.
(Initially before being redirected I read this question, but the solution given doesn't work: it can give pid of the process using it)
EDIT: I should have mentioned it before, but if the script is already in use I still need to write something in a log file, at least : date>>script.log; echo "Script already in use">>script.log", I may be wrong but I think flock doesn't allow to do that.
Use lckdo or flock to avoid duplicated running.
DESCRIPTION
lckdo runs a program with a lock held, in order to prevent multiple
processes from running in parallel. Use just like nice or nohup.
Now that util-linux contains a similar command named flock, lckdo is
deprecated, and will be removed from some future version of moreutils.
Of course you can implement this primitive lockfile feature by yourself.
if [ ! -f /tmp/my.lock ];then
touch /tmp/my.lock
run prog
rm -f /tmp/my.lock
fi
I'm running a process that will take, optimistically, several hours, and in the worst case, probably a couple of days.
I've tried a couple of times to run it and it just never seems to complete (I should add, I didn't write the program, it's just a big dataset). I know my syntax for the command is correct as I use it all the time for smaller data and it works properly (I'll spare you the details as it is obscure for SO and I don't think that relevant to the question).
Consequently, I'd like to leave the program unattended running as a fork with &.
Now, I'm not totally sure whether the process is just grinding to a halt or is running but taking much longer than expected.
Is there any way to check the progress of the process other than ps and top + 1 (to check CPU use).
My only other thought was to get the process to output a logfile and periodically check to see if the logfile has grown in size/content.
As a sidebar, is it necessary to also use nohup with a forked command?
I would use screen for this purpose. see the man for more reference
Brief summary how to use:
screen -S some_session_name - starts a new screen session named session_name
Ctrl + a + d - detach session
screen -r some_session_name returns you to your session
I have a script named program.rb and would like to write a script named main.rb that would do the following:
system("ruby", "program.rb")
constantly check if program.rb is running until it is done
if program.rb has reached completion
exit main.rb
end
otherwise keep doing this until program.rb reaches completion{
if program.rb is not running and stopped before completing
restart program.rb from where it left off
end}
I've looked into Pidify but could not find a way to apply it to fit this exactly the right way...
Any help in how to approach this script would be greatly appreciated!
Update:
I could figure out how to resume running the script from where it left off in program.rb if there's no way to do it in main.rb
It's impossible to "restart script from where it left off" without full cooperation from the program.rb. That is, it should be able to advertise its progress (by writing current state to a file, maybe?) and be able to start correctly from a step specified in ARGV. There's no external ruby magic that can replace this functionality.
Also, if a program terminated abnormally, it means one of two things:
the error is (semi-)permanent (disk is full, no appropriate access rights to a file, etc). In this case, simply restarting the program would cause it to fail again. And again. Infinite fail loop.
the error is temporary (shaky internet connection). In this case, program should do better job with exception handling and retry on its own (instead of terminating).
In either case, there's no need for restarting, IMHO.
Well, here is one way.
Modify program.rb to take an optional flag argument --restart or something.
When program.rb starts up without this argument it will initialize a file to record its current state. Periodically, it will write whatever it needs into this file to record some kind of checkpoint.
When program.rb starts up with the restart flag, it will read its checkpoint file and start processing at that point. For this to work, it must either checkpoint all state changes or arrange for all processing between checkpoints to be idempotent so it can be repeated without ill effect.
There are lots of ways to monitor the health of program.rb. The best way is with some sort of ping, perhaps something like GET /health_check or a dummy message via a socket or pipe. You could just have a locked file to detect if the lock is still held, or you could record the PID on startup and check that it still exists.
I have a program that runs in the command line (i.e. $ run program starts up a prompt) that runs mathematical calculations. It has it's own prompt that takes in text input and responds back through standard-out/error (or creates a separate x-window if needed, but this can be disabled). Sometimes I would like to send it small input, and other times I send in a large text file filled with a series of input on each line. This program takes a lot of resources and also has a large startup time, so it would be best to only have one instance of it running at a time. I could keep open the program-prompt and supply the input this way, or I can send the process with an exit command (to leave prompt) which just prints the output. The problem with sending the request with an exit command is that the program must startup each time (slow ...). Furthermore, the output of this program is sometimes cryptic and it would be helpful to filter the output in some way (eg. simplify output, apply ANSI colors, etc).
This all makes me want to put some 2-way IO filter (or is that "pipe"? or "wrapper"?) around the program so that the program can run in the background as single process. I would then communicate with it without having to restart. I would also like to have this all while filtering the output to be more user friendly. I have been looking all over for ideas and I am stumped at how to accomplish this in some simple shell accessible manor.
Some things I have tried were redirecting stdin and stdout to files, but the program hangs (doesn't quit) and only reads the file once making me unable to continue communication. I think this was because the prompt is waiting for some user input after the EOF. I thought that this could be setup as a local server, but I am uncertain how to begin accomplishing that.
I would love to find some simple way to accomplish this. Additionally, if you can think of a way to perform this, do you think there is a way to also allow for attaching or detaching to the prompt by request? Any help and ideas would be greatly appreciated.
You could create two named pipes (man mkfifo) and redirect input and output:
myprog < fifoin > fifoout
Then you could open new terminal windows and do this in one:
cat > fifoin
And this in the other:
cat < fifoout
(Or use tee to save the input/output as well.)
To dump a large input file into the program, use:
cat myfile > fifoin