Queue using several processes to launch bash jobs - bash

I need to run many (hundreds) commands in shell, but I only want to have a maximum of 4 processes running (from the queue) at once. Each process will last several hours.
When a process finishes I want the next command to be "popped" from the queue and executed.
I also want to be able to add more process after the beginning, and it will be great if I could remove some jobs from the queue, or at least empty the queue.
I have seen solutions using makefile, but this only work if I have all my list of commands before the beginning. Also tried using mkfifo sjobq, and others, but I never could reach my needs...
Does anyone have code to solve this problem?
Edit: In response to Mark Setchell
The solution with tail -f and parallel is almost perfect, but when I do it, it always keep not launching the last 4 commands until I add more, and so on, I don't know why, and it is quite troublesome...
As for Redis, good solution also, but it takes more time to master all of it.
Thanks !

Use GNU Parallel to make a job queue like this:
# Clear out file containing job queue
> jobqueue
# Start GNU Parallel processing jobs from queue
# -k means "keep" output in order
# -j 4 means run 4 jobs at a time
tail -f jobqueue | parallel -k -j 4
# From another terminal, submit 40 jobs to the queue
for i in {1..40}; do echo "sleep 5;date +'%H:%M:%S Job $i'"; done >> jobqueue
Another option is to use REDIS - see my answer here Run several jobs parallelly and Efficiently

Related

is there a way to trigger 10 scripts at any given time in Linux shell scripting?

I have a requirement where I need to trigger 10 shell scripts at a time. I may have 200+ shell scripts to be executed.
e.g. if I trigger 10 jobs and two jobs completed, I need to trigger another 2 jobs which will make number of jobs currently executing to 10.
I need your help and suggestion to cater this requirement.
Yes with GNU Parallel like this:
parallel -j 10 < ListOfJobs.txt
Or, if your jobs are called job_1.sh to job_200.sh:
parallel -j 10 job_{}.sh ::: {1..200}
Or. if your jobs are named with discontiguous, random names but are all shell scripts named with .sh suffix in one directory:
parallel -j 10 ::: *.sh
There is a very good overview here. There are lots of questions and answers on Stack Overflow here.
Simply run them as background jobs:
for i in {1..10}; { ./script.sh & }
Adding more jobs if less than 10 are running:
while true; do
pids=($(jobs -pr))
((${#pids[#]}<10)) && ./script.sh &
done &> /dev/null
There are different ways to handle this:
Launch them together as background tasks (1)
Launch them in parallel (1)
Use the crontab (2)
Use at (3)
Explanations:
(1) You can launch the processes exactly when you like (by launching a command, click a button or whatever event you choose)
(2) The processes will be launched at the same time, every (working) day, periodically.
(3) You choose a time when the processes will be launched together once.
I have used below to trigger 10 jobs a time.
max_jobs_trigger=10
while mapfile -t -n ${max_jobs_trigger} ary && ((${#ary[#]})); do
jobs_to_trigger=`printf '%s\n' "${ary[#]}"`
#Trigger script in background
done

How do I create a Stack or LIFO for GNU Parallel in Bash

While my original problem was solved in a different manner (see comment thread under this question, as well as the edits to this question), I was able to create a stack/LIFO for GNU Parallel in Bash. So I will edited my background/question to reflect a situation where it could be needed.
Background
I am using GNU Parallel to process files with a Bash script. As the files are processed, more files are created and new commands need to be added to parallel's list. I am not able to give parallel a complete list of commands, as information is generated as the initial files are processed.
I need a way to add the lines to parallel's list while it is running.
Parallel will also need to wait for a new line if nothing is in the queue and exit once the queue is finished.
Solution
First I created a fifo:
mkfifo /tmp/fifo
Next I created a bash file that cat's the file and pipes the output to parallel, which checks for the end_of_file line. (I wrote this with help from the accepted answer as well as from here)
#!/bin/bash
while true;
do
cat /tmp/fifo
done | parallel --ungroup --gnu --eof "end_of_file" "{}"
Then I write to the pipe with this command, adding lines to parallel's queue:
echo "command here" > /tmp/fifo
With this setup, all new commands are added to the queue. Once the queue is full parallel will begin processing it. This means that if you have slots for 32 jobs (32 processors), then you will need to add 32 jobs in order to start the queue.
If parallel is occupying all of its processors, it will put the job on hold until a processor becomes available.
By using the --ungroup argument, parallel will process/output jobs as they are added to the queue once the queue is full.
Without the --ungroup argument, parallel waits until a new slot is needed to complete a job. From the accepted answer:
Output from the running or completed jobs are held back and will only be printed when JobSlots more jobs has been started (unless you use --ungroup or -u, in which case the output from the jobs are printed immediately). E.g. if you have 10 jobslots then the output from the first completed job will only be printed when job 11 has started, and the output of second completed job will only be printed when job 12 has started.
From http://www.gnu.org/software/parallel/man.html#EXAMPLE:-GNU-Parallel-as-queue-system-batch-manager
There is a a small issue when using GNU parallel as queue system/batch manager: You have to submit JobSlot number of jobs before they will start, and after that you can submit one at a time, and job will start immediately if free slots are available. Output from the running or completed jobs are held back and will only be printed when JobSlots more jobs has been started (unless you use --ungroup or -u, in which case the output from the jobs are printed immediately). E.g. if you have 10 jobslots then the output from the first completed job will only be printed when job 11 has started, and the output of second completed job will only be printed when job 12 has started.

shell script to loop and start processes in parallel?

I need a shell script that will create a loop to start parallel tasks read in from a file...
Something in the lines of..
#!/bin/bash
mylist=/home/mylist.txt
for i in ('ls $mylist')
do
do something like cp -rp $i /destination &
end
wait
So what I am trying to do is send a bunch of tasks in the background with the "&" for each line in $mylist and wait for them to finish before existing.
However, there may be a lot of lines in there so I want to control how many parallel background processes get started; want to be able to max it at say.. 5? 10?
Any ideas?
Thank you
Your task manager will make it seem like you can run many parallel jobs. How many you can actually run to obtain maximum efficiency depends on your processor. Overall you don't have to worry about starting too many processes because your system will do that for you. If you want to limit them anyway because the number could get absurdly high you could use something like this (provided you execute a cp command every time):
...
while ...; do
jobs=$(pgrep 'cp' | wc -l)
[[ $jobs -gt 50 ]] && (sleep 100 ; continue)
...
done
The number of running cp commands will be stored in the jobs variable and before starting a new iteration it will check if there are too many already. Note that we jump to a new iteration so you'd have to keep track of how many commands you already executed. Alternatively you could use wait.
Edit:
On a side note, you can assign a specific CPU core to a process using taskset, it may come in handy when you have fewer more complex commands.
You are probably looking for something like this using GNU Parallel:
parallel -j10 cp -rp {} /destination :::: /home/mylist.txt
GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.
If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:
GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:
Installation
If GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README
Learn more
See more examples: http://www.gnu.org/software/parallel/man.html
Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html
Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel

Hold a bash script on PBS status without Torque

I've access to a low priority queue on a large national system. I can allocate in the queue only 1 job at the time.
The PBS job contains a program who is not likely to complete before the wall-time ends. Jobs on hold can't be queued in a number that exceeds 3.
It means that:
I can not use -W depend=afterok:$ID_of_previous_job . The script would submit all the job at once, but just the first 3 will enter the queue (the last 2 in H state)
I can not modify the submission script with a last line that submit the next_job (it is very likely that the actual program won't finish before the walltime ends and then the last line is not executed.
I can not install any software so I am limited to use a Bash Script, rather than Torque
I'd rather not use a "time check" script (such as: every 5 minute check if previous_job is over)
Is it possible to use a while and or sleep ?
Option 1
To use a while and sleep requires you to do something very similar to a time check script:
#!/bin/bash
jobid=`submit the first job`
while [[ -z `qstat ${jobid} | grep C` ]]; do
sleep 5
done
# submit the new job once the loop is done, after checking the exit status if desired
Option 2 - may be TORQUE only, not sure:
Perhaps a better way, suggested by Dmitri Chubarov in the comments, would be to use the per-job epilogue option. To do this the compute nodes have to be able to submit jobs, but since you were considering having the final line of the job do it then this seems like a possibility.
Add to the job a perjob epilogue by adding this line to the script:
#PBS -l epilogue=/path/to/script
And then have the script:
#!/bin/bash
# check exit code if desired, its argument 10 to the script
# submit the next job

GNU parallel: different commands to different computers?

Have searched on SO and GNU parallel tutorial and gone through examples here, but still don't quite see what I need solved. Any tips appreciated on how I could accomplish the following:
I need to invoke the same script on several remote servers with a different argument passed to each one (argument is a string), then wait until all those jobs are done... Then, run that same script some more times on those same remote servers, but this time try to keep the remote servers as busy as possible (ie when they finish their job, send them another job). Ideally the strings could be read in from a file on the "master" machine that is sending the jobs to the remote servers.
To diagram this, I'm trying to run *my_script* like this:
server A: myscript fee
server B: myscript fi
When both jobs are done I then want to do something like:
server A: myscript fo
server B: myscript fum
... and supposing A finished its work before server B, immediately sending it the next job like :
server A: myscript englishmun
... etc
Again, hugely appreciate any ideas people might have about whether this is easy/hard with GNU parallel (or if something else like pdsh, cluster ssh, might be better suited).
TIA!
It seems we can split the problem up in two parts: An initialization part that needs to be run on all server and a job processing part that does not care which server it is run on.
The last part is GNU Parallel's specialty:
cat argfile | parallel -S serverA,serverB myscript
The first part is a bit more tricky: You want the first k arguments to go onto to k servers.
head -n 2 argfile | parallel -j1 -S serverA,serverB myscript
The problem is here that if there are loads of servers, then serverA may finish before you get to the last server. It is much easier to run the same job on all servers:
head -n 1 argfile | parallel --onall -S serverA,serverB myscript

Resources