Resource management in Bash for running processes in parallel

Resource management in Bash for running processes in parallel - bash

We have shared server with multiple GPU nodes without resource manager. We make agreements that: "this week you can use nodes ID1,ID2 and ID5". I have a program that gets this ID as a parameter.
When I need to run my program ten times with ten different sets of parameters $ARGS1, $ARGS2, ..., $ARGS10, I run first three commands
programOnGPU $ARGS1 -p ID1 &
programOnGPU $ARGS2 -p ID2 &
programOnGPU $ARGS3 -p ID5 &
Then I must wait for any of those to finish and if e.g ID2 finishes first, I then run
programOnGPU $ARGS4 -p ID2 &
As this is not very convenient when you have a lot of processes I would like to automatize the process. I can not use parallel as I need to reuse IDs.
First use case is a script that needs to execute apriori known 10 commands of the type
programOnGPU $PARAMS -p IDX
when any of them finishes to assign its ID to another one in the queue. Is this possible using bash without too much overhead of the type of the SLURM? I don't need to check the state of physical resource.
General solution would be if I can make a queue in the bash or simple command line utility to which I will submit commands of the type
programABC $PARAMS
and it will add the GPU ID parameter to it and manage the queue that will be preconfigured to be able to use just given IDs and one ID at once. Again I don't want this layer to touch physical GPUs, but to ensure that it executes consistently over allowed ID's.

This is very simple with Redis. It is a very small, very fast, networked, in-memory data-structure server. It can store sets, queues, hashes, strings, lists, atomic integers and so on.
You can access it across a network in a lab, or across the world. There are clients for bash, C/C++, Ruby, PHP, Python and so on.
So, if you are allocated nodes 1, 2 and 5 for the week, you can just store those in a Redis "list" with LPUSH using the Redis "Command Line Interface"* for bash:
redis-cli lpush VojtaKsNodes 1 2 5
If you are not on the Redis host, add its hostname/IP-address into the command like this:
redis-cli -h 192.168.0.4 lpush VojtaKsNodes 1 2 5
Now, when you want to run a job, get a node with BRPOP. I specify an infinite timeout with the zero at the end, but you could wait a different amount of time:
# Get a node with infinite timeout
node=$(redis-cli brpop VojtaKsNodes 0)
run job on "$node"
# Give node back
redis-cli lpush VojtaKsNodes "$node"

I would:
I have a list of IDS=(ID1 ID2 ID5)
I would make 3 files, one with each IDs.
Run <arguments xargs -P3 programOnGPUFromLockedFile so run 3 processes for each of your argument.
Each of the processes will nonblockingly try to flock the 3 files in a loop, endlessly (ie. you can run more processes then 3, if you wanna).
When they succeed to flock,
they read the ID from the file
run the action on that ID
When they terminate, they will free flock, so the next process may flock the file and use the ID.
Ie. it's a very, very basic mutex locking. There are also other ways you can do it, like with an atomic fifo:
Create a fifo
Spawn one process for each argument you want to run with that will:
Read one line from the fifo
That line will be the ID to run on
Do the job on that ID
Output one line with the ID to the fifo back
And then write one ID per line to the fifo (in 3 separate writes! so that it's hopefully atomic), so 3 processes may start.
wait until all except 3 child processes exit
read 3 lines from fifo
wait until all child processes exit

Related

Is there any way to remove the Redis keys of completed jobs of Horizon?

I searched a lot but could not find anything suitable. i found this Redis::command('flushdb'); but this flush all of my others keys too which will needed to complete queued job.

Since OP said it is related to horizon, it keeps the failed jobs in a the redis under multiple keys.
Get your HORIZON_PREFIX from the configuration file; let's say it is foo.
You either invoke the following commands in your codebase (tinker maybe)
Redis::connection()->del('foo:failed:*');
Redis::connection()->del('foo:failed_jobs');
Or you can use redis-cli
127.0.0.1:6379> del foo:failed:* foo:failed_jobs
(integer) 1
127.0.0.1:6379>

Redis pipeline, dealing with cache misses

I'm trying to figure out the best way to implement Redis pipelining. We use redis as a cache on top of MySQL to store user data, product listings, etc.
I'm using this as a starting point: https://joshtronic.com/2014/06/08/how-to-pipeline-with-phpredis/
My question is, assuming you have an array of ids properly sorted. You loop through the redis pipeline like this:
$redis = new Redis();
// Opens up the pipeline
$pipe = $redis->multi(Redis::PIPELINE);
// Loops through the data and performs actions
foreach ($users as $user_id => $username)
{
// Increment the number of times the user record has been accessed
$pipe->incr('accessed:' . $user_id);
// Pulls the user record
$pipe->get('user:' . $user_id);
}
// Executes all of the commands in one shot
$users = $pipe->exec();
What happens when $pipe->get('user:' . $user_id); is not available, because it hasn't been requested before or has been evicted by Redis, etc? Assuming it's result # 13 from 50, how do we a) find out that we weren't able to retrieve that object and b) keep the array of users properly sorted?
Thank you

I will answer the question referring to Redis protocol. How it works in particular language is more or less the same in that case.
First of all, let's check how Redis pipeline works:
It is just a way to send multiple commands to server, execute them and get multiple replies. There is nothing special, you just get an array with replies for each command in the pipeline.
Why pipelines are much faster is because roundtrip time for each command is saved, i.e. for 100 commands there is only one round-trip time instead of 100. In addition, Redis executes every command synchronously. Executing 100 commands needs potentially fighting 100 times, for Redis to pick that singular command, pipeline is treated as one long command, thus requiring only once to wait being picked synchronously.
You can read more about pipelining here: https://redis.io/topics/pipelining. One more note, because each pipelined batch runs uninterruptible (in terms of Redis) it makes sense to send these commands in overviewable chunks, i.e. don't send 100k commands in a single pipeline, that might block Redis for a long period of time, split them into chunks of 1k or 10k commands.
In your case you run in the loop the following fragment:
// Increment the number of times the user record has been accessed
$pipe->incr('accessed:' . $user_id);
// Pulls the user record
$pipe->get('user:' . $user_id);
The question is what is put into pipeline? Let's say you'd update data for u1, u2, u3, u4 as user ids. Thus the pipeline with Redis commands will look like:
INCR accessed:u1
GET user:u1
INCR accessed:u2
GET user:u2
INCR accessed:u3
GET user:u3
INCR accessed:u4
GET user:u4
Let's say:
u1 was accessed 100 times before,
u2 was accessed 5 times before,
u3 was not accessed before and
u4 and accompanying data does not exist.
The result will be in that case an array of Redis replies having:
101
u1 string data stored at user:u1
6
u2 string data stored at user:u2
1
u3 string data stored at user:u3
1
NIL
As you can see, Redis will treat missing INCR values as being 0 and execute incr(0). Finally, there is nothing being sorted by Redis and the results will come in the oder as requested.
The language binding, e.g. Redis driver, will just parse for you that protocol and give the view to parsed data. Without keeping the oder of commands it'll be impossible for Redis driver to work correctly and for you as programmer to deduce smth. Just keep in mind, that request is not duplicated in the reply i.e. you will not receive key for u1 or u2 when doing GET, but just the data for that key. Thus your implementation must remember that on position 1 (zero based index) comes the result of GET for u1.

Bash parallelization for double for loop

So I have a function that request REST API and that takes in two arguments: instances and dates. I am given a list of instances and a range of dates which need to be iterated with two for loops. One constraint is that the only one instance can be requested at a time.
I have tried using & and wait, and my pseudocode looks like this.
for each date:
for each instance:
do-something "$date" "$instance" &
done
wait
done
This actually works perfectly since only one instance is requested at a time and only progress when all instances are processed and therefore no instance gets requested at the same time.
The problem is that some request for certain instance takes a long time, so other instances that have been processed earlier are idling. How can I solve this problem?

Define a function which will process a given instance for each date sequentially:
for_each_date () {
instance=$1
shift
for d in "$#"; do
some_command "$d" "$instance"
done
}
Now, spawn a background process to run this function for each instance.
dates=(2015-07-21 2015-07-22 2015-07-23) # For example
instances=(inst1 inst2 inst3)
for instance in "${instances[#]}"; do
for_each_date "$instance" "${dates[#]}" &
done
wait
Each background job will run some-command for a different instance, and will never run more than one process at a time, so you meet your first constraint. At the same time, for_each_date starts a new request for its instance as soon as the old one completes, keeping your machine as busy as possible.

With GNU Parallel you would do:
parallel do-something ::: d a t e s ::: i n s t a n c e s

Move a range of jobs to another queue with qmove

I have several (idle) jobs scheduled on a cluster that I want to move to another queue.
I can move a single job like this (where 1234 is the job id):
qmove newQueue 1234
But now I have hundreds of jobs that I want to move to newQueue. Is it possible to move them all? Using * as a wildcard operator does not work.

If the job ids are in sequential order, you could use Bash's brace extension. For example:
$ echo {0..9}
0 1 2 3 4 5 6 7 8 9
Transferred to moving all jobs ranging from 1000 to 2000, the qmove command would be:
qmove newQueue {1000..2000}
This might even work if there are job ids that you are not allowed to move (from other users or in running state). They should be simply ignored. (not tested)

How can I performance test using shell scripts - tools and techniques?

I have a system to which I must apply load for the purpose of performance testing. Some of the load can be created via LoadRunner over HTTP.
However in order to generate realistic load for the system I also need to simulate users using a command line tool which uses a non HTTP protocol* to talk to the server.
* edit: actually it is HTTP but we've been advised by the vendor that it's not something easy to record/script and replay. So we're limited to having to invoke it using the CLI tool.
I have the constraint of not having the licences for LoadRunner to do this and not having the time to put the case to get the license.
Therefore I was wondering if there was a tool that I could use to control the concurrent execution of a collection of shell scripts (it needs to run on Solaris) which will be my transactions. Ideally it would be able to ramp up in accordance with a predetermined scehdule.
I've had a look around and can't tell if JMeter will do the trick. It seems very web oriented.

You can use bellow script to trigger load test for HTTP/S requests,
#!/bin/bash
#define variables
set -x # run in debug mode
DURATION=60 # how long should load be applied ? - in seconds
TPS=20 # number of requests per second
end=$((SECONDS+$DURATION))
#start load
while [ $SECONDS -lt $end ];
do
for ((i=1;i<=$TPS;i++)); do
curl -X POST <url> -H 'Accept: application/json' -H 'Authorization: Bearer xxxxxxxxxxxxx' -H 'Content-Type: application/json' -d '{}' --cacert /path/to/cert/cert.crt -o /dev/null -s -w '%{time_starttransfer}\n' >> response-times.log &
done
sleep 1
done
wait
#end load
echo "Load test has been completed"
You may refer this for more information

If all you need is starting a bunch of shell scripts in parallel, you can quickly create something of your own in perl with fork, exec and sleep.
#!/usr/bin/perl
for $i (1..1000)
{
if (fork == 0)
{
exec ("script.sh");
exit;
}
sleep 1;
}

For anyone interested I have written a Java tool to manage this for me. It references a few files to control how it runs:
1) Schedules File - defines various named lists of timings which controls the length of sequential phases.
e.g. MAIN,120,120,120,120,120
This will result in a schedule named MAIN which has 5 phases each 120 seconds long.
2) Transactions File - defines transactions that need to run. Each transaction has a name, a command that should be called, boolean controlling repetition, integer controlling pause between repetitions in seconds, data file reference,schedule to use and increments.
e.g. Trans1,/path/to/trans1.ksh,true,10,trans1.data.csv,MAIN,0,10,0,10,0
This will result in a transaction running trans1.ksh, repeatedly with a pause of 10 seconds between repetitions. It will reference the data in trans1.data.csv. During phase 1 it will increment the number of parallel invocations by 0, phase 2 will add 10 parallel invocations, phase 3 none added and so on. Phase times are taken from the schedule named MAIN.
3) Data Files - as referenced in the transaction file, this will be a CSV with a header. Each line of data will be passed to subsequent invocations of the transaction.
e.g.
HOSTNAME,USERNAME,PASSWORD
server1,jimmy,password123
server1,rodney,ILoveHorses
These get passed to the transaction scripts via environment variables (e.g. PASSWORD=ILoveHorses), a bit klunky, but workable.
My Java simply parses the config files, sets up a manager thread per transaction which itself takes care of setting up and starting executor threads in accordance with the configuration. Managers take care of adding executors linearly so as not to totally overload it.
When it runs, it just reports every second on how many workers each transaction has running and which phase it's in.
It was a fun little weekend project, it's certainly no load runner and I'm sure there are some massive flaws in it that I'm currently blissfully unaware of, but it seems to do ok.
So in summary the answer here was to "roll ya own".

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio