sh via Ruby: Running ulimit and a program in the same line - ruby

I am trying to run some computational-intense program from Ruby via the following command:
%x(heavy_program)
However, I sometimes want to limit the running time of the program. So I tried doing
%x(ulimit -St #{max_time} & heavy_program)
But it seems to fail; the "&" trick does not work even when I try it in a running sh shell outside Ruby.
I'm sure there's a better way of doing this...

use either && or ;:
%x(ulimit -St #{max_time} && heavy_program)
%x(ulimit -St #{max_time}; heavy_program)
However using ulimit may be not what you really need, consider this code:
require 'timeout'
Timeout(max_time){ %x'heavy_program' }
ulimit limits CPU time, and timeout limits total running time, as we, humans, usually count it.
so, for example, if you run sleep 999999 shell command with ulimit -St 5 - it will run not for 5 seconds, but for all 999999 because sleep uses negligible amount of CPU time

Related

Is there a way to limit time and memory resources for running a bash command?

Basically I want to run my compiled C++ code and limit execution time (to a second for example) and memory (to 100k) like the online judges. Is it possible by adding options to the command? This has to be done without modifying the source code of course.
Try ulimit command, it can set limits on CPU time and memory.
Try this example
bash -c 'ulimit -St 1 ; while true; do true; done;'
The result you will get will be
CPU time limit exceeded (core dumped)
To limit time you can use "timeout" command
timeout 15s command
Check this for more details: link

shell script to loop and start processes in parallel?

I need a shell script that will create a loop to start parallel tasks read in from a file...
Something in the lines of..
#!/bin/bash
mylist=/home/mylist.txt
for i in ('ls $mylist')
do
do something like cp -rp $i /destination &
end
wait
So what I am trying to do is send a bunch of tasks in the background with the "&" for each line in $mylist and wait for them to finish before existing.
However, there may be a lot of lines in there so I want to control how many parallel background processes get started; want to be able to max it at say.. 5? 10?
Any ideas?
Thank you
Your task manager will make it seem like you can run many parallel jobs. How many you can actually run to obtain maximum efficiency depends on your processor. Overall you don't have to worry about starting too many processes because your system will do that for you. If you want to limit them anyway because the number could get absurdly high you could use something like this (provided you execute a cp command every time):
...
while ...; do
jobs=$(pgrep 'cp' | wc -l)
[[ $jobs -gt 50 ]] && (sleep 100 ; continue)
...
done
The number of running cp commands will be stored in the jobs variable and before starting a new iteration it will check if there are too many already. Note that we jump to a new iteration so you'd have to keep track of how many commands you already executed. Alternatively you could use wait.
Edit:
On a side note, you can assign a specific CPU core to a process using taskset, it may come in handy when you have fewer more complex commands.
You are probably looking for something like this using GNU Parallel:
parallel -j10 cp -rp {} /destination :::: /home/mylist.txt
GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.
If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:
GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:
Installation
If GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README
Learn more
See more examples: http://www.gnu.org/software/parallel/man.html
Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html
Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel

How to limit allowed time for a script run as a sub process in ksh script

I try to limit the allowed of a sub process in a ksh script. I try using ulimit (hard or soft values) but the sub process always break the limit (if take longer than allowed time).
# value for a test
Sc_Timeout=2
Sc_FileOutRun=MyScript.log.Running
Sc_Cmd=./AScriptToRunInSubShell.sh
(
ulimit -Ht ${Sc_Timeout}
ulimit -St ${Sc_Timeout}
time (
${Sc_Cmd} >> ${Sc_FileOutRun} 2>&1
) >> ${Sc_FileOutRun} 2>&1
# some other command not relevant for this
)
result:
1> ./MyScript.log.Running
ulimit -Ht 2
ulimit -St 2
1>> ./MyScript.log.Running 2>& 1
real 0m11.45s
user 0m3.33s
sys 0m4.12s
I expect a timeout error with a sys or user time of something like 0m2.00s
When i make a test directly from command line, the ulimit Hard seems to effectively limit the time bu not in script
System of test/dev is a AIX 6.1 but should also work other version and on sun and linux
Each process has its own time limits, but time shows the cumulative time for the script. Each time you create a child process, that child will have its own limits. So, for example, if you call cut and grep in the script then those processes use their own CPU time, the quota is not decremented from the script's, although the limits themselves are inherited.
If you want a time limit, you might wish to investigate trap ALRM.

Run script in multiple machines in parallel

I am interested to know the best way to start a script in the background in multiple machines as fast as possible. Currently, I'm doing this
Run for each IP address
ssh user#ip -t "perl ~/setup.pl >& ~/log &" &
But this takes time as it individually tries to SSH into each one by one to start the setup.pl in the background in that machine. This takes time as I've got a large number of machines to start this script on.
I tried using GNU parallel, but couldn't get it to work properly:
seq COUNT | parallel -j 1 -u -S ip1,ip2,... perl ~/setup.pl >& ~/log
But it doesn't seem to work, I see the script started by GNU parallel in the target machine, but it's stagnant. I don't see anything in the log.
What am I doing wrong in using the GNU parallel?
GNU Parallel assumes per default that it does not matter which machine it runs a job on - which is normally true for computations. In your case it matters greatly: You want one job on each of the machine. Also GNU Parallel will give a number as argument to setup.pl, and you clearly do not want that.
Luckily GNU Parallel does support what you want using --nonall:
http://www.gnu.org/software/parallel/man.html#example__running_the_same_command_on_remote_computers
I encourage you to read and understand the rest of the examples, too.
I recommend that you use pdsh
It allows you to run the same command on multiple machines
Usage:
pdsh -w machine1,machine2,...,machineN <command>
It might not be included in your distribution of linux so get it through yum or apt
Try to wrap ssh user#ip -t "perl ~/setup.pl >& ~/log &" & in the shell script, and run for each ip address ./mysctipt.sh &

Un*x shell script: what is the correct way to run a script for at most x milliseconds?

I'm not a scripting expert and I was wondering what was an acceptable way to run a script for at most x milliseconds (and yet finish before x milliseconds if the script is done before the timeout).
I solved that problem using Bash in a way that I think is very hacky and I wonder if there's a better way to do it.
Basically I've got one shell script called sleep_kill.sh that takes a PID as the first argument and a timeout as its second argument and that does this:
sleep $2
kill -9 $1 2> /dev/null 1> /dev/null
So if the PID corresponds to a script that finishes before timing out, nothing is going to be killed (I take it that the OS shall not have the time to be reusing this PID for another [unrelated] process seen that it's 'cycling' through all the process IDs before starting to reuse them).
Anyway, then I call my script that may "hang" or timeout:
command_that_may_hang.sh
PID=$!
sleep_kill.sh $PID .3
wait $PID > /dev/null 2>&1
And I'll be waiting at most 300 ms for command_that_may_hang.sh. Yet if command_that_may_hang.sh took only 10 ms to execute, I won't be "stuck" for 300 ms.
It would be great if some shell expert could explain the drawbacks of this approach and what should be done instead.
Have a look at this script: http://www.pixelbeat.org/scripts/timeout
Note timeouts of less that one second are pretty much nonsensical on most systems due to scheduling delays etc. Note also that newer coreutils has the timeout command included and it has a resolution of 1 second.

Resources