wait for process completion created with taskset - bash

I am using taskset according to linux manual page in order to run a very processing intense task only on specific cores.
The taskset is encapsulated in a loop. Each time a new target directory is selected and the task is beeing run. Running the process multiple times in parallel may lead to fatal results.
The pseudo code is as follows:
#!/bin/bash
while :
do
target_dir=$(select_dir) # select new directory to process
sudo taskset -c 4,5,6,7,8,9,10,11 ./processing_intense_task --dir $target_dir
done
I have found nothing in the documentation if taskset actually waits for the process to finish.
If it does not wait, how do I wait for the task completion before starting a new instance of processing_intense_task?

the documentation if taskset actually waits for the process to finish.
Taskset executes exec, so it becomes the command. https://github.com/util-linux/util-linux/blob/master/schedutils/taskset.c#L246
This is the same as do other similar commands, like nice ionice.
If it does not wait,
Well, technically taskset doesn't wait, it becomes the command itself.
how do I wait for the task completion before starting a new instance of processing_intense_task?
You just wait for taskset process to finish, as it's the same process as the command. I.e. do nothing.

Related

How do I record all child processes spawned by an Ant script over time?

I inherited a legacy Ant-based build system and I'm trying to get a sense of its scope. I observed multiple jvm and junit tasks with fork=yes. It calls subant and similar tasks wildly. Occasionally, it just execs other processes.
I really don't want to search through 100s of scripts and reference documentation for every task to find possible-forking-behavior. I'd like to capture the child-process list while the build runs.
I managed to create a clean Vagrant + Puppet environment for builds and I can run the full build like so
$ cd /vagrant && $ANT_HOME/bin/ant
If I had to brute force something... I'd have a script kick off the build and capture child processes until the build is completed?
#!/bin/bash
$ANT_HOME/bin/ant &
while ps $!
do
sleep 1
ps --ppid $! >> build_processes
done
User Jayan recommended strace, specifically:
$ strace -f -e trace=fork ant
The -f limits tracing to fork system calls.
Trace child processes as they are created by cur-
rently traced processes as a result of the fork(2)
system call. The new process is attached to as soon
as its pid is known (through the return value of
fork(2) in the parent process). This means that such
children may run uncontrolled for a while (espe-
cially in the case of a vfork(2)), until the parent
is scheduled again to complete its (v)fork(2) call.
If the parent process decides to wait(2) for a child
that is currently being traced, it is suspended
until an appropriate child process either terminates
or incurs a signal that would cause it to terminate
(as determined from the child’s current signal dis-
position).
I can't find the trace=fork expression, but trace=process seems useful.
-e trace=process
Trace all system calls which involve process management. This is useful for watching the fork, wait, and exec steps of a process.
http://linuxcommand.org/man_pages/strace1.html
As ant is a java process, you can try to use byteman. In byteman script you define rules which are triggered when methods exec from java.lang.Runtime are executed.
You attach byteman to ant using ANT_OPTS env variable.

How can I create a process in Bash that has zero overhead but which gives me a process ID

For those of you who know what you're talking about I apologise for butchering the way that I'm going to phrase this question. I know nothing about bash whatsoever. With that caveat out of the way, let me get out my cleaver...
I am building a Rails app which has what's called a procfile which sets up any processes that need to be run in different environments
e.g.
web: bundle exec unicorn -p $PORT -c ./config/unicorn.rb
redis: redis-server
worker: bundle exec sidekiq
proxylocal: bin/proxylocal_local
Each one of these lines specs a process to be run. It also expects a pid to be returned after the process spins up. The syntax is
process_name: process_invokation_script
However the last process, proxylocal, only actually starts a process in development. In production it doesn't do anything.
Unfortunately that causes the Procfile to choke as it needs a process ID returned. So is there some super-simple, zero-overhead process that I can spawn in that case just to keep the procfile happy?
The sleep command does nothing for a specified period of time, with very low overhead. Give it an argument longer than your code will run.
For example
sleep 2147483647
does nothing for 231-1 seconds, just over 68 years. I picked that number because any reasonable implementation of sleep should be able to handle it.
In the unlikely event that that doesn't work (say if you're on an old 16-bit system that can't sleep for more than 216-1 seconds), you can do a sleep in an infinite loop:
sh -c 'while : ; do sleep 30000 ; done'
This assumes that you need the process to run for a very long time; that depends on what your application needs to do with the process ID. If it's required to be unique as long as the application is running, you need something that will continue to run for a long time; if the process terminates, its PID can be re-used by another process.
If that's not a requirement, you can use sleep 0 or true, which will terminate immediately.
If you need to give the application a little time to get the process ID before the process terminates, something like sleep 10 or even sleep 1 might work, though determining just how long it needs to run can be tricky and error-prone.
If Heroku isn't doing anything with proxylocal I'm not sure why you'd even want this in your Procifle. I'm also a bit confused about whether you want to change the Procfile or what bin/proxylocal_local does and how you would even do that.
That being said, if you are able to do anything you like for production your script can just call cat and it will create a pid and then just sit waiting for the next command (which never comes).
For truly minimal overhead, you don't want to run any external commands. When the shell starts a command, it first forks itself, then the child shell execs the external command. If the forked child can run a builtin, you can skip the exec.
Start by creating a read-only fifo somewhere.
mkfifo foo
chmod 400 foo
Then, whenever you need a do-nothing process, just fork a shell which tries to read from the fifo. It's read-only, so no one can write to it, so all reads will block.
read < foo &

How to make a shell script wait for another with out using sleep

I want to know how to make a shell script wait till other script finishes its execution with out the help of sleep command.
suppose i have scripts run.sh and kill.sh, where run.sh will make all the processes up(means to start running the image on the box) whereas kill.sh contains just the kill commands to kill all the running processes.
Whenever i have run the run.sh, it will make all the processes up and it will end. Then what happens here is all the running processes becoming orphan(handled by init). Whenever we run kill.sh, some of the processes are becoming zombies.
Means, Orphan processes becoming zombies.
To avoid this, i want to make the run.sh wait till the end of kill.sh script.
So, How to make a shell script wait for another script ? Please provide the comments.
Thanks in Advance
You can use wait to let the first script finish without giving an explicit sleep.
#!/bin/bash
./first_script.sh
wait
./second_script.sh

Understanding the behavior of processes - why all process run together and sleep together?

I have written a script to initiate multi-processing
for i in `seq 1 $1`
do
/usr/bin/php index.php name&
done
wait
A cron run every min - myscript.sh 3 now three background process get initiated and after some time I see list of process via ps command. I see all the processes are together in "Sleep" or "Running" mode...Now I wanted to achieve that when one goes to sleep other processes must process..how can I achieve it?. Or this is normal.
This is normal. A program that can run will be given time by the operating system... when possible. If all three are sleeping, then the system is most likely busy and time is being given to other processes.

Resource leaking of available PIDs by long running bash scripts

I am currently reading up on some more details on Bash scripting and especially process management here. In the section on "PIDs and Parents" I found the following statement:
A process's PID will NEVER be freed up for use after the process dies UNTIL the parent process waits for the PID to see whether it ended and retrieve its exit code.
So if I understand this correctly, if I start an process in a bash script, then the process terminates, that the PID cannot be used by any other process. Wouldn't this mean, that if I have a long running script, which repeatedly starts other sub-processes but never waits on them, that I'll eventually have a resource leak, because the used PIDs will not be returned back to the system?
How about if I actually wait for the other process, but the wait get's cancelled by a trap. Would this wait somehow still free up the PID, or do I have to wait again after the trap has been caught?
Luckily you won't. I can't tell you exactly why but you can easily test this. Run the following script (stop with Ctrl+C):
#!/bin/bash
while true; do
sleep 5 &
sleep 1
done
You can see you get no zombies (leaked PIDs) after 6+ seconds. To see some zombies use the following python code (again, stop with Ctrl+C):
#!/usr/bin/python
import subprocess, time
pl = []
while True:
pl.append(subprocess.Popen(["sleep", "5"]))
time.sleep(1)
After 6 seconds you'll see one zombie:
ps xaw | grep 'sleep'
...
26470 pts/2 Z+ 0:00 [sleep] <defunct>
...
My guess is that bash does wait and stores the results reaping the zombile processes with or without the builtin wait command. For the python script, if you remove the pl.append part the garbage collection releases the objects and does it's magic again reaping the zombies. Just for info a child may never become a zombie (from wikipedia, Zombie process):
...if the parent explicitly ignores SIGCHLD by setting its handler to SIG_IGN (rather
than simply ignoring the signal by default) or has the SA_NOCLDWAIT flag set, all
child exit status information will be discarded and no zombie processes will be left.
You don't have to explicitly wait on foreground processes because the shell in which your script is running waits on them. The next process won't start until the previous one finishes.
If you start many long running background processes, you could use all available PIDs, but that's subject to the limit of ulimit -u (which could be unlimited).

Resources