Is it possible to command the CPU to pause all non-essential processes until my program has finished processing? The goal being to reduce the amount of processes competing for CPU processing time, and I am ultimately expecting an improvement in wall-clock running time of my program.
So I want to start my program running, command the CPU to pause non-essential processes except for my program, and when my program terminates then the CPU can resume the previously paused processes.
On linux, The obvious initial tactic is to increase the priority of your process using renice. The lower the nice value, the higher the priority, with a maximum priority of -20.
(here i create a long running process for example)
sleep 100000 &
as root grep for the process;
ps -ef | grep sleep
500 **4323** 2995 0 18:44 pts/1 00:00:00 sleep 100000
500 4371 2995 0 18:45 pts/1 00:00:00 grep --color=auto sleep
renice the process to a very high priority;
renice -20 4323
You can also send the SIGSTOP and SIGCONT signals to Stop and Continue particular processes like so;
skill -STOP -p <processid>
skill -CONT -p <processid>
Unfortunately, what constitutes non-essential processes is dependent on your own definition. You can stop all non-root processes by examining the process list, and using the following command to stop all of a particular user's processes temporarily;
skill -STOP -u <userid>
skill -CONT -u <userid>
Obviously beware of stopping processes such as the shell that spawned your sudo root session.
Related
Our teachers told us to experiment with the terminal and kill -9 -1
To my understanding, on UNIX based OS, the first process charged is Init with PID -1 from which the other processes will spawn. I assumed that you couldn't kill it as it is charged in a secured part of memory.
On a VM running LinuxMint, the command would would cause the session to close itself. On MacOS, it would close/crahs(?) all applications.
On some other people laptops running different distribution of Linux the command would be denied which was the behaviour I would have expected in any OS.
So I am confused by the behaviour of the command.
What should be the normal result? Or is it bound to each OS implementation?
Thanks.
The general behaviour, from Kill MAN page:
If pid equals -1, then sig is sent to every process for which the calling process has permission to send signals, except for process 1 (init)
So, "kill -9 -1" will kill all processes it can.
Let's say I run the following command:
cat /var/log/dmesg | festival --tts
This might return the message [1] 4726, indicating a process ID associated with this operation. When I run kill 4726 or killall festival or killall cat or killall aplay, the speech does not stop (or, at least, it continues on for quite some time before stopping). If I run the command above, how can I kill what it starts doing?
Kill sends a SIGTERM to the program. SIGTERM tells the program to stop, allowing it to shut down gracefully. The program is unallocating memory, closing connections, flushing to disk, removing temp files, etc. So SIGTERM may not be immediate or quick.
Kill -9, sends a SIGSTOP or SIGKILL, which is only seen by the kernel. The kernel will terminate the process. While this is faster, it does not allow for a graceful exit.
I am not familiar with festival, so if you are worried that these commands are forking off processes and you want to stop all the children, you can brute force the issue by spawning them all out of a bash shell. When you kill the parent bash shell, it will kill all of the processes owned by it.
bash -c "cat /var/log/dmesg | festival --tts" &
You will get bash the pid for the bash shell, which you can kill and clean up all sub-procs.
I have written a script to initiate multi-processing
for i in `seq 1 $1`
do
/usr/bin/php index.php name&
done
wait
A cron run every min - myscript.sh 3 now three background process get initiated and after some time I see list of process via ps command. I see all the processes are together in "Sleep" or "Running" mode...Now I wanted to achieve that when one goes to sleep other processes must process..how can I achieve it?. Or this is normal.
This is normal. A program that can run will be given time by the operating system... when possible. If all three are sleeping, then the system is most likely busy and time is being given to other processes.
I am currently reading up on some more details on Bash scripting and especially process management here. In the section on "PIDs and Parents" I found the following statement:
A process's PID will NEVER be freed up for use after the process dies UNTIL the parent process waits for the PID to see whether it ended and retrieve its exit code.
So if I understand this correctly, if I start an process in a bash script, then the process terminates, that the PID cannot be used by any other process. Wouldn't this mean, that if I have a long running script, which repeatedly starts other sub-processes but never waits on them, that I'll eventually have a resource leak, because the used PIDs will not be returned back to the system?
How about if I actually wait for the other process, but the wait get's cancelled by a trap. Would this wait somehow still free up the PID, or do I have to wait again after the trap has been caught?
Luckily you won't. I can't tell you exactly why but you can easily test this. Run the following script (stop with Ctrl+C):
#!/bin/bash
while true; do
sleep 5 &
sleep 1
done
You can see you get no zombies (leaked PIDs) after 6+ seconds. To see some zombies use the following python code (again, stop with Ctrl+C):
#!/usr/bin/python
import subprocess, time
pl = []
while True:
pl.append(subprocess.Popen(["sleep", "5"]))
time.sleep(1)
After 6 seconds you'll see one zombie:
ps xaw | grep 'sleep'
...
26470 pts/2 Z+ 0:00 [sleep] <defunct>
...
My guess is that bash does wait and stores the results reaping the zombile processes with or without the builtin wait command. For the python script, if you remove the pl.append part the garbage collection releases the objects and does it's magic again reaping the zombies. Just for info a child may never become a zombie (from wikipedia, Zombie process):
...if the parent explicitly ignores SIGCHLD by setting its handler to SIG_IGN (rather
than simply ignoring the signal by default) or has the SA_NOCLDWAIT flag set, all
child exit status information will be discarded and no zombie processes will be left.
You don't have to explicitly wait on foreground processes because the shell in which your script is running waits on them. The next process won't start until the previous one finishes.
If you start many long running background processes, you could use all available PIDs, but that's subject to the limit of ulimit -u (which could be unlimited).
I have some processes showing up as <defunct> in top (and ps). I've boiled things down from the real scripts and programs.
In my crontab:
* * * * * /tmp/launcher.sh /tmp/tester.sh
The contents of launcher.sh (which is of course marked executable):
#!/bin/bash
# the real script does a little argument processing here
"$#"
The contents of tester.sh (which is of course marked executable):
#!/bin/bash
sleep 27 & # the real script launches a compiled C program in the background
ps shows the following:
user 24257 24256 0 18:32 ? 00:00:00 [launcher.sh] <defunct>
user 24259 1 0 18:32 ? 00:00:00 sleep 27
Note that tester.sh does not appear--it has exited after launching the background job.
Why does launcher.sh stick around, marked <defunct>? It only seems to do this when launched by cron--not when I run it myself.
Additional note: launcher.sh is a common script in the system this runs on, which is not easily modified. The other things (crontab, tester.sh, even the program that I run instead of sleep) can be modiified much more easily.
Because they haven't been the subject of a wait(2) system call.
Since someone may wait for these processes in the future, the kernel can't completely get rid of them or it won't be able to execute the wait system call because it won't have the exit status or evidence of its existence any more.
When you start one from the shell, your shell is trapping SIGCHLD and doing various wait operations anyway, so nothing stays defunct for long.
But cron isn't in a wait state, it is sleeping, so the defunct child may stick around for a while until cron wakes up.
Update: Responding to comment...
Hmm. I did manage to duplicate the issue:
PPID PID PGID SESS COMMAND
1 3562 3562 3562 cron
3562 1629 3562 3562 \_ cron
1629 1636 1636 1636 \_ sh <defunct>
1 1639 1636 1636 sleep
So, what happened was, I think:
cron forks and cron child starts shell
shell (1636) starts sid and pgid 1636 and starts sleep
shell exits, SIGCHLD sent to cron 3562
signal is ignored or mishandled
shell turns zombie. Note that sleep is reparented to init, so when the sleep exits init will get the signal and clean up. I'm still trying to figure out when the zombie gets reaped. Probably with no active children cron 1629 figures out it can exit, at that point the zombie will be reparented to init and get reaped. So now we wonder about the missing SIGCHLD that cron should have processed.It isn't necessarily vixie cron's fault. As you can see here, libdaemon installs a SIGCHLD handler during daemon_fork(), and this could interfere with signal delivery on a quick exit by intermediate 1629Now, I don't even know if vixie cron on my Ubuntu system is even built with libdaemon, but at least I have a new theory. :-)
to my opinion it's caused by process CROND (spawned by crond for every task) waiting for input on stdin which is piped to the stdout/stderr of the command in the crontab. This is done because cron is able to send resulting output via mail to the user.
So CROND is waiting for EOF till the user command and all it's spawned child processes have closed the pipe. If this is done CROND continues with the wait-statement and then the defunct user command disappears.
So I think you have to explicitly disconnect every spawned subprocess in your script form the pipe (e.g. by redirecting it to a file or /dev/null.
so the following line should work in crontab :
* * * * * ( /tmp/launcher.sh /tmp/tester.sh &>/dev/null & )
I suspect that cron is waiting for all subprocesses in the session to terminate. See wait(2) with respect to negative pid arguments. You can see the SESS with:
ps faxo stat,euid,ruid,tty,tpgid,sess,pgrp,ppid,pid,pcpu,comm
Here's what I see (edited):
STAT EUID RUID TT TPGID SESS PGRP PPID PID %CPU COMMAND
Ss 0 0 ? -1 3197 3197 1 3197 0.0 cron
S 0 0 ? -1 3197 3197 3197 18825 0.0 \_ cron
Zs 1000 1000 ? -1 18832 18832 18825 18832 0.0 \_ sh <defunct>
S 1000 1000 ? -1 18832 18832 1 18836 0.0 sleep
Notice that the sh and the sleep are in the same SESS.
Use the command setsid(1). Here's tester.sh:
#!/bin/bash
setsid sleep 27 # the real script launches a compiled C program in the background
Notice you don't need &, setsid puts it in the background.
I’d recommend that you solve the problem by simply not having two separate processes: Have launcher.sh do this on its last line:
exec "$#"
This will eliminate the superfluous process.
I found this question while I was looking for a solution with a similar issue. Unfortunately answers in this question didn't solve my problem.
Killing defunct process is not an option as you need to find and kill its parent process. I ended up killing the defunct processes in the following way:
ps -ef | grep '<defunct>' | grep -v grep | awk '{print "kill -9 ",$3}' | sh
In "grep ''" you can narrow down the search to a specific defunct process you are after.
I have tested the same problem so many times.
And finally I've got the solution.
Just specify the '/bin/bash' before the bash script as shown below.
* * * * * /bin/bash /tmp/launcher.sh /tmp/tester.sh