I would like to run arbitrary console-based sub-processes and manage them from a single master process. The console based sub-processes communicate via stdin, stdout and stderr, and if you run them in a genuine console they terminate cleanly when you press CTRL+C. Some of them may in fact be a tree of processes, such as a batch script that runs an executable which may in turn run another executable to do some work. I would like to redirect their standard I/O (for example, so that I can show their output in a GUI window) and in certain circumstances to send them a CTRL+C event so that they will give up and terminate cleanly.
The following two diagrams show first the normal structure - one master process has four worker sub-processes, and some of those workers have their own subprocesses; and then what should happen when one of the workers needs to be stopped - it and all of its children should get the CTRL+C event, but no other processes should receive the CTRL+C event.
(source: livejournal.com)
Additionally, I would much prefer that there are no extra windows visible to the user.
Here's what I've tried (note that I'm working in Python, but solutions for C would still be helpful):
Spawning an extra intermediate process with CREATE_NEW_CONSOLE, and then having it spawn the worker process. Then have it call GenerateConsoleCtrlEvent(CTRL_C_EVENT, 0) when we want to kill the worker. Unfortunately, CREATE_NEW_CONSOLE seems to prevent me from redirecting the standard I/O channels, so I'm left with no easy way to get the output back to the main program.
Spawning an extra intermediate process with CREATE_NEW_PROCESS_GROUP, and then having it spawn the worker process. Then have it call GenerateConsoleCtrlEvent(CTRL_C_EVENT, 0) when we want to kill the worker. Somehow, this manages to send the CTRL+C only to the master process, which is completely useless. On closer inspection, GenerateConsoleCtrlEvent says that CTRL+C cannot be sent to process groups.
Spawning the subprocess with CREATE_NEW_PROCESS_GROUP. Then call GenerateConsoleCtrlEvent(CTRL_BREAK_EVENT, pid) to kill the worker. This is not ideal, because CTRL+BREAK is less friendly than CTRL+C and will probably result in a messier termination. (E.g. if it's a Python process, no KeyboardInterrupt can be caught and no finally blocks run.)
Is there any good way to do what I want? I can see that I could theoretically build on the first attempt and find some other way to communicate between the processes, but I am worried it will turn out to be extremely awkward. Are there good examples of other programs that achieve the same effect? It seems so simple that it can't be all that uncommon a requirement.
I don't know about managing/redirecting stdin et. al., but for managing the subprocess tree
have you considered using the Windows Job Objects api?
There are several other questions about managing process trees (How do I automatically destroy child processes in Windows? Performing equivalent of “Kill Process Tree” in c++ on windows) and it looks like the cleanest method if you can use it.
Chapter 5 of Windows Via C/C++ by Jeffery Richter has a good discussion on using CreateJobObject and the related APIs.
Related
I am looking for a cross platform (various flavours of Unix, including Linux) to find and kill all processes spawned by my program. For Linux, I can walk to /proc to obtain this information, and I am sure I can find somethinf similar for OS X and *BSD. But I'd prefer if there were a standard library for this.
Background: I am writing a custom job schedular which needs to terminate the jobs that don't complete within a given period of time. Simply killing (SIGTERM, followed by SIGKILL- if the formar is not ignored) the child process works fine when the job doesn't spawn any other process or handles SIGTERM properly and takes care of the cleanup. But I don't control the jobs - and I know at least some that are poorly written. In the latter case, the system is left with a bunch of orphaned process which keep holdin on to certain resouces and cause all sorts of problems.
Any pointer to libraries or some cross platform way of doing this would be welcome.
I have been reading about ptys from this page's example: http://www.rkoucha.fr/tech_corner/pty_pdip.html. I have two questions:
What is the difference, or the most important difference, between using a pty and using a pipe? From what I have read, both are for inter-process communication, but with a pty the process can "treat it like a normal terminal". What does that mean?
What is a "controlling terminal"? I have read about them but can't understand what they really are. Is the controlling terminal always the pty assigned to the process?
The article you mention is excellent, and hard to improve upon, but it is rather technical. I'll try to give a less technical explanation (bear with me, Unix gurus!)
A pipe is just an unidirectional data channel: it can only be written on one end, and read on the other. For bidirectional inter-process communication you'll always need two pipes. Pipes are excellent to move bits around, but not for much more.
A pty (pseudoterminal) can be read and written on both ends, but it is much more than just a bidirectional data channel. To understand this, it is useful to have a look at a real terminal: On one end there is a process reading keystrokes and sending characters to a teletype or screen. On the other end there is a real human banging away at a keyboard and staring at the above-mentioned screen. Only one end has a file descriptor, the other end is just a connector and a cable.
Historically, terminals have developed many attributes that can be controlled by the programs running on them (like 'echo mode' or 'canonical mode', see termios (3)) Also, a terminal can let the user (by way of the above-mentioned connector and cable) send signals that can be used for 'job control', e.g. by typing CTRL-Z to put a foreground job in the background.
A pty is like a real terminal where both ends are file descriptors:
the slave end behaves exactly like a real terminal : a process that has a descriptor for the slave end ("inferior process") can read from, and write to it, but also set terminal attibutes like echo mode or the interrupt character (e.g. CTRL+C). It will usually not even be aware that is is not connected to a real screen and keyboard.
the master end looks more like a keyboard and teletype for use, not by humans, but by other processes: any process that has opened the the master end can write to it, and will receive echo (but only if the inferior process has set the ECHO attribute on the slave). It can also (on most modern unices) control the session that has the slave as its controlling terminal), e.g. by sending CTRL+Z.
To understand what a controlling terminal is, it is again useful to think about the scenario where a real user is logged in at a real terminal. The user can start a "session", i.e. a collection of processes, some of them in foreground jobs, others in the background.
To prevent chaos, a controlling terminal (i.e. the kernel structure associated with it) keeps track of which processes are in a foreground or background job, and which processes are allowed to read from and write to it. Whenever a process tries something illegal (like a background process reading from the controlling terminal) the operation will fail (with EIO) and the whole job is then stopped by the kernel (using the signal SIGTTIN)
This shows that, just as with a real terminal, only the slave end of a pty can be a controlling terminal, and that the concept only makes sense on a Unix system that supports job control (any Unix system, nowadays)
I have to create a script (ksh or perl) that starts certain number of parallel jobs (another scripts), each of them runs as a foreground process in a separate session. Plus I start monitoring job that has to determine if any of those scripts is expecting input from operator, and switch to the corresponding session if necessary.
My problem is that I have not found a good way to determine that process is expecting input. For the background process it's pretty easy: process state is "stopped" and this can be easily checked with 'ps' command. In case of foreground process this does not work.
So far I tried to attach to the process with dbx or truss to see if it's hanging on 'read', but this approach seems too heavyweight.
Could you suggest some better solution? Perl, shell, C, Java, etc. … is ok as long as it’s standard and does not require extra 3rd party or OS-specific stuff to install.
Thank you.
What you're asking isn't possible, at least not reliably. The process may be using select or other polling method rather than blocking on a read call. You can't know whether it's waiting for operator input or busy doing other stuff, and in general it could be both (doing stuff in the background while being responsive to operator input).
The normal way for a program to signal that it's waiting for operator input is to print a prompt. Thus you should consider a session to be active if it's displayed a prompt since the last time you fed it input.
If your programs don't behave this way, you'll need to find some other program-specific way to know that these processes are waiting for input.
I'm refactoring a bit of concurrent processing in my Ruby on Rails server (running on Linux) to use Spawn. Spawn::fork_it documentation claims that forked processes can still be waited on after being detached: https://github.com/tra/spawn/blob/master/lib/spawn.rb (line 186):
# detach from child process (parent may still wait for detached process if they wish)
Process.detach(child)
However, the Ruby Process::detach documentation says you should not do this: http://www.ruby-doc.org/core/classes/Process.html
Some operating systems retain the status of terminated child processes until the parent collects that status (normally using some variant of wait(). If the parent never collects this status, the child stays around as a zombie process. Process::detach prevents this by setting up a separate Ruby thread whose sole job is to reap the status of the process pid when it terminates. Use detach only when you do not intent to explicitly wait for the child to terminate.
Yet Spawn::wait effectively allows you to do just that by wrapping Process::wait. On a side note, I specifically want to use the Process::waitpid2 method to wait on the child processes, instead of using the Spawn::wait method.
Will detach-and-wait not work correctly on Linux? I'm concerned that this may cause a race condition between the detached reaper thread and the waiting parent process, as to who collects the child status first.
The answer to this question is there in the documentation. Are you writing code for your own use in a controlled environment? Or to be used widely by third parties? Ruby is written to be widely used by third parties, so their recommendation is to not do something that could fail on "some operating systems". Perhaps the Spawn library is designed primarily for use on Linux machines and tested only on a small subset thereof where this tactic works.
If you're distributing the code you're writing to be used by anyone and everyone, I would take Ruby's approach.
If you control the environment where this code will be run, I would write two tests:
A test that spawns a process, detaches it and then waits for it.
A test that spawns a process and then just waits for it.
Count the failure rate for both and if they are equal (within a margin that you feel is acceptable), go for it!
I want to increase the throughput of a script which does net I/O (a scraper). Instead of making it multithreaded in ruby (I use the default 1.9.1 interpreter), I want to launch multiple processes. So, is there a system for doing this to where I can track when one finishes to re-launch it again so that I have X number running at any time. ALso some will run with different command args. I was thinking of writing a bash script but it sounds like a potentially bad idea if there already exists a method for doing something like this on linux.
I would recommend not forking but instead that you use EventMachine (and the excellent em-http-request if you're doing HTTP). Managing multiple processes can be a bit of a handful, even more so than handling multiple threads, but going down the evented path is, in comparison, much simpler. Since you want to do mostly network IO, which consist mostly of waiting, I think that an evented approach would scale as well, or better than forking or threading. And most importantly: it will require much less code, and it will be more readable.
Even if you decide on running separate processes for each task, EventMachine can help you write the code that manages the subprocesses using, for example, EventMachine.popen.
And finally, if you want to do it without EventMachine, read the docs for IO.popen, Open3.popen and Open4.popen. All do more or less the same thing but give you access to the stdin, stdout, stderr (Open3, Open4), and pid (Open4) of the subprocess.
You can try fork http://ruby-doc.org/core/classes/Process.html#M003148
You can get the PID in return and see if this process run again or not.
If you want manage IO concurrency. I suggest you to use EventMachine.
You can either
implement (or find an equivalent gem) a ThreadPool (ProcessPool, in your case), or
prepare an array of all, let's say 1000 tasks to be processed, split it into, say 10 chunks of 100 tasks (10 being the number of parallel processes you want to launch), and launch 10 processes, of which each process right away receives 100 tasks to process. That way you don't need to launch 1000 processes and control that not more than 10 of them work at the same time.