Controlling an interactive command-line utility from a Cocoa app - trouble with ptys - macos

What I'm trying to do
My Cocoa app needs to run a bunch of command-line programs. Most of these are non-interactive, so I launch them with some command-line arguments, they do their thing, output something and quit. One of the programs is interactive, so it outputs some text and a prompt to stdout and then expects input on stdin and this keeps going until you send it a quit command.
What works
The non-interactive programs, which just dump a load of data to stdout and then terminate, are comparatively trivial:
Create NSPipes for stdout/stdin/stderr
Launch NSTask with those pipes
Then, either
get the NSFileHandle for the other end of the pipe to read all data until the end of the stream and process it in one go when the task ends
or
Get the -fileDescriptors from the NSFileHandle of the other end of the output pipes.
Set the file descriptor to use non-blocking mode
Create a GCD dispatch source with each of those file descriptors using dispatch_source_create(DISPATCH_SOURCE_TYPE_READ, ...
Resume the dispatch source and handle the data it throws at you using read()
Keep going until the task ends and the pipe file descriptor reports EOF (read() reports 0 bytes read)
What doesn't work
Either approach completely breaks down for interactive tools. Obviously I can't wait until the program exits because it's sitting at a command prompt and never will exit unless I tell it to. On the other hand, NSPipe buffers the data, so you receive it in buffer-sized chunks, unless the CLI program happens to flush the pipe explicitly, which the one in my case does not. The initial command prompt is much smaller than the buffer size, so I don't receive anything, and it just sits there. So NSPipe is also a no-go.
After some research, I determined that I needed to use a pseudo-terminal (pty) in place of the NSPipe. Unfortunately, I've had nothing but trouble getting it working.
What I've tried
Instead of the stdout pipe, I create a pty like so:
struct termios termp;
bzero(&termp, sizeof(termp));
int res = openpty(&masterFD, &slaveFD, NULL, &termp, NULL);
This gives me two file descriptors; I hand the slaveFD over to an NSFileHandle, which gets passed to the NSTask for either just stdout or both stdout and stdin. Then I try to do the usual asynchronous reading from the master side.
If I run the program I'm controlling in a Terminal window, it starts off by outputting 2 lines of text, one 18 bytes long including the newline, one 22 bytes and with no newline for the command prompt. After those 40 bytes it waits for input.
If I just use the pty for stdout, I receive 18 bytes of output (exactly one line, ending in newline) from the controlled program, and no more. Everything just sits there after the initial 18 bytes, no more events - the GCD event source's handler doesn't get called.
If I also use the pty for stdin, I usually receive 19 bytes of output (the aforementioned line plus one character from the next line) and then the controlled program dies immediately. If I wait a little before attempting to read the data (or scheduling noise causes a small pause), I actually get the whole 40 bytes before the program again dies instantly.
An additional dead end
At one point I was wondering if my async reading code was flawed, so I re-did everything using NSFileHandles and its -readInBackgroundAndNotify method. This behaved the same as when using GCD. (I originally picked GCD over the NSFileHandle API as there doesn't appear to be any async writing support in NSFileHandle)
Questions
Having arrived at this point after well over a day of futile attempts, I could do with some kind of help. Is there some fundamental problem with what I'm trying to do? Why does hooking up stdin to the pty terminate the program? I'm not closing the master end of the pty, so it shouldn't be receiving EOF. Leaving aside stdin, why am I only getting one line's worth of output? Is there a problem with the way I'm performing I/O on the pty's file descriptor? Am I using the master and slave ends correctly - master in the controlling process, slave in the NSTask?
What I haven't tried
I so far have only performed non-blocking (asynchronous) I/O on pipes and ptys. The only thing I can think of is that the pty simply doesn't support that. (if so, why does fcntl(fd, F_SETFL, O_NONBLOCK); succeed though?) I can try doing blocking I/O on background threads instead and send messages to the main thread. I was hoping to avoid having to deal with multithreading, but considering how broken all these APIs seem to be, it can't be any more time consuming than trying yet another permutation of async I/O. Still, I'd love to know what exactly I'm doing wrong.

The problem is likely that the stdio library inside is buffering output. The output will only appear in the read pipe when the command-line program flushes it, either because it writes a "\n" via the stdio library, or fflush()s, or the buffer gets full, or exits (which causes the stdio library to automatically flush any output still buffered), or possibly some other conditions. If those printf strings were "\n"-terminated, then you MIGHT the output quicker. That's because there are three output buffering styles -- unbuffered, line-buffered (\n causes a flush), and block buffered (when the output buffer gets full, it's auto-flushed).
Buffering of stdout is line-buffered by default if the output file descriptor is a tty (or pty); otherwise, block buffered. stderr is by default unbuffered. The setvbuf() function is used to change the buffering mode. These are all standard BSD UNIX (and maybe general UNIX) things I've described here.
NSTask does not do any setting up of ttys/ptys for you. It wouldn't help in this case anyway since the printfs aren't printing out \n.
Now, the problem is that the setvbuf() needs to be executed inside the command-line program. Unless (1) you have the source to the command-line program and can modify it and use that modified program, or (2) the command-line program has a feature that allows you to tell it to not buffer its output [ie, call setvbuf() itself], there's no way to change this, that I know of. The parent simply cannot affect the subprocess in this way, either to force flushing at certain points or change the stdio buffering behavior, unless the command-line utility has those features built into it (which would be rare).
Source: Re: NSTask, NSPipe's and interactive UNIX command

Related

Maximum size of pipe used by CreateProcess

I'm currently using this example as a guide to redirect standard error of a child process launched by CreateProcess.
However unlike the example currently I'm waiting until the process finishes (checking GetExitCodeProcess), closing the pipe and then reading the error if a non-zero return code comes back.
However I've since read if the pipe fills up the client process will block until the pipe is cleared. The reason I'm not currently reading from the pipe during execution is that the ReadFile call blocks during execution (standard error is only output at the end) so I can't pump the message queue to avoid the GUI from "ghosting" and being marked not responding.
I can't find any reference to how big the pipe is by default (although I can set a size myself), is this something I need to worry about given I'm buffering the output into a string variable for later use anyway? (ie. it would need to fit into the available memory for the process so it has a hard limit there, it's not going to a file like most of the examples have)

2-way communication with background process (I/O)

I have a program that runs in the command line (i.e. $ run program starts up a prompt) that runs mathematical calculations. It has it's own prompt that takes in text input and responds back through standard-out/error (or creates a separate x-window if needed, but this can be disabled). Sometimes I would like to send it small input, and other times I send in a large text file filled with a series of input on each line. This program takes a lot of resources and also has a large startup time, so it would be best to only have one instance of it running at a time. I could keep open the program-prompt and supply the input this way, or I can send the process with an exit command (to leave prompt) which just prints the output. The problem with sending the request with an exit command is that the program must startup each time (slow ...). Furthermore, the output of this program is sometimes cryptic and it would be helpful to filter the output in some way (eg. simplify output, apply ANSI colors, etc).
This all makes me want to put some 2-way IO filter (or is that "pipe"? or "wrapper"?) around the program so that the program can run in the background as single process. I would then communicate with it without having to restart. I would also like to have this all while filtering the output to be more user friendly. I have been looking all over for ideas and I am stumped at how to accomplish this in some simple shell accessible manor.
Some things I have tried were redirecting stdin and stdout to files, but the program hangs (doesn't quit) and only reads the file once making me unable to continue communication. I think this was because the prompt is waiting for some user input after the EOF. I thought that this could be setup as a local server, but I am uncertain how to begin accomplishing that.
I would love to find some simple way to accomplish this. Additionally, if you can think of a way to perform this, do you think there is a way to also allow for attaching or detaching to the prompt by request? Any help and ideas would be greatly appreciated.
You could create two named pipes (man mkfifo) and redirect input and output:
myprog < fifoin > fifoout
Then you could open new terminal windows and do this in one:
cat > fifoin
And this in the other:
cat < fifoout
(Or use tee to save the input/output as well.)
To dump a large input file into the program, use:
cat myfile > fifoin

Can I capture stdout/stderr separately and maintain original order?

I've written a Windows application using the native win32 API. My app will launch other processes and capture the output and highlight stderr output in red.
In order to accomplish this I create a separate pipe for stdout and stderr and use them in the STARTUPINFO structure when calling CreateProcess. I then launch a separate thread for each stdout/stderr handle that reads from the pipe and logs the output to a window.
This works fine in most cases. The problem I am having is that if the child process logs to stderr and stdout in quick succession, my app will sometimes display the output in the incorrect order. I'm assuming this is due to using two threads to read from each handle.
Is it possible to capture stdout and stderr in the original order they were written to, while being able to distinguish between the two?
I'm pretty sure it can't be done, short of writing the spawned program to write in packets and add a time-stamp to each. Without that, you can normally plan on buffering happening in the standard library of the child process, so by the time they're even being transmitted through the pipe to the parent, there's a good chance that they're already out of order.
In most implementations of stdout and stderr that I've seen, stdout is buffered and stderr is not. Basically what this means is that you aren't guaranteed they're going to be in order even when running the program on straight command line.
http://en.wikipedia.org/wiki/Stderr#Standard_error_.28stderr.29
The short answer: You cannot ensure that you read the lines in the same order that they appear on cmd.exe because the order they appear on cmd.exe is not guaranteed.
Not really, you would think so but std_out is at the control of the system designers - exactly how and when std_out gets written is subject to system scheduler, which by my testing is subordinated to issues that are not as documented.
I was writing some stuff one day and did some work on one of the devices on the system while I had the code open in the editor and discovered that the system was giving real-time priority to the driver, leaving my carefully-crafted c-code somewhere about one tenth as important as the proprietary code.
Re-inverting that so that you get sequential ordering of the writes is gonna be challenging to say the least.
You can redirect stderr to stdout:
command_name 2>&1
This is possible in C using pipes, as I recall.
UPDATE: Oh, sorry -- missed the part about being able to distinguish between the two. I know TextMate did it somehow using kinda user visible code... Haven't looked for a while, but I'll give it a peek. But after some further thought, could you use something like Open3 in Ruby? You'd have to watch both STDOUT and STDERR at the same time, but really no one should expect a certain ordering of output regarding these two.
UPDATE 2: Example of what I meant in Ruby:
require 'open3'
Open3.popen3('ruby print3.rb') do |stdin, stdout, stderr|
loop do
puts stdout.gets
puts stderr.gets
end
end
...where print3.rb is just:
loop do
$stdout.puts 'hello from stdout'
$stderr.puts 'hello from stderr'
end
Instead of throwing the output straight to puts, you could send a message to an observer which would print it out in your program. Sorry, I don't have Windows on this machine (or any immediately available), but I hope this illustrates the concept.
I'm pretty sure that even if you don't separate them at all, you're still not guaranteed that they'll interchange one another in the correct order.
Since the intent is to annotate the output os an existing program, any possible interleaving of the two streams must be correct. The original developer will have placed appropriate flush() calls to ensure any mandatory ordering is honoured.
As previously explained, record each fragment that is written with a time stamp, and use this to recover the sequence actually seen by the output devices.

Basic Questions about Pipes

I have some basic questions about pipes I am unsure about.
a) What is the standard behavior if a process writing to a pipe gets killed (ie. SIGKILL SIGINT) Does it close the pipe? Does it flush the pipe? Or is the behavior undefined?
b) What is the standard behavior if a process returns normally? Is it guaranteed to flush the pipe and close the pipe? (without explicitly doing so of course).
I would like these answers to be as general as possible, but in reality if it depends entirely on the OS specs I can accept that! However, if there is a Posix standard or a current defined Windows behavior I would be very grateful to know.
Thanks.
a. What is the standard behavior if a process writing to a pipe gets killed (ie. SIGKILL SIGINT) Does it close the pipe? Does it flush the pipe? Or is the behavior undefined?
SIGKILL never allows any cleanup - the process dies, dead. With SIGINT, it depends on whether the process handles the signal. If so, it is likely to exit via exit(2), which flushes standard I/O file handles. The question is - was the pipe connected to standard output or via popen()? If so, outstanding buffered data may be flushed; if not, there is no buffered data so flushing is immaterial.
If there is unread data in the pipe, that data remains in the pipe, ready for the reader to collect - assuming there is a reader.
b. What is the standard behavior if a process returns normally? Is it guaranteed to flush the pipe and close the pipe? (without explicitly doing so of course).
It depends on whether the pipe was connected via standard I/O or not. If not, there is nothing pending. If so, then yes, any material in the buffers will be flushed as the standard I/O stream is closed.
c. Thanks for the info on signals and the unread data, but I'm a little confused about the standard I/O pipe connection. After you mentioned popen() I looked it up and the man page says its return value identical to an I/O stream and the streams are fully buffered by default. I'm just not clear on the difference between the two nor do I understand where the difference comes from.
The basic system call for creating pipes is pipe(2). It creates two file descriptors, one for the read end of the pipe, one for the write end. If you do nothing else with them, then they remain as file descriptors, with unbuffered output (via write(2) and related system calls). If the process terminates, there is no buffering in the application; the pipe is closed.
If you use popen(3), then it does a whole lot more work for you. It still invokes pipe(2) to create the pipes, but it then does a fork(2). The child arranges the correct configuration of the pipes and launches the child process. The parent also closes the unused end of the pipe, and uses fdopen(3) to create a standard I/O file stream for the calling process to use.
With the file stream, if there is data in the I/O buffer, then a close or equivalent will ensure that the outstanding data is flushed and the file descriptor is closed.
The normal behaviour is that all file descriptors are closed when a process terminates. This means that a pipe, like any other open file descriptor, is closed normally.
One interesting thing about pipes, though: in POSIX, if a process writes to a pipe that has been closed, the writer will get a signal, SIGPIPE.
Edit:
A caveat: The difference between s SIGx termination and a normal termination is that, like any other file write, you may loose data that has been buffered (via a FILE write) and not yet written to the file descriptor.

Is there a terminal program that differentiates between input, output, and commands?

Is there a terminal program that shows the difference between input, standard output, error output, the prompt, and user-entered commands? It should also show when standard input is needed vs. running a command.
One way would be to highlight each differently. The cursor could change color depending on if it was waiting for a command, running a command, or waiting for standard input.
Another way would be to have 3 frames -- a large frame on the top for output (including prompt and commands running), a small frame near the bottom for standard input, and an one-line frame at the bottom for command line input. That would possibly even allow running another command to provide input while the previous command is still waiting for standard input.
From http://jamesjava.blogspot.com/2007/09/terminal-window-with-3-frames.html
Hotwire could be a good candidate, but it's not doing that out of the box, AFAIK
For now it appears that there is no such program.
My program gush (Graphical User SHell) does part of this.
It uses different colours for commands and program stdin/stdout/stderr.
Note that the traditional separation of shell and terminal makes this
impossible because the interface between them models an old serial
terminal connection and therefore only has a single input and single
output channel. I get around this problem by combining shell and
terminal into one program.
It would be nice to also indicate when a program is waiting for input,
but I don't think there's any way to detect this, unless you traced the
system calls of the child program to detect when it tries to read stdin.
For interactive programs, you can guess that if the last output did not
end with newline it's probably prompting for input, but this would not
work for non-interactive programs, eg. sed.

Resources