benefit of using systemu instead of open3? - ruby

The systemu page says:
systemu can be used on any platform to return status, stdout, and stderr of any command. unlike other methods like open3/popen4 there is zero danger of full pipes or threading issues hanging your process or subprocess.
(https://github.com/ahoward/systemu)
Could anyone explain this a little bit?

Methods like popen and its various spinoffs are convenient and are part of the expected API for a full I/O library.
However, they must be used either casually or carefully because they are prone to deadlock. By casually, I mean, if you both write and read from the command, it's still OK as long as you either don't write a lot or don't read a lot. By carefully, I mean, you can move large amounts of data, but only if you keep the inner details of the operation in mind and deliberately engineer against deadlock.
Imagine writing lots of stuff to your popened command and then reading a result. If you write more than a pipe will buffer, then your process will sleep. That's OK in practice, most of the time, but what if the command has to write a lot of stuff? Now it may sleep and not finish reading input that you are sending. You won't finish sending input so you will never wake up and read results.
Deadlock!

Related

Confusion about rubys IO#(read/write)_nonblock calls

I am currently doing the Ruby on the Web project for The Odin Project. The goal is to implement a very basic webserver that parses and responds to GET or POST requests.
My solution uses IO#gets and IO#read(maxlen) together with the Content-Length Header attribute to do the parsing.
Other solution use IO#read_nonblock. I googled for it, but was quite confused with the documentation for it. It's often mentioned together with Kernel#select, which didn't really help either.
Can someone explain to me what the nonblock calls do differently than the normal ones, how they avoid blocking the thread of execution, and how they play together with the Kernel#select method?
explain to me what the nonblock calls do differently than the normal ones
The crucial difference in behavior is when there is no data available to read at call time, but not at EOF:
read_nonblock() raises an exception kind of IO::WaitReadable
normal read(length) waits until length bytes are read (or EOF)
how they avoid blocking the thread of execution
According to the documentation, #read_nonblock is using the read(2) system call after O_NONBLOCK is set for the underlying file descriptor.
how they play together with the Kernel#select method?
There's also IO.select. We can use it in this case to wait for availability of input data, so that a subsequent read_nonblock() won't cause an error. This is especially useful if there are multiple input streams, where it is not known from which stream data will arrive next and for which read() would have to be called.
In a blocking write you wait until bytes got written to a file, on the other hand a nonblocking write exits immediately. It means, that you can continue to execute your program, while operating system asynchronously writes data to a file. Then, when you want to write again, you use select to see whether the file is ready to accept next write.

How can I exit reader.ReadString from waiting for user input?

I am making it so that it stops asking for input upon CTRL-C.
What I have currently is that a separate go-routine, upon receiving a CTRL-C, changes the value of a variable so it won't ask for another line. However, I can't seem to find a way around the current line.
i.e. I still have to press enter once, to get out of the current iteration of reading for \n.
Is there perhaps a way to push a "\n" into stdin for the reader.ReadString to read. Or a way to stop its execution altogether.
The only decent mechanism that Go gives you to proceed when either of two things happens is select, and select only selects on channel reads, so your only option is to change your signal-handler goroutine to write to a channel, and add another goroutine that handles stdin and passes lines of input to a channel, then select on the two channels.
However, that still leaves your question half-unanswered: your main program can stop waiting for input on a Ctrl-C, but the goroutine that's reading input will still be waiting for input. In some cases that might be okay... if you will never need stdin again, or if you will go right back to processing lines in the same exact way. But if you want to do something other than ReadString from that reader, you're stuck... literally. The only solution I see would be to write your own state machine around Read or ReadByte that is capable of changing its behavior in response to external conditions, but that can easily get horribly complicated.
Basically, this looks like a case where Go simplifies things compared to the underlying system (not exposing anything like EINTR, not allowing select on filehandles), but ends up providing less power to the programmer.

Get a long-running-process' output stream

There's a long-running Unix process which output I'd wish to capture and process with Clojure. A good example of one such process is a repl-y / nREPL session: its duration is indefinite, and output gets printed to stdout.
If I try (clojure.java.io/sh "lein" "repl"), evaluation will block until the underlying process finishes, and then I can observe the output.
This is not what I want - I want to get a stream immediately instead.
Can I achieve this using clojure.java.io, or similar, existing Clojure tools? Wouldn't mind resorting to Java otherwise.
Take a look at the me.raynes.conch library, it's a bit more versatile than clojure.java.shell. It's low-level API seems to be what you're looking for.
Not a detailed answer, but the source for Clojure's sh function is pretty short. If you reworked it slightly to remove the .waitFor (or added a higher-order function to consume the partial reads returned by the InputStreamReader as they arrived), you could probably get updated data as it's returned by the process. But be careful of deadlocks in case your subprocess expects input as well (as in your lein repl example).

Namedpipe writeFIle questions Win32

I am writing a win32 app which is using the namedpipe for inter-process communication. When one process is trying to writeFile, it will write the structure (tell other process how many bytes and other info), then it will write the actual data by calling WriteFile again.
The other process, when it is reading, it read the first msg, and then read the second msg based on the information got from the first msg.
My questions are:
If the server process is writing the data, but the client process hasn't read it yet, is it possible to lost the first msg when the client is reading? Example, when the server is calling WriteFile at the second time to write actual data, will the previous msg was overwritten?
Is there any best solution to use waitforsingleobject to sync?
Thanks
A pipe is a little like a real pipe -- when you write more to the pipe, it doesn't overwrite what was already in the pipe. It just adds more data to the pipe that will be delivered after the data that you previously wrote to the pipe.
I rarely find WaitForSingleObject useful for a pipe. If you want to block the current thread until it receives data from the pipe, you can just do a synchronous read, and it'll block until there's data. If you want to block until there's input from any of a number of sources, you usually want WaitForMultipleObjects or MsgWaitForMultipleObjects, so your thread will run when any of the sources has input to process.
The only times I can recall using WaitForSingleObject on a pipe were with a zero timeout, so the receiver would continue other processing if there was no pipe input, and every once in a while check if the pipe has some data to process. While it initially seems like PeekNamedPipe would work for this, it's really most useful for other purposes -- though it might work for you, to read the header data and figure out what other code to invoke to read and process the entire message.
Having said all that, I feel obliged to point out that I haven't written any new code using named pipes in quite a while. I can think of very few situations in which I'd even consider them today -- I'd almost always use sockets instead.

Can I capture stdout/stderr separately and maintain original order?

I've written a Windows application using the native win32 API. My app will launch other processes and capture the output and highlight stderr output in red.
In order to accomplish this I create a separate pipe for stdout and stderr and use them in the STARTUPINFO structure when calling CreateProcess. I then launch a separate thread for each stdout/stderr handle that reads from the pipe and logs the output to a window.
This works fine in most cases. The problem I am having is that if the child process logs to stderr and stdout in quick succession, my app will sometimes display the output in the incorrect order. I'm assuming this is due to using two threads to read from each handle.
Is it possible to capture stdout and stderr in the original order they were written to, while being able to distinguish between the two?
I'm pretty sure it can't be done, short of writing the spawned program to write in packets and add a time-stamp to each. Without that, you can normally plan on buffering happening in the standard library of the child process, so by the time they're even being transmitted through the pipe to the parent, there's a good chance that they're already out of order.
In most implementations of stdout and stderr that I've seen, stdout is buffered and stderr is not. Basically what this means is that you aren't guaranteed they're going to be in order even when running the program on straight command line.
http://en.wikipedia.org/wiki/Stderr#Standard_error_.28stderr.29
The short answer: You cannot ensure that you read the lines in the same order that they appear on cmd.exe because the order they appear on cmd.exe is not guaranteed.
Not really, you would think so but std_out is at the control of the system designers - exactly how and when std_out gets written is subject to system scheduler, which by my testing is subordinated to issues that are not as documented.
I was writing some stuff one day and did some work on one of the devices on the system while I had the code open in the editor and discovered that the system was giving real-time priority to the driver, leaving my carefully-crafted c-code somewhere about one tenth as important as the proprietary code.
Re-inverting that so that you get sequential ordering of the writes is gonna be challenging to say the least.
You can redirect stderr to stdout:
command_name 2>&1
This is possible in C using pipes, as I recall.
UPDATE: Oh, sorry -- missed the part about being able to distinguish between the two. I know TextMate did it somehow using kinda user visible code... Haven't looked for a while, but I'll give it a peek. But after some further thought, could you use something like Open3 in Ruby? You'd have to watch both STDOUT and STDERR at the same time, but really no one should expect a certain ordering of output regarding these two.
UPDATE 2: Example of what I meant in Ruby:
require 'open3'
Open3.popen3('ruby print3.rb') do |stdin, stdout, stderr|
loop do
puts stdout.gets
puts stderr.gets
end
end
...where print3.rb is just:
loop do
$stdout.puts 'hello from stdout'
$stderr.puts 'hello from stderr'
end
Instead of throwing the output straight to puts, you could send a message to an observer which would print it out in your program. Sorry, I don't have Windows on this machine (or any immediately available), but I hope this illustrates the concept.
I'm pretty sure that even if you don't separate them at all, you're still not guaranteed that they'll interchange one another in the correct order.
Since the intent is to annotate the output os an existing program, any possible interleaving of the two streams must be correct. The original developer will have placed appropriate flush() calls to ensure any mandatory ordering is honoured.
As previously explained, record each fragment that is written with a time stamp, and use this to recover the sequence actually seen by the output devices.

Resources