Recovering control of a closed input descriptor process - scheme

Doing some tests in scm (a scheme interpreter), I've intentionally closed the current-input-port (equivalent to the standard input file descriptor). Once the program work in REPL, the things got crazy, printing systematically a error message. My question is: how could I recover the control of process, that means, how could I reestablish the input file descriptor of such process?
Search for "changing file descriptor of a running process" or something similar, I couldn't find a helpful article.
Thanks in advance
System information: Debian 10.

You almost certainly can't, although this does slightly depend on how the language-level ports are mapped to the underlying OS-level I/O system.
If what you do is close the OS-level standard input then all is lost:
the REPL tries to read from standard input, gets an error as it's closed;
it tries to raise some error which will involve prompting the user for input ...
... from standard input, which is closed, so it gets error;
game over.
The only way to survive this is for one of two things to be true:
either you've wrapped an error handler around the code which is already prepared to deal with this;
or the implementation is smart enough to recognise that it's getting closed-port errors in its closed-port error handler and gives up in some smart way.
Basically once the OS level standard input is gone anything that needs to get input from it is doomed: you can't put it back without OS-level surgery on the process.
However it's possible that the implementation maps a single OS-level I/O stream to multiple language-level streams, and closing only one of these streams would leave the system with some other stream-of-last-resort to which it can still talk, and which still refers to the OS-level standard input. Common Lisp is an example of a system which can (depending on configuration) do this. It has, for instance, *standard-input* *error-output*, *query-io*, *terminal-io* and other streams, and it's very possible to be in a situation where, for instance, *standard-input* has been closed causing read errors, but *query-io* still points somewhere with a human on the end of it.
I don't know if scm does that.

Related

Making STDIN unbuffered under Windows in Perl

I am trying to do input processing (from the console) in Perl asynchronously. My first approach was to use IO::Select but that does not work under Windows.
I then came across the post Non-buffered processor in Perl which roughly suggests this:
binmode STDIN;
binmode STDOUT;
STDIN->blocking(0) or warn $!;
STDOUT->autoflush(1);
while (1) {
my $buffer;
my $read_count = sysread(STDIN, $buffer, 4096);
if (not defined($read_count)) {
next;
} elsif (0 == $read_count) {
exit 0;
}
}
That works as expected for regular Unix systems but not for Windows, where the sysread actually does block. I have tested that on Windows 10 with 64-bit Strawberry Perl 5.32.1.
When you check the return value of blocking() (as done in the code above), it turns out that the call fails with the funny error message "An operation was attempted on something that is not a socket".
Edit: My application is a chess engine that theoretically can be run interactively in a terminal but usually communicates via pipes with a GUI. Therefore, Win32::Console does not help.
Has something changed since the blog post had been published? The author explicitely claims that this approach would work for Windows. Any other option that I can go with, maybe some module from the Win32:: namespace?
The solution I now implemented in https://github.com/gflohr/Chess-Plisco/blob/main/lib/Chess/Plisco/Engine.pm (search for the method __msDosSocket()) can be outlined as follows:
If Windows is detected as the operating system, create a temporary file as a Unix domain socket with IO::Socket::Unix for writing.
Do a fork() which actually creates a thread in Perl for Windows because the system does not have a real fork().
In the "parent", create another instance of IO::Socket::Unix with the same path for reading.
In the "child", read from standard input with getline(). This blocks, of course. Every line read is echoed to the write end of the socket.
The "parent" uses the read-end of the socket as a replacement for standard input and puts it into non-blocking mode. That works even under Windows because it is a socket.
From here on, everything is working the same as under Unix: All input is read in non-blocking mode with IO::Select.
Instead of a Unix domain socket it is probably wiser to route the communication through the loopback interface because under Windows it is hard to guarantee that a temporary file gets deleted when the process terminates since you cannot unlink it while it is in use. It is also stated in the comments that IO::Socket::UNIX may not work under older Windows versions, and so inet sockets are probably more portable to use.
I also had trouble to terminate both threads. A call to kill() does not seem to work. In my case, the protocol that the program implements is so that the command "quit" read from standard input should cause the program to terminate. The child thread therefore checks, whether the line read was "quit" and terminates with exit in that case. A proper solution should find a better way for letting the parent kill the child.
I did not bother to ignore SIGCHLD (because it doesn't exist under Windows) or call wait*() because fork does not spawn a new process image under Windows but only a new thread.
This approach is close to the one suggested in one of the comments to the question, only that the thread comes in disguise as a child process created by fork().
The other suggestion was to use the module Win32::Console. This does not work for two reasons:
As the name suggests, it only works for the console. But my software is a backend for a GUI frontend and rarely runs in a console.
The underlying API is for keyboard and mouse events. It works fine for key strokes and most mouse events, but polling an event blocks as soon as the user has selected something with the mouse. So even for a real console application, this approach would not work. A solution built on Win32::Console must also handle events like pressing the CTRL, ALT or Shift key because they will not guarantee that input can be read immediately from the tty.
It is somewhat surprising that a task as trivial as non-blocking I/O on a file descriptor is so hard to implement in a portable way in Perl because Windows actually has a similar concept called "overlapped" I/O. I tried to understand that concept, failed at it, and concluded that it is true to the Windows maxim "make easy things hard, and hard things impossible". Therefore I just cannot blame the Perl developers for not using it as an emulation of non-blocking I/O. Maybe it is simply not possible.

Confusion about rubys IO#(read/write)_nonblock calls

I am currently doing the Ruby on the Web project for The Odin Project. The goal is to implement a very basic webserver that parses and responds to GET or POST requests.
My solution uses IO#gets and IO#read(maxlen) together with the Content-Length Header attribute to do the parsing.
Other solution use IO#read_nonblock. I googled for it, but was quite confused with the documentation for it. It's often mentioned together with Kernel#select, which didn't really help either.
Can someone explain to me what the nonblock calls do differently than the normal ones, how they avoid blocking the thread of execution, and how they play together with the Kernel#select method?
explain to me what the nonblock calls do differently than the normal ones
The crucial difference in behavior is when there is no data available to read at call time, but not at EOF:
read_nonblock() raises an exception kind of IO::WaitReadable
normal read(length) waits until length bytes are read (or EOF)
how they avoid blocking the thread of execution
According to the documentation, #read_nonblock is using the read(2) system call after O_NONBLOCK is set for the underlying file descriptor.
how they play together with the Kernel#select method?
There's also IO.select. We can use it in this case to wait for availability of input data, so that a subsequent read_nonblock() won't cause an error. This is especially useful if there are multiple input streams, where it is not known from which stream data will arrive next and for which read() would have to be called.
In a blocking write you wait until bytes got written to a file, on the other hand a nonblocking write exits immediately. It means, that you can continue to execute your program, while operating system asynchronously writes data to a file. Then, when you want to write again, you use select to see whether the file is ready to accept next write.

Closing all pipes of a process

I am working on making a program that will act in a similar way as a shell, but supports only foreground processes and pipes. I have multiple processes writing to the same pipe and some other properties that differ from the normal usage of pipes. Anyhow, my question is,
Is there any easy (automatic) way to close all file descriptors of a process except the three basic ones?
I am asking this question since I have a lot of difficulties keeping track of all file descriptors for every process. And sometimes they act in some unpredictable ways to me. It could be also because of the fact that I don't have a very thorough understanding of them.
Is there any easy way(automatic) to close all file descriptors of a process except the three basic ones?
The normal way to do this is to simply iterate over all of them and close them:
for (i = getdtablesize(); i > 3;) close(--i);
That's already a one-liner. It doesn't get any more "automatic" than that.
I am asking this question since I have a lot of difficulty keeping track of all file descriptors for every process.
It will be worth your time to think about the life cycle of each file descriptor you open, when it gets duplicated (e.g. dup2() and fork()), how it gets used, and make sure you account for how each one is going to get closed when it is no longer needed. Papering over a problem of leaked file descriptors by indiscriminately closing them all is not going to be sustainable.
I have multiple processes writing to the same pipe
If you do this, then you need to be aware that the order in which data arrive at the other end of the pipe is going to be unpredictable. It will be difficult to avoid corrupting the data stream.
Use the closefrom(3) C library function.
From the manpage:
The closefrom() system call deletes all open file descriptors greater
than or equal to lowfd from the per-process object reference table.
Any errors encountered while closing file descriptors are ignored.
Example usage:
#include <unistd.h>
int main() {
// Close everything except stdin, stdout and stderr
closefrom(3); // Were 3 is the lowest file descriptor you wish to close
printf("Clear of all, but the three basic file descriptors!\n");
return 0;
}
This works in most unices, but requires the libbsd support library for Linux.

Ruby file handle management (too many open files)

I am performing very rapid file access in ruby (2.0.0 p39474), and keep getting the exception Too many open files
Having looked at this thread, here, and various other sources, I'm well aware of the OS limits (set to 1024 on my system).
The part of my code that performs this file access is mutexed, and takes the form:
File.open( filename, 'w'){|f| Marshal.dump(value, f) }
where filename is subject to rapid change, depending on the thread calling the section. It's my understanding that this form relinquishes its file handle after the block.
I can verify the number of File objects that are open using ObjectSpace.each_object(File). This reports that there are up to 100 resident in memory, but only one is ever open, as expected.
Further, the exception itself is thrown at a time when there are only 10-40 File objects reported by ObjectSpace. Further, manually garbage collecting fails to improve any of these counts, as does slowing down my script by inserting sleep calls.
My question is, therefore:
Am I fundamentally misunderstanding the nature of the OS limit---does it cover the whole lifetime of a process?
If so, how do web servers avoid crashing out after accessing over ulimit -n files?
Is ruby retaining its file handles outside of its object system, or is the kernel simply very slow at counting 'concurrent' access?
Edit 20130417:
strace indicates that ruby doesn't write all of its data to the file, returning and releasing the mutex before doing so. As such, the file handles stack up until the OS limit.
In an attempt to fix this, I have used syswrite/sysread, synchronous mode, and called flush before close. None of these methods worked.
My question is thus revised to:
Why is ruby failing to close its file handles, and how can I force it to do so?
Use dtrace or strace or whatever equivalent is on your system, and find out exactly what files are being opened.
Note that these could be sockets.
I agree that the code you have pasted does not seem to be capable of causing this problem, at least, not without a rather strange concurrency bug as well.

Unwanted buffering when filtering console output in Win32

My question is related to "Turn off buffering in pipe" albeit concerning Windows rather than Unix.
I'm writing a Make clone and to stop parallel processes from thrashing each others' console output I've redirected the output to pipes (as described in here) on which I can do any filtering I want. Unfortunately long-running processes now buffer up their output rather than sending it in real-time as they would on a console.
From peeking at the MSVCRT sources it seems the root cause is that GetFileType() is used to check whether the standard I/O handles are attached to a console, which then sets an internal flag and ends up disabling buffering.
Apparently a separate array of inheritable file handles and flags can also be passed on through the undocumented lpReserved2 member of the STARTUPINFO structured when creating the process. About the only working solution I've figured out is to use this list and just lie about the device type when setting the flags for stdout/stderr.
Now then... Is there any sane way of solving this problem?
There is not. Yes, GetFileType() tells it that stdout is no longer a char device, _isatty() return false so the CRT switches the output stream to buffered mode. Important to get reasonable throughput. Flushing output one character at a time is only acceptable when a human is looking at them.
You would have to relink the programs you are trying to redirect with a customized version of the CRT. I don't doubt that if that was possible, you wouldn't be messing with this in the first place. Patching GetFileType() is another un-sane solution.

Resources