Closing all pipes of a process - shell

I am working on making a program that will act in a similar way as a shell, but supports only foreground processes and pipes. I have multiple processes writing to the same pipe and some other properties that differ from the normal usage of pipes. Anyhow, my question is,
Is there any easy (automatic) way to close all file descriptors of a process except the three basic ones?
I am asking this question since I have a lot of difficulties keeping track of all file descriptors for every process. And sometimes they act in some unpredictable ways to me. It could be also because of the fact that I don't have a very thorough understanding of them.

Is there any easy way(automatic) to close all file descriptors of a process except the three basic ones?
The normal way to do this is to simply iterate over all of them and close them:
for (i = getdtablesize(); i > 3;) close(--i);
That's already a one-liner. It doesn't get any more "automatic" than that.
I am asking this question since I have a lot of difficulty keeping track of all file descriptors for every process.
It will be worth your time to think about the life cycle of each file descriptor you open, when it gets duplicated (e.g. dup2() and fork()), how it gets used, and make sure you account for how each one is going to get closed when it is no longer needed. Papering over a problem of leaked file descriptors by indiscriminately closing them all is not going to be sustainable.
I have multiple processes writing to the same pipe
If you do this, then you need to be aware that the order in which data arrive at the other end of the pipe is going to be unpredictable. It will be difficult to avoid corrupting the data stream.

Use the closefrom(3) C library function.
From the manpage:
The closefrom() system call deletes all open file descriptors greater
than or equal to lowfd from the per-process object reference table.
Any errors encountered while closing file descriptors are ignored.
Example usage:
#include <unistd.h>
int main() {
// Close everything except stdin, stdout and stderr
closefrom(3); // Were 3 is the lowest file descriptor you wish to close
printf("Clear of all, but the three basic file descriptors!\n");
return 0;
}
This works in most unices, but requires the libbsd support library for Linux.

Related

Recovering control of a closed input descriptor process

Doing some tests in scm (a scheme interpreter), I've intentionally closed the current-input-port (equivalent to the standard input file descriptor). Once the program work in REPL, the things got crazy, printing systematically a error message. My question is: how could I recover the control of process, that means, how could I reestablish the input file descriptor of such process?
Search for "changing file descriptor of a running process" or something similar, I couldn't find a helpful article.
Thanks in advance
System information: Debian 10.
You almost certainly can't, although this does slightly depend on how the language-level ports are mapped to the underlying OS-level I/O system.
If what you do is close the OS-level standard input then all is lost:
the REPL tries to read from standard input, gets an error as it's closed;
it tries to raise some error which will involve prompting the user for input ...
... from standard input, which is closed, so it gets error;
game over.
The only way to survive this is for one of two things to be true:
either you've wrapped an error handler around the code which is already prepared to deal with this;
or the implementation is smart enough to recognise that it's getting closed-port errors in its closed-port error handler and gives up in some smart way.
Basically once the OS level standard input is gone anything that needs to get input from it is doomed: you can't put it back without OS-level surgery on the process.
However it's possible that the implementation maps a single OS-level I/O stream to multiple language-level streams, and closing only one of these streams would leave the system with some other stream-of-last-resort to which it can still talk, and which still refers to the OS-level standard input. Common Lisp is an example of a system which can (depending on configuration) do this. It has, for instance, *standard-input* *error-output*, *query-io*, *terminal-io* and other streams, and it's very possible to be in a situation where, for instance, *standard-input* has been closed causing read errors, but *query-io* still points somewhere with a human on the end of it.
I don't know if scm does that.

worse case scenario: launched two copies of a program which appends lines to a file

I have a Python program which performs a simple operation on a file:
with open(self.cache_filename_url, "a", encoding="utf8") as f:
w = csv.writer(f, delimiter=',', quotechar='"', lineterminator='\n')
w.writerow([cache_url, rpd_products])
As you can see it just opens the file and appends a CSV line to it. It does this a lot, in a loop.
I accidentally ran two copies of this program simultaneously, so I think they would have been appending to the file simultaneously. I am trying to determine the worst-case-scenario for file corruption.
Do you think the writes would at least be atomic operations in this case? For example this wouldn't be a problem for me:
old line
old line
new line written by instance 1
new line written by instance 2
new line written by one
This would be a problem for me:
old line
old line
[half of new line written by instance 1] [half of new line by instance 2]
etc
To put it another way, is it possible for the two append operations to "interfere" with each other?
EDIT: I am using Windows 7
Opening the same file multiple times in shared write mode can definitely be problematic. And, if they don't open in shared mode, you'll get one of them throwing exceptions that it cannot open the file.
If SHARED mode:
Both instances will have their own internal pointer. In most cases, they will probably write independently. You could get:
Process A opens file, sets pointer to end (byte 1024)
Process B opens file, sets pointer to end (byte 1024)
Process B writes at byte 1024 and closes file
Process A writes at byte 1024 and closes file.
Both processes will have written to the file at the same location. You've basically lost the record from Process B, and depending on how the close works (if it truncates), if the lines it writes are different lengths, you could get part of Process B if the line was longer.
If it is in EXCLUSIVE mode, one process will fail to open the file, and whatever exception handling you have will kick in.
Which mode you are in can be system dependent, as Python doesn't seem to provide any mechanisms for controlling the share mode.
Update: I ran a check on my file, and I did indeed have corrupted partial lines (the case under "This would be a problem for me" in my question)
It's unfortunate, especially since it implies you could have problems even when you intend to share a file between two processes.
I am still interested in any pointers on how to avoid this outcome. I will hold off on marking an answer as accepted for now. (The other answer is good, but doesn't provide enough details on these modes or how to determine which will be used.)

two programs accessing one file

New to this forum - looks great!
I have some Processing code that periodically reads data wirelessly from remote devices and writes that data as bytes to a file, e.g. data.dat. I want to write an Objective C program on my Mac Mini using Xcode to read this file, parse the data, and act on the data if data values indicate a problem. My question is: can my two different programs access the same file asynchronously without a problem? If this is a problem can you suggest a technique that will allow these operations?
Thanks,
Kevin H.
Multiple processes can read from the same file at a time without any problem. A process can also read from a file while another writes without problem, although you'll have to take care to ensure that you read in any new data that was written. Multiple processes should not write to the same file at at the same time, though. The OS will let you do it, but the ordering of data will be undefined, and you'll like overwrite data—in general, you're gonna have a bad time if you do that. So you should take care to ensure that only one process writes to a file at a time.
The simplest way to protect a file so that only one process can write to it at a time is with the C function flock(), although that function is admittedly a bit rudimentary and may or may not suit your use case.

What happens if another process tries to write to a flock(2)'d file?

Specifically, if the following events take place in the given order:
Process 1 opens a file in append mode.
Process 2 opens the same file in append mode.
Process 2 gets an exclusive lock using flock(2) on the file descriptor.
Process 1 attempts to write to the file.
What happens?
Will the write return immediately with a code indicating failure? Will it hang until the lock is released, then write and return success? Does the behavior vary by kernel? It seems odd that the documentation doesn't cover this case.
(I could write a couple processes to test it on my system, but I don't know whether my test would be representative of the general case, and if anyone does know, I can anticipate this answer saving a lot of other people a lot of time.)
The write proceeds as normal. flock provides advisory locking. Locking a file exclusively only prevents others from getting a shared or exclusive lock on the same file. Calls other than flock are not affected.

Unwanted buffering when filtering console output in Win32

My question is related to "Turn off buffering in pipe" albeit concerning Windows rather than Unix.
I'm writing a Make clone and to stop parallel processes from thrashing each others' console output I've redirected the output to pipes (as described in here) on which I can do any filtering I want. Unfortunately long-running processes now buffer up their output rather than sending it in real-time as they would on a console.
From peeking at the MSVCRT sources it seems the root cause is that GetFileType() is used to check whether the standard I/O handles are attached to a console, which then sets an internal flag and ends up disabling buffering.
Apparently a separate array of inheritable file handles and flags can also be passed on through the undocumented lpReserved2 member of the STARTUPINFO structured when creating the process. About the only working solution I've figured out is to use this list and just lie about the device type when setting the flags for stdout/stderr.
Now then... Is there any sane way of solving this problem?
There is not. Yes, GetFileType() tells it that stdout is no longer a char device, _isatty() return false so the CRT switches the output stream to buffered mode. Important to get reasonable throughput. Flushing output one character at a time is only acceptable when a human is looking at them.
You would have to relink the programs you are trying to redirect with a customized version of the CRT. I don't doubt that if that was possible, you wouldn't be messing with this in the first place. Patching GetFileType() is another un-sane solution.

Resources