This question already has answers here:
Is it necessary to close files after reading (only) in any programming language?
(3 answers)
Does the file need to be closed?
(1 answer)
Closed 2 years ago.
It seems that os.Open() open read-only files. So I think there is no need to Close() it. The doc is not clear on this. Is my understanding correct?
https://golang.org/pkg/os/#Open
In general, you should always close the files you open. In a long running program, you may exhaust all available file handles if you do not close your files. That said, the Go garbage collector closes open files, so depending on your exact situation leaving files open may not be a big deal.
There is a limit to how many filehandles a process can have open at once, the limit is determined by your environment, so it's important to close them.
In addition, Windows file locking is complicated; if you hold a file open it may not be able to be written to or deleted.
Unless you're returning the open filehandle, I'd advise to always match an open with a defer file.Close()
Close releases resources that are independent of the read/write status of the file. Close the file when you are done with it.
Your best bet is to always use defer file.Close(). This function is invoked for cleanup purposes, and also releases resources that are indirectly related to the I/O operation itself.
This also holds true to HTTP/s response bodies and any data type that implements the Reader interface.
Related
I’ve got an http endpoint which calls net/http.(*Request).FormFile to read a file uploaded. I noticed the returned *multipart.File is never closed with Close(). That is fine for small files as it is a no-op, but It appears that https://golang.org/src/net/http/request.go#L1369 r.ParseMultipartForm will copy the file out of memory and into a temp file if the file is larger than 32MB. You can see the os.Open call here: https://golang.org/src/mime/multipart/formdata.go?s=3614:3656#L146
AFAICT this would leak file handles, but when I examine the process, I do not see leaking file handles. Where is this cleaned up?
UPDATE: Here is a complete program for testing: https://play.golang.org/p/79_kt46t1PQ
The multipart.Form declares a RemoveAll() method, which calls os.Remove(fh.tmpfile).
This method is called either on an error during multipart.Reader.ReadForm or after the request has been served:
if w.req.MultipartForm != nil {
w.req.MultipartForm.RemoveAll()
}
The above is located at https://golang.org/src/net/http/server.go#L1957
Edit:
Actually you might be on to something. There is an open issue mentioning this. In the issue it is noted that RemoveAll gets called at the end of the request, but the files are reportedly still there.
As you also commented, this might be related to the os.Remove call being implemented (on Unix) with unlink syscall, which:
If the name was the last link to a file but any processes still
have the file open, the file will remain in existence until the
last file descriptor referring to it is closed.
So to wrap up, I think you are supposed to call Close() yourself on the multipart.File, as suggested in the same thread:
file, _, _ := r.FormFile("file")
defer file.Close()
It could be a documentation bug, though some could argue that closing files after usage is obvious, though in this particular case, the docs might be more explicit. Anyway, at this point I guess you could ask for further clarification on the linked issue.
Doing some tests in scm (a scheme interpreter), I've intentionally closed the current-input-port (equivalent to the standard input file descriptor). Once the program work in REPL, the things got crazy, printing systematically a error message. My question is: how could I recover the control of process, that means, how could I reestablish the input file descriptor of such process?
Search for "changing file descriptor of a running process" or something similar, I couldn't find a helpful article.
Thanks in advance
System information: Debian 10.
You almost certainly can't, although this does slightly depend on how the language-level ports are mapped to the underlying OS-level I/O system.
If what you do is close the OS-level standard input then all is lost:
the REPL tries to read from standard input, gets an error as it's closed;
it tries to raise some error which will involve prompting the user for input ...
... from standard input, which is closed, so it gets error;
game over.
The only way to survive this is for one of two things to be true:
either you've wrapped an error handler around the code which is already prepared to deal with this;
or the implementation is smart enough to recognise that it's getting closed-port errors in its closed-port error handler and gives up in some smart way.
Basically once the OS level standard input is gone anything that needs to get input from it is doomed: you can't put it back without OS-level surgery on the process.
However it's possible that the implementation maps a single OS-level I/O stream to multiple language-level streams, and closing only one of these streams would leave the system with some other stream-of-last-resort to which it can still talk, and which still refers to the OS-level standard input. Common Lisp is an example of a system which can (depending on configuration) do this. It has, for instance, *standard-input* *error-output*, *query-io*, *terminal-io* and other streams, and it's very possible to be in a situation where, for instance, *standard-input* has been closed causing read errors, but *query-io* still points somewhere with a human on the end of it.
I don't know if scm does that.
I am working on making a program that will act in a similar way as a shell, but supports only foreground processes and pipes. I have multiple processes writing to the same pipe and some other properties that differ from the normal usage of pipes. Anyhow, my question is,
Is there any easy (automatic) way to close all file descriptors of a process except the three basic ones?
I am asking this question since I have a lot of difficulties keeping track of all file descriptors for every process. And sometimes they act in some unpredictable ways to me. It could be also because of the fact that I don't have a very thorough understanding of them.
Is there any easy way(automatic) to close all file descriptors of a process except the three basic ones?
The normal way to do this is to simply iterate over all of them and close them:
for (i = getdtablesize(); i > 3;) close(--i);
That's already a one-liner. It doesn't get any more "automatic" than that.
I am asking this question since I have a lot of difficulty keeping track of all file descriptors for every process.
It will be worth your time to think about the life cycle of each file descriptor you open, when it gets duplicated (e.g. dup2() and fork()), how it gets used, and make sure you account for how each one is going to get closed when it is no longer needed. Papering over a problem of leaked file descriptors by indiscriminately closing them all is not going to be sustainable.
I have multiple processes writing to the same pipe
If you do this, then you need to be aware that the order in which data arrive at the other end of the pipe is going to be unpredictable. It will be difficult to avoid corrupting the data stream.
Use the closefrom(3) C library function.
From the manpage:
The closefrom() system call deletes all open file descriptors greater
than or equal to lowfd from the per-process object reference table.
Any errors encountered while closing file descriptors are ignored.
Example usage:
#include <unistd.h>
int main() {
// Close everything except stdin, stdout and stderr
closefrom(3); // Were 3 is the lowest file descriptor you wish to close
printf("Clear of all, but the three basic file descriptors!\n");
return 0;
}
This works in most unices, but requires the libbsd support library for Linux.
I am working on a tool which writes data to files.
At some point, a file might be "locked" and is not writable until other handles have been closed.
I could use the CreateFile API in a loop until the file is available for writing access.
But I have 2 concerns using CreateFile in a loop:
The Harddrive (cache) is always running...?!
I need to call CreateFile again to obtain a valid writing handle with different flags...?!
So my question is:
What is the best solution to wait for a file to be writable and instantly get a valid handle?
Are there any event solutions or anything, which allows to "queue/reserve" for a handle once, so that there is no "uncontrolled" race condition with others?
A file can be "locked" for two reasons:
An actual file lock which prevents writing to, and possibly reading from the file.
The file being opened without sharing access (accidentially or voluntarily) which even prevents you from opening a handle. If you already see CreateFile failing, that's likely the case rather than a real lock.
There are conceptually[1] at least two ways of knowing that no other process has locked a file without busy waiting:
By finding out who holds locks and waiting on the process or thread to exit (or, by outright killing them...)
By locking the file yourself
Who holds locks?
Finding out about lock owners is rather nasty, you can do it via the totally undocumented SystemLocksInformation class used with the undocumented NtQuerySystemInformation function (the latter is "only undocumented", but the former is so much undocumented that it's really hard to find any information at all). The returned structure is explained here, and it contains an owning thread id.
Luckily, holding a lock presumes holding a handle. Closing the file handle will unlock all file ranges. Which means: No lock without handle.
In other words, the problem can also be expressed as "who is holding an open handle to the file?". Of course not all processes that hold a handle to a file will have the file locked, but no process having a handle guarantees that no process has the file locked.
Code for finding out which processes have a file open is much easier (using restart manager) and is readily available at Raymond Chen's site.
Now that you know which processes and threads are holding file handles and locks, make a list of all thread/process handles and use WaitForMultipleObjects on the list of process handles. When a process exits, all handles are closed.
This also transparently deals with the possibility of a "lock" because a process does not share access.
Locking the file yourself
You can use LockFileEx, which operates asynchronously. Note that LockFileEx needs a valid handle that has been opened with either read or write permissions (getting write permission may not be possible, but read should work almost always -- even if you are prevented from actually reading by an exclusive lock, it's still possible to create a handle that could read if there was no lock).
You can then wait on the asynchronous locking to complete either via the event in the OVERLAPPED structure, or on a completion port, and can even do other useful stuff in the mean time, too. Once you have locked the file, you know that nobody else has it locked.
[1] The wording "conceptually" suggests that I am pretty sure either method will work, but I have not tested them.
Apart from a busy loop, repeatedly trying to open the file with write access (which doesn't smell right - what if the file is locked by a process that is stuck and requires a reboot or manual termination, you'll never be able to write to it.
You could write to a temporary file and rename it afterwards (you can tell the OS a file rename operation is required and it will do it at next boot). If you need to append instead of write, then you'll have to write a process to append your temporary file to the correct one, possibly at startup (write the instructions of which file to append to where to a file that your process reads).
If you need to modify a locked file, then you'll just have to take a lock on it as soon as you can, and refuse to start the program if you don't have write access - warn the user right at the start.
There is a possibility that you can wait in a better way: if a file is locked for writing, you can assume that someone is going to write to it, and so use FindFirstChangeNotification to receive events for the FILE_NOTIFY_CHANGE_LAST_WRITE or FILE_NOTIFY_CHANGE_ATTRIBUTES events. Its not perfect in that someone could request exclusive access for reading too.
I suppose you could try to get the handle to the file that is locked and wait on that, so when it is released your WaitForSingleObject will return. However, there's a good chance you will not be allowed to get the handle owned by a different process (by the security subsystem)
I am performing very rapid file access in ruby (2.0.0 p39474), and keep getting the exception Too many open files
Having looked at this thread, here, and various other sources, I'm well aware of the OS limits (set to 1024 on my system).
The part of my code that performs this file access is mutexed, and takes the form:
File.open( filename, 'w'){|f| Marshal.dump(value, f) }
where filename is subject to rapid change, depending on the thread calling the section. It's my understanding that this form relinquishes its file handle after the block.
I can verify the number of File objects that are open using ObjectSpace.each_object(File). This reports that there are up to 100 resident in memory, but only one is ever open, as expected.
Further, the exception itself is thrown at a time when there are only 10-40 File objects reported by ObjectSpace. Further, manually garbage collecting fails to improve any of these counts, as does slowing down my script by inserting sleep calls.
My question is, therefore:
Am I fundamentally misunderstanding the nature of the OS limit---does it cover the whole lifetime of a process?
If so, how do web servers avoid crashing out after accessing over ulimit -n files?
Is ruby retaining its file handles outside of its object system, or is the kernel simply very slow at counting 'concurrent' access?
Edit 20130417:
strace indicates that ruby doesn't write all of its data to the file, returning and releasing the mutex before doing so. As such, the file handles stack up until the OS limit.
In an attempt to fix this, I have used syswrite/sysread, synchronous mode, and called flush before close. None of these methods worked.
My question is thus revised to:
Why is ruby failing to close its file handles, and how can I force it to do so?
Use dtrace or strace or whatever equivalent is on your system, and find out exactly what files are being opened.
Note that these could be sockets.
I agree that the code you have pasted does not seem to be capable of causing this problem, at least, not without a rather strange concurrency bug as well.