Golang program closes file before writing is finished - go

I have implemented a custom Write interface for my cloud program.
My problem so far is that after i am done copying files to my writer and closed the Writer, the writer still has a few Writes to do(usually maybe 4 writes about 4096 bytes each). The last Write is usually less than 4096.
This has not happened yet but i know it is a probability of 1/4096 that the last Write is 4096 bytes and my program won't terminate.
I am using this for a zipping program and io.EOF is not effective as every write chunk has one, also checking if writer is closed comes too early while there are still some writes to do.
What is the best way to handle this situation?
***EDIT*****
I ended up implementing a more Robust Write(), Flush() and Close() method.Now everything is good if i use defer Close() but i still get the same problem if i manually call Close() at the end

since you have full control on the writer, you could use a waitgroup
to wait in your main for all goroutines to finish.

Problem was solved by implementing a more robust Close() function. I also used defer Close() to make sure that Golang handled all the Goroutines internally.

Related

Is Close() needed for a file opened by os.Open()? [duplicate]

This question already has answers here:
Is it necessary to close files after reading (only) in any programming language?
(3 answers)
Does the file need to be closed?
(1 answer)
Closed 2 years ago.
It seems that os.Open() open read-only files. So I think there is no need to Close() it. The doc is not clear on this. Is my understanding correct?
https://golang.org/pkg/os/#Open
In general, you should always close the files you open. In a long running program, you may exhaust all available file handles if you do not close your files. That said, the Go garbage collector closes open files, so depending on your exact situation leaving files open may not be a big deal.
There is a limit to how many filehandles a process can have open at once, the limit is determined by your environment, so it's important to close them.
In addition, Windows file locking is complicated; if you hold a file open it may not be able to be written to or deleted.
Unless you're returning the open filehandle, I'd advise to always match an open with a defer file.Close()
Close releases resources that are independent of the read/write status of the file. Close the file when you are done with it.
Your best bet is to always use defer file.Close(). This function is invoked for cleanup purposes, and also releases resources that are indirectly related to the I/O operation itself.
This also holds true to HTTP/s response bodies and any data type that implements the Reader interface.

How do I print the contents of a files in a directory but ignore files which are opened in write mode?

I have a goroutine which periodically checks for new files in a directory and then prints the contents of the files. However there is another goroutine which creates a file, writes contents into it and then saves the file.
How do I ignore the files which are open in WRITE mode in a directory?
Sample Code:
for {
fileList, err := ioutil.ReadDir("/uploadFiles")
if err != nil {
log.Fatal(err)
continue
}
for _, f := range fileList {
log.Println("File : ", f.Name())
go printContents(f.Name())
}
time.Sleep(time.Second * 5)
}
In the printContents goroutine I want to ignore the files which are open in WRITE mode.
That is not how it's done.
Off the top of my head I can think of these options:
If both goroutines are working in the same program,
there is little problem: make the "producer" goroutine register
the names of the files it has completed modifying into some
registry, and make the "consumer" goroutine read (and delete)
from that registry.
In the simplest case that could be a buffered channel.
If the producer works much faster than the consumer,
and you don't want to block the former for some reason
then a slice protected by a mutex would fit the bill.
If the goroutines work in different processes on the same
machine but you control both programs, make the producer
process communicate the same data to the consumer process
via any suitable sort of IPC.
What method to do IPC is better depends on how the
processes start up, interact etc.
There is a wide variety of options.
If you control both processes but do not want to mess with
IPC between them (there are reasons, too), then make the producer
follow best practices on how to write a file
(more on this in a moment), and make the consumer use any
filesystem-monitoring facility to report which files get created ("appear") once produced by the producer.
You may start with github.com/fsnotify/fsnotify.
To properly write a file, the producer have to write its
data to a temporary file—that is, a file located in the same
directory but having a filename which is well understood to
indicate that the file is not done with yet—for instance,
".foobar.data.part" or "foobar.data.276gd14054.tmp" is OK for writing "foobar.data".
(Other approaches exist but this one is good enough to
start with.)
Once the file is ready, the producer have to rename the
file from its temporary name to its "proper", final name.
This operation is atomic on all sensible OSes/filesystems,
and makes file atomically "spring into existense" from the PoV
of the consumer. For instance, inotify on Linux generates
an event of type "moved to" for such an appearance.
If you don't feel like doing the proper thing yourself, github.com/dchest/safefile is a good cross-platform start.
As you can see, with this approach you know
the file is done just from the fact it was reported
to having appeared.
If you do not control the producer, you may need to resort to
guessing.
The simpest is to, again, monitor the filesystem for
events—but this time for "file updated" events, not "file created"
events. For each file reported as updated you had to remember
the timestamp of that event, and when certain amount of time passes, you may declare that the file is done by the producer.
IMO this approach is the worst of all, but if you have no
better options it's at least something.

Confusion about rubys IO#(read/write)_nonblock calls

I am currently doing the Ruby on the Web project for The Odin Project. The goal is to implement a very basic webserver that parses and responds to GET or POST requests.
My solution uses IO#gets and IO#read(maxlen) together with the Content-Length Header attribute to do the parsing.
Other solution use IO#read_nonblock. I googled for it, but was quite confused with the documentation for it. It's often mentioned together with Kernel#select, which didn't really help either.
Can someone explain to me what the nonblock calls do differently than the normal ones, how they avoid blocking the thread of execution, and how they play together with the Kernel#select method?
explain to me what the nonblock calls do differently than the normal ones
The crucial difference in behavior is when there is no data available to read at call time, but not at EOF:
read_nonblock() raises an exception kind of IO::WaitReadable
normal read(length) waits until length bytes are read (or EOF)
how they avoid blocking the thread of execution
According to the documentation, #read_nonblock is using the read(2) system call after O_NONBLOCK is set for the underlying file descriptor.
how they play together with the Kernel#select method?
There's also IO.select. We can use it in this case to wait for availability of input data, so that a subsequent read_nonblock() won't cause an error. This is especially useful if there are multiple input streams, where it is not known from which stream data will arrive next and for which read() would have to be called.
In a blocking write you wait until bytes got written to a file, on the other hand a nonblocking write exits immediately. It means, that you can continue to execute your program, while operating system asynchronously writes data to a file. Then, when you want to write again, you use select to see whether the file is ready to accept next write.

Golang io.Reader usage with net.Pipe

The problem I'm trying to solve is using io.Reader and io.Writer in a net application without using bufio and strings as per the examples I've been able to find online. For efficiency I'm trying to avoid the memcopys those imply.
I've created a test application using net.Pipe on the play area (https://play.golang.org/p/-7YDs1uEc5). There is a data source and sink which talk through a net.Pipe pair of connections (to model a network connection) and a loopback on the far end to reflect the data right back at us.
The program gets as far as the loopback agent reading the sent data, but as far as I can see the write back to the connection locks; it certainly never completes. Additionally the receiver in the Sink never receives any data whatsoever.
I can't figure out why the write cannot proceed as it's wholly symmetrical with the path that does work. I've written other test systems that use bi-directional network connections but as soon as I stop using bufio and ReadString I encounter this problem. I've looked at the code of those and can't see what I've missed.
Thanks in advance for any help.
The issue is on line 68:
data_received := make([]byte, 0, count)
This line creates a slice with length 0 and capacity count. The call to Read does not read data because the length is 0. The call to Write blocks because the data is never read.
Fix the issue by changing the line to:
data_received := make([]byte, count)
playground example
Note that "Finished Writing" may not be printed because the program can exit before dataSrc finishes executing.

Is golang bufio goroutine safety

could multiple goroutines invoke bufio Read function at same time. I read the source code of bufio, and looks like it doesn't have proper method to protect buffer would only read by one goroutine.
No, reading from a buffer is not a thread safe operation. You have to manage coordination. Thing is, a read from the buffer modifies it's state there's not really any reasonable way to do it concurrently. There's a position marker that has to be moved at the end of the read so you can't begin a second read until the first completes.

Resources