I have create a anonymous pipe (using pipe system call in linux and _pipe() in windows). I wanted to know
1. Whether the read and write on this pipe are blocking call (i.e if the pipe is full will the write be blocked)?.
2. Is there is any chance of data being overwritten in anonymous pipe?. If yes which is a better alternative to it?
Thanks,
Manoj
Yes -- the pipe blocks when full, although that rarely happens in modern systems with lots of memory.
If it happens, its a serious bug.
Related
I am currently doing the Ruby on the Web project for The Odin Project. The goal is to implement a very basic webserver that parses and responds to GET or POST requests.
My solution uses IO#gets and IO#read(maxlen) together with the Content-Length Header attribute to do the parsing.
Other solution use IO#read_nonblock. I googled for it, but was quite confused with the documentation for it. It's often mentioned together with Kernel#select, which didn't really help either.
Can someone explain to me what the nonblock calls do differently than the normal ones, how they avoid blocking the thread of execution, and how they play together with the Kernel#select method?
explain to me what the nonblock calls do differently than the normal ones
The crucial difference in behavior is when there is no data available to read at call time, but not at EOF:
read_nonblock() raises an exception kind of IO::WaitReadable
normal read(length) waits until length bytes are read (or EOF)
how they avoid blocking the thread of execution
According to the documentation, #read_nonblock is using the read(2) system call after O_NONBLOCK is set for the underlying file descriptor.
how they play together with the Kernel#select method?
There's also IO.select. We can use it in this case to wait for availability of input data, so that a subsequent read_nonblock() won't cause an error. This is especially useful if there are multiple input streams, where it is not known from which stream data will arrive next and for which read() would have to be called.
In a blocking write you wait until bytes got written to a file, on the other hand a nonblocking write exits immediately. It means, that you can continue to execute your program, while operating system asynchronously writes data to a file. Then, when you want to write again, you use select to see whether the file is ready to accept next write.
I have a situation where I need to concurrently read/write from/to the file, but the scope of operations is limited:
append only, no random offset writes
read from random position, where I know for sure the content has been written before(via append, internal access serialization via golang channel to ensure random read happens only after content's been appended)
there is only one process running
This is a high loaded application and I would like to avoid locking file for each read/write I do
I was going to open 2 files - one for read, another for append only
would doing so create some potential issues/bugs?
what is the recommended practice if I would like to avoid file locking for each read/write I do?
p.s. golang, linux, ext4
I'll assume by "random read" you actually mean "arbitrary read".
If I understand your use case correctly, you don't need to seek or lock or do anything manual. UNIX has this covered via O_APPEND. Here is what you can do:
Open the file with os.O_APPEND. This way every write, regardless of any preceding operations, will go to the end of the file
When reading use File.ReadAt. This lets you specify arbitrary offsets for your reads
Using this scheme you can avoid any sort of locking: the OS will do it for you. Because of the buffer cache this scheme is not even inefficient: appends and reads are pretty much independent.
I am writing a win32 app which is using the namedpipe for inter-process communication. When one process is trying to writeFile, it will write the structure (tell other process how many bytes and other info), then it will write the actual data by calling WriteFile again.
The other process, when it is reading, it read the first msg, and then read the second msg based on the information got from the first msg.
My questions are:
If the server process is writing the data, but the client process hasn't read it yet, is it possible to lost the first msg when the client is reading? Example, when the server is calling WriteFile at the second time to write actual data, will the previous msg was overwritten?
Is there any best solution to use waitforsingleobject to sync?
Thanks
A pipe is a little like a real pipe -- when you write more to the pipe, it doesn't overwrite what was already in the pipe. It just adds more data to the pipe that will be delivered after the data that you previously wrote to the pipe.
I rarely find WaitForSingleObject useful for a pipe. If you want to block the current thread until it receives data from the pipe, you can just do a synchronous read, and it'll block until there's data. If you want to block until there's input from any of a number of sources, you usually want WaitForMultipleObjects or MsgWaitForMultipleObjects, so your thread will run when any of the sources has input to process.
The only times I can recall using WaitForSingleObject on a pipe were with a zero timeout, so the receiver would continue other processing if there was no pipe input, and every once in a while check if the pipe has some data to process. While it initially seems like PeekNamedPipe would work for this, it's really most useful for other purposes -- though it might work for you, to read the header data and figure out what other code to invoke to read and process the entire message.
Having said all that, I feel obliged to point out that I haven't written any new code using named pipes in quite a while. I can think of very few situations in which I'd even consider them today -- I'd almost always use sockets instead.
I need to send data from child processes to parent. Some of this data is HTML, plain text, etc. but it may also be necessary send image data, zip file data, etc.
As I understand it, anonymous pipes use the child process standard input and standard output. Conventionally stdin and stdout only convey textual data: would there be any problem with sending non printable characters using this mechanism?
There is no relation between anonymous pipes and stdin/out. As one process has only one stdin/out, you could create only one anonymous pipe per process that way, which sounds stupid, doesn't it? You can redirect stdin/out of a child process to the pipe, yes. But you don't have to, if the child process is able to report itself by another means (like logfile or network activity). A call to CreatePipe gives you reading and writing handles and it's up to you how you use them. Sending arbitrary binary data is indeed possible. Anonymous pipe is in no way different from named pipe in that respect.
Even if you do choose to use stdin/stdout redirection to pass the pipe handle(s) to the child process, you shouldn't have any problems provided the child process uses the Windows API to send the data rather than the C runtime library functions.
That is, WriteFile will work perfectly, but printf would not be a good idea.
You can use GetStdHandle to get the handle(s) to the pipe(s) for use with the Windows API functions.
The systemu page says:
systemu can be used on any platform to return status, stdout, and stderr of any command. unlike other methods like open3/popen4 there is zero danger of full pipes or threading issues hanging your process or subprocess.
(https://github.com/ahoward/systemu)
Could anyone explain this a little bit?
Methods like popen and its various spinoffs are convenient and are part of the expected API for a full I/O library.
However, they must be used either casually or carefully because they are prone to deadlock. By casually, I mean, if you both write and read from the command, it's still OK as long as you either don't write a lot or don't read a lot. By carefully, I mean, you can move large amounts of data, but only if you keep the inner details of the operation in mind and deliberately engineer against deadlock.
Imagine writing lots of stuff to your popened command and then reading a result. If you write more than a pipe will buffer, then your process will sleep. That's OK in practice, most of the time, but what if the command has to write a lot of stuff? Now it may sleep and not finish reading input that you are sending. You won't finish sending input so you will never wake up and read results.
Deadlock!