For what do I need the Socket::MSG_* constants in ruby? - ruby

I want to develop a p2p app which communicates via UDPSockets. I'm just starting to read the docs for that and I couldn't understand that piece of ruby's socket management.
Specifically it's possible to add those "flags", as ruby-doc calls them, to every send call. (http://www.ruby-doc.org/stdlib-1.9.3/libdoc/socket/rdoc/UDPSocket.html#method-i-send)
But when do I use those and how?

You'll probably know if you need to use them as you'll have an example or some documentation that refers to them.
Some of the more common options used with recvfrom are: MSG_OOB to process out-of-band data, MSG_PEEK to peek at the incoming message without de-queueing it, and MSG_WAITALL to wait for the receive buffer to fill up.
These are really quite edge-case so you probably won't ever see one used.
Those flags come from the low-level recv call on which Socket is based.

Related

Is there any way to use IOCP to notify when a socket is readable / writeable?

I'm looking for some way to get a signal on an I/O completion port when a socket becomes readable/writeable (i.e. the next send/recv will complete immediately). Basically I want an overlapped version of WSASelect.
(Yes, I know that for many applications, this is unnecessary, and you can just keep issuing overlapped send calls. But in other applications you want to delay generating the message to send until the last moment possible, as discussed e.g. here. In these cases it's useful to do (a) wait for socket to be writeable, (b) generate the next message, (c) send the next message.)
So far the best solution I've been able to come up with is to spawn a thread just to call select and then PostQueuedCompletionStatus, which is awful and not particularly scalable... is there any better way?
It turns out that this is possible!
Basically the trick is:
Use the WSAIoctl SIO_BASE_HANDLE to peek through any "layered service providers"
Use DeviceIoControl to submit an AFD_POLL request for the base handle, to the AFD driver (this is what select does internally)
There are many, many complications that are probably worth understanding, but at the end of the day the above should just work in practice. This is supposed to be a private API, but libuv uses it, and MS's compatibility policies mean that they will never break libuv, so you're fine. For details, read the thread starting from this message: https://github.com/python-trio/trio/issues/52#issuecomment-424591743
For detecting that a socket is readable, it turns out that there is an undocumented but well-known piece of folklore: you can issue a "zero byte read", i.e., an overlapped WSARecv with a zero-byte receive buffer, and that will not complete until there is some data to be read. This has been recommended for servers that are trying to do simultaneous reads from a large number of mostly-idle sockets, in order to avoid problems with memory usage (apparently IOCP receive buffers get pinned into RAM). An example of this technique can be seen in the libuv source code. They also have an additional refinement, which is that to use this with UDP sockets, they issue a zero-byte receive with MSG_PEEK set. (This is important because without that flag, the zero-byte receive would consume a packet, truncating it to zero bytes.) MSDN claims that you can't combine MSG_PEEK with overlapped I/O, but apparently it works for them...
Of course, that's only half of an answer, because there's still the question of detecting writability.
It's possible that a similar "zero-byte send" trick would work? (Used directly for TCP, and adding the MSG_PARTIAL flag on UDP sockets, to avoid actually sending a zero-byte packet.) Experimentally I've checked that attempting to do a zero-byte send on a non-writable non-blocking TCP socket returns WSAEWOULDBLOCK, so that's a promising sign, but I haven't tried with overlapped I/O. I'll get around to it eventually and update this answer; or alternatively if someone wants to try it first and post their own consolidated answer then I'll probably accept it :-)

Is there a preferred way to design signal or event APIs in Go?

I am designing a package where I want to provide an API based on the observer pattern: that is, there are points where I'd like to emit a signal that will trigger zero or more interested parties. Those interested parties shouldn't necessarily need to know about each other.
I know I can implement an API like this from scratch (e.g. using a collection of channels or callback functions), but was wondering if there was a preferred way of structuring such APIs.
In many of the languages or frameworks I've played with, there has been standard ways to build these APIs so that they behave the way users expect: e.g. the g_signal_* functions for glib based applications, events and addEventListener() for JavaScript DOM apps, or multicast delegates for .NET.
Is there anything similar for Go? If not, is there some other way of structuring this type of API that is more idiomatic in Go?
I would say that a goroutine receiving from a channel is an analogue of an observer to a certain extent. An idiomatic way to expose events in Go would be thus IMHO to return channels from a package (function). Another observation is that callbacks are not used too often in Go programs. One of the reasons is also the existence of the powerful select statement.
As a final note: some people (me too) consider GoF patterns as Go antipatterns.
Go gives you a lot of tools for designing a signal api.
First you have to decide a few things:
Do you want a push or a pull model? eg. Does the publisher push events to the subscribers or do the subscribers pull events from the publisher?
If you want a push system then having the subscribers give the publisher a channel to send messages on would work really well. If you want a pull method then just a message box guarded with a mutex would work. Other than that without knowing more about your requirements it's hard to give much more detail.
I needed an "observer pattern" type thing in a couple of projects. Here's a reusable example from a recent project.
It's got a corresponding test that shows how to use it.
The basic theory is that an event emitter calls Submit with some piece of data whenever something interesting occurs. Any client that wants to be aware of that event will Register a channel it reads the event data off of. This channel you registered can be used in a select loop, or you can read it directly (or buffer and poll it).
When you're done, you Unregister.
It's not perfect for all cases (e.g. I may want a force-unregister type of event for slow observers), but it works where I use it.
I would say there is no standard way of doing this because channels are built into the language. There is no channel library with standard ways of doing things with channels, there are simply channels. Having channels as built in first class objects frees you from having to work with standard techniques and lets you solve problems in the simplest most natural way.
There is a basic Golang version of Node EventEmitter at https://github.com/chuckpreslar/emission
See http://itjumpstart.wordpress.com/2014/11/21/eventemitter-in-go/

Monitoring files asynchronously

On Unix: I’ve been through FAM and Gamin, and both seem to provide a client/server file monitoring system. I would rather have a system where I tell the kernel to monitor some inodes and it pokes me back when events occur. Inotify looked promising at first on that side: inotify_init1 let me pass IN_NONBLOCK which in turn caused poll() to return directly. However I understood that I would have to call it regularly if I wanted to have news about the monitored files. Now I’m a bit short of ideas.
Is there something to monitor files asynchronously?
PS: I haven’t looked on Windows yet, but I would love to have some answers about it too.
As Celada says in the comments above, inotify and poll are the right way to do this.
Signals are not a mechanism for reasonable asynchronous programming -- and signal handlers are remarkably dangerous for the inexperienced and even for the experienced. One does not use them for such purposes voluntarily.
Instead, one should structure one's program around an event loop (see http://en.wikipedia.org/wiki/Event-driven_programming for an overall explanation) using poll, select, or some similar system call as the core of your program's event handling mechanism.
Alternatively, you can use threads, or threads plus an event loop.
However interesting are you answers, I am sorry but I can’t accept a mechanism based on blocking calls on poll or select, when the question states “asynchronously”, regardless of how deep it is hidden.
On the other hand, I found out that one could manage to run inotify asynchronously by passing to inotify_init1 the flag IN_NONBLOCK. Signals are not triggered as they would have with aio, and a read call that would block blocking would set errno to EWOULDBLOCK instead.

Is it a good idea to implement a TCP/IP socket client-server with signals?

To clarify, I am wondering what are the cons and pros of writing a "multiple simultaneous clients to a single server" using TCP/IP sockets and signal handlers that are called in response to "can read / can write" signal conditions on client socket file descriptors? As far as I understand at least the Linux kernel uses signals to notify a process of conditions related to socket descriptors? Obviously one has to be careful in a signal handler, which, again as I understand, interrupts the process - reentrancy, atomicity, undefined state for variables, etc.
But one does not have to have signals do most work, in fact quite the opposite - add the socket to a set of sockets ready for reading, writing, much like select, poll and epoll_wait do, and let the default process code flow work with these sets? In effect, one emulates much the same pattern as with the functions mentioned, but purely principally, is it doable and how can it be worth it?
There is already a couple of such methods. One is using the SIGIO signal, check man 7 socket and look for the section named "Signals" for more information.
The other method is standardized by POSIX and called async I/O. The functions to use are all prefixed with aio_ (for example aio_read). See this link for an example on how to use this or check the manual page.

Streaming files from EventMachine handler?

I am creating a streaming eventmachine server. I'm concerned about avoiding blocking IO or doing anything else to muck up the event loop.
From what I've read, ruby's non-blocking IO can be used to stream files in a non-blocking way, or I can call next_tick, but I'm a little unclear about which of these approaches is preferable.
Part of the problem is that I have not found a good explanation of non-blocking IO library functions in ruby.
Short version:
Assuming a long-lived network IO operation, several wall clock minutes of streaming per file, transfer, what is the best way to do this in eventmachine without gumming up the event loop?
while 1 do
file.read do |bytes|
#conn.send_data bytes
end
end
I understand that the above code will block and I'm wondering what to put in its place. Also, I cannot use the FileStreamer class that is part of eventmachine as is, because I need to manipulate the data after it's read but before it's sent.
I think you can still use FileStreamer. FileStreamer expects its first argument to be a Connection, but this is a loose contract. As long as you implement the methods that FileStreamer expects, it should work. Take a look at this
https://gist.github.com/f4d997c3eeb6bdc5a9f3
The methods you'll need to handle are send_data and send_file_data. You can perform your manipulations here. Then pass the result along to EM::Connection.
Also, from my reading of the code, the special property of FileStreamer is that it allocates a memory mapped file (unless the file is small). You could do essentially the same thing by opening a regular Ruby File, reading blocks out of it, doing your manipulation, and emulating the behavior of FileStreamer.stream_one_chunk. Which is basically:
Each iteration must either send some data to the Connection, or reschedule itself using next_tick
Data can be repeatedly written to the Connection until the outbound buffer is full (according to get_outbound_data_size)
Once the file has been fully read, it should be closed (of course)
In fact, it seems to me that you had better not use FileStreamer unless your file will comfortably fit in memory.
You can look at the EM::Protocols for ideas about how to transform the data as it is streaming through.

Resources