Replace ZeroMQ's select() on windows - zeromq

It is unbelievable that ZeroMQ uses select() on Windows, I didn't know that until I have completes my code and started performance test. They should present this information on their web site with big red font.
Is there anyway to replace ZeroMQ's select()?
IOCP is proactor model and can't be easily integrated into it, how about WSAEventSelect, this is also a reactor model and have a near performance like poll.
Another choice for me is http://nanomsg.org/, but it is still alpha.

One of the main objectives in Zeromq is to provide a consistent API for communication between threads, processes, nodes, and clusters. Protocol specific optimization is outside of this scope because of the ways that it can effect other areas of communication. For example, shared memory would be a better form of IPC, but UNIX domain sockets make a consistent API easier. It would also be nice to know when an endpoint disconnects, but how would you implement such behavior between threads?
Their main goal is to allow every pattern to work the same way regardless of topology, protocol, system, or language, to the point that any mixture can be used regardless of how odd it may seem (node.js Websockets communicating with C# brokers passing messages to Ruby and PHP workers which share work with java threads, etc.)
Each of it's features would be enhanced greatly if optimised for each specific protocol and system, but that would also make uniform patterns close to impossible.
BTW, they might accept a pactch if you could find a way to implement iocp while still maintaining this versatility and neutrality.
PPS, nanomsg is made by one of the main original developers of Zeromq. Crossroads.IO is a direct fork of Zeromq, by original Zeromq developers as well and including some developers of nanomsg. if I'm not mistaken, Nano will likely become the core of crossroads when complete.

Related

Simplest C++ library that supports distributed messaging - Observer Pattern

I need to do something relatively simple, and I don't really want to install a MOM like RabittMQ etc.
There are several programs that "register" with a central
"service" server through TCP. The only function of the server is to
call back all the registered clients when they all in turn say
"DONE". So it is a kind of "join" (edit: Barrier) for distributed client processes.
When all clients say "DONE" (they can be done at totally different times), the central server messages
them all saying "ALL-COMPLETE". The clients "block" until asynchronously called back.
So this is a kind of distributed asynchronous Observer Pattern. The server has to keep track of where the clients are somehow. It is ok for the client to pass its IP address to the server etc. It is constructable with things like Boost::Signal, BOOST::Asio, BOOST::Dataflow etc, but I don't want to reinvent the wheel if something simple already exists. I got very close with ZeroMQ, but non of their patterns support this use-case very well, AFAIK.
Is there a very simple system that does this? Notice that the server can be written in any language. I just need C++ bindings for the clients.
After much searching, I used this library
https://github.com/actor-framework
It turns out that doing this with this framework is relatively straightforward. The only real "impediment" to using it is that the library seems to have gotten an API transition recently and the documentation .pdf file has not completely caught up with the source. No biggie since the example programs and the source (.hpp) files get you over this hump. However, they need to bring the docs in sync with the source. In addition, IMO they need to provide more interesting examples on how to use c++ Actors for extreme performance. For my case it is not needed, but the idea of actors (shared nothing) in this use-case is one of the reasons people use it instead shared memory communication when using threads.
Also, getting used to the syntax that the library enforces (get used to lambdas!) if one is not used to state of the art c++11 programs it can be a bit of a mind-twister at first. Then, the triviality of remembering all the clients that registered with the server was the only other caveat.
STRONGLY RECOMMENDED.

ZeroMQ vs Crossroads I/O

I am looking into using ZeroMQ as the messaging/transport layer for a fairly large distributed system, mainly targeting monitoring and data collection (many producers, a few consumers).
As far as I can see there are currently two different implementations of the same concept; ZeroMQ and Crossroads I/O, the latter being a fork of ZeroMQ (in 2012?).
I am trying to figure out which one to use and wonder about the differences between them, but have so far not found much information regarding this.
For example:
Are they compatible on the wire?
Are they API compatible, i.e. some kind of common base API, possibly with different add-ons?
Do they both implement support for ZMTP (ZeroMQ Message Transport Protocol)?
Do they share some kind of common understanding of future development or will they continue in two separate and possible different directions?
What are the pros/cons in relation to the other?
Basically, how do one choose one over the other?
Crossroads.io is pretty dead since Martin Sustrik has started on a new stack, in C, called nano: https://github.com/250bpm/nanomsg
Crossroads.io does not, afaik, implement ZMTP/1.0 nor ZMTP/2.0 but its own version of the protocol.
Nano has pluggable transports and we'll probably make a ZMTP transport for that. Nano is really nice, a rethinking of the original libzmq library, and if it's successful would make a good new kernel.
Ideally, Nano would interoperate both at the API and the protocol level, so be a pluggable replacement for libzmq. It does have quite a long way to go, though.
Note that there are now several rewrites of libzmq emerging, including JeroMQ (Java) and NetMQ (C#). These two do implement ZMTP/1.0 and ZMTP/2.0 properly. There are also other libraries like Axon (https://github.com/visionmedia/axon) which are heavily inspired by 0MQ but not compatible.
Based on experience, users value interoperability more than almost anything else, so it's quite likely that different 0MQ-like stacks will end up speaking the same protocols.

multi-client inter-process communication on Windows, VB6

What is the best way for multiple client programs to
communicate with a single server program, all running
on a single Windows computer? All written in VB6.
I'd appreciate recommendations of how you might solve
this problem.
NOTE: we are working on transition to .NET, but have to
add a capability to the V6B version before the .NET will
be ready.
The possibilities include TPC connections, named pipes,
shared memory, messages, files, and more.
A client passes the server a string as input, and the server
combines it with data known only to the server, to generate
another string which is returned to the client. Both strings
are only about 100 characters long. The server is contacted
only when a new file needs to be opened, and so it is a very
low volume of communication... probably a flurry of 10 calls
within 15 seconds, followed by an hour of idle time.
But it is possible that two clients would choose about the
same time to request information. Blocking/Locking are certainly
acceptable, as the server will be done with each request in
well under a second, and several seconds of delay is unimportant
to any of the programs.
The server's algorithm is complex, and for several reasons important
to the application should not be replicated in each helper program.
That is the reason for needing a server.
Background:
I am adding capability to a large existing legacy program.
This single program has several other legacy programs which
act as helpers and are run when the user makes certain
choices. These programs are started with a shell command,
and are not just separate threads. For instance, one helper
loads new data from a DVD drive onto the hard drive. Another
helper just displays a chart of the current positions of
the planets.
This is a LARGE commercial legacy program that happens to be
written in VB6. We are working to convert it and all the
helper programs to .NET, but must first release a new version
under vb6 with this added capability. (Please don't tell me
to not use VB6, as we are already moving elsewhere.)
We need a temporary VB6 solution.
VB6 does TCP and UDP extremely well via the standard Winsock Control component included in Pro and Enterprise Editions. A lot of shadetree coders do seem to struggle with it though. This is probably the most obvious route since the only other native IPC in VB6 would be COM/DCOM and DDE, however MSMQ provided excellent support for VB6 as well.
The downside of IP-based protocols is their limited namespace and resulting high probability of collisions (64K port numbers, many set aside for standard applications, ephemeral port ranges, etc.). They're also somewhat "heavyweight" but considering the vast resources of even the oldest PCs still in service and your light traffic requirements you can ignore that in deciding.
Another option you've considered is Named Pipes.
This offers a number of advantages in your situation. For one thing the namespace is much larger requiring only a unique name, which in the post-Win9x era can be up to 256 characters long making uniqueness fairly easy to achieve. For another, as long as your firewalls permit "File and Print Sharing" you're all set on that front.
Also, for your application you only seem to require an RPC-style mechanism rather than arbitrary bidirectional streams or messages. TransactNamedPipe() calls in your clients might be ideal. Named Pipes work over a LAN, but within one PC they are quite fast and light weight.
While VB6 doesn't come with a Named Pipe component such a thing is fairly easy to create as long as extremely high performance isn't required. You can use Timer-based polling in the server instead of trying to implement overlapped I/O to get asynchronicity. I put one together a couple of years ago and have had good luck with this approach.
I published a fairly stable rendition of this a while back at PipeRPC - RPC Over Named Pipes. There is an older and a somewhat newer version there with examples of use and documentation. As designed, clients make "calls" passing a Byte array of request parameters and receiving back a Byte array of response results. You can also shove Unicode Strings though with no changes, letting the compiler coerce the types.
Just one "drop in" UserControl for both clients and servers.
Looking back at this question:
The server's algorithm is complex, and for several reasons important
to the application should not be replicated in each helper program.
That is the reason for needing a server.
If that's really the concern why not just create a shared DLL that all programs use?
For a one-off upgrade release to an existing VB6 application being moved to a newer platform, I would stress keeping the modification as simple and straightforward as possible. As a result, I wouldn't go down any routes involving shared memory or anything relatively unusual.
A few options, none perfectly simple, but at least some ideas:
Expose a COM object in the server code that performs the translation, and can be consumed by the client apps. The clients instantiate the object from the server as an out-of-process object, and let COM handle all the marshalling, etc.
Does the server have any network awareness? VB6 doesn't do sockets/tcp natively very well, but if you've had a reason to add that in, you might be able to leverage it to perform a socket-based connection and data exchange.
The server and client could each poll a common resource folder for the presence of a specific file that constituted inbound/outbound requests for the translation service you describe. Not very elegant, but it might be the simplest.
Just a few ideas to give you some things to think about. Hope that's helpful in some way. Good luck!

How do CPG of Corosync, ZeroMQ, and Spread compare for messaging?

I'm interested in:
Performance
Latency
Throughput
Resource usage (CPU, memory, ...)
High availability
No single point of failure
Features
Transport options
Routing options
Stability
Community
Active development
Widely used
Helpful mailing list, forum, IRC channel, ...
Ease of integration with my current codebase
Gotchas maybe
Any other thing you think I omitted
I've read about them, but I couldn't find a good comparison. Specially I'm interested in performance benchmarks comparing them. (Maybe I should do one on my own! I hope not.)
Well, I haven't used the other two, but can share my experiences with ZeroMQ. In my opinion, it excels at all of yours.
Speed and throughput
It's as fast as TCP, doesn't use CPU or a lot a memory. It can push A LOT of messages very quickly without a sweat. It will saturate your network channel way before you run out of memory (I doubt you'll ever be able to max-out the CPU). There was a comparison to RabbitMQ somewhere and ZMQ outperforms it by a factor of 2. From things I've read around the web it's in use in high speed trading.
RabbitMQ is also a very good tool. Have a look at it - it might be good fit for what you are looking
SPOF
If you design you application properly, then you can have no single point of failure. It's very easy to connect two sockets to another one. So if one of them fails - the other is there to handle the work. There are things like High water marks to help you along the way. Read the ZeroMQ Guide to learn how to design your app without a SPOF.
Transports and routing
Regarding transport options (if I'm understanding this correctly) - it's up to you to define your protocol. ZeroMQ basically promises you that it will deliver this blob of data to the other end. Use JSON, Protocol buffers, Morse code, whatever you like.
There is no built-in routing in like there is in AMQP. Again, it up to you to specify which ZeroMQ socket connects to which, but this is very easy.
Stability
I've been developing with it for a few months (using Python) and haven't found a single issue with its stability. Even when I try to use it the wrong way it just throws a nice error telling me not to do that. Even restarting/killing some of the services and bringing them back up doesn't cause any problems. I'd say it a very stable piece of software.
As a note: always use the latest version - the 2.1 version is very much stability oriented, so many stability issues are resolved in it.
Community
Bindings for more than 20 languages, active mailing list, very good documentation, frequent releases. Anything else?
Integration
Because it's designed as a library it's up to you to design you application (unlike the case with a framework) and it pretty much stands out of your way. It feels a bit like a normal TCP socket, much more powerful and easier to use (it guarantees you that a message will be delivered as a whole, not only the first 128 bytes and the rest later as it the case with regular sockets).
Gotchas
There are some, but they are all documented in the guide. (For example: you might miss the first few messages from a PUB socket when you connect (SUB) to it. There is an explanation to this in the guide and a recipe how to handle it).
Overall
I find this one of the best designed pieces of software - stable, well written, well documented and doesn't stand in my way.
I recommend you to read the guide end-to-end. It's well written, examples in a lot of languages (including C++) and it describes a lot of edge cases and pain points.

How does Erlang's support for *transparent* distribution of actors impact application design?

One of the features of the actor model in Erlang is transparent distribution. Unless I'm misinterpreting, when you send messages between actors, you theoretically shouldn't assume that they are in the same process space or even co-located on the same physical machine.
I've always been under the impression that distributed, fault tolerant systems require careful application design to solve inherent problems around ordering/causality and consensus (among others).
I'm pretty sure that Erlang doesn't promise to solve these classes of problems transparently, so my question is, how do Erlang developers cope with this? Do you design your application as if all the actors are in the same process space and then only solve distribution problems when it comes time to actually distribute them?
If so, is this transparent distribution feature of Erlang really just concerned with the wire protocol used for remote messaging and not really transparent in the sense that a true distributed application still requires careful design in the application layer?
You are correct that erlang does not inherently solve the problems of Ordering/Causality or Consensus. What erlang does abstract for you is the difference between sending messages to local or remote nodes.
I'm not sure it would really be possible to solve those problems in a language design. That more properly belongs in a framework. The OTP framework does have some tools to help with that. Really though it's somewhat dependent on the specific problem you are solving.
For one example of an Erlang VectorClock implementation look at distributerl
Erlang OTP Supervisors also might provide some of the necessary infrastructure for consensus but there is some thought that Consensus is an impossibility in asynchronous message passing distributed systems. See your referenced wiki page for additional information on that.
Erlang does, in fact, solve these problems transparently. It can do this because it is a functional language with immutable (single-assignment) variables. It uses the Actor model for concurrency, and was specifically designed to allow hot-swapping of code and concurrent programming without the programmer having to worry about synchronization.
The Wikipedia article actually has a pretty good description of this. It is my understanding that Ericsson invented the language as a practical way to program massively parallel phone switches.
Erlang promises those things (http://www.sics.se/~joe/thesis/armstrong_thesis_2003.pdf section 3.1 (39-40)):
Everything is a process.
Processes are strongly isolated.
Process creation and destruction is a lightweight operation.
Message passing is the only way for processes to interact.
Processes have unique names.
If you know the name of a process you can send it a message.
Processes share no resources.
Error handling is non-local.
Processes do what they are supposed to do or fail.
and rest is up to you. If you want know why see chapter 2. Shortly, you can send message to process if you know its PID even it is on another piece of HW. You can't be sure if message arrive unless you receive response with common secret. You can be sure that you will receive failure message when process failure when you monitor (or link) it. Those are basic elements with which you can build up what ever you want.

Resources