ZeroMQ - Handling slow receivers without dropping - zeromq

I have an architecture where I have a ROUTER socket and multiple DEALER sockets.
Some DEALER sockets will only send data, some will only receive data and others can do a mixture of both.
I have a scenario, where I have one DEALER socket, that is sending data at an extremely fast rate. This data is received by another DEALER, that will process this as fast as it can. The send rate is always going to be higher than the receive.
In my current setup the ZMQ_SNDHWM on my ROUTER socket gets hit for the receive client and will silently drop messages. I do not want this to be the case.
What is the best way to do so as to deal with this scenario?
I have looked at DEALER->DEALER on a different port, but this could be hard to maintain, depending on the number of sessions that are created I could potentially have to have one port per session.
The other way I can think of solving this is to do some pipe-lining in which the receiving DEALER socket will tell the sender when it is ready to receive but this seems to add a lot of complication to the overall protocol and a lot more state management. It also seems to defeat the ability to be able to naturally block on DEALER sockets which is really what I need in this case; the DEALER sockets will never have to communicate with any other socket.

Do not rely on blocking, the less on uncontrolled use of resources
In distributed-system there is not much space for optimistic beliefs and expectations. In many posts I advocate, where possible, not to rely on blocking states, as your code gets out-of-control and you cannot do anything about leaving such a state, but pray for a message to come any time soon, if ever.
Rather bear the responsibility, end-to-end, which in distributed-system means that you need to also design strategies how to survive a "remote" death and similar situations, that are outside of the range of your-domain-of-control, but which your code design has no right to abstract from.
Even if not willing to, an explicit flow-management is the way to go
Late 90-ies have demonstrated many flow-control strategies for distributed systems, so this is definitely not a new field.
Increasing the box-of-worm size does not help to manage the un-controlled / un-managed flow of events. While ZMQ_???HMW + ZMQ_???BUF may help somehow tweak non-brutal cases, where having a bit more space may temporarily postpone the principal weakness of un-controlled message-flows, yet the problem is like remaining stand still with closed eyes right in the middle of the cross-fire shooting. Such agent may survive, but it's survival is not a result of it's design cleverness, but just an accidental luck.
Token-passing seems to be the cheapest way how to throttle the flow so as to remain acceptable / process-able on the slowest node. Increasing such a node-processing performance may be done with a use of a static, an incrementally expanded or a fully adaptive pooling-proxy, so even this known bottleneck remains manageable and under your design-control.
The highest layer of robustness is in making your distributed-system's design resilient to spurious bursts of events ( be it messages or .connect() attempts ). So, independently of selecting the building blocks, designer has also the responsibility to design in all the survival-strategies. Not doing this leaves your system vulnerable to capacity-directed vector of attack or other sort of unhandled exploits of these kinds of known weakness vulnerabilites.

Related

What are down sides of using ZeroMQ for sending large messages (up to gigabytes)?

I found that people don't recommend sending large messages with ZeroMQ. But it is a real headache for me to split the data (it is somewhat twisted). Why this is not recommended is there some specific reason? Can it be overcome?
Why this is not recommended?
Resources ...
Even the best Zero-Copy implementation has to have spare resources to store the payloads in several principally independent, separate locations:
|<fatMessageNo1>|
|...............|__________________________________________________________ RAM
|...............|<fatMessageNo1>|
|...............|...............|__________________Context().Queue[peerNo1] RAM
|...............|...............|<fatMessageNo1>|
|...............|...............|...............|________O/S.Buffers[L3/L2] RAM
Can it be overcome?
Sure, do not send Mastodon-sized-GB+ messages. May use any kind of an off-RAM representation thereof and send just a lightweight reference to allow a remote peer to access such an immense beast.
Many new questions added via comment:
I was concern more about something like transmission failure: what will zeromq do (will it try to retransmit automatically, will it be transparent for me etc). RAM is not so crucial - servers can have it more than enough and service that we write is not intended to have huge amount of clients at the same time. The data that I talk about is very interrelated (we have molecules/atoms info and bonds between them) so it is impossible to send a chunk of it and use it - we need it all)) – Paul 25 mins ago
You may be already aware that ZeroMQ is working under a Zen-of-Zero, where also a zero-warranty got its place.
So, a ZeroMQ dispatched message will either be delivered "through" error-free, or not delivered at all. This is a great pain-saver, as your code will receive only a fully-protected content atomically, so no tortured trash will ever reach your target post-processing. Higher level soft-protocol handshaking allows one to remain in control, enabling mitigations of non-delivered cases from higher levels of abstractions, so if your design apetite and deployment conditions permit, one can harness a brute force and send whatever-[TB]-BLOBs, at one's own risk of blocked both local and infrastructure resources, if others permit and don't mind ( ... but never on my advice :o) )
Error-recovery self-healing - from lost-connection(s) and similar real-life issues - is handled if configuration, resources and timeouts permit, so a lot of troubles with keeping L1/L2/L3-ISO-OSI layers issues are efficiently hidden from user-apps programmers.

Using ZMQ for bidirectional inter-thread communication

I am new to ZeroMQ. I have spent the last couple of months reading the documentation and experimenting with the library. I am currently developing a multi-threaded c++ application and want to use ZeroMQ instead of mutexes to exchange data between my main thread and one of its child.
The child thread is handling the communication with an external application. Therefore, I will need to queue/sockets between the main thread and its child. One for outgoing messages and one for incoming messages.
Which zmq socket should I use in order to achieve this.
Thanks in advance
By moving from using shared memory and mutexes to using ZeroMQ, you are entering the realm of Actor model programming.
This, in my opinion, is a fairly good thing. However, there are some things to be aware of.
The only reason mutexes are no longer needed is because you are copying data, not sharing it. The 'cost' is that copying a lot of data takes a lot longer than locking a mutex that points to shared data. So you can end up with a nice looking Actor model program that runs like a dog in comparison to an equivalent program that uses shared memory / mutexes.
A caveat is that on complicated architectures like Intel Xeons with multiple CPUs, accessing shared memory can, conceivably, take just as long as copying it. This is because this may (depending on how lucky you've been) mean transactions across the QPI bus. Actor model programming is ideal for NUMA hardware architectures. Modern Intel and AMD architectures are, partially/fundamentally, NUMA, but the protocols they run over QPI / Hypertransport "fake" an SMP environment.
I would avoid ZMQ_PAIR sockets wherever practicable. They don't work across network connections. This means that if, for any reason, your application needs to scale across multiple computers you have to re-write your code. However, if you use different socket types from the very beginning, a scale-up of your application is nothing more than a matter of redeploying your code, not changing it. FYI nanomsg PAIRs do not have this restriction.
Don't for one moment assume that Actor model programming is going to solve all your problems. It brings in a whole suite of problems all of it's own. You can still deadlock, livelock, spinlock, etc. The problem with Actor model programmes is that these problems can be lurking in your code for years and never happen, until one day the network is just a little bit busier and -bam- your program stops running...
However, there is a development of Actor model programming called "Communicating Sequential Processes". This doesn't solve those problems, but if you've written your program with these problems they are guaranteed to happen every single time. So you discover the problem during development and testing, not five years later. There's also a process calculi for it, i.e. you can algebraically prove that your design is problem free before you ever write a single line of code. ZeroMQ is not CSP. Interestingly CSP is making something of a comeback - the Rust and Go languages both do CSP. However, they do not do CSP across network connections - it's all in-process stuff. Erlang does CSP too, and AFAIK does it across network connections.
Assuming you've read all that about CSP and are still going to use ZeroMQ, think carefully about what it is you are planning on sending across the ZeroMQ sockets. If it's all within one program on the same machine, then sending copies of, for example, arrays of integers is fine. They'll still be interpretable as integers at the receiving end. However, if you have aspirations to send data through ZMQ sockets to another computer it's well worth considering some sort of serialisation technology. ZeroMQ delivers messages. Why not make those messages the byte stream from an object serialiser? Then you can guarantee that the received message will, after de-serialisation, mean something appropriate at the receiving end, instead of having to solve problems with endianness, etc.
Favourite serialisers for me include Google Protocol Buffers. It is language / operating system agnostic, giving lots of options for a heterogeneous system. ASN.1 is another really good option, it can be got for most of the important languages, and it has a rich set of wire formats (including XML and, now/soon, JSON, which gives some interesting inter-op options), and does Constraints (something Google PBufs don't do), but does tend to cost money if one wants really good tools for it. XML can be understood by almost anything, but is bloated. Basically it's worth picking one that doesn't tie you down to using, say, C#, or Python everywhere.
Good luck!

Performance benefit of multiple pending reads or multiple pending writes per individual TCP socket?

IOCP is great for many connections, but what I'm wondering is, is there a significant benefit to allowing multiple pending receives or multiple pending writes per individual TCP socket, or am I not really going to lose performance if I just allow one pending receive and one pending send per each socket (which really simplifies things, as I don't have to deal with out-of-order completion notifications)?
My general use case is 2 worker threads servicing the IOCP port, handling several connections (more than 2 but less than 10), where the transmitted data is ether of two forms: one is frequent very small messages (which I combine if possible manually, but generally need to send often enough that the per-send data is still pretty small), and the other is transferring large files.
Multiple pending recvs tend to be of limited use unless you plan to turn off the network stack's recv buffering in which case they're essential. Bear in mind that if you DO decide to issue multiple pending recvs then you must do some work to make sure you process them in the correct sequence. Whilst the recvs will complete from the IOCP in the order that they were issued thread scheduling issues may mean that they are processed by different I/O threads in a different order unless you actively work to ensure that this is not the case, see here for details.
Multiple pending sends are more useful to fully utilise the TCP connection's available TCP window (and send at the maximum rate possible) but only if you have lots of data to send, only if you want to send it as efficiently as you can and only if you take care to ensure that you don't have too many pending writes. See here for details of issues that you can come up against if you don't actively manage the number of pending writes.
For less than 10 connections and TCP, you probably won't feel any difference even at high rates. You may see better performance by simply growing your buffer sizes.
Queuing up I/Os is going to help if your application is bursty and expensive to process. Basically it lets you perform the costly work up front so that when the burst comes in, you're using a little of the CPU on I/O and as much of it on processing as possible.

What reliability guarantees (if any) does ZMQ make for PUB/SUB over epgm?

I've got an app sending messages on an epgm PUB socket to one or more epgm SUB sockets. Things mostly work, but if a subscribing application is left up long enough, it will generally end up missing a message or a few messages. (My messages have sequence numbers, so I can tell if any are missing or out of order.) Based on my reading of the ZMQ docs, I would have thought that the "reliable multicast" nature of epgm would prevent this from happening, that after a SUB socket gets one message, it's guaranteed to keep getting them until shutdown or until major network troubles (ie, the connection is maxed out).
Anyway, that's the context, but the question is simply the title: What reliability guarantees (if any) does ZMQ make for PUB/SUB over epgm?
The PGM implementation within ZeroMQ uses an in-memory window for recovery thus is only short lived. If recovery fails due to the window being exhausted: for example publishing faster than it takes a recovery to transition, then the underlying PGM socket will reset and continue at best effort.
This means at high data rates or significant packet loss the transport will be constantly resetting and you will be dropping messages that cannot be recovered: hence reliable delivery not guaranteed.
The PGM configuration is targeted at real time broadcast such that slow receivers cannot stall the sender. The protocol does support both paradigms but the latter has not been implemented due to lack of demand.
ZeroMQ makes exactly one guarantee: all messages are complete - you will never receive partial messages. It makes no guarantee of reliability. You should check out the documentation of the high water mark (HWM) behavior, which is the most common cause for dropped messages, as illustrated by the suicidal snail.

How can I tell Windows XP/7 not to switch threads during a certain segment of my code?

I want to prevent a thread switch by Windows XP/7 in a time critical part of my code that runs in a background thread. I'm pretty sure I can't create a situation where I can guarantee that won't happen, because of higher priority interrupts from system drivers, etc. However, I'd like to decrease the probability of a thread switch during that part of my code to the minimum that I can. Are there any create-thread flags or Window API calls that can assist me? General technique tips are appreciated too. If there is a way to get this done without having to raise the threads priority to real-time-critical that would be great, since I worry about creating system performance issues for the user if I do that.
UPDATE: I am adding this update after seeing the first responses to my original post. The concrete application that motivated the question has to do with real-time audio streaming. I want to eliminate every bit of delay I can. I found after coding up my original design that a thread switch can cause a 70ms or more delay at times. Since my app is between two sockets acting as a middleman for delivering audio, the instant I receive an audio buffer I want to immediately turn around and push it out the the destination socket. My original design used two cooperating threads and a semaphore since the there was one thread managing the source socket, and another thread for the destination socket. This architecture evolved from the fact the two devices behind the sockets are disparate entities.
I realized that if I combined the two sockets onto the same thread I could write a code block that reacted immediately to the socket-data-received message and turned it around to the destination socket in one shot. Now if I can do my best to avoid an intervening thread switch, that would be the optimal coding architecture for minimizing delay. To repeat, I know I can't guarantee this situation, but I am looking for tips/suggestions on how to write a block of code that does this and minimizes as best as I can the chance of an intervening thread switch.
Note, I am aware that O/S code behind the sockets introduces (potential) delays of its own.
AFAIK there are no such flags in CreateThread or etc (This also doesn't make sense IMHO). You may snooze other threads in your process from execution during in critical situations (by enumerating them and using SuspendThread), as well as you theoretically may enumerate & suspend threads in other processes.
OTOH snoozing threads is generally not a good idea, eventually you may call some 3rd-party code that would implicitly wait for something that should be accomplished in another threads, which you suspended.
IMHO - you should use what's suggested for the case - playing with thread/process priorities (also you may consider SetThreadPriorityBoost). Also the OS tends to raise the priority to threads that usually don't use CPU aggressively. That is, threads that work often but for short durations (before calling one of the waiting functions that suspend them until some condition) are considered to behave "nicely", and they get prioritized.

Resources