RIght ZeroMQ topology - zeromq

I need to write an Order Manager that routes client (stock, FX, whatever) orders to the proper exchange. The clients want to send orders, but know nothing about FIX or other proprietary protocols, only an internal (normalized) format for sending orders. I have applications (servers) that each connect through FIX/Binary/etc connections to each FIX/etc provider. I would like a broker program in between the clients and the servers that take the normalized order and turn it into a proper format to a given FIX/etc provider, and take messages from the servers and turn it back to a normalized format for the clients. It is ok for the clients to specify a route, but it is up to a broker program in between the clients and the servers to communicate messages about that order back and forth between clients and servers. So somehow the output [fills, partial fills, errors, etc] from the server has to be routed back to the right client.
I have studied the ZMQ topologies, and REQ->ROUTER->DEALER doesn't work [the code works - I mean it is the wrong topology] since the servers are not identical.
//This topology doesn't work because the servers are not identical
#include "zhelpers.hpp"
int main (int argc, char *argv[])
{
// Prepare our context and sockets
zmq::context_t context(1);
zmq::socket_t frontend (context, ZMQ_ROUTER);
zmq::socket_t backend (context, ZMQ_DEALER); // ZMQ_ROUTER here? Can't get it to work
frontend.bind("tcp://*:5559");
backend.bind("tcp://*:5560");
// Start built-in device
zmq::device (ZMQ_QUEUE, frontend, backend);
return 0;
}
I thought that maybe a ROUTER->ROUTER topology instead is correct, but I can't get the code to work - the clients send orders but never get responses back so I must be doing something wrong. I thought that using ZMQ_IDENTITY is the correct thing to do, but not only can I also not get this to work, but it seems as if ZMQ is moving away from ZMQ_IDENTITY?
Can someone give a simple example of three ZMQ programs [not in separate threads, three separate processes] that show the correct way to do this?

Look at the MajorDomo example in the Guide: http://zguide.zeromq.org/page:all#toc71
You'd use a worker pool per exchange.

Responding to:
ROUTER->ROUTER topology instead is correct, but I can't get the code to work
My understanding is that ZMQ Sockets comes in Pairs to enable a certain pattern.
PAIR
REQ/REP
PUB/SUB
PUSH/PULL
Only PAIR socket type can talk to another socket of type PAIR and behaves similar to your normal socket.
For all other socket types, there is a complimentary socket type for communication. For example REQ socket type can only talk to REP socket type. REQ Socket type can not talk to REQ socket type.
My understanding is that in ROUTER/DEALER, ROUTER can talk to DEALER but ROUTER can not talk to ROUTER socket type.
My understanding could be wrong but from the examples this is what I have understood so far.

Related

Best Performance - emit to sockets via a loop or rooms

We currently have a chat app whereby when emitting messages out to the appropriate users (could be 1 or several depending how many are in the conversation) we loop through all socket (Socket.io 2.0.2) connections to the server (NodeJS) to get a list of sockets that a user has based on a member ID value as each user could be connected from multiple devices. The code looks like this in order to determine which sockets a user has that we should be sending the message,
var sockets = Object.keys(socketList);
var results = [];
for (var key in sockets) {
if (hasOwnProperty(socketList[sockets[key]].handshake.query, 'token')) {
if (JSON.parse(socketList[sockets[key]].handshake.query.member).id === memberId) {
results.push(socketList[sockets[key]]);
}
}
}
Having to loop through the socket connections seems inefficient and I wonder is there a better way. My thought is to create a room for each user, most users will have only the one connection but some will be connected via multiple devices so they could have multiple sockets in their room. Then I would just broadcast to the appropriate rooms rather than always looping through all sockets. Given that 95% of users will only have the one socket connection I'm not sure if this approach is any more efficient or not and would appreciate some input on this.
Thanks.
First off, socket.io already creates a room for every single user. That room has the name of the socket.id. Rooms are very lightweight objects. They basically just consist of an object with all the ids of the sockets that are in the room. So, there should be no hesitancy to use rooms at all. If they fit the model of what you're doing, then use them.
As for looping yourself vs. emitting to a room, there's really no difference - use whichever makes your code simpler. When you emit to a room, all it does is loop through the sockets in the room and send to each one individually.
Having to loop through the socket connections seems inefficient and I wonder is there a better way.
The main advantage of rooms is that they are pre-built associations of sockets so you don't have to dynamically figure out which sockets you want to send to - there's already a list of sockets in the right room that you can send to. So, it would likely be a small bit more efficient to just send to all sockets in a room than to do what your code is doing because you code is dynamically trying to figure out which sockets to send to, rather than sending to an already made list. Would this make a difference? That depends upon how long the whole list of sockets is and how expensive the computation is to figure out which ones you want to send to. My guess is that it probably wouldn't make much difference either way.
Sending a message to a room is not much more efficient on the actual sending part. Each socket has to be sent the message individually so somebody (your code or the socket.io rooms code) is going to be looping through a list of sockets either way. The underlying OS does not contain a function to send a single message to multiple sockets. Each socket has to be sent to individually.
Then I would just broadcast to the appropriate rooms rather than always looping through all sockets.
Sending to a room is a programming convenience for you, but socket.io will just be looping under the covers anyway.
I would use Socket.io rooms to accomplish what you want to do.
Server side, adding a client to a chat room:
socket.join('some room');
Then I would use socket.to('some room').emit for a sender message to be sent to all participants in the room.

What does tcp proxy session details include?

Suppose TCP proxy has forwarded request back to the backend server. When it receives reply from the backend server, how does it knows which client to reply. What exact session information does a proxy stores?
Can anyone please throw some light on this
It depends on the protocol, it depends on the proxy, and it depends on whether transparency is a goal. Addressing all of these points exhaustively would take forever, so let's consider a simplistic case.
A network connection in software is usually represented by some sort of handle (whether that's a file descriptor or some other resource). In a C program on a POSIX system, we could simply keep two file descriptors associated with each other:
struct proxy_session {
int client_fd;
int server_fd;
}
This is the bare-minimum requirement.
When a client connects, we allocate one of these structures. There may be a protocol that lets us know what backend we should use, or we may be doing load balancing and picking backends ourselves.
Once we've picked a backend (either by virtue of having parsed the protocol or through having made some form routing decision), we initiate a connection to it. Simplistically, a proxy (as an intermediary) simply forwards packets between a client and a server.
We can use any number of interfaces for tying these two things together. On Linux, for example, epoll(2) allows us to associate a pointer to events on a file descriptor. We can provide it a pointer to our proxy_session structure for both the client and server side. When data comes in either of those file descriptors, we know where to map it.
Lacking such an interface, we necessarily have a means for differentiating connection handles (whether they're file descriptors, pointers, or some other representation). We could then use a structure like a hash table to look up the destination for a handle. The solution is found simply by being able to differentiate connections to each other, and holding some state that "glues" two connections together.

How do I add a pipeline to a REQ-REP in ZeroMQ?

I am experimenting with ZeroMQ where I want to create a server that does :
REQ-PIPELINE-REPLY
I want to sequentially receives data query requests, push it through a inproc pipeline to parallelise the data query and the sink merges the data back. After the sink merges the data together, the sink sends the merged data as the reply back to the request.
Is this possible? How would it look? I am not sure if the push/pull will preserve client's address for the REP socket to send back to.
Assuming that each client has only a single request out at any one time.
Is this possible?
Yes, but with different socket types.
How would it look?
(in C)
What you may like to do is shift from a ZMQ_REP socket on the external server socket to a ZMQ_ROUTER socket. The Router/Dealer sockets have identities which can allow you to have multiple requests in your pipeline and still respond correctly to each.
The Asynchronous Client/Server Pattern:
http://zguide.zeromq.org/php:chapter3#The-Asynchronous-Client-Server-Pattern
The only hitch in this is that you will need to manage the multiple parts of the ZMQ message. The first part is the identity. Second is null. Third is the data. As long as you REPLY in the same order as the REQUEST the identity will guide your response's data to the correct client. I wrapped my requests in a struct:
struct msg {
zmq_msg * identity;
zmq_msg * nullMsg;
zmq_msg * data;
};
Make sure to use zmq_msg_more when receiving messages and set the more flag when sending correctly.
I am not sure if the push/pull will preserve client's address for the
REP socket to send back to.
You are correct. A push pull pattern would not allow for specifying of the return address between multiple clients.

Is there a way to reverse the bind on zmq pub/sub?

I have server code on one box that needs to listen in on status coming from another box with about 10 chips with linux embedded in them. The 10 chips have their own ip addresses and each will send basically health status to the server which could (possibly) do something with it.
I would like the server just to passively listen and not have to send a response. So, this looks like a job for zmq's pub/sub. Where, each of the 10 chips have their own publication and the server would subscribe to each.
However, the server would need to know the well known address that each chip bound their publication to. But, in the field, these chips can be swapped or replace with a different ip address.
Instead, it's safer to have the chips know the server code's ip adddress.
What I would like a pub/sub where the receiver is the well known address. Or, a request/response pattern where the clients (the chips) send a messages to the server (the requests), but neither the server nor the chips need to send/receive a response.
Now, currently, there are two servers on the separate box. So, if possible I'd like a solution for one server and multiple servers.
Is this possible in zmq? And what pattern would that be?
thanks.
Yes, you can do this exactly the way you'd expect to do so. Just bind on your subscriber, then connect to that subscriber with your publishers. ZMQ doesn't designate which end should be the "server", or more reliable end, and which should be the "client", or more transient end, specifically for this reason, and this is an excellent reason to switch up the normal paradigm.
Edit to address the new clarification--
It should work fine with multiple servers. In general it would work like the following (the order of operations in this case is just to ensure no messages get lost, which is possible if the PUB socket starts sending messages before the SUB is ready):
Spin up server 1. Create SUB socket and bind on address:port.
Spin up server 2. Create SUB socket and bind on address:port.
Spin up a chip. That chip will create a PUB socket and connect on [server 1] address:port and connect on [server 2] address:port.
Repeat step (3) for the other nine chips.
Dual .SUB model
Oh yes, each .PUB-lishing entity may have numerous .SUB-s listening,
so having two <serverNode>-s meets the .PUB/.SUB-primitive Formal Communication Pattern ( one speaks - many listen )
As given above, each of your <serverNode> binds
.bind( aFixServer{A|B}_ipAddress_portNumber )
so as allow each .PUB-lishing <chipNode> to
.connect( anAprioriKnownServer{A|B}_bindingNode_ipAddress_portNumber )
And both <serverNode{A|B}> than .SUB-s to receive any messages from them.
Multi-Server model
As seen above, the {A|B} grammar is freely extensible to {A|B|C|D|...} so the principal messaging model will stand for any reasonable multi-server extension
Q.E.D.

ZeroMQ, ROUTER - DEALER, send a message to all

One server - ZMQ_ROUTER, many clients - ZMQ_DEALER
How on a server(ZMQ_ROUTER) send a message to all clients(ZMQ_DEALER)?
UPD:
I know there are PUB-SUB pattern and that is really what I need. But I want to use only the current ROUTER-DEALER socket. Is it possible?
Yes, but It won't be the answer you would like to hear. I think there isn't a flag, or socket option for this. What you can do:
Track the connected dealers manually, than create a loop and send the same stuff to every connected dealer. If you send large messages you can zero copy the load, so you don't have to allocate the memory time to time.

Resources