ZeroMQ: One to many, one-way communication without topics? - zeromq

What is the simplest way to use ZeroMQ so that one sender sends out messages and an arbitrary number of receivers can get all those messages?
I would like to send messages which are Python objects (dicts) so I would like to use recv_json and send_json for that.
As I understand PUB/SUB will always require topics and will send multipart messages, which seems unnecessary with my use case. But I cannot see any other protocol which would do the same without the overhead.
Another complication (perhaps) is that this should work for both Python2 and Python3 receivers.
When there is no topic, can I just ignore the issue of topics when sending and receiving alltogether and use send/receiv_json as with other protocols?

You don't need to use topics for PUB/SUB; you can't just subscribe to "all topics" and then send whatever messages you want. E.g., I can write a publisher like this:
import json
import time
import zmq
c = zmq.Context()
s = c.socket(zmq.PUB)
s.bind("tcp://127.0.0.1:4321")
while True:
s.send(json.dumps({"color": "red", "size": "large", "count": 10}).encode())
time.sleep(0.5)
And a client like this:
import json
import zmq
c = zmq.Context()
s = c.socket(zmq.SUB)
s.connect('tcp://127.0.0.1:4321')
s.subscribe('')
while True:
m = json.loads(s.recv())
print(m)
It all works, and you'll note that I am neither making use of topics nor am I using multipart messages.

Related

How to filter in PUB/SUB with protobuf binaries?

Suppose I want to serialize and transmit protobuf binaries with ZMQ using a protocol defined in cake.proto:
syntax = "proto3";
message Cake {
int32 radius = 1;
}
I can find plenty of examples for the PUB/SUB pattern where a subscriber filters a topic with a string:
socket.setsockopt_string(zmq.SUBSCRIBE, "abc")
But how does subscribing to topics work when it comes to protobuf binaries? Do I use the bytes themselves or does ZMQ provide an wrapper for a message with a header I can use for cases like this?
There is no wrapper for this, the subject is just the first frame of the zeromq message.
If you are confident your protobuf messages will always start with the specific sequence of bytes (that make your subject) then yes you can just subscribe to that byte prefix pattern.
The other option is to copy the subject pattern into an initial frame then add the protobuf frame(s) via ZMQ_SNDMORE. If you can pack many protobuf frames into that same zmq message then the efficiency is good. If each protobuf message has its own "subject" then you will have the overhead of an extra subject frame per protobuf.

Why do almost all ZeroMQ code samples contain sleep() operations?

I'm learning ZeroMQ and trying to build a simple message queue in Python.
I noticed basically all code samples contain some kind of sleep() operation.
Even the hello world example on the ZeroMQ guide does, with the comment "Do some work".
I find this a little unclear, is the motivation to simulate the act of processing the message? Why is this necessary?
import time
import zmq
context = zmq.Context()
socket = context.socket(zmq.REP)
socket.bind("tcp://*:5555")
while True:
# Wait for next request from client
message = socket.recv()
print("Received request: %s" % message)
# Do some 'work'
time.sleep(1)
# Send reply back to client
socket.send(b"World")
is the motivation to simulate the act of processing the message ?
Sort of yes. Launching while True: "without" any handbrake would be soon pretty ugly on screen with a literally endlessly running river of print()-s, wouldn't it?
Why is this necessary ?
Just a cheap SLOC / convenience-trick. Except for cases, where some latency needs to get injected, there is no technical reason for sleep()-s

ZeroMQ: Many-to-one no-reply aynsc messages

I have read through the zguide but haven't found the kind of pattern I'm looking for:
There is one central server (with known endpoint) and many clients (which may come and go).
Clients keep sending hearbeats to the server, but they don't want the server to reply.
Server receives heartbeats, but it does not reply to clients.
Hearbeats sent when clients and server are disconnected should somehow be dropped to prevent a heartbeat flood when they go back online.
The closet I can think of is the DEALER-ROUTER pattern, but since this is meant to be used as an async REQ-REP pattern (no?), I'm not sure what would happen if the server just keep silent on incoming "requests." Also, the DEALER socket would block rather then start dropping heartbeats when the send High Water Mark is reached, which would still result in a heartbeat flood.
The PUSH/PULL pattern should give you what you need.
# Client example
import zmq
class Client(object):
def __init__(self, client_id):
self.client_id = client_id
ctx = zmq.Context.instance()
self.socket = ctx.socket(zmq.PUSH)
self.socket.connect("tcp://localhost:12345")
def send_heartbeat(self):
self.socket.send(str(self.client_id))
# Server example
import zmq
class Server(object):
def __init__(self):
ctx = zmq.Context.instance()
self.socket = ctx.socket(zmq.PULL)
self.socket.bind("tcp://*:12345") # close quote
def receive_heartbeat(self):
return self.socket.recv() # returns the client_id of the message's sender
This PUSH/PULL pattern works with multiple clients as you wish. The server should keep an administration of the received messages (i.e. a dictionary like {client_id : last_received} which is updated with datetime.utcnow() on each received message. And implement some housekeeping function to periodically check the administration for clients with old timestamps.

How do I force the sendMessage in my Autobahn websocket server to send data without delay

I am trying to write a client/server apps using websocket. I am thinking about using Autobahn websocket as my communication medium. The client is going to send a command to the server to perform a task and then wait for a series of progress response from the server. In the server, after I receive the command from client, I perform a series of tasks and then call self.sendMessage ("percent completed") % (percent) to the client. The problem I ran into is that sendMessage appear to buffered up all the messages and then sent them all at once at the end. Any idea on how I can solve this problem? Here is the code snippet from the websocket/example/echo/server.py:
import sys
import time
from twisted.internet import reactor
from twisted.python import log
from twisted.web.server import Site
from twisted.web.static import File
from autobahn.websocket import WebSocketServerFactory, \
WebSocketServerProtocol, \
listenWS
class EchoServerProtocol(WebSocketServerProtocol):
def onMessage(self, msg, binary):
self.sendMessage("server respond message 1", binary)
time.sleep (2)
self.sendMessage("server response message 2", binary)
time.sleep (2)
self.sendMessage("server response message 3", binary)
I expect the client to receive a message from the server every 2 seconds instead it gets all three messages at once.
time.sleep will block the Twisted reactor. That's (almost) never a good idea. Twisted has reactor.callLater to delay in a non-blocking way.
You can checkout the example here, to assure yourself that Autobahn sends out message immediately.
I had the same problem with you, after read websocket protocol specific, I have a solution:
self.sendMessage(msg, binary, fragmentSize=len(msg))
from websocket protocol RFC 6455
The primary purpose of fragmentation is to allow sending a message
that is of unknown size when the message is started without having to
buffer that message. If messages couldn’t be fragmented, then an
endpoint would have to buffer the entire message so its length could
be counted before the first byte is sent. With fragmentation, a
server or intermediary may choose a reasonable size buffer and, when
the buffer is full, write a fragment to the network.

Publisher finishes before subscriber and messages are lost - why?

Fairly new to zeromq and trying to get a basic pub/sub to work. When I run the following (sub starting before pub) the publisher finishes but the subscriber hangs having not received all the messages - why ?
I think the socket is being closed but the messages have been sent ? Is there a way of ensuring all messages are received ?
Publisher:
import zmq
import random
import time
import tnetstring
context=zmq.Context()
socket=context.socket(zmq.PUB)
socket.bind("tcp://*:5556")
y=0
for x in xrange(5000):
st = random.randrange(1,10)
data = []
data.append(random.randrange(1,100000))
data.append(int(time.time()))
data.append(random.uniform(1.0,10.0))
s = tnetstring.dumps(data)
print 'Sending ...%d %s' % (st,s)
socket.send("%d %s" % (st,s))
print "Messages sent: %d" % x
y+=1
print '*** SERVER FINISHED. # MESSAGES SENT = ' + str(y)
Subscriber :-
import sys
import zmq
import tnetstring
# Socket to talk to server
context = zmq.Context()
socket = context.socket(zmq.SUB)
socket.connect("tcp://localhost:5556")
filter = "" # get all messages
socket.setsockopt(zmq.SUBSCRIBE, filter)
x=0
while True:
topic,data = socket.recv().split()
print "Topic: %s, Data = %s. Total # Messages = %d" % (topic,data,x)
x+=1
In ZeroMQ, clients and servers always try to reconnect; they won't go down if the other side disconnects (because in many cases you'd want them to resume talking if the other side comes up again). So in your test code, the client will just wait until the server starts sending messages again, unless you stop recv()ing messages at some point.
In your specific instance, you may want to investigate using the socket.close() and context.term(). It will block until all the messages have been sent. You also have the problem of a slow joiner. You can add a sleep after the bind, but before you start publishing. This works in a test case, but you will want to really understand what is the solution vs a band-aid.
You need to think of the PUB/SUB pattern like a radio. The sender and receiver are both asynchronous. The Publisher will continue to send even if no one is listening. The subscriber will only receive data if it is listening. If the network goes down in the middle, the data will be lost.
You need to understand this in order to design your messages. For example, if you design your messages to be "idempotent", it doesn't matter if you lose data. An example of this would be a status type message. It doesn't matter if you have any of the previous statuses. The latest one is correct and message loss doesn't matter. The benefits to this approach is that you end up with a more robust and performant system. The downsides are when you can't design your messages this way.
Your example includes a type of message that requires no loss. Another type of message would be transactional. For example, if you just sent the deltas of what changed in your system, you would not be able to lose the messages. Database replication is often managed this way which is why db replication is often so fragile. To try to provide guarantees, you need to do a couple things. One thing is to add a persistent cache. Each message sent needs to be logged in the persistent cache. Each message needs to be assigned a unique id (preferably a sequence) so that the clients can determine if they are missing a message. A second socket (ROUTER/REQ) needs to be added for the client to request the missing messages individually. Alternatively, you could just use the secondary socket to request resending over the PUB/SUB. The clients would then all receive the messages again (which works for the multicast version). The clients would ignore the messages they had already seen. NOTE: this follows the MAJORDOMO pattern found in the ZeroMQ guide.
An alternative approach is to create your own broker using the ROUTER/DEALER sockets. When the ROUTER socket saw each DEALER connect, it would store its ID. When the ROUTER needed to send data, it would iterate over all client IDs and publish the message. Each message should contain a sequence so that the client can know what missing messages to request. NOTE: this is a sort of reimplementation of Kafka from linkedin.

Resources