Streaming Pull hogging PubSub Messages

I'm currently running some tests on the latest google-cloud-pubsub==0.35.4 pubsub release. My intention is to process a never ending stream (variating in load) using a dynamic amount of subscriber clients.
However, when i have a queue of say.. 600 messages and 1 client running and then add additional clients:
Expected: All remaining messages get distributed evenly across all clients
Observed: Only new messages are distributed across clients, any older messages are send to pre-existing clients
Below is simplified version of what i use for my clients (for reference we'll only be running the low priority topic).
I won't include the publisher since it has no relation.
ACKS_PER_MIN = 100.00
PRIORITY_LOW: 'test_low',
class Subscriber:
subscriber_client = None
subscriptions = {}
priority_queue = defaultdict(Queue.Queue)
priorities = []
def __init__(self):
self.subscriber_client = pubsub_v1.SubscriberClient()
for option, percentage in ACKS_RATIO.iteritems():
self.priorities += [option] * percentage
def subscribe_to_topic(self, topic, max_messages=10):
self.subscriptions[topic] = self.subscriber_client.subscribe(
BASE_TOPIC_PATH.format(project=PROJECT, topic=topic,),
def un_subscribe_from_topic(self, topic):
subscription = self.subscriptions.get(topic)
if subscription:
del self.subscriptions[topic]
def process_message(self, message):
json_message = json.loads('utf8'))
def retrieve_message(self):
message = None
priority = random.choice(self.priorities)
ack_priorities = PRIORITY_SEQUENCES[priority]
for ack_priority in ack_priorities:
message = self.priority_queue[ack_priority].get(block=False)
except Queue.Empty:
return message
if __name__ == '__main__':
messages_acked = 0
pub_sub = Subscriber()
pub_sub.subscribe_to_topic(PRIORITY_TOPICS[PRIORITY_LOW], MESSAGE_LIMIT * 3)
while True:
msg = pub_sub.retrieve_message()
if msg:
json_msg = json.loads('utf8'))
print ("%s - Akked Priority %s , High %s, Medium %s, Low %s" % ('%H:%M:%S'),
time.sleep(60.0 / ACKS_PER_MIN)
I'm wondering if this behaviour as inherent to how streaming pulls function or if there are configurations that can alter this behaviour.

Considering the Cloud Pub/Sub documentation, Cloud Pub/sub delivers each published message at least once for every subscription, nevertheless there are some exception to this behavior:
A message that cannot be delivered within the maximum retention time of 7 days is deleted.
A message published before a given subscription was created will not be delivered.
In other words, the service will deliver the messages to the subscriptions created before the message was published, therefore, the old messages will not be available for new subscriptions. As far as I know, Cloud Pub/Sub does not offer a feature to change this behavior.


How to work around zeromq late-joiner problem in a proxy (xpub/xsub)?

I have read all salient posts and the zeromq guide on the topic of pub/sub late joiner, and hopefully I simply missed the answer, but just in case I haven't, I have two questions about the zeromq proxy in context to the late joiner:
Does the zeromq proxy with the suggested XSUB/XPUB sockets also suffer from the late joiner problem, ie. are the first few pub messages of a new publisher dropped ?
If so, what is the accepted solution to ensure that even the first published message is received by subscribers with matching topic (my latest info is to sleep...)
I don't believe it is pertinent to the question, but just in case: here are
My proxy's run method, which runs in its own thread; it starts a capture thread if DEBUG is true to log all messages; if no matching subscription exists, nothing is captured.
The proxy works fine including the capture socket. However, I am now adding functionality where a new publisher is started in a thread and will immediately start to publish messages. Hence my question, if a message is published straight away, will it be dropped ?
def run(self):
msg = debug_log(self, f"{} thread running...")
debug_log(self, msg + "binding sockets...")
xpub = self.zmq_ctx.socket(zmq.XPUB)
xsub = self.zmq_ctx.socket(zmq.XSUB)
if self.loglevel == DEBUG and sys_conf.system__zmq_broker_capt_addr:
debug_log(self, msg + " done, now starting broker with message "
"capture and logging.")
capt = self.zmq_ctx.socket(zmq.PAIR)
capt_th = Thread( + "_capture_thread", target=self.capture_run,
args=(self.zmq_ctx,), daemon=True
capt.recv() # wait for peer thread to start and check in
debug_log(self, msg + "capture peer synchronised, proceeding.")
zmq.proxy(xpub, xsub, capt) # blocks until thread terminates
debug_log(self, msg + " starting broker.")
zmq.proxy(xpub, xsub) # blocks until thread terminates
def capture_run(self, ctx):
""" capture socket's thread's run method.
debug_log(self, f"{} capture thread running.")
sock = ctx.socket(zmq.PAIR)
sock.send(b"") # ack message to calling thread
while True:
try: # assume we're getting topic string followed by python object
topic = sock.recv_string()
obj = sock.recv_pyobj()
except Exception: # if not simply log message in bytes
topic = None
obj = sock.recv()
debug_log(self, f"{} capture_run received topic {topic}, "
f"obj {obj}.")
My users of the proxy (they are all both subscribers and publishers) in different threads and/or processes from the proxy:
# establish zmq subscriber socket and connect to broker
self._evm_subsock = self._zmq_ctx.socket(zmq.SUB)
# establish pub socket and connect to broker
self._evm_pub_sock = self._zmq_ctx.socket(zmq.PUB)
debug_log(self, msg + " Connected to pub/sub broker.")

ZeroMQ Subscribers not receiving message from Publisher over an inproc: transport class

I am fairly new to pyzmq. I am trying to understand inproc: transport class and have created this sample example to play with.
It looks a Publisher instance is publishing messages but Subscriber instances are not receiving any.
In case I move Subscriber instances into a separate process and change inproc: to a tcp: transport class, the example works.
Here is the code:
import threading
import time
import zmq
context = zmq.Context.instance()
address = 'inproc://test'
class Publisher(threading.Thread):
def __init__(self):
self.socket = context.socket(zmq.PUB)
def run(self):
while True:
message = 'snapshot,current_time_%s' % str(time.time())
print 'sending message %s' % message
class Subscriber(object):
def __init__(self, sub_name): = sub_name
self.socket = context.socket(zmq.SUB)
def listen(self):
while True:
msg = self.socket.recv()
a, b = msg.split(' ', 1)
print 'Received message -> %s-%s-%s' % (, a, b)
except zmq.ZMQError as e:
if __name__ == '__main__':
thread_a = []
for i in range(0, 1):
subs = Subscriber('subscriber_%s' % str(i))
th = threading.Thread(target=subs.listen)
pub = Publisher()
pub_th = threading.Thread(
There is nothing wrong, but
ZeroMQ is a wonderfull toolbox.It is full of smart, bright and self-adapting services under the hood, that literally save our poor lives in many ways.Still it is worth to read and obey a few rules from the documentation.
inproc transport class has one such. .bind() first, before .connect()-s
[ Page 38, Code Connected, Volume I ]... inproc is an inter-thread signalling transport ... it is faster than tcp or ipc. This transport has a specific limitation compared to tpc and icp: the server must issue a bind before any client issues a connect. This is something future versions of ØMQ may fix, but at present this defines how you use inproc sockets.
So, as an example:
if __name__ == '__main__':
pub = Publisher()
pub_th = threading.Thread( target = )
# give it a place to start before .connect()-s may take place
# give it a time to start before .connect()-s may take place
thread_a = []
for i in range( 0, 1 ):
subs = Subscriber( 'subscriber_%s' % str( i ) )
th = threading.Thread( target = subs.listen )
thread_a.append( th )

Akka actors and Clustering-I'm having trouble with ClusterSingletonManager- unhandled event in state Start

I've got a system that uses Akka 2.2.4 which creates a bunch of local actors and sets them as the routees of a Broadcast Router. Each worker handles some segment of the total work, according to some hash range we pass it. It works great.
Now, I've got to cluster this application for failover. Based on the requirement that only one worker per hash range exist/be triggered on the cluster, it seems to me that setting up each one as a ClusterSingletonManager would make sense..however I'm having trouble getting it working. The actor system starts up, it creates the ClusterSingletonManager, it adds the path in the code cited below to a Broadcast Router, but it never instantiates my actual worker actor to handle my messages for some reason. All I get is a log message: "unhandled event ${my message} in state Start". What am I doing wrong? Is there something else I need to do to start up this single instance cluster? Am I sending the wrong actor a message?
here's my akka config(I use the default config as a fallback):
min-nr-of-members = 1
role {
workerSystem.min-nr-of-members = 1
daemonic = true
remote {
enabled-transports = ["akka.remote.netty.tcp"]
netty.tcp {
hostname = ""
port = ${akkaPort}
provider = akka.cluster.ClusterActorRefProvider
single-message-bound-mailbox {
# FQCN of the MailboxType. The Class of the FQCN must have a public
# constructor with
# (, com.typesafe.config.Config) parameters.
mailbox-type = "akka.dispatch.BoundedMailbox"
# If the mailbox is bounded then it uses this setting to determine its
# capacity. The provided value must be positive.
# Up to version 2.1 the mailbox type was determined based on this setting;
# this is no longer the case, the type must explicitly be a bounded mailbox.
mailbox-capacity = 1
# If the mailbox is bounded then this is the timeout for enqueueing
# in case the mailbox is full. Negative values signify infinite
# timeout, which should be avoided as it bears the risk of dead-lock.
mailbox-push-timeout-time = 1
type = PinnedDispatcher
executor = "thread-pool-executor"
# Throughput defines the number of messages that are processed in a batch
# before the thread is returned to the pool. Set to 1 for as fair as possible.
throughput = 500
thread-pool-executor {
# Keep alive time for threads
keep-alive-time = 60s
# Min number of threads to cap factor-based core number to
core-pool-size-min = ${workerCount}
# The core pool size factor is used to determine thread pool core size
# using the following formula: ceil(available processors * factor).
# Resulting size is then bounded by the core-pool-size-min and
# core-pool-size-max values.
core-pool-size-factor = 3.0
# Max number of threads to cap factor-based number to
core-pool-size-max = 64
# Minimum number of threads to cap factor-based max number to
# (if using a bounded task queue)
max-pool-size-min = ${workerCount}
# Max no of threads (if using a bounded task queue) is determined by
# calculating: ceil(available processors * factor)
max-pool-size-factor = 3.0
# Max number of threads to cap factor-based max number to
# (if using a bounded task queue)
max-pool-size-max = 64
# Specifies the bounded capacity of the task queue (< 1 == unbounded)
task-queue-size = -1
# Specifies which type of task queue will be used, can be "array" or
# "linked" (default)
task-queue-type = "linked"
# Allow core threads to time out
allow-core-timeout = on
fork-join-executor {
# Min number of threads to cap factor-based parallelism number to
parallelism-min = 1
# The parallelism factor is used to determine thread pool size using the
# following formula: ceil(available processors * factor). Resulting size
# is then bounded by the parallelism-min and parallelism-max values.
parallelism-factor = 3.0
# Max number of threads to cap factor-based parallelism number to
parallelism-max = 1
Here's where I create my Actors(its' written in Groovy):
Props clusteredProps = ClusterSingletonManager.defaultProps("worker".toString(), PoisonPill.getInstance(), "workerSystem",
new ClusterSingletonPropsFactory(){
Props create(Object handOverData) {"called in ClusterSingetonManager")
Props.create(WorkerActorCreator.create(applicationContext, it.start, it.end)).withDispatcher("").withMailbox("")
} )
ActorRef manager = system.actorOf(clusteredProps, "worker-${it.start}-${it.end}".toString())
String path = manager.path().child("worker").toString()
when I try to send a message to the actual worker actor, should the path above resolve? Currently it does not.
What am I doing wrong? Also, these actors live within a Spring application, and the worker actors are set up with some #Autowired dependencies. While this Spring integration worked well in a non-clustered environment, are there any gotchyas in a clustered environment I should be looking out for?
thank you
FYI:I've also posted this in the akka-user google group. Here's the link.
The path in your code is to the ClusterSingletonManager actor that you start on each node with role "workerSystem". It will create a child actor (WorkerActor) with name "worker-${it.start}-${it.end}" on the oldest node in the cluster, i.e. singleton within the cluster.
You should also define the name of the ClusterSingletonManager, e.g. system.actorOf(clusteredProps, "workerSingletonManager").
You can't send the messages to the ClusterSingletonManager. You must send them to the path of the active worker, i.e. including the address of the oldest node. That is illustrated by the ConsumerProxy in the documentation.
I'm not sure you should use a singleton at all for this. All workers will be running on the same node, the oldest. I would prefer to discuss alternative solutions to your problem at the akka-user google group.

ZeroMQ High Water Mark Doesn't Work

when I read the "Durable Subscribers and High-Water Marks" in zmq guide, it said "The HWM causes ØMQ to drop messages it can't put onto the queue", but no messages lost when I ran the example. Hit ctrl+c to terminate the and then continue it.
the example from the zmq in python.Other languages are the same.
import zmq
import time
context = zmq.Context()
subscriber = context.socket(zmq.SUB)
subscriber.setsockopt(zmq.IDENTITY, "Hello")
subscriber.setsockopt(zmq.SUBSCRIBE, "")
sync = context.socket(zmq.PUSH)
while True:
data = subscriber.recv()
print data
if data == "END":
import zmq
import time
context = zmq.Context()
sync = context.socket(zmq.PULL)
publisher = context.socket(zmq.PUB)
publisher.setsockopt(zmq.HWM, 2)
sync_request = sync.recv()
for n in xrange(10):
msg = "Update %d" % n
The suggestion above is valid, but doesn't properly address the problem in this particular code.
The real problem here is that in you call publisher.setsockopt(zmq.HWM, 2) AFTER calling publisher.bind. You should call setsockopt BEFORE bind or connect.
Please refer to 0MQ API documentation for setsockopt:
Caution: All options, with the exception of ZMQ_SUBSCRIBE, ZMQ_UNSUBSCRIBE and ZMQ_LINGER, only take effect for subsequent socket bind/connects.

Redis / RabbitMQ - Pub / Sub - Performances

I wrote a little test for a simple scenario:
One publisher and one subscriber
Publisher send 1000000 messages
Subscriber receive the 1000000 messages
First test with RabbitMQ, fanout Exchange, RabbitMq node type Ram : 320 seconds
Second test with Redis, basic pub/Sub : 24 seconds
Am i missing something? Why a such difference ? Is this a configuration problem or something?
First scenario: one node.js process for the subscriber, one for the publisher, each one, one connection to rabbitmq with amqp node module.
Second scénario: one node.js process for the subscriber, one for the publisher, each one got one connection to redis.
Any help is welcom to understand... I can share the code if needed.
i'm pretty new to all of this.
What i need, is a high performances pub / sub messaging system. I'd like to have clustering capabilities.
To run my test, i just launch the rabbitMq server (default configuration) and i use the following
var sys = require('sys');
var amqp = require('amqp');
var nb_messages = process.argv[2];
var connection = amqp.createConnection({url: 'amqp://guest:guest#localhost:5672'});
connection.addListener('ready', function () {
exchangeName = 'myexchange';
var start = end = null;
var exchange =, {type: 'fanout'}, function(exchange){
start = (new Date()).getTime();
for(i=1; i <= nb_messages; i++){
if (i%1000 == 0){
exchange.publish("", "hello");
end = (new Date()).getTime();
console.log("Publishing duration: "+((end-start)/1000)+" sec");
var sys = require('sys');
var amqp = require('amqp');
var nb_messages = process.argv[2];
var connection = amqp.createConnection({url: 'amqp://guest:guest#localhost:5672'});
connection.addListener('ready', function () {
exchangeName = 'myexchange';
queueName = 'myqueue'+Math.random();
var queue = connection.queue(queueName, function (queue) {
queue.bind(exchangeName, "");
queue.start = false;
queue.nb_messages = 0;
queue.subscribe(function (message) {
if (!queue.start){
queue.start = (new Date()).getTime();
if (queue.nb_messages % 1000 == 0){
if (queue.nb_messages >= nb_messages){
queue.end = (new Date()).getTime();
console.log("Ending at "+queue.end);
console.log("Receive duration: "+((queue.end - queue.start)/1000));
Check to ensure that:
Your RabbitMQ queue is not configured as persistent (since that would require disk writes for each message)
Your prefetch count on the subscriber side is 0
You are not using transactions or publisher confirms
There are other things which could be tuned, but without knowing the details of your test it's hard to guess. I would just make sure that you are comparing "apples to apples".
Most messaging products can be made to go as fast as humanly possible at the expense of various guarantees (like delivery assurance, etc) so make sure you understand your application's requirements first. If your only requirement is for data to get shoveled from point A to point B and you can tolerate the loss of some messages, pretty much every messaging system out there can do that, and do it well. The harder part is figuring out what you need beyond raw speed, and tuning to meet those requirements as well.
