My topology's processing rate is about 2500 messages per seconds, but Complete latency is about 7ms. Shouldn't it be equal 1000 / 2500 = 0.4ms? - apache-storm

My topology reads from RabbitMQ, and it's processing rate is about 2500 messages per seconds, but Complete latency is about 7ms. Shouldn't it be equal 1000 / 2500 = 0.4ms?
Topology summary:
Please, help me to understand, what does mean parameter Complete latency in my case.
Topology process messages from RabbitMQ queue with rate about 2500/sec
RabbitMQ screenshot:

According to Storm docs The complete latency is just for spouts. It is the average amount of time it took for ack or fail to be called for a tuple after it was emitted.
So, it is the time between your rabbitmq-spout emitted tuple and the last bolt acked it.
Storm has an internal queue to make pressure, the maximum size of this queue is defined in topology.max.spout.pending variable in configs. If you set it to a high value your rabbit consumer would read messages from the rabbit to fulfil this queue ahead of real processing with bolts in topology, causing the wrong measure of real latency of your topology.
In the RabbitMQ panel, you see how fast messages are consumed from it, not how they are processed, you compare hot and round.
To measure latency I would recommend running your topology for a couple of days, 202 seconds according to your screenshot is too tight.

Related

why does SwiftMQ show flow control behaviour even when flow control is disabled?

I'm trying to benchmark the performance of swiftMQ 5.0.0 with producer and consumer application I wrote so that I can vary the number of producer threads and consumer threads. I have added a delay on the consumer to simulate the time taken to process a message. I have run a test by setting the producer threads fixed at 2, and by varying the number of consumer threads from 20 to 92 in steps of 4.
Initially, the producer rate starts high and consumer rate is low (as expected due to the delay added and less number of consumer threads).
As the number of consumer threads increase, the producer rate drops and consumer rate increases and they become equal at around 48 consumer threads.
After that, as the number of consumer threads further increase, both producer and consumer rates keep increasing linearly. I am wandering what the reason for this behavior is?
see this image for the
result graph
Notes:
I have disabled flow control at queue level by setting flowcontrol-start-queuesize="-1" .
I also have not set a value to inbound-flow-control-enabled in routing swiftlet. (I believe it
defaults to false)
Any help on this matter is much appreciated. TIA

Apache Storm: throttling spout based on configuration

My topology reads from Kafka and makes a HTTP call to an external system. The ingestion rate in Kafka is about 200 messages per second. The external system only supports 20 HTTP calls per second. How can i introduce throttling so that the bolt that makes HTTP calls processes only 20 messages per second?
You can use the topology.max.spout.pending setting to throttle the spout based on how many tuples are in flight in the topology. The setting is per spout instance, so if you have e.g. 10 spout executors and you set max 100 tuples, you will get a max of 1000 tuples in the topology.
You can use the resetTimeout method on the OutputCollector to keep tuples you want to postpone from failing due to timeout.
This being said, you probably need to batch up your messages into larger bundles. If you can only process 20 messages per second, and you have an input of 200 per second, you will start falling behind and never catch up.

High latency between spout -> bolt and bolt -> bolts

In my topology I see around 1 - 2 ms latency when transferring tuples from spouts to bolts or from bolts to bolts. I am calculating latency using nanosecond timestamps because the whole topology runs inside a single worker.
Topology is run in a cluster which runs in a production capable hardware.
To my understanding, tuples need not be serialized/de-serialized in this case as everything is inside single JVM. I have set parallelism hint for most spouts and bolts to 5 and spouts only produce events at a rate of 100 per second. I dont think high latency is due to queuing of events because I dont see any increase of latency with time. No memory increase either. log levels are set to ERROR. CPU usage is in the range of 200 to 300 %.
what could be causing this latency? I was expecting only few us's for tuple transfer.
I'm going to assume you're using one of the released Storm versions, and not 2.0.0-SNAPSHOT, since the queueing implementation has changed in that version.
I think it's likely that the delay is because Storm batches up tuples before delivering them to the consumer. Take a look at https://github.com/apache/storm/blob/v1.2.1/storm-core/src/jvm/org/apache/storm/utils/DisruptorQueue.java#L247, and also look at the Flusher class in that file. When a spout/bolt publishes a tuple, it is put into the _currentBatch list. It stays there until either enough tuples have been received so the batch is "big enough" (you can look at the _inputBatchSize variable to figure out when this is), or until the Flusher is triggered (happens by default once per millisecond).

how to understand arrival rate about apache storm disruptor queue

About storm metric. I do not understand the relationship between send queue arrival rate and receive queue arrival rate.
For example, when open ACK, if a spout receive one tuple , and it emit one tuple. whether the RQ arrival rate : SQ arrival rate = 1:2?
Besides, if system not stable. this Equation may be change?
Spout instances in Storm do not have a receive queue (only a send queue)? I assume you are referring to bolts?
Although it is a little old this article by Michael Noll gives a good overview of the internal queues within the workers.
To answer your question. The ratio between the queues will not always be 2:1. The disruptor queues report their metrics averaged over the user configurable topology.builtin.metrics.bucket.size.secs so this will obscure some of the difference. Also all metrics are subject to a sample ratio, set by the topology.stats.sample.rate config variable - which by default is only 20% of transferred tuples, this can also cause the reported numbers to be off.
Also, depending on the code in your bolts, 1 input tuple may produce many output tuples so you would have to take this into account in any ratios you were calculating.
You refer to the stability of an equation in your question. The arrival rate is not based on any queuing theory equation and is simply the number of tuples that are put on the queue in a metric.bucket period divided by the period length in seconds. However, Storm does report a queue sojourn time metric. This is based on a very simple queuing theory equation that is not reliable for unstable queue systems and should be avoided.

Storm latency caused by ack

I was using kafka-storm to connect kafka and storm. I have 3 servers running zookeeper, kafka and storm. There is a topic 'test' in kafka that has 9 partitions.
In the storm topology, the number of KafkaSpout executor is 9 and by default, the number of tasks should be 9 as well. And the 'extract' bolt is the only bolt connected to KafkaSpout, the 'log' spout.
From the UI, there is a huge rate of failure in the spout. However, he number of executed message in bolt = the number of emitted message - the number of failed mesage in bolt. This equation is almost matched when the failed message is empty at the beginning.
Based on my understanding, this means that the bolt did receive the message from spout but the ack signals are suspended in flight. That's the reason why the number of acks in spout are so small.
This problem might be solved by increase the timeout seconds and spout pending message number. But this will cause more memory usage and I cannot increase it to infinite.
I was wandering if there is a way to force storm ignore the ack in some spout/bolt, so that it will not waiting for that signal until time out. This should increase the throughout significantly but not guarantee for message processing.
if you set the number of ackers to 0 then storm will automatically ack every sample.
config.setNumAckers(0);
please note that the UI only measures and shows 5% of the data flow.
unless you set
config.setStatsSampleRate(1.0d);
try increasing the bolt's timeout and reducing the amount of topology.max.spout.pending.
also, make sure the spout's nextTuple() method is non blocking and optimized.
i would also recommend profiling the code, maybe your storm Queues are being filled and you need to increase their sizes.
config.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE,32);
config.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE,16384);
config.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE,16384);
Your capacity numbers are a bit high, leading me to believe that you're really maximizing the use of system resources (CPU, memory). In other words, the system seems to be bogged down a bit and that's probably why tuples are timing out. You might try using the topology.max.spout.pending config property to limit the number of inflight tuples from the spout. If you can reduce the number just enough, the topology should be able to efficiently handle the load without tuples timing out.

Resources