Apache Storm tuple timed out after 10 minutes but topology.message.timeout.secs is configured as 5 minutes - apache-storm

We have a topology with topology.message.timeout.secs = 300 secs.
Recently, we ran into an issue where the first bolt after the spout reached a capacity of ~2.
The bolt started processing tuples very slowly (it started processing tuples 5 mins after the spout emitted them).
After a few minutes, the spout emitted tuples but the before the bolt could start processing, the tuple timed out.
The strange part is the time difference between when the tuple failed vs when it was emitted is 10 mins.
The expectation was that the tuple should have failed after 5 mins (300 secs configured).
Any thoughts/information on why the timeout configured was not really considered would be very helpful. Is there any other configuration that can affect the tuple timeout?

Related

My topology's processing rate is about 2500 messages per seconds, but Complete latency is about 7ms. Shouldn't it be equal 1000 / 2500 = 0.4ms?

My topology reads from RabbitMQ, and it's processing rate is about 2500 messages per seconds, but Complete latency is about 7ms. Shouldn't it be equal 1000 / 2500 = 0.4ms?
Topology summary:
Please, help me to understand, what does mean parameter Complete latency in my case.
Topology process messages from RabbitMQ queue with rate about 2500/sec
RabbitMQ screenshot:
According to Storm docs The complete latency is just for spouts. It is the average amount of time it took for ack or fail to be called for a tuple after it was emitted.
So, it is the time between your rabbitmq-spout emitted tuple and the last bolt acked it.
Storm has an internal queue to make pressure, the maximum size of this queue is defined in topology.max.spout.pending variable in configs. If you set it to a high value your rabbit consumer would read messages from the rabbit to fulfil this queue ahead of real processing with bolts in topology, causing the wrong measure of real latency of your topology.
In the RabbitMQ panel, you see how fast messages are consumed from it, not how they are processed, you compare hot and round.
To measure latency I would recommend running your topology for a couple of days, 202 seconds according to your screenshot is too tight.

Apache Storm UI window

In Apache Storm UI, Window specifies The past period of time for which the statistics apply. So it may be 10 mins, 3 hr, 1day. But actually when a topology is running, Is the number of tuples emitted/ transferred be computed using this window time because If I see the actual time 10 mins is quite big but the window shows 10 mins statistics before actual 10 mins which doesn't make sense?
For Example: emitted = 1764260 tuples, so will the rate of tuples emission is 1764260/600= 9801 tuples/sec?
It does not display the average, it displays the total number of tuples emitted in the last period of time (10 min, 3h or 1 day).
Therefore, if you started the application 2 minutes ago, it will display all tuples emitted the last two minutes and you'll see that the number increases until you get to 10 minutes.
After 10 minutes, it will only show the number of tuples emitted in the last 10 minutes, and not an average of the tuples emitted. So if, for example, you started the application 30 minutes ago, it will display the number of tuples emitted between minutes 20 to 30.

Tuples failing at the spout, and seems they are not even reaching the Bolt

I have a topology running for a few days now and it started failing tuples from last couple of days. From the logs it seems that the tuples are not reaching the bolts, attached is the Storm UI screenshot.
I am ack'ing the tuples in finally in my code, so no case of un'acked tuples, and the timeout is set at 10sec, which is quite high than the time shown on the UI.
Any Hints ?enter image description here
The log you're seeing is simply the Kafka spout telling you that it has fallen too far behind, and it has started skipping tuples.
I believe only acked tuples count for the complete latency metric https://github.com/apache/storm/blob/a4afacd9617d620f50cf026fc599821f7ac25c79/storm-client/src/jvm/org/apache/storm/stats/SpoutExecutorStats.java#L54. Failed tuples don't (how would Storm know what the actual latency is for tuples that time out), so the complete latency you're seeing is only for the initial couple of acked tuples.
I think what's happening is that your tuples are reaching the bolt, and then either you're not acking them (or acking them more than once), or the tuples are taking too long to process so they time out while queued up for the bolt. Keep in mind that the tuple timeout starts when the spout emits the tuple, so time spent in the bolt's input queue counts. Since your initial couple of tuples are taking a while to process, I think the bolt queue gets backed up with tuples that are already timed out. The bolt doesn't discard tuples that are timed out, so the queued timed out tuples are preventing fresh tuples from being processed in time.
I'd raise the tuple timeout, and also cap the number of pending tuples by setting topology.max.spout.pending to whatever you think is reasonable (something like the number of tuples you think you can process within the timeout)

Apache storm ui capacity metric

How 'capacity' is calculated?
From their documentation
The "capacity" metric is very useful and tells you what % of the time in the last 10 minutes the bolt spent executing tuples. If this value is close to 1, then the bolt is "at capacity" and is a bottleneck in your topology. The solution to at-capacity bolts is to increase the parallelism of that bolt.
I don't quite understand % of time. So if the value is 0.75 - what does it really mean?
It's the percent of time that the bolt is busy vs idle. 0.75 would mean that 25% of the time is waiting for new data to be processed.
Lets say you receive a new input tuple every second but your bolt takes 0.1 seconds to process it, the bolt will be idle 90% of the time and the capacity will be 0.1.
Another example: Imagine you receive more data in real time that you can process and you have two bolts and the task that is doing the first bolt takes more time than the second so the first bolt is your bottleneck. The capacity of the first bolt will be around 1 and the capacity of the second will be below 1.
In both examples above, then you can determine the parallelism (or processing power) that you need to set up for each bolt by looking at this number.
If the first bolt capacity is 1 and the second is 0.5 you probably want to set up twice the number of executors to the first bolt than two the second. At the same time (and most importantly), you have to increase the number of executors until that bolt capacity is below 1, so you are sure that your topology is able to keep up and process all the data that is coming in real time.

Storm topology processing slowing down gradually

I have been reading about apache Storm tried few examples from storm-starter. Also learnt about how to tune the topology and how to scale it to perform fast enough to meet the required throughput.
I have created example topology with acking enabled, i am able to achieve 3K-5K messages processing per second. It performs really fast in initial 10 to 15min or around 1mil to 2mil message and then it starts slowing down. On storm UI, I can see the overall latency starts going up gradually and does not comes back, after a while the processing drops to only few hundred a second. I am getting exact same behavior for all the typologies i tried, the simplest one is to just read from kafka using KafkaSpout and send it to transform bolt parse the msg and send it to kafka again using KafkaBolt. The parser is very fast as it takes less than a millisecond to parse the message. I tried few option of increasing/describing the parallelism, changing the buffer sizes etc. but same behavior. Please help me to find out the reason for gradual slowness in the topology. Here is the config i am using
1 Nimbus machine (4 CPU) 24GB RAM
2 Supervisor machines (8CPU) and using 1 thread per core with 24GB RAM
4 Node kafka cluster running on above 2 supervisor machines (each topic has 4 partitions)
KafkaSpout(2 parallelism)-->TransformerBolt(8)-->KafkaBolt(2)
topology.executor.receive.buffer.size: 65536
topology.executor.send.buffer.size: 65536
topology.spout.max.batch.size: 65536
topology.transfer.buffer.size: 32
topology.receiver.buffer.size: 8
topology.max.spout.pending: 250
At the start
After few minutes
After 45 min - latency started going up
After 80 min - Latency will keep going up and will go till 100 sec by the time it reaches 8 to 10mil messages
Visual VM screenshot
Threads
Pay attention to the capacity metric on RT_LEFT_BOLT, it is very close to 1; which explains why your topology is slowing down.
From the Storm documentation:
The Storm UI has also been made significantly more useful. There are new stats "#executed", "execute latency", and "capacity" tracked for all bolts. The "capacity" metric is very useful and tells you what % of the time in the last 10 minutes the bolt spent executing tuples. If this value is close to 1, then the bolt is "at capacity" and is a bottleneck in your topology. The solution to at-capacity bolts is to increase the parallelism of that bolt.
Therefore, your solution is to add more executors (and tasks) to that given bolt (RT_LEFT_BOLT). Another thing you can do is reduce the number of executors on RT_RIGHT_BOLT the capacity indicates you don't need that many executors, probably 1 or 2 can do the job.
The issue was due to GC setting with newgen params, it was not using the allocated heap completely so internal storm queues were getting full and running out of memory. The strange thing was that storm did not throw out of memory error, it just got stalled, with the help of visual vm i was able to trace it down.

Resources