Apache Storm Bolts stop processing after reading 1Million messages - apache-storm

I am using Storm Topology to read from Kafka queue and emit aggregates using a single spout. When I have a single supervisor node, the topology is working fine and spout is emitting fine. However when a second supervisor node is added, the spout stops emitting. I was able to verify that there are two supervisor nodes using storm ui. There are no errors either in supervisor.log or worker log files on both the nodes.
Please help me in resolving this issue.

Related

Add bolts in a running topology of storm

I deploy a storm cluster with version 1.0.2. There is a case as below:
A topology is created and is submitted on the cluster to do some analysis for one client/customer. When another client also needs the same analysis, as I think, the similar bolts can be created and append the spout in the topology. I wonder if it's possible to create such bolts when the topology is running. That means the running analysis for the first client can't be interrupted. Is it possible?
Thanks for your any comments.
Max
Altering a running topology is not possible. You need to kill the topology and re-submit it with the newly added bolt.

Apache Storm Flux change topology

Is it possible to change the topology layout while it is running? I would like to change the stream groupings and bolts while it is active.
Submitting the yaml file with the new topology layout says it cannot deploy since it is already running.
I'm using Apache Storm 0.10.0
Thanks
It is not possible to change the structure of a topology while it is running. You need to kill the topology and redeploy the new version afterwards.
The only parameter you can change while a topology is running it the parallelism. See here for more details:
V 2.0.0 - https://storm.apache.org/releases/2.0.0/Understanding-the-parallelism-of-a-Storm-topology.html

How to submit an topology to a specify storm worker node?

Assumed that I have a storm cluster formed with three servers, named as server1, server2, server3.
Server1 runs as master node, server2 and 3 run as worker node.
When I submit a topology to Server1, it always distribute the topology to run on Server2.
But there is something wrong with Server2(new submitted topologies can run but don't truly work, and I don't know why), so I want to change the server that the topologies would run on.
And here is my question:
How could I submit my topologies to a specify worker server?
I guess you confuse workers with supervisors. Supervisors are running on each node in your cluster and started when starting up the Storm cluster. Workers are started by supervisors if a topology is submitted. You can configure the max number of workers for each supervisor in storm.yaml. Nimbus is communicating with supervisors only (via Zookeeper): see https://storm.apache.org/documentation/Tutorial.html
Furthermore, you can implement a custom scheduler in Storm, and thus influence to which nodes (ie, supervisors) a topology is submitted to.
Hope this helps.

Apache Storm - spout and bolts not present in Storm UI

I am developing a storm topology locally. I am using the Storm 0.9.2-incubating and have developed a simple Topology. When I deploy it using the LocalCluster() option, it works fine, but it will not show up in my Storm UI it just executes.
When I deploy it regularly, it will show the Topology in my Storm UI, but no spouts or bolts will be visible when I click it.
I have also tried this with example WordCountTopology that comes in many storm starter projects. The same behavior happens.
My question is really, why are the spouts and bolts not showing up? If you deploy a topology locally without using LocalCluser() option will that cause problems? Is it possible to deploy a topology on my local box and see it in Storm UI with all the spouts and bolts and not have it execute immediately, but wait for something such as a kafka message?
Are you running Storm Supervisor? If you deploy a new topology and Supervisor isn't running the topology will show up in the UI but since its never initialized it doesn't show any stats when you click into it.

Need help regarding storm

1) What happens if Nimbus fails? Can we convert some other node into a Nimbus?
2) Where is the output of topology stored? When a bolt emits a tuple, where is it stored ?
3) What happens if zookeeper fails ?
Nimbus is itself a failure-tolerant process, which means it doesn't store its state in-memory but in an external database (Zookeeper). So if Nimbus crashes (an unlikely scenario), on the next start it will resume processing just where it stopped. Nimbus usually must be setup to be monitored by an external monitoring system, such as Monit, which will check the Nimbus process state periodically and restart it if any problem occurs. I suggest you read the Storm project's wiki for further information.
Nimbus is the master node of a Storm cluster and isn't possible to have multiple Nimbus nodes. (Update: the Storm community is now (as of 5/2014) actively working on making the Nimbus daemon fault tolerant in a failover manner, by having multiple Nimbuses heartbeating each other)
The tuple is "stored" in the tuple tree, and it is passed to the next bolt in the topology execution chain as topology execution progresses. As for physical storage, tuples are probably stored in an in-memory structure and seralized as necessary to be distributed among the cluster's nodes. The complete Storm cluster's state itself is stored in Zookeeper. Storm doesn't concern itself with persisent storage of a topology or a bolt's output -- it is your job to persist the results of the processing.
Same as for Nimbus, Zookeper in a real, production Storm cluster must be configured for reliability, and for Zookeeper that means having an odd number of Zookeeper nodes running on different servers. You can find more information on configuring a Zookeeper production cluster in the Zookeper Administrator's Guide. If Zookeeper would fail (altough a highly unlikely scenario in a properly configured Zookeeper cluster) the Storm cluster wouldn't be able to continue processing, since all cluster's state is stored in Zookeeper.
Regarding question 1), this bug report and subsequent comment from Storm author and maintainer Nathan Marz clarifies the issue:
Storm is not designed for having topologies partially running. When you bring down the master, it is unable to reassign failed workers. We are working on Nimbus failover. Nimbus is fault-tolerant to the process restarting, which has made it fault-tolerant enough for our and most people's use cases.

Resources