I have an EvaluationBolt (for e.g. memory monitoring) and I want to make sure one executor for it runs on every worker process (which in my case is one per physical node, i.e. supervisor.slots.ports is configured to only port 6700). On the topic I found this question:
How bolts and spouts are shared among workers?
But it does not state how and if I myself can control distribution of bolts and spouts. Can one somehow configure the scheduler manually?
Cheers,
Tomi
The complicated and correct route is to write a Storm scheduler: http://xumingming.sinaapp.com/885/twitter-storm-how-to-develop-a-pluggable-scheduler/.
What I've also found is that the Storm scheduler by default performs round-robin scheduling between hosts, so most of the time you can just use the built-in scheduler to distribute your tasks equally over all hosts.
Related
I have just started learning about Apache Storm. One thing that I cannot understand is whether the entire topology is replicated on atleast one worker process on a supervisor node. If it is the case, then a component in the topology which is very compute intensive (and possibly gives better(performance) executed on a single machine by itself), is a potential bottleneck? If not, I assume Nimbus in a way "distributes" parts of topology across the cluster. How does it know how to optimally "distribute" the topology?
Storm does not replicate a topology. If you deploy a topology, all executor threads are distributed evenly over all worker nodes (using a round-robin scheduling mechanism). the number of worker nodes a topology can use, can be configured via Config.setNumWorkers(int);.
If you have a compute intensive bolt and you want to ensure that it is deployed to an own worker, you would need to implement a custom scheduler. See her for more details: https://xumingming.sinaapp.com/885/twitter-storm-how-to-develop-a-pluggable-scheduler/
I read many sites related to Storm.
But still I cannot map topology into storm cluster perfectly.
Please help me to understand this.
In storm cluster there are terms like
Supervisor
Worker node
Worker processor
Workers
Slots
Executer
Tasks
In topology, there are
Spout
Bolt
Also there is possible to configure
numWorkers
parallelism
So anyone please relate all these thing to help me.
I want to know like, each spout/bolt is act executer or is it the task.
If parallelism hint is given, the count of which entity will increase.
If num workers set, which one's count is that.
All these things to map with storm cluster.
I already worked in a project. So I know the topology.
Physical Cluster Setup:
The term node usually refers to a physical machine (or a VM) in your cluster. On each node a supervisor is running in an own JVM. Supervisor have worker slots. This is a logical configuration and tells how many worker can be started by a supervisor. Each worker (if started) runs in an own JVM (thus, some people call it worker process). In summary: on a node there is one supervisor JVM and up to number-of-worker-slots worker JVMs. Therefore, the node a worker JVM is running on, can be called worker node. While the supervisor is running all the time, workers are started if needed, ie, if topologies are deployed, and stopped when a topology is killed. Within a worker, executors are running as threads (ie, each executor maps to an own thread).
Logical Topology Setup:
Topologies are build out of Spouts (also called sources, ie, operators with no incoming data stream) and Bolts (regular operators with at least one incoming data stream and any number of outgoing data streams -- if there is no outgoing data stream, a Bolt is also called sink). For each Spout/Bolt you can configure two parameters:
the number of tasks
the dop (degree of parallelism, called parallelism_hint), ie, the number of executors you want to have for a Spout/Bolt
Tasks are logical unit of works (ie, something passive). Let's assume you use fieldsGrouping connection pattern. Thus, the data stream is partitioned into number-of-tasks many sub-streams. Tasks are assigned to executors, ie, each executor processes one or multiple tasks. This implies, that you cannot have less tasks than executors (ie, parallelism); otherwise, there would be a thread without any work to do.
See the Storm documentation for further details (https://storm.apache.org/documentation/Understanding-the-parallelism-of-a-Storm-topology.html). Furthermore, there are many other question on SO about task/executors in Storm.
Last but not least, you can configure the numberOfWorkers for a topology. This parameter indicates, how many workers should be started to run the topology. The overall number of executors for a topology is the sum of dops over all Spouts/Bolts. All executors will be evenly distributes over all available worker JVMs.
Furthermore, a single worker can only run executors of a single topology. This is done for fault-tolerance reasons, ie, topologies are isolated from each other. At the same time, a worker itself can run any number of executors.
We have the Storm configured in a single node development server with most of the configurations set to default (not local mode).
Having storm nimbus, supervisor and workers running in that single node only and UI also configured.
AFAIK parallelism and configuration differs from topology to topology.
I think finding the right parallelism and configuration is by trial and error method only.
So, to find the best parallelism we have started testing our Storm topology with various configurations in a single node.
Strangely the results are unexpected:
Our topology processes stream of xml files from HDFS directory.
Having a single spout (Parallelism always 1) and four bolts.
Single worker
Whatever the topology parallelism we get the almost same performance results (the rate of data processed)
Multiple workers
Whatever the topology parallelism we get the similar performance as of single worker until sometime (most of the cases it is 10 minutes).
But after that complete topology gets restarted without any error traces.
We had observed that Whatever data processed in 20 minutes with single worker took 90 minutes with 5 workers having the same parallelism.
Also Topology had restarted 7 times with 5 workers.
And CPU usage is relatively high.
(Someone else also had faced this topology restart issue http://search-hadoop.com/m/LrAq5ZWeaU but no answer)
After testing many configurations we found that single worker with less no of parallelism (each bolt with 2 or 3 instances) works better than high parallelism or more no of workers.
Ideally the performance of Storm topology should be better with more no workers/ parallelism.
Apparently this rule is not holding good here.
why can't we set more than a single worker in a single node?
What are the maximum no of workers can be run in a single node?
What are the Storm configurations changes that are need to scale the performance? (I have tried nimbus.childopts and worker.childopts)
If your CPU usage is high on the one node then you're not going to get any better performance as you increase parallelism. If you do increase parallelism, there will just be greater contention for a constant number of CPU cycles. Not knowing any more about your specific topology, I can only suggest that you look for ways to reduce the CPU usage across your bolts and spouts. Only then can you would it make sense to add more bolt and spout instances.
In the following, I refer to this article: Understanding the Parallelism of a Storm Topology by Michael G. Noll
It seems to me that a worker process may host an arbitrary number of executors (threads) to run an arbitrary number of tasks (instances of topology components). Why should I configure more than one worker per cluster node?
The only reason I see is that a worker can only run a subset of at most one topology. Hence, if I want to run multiple topologies on the same cluster, I would need to configure the same number of workers per cluster node as the number of topologies to be run.
(Example: This is because I would want to be flexible in case that some cluster nodes fail. If for example, only one cluster node remains I need at least as many worker processes as topologies running on that cluster in order to keep all topologies running.)
Is there any other reason? Especially, is there any reason to configure more than one worker per cluster node if running only one topology? (Better fail-safety, etc.)
To balance the costs of a supervisor daemon per node, and the risk of impact of a worker crashing. If you have one large, monolithic worker JVM, one crash impacts everything running in that worker, as well as bad behaving portions of your worker impact more residents. But by having more than one worker per node, you make your supervisor more efficient, and now have a bulkhead pattern somewhat, keeping from the all or nothing approach.
The shared resources I refer to could be yours or storm's; several pieces of storm's architecture are shared per JVM, and could create contention problems. I refer to the receive and send threads, and underlying network pieces, specifically. Documented here.
I'm trying to learn twitter storm by following the great article "Understanding the parallelism of a Storm topology"
However I'm a bit confused by the concept of "task". Is a task an running instance of the component(spout or bolt) ? A executor having multiple tasks actually is saying the same component is executed for multiple times by the executor, am I correct ?
Moreover in a general parallelism sense, Storm will spawn a dedicated thread(executor) for a spout or bolt, but what is contributed to the parallelism by an executor(thread) having multiple tasks ? I think having multiple tasks in a thread, since a thread executes sequentially, only make the thread a kind of "cached" resource, which avoids spawning new thread for next task run. Am I correct?
I may clear those confusion by myself after taking more time to investigate, but you know, we both love stackoverflow ;-)
Thanks in advance.
Disclaimer: I wrote the article you referenced in your question above.
However I'm a bit confused by the concept of "task". Is a task an running instance of the component(spout or bolt) ? A executor having multiple tasks actually is saying the same component is executed for multiple times by the executor, am I correct ?
Yes, and yes.
Moreover in a general parallelism sense, Storm will spawn a dedicated thread(executor) for a spout or bolt, but what is contributed to the parallelism by an executor(thread) having multiple tasks ?
Running more than one task per executor does not increase the level of parallelism -- an executor always has one thread that it uses for all of its tasks, which means that tasks run serially on an executor.
As I wrote in the article please note that:
The number of executor threads can be changed after the topology has been started (see storm rebalance command).
The number of tasks of a topology is static.
And by definition there is the invariant of #executors <= #tasks.
So one reason for having 2+ tasks per executor thread is to give you the flexibility to expand/scale up the topology through the storm rebalance command in the future without taking the topology offline. For instance, imagine you start out with a Storm cluster of 15 machines but already know that next week another 10 boxes will be added. Here you could opt for running the topology at the anticipated parallelism level of 25 machines already on the 15 initial boxes (which is of course slower than 25 boxes). Once the additional 10 boxes are integrated you can then storm rebalance the topology to make full use of all 25 boxes without any downtime.
Another reason to run 2+ tasks per executor is for (primarily functional) testing. For instance, if your dev machine or CI server is only powerful enough to run, say, 2 executors alongside all the other stuff running on the machine, you can still run 30 tasks (here: 15 per executor) to see whether code such as your custom Storm grouping is working as expected.
In practice we normally we run 1 task per executor.
PS: Note that Storm will actually spawn a few more threads behind the scenes. For instance, each executor has its own "send thread" that is responsible for handling outgoing tuples. There are also "system-level" background threads for e.g. acking tuples that run alongside "your" threads. IIRC the Storm UI counts those acking threads in addition to "your" threads.