I am trying to to put kafka-data through storm in hdfs and hive. I am working with hortonworks. Therefore i have the following structure, as (a little modificated) seen in many tutorials (http://henning.kropponline.de/2015/01/24/hive-streaming-with-storm/):
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("kafka-spout", kafkaSpout);
builder.setBolt("hdfs-bolt", hdfsBolt).globalGrouping("kafka-spout");
builder.setBolt("parse-bolt", new ParseBolt()).globalGrouping("kafka-spout");
builder.setBolt("hive-bolt", hiveBolt).globalGrouping("parse-bolt");
I send the kafka-spout data directly to hdfs-bolt, which is working when i only use hdfs-bolt. When i add the parse-bolt to parse the kafka-data and emit it to hive-bolt, the complete system goes crazy. Even when iam just sending one single message over kafka, this message is duplicated by the kafka-spout infinite times and is written to the hdfs infinite.
If there is an error in the parse-bolt, shouldn't the hdfs-bolt still working normal? I'am new to the topic, can someone see a simple beginners mistake? I am grateful for any advice.
Are you acking the messages at the end of both bolt's execution?
When you read from the same stream from your kafka-spout, messages will get anchored to the same spout but with unique messageIds. So essentially even though your parse-bolt 's tuple fails, since it is anchored to the same spout, it will get replayed at the spout . This will result in another tuple with a different messageId but same content being played for all the bolts subscribed to it, in your case the parse-bolt and the hdfs-bolt.
Remember that the replay happens at the Spout and hence everything subscribed to that stream from the spout will get redundant messages.
Related
I tried to create a small example in Trident. The goal was to see how tuples are replayed in Case of failures. Below is the topology definition
Random rand = new Random();
Config config = new Config();
config.setDebug(true);
config.setNumWorkers(1);
TridentTopology topology = new TridentTopology();
topology.newStream("spout", new RandomIntegerSpout())
.map((MapFunction) tridentTuple -> {
if ((tridentTuple.getLongByField("msgid") % 50 == 0) &&
(rand.nextInt(2) == 1)) {
System.out.println(String.format("Failed to process tuple %d", tridentTuple.getLongByField("msgid")));
throw new ReportedFailedException("Divisible by 50");
}
return new Values(tridentTuple.toArray());
})
.peek((Consumer) tridentTuple -> System.out.println(tridentTuple.getValues()));
I use the RandomIntegerSpout from storm-starter which extends BaseRichSpout and just generates random numbers. I then apply a MapFunction that just draws a random number every 50 tuples and randomly fails the tuple.
The Problem is, I do not get any acks or fails.
I played around with the spout and ran it in debug mode, tried same sample output, tried it with standard storm bolts. The anchoring is working fine, it just does not get called by trident.
I reproduced this problem with LocalCluster and StormSubmitter, in v1.2.3 and v2.0.0.
Below is a screenshot of the Storm UI:
The bolts corresponding to the map ack and fail the tuple as expected, but this is are never propagated back to the spout.
I thought the trident mastercoord might expect some kind of persistence in a state to realize the topology is done, but replacing peek by some persistentAggregate did not help. I also ruled out a bug in map by doing the same with each.
Seeing the code is almost trivial by inspection I probably misunderstand something fundamental about Trident / Storm. Am I wrong to expect trident to call the spout's and ack method if a batch is done? I realized there is no fail method in IBatchSpout. how does Trident handle replaying of batches??
Trident spouts don't ack or fail tuples at the individual tuple level. Instead, tuples are acked as a batch.
Trident spouts will often look something like this interface.
M emitPartitionBatch(TransactionAttempt tx, TridentCollector collector, PartitionT partition, M lastPartitionMeta);
The idea is that Trident will manage keeping track of acks/fails of the batch tuples, and then if the batch fails, it will ask the spout for to repeat the batch, and if not, it simply won't.
Note how this is different from a standard Storm spout. With a normal spout, the framework basically tells the spout "Hey, emit something. Up to you what you emit.", and then the ack and fail methods are used to tell the spout whether it should emit a particular tuple again.
With Trident, the spout is instead told "Hey, (re)emit batch number x", and it is then up to the spout to know which tuples were in that batch. With this model there's no need for a fail method. Some Trident spouts will have an ack/succeed method though, to allow the spout to drop any state it may have related to a particular in-progress batch.
For wrapped IRichSpouts, there's some bridging code that wraps them into the Trident API. Basically, the wrapper calls nextTuple until it has a full batch, then it stores the ids in a cache. If the wrapper is asked to reemit a batch, it calls fail on the spout. Otherwise, it calls ack once the batch has succeeded.
I think the reason you're not seeing anything in Storm UI related to this, is that the IRichBolt isn't actually represented there. Instead it's wrapped, so the ack/fail calls are happening "under the hood" inside the spout-spout component. If you want to know for sure whether ack/fail is being called, try adding some logging to the ack/fail methods of your IRichSpout.
My topology looks like this :
Data_Enrichment_Persistence_Topology
So basically the problem I am trying to solve here is that every time any issue comes in the Stop or Load service bolts, and a tuple fails , it replays and the spout re emits it. This makes the Cassandra bolt re process the tuple and rewrite data.
I can not make the tuples in the load and stop bolts unanchored as i need them to be replayed in case of any failure. However I only want to get the upper workflow replayed.
I am using a KafkaSpout to emit data ( it is emitting it on the " default" stream). Not sure how to duplicate the streams at the Kafka Spout's emit level.
If I can duplicate the streams the replay on any of of the two will only re emit the message on a particular stream right at the spout level leaving the other stream untouched right?
TIA!
You need to use two output streams in your Spout -- one for each downstream pass. Furthermore, you emit each tuple to both streams (using different message-id).
Thus, if one fails, you can reply this tuple to just this stream.
I am trying of a scenario in which I have a Spout which reads a data from a Message Broker and emits the message as a tuple to a Bolt for some Processing.
Bolt post processing converts it into seperate Messages and each sub- message has to be sent to different Brokers which can be hosted on different machines .
Assuming I have finite recipients (in my case there are 3 Message Brokers for output) .
So , Bolt1 post processing can either drop the message directly to these 3 Message Brokers
Now, If I use a single Bolt here which drops the messages to these three brokers by itself and lets say One of them fails(due to unavailability etc) on which I call the collector's fail method .
Once the fail method is called on the bolt , in my Spout fail method gets Invoked .
Here , I believe I will have to again process the entire message again (I have to make sure everyMessage has to be processed ) even though 2 out of 3 messages got successfully delivered .
Alternatively , even If I emit these 3 sub messages to different bolt , I think even in that case Spout will have to process the entire message again .
This is because I am appending a Unique Guid with the message while emitting it first time in the spout nextTuple() method .
Is there a way to ensure that only the failed sub message should be processed and not entire one?
Thanks
Storm (low level Java API) provides only "at-least-once" processing guarantees, ie, there is no support to avoid duplicate processing on case of failure.
If you need exactly once proceeding, you can use Trident on top of Storm. However, even Trident can not give exactly once if you emit data to an external system (if the external system cannot detect and delete duplicates). This is not a Storm specific but general problem. Other system like Apache Flink, Apache Spark Streaming, or S-Store (a recent research prototype system from MIT -> Stonebraker) "suffer" from the exact same problem.
Maybe the best approach would be to try out Trident to evaluate if it can meet your requirements.
As I understand things, ZooKeeper will persist tuples emitted by bolts so if a bolt crashes (or a computer with the bolt crashes, or the entire cluster crashes), the tuple emitted by the bolt will not be lost. Once everything is restarted, the tuples will be fetched from ZooKeeper, and everything will continue on as if nothing bad ever happened.
What I don't yet understand is if the same thing is true for spouts. If a spout emits a tuple (i.e., the emit() function within a spout is executed), and the computer the spout is running on crashes shortly thereafter, will that tuple be resurrected by ZooKeeper? Or do we need Kafka in order to guarantee this?
P.S. I understand that the tuple emitted by the spout must be assigned a unique ID in the call to emit().
P.P.S. I see sample code in books that uses something like ConcurrentHashMap<UUID, Values> to track which spouted tuples have not yet been acked. Is this somehow automatically persisted with ZooKeeper? If not, then I shouldn't really be doing that, should I? What should I being doing instead? Using Kafka?
Florian Hussonnois answered my question thoroughly and clearly in this storm-user thread. This was his answer:
Actually, the tuples aren't persisted into "zookeeper". If your
"spout" emits a tuple with a unique id, it will be automatically
follow internally by storm (i.e ackers) . Thus, in case the emitted
tuple comes to fail because of a bolt failure, Storm invokes the
method 'fail' on the origin spout task with the unique id as argument.
It's then up to you to re-emit the failed tuple.
In sample codes, spouts use a Map to track which tuples are fully
processed by your entire topology in order to be able to re-emit in
case of a bolt failure.
However, if the failure doesn't come from a bolt but from your spout,
the in memory Map will be lost and your topology will not be able to
remit failed tuples.
For a such scenario you can rely on Kafka. In fact, the Kafka Spout
store its read offset into zookeeper. In that way, if a spout task
goes down it will be able to read its offset from zookeeper after
restarting.
We have a fairly simple storm topology with one head ache.
One of our bolts can either find the data it is processing to be valid and every thing carries on down the stream as normal or it can find it to be invalid but fixable. In which case we need to send it for some additional processing.
We tried making this step part of the topology with a separate bolt and stream.
declarer.declareStream(NORMAL_STREAM, getStreamFields());
declarer.declareStream(ERROR_STREAM, getErrorStreamFields());
Followed by some thing like the following at the end of the execute method.
if(errorOutput != null) {
collector.emit(ERROR_STREAM, input, errorOutput);
}
else {
collector.emit(NORMAL_STREAM, input, output);
}
collector.ack(input);
This does work however it has a breaking effect of causing all of the tuples that do not go down this error path to fail and get re-sent by the spout endlessly.
I think this is because the error bolt can not send acks for messages it doesn't receive but the acker thing waits for all the bolts in a topology to ack before sending the ack back to the spout. At the very least taking out the error processing bolt causes every thing to get acked back to the spout correctly.
What is the best way to achieve some thing like this?
It's possible that the error bolt is slower than you suspect, causing a backup on error_stream which, in turn, causes a backup into your first bolt and finally causing tuples to start timing out. When a tuple times out, it gets resent by the spout.
Try:
Increasing the timeout config (topology.message.timeout.secs),
Limiting the number of inflight tuples from the spout (topology.max.spout.pending) and/or
Increasing the parallelism count for your bolts