Why Kafka streams creates topics for aggregation and joins - apache-kafka-streams

I recently created my first Kafka stream application for learning. I used spring-cloud-stream-kafka-binding. This is a simple eCommerce system, in which I am reading a topic called products, which have all the product entries whenever a new stock of a product comes in. I am aggregating the quantity to get the total quantity of a product.
I had two choices -
Send the aggregate details (KTable) to another kafka topic called aggregated-products
Materialize the aggregated data
I opted second option and what I found out that application created a kafka topic by itself and when I consumed messages from that topic then got the aggregated messages.
.peek((k,v) -> LOGGER.info("Received product with key [{}] and value [{}]",k, v))
.groupByKey()
.aggregate(Product::new,
(key, value, aggregate) -> aggregate.process(value),
Materialized.<String, Product, KeyValueStore<Bytes, byte[]>>as(PRODUCT_AGGREGATE_STATE_STORE).withValueSerde(productEventSerde)//.withKeySerde(keySerde)
// because keySerde is configured in application.properties
);
Using InteractiveQueryService, I am able to access this state store in my application to find out the total quantity available for a product.
Now have few questions -
why application created a new kafka topic?
if answer is 'to store aggregated data' then how is this different from option 1 in which I could have sent the aggregated data by my self?
Where does RocksDB come into picture?
Code of my application (which does more than what I explained here) can be accessed from this link -
https://github.com/prashantbhardwaj/kafka-stream-example/blob/master/src/main/java/com/appcloid/kafka/stream/example/config/SpringStreamBinderTopologyBuilderConfig.java

The internal topics are called changelog topics and are used for fault-tolerance. The state of the aggregation is stored both locally on the disk using RocksDB and on the Kafka broker in the form of a changelog topic - which is essentially a "backup". If a task is moved to a new machine or the local state is lost for a different reason, the local state can be restored by Kafka Streams by reading all changes to the original state from the changelog topic and applying it to a new RocksDB instance. After restoration has finished (the whole changelog topic was processed), the same state should be on the new machine, and the new machine can continue processing where the old one stopped. There are a lot of intricate details to this (e.g. in the default setting, it can happen that the state is updated twice for the same input record when failures happen).
See also https://developer.confluent.io/learn-kafka/kafka-streams/stateful-fault-tolerance/

Related

kstream topology with inmemory statestore data not commited

I need to aggregate client information and every hours push it to an output topic.
I have a topology with :
input-topic
processor
sink topic
Data arrives in input-topic with a key in string which contains a clientID concatenated with date in YYYYMMDDHH
.
In my processor I use a simple InMemoryKeyValueStore (withCachingDisabled) to merge/aggregate data with specific rules (data are sometime not aggregated according to business logic).
In a punctuator, every hours the program parse the statestore to get all the messages transform it and forward it to the sink topic, after what I clean the statestore for all the message processed.
After the punctuation, I ask the size of the store which is effectivly empty (by .all() and
approximateNumEntries), every thing is OK.
But when I restart the application, the statstore is restored with all the elements normally deleted.
When I parse manually (with a simple KafkaConsumer) the changelog topic of the statestore in Kafka, I view that I have two records for each key :
The first record is commited and the message contains my aggregation.
The second record is a deletion message (message with null) but is not commited (visible only with read_uncommitted) which is dangerous in my case because the next punctuator will forward again the aggregate.
I have play with commit in the punctuator which forward, I have create an other punctuator which commit the context periodically (every 3 seconds) but after the restart I still have my data restored in the store (normal my delete message in not commited.)
I have a classic kstream configuration :
acks=all
enable.idempotence=true
processing.guarantee=exactly_once_v2
commit.interval.ms=100
isolation.level=read_committed
with the last version of the library kafka-streams 3.2.2 and a cluster in 2.6
Any help is welcome to have my record in the statestore commited. I don't use TimeWindowedKStream which is not exactly my need (sometime I don't aggregate but directly forward)

Flink, Kafka and JDBC sink

I have a Flink 1.11 job that consumes messages from a Kafka topic, keys them, filters them (keyBy followed by a custom ProcessFunction), and saves them into the db via JDBC sink (as described here: https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/jdbc.html)
The Kafka consumer is initialized with these options:
properties.setProperty("auto.offset.reset", "earliest")
kafkaConsumer = new FlinkKafkaConsumer(topic, deserializer, properties)
kafkaConsumer.setStartFromGroupOffsets()
kafkaConsumer.setCommitOffsetsOnCheckpoints(true)
Checkpoints are enabled on the cluster.
What I want to achieve is a guarantee for saving all filtered data into the db, even if the db is down for, let's say, 6 hours, or there are programming errors while saving to the db and the job needs to be updated, redeployed and restarted.
For this to happen, any checkpointing of the Kafka offsets should mean that either
Data that was read from Kafka is in Flink operator state, waiting to be filtered / passed into the sink, and will be checkpointed as part of Flink operator checkpointing, OR
Data that was read from Kafka has already been committed into the db.
While looking at the implementation of the JdbcSink, I see that it does not really keep any internal state that will be checkpointed/restored - rather, its checkpointing is a write out to the database. Now, if this write fails during checkpointing, and Kafka offsets do get saved, I'll be in a situation where I've "lost" data - subsequent reads from Kafka will resume from committed offsets and whatever data was in flight when the db write failed is now not being read from Kafka anymore nor is in the db.
So is there a way to stop advancing the Kafka offsets whenever a full pipeline (Kafka -> Flink -> DB) fails to execute - or potentially the solution here (in pre-1.13 world) is to create my own implementation of GenericJdbcSinkFunction that will maintain some ValueState until the db write succeeds?
There are 3 options that I can see:
Try out the JDBC 1.13 connector with your Flink version. There is a good chance it might just work.
If that doesn't work immediately, check if you can backport it to 1.11. There shouldn't be too many changes.
Write your own 2-phase-commit sink, either by extending TwoPhaseCommitSinkFunction or implement your own SinkFunction with CheckpointedFunction and CheckpointListener. Basically, you create a new transaction after a successful checkpoint and commit it with notifyCheckpointCompleted.

Execute code when two previous events have been processed (Apache Kafka)

I´m new in Apache Kafka and Spring Boot. I´m trying to create a Spring Boot listener that generates a new event only when two specific messages (sent through Apache Kafka) have been received (for a determined resource).
The obvious solution is to use the database to change the status of the resource when the first event comes, and execute the code when the second event comes (if the customer is in the correct status in database). In this case, I'm worried if both events arrive at the same time.
Is there a way to aggregate both messages in Spring Boot/Apache Kafka instead do this manually?
Thanks.
You can do it with kafka streams. Example topology:
input stream (key/value from input topic A)
filter (filter by event type for example)
groupBy (group events by key or some field)
aggregate (aggregate events into new data structure)
filter (verify if aggregate its complete)
map (generate new output event with aggregate values)
output stream (key/value to topic B)
Check details in official doc: https://kafka.apache.org/24/documentation/streams/developer-guide/dsl-api.html#creating-source-streams-from-kafka

Is there any option of cold-bootstraping a persistent store in Kafka streams?

I have been working on kafka-streams for a couple of months. We are using RocksDB to store data. Now, changelog topic keeps data of only a few days and if our application's persistent stores have data of few months. How will store state be restored if a partition is moved from one node to another(which I think, happens through changelog).
Also, if the node goes containing active task and a new node is introduced. So, the replica will be promoted to active and a new replica will start building on this new node. So, if changelog has only few days of data the new replica will have only that data, instead of original few months.
So, is there any option where we can transfer data to a replica from the active store rather than changelog(as it only has fraction of data).
Changelog topics that are used to backup stores don't have a retention time but are configured with log-compaction enabled (cf. https://kafka.apache.org/documentation/#compaction). Thus, it's guaranteed that no data is lost no matter how long you run. The changelog topic will always contain the exact same data as your RocksDB stores.
Thus, for fail-over or scale-out, when a task migrates and a store need to be rebuild, it will be a complete copy of the original store.

Designing model for Storm Topology

I am using Apache Kafka & Apache Storm integration.
I need to design a model.Here are the specification of my topology :
I have configured topic in Kafka. Let say customer1 . Now, the storm bolts will read the data from the customer1 kafka-spout. It processes the data and writes into mongo and cassandra db. Here the db names are also same as the kafka topics customer1. Table structure and rest of the things will be same.
Now, suppose I get a new customer let say customer2. I need to read data from customer2 kafka-spout and write it into mongo and cassandra db where the db names will be customer2.
I can think of two ways to do it .
I will write a bolt which gets trigged whenever a new customer name gets added into a Kafka topic .That bolt will have code which will create and submit the new topology to cluster.
I will create independent jars for all the customer and submit the topology manually.
I searched a lot about it but didn't get which approach is better.
What are the PROs and CONs of the above specified approach in terms of efficiency, code maintainability and adding new changes to the existing model ?
Is there any other way to handle this ?

Resources