Stream thread calculation - apache-kafka-streams

I'm using Stream DSL. I have three source topic with partition 17, 100, 40.
I will be running three instances and 2 standby instances.
How can I calculate how many stream threads I will need so that each thread gets exactly one task or highest parallelism is achieved?

This depends on the structure of your application. You can run the application with a single thread and observe the number of created tasks. The number of task is the maximum number of threads you can use.
The task that are created are logged or you obtain them via KafkaStream#localThreadMetadata().

I will try to discuss an approach here in short
You are asking for maximum parallelism
This can be achieved by separating out each topic in a separate
topology
Each topology having separate thread count (one thread per
consumer per topic) - 17/3, 100/3, 40/3 - topic partition/instances
This will make sure that each topology gets separate thread count and
separate parallelism
each topology will act as separate consumer
group

Related

Dynamically adapt the number of consumer thread to the number of Kafka partitions

I have a Kafka topic with 50 partitions.
My Spring Boot application uses Spring Kafka to read those messages with a #KafkaListener
The number of instances of my application autoscale in my Kubernetes.
By default, it seems that Spring Kafka launch 1 consumer thread per topic.
org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1
So, with a unique instance of the application, one thread is reading the 50 partitions.
With 2 instances, there is a load balancing and each instance listen to 25 partitions. Still with 1 thread per instance.
I know I can set the number of thread using the concurrency parameter on #KafkaListener.
But this is a fixed value.
Is there any way to tell Spring to dynamically adapt the number of consumer threads to the number of partition the client is currently listening?
I think there might be a better way of approaching this.
You should figure out how many records / partitions in parallel one instance of your application can handle optimally, through load / performance tests.
Let's say one instance can handle 10 threads / records in parallel optimally. Now if you scale out your app to 50 instances, in your approach, each instance will get one partition, and each instance will be performing below its capacity, wasting resources.
Now consider the opposite - only one instance is left, and it spawns 50 threads to consume from all partitions in parallel. The app's performance will be severally degraded, it might become unresponsive or even crash.
So, in this hypotethical scenario, what you might want to do is, for example, start with one or two instances handling all partitions with 10 threads each, and have it scale to up to 5 instances if there's consumer lag, so that each partition has a dedicated thread processing it.
Again, the actual figures should be determined through load / performance testing.

Queue that that serves producers evenly

So I have multiple producers who generate some tasks
and there is consumer (or multiple) who executes these tasks (f.e. count number of lines in a file)
My problem is that consumers should be treated evenly, meaning if one producer generate 10 tasks and other only 2 - consumer should first do 1 task from producer1, then task from producer2, then producer1 and then the rest
Basically for each producer system must guaranty that created tasks will not wait for large chunk of tasks from other producers
Can you help me with algorithm or ready to use broker/queue software that can achieve this goal ?

Kafka-stream Threading model

Can it be safe to say that in all and all in Kafka Stream, Tasks represent subscriptions to partitions, while Threads represent consumers ?
That is, if there is 8 partition there will always be 8 Tasks. However the number of consumers is determine by the number of Thread available. Those are spread across Application instance. So one application instance may represent 2 consumer provided that is has 2 Thread associated to it.
For full parallelism, with a topic with 8 partitions we could have 2 application instance with each having 4 Thread, or one application instance with 8 Threads and so on.
Yeah, Number of tasks will be equal to maximum number of partitions in any Kafka stream app
In case there are two topics "A" and "B" each having 8 partitions. So no. of tasks will be max(8,8) = 8. Now each consumer represents a thread. If you set of threads as 2, so 2 threads will distribute the tasks between each other. Each thread will get 4 tasks to process.
For full parallelism, with a topic with 8 partitions we could have 2
application instance with each having 4 Thread, or one application
instance with 8 Threads and so on.
You should use number of threads to the maximum number of partitions always in order to achieve the full parallelism. You can either do it in several application instances or one.
Here is a nicely explained Threading model of Kstream.
https://docs.confluent.io/current/streams/architecture.html#parallelism-model

Multiple queues vs multiple jobs in resque

I am using resque to background process two types of jobs:
(1) 3rd-party API requests
(2) DB query and insert
While the two jobs can be processed parallely, each job type in itself can only be processed in serial order. For example, DB operations need to happen in serial order but can be executed in parallel with 3rd party API requests.
I am contemplating either of the following methods for executing this :
(1) Having two queues with one queue handling only API requests and the other queue
handling only db queries. Each queue will have its own worker.
(2) One single queue but two workers. One worker for each job.
I would like to know the difference in the two approaches and which among the two would be a better approach to take.
This choice of selecting an architecture is not straight forward, you have to keep many things in mind.
Having two queues with one queue handling only API requests and the other queue
handling only db queries. Each queue will have its own worker.
Ans: You can have this architecture when you have both the queues equally busy i.e if you have one queue with more number of jobs and other empty then your one worker will be idol and there will be jobs waiting in other queue.
You should always think about full utilisation of your workers.
One single queue but two workers. One worker for each job.
Ans: This approach is what we also use in our project.
Having all jobs en-queued to one queue and have multiple workers running on it.
Your workers will always be busy no matter which type job is present.
proper worker utilisation is possible.
At last I would suggest you can use 2nd approach.

multiple consumers per kinesis shard

I read you can have multiple consumer apps per kinesis stream.
http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-kcl.html
however, I heard you can only have on consumer per shard. Is this true? I don't find any documentation to support this, and can't imagine how that could be if multiple consumers are reading from the same stream. Certainly, it doesn't mean the producer needs to repeat content in different shards for different consumers.
Kinesis Client Library starts threads in the background, each listens to 1 shard in the stream. You cannot connect to a shard over multiple threads, that is by-design.
http://docs.aws.amazon.com/kinesis/latest/dev/kinesis-record-processor-scaling.html
For example, if your application is running on one EC2 instance, and
is processing one Amazon Kinesis stream that has four shards. This one
instance has one KCL worker and four record processors (one record
processor for every shard). These four record processors run in
parallel within the same process.
In the explanation above, the term "KCL worker" refers to a Kinesis consumer application. Not the threads.
But below, the same "KCL worker" term refers to a "Worker" thread in the application; which is a runnable.
Typically, when you use the KCL,
you should ensure that the number of instances does not exceed the
number of shards (except for failure standby purposes). Each shard is
processed by exactly one KCL worker and has exactly one corresponding
record processor, so you never need multiple instances to process one
shard.
See the Worker.java class in KCL source.
Late to the party, but the answer is that you can have multiple consumers per kinesis shard. A KCL instance will only start one process per shard, but you can have another KCL instance consuming the same stream (and shard), assuming the second one has permission.
There are limits, though, as laid out in the docs, including:
Each shard can support up to 5 transactions per second for reads, up to a maximum total data read rate of 2 MB per second.
If you want a stream with multiple consumers where each message will be processed once, you're probably better off with something like Amazon Simple Queue Service.
to keep it simple, you can have multiple/different lambda functions get triggered on kinesis data. this way your both the lambdas are going to get all the data from the kinesis. The downside is that now you will have to increase the throughput at the kinesis level which is going to pricey. Use SQS instead for your use case.

Resources