I have a Lambda which is configured as a consumer of a Kinesis data stream, with a batch size of 10,000 (maximal).
The lambda parses given records and inserts them to Aurora Postgresql (using an INSERT command).
Somehow, I see that the lambda is invoked most of the time with a relatively small number of records (less than 200), although the 'IteratorAge' is constantly high (about 60 seconds). The records are put in the stream with a random partition key (generated as uuid4), and of size
How can that be explained? As I understand, if the shard is not empty, all the current records, up to the configured batch size, should be polled.
I assume that if the Lambda was invoked with bigger batches this delay could be prevented.
Note: There is also a Kinesis Firehose configured as a consumer (doesn't seem to have any issue).
Finds out that the iterator age of the Kinesis was 0ms, so this behavior makes sense.
The iterator age of the Lambda is a bit different:
Measures the age of the last record for each batch of records processed. Age is the difference between the time Lambda received the batch, and the time the last record in the batch was written to the stream.
Related
I've connected a lambda to a DyDB table via a stream. When a record is written to the table, it triggers the lambda. The traffic is very bursty, so nothing might happen for a while, then I'll write several thousand records.
What I'm seeing is a few lambda instances will be triggered, but not enough to handle the burst. Then at random times, the number of lambda instances will jump an order of magnitude or two (from 2 to 90 or more), and it will catch up. The problem is the jump might not occur for 30 minutes or more.
I'm seeing the records written to the table very quickly (seconds). The processing of 20 records by the lambda shouldn't take more than 2 minutes. It seems like the lambdas are spending most of their time sitting around waiting for records to show up. The record key for the table is a GUID.
Things I've tried
Playing with the number of records to make sure there's no lambda timeouts (20 seems to be conservative, but 100 causes timeouts)
Moving the lambda to a different subnet
Batching the writes to the table (~500-1000 records in a batch)
Breaking up the writes in hopes it would trigger more lambdas (~20-100 records in a batch)
Increasing the lambda memory to the max (3GB)
Reducing memory to be larger than used (1GB, 300Mb used)
Is there a better pattern to be using? Should I skip the stream and just write SNS messages? I don't care about order, but would prefer to not run the job more than once.
So here's what I found out.
It looks like the problem is contention on the DynamoDB stream by the lambda instances.
My solution was to skip the DynamoDB stream and not use it, and post to an SNS queue. The lambdas pick up the messages, and scale much better. Times have gone from hours to seconds.
What is the effect of setting batch interval when creating the streaming context
new StreamingContext(spark.sparkContext, batchInterval)
According to this Amazon blog the Kinesis batch interval is hard coded to 1s.
The the Kinesis batch interval mentioned in the Blog is the interval at which the receiver reads data from a stream, which is by default set at 1 second. This interval just decides input rate of the receiver.
The batchInterval provided while creating the StreamingContext divides the input records into batches of given interval, to be processed by spark streaming.
For example if you have single Kinesis receiver and your batchInterval is 10 seconds then receiver would be able to read up to 10000 records in 10 second, that is reading 1000 records per second interval from the Kinesis stream. So your that streaming batch will include 10000 records.
I have already read some questions about kinesis shard and multiple consumers but I still don't understand how it works.
My use case: I have a kinesis stream with just one shard. I would like to consume this shard using different lambda function, each of them independently. It's like that each lambda function will have it's own shard iterator.
Is it possible? Set multiple lambda consumers ( stream based) reading from the same stream/shard?
Hey Mr Magalhaes I believe the following picture should answer some of your questions.
So to clarify you can set multiple lambdas as consumers on a kinesis stream, but the Lambdas will block each other on processing. If your stream has only one shard it will only have one concurrent Lambda.
If you have one kinesis stream, you can connect as many lambda functions as you want through an event source mapping.
All functions will run simultaneously and fully independent of each other and will constantly be invoked if new records arrive in the stream.
The number of shards does not matter.
For a single lambda function:
"For Lambda functions that process Kinesis or DynamoDB streams the number of shards is the unit of concurrency. If your stream has 100 active shards, there will be at most 100 Lambda function invocations running concurrently. This is because Lambda processes each shard’s events in sequence." [https://docs.aws.amazon.com/lambda/latest/dg/scaling.html]
But there is no limit on how many different lambda consumers you want to attach with kinesis.
Yes, no problem with this !
The number of shards doesn't limit the number of consumers a stream can have.
In you case, it will just limit the number of concurrent invocations of each lambda. This means that for each consumers, you can only have the number of shards of concurrent executions.
Seethis doc for more details.
Short answer:
Yes it will work, and will work concurrently.
Long answer:
Each shared in Kinesis stream has 2MiB/sec read throughput:
https://docs.aws.amazon.com/streams/latest/dev/building-consumers.html
If you have multiple applications (in your case Lambda's). They will share the throughput.
A description taken from the link above:
Fixed at a total of 2 MiB/sec per shard. If there are multiple consumers reading from the same shard, they all share this throughput. The sum of the throughput they receive from the shard doesn't exceed 2 MiB/sec.
If you create (write) less than 1mib/sec of data you should be able to support two "applications" with a single shard.
In general if you have Y shards and X applications it should work properly assuming your total write throughput (mib/sec) is less than 2mib/sec * Y / X and that data is spread equally between shards.
If you require each "Application" to use 2 Mib/sec each, you may enable "Consumers with Enhanced Fan-Out" which "fan-outs" the stream giving each application a dedicated 2 Mib/sec per shard (instead of sharing the throughput).
This is described in the following link:
https://docs.aws.amazon.com/streams/latest/dev/introduction-to-enhanced-consumers.html
In Amazon Kinesis Data Streams, you can build consumers that use a feature called enhanced fan-out. This feature enables consumers to receive records from a stream with throughput of up to 2 MiB of data per second per shard. This throughput is dedicated, which means that consumers that use enhanced fan-out don't have to contend with other consumers that are receiving data from the stream.
Is there an optimal way to improve the speed of requests to an insertAll request via the golang client?
My current structure is such that:
Rows get submit to a queue
Upon hitting job size, a new job is queued up (approx. 250 rows)
A worker picks up a job from the queue
Worker formats + sends insertAll request to BQ
The above works fairly well, however I seem to be having an average of 8-12 seconds per 250 row request.
I should mention that the number of workers being used is generally 200, however I have tried higher / lower values and it does not seem to make much a difference (higher usually seems to take longer per job).
I read you can have multiple consumer apps per kinesis stream.
http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-kcl.html
however, I heard you can only have on consumer per shard. Is this true? I don't find any documentation to support this, and can't imagine how that could be if multiple consumers are reading from the same stream. Certainly, it doesn't mean the producer needs to repeat content in different shards for different consumers.
Kinesis Client Library starts threads in the background, each listens to 1 shard in the stream. You cannot connect to a shard over multiple threads, that is by-design.
http://docs.aws.amazon.com/kinesis/latest/dev/kinesis-record-processor-scaling.html
For example, if your application is running on one EC2 instance, and
is processing one Amazon Kinesis stream that has four shards. This one
instance has one KCL worker and four record processors (one record
processor for every shard). These four record processors run in
parallel within the same process.
In the explanation above, the term "KCL worker" refers to a Kinesis consumer application. Not the threads.
But below, the same "KCL worker" term refers to a "Worker" thread in the application; which is a runnable.
Typically, when you use the KCL,
you should ensure that the number of instances does not exceed the
number of shards (except for failure standby purposes). Each shard is
processed by exactly one KCL worker and has exactly one corresponding
record processor, so you never need multiple instances to process one
shard.
See the Worker.java class in KCL source.
Late to the party, but the answer is that you can have multiple consumers per kinesis shard. A KCL instance will only start one process per shard, but you can have another KCL instance consuming the same stream (and shard), assuming the second one has permission.
There are limits, though, as laid out in the docs, including:
Each shard can support up to 5 transactions per second for reads, up to a maximum total data read rate of 2 MB per second.
If you want a stream with multiple consumers where each message will be processed once, you're probably better off with something like Amazon Simple Queue Service.
to keep it simple, you can have multiple/different lambda functions get triggered on kinesis data. this way your both the lambdas are going to get all the data from the kinesis. The downside is that now you will have to increase the throughput at the kinesis level which is going to pricey. Use SQS instead for your use case.