200ms latency to DynamoDB from cold lambda vs <10ms when hot. Why? - aws-lambda

I'm developing an AWS Serverless architecture.
I have a lambda attached to a subnet within a VPC. I have setup a VPC endpoint to reach my DynamoDB table.
When my lambda is cold, it takes up to 200 - 300ms make a simple GetItem command to my dynamodb table. This is just the GetItem command, I have already substracted lambda initialization, DynamoDB client instanciation etc. This is unacceptable for my application.
However, when my lambda is hot, I only take ~8-9msm for the GetItem command, which is acceptable.
Is there some ENIs latencies because my lambda is attached to a subnet ? If so, what can I do to speed it up ? Or is there another problem that I do not see ?

The latencies you're experiencing is due to the metadata caching that DynamoDB utilizes to lower latency.
When requests are made frequently to DynamoDB it makes use of caching certain metadata locally, such as authentication and data locality etc....
When requests are infrequent this cache is stale and results in extra hope by the DynamoDB service in order to serve your requests. This is one of the fundamental reasons that DynamoDB latency decreases as throughput increases.
Some thing's that you can do to avoid the latency hit:
Ensure you reuse TCP connections by setting keep-alice to true
Ensure client is created outside of the Lambda handler
Send dummy traffic to your DynamoDB table to ensure it keeps the metadata cache warm.

Related

Configure dynamodb stream event source triggers to have configurable delay for retries

We have basically
dynamodb streams =>
trigger lambda (batch size XX, concurrency 1, retries YY) =>
write to service
There are multiple shards, so we may have some number of concurrent writes to the service. Under some conditions too many streams have too much data, and too many lambda instances are writing to the service, which then responds with 429.
Right now the failure simply ends up being a failure, the lambda retries, but the service is still overwhelmed.
What we would like to do is just have the lambda triggers delay before triggering a lambda retry, essentially have an exponential backoff before triggering. We can easily implement that "inside" the lambda, we can retry and wait for up to the 15m lambda duration.
But then we are billed for whole lambda execution time, while it is sleeping for however many backoffs are required.
Is there a way to configure the lambda/dynamodb trigger to have a delay (that we can control up and down) before invoking the retry? For SQS triggers there is some talk of redrive policy that somehow can control the rate of retries - but not clear how or if that applies to dynamodb streams.
I understand that the streams will "backup" as we slow down the dispatch of lambdas, but this is assumed to be a transient situation, and the dynamodb stream will act as a queue. And we can also configure a dead letter queue, but that is sort of orthogonal to the basic question.
You can configure a wait. And yes, while you are billed by the time use, its pennies. Seriously, the free aws account covers a million lambda invocations a month. At the enterprise level its really nothing compared to what EC2 servers cost. But Im not your CFO so maybe it is a concern.
You can take your stream and process it into whatever service calls you would need and have their paylods all added to the same SQS. You can configure your SQS to throttle it self in effect, so it only sends so many over a given time. The messages in your queue wold go to another lambda that would do the service call for you, one at a time. It would be doled out by the SQS
set up a Dead Letter Queue instead (possibly in combination with either of the above) to catch the failed ones and try again when traffic is lower.
As an aside, you dont want to 'pause' your dynamo stream as it only has a 24 hour TTL on it. If your stream pauses for too long you will loose data. Better to take the stream in whole and put it into an SQS queue as individual writes because SQS has a TTL of up to 14 days.

Best method to persist data from an AWS Lambda invocation?

I use AWS Simple Email Services (SES) for email. I've configured SES to save incoming email to an S3 bucket, which triggers an AWS Lambda function. This function reads the new object and forwards the object contents to an alternate email address.
I'd like to log some basic info. from my AWS Lambda function during invocation -- who the email is from, to whom it was sent, if it contained any links, etc.
Ideally I'd save this info. to a database, but since AWS Lambda functions are costly (relatively so to other AWS ops.), I'd like to do this as efficiently as possible.
I was thinking I could issue an HTTPS GET request to a private endpoint with a query-string containing the info. I want logged. Since I could fire my request async. at the outset and continue processing, I thought this might be a cheap and efficient approach.
Is this a good method? Are there any alternatives?
My Lambda function fires irregularly so despite Lambda functions being kept alive for 10 minutes or so post-firing, it seems a database connection is likely slow and costly since AWS charges per 100ms of usage.
Since I could conceivable get thousands of emails/month, ensuring my Lambda function is efficient is paramount to cost. I maintain 100s of domain names so my numbers aren't exaggerated. Thanks in advance.
I do not think that thousands per emails per month should be a problem, these cloud services have been developed with scalability in mind and can go way beyond the numbers you are suggesting.
In terms of persisting, I cannot really understand - lack of logs, metrics - why your db connection would be slow. From the moment you use AWS, it will use its own internal infrastructure so speeds will be high and not something you should be worrying about.
I am not an expert on billing but from what you are describing, it seems like using lambdas + S3 + dynamoDB is highly optimised for your use case.
From the type of data you are describing (email data) it doesn't seem that you would have neither a memory issue (lambdas have mem constraints which can be a pain) or an IO bottleneck. If you can share more details on your memory used during invocation and the time taken that would be great. Also how much data you store on each lambda invocation.
I think you could store jsonified strings of your email data in dynamodb easily, it should be pretty seamless and not that costly.
Have not used (SES) but you could put a trigger on DynamoDB whenever you store a record, in case you want to follow up with another lambda.
You could combine S3 + dynamoDB. When you store a record, simply upload a file containing the record to a new S3 key and update the row in DynamoDB with a pointer to the new S3 object
DynamoDB + S3
You can now persist data using AWS EFS.

What is the maximum outbound connections I can create from AWS Lambda?

I am looking at the documentation on Lamba Limits which says:
Number of file descriptors 1,024
I am wondering if this is per invoking lambda or total across all lambdas?
I am processing a very large number of items from a kinesis stream and I am calling a web endpoint and it I seem to be hitting a bottle neck of about 1024 concurrent connections to the API and I'm not sure where the bottleneck is. I'm investigating limits on my load balancer and instances but I'm also wondering if lambda itself simply cannot create more than 1024 concurrent outbound connections across all lambdas?
This question is old, but a suitable answer may help others in the future. The limit as correctly noted in the question is 1,024 outbound connections per Lambda function. However this limit is only for the life cycle of the container. There are currently no public documents stating the length of the life cycle, however through my own testing it resulted in the following:
A new container is created after 5 minutes of idle time for the Lambda function
A new container is created after 60 minutes of frequent use of the Lambda function
A new container is created on any update to the code or configuration of the Lambda
A final note on the new containers, when a new container is created it will run all of your code from the start whereas invoking a warm container will just invoke the handler, skipping the loading of the libraries etc. As this is the case it is a best practice to implement connection pooling and declare the connection outside of the handler so that it can be reused in subsequent invokes, examples of this can be found in the AWS docs

Amazon Web Services: Spark Streaming or Lambda

I am looking for some high level guidance on an architecture. I have a provider writing "transactions" to a Kinesis pipe (about 1MM/day). I need to pull those transactions off, one at a time, validating data, hitting other SOAP or Rest services for additional information, applying some business logic, and writing the results to S3.
One approach that has been proposed is use Spark job that runs forever, pulling data and processing it within the Spark environment. The benefits were enumerated as shareable cached data, availability of SQL, and in-house knowledge of Spark.
My thought was to have a series of Lambda functions that would process the data. As I understand it, I can have a Lambda watching the Kinesis pipe for new data. I want to run the pulled data through a bunch of small steps (lambdas), each one doing a single step in the process. This seems like an ideal use of Step Functions. With regards to caches, if any are needed, I thought that Redis on ElastiCache could be used.
Can this be done using a combination of Lambda and Step Functions (using lambdas)? If it can be done, is it the best approach? What other alternatives should I consider?
This can be achieved using a combination of Lambda and Step Functions. As you described, the lambda would monitor the stream and kick off a new execution of a state machine, passing the transaction data to it as an input. You can see more documentation around kinesis with lambda here: http://docs.aws.amazon.com/lambda/latest/dg/with-kinesis.html.
The state machine would then pass the data from one Lambda function to the next where the data will be processed and written to S3. You need to contact AWS for an increase on the default 2 per second StartExecution API limit to support 1MM/day.
Hope this helps!

How to add storage-level caching between DynamoDB and Titan?

I am using the Titan/DynamoDB library to use AWS DynamoDB as a backend for my Titan DB graphs. My app is very read-heavy and I noticed Titan is mostly executing query requests against DynamoDB. I am using transaction- and instance-local caches and indexes to reduce my DynamoDB read units and the overall latency. I would like to introduce a cache layer that is consistent for all my EC2 instances: A read/write-through cache between DynamoDB and my application to store query results, vertices, and edges.
I see two solutions to this:
Implicit caching done directly by the Titan/DynamoDB library. Classes like the ParallelScanner could be changed to read from AWS ElastiCache first. The change would have to be applied to read & write operations to ensure consistency.
Explicit caching done by the application before even invoking the Titan/Gremlin API.
The first option seems to be the more fine-grained, cross-cutting, and generic.
Does something like this already exist? Maybe for other storage backends?
Is there a reason why this does not exist already? Graph DB applications seem to be very read-intensive so cross-instance caching seems like a pretty significant feature to speedup queries.
First, ParallelScanner is not the only thing you would need to change. Most importantly, all the changes you need to make are in DynamoDBDelegate (that is the only class that makes low level DynamoDB API calls).
Regarding implicit caching, you could add a caching layer on top of DynamoDB. For example, you could implement a cache using API Gateway on top of DynamoDB, or you could use Elasticache. Either way, you need to figure out a way to invalidate Query/Scan pages. Inserting/deleting items will cause page boundaries to change so it requires some thought.
Explicit caching may be easier to do than implicit caching. The level of abstraction is higher, so based on your incoming writes it may be easier for you to decide at the application level whether a traversal that is cached needs to be invalidated. If you treat your graph application as another service, you could cache the results at the service level.
Something in between may also be possible (but requires some work). You could continue to use your vertex/database caches as provided by Titan, and use a low value for TTL that is consistent with how frequently you write columns. Or, you could take your caching approach a step further and do the following.
Enable DynamoDB Stream on edgestore.
Use a Lambda function to stream the edgestore updates to a Kinesis Stream.
Consume the Kinesis Stream with edgestore updates in the same JVM as the Gremlin Server on each of your Gremlin Server instances. You would need to instrument the database level cache in Titan to consume the Kinesis stream and invalidate the cached columns as appropriate, in each Titan instance.

Resources