AWS DynamoDB persistent connection - performance

Given AWS Lambda on NodeJS 6-th version, which performs very simple test-purposes CRUD interaction with DynamoDB table. Measured performance is extremely slow, independent to selected Lambda RAM memory or RCU/WCU units for Dynamodb.
There was performed benchmark, and results are unsatisfied. Event MySQL database in micro container has performance, which is better compare to DynamoDB with some times.
Update operations 1000 10000 20000 100000
RCU=1000/WCU=1000 104708 ms 176109 ms 276689 ms N/A >5min
942 MB 707 MB 896 MB
RCU=2000/WCU=2000 45953 ms 167686 ms 245937 ms N/A >5min
646 MB 829 MB 896 MB
RCU=3000/WCU=3000 74205 ms 151072 ms 253800 ms N/A >5min
657 MB 840 MB 854 MB
RCU=4000/WCU=4000 76636 ms 175258 ms 257238 ms N/A >5min
896 MB 896 MB 896 MB
After fast research reason of such behavior was found: DynamoDB performs new HTTP(S) request on each CRUD operation! (https://github.com/aws/aws-sdk-js/blob/master/lib/http/node.js#L25) It's extremely slow, event in TLS connection, which has relatively big establishment time, including key exchanges. Also, it's extremely big overhead for HTTP headers, which some times bigger than CRUD payload.
So question: is there available method to communicate with DynamoDB in persistent connection in NodeJS Labmda? Batch operations is not apropriate solution, since they does not support UPDATE operations.

It seems that AWS Node SDK supports Keep-alive by default.
We need to set AWS_NODEJS_CONNECTION_REUSE_ENABLED environment variable to 1.
Please refer:- https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/node-reusing-connections.html

By design, DynamoDB is a web service. You can't have persistent connection to the database like RDBMS or any other database.
DynamoDB is a web service, and interactions with it are stateless.
Applications do not need to maintain persistent network connections.
Instead, interaction with DynamoDB occurs using HTTP(S) requests and
responses.
Dynamodb connection
Workaround Solution:-
If the updates are not related, you can asynchronously update the item to improve performance. However, you may need to think about how you would like to handle errors in this case.
The below code doesn't have callback on update. So, it calls the update operation and doesn't wait for its response.
requestObj = docClient.update(params);
requestObj.send();

Related

Provisioned concurrency has minor impact on response time of Lambda function

We are using a serverless architecture along with AWS Lambda and an API gateway. The execution time of the lambda is in the range of a few milliseconds. The final response at the client's end is received in seconds (which is far more than the lambdas execution time even if init duration is also taken into count considering cold-start scenarios).
While debugging this using API gateway logs, there is integration latency in seconds which is making the end-to-end response considerably slow. To remove the init duration or cold-start I have added rules in CloudWatch for periodically calling the lambdas to keep them warm.
The init duration is removed completely and this helped in reducing integration latency as well. There are Lambdas which can not be scheduled as authentication is required for calling them and for this I have added provisioned concurrency of 5.
This Lambda has init duration as well in the logs. Lambda provisioning is another option to get rid of cold-starts but this is not having impact over the time in which Lambda's response is available at API gateway.
I have followed below links to assign provisioned concurrency to Lambdas:
Provisioned Concurrency: What it is and how to use it with the Serverless Framework
AWS News Blog – Provisioned Concurrency for Lambda Functions
CloudWatch logs of the Lambda to which I have added provisioning to:
Duration: 1331.38 ms Billed Duration: 1332 ms Memory Size: 256 MB Max Memory Used: 130 MB Init Duration: 1174.18 ms
One thing I could notice in API Gateway and Lambda logs is that Request to lambda was sent from API Gateway at 2021-02-15T11:51:36.621+05:30 but, it was received at Lambda at 2021-02-15T11:51:38.535+05:30. There is about 2 seconds of delay in getting the request at Lambda.
AWS X-RAY TRACING
I have enabled AWS X-Ray logging for API gateway and Lambda both and this is what I have by the traces. This request took 595 ms in total but at Postman the response was received in 1558 ms. From where the delay of approximately 1 second is being added to receiving a response from API gateway?
I believe the reason is that the provisioned concurrency of 5 is not enough and you still run into cold starts of your Lambda function. This means if the external service is calling your API endpoint (i.e. your Lambda function behind API Gateway), your Lambda function is warm with 5 instances. If we assume your Lambda function can handle 2 requests per second (500ms for each invocation), then you can roughly handle 10 requests per second with your Lambda function. If the external service is making 20 requests per second, AWS Lambda tries to spin up new instances because the existing ones are busy handling requests. This has the consequence that the external service experiences high response times because of cold starts of your function.
Also, consider that the instances of your Lambda function do not live "forever" but are cleaned up after some point. I.e. if you experience many spikes in your traffic patterns, then this can mean that after one spike the instances live like 15 minutes, then AWS Lambda shuts them down to only keep the 5 provisioned ones and if then another spike comes, you'll see the same problem as before.
Please note: This is a very simplified explanation of what's happening behind the scenes - and more a good guess based on your description. It would help if you'd provide some example numbers (e.g. init duration, execution duration, response time) and maybe some example code what you're doing in your Lambda function. Also: which runtime are you using? how does your traffic pattern look like?
Potential solutions
Reduce the cold start time of your Lambda functions -> always a good idea for Lambda functions that are behind an API Gateway
Provision more instances -> only possible up to a certain (soft) limit
===== Flow-of-Services =====
API GW ⮕ Lambda function(Provisioned)
===== Query =====
You want to understand why there is latency while processing the request from API GW to Lambda function.
===== Time-stamp-of-CW =====
021-02-15T11:51:36.621+05:30
2021-02-15T11:51:38.535+05:3
Lambda duration - Duration: 1331.38 ms Billed Duration: 1332 ms Memory Size: 256 MB Max Memory Used: 130 MB Init Duration: 1174.18 ms
===== Follow-up questions =====
While the request was processed through the API GW to Lambda function, the execution env took 1174.18 ms(1.1s) sections to become active & execute your code in remaining 0.3s which makes it total of 1.4 seconds.
Q. What is the type of processor you are using?
Q. Type of API & Endpoint type?
===== K.C =====
You should read AWS DOC for optimizing your Lambda function code.Optimizing static initialization
Lambda won't charge you for the time it takes to initialize your code (e.g. importing stuff) as long as it finishes in about X seconds.1
===== Replication Observation =====
Without provisioned concurrency.
API GW execution time - 286 ms
Initialization -195ms
Invocation - 11ms
Overhead - 0ms
With provisioned concurrency.
API GW execution time - 1.103ms
Initialization - 97ms
Invocation - 1ms
Overhead - 0ms
I'm in US-WEST-2 region and calling the request from 12,575 km away from the Region. I hv a REST API which is configured as 'Regional' endpoint type. Lambda function is running on x86_64 – 64-bit x86 architecture, for x86-based processors.
-- Check if you have optimized Lambda function code.
-- To have low latency, you may make use of 'edge' optimized Rest API.An edge-optimized API endpoint is best for geographically distributed clients. API requests are routed to the nearest CloudFront Point of Presence (POP).
-- Alway choose the Region which is closest to the high traffic region.
--
References:
new-provisioned-concurrency-for-lambda-functions
provisioned-concurrency.html#optimizing-latency

High latency when sending events to Azure Event Hub

I have an API which take a json object and forward it to Azure Event Hub. The API running .NET Core 3.1, with EventHub SDK 3.0, it also have Application Insight configured to collect dependency telemetry, including Event Hub.
Using the following kusto query in Application Insight, I've found that there are some call to Event Hub which have really high latency (highest is 60 second, on average it fall around 3-7 seconds).
dependencies
| where timestamp > now()-7d
| where type == "Azure Event Hubs" and duration > 3000
| order by duration desc
Also it is worth noting that it return 890 results, out of 4.6 million Azure Event Hubs dependency result
I've check Event Hub metrics blade on Azure Portal, with average (in 1 minute time granularity) incoming/outgoing request way below the throughput unit (I have 2 event hubs in a EH namespace, 1 TU, autoscale to 20 max), which is around 50-100 message per second, bytes around 100kB, both incoming and outgoing. 0 throttled requests, 1-2 server/user errors from time to time
There are spike but it does not exceed throughput limit, and the slow dependency timestamp also don't match these spike
I also increased throughput unit to 2 manually, and it does not change anything
My question is:
Is it normal to have extremely high latency to Event Hub sometimes? Or it is acceptable if it only in small amount?
Codewise, only use 1 EventHubClient instance to send all the request, it is a bad practice or should I used something else like a client pool?
I also have a support engineer told me during a timestamp where I have high latency in Application Insight, the Event Hub log does not seem to have such high latency (322ms max), without going into details, it is possible for Application Insight to produce wrong performance telemetry?

Occasional AWS Lambda timeouts, but otherwise sub-second execution

We have an AWS Lambda written in Java that usually completes in about 200 ms. Occasionally, it times out after 5 seconds (our configured timeout value).
I understand that there is occasional added latency due to container setup (though, I'm not clear if that counts against your execution time). I added some debug logging, and it seems like the code just runs slow.
For example, a particularly noticeable log entry shows a call to HttpClients.createDefault usually takes less than 200 ms (based on the fact that the Lambda executes in less than 200 ms), but when the timeout happens, it takes around 2-3 seconds.
2017-09-14 16:31:28 DEBUG Helper:Creating HTTP Client
2017-09-14 16:31:31 DEBUG Helper:Executing request
Unless I'm misunderstanding something, it seems like any latency due to container initialization would have already happened. Am I wrong in assuming that code execution should not have dramatic differences in speed from one execution to the next? Or is this just something we should expect?
Setting up new containers or replacing cold containers takes some time. Both account against your time. The time you see in the console is the time you are billed against.
I assume that Amazon doesn't charge for the provisioning of the container, but they will certainly hit the timer as soon as your runtime is started. You are likely to pay for the time during which the SDK/JDK gets initialized and loads it's classes. They are certainly not charging us for the starting of the operation system which hosts the containers.
Running a simple Java Lambda two times shows the different times for new and reused instances. The first one is 374.58 ms and the second one is 0.89 ms. After that you see the billed duration of 400 and 100 ms. For the second one the container got reused. While you can try to keep your containers warm as already pointed out by #dashmug, AWS will occasionally recycle the containers and as load increases or decreases spawn new containers. The blogs How long does AWS Lambda keep your idle functions around before a cold start? and How does language, memory and package size affect cold starts of AWS Lambda? might be worth a look as well. If you include external libraries you times will increase. If you look at that blog you can see that for Java and smaller memory allocations can regularly exceed 2 - 4 seconds.
Looking at these times you should probably increase your timeout and not just have a look at the log provided by the application, but a look at the START, END and REPORT entries as well for an actual timeout event. Each running Lambda container instance seems to create its own log stream. Consider keeping your Lambdas warm if they aren't called that often.
05:57:20 START RequestId: bc2e7237-99da-11e7-919d-0bd21baa5a3d Version: $LATEST
05:57:20 Hello from Lambda com.udoheld.aws.lambda.HelloLogSimple.
05:57:20 END RequestId: bc2e7237-99da-11e7-919d-0bd21baa5a3d
05:57:20 REPORT RequestId: bc2e7237-99da-11e7-919d-0bd21baa5a3d Duration: 374.58 ms Billed Duration: 400 ms Memory Size: 128 MB Max Memory Used: 44 MB
05:58:01 START RequestId: d534155b-99da-11e7-8898-2dcaeed855d3 Version: $LATEST
05:58:01 Hello from Lambda com.udoheld.aws.lambda.HelloLogSimple.
05:58:01 END RequestId: d534155b-99da-11e7-8898-2dcaeed855d3
05:58:01 REPORT RequestId: d534155b-99da-11e7-8898-2dcaeed855d3 Duration: 0.89 ms Billed Duration: 100 ms Memory Size: 128 MB Max Memory Used: 44 MB
Try to keep your function always warm and see if it would make a difference.
If the timeout is really due to container warmup, then keeping it warm will greatly help reduce the frequency of these timeouts. You'd still get cold starts when you deploy changes but at least that's predictable.
https://read.acloud.guru/how-to-keep-your-lambda-functions-warm-9d7e1aa6e2f0
For Java based applications the warm up period is more as you know it's jvm right. Better to use NodeJS or Python because the warm up period is less for them. If you are not in such a way to switch the tech stack simply keep the container warm by triggering it or increase the memory that will reduce the execution time as lambda cpu allocation is more for larger memory.

Improve caching performance in Mule

I am using Anypoint 6.1 and Mule 3.8.1 and I'm finding problems with the performance and it looks like it is down to the cache scope.
The cache is a managed store (so I can invalidate the cache when new data is loaded) and has the following values:
Max Entries: 1000
Entry TTL: 84600
Expiration Interval: 84600
The response returns approx 200 JSON records.
Is there anyway to improve this and make this a faster response?
Thanks
Expiration Interval is the frequency with which the object store checks for expired cached response events. It can be set as low as 1 seconds to hours, depending upon the message rate you are expecting, you can try different values to test performance of your application.
Also, try in-memory-object-store for your caching strategy, as it saves responses in system memory, so a little bit faster but have to be careful in usage to avoid OutOfMemory errors.

Bigquery Streaming inserts, persistent or new http connection on every insert?

I am using google-api-ruby-client for Streaming Data Into BigQuery. so whenever there is a request. it is pushed into Redis as a queue & then a new Sidekiq worker tries to insert into bigquery. I think its involves opening a new HTTPS connection to bigquery every insert.
the way, I have it setup is:
Events post every 1 second or when the batch size reaches 1MB (one megabyte), whichever occurs first. This is per worker, so the Biquery API may receive tens of HTTP posts per second over multiple HTTPS connections.
This is done using the provided API client by Google.
Now the Question -- For Streaming inserts, what is the better approach:-
persistent HTTPS connection. if yes, then should it be a global connection that's shared across all requests? or something else?
Opening new connection. like we are doing now using google-api-ruby-client
I think it's pretty much to early to speak about these optimizations. Also other context is missing like if you exhausted the kernel's TCP connections or not. Or how many connections are in TIME_WAIT state and so on.
Until the worker pool doesn't reach 1000 connections per second on the same machine, you should stick with the default mode the library offers
Otherwise this would need lots of other context and deep level of understanding how this works in order to optimize something here.
On the other hand you can batch more rows into the same streaming insert requests, the limits are:
Maximum row size: 1 MB
HTTP request size limit: 10 MB
Maximum rows per second: 100,000 rows per second, per table.
Maximum rows per request: 500
Maximum bytes per second: 100 MB per second, per table
Read my other recommendations
Google BigQuery: Slow streaming inserts performance
I will try to give also context to better understand the complex situation when ports are exhausted:
Let's say on a machine you have a pool of 30,000 ports and 500 new connections per second (typical):
1 second goes by you now have 29500
10 seconds go by you now have 25000
30 seconds go by you now have 15000
at 59 seconds you get to 500,
at 60 you get back 500 and stay at using 29500 and that keeps rolling at
29500. Everyone is happy.
Now say that you're seeing an average of 550 connections a second.
Suddenly there aren't any available ports to use.
So, your first option is to bump up the range of allowed local ports;
easy enough, but even if you open it up as much as you can and go from
1025 to 65535, that's still only 64000 ports; with your 60 second
TCP_TIMEWAIT_LEN, you can sustain an average of 1000 connections a
second. Still no persistent connections are in use.
This port exhaust is better discussed here: http://www.gossamer-threads.com/lists/nanog/users/158655

Resources