I want to build a cronjob like system that gets all users from database and make multiple (I mean lots of) concurrent requests for each of them and make some executions and save the result to db. It will run every hour in every day 7/24.
I came up with the solution that:
Gets all users from db (that's the easy part)
Dynamically creates lambda functions and distributes all users to these functions
Each lambda function makes concurrent requests and executions
(handling results and saving them to db)
Communicate these functions with SNS when needed
So, does my approach make sense for this situation?
The most important thing here is scaling (that's why I thought to distribute all users to lambda functions, for limiting concurrent requests and resources), how we can come with an scalable and efficient idea for exponentially increased user count?
Or any other suggestions?
Here is my solution:
if 100 concurrent lambdas are not enough for your need, create a ticket to increase your limit, you will be charged what will you use.
However, still you can't determine that how many lambdas will be required in future. It is not necessary to process each user in a separate lambda, instead you can invoke a lambda with a chunk of user data. e.g. Let's say, your max. lambda limit is 100 and there are 1000 users then you can do something (i don't know go, here is a python code which may not be 100% syntactically correct)
users = get_users_fromdb() # users = [1,2,3,... 1000]
number_of_users = len(users)
chunk_size = number_of_users / 100 # 100 is your lambda limit
for i in range(0, number_of_users, chunk_size)
# e.g. chunk_users_data = [1,2,3 ... 10]
chunk_users_data = users[i * chunk_size : (i + 1) * chunk_size ]
invoke_lambda_to_process_users_chunk_data()
Here is what you can do in other lambda
users = event.get('users')
for user in users:
try:
process_user(user)
except Exception as e:
print(e) # handle exception / error if you want
Update:
By default, 100 is limit for concurrent running lambdas. If you have 100K users, IMO, you should go for a support case to increase your account's concurrent lambda limit to 1000 or more. I am working on lambda and we have 10K limit. One more thing to keep in mind that it is not sure that your one lambda invocation will be able to process all users in a chunk, so add some logic to reinvoke with remaining users before timeout. A lambda can run upto max. of 5 minutes. YOu can get remaining time from context object in milli seconds.
Related
I am using AWS and using the serverless framework. My serverless lambda function gets triggered by event. Then I talk with Database and there is a limit in the number of connections I can open with DB.
So I want to only run 5 lambda functions at a time and queue other events. I know there is:
provisionedConcurrency: 3 # optional, Count of provisioned lambda instances
reservedConcurrency: 5 # optional, reserved concurrency limit for this function. By default, AWS uses account concurrency limit
So in this case, the specified number of long running jobs will be there and they will be serving the events.
But rather than that what I want is event queuing and the functions will be triggered such that at most 5 functions are running at a time.
I am wondering whether this notion of event queuing is supported in AWS?
In AWS Lambda, a concurrency limit determines how many function invocations can run simultaneously in one region. You can set this limit though AWS Lambda console or through Serverless Framework.
If your account limit is 1000 and you reserved 100 concurrent executions for a specific function and 100 concurrent executions for another, the rest of the functions in that region will share the remaining 800 executions.
If you reserve concurrent executions for a specific function, AWS Lambda assumes that you know how many to reserve to avoid performance issues. Functions with allocated concurrency can’t access unreserved concurrency.
The right way to set the reserved concurrency limit in Serverless Framework is the one you shared:
functions:
hello:
handler: handler.hello # required, handler set in AWS Lambda
reservedConcurrency: 5 # optional, reserved concurrency limit for this function. By default, AWS uses account concurrency limit
I would suggest to use SQS to manage your Queue. One of the common architectural reasons for using a queue is to limit the pressure on a different part of your architecture. This could mean preventing overloading a database or avoiding rate-limits on a third-party API when processing a large batch of messages.
For example, let's think about your case where your SQS processing logic needs to connect to a database. You want to limit your workers to have no more than 5 open connections to your database at a time, with concurrency control, you can set proper limits to keep your architecture up.
In your case you could have a function, hello, that receives your requests and put them in a SQS queue. On the other side the function compute will get those SQS messages and compute them limiting the number of concurrent invocations to 5.
You can even set a batch size, that is the number of SQS messages that can be included in a single lambda.
functions:
hello:
handler: handler.hello
compute:
handler: handler.compute
reservedConcurrency: 5
events:
- sqs:
arn: arn:aws:sqs:region:XXXXXX:myQueue
batchSize: 10 # how many SQS messages can be included in a single Lambda invocation
maximumBatchingWindow: 60 # maximum amount of time in seconds to gather records before invoking the function
Have you considered a proxy endpoint (acting like a pool) instead of limiting the concurrency of the lambda. Also, I think the way the lambda <-> SQS communication happens is via some event pool, and setting the concurrency lower than however many threads they have going will cause you to have to handle lost messages.
https://aws.amazon.com/rds/proxy/
I am implementing a solution that involves SQS that triggers a Lambda funcion, that uses a 3rd party API to perform some operations.
That 3rd party API has a limit of requests per second, so I would like to limit the amount of SQS messages processed by my Lambda funtion to a similar rate.
Is there any way to limit the number of messages visibles per second on the SQS or the number of invocations per second of a Lambda function?
[edited]
After some insights given in the comments about AWS Kinesis:
There is no lean solution by handling Kinesis parameters Batch Window, Batch size and payload size, due to the behaviour of Kinesis has that triggers the lambda execution if ANY of the thresholds and reached:
* Given N = the max number of request per second I can execute over the 3rd party api.
* Configuring a Batch Window = 1 second and a Batch Size of N, back presurre should trigger the execution with more than N_MAX requests.
* Configuring a Batch Windows = 1 secnd and a Batch Size of MAX_ALLOWED_VALUE, will be under performant and also does not guarantee executing less than N execution per second.
The simplest solution I have found is creating a Lambda with a fixed execution rate of 1 second, that reads a fixed number of messages N from SQS/Kinesis, and write those in another SQS/Kinesis, having those another Lambda as endpoint.
This is a difficult situation.
Amazon SQS can trigger multiple AWS Lambda functions in parallel, so there is not central oversight of how fast requests can be made to the 3rd-party API.
From Managing concurrency for a Lambda function - AWS Lambda:
To ensure that a function can always reach a certain level of concurrency, you can configure the function with reserved concurrency. When a function has reserved concurrency, no other function can use that concurrency. Reserved concurrency also limits the maximum concurrency for the function, and applies to the function as a whole, including versions and aliases.
Therefore, concurrency can be used to limit the number of simultaneous Lambda functions executing, but this does not necessarily map to "x API calls per second". That would depend upon how long the Lambda function takes to execute (eg 2 seconds) and how many API calls it makes in that time (eg 2 API calls).
It might be necessary to introduce delays either within the Lambda function (not great because you are still paying for the function to run while waiting), or outside the Lambda function (by triggering the Lambda functions in a different way, or even doing the processing outside of Lambda).
The easiest (but not efficient) method might be:
Set a concurrency of 1
Have the Lambda function retry the API call if it is rejected
Thanks to #John Rotenstein gave a comprehensive and detailed answer about SQS part.
If your design is limited to a single consumer than you may replace sqs with kinesis streams. By replacing it, you may use batch window option of kinesis to limit the requests made by consumer. Batch window option is used to reduce the number of invocations
Lambda reads records from a stream at a fixed cadence (e.g. once per second for Kinesis data streams) and invokes a function with a batch of records. Batch Window allows you to wait as long as 300s to build a batch before invoking a function. Now, a function is invoked when one of the following conditions is met: the payload size reaches 6MB, the Batch Window reaches its maximum value, or the Batch Size reaches its maximum value. With Batch Window, you can increase the average number of records passed to the function with each invocation. This is helpful when you want to reduce the number of invocations and optimize cost.
We've use an API that impose a rate-limit per hour.
I wonder what'll be the best way to do a set number of requests per hour to the API for our own scripts. I.e.: Making 10 request per hour to not exceed our allowance and avoids overcharges.
I was thinking just using sleep(60*6) in my loop but API calls can take minutes, so it might be doing a lot less requests than allowed.
What will be the best practice to spread out our requests?
Edit:
I ended doing something like this, what do you guys think?
while(queue.size > 0) do
Thread.new {
element = queue.pop
# do the rate limited API calls and things
}
sleep(60*6)
end
consider rack attack middleware
to sum it up - you keep somewhere (in memory, in database like redis) number of requestes executed by specific client (know by IP, identity or in any other form) within given time.
I needed to implement a stream solution using AWS Kinesis streams & Lambda.
Lambda function 1 -
It adds data to stream and is invoked every 10 seconds. I added 100 data request ( each one of 1kb) to stream. I am running two instances of the script which invokes the lambda function.
Lambda function 2 -
This lambda uses above stream as trigger. On small volume of data / interval second lambda get data on same time. But on above metrics, data reaches slower than usual ( 10 minutes slower after +1 hour streaming ).
I checked the logic of both lambda functions and verified that, first lambda does not add latency before pushing data to stream. I also verified this by stream packet in second lambda where approximateArrivalTimestamp & current time clearly have the time difference increasing..
Kinesis itself did not have any issues / throttling shown in analytics ( I am using 1 shard ).
Are their any architectural changes I need to make to have it go smoother as I need to scale up at least 10 times like 20 invocations of first lambda with 200 packets, timeout 1 - 10 seconds as later benchmarks.
I am using 100 as the batch size. Can increasing/decreasing it have advantage?
UPDATE : As I explored more online, I found ideas to implement some async / front facing lambda with kinesis which in-turn invoke actual lambda asynchronously, So lambda processing time will not become bottleneck. However, this approach also failed as I have the same latency issue. I have checked the execution time. Front facing lambda ended in 1 second. But still I get big gap between approximateArrivalTimestamp & current time in both lambdas.
Please help!
For one shard, there will one be one instance of 2nd lambda.
So it works like this for 2nd lambda. The lambda reads configured record size from stream and processes it. It won't read other records until the previous records have been successfully processed.
Adding a second shard, you would have 2 lambdas processing the records. Thus the way I see to scale the architecture is by increasing the number of shards, however make sure data is evenly distributed across shards.
I want to retrieve information from an Excachange Server (2010 via EWS API). In detail I want build a windows service to iterate over all excachange users and index their private mailboxes using impersonalisation.
That works well but its very slow when I do this one user after another (depending on the mailbox volume and the amout of users). The indexing speed is now about 500 items per minute.
The following calls takes about 250 milliseconds on my test system:
PropertySet myPropertySet = new PropertySet(BasePropertySet.FirstClassProperties, ItemSchema.ParentFolderId);
myPropertySet.RequestedBodyType = BodyType.Text;
myPropertySet.Add(entryIdExtendedProperty);
Item item = Item.Bind(es, itemKey, myPropertySet);
So my idea was to do a parallelization. So far I tried 3 ways:
Background worker: One worker thread per user.
Result: No effect. It seems that doing this will slow down very call. In sum the overall speed stays the same.
Separate EXE processes: One EXE per user. I created a "Worker"-Exe and called them with the user as argument: IndexWorker.exe -user1
Result: Same result! The calls of every exe are slowed down!
Separate Windows Services: One service per user.
Result: Suddenly, the request did not slow down, which means I could bring the overall speed to a multiple of 500 items per minute (I triet up to 3 processes, thats 1500 items per minute). Not bad but I lets me alone with the question:
Why are EWS calls slowed down in 1) and 2) but not in 3)?
Threading would the most elegant way for me, is there any option oder setting that I may use?
I read a couple of things about Throttling Policies and the EWSFindCountLimit. Is this the right direction?
Did you get to the bottom of why the separate service gave you such an increase in performance? The throttling is applied at the Service Account level, so it should not matter where you are making the calls from.
Your issue is the throttling policy. You need to create a throttling policy for your service account that doesn't restrict EWS or RPC activity.