Is there a way to track the status of 100 concurrent lambda functions and send email on successful execution of all the lambda functions - aws-lambda

PROBLEM: We are iterating over list of items and for each item, we are pushing messages into SQS where we have lambda trigger. Suppose, we have a 100 items and we are sending 100 messages into the SQS. The, 100 concurrent lambda functions will be executed( and lambda have concurrency set to 50). Now, We need to send a email when all these 100 lambda functions executed successfully. Is there a way in AWS where we can monitor these 100 lambda function status?
I have few ideas, where we can create a separate row in db for each item and mark status as completed after each lambda successful execution. And, then at the end of every lambda function, we can check if we have all 100 entries in db with status completed, then we can send email.
EDIT:
Or, we can have step function where parent task can have dynamic parallel states including all child lambda functions using state machine map state. Then, at the next step, we can send the consolidate email.
Please, let me know your thoughts. That would be very helpful

This looks like a case where Step Functions Distributed Map would help. I'm not sure how you are doing your original iteration, but you could replace that with iteration using Distributed Map. If the list you are iterating over is in S3 already, they you get that out of the box. If it's elsewhere (e.g., in DynamoDB) then you'd need to add a step to either read that list in (if it's less than 256 kB in size, this can be passed in as state input) or write it to S3 first to iterate over (can be in JSON or CSV).
With this, you can then define a workflow to run for each of the items, including calling your Lambda functions. And then when the Distributed Map state completes, you can add in any subsequent steps like sending an email or emitting events to EventBridge.
To learn more about Distributed Map, you can read the docs linked above and check out this blog post or this session from re:Invent 2022.

Related

DynamoDB:PutItem calls silently ignored

I have a Lambda function bound to CodeBuild notifications; a Lambda instance writes details of the notification that triggered it to a DynamoDB table (BillingMode PAY_PER_REQUEST)
Each CodeBuild notification spawns an independent Lambda instance. A CodeBuild build can spawn 7-8 separate notifications/Lambda instances, many of which often happen simultaneously.
The Lambda function uses DynamoDB:PutItem to put details of the notification to DynamoDB. What I find is that out of 7-8 notifications in a 30 second period, sometimes all 7-8 get written to DynamoDB, but sometimes it can be as low as 0-1; many calls to DynamoDB:PutItem simply seem to be "ignored".
Why is this happening?
My guess is that DynamoDB simply shouldn't be accessed by multiple Lambda instances in this way; that best practice is to push the updates to a SQS queue bound to a separate Lambda, and have that separate Lambda write many updates to DynamoDB as part of a transaction.
Is that right? Why might parallel independent calls to DynamoDB:PutItem fail silently?
TIA.
DynamoDB uses a web endpoint and for that reason it can handle any number of concurrent connections, so the issue is not with how many Lambdas are writing.
I typically see this happen when users do not allow the Lambda to wait until the API requests are complete and the container gets shut down prematurely. I would first check your code and ensure that your lambda is staying alive for all items to be processed, you can do this by adding some simple logging in your code.
What you are describing is a good use case for Step Functions.
As much as Lambda functions are great to glue between services, they have their overheads and their limitations. With Step Functions, you can call directly to DynamoDB:PutItem, and you can handle various scenarios and flows, such as Async calls. These flows are possible to implement in a Lambda function, however with less visibility and with less traceability.
BTW, you can also call a Lambda function from Step Functions, however, I recommend you to try and use the direct service call to maximize the benefits of the Step Functions service.
My mistake, I had a separate issue which was messing up some of the range keys and causing updates to "fail" silently. But thx for the tip regarding timeouts

API waiting for a specific record on DynamoDb without pooling

I am inheriting a workflow that has a reasonable amount of data stored in DynamoDb. The data is periodically refreshed by Lambdas calling third parties when needed. The lambdas are triggered by both SQS and DynamoDB streams and go through four or five steps before the data is updated.
I'm given the task to write an API that can forcibly update N items and return their status. The obvious way to do this without reinventing the wheel and honoring DRY is to trigger an event that spawns off a refresh for each item so that the lambdas can do their thing.
The trouble is that I'm not sure the best pub/sub approach to handle being notified that end state of each workflow is met. Do I read from an update/insert stream of dynamodb to see if the records are updated? Do I create some sort of pub/sub model like Reddis or SNS to listen for the end state of each lambda being triggered?
Since I'm writing a REST API, timeouts, if there are failures along the line, arefine. But at the same time I want to make sure I can handle the following.
Be guaranteed that I can be notified that an update occurred for my targets after my call (in the case of multiple forced updates being called at once I only care about the first one to arrive).
Not be bogged down by listening for updates for record updates that are not contextually relevant to the API call in question.
Have an amortized time complexity of 1
In other words, in terms of cap theory i care about C & A but not P (because a 502 isn't that big a deal). But getting the timing wrong or missing a subscription is a problem.
I know I can just listen to a dynamodb event stream but I'm concerned that when things get noisy there will be more irrelevant stuff slowing me down. And I'm not sure if having every single record getting it's own topic is scalable (or how messy that would be).
You can use DynamoDB streams in combination with Lambda Event Filtering so the Lambda function only executes for the relevant change you are interested in. More information is available here:
https://aws.amazon.com/about-aws/whats-new/2021/11/aws-lambda-event-filtering-amazon-sqs-dynamodb-kinesis-sources/

DynamoDb re-processing records

I just inherited some one else's code that uses a server-less lambda function to process records from DynamoDb. The original developer is using DynamoDb much like how RabbitMQ works; as a temporary staging area with some level of fault tolerance and a lambda function that will process them at a later date.
We currently have a way to delay message publication in RabbitMQ at my company, but this feature is missing on the AWS side of the fence.
I wrote some code in my serverless lambda function so that it checks a special property called ProcessAfter (UTC DateTime) and effectively skips processing any given DynamoDb record if the current UTC date/time is less than that specified by the ProcessAfter. However DynamoDb never sends me that record ever again. It appears that DynamoDb only ever allows a single attempt at processing a record (excluding the exception re-tries built in), so I'm stuck with my attempted solution to implementing a delay capability.
Is there anyway to replicate the delay functionality in DynamoDb, or in my lambda function so that messages are skipped, and then re-processed as often as necessary until the delay is over and the record is successfully processed?
Looks like you are listening to dynamo_db streams. They work in a way if any event(insert, update etc which is being configured) happens for a record it will be sent to a listener for processing.
Now talking about your specific scenario, you need to have an SQS in place for processing a record later if you do not wish to process it after listening.
Better architecture I would advice is put an extra SQS and Lambda. The Lambda will listen the dynamo_db stream event, will compare processAfter with Date_Now to compute delay, add that delay as delay_seconds and send message to SQS.
Finally lambda listener will listen and process it after specified delay or 0 delay as required.

Disable lambda concurrency for the same event

I have a lambda that processes files by s3 event. Is it possible to prevent the execution of lambda for the same event? For example, I uploaded a file to s3 and It triggered lambda. While the lambda is working, I upload the same file again and don't want the lambda to execute parallel. The first lambda should be done and after that, the second one should be started. Thanks
No, this is not possible, but you can always maintain a DDB table and check for record existence before further processing.
Well, as per Lamda functions, it works asynchronously, independently, and in isolation. As it meant to do a specific task in a serverless manner. You can't hold another execution of the lambda function.
if you implemented the trigger point as any files upload, it will trigger the lambda function.
However, what is a process and what is currently processing by the lambda function, you can use a database or file system to keep track of processed files and if you got the same files again in Lamda, return without processing the file or maybe existing result if you are storing.

How can I trigger one AWS Lambda function from another, guaranteeing the second only runs once?

I've built a bit of a pipeline of AWS Lambda functions using the Serverless framework. There are currently five steps/functions, and I need them to run in order and each run exactly once. Roughly, the functions are:
Trigger function by an HTTP request, respond with an ID.
Access and API to get the URL of a resource to download.
Download that resource and upload a copy to S3.
Alter that resource and upload the altered copy to S3.
Submit the altered resource to a different API.
The specifics aren't important, but the question is: What's the best event/trigger to use to move along down this line of functions? The first one is triggered by an HTTP call, but the first one needs to trigger the second somehow, then the second triggers the third, and so on.
I wrote all the code using AWS SNS, but now that I've deployed it to staging I see that SNS often triggers more than once. I could add a bunch of code to detect this, but I'd rather not. And the problem is also compounding -- if the second function gets triggered twice, it sends two SNS notifications to trigger step three. If either of those notifications gets doubled... it's not unreasonable that the last function could be called ten times instead of once.
So what's my best option here? Trigger the chain through HTTP? Kinesis maybe? I have never worked with a trigger other than HTTP or SNS, so I'm not really sure what my options are, and which options are guaranteed to only trigger the function once.
AWS Step Functions seems pretty well targeted at this use-case of tying together separate AWS operations into a coherent workflow with well-defined error handling.
Not sure if the pricing will work for you (can be pricey for millions+ operations) but it may be worth looking at.
Also not sure about performance overhead or other limitations, so YMMV.
You can simply trigger the next lambda asynchronously in your lambda function after you complete the required processing in that step.
So, the first lambda is triggered by an HTTP call and in that lambda execution, after you finish processing this step, just launch the next lambda function asynchronously instead of sending the trigger through SNS or Kinesis. Repeat this process in each of your steps. This would guarantee single time execution of all the steps by lambda.
Eventful Lambda triggers (SNS, S3, CloudWatch, ...) generally guarantee at-least-once invocation, not exactly-once. As you noted you'd have to handle deduplication manually by, for example, keeping track of event IDs in DynamoDB (using strongly consistent reads!), or by implementing idempotent Lambdas, meaning functions that have no additional effects even when invoked several times with the same input. In your example step 4 is essentially idempotent providing that the function doesn't have any side effects apart from storing the altered copy, and that the new copy overwrites any previously stored copies with the same event ID.
One service that does guarantee exactly-once delivery out of the box is SQS FIFO. This service unfortunately cannot be used to trigger Lambdas directly so you'd have to set up a scheduled Lambda to poll the FIFO queue periodically (as per this answer). In your case you could handle step 5 with this arrangement, since I'm assuming you don't want to submit the same resource to the target API several times.
So in summary here's how I'd go about it:
Lambda A, invoked via HTTP, responds with ID and proceeds to asynchronously fetch resource from the API and store it to S3
Lambda B, invoked by S3 upload event, downloads the uploaded resource, alters it, stores the altered copy to S3 and finally pushes a message into the FIFO SQS queue using the altered resource's filename as the distinct deduplication ID
Lambda C, invoked by CloudWatch scheduler, polls the FIFO SQS queue and upon a new message fetches the specified altered resource from S3 and submits it to the other API
With this arrangement even if Lambda B is occasionally executed twice or more by the same S3 upload event there's no harm done since the FIFO SQS queue handles deduplication for you before the flow reaches Lambda C.
AWS Step function is meant for you: https://docs.aws.amazon.com/step-functions/latest/dg/welcome.html
You will execute the steps you want based on previous executions outputs.
Each task/step just need to output a json correctly in the wanted "state".
https://docs.aws.amazon.com/step-functions/latest/dg/concepts-states.html
Based on the state, your workflow will move on. You can create your workflow easily and trigger lambdas, or ECS tasks.
ECS tasks are your own "lambda" environment, running without the constraints of the AWS Lambda environment.
With ECS tasks you can run on Bare metal, on your own EC2 machine, or in ECS Docker containers on ECS and thus have unlimited resources extensible limits.
As compared to Lambda where the limits are pretty strict: 500Mb of disk, execution limited in time, etc.

Resources