What are the drawbacks of SQS poller which AWS Lambda removes? - aws-lambda

I have an architecture which looks like as follows:-
Multiple SNS -> (AWS Lambda or SQS with Poller)??? -> Dynamo Db
So, basically multiple SNS have subscribed to AWS Lambda or SQS with Poller and that thing pushes data to Dynamo Db.
But this ? thing do lot of transformation of message in between. So, now for such case, I can either use AWS Lambda or SQS with Poller. With AWS Lambda, I can do transformation in Lambda function and with SQS with Poller, I can do transformation in Poller. With AWS Lambda, I see one problem that code would become quite large as transformation is quite complex(has lot of rules), so I am thinking to use SQS. But before finalising on SQS, I wanted to know of the drawbacks of SQS which AWS Lambda removes?
Please help. Let me know if you need further information.

Your question does not contain much detail, so I shall attempt to interpret your needs.
Option 1: SQS Polling
Information is sent to an Amazon SNS topic
An SQS queue is subscribed to the SNS topic
An application running on Amazon EC2 instance(s) regularly poll the SQS queue to ask for a message
If a message is available, the data in the message is transformed and saved to an Amazon DynamoDB table
This approach is good if the transformation takes a long time to process. The number of EC2 instances can be scaled based upon the amount of work in the queue. Multiple messages can be received at the same time. It is a traditional message-based approach.
Option 2: Using Lambda
Information is sent to an Amazon SNS topic
An AWS Lambda function is subscribed to the SNS topic
A Lambda function is invoked when a message is sent to the SNS topic
The Lambda function transforms the data in the message and saves it to an Amazon DynamoDB table
AWS Lambda functions are limited to five minutes of execution time, so this approach will only work if the transformation process can be completed within that timeframe.
No servers are required because Lambda will automatically run multiple functions in parallel. When no work is to be performed, no Lambda functions execute and there is no compute charge.
Between the two options, using AWS Lambda is much more efficient and scalable but it might vary depending upon your specific workload.

We can now use SQS messages to trigger AWS Lambda Functions.
28 JUN 2018: AWS Lambda Adds Amazon Simple Queue Service to Supported
Event Sources
Moreover, no longer required to run a message polling service or create an SQS to SNS mapping.
AWS Serverless Model supports a new event source as following:
Type: SQS
PropertiesProperties:
QueueQueue: arn:aws:sqs:us-west-2:213455678901:test-queue arn:aws:sqs:us-west-2:123791293
BatchSize: 10
AWS Console also support:
Further details:
https://aws.amazon.com/blogs/aws/aws-lambda-adds-amazon-simple-queue-service-to-supported-event-sources/
https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html

Related

Multiple processes on Single SQS with in a Lambda function

Currently , we have a sqs subscribed to a Standard SNS topic which triggers a lambda to publish some data based upon these events to a downstream X .
We have come with another usecase where we want to listen to this exiting SNS and publish another set of data based on these events to downstream Y . In future we might have another use case where we want to listen to this exiting SNS and publish another set of data based on these events to downstream Z .
I was wondering if we can re use this existing SQS and lambda for these new use case . I am just curious how wan we handle failure scenarios in case one of publish fails . Failure of 1 process out of x will lead the message back to DLQ from where the re drive would be required , so all the consumer processes of this message with in the lambda will have again process this redrived message ?
Another way could be have a separate SQS and separate lambda for each of such use cases .
has someone had a similar problem statement and what was the approach followed out of the above two or anything that could help reusing some of the existing infra ?
You should subscribe multiple Amazon SQS queues to the Amazon SNS topic, so that each of them receives a copy of the message. Each SQS queue can have its own Dead Letter Queue for error handling.
It is also possible to subscribe the AWS Lambda function to the Amazon SNS topic directly, without using an Amazon SQS queue. The Lambda function can send failed messages directly to a Dead Letter Queue.

What are the benefits of using SNS to SQS to Lambda compared to having SNS to SQS to SQSConsumers?

We have a SQS queue subscribe to SNS Topic which publishes about 1-5 million events per month. I want to know which of these combinations - SNS->SQS->Lambda vs SNS->SQS->SQSConsumer would benefit me for such use-cases.
I understand the maintain difference between them is Event driven Vs Pull Driven. A lambda is triggered for each message that comes into a queue so that is an event driven architecture, an SQSConsumer has to constantly poll for messages. You have to have constant up time for a poller like that vs a lambda that is only triggered once a message is received.
I have couple of questions here :
Why SNS->SQS-> Lambda is considered Event driven, when lambda has to poll the SQS queue similar to what SQSConsumer does?
Followup : When Lambda is also constantly polling, then why lambda is considered to be more cost efficient than SQSConsumer?
If you ignore the 'internals' of how Amazon SQS with AWS Lambda is implemented, simply think of it as SQS directly triggering the Lambda function. This is a serverless model, whereas using an SQS consumer requires code to be running on a computer somewhere. Lambda will automatically scale, so it is more cost effective than having computing infrastructure waiting around for events (and costing money even when it isn't used).
So, it's really a decision about whether to use a serverless architecture.
You could also subscribe the AWS Lambda function direction to the Amazon SNS topic, without using Amazon SQS in the middle.

AWS SQS List Triggers from SDK

I'm looking for a method to programmatically identify the triggers associated with an SQS queue. Looking through the SQS sdk docs, it doesn't seem this is possible. I had thought instead to try from the other end, and it appears the Lambda ListEventSourceMappings function would likely do what I want, since I'm able to provide it with the queue ARN. However, this requires the ListSourceMappings permission on all lambdas (*), which isn't really ideal - though it shouldn't really hurt, just not what I want. Is there another mechanism for this that I'm missing, or another approach?
Lambda polls SQS queues. It doesn't appear that way in the console, because they hide some of the details from you, but behind the scenes there is a process running within the AWS Lambda system that is polling your SQS queue and invoking your Lambda function when a message is available.
SQS doesn't push messages to Lambda (or anywhere else). SQS just holds messages and hands them out to anything that asks for them. So from an SQS perspective, there is no knowledge of who the message consumers are.
Given the above, the only way to find what you want is to use the Lambda ListEventSourceMappings API.

Is there a good pattern to send a message between AWS Lambdas

My use case is the following. I have 5 lambdas. They need to talk to each other. I've heard that it can be done with SNS but also SNS and SQS. What is the difference, why not call lambdas only from one another directly?
It's possible to design durable and scalable applications using SNS-SQS AWS pattern. You can do this by having an SNS topic to which lambda A posts then the SNS triggers directly SQS which is a queue. In that way if you have high volume messages they will be processed sequentially.
Take care that the SNS and SQS can trigger more than once.
For more info check the article here:
https://aws.amazon.com/blogs/compute/designing-durable-serverless-apps-with-dlqs-for-amazon-sns-amazon-sqs-aws-lambda/
You can also use AWS Step Function which is a serverless function orchestrator that makes it easy to sequence AWS Lambda functions and multiple AWS services.
You can check out getting started guide here - https://docs.aws.amazon.com/step-functions/latest/dg/getting-started.html

Best way to schedule one-time events in serverless environments

Example use case
Send the user a notification 2 hours after signup.
Options considered
setTimeout(() => { /* send notification */ }, 2*60*60*1000); is not an option in serverless environments since the function terminates after execution (so it has to be stateless).
CloudWatch events can schedule lambda invocations using cron expressions - but this was designed for repetitive invocations (there's a limit of 100 rules/region).
I have not seen scheduling options in AWS SNS/SQS or GCP Pub/Sub. Are there alternatives with scheduling?
I want to avoid (if possible) setting up a dedicated message broker (overkill) or stateful/non-serverless instance - is there a serverless way to do this?
I can queue the events in a database and invoke a lambda function every minute to poll the database for events to execute in that minute... is there a more elegant solution?
Use AWS Step functions, they are like serverless functions that don't have the 15 minute limit like AWS Lambda does. You can design a workflow in AWS step that integrates with API Gateway, Lambda and SNS to send email and text notifications as follows:
Create a REST API via API gateway that will invoke a Lambda function passing in for example, the destination address (email, phone #) of the SNS notification, when it should be sent, notification method (e.g. email, text, etc.).
The Lambda function on invocation will invoke the Step function passing in the data (Lambda is needed because API Gateway currently can't invoke Step functions directly).
The Step function is basically a workflow, you can define states for waiting (like waiting for the specified time to send the notification e.g. 30 seconds), and states for invoking other Lambda functions that can use SNS to send out an email and/or text notifications.
A rudimentary example is provided by AWS w/ their Task Timer example.
Things are coming on GCP for doing this, but not very soon. Thereby, today, the solution is to poll a database.
You can to that with Datastore/firestore with the execution datetime indexed (to prevent to read all the documents each minute). But be careful of traffic spike, you could create hotspot.
You can use Cloud Scheduler on Google Cloud Platform. As is is stated in the official documentation :
Cloud Scheduler is a fully managed enterprise-grade cron job scheduler. It allows you to schedule virtually any job, including batch, big data jobs, cloud infrastructure operations, and more. You can automate everything, including retries in case of failure to reduce manual toil and intervention. Cloud Scheduler even acts as a single pane of glass, allowing you to manage all your automation tasks from one place.
Here you can check a quickstart for using it with Pub/Sub and Cloud Functions.

Resources