Lambda function times out when calling google sheets after making postgresql query - google-api

I am trying to write a Lambda function (using AWS Cloud9) which makes a query to Redshift (using the node-postgres package) and then writes the result to a Google Sheet (using the googleapis package).
I currently have the code spread over two separate Lambda functions - one to make the query, and one to write to the sheet, though this same error occurred when I tried it in a single function.
Both functions individually work fine. The query function makes a query and returns a result, and the writing query writes a test payload to the sheet.
However, if I try to invoke the writing function from the query function, the whole thing freezes up and eventually times out. This is the exact log from a run.
Error:
Read timeout on endpoint URL: "https://lambda.us-east-2.amazonaws.com/2015-03-31/functions/queryRedshift/invocations"
at convertStderrToError (https://d28a1z68q19s1r.cloudfront.net/content/ce0bff16a8467f5a19e655ab833e28a385f3a62f/#aws/aws-toolkit-cloud9/configs/bundle.js:424:33)
at exports.EventEmitter.<anonymous> (https://d28a1z68q19s1r.cloudfront.net/content/ce0bff16a8467f5a19e655ab833e28a385f3a62f/#aws/aws-toolkit-cloud9/configs/bundle.js:416:70)
at exports.EventEmitter.EventEmitter.emit (https://d373lap04ubgnu.cloudfront.net/c9-af167ac416de-ide/build/configs/ide/#aws/cloud9/configs/ide/environment-default.js:20:23)
at Consumer.onExit (https://d373lap04ubgnu.cloudfront.net/c9-af167ac416de-ide/build/configs/ide/#aws/cloud9/configs/ide/environment-default.js:47444:80)
at Consumer.<anonymous> (https://d373lap04ubgnu.cloudfront.net/c9-af167ac416de-ide/build/configs/ide/#aws/cloud9/configs/ide/environment-default.js:47204:4)
at Consumer.Agent._onMessage (https://d373lap04ubgnu.cloudfront.net/c9-af167ac416de-ide/build/configs/ide/#aws/cloud9/configs/ide/environment-default.js:47289:4)
at EngineIoTransport.EventEmitter.emit (https://d373lap04ubgnu.cloudfront.net/c9-af167ac416de-ide/build/configs/ide/#aws/cloud9/configs/ide/environment-default.js:47041:16)
at module.exports.onMessage (https://d373lap04ubgnu.cloudfront.net/c9-af167ac416de-ide/build/configs/ide/#aws/cloud9/configs/ide/environment-default.js:47348:6)
at module.exports.EventEmitter.emit (https://d373lap04ubgnu.cloudfront.net/c9-af167ac416de-ide/build/configs/ide/#aws/cloud9/configs/ide/environment-default.js:19:23)
at module.exports.ReliableSocket.onMessage (https://d373lap04ubgnu.cloudfront.net/c9-af167ac416de-ide/build/configs/ide/#aws/cloud9/configs/ide/environment-default.js:47560:76)
I have tried re-working the code to separate things, but I'm not actually sure where to start, as I can only find one other similar problem with no answer, and the log isn't pointing to where things are getting stuck (as far as I can tell - I'm not super experienced at this).
If someone can at least point me in the right direction, it would be super helpful!
Thanks in advance!
EDIT: I have now also tried the node-redshift package with the same result.

From the info you provided, below may be the situation:
Querying lambda is able to connect to redshift within AWS.
Writing lambda is able to connect to google sheet api through Internet.
Querying lambda doesn't have internet connectivity to connect to
lambda.us-east-2.amazonaws.com
For a Lambda function inside the VPC to access internet you have to do the below,
https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/
You have all your subnets attached to Internet Gateway but none of them having NAT Gateway, if I am correct.

Related

AWS Lambda caching layer using Extensions

I have a lambda function that uses SSM ParameterStore values. It queries the ParameterStore and stores the fetched values in Lambda env variables so that next time it can use them from env variables instead of making calls to the ParameterStore which is working fine if the lambda is in a hot-state, but during the cold start still, my lambda is making many calls to ParameterStore during peak traffic and getting throttling exceptions.
I'm looking to reduce the num of calls to the parameter store by having a caching layer, I found this article online, but I'm new to Lambda extensions. I just wanted to check if this caching works between the cold starts or not before I create a POC. please advise.
Thanks in advance!

Invoking 1 AWS Lambda with API Gateway sequentially

I know there's a question with the same title but my question is a little different: I got a Lambda API - saveInputAPI() to save the value into a specified field. Users can invoke this API with different parameter, for example:
saveInput({"adressType",1}); //adressType is a DB field.
or
saveInput({"name","test"}) //name is a DB field.
And of course, this hosts on AWS so I'm also using API Gateway as well. But the problem is sometimes, an error like this happened:
As you can see. API No. 19 was invoked first but ended up finishing later
(10:10:16:828) -> (10:10:18:060)
While API No.18 was invoked later but finished sooner...
(10:10:17:611) -> (10:10:17:861)
This leads to a lot of problems in my project. And sometimes, the delay between 2 API was up to 10 seconds. The front project acts independently so users don't know what happens behind. They think they have set addressType to 1 but in reality, the addressType is still 2. Since this project is large and I cannot change this kind of [using only 1 API to update DB value] design. Is there any way for me to fix this problem ?? Really appreciate any idea. Thanks
If updates to Database can't be skipped if last updated timestamp is more recent than the source event timestamp, we need to decouple Api Gateway and Lambda.
Api Gateway writes to SQS FIFO Queue.
Lambda to consume SQS and process the request.
This will ensure older event is processed first.
Amazon Lambda is asynchronous by design. That means that trying to make it synchronous and predictable is kind of waste.
If your concern is avoiding "old" data (in a sense of scheduling) overwrite "fresh" data, then you might consider timestamping each data and then applying constraints like "if you want to overwrite target data, then your source timestamp have to be in the future compared to timestamp of the targeted data"

Is there a way to combine a query and a command in CQRS?

I have a project built using CQRS, but I can't figure out how to implement one use case.
The user needs to be able to make a Query which will return a set of data for them to view. However, I also need to save the data they got at the same time.
Is there a way to do this within a Query without violating CQRS' principles? Or would the Query and Command need to be two separate API calls one after another?
In CQRS it is your client who can do both command and queries. This client is not necessary a piece of UI.
It can be an API endpoint handler, which would
receive a query
forward it to the query endpoint
wait for the answer
send an answer to the caller
send a command to store the answer
Is there a way to do this within a Query without violating CQRS' principles?
It depends.
If "save the data" means "make some change to the domain model"... well, that would be pretty weird.
Asking a question should not change the answer. -- Bertrand Meyer
On the other hand, logging/telemetry are pretty normal ways to track the activity of an application, so that should be fine.
There are some realities of a distributed system on an unreliable network that you need to be aware of (what should the behavior be if the telemetry system is not available? What are the consequences of recording queries that don't actually reach the client (because the network is unreliable).
As #VoiceOfUnreason stated, it may be somewhat strange to effect domain changes when querying data.
However, it may be that you could swop that around.
For instance, perhaps one could query a forecast of sorts. We would want to store that forecast. It then seems as though the query results in us having to save the result. This appears to break CQS at some level since each query would result in a change of state.
If we swop that around and first request a forecast via the domain handling and then that produces a result, or even a pointer to the result, then the query would be something you could perform on the data multiple times without "breaking" CQS.

Autoscaling : Minimum 2 Instances and a subsequent Lambda

All,
Am really stuck and have tried almost everything. Can some one please help.
I provision 2 instances while creating my Auto-scaling group. I trigger a Lambda ( manipulates the tags) which changes the instance name to a unique name.
Desired State
I want first instance of Lambda to give first instance the name "web-1"
Then second instance of lambda would run just fine to assign a name "web-2"
Current State
I start with a search on running instances to see if "web-1" exists or not.
So in this case my Lambda executes twice and creates both instances with the same name ( web-1, web-1).
How do I get around this ? I know that the problem is due to Lambda listening to Cloud Watch events. ASG Launch creates 2 events at the same time in my case leading to the problem I have.
Thanks.
You are running into a classic multi-threading issue. Both lambda functions execute simultaneously, see the same "unused" web-1 and mark both with the same function.
What you need is an atomic operation that gives each Lambda execution "permission" to proceed. You can try using a helper DynamoDB table to serialize the tag attempts.
Have your lambda function decide which tag to set (web-1, web-2, etc.)
Check a DynamoDB table to see if that tag has been set in the last 30 seconds. If so, someone else got to it first, so go back to step 1.
Try to write your "ownership" of your sought-after tag to the DynamoDB along with your current timestamp. Try using some attribute_not_exists or other DynamoDB conditions to ensure only one simultaneous such write succeeds.
If you fail at writing, go back to step 1.
If you succeed at writing, then you're free to set your tag.
The reason for the timestamps is to allow for "web-1" to be terminated, and then having a new EC2 instance launched and labelled "web-1".
The above logic is not proven to work, but hopefully should give enough guidance to develop a working solution.

Parse.com. Execute backend code before response

I need to know the relative position of an object in a list. Lets say I need to know the position of a certain wine of all wines added to the database, based in the votes received by users. The app should be able to receive the ranking position as an object property when retrieving a "wine" class object.
This should be easy to do in the backend side but I've seen Cloud Code and it seems it only is able to execute code before or after saving or deleting, not before reading and giving response.
Any way to do this task?. Any workaround?.
Thanks.
I think you would have to write a Cloud function to perform this calculation for a particular wine.
https://www.parse.com/docs/cloud_code_guide#functions
This would be a function you would call manually. You would have to provide the "wine" object or objectId as a parameter and then get have your cloud function return the value you need. Keep in mind there are limitations on cloud functions. Read the documentation about time limits. You also don't want to make too many API calls every time you run this. It sounds like your computation could be fairly heavy if your dataset is large and you aren't caching at least some of the information.

Resources