Query AWS RDS from Lambda Securely - aws-lambda

I am trying to connect my Lambda to RDS just as a learning exercise.
Currently, all resources are created through CloudFormation and I would like to continue to do that if possible.
My issue is with the following statement from https://docs.aws.amazon.com/lambda/latest/dg/vpc-rds.html which details how connect.
A second file contains connection information for the function.
Example rds_config.py
#config file containing credentials for RDS MySQL instance
db_username = "username"
db_password = "password"
db_name = "ExampleDB"
The statement AWS is making makes it seem like I should hardcode these values into a file which does not seem secure. I could try to use environment variables but I think the same issue will arise.
If anyone has any advice for how to connect lambda to rds securely I would greatly appreciate it!!!

If you don't want to use environment variables for whatever reason, you can have your Lambda function query the AWS Systems Manager Parameter Store for you.
So let's say once your function has been triggered, you can just query SSM to get the desired parameters and then pass it into your RDS connection.
Just remember that if your Lambda also needs Internet Access (and in this case it does, because it will need to access SSM), you'll need to attach 2 subnets to it: a private and a public. The private will route traffic to RDS and the public will route traffic to other AWS Services / or to the internet.
Setting up Environment Variables would be the easiest to get you off ground, though.
EDIT: Check this answer where I walk the OP through creating a VPC with both public and private subnets if you need a quick start.
EDIT 2: Good news. AWS has released VPC endpoints for SSM some time ago. So your Lambda won't need to go through the Internet anymore, you can just hit that VPC endpoint. You can see it in the official docs

Related

Lambda function times out trying to connect to RDS if in VPC, but doesn't if outside VPC

I have a single AWS lambda function that connects to a single AWS RDS Postgres db and simply returns a json list of all records in the db.
If I don't assign a VPC to the lambda function, it is able to access the AWS RDS db. However, if I assign a VPC to the lambda function it can no longer access the db.
The VPC is the same for both the lambda function and the RDS db. I've also opened all traffic on port 0.0.0.0/0 for inbound and outbound connections temporarily to find the issue, but I am still unable to connect.
I believe it might be a role permission related to VPC for the lambda function, but I've already assigned the policy AmazonVPCFullAccess to the lambda role.
The fact that the lambda can access the DB when not in a VPC is a bit troubling in the sense that the DB is then probably public.
A common mistake that often happens is that lambda is deployed to a public subnet. Lambda's only get assigned private IP addresses in a VPC. When deployed to a public subnets, it's only route to the internet is the internet gateway. That doesn't really work well if the lambda itself has a private ip address (the internet couldn't route traffic back to you :P).
One part of the solution is to make sure your lambda is deployed to a private subnet instead with a route to a NAT gateway if it needs access to public resources.
However, the better part of the solution is actually put the database in the private subnet WITHOUT a public IP adresss.
Because I've seen many mistakes with this with my customers, and because it can't be stressed enough: I'd strongly suggest you follow a three-tier networking model with your VPC's. This basically means:
Don't use the default VPC. Create your own.
Create 9 subnets:
3 public
3 private. Put your private lambda's here.
3 isolated. Put your database here.
There are lot's of articles / templates available that do this for you. A quick google search gives me
https://github.com/aws-samples/vpc-multi-tier
https://www.wellarchitectedlabs.com/reliability/100_labs/100_deploy_cloudformation/1_deploy_vpc/

Is it possible to connect to database hosted in local machine through AWS lambda

I launched one RDS instance,s3 and EC2 in AWS and its is triggered properly using lambda. Now I wish to change the change the RDS and EC2 from AWS to local machine. My lambda is triggered from s3.
How do I connect the local database through lambda in AWS?
It appears that your requirement is:
You wish to run an AWS Lambda function
Within the function, you wish to connect to a database running on your own computer (outside of AWS)
Firstly, I would not recommend this strategy. To maintain good performance, you should always have an application as close as possible to the database. This means on the same network, in the same location and not going across remote network connections or the Internet.
However, if you wish to do this, then here's some things you would need to do:
Your database will need to be accessible on the Internet, so that you can connect to it remotely. To test this, try accessing it from an Amazon EC2 instance.
The AWS Lambda function should either be configured without VPC connectivity (which means that it is connected to the Internet) or, if you have configured it for VPC connectivity, it needs to be in a Private Subnet with a NAT Gateway enabling Internet access.
(Optional) For added security, you could lock-down your database to only accept connections from a known IP address. To achieve this, you would need to use the VPC + NAT Gateway so that all traffic is coming from the Elastic IP address assigned to the NAT Gateway.
I agree with John Rotenstein that connecting your local machine to a Lambda running on AWS is probably a bad idea.
If your intention is to develop or test locally, I recommend the serverless framework, and the serverless-offline plugin. It will allow you to simulate Lambda locally, and you can pass database config values through as environment variables.
See: Running AWS Lambda and API Gateway locally: serverless-offline

What is the downside of NOT running AWS Lambda functions in a VPC?

I am running AWS Lambda functions in a VPC.
And during the course of the project I have hit problems because:
no access to my database - had to solve this somehow
no access to AWS SES - had to find workaround
no access to AWS SQS -removed all queuing functionality from Lambda functions
no access to external Internet - still don't know how to implement ReCapthca
without Internet access
no access to AWS Cognito - cannot get
information about logged in users
I COULD implement a NAT gateway in the VPC but what is the point of serverless if I have to run a NAT server instance? That's not serverless.
So finally AWS has worn me down and I have decided to give up on running my AWS Lambda functions in a VPC - without endpoints for Internet proxying and the various AWS services its just too hard.
SO my question is - what is the downside/disadvantage of running my AWS Lambda functions with no VPC?
If you need access to resources within a VPC, then run your AWS Lambda function within a VPC. If you do not require this access, then do not run it within a VPC.
If you require Internet access, then you should connect your Lambda functions to a Private Subnet and use a NAT Gateway, which is a fully-managed NAT so you can remain serverless. It will solve the problems you listed.
AWS has provided a reference document for Lambda deployments: Serverless Application Lens, AWS Well-Architected Framework. In it they provide the following decision tree:
The only major downside noted is that a Lambda outside of a VPC cannot directly access private resources within a VPC.
One reason to create a Lambda in a VPC would be that you have a specific IP or IP range for it. This could be the case if a system just accepts calls from a specific IP which would need to be whitlistet for it.
Fix IP for Lambda function is discussed here: Is there a way to assign a Static IP to a AWS Lambda without VPC?
Downside of not having Lambda in VPC: Not having specific IP / IP-range for your Lambda function.
In the end I stayed with the VPC but I added an EC2 instance into the VPC and ran TinyProxy on it. I then configured my AWS Lambda functions with the environment variable:
HTTPS_PROXY https://ip-10-0-1-53.eu-west-1.compute.internal:8888
boto3 picked up the environment variable and sent all requests to the proxy. This seems to work fine without the complexity of a NAT gateway.

AWS Lambda: Unable to access SQS Queue from a Lambda function with VPC access

I have a Lambda function that needs to read messages from an SQS queue using it's URL. Then it needs to insert that data to Cassandra running on a server inside a VPC.
I am able to access the Cassandra server from my Lambda function, using it's private IP and configuring the security groups correctly.
However, I am not able to read messages from the SQS Queue. When I change the configuration of Lambda function to No VPC, then I am able to read the messages from the SQS Queue. However, with VPC settings, it just times out.
How can I overcome this ? I have checked the security group of my Lambda function has full outbound access to all IP addresses.
At the end of 2018, AWS announced support for SQS endpoints which provide
connectivity to Amazon SQS without requiring an internet gateway, network address translation (NAT) instance, or VPN connection.
There is a tutorial for Sending a Message to an Amazon SQS Queue from Amazon Virtual Private Cloud
See also the SQS VPC Endpoints Documentation for more information.
Its important to note that if you want to access SQS within the Lambda VPC there are a couple other things you need to do:
Make sure to specify the SQS region in your code. For example, I had to set my endpoint_url to "https://sqs.us-west-2.amazonaws.com"
Make sure that you have attached a "wide open" security group to the SQS VPC Interface, otherwise SQS will not work.
Make sure that your subnets in your Lambda VPC match what you have set up for your SQS VPC Interface.
Some services (e.g. S3) are offering VPC endpoints to solve this particular problem but SQS is not one of them. I think the only real solution to this problem is to run a NAT inside your VPC so the network traffic from the Lambda function can be routed to the outside world.
I ran into the same kind of problem when I was running lambda function with access to elasticache on the VPC. While the function was configured to run in the VPC, I wasnt able to talk to any other service (specifically codedeploy for me).
As #garnaat pointed out NAT seems to be the only way to go about solving this problem for services without VPC endpoints.
And like you pointed out, I also ran into the same trouble where I could'nt SSH into the machine(s) once I replaced the entry with the IGW in the route table. Seems like detaching the IGW starves the VPC of either the incoming traffic (mostly) or the outgoing traffic from or to the internet respectively. So here's what I did and it worked for me:
Create a new Subnet within the VPC
Now, when lambda runs, make sure lambda operates from this subnet.
You can do this by using aws-cli like so:
aws lambda update-function-configuration --function-name your-function-name --vpc-config SubnetIds="subnet-id-of-created-subnet",SecurityGroupIds="sg-1","sg-2"
Make sure you add all the security groups whose inbound and outbound traffic rules apply for your lambda function.
Next, go to Route Tables in the VPC console and create a new route table.
Here is where you add the NAT gateway to the target.
finally go to the Subnet Associations tab in the new route table and add the newly created subnet there.
Thats all this should get it working . Mind you, please treat this as only a workaround. I haven't done much digging and I have a very limited idea on how things get resolved internally while doing this. This might not be an ideal solution.
The ideal solution seems to be to design the VPC before hand. Use subnets to isolate resources/instances that need internet access and that dont(private and public subnets) and place appropriate gateways where needed.( so that you may not have to create a seperate subnet for this purpose later). Thanks
I was unable to get either of the other two answers to this question to work. Perhaps this is due to one or more mistakes on my part. Regardless, I did find a workaround that I wanted to share, in case I'm not alone with this problem.
Solution: I created two Lambda functions. The first Lambda function runs inside my VPC and performs the desired work (in mandeep_m91's case, that's a data insertion to Cassandra; in my case is was accessing an RDS instance). The second Lambda function lives outside the VPC, so I could hook it up to the SQS queue. I then had the second Lambda function call the first, using the information found this this StackOverflow Q&A answer. Note, the linked question has both node.js and Python examples in the answers.
This will effectively double the cost of making a function call, since each call results in two function executions. However, for my situation, the volume is so low it won't make a real difference.
To clarify a point above about a "wide open" security group, the group set on the endpoint needs to allow inbound access to SQS from your lambda function.
I created a security group for my endpoint that only opened 443 to my lambda's security group.

How to edit AWS EC2 instance's security groups to allow access to a lambda function only

I am running into a security related issue with AWS lambda and not sure what is the right way to resolve this.
Consider an EC2 instance A accessing the database on another EC2 instance B. If I want to restrict the accessibility of the DB on instance B to instance A only, I would modify the security group and add a custom TCP rule to allow access to only the public IP of instance A. So, this way, AWS will take care of everything and the DB server will not be accessible from any other IP address.
Now let us replace instance A by a lambda function. Since it is no longer an instance, there is no definite IP address. So, how do I restrict access to only the lambda function and block any other traffic ?
Have the Lambda job determine its IP, and dynamically update the instance B security group, then reset the security group when done.
Until there is support for Lambda running within a VPC this is the only option. Support for that has been announced for later this year. The following quote is from the referenced link above.
Many AWS customers host microservices within a Amazon Virtual Private
Cloud and would like to be able to access them from their Lambda
functions. Perhaps they run a MongoDB cluster with lookup data, or
want to use Amazon ElastiCache as a stateful store for Lambda
functions, but don’t want to expose these resources to the Internet.
You will soon be able to access resources of this type by setting up
one or more security groups within the target VPC, configure them to
accept inbound traffic from Lambda, and attach them to the target VPC
subnets. Then you will need to specify the VPC, the subnets, and the
security groups when your create your Lambda function (you can also
add them to an existing function). You’ll also need to give your
function permission (via its IAM role) to access a couple of EC2
functions related to Elastic Networking.
This feature will be available later this year. I’ll have more info
(and a walk-through) when we launch it.
I believe the below link will explain lambda permission model for you.
http://docs.aws.amazon.com/lambda/latest/dg/intro-permission-model.html

Resources