Best way to connect with MySQL in AWS lambda - aws-lambda

I want to know the best approach for mysql connection creation and termination to an External MySQL instance, to allow more than 1000 users to access AWS lambda function at same time.

The best option is to configure MySQL on AWS RDS.
1: https://docs.aws.amazon.com/lambda/latest/dg/vpc-rds.html. Accessing MySQL from AWS Lambda is no different than accessing it from any native code (Java/ Python/ Node or C#). Be sure to configure proper roles so that MySQL can be accessed from lambda (details).

1000 concurrent connections is the default limit for a single lambda function. From the console you can increase the number of "nodes" running your lambda function or you can increase the size limit. -> https://docs.aws.amazon.com/lambda/latest/dg/concurrent-executions.html

Related

BadRequestException when trying to access serverless database cluster via its Data API inside a Lambda

My Lambda function has exactly the same IAM permissions as an IAM user I created for testing purposes. When I configure the AWS CLI on my local computer to use the IAM user and execute the following command:
aws rds-data execute-statement --resource-arn "arn:aws:rds:eu-central-1:xxxxxxxxxxx:cluster:xxxxxxxxxxx" --database="test" --secret-arn "arn:aws:secretsmanager:eu-central-1: xxxxxxxxxxx:secret:databaseclusterdatabaseSecr-xxxxxxxxxxx" --sql "show databases;"
it succeeds and prints all databases as expected.
When I do the same thing inside my lambda
const command = new ExecuteSqlCommand({
dbClusterOrInstanceArn, <--- matches the value I used for the CLI command
awsSecretStoreArn, <--- matches the value I used for the CLI command
sqlStatements: 'show databases;',
database: 'test',
});
const result = await databaseClient.client.send(command);
I receive the following error:
{
"name":"BadRequestException",
"$fault":"client",
"$metadata":{
"httpStatusCode":400,
"requestId":"74171357-0de6-4350-a776-d88a4ae748ac",
"attempts":1,
"totalRetryDelay":0
}
}
Do I have to perform any additional network configurations in order for my lambda to be able to connect to my serverless database cluster? Do my lambda and my cluster need to be in the same VPC? If not, can someone point me in the right direction as to how I can debug this problem? Thanks a lot guys.
I confused ExecuteSqlCommand and ExecuteStatementCommand.

Evolving Tarantool instance

What should we do if we have one tarantool instance (without Cartridge or VShard), then sometimes in the future we need to replicate it to another machine without downtime?
or if it's the easiest way is using cartridge, how to connect to tarantool cartridge from outside the cardridge? for example using golang (what's the username and password?):
taran, err = tarantool.Connect(cfg.Tarantool.Addr, tarantool.Opts{
User: cfg.Tarantool.User,
Pass: cfg.Tarantool.Pass,
Reconnect: 10 * time.Second,
MaxReconnects: 8640,
})
for example in other database only need to attach a new slave from the master (1 command line call) and wait for it to be sync (100% replicated).
Not sure I'll completely answer your question. But let's discuss every point separately.
Replication
You can use replication without vshard or cartridge. vshard is a module for sharding, if you don't need sharding you can use only replication feature.
Read about replication in documentation configuration - https://www.tarantool.io/en/doc/latest/book/replication/. Cartridge is just framework that simplifies cluster management and gives you huge amount of useful features.
User password
Also you ask about users/passwords. After you call box.cfg{listen=...} you could create some user, alter some rights for it and change its password. Please, read about user management in Tarantool in our documentation - https://www.tarantool.io/en/doc/latest/book/box/authentication/. After you create some user you can connect to Tarantool instance via connector, console (using tarantoolctl) or another Tarantool (using net.box module) under this user.
Talking about cartridge, it uses system user admin with cluster-cookie as a password.

How to decrypt database activity events from AWS Aurora?

I have turned on database activity events which I think is some kind of log file on AWS Aurora. They are currently being passed through AWS kinesis into s3 via AWS Firehose. The log in s3 looks like this:
{"type":"DatabaseActivityMonitoringRecords","version":"1.0","databaseActivityEvents":"AYADeOC+7S/mFpoYLr17gZCXuq8AXwABABVhd3MtY3J5cHRvLXB1YmxpYy1rZXkAREFvbjhIZ01uQTVpVHlyS0l3NnVIOS9xdXF3OWEza0xZV0c2QXYzQmtWUFI2alpIK2hsczNwalAyTTIzYnpPS2RXUT09AAEAAkJDABtEYXRhS2V5AAAAgAAAAAwzb2YKNe4h6b2CpykAMLzY7gDftUKUr3QxmxSzylw9qCRxnGW9Fn1qL4uKnbDV/PE44WyOQbXKGXv9s8BxEwIAAAAADAAAEAAAAAAAAAAAAAAAAAC+gU55u4hvWxW1RG/FNNSJ/////wAAAAEAAAAAAAAAAAAAAAEAAACtbmBmDwZw2/1rKiwA4Nyl7cm19/RcHhCpMMwbOFFkZHKL/bvsohf5T+yM9vNxCgAi2qTUIEe17VA5bJ0eCcNAA9mb6Ys+PR1w7QhKrQsHHTBC2dhJ4ELwpXamGRmPLga5Dml2rOveA59YefcJ4PhrqztZXfrS8fBYJ3HgBWHY9nPh1jdyinjQAl61hQrz2LPII85zlqAWTNeL2pXwaRdtGdYeIXXoh4VsoV3Q18Hj/uOQzTIbT8EJvwnk0gj8AGcwZQIxAJNuoCJhHPUfbkk0fHF6HYz1STIc4HX2HOl0qSIHqwpgtQK6BMa3YlPI9hNwhB8x+AIwWDY0bMjuLRGQgjjBv5z1xPpZQ+pMZ4K6m9JaNBFVKxZTvqDL1z7lrV0rlbZThad+","key":"AQIDAHhQgnMAiP8TEQ3/r+nxwePP2VOcLmMGvmFXX8om3hCCugE7IUxSH/eJBEKvnkYoNIqFAAAAfjB8BgkqhkiG9w0BBwagbzBtAgEAMGgGCSqGSIb3DQEHATAeBglghkgBZQMEAS4wEQQMQIX97gE5ioBR1+nnAgEQgDuDX2B2T7nOxjKDyL31+wHJb0pwkCeaU7CwA6BwIkiT7FmhMB71XgvCVrY9C9ABUtc1e5J7QIfsVB214w=="}
I think a KMS key is being used to encrypt that log file. How do I decrypt it? Is there working sample code somewhere? Also, more importantly, the Aurora database I'm using is a test database with no activity (no inserts, selects, updates). Why are there so many logs? Why are there so many databaseActivityEvents. They seem to be getting written to s3 every minute of the day.
Yes it uses RDS Activity stream KMS key (ActivityStreamKmsKeyId) for encrypting the log event and also base64 encoding. You will have to make use of AWS cryptographic SDKs to decrypt the key and the log event.
For reference see below their the sample java and python versions:
Processing a Database Activity Stream using the AWS SDK
In your firehose pipeline you can add transformation with Lambda step and do this decryption in your lambda function.
Why there are so many events in idle postgres RDS cluster? They are heartbeat events.
When you decrypt and take a look at the actual activity event json, it has type field which can be either be record or heartbeat. Events with type record are the user activity generated ones.

How to invoke step function from a lambda which is inside a vpc?

I am trying to invoke a step function from a lambda which is inside a VPC.
I get exception that HTTP request timed out.
Is it possible to access step function from a lambda in a vpc?
Thanks,
If your lambda function is running inside a VPC, you need to add a VPC endpoint for step functions.
In the VPC console : Endpoints : Create Endpoint, the service name for step functions is com.amazonaws.us-east-1.states (the region name may vary).
Took me a while to find this in the documentation.
It is possible but depends on how you are trying to access step functions. If you are using the AWS SDK then it should take care of any http security issues, otherwise if you are executing raw HTTP commands you will need to mess around with AWS headers.
The other thing you will need to look at is the role that lambda is executing. Without seeing how you have things configure I can only suggest to you things I encountered; you may need to adjust your policies so the role can have the action: sts:AssumeRole, another possibility is adding the action: iam:PassRole to the same execution role.
The easiest solution is to grant your execution role administrator privileges, test it out then work backwards to lock down your role access. Remember to treat your lambda function like another API user account and set privileges appropriately.

Automate SQL Query to send email on Redshift

I am beginner in AWS (from Microsoft domain). I want to run a SQL query against Redshift tables to view duplicates in table on daily basis and send results out in email to a Prod Support group.
Please advise, what is right way to proceed on this.
Recommend doing this with either AWS Lambda or AWS Batch. Use one of these services to issue a short query on a schedule and send the results if required.
Lambda is ideally for simple tasks that complete quickly. https://aws.amazon.com/lambda/ Note that Lambda charges by duration has very tight limits on how long a step can run. A basic skeleton for connecting to Redshift in Lambda is provided in this S.O. answer: Using psycopg2 with Lambda to Update Redshift (Python)
Batch is useful for more complex or long running tasks that need to complete in a sequence. https://aws.amazon.com/batch/
There is no in-built capability with Amazon Redshift to do this for you (eg no stored procedures).
The right way is to write a program that queries Redshift and then sends an email.
I see that you tagged your question with aws-lambda. I'd say that a Lambda function would not be suitable here because it can only run for a maximum of 5 minutes and that might be longer than you need your analysis to run.
Instead, you could run the program from an Amazon EC2 instance, or from any computer connected to the Internet.

Resources