Can Aurora RDS Serverless be restricted to a geography? - amazon-aurora

I am developing an application on AWS and it has regulatory needs to retain the data in a certain geography. I know that with RDS we can achieve the same. But if we use Aurora Server-less, can we define that my data does not leave the geography in which the Amazon data centre is located.
I have gone through the documentation of AWS. It seems to suggest that the data is geographically distributed to improve latency. But this would mean I do not have control over where the data is. My need is the opposite of it, where I want to restrict it to a certain geo location.

Aurora Serverless clusters are similar to Provisioned clusters - they are tied to a region. Provisioned clusters have new features like Global databases which makes the data available in other geographies, but Aurora Serverless does not support those features. Your data in, say, us-east-1 is not leaving that region.

Related

Aurora Serverless AWS strong read consistency?

Is it possible to get strong read consistency (read-after-write) with Aurora Serverless? I'm using the data api client.
Aurora Serverless will always give you read-after-write consistency. It is pretty much a single (writer) instance Aurora provisioned cluster with serverless support to automatically scale (up or down) the compute.
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless.how-it-works.html

Accessing Amazon Aurora from Lambda?

I am a beginner in AWS development and I had a question regarding accessing amazon aurora from lambda.
I have read that all instances of Amazon Aurora needs to be created inside a VPC. However, it seems that Lambda will incure massive latency for setting up elastic network interface (ENI) everytime it tried to access resources which is inside a VPC
https://medium.freecodecamp.org/lambda-vpc-cold-starts-a-latency-killer-5408323278dd
Since this could increase the cold start time by around 10s , is there a way to avoid this ENI setup latency while using Lambda to access Amazon RDS?
No. There is currently no "good" way to reliably prevent the coldstart.
(1) Yes, keeping the lambda function warm can help reduce the problem, but it will still be present.
(2) The only way would be if you run your rds "outside" a VPC (i.e. make it publicly available) and secure it using security groups. But this is a really bad idea for a lot of reasons (lambda ip addresses change so you need to leave the rds instance wide open for any attacker, violates aws best practices, etc).
AWS lambda + rds is currently not suitable if you need responsiveness. That's why Amazon is pushing the use of dynamodb with lambda so much (since that uses https).
Tldr if you need responsiveness + security stay away from lambda + rds.
What you need to do is make sure your lambda role has the AWSLambdaVPCAccessExecutionRole policy attached to it.
Your ENI is created on cold start. Avoid the cold start by creating another lambda to invoke your current lambda on a schedule to keep it warm.

Global borderless implementation website/app on Serverless AWS

I am planning to use AWS to host a global website that have customers all around the world. We will have a website and app, and we will use serverless architecture. I will also consider multi-region DynamoDB to allow users closer to the region to access the closest database instance.
My question regarding the best design to implement a solution that is not locked down to one particular region, and we are a borderless implementation. I am also looking at high traffic and high number of users across different countries.
I am looking at this https://aws.amazon.com/getting-started/serverless-web-app/module-1/ but it requires me to choose a region. I almost need a router in front of this with multiple S3 buckets, but don't know how. For example, how do users access a copy of the landing page closest to their region?, how do mobile app users call up lambda functions in their region?
If you could point me to a posting or article or simply your response, I would be most grateful.
Note: would be interested if Google Cloud Platform is also an option?
thank you!
S3
Instead of setting up an S3 bucket per-region, you could set up a CloudFront distribution to serve the contents of a single bucket at all edge locations.
During the Create Distribution process, select the S3 bucket in the Origin Domain Name dropdown.
Caveat: when you update the bucket contents, you need to invalidate the CloudFront cache so that the updated contents get distributed. This isn't such a big deal.
API Gateway
Setting up an API Gateway gives you the choice of Edge-Optimized or Regional.
In the Edge-Optimized case, AWS automatically serves your API via the edge network, but requests are all routed back to your original API Gateway instance in its home region. This is the easy option.
In the Regional case, you would need to deploy multiple instances of your API, one per region. From there, you could do a latency-based routing setup in Route 53. This is the harder option, but more flexible.
Refer to this SO answer for more detail
Note: you can always start developing in an Edge-Optimized configuration, and then later on redeploy to a Regional configuration.
DynamoDB / Lambda
DynamoDB and Lambda are regional services, but you could deploy instances to multiple regions.
In the case of DynamoDB, you could set up cross-region replication using stream functions.
Though I have never implemented it, AWS provides documentation on how to set up replication
Note: Like with Edge-Optimized API Gateway, you can start developing DynamoDB tables and Lambda functions in a single region and then later scale out to a multi-regional deployment.
Update
As noted in the comments, DynamoDB has a feature called Global Tables, which handles the cross-regional replication for you. Appears to be fairly simple -- create a table, and then manage its cross-region replication from the Global Tables tab (from that tab, enable streams, and then add additional regions).
For more info, here are the AWS Docs
At the time of writing, this feature is only supported in the following regions: US West (Oregon), US East (Ohio), US East (N. Virginia), EU (Frankfurt), EU West (Ireland). I imagine when enough customers request this feature in other regions it would become available.
Also noted, you can run Lambda#Edge functions to respond to CloudFront events.
The lambda function can inspect the AWS_REGION environment variable at runtime and then invoke (and forward the request details) a region-appropriate service (e.g. API Gateway). This means you could also use Lambda#Edge as an API Gateway replacement by inspecting the query string yourself (YMMV).

Instance type on EC2 Amazon AWS

For bandwidth of 400GB per month, what EC2 instance should I use if I want to create a video streaming infrastructure to different regions?
You won't get any specific answers on questions like this. It is totally dependent on your application.
If you stored the videos in Amazon S3 and streamed videos through Amazon CloudFront, then Amazon EC2 would purely be handling user interactions and web pages, without having to serve video content at all.
For any application, the only way to know how much compute is required is to test the application under many different workloads and instance types and measure the performance. Alternatively, an application can be designed to use serverless microservices using AWS Lambda, which can automatically scale without using EC2 instances.

How to create a Windows Azure application hosted in different datacenters

I'm trying to figure out how to scale a Windows Azure app, where there are some web roles and some worker roles.
The objective is to have some instances in a US datacenter and some others in an Europe datacenter, for different users in America an Europe to have the better response time. My problem is to replicate all my storages (for users in Europe who travel to America and viceversa) and even for troubles in one datacenter.
Until now, I understand that it's possible using Traffic Manager to let Azure know which datacenter is closer to the user.
I know I can replicate data between databases with SQL Data Sync.
The blob storages can also be replicated using Copy Blob API .
I understand the queues cannot be automatically replicated but I don't have much problem with that.
My problem is I cannot find a way to replicate table storages.
As a matter of fact I really don't know if this is the best strategy for my problem...
Thank you.
DX - you are right on with Traffic Manager and Data Sync. Those are the best options for roles & SQL. However, BLOBs are much easier - enable CDN and your BLOBs are replicated across 24 data centers automatically. Read Using CDN for Windows Azure for how to setup the CDN from your primary Storage account.
For table storage, I would handle this programatically, keep a list of the Table connections and then use a parallel foreach to insert into the different data centers.
We maintain a different Service Configuration file for each Data Center to simplify deployment.

Resources