I am trying to get data for certain objects that are in a specific Availability Zone (ec2.ap-southeast-2b.amazonaws.com) but fail to set:
ec2Config.ServiceURL = `http://ec2.ap-southeast-2b.amazonaws.com`
And get a NameResolutionException!
How can I get the data (I try: ProductionClient.DescribeVolumes()) for this specific AZ??
AWS service endpoints are specific to a whole Region rather than just an Availability Zone.
So, the URL should be: http://ec2.ap-southeast-2.amazonaws.com (without the 'b')
For a list of Endpoints, see: AWS Regions and Endpoints
Related
I need to use SQL on multiple different locations. The best option will be to set some databases (or even some records, like tagging on Mongo) in different locations. Is it possible to achieve on Google SQL?
There maybe two scenarios -
One single Cloud SQL instance in multiple locations
Different Cloud SQL instances in multiple locations
When you create a Cloud SQL instance, you choose a region where the instance and its data are stored. To reduce latency and increase availability, choose the same region for your data and your Compute Engine instances, standard environment apps, and other services.
Location types are of mainly two types, regional location i.e. a specific geographic place and multi-regional location which contains at least two geographic places. Multi-regional locations are only used for backup operations in Cloud SQL.
You choose a location when you first create the instance. The location can't be changed after the instance is created.
One single region consists of many small data centers called zones. While creating a Cloud SQL instance you can specify the instance to be available in a single zone or in two different zones within the selected region. Selecting the Cloud SQL instance to be in two different zones is called High Availability (HA) configuration.
The purpose of an HA configuration is to reduce downtime when a zone or instance becomes unavailable which might happen during a zonal outage, or when an instance becomes corrupted. With HA, your data continues to be available to client applications.
The HA configuration provides data redundancy. A Cloud SQL instance configured for HA is also called a regional instance and is located in a primary and secondary zone within the configured region. Within a regional instance, the configuration is made up of a primary instance and a standby instance.
So considering the first scenario when you say if a Cloud SQL instance can be located in multiple locations then it is yes if you consider different zones as different locations (This is correct as two zones are physically separated data centers within a single GCP region). But it can only be located in two zones and for that you have to configure High Availability(HA) for the instance.
For the second scenario, you can always create different Cloud SQL instances in different regions.
You can go through instance locations in Cloud SQL and overview of HA configuration to have a brief understanding of the above.
There is another option in Cloud SQL called read replicas.
You use a read replica to offload work from a Cloud SQL instance. The read replica is an exact copy of the primary instance. Data and other changes on the primary instance are updated in almost real time on the read replica.
Read replicas are read-only; you cannot write to them. The read replica processes queries, read requests, and analytics traffic, thus reducing the load on the primary instance.
If you want the data to be available in multiple locations you may consider using cross-region read replicas.
Cross-region replication lets you create a read replica in a different region from the primary instance.
Cross-region read replicas has many advantages -
Improve read performance by making replicas available closer to your
application's region.
Provide additional disaster recovery capability to guard against a
regional failure.
Let you migrate data from one region to another.
Is it ok to have multiple tenant QnA's to be stored in a single data source? for ex: in Azure Table Storage with all QnA's stored in a single table but each tenant data differentiated by an unique key and then filter results based on their unique key, this would help me to reduce the azure service cost but is their any drawbacks in using this method ?
Sharing a service/index in developer/test environments is fine, but there are additional concerns for production environments. These are some drawbacks, though you might not care about some of them:
competing queries: high traffic volume for one tenant can affect query latency/throughput for another tenant
harder to manage data for individual tenants: can you easily delete all documents for a particular tenant? Would the whole index need to be deleted or recreated for any reason which will affect all tenants?
flexibility in location: multiple services allow you to put data physically closer to where the queries will be issued. There can also be legal requirements for where data is stored.
susceptible to bugs/human error: people make mistakes; how bad is it to return data for the wrong tenant? How would you guard against that?
permission management: do you need to need grant permissions to view data for a subset of the tenants?
Because lambdas are stateless and multiple instances can run at the same time it might be a bad idea to generate ids based on timestamps. I am currently using UUIDv1. I know the chance of generating the same IDs with the same timestamp already literally impossible. It's also enough unique enough for my application. Out of curiosity I'm thinking of ways to generate truly mathematically unique ids on aws lambda.
UUID v1 uses a node to distinct ids generated with the same timestamp. Random numbers or MAC-Adresses (bad idea for virtual instances) are used to create node ids.
If I had a unique id for my active lambda instance I would be able to generate truly unique ids. There is a awsRequestId inside the context object but it just seems like another timestamp based UUID.
Maybe you guys have more ideas?
AWS lambda:
System.getenv("AWS_LAMBDA_LOG_STREAM_NAME").replaceFirst(".*?(?=\\w+$)", EMPTY)
Defined runtime environment variables
AWS EC2:
httpGet http://169.254.169.254/latest/meta-data/instance-id
Instance metadata and user data
AWS ECS:
httpGet http://localhost:51678/v1/metadata
How to get Task ID from within ECS container?
Unique within subnet
String executableId = ManagementFactory.getRuntimeMXBean().getName();
I have a simple web app UI (which stores certain dataset parameters (for simplicity, assuming they are all data tables in a single Redshift database, but the schema/table name can vary, and the Redshift is in AWS). Tableau is installed on an EC2 instance in the same AWS account.
I am trying to determine an automated way of passing 'parameters' as a data source (i.e. within the connection string inside Tableau on EC2/AWS) rather than manually creating data source connections and inputting the various customer requests.
The flow for the user would be say 50 users select various parameters on the UI (for simplicity suppose the parameters are stored as a JSON file in AWS) -> parameters are sent to Tableau and data sources created -> connection is established within Tableau without the customer 'seeing' anything in the back end -> customer is able to play with the data in Tableau and create tables and charts accordingly.
How may I do this at least through a batch job or cloud formation setup? A "hacky" solution is fine.
Bonus: if the above is doable in real-time across multiple users that would be awesome.
** I am open to using other dashboard UI tools which solve this problem e.g. QuickSight **
After installing Tableau on EC2 I am facing issues in finding an article/documentation of how to pass parameters into the connection string itself and/or even parameterise manually.
An example could be customer1 selects "public_schema.dataset_currentdata" and "public_scema.dataset_yesterday" and one customer selects "other_schema.dataser_currentdata" all of which exist in a single database.
3 data sources should be generated (one for each above) but only the data sources selected should be open to the customer that selected it i.e. customer2 should only see the connection for other_schema.dataset_currentdata.
One hack I was thinking is to spin up a cloud formation template with Tableau installed for a customer when they make a request, creating the connection accordingly, and when they are done then just delete the cloud formation template. I am mainly unsure how I would get the connection established though i.e. pass in the parameters. I am not sure spinning up 50 EC2's though is wise. :D
An issue I have seen so far is creating a manual extract limits the number of rows. Therefore I think I need a live connection per customer request. Hence I am trying to get around this issue.
You can do this with a combination of a basic embed and applying filters. This would load the Tableau workbook. Then you would apply a filter based on whatever values your user selects from the JSON.
The final missing part is that you would use a parameter instead of a filter and pass those values to the database via initial sql.
I am aware of the generation of the Performance Counters and Diagnosis in webrole and worker-role in Azure.
My question is can I get the Performance Counter on a remote place or remote app, given the subscription ID and other certificates (3rd Party app to give performance Counter).
Question in other words, Can I get the Performance Counter Data, the way I use Service Management API for any hosted service...?
What are the pre-configurations required to be done in Server...? to get CPU data...???
Following is the description of the attributes for Performance counters table:
EventTickCount: Stores the tick count (in UTC) when the log entry was recorded.
DeploymentId: Id of your deployment.
Role: Role name
RoleInstance: Role instance name
CounterName: Name of the counter
CounterValue: Value of the performance counter
One of the key thing here is to understand how to effectively query this table (and other diagnostics table). One of the things we would want from the diagnostics table is to fetch the data for a certain period of time. Our natural instinct would be to query this table on Timestamp attribute. However that's a BAD DESIGN choice because you know in an Azure table the data is indexed on PartitionKey and RowKey. Querying on any other attribute will result in full table scan which will create a problem when your table contains a lot of data.
The good thing about these logs table is that PartitionKey value in a way represents the date/time when the data point was collected. Basically PartitionKey is created by using higher order bits of DateTime.Ticks (in UTC). So if you were to fetch the data for a certain date/time range, first you would need to calculate the Ticks for your range (in UTC) and then prepend a "0" in front of it and use those values in your query.
If you're querying using REST API, you would use syntax like:
PartitionKey ge '0<from date/time ticks in UTC>' and PartitionKey le '0<to date/time in UTC>'.
You could use this syntax if you're querying table storage in our tool Cloud Storage Studio, Visual Studio or Azure Storage Explorer.
Unfortunately I don't have much experience with the Storage Client library but let me work something out. May be I will write a blog post about it. Once I do that, I will post the link to my blog post here.
Gaurav
Since the performance counters data gets persisted in Windows Azure Table Storage (WADPerformanceCountersTable), you can query that table through a remote app (either by using Microsoft's Storage Client library or writing your own custom wrapper around Azure Table Service REST API to retrieve the data. All you will need is the storage account name and key.