On checking the following documentation for Alibaba Cloud ECS :-
https://www.alibabacloud.com/help/doc-detail/59643.htm
https://www.alibabacloud.com/help/doc-detail/25499.htm?#CreateInstance
https://www.alibabacloud.com/help/doc-detail/25517.htm
I see that there's an option to enable encryption for the Data Disks using the following option(s) -
Set the parameter DataDisk.n.Encrypted (CreateInstance) or Encrypted (CreateDisk) to true.
However, I don't see a similar option for encrypting the SystemDisk for the ECS instance while creating the instance / or in ModifyDiskAttribute
Is there an option for doing this which is perhaps not documented ?
Missed checking this in the documentation, it's present in the Limits section of the following article:
https://www.alibabacloud.com/help/doc-detail/59643.htm
You can only encrypt data disks, not system disks.
Also, as mentioned "Data in the instance operating system is not encrypted." on the following official Alibaba Cloud blog:
https://www.alibabacloud.com/blog/data-encryption-at-storage-on-alibaba-cloud_594581
And I think reason behind it would be that the Encryption on the system disk will slow down the processing capabilities.
Related
In the AWS policy conditions section, what is the difference between ec2:ResourceTag/${Tag Key} and aws: ResourceTag/${TagKey}?
I am trying to understand if there is a difference between adding to the conditions aws: ResourceTag/${TagKey} and ec2:ResourceTag/${Tag Key}?
aws:ResourceTag is an AWS Global Condition Context Key, whereas ec2:ResourceTag is an AWS Service-specific Key.
In the general case, Global identifiers should work in all services, but this is (was not in the past?) guaranteed, and you would need to verify if this was supported for the relevant service.
In this specific case, for EC2, they behave the same way, as you can see in the relevant documentation:
https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonec2.html#amazonec2-aws_ResourceTag___TagKey_
https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonec2.html#amazonec2-ec2_ResourceTag___TagKey_
See also https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_condition-keys.html
AWS has requested that the product I'm working on identifies requests that it makes to our users' S3 resources on their behalf so they can assess its impact.
To accomplish this, we have to set the User-Agent header for every upload request done against a S3 bucket from an EMR application. I'm wondering how this can be achieved?
Hadoop's doc mentions the fs.s3a.user.agent.prefix property (core-default.xml). However, the protocol s3a seems to be deprecated (Work with Storage and File Systems), so I'm not sure if this property will work.
To give a bit of more context what I need to do, with AWS Java SDK, it is possible to set the User-Agent header's prefix, for example:
AWSCredentials credentials;
ClientConfiguration conf = new ClientConfiguration()
.withUserAgentPrefix("APN/1.0 PARTNER/1.0 PRODUCT/1.0");
AmazonS3Client client = new AmazonS3Client(credentials, conf);
Then, every request's User-Agent http header will has a value similar to: APN/1.0 PARTNER/1.0 PRODUCT/1.0, aws-sdk-java/1.11.234 Linux/4.15.0-58-generic Java_HotSpot(TM)_64-Bit_Server_VM/25.201-b09 java/1.8.0_201. I need to achieve something similar when uploading files from an EMR application.
S3A is not deprecated in ASF hadoop; I will argue that it is now ahead of what EMR's own connector will do. If you are using EMR you may be able to use it, otherwise you get to work with what they implement.
FWIW in S3A we're looking at what it'd take to actually dynamically change the header for a specific query, so you go beyond specific users to specific hive/spark queries in shared clusters. Be fairly complex to do this though as you need to do it on a per request setting.
The solution in my case was to include a awssdk_config_default.json file inside the JAR submitted to EMR job. This file it used by AWS SDK to allow developers to override some custom settings.
I've added this json file within the JAR submitted to EMR with this content:
{
"userAgentTemplate": "APN/1.0 PARTNER/1.0 PRODUCT/1.0 aws-sdk-{platform}/{version} {os.name}/{os.version} {java.vm.name}/{java.vm.version} java/{java.version}{language.and.region}{additional.languages} vendor/{java.vendor}"
}
Note: passing the fs.s3a.user.agent.prefix property to EMR job didn't work. AWS EMR uses EMRFS when handling files stored in S3 which uses AWS SDK. I realized it because of an exception thrown in AWS EMR that I see sometimes, part of its stack trace was:
Caused by: java.lang.ExceptionInInitializerError: null
at com.amazon.ws.emr.hadoop.fs.files.TemporaryDirectoriesGenerator.createAndTrack(TemporaryDirectoriesGenerator.java:144)
at com.amazon.ws.emr.hadoop.fs.files.TemporaryDirectoriesGenerator.createTemporaryDirectories(TemporaryDirectoriesGenerator.java:93)
at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.create(S3NativeFileSystem.java:616)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:932)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:825)
at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.create(EmrFileSystem.java:217)
at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135)
I'm posting the answer here to future references. Some interests links:
The class in AWS SDK that uses this configuration file: InternalConfig.java
https://stackoverflow.com/a/31173739/1070393
EMRFS
We can take VSS based snapshot of Hyper-V VM using WMI CreateSnapshot() API provided by Msvm_VirtualSystemSnapshotService.
But there is no API provided to read the snapshot data.
Please suggest the ways to read Hyper-V snapshot data for backup purpose.
There are many details which you can leverage for your backup purpose.
The Msvm_VirtualSystemSnapshotService class contains the following properties:
Description, DetailedStatus, OperatingStatus, OperationalStatus, PrimaryStatus, StatusDescriptions etc.
I am creating a VPC in Amazon's cloud, but I can't figure out how to disable the source/dest check on my NAT instance from the AWS SDK. Specifically, I am using ruby and the docs show a call that will return a boolean indicating if it is on or not: http://docs.aws.amazon.com/AWSRubySDK/latest/AWS/EC2/Instance.html#source_dest_check-instance_method
I don't see anywhere that I can actually set it from the AWS SDK. I can do it through the console or through the command line tools, but it looks like they might have left this out of the API?
No it hasn't been left out from the API. You can use this:
http://docs.aws.amazon.com/AWSRubySDK/latest/AWS/EC2/Client.html#modify_instance_attribute-instance_method
or this:
http://docs.aws.amazon.com/AWSRubySDK/latest/AWS/EC2/Client.html#modify_network_interface_attribute-instance_method
I would like to access external data from my aws ec2 instance.
In more detail: I would like to specify inside by user-data the name of a folder containing about 2M of binary data. When my aws instance starts up, I would like it to download the files in that folder and copy them to a specific location on the local disk. I only need to access the data once, at startup.
I don't want to store the data in S3 because, as I understand it, this would require storing my aws credentials on the instance itself, or passing them as userdata which is also a security risk. Please correct me if I am wrong here.
I am looking for a solution that is both secure and highly reliable.
which operating system do you run ?
you can use an elastic block storage. it's like a device you can mount at boot (without credentials) and you have permanent storage there.
You can also sync up instances using something like Gluster filesystem. See this thread on it.