How to Copy files from one EBS to Another EBS - amazon-ec2

The problem is simple.
I need to copy the files from one EBS to Another without passing the files through my local machine. Is this possible? If so how?

In order to copy files from one EBS volume to another EBS volume, both volumes will (at some point) need to be attached to an instance, though not necessarily the same instance. There are a lot of ways to do this if you allow for multiple instances and/or temporarily storing the files on a third storage option, but without constraints, and assuming the volumes are not currently attached to instances, a simple solution would be:
Start a temporary instance. Use a larger size for higher IO bandwidth.
Attach both EBS volumes to the instance and mount them as, say, /vol1 and /vol2
Copy the files from /vol1 to /vol2 (perhaps using something like: rsync -aSHAX /vol1/ /vol2/ )
Unmount the volumes, detach the EBS volumes, terminate the temporary instance.
If you have additional constraints, you should update your question to specify exactly what your environment is and what you're trying to do.
[Followup based on the error you are seeing]
The EC2 instance and EBS volume must exist in the same EC2 region and availability zone for the volume to be attached to the instance. If they are not, then you may want to create a temporary instance in the EBS volume's availability zone and use something like rsync to copy files between two instances.

Related

How to take a backup of an SSD AWS [ec2]?

I have an m3 large. Although I can find the other EBS volumes associated with that instance in the Volumes Section.
But I am not able to find my 32GB SSD disk.
How can we take backup of this SSD?
It appears that you are referring to the Instance Store SSD volume that is provided as part of an m3.large Amazon EC2 instance.
Instance Store volumes are temporary (aka "ephemeral") and the content is lost when the instance is Stopped, Terminated or fails. Therefore, it is recommended only for temporary files and swap files. Be sure to copy off any data you wish to keep before the instance is Stopped.
Instance Store volumes are not the same as Elastic Block Store (EBS) volumes. While EBS provides a snapshot capability, this is not available for Instance Store volumes.
Instead, you must copy off any data you wish to keep via normal filesystem commands, or run traditional backup software. There is no snapshot-like capability available for Instance Store volumes.

Running out of disk space EC2

I ran into some issues with my EC2 micro instance and had to terminate it and create a new one in its place. But it seems even though the old instance is no longer visible in the list, it is still using up some space on my disk. My df -h is listed below:
Filesystem Size Used Avail Use%
/dev/xvda1 7.8G 7.0G 719M 91% /
When I go to the EC22 console I see there are 3 volumes each 8gb in the list. One of them is attached (/dev/xvda) and this one is showing as "in-use". The other 2 are simply showing as "Available"
Is the terminated instance really using up my disk space? If yes, how to free it up?
I have just solved my problem by running this command:
sudo apt autoremove
and a lot of old packages are going to be removed, for instance many files like this linux-aws-headers-4.4.0-1028
Amazon Elastic Block Storage (EBS) is a service that provides virtual disks for use with Amazon EC2. It is network-attached storage that persists even when an EC2 instance is stopped or terminated.
When launching an Amazon EC2 instance, a boot volume is automatically attached to the instance. The contents of the boot volume is copied from an Amazon Machine Image (AMI), which can be chosen from a pre-populated list (including the ability to create your own AMI).
When an Amazon EC2 instance is Stopped, all EBS volumes remain attached to the instance. This allows the instance to be Started with the same configuration as when it was stopped.
When an Amazon EC2 instance is Terminated, EBS volumes might or might not be deleted, based upon the Delete on Termination setting of each volume:
By default, boot volumes are deleted when an instance is terminated. This is because the volume was originally just a copy of an AMI, so there is unlikely to be any important data on the volume. (Hint: Don't store data on a boot volume.)
Additional volumes default to "do not delete on termination", on the assumption that they contain data that should be retained. When the instance is terminated, these volumes will remain in an Available state, ready to be attached to another instance.
So, if you do not require any content on your remaining EBS volumes, simply delete them. In future, when launching instances, keep an eye on the Delete on Termination setting to make the clean-up process simpler.
Please note that the df -h command is only showing currently-attached volumes. It is not showing the volumes in Available state, since they are not visible to that instance. The concept of "Disk Space" typical refers to the space within an EBS volume, while "EBS Storage" refers to the volumes themselves. So, the 7GB of the volume that is used is related to that specific (boot) volume.
If you are running out of space on an EBS volume, see: Expanding the Storage Space of an EBS Volume on Linux. Expanding the volume involves:
Creating a snapshot
Creating a new (bigger) volume from the snapshot
Swapping the disks (requiring a Stop/Start if you are swapping a boot volume)
These 2 steps add an extra hard drive to your EC2 and format it for use:
Attach an extra hard drive (EBS: Elastic Block Storage) to an EC2
Format an EBS drive attached to an EC2
Here's pricing info. Free Tier includes 30GB. Afterward it's $1.25/month for 10GB on a General Purpose SSD (gp2).
To see how much space you are using/need:
Check your current disk use/available in Linux with df -h.
Check the size of a directory in Linux with du -sh [path].

Can I create an AMI that includes multiple ebs volumes (i.e. both sda and sdb)

I have an ebs-backed instance running on EC2. I'm using it to do some computationally intensive text processing on around 16Gb of data which is stored on sdb (i.e. the larger ebs volume associated with the instance).
I'd like to parallelized the processing by creating replicas of this instance, each with its own copy of the data. I can create an AMI from the instance but I need the image to include BOTH sda (the root ebs volume) AND ALSO sdb, which is the volume where all the data is. How can I make a replica of the whole package?
Creating an image in the AWS Management Console just copies sda (i.e. the root volume, which is too small to hold my data).
Is this even possible?
(PS: I don't even see the sdb volume in the AWS Management Console Elastic Block Store->Volumes panel)
Thanks!
I once needed this sort of setting where I had to setup a MySQL on a EBS backed machine with data store in a separate EBS Volume. The AMI had to be such that every time you instanciate it, it should have the data volume (with static data in it) attached. This is how I did:
Created an EBS backed instance from any existing image
Attached a EBS volume, performed mkfs, mounted on /database
Copied data to the volume, e.g. under /database/mysql
Created image of this setup from AMI web console.
Now, every time I launch this image, I see the volume with all the data is there. I just mount it on /database and things get going.
I am not sure, if this is helpful to you but your problem seemed to close to this.
Update after #NAD's comment
Yeah, AMI creation process excludes stuffs that are under
/sys
/proc
/dev
/media
/mnt
So, the trick is to not have stuffs that you want to bundle up with your AMI under these directories.
Also, if you have volume that you want to auto-mount at boot, register it in fstab

amazon ec2, is mount information stored in ami or snapshot?

I have ec2 instance set up where mysql DB is mounted on separate volume.
(as detailed in http://aws.amazon.com/articles/1663 )
I want to duplicate this instance set up where my application servers on duplicated instances share the DB volume which is attached to the already running ec2 instance.(I can specify mysql ip through configuration file)
Since almost every set up except the mysql ip is identical, i'd like to create an ami from the first instance and slightly modify to create 2nd,3rd instances.
The question is, the mount information stored in the first instance will take effect when I launch the 2nd instance.
I can elaborate the question,
1. I read that a volume can not be attached to more than one ec2 instance at the same time.
2. the running instance attaches/mount an volume to itself on start up.(so it seems)
3. if I were to create an ami from first instance and use that to initiate other instances, how would auto attach/mount information(which I assume, will be stored in the ami) will affect the other instances.
Eugene,
Mounting the same device to several servers is not possible, so you better forget about this option.
The best solution is to:
Create a copy of your master instance.
Detach the created mount volume. We are going to create an image from this new instance, and you don't want the useless drive copy to be re-created every time.
Change the settings that you need to change, in order to make this server rely on the remote (master) mysql server.
Once you are satisfied with the outcome, create an image from this instance.
Good luck!
Dotan

how does multiple EC2 instances (scaling) works on one EBS for data storage?

So, in a simple situation, if there is only one instance, then I can store the data into a EBS volume mounted on that instance. e.g. /mnt/db
However, how does it work if I scale and have multiple instance (either static or dynamic scaling)?
Because one EBS can only attach to one instance, if I have multiple instance, does it mean that I have to attach an EBS volume for each instance? If that's the case, the data on each Instance's EBS volume will be different.
It is obvious that I want all instances to access (R & W) a single volume (as data-storage). and the data in the volume will constantly grow and there is no downtime.
What is the solution? Is there a way that I don't mount the device (EBS), and just call it for accessing the data?
Here is what I can think of:
1) if each instance has its own EBS volume, then each time interval (e.g. 1 hour), all instances will unmount & detach the EBS volume,and attach a new one. Then there is one powerful instance that mount all the EBS volumes just detached, and aggregate all the data.
2) or similar to 1), instead of detach and attach, I just take a snapshot on all volumes for all instances. Then the powerful instance aggregateness the data from the snapshot. And save the result into either another EBS or S3.
These two approach seem to be working.. but require a lot of work. is there a smarter way to approach this problem? thanks.
by the way, because of performance issue, I cannot have the instance writes data to S3. :)
OH how about this
3) First, all instances have their own EBS and write data into the EBS. and then each hour, data will be sent to S3. Then another instance will aggregate them.
how about having ang NFS instance which can be mounted to the other instances?
It seems that you need to create an EBS snapshot of your most up to date EC2 instance. This will create an EBS backed AMI. You would then need to terminate all your EC2 instances that are not up to date and launch a new stack of instances from your newly created AMI. If you had a load balancer running then you would have to attach these new instances to your load balancer also.
It seems a little long-winded but it can all be done programmatically. At least this is how I think scaling in the cloud with Amazon works and far as propagating changes across multiple instances goes. Somebody else with more experience verify this. I plan to test it out myself later on.

Resources