Do Amazon EBS snapshots retain deleted data? - amazon-ec2

Consider the following scenario:
I have a file with sensitive information stored on an EBS-backed EC2 instance. I delete this file in the standard non-secure way (rm -f my_secret_file). Once the file is deleted, I immediately shut down the instance and take an EBS snapshot. (Or create an AMI... either one, really.)
If a malicious party was able to gain access to the snapshot and mount/boot it, could they undelete any portion of my_secret_file using the various filesystem tools available? Put another way, do the EBS snapshots retain the data that existed in "unallocated"/deleted blocks at the time?

Yes - I would be extremely surprised if they didn't. The EBS snapshots are block-level snapshots so they will capture everything, regardless of the logical state of the file system similar to a hard disk image.

Related

How to take a backup of an SSD AWS [ec2]?

I have an m3 large. Although I can find the other EBS volumes associated with that instance in the Volumes Section.
But I am not able to find my 32GB SSD disk.
How can we take backup of this SSD?
It appears that you are referring to the Instance Store SSD volume that is provided as part of an m3.large Amazon EC2 instance.
Instance Store volumes are temporary (aka "ephemeral") and the content is lost when the instance is Stopped, Terminated or fails. Therefore, it is recommended only for temporary files and swap files. Be sure to copy off any data you wish to keep before the instance is Stopped.
Instance Store volumes are not the same as Elastic Block Store (EBS) volumes. While EBS provides a snapshot capability, this is not available for Instance Store volumes.
Instead, you must copy off any data you wish to keep via normal filesystem commands, or run traditional backup software. There is no snapshot-like capability available for Instance Store volumes.

How to Copy files from one EBS to Another EBS

The problem is simple.
I need to copy the files from one EBS to Another without passing the files through my local machine. Is this possible? If so how?
In order to copy files from one EBS volume to another EBS volume, both volumes will (at some point) need to be attached to an instance, though not necessarily the same instance. There are a lot of ways to do this if you allow for multiple instances and/or temporarily storing the files on a third storage option, but without constraints, and assuming the volumes are not currently attached to instances, a simple solution would be:
Start a temporary instance. Use a larger size for higher IO bandwidth.
Attach both EBS volumes to the instance and mount them as, say, /vol1 and /vol2
Copy the files from /vol1 to /vol2 (perhaps using something like: rsync -aSHAX /vol1/ /vol2/ )
Unmount the volumes, detach the EBS volumes, terminate the temporary instance.
If you have additional constraints, you should update your question to specify exactly what your environment is and what you're trying to do.
[Followup based on the error you are seeing]
The EC2 instance and EBS volume must exist in the same EC2 region and availability zone for the volume to be attached to the instance. If they are not, then you may want to create a temporary instance in the EBS volume's availability zone and use something like rsync to copy files between two instances.

Swap disappears when stopping an ebs backed instance.

My instance swap file is disappearing when I start my instance.
I have an Ubuntu ec2 instance, and I follow the "Four-step Process to Add Swap File" instructions at https://help.ubuntu.com/community/SwapFaq:
sudo dd if=/dev/zero of=/mnt/512MiB.swap bs=1024 count=524288
sudo chmod 600 /mnt/512MiB.swap
sudo mkswap /mnt/512MiB.swap
sudo swapon /mnt/512MiB.swap
I then changed my /etc/fstab to include:
/mnt/512MiB.swap none swap sw 0 0
Since I am using a much bigger swap, this process takes some time, and I don't want to do it every time I start. I would rather pay for the storage. However, when I start my instance, the swap has disappeared. If I type 'top', the instance does not have a swap file in use.
What should I do?
While the Amazon EC2 instance you are using has EBS backed Root Device Storage, all EC2 instance types still have the EC2 instance storage (also known as an ephemeral store) available for use as well, and the smaller instance types (e.g. m1.small and c1.medium) have it attached and mounted at /mnt by default even (the larger ones not!).
The most important characteristic of this storage type to be aware of is, that the data on the instance store volumes persists only during the life of the associated Amazon EC2 instance.
This statement is nowadays a tiny bit misleading, insofar it applies to stopping an EBS backed instance as well (not rebooting though), i.e. the moment you stop that instance, the ephemeral volume mounted at /mnt is detached and deleted and all data stored there is lost, including your swap file of course; once you start the instance again, a new ephemeral volume will be attached and mounted at /mnt.
Solution
You can still use the EC2 instance storage (which is plentiful and free of charge) if you exactly know what you are doing (see section Background below), e.g. it is a perfect option for strictly temporary data or anything that can be recreated easily on demand, like a cache for example.
A swap file is matching this requirements as well of course, so you simply need to create a script with the commands outlined in your question and execute it on instance start to recreate the swap file. You should put a guard in place though, because the instance storage survives reboots, i.e. you neither need nor should recreate the swap file on reboots, just with real stop/start cycles.
Background
The instance storage used to be the only storage option when Amazon EC2 was first introduced, but the resulting severe limitations for everyday usage have fortunately been remedied with the Amazon Elastic Block Store (EBS) you are using as well accordingly. Eric Hammond has recently provided a great summary why You Should Use EBS Boot Instances on Amazon EC2, addressing this very topic:
If you are just getting started with Amazon EC2, then use EBS boot
instances and stop reading this article. Forget that you ever heard
about instance-store and accept my apology that I just mentioned it.
Once you are completely comfortable with using EBS boot instances on
EC2, you may (or may not) want to come back here and read why you made
a good decision.

Why do Windows snapshots take a long time?

I am running a vanilla Windows install on Amazon EBS volume. The computer takes 10 minutes to boot, which may be understandable as 2 reboots are required. However, taking a snapshot is also a 10-15 minute process. Can anyone explain this? Any way to speed it up? I am a bit surprised, because I thought that snapshots are immediate replicas of the running EBS volume, in which case shouldn't they take just a couple of seconds to complete?
I will add that the console shows that "snapshot" is completed very quickly. But the "AMI" section is what seems to take 10-20 minutes. What's the difference? Is the snapshot available for use immediately, or do I need to wait for the AMI?
From the EBS product page:
Amazon EBS snapshots are incremental
backups, meaning that only the blocks
on the device that have changed since
your last snapshot will be saved. If
you have a device with 100 GBs of
data, but only 5 GBs of data has
changed since your last snapshot, only
the 5 additional GBs of snapshot data
will be stored back to Amazon S3.
Subsequent snapshots are fast because only the changed blocks need to be saved. So the time it takes scales with the amount of changes since the last snapshot.
Is the snapshot available for use
immediately, or do I need to wait for
the AMI?
Also from the product page:
New volumes created from existing
Amazon S3 snapshots load lazily in the
background. This means that once a
volume is created from a snapshot,
there is no need to wait for all of
the data to transfer from Amazon S3 to
your Amazon EBS volume before your
attached instance can start accessing
the volume and all of its data. If
your instance accesses a piece of data
which hasn’t yet been loaded, the
volume will immediately download the
requested data from Amazon S3, and
then will continue loading the rest of
the volume’s data in the background.
The creation of an AMI is a multiple-step process.
The Snapshot of the current machine is started (that's darn near instantaneous)
The snapshot copies the "changed blocks" from the base AMI to the snapshot lazily (that's pretty quick too)
The underlying Windows image is then prepared to be an AMI base image this starts with booting a "ghost" instance from the image with the snapshot as the disk image.
A SYSPREP is started to "reseal" the machine so it gets new machine SIDs.
The new image is then re-snapshotted
The AMI is marked "complete"

EBS for storing databases vs. website files

I spent the day experimenting with AWS for the first time. I've got an EC2 instance running and I mounted an Elastic Block Store (EBS) to keep the MySQL databases.
Does it make sense to also put my web application files on the EBS, or should I just deploy them to the normal EC2 file system?
When you say your web application files, I'm not sure what exactly you are referring to.
If you are referring to your deployed code, it probably doesn't make sense to use EBS. What you want to do is create an AMI with your prerequisites, then have a script to create an instance of that AMI and deploy your latest code. I highly recommend you automate and test this process as it's easy to forget about some setting you have to manually change somewhere.
If you are storing data files, that are modified by the running application, EBS may make sense. If this is something like user-uploaded images or similar, you will likely find that S3 gives you a much simpler model.
EBS would be good for: databases, lucene indexes, file based CMS, SVN repository, or anything similar to that.
EBS gives you persistent storage so if you EC2 instance fails the files still exist. Apparently their is increased IO performance but I would test it to be sure.
If your files are going to change frequently (like a DB does) and you don't want to keep syncing them to S3 (or somewhere else), then an EBS is a good way to go. If you make infrequent changes and you can manually (or scripted) sync the files as necessary then store them in S3. If you need to shutdown or you lose your instance for whatever reason, you can just pull them down when you start up the new instance.
This is also assuming that you care about cost. If cost is not an issue, using the EBS is less complicated.
I'm not sure if you plan on having a separate EBS for your DB and your web files but if you only plan on having one EBS and you have enough empty space on it for your web files, then again, the EBS is less complicated.
If it's performance you are worried about, as mentioned, it's best to test your particular app.
Our approach is to have a script pre-deployed on our AMI that fetches the latest and greatest version of the code from source control. That makes it very straightforward to launch new instances quickly, or update all running instances (we take them out of the load balancing rotation one at a time, run the script, and put them back in the rotation).
UPDATE:
Reading between the lines it looks like you're mounting a separate EBS volume to an instance-store backed instance. AWS recently introduced EBS backed instances that have a ton of benefits vs. the old instance-store ones. I still mount my MySQL data on a separate EBS partition, though, so that I can easily mount it to a different server if needed.
I strongly suggest an EBS backed instance with a separate EBS volume for the MySQL data.

Resources