how does multiple EC2 instances (scaling) works on one EBS for data storage? - amazon-ec2

So, in a simple situation, if there is only one instance, then I can store the data into a EBS volume mounted on that instance. e.g. /mnt/db
However, how does it work if I scale and have multiple instance (either static or dynamic scaling)?
Because one EBS can only attach to one instance, if I have multiple instance, does it mean that I have to attach an EBS volume for each instance? If that's the case, the data on each Instance's EBS volume will be different.
It is obvious that I want all instances to access (R & W) a single volume (as data-storage). and the data in the volume will constantly grow and there is no downtime.
What is the solution? Is there a way that I don't mount the device (EBS), and just call it for accessing the data?
Here is what I can think of:
1) if each instance has its own EBS volume, then each time interval (e.g. 1 hour), all instances will unmount & detach the EBS volume,and attach a new one. Then there is one powerful instance that mount all the EBS volumes just detached, and aggregate all the data.
2) or similar to 1), instead of detach and attach, I just take a snapshot on all volumes for all instances. Then the powerful instance aggregateness the data from the snapshot. And save the result into either another EBS or S3.
These two approach seem to be working.. but require a lot of work. is there a smarter way to approach this problem? thanks.
by the way, because of performance issue, I cannot have the instance writes data to S3. :)
OH how about this
3) First, all instances have their own EBS and write data into the EBS. and then each hour, data will be sent to S3. Then another instance will aggregate them.

how about having ang NFS instance which can be mounted to the other instances?

It seems that you need to create an EBS snapshot of your most up to date EC2 instance. This will create an EBS backed AMI. You would then need to terminate all your EC2 instances that are not up to date and launch a new stack of instances from your newly created AMI. If you had a load balancer running then you would have to attach these new instances to your load balancer also.
It seems a little long-winded but it can all be done programmatically. At least this is how I think scaling in the cloud with Amazon works and far as propagating changes across multiple instances goes. Somebody else with more experience verify this. I plan to test it out myself later on.

Related

How to save intermidiate results on Amazon EC2 when spot instance used?

I do some scientific calculations and I have some intermidiate results on each iteration, so I think I can use spot instance reduce cost of processing.
How can I save intermidiate results on each iteration?
How can I automatically rerun instance from last checkpoint when it's terminated?
When the spot price of an Amazon EC2 instance rises above your bid price, your Amazon EC2 instance is terminated. A 2-minute notice is provided via the metadata interface. You can use this notice as a trigger for saving your work, or you could simply save work at regular intervals regardless of the notice period.
Do not save your work "locally", since the Amazon EBS volumes will either be deleted (eg boot volume) or disconnected (eg data volumes). I would recommend that you save your work in a persistent datastore, such as a database or Amazon S3.
One option would be to save files to your local disk, but use the AWS Command-Line Interface (CLI) to copy the files to Amazon S3 using the aws s3 sync command.
Then, if you have configured a persistent spot instance, simply copy the files from Amazon S3 when the new Amazon EC2 spot instance is started.
See:
Spot Instance Interruptions

Can a new EC2 instance attach to EBS volumes that are currently attached to another EC2 instance that's failed in reachability check?

I have an EC2 small instance, which has two EBS volumes attached and has EBS as root device. Now the EC2 instance is not reachable for some reason (AWS engineers are looking into it). In the mean while, we are thinking about launch another EC2 instance and attach it to the two EBS volumes. What's the best practice for that purpose? Do I need to take snapshot of the volumes before re-attach to the new EC2 instance? Can we attach to them without destroying the existing data on the volumes?
You do not need to take a snapshot in order to attach the volume to a new instance. You can simply detach the volumes and re-attach them to a new instance.
Your data will not be destroyed in this process.
Hope it helps.

How to Copy files from one EBS to Another EBS

The problem is simple.
I need to copy the files from one EBS to Another without passing the files through my local machine. Is this possible? If so how?
In order to copy files from one EBS volume to another EBS volume, both volumes will (at some point) need to be attached to an instance, though not necessarily the same instance. There are a lot of ways to do this if you allow for multiple instances and/or temporarily storing the files on a third storage option, but without constraints, and assuming the volumes are not currently attached to instances, a simple solution would be:
Start a temporary instance. Use a larger size for higher IO bandwidth.
Attach both EBS volumes to the instance and mount them as, say, /vol1 and /vol2
Copy the files from /vol1 to /vol2 (perhaps using something like: rsync -aSHAX /vol1/ /vol2/ )
Unmount the volumes, detach the EBS volumes, terminate the temporary instance.
If you have additional constraints, you should update your question to specify exactly what your environment is and what you're trying to do.
[Followup based on the error you are seeing]
The EC2 instance and EBS volume must exist in the same EC2 region and availability zone for the volume to be attached to the instance. If they are not, then you may want to create a temporary instance in the EBS volume's availability zone and use something like rsync to copy files between two instances.

Terminating Amazon EC2 - what happens to persistant data

Its a pretty quick question - I have setup a pretty simple LAMP based website on EC2. I created an EBS and mounted it to the instance where I'm saving all the mysql data and other backups.
Now in order to connect to the instance - I use WINSCP and use the Elastic IP from where I can view all the data.
Now my question is - say I terminate the instance - the backup data and mysql data which resides on the EBS will still be available right. So how can I access this data.
I mean using WINSCP and the same Elastic IP, I wont be able to connect anymore as the instance is terminated - so how can access the data stored on EBS.
Sorry for the ignorant question but just starting to play with EC2
Thanks
I'm assuming you've created an EBS-backed instance and added to that (attached) a further EBS volume as a chunk of extra storage. In which case, when you terminate the instance, the boot EBS volume is released and deleted, but attached EBS storage is only released - it remains in the 'Available' state after the instance has been destroyed and its' data contents are left intact. You can then access whatever is on it by simply attaching it to another running instance.

EBS for storing databases vs. website files

I spent the day experimenting with AWS for the first time. I've got an EC2 instance running and I mounted an Elastic Block Store (EBS) to keep the MySQL databases.
Does it make sense to also put my web application files on the EBS, or should I just deploy them to the normal EC2 file system?
When you say your web application files, I'm not sure what exactly you are referring to.
If you are referring to your deployed code, it probably doesn't make sense to use EBS. What you want to do is create an AMI with your prerequisites, then have a script to create an instance of that AMI and deploy your latest code. I highly recommend you automate and test this process as it's easy to forget about some setting you have to manually change somewhere.
If you are storing data files, that are modified by the running application, EBS may make sense. If this is something like user-uploaded images or similar, you will likely find that S3 gives you a much simpler model.
EBS would be good for: databases, lucene indexes, file based CMS, SVN repository, or anything similar to that.
EBS gives you persistent storage so if you EC2 instance fails the files still exist. Apparently their is increased IO performance but I would test it to be sure.
If your files are going to change frequently (like a DB does) and you don't want to keep syncing them to S3 (or somewhere else), then an EBS is a good way to go. If you make infrequent changes and you can manually (or scripted) sync the files as necessary then store them in S3. If you need to shutdown or you lose your instance for whatever reason, you can just pull them down when you start up the new instance.
This is also assuming that you care about cost. If cost is not an issue, using the EBS is less complicated.
I'm not sure if you plan on having a separate EBS for your DB and your web files but if you only plan on having one EBS and you have enough empty space on it for your web files, then again, the EBS is less complicated.
If it's performance you are worried about, as mentioned, it's best to test your particular app.
Our approach is to have a script pre-deployed on our AMI that fetches the latest and greatest version of the code from source control. That makes it very straightforward to launch new instances quickly, or update all running instances (we take them out of the load balancing rotation one at a time, run the script, and put them back in the rotation).
UPDATE:
Reading between the lines it looks like you're mounting a separate EBS volume to an instance-store backed instance. AWS recently introduced EBS backed instances that have a ton of benefits vs. the old instance-store ones. I still mount my MySQL data on a separate EBS partition, though, so that I can easily mount it to a different server if needed.
I strongly suggest an EBS backed instance with a separate EBS volume for the MySQL data.

Resources