Syncing between Amazon EBS Devices - amazon-ec2

I have 2 EC2 instances, each with their own EBS attached. Sitting infront of the EC2s is a load balancer.
These instances run CMS driven sites, where uses can upload files.
What would be the best solution to the problem of a file getting uploaded to one EBS and the load balancer sending a visitor to the EC2 instance whose EBS does not have the file? Some sort of cron which runs an rsync?
Suggestions very welcome!
Thanks
S

I believe the best solution would be to use single shared storage like Amazon S3. It's better to use some plugin for your CMS to store users' files on S3. But if there is no such plugin you can use Fuse s3fs adapter to mount the file system on both instances and configure your CMS to store those files in that specified directory.

there are several solutions to this problem from top of my head i think
nfs/samba shared dir between instances
svn deploy
cluster file systems - OCFS/GFS
cloud management such as capistrano and trriger a deploy when you need
and of course cron jobs when you can do ftp, scp, rsync, s3sync/copy etc

Or possibly, create one EC2 instance as NFS and share it's directories with your other instances.

There are multiple solutions to keep data in both EC2 in sync with or without using EBS volumes.
Can use AWS EFS service instead of using EBS volumes. EFS volume can be shared between EC2 instances within a VPC, and both instances will have data in sync on the mountpath where EFS is mounted on instances.
Another solution is using Gluster File Storage. This can also work between EBS volumes in different AWS region. Refer this link: http://sanketdangi.com/post/5601762671/gluster-config-aws-multi-az
Can mount S3 bucket on your EC2 instances using S3 Fuse. Refer this link: https://github.com/s3fs-fuse/s3fs-fuse/wiki/Fuse-Over-Amazon
May be you can also use "s3 sync" on both ebs volumes. This way both ebs will be in sync via S3. Refer this link: https://docs.aws.amazon.com/cli/latest/reference/s3/sync.html

Related

How to Backup running EC2 instances with EBS root volumes?

I am new to AWS and had to take over an existing VPC with multiple EC2 instances.
I am looking for a way to backup the instances (whole disks).
I read about EBS snapshots on forums and this seems a good solution.
The instances' root disks are all EBS volumes.
I read the AWS documentation on EBS snapshot which states as shown below:
To create a snapshot for Amazon EBS volumes that serve as root devices, you should stop the instance before taking the snapshot.
I cannot shutdown the EC2 instances just for a backup.
How do senior AWS sysadmins back up their instances with EBS root volumes ?
With KVM it is possible to pause a host. Is there a similar functionality available in AWS?

sync EBS volumes via S3

I am looking to have multiple Amazon EC2 instances use the same data store. Amazon does not condone mounting an S3 Bucket as a file system, so I am trying to avoid this solution. Is there a way to synchronize an EBS volume with S3 or would it be best to use rsync and cron?
Do you really have to have the files locally available from within EBS? What if instead you served them to yourself via CloudFront, and restricted the permissions so that only your instances (or only your Security Group) could see the files?
Come Fall 2015 you'll be able to use Elastic File Storage (EFS) for this. But until then, I would suppose the next best thing is to use the AWS command-line to sync down from S3 to your volume:
aws s3 sync s3://my-bucket/my/folder /mnt/my-ebs/
After the initial run, that sync command is surprisingly fast. So from there you could just cron it to run hourly or so?

How to access file storage from web application on Amazon EC2

I am in process of hosting a dynamic website on Amazon EC2. I have created the environment and deployed war on ElasticStalkBean. I can connect to mysql database too. But I am not sure how my web application will read/write to the disk and at which path?
As per my understanding, Amazon provides 3 options for file storage
S3
EBS (Persistant)
instance storage
I could upload files on s3 creaing bucket but how can my web application read or write to S3 bucket path on differnt server?
I am not sure how should i upload files or write file to EBS. Connecting to EC2, I cannot cd /dev/sd* directory for my EBS attached to my environment instance. How can I configure my web app to use this as directory for images etc
Instance storage is lost if I stop or recreate env. and is non persistant. So not interested to store files here.
Can you help me on this?
Where to upload file that are read by application?
Where can my application write files?
Your question: "how can my web application read or write to S3 bucket path on different server?
I'm a newbie user of AWS too, so can only offer limited help, but this is what I understand:
The webapp running in the EC2 instance can access the S3 storage using with the REST or SOAP APIs. Here's the link to the reference guide for using the REST GET function to get a file from S3:
GET object documentation
I guess the idea is that the S3 storage bucket that Amazon create for your EBS "environments" provides permanent storage for your application and data files (images etc.). When a EC2 instance is created or rebooted, it should get any additional application files from an S3 bucket and 'cache' them on the file system ("volume") attached to the EC2 "instance".

amazon ec2 and s3 setup

I am about to migrate a large web project (many sites using common data) to EC2 and i wondered what would be the best setup (I am very much a newbie with Amazon AWS).
The site pages are rebuilt by scripts once a week and the resultant static pages are served (currently about 7 to 10k views a day). Inbetween the weekly builds I would like to access the db to add/edit data.
I am thinking either EC2 + RDS or EC2 and S3 (S3 having the advantage of keeping a copy of the static pages too). Do these options sound reasonable, based on what I have mentioned?
Thanks in advance
We're using EC2 (experimtented with a few instance types just to learn cpu extra large worked best for our type of application), and rather than using RDS we extensively use EBS -
one EBS for running code, one EBS which holds mysql database files.
S3 is used for incremental backups mostly- as the EBS can be mounted on any other instance easily.

How to rsync to all Amazon EC2 servers?

I have a Scalr EC2 cluster, and want an easy way to synchronize files across all instances.
For example, I have a bunch of files in /var/www on one instance, I want to be able to identify all of the other hosts, and then rsync to each of those hosts to update their files.
ls /etc/aws/hosts/app/
returns the IP addresses of all of the other instances
10.1.2.3
10.1.33.2
10.166.23.1
Ideas?
As Zach said you could use S3.
You could download one of many clients out there for mapping drives to S3. (search for S3 and webdav).
If I was going to go this route I would setup an S3 bucket with all my shared files and use jetS3 in a cronJob to sync each node's local drive to the bucket (pulling down S3 bucket updates). Then since I normally use eclipse & ant for building, I would create a ANT job for deploying updates to the S3 bucket (pushing updates up to the S3 bucket).
From http://jets3t.s3.amazonaws.com/applications/synchronize.html
Usage: Synchronize [options] UP <S3Path> <File/Directory>
(...)
or: Synchronize [options] DOWN
UP : Synchronize the contents of the Local Directory with S3.
DOWN : Synchronize the contents of S3 with the Local Directory
...
I would recommend the above solution, if you don't need cross-node file locking. It's easy and every system can just pull data from a central location.
If you need more cross-node locking:
An ideal solution would be to use IBM's GPFS, but IBM doesn't just give it away (at least not yet). Even though it's designed for High Performance interconnects it also has the ability to be used over slower connections. We used it as a replacement for NFS and it was amazingly fast ( about 3 times faster than NFS ). There maybe something similar that is open source, but I don't know. EDIT: OpenAFS may work well for building a clustered filesystem over many EC2 instances.
Have you evaluated using NFS? Maybe you could dedicate one instance as an NFS host.

Resources