I am in process of hosting a dynamic website on Amazon EC2. I have created the environment and deployed war on ElasticStalkBean. I can connect to mysql database too. But I am not sure how my web application will read/write to the disk and at which path?
As per my understanding, Amazon provides 3 options for file storage
S3
EBS (Persistant)
instance storage
I could upload files on s3 creaing bucket but how can my web application read or write to S3 bucket path on differnt server?
I am not sure how should i upload files or write file to EBS. Connecting to EC2, I cannot cd /dev/sd* directory for my EBS attached to my environment instance. How can I configure my web app to use this as directory for images etc
Instance storage is lost if I stop or recreate env. and is non persistant. So not interested to store files here.
Can you help me on this?
Where to upload file that are read by application?
Where can my application write files?
Your question: "how can my web application read or write to S3 bucket path on different server?
I'm a newbie user of AWS too, so can only offer limited help, but this is what I understand:
The webapp running in the EC2 instance can access the S3 storage using with the REST or SOAP APIs. Here's the link to the reference guide for using the REST GET function to get a file from S3:
GET object documentation
I guess the idea is that the S3 storage bucket that Amazon create for your EBS "environments" provides permanent storage for your application and data files (images etc.). When a EC2 instance is created or rebooted, it should get any additional application files from an S3 bucket and 'cache' them on the file system ("volume") attached to the EC2 "instance".
Related
My Laravel app allows users to upload images. Currently, when the user uploads their images, they are stored in a temporary location on the server. A cron job then modifies the uploaded images (compresses them, etc.), and uploads them to S3. Any temporary files older than 48 hours that failed to upload to S3 are deleted by another cron job.
I've set up an Elastic Beanstalk environment, but it's occurred to me that storing uploaded images in a temporary directory on an instance is risky because instances can be created and destroyed when necessary.
How and where, then, would I store these temporary files so that they're not at risk of being deleted by an instance?
As discussed in the comments, I think that uploading the file to S3 is the best option. As far as I know, it's not possible to stop Elastic Beanstalk from destroying an ec2 instance, unless you want to get rid of all of the scaling and instance failure/autoreplacement features.
One option I don't know much about may be AWS EBS. "Amazon Elastic Block Store (Amazon EBS) provides persistent block storage volumes for use with Amazon EC2 instances in the AWS Cloud." I don't have any direct experience with EBS, the overriding question of course would be if EBS is truly persistent, even after an ec2 instance is destroyed. As EBS has costs associated with it, it seems like since you are already using S3, S3 would be the way to go.
S3 has a feature called object lifecycle management you can use to have files deleted automatically by setting them them to expire 2 days after they're uploaded.
You can either:
A) Prefix the temporary files to put them in an S3 psuedo-folder (i.e., Temp/), apply the object lifecycle expire rule to that specific prefix (or "folder"), and use the files in there as a source of truth for the new files derived from it post-manipulation.
or
B) Create an S3 bucket specifically for temporary files. Manipulate the files from there and copy to the production bucket.
I have an application(A) deployed on amazon aws using elasticbeanstalk. I also have another multi threaded java application(B), which creates some file on periodic basis, which needs to be read/updated by the application(A) running on elasticbeanstalk.
If i directly run the application (B) on EC2 then Application (A) does not have access to it.
What model should i use in this situation so that Application (A) can access files created by application(B).?
Upload the files created by B to S3, you can do this with the AWS api or use S3 Fuse to mount it in the filesystem. Then have A read them the same way with either the API or S3 Fuse.
I've launched an application in AWS -> Beanstalk using pre-installed server template.
In the process of Beanstalk installation I see it is creating S3 bucket. I'm pretty sure that I didn't select any option to use S3 bucket. If S3 bucket is needed for the Beanstalk application, can you tell me how it works together and what is the purpose? Can I prevent using S3 with Beanstalk?
This S3 bucket is indeed automatically created by Elastic Beanstalk for your new application.
It is used to store some environment files, and more important, zipped builds of your app (each one being a different version). The Beanstalk deployment script simply downloads the .zip from the bucket to the EBS volume.
It looks like there is no option on AWS to change this.
By the way, why don't you want to use S3?
I am looking to have multiple Amazon EC2 instances use the same data store. Amazon does not condone mounting an S3 Bucket as a file system, so I am trying to avoid this solution. Is there a way to synchronize an EBS volume with S3 or would it be best to use rsync and cron?
Do you really have to have the files locally available from within EBS? What if instead you served them to yourself via CloudFront, and restricted the permissions so that only your instances (or only your Security Group) could see the files?
Come Fall 2015 you'll be able to use Elastic File Storage (EFS) for this. But until then, I would suppose the next best thing is to use the AWS command-line to sync down from S3 to your volume:
aws s3 sync s3://my-bucket/my/folder /mnt/my-ebs/
After the initial run, that sync command is surprisingly fast. So from there you could just cron it to run hourly or so?
I have 2 EC2 instances, each with their own EBS attached. Sitting infront of the EC2s is a load balancer.
These instances run CMS driven sites, where uses can upload files.
What would be the best solution to the problem of a file getting uploaded to one EBS and the load balancer sending a visitor to the EC2 instance whose EBS does not have the file? Some sort of cron which runs an rsync?
Suggestions very welcome!
Thanks
S
I believe the best solution would be to use single shared storage like Amazon S3. It's better to use some plugin for your CMS to store users' files on S3. But if there is no such plugin you can use Fuse s3fs adapter to mount the file system on both instances and configure your CMS to store those files in that specified directory.
there are several solutions to this problem from top of my head i think
nfs/samba shared dir between instances
svn deploy
cluster file systems - OCFS/GFS
cloud management such as capistrano and trriger a deploy when you need
and of course cron jobs when you can do ftp, scp, rsync, s3sync/copy etc
Or possibly, create one EC2 instance as NFS and share it's directories with your other instances.
There are multiple solutions to keep data in both EC2 in sync with or without using EBS volumes.
Can use AWS EFS service instead of using EBS volumes. EFS volume can be shared between EC2 instances within a VPC, and both instances will have data in sync on the mountpath where EFS is mounted on instances.
Another solution is using Gluster File Storage. This can also work between EBS volumes in different AWS region. Refer this link: http://sanketdangi.com/post/5601762671/gluster-config-aws-multi-az
Can mount S3 bucket on your EC2 instances using S3 Fuse. Refer this link: https://github.com/s3fs-fuse/s3fs-fuse/wiki/Fuse-Over-Amazon
May be you can also use "s3 sync" on both ebs volumes. This way both ebs will be in sync via S3. Refer this link: https://docs.aws.amazon.com/cli/latest/reference/s3/sync.html