Basic AWS questions - amazon-ec2

I'm newbie on AWS, and it has so many products (EC2, Load Balancer, EBS, S3, SimpleDB etc.), and so many docs, that I can't figure out where I must start from.
My goal is to be ready for scalability.
Suppose I want to set up a simple webserver, which access a database in mongolab. I suppose I need one EC2 instance to run it. At this point, do I need something more (EBS, S3, etc.)?
At some point of time, my app has reached enough traffic and I must scale it. I was thinking of starting a new copy (instance) of my EC2 machine. But then it will have another IP. So, how traffic is distributed between both EC2 instances? Is that did automatically? Must I hire a Load Balancer service to distribute the traffic? And then will I have to pay for 2 EC2 instances and 1 LB? At this point, do I need something more (e.g.: Elastic IP)?

Welcome to the club Sony Santos,
AWS is a very powerfull architecture, but with this power comes responsibility. I and presumably many others have learned the hard way building applications using AWS's services.
You ask, where do I start? This is actually a very good question, but you probably won't like my answer. You need to read and do research about all the technologies offered by amazon and even other providers such as Rackspace, GoGrid, Google's Cloud and Azure. Amazon is not easy to get going but its not meant to be really, its focus is more about being very customizable and have a very extensive api. But lets get back to your question.
To run a simple webserver you would need to start an EC2 instance this instance by default runs on a diskdrive called EBS. Essentially an EBS drive is a normal harddrive except that you can do lots of other cool stuff with it like take it off one server and move it to another. S3 is really more of a file storage system its more useful if you have a bunch of images or if you want to store a lot of backups of your databases etc, but its not a requirement for a simple webserver. Just running an EC2 instance is all you need, everything else will happen behind the scenes.
If you app reaches a lot of traffic you have two options. You can scale your machine up by shutting it off and starting it with a larger instance. Generally speaking this is the easiest thing to do, but you'll get to a point where you either cannot handle all the traffic with 1 instance even at the larger size and you'll decide you need two OR you'll want a more fault tolerant application that will still be online in the event of a failure or update.
If you create a second instance you will need to do some form of loadbalancing. I recommend using amazons Elastic Load Balancer as its easy to configure and its integration with the cloud is better than using Round Robin DNS or a application like haproxy. Elastic Load Balancers are not expensive, I believe they cost around $18 / month + data that's passed between the loadbalancer.
But no, you don't need anything else to do scale up your site. 2 EC2 instances and a ELB will do the trick.
Additional questions you didn't ask but probably should have.
How often does an EC2 instance experience hardware failure and crash my server. What can I do if this happens?
It happens frequently, usually in batches. Sometimes I go months without any problems then I will get a few servers crash at a time. But its defiantly something you should plan for I didn't in the beginning and I paid for it. Make sure you create scripts and have backups and a backup plan ready incase your server fails. Be ok with it being down or have a load balanced solution from day 1.
Whats the hardest part about scalabilty?
Testing testing testing testing... Don't ever assume anything. Also be prepared for sudden spikes in your traffic. You have to be prepared for anything if you page goes from 1 to 1000 people over night are you prepared to handle it? Have you tested what you "think" will happen?
Best of luck and have fun... I know I have :)

Related

Amazon EC2 and Pen-Testing Form

I want to get an Amazon EC2 instance (first year trial is for free) for my tutorials but I have found out that I need to complete form about Pen-Testing on their website, as I will be using Amazon EC2 instance only to perform such actions on my own systems which I physically own so I was just wondering if a normal person like me can apply for it or is it just limited to a companies and normal users can't apply for it ?
I will appreciate any help.
Kind Regards
You can apply just like anybody else - no special qualifications needed. Mostly they want to make sure you are only pen testing against your own instances, not somebody else's instance.
But also keep in mind, since it sounds like you are trying to stay within the free-tier, that you are probably going to need to pay for a bigger instance to test against:
At this time, our policy does not permit testing small or micro RDS instance types. Testing of m1.small or t1.micro EC2 instance types is not permitted. This is to prevent potential adverse performance impacts on resources that may be shared with other customers.

DB Server Requirements Advice

I am building a MySQL database with a web front end for a client. The client and their staff will use this webapp on a daily basis, creating anywhere from a few thousand, to possibly a few hundred thousand records annually. I just picked up a second client who wishes to have the same product and will probably be creating the same number of records annually, possibly more.
In the future I hope to pick up a few more clients. In the next few years I could have up to 5 databases & web front ends running for 5 distinct clients, all needing tight security while creating, likely, millions of records annually (cumulatively across all the databases).
I would like to run all of this with Amazon's EC2 service but am having difficulty deciding on what type of instance to run. I am not sure if I should have several distinct Linux instances, one per client, or run one "large" instance which would manage all the clients' databases and web front ends.
I know that hardware configuration is rather specific to the task at hand. The web front ends will be using JQuery to make MySQL queries "pretty" and I will likely be doing some graphing of data (again with JQuery). The front ends will be using SSL for security, which I understand can add some overhead to the network speed.
I'm looking for some of your thoughts on this situation.
Thanks
Use the tools that are available. The Amazon RDS service lets you run a MySQL database in the cloud with no extra effort. You can scale it up and down as you need - start small, and then as you hit your limits, add extra capacity (at extra cost).
Next, use Elastic Load Balancing (ELB) with an SSL certificate, so you offload the overhead of SSL decryption to an Amazon service.
If you're using Java for your webapp, you could use Elastic Beanstalk to handle the whole hosting process for you.
Don't be afraid to experiment - you can always resize instances with no data loss (if they boot from an EBS volume) and you can always create and delete instances. Scaling horizontally is often better than scaling vertically, as you can spread your instances across multiple Availability Zones.
Good luck!

Amazon EC2 Capacity & Workflow Questions

I’m hoping some of you with experience using amazon EC2 could offer some advice… of course it’ll be subjective which is fine, I’m pretty sure your guestimate would be better than mine.
I am planning on moving all my client’s websites from shared hosting environments to Amazon EC2. They’re all pretty low traffic sites (the busiest site receives around 50 unique visitors a day). There’s about 8 sites, but I may expand this as I take on more projects and host more sites… current capacity planning is for say 12 sites.
Each site runs on ASP.Net (Umbraco CMS), and requires a SQL Server database.
My thoughts are one of the following:
Setup a Small Instance (1.7gb RAM, 1 EC2 Compute Unit), and run IIS and SQL Server Express on that server.
Setup 2 Micro Instances (613MB Ram each, Up to 2 EC2 Compute Units) – one for IIS, the other for SQL Server.
Which arrangement do you think would work the best for my requirements. I’ve started setting up a Micro instance with Server 2008, SQL Server Express, etc… and finding it not coping with the memory requirements, hence considering expanding. I could always configure on a Small instance, then export the AMI and fire it up in a Micro instance after, and do the same every time any serious changes to the server are required. I guess I could even do all updates etc on a spare Small Spot instance, then switch load that AMI up in a Micro and transfer the IP Address across, so I don’t need to do too much work on the production servers. I figure if I store all my website data files on EBS Volumes, then it should be fairly easy to move hosting between servers with minimal downtime, while never working on a production server.
I’m interested to know what you all think, and what strategies you employ for such activities as upgrades, windows updates, software installations, etc.
And what capacity do you think I’d need for my requirements.
Cheers
Greg
Well, first-up, Server 2008 doesn't play well in the 613MB RAM the Micro instance gives you. It runs, but it's a dog, and it barks louder the more services (IIS, SSE, etc) you layer on top. We using nothing smaller than a Small for Server 2008, and in fact typically do the environment config in a Medium and scale down to Small once the heavy lifting is complete and the OS is ready to use. Server 2003, however, seems to breathe easier on a Micro - but we still do the config on a larger instance and scale down.
We're running low-traffic websites on Server 2003/IIS6 in a Micro, with a Server 2008/SS install on a shared, separate, Small instance. We do also have one Server 2008/IIS7 Micro build running, but only to remind ourselves why we don't use it more widely. ;)
Larger websites run Server 2008/IIS7 in either Small or Medium instances, but almost always still using that shared separate SS instance for database services. We try not to deploy multiple SS installations, since it makes maintenance and backups more complex.
Stashing content and config on EBS Volumes is of course good practice, unless you like rebuilding the entire system whenever an Instance disappears. Snapshotting your Instances periodically is also good practice, since you can spin-up a new Instance from a baseline AMI and swap the snapshot in as a boot Volume for fast recovery in the event of disaster.

Amazon EC2 consideration - redundancy and elastic IPs

I've been tasked with determining if Amazon EC2 is something we should move our ecommerce site to. We currently use Amazon S3 for a lot of images and files. The cost would go up by about $20/mo for our host costs, but we could sell our server for a few thousand dollars. This all came up because right now there are no procedures in place if something happened to our server.
How reliable is Amazon EC2? Is the redundancy good, I don't see anything about this in the FAQ and it's a problem on our current system I'm looking to solve.
Are elastic IPs beneficial? It sounds like you could point DNS to that IP and then on Amazon's end, reroute that IP address to any EC2 instance so you could easily get another instance up and running if the first one failed.
I'm aware of scalability, it's the redundancy and reliability that I'm asking about.
At work, I've had something like 20-40 instances running at all times for over a year. I think we've had 1-3 alert emails come from amazon suggesting that we terminate and boot another instance (presumably because they are detecting possible failure in the underlying hardware). We've never had an instance go down suddenly, which seems rather good.
Elastic IP's are amazing and are part of the solution. The other part is being able to rapidly bring up new instances. I've learned that you shouldn't care about instances going down, that it's more important to use proper load balancing and be able to bring up commodity instances quickly.
Yes, it's very good. If you aren't able to put together a concurrent redundancy (where you have multiple servers fulfilling requests simultaneously), using the elastic IP to quickly redirect to another EC2 instance would be a way to minimize downtime.
Yeah I think moving from inhouse server to Amazon will definitely make a lot of sense economically. EBS backed instances ensure that even if the machine gets rebooted, the transient memory is not lost. And if you have a clear separation between your application and data layer and can have them on different machines, then you can build even better redundancy for your data.
For ex, if you use mysql, then you can consider using Amazon RDS service - which gives you a highly available and reliable MySQL instance, fully managed (patches and all). The application layer then can be made more resilient by having more smaller instances rather than one larger instance, through load balancing.
The cost you will save on is really hardware maintenance and the cost you would have to incur to build in disaster recovery.

What are the practical limitations with auto scaling EC2 or other cloud computing infrastructure?

I was curious if anyone has experimented with auto scaling web or db tier in EC2 or other cloud computing infrastructure? It seems theoretically possible, but I am curious what the practical limitations are/maybe.
Thanks!
We are also starting to look at auto-scaling.
The first candidate approach is to use Amazon's ELB (Elastic Load Balancer) and Cloud Front. However, our traffic is a web service. Caller's frequently send the 100-Continue http message, and ELB cannot understand that message. There's no word yet from Amazon on when that might be fixed. Also, there are a number of complaints in the Amazon forums about ELB not handling heavy load.
LigHTTPD 1.5 looks like a promising partial solution, in that it can detect when an instance is not functioning and transparently take it out of the rotation, and can be dynamically reconfigured without restarting the load balancer.
There are a number of commercial solutions as well. We will probably have a look at Right Scale.
This is more of a question than an answer, but I'm about to start experimenting with autoscaling myself (most likely using the Amazon CloudFront facilities) and am thinking that instance startup time will be a factor. I've noticed that a new EC2 instance can take from 5 to 20 minutes to start up, so it's not as if you can instantly add more capacity when your load increases; it seems like you would need one or more idle instances to be running and ready to pick up increased load.
Late addition:
Consider SimpleDB as well... this would eliminate the DB scaling side.
For autoscaling, we rolled our own scripts to monitor, launch, and provision servers and yes, the whole process takes about 7 minutes. We do a little predictive analysis to guess when new servers will be needed and then just break them down if they aren't. Total cost: ~10 cents.
Also, Scalr looks promising as a commercial solution (haven't used it).
Chad

Resources