I want to setup a proxy (using Squid) that will take ~ 5000 requests per day on an Amazon EC2 instance. Will there be a noticeable difference in speed between a micro vs small instance?
The requests are for HTML, not for any media like images or videos.
Micro instances main issue is spotty CPU and disk I/O performance, but that amount of traffic is actually quite small, averaging 3 requests a minute. As a bonus, a micro instance will qualify for Amazon's free tier. For optimal performance, make sure the machine is stripped down of unnecessary services.
I would advice you to start with ebs (not instance-store) micro instance.
Later you will be able to convert micro to small.
Besides if you are planing to use the instance during for long time (more than one year) just think about purchasing reserved instance (you will save some money). But before you reserve instance you must definitely know what is better for you micro or small.
Good luck!
A micro instance should be able to handle that kind of traffic as long as you don't have too many other things installed on it. Since you are only charged for how long the instance is running. I would take both a micro and a small instance and then run some tests to see if the micro can handle your traffic or if the small instance gives you better performance and is worth the extra cash.
Related
I'm newbie on AWS, and it has so many products (EC2, Load Balancer, EBS, S3, SimpleDB etc.), and so many docs, that I can't figure out where I must start from.
My goal is to be ready for scalability.
Suppose I want to set up a simple webserver, which access a database in mongolab. I suppose I need one EC2 instance to run it. At this point, do I need something more (EBS, S3, etc.)?
At some point of time, my app has reached enough traffic and I must scale it. I was thinking of starting a new copy (instance) of my EC2 machine. But then it will have another IP. So, how traffic is distributed between both EC2 instances? Is that did automatically? Must I hire a Load Balancer service to distribute the traffic? And then will I have to pay for 2 EC2 instances and 1 LB? At this point, do I need something more (e.g.: Elastic IP)?
Welcome to the club Sony Santos,
AWS is a very powerfull architecture, but with this power comes responsibility. I and presumably many others have learned the hard way building applications using AWS's services.
You ask, where do I start? This is actually a very good question, but you probably won't like my answer. You need to read and do research about all the technologies offered by amazon and even other providers such as Rackspace, GoGrid, Google's Cloud and Azure. Amazon is not easy to get going but its not meant to be really, its focus is more about being very customizable and have a very extensive api. But lets get back to your question.
To run a simple webserver you would need to start an EC2 instance this instance by default runs on a diskdrive called EBS. Essentially an EBS drive is a normal harddrive except that you can do lots of other cool stuff with it like take it off one server and move it to another. S3 is really more of a file storage system its more useful if you have a bunch of images or if you want to store a lot of backups of your databases etc, but its not a requirement for a simple webserver. Just running an EC2 instance is all you need, everything else will happen behind the scenes.
If you app reaches a lot of traffic you have two options. You can scale your machine up by shutting it off and starting it with a larger instance. Generally speaking this is the easiest thing to do, but you'll get to a point where you either cannot handle all the traffic with 1 instance even at the larger size and you'll decide you need two OR you'll want a more fault tolerant application that will still be online in the event of a failure or update.
If you create a second instance you will need to do some form of loadbalancing. I recommend using amazons Elastic Load Balancer as its easy to configure and its integration with the cloud is better than using Round Robin DNS or a application like haproxy. Elastic Load Balancers are not expensive, I believe they cost around $18 / month + data that's passed between the loadbalancer.
But no, you don't need anything else to do scale up your site. 2 EC2 instances and a ELB will do the trick.
Additional questions you didn't ask but probably should have.
How often does an EC2 instance experience hardware failure and crash my server. What can I do if this happens?
It happens frequently, usually in batches. Sometimes I go months without any problems then I will get a few servers crash at a time. But its defiantly something you should plan for I didn't in the beginning and I paid for it. Make sure you create scripts and have backups and a backup plan ready incase your server fails. Be ok with it being down or have a load balanced solution from day 1.
Whats the hardest part about scalabilty?
Testing testing testing testing... Don't ever assume anything. Also be prepared for sudden spikes in your traffic. You have to be prepared for anything if you page goes from 1 to 1000 people over night are you prepared to handle it? Have you tested what you "think" will happen?
Best of luck and have fun... I know I have :)
I've been tasked with determining if Amazon EC2 is something we should move our ecommerce site to. We currently use Amazon S3 for a lot of images and files. The cost would go up by about $20/mo for our host costs, but we could sell our server for a few thousand dollars. This all came up because right now there are no procedures in place if something happened to our server.
How reliable is Amazon EC2? Is the redundancy good, I don't see anything about this in the FAQ and it's a problem on our current system I'm looking to solve.
Are elastic IPs beneficial? It sounds like you could point DNS to that IP and then on Amazon's end, reroute that IP address to any EC2 instance so you could easily get another instance up and running if the first one failed.
I'm aware of scalability, it's the redundancy and reliability that I'm asking about.
At work, I've had something like 20-40 instances running at all times for over a year. I think we've had 1-3 alert emails come from amazon suggesting that we terminate and boot another instance (presumably because they are detecting possible failure in the underlying hardware). We've never had an instance go down suddenly, which seems rather good.
Elastic IP's are amazing and are part of the solution. The other part is being able to rapidly bring up new instances. I've learned that you shouldn't care about instances going down, that it's more important to use proper load balancing and be able to bring up commodity instances quickly.
Yes, it's very good. If you aren't able to put together a concurrent redundancy (where you have multiple servers fulfilling requests simultaneously), using the elastic IP to quickly redirect to another EC2 instance would be a way to minimize downtime.
Yeah I think moving from inhouse server to Amazon will definitely make a lot of sense economically. EBS backed instances ensure that even if the machine gets rebooted, the transient memory is not lost. And if you have a clear separation between your application and data layer and can have them on different machines, then you can build even better redundancy for your data.
For ex, if you use mysql, then you can consider using Amazon RDS service - which gives you a highly available and reliable MySQL instance, fully managed (patches and all). The application layer then can be made more resilient by having more smaller instances rather than one larger instance, through load balancing.
The cost you will save on is really hardware maintenance and the cost you would have to incur to build in disaster recovery.
I want to separate the current high cpu medium instance that I have into high cpu + micro instance in order to take in more http traffic. Does anyone know how many simultaneous connections can a micro instance take in? The initial idea is to separate the db and host it from sql azure but due to some old stored procedures, I'm opting to stay on a pure ec2 setup. The reason is because the current instance already takes in 100% cpu at times.
This question is quite old so presumably you've made the decision on your own, but just for posterity:
The capacity of any server will vary WIDELY depending on what you're using it for. It would be silly to even speculate on a value. If each HTTP request requires lots of server processing, combing through database results, repackaging media from other servers, you could quickly overload a Micro. However, if all you're doing is basic web serving, you could probably do alright for a while.
The only way to answer a question like this is to set up the micro instance and test it yourself. Set up your development workstation to flood it with requests, and then keep an eye on the response time and CPU usage in AWS CloudWatch to see if the behavior you get is acceptable to you.
(As a side note, there are also differences in network I/O between the various EC2 sizes, so if performance under heavy load seems bad but the CPU isn't working hard, it may be I/O bound.)
I currently hold a tracking service that records visits from various sources. At times we record the visits and redirect to our clients or we let clients call us to report visits. The architecture is two worker boxes configured behind a load-balancer. This system is setup using Amazon EC2 and the load-balancer used is Amazon's Elastic LB.
I did some benchmarking tests and have noticed significant network latencies. The traffic through load-balancer suffers atleast 2 times more delay than hitting any of the boxes directly.
Has anyone experienced such an issue and has attempted to solve it? Is this an Amazon EC2 specific issue?
Is there any other architecture in use that would lower my network latencies significantly. e.g. Using a HA such that traffic needn't go through a load-balancer but instead hits the end point servers directly? Before I start investing time on that I wanted to hear what others think of the same.
Thanks alot for your time,
Santosh
Change your LB and give it another try. HAProxy is great session/cookie aware L7 balancer and can be setup in Amazon cloud AFAIK. See this: http://agiletesting.blogspot.com/2009/02/load-balancing-in-amazon-ec2-with.html
You have to take into account that ELBs perform better after a while, then initially. Don't ask me why, but that's how it is -- loadbalancer warming?
It also really depends how much traffic you sent the ELB. Keep in mind that the hardware the ELB is provisioned on seems like a regular small instance. So the throughput is capped at ~25 MBit (last time I checked). If you require more, go dedicated.
In the end, I too would suggest that you look Haproxy on a dedicated instance. I'd expect some delay, 2x more delay sounds unreal. Maybe use another small instance and benchmark it directly against the ELB and then try a c1.medium.
I was curious if anyone has experimented with auto scaling web or db tier in EC2 or other cloud computing infrastructure? It seems theoretically possible, but I am curious what the practical limitations are/maybe.
Thanks!
We are also starting to look at auto-scaling.
The first candidate approach is to use Amazon's ELB (Elastic Load Balancer) and Cloud Front. However, our traffic is a web service. Caller's frequently send the 100-Continue http message, and ELB cannot understand that message. There's no word yet from Amazon on when that might be fixed. Also, there are a number of complaints in the Amazon forums about ELB not handling heavy load.
LigHTTPD 1.5 looks like a promising partial solution, in that it can detect when an instance is not functioning and transparently take it out of the rotation, and can be dynamically reconfigured without restarting the load balancer.
There are a number of commercial solutions as well. We will probably have a look at Right Scale.
This is more of a question than an answer, but I'm about to start experimenting with autoscaling myself (most likely using the Amazon CloudFront facilities) and am thinking that instance startup time will be a factor. I've noticed that a new EC2 instance can take from 5 to 20 minutes to start up, so it's not as if you can instantly add more capacity when your load increases; it seems like you would need one or more idle instances to be running and ready to pick up increased load.
Late addition:
Consider SimpleDB as well... this would eliminate the DB scaling side.
For autoscaling, we rolled our own scripts to monitor, launch, and provision servers and yes, the whole process takes about 7 minutes. We do a little predictive analysis to guess when new servers will be needed and then just break them down if they aren't. Total cost: ~10 cents.
Also, Scalr looks promising as a commercial solution (haven't used it).
Chad