Drupal on AWS for 3000 users, what instance to use? - amazon-ec2

I am looking at installing Drupal on Amazon Web services to host a staff training website & staff newsletter. There will be no heavy custom coding just a very standard drupal install.
The day to day use will be pretty low, but there will be time were nearly 3000 users will hit the site with in a hour.
Any advice on which instance size to use? if to use a load balancer?
Thanks in advance

It depeneds on many factors. The factors revolve around the question: how optimized are you? If you are not optimized, I'm not sure even the 7.2 GB will be enough. If correctly optimized, it probably will.
Optimization tips? Tell us what this site is about. Mostly static files, not expecting a lot of users really login in? YOU MUST HAVE BOOST INSTALLED. Also, use aws cloudfront and get a 3.4 GB instance, it should be enough. (needless to say, all static files should go to S3).
Did you mean that 3000 users will be login in within an hour? well, that can be tricky... If the surge in users is just for a couple of hours, don't be cheap. Get a huge instance just for a couple of hours. The money is worth all the optimizations you can do...

Related

How many instances would I need?

I'm trying to figure out hosting requirements for our organization's site. Any guidance to this would be much appreciated!
I need to know how many / which kind of instances I'll need so I can start planning this in my head.
Info:
- We'll be running ExpressionEngine (PHP) to power our sites, there will two sites so we'll be using the Multi-site manager
- We receive on average 85k hits daily - off months are around 6k a day, but it all balances out to an 85k average
- All images / media will be hosted on S3
- Database to run on RDS
- I'll cache the pages in the CMS so minimize load
I know we'll need a few EC2 instances, wondering what you guys suggest in terms of number of instances / which ones. I haven't used the AWS load balancers before, but I'm sure I'll need them.
I appreciate any suggestions, as well as links I could read up on the requirements. Thank you!
Honestly, nobody can answer this question without it being a wild-guess, but perhaps the better thing to do, instead of trying to figure out how many instances you need, is to spend the time instead architecting your site so that in can adapt to use as many instances it needs based on real usage metrics which it collects once it goes live.
AWS provides autoscaling for just this purpose. Take a guess how many instances you need, but setup everything up so that if AWS detects a given threshold has been passed (using a metric that you define), it automatically spins up more instances to meet demand, and then can takes them off-line when it is safe to do so. In general you should spin up extra instances fast, and take them off-line slowly (since you pay for the full hour anyway once they spin up). You can set a minimum and maximum as well, so you don't all of a sudden have to pay for 500+ instances that mistakenly got spun up!.
Even if you knew what would satisfy the load the first week, can you really know what the demand will be in 6-12 months? With autoscaling properly setup you don't have to know ahead of time.
This link is a good starting point; http://aws.amazon.com/autoscaling/

What's the speediest web hosting choices out there that are scalable to large traffic spikes and can handle fast page loads?

Is cloud hosting the way to go? Or is there something better that delivers fast page loads?
The reason I ask is because I run a buddypress site on a bluehost dedicated server, but it seems to run slow at most times of the day. This scares me because at the moment the sites not live and I'm afraid when it gets traffic it'll become worse and my visitors will lose interest. I use Amazon Cloud to handle all my media, JS, and CSS files along with a catching plugin, but it still loads slow at times.
I feel like the problem is Bluehost, because I visit other sites running buddypress and their sites seem to load instantly. Im not web hosting savvy so can someone please point me in the right direction here?
The hosting choice depends on many factors such as technical requirements, growth rates, burst rates, budgets and more.
Bigger Hardware
To scale up hosting operation, your first choice is often just using a more powerful server, VPS, or cloud instance. The point is not so much cloud vs. dedicated but that you simply bring more compute power to the problem. Cloud can make scaling up easier - often with a few clicks.
Division of Labor
The next step often is division of labor. You offload database, static content, caching or other items to specific servers or services. For example, you could offload static content to a CDN. You could a dedicated database.
Once again, cloud vs non-cloud is not the issue. The point is to bring more resources to your hosting problems.
Pick the Right Application Stack
I cannot stress enough picking the right underlying technology for your needs. For example, I've recently helped a client switch from a Apache/PHP stack to a Varnish/Nginx/PHP-FPM stack for a very business Wordpress operation (>100 million page views/mo). This change boosted capacity by nearly 5X with modest hardware changes.
Same App. Different Story
Also just because you are using a specific application, it does not mean the same hosting setup will work for you. I don't know about the specific app you are using but with Drupal, Wordpress, Joomla, Vbulletin and others, the plugins, site design, themes and other items are critical to overall performance.
To complicate matter, user behavior is something to consider as well. Consider a discussion form that has a 95:1 read:post ratio. What if you do something in the design to encourage more posts and that ratio moves to 75:1. That means more database writes, less caching, etc.
In short, details matter, so get a good understanding of your application before you start to scale out hosting.
A hosting service is part of the solution. Another part is proper server configuration.
For instance this guy has optimized his setup to serve 10 million requests in a day off a micro-instance on AWS.
I think you should look at your server config first, then shop for other hosts. If you can't control server configuration, try AWS, Rackspace or other cloud services.
just an FYI: You can sign up for AWS and use a micro instance free for one year. The link I posted - he just optimized on the same server. You might have to upgrade to a small server because Amazon has stated that micro is only to handle spikes and sustained traffic.
Good luck.

What are the reasons for a "simple" website not to choose Cloud Based Hosting?

I have been doing some catching up lately by reading about cloud hosting.
For a client that has about the same characteristics as StackOverflow (Windows stack, same amount of visitors), I need to set up a hosting environment. Stackoverflow went from renting to buying.
The question is why didn't they choose cloud hosting?
Since Stackoverflow doesn't use any weird stuff that needs to run on a dedicated server and supposedly cloud hosting is 'the' solution, why not use it?
By getting answers to this question I hope to be able to make a weighted decision myself.
I honestly do not know why SO runs like it does, on privately owned servers.
However, I can assume why a website would prefer this:
Maintainability - when things DO go wrong, you want to be hands-on on the problem, and solve it as quickly as possible, without needing to count on some third-party. Of course the downside is that you need to be available 24/7 to handle these problems.
Scalability - Cloud hosting (or any external hosting, for that matter) is very convenient for a small to medium-sized site. And most of the hosting providers today do give you the option to start small (shared hosting for example) and grow to private servers/VPN/etc... But if you truly believe you will need that extra growth space, you might want to count only on your own infrastructure.
Full Control - with your own servers, you are never bound to any restrictions or limitations a hosting service might impose on you. Run whatever you want, hog your CPU or your RAM, whatever. It's your server. Many hosting providers do not give you this freedom (unless you pay up, of course :) )
Again, this is a cost-effectiveness issue, and each business will handle it differently.
I think this might be a big reason why:
Cloud databases are typically more
limited in functionality than their
local counterparts. App Engine returns
up to 1000 results. SimpleDB times out
within 5 seconds. Joining records from
two tables in a single query breaks
databases optimized for scale. App
Engine offers specialized storage and
query types such as geographical
coordinates.
The database layer of a cloud instance
can be abstracted as a separate
best-of-breed layer within a cloud
stack but developers are most likely
to use the local solution for both its
speed and simplicity.
From Niall Kennedy
Obviously I cannot say for StackOverflow, but I have a few clients that went the "cloud hosting" route. All of which are now frantically trying to get off of the cloud.
In a lot of cases, it just isn't 100% there yet. Limitations in user tracking (passing of requestor's IP address), fluctuating performance due to other load on the cloud, and unknown usage number are just a few of the issues that have came up.
From what I've seen (and this is just based on reading various blogged stories) most of the time the dollar-costs of cloud hosting just don't work out, especially given a little bit of planning or analysis. It's only really valuable for somebody who expects highly fluctuating traffic which defies prediction, or seasonal bursts. I guess in it's infancy it's just not quite competitive enough.
IIRC Jeff and Joel said (in one of the podcasts) that they did actually run the numbers and it didn't work out cloud-favouring.
I think Jeff said in one of the Podcasts that he wanted to learn a lot of things about hosting, and generally has fun doing it. Some headaches aside (see the SO blog), I think it's a great learning experience.
Cloud computing definitely has it's advantages as many of the other answers have noted, but sometimes you just want to be able to control every bit of your server.
I looked into it once for quite a small site. Running a small Amazon instance for a year would cost around £700 + bandwidth costs + S3 storage costs. VPS hosting with similar specs and a decent bandwidth allowance chucked in is around £500. So I think cost has a lot to do with it unless you are going to have fluctuating traffic and lots of it!
I'm sure someone from SO will answer it but "Isn't just more hassle"? Old school hosting is still cheap and unless you got big scalability problems why would you do cloud hosting?

mosso versus gogrid which is better?

I have reasonable experience to manage my own server, so gogrid style management is not a problem. But seems mosso is a tag cheaper somewhat- except the very difficult to access compute cycles terms. Anyone could share about this would be very welcomed.
Well, even at the current moment as correct answer is marked GoGrid choice, I think I need to share my experience with GoGrid.
It's been several weeks after we broke our commitment with them and I think I'm pretty calm now to write cons for them.
1) Images. We were trying to use Windows 2008 images and those were pretty old. To be up to date, you need to install 80+ updates and that takes a while. But that's not the worst thing. Worst thing is, that default image hdd size is 20gb and that was not enough to complete windows updating, at least in automatic way (not talking about installing additional software). There's no way to increase image size, so you need to make all kinds of workarounds (for example disable virtual memory, when installing).
2) Support. It's not fanatic. I would call it robotic. Although live chat is working, at least we were unable to solve by live chat most of the problems, because live chat support personel would always forward request to upper level, which is not accessible through live chat. Another thing is, that as I understood, engineers, that have real knowledge and access to infrastructure don't work at night and in weekends (I was working from Europe, so I had completely different time zone).
3) Service Level Agreement. You need to be careful about small print (for example I've missed that rule 1hour of non working is compensated 100x was working only for one month bill), but there are things, that are not mentioned - for example I was told, that SLA terms do not work for cloud storage, although I think you won't find this mentioned in SLA.
4) Reaction time. Although in SLA they say, that will solve any issue in two hours, we couldn't get solution in 10 days. Problem was clear: network speed between gogrid server instances, also between instance and cloud storage was 10-15kbps (measured using several tools, such as netio and etc., tested several instances and so on). That wasn't because they forgot or smth., we were checking status at various levels every day. My management talked with VP of technology or something and he promised that problem will be solved in nearest time, several days passed and no solution was proposed. And some of the emails about how they are investigating problem made me laugh.
5) Internet speeds. Sometimes they were really good (I've measured 550mbps download speed), but sometimes they are terrible (upload up to 0.05mbps).
If someone thinks, that this is some kind of competitors posting, I have chat and email logs about mentioned issues, also screen shots of internet speed tests and could provide under request.
Ok, and one good thing about their service - you can use several IP addresses on one instance (what our current hosting provider - Amazon EC2 is unable to do).
Stay away from GoGrid !
I don't have any experience with Mosso, but I do have (unfortunately) VERY bad experience with GoGrid.
As other people mentioned, their support is horrible. Most times you will get a live chat person that really is no help at all - doesn't really know their system or how it works so he can't really help with any problem beyond restarting your server.
Another issue is their performance which is at best unreliable and at worst just not there. Starting from I/O which can drop to < 1mb/s (measured by a few tools) - ranging to network connections that are very slow - load balancers which do not spread the load (2 servers on RoundRobin get 70/30)
Not to mention a very buggy portal - new server picks a free ip, which I am then told is in use...and not by me - even though I have the whole range "assigned" to me -
new cases which are saved without the text - buttons which say "upgrade to a new plan" but do nothing... etc... etc...
Their billing department which is not responsive and you have to argue about everything (why am I paying $0.5/gb traffic when the site states $0.29 ?????)
I have been using them for about a year now - and that's only because I don't have the time to move. Hopefully I will be able to get the hell out of there in a month.
As you can tell, I am very very frustrated with them. I know it's my fault I didn't run away sooner, but I really didn't expect such a low level of service and quality.
beware....
Yoav.
Mosso has way better service though, and the clients stay happy. The only issue I have experienced with them ever was installing DNN (which is a pain period) and a single client machine refused to allow for FTP access to their site... but again, Mosso techs did everything they could to get it going.
It's simple, Mosso is just like a "reseller" hosting. They provide you everything whitelabel from billing to control panel then you sell it back to customers.
If you are developer, I recommend you choose GoGrid. Firstly, Mosso doesn't provide SSH access. Secondly, if you are RoR/Mongrel user, you are capped to limited RAM (unless you pay extra in addition to $100). Moreover, GoGrid allows you to choose server image (CentOS, Redhat, Windows) with some out-of-the-box support for RoR and LAMP.
Somemore, GoGrid provides you initial credits ($50 or $95 if you use MS-WEBFWRD) for you to try out before actually paying for it.
Mosso does not give you Admin control over the "servers" anymore...
Disclosure: I am the Technology Evangelist for GoGrid.
I wanted to address some of the points above by #Giedrius and #Yoav. I'm sorry if your experience was lower than expected. We have and continue to make dramatic improvements and upgrades to both our product features as well as our service. That being said, I want to answer a few points that you listed above, specifically:
1) Images - Do note that the HD size (persistent storage) is tied to the RAM allocation. Our base images for the lowest RAM allocation (512 MB) is now 30 GBs. Also, because some users experienced some performance issues with low allocations of RAM on Windows servers, we have set a minimum allocation of 1 GB or higher for most Windows instances. Also, all of our Windows 2008 instances now have SP2 on them: wiki.gogrid.com/wiki/index.php/Server_Images#Windows_2008_Server
2) Support - We are always working on making our support team and processes even better. Remember that there are several public clouds that charge for support, something we don't do. Yes, it is available 24/7/365 and you are correct that there are typically more support personnel available during business hours (that is the norm for many companies). Be we are here to help 24x7. Also, every GoGrid account gets a dedicated service team which consists of a variety of personnel from our organization (acct mgmt, tech support, billing, etc.)
3) SLA - We offer one of the most robust SLAs in the marketplace. Also, Cloud Storage IS in fact covered in our SLA under Section VI here: www.gogrid.com/legal/sla.php .
4) Reaction time - I do not believe that we ever state in the SLA that any issue will be "resolved" within 2 hours. I doubt that ANY hosting provider can offer that, simply because of the nature of hosting and the complexity therein. We will acknowledge and respond to tickets (as stated within the SLA) within 2 hours or 30 minutes depending on the nature of the ticket. I'm sorry if that isn't clear so please let me know where it can be better explained.
5) Internet speeds - we have multiple bandwidth providers for our datacenter. It is not typical that there is latency, jitter or slow transfer speeds. If a situation is encountered where the speeds are not what you expect, I encourage you to open a support ticket so that we can investigate.
6) I/O - recently we have been benchmarked by an independent 3rd party, CloudHarmony.com, as having the best I/O of cloud providers: http://blog.cloudharmony.com/2010/06/disk-io-benchmarking-in-cloud.html
7) Network Connections - see #5 above
8) Load Balancers - if you are encountering balancing issues, we encourage you to report it. Details on our LB can be found on the wiki: wiki.gogrid.com/wiki/index.php/(F5)_Load_Balancer
9) Portal - We continue to make optimizations to the web portal including recently launching a "list view" for customers with larger environments. If the portal is "misbehaving", I recommend clearing your cache and using the latest browser version (I personally use Chrome and Firefox regularly on the portal w/o issue). Alternatively, you could use the API to manage your GoGrid infrastructure.
10) Transfer Plan - A few months ago, we released some new RAM and Transfer Plans. It seems that you are still on the old Transfer plan if you have $0.50/GB instead of $0.29. We don't automatically change customers' plans without their permission. So I recommend that you upgrade your plan to enjoy the new pricing.
Hope that helps answer the questions/concerns. I didn't mean for it to be a sales pitch (as I'm not a sales guy) but I wanted to be sure that other readers had "the other side of the story."
Please contact me should you have any questions: michael[at]gogrid.com
Thanks!
-Michael

How to affordably release a Web App

I am a broke college student. I have built a small web app in PHP5 and MySQL, and I already have a domain. What is an affordable way to get it online? A few people have suggested amazon's cloud services, but that seems equivalent to slitting my wrists and watching money slowly trickle out. So suggestions? Hosting companies, CIA drop sites, anything?
Update: A lot of suggestions have been for Dreamhost. Their plan allows for 5TB of bandwidth. Could anyone put this in perspective? For instance, how much bandwidth does a site with the kind of traffic StackOverflow get?
I say pay the 50-80 bucks for a real host. The classic "you get what you pay for" is very true for hosting. This will save you time, time you can spend getting those $80.
I use and recommend DreamHost for both their prices and customer service. I've hosted several sites here and performance has always been good. $5.95 a month for their basic package.
I highly recommend HostRocket. I have been with them for about 6 or 7 years now with multiple domains and have found uptime and database availability flawless. The only reason I'm leaving them is because I'm doing some .NET web apps now and HostRocket is purely LAMP based.
But without making things an ongoing ad. I will put in two "gotchas" that you'll want to be wary of when searching:
"Free" hosting services. Most of these will make you subdomain on them and worse, they'll put a header and a footer on your page (sometimes in gaudy frame format) that they advertise heavily on. I don't care how poor you are, this will not help attract traffic to your app.
A lot of the cheaper rates depend on pre-payment. HostRocket will give you $4.99 a month in hosting, but you have to pre-pay for 3 years. If you go month to month, it is $8.99. There are definitely advantages to the pre-payment, but you don't want to get caught with close to twice the monthly payment if you weren't expecting it.
I recently found a site called WebHostingStuff that seems to have a decent list of hosts and folks that put in their reviews. While I wouldn't consider it "the final authority" I have been using it as of late for some ideas when looking for a new host.
I hope this helps and happy hunting!
I have no specific sites to suggest, but a typical hosting company will charge you less than $10 per month for service. A simple Google search will turn up lots of results for "comparison of web hosts": http://www.google.com/search?hl=en&q=comparison+of+web+hosts&btnG=Google+Search
Well, Amazon EC2 is only as bad as the amount of traffic you get. So the ideal situation is to monetize your site (ads, affiliate programs, etc) so that that more traffic you get, the more you pay Amazon, but the more you make...in theory of course.
As for a budget of nothing...there's not really much you can do...hosting typically always costs something, but since you are using the LAMP stack, it's pretty cheap.
For example, hosting on GoDaddy.com for 1year can be about $50-60 which is not too bad.
I use dreamhost which costs about $80 per year, but I get MUCH more storage and bandwidth.
I agree with pix0r. With your requirements of php5 and mysql it seems that for starting out Dreamhost would be a good recommendation. You can always move it over pretty easily to ec2 if it takes off.
Dreamhost is great and cheap for a php5 mysql setup that gives you command line access. The problems come if you want to use some other web language/framework like RoR or Python/Django/Pylons. I know there are hacks to get things working, but last time I tried they were spotty at best and not supported by Dreamhost.
It may be helpful to know what kind of app we are talking about. Also what sort of traffic do you expect and to echo Adam's note what sort of business model (if any) do you have?
I've been at HostingMatters for years. They're relatively cheap, and their service is awesome. <12 hours for any support ticket I've ever had.
Additionally, since I've been with them for about ten years, they bumped me to an unmetered plan for no cost (at the same $10/month I was paying.) ....

Resources