What various methods and technologies have you used to successfully address scalability and performance concerns of a website? I am an ASP.NET web developer exploring .NET remoting with WCF with SQL clustering and am curious as to what other approaches exist (such as the ‘cloud’). In which cases would you apply various approaches (for example method a for roughly x many ‘active’ users).
An example of what I mean, a myspace case study: http://highscalability.com/myspace-architecture
This is a very broad question making it difficult to answer, but I'll try and provide a few general suggestions.
1 - Unless you are doing some things seriously wrong then you'll likely not need to worry about perf or scale until you hit a significant amount of traffic (over 1 million page views a month).
2 - Your biggest performance problems initially are likely to be the page load times from other countries. Try the Gomez Instance Site Test to see the page load times from around the world, and use YSlow as a guide for optimizing.
3 - When you do start hitting performance problems it will first most likely be due to the database work. Use the SQL Server Profiler to examine your SQL traffic looking for long running queries to try optimizing, and also use dm_db_missing_index_details to look for indexes you should add.
4 - If your web servers start becoming the performance bottleneck, use a profiler to (such as the ANTS Profiler) to look for ways to optimize your web pages code.
5 - If your web servers are well optimized and still running too hot, look for more caching opportunities, but you're probably going to need to simply add more web servers.
6 - If your database is well optimized and still running too hot, then look at adding a distributed caching system. This probably won't happen until you're over 10 million page views a month.
7 - If your database is starting to get overwhelmed even with distributed caching, then look at a sharding architecture. This probably won't happen until you're over 100 million page views a month.
I've worked on a few sites that get millions/hits/month. Here are some basics:
Cache, cache, cache. Caching is one of the simplest and most effective ways to reduce load on your webserver and database. Cache page content, queries, expensive computation, anything that is I/O bound. Memcache is dead simple and effective.
Use multiple servers once you are maxed out. You can have multiple web servers and multiple database servers (with replication).
Reduce overall # of request to your webservers. This entails caching JS, CSS and images using expires headers. You can also move your static content to a CDN, which will speed up your user's experience.
Measure & benchmark. Run Nagios on your production machines and load test on your dev/qa server. You need to know when your server will catch on fire so you can prevent it.
I'd recommend reading Building Scalable Websites, it was written by one of the Flickr engineers and is a great reference.
Check out my blog post about scalability too, it has a lot of links to presentations about scaling with multiple languages and platforms:
http://www.ryandoherty.net/2008/07/13/unicorns-and-scalability/
There is velocity from MS as well as MEMCache has a port to .NET now and also indeXus.Net
Related
The following questions were posed by a customer who is about to write a large scale XPages application. While I think the questions are actually to broad to fit stackoverflow style, they are interesting and the collective knowledge of the experts here could yield better results than one person answering them:
How many concurent users can use XPages applications on 1 Lotus Domino server (There are several applications on Lotus Domino server. Not one)?
How can we define and analyze memory leaks on Lotus Domino server, when run XPages application?
How can we write XPages the right way for achiving the best performance and avoding memory leaks?
What code methods and objects should not be used?
What are typical errors when the Lotus Script developer begins to write the code for XPages? What are the best practises?
How can we build centralized, consolidated application on XPages for 10000 - 15000 users? How many servers we need? How to configure XPages application in that case?
How to balace users?
I will provide my insights, please share yours
How long is a string? It depends on how the server is configured. And "application" could be a single form or hundreds. Only a test can tell. In general: build a high performance server preferably with 64Bit architecture and lots of RAM. Make that RAM available for the JVM. If the applications use attachments, use DAOS, put it on a separate disk - and of course make sure you have the latest version of Domino (8.5.3FP1 at time of this writing)
There is the XPages Toolbox that includes a memory and CPU profiler.
It depends on the type of application. Clever use of the scopes for caching, Expression Language and beans instead of SSJS. You leak memory whey you forget .recycle. Hire an experienced lead developer and read the book also the other one and two. Consider threading off longer running code, so users don't need to wait.
Depends on your needs. The general lessons of Domino development apply when it comes to db operations, so FTSearch over DBSearch, scope usage over #DBColumn for parameters. EL over SSJS.
Typical errors include: all code in the XPages -> use script libraries. Too much #dblookup, #dbcolumn instead of scope. Validation in buttons instead of validators. Violation of decomposition principles. Forgetting to use .recycle(). Designing applications "like old Notes screens" instead of single page interaction. Too little use of partial refresh. No use of caching. Too little object orientation (crating function graves in script libraries).
This is a summary of question 1-5, nothing new to answer
When clustering Domino servers for XPages and putting a load balancer in front, the load balancer needs to be configured to keep a session on the same server, so partial refreshes and Ajax calls reach the server that has the component tree rendered for that user.
It depends on the server setup, I have i.e XPage extranet with 12000 registered users spanning over aprox 20 XPage applications. That runs on 1 Windows 2003 server with 4GB Ram and quad core cpu. Data aount is about 60GB over these 20 applications. No Daos, no beans just SSJS. Performance is excellent. So when I upgrade this installation to 64 bit and Daos the application will scale when more. So 64Bit and Lots of Ram is the key to alot of users.
I haven't done anything around this
Make sure to recyle when you do document loops, Use the openntf.org debug toolbar it will save alot of time before we have a debugger for XPages.
Always think when you are doing things this will be done by several users, so try to cut down number of lookup or getElementByKey. Try to use ViewNavigator when you can.
It all depends on how many users that uses the system concurrent. If you have 10000 - 15000 users concurrent then you have to look at what the applications does and how many users will use the same application at the same time.
Thats my insights into the question
Is cloud hosting the way to go? Or is there something better that delivers fast page loads?
The reason I ask is because I run a buddypress site on a bluehost dedicated server, but it seems to run slow at most times of the day. This scares me because at the moment the sites not live and I'm afraid when it gets traffic it'll become worse and my visitors will lose interest. I use Amazon Cloud to handle all my media, JS, and CSS files along with a catching plugin, but it still loads slow at times.
I feel like the problem is Bluehost, because I visit other sites running buddypress and their sites seem to load instantly. Im not web hosting savvy so can someone please point me in the right direction here?
The hosting choice depends on many factors such as technical requirements, growth rates, burst rates, budgets and more.
Bigger Hardware
To scale up hosting operation, your first choice is often just using a more powerful server, VPS, or cloud instance. The point is not so much cloud vs. dedicated but that you simply bring more compute power to the problem. Cloud can make scaling up easier - often with a few clicks.
Division of Labor
The next step often is division of labor. You offload database, static content, caching or other items to specific servers or services. For example, you could offload static content to a CDN. You could a dedicated database.
Once again, cloud vs non-cloud is not the issue. The point is to bring more resources to your hosting problems.
Pick the Right Application Stack
I cannot stress enough picking the right underlying technology for your needs. For example, I've recently helped a client switch from a Apache/PHP stack to a Varnish/Nginx/PHP-FPM stack for a very business Wordpress operation (>100 million page views/mo). This change boosted capacity by nearly 5X with modest hardware changes.
Same App. Different Story
Also just because you are using a specific application, it does not mean the same hosting setup will work for you. I don't know about the specific app you are using but with Drupal, Wordpress, Joomla, Vbulletin and others, the plugins, site design, themes and other items are critical to overall performance.
To complicate matter, user behavior is something to consider as well. Consider a discussion form that has a 95:1 read:post ratio. What if you do something in the design to encourage more posts and that ratio moves to 75:1. That means more database writes, less caching, etc.
In short, details matter, so get a good understanding of your application before you start to scale out hosting.
A hosting service is part of the solution. Another part is proper server configuration.
For instance this guy has optimized his setup to serve 10 million requests in a day off a micro-instance on AWS.
I think you should look at your server config first, then shop for other hosts. If you can't control server configuration, try AWS, Rackspace or other cloud services.
just an FYI: You can sign up for AWS and use a micro instance free for one year. The link I posted - he just optimized on the same server. You might have to upgrade to a small server because Amazon has stated that micro is only to handle spikes and sustained traffic.
Good luck.
We have a new project for a web app that will display banners ads on websites (as a network) and our estimate is for it to handle 20 to 40 billion impressions a month.
Our current language is in ASP...but are moving to PHP. Does PHP 5 has its limit with scaling web application? Or, should I have our team invest in picking up JSP?
Or, is it a matter of the app server and/or DB? We plan to use Oracle 10g as the database.
No offense, but I strongly suspect you're vastly overestimating how many impressions you'll serve.
That said:
PHP or other languages used in the application tier really have little to do with scalability. Since the application tier delegates it's state to the database or equivalent, it's straightforward to add as much capacity as you need behind appropriate load balancing. Choice of language does influence per server efficiency and hence costs, but that's different than scalability.
It's scaling the state/data storage that gets more complicated.
For your app, you have three basic jobs:
what ad do we show?
serving the add
logging the impression
Each of these will require thought and likely different tools.
The second, serving the add, is most simple: use a CDN. If you actually serve the volume you claim, you should be able to negotiate favorable rates.
Deciding which ad to show is going to be very specific to your network. It may be as simple as reading a few rows from a database that give ad placements for a given property for a given calendar period. Or it may be complex contextual advertising like google. Assuming it's more the former, and that the database of placements is small, then this is the simple task of scaling database reads. You can use replication trees or alternately a caching layer like memcached.
The last will ultimately be the most difficult: how to scale the writes. A common approach would be to still use databases, but to adopt a sharding scaling strategy. More exotic options might be to use a key/value store supporting counter instructions, such as Redis, or a scalable OLAP database such as Vertica.
All of the above assumes that you're able to secure data center space and network provisioning capable of serving this load, which is not trivial at the numbers you're talking.
You do realize that 40 billion per month is roughly 15,500 per second, right?
Scaling isn't going to be your problem - infrastructure period is going to be your problem. No matter what technology stack you choose, you are going to need an enormous amount of hardware - as others have said in the form of a farm or cloud.
This question (and the entire subject) is a bit subjective. You can write a dog slow program in any language, and host it on anything.
I think your best bet is to see how your current implementation works under load. Maybe just a few tweaks will make things work for you - but changing your underlying framework seems a bit much.
That being said - your infrastructure team will also have to be involved as it seems you have some serious load requirements.
Good luck!
I think that it is not matter of language, but it can be be a matter of database speed as CPU processing speed. Have you considered a web farm? In this way you can have more than one machine serving your application. There are some ways to implement this solution. You can start with two server and add more server as the app request more processing volume.
In other point, Oracle 10g is a very good database server, in my humble opinion you only need a stand alone Oracle server to commit the volume of request. Remember that a SQL server is faster as the people request more or less the same things each time and it happens in web application if you plan your database schema carefully.
You also have to check all the Ad Server application solutions and there are a very good ones, just try Google with "Open Source AD servers".
PHP will be capable of serving your needs. However, as others have said, your first limits will be your network infrastructure.
But your second limits will be writing scalable code. You will need good abstraction and isolation so that resources can easily be added at any level. Things like a fast data-object mapper, multiple data caching mechanisms, separate configuration files, and so on.
What makes a site good for high traffic?
Does it have more to do with the hardware/infrastructure, or with how one writes the software, using Java as the example, if it matters?
I'm wondering how the software changes just because it is expected that billions of users will be on the site, if at all.
My understanding up to this point is that the code doesn't change, but that it is deployed on multiple servers, in a cluster, and a load balancer distributes the load, so really, on any one server/deployment, the application is just as any other standard application/website.
I highly recommend reading Jeff Atwood's blog on Micro-Optimization. In previous blogs he talks somewhat about how this site was created and the hardware upgrades he has had (which quickly summarized said that better hardware performs better only the extent that it is faster/better), but the real speed of a site comes from good programming, and this article seems like it should sum up some of your site programming questions quite well.
Hardware is cheap. Programming is expensive.
There are some programming techniques to make sure your code can handle multiple simultaneous views/updates. If you're using an existing framework, much of that work is (hopefully) done for you, but otherwise you're going to find stuff that worked for a few hundred hits an hour on one server isn't going to work when you're getting hundreds of thousands of hits and you have to deploy multiple load balancing machines.
Well, it is primarily an issue of hardware scaling but there are a few things to keep in mind with respect to the software involved in scaling. For example, if you are on a server farm, you'll need to work with a session management server (either via SQL Server or via a state server - which has implications in that your session variables need to be serializable).
But, in the bigger picture, there are a variety of things that you would want to do to scale to an enterprise level. For example, it becomes particularly important that you abstract out your database calls to a DAL because you may well need to adopt the use of a middleware package for high volume environments.
Sharepoint isn't the speediest of server applications, and I've read about a few tips to speed it up. What steps do you think are necessary to increase performance so it can be used to host a high traffic site?
At the end of the day SharePoint is just a complicated web site with all the standard components.
In order to optimize performance you need to analyze each component and determine which one is a problem, and then adjust it accordingly.
We're in the process of implementing a 1000 concurrent user sharepoint website, which may or may not be large, however some steps we are taking are:
Implementing a detailed caching strategy, to cache webpart content intelligently.
Use load balanced servers to ensure all our hardware is utilised rather then lying idle.
We've undertaken capacity planning given the existing solution, so we have a good idea which component is the bottleneck for us. (The SQL Server), so we will ensure the server can cope with expected load and future growth of the site.
We're also using hardware load balancers which will ensure our network and the related servers operate as expected, and again this is something to investigate before you implement a sharepoint website.
We're also ensuring our webparts don't generate unnecessary html, and don't return unnecesary data, as this will slow down loading times.
Something which I definately think is a good idea is to have a goal to work towards, as you can spend a huge amount of money and time optimizing SharePoint, which may prove unnecessary.
My additional best bets are:
use x64 to allow more RAM on your server
Make the best use of your application pool recycling http://blogs.msdn.com/joelo/archive/2007/10/29/sharepoint-app-pool-settings.aspx
Make sure all custom code properly disposes SPWeb and SPSite objects using this http://blogs.msdn.com/rogerla/archive/2008/02/12/sharepoint-2007-and-wss-3-0-dispose-patterns-by-example.aspx
utilize MS Capacity Planning Tool http://technet.microsoft.com/en-us/library/bb961988.aspx
Plan your site collection and database sizes. Keeping your databases and site collections under control will be key
GOVERNANCE GOVERNANCE GOVERNANCE - Plan for site size limits and expiration strategy. Old data should be deleted or archived for better performance. http://technet.microsoft.com/en-us/office/sharepointserver/bb507202.aspx
I cannot emphasize enough that proper early planning is essential for a successful SharePoint implementation.
In addition to caching and hardware, try to make sure that your masterpages and page layouts are not ghosted in the database (requiring a database call to retrieve).
Do this by ensuring the files get released to the 12 hive in your solution.
Don't forget careful selection of the built-in cache settings (choose the right one for your situation).
Use the BLOBCache.
Use IIS Compression/caching (the defaults are not enough BTW).
Ensure your SQL box can keep up, especially during indexing/crawling. Splitting the Application roles (indexing vs search query and dedicated WFE for indexing/crawling) helps.
BTW if you're running VMWare VMs for your WFEs, Windows NLB breaks (though not consistently), so use hardware NLBs or DNS round-robin, etc.
If you don't need > 2gig RAM for the IIS Application Pool on a WFE, don't bother with 64bit on the WFE.
Just my 2c.