how to store dynamically generated pages in html? - performance

I'm working on ASP.net MVC3 Web application that is facing scalability issue.
For improving performance I want to store dynamically generated pages in html and serve them from generated html directly rather then querying database for each page request.
I'm sure this will dramatically increase performance.
Can any one share any hint / example / tutorial on how to do it? And what are challenges?
I would also like to know how others are handling performance issue for large e-commerce sites with at-least thousand categories and 200k products with at least 200-500 concurrent visitors? What are the best approaches?
Thanks in advance.

You shoult have a look at Improving Performance with Output Caching.
It provides several ways to cache the output of your controllers like this:
[OutputCache(Duration=10, VaryByParam="none")]
public ActionResult Index()
{
return View();
}

You don't need to do that, just enable the output cache. It will be the same, instead of hitting all your logic for creating the pages you will be retrieving a static one, but from the cache instead of disk.

I would look at implementing a cache system for commonly viewed pages. The .Net platform has some really nice cache libraries that can be used that would allow you to manage the cache in real time.
Cache Best Practices
msdn.microsoft.com/en-us/library/aa478965.aspx
I might also take a look a writing a restful API that can be load balanced across multiple nodes of a cluster.

Do you know where your capacity bottleneck is?
My guess is your DB is the bottleneck, but unless you measure to find out where the bottleneck is you're likely to spend a lot of time optimising things that may not make much difference.
First things to do are get hold of a copy of PAL and monitor the web server and DB to see what it tells you.
Also run SQL queries to diagnose the most expensive and frequent queries.
Measure you're actually page generation times and follow it down the stack to see what calls are being made and which are expensive and can be cached.
Rather than output caching, I'd generally introduce a caching layer infront of the webservers, and caching layer between the app server and the DB but without measuring it's very hard to judge.
If you want to know more about caching in general, I'd read this answer I wrote a while back How to get started with web caching, CDNs, and proxy servers?

Related

How to speed up the TYPO3 Backend?

Given: Each call to a BE module takes several seconds even with a SSD drive. (A well configured setup runs below 1 second for general BE tasks.)
What are likely bottlenecks?
How to check for them?
What options to speed up?
On purpose I don't give a special configuration, but ask for a general checklist, so that the answer is suitable for many people as first entry point.
General tips on performance tuning for TYPO3 can be found here: https://wiki.typo3.org/Performance_tuning
However, in my experience most general performance problems are due to one of a few reasons:
Bad/no caching. Usually this is a problem with one or more extensions (partly) disabling cache. Try disabling all third party extensions and enabling them one by one to see which causes the site to slow down the most. $GLOBALS['TSFE']->set_no_cache() will disable all cache, so you could search for that. USER_INT and COA_INT in TypoScript also disable cache for anything that's configured inside there.
A lot of data. Check the database for any tables containing a lot of data. How many constitutes "a lot", depends on a lot of factors, but generally anything below a million records shouldn't be too much of a problem unless for example you do queries with things like LIKE '%...%' on fields containing a lot of data.
Not enough resources on the server. To fix this, add more memory and/or CPU cores to the server. Or if it's a shared server, reduce the number of sites running on it.
Heavy traffic. No matter how many resources a server has, it will always have a limit to the number of requests it can process in a given time. If this is your problem you will have to look into load balancing and caching servers. If you don't (normally) have a lot of visitors, high traffic can still be caused by robots crawling your site too quickly. These are usually easy to block on IP address in your firewall or webserver configuration.
A slow backend on a server without any other traffic (you're the only one who can access it) rules out 1 (can only cause a slow backend if users are accessing the frontend and causing a high server load) and 4 (no other traffic).
one further aspect you could inspect: in the user record a lot of things are stored, for example the settings you used in the log module.
one setting which could consume a lot of memory (and time to serialize and deserialize) is the state of the pagetree (which pages are expanded/ which are not).
Cleaning the user settings could make the backend faster for this user.
If you have a large page tree and the user has to navigate through many pages the effect will stall. another draw back: you loose all settings as there still is no selective cleaning.
Cannot comment here but need to say: The TSFE-Object does absolutely nothing in the TYPO3 Backend. The Backend is always uncached. The TYPO3-Backend is a standalone module to edit and maintenance the frontend output. There are tons of Google search results that will ignore this fact.
Possible performance bottlenecks are poor written extensions that do rendering or data processing. Hooks to core functions are usually no big deal but rendering of many elements for edit forms (especially in TYPO3s Fluid Template Engine) can cause performance problems.
The Extbase-DBAL-Layer can also cause massive performance problems. The reason is the database model does not know indexes. It' simple but stupid. A SQL-Join on a big table of 2000 records+ will delay the output perceptibly, depending on the data model.
Also TYPO3 Backend does not really depend on the Typoscript-Configuration but in effect to control some output or loaded by extensions, the full parsing of the *.ts files is needed. And this parser is very slow.
If you want to speed things up you need to know what goes wrong. The only way to debug this behaviour is to inspect the runtime with a PHP profiling tool like xdebug because the TYPO3 Framework is very complex. It's using some kind of Doctrine Framework and will load tons of files, by every request. Thus a good configured OpCache is a must.
Most reason the whole thing is slow is because it is poor written. You can confirm that fact by inspecting the runtime.
In addition to what already has been said, put the runtime environment onto your checklist:
Memory:
If heavy IDE and other tools are open at the same time, available memory can become an issue. To check the memory profile, you may start a tool that monitors the memory usage of the machine.
If virtualization is used, check the memory assigned to the box. Try if assigning more memory improves behaviour.
If required and possible spend more memory to your machine. This should not be a bugfix to poorly written code. Bad code can blow up any size of memory.
File access:
TYPO3 reads and writes thousands of files. If you work with a contemporary SSD, this is surprisingly fast. I did measure this. Loading all class files of TYPO3 takes just a fraction of a second.
However this may look different if you do not work with a standard setup. Many factors may slow you down:
USB-Sticks as storage.
Memory cards as storage.
All kind of external storage may be limited due to slow drivers.
Virtualization can become an issue. Again it's a question of drivers.
In doubt test and store your files and DB on a different drive to compere the behaviour.
Routing
The database itself may be fast. A bad routing of your request may still slow you down. Think of firewalls, proxies etc. even on your local machine and specially if virtualisation is used.
Database connection:
I fast database connection is crucial. If the database access is slow TYPO3 can't be fast.
Especially due to Extbase TYPO3 often queries much more data than really required and more often than really required, because a lot of relations are resolved in the PHP layer instead of the DB layer itself. Loading data structures like the root line may cause a lot of ping-pong between the PHP and the DB layer.
I can't give advice, how to measure your DB-connection. You have to as your admin for that. What you always can do is to test and compare with another DB from a completely different environment.
The speed of the database may depend on the type of the database itself. Typically you use MySQL/Maria-DB which should be fast. It also depends on the factors mentioned above, memory, file access and routing.
Strategy:
Even without being and admin and knowing all performance tools, you can always exchange parts of your system and check if matters improve. By this approach you can localise the culprit without being an expert. Once having spotted the culprit, Google may help you to get more information.
When it comes to a clean and performant setup of routing or virtualisation it's still the best idea to ask an experienced admin.
Summary
This is all in addition to what others have already pointed to.
What would be really helpful would be a BE-Plugin, that analyses and measures the environment. May there are some out there I don't know.

Is memcache(d) necessary when using Cloudflare/Incapsula

If you need caching in your website to make database use lower, do you have to do it using memcache or memcached (in PHP, for example) or can you achieve this by using professional services like CloudFlare, Incapsula or others like that do some caching for you?
Services like Cloudflare cache your HTML and/or assets like images and CSS files in a CDN, so that your entire server is hit less often. This is great for semi-static sites but may not be the best fit for highly dynamic sites.
Local caches like memcached just store any data in a way that's fast to access. You can use that to cache database queries and lower your database activity, but you can also use it to store pre-computed data that would be expensive to re-create all the time or whatever else you may want to store non-permanently in a fast-to-access way.
Both solutions solve different problems. You may use both together, or either, or neither. It really depends on where exactly your bottleneck is and which solution fits your problem better.
I'm the CEO of CloudFlare and I'd say: more (intelligent) caching is almost always a good thing. While we can significantly decrease the load coming to your web server, to get the best performance it's still extremely important to optimize your web application and it's interaction with your database. To that end, memcache and other fast caching layers can play an important role and I'd never discourage them.
PS - we work great with dynamic sites. 95%+ of our sites are highly dynamic web applications.

Client-side caching in Rich Internet Applications

I'm starting to step into unfamiliar territory with regards to performance improvement and our RIA (Rich Internet Application) built with GWT. For those unfamiliar with GWT, essentially when deployed it's just pure JavaScript. We're interfacing with the server side using a REST-style XML web service via XMLHttpRequest.
Our XML is un-marshalled into JavaScript objects and used within the application to represent the data model behind the interface. When changes occur, the model is updated and marshalled back to XML and sent back to the server.
I've learned the number one rule of performance (in terms of user experience) is to make as few requests as possible. Obviously this brings up the possibility of caching. Caching is great for static data but things get tricky in a multi-user system where data on the server may be changing. Also, use of "Last-Modified" and "If-Modified-Since" requests don't quite do enough since we'd like to avoid unnecessary requests altogether.
I'm trying to figure out if caching data in the browser is even right for us before researching the approaches. I hope someone has tread this path before. I'm looking for similar approaches, lessons learned, things to avoid, etc.
I'm happy to provide more specific info if needed...
For GWT, if performance matters that much to you, you get better performance by sending all the data you need in a single request, instead of querying multiple small data. I would recommend against client-side data caching as there are lots of issues like keeping the data in sync with the database.
Besides, you already have a good advantage with GWT over traditional html apps. Unless you are dealing with special data (eg: does not become stale too quickly - implies mostly-read queries) I found out that there is no special need for caching. You are better off doing a service-layer caching, since most of the time should come of server-side processing.
If you can provide more details about the nature of the app, maybe some different conclusions can be taken.

Scalability and Performance of Web Applications, Approaches?

What various methods and technologies have you used to successfully address scalability and performance concerns of a website? I am an ASP.NET web developer exploring .NET remoting with WCF with SQL clustering and am curious as to what other approaches exist (such as the ‘cloud’). In which cases would you apply various approaches (for example method a for roughly x many ‘active’ users).
An example of what I mean, a myspace case study: http://highscalability.com/myspace-architecture
This is a very broad question making it difficult to answer, but I'll try and provide a few general suggestions.
1 - Unless you are doing some things seriously wrong then you'll likely not need to worry about perf or scale until you hit a significant amount of traffic (over 1 million page views a month).
2 - Your biggest performance problems initially are likely to be the page load times from other countries. Try the Gomez Instance Site Test to see the page load times from around the world, and use YSlow as a guide for optimizing.
3 - When you do start hitting performance problems it will first most likely be due to the database work. Use the SQL Server Profiler to examine your SQL traffic looking for long running queries to try optimizing, and also use dm_db_missing_index_details to look for indexes you should add.
4 - If your web servers start becoming the performance bottleneck, use a profiler to (such as the ANTS Profiler) to look for ways to optimize your web pages code.
5 - If your web servers are well optimized and still running too hot, look for more caching opportunities, but you're probably going to need to simply add more web servers.
6 - If your database is well optimized and still running too hot, then look at adding a distributed caching system. This probably won't happen until you're over 10 million page views a month.
7 - If your database is starting to get overwhelmed even with distributed caching, then look at a sharding architecture. This probably won't happen until you're over 100 million page views a month.
I've worked on a few sites that get millions/hits/month. Here are some basics:
Cache, cache, cache. Caching is one of the simplest and most effective ways to reduce load on your webserver and database. Cache page content, queries, expensive computation, anything that is I/O bound. Memcache is dead simple and effective.
Use multiple servers once you are maxed out. You can have multiple web servers and multiple database servers (with replication).
Reduce overall # of request to your webservers. This entails caching JS, CSS and images using expires headers. You can also move your static content to a CDN, which will speed up your user's experience.
Measure & benchmark. Run Nagios on your production machines and load test on your dev/qa server. You need to know when your server will catch on fire so you can prevent it.
I'd recommend reading Building Scalable Websites, it was written by one of the Flickr engineers and is a great reference.
Check out my blog post about scalability too, it has a lot of links to presentations about scaling with multiple languages and platforms:
http://www.ryandoherty.net/2008/07/13/unicorns-and-scalability/
There is velocity from MS as well as MEMCache has a port to .NET now and also indeXus.Net

Does Ajax detoriate performance?

Does excess use of AJAX affects performance? In context of big size web-applications, how do you handle AJAX requests to control asynchronous requests?
excess use of anything degrades performance; using AJAX where necessary will improve performance, especially if the alternative is a complete full-page round-trip to the server [a 'postback' in asp.net terminology]
There are two sides to this story.
AJAX generally improves the performance from the client's perspective. Rather than loading an entire page, a smaller amount of data is requested from the server when it is needed. Given that a HTML page often references many dependent files (images, css, javascript,etc, each requiring a hit from the server (or the cache)) the client performance from judicious use of AJAX can be remarkable.
On the server-side, the issue becomes one of having many more connections to manage. Polling applications, such as in-browser chat in particular, can really start to increase the load on the server because the browser is now hitting the server much more rapidly. In a typical dynamic application (where the response is generated by code rather than from a static file) you may start running into issues - but these are generally balanced by the fact that the complexity of your request is often much lower (again, you aren't generating the entire page but a small subset of the page) and so therefore your platform can probably get a higher throughput in any case.
The exact outcome of any performance issue is going to depend on a number of factors including your server, platform, framework, and prevailing climactic conditions at the time.
My ultimate advice - focus on creating a good user experience, develop intelligently, collect as many metrics as you can and optimise when you know you need it.
AJAX itself (being asynchronous requests).. No not generally.
However if you have an abundance of javascript and markup and have large amounts of data transferred via your xmlhttprequests then yes you can see a performance hit. It really depends on how you want your website to function any degredation is generally avoidable if sculpted correctly.
Performance of what exactly? I'm going to assume you meant performance of an application in terms of user experience.
What Ajax appears to be best at is causing network traffic only when it's needed. Rather than downloading a honkin' great web page in one hit, it downloads only what's needed in as quick a manner as possible.
Then, if you do something that needs more info, it goes and gets it from the network then.
This means unused stuff is never downloaded (if you design it right, of course - bad code can be written in Ajax as much as any other environment).
I prefer to mix Ajax methods for data transfer and a client-side library like jQuery for pretty interface.
Depending on the situation, AJAX may have a performance overhead or it can actually have better performance than an equivelantly functioning web site that doesn't use AJAX.
It's very easy to overuse AJAX to overload the server with tons of frivilous requests and it can also be a burden on the client's CPU. Conversely, AJAX can also be used to deliver small bits of HTML and other code rather than a whole page for each request, which is at least less of a burden on the server.
Ajax is just an ordinary HTTP request, so as long as your server can handle those requests it won't be a problem. The upside to Ajax is faster perceived performance by the user, since the page doesn't have to reload and redraw itself for every user action.
If scalability is a concern, I'm sure you are also looking at scaling the system horizontally by adding more web servers to the farm. Same goes with even non-Ajax web apps anyway.
AJAX, like any technology can be a good thing or a bad thing depending on the situation and how it is implemented. If you have a specific need for the asynchronous process then it is a good tool to use. However, if you use it irresponsibly you can get into trouble. If you do use it, try to find a good framework that does most of the heavy lifting and be aware of some of the downsides of AJAX...
http://learningremix.net/w2007integ/vangoori/2007/01/the_downsides_of_ajax.shtml
I would agree with quite a few other posts in here. If you are using it in an intelligent way (ie, not using ajax every 30 seconds), then it will be fine. I use ajax on my website (and there is also a js free version) and from a clients perspective, the ajax version loads at anywhere from near-equal speeds to four times faster. It all depends on the design (graphics and other content) of the website and what you are updating.
The downside is, since you have to load some frameworks (even if you create your own like I have) you will have a bit slower of a load for the first page, or any full refreshes, and it does increase the processing load a bit. But that is just because the ajax has increased productivity and therefore the user can make more requests/updates
If the site is busy then it will, eventually, kill the server, unless your in a farm.
As to the site itself it shouldn't.

Resources