Designing XPages applications for large user populations and high performance - performance

The following questions were posed by a customer who is about to write a large scale XPages application. While I think the questions are actually to broad to fit stackoverflow style, they are interesting and the collective knowledge of the experts here could yield better results than one person answering them:
How many concurent users can use XPages applications on 1 Lotus Domino server (There are several applications on Lotus Domino server. Not one)?
How can we define and analyze memory leaks on Lotus Domino server, when run XPages application?
How can we write XPages the right way for achiving the best performance and avoding memory leaks?
What code methods and objects should not be used?
What are typical errors when the Lotus Script developer begins to write the code for XPages? What are the best practises?
How can we build centralized, consolidated application on XPages for 10000 - 15000 users? How many servers we need? How to configure XPages application in that case?
How to balace users?
I will provide my insights, please share yours

How long is a string? It depends on how the server is configured. And "application" could be a single form or hundreds. Only a test can tell. In general: build a high performance server preferably with 64Bit architecture and lots of RAM. Make that RAM available for the JVM. If the applications use attachments, use DAOS, put it on a separate disk - and of course make sure you have the latest version of Domino (8.5.3FP1 at time of this writing)
There is the XPages Toolbox that includes a memory and CPU profiler.
It depends on the type of application. Clever use of the scopes for caching, Expression Language and beans instead of SSJS. You leak memory whey you forget .recycle. Hire an experienced lead developer and read the book also the other one and two. Consider threading off longer running code, so users don't need to wait.
Depends on your needs. The general lessons of Domino development apply when it comes to db operations, so FTSearch over DBSearch, scope usage over #DBColumn for parameters. EL over SSJS.
Typical errors include: all code in the XPages -> use script libraries. Too much #dblookup, #dbcolumn instead of scope. Validation in buttons instead of validators. Violation of decomposition principles. Forgetting to use .recycle(). Designing applications "like old Notes screens" instead of single page interaction. Too little use of partial refresh. No use of caching. Too little object orientation (crating function graves in script libraries).
This is a summary of question 1-5, nothing new to answer
When clustering Domino servers for XPages and putting a load balancer in front, the load balancer needs to be configured to keep a session on the same server, so partial refreshes and Ajax calls reach the server that has the component tree rendered for that user.

It depends on the server setup, I have i.e XPage extranet with 12000 registered users spanning over aprox 20 XPage applications. That runs on 1 Windows 2003 server with 4GB Ram and quad core cpu. Data aount is about 60GB over these 20 applications. No Daos, no beans just SSJS. Performance is excellent. So when I upgrade this installation to 64 bit and Daos the application will scale when more. So 64Bit and Lots of Ram is the key to alot of users.
I haven't done anything around this
Make sure to recyle when you do document loops, Use the openntf.org debug toolbar it will save alot of time before we have a debugger for XPages.
Always think when you are doing things this will be done by several users, so try to cut down number of lookup or getElementByKey. Try to use ViewNavigator when you can.
It all depends on how many users that uses the system concurrent. If you have 10000 - 15000 users concurrent then you have to look at what the applications does and how many users will use the same application at the same time.
Thats my insights into the question

Related

How many users a single-small Windows azure web role can support?

We are creating a simple website but with heave user logins (about 25000 concurrent users). How can I calculate no. of instances required to support it?
Load testing and performance testing are really the only way you're going to figure out the performance metrics and instance requirements of your app. You'll need to define "concurrent users" - does that mean 25,000 concurrent transactions, or does that simply mean 25,000 active sessions? And if the latter, how frequently does a user visit web pages (e.g. think-time between pages)? Then, there's all the other moving parts: databases, Azure storage, external web services, intra-role communication, etc. All these steps in your processing pipeline could be a bottleneck.
Don't forget SLA: Assuming you CAN support 25,000 concurrent sessions (not transactions per second), what's an acceptable round-trip time? Two seconds? Five?
When thinking about instance count, you also need to consider VM size in your equation. Depending again on your processing pipeline, you might need a medium or large VM to support specific memory requirements, for instance. You might get completely different results when testing different VM sizes.
You need to have a way of performing empirical tests that are repeatable and remove edge-case errors (for instance: running tests a minimum of 3 times to get an average; and methodically ramping up load in a well-defined way and observing results while under that load for a set amount of time to allow for the chaotic behavior of adding load to stabilize). This empirical testing includes well-crafted test plans (e.g. what pages the users will hit for given usage scenarios, including possible form data). And you'll need the proper tools for monitoring the systems under test to determine when a given load creates a "knee in the curve" (meaning you've hit a bottleneck and your performance plummets).
Final thought: Be sure your load-generation tool is not the bottleneck during the test! You might want to look into using Microsoft's load-testing solution with Visual Studio, or a cloud-based load-test solution such as Loadstorm (disclaimer: Loadstorm interviewed me about load/performance testing last year, but I don't work for them in any capacity).
EDIT June 21, 2013 Announced at TechEd 2013, Team Foundation Service will offer cloud-based load-testing, with the Preview launching June 26, coincident with the //build conference. The announcement is here.
No one can answer this question without a lot more information... like what technology you're using to build the website, what happens on a page load, what backend storage is used (if any), etc. It could be that for each user who logs on, you compute a million digits of pi, or it could be that for each user you serve up static content from a cache.
The best advice I have is to test your application (either in the cloud or equivalent hardware) and see how it performs.
It all depends on the architecture design, persistence technology and number of read/write operations you are performing per second (average/peak).
I would recommend to look into CQRS-based architectures for this kind of application. It fits cloud computing environments and allows for elastic scaling.
I was recently at a Cloud Summit and there were a few case studies. The one that sticks in my mind is an exam app. it has a burst load of about 80000 users over 2 hours, for which they spin up about 300 instances.
Without knowing your load profile it's hard to add more value, just keep in mind concurrent and continuous are not the same thing. Remember the Stack overflow versus Digg debacle "http://twitter.com/#!/spolsky/status/27244766467"?

Best scaling methodologies for a highly traffic web application?

We have a new project for a web app that will display banners ads on websites (as a network) and our estimate is for it to handle 20 to 40 billion impressions a month.
Our current language is in ASP...but are moving to PHP. Does PHP 5 has its limit with scaling web application? Or, should I have our team invest in picking up JSP?
Or, is it a matter of the app server and/or DB? We plan to use Oracle 10g as the database.
No offense, but I strongly suspect you're vastly overestimating how many impressions you'll serve.
That said:
PHP or other languages used in the application tier really have little to do with scalability. Since the application tier delegates it's state to the database or equivalent, it's straightforward to add as much capacity as you need behind appropriate load balancing. Choice of language does influence per server efficiency and hence costs, but that's different than scalability.
It's scaling the state/data storage that gets more complicated.
For your app, you have three basic jobs:
what ad do we show?
serving the add
logging the impression
Each of these will require thought and likely different tools.
The second, serving the add, is most simple: use a CDN. If you actually serve the volume you claim, you should be able to negotiate favorable rates.
Deciding which ad to show is going to be very specific to your network. It may be as simple as reading a few rows from a database that give ad placements for a given property for a given calendar period. Or it may be complex contextual advertising like google. Assuming it's more the former, and that the database of placements is small, then this is the simple task of scaling database reads. You can use replication trees or alternately a caching layer like memcached.
The last will ultimately be the most difficult: how to scale the writes. A common approach would be to still use databases, but to adopt a sharding scaling strategy. More exotic options might be to use a key/value store supporting counter instructions, such as Redis, or a scalable OLAP database such as Vertica.
All of the above assumes that you're able to secure data center space and network provisioning capable of serving this load, which is not trivial at the numbers you're talking.
You do realize that 40 billion per month is roughly 15,500 per second, right?
Scaling isn't going to be your problem - infrastructure period is going to be your problem. No matter what technology stack you choose, you are going to need an enormous amount of hardware - as others have said in the form of a farm or cloud.
This question (and the entire subject) is a bit subjective. You can write a dog slow program in any language, and host it on anything.
I think your best bet is to see how your current implementation works under load. Maybe just a few tweaks will make things work for you - but changing your underlying framework seems a bit much.
That being said - your infrastructure team will also have to be involved as it seems you have some serious load requirements.
Good luck!
I think that it is not matter of language, but it can be be a matter of database speed as CPU processing speed. Have you considered a web farm? In this way you can have more than one machine serving your application. There are some ways to implement this solution. You can start with two server and add more server as the app request more processing volume.
In other point, Oracle 10g is a very good database server, in my humble opinion you only need a stand alone Oracle server to commit the volume of request. Remember that a SQL server is faster as the people request more or less the same things each time and it happens in web application if you plan your database schema carefully.
You also have to check all the Ad Server application solutions and there are a very good ones, just try Google with "Open Source AD servers".
PHP will be capable of serving your needs. However, as others have said, your first limits will be your network infrastructure.
But your second limits will be writing scalable code. You will need good abstraction and isolation so that resources can easily be added at any level. Things like a fast data-object mapper, multiple data caching mechanisms, separate configuration files, and so on.

About dynamics CRM performance

My boss asked me to do a research on available CMSes on market because cms we are using currently is rather a mess.
For me as a .NET developer it would be great to choose and implement Dynamics CRM because of extensibility and perfect integration with .NET environment and well-known tools.
All marketing sounds great but I'd like to know about common DISADVANTAGES, ISSUES concerning this system.
The most important is how it is performing in a company with about 150 concurrent and very active users. I heard that it's really slow comparing to competitors system.
The Dynamics CRM Product team has published an excellent whitepaper with guidance and benchmarks for 500 concurrent users. You can learn a lot by studying this paper. The link is here:
Microsoft Dynamics CRM 4.0 Suggested Hardware for Deployments of up to 500 Concurrent Users
I can't answer regarding the number of users/activity. I can refer you to the SDK article 'Performance Best Practices'. I'll speak to the side of you that would be writing plugins (to data access messages), custom pages accessing the CRM web services, and writing SSRS reports. A couple of points I can relate to:
Disable Plug-ins. This is an attractive and major integration point into CRM. The fact that they list it as a performance issue is disheartening. We have seen OutOfMemory exceptions stemming from the plugin cache. We got around this issue by deploying to disk rather than database. In the database they reload the assembly and confirm the signature every time a plugin is called. We believe this was eating up the Large Object Heap. Probably not an issue for your normal CRM implementation.
Limit Data Retrieved. Definitely. Avoid lookups/picklists/bits you don't need when you can as these cause an extra join. Not going to be a huge deal on smaller entities. But if you need entities with a large number of attributes it could be. Probably not an issue for normal CRM customization. A good design in other cases should avoid this issue.
I can't really offer any advice on how it compares to its leading competitors. I know the main thing is that its cheaper and very actively developed.
I can say a bit about performance though which might help.
We have about 400 - 600 concurrent users using the system. The system isn't particularly web server intensive. We have two for resliency - it would be a disaster if it went offline, but these servers are never taxed. They have a couple of virtual cores and 4 gig of ram.
Our database is 130GB in size and is hosted on a 24 core database server with 48GB of RAM. It is clustered but because SQL server can't handle two active nodes, only one server is ever active.
The database server really never gets maxed out. However, there is one very important change we needed to make and one that I think MS are advising all users of large CRM installs to do now. By default SQL Server has a locking mode that will block people writing to the database when a row is being read. In our system (and numerous others apparently) that was causing huge issues.
We switched on a different mode (I think its called "snapshot isolation") or something like that. To be fair though even if you did have 200 concurrent users, it won't be any issue until the more central tables like activitypointer and account get pretty large (in the millions)
So - there is no doubt that CRM 2011 can handle that many users as long as you have some suitable hardware and have someone who understands SQL Server
HTH
S

Scalability and Performance of Web Applications, Approaches?

What various methods and technologies have you used to successfully address scalability and performance concerns of a website? I am an ASP.NET web developer exploring .NET remoting with WCF with SQL clustering and am curious as to what other approaches exist (such as the ‘cloud’). In which cases would you apply various approaches (for example method a for roughly x many ‘active’ users).
An example of what I mean, a myspace case study: http://highscalability.com/myspace-architecture
This is a very broad question making it difficult to answer, but I'll try and provide a few general suggestions.
1 - Unless you are doing some things seriously wrong then you'll likely not need to worry about perf or scale until you hit a significant amount of traffic (over 1 million page views a month).
2 - Your biggest performance problems initially are likely to be the page load times from other countries. Try the Gomez Instance Site Test to see the page load times from around the world, and use YSlow as a guide for optimizing.
3 - When you do start hitting performance problems it will first most likely be due to the database work. Use the SQL Server Profiler to examine your SQL traffic looking for long running queries to try optimizing, and also use dm_db_missing_index_details to look for indexes you should add.
4 - If your web servers start becoming the performance bottleneck, use a profiler to (such as the ANTS Profiler) to look for ways to optimize your web pages code.
5 - If your web servers are well optimized and still running too hot, look for more caching opportunities, but you're probably going to need to simply add more web servers.
6 - If your database is well optimized and still running too hot, then look at adding a distributed caching system. This probably won't happen until you're over 10 million page views a month.
7 - If your database is starting to get overwhelmed even with distributed caching, then look at a sharding architecture. This probably won't happen until you're over 100 million page views a month.
I've worked on a few sites that get millions/hits/month. Here are some basics:
Cache, cache, cache. Caching is one of the simplest and most effective ways to reduce load on your webserver and database. Cache page content, queries, expensive computation, anything that is I/O bound. Memcache is dead simple and effective.
Use multiple servers once you are maxed out. You can have multiple web servers and multiple database servers (with replication).
Reduce overall # of request to your webservers. This entails caching JS, CSS and images using expires headers. You can also move your static content to a CDN, which will speed up your user's experience.
Measure & benchmark. Run Nagios on your production machines and load test on your dev/qa server. You need to know when your server will catch on fire so you can prevent it.
I'd recommend reading Building Scalable Websites, it was written by one of the Flickr engineers and is a great reference.
Check out my blog post about scalability too, it has a lot of links to presentations about scaling with multiple languages and platforms:
http://www.ryandoherty.net/2008/07/13/unicorns-and-scalability/
There is velocity from MS as well as MEMCache has a port to .NET now and also indeXus.Net

What steps do you take to increase performance of a Sharepoint site?

Sharepoint isn't the speediest of server applications, and I've read about a few tips to speed it up. What steps do you think are necessary to increase performance so it can be used to host a high traffic site?
At the end of the day SharePoint is just a complicated web site with all the standard components.
In order to optimize performance you need to analyze each component and determine which one is a problem, and then adjust it accordingly.
We're in the process of implementing a 1000 concurrent user sharepoint website, which may or may not be large, however some steps we are taking are:
Implementing a detailed caching strategy, to cache webpart content intelligently.
Use load balanced servers to ensure all our hardware is utilised rather then lying idle.
We've undertaken capacity planning given the existing solution, so we have a good idea which component is the bottleneck for us. (The SQL Server), so we will ensure the server can cope with expected load and future growth of the site.
We're also using hardware load balancers which will ensure our network and the related servers operate as expected, and again this is something to investigate before you implement a sharepoint website.
We're also ensuring our webparts don't generate unnecessary html, and don't return unnecesary data, as this will slow down loading times.
Something which I definately think is a good idea is to have a goal to work towards, as you can spend a huge amount of money and time optimizing SharePoint, which may prove unnecessary.
My additional best bets are:
use x64 to allow more RAM on your server
Make the best use of your application pool recycling http://blogs.msdn.com/joelo/archive/2007/10/29/sharepoint-app-pool-settings.aspx
Make sure all custom code properly disposes SPWeb and SPSite objects using this http://blogs.msdn.com/rogerla/archive/2008/02/12/sharepoint-2007-and-wss-3-0-dispose-patterns-by-example.aspx
utilize MS Capacity Planning Tool http://technet.microsoft.com/en-us/library/bb961988.aspx
Plan your site collection and database sizes. Keeping your databases and site collections under control will be key
GOVERNANCE GOVERNANCE GOVERNANCE - Plan for site size limits and expiration strategy. Old data should be deleted or archived for better performance. http://technet.microsoft.com/en-us/office/sharepointserver/bb507202.aspx
I cannot emphasize enough that proper early planning is essential for a successful SharePoint implementation.
In addition to caching and hardware, try to make sure that your masterpages and page layouts are not ghosted in the database (requiring a database call to retrieve).
Do this by ensuring the files get released to the 12 hive in your solution.
Don't forget careful selection of the built-in cache settings (choose the right one for your situation).
Use the BLOBCache.
Use IIS Compression/caching (the defaults are not enough BTW).
Ensure your SQL box can keep up, especially during indexing/crawling. Splitting the Application roles (indexing vs search query and dedicated WFE for indexing/crawling) helps.
BTW if you're running VMWare VMs for your WFEs, Windows NLB breaks (though not consistently), so use hardware NLBs or DNS round-robin, etc.
If you don't need > 2gig RAM for the IIS Application Pool on a WFE, don't bother with 64bit on the WFE.
Just my 2c.

Resources