We delivered a successful project a few days back and now we need to make some performance improvements in our WCF Restful API.
The projects is using the following tools/technologies
1- LINQ
2- Entity Framework
3- Enterprise library for Logging/Exception handling
4- MS SQL 2008
5- Deployed on IIS 7
A few things to note
1- 10-20 queries have more than 7 table joins in LINQ
2- The current IIS has more than 10 applications deployed
3- The entity framework has around 60 tables
4- The WCF api is using HTTPS
5- All the API call return JSON responses
The general flow is
1- WCF call is received
2- Session is checked
3- Function from BL layer is called
4- Function from DA layer is called
5- Response returned in JSON
Currently, as per my little knowledge and research I think that the
following might improve performance
1- Implement caching for reference data
2- Move LINQ queries with more than 3 joins to stored procedure (and use hints maybe?)
3- Database table re-indexing
4- Use performance counters to know the problem area's
5- Move functions with more than 3 update/delete/inserts to stored procedure
Can you point out some issue with the above improvements ? and what
other improvements can i do ?
Your post is missing some background on your improvement suggestions. Are they just guesses or have you actually measured and identified them as problem areas?
There really is no substitute for proper performance monitoring and profiling to determine which area you should focus on for optimizations. Everything else is just guesswork, and although some things might be obvious, it's often the not-so-obvious things that actually improve performance.
Run your code through a performance profiling tool to quickly identify problem areas inside the actual application. If you don't have access to the Visual Studio Performance Analyzer (Visual Studio Premium or Ultimate), take a look at PerfView which is a pretty good memory/CPU profiler that won't cost you anything.
Use a tool such as MiniProfiler to be able to easily set up measuring points, as well as monitoring the Entity Framework execution at runtime. MiniProfiler can also be configured to save the results to a database which is handy when you don't have a UI.
Analyzing the generated T-SQL statements from the Entity Framework, which can be seen in MiniProfiler, should allow you to easily measure the query performance by looking at the SQL execution plans as well as fetching the SQL IO statistics. That should give you a good overview of what can/should be put into stored procedures and if you need any other indexes.
Related
We have an app in ASP.NET MVC 3 that, due to legacy and porting reasons, is written entirely using traditional ADO.NET for the data layer.
I am now tasked with adding some reporting to this website, and the reports can result in some extremely complicated queries.
Are there any pitfalls in using the EF Power Tools to reverse-engineer a code first model and using it side-by-side with our current ADO.NET model? Doing so would allow me to use LINQ for querying the data I need, greatly speeding up the time required to write each report. I would need to shut off data context initialization, as we have our current model do that, but are there any glaring risks or problems associated with trying to do this?
If it's of any relevance (I know EF 5 has a ton of new features), we are using .NET 4 and will begin moving to .NET 4.5 as soon as it launches.
I think this is a very sensible thing to do. You could also use a database-first model, which you can refresh whenever the database changes and which does not try to initialize a database.
Since you will use the context read-only you can optimize the query process by setting the MergeOption property of ObjectQuerys to MergeOption.NoTracking. This reduces overhead because the context will not track changes of the generated objects.
A problem might be that there is more maintenance if the database changes, but I think the absence of walls of boiler-plate query code for reporting on the old data layer far outweighs that.
One day :) you may even decide to use the EF model to display data that users want to filter in the UI and use the old data layer for CUD commands. (a bit like CQRS).
Currently we have a project to implement an Internet Banking site, and we are evaluating using Nhibernate on it. ¿Is NHibernate suitable for this kind of application, where performance is important and there will be a large quantity of users doing operations simultaneously?
¿Do you know any successfull stories of using NHibernate in this kind of environment?
I think NHibernate is slow only when is used incorrectly, and I think we can use it with a lot of tweaking, best practices and common sense.
UPDATE: We were contacted for the project not too long ago, and we are still collecting requirements to define the specs. The application its for a small to medium bank in our country, so they expect around a 200 - 300 users as a top simultaneously.
Im pretty sure the DB will be in SQL Server 2005, and will be a n-tier application using webservices to access the data layer.
My team has been using NHibernate in a system requiring high throughput for years without a problem. NH is fairly efficient to begin with, and provides fine-grained control over when and how objects are reconstituted.
With that said, we don't know the specifics of your problem, so we can't make certain predictions. Perform scaling tests before you commit yourself.
NHibernate can be suitable if used correctly.
But, don't pin you down on this answer, since we do not know the correct specs.
I have an application that talks to several internal and external sources using SOAP, REST services or just using database stored procedures. Obviously, performance and stability is a major issue that I am dealing with. Even when the endpoints are performing at their best, for large sets of data, I easily see calls that take 10s of seconds.
So, I am trying to improve the performance of my application by prefetching the data and storing locally - so that at least the read operations are fast.
While my application is the major consumer and producer of data, some of the data can change from outside my application too that I have no control over. If I using caching, I would never know when to invalidate the cache when such data changes from outside my application.
So I think my only option is to have a job scheduler running that consistently updates the database. I could prioritize the users based on how often they login and use the application.
I am talking about 50 thousand users, and at least 10 endpoints that are terribly slow and can sometimes take a minute for a single call. Would something like Quartz give me the scale I need? And how would I get around the schedular becoming a single point of failure?
I am just looking for something that doesn't require high maintenance, and speeds at least some of the lesser complicated subsystems - if not most. Any suggestions?
This does sound like you might need a data warehouse. You would update the data warehouse from the various sources, on whatever schedule was necessary. However, all the read-only transactions would come from the data warehouse, and would not require immediate calls to the various external sources.
This assumes you don't need realtime access to the most up to date data. Even if you needed data accurate to within the past hour from a particular source, that only means you would need to update from that source every hour.
You haven't said what platforms you're using. If you were using SQL Server 2005 or later, I would recommend SQL Server Integration Services (SSIS) for updating the data warehouse. It's made for just this sort of thing.
Of course, depending on your platform choices, there may be alternatives that are more appropriate.
Here are some resources on SSIS and data warehouses. I know you've stated you will not be using Microsoft products. I include these links as a point of reference: these are the products I was talking about above.
SSIS Overview
Typical Uses of Integration Services
SSIS Documentation Portal
Best Practices for Data Warehousing with SQL Server 2008
Recently I had a project in which I had to get some data from particular software system to a portlet. The software used a database, and I spent a fair bit of time modeling the data I wanted and then creating a web service so that my portlet could grab the information.
Then it suddenly struck me that I was wasting my time. I grabbed BIRT, tossed it into a portlet, and then just wrote some reports that directly grabbed the necessary data from the database. I was done in an afternoon.
I understand that reporting is a one way street, but this got me thinking. Reporting tools can be very effective for creating reports (duh) from your actual data, but when you're doing this you're bypassing your model which except in simple cases is not a direct representation of your data as it exists in your database.
If you're writing a data-intensive application and require the ability to perform non-trivial reporting, do you bypass your application and use something like BIRT or Crystal Reports? How do you manage these tools as part of your overall process? Do you consider the reports you write as being part of your application and treat them as such? A report is a view and a model and a controller (if you will) all in one big mess, how do you deal with and interpret and plan for that?
Revised question: it's possible and even common that a report will perform some business calculations that in a perfect world you would like to have contained in your application. This can lead to a mismatch of information given back to the user. On the other hand, reporting tools make it so easy to gather and display information that it's hard to take a purist's approach and do everything from within the application. Are there any good techniques for ensuring that the data in your reports matches the data that you might be showing in the regular GUI?
I see reporting as simply another view on the data, not a view/model/controller in one (well, maybe a view and controller in one).
We have our reports (built in sql 2008 reporting services) consume a service in our application layer to get data (keeping with our standard, that data access is in a repository). These functions could do a simple query or handle very complex processing that would be a nightmare in your reporting evironment or a stored procedure. In practice, we find this takes no longer than coding up some one-off stored procedure that will, as your system grows and grows, become a nightmare to maintain.
Treating reporting as simply a one-off or not integrating into your application design is a huge mistake.
Reporting is crucial. Reporting is mostly crucial to share values collected in one system to external users, e.g. users not directly using the system (eg management for sales figures). So reporting is a lot more than just displaying facts and figures and is something central to almost every system that drives a commercial.
At least the more advanced systems allow you to enhance them: with your own reusable "controls". Even a way back can be implemented - if you just use the correct plugins. Once I wrote a system to send emails out of a report, because the system did not allow for change. It worked - though it was not meant to be used that way ;)
Reports make a good part of the application, and you gain a lot freedom if you make reports changeable for your customers. Sometimes you come up with more possibilities than you thought of when you built the system in the first place.
So yes, for me reporting is part of the system.
Reports are part of your app but because they are generally something a user will have strong ideas about than, say, your data capture UI, I'd sacrifice purity for convenience/speed of delivery and get back to "real" coding... :-)
As soon as you've done a report, users want another one or change the colour or optional grouping or more filtering or... something that takes you away from whizzier stuff... so I don't bust a gut maintaining purity.
This is a fine line indeed. You don't want to spend too much time building reports (that users want you to change all the time anyway) but you don't want to duplicate logic by putting business logic into your reports! With our reporting products at Data Dynamimcs I think we have reached a happy medium between these two tradeoffs.
By using the ObjectDataProvider (see links below for more info) you can bind the report directly to business objects (plain old objects) so you don't have to bypass your business layer for getting data. At the same time we provide a way to reference and use functions from other libraries in your report. This way if you have some code configured already to do some business logic calculations you can reuse those functions directly within your report. You can see an example of this in the links below too.
Binding to Objects for your Data (see "Object Provider" section): http://www.datadynamics.com/Help/ddReports/ddrconDataSetAndObjectDataSource.html
Adding Custom Code to your reports Walkthrough: http://www.datadynamics.com/Help/ddReports/ddrwlkCustomCode.html
Using Custom Assemblies (referencing shared libraries/dlls from your report): http://www.datadynamics.com/Help/ddReports/ddrconCustomCode.html, and http://www.datadynamics.com/Help/ddReports/ddrtskCreatingAnInstanceMethod.html
Scott Willeke
Data Dynamics / GrapeCity
The way I've always worked with reports is to consider part reports as part of the code-base, and stored in the source along with the application. In some contexts, reports are more important than the application, in that management makes business decisions off of report data, having the wrong information can cause them to cancel a product line, cancel a campaign, or fire a sales person. Obviously, this depends highly on your management and your application.
Regarding keeping your model consistent, this is a bit trickier question. One way to ensure consistent model between reports and your application is to use stored procedures (or views) to retrieve data, depending on your application's architecture.
I've got an old classic asp/sql server app which is constantly throwing 500 errors/timeouts even though the load is not massive. Some of the DB queries are pretty intensive but nothing that should be causing it to fall over.
Are there any good pieces of software I can install on my server which will show up precisely where the bottlenecks are in either the asp or the DB?
Some tools you can try:
HP (formerly Mercury) LoadRunner or Performance Center
Visual Studio Application Center Test (Enterprise Editions only?)
Microsoft Web Application Stress tool (aka WAS, aka "Homer"; predecessor to Application Center Test)
WebLoad
MS Visual Studio Analyzer if you want to trace through the application code. This can show you how long the app waits on DB calls, and what the SQL was that was used. You can then use the SQL profiler to tune the queries.
Where is the timeout occurring? Is it at lines when ASP is connecting/executing sql? If so your problem is either with the connection to the db server or at the db itself. Load up SQL profiler in MSSQL to see how long the queries take. Perhaps it is due to locks in the database.
Do you use transactions? If so make sure they do not lock your database for a long time. Make sure you use transactions in ADO and not on the entire ASP page. You can also ignore lock in SQL Selects by using WITH (NOLOCK) hint on tables.
Make sure you database is optimized with indexes.
Also make sure you are conencted to the DB for as shortest time as possible i.e (example not working code): conn.open; set rs = conn.execute(); rs.close; conn.close. So store recordsets in a variable instead of looping through while holding the connection to the DB open. A good way is to use GetRows() function in ADO.
Always explicitly close and set ADO objects to nothing. This can cause the connection to the DB to remain open.
Enable connection pooling.
Load ADO constants in global.asa if you are using them
Do not store any objects in session or application scopes.
Upgrade to latest versions of ADO, MDac, SQL Server service packs etc.
Are you sure the server can handle the load? Maybe upgrade it? Is it on shared hosting? Maybe your app is not the problem.
It is quite simple to measure a script performance by timing it from the 1 line to the last line. This way you can identify slow running pages.
Have you tried running the SQL Server Profiler on the server? It will highlight any unexpected activity hitting the database from the app as well as help identifying badly performing queries.
If you're happy that the DB queries are needfully intensive then perhaps you need to set more appropriate timeouts on those pages that use these queries.
Set the Server.ScriptTimeout to something larger, you may also need to set the timeout on ADO Command objects used by the script.
Here's how I'd approach it.
Look at the running tasks on the server. Which is taking up more CPU time - SQL server or IIS? Most of the time, it will be SQL server and it certainly sounds that way based on your post. It's very rare that any ASP application actually does a lot of processing on the ASP side of things as opposed to the COM or SQL sides.
Use SQL Profiler to check out all the queries hitting the database server.
Deal with the low-hanging fruit first. Generally you will have a few "problem" queries that hit the database frequently and chew up a lot of time. Deal with these. (A truism in software development is that 10% of the code chews up 90% of the execution time...)
In addition to looking at query costs with SQL Profiler and Query Analyzer/SQL Studio and doing the normal SQL performance detective work you might also want to check if your database calls are returning inordinate amounts of data to your ASP code. I've seen cases where innocuous-looking queries returned HUGE amounts of unneeded data to ASP - the classic ("select * from tablename") kind of query written by lazy/inexperienced programmers that returns 10,000 huge rows when the programmer really only needed 1 field from 1 row. The reason I mention this special case is because these sorts of queries often have low execution times and low query costs on the SQL side of things and can therefore slip under the radar.