LinqToSQL DateTime filters? - performance

I've got a linqtosql query filtering and ordering by a datecolumn that takes 20 seconds to run. When I run the generated sqlquery directly on the DB it returns in 0 seconds.
var myObjs = DB.Table
.Where(obj => obj.DateCreated>=DateTime.Today)
.OrderByDescending(obj => obj.DateCreated);
The table has only 100,000 records and the DateTime column is indexed.
Just another in a long line of linqtosql performance grievances. But this one is SOO bad that I'm sure I must be doing something wrong.

I suspect the difference is that although running the generated query only takes 0 seconds, that's because it's not actually showing you all the results if you're using something like Enterprise Manager. Just fetching (and deserializing) all the data for 100,000 results could well take a significant amount of time, but your manual query is probably only showing you the first 20 hits or something similar.
If you run the same SQL in .NET and use a DataReader to fetch all the data, how long does it take then?
If you run server with profiling turned on, how long does it say the query took to execute from LINQ to SQL?

Thanks guys...
The problem was mine, not linq's. For brevity I shortened the query in the question but there was actually another filter that had been applied to a NON indexed column. Adding the index solved the problem.
What through me for a loop though was that, as Jon Skeet suggested, running the query in Sql Mgmt studio gave a false sense of confidence because the query was paged, and very quickly returned the top 20 rows, leaving me to think linq was to blame. So the index problem only showed up in linq and not in sql mgmt studio.

I can't see anything wrong in your query. It would be great to see the T-SQL generated by Linq. Did you try that?

Related

How to handle large amount of data using linq in mvc

I face a problem using a linq query. I am getting data from a SQL database using this query and date time parameter (see below). When this query executes, it takes a long time, and after a long wait, I get an error.
Data is available in database, and when I use Take() with number of rows, it will work. I don't to know how to figure out the problem.
Is it possible my query hits a large amount of data causing the query to not work? Can you please share any suggestions on how to solve this issue?
from ClassificationData in DbSet
where ClassificationData.CameraListId == id &&
ClassificationData.DateTime <= endDate &&
ClassificationData.DateTime >= startdate
orderby ClassificationData.Id descending
select ClassificationData
Your problem is probably more in the realm of SQL than LINQ. LINQ just translates what you write into Transact-SQL (T-SQL) that gets sent up to SQL Server. If your SQL Server is not set-up properly, then you'll get a timeout if the query takes too long.
You need to make sure that you have indexes on the ClassificationData table (I assume it's a table, but it could be a view -- whatever it is, you need to put indexes on it if it has lots of rows). Make sure that an index is on DateTime, and that an index is also on CameraListId. Then, bring down the data unordered and execute the order-by in a separate query done on the local machine -- that will let SQL Server start giving you data right away instead of sorting it first, reducing the chance for a timeout.
If your problems persist, write queries directly against SQL Server (in Query Analyzer, for instance, if they still use that -- I'm old school, but a bit rusty). Once you have SQL Server actually working, then you should be able to write the LINQ that translates into that SQL. But you could also make it a stored procedure, which has other benefits, but that's another story for another day.
Finally, if it's still too much data, break up your data into multiple tables, possibly by CameraListId, or perhaps a week at a time of data so the DateTime part of the query doesn't have as much to look through.
Good luck.

JPA getResultList much slower than SQL query

I have a (oracle)table with about 5 million records and a quite complex query which returns about 5000 records of it in less than 5 seconds with a database tool like toad.
However when I ran the query via entityManager(eclipseLink) the query runs for minutes...
I'm probably too naive in the implementation.
I do:
Query query = em.createNativeQuery(complexQueryString, Myspecific.class);
... setParameter...
List result = query.getResultList();
The complexQueryString starts with a "SELECT *".
What kinds of optimization do I have?
May be one is only to select the fields I really need later. Some explanation would be great.
I had a similar problem (I tried to read 800000 records with 8 columns in less than one second) and the best solution was to fall back to jdbc. The ResultSet was created and read really 10 times faster than using JPA, even when doing a native query.
How to use jdbc: normally in the J2EE-Servers a JDBC-DataSource can be injected as #Resource.
An explanation: I think the OR-Mappers try to create and cache objects so that changes can easily detected later. This is a very substantial overhead, that can't be recognized if you are just working with single entities.
Query.setFetchSize(...) may help a bit. It tells the jdbc driver how many rows to return in one chunk. Just call it before getResultList();
query.setFetchSize(5000);
query.getResultList();

Query duration in NHibernate Profiler

I have an ASP .Net MVC application which uses Fluent NHibernate to access an oracle database. I also use NHibernate Profiler for monitoring the queries generated by NHibernate. I have one query which is really simple (selecting all rows from a table with 4 string columns). It is used for creating a report in CSV format. My problem is that the query is taking very long to run, and I would like to get a bit more insight into the durations displayed by nhprof. With 65.000 rows, it is taking 10-20 seconds, even though the "Database only" duration only shows something like 20 ms. Network lag should not make out a lot of this time, because the servers are on the same gigabit LAN. I don't expect people to be able to pinpoint for me exactly where the bottleneck is, but what I would like to know is some more details about how to read the duration measurements in NHibernate profiler.
What is included in the "Database only" part, and what is included in the "Total time"? Does the total time also include the processing done after populating the C# objects, so that this time is actually for the entire http request? Knowing more about this would hopefully make me able to eliminate some factors.
This what the NHibernate mapping class looks like:
Table("V_TICKET_DETAILS");
CompositeId()
.KeyProperty(x => x.TicketId, "TICKET_ID")
.KeyProperty(x => x.Key, "COLUMN_NAME")
.KeyProperty(x => x.Parent, "PARENT_NAME");
Map(x => x.Value, "COLUMN_VALUE");
And the query generated by nh profiler is like this:
SELECT this_.TICKET_ID as TICKET1_35_0_,
this_.COLUMN_NAME as COLUMN2_35_0_,
this_.PARENT_NAME as PARENT3_35_0_,
this_.COLUMN_VALUE as COLUMN4_35_0_
FROM V_TICKET_DETAILS this_
The view is really simple, only joining two tables on a 2-digit integer.
I am by no means a database expert, so I would be happy for all comments that would point me in the correct direction.
The total time is for the call to the nHib query only.
However, it includes, in addition to the time in the db, the time it takes nHib to populate your entities (hydration). and that's likely your culprit.
I've had a similar problem, perhaps some of the suggestions there may help you.
The bottom line is that nHib is not really intended to load large datasets.
If none of the suggestions I got helped you, I would suggest a couple of things:
1. It's unlikely that your user needs to view 65,000 rows of data at the same time. perhaps you can find a way to filter the data so that the result set is smaller (and more readable).
2. otherwise- if it's, as you say, an 'special' case that only occurs when you generate a report- you don't have to use nHib. you can just use, say, good ol' ADO.Net classes...
there is also IStatelessSession which is intended for such situations. It doesnt have a session cache and saves a lot of work. It should be a lot faster.
using (var session = factory.OpenStatelessSession())
{
}

Entity Framework queries speed

Recently I started to learning Entity Framework.
First question made in my mind is:
When we want to use LINQ to fetching data in EF, every query like this:
var a = from p in contacts select p.name ;
will be converts to SQL commands like this :
select name from contacts
does this converting repeat every time that we are querying?
I heard that stored procedures are cached in database, does this event happens in LINQ queries in Entity Framework ?
And at last is my question clear?
I think linq query is converted each time you want to execute it. To improve performance you can use compiled queries.
There are all sorts of optimizations being made, both in the linq expression caching and what SQL server chooses to cache, the only way is to measure your performance speed and memory consumption
To see what SQL is created you can use http://efprof.com/ which I've found quite good. You can get some of this info through SQL profiler, it's just a lot more work.

Does LINQ skip & take have decent performance?

We have a query for about 40 data fields related to customers. The query will often return a large amount of records, say up to 20,000. We only want to use say around the first 500 results. Then, we just want to be able to page through them 10 at a time.
Is LINQ skip and take a reasonable approach for this? Are there any potentnialy performance issues with using this approach vs doing it manually some other way?
Take() without Skip() generates SQL using TOP clause.
Take() with Skip generates SQL using ROW_NUMBER() as shown here.
Also, I'd recommend to use excellent LINQPad tool or LINQ to SQL logging to check generated queries.
Yes, if you're on SQL Server 2005+ it'll generate SQL that uses the ROW_NUMBER() function to make paging efficient (not unlike the SQL that Scott uses in this blog post).

Resources