Linq Update vs SQL Update performance - linq

Let say I got a 500 - 1,000 data in my table.
LINQ
(from a in db.Person
select a).ToList().ForEach(x => x.PersonName = "Juan");
db.SaveChanges();
SQL
UPDATE dbo.Person
SET PersonName = Juan;
Which query is much better performance in terms of speed when updating a record?.

Clearly the raw update will be faster. ToList() materialises the data set (so it gets read). The update may or may not track changes in your entity but by default will. Then it sends it all back again.
The update just operates at the database.
Without the ToList() it might be closer.

Related

Is it a good idea to store and access an active query resultset in Coldfusion vs re-quering the database?

I have a product search engine using Coldfusion8 and MySQL 5.0.88
The product search has two display modes: Multiple View and Single View.
Multiple displays basic record info, Single requires additional data to be polled from the database.
Right now a user does a search and I'm polling the database for
(a) total records and
(b) records FROM to TO.
The user always goes to Single view from his current resultset, so my idea was to store the current resultset for each user and not have to query the database again to get (waste a) overall number of records and (waste b) a the single record I already queried before AND then getting the detail information I still need for the Single view.
However, I'm getting nowhere with this.
I cannot cache the current resultset-query, because it's unique to each user(session).
The queries are running inside a CFINVOKED method inside a CFC I'm calling through AJAX, so the whole query runs and afterwards the CFC and CFINVOKE method are discarded, so I can't use query of query or variables.cfc_storage.
So my idea was to store the current resultset in the Session scope, which will be updated with every new search, the user runs (either pagination or completely new search). The maximum results stored will be the number of results displayed.
I can store the query allright, using:
<cfset Session.resultset = query_name>
This stores the whole query with results, like so:
query
CACHED: false
EXECUTIONTIME: 2031
SQL: SELECT a.*, p.ek, p.vk, p.x, p.y
FROM arts a
LEFT JOIN p ON
...
LEFT JOIN f ON
...
WHERE a.aktiv = "ja"
AND
... 20 conditions ...
SQLPARAMETERS: [array]
1) ... 20+ parameters
RESULTSET:
[Record # 1]
a: true
style: 402
price: 2.3
currency: CHF
...
[Record # 2]
a: true
style: 402abc
...
This would be overwritten every time a user does a new search. However, if a user wants to see the details of one of these items, I don't need to query (total number of records & get one record) if I can access the record I need from my temp storage. This way I would save two database trips worth 2031 execution time each to get data which I already pulled before.
The tradeoff would be every user having a resultset of up to 48 results (max number of items per page) in Session.scope.
My questions:
1. Is this feasable or should I requery the database?
2. If I have a struture/array/object like a the above, how do I pick the record I need out of it by style number = how do I access the resultset? I can't just loop over the stored query (tried this for a while now...).
Thanks for help!
KISS rule. Just re-query the database unless you find the performance is really an issue. With the correct index, it should scales pretty well. When the it is an issue, you can simply add query cache there.
QoQ would introduce overhead (on the CF side, memory & computation), and might return stale data (where the query in session is older than the one on DB). I only use QoQ when the same query is used on the same view, but not throughout a Session time span.
Feasible? Yes, depending on how many users and how much data this stores in memory, it's probably much better than going to the DB again.
It seems like the best way to get the single record you want is a query of query. In CF you can create another query that uses an existing query as it's data source. It would look like this:
<cfquery name="subQuery" dbtype="query">
SELECT *
FROM Session.resultset
WHERE style = #SelectedStyleVariable#
</cfquery>
note that if you are using CFBuilder, it will probably scream Error at you for not having a datasource, this is a bug in CFBuilder, you are not required to have a datasource if your DBType is "query"
Depending on how many records, what I would do is have the detail data stored in application scope as a structure where the ID is the key. Something like:
APPLICATION.products[product_id].product_name
.product_price
.product_attribute
Then you would really only need to query for the ID of the item on demand.
And to improve the "on demand" query, you have at least two "in code" options:
1. A query of query, where you query the entire collection of items once, and then query from that for the data you need.
2. Verity or SOLR to index everything and then you'd only have to query for everything when refreshing your search collection. That would be tons faster than doing all the joins for every single query.

Data Entity Framework and LINQ - Get giant data set and execute a command one at a time on each object

We are pulling in a giant dataset of records (in the 100's of thousands) and then need to update a field on each one, one at a time in an atomic transation. They records are unrelated to each other and we don't want to do a blind update to all couple hundred thousand (there are views and indexes on this table that make that very prohibitive). The ONLY way that I could get this to work without doing a giant transation was as follows (container is a reference to a custom ObjectContext):
var expiredWorkflows = from iw in container.InitiatedWorkflows
where iw.InitiationStatusID != 1 && iw.ExpirationDate < DateTime.Now
select iw.ID;
foreach (int expiredWorkflow in expiredWorkflows)
container.ExecuteStoreCommand("UPDATE dbo.InitiatedWorkflow SET InitiationStatusID = 7 WHERE ID = #ID", new SqlParameter() { ParameterName = "#ID", Value = expiredWorkflow.ToString() } );
We tried looping through each one and just updating the field via the container and then calling SaveChanges(), but that runs everything as one transaction. We tried calling SaveChanges() in the foreach loop, but that threw transaction exceptions. Is there any way to what we are trying to do using the ObjectContext, so it would do something like (the above select would be changed to return the full object, not just the ID):
foreach (var expiredWorkflow in expiredWorkflows)
expiredWorkflow.InitiationStatusID = 7
container.SaveChanges(SaveOptions.OneAtATime);
Speaking generally, if the operation you need to carry out is as simple as the sort of UPDATE your code above suggests, this is the sort of operation that will run far better on the back end database--assuming, of course, there's some clear way to select only the rows that need to be changed. Entity Framework is intended more for manipulating small to medium sets of objects that can easily be loaded into memory and twiddled there, not large bulk-processing operations for which stored procedures are often best. EF can certainly perform those big operations, but it will take a lot longer to execute one SQL statement per row.

Help to convert the SQL query to Lambda(Linq)

Can anyone help me, to convert the below sql query to Linq (Lambda Expressions).
Thanks in Advance.
SQL Query:
UPDATE A INNER JOIN B ON (A.Trade Number=B.Trade Number) AND
(A.AccountId=B.AccountId) SET A.Float = B.CCY, A.Float = B.BASIS,
A.LastReset = B.FIXING, A.LastLibor = B.INTRATE;
Updates are not expressed as LINQ statements. You can use LINQ to load the record you wish to modify, but the actual update will involve a separate method call, the details of which will depend on which database technology you're using.
So, assuming you already have a reference to the record you wish to update from A, load the associated record from B (using LINQ), perform the updates to A programatically, and then save the changes to A.
Here's an example using LINQ-to-SQL.

LINQ and Generated sql

suppose my LINQ query is like
var qry = from c in nwEntitiesContext.CategorySet.AsEnumerable()
let products = this.GetProducts().WithCategoryID(c.CategoryID)
select new Model.Category
{
ID = c.CategoryID,
Name = c.CategoryName,
Products = new Model.LazyList<Core.Model.Product>(products)
};
return qry.AsQueryable();
i just want to know what query it will generate at runtime....how to see what query it is generating from VS2010 IDE when we run the code in debug mode....guide me step by step.
There is not much to see here - it will just select all fields from the Category table since you call AsEnumerable thus fetching all the data from the Category table into memory. After that you are in object space. Well, depending on what this.GetProducts() does - and my guess it makes another EF query fetching the results into memory. If that's the case, I would strongly recommend you to post another question with this code and the code of your GetProducts method so that we can take a look and rewrite this in a more optimal way. (Apart from this, you are projecting onto a mapped entity Model.Category which again won't (and should not) work with Linq-to-Entities.)
Before reading into your query I was going to recommend doing something like this:
string sqlQueryString = ((ObjectQuery)qry).ToTraceString();
But that won't work since you are mixing Linq-to-Entities with Linq-to-objects and you will actually have several queries executed in case GetProducts queries EF. You can separate the part with your EF query and see the SQL like this though:
string sqlString = nwEntitiesContext.CategorySet.ToTraceString();
but as I mentioned earlier - that would just select everything from the Categories table.
In your case (unless you rewrite your code in a drastic way), you actually want to see what queries are run against the DB when you execute the code and enumerate the results of the queries. See this question:
exact sql query executed by Entity Framework
Your choices are SQL Server Profiler and Entity Framework Profiler. You can also try out LinqPad, but in general I still recommend you to describe what your queries are doing in more detail (and most probably rewrite them in a more optimal way before proceeding).
Try Linqpad
This will produce SELECT * FROM Categories. Nothing more. Once you call AsEnumerable you are in Linq-to-objects and there is no way to get back to Linq-to-entities (AsQueryable doesn't do that).
If you want to see what query is generated use SQL Profiler or any method described in this article.

LINQ - Using where or join - Performance difference?

Based on this question:
What is difference between Where and Join in linq?
My question is following:
Is there a performance difference in the following two statements:
from order in myDB.OrdersSet
from person in myDB.PersonSet
from product in myDB.ProductSet
where order.Persons_Id==person.Id && order.Products_Id==product.Id
select new { order.Id, person.Name, person.SurName, product.Model,UrunAdı=product.Name };
and
from order in myDB.OrdersSet
join person in myDB.PersonSet on order.Persons_Id equals person.Id
join product in myDB.ProductSet on order.Products_Id equals product.Id
select new { order.Id, person.Name, person.SurName, product.Model,UrunAdı=product.Name };
I would always use the second one just because it´s more clear.
My question is now, is the first one slower than the second one?
Does it build a cartesic product and filters it afterwards with the where clauses ?
Thank you.
It entirely depends on the provider you're using.
With LINQ to Objects, it will absolutely build the Cartesian product and filter afterwards.
For out-of-process query providers such as LINQ to SQL, it depends on whether it's smart enough to realise that it can translate it into a SQL join. Even if LINQ to SQL doesn't, it's likely that the query engine actually performing the query will do so - you'd have to check with the relevant query plan tool for your database to see what's actually going to happen.
Side-note: multiple "from" clauses don't always result in a Cartesian product - the contents of one "from" can depend on the current element of earlier ones, e.g.
from file in files
from line in ReadLines(file)
...
My question is now, is the first one slower than the second one? Does it build a cartesic product and filters it afterwards with the where clauses ?
If the collections are in memory, then yes. There is no query optimizer for LinqToObjects - it simply does what the programmer asks in the order that is asked.
If the collections are in a database (which is suspected due to the myDB variable), then no. The query is translated into sql and sent off to the database where there is a query optimizer. This optimizer will generate an execution plan. Since both queries are asking for the same logical result, it is reasonable to expect the same efficient plan will be generated for both. The only ways to be certain are to
inspect the execution plans
or measure the IO (SET STATISTICS IO ON).
Is there a performance difference
If you find yourself in a scenario where you have to ask, you should cultivate tools with which to measure and discover the truth for yourself. Measure - not ask.

Resources