Performace issue using Foreach in LINQ - linq

I am using an IList<Employee> where i get the records more then 5000 by using linq which could be better? empdetailsList has 5000
Example :
foreach(Employee emp in empdetailsList)
{
Employee employee=new Employee();
employee=Details.GetFeeDetails(emp.Emplid);
}
The above example takes a lot of time in order to iterate each empdetails where i need to get corresponding fees list.
suggest me anybody what to do?

Linq to SQL/Linq to Entities use a deferred execution pattern. As soon as you call For Each or anything else that indirectly calls GetEnumerator, that's when your query gets translated into SQL and performed against the database.
The trick is to make sure your query is completely and correctly defined before that happens. Use Where(...), and the other Linq filters to reduce as much as possible the amount of data the query will retrieve. These filters are built into a single query before the database is called.
Linq to SQL/Linq to Entities also both use Lazy Loading. This is where if you have related entities (like Sales Order --> has many Sales Order Lines --> has 1 Product), the query will not return them unless it knows it needs to. If you did something like this:
Dim orders = entities.SalesOrders
For Each o in orders
For Each ol in o.SalesOrderLines
Console.WriteLine(ol.Product.Name)
Next
Next
You will get awful performance, because at the time of calling GetEnumerator (the start of the For Each), the query engine doesn't know you need the related entities, so "saves time" by ignoring them. If you observe the database activity, you'll then see hundreds/thousands of database roundtrips as each related entity is then retrieved 1 at a time.
To avoid this problem, if you know you'll need related entities, use the Include() method in Entity Framework. If you've got it right, when you profile the database activity you should only see a single query being made, and every item being retrieved by that query should be used for something by your application.

If the call to Details.GetFeeDetails(emp.Emplid); involves another round-trip of some sort, then that's the issue. I would suggest altering your query in this case to return fee details with the original IList<Employee> query.

Related

Is Laravel's 'pluck' method cheaper than a general 'get'?

I'm trying to dramatically cut down on pricey DB queries for an app I'm building, and thought I should perhaps just return IDs of a child collection (then find the related object from my React state), rather than returning the children themselves.
I suppose I'm asking, if I use 'pluck' to just return child IDs, is that more efficient than a general 'get', or would I be wasting my time with that?
Yes,pluck method is just fine if you are trying to retrieving a Single Column from tables.
If you use get() method it will retrieve all information about child model and that could lead to a little slower process for querying and get results.
So in my opinion, You are using great method for retrieving the result.
Laravel has also different methods for select queries. Here you can look Selects.
The good practice to perform DB select query in a application, is to select columns that are necessary. If id column is needed, then id column should be selected, instead of all columns. Otherwise, it will spend unnecessary memory to hold unused data. If your mind is clear, pluck and get are the same:
Model::pluck('id')
// which is the same as
Model::select('id')->get()->pluck('id');
// which is the same as
Model::get(['id'])->pluck('id');
I know i'm a little late to the party, but i was wondering this myself and i decided to research it. It proves that one method is faster than the other.
Using Model::select('id')->get() is faster than Model::get()->pluck('id').
This is because Illuminate\Support\Collection::pluck will iterate over each returned Model and extract only the selected column(s) using a PHP foreach loop, while the first method will make it cheaper in general as it is a database query instead.

typed data set; parent/child select and update with ONE trip to the database (for each op)?

Is it possible, using an ADO.NET typed DataSet containing two tables in a parent/child relationship, to populate the DataSet with ONE trip to the d/b (query could return one or two tables; if one, then result set has columns from both tables, right?), and to update the d/b with ONE trip to the d/b (call to generated stored proc, I guess).
By "is it possible", I mean is it possible to have Visual Studio (2012) automagically generate the classes and SQL code to make this happen?
Or am I kind of on my own? It's looking an awful lot like VS really wants to generate one d/b server round trip for each table involved.
*I guess the update stored proc would have to take table-typed parameters from both parent and child, and perform inserts/updates/deletes appropriately.
Yes, one round trip per table is the way to go.
(- It's certainly possible to use a join query to populate a datatable but VS will then be reluctant to generate update etc SQL. This may or may not be a problem, depending on what you intend to do with the dataset.)
But if you have two tables in a dataset, lets say customers - orders, then you would typically use two queries, and two trips to the db:
SELECT * FROM customers WHERE customers.customerid=#customerid
and
SELECT * FROM orders WHERE orders.customerid=#customerid
Somewhat more counter-intuitive is the situation where you want all customers and orders for one country:
SELECT * FROM customers WHERE customers.countryid=#countryid
and
SELECT orders.* FROM orders INNER JOIN customers ON customers.customerid=orders.customerid WHERE customers.countryid=#countryid
Note how the join query returns data from only one table, but uses the join to identify which rows to return.
Then, once you have the data in your dataset, you can navigate it using the getparentrow and getchildrows methods. This is how ADO.Net manages hierarchical data.
You do need this one-table-at-a-time approach, because, assuming you have foreign key constraints in your db, you need to insert and update in reverse order from delete.
EDIT Yes, this does mean that in some circumstances, depending on the data you want and the structure of your primary keys, you could end up with a humungous set of JOINS that still only pull the data from the table at the end of the hierarchy. This might seem wrong in terms of traditional SQL, but actually it's fine. The time you have lost in the multiple, more complex queries is saved by the reduced amount of data you have to pull back across the wire, compared with one big join query that would be returning multiple copies of the parent data.

Entity Framework - Querying from ObjectContext vs Querying from Navigation Property

I've noticed that depending on how I extract data from my Entity Framework model, I get different types of results. For example, when getting the list of employees in a particular department:
If I pull directly from ObjectContext, I get an IQueryable<Employee>, which is actually a System.Data.Objects.ObjectQuery<Employee>:
var employees = MyObjectContext.Employees.Where(e => e.DepartmentId == MyDepartment.Id && e.SomeCondtition)
But if I use the Navigation Property of MyDepartment, I get an IEnumerable<Employee>, which is actually a System.Linq.WhereEnumerableIterator<Employee> (private class in System.Linq.Enumerable):
var employees = MyDeparment.Employees.Where(e => e.SomeCondtition)
In the code that follows, I heavily use employees in several LINQ queries (Where, OrderBy, First, Sum, etc.)
Should I be taking into consideration which query method I use? Will there be a performance difference? Does the latter use deferred execution? Is one better practice? Or does it not make a difference?
I ask this because since installing ReShaper 6, I'm getting lots of Possible multiple enumeration of IEnumerable warnings when using the latter method, but none when using direct queries. I've been using the latter method more often, simply because it's much cleaner to write, and I'm wondering if doing so has actually had a detrimental effect!
There is very big difference.
If you are using the first approach you have IQueryable = exression tree and you can still add other expressions and only when you execute the query (deferred execution) the expression tree will be converted to SQL and executed in the database. So if you use your first example and add .Sum of something you will indeed execute operation in the database and it will transfer only single number back to your application. That is linq-to-entities.
The second example uses in memory collection. Navigation property doesn't represent IQueryable (expression tree). All linq commands are treated as linq-to-objects = all records representing related data in navigation property must be first loaded from database to your application and all operations are done in memory of your application server. You can load navigation property eagerly (by using Include), explicitly (by using Load) or lazily (it is just done automatically when you access the property for the first time if lazy loading is enabled). So if you want to have sum of something this scenario requires you to load all data from database and then execute the operation locally.

Help me understand entity framework 4 caching for lazy loading

I am getting some unexpected behaviour with entity framework 4.0 and I am hoping someone can help me understand this. I am using the northwind database for the purposes of this question. I am also using the default code generator (not poco or self tracking). I am expecting that anytime I query the context for the framework to only make a round trip if I have not already fetched those objects. I do get this behaviour if I turn off lazy loading. Currently in my application I am breifly turning on lazy loading and then turning it back off so I can get the desired behaviour. That pretty much sucks, so please help. Here is a good code example that can demonstrate my problem.
Public Sub ManyRoundTrips()
context.ContextOptions.LazyLoadingEnabled = True
Dim employees As List(Of Employee) = context.Employees.Execute(System.Data.Objects.MergeOption.AppendOnly).ToList()
'makes unnessesary round trip to the database, I just loaded the employees'
MessageBox.Show(context.Employees.Where(Function(x) x.EmployeeID < 10).ToList().Count)
context.Orders.Execute(System.Data.Objects.MergeOption.AppendOnly)
For Each emp As Employee In employees
'makes unnessesary trip to database every time despite orders being pre loaded.'
Dim i As Integer = emp.Orders.Count
Next
End Sub
Public Sub OneRoundTrip()
context.ContextOptions.LazyLoadingEnabled = True
Dim employees As List(Of Employee) = context.Employees.Include("Orders").Execute(System.Data.Objects.MergeOption.AppendOnly).ToList()
MessageBox.Show(employees.Where(Function(x) x.EmployeeID < 10).ToList().Count)
For Each emp As Employee In employees
Dim i As Integer = emp.Orders.Count
Next
End Sub
Why is the first block of code making unnessesary round trips?
Your expectation is not correct. Queries always query the DB. Always. That's because LINQ is always converted to SQL.
To load an object from the context if it's already been fetched and from the DB if it hasn't, use ObjectContext.GetObjectByKey().
The first 'unnecessary' trip is necessary - you did a new query and the database could have changed in the meantime. If you used the employees variable instead (where you have stored the result of the query) it wouldn't need to make a trip to the database.
The second one is necessary because you are asking it to fetch the Orders for each Employee. With Lazy loading and no Include() it hasn't read the Orders until you ask it to with emp.Orders.Count().
Remember that until you start iterating on a query (or call some method that requires it to iterate) LINQ to EF does nothing. If you save that query in a variable and then call .Count() on it there will be a round trip. If you use that same query and start enumerating it there will be another round trip. If the entities in that query themselves have relationships and lazy loading is on each time you access one there will be yet another round trip.
Your second example shows how to do this right if you know ahead of time that you want the Orders, that's eager loading. Notice how you don't go back to the context to ask it again for the employees, you reuse the one you've already loaded.

JSF session issue

I have got a situation where I have list of records say 10,000, I am using datatable and I am using paging,(10 records per display). I wanted to put put that list in the session as:
facesContext........put("mylist", mylist);
And in the getters of the mylist, I have
public List<MyClass> getMyList() {
if(mylist== null){
mylist= (List<MyClass>) FacesContext......getSessionMap().get("mylist");
}
return mylist;
}
Now the problem is whene ever i click on paging button to go to second page, only the first records are displayed,
I know i am missing some thing, and I have few questions:
Is the way of putting the list in session correct.
Is this the way I should be calling the list in my case.
Thnaks in advance...
Something entirely different: I strongly recommend to not put those 10.000 records in the session scope. That is plain inefficient. If 100 users are visiting your datatable, those records would be duplicated in memory for every user. This makes no sense. Just leave them in the database and write SQL queries accordingly that it returns exactly the rows you want to display per request. That's the job the DB is designed for. If the datamodel is well designed (indexes on columns involved in WHERE and if necessary ORDER BY clauses), then it's certainly faster than hauling the entire table in Java's memory for each visitor.
You can find more insights and code examples in this article.

Resources