How to bulk load CRM custom entities - dynamics-crm

I'm using CRM 2013 and I would like to return a large number of custom entities (around 100) based on values in a list.
I can't do context.MyCustomEntity.Where(i=>list.Contains(i.Id));
I can't use RetrieveMultiple since I want to update these entities and send them back to the server.
So I'm forced to call context.MyCustomEntity.Where(i=>i.Id == id) in a loop.
Is there somewhere to prefetch the entities from the context? Or call execute on the where clause in a loop?

If you already have the Id's and you don't need to reference any existing dynamic values on the records, you can just update the records. Here's an example:
var toUpdate = new MyCustomEntity{Id=id};
// update appropriate values
orgService.Update(toUpdate);
However, if you do need to update the fields dynamically, you could chunk the retrievals by using PredicateBuilder to do a dynamic OR condition, but you are limited on the number of criteria you can add to your query. Let me know if this is a path you need to go down, and I can provide more guidance.

Related

Eloquent Eager Loading in Cursor (Lazy Collection)

I'm trying to export a large number of records from my database, but I need relationship data in order to build the export correctly. Ideally I would be able to use cursor() to get a Lazy Collection, but that won't load the relationships. I can't load the relationship within a loop, because that will create N+1 queries, and this could be hundreds of thousands of additional queries, which is unacceptable.
Here's what "works" (but runs out of memory):
Record::with('projects')->get()->map(function ($record) {
dd($record); // Shows the `projects` relationship
});
But when I use cursor()...
Record::with('projects')->cursor()->map(function ($record) {
dd($record); // Does NOT show the `projects` relationship
});
Is there a way to get a lazy collection that includes a record's relationship? I have looked in the documentation and it's not clear. Other suggestions have been to use chunk() which is unfortunately not a possibility in this situation.
EDIT: I shouldn't say chunk isn't a possibility, but it's a very expensive re-write. Currently, the data is structured with a lot of variability. So in order to construct the CSV for export, I need (for example) a header for the file. I currently grab that header by looping through all the records (the fields are stored in a JSONB field) and building out an array based on the fields present on those records.
I am also normalizing the data against those headers. So if one record has the field "address-1" but another record doesn't have that, the one that doesn't have it instead shows a blank value in the appropriate column. Otherwise, when inserting the row into the CSV, it doesn't respect the header.
These operations currently grab the entire data set and use a LazyCollection to map the header and normalize the records, and then feed it into the CSV one at a time. It would be ideal if I could grab relationships in a LazyCollection as well rather than having to rewrite the workflow.
according to this doc
cursor work in db stage, while loading relations come after method 'get' or 'first' ...
so: the code in cursor will work in db row represented as Model instance before the overall result, means that this code will run into db, without loading the relation, again db row (iterate through your database records...)
if you can't use chunk... then i think that you can use mySql to manage your data using raw-expressions

Entity Framework - Querying from ObjectContext vs Querying from Navigation Property

I've noticed that depending on how I extract data from my Entity Framework model, I get different types of results. For example, when getting the list of employees in a particular department:
If I pull directly from ObjectContext, I get an IQueryable<Employee>, which is actually a System.Data.Objects.ObjectQuery<Employee>:
var employees = MyObjectContext.Employees.Where(e => e.DepartmentId == MyDepartment.Id && e.SomeCondtition)
But if I use the Navigation Property of MyDepartment, I get an IEnumerable<Employee>, which is actually a System.Linq.WhereEnumerableIterator<Employee> (private class in System.Linq.Enumerable):
var employees = MyDeparment.Employees.Where(e => e.SomeCondtition)
In the code that follows, I heavily use employees in several LINQ queries (Where, OrderBy, First, Sum, etc.)
Should I be taking into consideration which query method I use? Will there be a performance difference? Does the latter use deferred execution? Is one better practice? Or does it not make a difference?
I ask this because since installing ReShaper 6, I'm getting lots of Possible multiple enumeration of IEnumerable warnings when using the latter method, but none when using direct queries. I've been using the latter method more often, simply because it's much cleaner to write, and I'm wondering if doing so has actually had a detrimental effect!
There is very big difference.
If you are using the first approach you have IQueryable = exression tree and you can still add other expressions and only when you execute the query (deferred execution) the expression tree will be converted to SQL and executed in the database. So if you use your first example and add .Sum of something you will indeed execute operation in the database and it will transfer only single number back to your application. That is linq-to-entities.
The second example uses in memory collection. Navigation property doesn't represent IQueryable (expression tree). All linq commands are treated as linq-to-objects = all records representing related data in navigation property must be first loaded from database to your application and all operations are done in memory of your application server. You can load navigation property eagerly (by using Include), explicitly (by using Load) or lazily (it is just done automatically when you access the property for the first time if lazy loading is enabled). So if you want to have sum of something this scenario requires you to load all data from database and then execute the operation locally.

Performace issue using Foreach in LINQ

I am using an IList<Employee> where i get the records more then 5000 by using linq which could be better? empdetailsList has 5000
Example :
foreach(Employee emp in empdetailsList)
{
Employee employee=new Employee();
employee=Details.GetFeeDetails(emp.Emplid);
}
The above example takes a lot of time in order to iterate each empdetails where i need to get corresponding fees list.
suggest me anybody what to do?
Linq to SQL/Linq to Entities use a deferred execution pattern. As soon as you call For Each or anything else that indirectly calls GetEnumerator, that's when your query gets translated into SQL and performed against the database.
The trick is to make sure your query is completely and correctly defined before that happens. Use Where(...), and the other Linq filters to reduce as much as possible the amount of data the query will retrieve. These filters are built into a single query before the database is called.
Linq to SQL/Linq to Entities also both use Lazy Loading. This is where if you have related entities (like Sales Order --> has many Sales Order Lines --> has 1 Product), the query will not return them unless it knows it needs to. If you did something like this:
Dim orders = entities.SalesOrders
For Each o in orders
For Each ol in o.SalesOrderLines
Console.WriteLine(ol.Product.Name)
Next
Next
You will get awful performance, because at the time of calling GetEnumerator (the start of the For Each), the query engine doesn't know you need the related entities, so "saves time" by ignoring them. If you observe the database activity, you'll then see hundreds/thousands of database roundtrips as each related entity is then retrieved 1 at a time.
To avoid this problem, if you know you'll need related entities, use the Include() method in Entity Framework. If you've got it right, when you profile the database activity you should only see a single query being made, and every item being retrieved by that query should be used for something by your application.
If the call to Details.GetFeeDetails(emp.Emplid); involves another round-trip of some sort, then that's the issue. I would suggest altering your query in this case to return fee details with the original IList<Employee> query.

Using Linq SubmitChanges without TimeStamp and StoredProcedures the same time

I am using Sql tables without rowversion or timestamp. However, I need to use Linq to update certain values in the table. Since Linq cannot know which values to update, I am using a second DataContext to retrieve the current object from database and use both the database and the actual object as Input for the Attach method like so:
Public Sub SaveCustomer(ByVal cust As Customer)
Using dc As New AppDataContext()
If (cust.Id > 0) Then
Dim tempCust As Customer = Nothing
Using dc2 As New AppDataContext()
tempCust = dc2.Customers.Single(Function(c) c.Id = cust.Id)
End Using
dc.Customers.Attach(cust, tempCust)
Else
dc.Customers.InsertOnSubmit(cust)
End If
dc.SubmitChanges()
End Using
End Sub
While this does work, I have a problem though: I am also using StoredProcedures to update some fields of Customer at certain times. Now imagine the following workflow:
Get customer from database
Set a customer field to a new value
Use a stored procedure to update another customer field
Call SaveCustomer
What happens now, is, that the SaveCustomer method retrieves the current object from the database which does not contain the value set in code, but DOES contain the value set by the stored procedure. When attaching this with the actual object and then submit, it will update the value set in code also in the database and ... tadaaaa... set the other one to NULL, since the actual object does not contain the changed made by the stored procedure.
Was that understandable?
Is there any best practice to solve this problem?
If you make changes behind the back of the ORM, and don't use concurrency checking - then you are going to have problems. You don't show what you did in step "3", but IMO you should update the object model to reflect these changes, perhaps using OUTPUT TSQL paramaters. Or; stick to object-oriented.
Of course, doing anything without concurrency checking is a good way to lose data - so my preferred option is simply "add a rowversion". Otherwise, you could perhaps read the updated object out and merge things... somehow guessing what the right data is...
If you're going to disconnect your object from one context and use another one for the update, you need to either retain the original object, use a row version, or implement some sort of hashing routine in your database and retain the hash as part of your object. Of these, I highly recommend the Rowversion option as well. Using the current value as the original value like you are trying to do is only asking for concurrency problems.

Using LINQ with stored procedure that returns multiple instances of the same entity per row

Our development policy dictates that all database accesses are made via stored procedures, and this is creating an issue when using LINQ.
The scenario discussed below has been somewhat simplified, in order to make the explanation easier.
Consider a database that has 2 tables.
Orders (OrderID (PK), InvoiceAddressID (FK), DeliveryAddressID (FK) )
Addresses (AddresID (PK), Street, ZipCode)
The resultset returned by the stored procedure has to rename the address related columns, so that the invoice and delivery addresses are distinct from each other.
OrderID InvAddrID DelAddrID InvStreet DelStreet InvZipCode DelZipCode
1 27 46 Main St Back St abc123 xyz789
This, however, means that LINQ has no idea what to do with these columns in the resultset, as they no longer match the property names in the Address entity.
The frustrating thing about this is that there seems to be no way to define which resultset columns map to which Entity properties, even though it is possible (to a certain extent) to map entity properties to stored procedure parameters for the insert/update operations.
Has anybody else had the same issue?
I'd imagine that this would be a relatively common scenarios, from a schema point of view, but the stored procedure seems to be the key factor here.
Have you considered creating a view like the below for the stored procedure to select from? It would add complexity, but allow LINQ to see the Entity the way you wanted.
Create view OrderAddress as
Select o.OrderID
,i.AddressID as InvID
,d.AddressID as DelID
...
from Orders o
left join Addresses i
on o.InvAddressID= i.AddressID
left join Addresses d
on o.DelAddressID = i.AddressID
LINQ is a bit fussy about querying data; it wants the schema to match. I suspect you're going to have to bring that back into an automatically generated type, and do the mapping to you entity type afterwards in LINQ to objects (i.e. after AsEnumerable() or similar) - as it doesn't like you creating instances of the mapped entities manually inside a query.
Actually, I would recommend challenging the requirement in one respect: rather than SPs, consider using UDFs to query data; they work similarly in terms of being owned by the database, but they are composable at the server (paging, sorting, joinable, etc).
(this bit a bit random - take with a pinch of salt)
UDFs can be associated with entity types if the schema matches, so another option (I haven't tried it) would be to have a GetAddress(id) udf, and a "main" udf, and join them:
var qry = from row in ctx.MainUdf(id)
select new {
Order = ctx.GetOrder(row.OrderId),
InvoiceAddress = ctx.GetAddress(row.InvoiceAddressId),
DeliveryAddress = ctx.GetAddress(row.DeliveryAddressId)) };
(where the udf just returns the ids - actually, you might have the join to the other udfs, making it even worse).
or something - might be too messy for serious consideration, though.
If you know exactly what columns your result set will include, you should be able to create a new entity type that has properties for each column in the result set. Rather than trying to pack the data into an Order, for example, you can pack it into an OrderWithAddresses, which has exactly the structure your stored procedure would expect. If you're using LINQ to Entities, you should even be able to indicate in your .edmx file that an OrderWithAddresses is an Order with two additional properties. In LINQ to SQL you will have to specify all of the columns as if it were an entirely unrelated data type.
If your columns get generated dynamically by the stored procedure, you will need to try a different approach: Create a new stored procedure that only pulls data from the Orders table, and one that only pulls data from the addresses table. Set up your LINQ mapping to use these stored procedures instead. (Of course, the only reason you're using stored procs is to comply with your company policy). Then, use LINQ to join these data. It should be only slightly less efficient, but it will more appropriately reflect the actual structure of your data, which I think is better programming practice.
I think I understand what you're after, but I could wildy off...
If you mock up classes in a DBML (right-click -> new -> class) that are the same structure as your source tables, you could simply create new objects based on what is read from the stored procedure. Using LINQ to objects, you could still query your selection. It's more code, but it's not that hard to do. For example, mock up your DBML like this:
Pay attention to the associations http://geeksharp.com/screens/orders-dbml.png
Make sure you pay attention to the associations I added. You can expand "Parent Property" and change the name of those associations to "InvoiceAddress" and "DeliveryAddress." I also changed the child property names to "InvoiceOrders" and "DeliveryOrders" respectively. Notice the stored procedure up top called "usp_GetOrders." Now, with a bit of code, you can map the columns manually. I know it's not ideal, especially if the stored proc doesn't expose every member of each table, but it can get you close:
public List<Order> GetOrders()
{
// our DBML classes
List<Order> dbOrders = new List<Order>();
using (OrderSystemDataContext db = new OrderSystemDataContext())
{
// call stored proc
var spOrders = db.usp_GetOrders();
foreach (var spOrder in spOrders)
{
Order ord = new Order();
Address invAddr = new Address();
Address delAddr = new Address();
// set all the properties
ord.OrderID = spOrder.OrderID;
// add the invoice address
invAddr.AddressID = spOrder.InvAddrID;
invAddr.Street = spOrder.InvStreet;
invAddr.ZipCode = spOrder.InvZipCode;
ord.InvoiceAddress = invAddr;
// add the delivery address
delAddr.AddressID = spOrder.DelAddrID;
delAddr.Street = spOrder.DelStreet;
delAddr.ZipCode = spOrder.DelZipCode;
ord.DeliveryAddress = delAddr;
// add to the collection
dbOrders.Add(ord);
}
}
// at this point I have a List of orders I can query...
return dbOrders;
}
Again, I realize this seems cumbersome, but I think the end result is worth a few extra lines of code.
this it isn't very efficient at all, but if all else fails, you could try making two procedure calls from the application one to get the invoice address and then another one to get the delivery address.

Resources