Entity Framework Bulk Updates - performance

I'm currently using EF5 and I have an entity InvoinceGroup which is related to many Invoices. Each Invoice is also related to many Practices.
When I want to pay an invoice, I need to know only que practices with a certain status (L).
These L practices, which were originaly related to the invoice 0, need to move to another invoice that is created on the fly, and each invoice is different depending on the customer. In addition to that, these new created invoices will have belong to one InvoiceGroup.
I need to perform a bulk update to the practices changing from the 0 invoice to the newly created invoice and the problem is that I have close to 5000 practices per invoice and above 100 invoices in each group.
It takes a very long time to do this and I assume that EF updates it one by one.
I've planned to do this by an sp, but my question is: Is there a better way to do this only using EF?

No, EF will only do INSERTs, UPDATEs, and DELETEs one-by-one.
You can create an SP, and either execute it directly, or import it into your EF model and execute it from there. Either way, you can't do the type of bulk operation you describe using only the Entity Framework.

Related

Elasticsearch read model sync with write model

My application following CQRS strategy separates Read model from Write model. I have a Product and multiple Purchase orders related to that Product.
The PurchaseOrder read model is in Elasticsearch and with product name attached. Now if I change the product name in the write model then I need to update all the PurchaseOrder's productName field accordingly in the read model(using Elasticsearch's bulk update API).
My question is: As I have millions of PurchaseOrders, will this productName sync be a performance issue? Or any suggestions for modeling such kind of syncing?
Although I do not believe that changing a product name on existing orders is a good idea (the invoice might have been generated and the product name in the order should match the one in the invoice), the question still has merit.
You may want your PurchaseOrder to only keep the ID (and perhaps the version?) of the Product, so that you can avoid such a mass update. This approach, on the other hand, requires a call to the Product aggregate root every time you want to translate the ID of the product in its own name. The impact of such a read can obviously be mitigated by using a cache.
I guess it really depend on the number of occurrences of such two circumstances to happen and I would then optimize the most occurring one.

Event Sourcing - Aggregate modeling

Two questions
1) How to model aggregate and reference between them
2) How to organise/store events so that they can be retrieved efficiently
Take this typical use case as example, we have Order and LineItem (they are an aggregate, Order is the aggregate root), and Product aggregate.
As LineItem needs to know which Product, so there are two options 1) LineItem has direct reference to Product aggregate (which seems not a best practice, as it violate the idea of aggregate being a consistence boundary because we can update Product aggregate directly from Order aggregate) 2) then LineItem only has ProductId.
It looks like 2nd option is the way to go...What do you think here?
However, another problem arises which is about building a Order read/view model. In this Order view model, it needs to know which Products are in Order (i.e. ProductId, Type, etc.). The typical use case is reporting, and CommandHandler also can use this Product object to perform logic such as whether there are too many particular products, etc. In order to do it, given the fact that those data are in two separate aggregate, then we need 1+ database roundtrips. As we are using events to build model, so the pseudo code looks like below
1) for a given order id (guid, order aggregate id), we load all the events for it; -- 1st database access
2) then build a Order aggregate, then we know which ProductId are referenced in Order;
3) for the list of ProductIds, we load all events for it; -- 2nd database access
If we build a really big graph of objects (a lot of different aggregates), then this may end up with a few more database access (each of which is slow)...What's your idea in here?
Thanks
Take this typical use case as example, we have Order and LineItem (they are an aggregate, Order is the aggregate root), and Product aggregate.
The Order aggregate makes sense the way you have described it. "Product aggregate" is more suspicious; do you ask the model if the product is allowed to change, or are you telling the model that the product has changed?
If Product can change without first consulting with the order, then the LineItem must not include the product. A reference to the product (aka the ProductId) is ok.
If we build a really big graph of objects (a lot of different aggregates), then this may end up with a few more database access (each of which is slow)...What's your idea in here?
For reads, reports, and the like -- where you aren't going to be adding new events to the history -- one possible answer is to do the slow work in advance. An asynchronous process listens for writes in the event store, and then publishes those events to a bus. Subscribers build new versions of the reports when new events are observed, and cache the results. (search keyword: cqrs)
When a client asks for a report, you give them one out of the cache. All the work is done, so it's very quick.
For command handlers, the answer is more complicated. Business rules are supposed to be in the domain model, so having the command handler try to validate the command (as opposed to the domain model) is a bit broken.
The command handler can load the products to see what the state might look like, and pass that information to the aggregate with the command data, but it's not clear that's a good idea -- if the client is going to send a command to be run, and you need to flesh out the Order command with Product data, why not instead have the command add the product data to the command directly, and skip the middle man.
CommandHandler also can use this Product object to perform logic such as whether there are too many particular products, etc.
This example is a bit vague, but taking a guess: you are thinking about cases where you prevent an order from being placed if the available inventory is insufficient to fulfill the order.
For real world inventory - a physical book in a warehouse - that's probably the wrong approach to take. First, the model itself is wrong; if you want to know how much product is in the warehouse, you should be querying the warehouse, not the product. Second, a physical warehouse is not constrained by your model -- calling the addProduct method on a warehouse aggregate doesn't cause the product to magically appear there.
Third, it probably doesn't match very well with what your domain experts want anyway. If the model says that the warehouse doesn't have enough product, do you think the stake holders want the system to
tell the shopper to buy the product somewhere else, or...
accept the order, and contact the supplier for a new delivery.
Hint: when in doubt, carefully review how amazon.com does it.

CRM - Transfer list of products from opportunity to order on close opportunity

When closing an opportunity as won, a workflow will run and create an Order based on some of the fields in Opportunity. We are not currently transferring the product list from opportunity to order. We are aware that there is a default relationship between opportunity and product called 'opportunityproduct'.
We need to know if there is a way to transfer the Products from 'opportunityproduct' to 'orderproduct'. Can this be achieved solely through the use of CRM workflow? Or would we need to create a custom Plugin?

Very slow search of a simple entity relationship

We use CRM 4.0 at our institution and have no plans to upgrade presently as we've spend the last year and a half customising and extending the CRM to work with our processes.
A tiny part of model is a simply hierarchy, we have a group of learning rooms that has a one-to-many relationship with another entity that describes the courses available for that learning room.
Another entity has a list of all potential and enrolled students who have expressed an interest in whichever course.
That bit's all straightforward and works pretty well and is modelled into 3 custom entities.
Now, we've got an Admin application that reads the rooms and then wants to show the courses for that room, but only where there are enrolled students.
In SQL this is simplified to:
SELECT DISTINCT r.CourseName, r.OtherInformation
FROM Rooms r
INNER JOIN Students S
ON S.CourseId = r.CourseId
WHERE r.RoomId = #RoomId
And this indeed is very close to the eventual SQL that CRM generates.
We use a Crm QueryEntity, a Filter and a LinkEntity to represent this same structure.
The problem now is that the CRM normalizes the a customize entity into a Base Table which has the standard CRM entity data that all share, and then an ExtensionBase Table which has our customisations. To Give a flattened access to this, it creates a view that merges both tables.
This view is what is used by the Generated SQL.
Now the base tables have indices but the view doesn't.
The problem we have is that all we want to do is return Courses where the inner join is satisfied, it's enough to prove there are entries and CRM makes it SELECT DISTINCT, so we only get one item back for Room.
At first this worked perfectly well, but now we have thousands of queries, it takes well over 30 seconds and of course causes a timeout in anything but SMS.
I'm given to believe that we can create and alter indices on tables in CRM and that's not considered to be an unsupported modification; but what about Views ?
I know that if we alter an entity then its views are recreated, which would of course make us redo our indices when this happens.
Is there any way to hint to CRM4.0 that we want a specific index in place ?
Another source recommends that where you get problems like this, then it's best to bring data closer together, but this isn't something I'd feel comfortable in trying to engineer into our solution.
I had considered putting a new entity in that only has RoomId, CourseId and Enrolment Count in to it, but that smacks of being incredibly hacky too; After all, an index would resolve the need to duplicate this data and have some kind of trigger that updates the data after every student operation.
Lastly, whilst I know we're stuck on CRM4 at the moment, is this the kind of thing that we could expect to have resolved in CRM2011 ? It would certainly add more weight to the upgrading this 5 year old product argument.
Since views are "dynamic" (conceptually, their contents are generated on-the-fly from the base tables every time they are used), they typically can't be indexed. However, SQL Server does support something called an "indexed view". You need to create a unique clustered index on the view, and the query analyzer should be able to use it to speed up your join.
Someone asked a similar question here and I see no conclusive answer. The cited concerns from Microsoft are Referential Integrity (a non-issue here) and Upgrade complications. You mention the unsupported option of adding the view and managing it over upgrades and entity changes. That is an option, as unsupported and hackish as it is, it should work.
FetchXml does have aggregation but the query execution plans still uses the views: here is the SQL generated from a simple select count from incident:
'select
top 5000 COUNT(*) as "rowcount"
, MAX("__AggLimitExceededFlag__") as "__AggregateLimitExceeded__" from (select top 50001 case when ROW_NUMBER() over(order by (SELECT 1)) > 50000 then 1 else 0 end as "__AggLimitExceededFlag__" from Incident as "incident0" ...
I dont see a supported solution for your problem.
If you are building an outside admin app and you are hosting CRM 4 on-premise you could go directly to the database for your query bypassing the CRM API. Not supported but would allow you to solve the problem.
I'm going to add this as a potential answer although I don't believe its a sustainable or indeed valid long-term solution.
After analysing the indexes that CRM had defined automatically, I realised that selecting more information in my query would be enough to fulfil the column requirements of an Index and now the query runs in less then a second.

Creating a Relationship Between Order Product (salesorderdetail) and Service Activity (serviceappointment)

We are using Microsoft CRM 4.0 to run a consulting business. Its working pretty well but we want to simplify the way we are doing some things. What we want to do is create an Order (salesorder) with multiple Order Products (salesorderdetal). So good so far.
Next I want to be able associate each Order Product (salesorderdetail) with a Service Activity (serviceappointment), this representing that this billable line item in the order is actually going to be fulfilled as a consuting engagement.
The problem is, I can't seem to be able to create an association between the Order Product (salesorderdetail) and Service Activiy (serviceappointment). It simply doesn't appear in the drop downlist.
Can anyone think of a reason for this? I've seen some posts about relating field mapping between Quote Product, Order Product, Opportunity Product and Invoice Product, but that isn't quite what I am after.
Any suggestions gratefully received - even if it is an explaination of why its not possible.
I created a simple 1:N mapping from Case to Invoice. The Case records its ID and Title in custom fields in the Invoice. Unfortunately this does not allow for product creation as children of the Invoice, so that should be created as a custom code workflow.

Resources