Very slow search of a simple entity relationship - dynamics-crm

We use CRM 4.0 at our institution and have no plans to upgrade presently as we've spend the last year and a half customising and extending the CRM to work with our processes.
A tiny part of model is a simply hierarchy, we have a group of learning rooms that has a one-to-many relationship with another entity that describes the courses available for that learning room.
Another entity has a list of all potential and enrolled students who have expressed an interest in whichever course.
That bit's all straightforward and works pretty well and is modelled into 3 custom entities.
Now, we've got an Admin application that reads the rooms and then wants to show the courses for that room, but only where there are enrolled students.
In SQL this is simplified to:
SELECT DISTINCT r.CourseName, r.OtherInformation
FROM Rooms r
INNER JOIN Students S
ON S.CourseId = r.CourseId
WHERE r.RoomId = #RoomId
And this indeed is very close to the eventual SQL that CRM generates.
We use a Crm QueryEntity, a Filter and a LinkEntity to represent this same structure.
The problem now is that the CRM normalizes the a customize entity into a Base Table which has the standard CRM entity data that all share, and then an ExtensionBase Table which has our customisations. To Give a flattened access to this, it creates a view that merges both tables.
This view is what is used by the Generated SQL.
Now the base tables have indices but the view doesn't.
The problem we have is that all we want to do is return Courses where the inner join is satisfied, it's enough to prove there are entries and CRM makes it SELECT DISTINCT, so we only get one item back for Room.
At first this worked perfectly well, but now we have thousands of queries, it takes well over 30 seconds and of course causes a timeout in anything but SMS.
I'm given to believe that we can create and alter indices on tables in CRM and that's not considered to be an unsupported modification; but what about Views ?
I know that if we alter an entity then its views are recreated, which would of course make us redo our indices when this happens.
Is there any way to hint to CRM4.0 that we want a specific index in place ?
Another source recommends that where you get problems like this, then it's best to bring data closer together, but this isn't something I'd feel comfortable in trying to engineer into our solution.
I had considered putting a new entity in that only has RoomId, CourseId and Enrolment Count in to it, but that smacks of being incredibly hacky too; After all, an index would resolve the need to duplicate this data and have some kind of trigger that updates the data after every student operation.
Lastly, whilst I know we're stuck on CRM4 at the moment, is this the kind of thing that we could expect to have resolved in CRM2011 ? It would certainly add more weight to the upgrading this 5 year old product argument.

Since views are "dynamic" (conceptually, their contents are generated on-the-fly from the base tables every time they are used), they typically can't be indexed. However, SQL Server does support something called an "indexed view". You need to create a unique clustered index on the view, and the query analyzer should be able to use it to speed up your join.

Someone asked a similar question here and I see no conclusive answer. The cited concerns from Microsoft are Referential Integrity (a non-issue here) and Upgrade complications. You mention the unsupported option of adding the view and managing it over upgrades and entity changes. That is an option, as unsupported and hackish as it is, it should work.
FetchXml does have aggregation but the query execution plans still uses the views: here is the SQL generated from a simple select count from incident:
'select
top 5000 COUNT(*) as "rowcount"
, MAX("__AggLimitExceededFlag__") as "__AggregateLimitExceeded__" from (select top 50001 case when ROW_NUMBER() over(order by (SELECT 1)) > 50000 then 1 else 0 end as "__AggLimitExceededFlag__" from Incident as "incident0" ...
I dont see a supported solution for your problem.
If you are building an outside admin app and you are hosting CRM 4 on-premise you could go directly to the database for your query bypassing the CRM API. Not supported but would allow you to solve the problem.

I'm going to add this as a potential answer although I don't believe its a sustainable or indeed valid long-term solution.
After analysing the indexes that CRM had defined automatically, I realised that selecting more information in my query would be enough to fulfil the column requirements of an Index and now the query runs in less then a second.

Related

How to create a quick search in CRM that spans multiple entities with grouped conditions

We are a housing association with a large CRM system (2016 & SP1). We have a new requirement that requires our users to be able to search for people who are current (ie not previous) occupants or residents or who are not residents (eg contractors)
For this purpose, we need to search the Person entity which has a related Tenancy entity. Person has TenancyType field with possible (option set) values Occupant, Resident, Contractor. Tenancy has TenancyStatus field with possible (text) values Current and Previous.
We tried using the following filter criteria in the quick view on the Person entity:
thinking that it would return all people who are not previous residents. However we noticed that it would filter out contractors because contractors do not have related tenancy records.
We needed to change the criteria to return all contractors OR all residents and occupants with no previous tenancy. So we changed it to the following:
at which point we got stuck because we noticed that it was not possible to AND together the second and the third conditions as the third one is a related entity.
We are wondering what the best way is to achieve the above bearing in mind that we do not want a separate view for each condition, eg one for residents, one for none residents, etc.
Any help or suggestion is greatly appreciated.
It is not possible to do this with a single query.
Instead, you can use two queries. If you do not want to do that, then using reports (as suggested by Alex) or a BI-solution would be other possibilities.
Thanks to everyone here who spent time answering my question. The following describes the correct answer:
https://community.dynamics.com/crm/f/117/p/241352/666651#666651

Event Sourcing - Aggregate modeling

Two questions
1) How to model aggregate and reference between them
2) How to organise/store events so that they can be retrieved efficiently
Take this typical use case as example, we have Order and LineItem (they are an aggregate, Order is the aggregate root), and Product aggregate.
As LineItem needs to know which Product, so there are two options 1) LineItem has direct reference to Product aggregate (which seems not a best practice, as it violate the idea of aggregate being a consistence boundary because we can update Product aggregate directly from Order aggregate) 2) then LineItem only has ProductId.
It looks like 2nd option is the way to go...What do you think here?
However, another problem arises which is about building a Order read/view model. In this Order view model, it needs to know which Products are in Order (i.e. ProductId, Type, etc.). The typical use case is reporting, and CommandHandler also can use this Product object to perform logic such as whether there are too many particular products, etc. In order to do it, given the fact that those data are in two separate aggregate, then we need 1+ database roundtrips. As we are using events to build model, so the pseudo code looks like below
1) for a given order id (guid, order aggregate id), we load all the events for it; -- 1st database access
2) then build a Order aggregate, then we know which ProductId are referenced in Order;
3) for the list of ProductIds, we load all events for it; -- 2nd database access
If we build a really big graph of objects (a lot of different aggregates), then this may end up with a few more database access (each of which is slow)...What's your idea in here?
Thanks
Take this typical use case as example, we have Order and LineItem (they are an aggregate, Order is the aggregate root), and Product aggregate.
The Order aggregate makes sense the way you have described it. "Product aggregate" is more suspicious; do you ask the model if the product is allowed to change, or are you telling the model that the product has changed?
If Product can change without first consulting with the order, then the LineItem must not include the product. A reference to the product (aka the ProductId) is ok.
If we build a really big graph of objects (a lot of different aggregates), then this may end up with a few more database access (each of which is slow)...What's your idea in here?
For reads, reports, and the like -- where you aren't going to be adding new events to the history -- one possible answer is to do the slow work in advance. An asynchronous process listens for writes in the event store, and then publishes those events to a bus. Subscribers build new versions of the reports when new events are observed, and cache the results. (search keyword: cqrs)
When a client asks for a report, you give them one out of the cache. All the work is done, so it's very quick.
For command handlers, the answer is more complicated. Business rules are supposed to be in the domain model, so having the command handler try to validate the command (as opposed to the domain model) is a bit broken.
The command handler can load the products to see what the state might look like, and pass that information to the aggregate with the command data, but it's not clear that's a good idea -- if the client is going to send a command to be run, and you need to flesh out the Order command with Product data, why not instead have the command add the product data to the command directly, and skip the middle man.
CommandHandler also can use this Product object to perform logic such as whether there are too many particular products, etc.
This example is a bit vague, but taking a guess: you are thinking about cases where you prevent an order from being placed if the available inventory is insufficient to fulfill the order.
For real world inventory - a physical book in a warehouse - that's probably the wrong approach to take. First, the model itself is wrong; if you want to know how much product is in the warehouse, you should be querying the warehouse, not the product. Second, a physical warehouse is not constrained by your model -- calling the addProduct method on a warehouse aggregate doesn't cause the product to magically appear there.
Third, it probably doesn't match very well with what your domain experts want anyway. If the model says that the warehouse doesn't have enough product, do you think the stake holders want the system to
tell the shopper to buy the product somewhere else, or...
accept the order, and contact the supplier for a new delivery.
Hint: when in doubt, carefully review how amazon.com does it.

Does Neo4j Support the concept of views

I'm starting with Neo4J trying to migrate my current system from a Relational DB to Neo4j
and have a peculiar problem to overcome.
I have a table called Orders and has 2 particular columns that are being a pain.
ShipBy is a value for (Train/Air/Truck)
Carrier is the Id of the company carrying the order but this changes, if it ships by Air, it has something like UPS/ALASKA/CONTINENTAL; if it ships by Train, it has something like BNSF/KANSASCITYRAIL/ETC...
these values come from different catalog tables, so this was resolved in my system with something like this
Select Orders.Number, Carrier.Name from Orders, (Select 'T' Type,Id,Name from Truckers union all Select 'R' Type, Id, Name from RailCompanies union all Select 'A' Type, Id, Name from AirLines) Carriers
Where Orders.ShipBy = Carriers.Type and Orders.CarrierId=Carrier.Id
I'd appreciate any pointer on this.
Neo4J doesn't have views in the way that relational DBs have. There are several things you could do as alternates:
Continually re-issue the query that computes the "view" you need, as needed
Create a special "view node", and then link that node via relationships to all of the other nodes that would naturally occur in your "view". Querying your view then becomes as simple as pulling up that one "view node" and traversing your edges to the view results.
Option #1 is easiest, option #2 is probably faster, but comes with it the maintenance burden that as your underlying nodes in the DB change, you need to maintain your view and make sure it points to the right places.
As we can read here "In database theory, a view is the result set of a stored query on the data, which the database users can query just as they would in a persistent database collection object."
Neo4j doesn't host stored queries, but you could think to extend Neo4j Servers as posted here by Stefan: https://stackoverflow.com/a/21780942/3442366
Materialized views are of course different...
Rely on the power of relationship management offered by Neo4j ;-)
Cheers,
Lorenzo

What might be the purpose of this column in eTRM (Oracle eBusiness suite)

I realize this is quite specialized question(about Oracle's eTRM + eBusiness suite ) I'm trying to figure out the meaning of this
REMIT_TO_ADDRESS_ID NUMBER (15)
which comes from the AR.RA_CUSTOMER_TRX_ALL table . The reason is that in a query I have, there's a bug like this where we say:
LEFT OUTER JOIN ra_customer_trx_all
ON rct.REMIT_TO_ADDRESS_ID = acct.REMIT_TO_ADDRESS_ID \
(acct is from the table hz_cust_acct_sites_all , by the way)
My guess is that REMIT_TO_ADDRESS_ID is some kind of meta-data?
I really appreciate any pointers/tips. Thanks.
Little bit rusty, but did Oracle Apps for 10 years. From your question I understand that you are new to Oracle Apps technology. ra_customer_trx_all stands for:
"RA" => "Accounts Receivables" also known as "AR" (something you sell and want money for),
"customer" says it,
"trx" => "transactions",
"_all" => all records across all organisations (multi-org).
It is a nice table with lots of features :-)
When in Oracle Apps a column is listed with name ending in '_id' and data type of number(15, 0), it is generally a reference to a row in another table. Depending on the Oracle Apps module, you will sometimes find also a foreign key constraint. But generally most Oracle Apps modules rely on the frontend to enforce referential integrity.
So remit_to_address_id refers to another table. In this case address information. Also, the naming of the column tells us that the referred row is used in a special way (role) namely as "remit to".
You might want to join it to the address table of Apps. When you do so, please check the columns listed in the indexes. The multi-org field org_id may be listed first (probably not in AR). If you forget them, you will still have good results since the ID-s are unique across the system, but the index might not be used.
For end user queries, I generally recommend to use the multi-orged view instead of the _all table. This ensures that users only see their current organisation. Remember that you need to set up the client_identifier session variable (if I recall correctly) to store the current organisation ID in.
I hope this helps you.
I have no knowledge of eTRM, or any other Oracle business application.
That said, as a complete wild guess, I would say that the REMIT_TO_ADDRESS_ID is the ID of an address that a payment of some kind is sent to, and that the address is optional (thus the outer join). So, in an Accounts Payable system, you may have a vendor, who has a normal business address. But when you send actual monies, they have an optional Remit To Address, and the payment is sent there instead of the normal business address.

typed data set; parent/child select and update with ONE trip to the database (for each op)?

Is it possible, using an ADO.NET typed DataSet containing two tables in a parent/child relationship, to populate the DataSet with ONE trip to the d/b (query could return one or two tables; if one, then result set has columns from both tables, right?), and to update the d/b with ONE trip to the d/b (call to generated stored proc, I guess).
By "is it possible", I mean is it possible to have Visual Studio (2012) automagically generate the classes and SQL code to make this happen?
Or am I kind of on my own? It's looking an awful lot like VS really wants to generate one d/b server round trip for each table involved.
*I guess the update stored proc would have to take table-typed parameters from both parent and child, and perform inserts/updates/deletes appropriately.
Yes, one round trip per table is the way to go.
(- It's certainly possible to use a join query to populate a datatable but VS will then be reluctant to generate update etc SQL. This may or may not be a problem, depending on what you intend to do with the dataset.)
But if you have two tables in a dataset, lets say customers - orders, then you would typically use two queries, and two trips to the db:
SELECT * FROM customers WHERE customers.customerid=#customerid
and
SELECT * FROM orders WHERE orders.customerid=#customerid
Somewhat more counter-intuitive is the situation where you want all customers and orders for one country:
SELECT * FROM customers WHERE customers.countryid=#countryid
and
SELECT orders.* FROM orders INNER JOIN customers ON customers.customerid=orders.customerid WHERE customers.countryid=#countryid
Note how the join query returns data from only one table, but uses the join to identify which rows to return.
Then, once you have the data in your dataset, you can navigate it using the getparentrow and getchildrows methods. This is how ADO.Net manages hierarchical data.
You do need this one-table-at-a-time approach, because, assuming you have foreign key constraints in your db, you need to insert and update in reverse order from delete.
EDIT Yes, this does mean that in some circumstances, depending on the data you want and the structure of your primary keys, you could end up with a humungous set of JOINS that still only pull the data from the table at the end of the hierarchy. This might seem wrong in terms of traditional SQL, but actually it's fine. The time you have lost in the multiple, more complex queries is saved by the reduced amount of data you have to pull back across the wire, compared with one big join query that would be returning multiple copies of the parent data.

Resources