WCF Data Service : Performance issue while fetching nested entities - performance

My Database Schema :
table : Terminology (ID (PK), Name, Comments)
table : Content (ID (PK), TerminologyID (FK), Data, LangaugeID)
1 - many relationship between Terminology and Content. One Terminology can have any number of content based on different language ID.
Terminology and Contents table may have millions of records.
Now, even thought I fetch some hundreds of record (pagination) from my client side using WCF data Service, after 5-6 attempts, I get time out exception.
_DataService.Terminologies.Expand("Contents").Skip(index1).Take(count).ToList();
If I don't expand my Contents, query works fine :), but I will not have Content Data.
What is the best way to handle this scenario.
Options...
Is there any performance improvement, if I use Include in ServerSide (I mean, writing Custom webget method) over Exapnd in Client Side.
Creating database Views and accessing it over client side.
Creating Stored Procedure, where I can pass my preferred LanguageID and call it from client side.

Is this ADO.NET DataServices build by default wizard?
In any case if your client can access database directly, it will be a lot faster, so if direct db option is available then take it.
If WCF is the only option, then you will have to create your own implementation of Paging Web Service, perhaps even with store procedure that returns multiple recordsets.
On a side note I do not see LanguageId in your service query, and that could slow things down a lot.
_DataService.Terminologies.Expand("Contents").Skip(index1).Take(count).ToList();

Related

How to fetch the data from database using spring boot without mapping

I have a database and in that database there are many tables of data. I want to fetch the data from any one of those tables by entering a query from the front-end application. I'm not doing any manipulation to the data, doing just retrieving the data from database.
Also, mapping the data requires writing so many entity or POJO classes, so I don't want to map the data to any object. How can I achieve this?
In this case, assuming the mapping of tables if not relevant, you don't need to use JPA/Hibernate at all.
You can use an old, battle tested jdbc template that can execute a query of your choice (that you'll pass from client), will serialize the response to JSONObject and return it as a response in your controller.
The client side will be responsible to rendering the result.
You might also query the database metadata to obtain the information about column names, types, etc. so that the client side will also get this information and will be able to show the results in a more convenient / "advanced" way.
Beware of security implications, though. Basically it means that the client will be able to delete all the records from the database by a simple query and you won't be able to avoid it :)

Persisting Data Across Requests MVC3 and Razor

I am working on an MVC3 and Razor website. The user has to select their way through a few choices before finally working on the data.
For example:
Client List -> Version List (Filtered by client) -> Etc (Filtered by version)
Once a user selects a client, they select a version for the client. So I'm passing the client id on the querystring. For each mode of the controller of version I'm passing around the client id. On views that I want to show the client name, I'm querying the database for the client and stuffing it into the ViewBag. This seems very inefficient. I feel like I could use a cookie to hold the client id & name.
Now that I've got my version controller done, I'm facing the same pattern again with each subsequent controller, but now I need to persist both client and version...
What is a preferred approach for persisting information like this across requests?
This seems very inefficient
That's what database are made and optimized for => query data based on fields and if you put indexes on those fields it will be screamingly fast. Of course Session, Cookies, Cache are some common techniques that you could employ to limit the number of queries to the database but you will have to assume the possible staleness of data that you are getting this way (if some other thread/process modified the data in the database you no longer get correct results).
So before doing any premature optimizations here's what I would recommend you: hammer your database until you discover that this is actually a bottleneck for your application. Databases might become bottleneck in some very high traffic applications where you should resort to one of the afforementioned techniques (or in some poorly written applications of course but let's exclude this possibility for the moment).
You should use TempData, which allows you to pass data between the current and next HTTP requests. Be sure to keep in mind that it uses the session.
Greg Shackles has a great article all about TempData here
see this similar question MVC3 multi step form - How to persist model object

How to access data in Dynamics CRM?

What is the best way in terms of speed of the platform and maintainability to access data (read only) on Dynamics CRM 4? I've done all three, but interested in the opinions of the crowd.
Via the API
Via the webservices directly
Via DB calls to the views
...and why?
My thoughts normally center around DB calls to the views but I know there are purists out there.
Given both requirements I'd say you want to call the views. Properly crafted SQL queries will fly.
Going through the API is required if you plan to modify data, but it isnt the fastest approach around because it doesnt allow deep loading of entities. For instance if you want to look at customers and their orders you'll have to load both up individually and then join them manually. Where as a SQL query will already have the data joined.
Nevermind that the TDS stream is a lot more effecient that the SOAP messages being used by the API & webservices.
UPDATE
I should point out in regard to the views and CRM database in general: CRM does not optimize the indexes on the tables or views for custom entities (how could it?). So if you have a truckload entity that you lookup by destination all the time you'll need to add an index for that property. Depending upon your application it could make a huge difference in performance.
I'll add to jake's comment by saying that querying against the tables directly instead of the views (*base & *extensionbase) will be even faster.
In order of speed it'd be:
direct table query
view query
filterd view query
api call
Direct table updates:
I disagree with Jake that all updates must go through the API. The correct statement is that going through the API is the only supported way to do updates. There are in fact several instances where directly modifying the tables is the most reasonable option:
One time imports of large volumes of data while the system is not in operation.
Modification of specific fields across large volumes of data.
I agree that this sort of direct modification should only be a last resort when the performance of the API is unacceptable. However, if you want to modify a boolean field on thousands of records, doing a direct SQL update to the table is a great option.
Relative Speed
I agree with XVargas as far as relative speed.
Unfiltered Views vs Tables: I have not found the performance advantage to be worth the hassle of manually joining the base and extension tables.
Unfiltered views vs Filtered views: I recently was working with a complicated query which took about 15 minutes to run using the filtered views. After switching to the unfiltered views this query ran in about 10 seconds. Looking at the respective query plans, the raw query had 8 operations while the query against the filtered views had over 80 operations.
Unfiltered Views vs API: I have never compared querying through the API against querying views, but I have compared the cost of writing data through the API vs inserting directly through SQL. Importing millions of records through the API can take several days, while the same operation using insert statements might take several minutes. I assume the difference isn't as great during reads but it is probably still large.

Is there a specific performance issue with Entity Framework + GUID PKs?

My company has a basic SOA, using Entity Framework for database access. One of our services must interface with a legacy database which uses uniqueidentifier primary keys in the database. Performance is good during unit and integration testing.
However, under load, the affected service starts to produce a backlog of queued requests. This queue can be eliminated by removing the code sections which use Entity Framework to match rows by uniqueidentifier PK. The service then becomes performant again, matching rows by integer PK as part of the same routines, using the same ObjectContext.
The database table performs well being queried by the legacy application which it was designed for. I therefore do not believe this problem to be related to fragmentation within the database/index.
In the code sample shown, domain is an Entity Framework ObjectContext, MetaSetting relates to the mapped EntityObject and match.ObjectId is a Guid.
metaSettings = domain.MetaSetting.Where(
ms => ms.AppUID.Equals(match.ObjectId)
).ToList();
Use Profiler
Have you checked TSQL sent to the server using Profiler? It will be much easier to pinpoint any TSQL bottlenecks that happen while getting data.
But looking at your code this simple LINQ should be just as performant as a direct call to the DB. There is some overhead by EF though to do the transformation but it should be minor since you're transforming a single record to a single object instance.
You will run into problems when using Include() calls, because those may bloat TSQL script and cripple it down. I've had issues with it when a *:1 navigation property translated to LEFT OUTER JOIN instead of an INNER one.

Reasonably secure way to allow table name access on client side

We have a need coming up in an application where the following is true:
A web page uses AJAX to request data from a server.
The specification of the data (e. g. table name) requested from the server will not be known until run-time.
The configuration of the data view is itself data-driven, and configurable by an administrator.
Data updates and inserts must be supported, not just views.
Prototyping this was very easy - we could pass in the appropriate information (table name, changeset, whatever) to a generic data service that just did what it was told (using JSON as the data storage mechanism). The data service could do basic validation on the parameters to ensure the current user can perform the requested operation (read the data, insert a row, read the row).
The issue we have now that we are looking to doing this is a secure production manner, and the idea of passing table names and column names is frightening. Everything we think of to deal with this devolves into trusting the client in some significant way, or seems to involve substantial bookkeeping on the server. For example:
User requests a viewing page.
The server notes the table name and saves it server side with a request ID
The server notes the column names and saves them, replacing them with "col1, col2", etc., and stores the mapping with the request ID data.
The client page sends the request ID to the service, which looks up the server storage by ID
The service returns col1, col2, etc.
This would work, we think, but feels very messy.
Does anyone have experience with this kind of problem and can offer a solution?
Do you need to give them access to raw tables?
Perhaps you can go meta, and make a meta-table that stores the tabular data in a secure manner (ie, only the system knows the table/schema, but the user's concept of schema/table are just abstractions that all map back to the same schema/table)...
Again, more information is needed as to what can be abstracted. Allowing DDL operations by the end-users is asking for trouble, as you rightfully assessed, and I would just abstract that so that "DDL" becomes DML.
However, mapping actual SQL that is written against this data would be much more difficult to abstract, if that is a requirement.
If I had to expose back-end information to end customers, I'd probably hide the actual physical representation using meta-data that would remap table names and columns to more user-friendly text, that would also enable me to provide views on the tables that are a bit more advanced than plain table / column names... As properly modeling associations between tables and so on...

Resources