How to cache queries in EJB and return result efficient (performance POV) - performance

I use JBoss EJB 3.0 implementation (JBoss 4.2.3 server)
At the beginning I created native query all the time using construction like
Query query = entityManager.createNativeQuery("select * from _table_");
Of couse it is not that efficient, I performed some tests and found out that it really takes a lot of time... Then I found a better way to deal with it, to use annotation to define native queries:
#NamedNativeQuery( name = "fetchData", value = "select * from _table_", resultClass=Entity.class )
and then just use it
Query query = entityManager.createNamedQuery("fetchData");
the performance of code line above is two times better than where I started from, but still not that good as I expected... then I found that I can switch to Hibernate annotation for NamedNativeQuery (anyway, JBoss's implementation of EJB is based on Hibernate), and add one more thing:
#NamedNativeQuery( name = "fetchData2", value = "select * from _table_", resultClass=Entity.class, readOnly=true)
readOnly - marks whether the results are fetched in read-only mode or not. It sounds good, because at least in this case of mine I don't need to update data, I wanna just fetch it for report. When I started server to measure performance I noticed that query without readOnly=true (by default it is false) returns result with each iteration better and better, and at the same time another one (fetchData2) works like "stable" and with time difference between them is shorter and shorter, and after 5 iterations speed of both was almost the same...
The questions are:
1) is there any other way to speed query using up? Seems that named queries should be prepared once, but I can't say it... In fact if to create query once and then just use it it would be better from performance point of view, but it is problematic to cache this object, because after creating query I can set parameters (when I use ":variable" in query), and it changes query object (isn't it?). well, is here any way to cache them? Or named query is the best option I can use?
2) any other approaches how to make results retrieveng faster. I mean, for instance I don't need those Entities to be attached, I won't update them, all I need is just fetch collection of data. Maybe readOnly is the only available way, so I can't speed it up, but who knows :)
P.S. I don't ask about DB performance, all I need now is how not to create query object all the time, so use it efficient, and to "allow" EJB to do less job with the same result concerning data returning.
Added 15.03.2010:
By query I mean query object (so how to cache this object to reuse); and to cache query results is not a solution for me because of where cause in query can be almost unique for each querying because of float-pointing parameters there. Cache just will not understand that "a > 50.0001" and "a > 50.00101" can give the same result, but also can not.

You could use second level cache and query cache to avoid hitting the database (works especially well with read-only objects). Second level cache is supported by Hibernate (with a third party cache provider) but is an extension to JPA 1.0 though.

Related

What are the advantages of using Spring JPA Specifications over direct queries

I am currently working on a project where I have to retrieve some rows from the database based on some filters (I also have to paginate them).
My solution was to make a function that generates the queries and to query the database directly (it works and it's fast)
When I presented this solution to the senior programmer he told me this is going to work but it's not a long-term solution and I should rather use Spring Specifications.
Now here comes my questions :
Why is Spring Specifications better than generating a query?
Is a query generated by Spring Specifications faster than a normal query?
Is it that big of a deal to use hard-coded queries ?
Is there a better approach to this problem ?
I have to mention that the tables in the database don't store a lot of data, the biggest one (which will be queried the least) has around 134.000 rows after 1 year since the application was launched.
The tables have indexes on the rows that we will use to filter.
A "function that generates the queries" sounds like building query strings by concatenating smaller parts based on conditions. Even presuming this is a JPQL query string and not a native SQL string that would be DB dependent, there are several problems:
you lose the IDEs help if you ever refactor your entities
not easy to modularize and reuse parts of the query generation logic (eg. if you want to extract a method that adds the same conditions to a bunch of different queries with different joins and aliases for the tables)
easy to break the syntax of the query by a typo (eg. "a=b" + "and c=d")
more difficult to debug
if your queries are native SQL then you also become dependent on a database (eg. maybe you want your integration tests to run on an in-memory DB while the production code is on a regular DB)
if in your project all the queries are generated in a way but yours is generated in a different way (without a good reason) then maintenance of the will be more difficult
JPA frameworks generate optimized queries for most common use cases, so generally speaking you'll get at least the same speed from a Specification query as you do from a native one. There are times when you need to write native SQL to further optimize a query but these are exceptional cases.
Yes, it's bad practice that makes maintenance a nightmare

Apply Laravel 4 query cache to all database reads

Laravel 4 has a query cache built into its query builder: just add ->remember(), according to the docs.
Can anybody tell me how I can apply this method to all queries in my application, without appending ->remember() to each and every database call in it? Some kind of after filter, I suppose.
You might be able to extend the query builder and simply overload the get() method to first call remember(), and then do the get() statement.
Practically, though, if you want to cache every single query, you might as well just do this at the database level. MySQL, for example, has a configuration option to automatically cache all queries for a certain amount of time. However, in an application that does a lot of inserts/updates/deletes, this will have poor performance since the cache is cleared for that table on every such call.
Using Laravel for every query would also mean getting outdated data if you do inserts/updates/deletes meanwhile, so you'd have to clear the cache every time you update.
Best practice would be to diligently decide if a query should be cached or not.

MS Access 2010: query slows down dramatically when using parameters

I hope this was not asked here before (I did search around here, and did google for an answer, but could not find an answer)
The problem is: I'm using MS Access 2010 to select records from a linked table (There are millions of records in the table). If I specify criteria (e.g. Date) directly (for example date=#1/1/2013#), the query returns in an instant. If i use parameters (add a parameter of type date/time and provide value of 1/1/2013 when prompted (or date in some different format), or reference a control in a form), the query takes minutes to load.
Please let me know if You have any ideas on what could be causing this. I do feel bad about asking such a question and possibly wasting someones time...
Here's a potential answer, I didn't know this myself and did a little digging.
If performance is important, it may be necessary to prefer dynamic SQL even for where parameter queries are suitable due to how queries are optimized. Generally, Access creates a plan for a new query upon saving. When a query contains a parameter, then Access cannot know what value the parameter may contain and has to make a "good guess". Depending on which actual values are later supplied, it may be okay or poor, resulting in sub-optimal performance. In contrast, dynamic SQL sidesteps this because the "parameters" are hard-coded into the temporary string and thus a new plan is compiled with that value, guaranteeing optimal execution plan. Since compiling a new plan at runtime is very fast, it can be the case that dynamic SQL will outperform parameter queries.
Source: http://www.utteraccess.com/wiki/index.php/Parameter_Query#Performance
Also, if I had to guess, in your parameter query, Access is requesting the ENTIRE table from Oracle and then filtering down with your where clause, but when the WHERE clause is specified, it actually just loads those records and possibly makes use of indexes.
As far as a solution, I would build your query string in VBA then execute it. It opens you up to injection, but you can handle that. So:
Instead of using a saved parameter query object in Access, try to do something like this.
dim qr as string
qr = "SELECT * FROM myTable WHERE myDate = #" & me.dateControl & "#;"
'CurrentDb.execute qr, dbFailOnError
Docmd.RunSQL qr
Or, as you replied, currentdb.openrecordset(qr)
This would force the engine to make an execution plan at runtime rather than having a saved potentially suboptimal plan. Let me know if this works out for you, I'd be interested to see.
Of course the above reference about using parameters with Access (JET/ACE) ONLY applies to access back ends, not ODBC ones like SQL server or oracle. Since you pointed out that your using Oracle here then creating a view or using a pass-though query would and should resolve this performance issue. However one does NOT want to use Access/JET paramters with data coming from some server based system - you best just send the server SQL strings, but much better would be to use a pass-though query. If the result set requires editing, then PT query are only readonly, and you have to create a view and link to that view.

What are the best practices for determining when (or not) to pre-compile a Linq Query? [duplicate]

I have a table:
-- Tag
ID | Name
-----------
1 | c#
2 | linq
3 | entity-framework
I have a class that will have the following methods:
IEnumerable<Tag> GetAll();
IEnumerable<Tag> GetByName();
Should I use a compiled query in this case?
static readonly Func<Entities, IEnumerable<Tag>> AllTags =
CompiledQuery.Compile<Entities, IEnumerable<Tag>>
(
e => e.Tags
);
Then my GetByName method would be:
IEnumerable<Tag> GetByName(string name)
{
using (var db = new Entities())
{
return AllTags(db).Where(t => t.Name.Contains(name)).ToList();
}
}
Which generates a SELECT ID, Name FROM Tag and execute Where on the code. Or should I avoid CompiledQuery in this case?
Basically I want to know when I should use compiled queries. Also, on a website they are compiled only once for the entire application?
You should use a CompiledQuery when all of the following are true:
The query will be executed more than once, varying only by parameter values.
The query is complex enough that the cost of expression evaluation and view generation is "significant" (trial and error)
You are not using a LINQ feature like IEnumerable<T>.Contains() which won't work with CompiledQuery.
You have already simplified the query, which gives a bigger performance benefit, when possible.
You do not intend to further compose the query results (e.g., restrict or project), which has the effect of "decompiling" it.
CompiledQuery does its work the first time a query is executed. It gives no benefit for the first execution. Like any performance tuning, generally avoid it until you're sure you're fixing an actual performance hotspot.
2012 Update: EF 5 will do this automatically (see "Entity Framework 5: Controlling automatic query compilation") . So add "You're not using EF 5" to the above list.
Compiled queries save you time, which would be spent generating expression trees. If the query is used often and you'll save the compiled query, you should definitely use it. I had many cases when the query parsing took more time than the actual round trip to the database.
In your case, if you are sure that it would generate SELECT ID, Name FROM Tag without the WHERE case (which I doubt, as your AllQueries function should return IQueryable and the actual query should be made only after calling ToList) - you shouldn't use it.
As someone already mentioned, on bigger tables SELECT * FROM [someBigTable] would take very long and you'll spend even more time filtering that on the client side. So you should make sure that your filtering is made on the database side, no matter if you are using compiled queries or not.
compiled queries are more helpfull with linq queries with large expression trees say complex queries to gain performance over building expression tree again and again while reusing query. in your case i guess it will save a very little time.
Compiled queries are compiled when the application is compiled and every time you reuse a query often or it is complex you should definitely try compiled queries to make execution faster.
But I would not go for it on all queries as it is a little more code to write and for simple queries it might not be worthwhile.
But for maximum performance you should also evaluate Stored Procedures where you do all the processing on the database server, even if Linq tries to push as much of the work to the db as possible you will have situations where a stored procedure will be faster.
Compiled queries offer a performance improvement, but it's not huge. If you have complex queries, I'd rather go with a stored procedure or a view, if possible; letting the database do it's thing might be a better approach.

Loading a huge entity tree with EF

I need to load a model, existing of +/- 20 tables from the database with Entity Framework.
So there are probably a few ways of doing this:
Use one huge Include call
Use many Includes calls while manually iterating the model
Use many IsLoaded and Load calls
Here's what happens with the 2 options
EF creates a HUGE query, puts a very heavy load on the DB and then again with mapping the model. So not really an option.
The database gets called a lot, with again pretty big queries.
Again, the database gets called even more, but this time with small loads.
All of these options weigh heavy on the performance. I do need to load all of that data (calculations for drawing).
So what can I do?
a) Heavy operation => heavy load => do nothing :)
b) Review design => but how?
c) A magical option that will make all these problems go away
When you need to load a lot of data from a lack of different tables, there is no "magic" solution which makes all problems go away. But in addition to what you have already discussed, you should consider projection. If you don't need every single property of an entity, it is often cheaper to project the information you do need, i.e.:
from parent in MyEntities.Parents
select new
{
ParentName = ParentName,
Children = from child in parent.Children
select new
{
ChildName = child.Name
}
}
One other thing to keep in mind is that for very large queries, the cost of compiling the query can often exceed the cost of executing it. Only profiling can tell you if this is the problem. If this turns out to be the problem, consider using CompiledQuery.
You might analyze the ratio of queries to updates. If you mostly upload the model once, then everything else is a query, then maybe you should store an XML representation of the model in the database as a "shadow" of the model. You should be able to either read the entire XML column in at once fairly quickly, or else maybe you can do your calculations (or at least the fetch of the values necessary for the calculations) using XQuery.
This assumes SQL Server 2005 or above.
You could consider caching your data in memory instead of getting it from the database each time.
I would recommend Enterprise Library Caching Application block: http://msdn.microsoft.com/en-us/library/dd203099.aspx

Resources