In practice, are the following SQL queries equally efficient?
select * from aTable where myColumn LIKE '%'
select * from aTable
I known that in principle the first clause implies a table/index search whereas the second doesn't, but I am wondering whether SQL Server and Oracle these days (SS 2008 onwards, and Oracle 10g onwards) are intelligent enough to discard or ignore the redundant WHERE clause.
The reason I ask is that I have an application to look at that dynamically generates sql queries, and it is inserting clauses such myColumn LIKE '%' and I wonder if I should spend scarce time worrying about that.
Related
I have a large Oracle table with an indexed date_time field: "DISCONNECT_DATE"
When I use the following where clause my query runs quickly:
DISCONNECT_DATE > TO_DATE('01-DEC-2016', 'DD-MON-YYYY') AND
DISCONNECT_DATE < TO_DATE('01-JAN-2017', 'DD-MON-YYYY')
When I use the following where clause my query runs (very) slowly:
extract(month from disconnect_date) = '12' and
extract(year from disconnect_date) = '2016'
They are both more or less equivalent in their intentions. Why does the former work and the later not? (I don't think I have this problem in SQL SERVER)
(I am using PL SQL Developer to write the query)
The issue is the use of indexes. In the first, all the functions are on the "constant" side, not on the "column" side. So, Oracle can readily see that an index can be applied.
The logic that does indexing, though, doesn't understand extract(), so the index doesn't get used. If you want to use that construct, you can create an index on function calls:
create index idx_t_ddyear_ddmonth on t(extract(month from disconnect_date), extract(year from disconnect_date));
Note: extract() returns a number not a string, so you should get rid of the single quotes. Mixing data types can also confuse the optimizer.
I am working with huge table which has 9 million records. I am trying to take backup of the table. As below.
First Query
Create table newtable1 as select * from hugetable ;
Second Query
Select * into newtable2 from hugetable ;
Here first query seems to be quite fast. I want to know the functional difference and Why it is fast compare to second query?, Which one is to be preferred?
Quote from the manual:
This command (create table as) is functionally similar to SELECT INTO, but it is preferred since it is less likely to be confused with other uses of the SELECT INTO syntax. Furthermore, CREATE TABLE AS offers a superset of the functionality offered by SELECT INTO.
So both do the same thing, but CREATE TABLE AS is part of the SQL standard (and thus works on other DBMS as well), whereas the SELECT INTO is non-standard. As the manuals says, you should prefer CREATE TABLE AS over the non-standard syntax.
Performance wise both are identical, the differences you see are probably related to caching or other activities on your system.
I have the same database on 2 different Oracle servers, one is 11.2.0.1.0 and the other is 11.2.0.4.0.
I have the same 2 geometry tables in both databases and run the following query on both servers. When run on an 11.2.0.1.0 version of Oracle, the query runs for a few minutes and I get results, the same query when run on 11.2.0.4.0 runs for about 3 seconds and returns no results.
The BLPUs table holds 36 million points and the PD_B2 table holds a polygon. I am trying to find all the points that fall in the polygon.
Other spatial queries do return rows but it takes hours and hours whereas the table join suggested in the Oracle Spatial documentation, takes 15 minutes to return all the points.
SELECT /*+ ordered */ a.uprn
FROM TABLE(SDO_JOIN('BLPUS', 'GEOLOC', 'PD_B2', 'GEOLOC','mask=ANYINTERACT')) c, blpus a, PD_B2 b
WHERE c.rowid1 = a.rowid
AND c.rowid2 = b.rowid;
The spatial queryies below return SDO_ROWIDSET() when run on the 11.2.0.4 server
select SDO_JOIN('BLPUS', 'GEOLOC', 'PD_B2', 'GEOLOC','mask=ANYINTERACT')
from dual;
select SDO_JOIN('BLPUS', 'GEOLOC', 'PD_B2', 'GEOLOC')
from dual;
On the 11.2.0.1 server they return results.
I have discovered that a much smaller table of points will work on 11.2.0.4 so it seems that there is a size limit on 11.2.0.4 when using SDO_JOIN where as 11.2.0.1 seems to cope with the large table.
Does anyone know why this is or if there is an actual limit on table size when using SDO_JOIN?
This is strange. I see no reason why SDO_JOIN will not work the same way in 11.2.0.4. At least I have not seen that sort of behavior before. It looks like a bug to me and I suggest you file a service request with Oracle Support so we can take a look. You may need to provide a dump of the tables - or at least of a small enough subset that demonstrates the problem.
That said there are a few things to check: did you apply the 11.2.0.4 patch on the same database ? I.e. nothing changed in terms of table structures or content, grants, etc ?
When you say that the query returns no rows, does it do so immediately ? Or does it perform some processing before completing without anything being returned ?
How large is the PD_B2 table ?
What happens when you do:
select SDO_JOIN('BLPUS', 'GEOLOC', 'PD_B2', 'GEOLOC','mask=ANYINTERACT')
from dual;
Does this also return nothing ?
What happens if you do
select SDO_JOIN('BLPUS', 'GEOLOC', 'PD_B2', 'GEOLOC')
from dual;
Both should return something that looks like this:
SDO_ROWIDSET(SDO_ROWIDPAIR('AAAW3LAAGAAAADmAAu', 'AAAW3TAAGAAAAg7AAC'), SDO_ROWIDPAIR('AAAW3LAAGAAAADmABE', 'AAAW3TAAGAAAAgrAAA'),...)
You will see this if you run the query in sqlplus. [If you use a GUI (like TOAD or SQLDeveloper) then you may not see that. All those GUIs have problems dealing with objects or arrays.]
But the fact that your 11.2.0.4 tests complete very quickly probably means that you get an empty result back, maybe like this:
SDO_ROWIDSET()
and that confirms that something did not work.
Now, from what you say PD_B2 only contains one row ? If so then there is no reason whatsoever to go via SDO_JOIN. A straightforward spatial query is easier to write. SDO_JOIN only makes sense when both tables being joined contain multiple rows. Then again, if one of the tables is very small (like the PD_B2 table in your case), then it anyway falls back to a simple query.
A simple query would look like this:
SELECT a.uprn
FROM blpus a, PD_B2 b
WHERE sdo_anyinteract (a.geoloc, b.geoloc) = 'TRUE';
Does this return what you expect in 11.2.0.4 ?
You may also want to examine the cost of the query and the query plan used. In sqlplus, do this:
set timing on
set autotrace traceonly
then run the above query. The effect is that sqlplus will not display any results - but it will still fetch and format the output: it will just not print them. At the end you will get a printout of the query plan used as well as some execution statistics and the elapsed time. Please add those results to your question.
Running the query from within your application should have a similar profile: similar response time and database-side costs - assuming you do fetch all the 1.3 million rows.
BTW, what do you do with those results ? Do you show them out as a report ? Do you save them into a table for later analysis ? Surely you do not want to show them all on a map ?
To clarify: SDO_JOIN is only for matching many to many geometries. For matching one to many, the simple SDO_ANYINTERACT() is what you need. As a matter of fact, SDO_JOIN will automatically fall back to a simple SDO_ANYINTERACT when one of the sets is very much smaller than the other. I can't see how SDO_JOIN could be faster in your circumstances (i.e. when searching for all objects that match one object) since it will perform the same as an SDO_ANYINTERACT and will require extra joins to the two tables.
Going back to the original issue - that of a difference in behavior in SDO_JOIN between 11.2.0.1 and 11.2.0.4 that IMO is definitely a bug that you need to report to Oracle Support.
We are using hibernate named query as follows :
named_query : Select this_ from TableA this_ where this_.id in( select max(id) from TableA where COLA is null group by COLB ) and rownum=1
Query query = getNamedQuery("nq.select.DirtySubject.onMaxDirtySubjectRecId");
List<SomeObject> objectList = query.list();
The query has been flagged by the DBAs and the comments verbatim are as follows
These indicate the SQL statements that are consuming the most parsing
resources every time they execute.
The SQL statements that appear in
this report are probably being reparsed. Excessively-parsed SQL
statements should be optimized to reduce their parsing frequency. This
involves using bind variables and identical statement syntax and case,
in order to be able to reuse any previously-parsed statements in the
SQL cache.
Examine these queries to see if any SQL optimization is
possible and reasonable.
Other material facts:
This query is part of a polling logic and gets fired repeatedly.
Database:Oracle 11G
Technology stack : Java,Hibernate,Tomcat,Linux,Oracle 11G
Questions:
1: Behind the scene- Hibernate will be using Prepared statement - correct ?
2: What can we possibly do more from the application side - to avoid re-parsing of this query ?
3: Anything we can do on the Database server to avoid re-parsing ?
The comment talks about parsing resources (in contrast to execution time/resources), but the statement itself does not really look that difficult to parse. The most interesting information would be the execution plan. I assume that the query is always the same (but the comment contains the word "probably").
As for question 1:
If you want to make sure that prepared statements do help, you can try to run the same query without Hibernate - just plain java.sql, PreparedStatement.
If so, there is a Hibernate prepared statement cache, but I have never used it.
As for question 2:
If this is your hotspot, you can use "direct SQL" with Hibernate (if it and your connection pool allows the caching of prepared statements) or even without it: Then you have full control.
As for question 3: Have a look on Oracle 11g result cache.
well this problem is general in sql server ce
i have indexes on all the the fields.
also the same query but with ID IN ( list of int ids) is pretty fast.
i tried to change the query to OUTER Join but this just make it worse.
so any hints on why this happen and how to fix this problem?
That's because the index is not really helpful for that kind of query, so the database has to do a full table scan. If the query is (for some reason) slower than a simple "SELECT * FROM TABLE", do that instead and filter the unwanted IDs in the program.
EDIT: by your comment, I recognize you use a subquery instead of a list. Because of that, there are three possible ways to do the same (hopefully one of them is faster):
Original statement:
select * from mytable where id not in (select id from othertable);
Alternative 1:
select * from mytable where not exists
(select 1 from othertable where mytable.id=othertable.id);
Alternative 2:
select * from mytable
minus
select mytable.* from mytable in join othertable on mytable.id=othertable.id;
Alternative 3: (ugly and hard to understand, but if everything else fails...)
select * from mytable
left outer join othertable on (mytable.id=othertable.id)
where othertable.id is null;
This is not a problem in SQL Server CE, but overall database.
The OPERATION IN is sargable and NOT IN is nonsargable.
What this mean ?
Search ARGument Able, thies mean that DBMS engine can take advantage of using index, for Non Search ARGument Ablee the index can't be used.
The solution might be using filter statement to remove those IDs
More in SQL Performance Tuning by Peter Gulutzan.
ammoQ is right, index does not help much with your query. Depending on distribution of values in your ID column you could optimise the query by specifying which IDs to select rather than not to select. If you end up requesting say more than ~25% of the table index will not be used anyway though because for nonclustered indexed (which is the only type of indexes which SQL CE supports if memory serves) it would be cheaper to scan the table. Otherwise (if the query is actually selective) you could re-write query with ID ranges to select ('union all' may work better than 'or' to combine ranges if SQL CE supports 'union all', not sure)