I need to combine multiple rows in Oracle SQL but have no access to LISTAGG or wm_concat (EVALUATE_AGGR disabled).
Note: I need this to work in Oracle OBIEE 11.1.1.9.
Grateful for any help or tips at all.
Ugh. I hate writing this as an Answer, but I found that the sys_connect_by_path solutions, both at Oracle-Base (see Alex Poole's comment) and on William Robertson's web site (quoted in the Oracle-Base article), are less than perfect, and this won't fit in a comment.
Oracle-Base link: https://oracle-base.com/articles/misc/string-aggregation-techniques#row_number
William Robertson web site: http://www.williamrobertson.net/documents/one-row.html
The solution on Oracle-Base uses two calls to row_number() when only one is needed, and it uses an aggregate query instead of connect_by_isleaf. Perhaps that's the solution originally posted by William, but his page currently has the better solution, using just one row_number() call and connect_by_isleaf instead of aggregation.
However, on William's page, he uses ltrim() without the argument that shows which character to trim, so in fact it has no effect. And he subtracts 1 from the value of row_number(), so in the result the first token in each comma-separated list is left out.
Here is the corrected solution - for reference; I claim no originality to any of this. The illustration is run on the EMP table in the standard SCOTT schema.
select deptno
, ltrim(sys_connect_by_path(ename,','), ',') as name_list
from ( select deptno
, ename
, row_number() over (partition by deptno order by ename) as seq
from emp )
where connect_by_isleaf = 1
connect by seq = prior seq + 1 and deptno = prior deptno
start with seq = 1;
DEPTNO NAME_LIST
------ ------------------------------------
10 CLARK,KING,MILLER
20 ADAMS,FORD,JONES,SCOTT,SMITH
30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
Related
My understanding as per standard practice is that HAVING is to be used along with GROUP BY for filtering conditions, while WHERE is supposed to be used for general row-wise filtering conditions.
However, there are online discussions with mixed conclusions as to whether use HAVING as a superset of WHERE clause. That is, whether it can be used even without GROUP BY in which case it works as a WHERE clause.
I want to understand what is the industry practice in using HAVING clause across Oracle, Microsoft SQL server, MySQL, PostGreSQL and other tools.
A funny thing I observed when executing this query:
SELECT *
FROM SH.SALES
WHERE amount_sold > 1000
HAVING amount_sold < 2000;
It gives an error when executing in Oracle SQL developer desktop whereas runs successfully in Oracle SQL developer web.
This is a great question AND puzzle!
Oracle SQL Developer Web is provided via Oracle REST Data Services (ORDS). There is a RESTful Web Service used to execute 'ad hoc' SQL statements and scripts.
Instead of bringing back all the rows from a query in a single call, we page them. And instead of holding a resultset open and process running, we stick to the RESTful way, and do all the work on a single call and response.
How do we make this happen?
Well, when you type in that query from your question and execute it, on the back end, that's not actually what gets executed.
We wrap that query with another SELECT, and use the ROW_NUMBER() OVER analytic function call. This allows us to 'window' the query results, in this case between rows 1 and 26, or the the first 25 rows of that query, your query.
SELECT *
FROM (
SELECT Q_.*,
ROW_NUMBER() OVER(
ORDER BY 1
) RN___
FROM (
select *
from sh.sales
where amount_sold > 1000
having amount_sold < 2000
) Q_
)
WHERE RN___ BETWEEN :1 AND :2
Ok, but so what?
Well, Optimizer figures out this query can still run, even if the having clause isn't appropriate.
The optimizer is always free to re-arrange a query before searching for best execution plans.
In this case, a 10053 trace shows that a query such as below that came from SQL Dev Web (I'm using EMP but the same applies for any table)
SELECT *
FROM (
SELECT Q_.*,
ROW_NUMBER() OVER(
ORDER BY 1
) RN___
FROM (
SELECT *
FROM emp
WHERE sal > 1000
HAVING sal < 2000
) Q_
)
WHERE RN___ BETWEEN :1 AND :2
got internally transformed to the following before being optimized for plans.
SELECT
subq.EMPNO EMPNO,
subq.ENAME ENAME,
subq.JOB JOB,
subq.MGR MGR,
subq.HIREDATE HIREDATE,
subq.SAL SAL,subq.COMM COMM,
subq.DEPTNO DEPTNO,
subq.RN___ RN___
FROM
(SELECT
EMP.EMPNO EMPNO,
EMP.ENAME ENAME,
EMP.JOB JOB,EMP.MGR MGR,
EMP.HIREDATE HIREDATE,
EMP.SAL SAL,
EMP.COMM COMM,
EMP.DEPTNO DEPTNO,
ROW_NUMBER() OVER ( ORDER BY NULL ) RN___
FROM EMP EMP
WHERE EMP.SAL>1000 AND TO_NUMBER(:B1)>=TO_NUMBER(:B2)
) subq
WHERE subq.RN___>=TO_NUMBER(:B3)
AND subq.RN___<=TO_NUMBER(:B4)
Notice the HAVING has been transformed/optimized out of the query, which lets it pass through onto the execution phase.
Major 👏 to #connor-mcdonald of AskTom fame for helping me parse this out.
And so that's why it works in SQL Developer Web, but NOT in SQL Developer Desktop, where the query is executed exactly as written.
I have question for using IN clause in sql query which of the following provide better performance
SELECT * FROM emp WHERE deptno IN (10,20)
OR
WITH dep AS (SELECT 10 deptno FROM DUAL UNION ALL
SELECT 20 deptno FROM DUAL)
SELECT * FROM EMP e
WHERE EXISTS (SELECT 1 FROM dep WHERE dep.deptno=e.deptno);
I am looking which will provide better performance
"In clause" will be better choice because in another example optimizer can't figure out how to join this two tables so it scans all of the emp table and see if particularly record meets your condition. I've checked this on huge table (more than million rows) and the query plan was very different. Of course I assumed that you have index on deptno column. Without it both solutions require full table scan on emp table.
I been working with a complex view written by some other company in 2005. I am trying to understand what it is doing for reasons beyond this post. By the highly complex nature of this view (over 500 lines of code) I take it that the writers new what they where doing.
I keep finding things like TO_NUMBER(null), TO_DATE(null) in various places.
Seems to me like totally unnecessary use of a function.
Is there any technical reasons or advantages that justify why this was design like this?
By default NULL does not have a data type:
SQL> select dump(null) from dual;
DUMP
----
NULL
SQL>
However, if we force Oracle into making a decision it will default to making it a string:
SQL> create or replace view v1 as
select 1 as id
, null as dt
from dual
/
2 3 4 5
View created.
SQL> desc v1
Name Null? Type
-------------- -------- ----------------------------
ID NUMBER
DT VARCHAR2
SQL>
But this not always desirable. We might need to use NULL in a view for a number of reasons (defining an API, filling out a jagged UNION, etc) and so we cast the NULL to another datatype to get the projection we need.
SQL> create or replace view v1 as
select 1 as id
, to_date(null) as dt
from dual
/
2 3 4 5
View created.
SQL> desc v1
Name Null? Type
-------------- -------- ----------------------------
ID NUMBER
DT DATE
SQL>
Later versions have got smarter with regards to handling UNION. On my 11gR2 database, even though I use the null in first declared query (and that usually drives things) I still get the correct datatype:
SQL> create or replace view v1 as
select 1 as id
, null as dt
from dual
union all
select 2 as id
, sysdate as something_else
from dual
/
2 3 4 5 6 7 8 9
View created.
SQL>
SQL> desc v1
Name Null? Type
-------------- -------- ----------------------------
ID NUMBER
DT DATE
SQL>
Explicitly casting NULL may be left over from 8i, or to workaround a bug, or as ammoQ said, "superstitious".
In some old and rare cases the implicit conversion of NULL in set operations caused errors like ORA-01790: expression must have same datatype as corresponding expression.
I can't find any great references for this old behavior, but Google returns a few results that claim a query like this would fail in 8i:
select 'a' a from dual
union
select null a from dual;
And there is at least one similar bug, "Bug 9456979 Wrong result from push of NVL / DECODE into UNION view with NULL select list item - superceded".
But don't let 16 year-old software and some rare bug dictate how to program. And don't think there's a positive correlation between code size programming skill. There's a negative correlation: good programmers will create smaller, more readable code, and won't leave as many mysteries for future coders.
Well, I would prefer CAST(NULL AS DATE) or CAST(NULL AS NUMBER) instead of TO_DATA(NULL), looks more logical in my eyes.
I know two scenarios where such an expression is required.
One it the case of UNION, as already stated in the other aswers.
Another scenario is the case of overloaded procedures/functions, for example:
CREATE OR REPLACE PROCEDURE MY_PROC(val IN DATE) AS
BEGIN
DELETE FROM EMP WHERE HIRE_DATE = val;
END;
/
CREATE OR REPLACE PROCEDURE MY_PROC(val IN NUMBER) AS
BEGIN
DELETE FROM EMP WHERE EMP_ID = val;
END;
/
Calling the procedure like MY_PROC(NULL); does not work, Oracle does not know which procedure to execute. You must call it like MY_PROC(CAST(NULL AS DATE)); for example.
As a oracle PL/SQL programmer, I really don't find any logical reason for doing the things you have specified. The only logical approach to deal with null in oracle is to use nvl(), I really don't find any reason to use TO_NUMBER(null), TO_DATE(null) in a complex view.
Disclaimer: I'm a developer and not a DBA.
I've been a huge fan of the USING clause in Oracle since I accidentally stumbled upon it and have used it in place of the old-fashioned ON clause to join fact tables with dimension tables ever since. To me, it creates a much more succinct SQL and produces a more concise result set with no unnecessary duplicated columns.
However, I was asked yesterday by a colleague to convert all my USING clauses into ONs. I will check with him and ask him what his reasons are. He works much more closely with the database than I do, so I assume he has some good reasons.
I have not heard back from him (we work in different timezones), but I wonder if there are any guidelines or best practices regarding the use of the "using" clause? I've googled around quite a bit, but have not come across anything definitive. In fact, I've not even even a good debate anywhere.
Can someone shed some light on this? Or provide a link to a good discussion on the topic?
Thank you!
You're presumably already aware of the distinction, but from the documentation:
ON condition Use the ON clause to specify a join condition. Doing so
lets you specify join conditions separate from any search or filter
conditions in the WHERE clause.
USING (column) When you are specifying an equijoin of columns that
have the same name in both tables, the USING column clause indicates
the columns to be used. You can use this clause only if the join
columns in both tables have the same name. Within this clause, do not
qualify the column name with a table name or table alias.
So these would be equivalent:
select e.ename, d.dname
from emp e join dept d using (deptno);
select e.ename, d.dname
from emp e join dept d on d.deptno = e.deptno;
To a large extent which you use is a matter of style, but there are (at least) two situations where you can't use using: (a) when the column names are not the same in the two tables, and (b) when you want to use the joining column:
select e.ename, d.dname, d.deptno
from emp e join dept d using(deptno);
select e.ename, d.dname, d.deptno
*
ERROR at line 1:
ORA-25154: column part of USING clause cannot have qualifier
You can of course just leave off the qualifier and select ..., deptno, as long as you don't have another table with the same column that isn't joined using it:
select e.ename, d.dname, deptno
from emp e join dept d using (deptno) join mytab m using (empno);
select e.ename, d.dname, deptno
*
ERROR at line 1:
ORA-00918: column ambiguously defined
In that case you can only select the qualified m.deptno. (OK, this is rather contrived...).
The main reason I can see for avoiding using is just consistency; since you sometimes can't use it, occasionally switching to on for those situations might be a bit jarring. But again that's more about style than any deep technical reason.
Perhaps your colleague is simply imposing (or suggesting) coding standards, but only they will know that. It also isn't quite clear if you're being asked to change some new code you've written that is going through review, or old code. If it's the latter then regardless of the reasons for them preferring on, I think you'd need to get a separate justification for modifying proven code, as there's a risk of introducing new problems even when the modified code is retested - quite apart from the cost/effort involved in the rework and retesting.
A couple of things strike me about your question though. Firstly you describes the on syntax as 'old-fashioned', but I don't think that's fair - both are valid and current (as of SQL:2011 I think, but citation needed!). And this:
produces a more concise result set with no unnecessary duplicated columns.
... which I think suggests you're using select *, otherwise you would just select one of the values, albeit with a couple of extra characters for the qualifier. Using select * is generally considered bad practice (here for example) for anything other than ad hoc queries and some subqueries.
Related question.
It seems the main difference is syntactic: the columns are merged in a USING join.
In all cases this means that you can't access the value of a joined column from a specific table, in effect some SQL will not compile, for example:
SQL> WITH t AS (SELECT 1 a, 2 b, 3 c FROM dual),
2 v AS (SELECT 1 a, 2 b, 3 c FROM dual)
3 SELECT t.* FROM t JOIN v USING (a);
SELECT t.* FROM t JOIN v USING (a)
^
ORA-25154: column part of USING clause cannot have qualifier
In an outer join this means you can't access the outer table value:
SQL> WITH t AS (SELECT 1 a, 2 b, 3 c FROM dual),
2 v AS (SELECT NULL a, 2 b, 3 c FROM dual)
3 SELECT * FROM t LEFT JOIN v USING (a)
4 WHERE v.a IS NULL;
WHERE v.a IS NULL
^
ORA-25154: column part of USING clause cannot have qualifier
This means that there is no equivalent for this anti-join syntax with the USING clause:
SQL> WITH t AS (SELECT 1 a, 2 b, 3 c FROM dual),
2 v AS (SELECT NULL a, 2 b, 3 c FROM dual)
3 SELECT * FROM t LEFT JOIN v ON v.a = t.a
4 WHERE v.a IS NULL;
A B C A B C
---------- ---------- ---------- - ---------- ----------
1 2 3
Apart from this, I'm not aware of any difference once the SQL is valid.
However, since it seems this syntax is less commonly used, I wouldn't be surprised if there were specific bugs that affect only the USING clause, especially in early versions where ANSI SQL was introduced. I haven't found anything on MOS that could confirm this, partly because the USING word is ubiquitous in bug descriptions.
If the reason for not using this feature is because of bugs, it seems to me the burden of the proof lies with your colleague: the bugs must be referenced/documented, so that the ban can eventually be lifted once the bugs are patched (database upgrade...).
If the reason is cosmetic or part of a coding convention, surely it must be documented too.
With USING you also cannot do a join like:
select a.id,aval,bval,cval
from a
left join b on a.id = b.id
left join c on c.id = b.id;
that is, only give the column from C when it is matched to a row in the B table.
I need to make a navigation panel that shows only a subset of a possible large result set. This subset is 20 records before and 20 records after the resulted record set. As I navigate the results through the navigation panel, I'll be applying a sliding window design using ROWNUM to get the next subset. My question is does Oracle's ROWNUM build the whole table before it extracts the rows you want? Or is it intelligent enough to only generate the rows I need? I googled and I couldn't find an explanation on this.
The pre-analytic-function method for doing this would be:
select col1, col2 from (
select col1, col2, rownum rn from (
select col1, col2 from the_table order by sort_column
)
where rownum <= 20
)
where rn > 10
The Oracle optimizer will recognize in this case that it only needs to get the top 20 rows to satisfy the inner query. It will likely have to look at all the rows (unless, say, the sort column is indexed in a way that lets it avoid the sort altogether) but it will not need to do a full sort of all the rows.
Your solution will not work (as Bob correctly pointed out) but you can use row_number() to do what you want:
SELECT col1,
col2
FROM (
SELECT col1,
col2,
row_number() over (order by some_column) as rn
FROM your_table
) t
WHERE rn BETWEEN 10 AND 20
Note that this solution has the added benefit that you can order the final result on a different criteria if you want to.
Edit: forgot to answer your initial question:
With the above solution, yes Oracle will have to build the full result in order to find out the correct numbering.
With 11g and above you might improve your query using the query cache.
Concerning the question's title.
See http://www.orafaq.com/wiki/ROWNUM and this in-depth explanation by Tom Kyte.
Concerning the question's goal.
This should be what you're looking for: Paging with Oracle
I don't think your design is quite going to work out as you've planned. Oracle assigns values to ROWNUM in the order that they are produced by the query - the first row produced is assigned ROWNUM=1, the second is assigned ROWNUM=2, etc. Notice that in order to have ROWNUM=21 assigned the query must first return the first twenty rows and thus if you write a query which says
SELECT *
FROM MY_TABLE
WHERE ROWNUM >= 21 AND
ROWNUM <= 40
no rows will be returned because in order for there to be rows with ROWNUM >= 21 the query must first return all the rows with ROWNUM <= 20.
I hope this helps.
It's an old question but you should try this - http://www.inf.unideb.hu/~gabora/pagination/results.html