ORA-01722: invalid number but only when query used as subquery - oracle

A query, like so:
SELECT SUM(col1 * col3) AS total, col2
FROM table1
GROUP BY col2
works as expected when run individually.
For reference:
table1.col1 -- float
table1.col2 -- varchar2
table1.col3 -- float
When this query is moved to a subquery, I get an ORA-01722 error, with reference to the "col2" position in the select clause. The larger query looks like this:
SELECT col3, subquery1.total
FROM table3
LEFT JOIN (
SELECT SUM(table1.col1 * table1.col3) AS total, table.1col2
FROM table1
GROUP BY table1.col2
) subquery1 ON table3.col3 = subquery1.col2
For reference:
table3.col3 -- varchar2
It may also be worth noting that I have another query, from table2 that has the same structure as table1. If I use the subquery from table2, it works. It never works when using table1.
There is no concatenation, the data types match, the query works by itself... I'm at a loss here. What else should I be looking for? What painfully obvious problem is staring me in the face?
(I didn't choose or make the table structures and can't change them, so answers to that end will unfortunately not be helpful.)

try using a proper cast of float to char ..
SELECT col3, subquery1.total
FROM table3
LEFT JOIN (
SELECT SUM(table1.col1 * table1.col3) AS total, table.1col2
FROM table1
GROUP BY table1.col2
) subquery1 ON to_char(table3.col3) = subquery1.col2

Related

snowflake select max date from date array

Imagine I have a table with some field one of which is array off date.
as below
col1 col2 alldate Max_date
1 2 ["2021-02-12","2021-02-13"] "2021-02-13"
2 3 ["2021-01-12","2021-02-13"] "2021-02-13"
4 4 ["2021-01-12"] "2021-01-12"
5 3 ["2021-01-11","2021-02-12"] "2021-02-12"
6 7 ["2021-02-13"] "2021-02-13"
I need to write a query such that to select only the one which has max date in there array. And there is a column which has max date as well.
Like the select statement should show
col1 col2 alldate Max_date
1 2 ["2021-02-12","2021-02-13"] "2021-02-13"
2 3 ["2021-01-12","2021-02-13"] "2021-02-13"
6 7 ["2021-02-13"] "2021-02-13"
The table is huge so a optimized query is needed.
Till now I was thinking of
select col1, col2, maxdate
from t1 where array_contains((select max(max_date) from t1)::variant,date));
But to me it seems running select statement per query is a bad idea.
Any Suggestion
If you want pure speed using lateral flatten is 10% faster than the array_contains approach over 500,000,000 records on a XS warehouse. You can copy paste below code straight into snowflake to test for yourself.
Why is the lateral flattern approach faster?
Well if you look at the query plans the optimiser filters at first step (immediately culling records) where as the array_contains waits until the 4th step before doing the same. The filter is the qualifier of the max(max_date) ...
Create Random Dataset:
create or replace table stack_overflow_68132958 as
SELECT
seq4() col_1,
UNIFORM (1, 500, random()) col_2,
DATEADD(day, UNIFORM (-40, -0, random()), current_date()) random_date_1,
DATEADD(day, UNIFORM (-40, -0, random()), current_date()) random_date_2,
DATEADD(day, UNIFORM (-40, -0, random()), current_date()) random_date_3,
ARRAY_CONSTRUCT(random_date_1, random_date_2, random_date_3) date_array,
greatest(random_date_1, random_date_2, random_date_3) max_date,
to_array(greatest(random_date_1, random_date_2, random_date_3)) max_date_array
FROM
TABLE (GENERATOR (ROWCOUNT => 500000000)) ;
Test Felipe/Mike approach -> 51secs
select
distinct
col_1
,col_2
from
stack_overflow_68132958
qualify
array_contains(max(max_date) over () :: variant, date_array);
Test Adrian approach -> 47 secs
select
distinct
col_1
, col_2
from
stack_overflow_68132958
, lateral flatten(input => date_array) g
qualify
max(max_date) over () = g.value;
I would likely use a CTE for this, like:
WITH x AS (
SELECT max(max_date) as max_max_date
FROM t1
)
select col1, col2, maxdate
from t1
cross join x
where array_contains(x.max_max_date::variant,alldate);
I have not tested the syntax exactly and the data types might vary things a bit, but the concept here is that the CTE will be VERY fast and return a single record with a single value. A MAX() function leverage metadata in Snowflake, so it won't even use a warehouse to get it.
That said, the Snowflake profiler is pretty smart, so your query might actually create the exact same query profile as this statement. Test them both and see what the Profile looks like to see if it truly makes a difference.
To build on Mike's answer, we can do everything in the QUALIFY, without the need for a CTE:
with t1 as (
select 'a' col1, 'b' col2, '2020-01-01'::date maxdate, array_construct('2020-01-01'::date, '2018-01-01', '2017-01-01') alldate
)
select col1, col2, alldate, maxdate
from t1
qualify array_contains((max(maxdate) over())::variant, alldate)
;
Note that you should be careful with types. Both of these are true:
select array_contains('2020-01-01'::date::string::variant, array_construct('2020-01-01', '2019-01-01'));
select array_contains('2020-01-01'::date::variant, array_construct('2020-01-01'::date, '2019-01-01'));
But this is false:
select array_contains('2020-01-01'::date::variant, array_construct('2020-01-01', '2019-01-01'));
You have some great answers, which I only saw, after i wrote mine up.
If your data types, match, you should be good to go, copy paste direct into snowflake ... and this should work.
create or replace schema abc;
use schema abc;
create or replace table myarraytable(col1 number, col2 number, alldates variant, max_date timestamp_ltz);
insert into myarraytable
select 1,2,array_construct('2021-02-12'::timestamp_ltz,'2021-02-13'::timestamp_ltz), '2021-02-13'
union
select 2,3,array_construct('2021-01-12'::timestamp_ltz,'2021-02-13'::timestamp_ltz),'2021-02-13'
union
select 4,4,array_construct('2021-01-12'::timestamp_ltz) , '2021-01-12'
union
select 5,3,array_construct('2021-01-11'::timestamp_ltz,'2021-02-12'::timestamp_ltz) , '2021-02-12'
union
select 6,7,array_construct('2021-02-13'::timestamp_ltz) , '2021-02-13';
select * from myarraytable
order by 1 ;
WITH cte_max AS (
SELECT max(max_date) as max_date
FROM myarraytable
)
select myarraytable.*
from myarraytable, cte_max
where array_contains(cte_max.max_date::variant, alldates)
order by 1 ;

sqplus: Retrieve results in chunks (update ROWNUM in loop)

I'm really new to using sqlplus, so I'm sorry if this is a stupid question.
I have a really long running query of the form:
SELECT columnA
from tableA
where fieldA in (
(select unique columnB
from tableB
where fieldB in
(select columnC
from tableC
where fieldC not in
(select columnD
from tableD
where x=y
and a=b
and columnX in
(select columnE
from tableE
where p=q)))
and columnInTableB = <some value>
and anotherColumnInTableB = <some other value>
and thirdColumnInTableB IN (<set of values>)
and fourthColumnInTableB like <some string>);
Each of the tables has about 15 - 30 columns and varying number of rows. TableB is the largest, with about 5million rows in all. Tables A - E have between 500,000 - 1 million rows each.
I've tried a couple of approaches:
1) Run this query as is:
This query runs for really long and I get the error -
ORA-01555: snapshot too old: rollback segment number <> with name <>
I did some research and found things like:
ORA-1555: snapshot too old: rollback segment number
However, I don't have privileges to change the undo segment.
2) I re-wrote the query using with...as, but then I get the error:
unable to extend temp segment by <> in tablespace TEMP
Again, I found explanations as to how to fix this error, but I don't have privileges to extend the temp segment.
The query that takes longest to run is:
(select unique columnB
from tableB ...
The and fourthColumnInTableB like <some string>); matches about three million entries in tableB in the worst case.
Someone suggested to me to 'run the query in smaller chunks'.
An approach I thought of was to retrieve data for the long running subquery (
(select unique columnB
from tableB ...
)
in chunks (using ROWNUM as suggested here.
My question is this:
I don't know exactly how many potential matches there are for this subquery. Can I dynamically set ROWNUM to retrieve data in chunks?
If yes, could you please give me an example of what the while loop must look like, and how I can determine when the result set has been exhausted?
An option I found for this was to check while ##ROWCOUNT > 0 or use:
while exists (query)
However, I'm still not sure how to write the loop and how to use a variable (?) to dynamically set ROWNUM.
Basically, I'm wondering if I can do:
SELECT columnA
from tableA
where fieldA in (
while all results have not been fetched:
select *
from
(select a.*, rownum rnum
from
(select unique columnB
from tableB
where fieldB in
(select columnC
from tableC
where fieldC not in
(select columnD
from tableD
where x=y
and a=b
and columnX in
(select columnE
from tableE
where p=q)))
and columnInTableB = <some value>
and anotherColumnInTableB = <some other value>
and thirdColumnInTableB IN (<set of values>)
and fourthColumnInTableB like <some string>) a
where rownumm <= i) and rnum >= i);
update value of i here (?) so that the loop can repeat
How can I update the value of 'i' for rownum/rnum above dynamically in some loop to retrieve results in chunks (of, say, 10000) until the result set has been exhausted?
Also, what should the while loop look like?
I also have no idea how to rewrite this using joins (my knowledge of sql is very limited), so if someone can help me rewrite this more efficiently using joins or any other method, that would work too.
I'd really appreciate any help on this. I've been stuck on this for a few days now and I'm unable to determine a proper solution.
Thank you!
1) Try removing UNIQUE /DISTINCT clause. It should in-memory sort and temp segment.
2) Try applying a 'rownum < x_rows' filter, in the end, to restrict no of rows to the client and reduce IO.
SELECT columnA
FROM tableA
WHERE fieldA IN (
(SELECT **UNIQUE** columnB
FROM tableB
WHERE fieldB IN
(SELECT columnC
FROM tableC
WHERE fieldC NOT IN
(SELECT columnD
FROM tableD
WHERE x =y
AND a =b
AND columnX IN
(SELECT columnE FROM tableE WHERE p=q
)
)
)
AND columnInTableB = <SOME value>
AND anotherColumnInTableB = <SOME other value>
AND thirdColumnInTableB IN (<
SET OF VALUES >)
AND fourthColumnInTableB LIKE <SOME string>
)
**ROWNUM < 50** ;

Converting rownum from Oracle to Postgres

I need to make a conversion from Oracle SQL to PostgreSQL.
select * from table1 inner join table2 on table1.id = table2.table1Id
where table1.col1 = 'TEST'
and rownum <=5
order by table2.col1
If I delete and rownum <=5 and put at the end limit 5, there are differences between the 2 dialects. In Oracle, first are selected the 5 elements, and after that, they are sorted by table2.col1 . In Postgres, first all the list is sorted, and AFTER there are selected the first 5 elements.
How can I obtain the same result in Postgres as in Oracle?
Thanks!
To get the behavior you desire, you can use a subquery like this:
SELECT * FROM (
SELECT table1.col1 as t1col1, table2.col1 as t2col1
FROM table1 INNER JOIN table2 ON table1.id = table2.table1Id
WHERE table1.col1 = 'TEST'
LIMIT 5
) AS sub
ORDER BY t2col1;
I named the columns there because in your example both tables had a col1.
Note however that without any ordering on the inner query, the selection of 5 rows you get will be purely random and subject to change.
Depending on the version you are using, PostgreSQL 8.4 and above have Window functions. Window function ROW_NUMBER() is capable of implementing the functionality of Oracle pseudo column rownum.
select row_number() over() as rownum,* from table1 inner join table2 on table1.id = table2.table1Id where table1.col1 = 'TEST' and rownum <=5 order by table2.col1;

Oracle - EXCEPT Error

Have few columns and tables, as follows:
NOTE: The names of elements used are for illustrative purposes only.
SELECT T.col1
FROM Table1 T
WHERE NOT EXISTS (
(SELECT * FROM Table2)
EXCEPT (SELECT TT.col1
FROM TableTT TT
WHERE TT.col2 = T.col2)
);
Error: Missing right parenthesis, though the parentheses seem to match.
But, I do know that it has nothing to do with the parenthesis actually. And I suspect the error to be somewhere in the EXCEPT clause. What might have resulted in the error?
There's no EXCEPT operator in Oracle. Use MINUS instead. Reference: Here
In your query the word 'EXCEPT' is most probably treated as a table alias for (SELECT * FROM Table2) subquery.
UPDATE:
Full query for provided data structure will look like:
SELECT T.col1
FROM Table1 T
WHERE NOT EXISTS
((SELECT col1 FROM Table2)
MINUS
(SELECT TT.col1 FROM TableTT TT WHERE TT.col2 = T.col2));
Note that I have changed * to col1 for Table2 - if you're selecting single INT column TT.col1 from TT then you should also select single INT column from Table2.

Getting Error in query

update tablename set (col1,col2,col3) = (select col1,col2,col3 from tableName2 order by tablenmae2.col4) return error
Missing ). The query works fine if I remove the order by clause
ORDER BY is not allowed in a subquery within an UPDATE. So you get the error "Missing )" because the parser expects the subquery to end at the point that you have ORDER BY.
What is the ORDER BY intended to do?
What you probably have in mind is something like:
UPDATE TableName
SET (Col1, Col2, Col3) = (SELECT T2.Col1, T2.Col2, T2.Col3
FROM TableName2 AS T2
WHERE TableName.Col4 = T2.Col4
)
WHERE EXISTS(SELECT * FROM TableName2 AS T2 WHERE TableName.Col4 = T2.Col4);
This clumsy looking operation:
Grabs rows from TableName2 that match TableName on the value in Col4 and updates TableName with the values from the corresponding columns.
Ensures that only rows in TableName with a corresponding row in TableName2 are altered; if you drop the WHERE clause from the UPDATE, you replace the values in Col1, Col2, and Col3 with nulls if there are rows in TableName without a matching entry in TableName2.
Some DBMS also support an update-join notation to reduce the ghastliness of this notation.

Resources