duplicates with connect by in oracle - oracle

select RP.COUNTRYID,RP.PRDCODE,
RP.REPID,
RP.CHANNELID,
RP.CUSTOMERID,
RP.DIVISION,
RP.WWCOGS_GAUSS,
RP.WWCOGS_SAP,
RP.WWCOGS_BusLine,
RP.CURRID ,
ADD_MONTHS(to_date('01-01-'||rp.year,'DD-MM-YYYY'),level-1) KFDATE
from
(SELECT CP.COUNTRYID,
CP.PRDCODE,
CP.REPID,
CP.CHANNELID,
CP.CUSTOMERID,
IP.DIVISION,
CP.WWCOGS_GAUSS,
CP.WWCOGS_SAP,
CP.WWCOGS_BusLine,
CP.year,
CP.CURRID from
(select distinct IBC.COUNTRYID ,
decode(ls.dmdunit,null,mwa.ProductName,ls.dmdunit) PRDCODE,
ReportingUnit REPID,
'99' CHANNELID,
IBC.COUNTRYID CUSTOMERID ,
(
CASE
WHEN MWA.COGSSourceCF LIKE '%Gauss%'
THEN MWA.COGSPriceCF * MWA.ExchRate
ELSE NULL
END) WWCOGS_GAUSS,
(
CASE
WHEN MWA.COGSSourceCF LIKE '%SAP%'
THEN MWA.COGSPriceCF * MWA.ExchRate
ELSE NULL
END) WWCOGS_SAP,
(
CASE
WHEN MWA.COGSSourceCF NOT LIKE '%Gauss%'
AND MWA.COGSSourceCF NOT LIKE '%SAP%'
THEN MWA.COGSPriceCF * MWA.ExchRate
ELSE NULL
END) WWCOGS_BusLine,
mwa.year,
IRC.CURRID
from BAM.M_WWCOGS_AREA MWA,
MICSTAG.M_IBP_REPUNIT_CURRENCY IRC,
micstag.M_IBP_BDREPORTINGCOUNTRY IBC
,MICSTAG.M_LOCALPRODUCT_STAG LS
WHERE MWA.ReportingUnit=IRC.REPID
--and mwa.productname='FR21030390085'
--and MWA.GaussCountry='BE BELGIUM'
AND IBC.COUNTRYPLANNINGGROUP =MWA.GaussCountry
and IBC.businessdivision=31
and MWA.COGSPriceCF <>0
and ls.REPORTINGUNITID(+)=mwa.ReportingUnit
and ls.IBPLOCALPRDID(+)= mwa.ProductName ) CP, micstag.M_IBP_PRODUCT IP
where CP.PRDCODE=IP.PRDID) RP
CONNECT BY level <= 12 ;
the above query is getting unwanted duplicate, if i use distinct the query is running forever.
req. duplicate the records based on year in result set of rp
consider value of year is 2019 than 12 records should came from 1-jan-2019 to 1-dec-2019.
more than one value of year are possible

The CONNECT BY LEVEL <= 12 trick works nicely with dual because that table returns one row. Things are messier when they base result set returns more than one row, because the CONNECT BY generates a product. This is why you're getting duplicates.
What you need to do is specify some additional criteria for the connection. Ideally you will have a primary key in the projection - I'm going to assume it's REPID, so if it's something different you'll need to tweak this. Anyway, you'll need something like this:
) RP
CONNECT BY level <= 12
and rp.repid = prior rp.repid
and prior sys_guid() is not null
The prior sys_guid() bit prevents ORA-01436: CONNECT BY loop in user data.

Related

MIN function behavior changed on Oracle databases after SAS Upgrade to 9.4M7

I have a program that has been working for years. Today, we upgraded from SAS 9.4M3 to 9.4M7.
proc setinit
Current version: 9.04.01M7P080520
Since then, I am not able to get the same results as before the upgrade.
Please note that I am querying on Oracle databases directly.
Trying to replicate the issue with a minimal, reproducible SAS table example, I found that the issue disappear when querying on a SAS table instead of on Oracle databases.
Let's say I have the following dataset:
data have;
infile datalines delimiter="|";
input name :$8. id $1. value :$8. t1 :$10.;
datalines;
Joe|A|TLO
Joe|B|IKSK
Joe|C|Yes
;
Using the temporary table:
proc sql;
create table want as
select name,
min(case when id = "A" then value else "" end) as A length 8
from have
group by name;
quit;
Results:
name A
Joe TLO
However, when running the very same query on the oracle database directly I get a missing value instead:
proc sql;
create table want as
select name,
min(case when id = "A" then value else "" end) as A length 8
from have_oracle
group by name;
quit;
name A
Joe
As per documentation, the min() function is behaving properly when used on the SAS table
The MIN function returns a missing value (.) only if all arguments are missing.
I believe this happens when Oracle don't understand the function that SAS is passing it - the min functions in SAS and Oracle are very different and the equivalent in SAS would be LEAST().
So my guess is that the upgrade messed up how is translates the SAS min function to Oracle, but it remains a guess. Does anyone ran into this type of behavior?
EDIT: #Richard's comment
options sastrace=',,,d' sastraceloc=saslog nostsuffix;
proc sql;
create table want as
select t1.name,
min(case when id = 'A' then value else "" end) as A length 8
from oracle_db.names t1 inner join oracle_db.ids t2 on (t1.tid = t2.tid)
group by t1.name;
ORACLE_26: Prepared: on connection 0
SELECT * FROM NAMES
ORACLE_27: Prepared: on connection 1
SELECT UI.INDEX_NAME, UIC.COLUMN_NAME FROM USER_INDEXES UI,USER_IND_COLUMNS UIC WHERE UI.TABLE_NAME='NAMES' AND
UIC.TABLE_NAME='NAMES' AND UI.INDEX_NAME=UIC.INDEX_NAME
ORACLE_28: Executed: on connection 1
SELECT statement ORACLE_27
ORACLE_29: Prepared: on connection 0
SELECT * FROM IDS
ORACLE_30: Prepared: on connection 1
SELECT UI.INDEX_NAME, UIC.COLUMN_NAME FROM USER_INDEXES UI,USER_IND_COLUMNS UIC WHERE UI.TABLE_NAME='IDS' AND
UIC.TABLE_NAME='IDS' AND UI.INDEX_NAME=UIC.INDEX_NAME
ORACLE_31: Executed: on connection 1
SELECT statement ORACLE_30
ORACLE_32: Prepared: on connection 0
select t1."NAME", MIN(case when t2."ID" = 'A' then t1."VALUE" else ' ' end) as A from
NAMES t1 inner join IDS t2 on t1."TID" = t2."TID" group by t1."NAME"
ORACLE_33: Executed: on connection 0
SELECT statement ORACLE_32
ACCESS ENGINE: SQL statement was passed to the DBMS for fetching data.
NOTE: Table WORK.SELECTED_ATTR created, with 1 row and 2 columns.
! quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.34 seconds
cpu time 0.09 seconds
Use the SASTRACE= system option to log SQL statements sent to the DBMS.
options SASTRACE=',,,d';
will provide the most detailed logging.
From the prepared statement you can see why you are getting a blank from the Oracle query.
select
t1."NAME"
, MIN ( case
when t2."ID" = 'A' then t1."VALUE"
else ' '
end
) as A
from
NAMES t1 inner join IDS t2 on t1."TID" = t2."TID"
group by
t1."NAME"
The SQL MIN () aggregate function will exclude null values from consideration.
In SAS SQL, a blank value is also interpreted as null.
In SAS your SQL query returns the min non-null value TLO
In Oracle transformed query, the SAS blank '' is transformed to ' ' a single blank character, which is not-null, and thus ' ' < 'TLO' and you get the blank result.
The actual MIN you want to force in Oracle is min(case when id = "A" then value else null end) which #Tom has shown is possible by omitting the else clause.
The only way to see the actual difference is to run the query with trace in the prior SAS version, or if lucky, see the explanation in the (ignored by many) "What's New" documents.
Why are you using ' ' or '' as the ELSE value? Perhaps Oracle is treating a string with blanks in it differently than a null string.
Why not use null in the ELSE clause?
or just leave off the ELSE clause and let it default to null?
libname mylib oracle .... ;
proc sql;
create table want as
select name
, min(case when id = "A" then value else null end) as A length 8
from mylib.have_oracle
group by name
;
quit;
Also try running the Oracle code yourself, instead of using implicit pass thru.
proc sql;
connect to oracle ..... ;
create table want as
select * from connection to oracle
(
select name,
min(case when id = "A" then value else null end) as A length 8
from have_oracle
group by name
)
;
quit;
When I try to reproduce this in Oracle I get the result you are looking for so I suspect it has something to do with SAS (which I'm not familiar with).
with t as (
select 'Joe' name, 'A' id, 'TLO' value from dual union all
select 'Joe' name, 'B' id, 'IKSK' value from dual union all
select 'Joe' name, 'C' id, 'Yes' value from dual
)
select name
, min(case when id = 'A' then value else '' end) as a
from t
group by name;
NAME A
---- ----
Joe TLO
Unrelated, if you are only interested in id = 'A' then a better query would be:
select name
, min(value) as a
from t
where id = 'A'
group by name;

Issue in Jqgrid pagination in Oracle server

We have a code to sort data and paginate the same and render the data to a Jqgrid. The code works fine when it is connected to an SQL server. That is on paginating each page returns distinct data as expected. But on connecting to an oracle server after some point of time the duplicate data are rendered. Both Oracle and SQL server has same data. Parameters in the Jqgrid page and the number of pages are working as expected on the server-side. That is on paging the start point and chunk size is correctly transferred to the server-side. The duplicate values are observed after sorting columns that are of type varchar in the database but hold numeric also. The database status column holds values of 3 and A, after sorting with the status column the duplicate data when the paginating issue is observed. Duplicate data in the sense, that data on page 2 will be the same as data on page 3. Any help will be appreciated. Thanks in advance...
Query One:-
select * from ( select row_.*, rownum rownum_ from ( Select x,y,z,status FROM tablename c WHERE status IN('in condition seperated with status') ORDER BY status asc ) row_ where rownum <= 30 ) where rownum_ > 20;
Query Two:-
select * from ( select row_.*, rownum rownum_ from ( Select x,y,z,status FROM tablename c WHERE status IN('in condition seperated with status') ORDER BY status asc ) row_ where rownum <= 20 ) where rownum_ > 10;
Here the query 1 and 2 always return the same results.
Where two or more values in the column of your ORDER BY clause are the same, you must always provide another secondary column to rank. Otherwise, data return has only a probability of fetching correct result as we expect. The possibility of getting a correct answer will be same as rolling a dice. The secondary column must be unique for accurate results. While you might be able to assume that they will sort themselves based on order entered
select * from( select row_.*, rownum rownum_ from( Select x,y,z,status FROM tablename c WHERE status IN('in condition seperated with status') ORDER BY status,x asc ) row_ where rownum <= 30 ) where rownum_ > 20;
Hoping x is a unique value. DBMS_RANDOM.VALUE can also be used in case if is an oracle specific query other than adding extra order by clause

11g Oracle aggregate SQL query

Can you please help me in getting a query for this scenario. In below case it should return me single row of A=13 because 13,14 in column A has most occurrences and value of B (30) is greater for 13. We are interested in maximum occurrences of A and in case of tie B should be considered as tie breaker.
A B
13 30
13 12
14 10
14 25
15 5
In below case where there are single occurrence of A (all tied) it should return 14 having maximum value of 40 for B.
A B
13 30
14 40
15 5
Use case - we get calls from corporate customers. We are interested in knowing during what hours of day when most calls come and in case of tie - which of the busiest hours has longest call.
Further question
There is further questions on this. I want to use either of two solutions - '11g or lower' from #GurV or 'dense_rank' from #mathguy in bigger query below how can I do it.
SELECT dv.id , u.email , dv.email_subject AS headline , dv.start_date , dv.closing_date, b.name AS business_name, ls.call_cost, dv.currency,
SUM(lsc.duration) AS duration, COUNT(lsc.id) AS call_count, ROUND(AVG(lsc.duration), 2) AS avg_duration
-- max(extract(HOUR from started )) keep (dense_rank last order by count(duration), max(duration)) as most_popular_hour
FROM deal_voucher dv
JOIN lead_source ls ON dv.id = ls.deal_id
JOIN lead_source_call lsc ON ls.PHONE_SID = lsc.phone_number_id
JOIN business b ON dv.business_id = b.id
JOIN users u ON b.id = u.business_id
AND TRUNC(dv.closing_date) = to_date('13-01-2017', 'dd-mm-yyyy')
AND lsc.status = 'completed' and lsc.duration >= 30
GROUP BY dv.id , u.email , dv.email_subject , dv.start_date , dv.closing_date, b.name, ls.call_cost, dv.currency
--, extract(HOUR from started )
Try this if 12c+
select a
from t
group by a
order by count(*) desc, max(b) desc
fetch first 1 row only;
If 11g or lower:
select * from (
select a
from t
group by a
order by count(*) desc, max(b) desc
) where rownum = 1;
Note that if there is equal count and equal max value for two or more values of A, then any one of them will be fetched.
Here is a query that will work in older versions (no fetch clause) and does not require a subquery. It uses the first/last function. In case of ties by both "count by A" and "value of max(B)" it selects only the row with the largest value of A. You can change that to min(A), or even to sum(A) (although that probably doesn't make sense in your problem) or LISTAGG(A, ',') WITHIN GROUP (ORDER BY A) to get a comma-delimited list of the A's that are tied for first place, but that requires 11.2 (I believe).
select max(a) keep (dense_rank last order by count(b), max(b)) as a
, max(max(b)) keep (dense_rank last order by count(b)) as b
from inputs
group by a
;

Passing a parameter to a WITH clause query in Oracle

I'm wondering if it's possible to pass one or more parameters to a WITH clause query; in a very simple way, doing something like this (taht, obviously, is not working!):
with qq(a) as (
select a+1 as increment
from dual
)
select qq.increment
from qq(10); -- should get 11
Of course, the use I'm going to do is much more complicated, since the with clause should be in a subquery, and the parameter I'd pass are values taken from the main query....details upon request... ;-)
Thanks for any hint
OK.....here's the whole deal:
select appu.* from
(<quite a complex query here>) appu
where not exists
(select 1
from dual
where appu.ORA_APP IN
(select slot from
(select distinct slots.inizio,slots.fine from
(
with
params as (select 1900 fine from dual)
--params as (select app.ora_fine_attivita fine
-- where app.cod_agenda = appu.AGE
-- and app.ora_fine_attivita = appu.fine_fascia
--and app.data_appuntamento = appu.dataapp
--)
,
Intervals (inizio, EDM) as
( select 1700, 20 from dual
union all
select inizio+EDM, EDM from Intervals join params on
(inizio <= fine)
)
select * from Intervals join params on (inizio <= fine)
) slots
) slots
where slots.slot <= slots.fine
)
order by 1,2,3;
Without going in too deep details, the where condition should remove those records where 'appu.ORA_APP' match one of the records that are supposed to be created in the (outer) 'slots' table.
The constants used in the example are good for a subset of records (a single 'appu.AGE' value), that's why I should parametrize it, in order to use the commented 'params' table (to be replicated, then, in the 'Intervals' table.
I know thats not simple to analyze from scratch, but I tried to make it as clear as possible; feel free to ask for a numeric example if needed....
Thanks

Return Boolean value when table has data in the specified range

I need a query to return boolean when there's table has data in the given range.
Assume table
Customer
[User ID, Name, Date, Products_Purchased]
I'm trying to do:
select case when exists(
select Date, count(*)
from Customer
where date between '2015-08-03' and '2015-08-05'
)
then cast(1 as BIT)
else case(0 as BIT)end;
This is throwing an error near "select Date".
However, weird part is the inner query is running perfectly fine.
Im wondering if im missing out something here !
What about something more straightforward e.g.
select case when count(*) >0 then 1 else 0 end as HIT
from ... where ...
That way you don't have to bother about Hive assuming that EXISTS implies a correlated sub-query, automagically translated into a MapJoin, i.e. a Java HashMap shuffled to the 2nd line of Mappers jobs, etc. Not exactly your use case.
Then it's not useful to compute the exact count, so the query could be refined as
select case when count(*) >0 then 1 else 0 end as HIT
from
(select ... from ... where ... limit 1) X
[Edit] There is no "bit" datatype in Hive. But the default "int" should be OK if you just want a return flag (zero / non-zero)

Resources