I have a query where pivoting on one column and getting the total notionals works but I also need to get the total notional for another column also (type).
I have tried creating another pivot to show the total for the type column from source table.
SELECT * FROM (SELECT venue, type_, notional FROM ABC ) PIVOT ( SUM(notional) FOR (venue) IN ('A' as A1 , 'B' as B1) ) PIVOT ( SUM(notional) FOR (type_) IN ('Prime') );
I expect to see one pivot which would show the total for Prime, A1 and B1.
I think you shouldn't use pivot for this problem, but simple aggregate functions:
SELECT SUM(CASE WHEN venue = 'A' THEN notional END) AS venueA
, SUM(CASE WHEN venue = 'B' THEN notional END) AS venueB
, SUM(CASE WHEN type_ = 'Prime' THEN notional END) AS Prime
FROM ABC
If it's really only one type you can do the following:
SELECT * FROM
(SELECT venue
, SUM(CASE WHEN type_ = 'Prime' THEN notional END) over() prime
, notional
FROM ABC )
PIVOT ( SUM(notional) FOR (venue) IN ('A' as A1 , 'B' as B1) )
If there is also more then one type you have todo several PIVOT queries and then JOIN them afterwards:
SELECT * FROM
(SELECT * FROM (SELECT venue, notional FROM ABC )
PIVOT ( SUM(notional) FOR (venue) IN ('A' as A1 , 'B' as B1) ))
CROSS JOIN
(SELECT * FROM (SELECT type_, notional FROM ABC )
PIVOT ( SUM(notional) FOR (type_) IN ('Prime' AS Prime)))
Related
I have the following setup, which works fine and generates output as expected.
I'm trying to add the locations subquery into the CTE so my output will have a random location_id for each row.
The subquery is straight forward and should work but I am getting syntax errors when I try to place it into the 'data's CTE. I was hoping someone could help me out.
CREATE TABLE employees(
employee_id NUMBER(6),
emp_name VARCHAR2(30)
);
INSERT INTO employees(
employee_id,
emp_name
) VALUES
(1, 'John Doe');
INSERT INTO employees(
employee_id,
emp_name
) VALUES
(2, 'Jane Smith');
INSERT INTO employees(
employee_id,
emp_name
) VALUES
(3, 'Mike Jones');
CREATE TABLE locations AS
SELECT level AS location_id,
'Door ' || level AS location_name
FROM dual
CONNECT BY level <=
with rws as (
select level rn from dual connect by level <= 5 ),
data as ( select e.*,round (dbms_random.value(1,5)
) n from employees e)
select employee_id,
emp_name,
trunc (sysdate) + dbms_random.value (0, 5) AS random_date
from rws
join data d on rn <= n
order by employee_id;
-- trying to make this work
with rws as ( select level rn from dual connect by level <= 5 ),
data as ( select e.*, loc.location_id = (
select location_id
from locations order by dbms_random.value()
fetch first 1 row only
),
round (dbms_random.value(1,5)
) n from employees e )
select employee_id,
emp_name,
trunc (sysdate) + dbms_random.value (0, 5) AS random_date
from rws
join data d on rn <= n
order by employee_id;
You need to alias the subquery column expression, rather than trying to assign it to a [variable] name. So instead of this:
with rws as ( select level rn from dual connect by level <= 5 ),
data as ( select e.*, loc.location_id = (
select location_id
from locations order by dbms_random.value()
fetch first 1 row only
),
round (dbms_random.value(1,5)
) n from employees e )
you would do this:
with rws as (
select level rn
from dual
connect by level <= 5
),
data as (
select e.*,
(
select location_id
from locations
order by dbms_random.value()
fetch first 1 row only
) as location_id,
round (dbms_random.value(1,5)) as n
from employees e
)
db<>fiddle
But yes, you'll get the same location_id for each row, which probably isn't what you want.
There are probably better ways to avoid it (or to approach whatever you're actually trying to achieve) but one option is to force the subquery to be correlated by adding something like:
where location_id != -1 * e.employee_id
db<>fiddle
although that might be expensive. It's probably worth asking a new question about that specific aspect.
I am getting the same location_id for every employee_id, which I don't want either.
The subquery is in the wrong place then; move it to the main query, and correlate against both ID and n:
with rws as (
select level rn
from dual
connect by level <= 5
),
data as (
select e.*,
round (dbms_random.value(1,5)) as n
from employees e
)
select d.employee_id,
d.emp_name,
(
select location_id
from locations
where location_id != -1 * d.employee_id * d.n
order by dbms_random.value()
fetch first 1 row only
) as location_id,
trunc (sysdate) + dbms_random.value (0, 5) AS random_date
from rws r
join data d on r.rn <= d.n
order by d.employee_id;
db<>fiddle
Or move the location part to a new CTE, I suppose, with its own row number; and join that on one of your other generated values.
Current Scenario => We have a query which we are running on our prod cluster.
this query selects just 3 fields from a join between 1 table and ( nested way of join )another huge table and then performs a groupby at the end but runs for 2 hours on production and it hits one huge table in that join
Query :
INSERT OVERWRITE TABLE mstr_wrk.final_acct_data
SELECT
a0,
a1,
a2,
a3
FROM
(
SELECT
t1.a0 as a0
FROM
(
SELECT
t1.a0 as a0
FROM
(
SELECT
CAST(t1.acct_id AS STRING) as a0
FROM
mstr_wrk.cust_xref t1
)
t1
GROUP BY
t1.a0
)
t1
)
tab1
RIGHT OUTER JOIN
(
SELECT
a0,
a1,
a2
FROM
(
SELECT
(
CASE
WHEN
1 = t1.a1
THEN
t1.a0
ELSE
CAST(NULL AS TIMESTAMP)
END
) as a0, UDFcalldate('TRUNC', UDFcalldate('ADD_TO_DATE',
(
CASE
WHEN
1 = t1.a1
THEN
t1.a0
ELSE
CAST(NULL AS TIMESTAMP)
END
)
, 'D', - 1), 'DD') as a1
FROM
(
SELECT
MAX(t1.a0) as a0,
MAX(t1.a1) as a1
FROM
(
SELECT
load_audit.run_ts as a0,
1 as a1
FROM
mstr_wrk.load_audit
WHERE
val_name = 'card_stg'
)
t1
)
t1
)
tab4
JOIN
(
SELECT
CAST(t1.acct_cd AS STRING) as a0,
CAST(t1.h_acct_cd AS STRING) as a1,
CAST(t1.acct_num AS STRING) as a2,
CAST(t1.load_dt AS TIMESTAMP) as a3,
t1.ts as a4
FROM
mstr_work.acct_crd t1
)
tab3
WHERE
(
tab4.a0 < tab3.a4
)
AND
(
tab4.a1 <= tab3.a3
)
)
tab2
ON (tab1.a0 = tab2.a1)
WHERE
1 =
(
CASE
WHEN
tab1.a0 IS NULL
THEN
1
ELSE
0
END
)
GROUP BY
tab2.a0, tab2.a1, tab2.a2
What I tried -
I tried to enable CBO and vectorization along with ppd but no luck
I dont see any small table here so cant try map side join
One of the joins, which looks like a cross join can be translated to inner join
but is there anyway i can try CTE here
Request -
Kindly guide how can i fix it
is there any better way to rewrite this query.
I have a table called ‘MainTable’ with following data
Another table called ‘ChildTable’ with following data (foreighn key Number)
Now I want to fetch those records from ‘ChildTable’ if there exists at least one ‘S’ status.
But if any other record for this number id ‘R’ then I don’t want to fetch it
Something like this-
I tried following
Select m.Number, c.Status from MainTable m, ChildTable c
where EXISTS (SELECT NULL
FROM ChildTable c2
WHERE c2.status =’S’ and c2.status <> ‘R’
AND c2.number = m.number)
But here I am getting record having ‘R’ status also, what I am doing wrong?
You can try something like this
select num, status
from
(select id, num, status,
sum(decode(status, 'R', 1, 0)) over (partition by num) Rs,
sum(decode(status, 'S', 1, 0)) over (partition by num) Ss
from child_table) t
where t.Rs = 0 and t.Ss >= 1
-- and status = 'S'
Here is a sqlfiddle demo
The child records with 'R' might be associated with a maintable record that also has another child record with status 'S' -- that is what your query is asking for.
Select
m.Number,
c.Status
from MainTable m
join ChildTable c on c.number = m.number
where EXISTS (
SELECT NULL
FROM ChildTable c2
WHERE c2.status =’S’
AND c2.number = m.number) and
NOT EXISTS (
SELECT NULL
FROM ChildTable c2
WHERE c2.status =’R’
AND c2.number = m.number)
WITH ChildrenWithS AS (
SELECT Number
FROM ChildTable
WHERE Status = 'S'
)
,ChildrenWithR AS (
SELECT Number
FROM ChildTable
WHERE Status = 'R'
)
SELECT MaintTable.Number
,ChildTable.Status
FROM MainTable
INNER JOIN ChildTable
ON MainTable.Number = ChildTable.Number
WHERE MainTable.Number IN (SELECT Number FROM ChildrenWithS)
AND MainTable.Number NOT IN (SELECT Number FROM ChildrenWithR)
I have these tables:
Products, Articles, Product_Articles
Lets say, product_ids are: p1 , p2 article_ids are: a1 , a2 , a3
product_articles is:
(p1,a1)
(p1,a2)
(p2,a1)
(p2,a1)
(p2,a2)
(p2,a3)
How to query for product_id, which has only a1,a2, nothing less, nothing more?
UPDATED Try
SELECT p.*
FROM products p JOIN
(
SELECT product_id
FROM product_articles
GROUP BY product_id
HAVING COUNT(*) = SUM(CASE WHEN article_id IN (1, 2) THEN 1 ELSE 0 END)
AND SUM(CASE WHEN article_id IN (1, 2) THEN 1 ELSE 0 END) = 2
) q ON p.product_id = q.product_id
or
SELECT p.*
FROM products p JOIN
(
SELECT product_id, COUNT(*) a_count
FROM product_articles
WHERE article_id IN (1, 2)
GROUP BY product_id
HAVING COUNT(*) = 2
) a ON p.product_id = a.product_id JOIN
(
SELECT product_id, COUNT(*) total_count
FROM product_articles
GROUP BY product_id
) b ON p.product_id = b.product_id
WHERE a.a_count = b.total_count
Here is SQLFiddle demo for both queries
This is an example of a "set-within-sets" subquery. I advocate using aggregation with a having clause for the logic, because this is the most general way to express the relationships.
The idea is that you can count the appearance of the articles within a product (in this case) in a way similar to using a where statement. The code is a bit more complex, but it offers flexibility. In your case, this would be:
select pa.product_id
from product_articles pa
group by pa.product_id
having sum(case when pa.article_id = 'a1' then 1 else 0 end) > 0 and
sum(case when pa.article_id = 'a2' then 1 else 0 end) > 0 and
sum(case when pa.article_id not in ('a1', 'a2') then 1 else 0 end) = 0;
The first two clauses count the appearance of the two articles, making sure that there is at least one occurrence of each. The last counts the number of rows without those two articles, making sure there are none.
You can see how this easily generalizes to more articles. Or to queries where you have "a1" and "a2" but not "a3". Or where you have three of four of specific articles, and so on.
I believe this can be done entirely using relational joins, as follows:
SELECT DISTINCT pa1.PRODUCT_ID
FROM PRODUCT_ARTICLES pa1
INNER JOIN PRODUCT_ARTICLES pa2
ON (pa2.PRODUCT_ID = pa1.PRODUCT_ID)
LEFT OUTER JOIN (SELECT *
FROM PRODUCT_ARTICLES
WHERE ARTICLE_ID NOT IN (1, 2)) pa3
ON (pa3.PRODUCT_ID = pa1.PRODUCT_ID)
WHERE pa1.ARTICLE_ID = 1 AND
pa2.ARTICLE_ID = 2 AND
pa3.PRODUCT_ID IS NULL
SQLFiddle here.
The inner join looks for products associated with the articles we care about (articles 1 and 2 - produces product 1 and 2). The left outer looks for products associated with articles we don't care about (anything article except 1 and 2) and then only accepts products which don't have any unwanted articles (i.e. pa3.PRODUCT_ID IS NULL, indicating that no row from pa3 was joined in).
I have identified a way to get fast paged results from the database using CTEs and the Row_Number function, as follows...
DECLARE #PageSize INT = 1
DECLARE #PageNumber INT = 2
DECLARE #Customer TABLE (
ID INT IDENTITY(1, 1),
Name VARCHAR(10),
age INT,
employed BIT)
INSERT INTO #Customer
(name,age,employed)
SELECT 'bob',21,1
UNION ALL
SELECT 'fred',33,1
UNION ALL
SELECT 'joe',29,1
UNION ALL
SELECT 'sam',16,1
UNION ALL
SELECT 'arthur',17,0;
WITH cteCustomers
AS ( SELECT
id,
Row_Number( ) OVER(ORDER BY Age DESC) AS Row
FROM #Customer
WHERE employed = 1
/*Imagine I've joined to loads more tables with a really complex where clause*/
)
SELECT
name,
age,
Total = ( SELECT
Count( id )
FROM cteCustomers )
FROM cteCustomers
INNER JOIN #Customer cust
/*This is where I choose the columns I want to read, it returns really fast!*/
ON cust.id = cteCustomers.id
WHERE row BETWEEN ( #PageSize * #PageNumber - 1 ) AND ( #PageSize * ( #PageNumber ) )
ORDER BY row ASC
Using this technique the returned results is really really fast even on complex joins and filters.
To perform paging I need to know the Total Rows returned by the full CTE. I have "Bodged" this by putting a column that holds it
Total = ( SELECT
Count( id )
FROM cteCustomers )
Is there a better way to return the total in a different result set without bodging it into a column? Because it's a CTE I can't seem to get it into a second result set.
Without using a temp table first, I'd use a CROSS JOIN to reduce the risk of row by row evaluation on the COUNT
To get total row, this needs to happen separately to the WHERE
WITH cteCustomers
AS ( SELECT
id,
Row_Number( ) OVER(ORDER BY Age DESC) AS Row
FROM #Customer
WHERE employed = 1
/*Imagine I've joined to loads more tables with a really complex where clause*/
)
SELECT
name,
age,
Total
FROM cteCustomers
INNER JOIN #Customer cust
/*This is where I choose the columns I want to read, it returns really fast!*/
ON cust.id = cteCustomers.id
CROSS JOIN
(SELECT Count( *) AS Total FROM cteCustomers ) foo
WHERE row BETWEEN ( #PageSize * #PageNumber - 1 ) AND ( #PageSize * ( #PageNumber ) )
ORDER BY row ASC
However, this isn't guaranteed to give accurate results as demonstrated here:
can I get count() and rows from one sql query in sql server?
Edit: after a few comments.
How to avoid a CROSS JOIN
WITH cteCustomers
AS ( SELECT
id,
Row_Number( ) OVER(ORDER BY Age DESC) AS Row,
COUNT(*) OVER () AS Total --the magic for this edit
FROM #Customer
WHERE employed = 1
/*Imagine I've joined to loads more tables with a really complex where clause*/
)
SELECT
name,
age,
Total
FROM cteCustomers
INNER JOIN #Customer cust
/*This is where I choose the columns I want to read, it returns really fast!*/
ON cust.id = cteCustomers.id
WHERE row BETWEEN ( #PageSize * #PageNumber - 1 ) AND ( #PageSize * ( #PageNumber ) )
ORDER BY row ASC
Note: YMMV for performance depending on 2005 or 2008, Service pack etc
Edit 2:
SQL Server Central shows another technique where you have reverse ROW_NUMBER. Looks useful
#Digiguru
OMG, this really is the wholy grail!
WITH cteCustomers
AS ( SELECT id,
Row_Number() OVER(ORDER BY Age DESC) AS Row,
Row_Number() OVER(ORDER BY id ASC)
+ Row_Number() OVER(ORDER BY id DESC) - 1 AS Total /*<- voodoo here*/
FROM #Customer
WHERE employed = 1
/*Imagine I've joined to loads more tables with a really complex where clause*/
)
SELECT name, age, Total
/*This is where I choose the columns I want to read, it returns really fast!*/
FROM cteCustomers
INNER JOIN #Customer cust
ON cust.id = cteCustomers.id
WHERE row BETWEEN ( #PageSize * #PageNumber - 1 ) AND ( #PageSize * ( #PageNumber ) )
ORDER BY row ASC
So obvious now.