I have multiple tables, each with FK relationships that connect them to one another. I need to create a pivot table using details out of some of the tables.
Region Table
Region_ID|Region_Description
State Table
State_ID|State_Description|Region_ID_FK
Order Table
Order_ID|Order_Date|State_ID_FK
Category Table
Category_ID|Category|Description|Order_ID_FK
I am joining all the tables using a natural join, based on the FKs.
I need to determine how many orders are in each category for each region.
The resulting table should look like this:
Category|Region1|Region2|Region3|Total
Sporting 1 0 3 4
ETC 0 2 1 3
SELECT c.Category,
COUNT( CASE r.Region_ID WHEN 1 THEN 1 ELSE NULL END ) AS Region1,
COUNT( CASE r.Region_ID WHEN 2 THEN 1 ELSE NULL END ) AS Region2,
COUNT( CASE r.Region_ID WHEN 3 THEN 1 ELSE NULL END ) AS Region3,
COUNT( CASE r.Region_ID WHEN 4 THEN 1 ELSE NULL END ) AS Region4
FROM REGION r
INNER JOIN
STATE s
ON (r.Region_ID = s.Region_ID_FK)
INNER JOIN
ORDER o
ON (s.State_ID = o.State_ID_FK)
INNER JOIN
CATEGORY c
ON (o.Order_ID = c.Order_ID_FK)
GROUP BY c.Category
Related
I have two tables in Oracle, in first table I have 100 users and in second table I have 100000 records. I want to distribute equal amount of records between them.....
Instead of writing updating and using rownum <= 1000 to distribute data....I want to write merge statement that can divide equal number of records between 100 users.
Table 1
column A Column B column c
1 Pre 90008765
2 Pre 90008766 and so on like this
Table 2
column a Column B column C Column d
1 null null null
2 null null null
And so on will have 100000 records
and between these two tables column a will be common in which we can apply join condition..... please guide me with merge query
If I understand correctly these words "write merge statement that can divide equal number of records between 100 users", you want this:
merge into table2 tgt
using (
select tb.rwd, ta.a
from (select rownum rn, a, b, c, count(1) over () cnt from table1) ta
join (select rowid rwd, rownum rn, a, b, c, d from table2) tb
on mod(ta.rn, cnt) = mod(tb.rn, cnt)) src
on (tgt.rowid = src.rwd)
when matched then update set a = src.a
dbfiddle
This statement assigns rows from T1 to rows in T2 in sequence 1-2-3-...-1-2-3-..., using function mod(). Of course you can update other columns if you need, not only A.
I have a question concerning Hive. Let me explain to you the scenario :
I am using a Hive action on Oozie; I have a query which is doing
succesive LEFT JOIN on different tables;
Total number of rows to be inserted is about 35 million;
First, the job was crashing due to lack of memory, so I set "set hive.auto.convert.join=false" the query was perfectly executed but it took 4 hours to be done;
I tried to rewrite the order of LEFT JOINs putting large tables at the end, but same result, about 4 hours to be executed;
Here is what the query look like:
INSERT OVERWRITE TABLE final_table
SELECT
T1.Id,
T1.some_field_name,
T1.another_filed_name,
T2.also_another_filed_name,
FROM table1 T1
LEFT JOIN table2 T2 ON ( T2.Id = T1.Id ) -- T2 is the smallest table
LEFT JOIN table3 T3 ON ( T3.Id = T1.Id )
LEFT JOIN table4 T4 ON ( T4.Id = T1.Id ) -- T4 is the biggest table
So, knowing the structure of the query is there a way to rewrite it so that I can avoid too many JOINs ?
Thanks in advance
PS: Even vectorization gave me the same timing
Too long for a comment, will be deleted later.
(1) Your current query won't compile.
(2) You are not selecting anything from T3 and T4, which makes no sense.
(3) Changing the order of tables is not likely to have any impact with cost based optimizer.
(4) Basically I would suggest to collect statistics on the tables, specifically on the id columns, but in your case I got a feeling that id is not unique in more than 1 table.
Add to your post the result of the following query:
select *
, case when cnt_1 = 0 then 1 else cnt_1 end
* case when cnt_2 = 0 then 1 else cnt_2 end
* case when cnt_3 = 0 then 1 else cnt_3 end
* case when cnt_4 = 0 then 1 else cnt_4 end as product
from (select id
,count(case when tab = 1 then 1 end) as cnt_1
,count(case when tab = 2 then 1 end) as cnt_2
,count(case when tab = 3 then 1 end) as cnt_3
,count(case when tab = 4 then 1 end) as cnt_4
from ( select 1 as tab,id from table1
union all select 2 as tab,id from table2
union all select 3 as tab,id from table3
union all select 4 as tab,id from table4
) t
group by id
having greatest (cnt_1,cnt_2,cnt_3,cnt_4) >= 10
) t
order by product desc
limit 10
;
I need help with my SQL Query I have Two tables that i need to join using a LEFT OUTER JOIN, then i need to create a database view over that particular view. If i run a query on the join to look for name A i need to get that A's latest brand "AP".
Table 1
ID name address
-----------------------
1 A ATL
2 B ATL
TABLE 2
ID PER_ID brand DATEE
--------------------------------------------
1 1 MS 5/19/17:1:00pm
2 1 XB 5/19/17:1:05pm
3 1 AP 5/19/17:2:00pm
4 2 RO 5/19/17:3:00pm
5 2 WE 5/19/17:4:00pm
I tried query a which returns correct result but i get problem 1 when i try to build the database view on top of the join. I tried query b but when i query my view in oracle sql developer i still get all the results but not the latest.
query a:
SELECT * from table_1
left outer join table_2 on table_1.ID = Table_2.PER_ID
AND table_2.DATE = (SELECT MAX(DATE) from table_2 z where z.PER_ID = table_2.PER_ID)
Problem 1
Error report -
ORA-01799: a column may not be outer-joined to a subquery
01799. 00000 - "a column may not be outer-joined to a subquery"
*Cause: <expression>(+) <relop> (<subquery>) is not allowed.
*Action: Either remove the (+) or make a view out of the subquery.
In V6 and before, the (+) was just ignored in this case.
Query 2:
SELECT * from table_1
left outer join(SELECT PER_ID,brand, max(DATEE) from table_2 group by brand,PER_ID) t2 on table_1.ID = t2.PER_ID
Use row_number():
select t1.id, t1.name, t1.address, t2.id as t2_id, t2.brand, t2.datee
from table_1 t1 left outer join
(select t2.*,
row_number() over (partition by per_id order by date desc) as seqnum
from table_2 t2
) t2
on t1.ID = t2.PER_ID and t2.seqnum = 1;
When defining a view, you should be in the habit of listing the columns explicitly.
I have 2 tables on my Oracle DB
One with a product list
PRODUCT_ID - PRODUCT_NAME - PRODUCT_PRICE
1 P_1 50
2 P_2 60
3 P_3 70
4 P_4 80
And one with the orders
CLIENT_ID - PRODUCT_ID - ORDER_PRICE
1 1 50
2 3 60
3 2 70
4 2 70
I need to make a query so it returns the product_list table but ordered by the most frequent Product_id in the orders table. So in this case the Product ID=2 must be first on the list.
I have found some examples but i cant find something that will work for this case.
You can use subquery for aggregation on orders table to find count for each product id and then left join it with the product_list table to use the calculated count for ordering.
select p.*
from product_list p
left join (
select product_id,
count(*) as cnt
from orders
group by product_id
) o on p.product_id = o.product_id
order by o.cnt desc nulls last;
LEFT Join is used since not all products could have orders and we need to find the count of orders for each product.
GROUP BY is used because we use the aggregate count() to find the occurrence of orders for a given Product.
ORDER BY DESC is used so the count is ordered highest count of product orders first to lowest. However when ties exist, we don't know what order will be returned as a second level of order by is not defined. Could be order We could add a Product_ID so they are low to high after that...
.
SELECT PL.Product_ID, PL.Product_Name, PL.Product_Price, count(O.Product_ID) cnt
FROM Product_List
LEFT JOIN Orders O
on O.Product_ID = PL.Product_ID
GROUP BY PL.Product_ID, PL.Product_Name, PL.Product_Price
ORDER BY cnt Desc
I have these tables:
Products, Articles, Product_Articles
Lets say, product_ids are: p1 , p2 article_ids are: a1 , a2 , a3
product_articles is:
(p1,a1)
(p1,a2)
(p2,a1)
(p2,a1)
(p2,a2)
(p2,a3)
How to query for product_id, which has only a1,a2, nothing less, nothing more?
UPDATED Try
SELECT p.*
FROM products p JOIN
(
SELECT product_id
FROM product_articles
GROUP BY product_id
HAVING COUNT(*) = SUM(CASE WHEN article_id IN (1, 2) THEN 1 ELSE 0 END)
AND SUM(CASE WHEN article_id IN (1, 2) THEN 1 ELSE 0 END) = 2
) q ON p.product_id = q.product_id
or
SELECT p.*
FROM products p JOIN
(
SELECT product_id, COUNT(*) a_count
FROM product_articles
WHERE article_id IN (1, 2)
GROUP BY product_id
HAVING COUNT(*) = 2
) a ON p.product_id = a.product_id JOIN
(
SELECT product_id, COUNT(*) total_count
FROM product_articles
GROUP BY product_id
) b ON p.product_id = b.product_id
WHERE a.a_count = b.total_count
Here is SQLFiddle demo for both queries
This is an example of a "set-within-sets" subquery. I advocate using aggregation with a having clause for the logic, because this is the most general way to express the relationships.
The idea is that you can count the appearance of the articles within a product (in this case) in a way similar to using a where statement. The code is a bit more complex, but it offers flexibility. In your case, this would be:
select pa.product_id
from product_articles pa
group by pa.product_id
having sum(case when pa.article_id = 'a1' then 1 else 0 end) > 0 and
sum(case when pa.article_id = 'a2' then 1 else 0 end) > 0 and
sum(case when pa.article_id not in ('a1', 'a2') then 1 else 0 end) = 0;
The first two clauses count the appearance of the two articles, making sure that there is at least one occurrence of each. The last counts the number of rows without those two articles, making sure there are none.
You can see how this easily generalizes to more articles. Or to queries where you have "a1" and "a2" but not "a3". Or where you have three of four of specific articles, and so on.
I believe this can be done entirely using relational joins, as follows:
SELECT DISTINCT pa1.PRODUCT_ID
FROM PRODUCT_ARTICLES pa1
INNER JOIN PRODUCT_ARTICLES pa2
ON (pa2.PRODUCT_ID = pa1.PRODUCT_ID)
LEFT OUTER JOIN (SELECT *
FROM PRODUCT_ARTICLES
WHERE ARTICLE_ID NOT IN (1, 2)) pa3
ON (pa3.PRODUCT_ID = pa1.PRODUCT_ID)
WHERE pa1.ARTICLE_ID = 1 AND
pa2.ARTICLE_ID = 2 AND
pa3.PRODUCT_ID IS NULL
SQLFiddle here.
The inner join looks for products associated with the articles we care about (articles 1 and 2 - produces product 1 and 2). The left outer looks for products associated with articles we don't care about (anything article except 1 and 2) and then only accepts products which don't have any unwanted articles (i.e. pa3.PRODUCT_ID IS NULL, indicating that no row from pa3 was joined in).