Recursive sum of values in an hierarchical table in Oracle 10g - view

Assuming I have this table:
CREATE TABLE MY_EXAMPLE ( ID NUMBER , PARENT NUMBER , VALUE NUMBER );
Insert into MY_EXAMPLE (ID,PARENT,VALUE) values (1,null,100);
Insert into MY_EXAMPLE (ID,PARENT,VALUE) values (2,1,50);
Insert into MY_EXAMPLE (ID,PARENT,VALUE) values (3,null,0);
Insert into MY_EXAMPLE (ID,PARENT,VALUE) values (4,2,1000);
Insert into MY_EXAMPLE (ID,PARENT,VALUE) values (5,1,1);
|id |parent |value |
|1 |null |100 |
|2 |1 |50 |
|3 |null |0 |
|4 |2 |1000 |
|5 |1 |1 |
I need to create a view (which should perform well) with the same number of rows but giving the row's plus the children's value summed. Many levels are possible as well as many children.
|id |parent |value |
|1 |null |1151 | (sum of 1 + 2 + 4 + 5)
|2 |1 |1050 | (sum of 2 + 4)
|3 |null |0 | (only 3 because has no children)
|4 |2 |1000 | (only 4 because has no children)
|5 |1 |1 | (only 5 because has no children)
ps.: I tried something like this but it didn't work in Oracle 10g first because the keyword RECURSIVE is not supported and second because it won't allow recursive WITH ("forward or recursive reference of a query name in WITH clause is not allowed").
Also I couldn't figure out a way to do it with CONNECT BY that includes the id and parent columns and gives me the whole table (in my attempts I always had to use START WITH).

You will have to create a recursive function:
CREATE FUNCTION RECURSIVE_ADD(
ROOT_ID IN NUMBER)
RETURN NUMBER
IS
TOTAL NUMBER;
BEGIN
SELECT SUM(VALUE)
INTO TOTAL
FROM (
(
SELECT VALUE FROM MY_EXAMPLE WHERE ID = ROOT_ID
)
UNION
(
SELECT recursive_add(id) FROM my_example WHERE parent = root_id
));
RETURN total;
END;
select id, parent, value, RECURSIVE_ADD(id) from my_example;
Make sure you don't have a cycle in your data (for example, if you set the parent of 1 to 2) otherwise this will never terminate. There are other ways to do this in newer versions of Oracle, but this will work in 10g.

Related

oracle index in the underlying sql query

I have a table with 2 columns
item_name varchar2
brand varchar2
both of them have bitmap index
let's say I create a view for a specific brand and rename the column item_name ,something like that
create view my_brand as
select item_name as item from table x where brand='x'
We cannot create an index on a normal view but what is Oracle doing when issuing the underlying query of that view? Is the index of the item_name column being used if we write select item from my_brand where item='item1'?
thanks
The answer will be “it depends”. The index access path is certainly an option open to the optimizer; but remember that the optimizer makes a cost based decision. So essentially it will evaluate the cost of all the available plans and choose the one with the lowest cost.
Here is an example:
create table tab1 ( item_name varchar2(15), brand varchar2(15) );
insert into tab1
select 'Name '||to_char( rownum), 'Brand '||to_char(mod(rownum,10))
from dual
connect by rownum < 1000000
commit;
exec dbms_stats.gather_table_stats( user, 'TAB1' );
create bitmap index bm1 on tab1 ( item_name );
create bitmap index bm2 on tab1 ( brand );
create or replace view my_brand
as select item_name as item from tab1 where brand = 'Brand 1';
explain plan for
select item from my_brand where item = 'Name 1001'
select * from table( dbms_xplan.display )
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 20 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| TAB1 | 1 | 20 | 3 (0)| 00:00:01 |
| 2 | BITMAP CONVERSION TO ROWIDS | | | | | |
|* 3 | BITMAP INDEX SINGLE VALUE | BM1 | | | | |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("BRAND"='Brand 1')
3 - access("ITEM_NAME"='Name 1001')
We can create an index on a normal view
No, you can't
SQL> create table as_idx_view (item_name varchar2(10), brand varchar2(10));
Table created.
SQL> create view as_view as select item_name item from as_idx_view where brand = 'X';
View created.
SQL> create index as_idx_view_idx1 on as_view (item);
create index as_idx_view_idx1 on as_view (item)
*
ERROR at line 1:
ORA-01702: a view is not appropriate here

Join two tables in HIVE with sub query

I need to get the cost of an item at a certain date and time. I have these two tables:
create table sales ( product_id int, items_sold int, date_loaded date );
create table product ( product_id int, description string, item_cost double, date_loaded date );
The product table is a history of each item. If the cost of an item today is $1.00 but the cost of that item yesterday was $0.99 I would have two records one for each day. When I load my sales data I need to reflect the cost of the item yesterday and not today's cost.
Here is the query I am trying to execute:
SELECT s.product_id, s.items_sold, p.description, s.items_sold * p.item_cost as total_cost FROM sales s, product p
WHERE
p.product_id = s.product_id and
p.date_loaded <= (
SELECT MAX(pp.date_loaded)
FROM product pp
WHERE
pp.product_id = s.product_id and
pp.date_loaded <= s.date_loaded
)
SALES TABLE:
|PRODUCT_ID |ITEMS_SOLD |DATE_LOADED |
|1 |4 |2016-06-30 |
|1 |5 |2016-07-01 |
|1 |6 |2016-07-02 |
|1 |3 |2016-07-03 |
PRODUCT TABLE:
|PRODUCT_ID |DESCRIPTION |ITEM_COST |DATE_LOADED |
|1 |ITEM A |0.99 |2016-06-20 |
|1 |ITEM A |1.00 |2016-07-02 |
I would expect to see this result:
|PRODUCT_ID |ITEMS_SOLD |DESCRIPTION |ITEM_COST |TOTAL_COST |
|1 |4 |ITEM A |0.99 |3.96 |
|1 |5 |ITEM A |0.99 |4.95 |
|1 |6 |ITEM A |1.00 |6.00 |
|1 |3 |ITEM A |1.00 |3.00 |
From everything I have read this form of a sub query is not allowed. So how can I accomplish this in HIVE?
It can be accomplished with CTE and Lag widow function
With result as(select PRODUCT_ID, DESCRIPTION, ITEM_COST , DATE_LOADED ,
LEAD(DATE_LOADED, 1,'2999-01-01')
OVER (ORDER BY DATE_LOADED) AS fromdate from PRODUCT )
SELECT s.product_id, s.items_sold, p.description, s.items_sold * p.item_cost
as total_cost FROM sales s join result p on s.product_id = p.product_id
where s.DATE_LOADED >= p.DATE_LOADED and s.DATE_LOADED < p.fromdate ;

How to select duplicate rows and group by two columns

I have the following table:
ID|user_id|group_id|subject |book_id
1| 2 |3 |history |1
2| 4 |3 |history |1
3| 5 |3 |art |2
4| 2 |3 |art |2
5| 1 |4 |sport |5
I would like to list all rows for group 3(id) that have duplicate rows with the same subject_id and book_id. The subject and book_id is what would determine the 2 or more rows to be duplicate.
I would like my distinct results to look like this:
|subject |book_id|
|history |1 |
|art |2 |
Using either query builder or eloquent
A SQL query to get the desired result may look
SELECT subject, book_id
FROM table1
WHERE group_id = 3
GROUP BY subject, book_id
HAVING COUNT(*) > 1
Here is a SQLFiddle demo
Now the same using the Laravel Query Builder
$duplicates = DB::table('table1')
->select('subject', 'book_id')
->where('group_id', 3)
->groupBy('subject', 'book_id')
->havingRaw('COUNT(*) > 1')
->get();

Table scan on internal tables

Assume I have 2 tables - TABLE-1 & TABLE-2 and each of the table has 1 million rows with 10 columns and index on col1..
Now I build a internal table on this 2 tables ( 1 + 1 = 2 million) rows,
select * from
(select col1, col2,....col10 from table-1
union all
select col1, col2,....col10 from table-2) x
Questions,
how will the internal table will be treated in Oracle since its a internal table..
1. Will the internal table will be treated as a table with index on col1?
2. Will this be captured in the Explain plan?
Yes and yes.
Oracle will effectively treat this inline view as a table. It can use predicate pushing to apply a filter on the inline view to the base tables, and potentially use an index. The explain plan will show this.
Tables, indexes, sample data, and statistics
create table table1(col1 number, col2 number, col3 number, col4 number);
create table table2(col1 number, col2 number, col3 number, col4 number);
create index table1_idx on table1(col1);
create index table2_idx on table2(col1);
insert into table1 select level, level, level, level
from dual connect by level <= 100000;
insert into table2 select level, level, level, level
from dual connect by level <= 100000;
commit;
begin
dbms_stats.gather_table_stats(user, 'TABLE1');
dbms_stats.gather_table_stats(user, 'TABLE2');
end;
/
Explain plan showing predicate pushing and index access
explain plan for
select * from
(
select col1, col2, col3, col4 from table1
union all
select col1, col2, col3, col4 from table2
)
where col1 = 1;
select * from table(dbms_xplan.display);
Plan hash value: 400235428
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 40 | 2 (0)| 00:00:01 |
| 1 | VIEW | | 2 | 40 | 2 (0)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
| 3 | TABLE ACCESS BY INDEX ROWID BATCHED| TABLE1 | 1 | 20 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | TABLE1_IDX | 1 | | 1 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID BATCHED| TABLE2 | 1 | 20 | 2 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | TABLE2_IDX | 1 | | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("COL1"=1)
6 - access("COL1"=1)
Notice how the predicates happen before the VIEW, and both indexes are used. By default everything should work as well as can be expected.
Notes
This type of query structure is called an inline view. Although a physical table is not built, the phrase "internal tables" is a good way of thinking about how the query logically works. Ideally, an inline view would work exactly like a pre-built table with the same data. In reality there are some cases where things don't quit work that way. But in general you are definitely on the right path - build a large query by assembling small inline views, and assume that Oracle will optimize it correctly.
for your particular query no any index will be used, but I suppose you do some filtering, ie where x.col1 = ###, I'm not sure that oracle will be able to use table-1/table-2 indexes to filter, so I suggest you to put where statements inside "union query"

How can I get a COUNT(col) ... GROUP BY to use an index?

I've got a table (col1, col2, ...) with an index on (col1, col2, ...). The table has got millions of rows in it, and I want to run a query:
SELECT col1, COUNT(col2) WHERE col1 NOT IN (<couple of exclusions>) GROUP BY col1
Unfortunately, this is resulting in a full table scan of the table, which takes upwards of a minute. Is there any way of getting oracle to use the index on the columns to return the results much faster?
EDIT:
more specifically, I'm running the following query:
SELECT owner, COUNT(object_name) FROM all_objects GROUP BY owner
and there is an index on SYS.OBJ$ (SYS.I_OBJ2) which indexes the owner# and name columns; I believe I should be able to use this index in the query, rather than a full table scan of SYS.OBJ$
I have had the chance to play around with this, and my previous comments regarding the NOT IN are a red herring in this case. The key thing is the presence of NULLs, or rather whether the indexed columns have NOT NULL constraints enforced.
This is going to depend on the version of the database you're using, because the optimizer gets smarter with each release. I'm using 11gR1 and the optimizer used the index in all cases except one: when both columns were null and I didn't include the NOT IN clause:
SQL> desc big_table
Name Null? Type
----------------------------------- ------ -------------------
ID NUMBER
COL1 NUMBER
COL2 VARCHAR2(30 CHAR)
COL3 DATE
COL4 NUMBER
Without the NOT IN clause...
SQL> explain plan for
2 select col4, count(col1) from big_table
3 group by col4
4 /
Explained.
SQL> select * from table(dbms_xplan.display)
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 1753714399
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 31964 | 280K| | 7574 (2)| 00:01:31 |
| 1 | HASH GROUP BY | | 31964 | 280K| 45M| 7574 (2)| 00:01:31 |
| 2 | TABLE ACCESS FULL| BIG_TABLE | 2340K| 20M| | 4284 (1)| 00:00:52 |
----------------------------------------------------------------------------------------
9 rows selected.
SQL>
When I dobbed the NOT IN clause back in, the optimizer opted to use the index. Weird.
SQL> explain plan for
2 select col4, count(col1) from big_table
3 where col1 not in (12, 19)
4 group by col4
5 /
Explained.
SQL> select * from table(dbms_xplan.display)
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 343952376
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 31964 | 280K| | 5057 (3)| 00:01:01 |
| 1 | HASH GROUP BY | | 31964 | 280K| 45M| 5057 (3)| 00:01:01 |
|* 2 | INDEX FAST FULL SCAN| BIG_I2 | 2340K| 20M| | 1767 (2)| 00:00:22 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
2 - filter("COL1"<>12 AND "COL1"<>19)
14 rows selected.
SQL>
Just to repeat, in all other cases, as long as one of the indexed columns was declared not nill, the index was used to satisfy the query. This may not be true on earlier versions of Oracle, but it probably points the way forward.
you could use a hint http://download.oracle.com/docs/cd/B10501_01/server.920/a96533/hintsref.htm ,
but remember that using an index might not always result in faster execution.
(Just in case, are you sure it's doing a table scan and not an index scan?)
Try using COUNT(*) instead of COUNT(col2) (assuming this is appropriate for you problem, of course). Also, maybe try an index with just col1.
You are querying against oracle's fixed tables, since you've not stated which db vesion this is, I'll assume a recent one. Have the fixed tables been analyzed and have updated statistics? Have you tried your query using the rule base optimizer by the use of the /*+ rule */ hint. Often I've seen that queries against oracle's own fixed tables perform better when the rule base optimizer is used.

Resources