Oracle insert in index table:Time to load 500 thousand rows is more than inserting 16 million rows

Oracle insert in index table:Time to load 500 thousand rows is more than inserting 16 million rows - oracle

At first I tried normal insert into target table from temporary table.
INSERT /*+ APPEND */ INTO RDW10DM.INV_ITEM_LW_DM
SELECT
*
FROM
RDW10PRD.TMP_MDS_RECLS_INV_ITEM_LW_DM
;
COMMIT;
It tooks only 17 min to load.Total count in temp table TMP_MDS_RECLS_INV_ITEM_LW_DM is 16491650.
Plan for Execution:
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 16M| 1290M| 4927 |
| 1 | LOAD AS SELECT | | | | |
| 2 | TABLE ACCESS FULL | TMP_MDS_RECLS_INV_ITEM_LW_DM | 16M| 1290M| 4927 |
--------------------------------------------------------------------------------------
Note: cpu costing is off
Then I tried to load loc wise:
INSERT /*+ APPEND */ INTO RDW10DM.INV_ITEM_LW_DM
SELECT
*
FROM
RDW10PRD.TMP_MDS_RECLS_INV_ITEM_LW_DM
where LOC_KEY=222
;
COMMIT;
Then it tooks around 28 min to load. Total count in temp table with filter is 493465
Plan for execution:
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 492K| 38M| 4927 |
| 1 | LOAD AS SELECT | | | | |
|* 2 | TABLE ACCESS FULL | TMP_MDS_RECLS_INV_ITEM_LW_DM | 492K| 38M| 4927 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("TMP_MDS_RECLS_INV_ITEM_LW_DM"."LOC_KEY"=222)
Note: cpu costing is off
Index in Target table:
Does anyone has any idea why this is happening?

My guess? The TMP table doesn't have an index.
Therefore - selecting all records and inserting them is faster then applying an a filter on 16Mil records.
As you can see, in your second execution plan the scanner is using FULL ACCESS , which slows down the query. Try adding an index on TMP_MDS_RECLS_INV_ITEM_LW_DM(LOC_KEY) . It should boost your query performance.

Thank everyone for your valuable thoughts.
I found the actual problem later. Since I have doing frequent truncate and load in target table RDW10DM.INV_ITEM_LW_DM so index pages might have fragmented.
So, ran query after rebuilding indexes and got expected results.

Related

Oracle Function based index returning slowly

I have a table setup that contains some 640m records and I'm trying to create an index.
The manner in which I want to select records involves something like this:
index_i9(ORA_HASH(placard_bcd,128),event);
However that still returns about 3-4 million records, and from my testing, it takes a length amount of time (~12 minutes or so).
Is this a bad idea as an index? I don't think getting 3-4m records should take that long.
Any ideas?
Edit (adding more info):
The table has a bunch of columns but I don't know if I need to list all of them:
table_a
container NOT NULL NUMBER(19),
placard_bcd NOT NULL VARCHAR2(30),
event NOT NULL VARCHAR(5),
bin_number NUMBER(3),
...
...
It takes about 12 minutes to return all of the records that would return based on the index above. So to provide me all 3-4 million records.
The query used looks something like this:
select barcode, event, bin_number
from table_a
where ora_hash(barcode,128) = 105;
and event in ('CLOS','PASG','BUILD');
The explain plan provided is this:
Plan hash value: 4185630329
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 35074 | 4212K| 338 (0)| 00:00:01 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID BATCHED| TABLE_A | 35074 | 4212K| 338 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | TABLE_A_I9 | 14030 | | 14 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access(("EVENT_TYPE"='BUILD' OR "EVENT_TYPE"='CLOS' OR "EVENT_TYPE"='PASG') AND
ORA_HASH("PLACARD_BCD",128)=105)
Everything seems correct, but it still is taking a while to provide me with the records.

Optimizing the SORT MERGE join of a MERGE statement

Consider the problem of applying changes to an aggregate table. Row that exist must be updated while new rows must be inserted. My approach was as follows:
Insert all changes in a temporary table (100K at a time)
MERGE the temporary table into the main table (eventually reaching 100s of millions rows)
The SQL (with a SORT MERGE hint) looks as follows (nothing fancy):
merge /*+ USE_MERGE(t s) */
into F_SCREEN_INSTANCE t
using F_SCREEN_INSTANCE_BUF s
on (s.DAY_ID = t.DAY_ID and s.PARTIAL_ID = t.PARTIAL_ID)
when matched then update set
t.ACTIVE_TIME_SUM = t.ACTIVE_TIME_SUM + s.ACTIVE_TIME_SUM,
t.IDLE_TIME_SUM = t.IDLE_TIME_SUM + s.IDLE_TIME_SUM
when not matched then insert values (
s.DAY_ID, s.PARTIAL_ID, s.ID, s.AGENT_USER_ID, s.COMPUTER_ID, s.RAW_APPLICATION_ID, s.APP_USER_ID, s.APPLICATION_ID, s.USER_ID, s.RAW_MODULE_ID, s.MODULE_ID, s.START_TIME, s.RAW_SCREEN_NAME, s.SCREEN_ID, s.SCREEN_TYPE, s.ACTIVE_TIME_SUM, s.IDLE_TIME_SUM)
The F_SCREEN_INSTANCE table has (DAY_ID, PARTIAL_ID) as a primary key and also is IOT (index organized table). This makes it an ideal candidate for a merge join: the rows are physically sorted by the lookup key.
So far so good. I've started a benchmark and the initial times looked good, 10s for one merge. But after about an hour, the merges were taking about 4 min with heavy tempdb usage (4GB per merge). The query plan below shows that F_SCREEN_INSTANCE is re-sorted before the merge, even though the table is ideally sorted already. And of course, as the table grows even more tempdb will be needed and the whole approach falls apart.
OK, so why re-sort the table? It turns to be a limitation of the merge join implementation: the second table is always sorted.
If an index exists, then the database can avoid sorting the first data
set. However, the database always sorts the second data set,
regardless of indexes.
O...K, so then can I make the main table to be first and the buffer to be second? Nope, that's not possible either. No matter how I list the tables in the USE_MERGE hint, the source table is always first.
Finally, here is my question: Have I missed anything? Is it possible to make this SORT MERGE approach work?
Here are some more details addressing questions you might ask:
What Oracle version? 12c.
Have you tried HASH JOIN? Yes, it's bad, as expected. The main table needs to be scanned in order to build the hash table. It can't scale as F_SCREEN_INSTANCE grows.
Have you tried LOOP JOIN? Yes, it's also bad. Considering the size of the buffer table, 100K lookups into F_SCREEN_INSTANCE take unreasonably long. Merges took about 3 min very quickly.
All in all, the MERGE JOIN is conceptually the best access strategy, but the Oracle implementation seems to be severely crippled by re-sorting the target table.

Sort merge outer joins will always put the outer-joined table second regardless of the hints. Adding an extra inner-join allows control of the join order, and then ROWID can be used to join again to the large table. Hopefully two good joins will work better than one bad join.
Assumptions
This answer assumes that the sort merge join is the fastest join, and that the manual is correct that the second data set is always sorted. It would be difficult to test these assumptions without significantly more information about the data.
Sample Schema
Here are some similar tables, with fake statistics to make the optimizer think they have 500M rows and 100K rows.
create table F_SCREEN_INSTANCE(DAY_ID number, PARTIAL_ID number, ID number, AGENT_USER_ID number,COMPUTER_ID number, RAW_APPLICATION_ID number, APP_USER_ID number, APPLICATION_ID number, USER_ID number, RAW_MODULE_ID number,MODULE_ID number, START_TIME date, RAW_SCREEN_NAME varchar2(100), SCREEN_ID number, SCREEN_TYPE number, ACTIVE_TIME_SUM number, IDLE_TIME_SUM number,
constraint f_screen_instance_pk primary key (day_id, partial_id)
) organization index;
create table F_SCREEN_INSTANCE_BUF(DAY_ID number, PARTIAL_ID number, ID number, AGENT_USER_ID number,COMPUTER_ID number, RAW_APPLICATION_ID number, APP_USER_ID number,APPLICATION_ID number, USER_ID number, RAW_MODULE_ID number, MODULE_ID number, START_TIME date, RAW_SCREEN_NAME varchar2(100), SCREEN_ID number, SCREEN_TYPE number, ACTIVE_TIME_SUM number, IDLE_TIME_SUM number,
constraint f_screen_instance_buf_pk primary key (day_id, partial_id)
);
begin
dbms_stats.set_table_stats(user, 'F_SCREEN_INSTANCE', numrows => 500000000);
dbms_stats.set_table_stats(user, 'F_SCREEN_INSTANCE_BUF', numrows => 100000);
end;
/
The Problem
The desired join and join order can be achieved with the LEADING hint when an inner join is used. The smaller table, F_SCREEN_INSTANCE_BUF, is the second table.
explain plan for
select /*+ use_merge(t s) leading(t s) */ *
from f_screen_instance_buf s
join f_screen_instance t
on (s.DAY_ID = t.DAY_ID and s.PARTIAL_ID = t.PARTIAL_ID);
select * from table(dbms_xplan.display(format => '-predicate'));
Plan hash value: 563239985
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 19M| | 6898 (66)| 00:00:01 |
| 1 | MERGE JOIN | | 100K| 19M| | 6898 (66)| 00:00:01 |
| 2 | INDEX FULL SCAN | F_SCREEN_INSTANCE_PK | 500M| 46G| | 4504 (100)| 00:00:01 |
| 3 | SORT JOIN | | 100K| 9765K| 26M| 2393 (1)| 00:00:01 |
| 4 | TABLE ACCESS FULL| F_SCREEN_INSTANCE_BUF | 100K| 9765K| | 34 (6)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
The LEADING hint does not work when changing to a left join.
explain plan for
select /*+ use_merge(t s) leading(t s) */ *
from f_screen_instance_buf s
left join f_screen_instance t
on (s.DAY_ID = t.DAY_ID and s.PARTIAL_ID = t.PARTIAL_ID);
select * from table(dbms_xplan.display(format => '-predicate'));
Plan hash value: 1472690071
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 19M| | 16M (1)| 00:10:34 |
| 1 | MERGE JOIN OUTER | | 100K| 19M| | 16M (1)| 00:10:34 |
| 2 | TABLE ACCESS BY INDEX ROWID| F_SCREEN_INSTANCE_BUF | 100K| 9765K| | 826 (0)| 00:00:01 |
| 3 | INDEX FULL SCAN | F_SCREEN_INSTANCE_BUF_PK | 100K| | | 26 (0)| 00:00:01 |
| 4 | SORT JOIN | | 500M| 46G| 131G| 16M (1)| 00:10:34 |
| 5 | INDEX FAST FULL SCAN | F_SCREEN_INSTANCE_PK | 500M| 46G| | 2703 (100)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------
This limitation is not documented as far as I can tell. I tried using the +outline setting of DBMS_XPLAN to see the full set of hints and then changed them around. But nothing I did could make the join order change for the LEFT JOIN version. Perhaps someone else can get this to work.
select * from table(dbms_xplan.display(format => '-predicate +outline'));
...
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
USE_MERGE(#"SEL$0E991E55" "T"#"SEL$1")
LEADING(#"SEL$0E991E55" "S"#"SEL$1" "T"#"SEL$1")
INDEX_FFS(#"SEL$0E991E55" "T"#"SEL$1" ("F_SCREEN_INSTANCE"."DAY_ID" "F_SCREEN_INSTANCE"."PARTIAL_ID"))
INDEX(#"SEL$0E991E55" "S"#"SEL$1" ("F_SCREEN_INSTANCE_BUF"."DAY_ID"
"F_SCREEN_INSTANCE_BUF"."PARTIAL_ID"))
OUTLINE(#"SEL$9EC647DD")
OUTLINE(#"SEL$2")
MERGE(#"SEL$9EC647DD")
OUTLINE_LEAF(#"SEL$0E991E55")
ALL_ROWS
DB_VERSION('12.1.0.1')
OPTIMIZER_FEATURES_ENABLE('12.1.0.1')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
Possible Solution
--#3: Join the large table to the smaller result set. This uses the largest table twice,
--but the plan can use the ROWID for a very quick join.
explain plan for
merge into F_SCREEN_INSTANCE t
using
(
--#2: Now get the missing rows with an outer join. Since the _BUF table is
--small I assume it does not make a big difference exactly how it it joind
--to the 100K result set.
--The hints NO_MERGE and NO_PUSH_PRED are required to keep the INNER_JOIN
--inline view intact.
select /*+ no_merge(inner_join) no_push_pred(inner_join) */ inner_join.*
from f_screen_instance_buf s
left join
(
--#1: Get 100K rows efficiently with an inner join.
--Note that the ROWID is retrieved here.
select /*+ use_merge(t s) leading(t s) */ s.*, s.rowid s_rowid
from f_screen_instance_buf s
join f_screen_instance t
on (s.DAY_ID = t.DAY_ID and s.PARTIAL_ID = t.PARTIAL_ID)
) inner_join
on (s.DAY_ID = inner_join.DAY_ID and s.PARTIAL_ID = inner_join.PARTIAL_ID)
) s
on (s.s_rowid = t.rowid)
when matched then update set
t.ACTIVE_TIME_SUM = t.ACTIVE_TIME_SUM + s.ACTIVE_TIME_SUM,
t.IDLE_TIME_SUM = t.IDLE_TIME_SUM + s.IDLE_TIME_SUM
when not matched then insert values (
s.DAY_ID, s.PARTIAL_ID, s.ID, s.AGENT_USER_ID, s.COMPUTER_ID, s.RAW_APPLICATION_ID, s.APP_USER_ID, s.APPLICATION_ID, s.USER_ID, s.RAW_MODULE_ID, s.MODULE_ID, s.START_TIME, s.RAW_SCREEN_NAME, s.SCREEN_ID, s.SCREEN_TYPE, s.ACTIVE_TIME_SUM, s.IDLE_TIME_SUM);
It ain't pretty, but at least it generates a plan with the large table first in the sort merge join.
select * from table(dbms_xplan.display);
Plan hash value: 1086560566
-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 500G| 173T| | 5355K (43)| 00:03:30 |
| 1 | MERGE | F_SCREEN_INSTANCE | | | | | |
| 2 | VIEW | | | | | | |
|* 3 | HASH JOIN OUTER | | 500G| 179T| 29M| 5355K (43)| 00:03:30 |
|* 4 | HASH JOIN OUTER | | 100K| 28M| 3712K| 8663 (53)| 00:00:01 |
| 5 | INDEX FAST FULL SCAN| F_SCREEN_INSTANCE_BUF_PK | 100K| 2539K| | 9 (0)| 00:00:01 |
| 6 | VIEW | | 100K| 25M| | 6898 (66)| 00:00:01 |
| 7 | MERGE JOIN | | 100K| 12M| | 6898 (66)| 00:00:01 |
| 8 | INDEX FULL SCAN | F_SCREEN_INSTANCE_PK | 500M| 12G| | 4504 (100)| 00:00:01 |
|* 9 | SORT JOIN | | 100K| 9765K| 26M| 2393 (1)| 00:00:01 |
| 10 | TABLE ACCESS FULL| F_SCREEN_INSTANCE_BUF | 100K| 9765K| | 34 (6)| 00:00:01 |
| 11 | INDEX FAST FULL SCAN | F_SCREEN_INSTANCE_PK | 500M| 46G| | 2703 (100)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("INNER_JOIN"."S_ROWID"=("T".ROWID(+)))
4 - access("S"."PARTIAL_ID"="INNER_JOIN"."PARTIAL_ID"(+) AND
"S"."DAY_ID"="INNER_JOIN"."DAY_ID"(+))
9 - access("S"."DAY_ID"="T"."DAY_ID" AND "S"."PARTIAL_ID"="T"."PARTIAL_ID")
filter("S"."PARTIAL_ID"="T"."PARTIAL_ID" AND "S"."DAY_ID"="T"."DAY_ID")

Improving SQL Exists scalability

Say we have two tables, TEST and TEST_CHILDS in the following way:
creat TABLE TEST(id1 number PRIMARY KEY, word VARCHAR(50),numero number);
creat TABLE TEST_CHILD (id2 number references test(id), word2 VARCHAR(50));
CREATE INDEX TEST_IDX ON TEST_CHILD(word2);
CREATE INDEX TEST_JOIN_IDX ON TEST_CHILD(id);
insert into TEST SELECT ROWNUM,U1.USERNAME||U2.TABLE_NAME, LENGTH(U1.USERNAME) FROM ALL_USERS U1,ALL_TABLES U2;
INSERT INTO TEST_CHILD SELECT MOD(ROWNUM,15000)+1,U1.USER_ID||U2.TABLE_NAME FROM ALL_USERS U1,ALL_TABLES U2;
We would like to query to get rows from TEST table that satisfy some criteria in the child table, so we go for:
SELECT /*+FIRST_ROWS(10)*/* FROM TEST T WHERE EXISTS (SELECT NULL FROM TEST_CHILD TC WHERE word2 like 'string%' AND TC.id = T.id ) AND ROWNUM < 10;
We always want just the first 10 results, not any more at all. Therefore, we would like to get the same response time to read 10 results whether table has 10 matching values or 1,000,000; since it could get 10 distinct results from the child table and get the values on the parent table (or at least that is the plan that we would like). But when checking the actual execution plan we see:
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 54 | 5 (20)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | NESTED LOOPS | | | | | |
| 3 | NESTED LOOPS | | 1 | 54 | 5 (20)| 00:00:01 |
| 4 | SORT UNIQUE | | 1 | 23 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| TEST_CHILD | 1 | 23 | 3 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | TEST_IDX | 1 | | 2 (0)| 00:00:01 |
|* 7 | INDEX UNIQUE SCAN | SYS_C005145 | 1 | | 0 (0)| 00:00:01 |
| 8 | TABLE ACCESS BY INDEX ROWID | TEST | 1 | 31 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<10)
6 - access("WORD2" LIKE 'string%')
filter("WORD2" LIKE 'string%')
7 - access("TC"."ID"="T"."ID")
SORT UNIQUE under the STOPKEY, what afaik means that it is reading all results from the child table, making the distinct to finally select only the first 10, making the query not as scalable as we would like it to be.
Is there any mistake in my example?
Is it possible to improve this execution plan so it scales better?

The SORT UNIQUE is going to find and sort all of the records from TEST_CHILD that matched 'string%' - it is NOT going to read all results from child table. Your logic requires this. IF you only picked the first 10 rows from TEST_CHILD that matched 'string%', and those 10 rows all had the same ID, then your final results from TEST would only have 1 row.
Anyway, your performance should be fine as long as 'string%' matches a relatively low number of rows in TEST_CHILD. IF your situation is such that 'string%' often matches a HUGE record count on TEST_CHILD, there's not much you can do to make the SQL more performant given the current tables. In such a case, if this is a mission-critical SQL, with performance tied to your annual bonus, there's probably some fancy footwork you could do with MATERIALIZED VIEWs to, e.g. pre-compute 10 TEST rows for high-cardinality WORD2 values in TEST_CHILD.
One final thought - a "risky" solution, but one which should work if you don't have thousands of TEST_CHILD rows matching the same TEST row, would be the following:
SELECT *
FROM TEST
WHERE ID1 IN
(SELECT ID2
FROM TEST_CHILD
WHERE word2 like 'string%'
AND ROWNUM < 1000)
AND ROWNUM <10;
You can adjust 1000 up or down, of course, but if it's too low, you risk finding less than 10 distinct ID values, which would give you final results with less than 10 rows.

Oracle 11g how to estimate needed TEMP tablespace?

We do an initial bulk load of some tables (both, source and target are Oracle 11g). The process is as follows: 1. truncate, 2. drop indexes (the PK and a unique index), 3. bulk insert, 4. create indexes (again the PK and the unique index). Now I got the following error:
alter table TARGET_SCHEMA.MYBIGTABLE
add constraint PK_MYBIGTABLE primary key (MYBIGTABLE_PK)
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP
So obviously TEMP tablespace is to small for PK creation (FYI the table has 6 columns and about 2.2 billion records). So I did this:
explain plan for
select line_1,line_2,line_3,line_4,line_5,line_6,count(*) as cnt
from SOURCE_SCHEMA.MYBIGTABLE
group by line_1,line_2,line_3,line_4,line_5,line_6;
select * from table( dbms_xplan.display );
/*
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2274M| 63G| | 16M (2)| 00:05:06 |
| 1 | HASH GROUP BY | | 2274M| 63G| 102G| 16M (2)| 00:05:06 |
| 2 | TABLE ACCESS FULL| MYBIGTABLE | 2274M| 63G| | 744K (7)| 00:00:14 |
-----------------------------------------------------------------------------------------------
*/
Is this how to tell how much TEMP tablespace will be needed for PK creation (102 GB in my case)? Or would you make the estimate differently?
Additional: The PK only exists on the target system. But fair point, so I run your query on target PK:
explain plan for
select MYBIGTABLE_PK
from TARGET_SCHEMA.MYBIGTABLE
group by MYBIGTABLE_PK ;
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 3 (34)| 00:00:01 |
| 1 | HASH GROUP BY | | 1 | 13 | 3 (34)| 00:00:01 |
| 2 | TABLE ACCESS FULL| MYBIGTABLE | 1 | 13 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
So how would I have to read this now?

This is a good question.
First, If you create the following primary key
alter table TARGET_SCHEMA.MYBIGTABLE
add constraint PK_MYBIGTABLE primary key (MYBIGTABLE_PK)
then you should query
explain plan for
select PK_MYBIGTABLE
from SOURCE_SCHEMA.MYBIGTABLE
group by PK_MYBIGTABLE
To get an estimate (make sure you gather stats exec dbms_stats.gather_table_stats('SOURCE_SCHEMA','MYBIGTABLE').
Second , you can query V$TEMPSEG_USAGE to see how much temp blocks were consumed before you got thrown and v$session_longops to see how much of the total process you finished.
Oracle docs suggests creating a dedicated temp tablespace for the process to not disturb any other operations.
Please post an edit if you find a more accurate solution.

Oracle explain plan over simple select performs multiple hash joins when multiple columns are indexed in a table

I am currently running into an issue with my Oracle instance. I have two simple select statements:
select * from dog_vets
and
select * from dog_statuses
and the following fiddle
My explain plan on dog_vets is as follows:
0 | Select Statement
1 | Table Access Full Scan dog_vets
my explain plan on dog_statuses is as follows:
ID|Operation | Name | Rows |Bytes | cost | time
0 | Select Statement | | 20G | 500M | 100000 | 999:99:17
1 | View | index%_join_001 | 20G | 500M | 100000 | 999:99:17
2 | Hash Join | | | | |
3 | Hash Join | | | | |
4 | Index fast full scan dog_statuses_check_up | | 20G | 500M | 100000 | 32:15:00
5 | Index fast full scan dog_statuses_sick| | 20G | 500M | 100000 | 35:19:00
To get this type of output execute the following statement:
explain plan for
select * from dog_vets;
OR
explain plan for
select * from dog_statuses;
and then
select * from table(dbms_xplan.display);
Now my question is, why do multiple indexes imply a view (materialized I assume) being created in my above statements and further what type of performance hit am I suffering on this type of query? As it stands now dog_vets has ~300 million records and dog_Statuses has about 500 million. I have yet to be able to get select * from dog_statuses to return in under 10 hours. This is primarily because the query dies before it completes.
DDL
In case sql fiddle dies:
create table dog_vets
(
name varchar2(50),
founded timestamp,
staff_count number
);
create table dog_statuses
(
check_up timestamp,
sick varchar2(1)
);
create index dog_vet_name
on dog_vets(name);
create index dog_status_check_up
on dog_statuses(check_up);
create index dog_status_sick
on dog_statuses(sick);

You could try to tell the optimizer to forget about indexes
SELECT /*+NO_INDEX(dog_statuses)*/ *
FROM dog_statuses

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio