I am using Apache Phoenix to run some queries but their performance look bad compared to what I was expecting. As an example, considering a table like:
CREATE TABLE MY_SHORT_TABLE (
MPK BIGINT not null,
... 38 other columns ...
CONSTRAINT pk PRIMARY KEY (MPK, 4 other columns))
SALT_BUCKETS = 4;
which has arround 460000 lines,
a query like:
select sum(MST.VALUES),
MST.III, MST.BBB, MST.DDD, MST.FFF,
MST.AAA, MST.CCC, MST.EEE, MST.HHH
from
MY_SHORT_TABLE MST
group by
MST.AAA, MST.BBB, MST.CCC, MST.DDD,
MST.EEE, MST.FFF, MST.HHH, MST.III
is taking arround 9-11 seconds to complete.
In a table with a similar structure but with near 3 400 000 lines, it takes arroud 45 seconds to complete the query.
I have 5 hosts (1 Master and 4 RegionServer+PhoenixQS) in this cluster with 6 vCPU and 32GB RAM.
The configurations I am using at in this example are:
HBase RegionServer Maximum Memory=8192(8GB)
HBase Master Maximum Memory=8192(8GB)
Number of Handlers per RegionServer=30
Memstore Flush Size=128MB
Maximum Record Size=1MB
Maximum Region File Size=10GB
% of RegionServer Allocated to Read Buffers=40%
% of RegionServer Allocated to Write Buffers=40%
HBase RPC Timeout=6min
Zookeeper Session Timeout=6min
Phoenix Query Timeout=6min
Number of Fetched Rows when Scanning from Disk=1000
dfs.client.read.shortcircuit=true
dfs.client.read.shortcircuit.buffer.size=131072
phoenix.coprocessor.maxServerCacheTimeToLiveMs=30000
I am using HDP 2.4.0, so Phoenix 4.4.
The example query explain is the following:
+------------------------------------------+
| PLAN |
+------------------------------------------+
| CLIENT 8-CHUNK PARALLEL 8-WAY FULL SCAN OVER MY_SHORT_TABLE |
| SERVER AGGREGATE INTO DISTINCT ROWS BY [AAA, BBB, CCC, DDD, EEE, FFF, HHH |
| CLIENT MERGE SORT |
+------------------------------------------+
Also, I have created an index as:
CREATE INDEX i1DENORM2T1 ON MY_SHORT_TABLE (HHH)
INCLUDE ( AAA, BBB, CCC, DDD, EEE, FFF, HHH, VALUES ) ;
This index changes the query execution plan to:
+------------------------------------------+
| PLAN |
+------------------------------------------+
| CLIENT 4-CHUNK PARALLEL 4-WAY FULL SCAN OVER I1DENORM2T1 |
| SERVER AGGREGATE INTO DISTINCT ROWS BY ["AAA", "BBB", "DDD", "EEE", "FFF", "HHH |
| CLIENT MERGE SORT |
+------------------------------------------+
However the performance do not match the expectations (arround 3-4 seconds).
What is wrong in the above configs or what should I change in order to get a better performance?
Thanks in advance.
Related
I am new to dynamo db.
The table looks like below
| id |rangekey |timestamp |dimensions
| -----| --------|-------------|----------
| of1 | ACTIVE |1631460979529|{"type":"test","content":"abc"}
| of2 | ACTIVE |1631499979529|{"type":"test","content":"bxh"}
| of3 | ACTIVE |1631499979520|{"type":"practice","content":"xyz"}
| of4 | ACTIVE |1631499979528|{"type":"lecture","content":"lll"}
| of5 | ACTIVE |1631460979927|{"type":"practice","content":"olp","component":"one"}
| .. |.. |... |...
so on.
The id is the partition key and range key is the sort key.The id values are unique.
It seems like poorly designed table when it comes to querying all the id for which the dimensions contains (or begins with)
"type":"test" or "type":"practice"
I am aware of the below approaches:
Scan the table with filter expression like below
contains(dimensions,'"type":"test"') or contains(dimensions,'"type":"practice"')
Query the partition id one
by one with filter expression as above.This seems like a problem
because i have a large list of id (partition keys) approx up-to
5000 .But this could be run in parallel to reduce time
Or can i create a dynamo db stream sort of a materialized view which has the
view containing all id whose dimension is of type test or
practice.Need more insight on this one.
Does any of the above approach seems good cost wise or efficiency.Are there better ways of doing this.Thanks in advance !
I have created a table as below with 3 buckets, and loaded some data into it.
create table testBucket (id int,name String)
partitioned by (region String)
clustered by (id) into 3 buckets;
I have set bucketing property as well. $set hive.enforce.bucketing=true;
But when I listed the table files in HDFS I could see that that 3 files are creates as I have mentioned 3 buckets.
But data got loaded in only one file and rest 2 files are just empty. So I am confused why my data got loaded into only file?
So could someone please explain me how data distribution happens in bucketing?
[test#localhost user]$ hadoop fs -ls /user/hive/warehouse/database2.db/buckettab/region=USA
Found 3 items
-rw-r--r-- 1 user supergroup 38 2016-06-27 08:34 /user/hive/warehouse/database2.db/buckettab/region=USA/000000_0
-rw-r--r-- 1 user supergroup 0 2016-06-27 08:34 /user/hive/warehouse/database2.db/buckettab/region=USA/000001_0
-rw-r--r-- 1 user supergroup 0 2016-06-27 08:34 /user/hive/warehouse/database2.db/buckettab/region=USA/000002_0
Bucketing is a method to evenly distributed the data across many files. Create multiple buckets and then place each record into one of the buckets based on some logic mostly some hashing algorithm.
Bucketing feature of Hive can be used to distribute/organize the table/partition data into multiple files such that similar records are present in the same file. While creating a Hive table, a user needs to give the columns to be used for bucketing and the number of buckets to store the data into. Which records go to which bucket are decided by the Hash value of columns used for bucketing.
[Hash(column(s))] MOD [Number of buckets]
Hash value for different columns types is calculated differently. For int columns, the hash value is equal to the value of int. For String columns, the hash value is calculated using some computation on each character present in the String.
Data for each bucket is stored in a separate HDFS file under the table directory on HDFS. Inside each bucket, we can define the arrangement of data by providing the SORT BY column while creating the table.
Lets See an Example
Creating a Hive table using bucketing
For creating a bucketed table, we need to use CLUSTERED BY clause to define the columns for bucketing and provide the number of buckets. Following query creates a table Employee bucketed using the ID column into 5 buckets.
CREATE TABLE Employee(
ID BIGINT,
NAME STRING,
AGE INT,
SALARY BIGINT,
DEPARTMENT STRING
)
COMMENT 'This is Employee table stored as textfile clustered by id into 5 buckets'
CLUSTERED BY(ID) INTO 5 BUCKETS
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE;
Inserting data into a bucketed table
We have following data in Employee_old table.
0: jdbc:hive2://localhost:10000> select * from employee_old;
+------------------+--------------------+-------------------+----------------------+--------------------------+--+
| employee_old.id | employee_old.name | employee_old.age | employee_old.salary | employee_old.department |
+------------------+--------------------+-------------------+----------------------+--------------------------+--+
| 1 | Sudip | 34 | 62000 | HR |
| 2 | Suresh | 45 | 76000 | FINANCE |
| 3 | Aarti | 25 | 37000 | BIGDATA |
| 4 | Neha | 27 | 39000 | FINANCE |
| 5 | Rajesh | 29 | 59000 | BIGDATA |
| 6 | Suman | 37 | 63000 | HR |
| 7 | Paresh | 42 | 71000 | BIGDATA |
| 8 | Rami | 33 | 56000 | HR |
| 9 | Arpit | 41 | 46000 | HR |
| 10 | Sanjeev | 51 | 99000 | FINANCE |
| 11 | Sanjay | 32 | 67000 | FINANCE |
+------------------+--------------------+-------------------+----------------------+--------------------------+--+
We will select data from the table Employee_old and insert it into our bucketed table Employee.
We need to set the property ‘hive.enforce.bucketing‘ to true while inserting data into a bucketed table. This will enforce bucketing, while inserting data into the table.
Set the property
0: jdbc:hive2://localhost:10000> set hive.enforce.bucketing=true;
Insert data into Bucketed table employee
0: jdbc:hive2://localhost:10000> INSERT OVERWRITE TABLE Employee SELECT * from Employee_old;
Verify the Data in Buckets
Once we execute the INSERT query, we can verify that 5 files are created Under the Employee table directory on HDFS.
Name Type
000000_0 file
000001_0 file
000002_0 file
000003_0 file
000004_0 file
Each file represents a bucket. Let us see the contents of these files.
Content of 000000_0
All records with Hash(ID) mod 5 == 0 goes into this file.
5,Rajesh,29,59000,BIGDATA
10,Sanjeev,51,99000,FINANCE
Content of 000001_0
All records with Hash(ID) mod 5 == 1 goes into this file.
1,Sudip,34,62000,HR
6,Suman,37,63000,HR
11,Sanjay,32,67000,FINANCE
Content of 000002_0
All records with Hash(ID) mod 5 == 2 goes into this file.
2,Suresh,45,76000,FINANCE
7,Paresh,42,71000,BIGDATA
Content of 000003_0
All records with Hash(ID) mod 5 == 3 goes into this file.
3,Aarti,25,37000,BIGDATA
8,Rami,33,56000,HR
Content of 000004_0
All records with Hash(ID) mod 5 == 4 goes into this file.
4,Neha,27,39000,FINANCE
9,Arpit,41,46000,HR
I feel all ID MOD 3 will be same for USA partition (region=USA) in sample data.
Bucket number is determined by the expression hash_function(bucketing_column) mod num_buckets. (There's a '0x7FFFFFFF in there too, but that's not that important). The hash_function depends on the type of the bucketing column. For an int, it's easy, hash_int(i) == i. For example, if user_id were an int, and there were 10 buckets, we would expect all user_id's that end in 0 to be in bucket 1, all user_id's that end in a 1 to be in bucket 2, etc. For other datatypes, it's a little tricky. In particular, the hash of a BIGINT is not the same as the BIGINT. And the hash of a string or a complex datatype will be some number that's derived from the value, but not anything humanly-recognizable. For example, if user_id were a STRING, then the user_id's in bucket 1 would probably not end in 0. In general, distributing rows based on the hash will give you a even distribution in the buckets.
Take a look at the language Manual here
It states:
How does Hive distribute the rows across the buckets? In general, the bucket number is determined by the expression hash_function(bucketing_column) mod num_buckets. (There's a '0x7FFFFFFF in there too, but that's not that important). The hash_function depends on the type of the bucketing column. For an int, it's easy, hash_int(i) == i. For example, if user_id were an int, and there were 10 buckets, we would expect all user_id's that end in 0 to be in bucket 1, all user_id's that end in a 1 to be in bucket 2, etc. For other datatypes, it's a little tricky. In particular, the hash of a BIGINT is not the same as the BIGINT. And the hash of a string or a complex datatype will be some number that's derived from the value, but not anything humanly-recognizable. For example, if user_id were a STRING, then the user_id's in bucket 1 would probably not end in 0. In general, distributing rows based on the hash will give you a even distribution in the buckets.
In your case because you are clustering by Id which is an Int and then you're bucketing it into 3 buckets only it looks like all values are being hashed into one of these buckets. To ensure this is working, add some rows that have different ids from the ones you have in the file and increase the number of buckets and see if they get hashed into separate files this time around.
I have a table PO_HEADER with ~20 million records. Considering our future load on the table we have decided to partitioned the table to increase the performance of the sql queries. Below are the queries used to create the new partitioned tables.
CREATE TABLE PO_HEADER_LP
PARTITION BY LIST (BUYER_IDENTIFIER)
(PARTITION GC66287246AA VALUES ('GC66287246AA') TABLESPACE MITRIX_TABLES,
PARTITION GC43837235JK VALUES ('GC43837235JK') TABLESPACE MITRIX_TABLES,
PARTITION GC84338293AA VALUES ('GC84338293AA') TABLESPACE MITRIX_TABLES,
PARTITION DEFAULTBUID VALUES (DEFAULT) TABLESPACE MITRIX_TABLES)
AS SELECT *
FROM PO_HEADER;
create index PO_HEADER_LP_SI_IDX on PO_HEADER_LP("SUPPLIER_IDENTIFIER") TABLESPACE MITRIX_INDEXES LOCAL;
Old Table PO_HEADER has two indexes on "BUYER_IDENTIFIER" and "SUPPLIER_IDENTIFIER" columns as follows:
create index PO_HEADER_BI_IDX on PO_HEADER("BUYER_IDENTIFIER") TABLESPACE MITRIX_INDEXES;
create index PO_HEADER_SI_IDX on PO_HEADER("SUPPLIER_IDENTIFIER") TABLESPACE MITRIX_INDEXES;
To test the performance of the query, I executed below query on both the tables. But, to my wonder I saw the cost of the 2nd query is almost double than the 1st one. Can any body know, why is the query cost is high of the partitioned table compared to normal table. Thanks in Advance.
select * from po_header where buyer_identifier='GC84338293AA' and supplier_identifier='GC75987723HT'; --cost: 56,941
select * from po_header_lp where buyer_identifier= 'GC84338293AA' and supplier_identifier='GC75987723HT'; --cost: 93,309
PO_HEADER with Global Index on buyer_identifier & supplier_identifier column
PO_HEADER_LP with Global Index on supplier_identifier column
PO_HEADER_LP with Local Index on supplier_identifier column
From your DDL I assume, you have three big buyers (say 5M records each) and a bunch of smaller ones. In other word this would be the correct setup for you list partitioning schema.
You may verify, whether it works testing access on buyer only:
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
select * from tab_lp where BUYER_ID = 1;
;
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6662K| 82M| 4445 (2)| 00:00:01 | | |
| 1 | PARTITION LIST SINGLE| | 6662K| 82M| 4445 (2)| 00:00:01 | KEY | KEY |
| 2 | TABLE ACCESS FULL | TAB_LP | 6662K| 82M| 4445 (2)| 00:00:01 | 2 | 2 |
------------------------------------------------------------------------------------------------
The same query for the non-partitioned table should produce much higher cost. Why?
In the partitioned table the selected buyer (in your case GC84338293AA, I'm using surrogate keys) has it own partition.
So full scan of this partition is the best access.
select * from tab where BUYER_ID = 1;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6596K| 81M| 14025 (1)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TAB | 6596K| 81M| 14025 (1)| 00:00:01 |
--------------------------------------------------------------------------
1 - filter("BUYER_ID"=1)
For the non-partitioned table (to get approximately one fourth of the data) the FULL TABLE SCAN is OK as well,
but of course has higher cost as all data must be scanned.
Note - if you see here lower cost, unrealistically low Rows count and/or INDEX ACCESS,
than this is the cause of the problem of the underestimating of the cost. So don't worry the old cost are too low, not the new one too high!
The next step is the access on both buyer and supplier. To get the answer you must provide
additional information.
How selective is the supplier filter?
I.e. if the predicate buyer_identifier='GC84338293AA' returns say 5M records, how may records return the predicate with both columns?
buyer_identifier='GC84338293AA' and supplier_identifier='GC75987723HT'
Is it 4M or 100 records?
If the complete predicate returns only few records than the local index on supplier is OK.
If it returns large number of rows (say the quarter of the partition) - you should stay on FULL PARTITION SCAN and not use it.
This is similar to my comment on the non partitioned table.
Estimation of the supplier cardinality
In case that the column SUPPLIER contains a skewed data (which may fool the CBO to calulate improper cost) you may define explicitely histogram in this column.
I used this statement statement, that calculates the histogram on full data (100% is important for highly skewed data) and for the table and partition.
exec dbms_stats.gather_table_stats(ownname=>user,tabname=>'TAB_LP',granularity=>'all',estimate_percent => 100,METHOD_OPT => 'for columns SUPPLIER_ID size 254');
This worked for my test data, i.e. for supplier with low cardinality an index access was opened (on local no-prefixed index) and for huge suppliers a full partition scan was used.
You can create a Local partitioned index using this script.
CREATE INDEX PO_HEADER_LOCAL_IDX ON PO_HEADER_LP
(BUYER_IDENTIFIER, SUPPLIER_IDENTIFIER)
LOCAL (
PARTITION GC66287246AA,
PARTITION GC43837235JK,
PARTITION GC84338293AA,
PARTITION DEFAULTBUID
);
Also it is recommended to gather statistics of the newly created partition table using this script:
EXEC DBMS_STATS.GATHER_TABLE_STATS('SCHEMA Name','PO_HEADER_LP');
Now you can generate the execution plan again of the following SQL:
select * from po_header_lp where buyer_identifier= 'GC84338293AA' and supplier_identifier='GC75987723HT';
Hope this will help you.
I have created a query doing this in ORACLE:
SELECT SUBSTR(title,1,INSTR(title,' ',1,1)) AS first_word, COUNT(*) AS word_count
FROM FILM
GROUP BY SUBSTR(title,1,INSTR(title,' ',1,1))
HAVING COUNT(*) >= 20;
Results after running:
539 rows selected. Elapsed: 00:00:00.22
I need to improve the performance of this and created a function-based index as so:
CREATE INDEX INDX_FIRSTWRD ON FILM(SUBSTR(title,1,INSTR(title,' ',1,1)));
After running the same query at the top of this post, I still get the same performance:
539 rows selected. Elapsed: 00:00:00.22
Is the index not being applied or overwritten or am I doing something wrong?
Thanks for any help you could provide. :)
EDIT:
Execution Plan:
----------------------------------------------------------
Plan hash value: 2033354507
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20000 | 2968K| 138 (2)| 00:00:02 |
|* 1 | FILTER | | | | | |
| 2 | HASH GROUP BY | | 20000 | 2968K| 138 (2)| 00:00:02 |
| 3 | TABLE ACCESS FULL| FILM | 20000 | 2968K| 136 (0)| 00:00:02 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(COUNT(*)>=20)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
471 consistent gets
0 physical reads
0 redo size
14030 bytes sent via SQL*Net to client
908 bytes received via SQL*Net from client
37 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
539 rows processed
The problem is that the value you're using for the index may be null - if there is no space in the title (i.e. it's a one-word title like "Jaws") then your substr evaluates to null. That probably isn't what you want, incidentally - you probably want the end position to be conditional on whether there is a space at all, but that's beyond the scope of the question. (And even if you correct that logic, Oracle may still not be able to trust that the result can't be null, even if the underlying column is not nullable). Edit: see below for more on using nvl to handle single-word titles.
Since nulls aren't included in indexes, the single-title rows won't be indexed. But you're asking for all rows, and Oracle knows the index doesn't hold all rows, so it can't use the index to fulfil the query - even if you add a hint telling it to, it has to ignore that hint.
The only time the index will be used is if you include a filter that references the indexed value too, and explicitly or implicitly exclude nulls, e.g.:
SELECT SUBSTR(title,1,INSTR(title,' ',1,1)) AS first_word, COUNT(*) AS word_count
FROM FILM
WHERE SUBSTR(title,1,INSTR(title,' ',1,1)) IS NOT NULL
GROUP BY SUBSTR(title,1,INSTR(title,' ',1,1))
HAVING COUNT(*) >= 20;
(which also probably isn't what you actually want).
SQL Fiddle for queries with and without a filter, and with and without an index hint. (Click the 'execution plan' link against each result section to see whether it's doing a full table scan or a full index scan).
And another Fiddle showing that the index can't be used even with the filter if the filter still allows null values, again since they are not in the index.
Since SylvainLeroux brought it up, Oracle isn't quite clever enough to know the computed value can't be null if you coalesce it, even if the underlying column is not-null (as a function-based index or as a virtual column). Possibly because there could be a lot of branches to evaluate. But it is clever enough if you use the simpler and proprietary nvl instead:
CREATE INDEX INDX_FIRSTWRD
ON FILM(NVL(SUBSTR(title,1,INSTR(title,' ',1,1)),title));
SELECT NVL(SUBSTR(title,1,INSTR(title,' ',1,1)),title) AS first_word,
COUNT(*) AS word_count
FROM FILM
GROUP BY NVL(SUBSTR(title,1,INSTR(title,' ',1,1)),title)
HAVING COUNT(*) >= 20;
But only if title is defined as not-null. And coalesce does work if the virtual column is also declared not-null (thanks Sylvain).
SQL Fiddle with a function-based index and another with a virtual column.
539 rows selected. Elapsed: 00:00:00.22
Do you really think you need to tune the query which returns 539 rows in less than a second? 220 milliseconds, precicely! Think about it.
In your case, I think CBO does the best possible thing. And that is the reason it doesn't use the index. Because, to read every row from the table, using the index is an overhead. It needs to read the index and then do a table access by rowid. Probably, in your small table, it could read the entire table with less IO to fetch the data.
If the table is small enough to be in a single block, then, it just requires a one IO to fetch required data from single block with full table scan.
You can try to check the explain plan by hinting the query to use the index and see if anything really improves. Remember, you are trying unnecessarily to improve the performance of a query which executes in less than a second!
every month I do a simple update statement on my oracle database. But, since monday it takes very long. The table grows every month by 5 percent. Now there are 8 million records stored.
The Statement:
update /*+ parallel(destination_tab, 4) */ destination_tab dest
set (full_name, state) =
(select /*+ parallel(source_tab, 4) */ dest.name, src.state
from source_tab src
where src.city = dest.city);
In real there are 20 fields to update, not only two... but so it looks easier to descripe the problem.
explain plan:
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | update statement | | 8517K| 3167M| 579M (50)|999:59:59 |
| 1 | update | destination_tab | | | | |
| 2 | PX COORDINATOR | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 8517K| 3167M| 6198 (1)| 00:01:27 |
| 4 | px block iterator | | 8517K| 3167M| 6198 (1)| 00:01:27 |
| 5 | table access full | DESTINATION_TAB | 8517K| 3167M| 6198 (1)| 00:01:27 |
| 6 | table access by index rowid| SOURCE_TAB | 1 | 56 | 1 (0)| 00:00:01 |
|* 7 | index unique scan | CITY_PK | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
Could anyone descripe to me, how this can be? The plan looks very bad! Thank you very very much.
You didn't say how long is too long. You are joining an 8 million row table. Not sure how many rows are in source_tab.
I noticed the execution plan indicates a full table scan of destination_tab. Is the city column on the destination_tab table indexed? If not, try adding an index. If it is, Oracle may be ignoring it because it knows it needs to return every value anyway and destination_tab is the driving table.
No matter how you optimize it, this will always degrade in performance as the tables grow because you are updating every row by fetching a value from the same table joined to another. That is, you are always doing N operations where N is the number of rows in destination_tab.
High-level questions/suggestions:
Do you need to update every row every time? Are only certain rows likely to have changed values? If so, can you somehow predict which rows you need to update and limit your updates to it.
Why are the hints there? If performance changes, I would experiment with dropping hints. It's the optimizer's job to find the best plan for you. By using hints, you are telling the optimizer how to do its job. You'd better be right.
You are updating the full_name column on destination_tab to the name column of the same row. But you are obtaining the name column through a join to the table. It may be quicker to take that out of your select and use something like below. This is a guess. It may not matter.
update destination_tab dest
set full_name = name,
state =
(select src.state
from source_tab src
where src.city = dest.city);
Try the following.
merge
into destination_tab d
using source_tab s
on (d.city = d.city)
when matched then
update
set d.state = s.state
where decode(d.state, s.state, 1, 0) = 0;
If this is a data warehouse, I wouldn't do updates, especially not every row in a large table. I'd probably create a materialized view combining the pieces from various base tables, and do a full refresh when needed (non-atomic: truncate + insert append).
Edit:
As for WHY the current update approach is taking much longer than usual, my guess is that in previous runs Oracle found a good number of blocks needed for the update in buffer cache, and lately Oracle has had to pull a lot from disk into buffer first. You can look into consistent gets and db block gets (logical io) vs physical io (disk).
I understand the comments about the sense of a data warehouse and so on. However, I have to do this update in this kind. The update is part of an ETL workflow. I have to copy every month the complete 8 million records of the table "destination". After this step I have to do the UPDATE which makes problems.
I do not understand the problem, that the performance is so bad day-to-day. Usually, the update runs 45 minutes. Now, it runs about 4 hours. But why? There is no sorting necessary, so the famous reason "sorting on disc instead on main memory" is not possible. What is the problem in my case?
Could there be an difference about the performance between normal update (how I do it) and the merge-update?