Refreshable on commit materialized view using MAX() - oracle

I'm being hit hard by the entity-attribute-value antipattern. Some day, years ago, a guy decided that DDL wasn't sexy, and wanted to develop something "flexible enough" to keep information about people. He ignored the fact that people uses to have at least some basic attributes, as name, date of birth, and the like. Not only that, he put a bunch of (side-effects-ridden) PL/SQL packages on top of that schema. The thing managed to be a key subsystem in which other applications relied on.
Fast forward some years and 20 million rows. The guy isn't at the company anymore, and I have to deal with this. I need to implement some basic searches that right now would require multiple inner joins and just take forever for some cases. Rewriting the whole thing it's not possible, so I want to "pivot" the most important attributes.
I thought that materialized views could be a viable alternative, but I need some guidance, since I have never used them. I would like to get a table like this:
select
uid,
max(case when att = 'NAME' then UPPER(value) end) name,
max(case when att = 'SURNAME' then UPPER(value) end) surname,
max(case when att = 'BIRTH' then DATEORNULL(value) end) birth,
....,
count(*) cnt
from t
group by uid
as I understand reading Oracle docs, I should be able to create a "REFRESHABLE ON COMMIT" materialized view with MAX() if the query has no where clause.
But can't get it to work. I've tried:
create materialized view log on t WITH SEQUENCE,ROWID,(value) INCLUDING NEW VALUES;
create materialized view t_view
refresh fast on commit
as
select
uid,
max(case when att = 'NAME' then UPPER(value) end) name,
max(case when att = 'SURNAME' then UPPER(value) end) surname,
max(case when att = 'BIRTH' then DATEORNULL(value) end) birth,
count(*) cnt
from t
group by uid
It works for inserts, but not for updates. I see that is capable of these things:
REFRESH_COMPLETE
REFRESH_FAST
REFRESH_FAST_AFTER_INSERT
but I think I should see also REFRESH_FAST_AFTER_ONETAB_DML. Any ideas?
Update: Output of dbms_mview.explain_mview
REFRESH_COMPLETE |Y|
REFRESH_FAST |Y|
REFRESH_FAST_AFTER_INSERT |Y|
REFRESH_FAST_AFTER_ONETAB_DML|N|mv uses the MIN or MAX aggregate functions
REFRESH_FAST_AFTER_ANY_DML |N|see the reason why REFRESH_FAST_AFTER_ONETAB_DML is disabled
REFRESH_FAST_PCT |N|PCT is not possible on any of the detail tables in the mater

The MV_CAPABILITIES_TABLE.MSGTXT is wrong, what you really need to do is replace case with decode.
When I tried this on 11g I got the message CASE expressions present in materialized view. Changing it to use decode fixed it on both 10g and 11g.

Related

Oracle 11 joining with view has high cost

I'm having some difficulty with joining a view to another table. This is on an Oracle RAC system running 11.2
I'll try and give as much detail as possible without going into specific table structures as my company would not like that.
You all know how this works. "Hey, can you write some really ugly software to implement our crazy ideas?"
The idea of what they wanted me to do was to make a view where the end user wouldn't know if they were going after the new table or the old table so one of the tables is a parameter table that will return "ON" or "OFF" and is used in the case statements.
There are some not too difficult but nested case statements in the select clause
I have a view:
create view my_view as
select t1.a as a, t1.b as b, t1.c as c,
sum(case when t2.a = 'xx' then case when t3.a then ... ,
case when t2.a = 'xx' then case when t3.a then ... ,
from table1 t1
join table t2 on (t1.a = t2.a etc...)
full outer join t3 on (t1.a = t3.a etc...)
full outer join t4 on (t1.a = t4.a etc...)
group by t1.a, t1.b, t2.c, and all the ugly case statements...
Now, when I run the query
select * from my_view where a='xxx' and b='yyy' and c='zzz'
the query runs great and the cost is 10.
However, when I join this view with another table everything falls apart.
select * from my_table mt join my_view mv on (mt.a = mv.a and mt.b=mv.b and mt.c=mv.c) where ..."
everything falls apart with a cost though the roof.
What I think is happening is the predicates are not getting pushed to the view. As such, the view is now doing full tables scans and joining everything to everything and then finally removing all the rows.
Every hint, tweak, or anything I've done doesn't appear to help.
When looking at the plan it looks like it has the predicates.
But this happens after everything is joined.
Sorry if this is cryptic but any help would be greatly appreciated.
Since you have the view with a "GROUP BY", predicates could not be pushed to the inner query
Also, you have the group by functions in a case statement, which could also make it worse for the optimizer
Oracle introduces enhancements to Optimizer every version/release/patch. It is hard to say what is supported in the version you're running. However, you can try:
See if removing the case from the GROUP BY function will make any difference
Otherwise, you have to take the GROUP BY and GROUP BY functions from the view to the outer most query
After many keyboard indentations on my forehead I may have tricked Oracle into pushing the predicates. I don't know exactly why this works but simplifying things may have helped.
I changed all my ON clauses to USING clauses and in this way the column names now match the columns from which I'm joining to. On some other predicates that were constants I added in a where clause to the view.
The end result is I can now join this view with another table and the cost is reasonable and the plan shows that the predicates are being pushed.
Thank you to everybody who looked at this problem.

How to query for Inactive Employees using BI Publisher in Oracle Fusion?

I'm new to BI Publisher and I'm using it through Oracle Fusion Applications.
I am trying to make a report relating to the Inactive Employees in an organization. However I am unable to figure out how to query for an inactive or terminated employee.
I initially used this query:
SELECT PERSON_ID, PERSON_NUMBER, EFFECTIVE_START_DATE, EFFECTIVE_END_DATE
FROM PER_ALL_PEOPLE_F
WHERE TRUNC(SYSDATE) NOT BETWEEN TRUNC(EFFECTIVE_START_DATE) AND TRUNC(EFFECTIVE_END_DATE)
My other considerations were attributes from the PER_ALL_ASSIGNMENTS_M table including PRIMARY_WORK_RELATION_FLAG, PRIMARY_ASSIGNMENT_FLAG and ASSIGNMENT_TYPE considering that the employee's assignment details would help somehow. However I was unsuccessful.
I wanted to know if there was any other proper way to query for inactive employees. Is there any particular attribute in any table which would tell me for certain that an employee is active or terminated? When an employee is terminated in Oracle Fusion, which all table attributes get affected?
Thank you for your help.
The easiest way to do this is simply :
SELECT * FROM YourTable t
WHERE TRUNC(t.END_DATE) <= trunc(sysdate)
Some times there is also an indication column like IS_ACTIVE or something. You can also consider adding it, and simply updating it to 1 for all the records returned from the above query.
Other then that, we can't really help you. We don't know your table structures, we don't know what data you store in them and which column indicates what .
I have found what i was looking for. ASSIGNMENT_STATUS_TYPE='INACTIVE' was what I needed (As mentioned in the question, this solution is without considering 'EFFECTIVE_END_DATE') Getting the 'latest' assignment status of an employee was what I needed to find. The following query works if the employee has only one Assignment assigned.
SELECT PAPF.PERSON_ID, PAPF.PERSON_NUMBER
FROM PER_ALL_PEOPLE_F PAPF, PER_ALL_ASSIGNMENTS_M PAAM
WHERE 1=1
AND TRUNC(PAAM.EFFECTIVE_START_DATE) = (SELECT MAX(TRUNC(PAAM_INNER.EFFECTIVE_START_DATE))
FROM PER_ALL_ASSIGNMENTS_M PAAM_INNER
WHERE PAAM_INNER.PERSON_ID=PAAM.PERSON_ID
GROUP BY PAAM_INNER.PERSON_ID)
AND PAPF.PERSON_ID=PAAM.PERSON_ID
AND PAAM.PRIMARY_FLAG='Y'
AND PAAM.ASSIGNMENT_STATUS_TYPE='INACTIVE'
AND TRUNC(SYSDATE) BETWEEN PAAM.EFFECTIVE_START_DATE AND PAAM.EFFECTIVE_END_DATE
AND TRUNC(SYSDATE) BETWEEN PAPF.EFFECTIVE_START_DATE AND PAPF.EFFECTIVE_END_DATE
ORDER BY 1 ASC
I had help from the Oracle Support Community to get to an answer.
Link: https://community.oracle.com/message/14000136#14000136
However in a case where an employee was given an assignment say starting from year 2000 and ending at 20015, then another assignment starting from 2016 till present, the above query will return one record of the said employee as 'Inactive' if the Max Effective_start_date condition is not checked. (Since one became Inactive on 2015), even though her current Assignment status is 'Active' and she is currently not terminated.
In such a case, it is wise to retrieve the record with the greatest 'EFFECTIVE_START_DATE' from the PER_ALL_ASSIGNMENTS_M table, ie, checking if EFFECTIVE_START_DATE = MAX(EFFECTIVE_START_DATE)

AM I on the right path

I'm taking a database intro master's class. We are working on SQL. The professor likes to be ambiguous with certain explains.
Here's my question. Certain questions we are required to find out the opposite of a query something like if a supplier ships parts that are red and blue what colors don't the ship.
here is how I figured out a solution
SELECT distinct PARTS.COLOR
FROM PARTS, SHIPMENTS
WHERE PARTS.COLOR NOT IN(
SELECT distinct PARTS.COLOR
FROM SHIPMENTS, PARTS
WHERE PARTS.PARTNO IN(
SELECT distinct SHIPMENTS.PARTNO
FROM SHIPMENTS
WHERE SHIPMENTS.SUPPLIERNO='S1'))
AND SHIPMENTS.PARTNO = PARTS.PARTNO;
What I was wondering is, is this best approach to this question. This works but I'm not sure it is how it should be done.
I should also mention he does not want us to use all available operations. He did not show us JOIN, EXISTS,
he showed us SELECT, IN, ALL/ANY, Aggregates so MAX, MIN, SUM, GROUP BY, and HAVING
Thanks
If you learn now to use "EXPLAIN PLAN" to view the query plan, you'll find that Oracle often uses the same execution plan for "WHERE .. IN()" and "WHERE EXISTS". Depending on if there are indexes on the columns, it comes down to several aspects, mainly if you are using statistics gathering, Oracle will look at the number of rows for each table / index and decide which is the best way to execute it. So unless you find that IN() vs EXISTS() runs drastically differently than each other, just use whichever one makes most sense to you at the time, but always check the execution plan.
As far as your question, since you are prohibited from using joins or exists, I see nothing wrong with your solution.
The easy options I can come up with to simplify either use a join or an exists. You could do it with group and outer join, probably, but I see no point.
Without the restrictions, I could simplify it down to:
SELECT distinct P.COLOR
FROM PARTS P WHERE NOT EXISTS
(SELECT 1 FROM SHIPMENTS S WHERE S.PARTNO = P.PARTNO AND S.SUPPLIERNO = 'S1')
though I am not certain about your schema, and where color is. I assumed a part has a distinct color. If not, this is not adequate and you'd need to correlate the subquery on color, not partno.
Your question is: "if a supplier ships parts that are red and blue what colors don't they ship."
Interesting question. I think the easiest method uses analytic functions, which you probably haven't covered:
select sp.supplierno, color, count(*)
from (select s.*, p.color
max(case when p.color = 'red' then 1 else 0 end) over (partition by partno) as HasRed,
max(case when p.color = 'blue' then 1 else 0 end) over (partition by partno) as HasBlue
from shipments s join
parts p
on s.partno = p.partno
) sp
where hasRed > 0 and hasBlue > 0
group by sp.supplierno, color;

How does Inline view differ from Inline table in oracle?

Could anyone tell the difference between Inline view and Inline table ?
Explanation with SQL code might be good to understand the concept easily.
"this question has been asked to me in an interview."
We hear this a lot. The problem with these type of questions is you're asking the wrong people: you should have have the courage to say to your interviewer, "I'm sorry, I'm not familiar with the term 'inline table' could you please explain it?"
Instead you ask us, and the thing is, we don't know what the interview had in mind. I agree with Alex that the nearest thing to 'inline table' is the TABLE() function for querying nested table collections, but it's not a standard term.
I have been on the opposite side of the interviewing table many times. I always give credit to a candidate who asked me to clarify a question; I always mark down a candidate who blusters.
The world is struggling to optimize the database query, so am I. Well if I say I have something which can speed the application by factors to 80%, if used at right situations, what will you say...
Here I give you a problem, suppose you want to calculate the lowest and highest salary across the department with the name of employees with their respective manager.
One way to do it is to create a temp table which contain the aggregated salary for the employees.
create table tmp_emp_sal as select t.emp_id,max(t.sal) as maxsal,min(t.sal) as minsal,avg(t.sal) as avgsal from sal t group by t.emp_id
and then use it in query further.
select concat(e.last_nm, e.first_nm) as employee_name,concat(m.last_nm,m.first_nm) as manager_name,tt.maxsal,tt.minsal,tt.avgsal from emp e,emp m,dept d,tmp_test tt where e.dept_id = d.dept_id and s.emp_id = tt.emp_id and e.mgr_id = m.emp_id order by employee_name, manager_name
Now I will optimize the above code by merging the two DML and DDL operations in to a single DML query.
select concat(e.last_nm, e.first_nm) as employee_name,concat(m.last_nm, m.first_nm) as manager_name,tt.maxsal,tt.minsal,tt.avgsal from emp e,emp m, dept d,(select t.emp_id, max(t.sal) as maxsal, min(t.sal) as minsal, avg(t.sal) as avgsal from sal t group by emp_id) tt where e.dept_id = d.dept_id and s.emp_id = tt.emp_id and e.mgr_id = m.emp_id order by employee_name,manager_name
The above query saves user from the following shortcomings :-
Eliminates expensive DDL statements.
Eliminates a round trip to the database server.
Memory usage is much lighter because it only stores the final result rather than the intermediate steps as well.
So its preferable to use inline views in place of temp tables.

Oracle command hangs when using view for "WHERE x IN..." subquery

I'm working on a web service that fetches data from an oracle data source in chunks and passes it back to an indexing/search tool in XML format. I'm the C#/.NET guy, and am kind of fuzzy on parts of Oracle.
Our Oracle team gave us the following script to run, and it works well:
SELECT ROWID, [columns]
FROM [table]
WHERE ROWID IN (
SELECT ROWID
FROM (
SELECT ROWID
FROM [table]
WHERE ROWID > '[previous_batch_last_rowid]'
ORDER BY ROWID
)
WHERE ROWNUM <= 10000
)
ORDER BY ROWID
10,000 rows is an arbitrary but reasonable chunk size and ROWID is sufficiently unique for our purposes to use as a UID since each indexing run hits only one table at a time. Bracketed values are filled in programmatically by the web service.
Now we're going to start adding views to the indexing, each of which will union a few separate tables. Since ROWID would no longer function as a unique identifier, they added a column to the views (VIEW_UNIQUE_ID) that concatenates the ROWIDs from the component tables to construct a UID for each union.
But this script does not work, even though it follows the same form as the previous one:
SELECT VIEW_UNIQUE_ID, [columns]
FROM [view]
WHERE VIEW_UNIQUE_ID IN (
SELECT VIEW_UNIQUE_ID
FROM (
SELECT VIEW_UNIQUE_ID
FROM [view]
WHERE VIEW_UNIQUE_ID > '[previous_batch_last_view_unique_id]'
ORDER BY VIEW_UNIQUE_ID
)
WHERE ROWNUM <= 10000
)
ORDER BY VIEW_UNIQUE_ID
It hangs indefinitely with no response from the Oracle server. I've waited 20+ minutes and the SQLTools dialog box indicating a running query remains the same, with no progress or updates.
I've tested each subquery independently and each works fine and takes a very short amount of time (<= 1 second), so the view itself is sound. But as soon as the inner two SELECT queries are added with "WHERE VIEW_UNIQUE_ID IN...", it hangs.
Why doesn't this query work for views? In what important way are they not interchangeable here?
Updated: the architecture of the solution stipulates that it is to be stateless, so I shouldn't try to make the web service preserve any index state information between requests from consumers.
they added a column to the views
(VIEW_UNIQUE_ID) that concatenates the
ROWIDs from the component tables to
construct a UID for each union.
God, that is the most obscene idea I've seen in a long time.
Let's say the view is a simple one like
SELECT C.CUST_ID, C.CUST_NAME, O.ORDER_ID, C.ROWID||':'||O.ROWID VIEW_UNIQUE_ID
FROM CUSTOMER C JOIN ORDER O ON C.CUST_ID = O.CUST_ID
Every time you want to do the
SELECT VIEW_UNIQUE_ID
FROM [view]
WHERE VIEW_UNIQUE_ID > '[previous_batch_last_view_unique_id]'
ORDER BY VIEW_UNIQUE_ID
It has to build that entire result set, apply the filter, and order it. For anything other than trivially sized tables, that will be a nightmare.
Stop using the database to paginate/chunk the data here and do that in the client. Open the database connection, execute the query, fetch the first ten thousand rows from the query, index them, fetch the next ten thousand. Don't close and reopen the query each time, only after you've processed each row. You'll be able to forget about ordering.
For stateless, you need to re-architect. The whole thing with concatenated ROWIDs will not fly.
Start by putting the records to be processed into a fresh table, then you can flag them/process them/delete them in chunks.
INSERT INTO pending_table
SELECT 'N' state_flag, v.* FROM view v;
<start looping here>
UPDATE pending_table
SET state_flag = 'P'
WHERE ROWNUM < 10000;
COMMIT;
SELECT * FROM pending_table
WHERE state_flag = 'P';
<client processing>
DELETE FROM pending_table
WHERE state_flag = 'P';
<go back to start of loop, and keep going until pending_table is empty>

Resources