using multiple left outer joins pl/sql - oracle

So I have three tables I am trying to pull data from with the following query:
select tats.machine_interval_id as machine_interval_id,
tats.interval_type as interval_type,
tats.interval_category as interval_category,
ops.opstatemnemonic as operational_state,
nptc.categorytype as idle_category,
tats.interval_duration as interval_duration
from temp_agg_time_summary tats
left outer join operationalstates ops on ops.opstateid=tats.operationalstatenumeric
left outer join nptcategories nptc on nptc.categoryid=tats.categorytypenumeric
The problem I'm having is that whenever there is a value that is not null from the nptcategories table, it double the record which in turns throws off any calculations I have later in my packages. I believe the problem has to do with having more than one left outer join in the query. My question may see fairly simple, but I'm new to PL/SQL so bear with me.
What I want to know is how can I use multiple left outer joins in a query with out having this problem occur? What would be a better way to structure this query?
Update
Okay so I found the offending line of code it is below:
left outer join nptcategories nptc on nptc.categoryid=tats.categorytypenumeric
Also when using distinct, it removes all of the duplicate records, but will using this cause any problems I am unaware of? Should I focus more on figuring out why the join above does not work properly, or is the distinct good enough?

Okay, so after taking Wolf's suggestion, i went in and ran the following line of code
select categorytype, count(*)
from nptcategories
group by categorytype
having count(*) > 1;
After running this, i found that somehow there were duplicates of records in this table so, this was fixed by removing the duplicates and setting the table to have unique ids. This was done by using the following script on the DB:
alter table nptcategories add constraint nptcatidunq unique(categoryid)

Related

ORA-02019:connection description for remote database not found - left join in a view

I have 3 tables:
table1: id, person_code
table2: id, address, person_code_foreing(same with that one from table 1), admission_date_1
table3: id, id_table2, admission_date_2, something
(the tables are fictive)
I'm trying to make a view who takes infos from this 3 tables using left join, i'm doing like this because in the first table i have some record who don't have the person_code in the others tables but I want also this info to be returned by the view:
CREATE OR REPLACE VIEW schema.my_view
SELECT t1.name, t2.adress, t3.something
from schema.table1#ambient1 t1
left join schema.table2#ambient1 t2
on t1.person_code = t2.person_code_foreing
left join schema.table3#ambient1 t3
on t3.id_table2 = t2.id
and t1.admission_date_1=t2.admission_date_2;
This view needs to be created in another ambient (ambient2).
I tried using a subquery, there I need also a left join to use, and this thing is very confusing because I don't get it, the subquery and the left join are the big no-no?! Or just de left-join?!
Has this happened to anyone?
How did you risolved it?
Thanks a lot.
ORA-2019 indicates that your database link (#ambient1) does not exist, or is not visible to the current user. You can confirm by checking the ALL_DB_LINKS view, which should list all links to which the user has access:
select owner, db_link from all_db_links;
Also keep in mind that Oracle will perform the joins in the database making the call, not the remote database, so you will almost certainly have to pull the entire contents of all three tables over the network to be written into TEMP for the join and then thrown away, every time you run a query. You will also lose the benefit of any indexes on the data and most likely wind up with full table scans on the temp tables within your local database.
I don't know if this is an option for you, but from a performance perspective and given that it isn't joining with anything in the local database, it would make much more sense to create the view in the remote database and just query that through the database link. That way all of the joins are performed efficiently where the data lives, only the result set is pushed over the network, and your client database SQL becomes much simpler.
I managed to make it work, but apparently ambient2 doesn't like my "left-join", and i used only a subquery and the operator (+), this is how it worked:
CREATE OR REPLACE VIEW schema.my_view
SELECT t1.name, all.adress, all.something
from schema.table1#ambient1 t1,(select * from
schema.table3#ambient1 t3, schema.table2#ambient1 t2
where t3.id_table2 = t2.id(+)
and (t1.admission_date_1=t2.admission_date_2 or t1.admission_date is null))
all
where t1.person_code = t2.person_code_foreing(+);
I tried to test if a query in ambient2 using a right-join works (with 2 tables created there) and it does. I thought there is a problem with that ambient..
For me, there is no sense why in my case this kind of join retrieves that error.
The versions are different?! I don't know, and I don't find any official documentation about that.
Maybe some of you guys have any clue..
There is a mistery for me :))
Thanks.

Oracle 11 joining with view has high cost

I'm having some difficulty with joining a view to another table. This is on an Oracle RAC system running 11.2
I'll try and give as much detail as possible without going into specific table structures as my company would not like that.
You all know how this works. "Hey, can you write some really ugly software to implement our crazy ideas?"
The idea of what they wanted me to do was to make a view where the end user wouldn't know if they were going after the new table or the old table so one of the tables is a parameter table that will return "ON" or "OFF" and is used in the case statements.
There are some not too difficult but nested case statements in the select clause
I have a view:
create view my_view as
select t1.a as a, t1.b as b, t1.c as c,
sum(case when t2.a = 'xx' then case when t3.a then ... ,
case when t2.a = 'xx' then case when t3.a then ... ,
from table1 t1
join table t2 on (t1.a = t2.a etc...)
full outer join t3 on (t1.a = t3.a etc...)
full outer join t4 on (t1.a = t4.a etc...)
group by t1.a, t1.b, t2.c, and all the ugly case statements...
Now, when I run the query
select * from my_view where a='xxx' and b='yyy' and c='zzz'
the query runs great and the cost is 10.
However, when I join this view with another table everything falls apart.
select * from my_table mt join my_view mv on (mt.a = mv.a and mt.b=mv.b and mt.c=mv.c) where ..."
everything falls apart with a cost though the roof.
What I think is happening is the predicates are not getting pushed to the view. As such, the view is now doing full tables scans and joining everything to everything and then finally removing all the rows.
Every hint, tweak, or anything I've done doesn't appear to help.
When looking at the plan it looks like it has the predicates.
But this happens after everything is joined.
Sorry if this is cryptic but any help would be greatly appreciated.
Since you have the view with a "GROUP BY", predicates could not be pushed to the inner query
Also, you have the group by functions in a case statement, which could also make it worse for the optimizer
Oracle introduces enhancements to Optimizer every version/release/patch. It is hard to say what is supported in the version you're running. However, you can try:
See if removing the case from the GROUP BY function will make any difference
Otherwise, you have to take the GROUP BY and GROUP BY functions from the view to the outer most query
After many keyboard indentations on my forehead I may have tricked Oracle into pushing the predicates. I don't know exactly why this works but simplifying things may have helped.
I changed all my ON clauses to USING clauses and in this way the column names now match the columns from which I'm joining to. On some other predicates that were constants I added in a where clause to the view.
The end result is I can now join this view with another table and the cost is reasonable and the plan shows that the predicates are being pushed.
Thank you to everybody who looked at this problem.

Write a nested select statement with a where clause in Hive

I have a requirement to do a nested select within a where clause in a Hive query. A sample code snippet would be as follows;
select *
from TableA
where TA_timestamp > (select timestmp from TableB where id="hourDim")
Is this possible or am I doing something wrong here, because I am getting an error while running the above script ?!
To further elaborate on what I am trying to do, there is a cassandra keyspace that I publish statistics with a timestamp. Periodically (hourly for example) this stats will be summarized using hive, once summarized that data will be stored separately with the corresponding hour. So when the query runs for the second time (and consecutive runs) the query should only run on the new data (i.e. - timestamp > previous_execution_timestamp). I am trying to do that by storing the latest executed timestamp in a separate hive table, and then use that value to filter out the raw stats.
Can this be achieved this using hive ?!
Subqueries inside a WHERE clause are not supported in Hive:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries
However, often you can use a JOIN statement instead to get to the same result:
https://karmasphere.com/hive-queries-on-table-data#join_syntax
For example, this query:
SELECT a.KEY, a.value
FROM a
WHERE a.KEY IN
(SELECT b.KEY FROM B);
can be rewritten to:
SELECT a.KEY, a.val
FROM a LEFT SEMI JOIN b ON (a.KEY = b.KEY)
Looking at the business requirements underlying your question, it occurs that you might get more efficient results by partitioning your Hive table using hour. If the data can be written to use this factor as the partition key, then your query to update the summary will be much faster and require fewer resources.
Partitions can get out of hand when they reach the scale of millions, but this seems like a case that will not tease that limitation.
It will work if you put in :
select *
from TableA
where TA_timestamp in (select timestmp from TableB where id="hourDim")
EXPLANATION : As > , < , = need one exact figure in the right side, while here we are getting multiple values which can be taken only with 'IN' clause.

ORA-30926: unable to get a stable set of rows in the source tables

I am getting
ORA-30926: unable to get a stable set of rows in the source tables
in the following query:
MERGE INTO table_1 a
USING
(SELECT a.ROWID row_id, 'Y'
FROM table_1 a ,table_2 b ,table_3 c
WHERE a.mbr = c.mbr
AND b.head = c.head
AND b.type_of_action <> '6') src
ON ( a.ROWID = src.row_id )
WHEN MATCHED THEN UPDATE SET in_correct = 'Y';
I've ran table_1 it has data and also I've ran the inside query (src) which also has data.
Why would this error come and how can it be resolved?
This is usually caused by duplicates in the query specified in USING clause. This probably means that TABLE_A is a parent table and the same ROWID is returned several times.
You could quickly solve the problem by using a DISTINCT in your query (in fact, if 'Y' is a constant value you don't even need to put it in the query).
Assuming your query is correct (don't know your tables) you could do something like this:
MERGE INTO table_1 a
USING
(SELECT distinct ta.ROWID row_id
FROM table_1 a ,table_2 b ,table_3 c
WHERE a.mbr = c.mbr
AND b.head = c.head
AND b.type_of_action <> '6') src
ON ( a.ROWID = src.row_id )
WHEN MATCHED THEN UPDATE SET in_correct = 'Y';
You're probably trying to to update the same row of the target table multiple times. I just encountered the very same problem in a merge statement I developed. Make sure your update does not touch the same record more than once in the execution of the merge.
A further clarification to the use of DISTINCT to resolve error ORA-30926 in the general case:
You need to ensure that the set of data specified by the USING() clause has no duplicate values of the join columns, i.e. the columns in the ON() clause.
In OP's example where the USING clause only selects a key, it was sufficient to add DISTINCT to the USING clause. However, in the general case the USING clause may select a combination of key columns to match on and attribute columns to be used in the UPDATE ... SET clause. Therefore in the general case, adding DISTINCT to the USING clause will still allow different update rows for the same keys, in which case you will still get the ORA-30926 error.
This is an elaboration of DCookie's answer and point 3.1 in Tagar's answer, which from my experience may not be immediately obvious.
How to Troubleshoot ORA-30926 Errors? (Doc ID 471956.1)
1) Identify the failing statement
alter session set events ‘30926 trace name errorstack level 3’;
or
alter system set events ‘30926 trace name errorstack off’;
and watch for .trc files in UDUMP when it occurs.
2) Having found the SQL statement, check if it is correct (perhaps using explain plan or tkprof to check the query execution plan) and analyze or compute statistics on the tables concerned if this has not recently been done. Rebuilding (or dropping/recreating) indexes may help too.
3.1) Is the SQL statement a MERGE?
evaluate the data returned by the USING clause to ensure that there are no duplicate values in the join. Modify the merge statement to include a deterministic where clause
3.2) Is this an UPDATE statement via a view?
If so, try populating the view result into a table and try updating the table directly.
3.3) Is there a trigger on the table? Try disabling it to see if it still fails.
3.4) Does the statement contain a non-mergeable view in an 'IN-Subquery'? This can result in duplicate rows being returned if the query has a "FOR UPDATE" clause. See Bug 2681037
3.5) Does the table have unused columns? Dropping these may prevent the error.
4) If modifying the SQL does not cure the error, the issue may be with the table, especially if there are chained rows.
4.1) Run the ‘ANALYZE TABLE VALIDATE STRUCTURE CASCADE’ statement on all tables used in the SQL to see if there are any corruptions in the table or its indexes.
4.2) Check for, and eliminate, any CHAINED or migrated ROWS on the table. There are ways to minimize this, such as the correct setting of PCTFREE.
Use Note 122020.1 - Row Chaining and Migration
4.3) If the table is additionally Index Organized, see:
Note 102932.1 - Monitoring Chained Rows on IOTs
Had the error today on a 12c and none of the existing answers fit (no duplicates, no non-deterministic expressions in the WHERE clause). My case was related to that other possible cause of the error, according to Oracle's message text (emphasis below):
ORA-30926: unable to get a stable set of rows in the source tables
Cause: A stable set of rows could not be got because of large dml activity or a non-deterministic where clause.
The merge was part of a larger batch, and was executed on a live database with many concurrent users. There was no need to change the statement. I just committed the transaction before the merge, then ran the merge separately, and committed again. So the solution was found in the suggested action of the message:
Action: Remove any non-deterministic where clauses and reissue the dml.
SQL Error: ORA-30926: unable to get a stable set of rows in the source tables
30926. 00000 - "unable to get a stable set of rows in the source tables"
*Cause: A stable set of rows could not be got because of large dml
activity or a non-deterministic where clause.
*Action: Remove any non-deterministic where clauses and reissue the dml.
This Error occurred for me because of duplicate records(16K)
I tried with unique it worked .
but again when I tried merge without unique same proble occurred
Second time it was due to commit
after merge if commit is not done same Error will be shown.
Without unique, Query will work if commit is given after each merge operation.
I was not able to resolve this after several hours. Eventually I just did a select with the two tables joined, created an extract and created individual SQL update statements for the 500 rows in the table. Ugly but beats spending hours trying to get a query to work.
As someone explained earlier, probably your MERGE statement tries to update the same row more than once and that does not work (could cause ambiguity).
Here is one simple example. MERGE that tries to mark some products as found when matching the given search patterns:
CREATE TABLE patterns(search_pattern VARCHAR2(20));
INSERT INTO patterns(search_pattern) VALUES('Basic%');
INSERT INTO patterns(search_pattern) VALUES('%thing');
CREATE TABLE products (id NUMBER,name VARCHAR2(20),found NUMBER);
INSERT INTO products(id,name,found) VALUES(1,'Basic instinct',0);
INSERT INTO products(id,name,found) VALUES(2,'Basic thing',0);
INSERT INTO products(id,name,found) VALUES(3,'Super thing',0);
INSERT INTO products(id,name,found) VALUES(4,'Hyper instinct',0);
MERGE INTO products p USING
(
SELECT search_pattern FROM patterns
) o
ON (p.name LIKE o.search_pattern)
WHEN MATCHED THEN UPDATE SET p.found=1;
SELECT * FROM products;
If patterns table contains Basic% and Super% patterns then MERGE works and first three products will be updated (found). But if patterns table contains Basic% and %thing search patterns, then MERGE does NOT work because it will try to update second product twice and this causes the problem. MERGE does not work if some records should be updated more than once. Probably you ask why not update twice!?
Here first update 1 and second update 1 are the same value but only by accident. Now look at this scenario:
CREATE TABLE patterns(code CHAR(1),search_pattern VARCHAR2(20));
INSERT INTO patterns(code,search_pattern) VALUES('B','Basic%');
INSERT INTO patterns(code,search_pattern) VALUES('T','%thing');
CREATE TABLE products (id NUMBER,name VARCHAR2(20),found CHAR(1));
INSERT INTO products(id,name,found) VALUES(1,'Basic instinct',NULL);
INSERT INTO products(id,name,found) VALUES(2,'Basic thing',NULL);
INSERT INTO products(id,name,found) VALUES(3,'Super thing',NULL);
INSERT INTO products(id,name,found) VALUES(4,'Hyper instinct',NULL);
MERGE INTO products p USING
(
SELECT code,search_pattern FROM patterns
) s
ON (p.name LIKE s.search_pattern)
WHEN MATCHED THEN UPDATE SET p.found=s.code;
SELECT * FROM products;
Now first product name matches Basic% pattern and it will be updated with code B but second product matched both patterns and cannot be updated with both codes B and T in the same time (ambiguity)!
That's why DB engine complaints. Don't blame it! It knows what it is doing! ;-)

Table Join Efficiency Question

When joining across tables (as in the examples below), is there an efficiency difference between joining on the tables or joining subqueries containing only the needed columns?
In other words, is there a difference in efficiency between these two tables?
SELECT result
FROM result_tbl
JOIN test_tbl USING (test_id)
JOIN sample_tbl USING (sample_id)
JOIN (SELECT request_id
FROM request_tbl
WHERE request_status='A') USING(request_id)
vs
SELECT result
FROM (SELECT result, test_id FROM result_tbl)
JOIN (SELECT test_id, sample_id FROM test_tbl) USING(test_id)
JOIN (SELECT sample_id FROM sample_tbl) USING(sample_id)
JOIN (SELECT request_id
FROM request_tbl
WHERE request_status='A') USING(request_id)
The only way to find out for sure is to run both with tracing turned on and then look at the trace file. But in all probability they will be treated the same: the optimizer will merge all the inline views into the main statement and come up with the same query plan.
It doesn't matter. It may actually be WORSE since you are taking control away from the optimizer which generally knows best.
However, remember if you are doing a JOIN and only including a column from one of the tables that it is QUITE OFTEN better to re-write it as a series of EXISTS statements -- because that's what you really mean. JOINs (with some exceptions) will join matching rows which is a lot more work for the optimizer to do.
e.g.
SELECT t1.id1
FROM table1 t1
INNER JOIN table2 ON something = something
should almost always be
SELECT id1
FROM table1 t1
WHERE EXISTS( SELECT *
FROM table2
WHERE something = something )
For simple queries the optimizer may reduce the query plans into identical ones. Check it out on your DBMS.
Also this is a code smell and probably should be changed:
JOIN (SELECT request_id
FROM request_tbl
WHERE request_status='A')
to
SELECT result
FROM request
WHERE EXISTS(...)
AND request_status = 'A'
No difference.
You can tell by running EXPLAIN PLAN on both those statements - Oracle knows that all you want is the "result" column, so it only does the minimum necessary to get the data it needs - you should find that the plans will be identical.
The Oracle optimiser does, sometimes, "materialize" a subquery (i.e. run the subquery and keep the results in memory for later reuse), but this is rare and only occurs when the optimiser believes this will result in a performance improvement; in any case, Oracle will do this "materialization" whether you specified the columns in the subqueries or not.
Obviously if the only place the "results" column is stored is in the blocks (along with the rest of the data), Oracle has to visit those blocks - but it will only keep the relevant info (the "result" column and other relevant columns, e.g. "test_id") in memory when processing the query.

Resources