Oracle Connect By query resultset randomly not in hierarchy - oracle

I have prepared few queries using startwith and connectby for fetching all items of a table with parent - childs relationship.
Till now, these queries were working perfectly fine. But now, i noticed the hierarchy returned was not the same. hierarchy is returned in totally random way, although the data is same.
Can anyone suggest why this is happening..
Below is the sample query:
SELECT id,loc.title Title FROM
(SELECT level level,id id,parent_id Parent_Id,sort_order FROM table1
START WITH parent_id=0
CONNECT BY prior id = parent_id ORDER SIBLINGS BY sort_order)
INNER JOIN table2 loc ON id = loc.id WHERE loc.locale=?

the ORDER SIBLINGS BY will do what you want if you don't wrap the query as an inline view.
once you wrap it, then the ordering is no longer specified
i would suggest doing the JOIN directly in the inner query.. then your order by should work.

Related

ORA-02019:connection description for remote database not found - left join in a view

I have 3 tables:
table1: id, person_code
table2: id, address, person_code_foreing(same with that one from table 1), admission_date_1
table3: id, id_table2, admission_date_2, something
(the tables are fictive)
I'm trying to make a view who takes infos from this 3 tables using left join, i'm doing like this because in the first table i have some record who don't have the person_code in the others tables but I want also this info to be returned by the view:
CREATE OR REPLACE VIEW schema.my_view
SELECT t1.name, t2.adress, t3.something
from schema.table1#ambient1 t1
left join schema.table2#ambient1 t2
on t1.person_code = t2.person_code_foreing
left join schema.table3#ambient1 t3
on t3.id_table2 = t2.id
and t1.admission_date_1=t2.admission_date_2;
This view needs to be created in another ambient (ambient2).
I tried using a subquery, there I need also a left join to use, and this thing is very confusing because I don't get it, the subquery and the left join are the big no-no?! Or just de left-join?!
Has this happened to anyone?
How did you risolved it?
Thanks a lot.
ORA-2019 indicates that your database link (#ambient1) does not exist, or is not visible to the current user. You can confirm by checking the ALL_DB_LINKS view, which should list all links to which the user has access:
select owner, db_link from all_db_links;
Also keep in mind that Oracle will perform the joins in the database making the call, not the remote database, so you will almost certainly have to pull the entire contents of all three tables over the network to be written into TEMP for the join and then thrown away, every time you run a query. You will also lose the benefit of any indexes on the data and most likely wind up with full table scans on the temp tables within your local database.
I don't know if this is an option for you, but from a performance perspective and given that it isn't joining with anything in the local database, it would make much more sense to create the view in the remote database and just query that through the database link. That way all of the joins are performed efficiently where the data lives, only the result set is pushed over the network, and your client database SQL becomes much simpler.
I managed to make it work, but apparently ambient2 doesn't like my "left-join", and i used only a subquery and the operator (+), this is how it worked:
CREATE OR REPLACE VIEW schema.my_view
SELECT t1.name, all.adress, all.something
from schema.table1#ambient1 t1,(select * from
schema.table3#ambient1 t3, schema.table2#ambient1 t2
where t3.id_table2 = t2.id(+)
and (t1.admission_date_1=t2.admission_date_2 or t1.admission_date is null))
all
where t1.person_code = t2.person_code_foreing(+);
I tried to test if a query in ambient2 using a right-join works (with 2 tables created there) and it does. I thought there is a problem with that ambient..
For me, there is no sense why in my case this kind of join retrieves that error.
The versions are different?! I don't know, and I don't find any official documentation about that.
Maybe some of you guys have any clue..
There is a mistery for me :))
Thanks.

Write a nested select statement with a where clause in Hive

I have a requirement to do a nested select within a where clause in a Hive query. A sample code snippet would be as follows;
select *
from TableA
where TA_timestamp > (select timestmp from TableB where id="hourDim")
Is this possible or am I doing something wrong here, because I am getting an error while running the above script ?!
To further elaborate on what I am trying to do, there is a cassandra keyspace that I publish statistics with a timestamp. Periodically (hourly for example) this stats will be summarized using hive, once summarized that data will be stored separately with the corresponding hour. So when the query runs for the second time (and consecutive runs) the query should only run on the new data (i.e. - timestamp > previous_execution_timestamp). I am trying to do that by storing the latest executed timestamp in a separate hive table, and then use that value to filter out the raw stats.
Can this be achieved this using hive ?!
Subqueries inside a WHERE clause are not supported in Hive:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries
However, often you can use a JOIN statement instead to get to the same result:
https://karmasphere.com/hive-queries-on-table-data#join_syntax
For example, this query:
SELECT a.KEY, a.value
FROM a
WHERE a.KEY IN
(SELECT b.KEY FROM B);
can be rewritten to:
SELECT a.KEY, a.val
FROM a LEFT SEMI JOIN b ON (a.KEY = b.KEY)
Looking at the business requirements underlying your question, it occurs that you might get more efficient results by partitioning your Hive table using hour. If the data can be written to use this factor as the partition key, then your query to update the summary will be much faster and require fewer resources.
Partitions can get out of hand when they reach the scale of millions, but this seems like a case that will not tease that limitation.
It will work if you put in :
select *
from TableA
where TA_timestamp in (select timestmp from TableB where id="hourDim")
EXPLANATION : As > , < , = need one exact figure in the right side, while here we are getting multiple values which can be taken only with 'IN' clause.

using multiple left outer joins pl/sql

So I have three tables I am trying to pull data from with the following query:
select tats.machine_interval_id as machine_interval_id,
tats.interval_type as interval_type,
tats.interval_category as interval_category,
ops.opstatemnemonic as operational_state,
nptc.categorytype as idle_category,
tats.interval_duration as interval_duration
from temp_agg_time_summary tats
left outer join operationalstates ops on ops.opstateid=tats.operationalstatenumeric
left outer join nptcategories nptc on nptc.categoryid=tats.categorytypenumeric
The problem I'm having is that whenever there is a value that is not null from the nptcategories table, it double the record which in turns throws off any calculations I have later in my packages. I believe the problem has to do with having more than one left outer join in the query. My question may see fairly simple, but I'm new to PL/SQL so bear with me.
What I want to know is how can I use multiple left outer joins in a query with out having this problem occur? What would be a better way to structure this query?
Update
Okay so I found the offending line of code it is below:
left outer join nptcategories nptc on nptc.categoryid=tats.categorytypenumeric
Also when using distinct, it removes all of the duplicate records, but will using this cause any problems I am unaware of? Should I focus more on figuring out why the join above does not work properly, or is the distinct good enough?
Okay, so after taking Wolf's suggestion, i went in and ran the following line of code
select categorytype, count(*)
from nptcategories
group by categorytype
having count(*) > 1;
After running this, i found that somehow there were duplicates of records in this table so, this was fixed by removing the duplicates and setting the table to have unique ids. This was done by using the following script on the DB:
alter table nptcategories add constraint nptcatidunq unique(categoryid)

Oracle Select Query, Order By + Limit Results

I am new to Oracle and working with a fairly large database. I would like to perform a query that will select the desired columns, order by a certain column and also limit the results. According to everything I have read, the below query should be working but it is returning "ORA-00918: column ambiguously defined":
SELECT * FROM(SELECT * FROM EAI.EAI_EVENT_LOG e,
EAI.EAI_EVENT_LOG_MESSAGE e1 WHERE e.SOURCE_URL LIKE '%.XML'
ORDER BY e.REQUEST_DATE_TIME DESC) WHERE ROWNUM <= 20
Any suggestions would be greatly appreciated :D
The error message means your result set contains two columns with the same name. Each column in a query's projection needs to have a unique name. Presumably you have a column (or columns) with the same name in both EAI_EVENT_LOG and EAI_EVENT_LOG_MESSAGE.
You also want to join on that column. At the moment you are generating a cross join between the two tables. In other words, if you have a hundred records in EAI_EVENT_LOG and two hundred records EAI_EVENT_LOG_MESSAGE your result set will be twenty thousand records (without the rownum). This is probably your intention.
"By switching to innerjoin, will that eliminate the error with the
current code?"
No, you'll still need to handle having two columns with the same name. Basically this comes from using SELECT * on two multiple tables. SELECT * is bad practice. It's convenient but it is always better to specify the exact columns you want in the query's projection. That way you can include (say) e.TRANSACTION_ID and exclude e1.TRANSACTION_ID, and avoid the ORA-00918 exception.
Maybe you have some columns in both EAI_EVENT_LOG and EAI_EVENT_LOG_MESSAGE tables having identical names? Instead of SELECT * list all columns you want to select.
Other problem I see is that you are selecting from two tables but you're not joining them in the WHERE clause hence the result set will be the cross product of those two table.
You need to stop using SQL '89 implicit join syntax.
Not because it doesn't work, but because it is evil.
Right now you have a cross join which in 99,9% of the cases is not what you want.
Also every sub-select needs to have it's own alias.
SELECT * FROM
(SELECT e.*, e1.* FROM EAI.EAI_EVENT_LOG e
INNER JOIN EAI.EAI_EVENT_LOG_MESSAGE e1 on (......)
WHERE e.SOURCE_URL LIKE '%.XML'
ORDER BY e.REQUEST_DATE_TIME DESC) s WHERE ROWNUM <= 20
Please specify a join criterion on the dotted line.
Normally you do a join on a keyfield e.g. ON (e.id = e1.event_id)
It's bad idea to use select *, it's better to specify exactly which fields you want:
SELECT e.field1 as customer_id
,e.field2 as customer_name
.....

JDBC: Is it possible to execute another query on the results of a previous query?

I want to first get some result set (that includes two joins and selection), and then get the maximum value for one of the columns in the result set.
I need both the data in the original results set, and the max.
Is this possible with JDBC, and how?
I think it is. Should look like this:
SELECT MAX(derivedTable.myRow) FROM
(SELECT * FROM table1 JOIN table2 ON table1.id = table2.some_id) derivedTable
The key is to assign your inner select an alias ("derivedTable" above) and perform another selection on that.
--- Edit based on comment:
No, I don't think that's possible. Even without the JDBC layer - say, in a direct SQL console - I don't think there is a way to query data in a result set in any RDBMS I know.
Depending on the speed of your query and the size of the result, either performing a second query or just iterating through the results to find the maximum are your best options.
You can do it with standard SQL although it's a bit awkward:
SELECT a, b, c, (SELECT MAX(c) FROM table1 JOIN table2 ON table1.id = table2.table_id) max_c
FROM table1
JOIN table2 ON table1.id = table2.table_id

Resources