Querying multiple Oracle databases performance issue - oracle

I have over million records in these tables in both the databases.
I am trying to figure out data in both the tables acros databases.
SELECT COUNT(*) FROM DB1.MYTABLE WHERE SEQ_NO NOT IN(SELECT SEQ_NO FROM DB2.MYTABLE) AND FILENAME NOT LIKE '%{%'
and PT_TYPE NOT IN(15,24,268,284,285,286,12,17,9,290,214,73) AND STTS=1
The query is taking ages. Is there any way I can make it fast?
Appreciate your help in advance

Do you actually mean different databases? Or do you mean different schemas? You talk about different databases but the syntax appears to be using tables in two different schemas, not two different databases. I don't see any references to a database link which would be needed if there were two different databases but perhaps DB2.MYTABLE is supposed to be a synonym for MYTABLE#DB2.
It would be helpful if you could post the query plan that is generated. It would also be useful to indicate what indexes exist and how selective each of these predicates is. My guess is that modifying the query to be
SELECT count(*)
FROM schema1.mytable a
WHERE NOT EXISTS (
SELECT 1
FROM schema2.mytable b
WHERE a.seq_no = b.seq_no )
AND a.filename NOT LIKE '%{%'
AND a.pt_type NOT IN (15,24,268,284,285,286,12,17,9,290,214,73)
AND a.stts = 1
might be more efficient if most of the rows in SCHEMA1.MYTABLE are eliminated because the SEQ_NO exists in SCHEMA2.MYTABLE.

Related

Combine Multiple Hive Tables as single table in Hadoop

Hi I have multiple Hive tables around 15-20 tables. All the tables will be common schema . I Need to combine all the tables as single table.The single table should be queried from reporting tool, So performance is also needs to be care..
I tried like this..
create table new as
select * from table_a
union all
select * from table_b
Is there any other way to combine all the tables more efficient. Any help will be appreciated.
Hive would be processing in parallel if you set "hive.exec.parallel" as true. With "hive.exec.parallel.thread.number" you can specify the number of parallel threads. This would increase the overall efficiency.
If you are trying to merge table_A and table_b into a single one, the easiest way is to use the UNION ALL operator. You can find the syntax and use cases here - https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Union

Searching from all tables in oracle

I am developing a java app which can connect to oracle database and selecting column names from any tables, after selecting columns i have to query the data from those tables which the user select in my java app, now my question is how can i join all tables in the database so that query returns data successfully, i want to connect to any oracle schema to a specific, i will make the logic in java, but i am unable to find the query which can extract the data from all tables, i tried natural join among all tables but it has dependency of having same name of connecting columns. so i want to know any generic way which can work in all conditions.
As others have mentioned.. it seems that there are other tools out there that you probably should leverage prior to trying to roll your own complex solution.
With that said if you wish to roll your own solution you could look into using some of oracle's dictionary tables. Such as:
Select * from all_tables;
Select * from all_tab_cols;

MonetDB simple join performance on 2 tables

Let's assume I have two tables of the same row count. Both tables contain a column that allows for 1-1 join between them.
If those tables were turned into one table instead and thus JOIN statement eliminated from the query, would there be any performance benefit of that?
Another example... Let's assume I have table with 10 columns. From that table I created new table but only taking one column. If I issue statement selecting that one column with WHERE predicate on the same column would there be any performance difference in executing this query on both tables?
What I'm trying to get to is if performance is the same in above cases is it safe to say tables are only containers wrapping number of columns together?
I did run couple tests but with non conclusive results.
Let's assume I have two tables of the same row count. Both tables
contain a column that allows for 1-1 join between them. If those
tables were turned into one table instead and thus JOIN statement
eliminated from the query, would there be any performance benefit of
that?
Performing that join for every query is of course more expensive than materializing the table once and then reading it. So yes, there would be a performance benefit.
Another example... Let's assume I have table with 10 columns. From
that table I created new table but only taking one column. If I issue
statement selecting that one column with WHERE predicate on the same
column would there be any performance difference in executing this
query on both tables?
No, there would be no difference, since tables are represented as collections of columns, which are each stored in their own file.
What I'm trying to get to is if performance is the same in above cases
is it safe to say tables are only containers wrapping number of
columns together?
That is indeed safe to say.

Is there any use to create index on all the table columns in oracle?

In our one of production database, we have 4 column table and there are no PK,UK constraints on it. only one notnull constraint on one column. The inserts are slow on this table and when I checked the indexes , there is one index which is built on all columns.
It is a normal table and not IOT. I really don't see a need of all column index, but wondering why the developers has created it?
Appreciate your thoughts?
It might be usefull, i.e. if you (mainly) query all columns oracle doesn't have to access the table at all, but can get all the data from the index. Though inserts take longer because a larger index has to be maintained by the dbms everytime.
One case where it could be useful is,
Say for example, you are trying to check the existence of records in this table and for that you have to have joins on all four columns. So in such a case if you have written a correlated query like below,
SELECT <something>
FROM table_1 t1
WHERE EXISTS
(SELECT 1 FROM table_t2 t2 where t1.c1=t2.c1 and t1.c2=t2.c2 and t1.c3=t2.c3 and t1.c4=t2.c4)
Apart from above case, it looks an error to me from developer's side.
Indexes are good to better query optimization but causes slow updates/inserts because the indexes needs to be updated at each modification.
If these tables first use is querying and inserts happens only in a specific periods like a batch at the beginning or the end of the day only, then you can remove the indexes before updating tables and then restore them.
In addition, all the queries all these tables need to be analysed to see which indexes are useful and which are not?
Anyway, You need to ask developers before removing these indexes.

Oracle: Having join or simple from/where clause has no affect on performance?

My manger just told me that having joins or where clause in oracle query doesn't affect performance even when you have million records in each table. And I am just not satisfied with this and want to confirm that.
which of the following queries is better in performance on oracle and in postgresql also
1- select a.name,b.salary,c.address
from a,b,c
where a.id=b.id and a.id=c.id;
2- select a.name,b.salary,c.address
from a
JOIN b on a.id=b.id
JOIN C on a.id=c.id;
I have tried Explain in postgresql for a small data set and query time was same (may be because I have just few rows) and right now I have no access to oracle and actual database to analyze the Explain in real envoirnment.
Using JOINS makes the code easier to read, since it's self-explanatory.
In speed there is no difference (I have just tested it) and the execution plan is the same
If the query optimizer is doing its job right, there should be no difference between those queries.
They are just two ways to specify the same desired result.

Resources