Can anyone suggest how to convert CONNECT BY Oracle query into Greenplum. Greenplum doesn't support recursive queries. So, we can not use WITH RECURSIVE. Is there any alternate solution to re-write the below query.
SELECT child_id, Parnet_id, LEVEL , SYS_CONNECT_BY_PATH (child_id,'/') as HIERARCHY
FROM pathnode
START WITH Parnet_id = child_id
CONNECT BY NOCYCLE PRIOR child_id = Parnet_id;
There are ways to do this but it will be a one-off per query. You will need to create a function that loops through your pathnode table and "return next" to return each row. You can search on this site to find examples of doing this with PostgreSQL 8.2.
Work is happening to rebase Greenplum to PostgreSQL 8.3, 8.4, and so on. Those later PostgreSQL versions support "with recursive" which is the ANSI SQL way to write your SQL but Greenplum doesn't support it yet. When it does get supported by Greenplum, I don't think it will perform all that well. The query will force looping and individual row lookups. This works great in an OLTP database but not so well for an MPP database.
I suggest you transform your data in Oracle with a VIEW and then just dump the view to a file to load into Greenplum. The DDL of having a self-referencing, N-level table will never be a good idea in an MPP database.
Related
Currently we are using Oracle SQL and replicating data. After doing some analysis we found out that we are not using most of the data that we are replicating. To optimize this we thought to replicate data based on column conditions. Can this be achieved in Oracle DB? If so what strategy can we use here?
In my Java program, I am trying to take values from a PostgreSQL database and using this data I am using a Select query with an Oracle database.
Problem is, it is taking too much time to complete this task. First I am fetching data from Postgres table and load into variable.
Then with this variable I am executing a select query against an Oracle table.
But I want to make this process faster. Is it possible to perform this task in one query that takes data from PostgreSQL table and fetch data from Oracle table?
Postgres statement:
select filial_name
into f_name
from branch
where id=1;
Oracle statement:
select sum(credit)
from balance
where filial_n = f_name;
Above process continues in loop.
If you have to run a massive join between an Oracle table and a PostgreSQL table, that is never going to be very fast.
But you can do much better than performing the join in your application by defining an oracle_fdw foreign table in PostgreSQL and performing the join in PostgreSQL.
I have some question about Postgres, I have used dbms_stats.gather_table_stats for performance optimization in Oracle. I would like to switch our database from Oracle to Postgres, therefore, I want to achieve same feature on Postgres also. I searched internet whether there is some equivalent feature existing in Postgres with dbms_stats.gather_table_stats in Oracle. The only I found was EXPLAIN, VACUUM something like that. I think these are already existing in Oracle with same name. but I can't find proper ones for dbms_stats.gather_table_stats. I am spedning a lot time on it, if you guys have some advice, could I get some?
The GATHER_TABLE_STATS procedure of DBMS_STATS package collects statistics of the specified table in Oracle.
In Postgres, we use ANALYZE for the same purpose.
ANALYZE collects statistics about the contents of tables in the database, and stores the results in the pg_statistic system catalog. Subsequently, the query planner uses these statistics to help determine the most efficient execution plans for queries.
I am developing a java app which can connect to oracle database and selecting column names from any tables, after selecting columns i have to query the data from those tables which the user select in my java app, now my question is how can i join all tables in the database so that query returns data successfully, i want to connect to any oracle schema to a specific, i will make the logic in java, but i am unable to find the query which can extract the data from all tables, i tried natural join among all tables but it has dependency of having same name of connecting columns. so i want to know any generic way which can work in all conditions.
As others have mentioned.. it seems that there are other tools out there that you probably should leverage prior to trying to roll your own complex solution.
With that said if you wish to roll your own solution you could look into using some of oracle's dictionary tables. Such as:
Select * from all_tables;
Select * from all_tab_cols;
Hey EXPERIENCED SSIS DEVELOPERS, I need your help.
High-Level Requirements
Query SQL Server table (on a different server than my SSIS server) resulting in about 200-300k records results set.
Use three output colums for each row to lookup date in Oracle database.
Insert or Update SQL Server table with results.
Use SSIS.
SQL Server 2008
Sounds easy, right?
Here is what I have done:
Created on Control Flow Execute SQL Task that gets a recordset from SQL Server. Very fast, easy query, like select field1, field2, field 3 from table where condition > 0. That's it. Takes less than a second.
Created a variable (evaluated as expression) for the Oracle query that uses the results set from the above in the WHERE clause.
Created a ForEachLoop Container that takes the results (from #1 above) for each row in the recordset and runs it through a Data Flow that uses the Oracle query (from #2 above) with Data access mode: SQL command from variable against an Oracle data source. Fast, simple query with only about 6 columns returned.
Data Conversion - obvious reasons - changing 3 columns from Oracle data types to SQL Server data types.
OLE DB Destination to insert to SQL Server using Fast Load to staging table.
It works perfectly! Hooray! Bad news - it is very, very slow. When I say slow, I mean it process 3000 records per hour. Holy moly - so freaking slow.
Question: am I missing a way to speed it up? It seems like the ForEachLoop Container is the bottleneck. Growl.
Important Points:
- I have NO write access in Oracle environment, so don't even suggest a potential solution that requires it. Not a possibility. At all.
Oracle sources do not allow for direct parameter definition. So no SELECT FIELD FROM TABLE WHERE ?. Don't suggest it - doesn't work.
Ideas
- Should I find a way to break down the results of the Execute SQL task and send them through several ForEachLoop Containers for faster processing?
Is there another design that is more appropriate?
Is there a script I can use that is faster?
Would it be faster to create a temporary table in memory and populate it - then use the results to bulk insert to SQL Server? Does this work when using an Oracle data source?
ANY OTHER IDEAS?