How to efficiently export and import an Oracle table? - oracle

What is the best way to do this so that we waste the least time possible in both exporting and importing?
Taking into account that we are talking about a huge table with data from more than a decade.
What I've been planning so far:
directory=dumps
dumpfile=foo.dmp
parallel=8
logfile=foo_exp.log
tables=FOO
query=FOO:"WHERE TSP <= sysdate"
content=DATA_ONLY
The import part:
directory=dumps
dumpfile=foo.dmp
parallel=8
logfile=foo_imp.log
remap_table=FOO:FOO_REPARTITIONED
table_exists_action=REPLACE
Both scripts are going to be run like this:
nohup expdp USER/PWD#sid parfile=export.par &
nohup impdp USER/PWD#sid parfile=import.par &
Is the parallel parameter going to work as expected? Do I need to take anything else into account?

You need to consider some things
The parallel parameter from Datapump will not work unless you specify multiple dump files using the option %U. So in your case:
directory=dumps
dumpfile=foo_%U.dmp
parallel=8
logfile=foo_exp.log
tables=FOO
query=FOO:"WHERE TSP <= sysdate"
content=DATA_ONLY
From the documentation
The value that you specify for integer should be less than, or equal
to, the number of files in the dump file set (or you should specify
either the %U or %L substitution variables in the dump file
specifications).
Also, take in consideration the following restrictions':
This parameter is valid only in the Enterprise Edition of Oracle
Database 11g or later.
To export a table or table partition in parallel (using parallel
query, or PQ, worker processes), you must have the
DATAPUMP_EXP_FULL_DATABASE role.
Transportable tablespace metadata cannot be exported in parallel.
Metadata cannot be exported in parallel when the NETWORK_LINK
parameter is also used. The following objects cannot be exported in
parallel: TRIGGER, VIEW OBJECT_GRANT, SEQUENCE, CONSTRAINT
REF_CONSTRAINT.
So in your case set the parameter to a value adequate for the Hardware you have available in your server.
Update
Sorry for taking so much time to answer, but I was kind of busy. You were mentioning issues during the import. Well, if the structure of the tables is not the same ( for example, the partition key ) that might have an effect in the import operation. Normally in this case, I would suggest to be smart and speed up the import by splitting the operation in two steps:
First Step - Import Datapump into normal table
directory=dumps
dumpfile=foo_%U.dmp
parallel=8
logfile=foo_imp.log
remap_table=FOO:TMP_FOO
table_exists_action=TRUNCATE
TRANSFORM=DISABLE_ARCHIVE_LOGGING:Y
ACCESS_METHOD=DIRECT_PATH
content=DATA_ONLY
Be sure to have the table TMP_FOO created before starting the operation. The first step is to import the datapump file ( only data ) into a non partitoned table using direct path and without logging.
Second Step - Direct Path Insert from TMP_FOO into your final table
alter session enable parallel dml ;
alter session force parallel query;
insert /*+append parallel(a,8) */ into your_partitioned_table a
select /*+parallel(b,8) */ * from tmp_foo b ;
commit;
I think this would make the time go down.

Related

Minimal Oracle DDL to bump DBA_OBJECTS.LAST_DDL_TIME

I'm using an external tool that scans tables in my database. It uses dba_objects.last_ddl_time to determine which tables have been scanned. Obviously, this strategy does not work if the table data is modified in between scans so sometimes I have to help it...
I need a way to "bump" the Last DDL time without actually changing anything.
I'm looking for the simplest possible instant DDL statement that can be executed on any table, knowing just the table name.
I have sysdba privileges.
Edit:
For example, I can use comment on table xxx is 'Boom'; but then I lose the original comment. I know how to fix this, but then it is no longer an small and easy statement I can quickly time in sql*plus
Changing LOGGING/NOLOGGING is pretty fast (though not instant).
If you set the LOGGING attribute back to itself, it will notch the LAST_DDL_TIME without making any real change to the table. This example below tries to touch every table except sys tabels (presumably you'd want more limits here)
BEGIN
FOR TABLE_POINTER IN (SELECT OWNER, TABLE_NAME, DECODE(LOGGING,'YES','LOGGING','NOLOGGING') DO_LOGGING
FROM DBA_TABLES WHERE OWNER NOT IN ('SYSTEM','SYS','SYSBACKUP','MDSYS' --etc. other restrictions here
))
LOOP
EXECUTE IMMEDIATE UTL_LMS.FORMAT_MESSAGE('ALTER TABLE %s.%s %s',TABLE_POINTER.OWNER, TABLE_POINTER.TABLE_NAME, TABLE_POINTER.DO_LOGGING);
END LOOP;
END;
/
EDIT: The above wouldn't work with temp tables. An alternative such as setting PCT_FREE to itself or another suitable attribute may be preferable. You may need to handle IOTs, Partitioned Tables, etc. differently than the rest of the tables as well.

How to update an Oracle Table from SAS efficiently?

The problem I am trying to solve:
I have a SAS dataset work.testData (in the work library) that contains 8 columns and around 1 million rows. All columns are in text (i.e. no numeric data). This SAS dataset is around 100 MB in file size. My objective is to have a step to parse this entire SAS dataset into Oracle. i.e. sort of like a "copy and paste" of the SAS dataset from the SAS platform to the Oracle platform. The rationale behind this is that on a daily basis, this table in Oracle gets "replaced" by the one in SAS which will enable downstream Oracle processes.
My approach to solve the problem:
One-off initial setup in Oracle:
In Oracle, I created a table called testData with a table structure pretty much identical to the SAS dataset testData. (i.e. Same table name, same number of columns, same column names, etc.).
On-going repeating process:
In SAS, do a SQL-pass through to truncate ora.testData (i.e. remove all rows whilst keeping the table structure). This ensure the ora.testData is empty before inserting from SAS.
In SAS, a LIBNAME statement to assign the Oracle database as a SAS library (called ora). So I can "see" what's in Oracle and perform read/update from SAS.
In SAS, a PROC SQL procedure to "insert" data from the SAS dataset work.testData into the Oracle table ora.testData.
Sample codes
One-off initial setup in Oracle:
Step 1: Run this Oracle SQL Script in Oracle SQL Developer (to create table structure for table testData. 0 rows of data to begin with.)
DROP TABLE testData;
CREATE TABLE testData
(
NODENAME VARCHAR2(64) NOT NULL,
STORAGE_NAME VARCHAR2(100) NOT NULL,
TS VARCHAR2(10) NOT NULL,
STORAGE_TYPE VARCHAR2(12) NOT NULL,
CAPACITY_MB VARCHAR2(11) NOT NULL,
MAX_UTIL_PCT VARCHAR2(12) NOT NULL,
AVG_UTIL_PCT VARCHAR2(12) NOT NULL,
JOBRUN_START_TIME VARCHAR2(19) NOT NULL
)
;
COMMIT;
On-going repeating process:
Step 2, 3 and 4: Run this SAS code in SAS
******************************************************;
******* On-going repeatable process starts here ******;
******************************************************;
*** Step 2: Trancate the temporary Oracle transaction dataset;
proc sql;
connect to oracle (user=XXX password=YYY path=ZZZ);
execute (
truncate table testData
) by oracle;
execute (
commit
) by oracle;
disconnect from oracle;
quit;
*** Step 3: Assign Oracle DB as a libname;
LIBNAME ora Oracle user=XXX password=YYY path=ZZZ dbcommit=100000;
*** Step 4: Insert data from SAS to Oracle;
PROC SQL;
insert into ora.testData
select NODENAME length=64,
STORAGE_NAME length=100,
TS length=10,
STORAGE_TYPE length=12,
CAPACITY_MB length=11,
MAX_UTIL_PCT length=12,
AVG_UTIL_PCT length=12,
JOBRUN_START_TIME length=19
from work.testData;
QUIT;
******************************************************;
**** On-going repeatable process ends here *****;
******************************************************;
The limitation / problem to my approach:
The Proc SQL step (that transfer 100 MB of data from SAS to Oracle) takes around 5 hours to perform - the job takes too long to run!
The Question:
Is there a more sensible way to perform data transfer from SAS to Oracle? (i.e. updating an Oracle table from SAS).
First off, you can do the drop/recreate from SAS if that's a necessity. I wouldn't drop and recreate each time - a truncate seems easier to get the same results - but if you have other reasons then that's fine; but either way you can use execute (truncate table xyz) from oracle or similar to drop, using a pass-through connection.
Second, assuming there are no constraints or indexes on the table - which seems likely given you are dropping and recreating it - you may not be able to improve this, because it may be based on network latency. However, there is one area you should look in the connection settings (which you don't provide): how often SAS commits the data.
There are two ways to control this, the DBCOMMMIT setting and the BULKLOAD setting. The former controls how frequently commits are executed (so if DBCOMMIT=100 then a commit is executed every 100 rows). More frequent commits = less data is lost if a random failure occurs, but much slower execution. DBCOMMIT defaults to 0 for PROC SQL INSERT, which means just make one commit (fastest option assuming no errors), so this is less likely to be helpful unless you're overriding this.
Bulkload is probably my recommendation; that uses SQLLDR to load your data, ie, it batches the whole bit over to Oracle and then says 'Load this please, thanks.' It only works with certain settings and certain kinds of queries, but it ought to work here (subject to other conditions - read the documentation page above).
If you're using BULKLOAD, then you may be up against network latency. 5 hours for 100 MB seems slow, but I've seen all sorts of things in my (relatively short) day. If BULKLOAD didn't work I would probably bring in the Oracle DBAs and have them troubleshoot this, starting from a .csv file and a SQL*LDR command file (which should be basically identical to what SAS is doing with BULKLOAD); they should know how to troubleshoot that and at least be able to monitor performance of the database itself. If there are constraints on other tables that are problematic here (ie, other tables that too-frequently recalculate themselves based on your inserts or whatever), they should be able to find out and recommend solutions.
You could look into PROC DBLOAD, which sometimes is faster than inserts in SQL (though all in all shouldn't really be, and is an 'older' procedure not used too much anymore). You could also look into whether you can avoid doing a complete flush and fill (ie, if there's a way to transfer less data across the network), or even simply shrinking the column sizes.

Oracle DBMS package command to export table content as INSERT statement

Is there any subprogram similar to DBMS_METADATA.GET_DDL that can actually export the table data as INSERT statements?
For example, using DBMS_METADATA.GET_DDL('TABLE', 'MYTABLE', 'MYOWNER') will export the CREATE TABLE script for MYOWNER.MYTABLE. Any such things to generate all data from MYOWNER.MYTABLE as INSERT statements?
I know that for instance TOAD Oracle or SQL Developer can export as INSERT statements pretty fast but I need a more programmatically way for doing it. Also I cannot create any procedures or functions in the database I'm working.
Thanks.
As far as I know, there is no Oracle supplied package to do this. And I would be skeptical of any 3rd party tool that claims to accomplish this goal, because it's basically impossible.
I once wrote a package like this, and quickly regretted it. It's easy to get something that works 99% of the time, but that last 1% will kill you.
If you really need something like this, and need it to be very accurate, you must tightly control what data is allowed and what tools can be used to run the script. Below is a small fraction of the issues you will face:
Escaping
Single inserts are very slow (especially if it goes over a network)
Combining inserts is faster, but can run into some nasty parsing bugs when you start inserting hundreds of rows
There are many potential data types, including custom ones. You may only have NUMBER, VARCHAR2, and DATE now, but what happens if someone adds RAW, BLOB, BFILE, nested tables, etc.?
Storing LOBs requires breaking the data into chunks because of VARCHAR2 size limitations (4000 or 32767, depending on how you do it).
Character set issues - This will drive you ¿¿¿¿¿¿¿ insane.
Enviroment limitations - For example, SQL*Plus does not allow more than 2500 characters per line, and will drop whitespace at the end of your line.
Referential Integrity - You'll need to disable these constraints or insert data in the right order.
"Fake" columns - virtual columns, XML lobs, etc. - don't import these.
Missing partitions - If you're not using INTERVAL partitioning you may need to manually create them.
Novlidated data - Just about any constraint can be violated, so you may need to disable everything.
If you want your data to be accurate you just have to use the Oracle utilities, like data pump and export.
Why don't you use regular export ?
If you must you can generate the export script:
Let's assume a Table myTable(Name VARCHAR(30), AGE Number, Address VARCHAR(60)).
select 'INSERT INTO myTable values(''' || Name || ','|| AGE ||',''' || Address ||''');' from myTable
Oracle SQL Developer does that with it's Export feature. DDL as well as data itself.
Can be a bit unconvenient for huge tables and likely to cause issues with cases mentioned above, but works well 99% of the time.

Oracle export problem

cmd:
exp bla/bla file=c:\bla.bkp
my bla schema in objects
Table
T_1
T_2
T_3
T_4
Functions
F_1
F_2
Procedure
P_1
P_2
I need all object but not in table ( T_4 ) how to make ?
If you are using the deprecated export utility, you cannot exclude a single object. You would have to specify every table that you wanted in a TABLES clause, i.e.
exp username/password file=c:\bla.dmp tables=(T_1, T_2, T_3)
Obviously, that gets unwieldy rather quickly. You can potentially write a query that generates the tables list for you and then copy & paste from a SQL*Plus window. But that is also rather unwieldy.
Assuming you are using a reasonably new version of Oracle, however, you should be able to use the data pump version of the export and import utilities, expdp. With expdp
expdp username/password dumpfile=c:\bla.dmp exclude=T_4
You can specify teh tables of interest n the command line, something like
exp bla/bla file=c:\bla.bkp TABLES=(T_1,T_2,T_3)
Ok, that only gets tables, the rest of the stuff you are going to have to use/write something else. Look at the enter code heredbms_metadata.GET_DDL procedure,

Importing selective data using impdp

I have an entire DB to be imported as a dump into my own. I want to exclude data out of certain tables(mostly because they are huge in size and not useful). I cannot entirely exclude those tables since I need the table object per se(minus the data) and will have to re create them in my schema if I do so. Also in the absence of those table objects , various other foreign constraints defined on other tables will also fail to be imported and will need to be redefined.So I need to exclude just the data from certain tables.I want data from all other tables though.
Is there a set of parameters for impdp that can help me do so?
I would make two runs at it: The first I would import metadata only:
impdp ... CONTENT=METADATA_ONLY
The second would include the data only for the tables I was interested in:
impdp ... CONTENT=DATA_ONLY TABLES=table1,table2...
Definitely make 2 runs. One to create all the table objects, but instead of using tables in the second impdp run, use the exclude
impdp ... Content=data_only exclude=TABLE:"IN ('table1', 'table2')"
The other way works, but this way you only have to list the tables you don't want versus all that you want.
If the size of the table is big for export import the you can use "SAMPLE" parameter in expdp command to take export of table for what ever percentage you want ....
$ expdp tables=T100test DIRECTORY=expimp1 DUMPFILE=test12.dmp SAMPLE = 10;
This command will export only 10% data of the T100test table's data.
Syntax:
EXCLUDE=[object_type]:[name_clause],[object_type]:[name_clause]
INCLUDE=[object_type]:[name_clause],[object_type]:[name_clause]
Examples of operator-usage:
EXCLUDE=SEQUENCE
or EXCLUDE=TABLE:"IN ('EMP','DEPT')"
or EXCLUDE=INDEX:"= 'MY_INDX'"
or INCLUDE=PROCEDURE:"LIKE 'MY_PROC_%'"
or INCLUDE=TABLE:"> 'E'"
The parameter can also be stored in a parameter file, for example: exp.par
DIRECTORY = my_dir
DUMPFILE = exp_tab.dmp
LOGFILE = exp_tab.log
SCHEMAS = scott
INCLUDE = TABLE:"IN ('EMP', 'DEPT')"
It seems you can exclude directly when importing using impdp query parameter
impdp [...] QUERY='TABLE_NAME:"WHERE rownum = 0"'
cf : community.oracle.com

Resources