How to tune Oracle's SQL*Loader append? - oracle

I am writing a Java program that creates a CSV file with 6,800,000 records conforming to specific distribution parameters and populates a table using Oracle's SQL*Loader.
I am testing my program using different sizes of records (50,000 and 500.000). The CSV File generation by itself is quite fast, using concurrency it takes miliseconds to create and insert these records into a file.
Inserting said records, on the other hand, is taking too long. Reading the log file generated by SQL*Loader, it takes 00:00:32.90 seconds to populate the table with 50,000 records and 00:07:58.83 minutes to populate it with 500,000.
SQL*Loader benchmarks I've googled show much better perfomances, such as 2 million rows in less than 2 minutes. I've followed this tutorial to improve the time, but it barely changed at all. There's obviously something wrong here, but I don't know what.
Here's my control file:
OPTIONS (SILENT=ALL, DIRECT=TRUE, ERRORS=50, COLUMNARRAYROWS=50000, STREAMSIZE=500000)
UNRECOVERABLE LOAD DATA
APPEND
INTO TABLE MY_TABLE
FIELDS TERMINATED BY ","
TRAILING NULLCOLS
...
Another important info: I've tried using PARALLEL=TRUE, but I get the ORA-26002 error (Table MY_TABLE has index defined upon it). Unfortunatly, running with skip_index_maintenance renders the index UNUSABLE.
What am I doing wrong?
Update
I have noticed that soon after running the program (less than a second), all rows are already present in the database. Yet, SQL*Loader is still busy and only finishes after 32-45 seconds.
What could it be doing?

One thought would be to create an external table and set the name to the csv file. Then after creating the file you can run a sql script inside Oracle to process the data directly.
Or, look at the following (copied from here:)
This issue is caused when using the bulk load option in parallel to load an Oracle target that has an index on it. An Oracle limitation.
To resolve this issue do one of the following:
· Change the target load option to Normal.
· Disable the enable parallel mode option in relational connection browser.
· Drop the indexes before loading.
· Or create a pre- and post-session sql to drop and create indexes and key constraints

Related

Oracle Data Integrator- ODI 12.2.1--Loadplan Issue no of records count issue

I come across a scenario in my project.I am loading data from file to Table using ODI.I am running My interfaces through loadplan.I've 1000 Records in my source file,and also getting 1000 records in target file.but when I'm checking ODI loadplan execution log its showing number of insert is 2000.can anyone please help.or is it a ODI bug.?
The number of inserts does not only show the inserts in the target table but also all the insert happening in temporary tables. Depending on the knowledge modules (KMs) used in an interface, ODI might load data in a C$_ table (LKM) or I$_ table (IKM/CKM). The rows loaded in these table will also be counted.
You can look at the code generated in the operator to check if your KMs are using using these temporary. You can also simulate an execution to see the code generated.

Oracle: troubles using the parallel option in CREATE TABLE

I am not an expert of oracle, I just write queries and scripts through TOAD but I don't know how oracle works.
In order to make a data analysis I have created several tables in order to build a DB from them.
In order to build them I have used the following command
CREATE TABLE SCHEMA.NAME
TABLESPACE SCHEMASPACE NOLOGGING PARALLEL AS
select [...]
with the parallel option active in order to make the server work on more processors.
Once that the DB was created I have tried to do some queries on it but the same query done on the same DB 2 times gave me two different answers. In detail the first time I started the query it returned to me a table with about 2500 rows and the second time it returned to me a table with about 2650 rows.
I have tried to redo the entire process without the parallel option and now it seems to work, but it take too much time and I need to run the process several times, does anyone have a solution?

unique constraint violated error performance

I am inserting in a table millions of records , such operation will need hours or maybe a day. After 2 hours the connection through my pc was disconnected, so I want to repeat the insert from the start.
My Question
which is faster ? truncate the table and repeat it again , or creating a primary key and continue, however an error will be raised because of 'unique constraint violated' for every record that was inserted in the last 2 hours.
Truncating the table (If full refresh)is the best option Hands down. There's also SKIP parameter, if you use Oracle's SQL*Loader utility. Let me explain to some extent!
Also try loading the table with SQL*Loader using DIRECT load option. Which means loading the table by loading into the data blocks, instead of conventional INSERT statements.
By this kind of loading, you can enable UNRECOVERABLE , which means no/less redo log written, so the loading is very fast >70% than conventional INSERT.
But, the downside of this loading is, ALL indexes on this table, except NULL constraints will be made UNUSABLE, before the start of loading, and the data will be loaded. And on SUCCESSFUL completion, SQL*Loader tries to re-enable the index, by rebuilding it. So, if in case my any reason, the loading had interrupted, the error messages will be logged properly, and the index would be left UNUSABLE.
More Details on : Please find Here
(DIRECT/CONVENTIONAL Loading)
Also, using SQL*Loader, you can load using Conventional loading, which means SQL*Loader would generate the chunk of INSERTs using the file , and process it. In this type of loading, all the INDEXES will be left as such, and the table remains unharmed.
If at all any error happens, SQL*Loader will log a SKIP parameter, which means , by next run, if you specify that number, the table will loaded from that point of the file.
More Details on SQL*Loader : Here
Not sure how you are loading your table but this is a classic situation where you should use external table of Oracle.

Attempting to use SQL-Developer to analyze a system table dump created with 'exp'

I'm attempting to recover the data from a specific table that exists in a system table dump I performed earlier. I would like to append the rows existing in the dump to any rows that may exist in the active table. The problem is, it's likely that the name of the table in the dump is not the same as what exists in the database currently (They're dynamically created with a prefix of ARC_TREND_). In addition, I don't know the name of the table as it exists in the dump, I was hoping to use SQL Developer to analyze the dump file as I can recognize the correct table by it's columns and it's existing rows.
While i'm going on blind faith that SQL Developer can work with my dump file, when attempting to open it, i'm getting a Java Heap OutOfMemory exception raised. I've adjusted the maximum heap size from 640m to 1024m in both sqldeveloper.bat and in sqldeveloper.conf, but to no avail.
Can someone recommend a course of action for me to take to recover the data from a table which exists in a exp created dump file? A graphical tool would be nice, but I'm no stranger to the command line. I need to analyze the tables that exist in the dump in order to pick the correct one out. Then I assume I can use imp TABLE= to bring it back into the active instance. It likely won't match the existing table name, so I will use SQL Developer to copy the rows from the imported table to the table where I need them to be.
The dump was taken from a Linux server running 10g, and will be imported to (the same server & database instance, upgraded) an 11g instance of the same database.
Thanks
Since you're referring to imp rather than impdp, I assume this wasn't exported with data pump. Either way, I doubt you'll get anything useful through SQL Developer.
Fortunately most of what you're trying to do is quite easy from the command line; just run imp with the INDEXFILE parameter, which will give you a text file containing all the table (commented out with REM) and index creation commands. From that you should be able to spot the table from its column names.
You can't really see any row data though, so if there's more than one possible match you might need to import several tables and inspect the data in them in the database to see which one you really want.

sqlldr corrupts my primary key after the first commit

Sqlldr is corrupting my primary key index after the first commit in my ctl file. After the first, no matter what I set the rows value to in my control file, I get:
ORA-39776: fatal Direct Path API error loading table PE_OWNER.CLINICAL_CODE
ORA-01502: index 'PE_OWNER.CODE_PK' or partition of such index is in unusable state
SQL*Loader-2026: the load was aborted because SQL Loader cannot continue.
I'm using Oracle database and client 11.1.0.6.0.
I know the issue is not due to duplicate rows because if I set the rows directive to a huge value, the index is not corrupt after sqlldr does a single commit for the entire file. This provides me with a workaround, but it's still a little alarming...
Thanks for any guidance anyone can give.
I don't use SQL*Loader much on production tables, but from what I've read, you need to use conventional load.
from the SQL*Loader documentation
When to Use a Conventional Path Load
If load speed is most important to
you, you should use direct path load
because it is faster than conventional
path load. However, certain
restrictions on direct path loads may
require you to use a conventional path
load. You should use a conventional
path load in the following situations:
* When accessing an indexed table concurrently with the load, or when
applying inserts or updates to a nonindexed table concurrently with the
load
To use a direct path load (with the exception of parallel loads),
SQL*Loader must have exclusive write
access to the table and exclusive
read/write access to any indexes.
I believe the issue was that Oracle did not have time to rebuild the indices on the table in question, so I increased the batch commit size to a number larger than the number of records I was importing.
That fixed the issue.

Resources