Conditional import - how to discard records? - oracle

I want to import a csv file using SQLLDR, but I only want specific records. I have solved this with "WHEN record_type = 1" in my control file.
This works but the log file is getting flooded by "Record xxx: Discarded - failed all WHEN clauses." The input files contain millions of records but only a few percent satisfy the condition, so I end up with a log file with the same size as the input file :)
Am I doing this incorrectly?
Is there another way to discard/filter records when using SQLLDR?
Example Data:
record_type;a;b;c
24;a1;b1;c1
17;a2;b2;c2
22;an;bn;cn
1;a1;b1;c1
1;a2;b2;c2
1;an;bn;cn
Control file
load data
truncate
into table my_table_t
WHEN record_type = 1
(...
)

What you do is right IMO.
SQL*Loader is logging to the finest level of the loading details for you. You can opt out from few of the things.
Yo can disable the DISCARD records logging by adding
SILENT=(DISCARDS) to your SQL*Loader
You can refer the DOC for further details.

If you just want to get rid of the log you can send these log to /dev/null if you using Linux/Unix and NUL on Windows.
Example
Data File.
[oracle#ora12c Desktop]$ cat sample.txt
record_type;a;b;c
24;a1;b1;c1
17;a2;b2;c2
22;an;bn;cn
1;a1;b1;c1
1;a2;b2;c2
1;an;bn;cn
Control file.
[oracle#ora12c Desktop]$ cat control.ctl
load data
infile 'sample.txt'
insert
into table table_1 when record_type = '1'
fields terminated by ";"
(record_type, a, b, c)
Lets try to load records.
[oracle#ora12c Desktop]$ sqlldr jay/password#orapdb1 control=control.ctl data=sample.txt log=/dev/null
SQL*Loader: Release 12.1.0.2.0 - Production on Fri Feb 10 16:05:10 2017
Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved.
Path used: Conventional
Commit point reached - logical record count 7
Table TABLE_1:
3 Rows successfully loaded.
Check the log file:
/dev/null
for more information about the load.
There was no log file.
Now we got the only selected records.
SQL> select * from table_1;
RECORD_TYPE A B C
----------- -------------------- -------------------- --------------------
1 a1 b1 c1
1 a2 b2 c2
1 an bn cn

Using the external table, you can then use simple SQL to load your table...
insert into my_table_t( record_type, a, b, c )
select record_type, a, b, c
from my_external_table
where record_type != 1

Related

I am getting ORA-04043: object does not exist error in SQL loader. When command is being executed getting table does not exist error

CSV FILE CONTENT
portal,,
ex portal,,
,,
i_id,i_name,risk
1,a,aa
2,b,bb
3,c,cc
4,d,dd
5,e,ee
6,f,ff
7,g,gg
8,h,hh
9,i,ii
10,j,jj
CONTROL FILE CONTENT
options (
skip=4,
PARALLEL=true,
DIRECT=true
)
LOAD DATA
INFILE 'E:\sqlloader\testfile.csv'
APPEND
INTO TABLE LOADER_TAB
FIELDS TERMINATED BY ","
(
i_id,
i_name,
risk
)
I am getting object does not exist but table name does exist in the schema system
select tab.owner, tab.STATUS
from dba_tables tab
where tab.TABLE_NAME = 'LOADER_TAB';
Also tried by giving scema_name.table_name but no luck.
options (
skip=4,
PARALLEL=true,
DIRECT=true
)
LOAD DATA
INFILE 'E:\sqlloader\testfile.csv'
APPEND
INTO TABLE SYSTEM.LOADER_TAB
FIELDS TERMINATED BY ","
(
i_id,
i_name,
risk
)
Can someone help me on this I had searched for the answer and did all possible way but not getting the solution.
You're almost there - here's a top to bottom run of the code, the only change being I've created a schema to hold the table and the path names for the CSV file. So follow the demo below and if yours does not get the same result, edit the question with the full output similar to below. Also, if you still get issues, try it without DIRECT/PARALLEL which will help us dig deeper into the "why".
SQL> create user myuser identified by mypassword;
User created.
SQL> alter user myuser quota unlimited on users;
User altered.
SQL> grant connect, resource to myuser;
Grant succeeded.
SQL> create table myuser.LOADER_TAB(i_id number(10),i_name varchar2(30),risk varchar2(30));
Table created.
x:\tmp>sqlldr userid=myuser/mypassword#db19_pdb1 control=loader.ctl
SQL*Loader: Release 19.0.0.0.0 - Production on Tue Nov 2 11:38:58 2021
Version 19.12.0.0.0
Copyright (c) 1982, 2021, Oracle and/or its affiliates. All rights reserved.
Path used: Direct
Load completed - logical record count 10.
Table LOADER_TAB:
10 Rows successfully loaded.
Check the log file:
loader.log
for more information about the load.
x:\tmp>sqlplus myuser/mypassword#db19_pdb1
SQL*Plus: Release 19.0.0.0.0 - Production on Tue Nov 2 11:39:21 2021
Version 19.12.0.0.0
Copyright (c) 1982, 2021, Oracle. All rights reserved.
Last Successful login time: Tue Nov 02 2021 11:38:58 +08:00
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.12.0.0.0
SQL> select * from loader_tab;
I_ID I_NAME RISK
---------- ------------------------------ ------------------------------
1 a aa
2 b bb
3 c cc
4 d dd
5 e ee
6 f ff
7 g gg
8 h hh
9 i ii
10 j jj
10 rows selected.

Register/Trademark symbols in vertica

I have a txt file containing some data.
One of the columns contains Register/Trademark/Copyright symbol in it.
For eg, "DataWeb #symphone ®" and "Copyright © technologies"
Now when I load this txt file in database, all data gets stored properly except these symbols ®©
Are they supported by vertica ? Are there any way to do this ?
Thanks!
Vertica supports Unicode characters encoded UTF-8. Your message is a little bit vague because is not clear what is your problem. If I were you I would double check those characters are properly encoded and your font set is able to visualise them. Here you have a little test...
First let's create a properly UTF-8 encoded file:
$ echo -e "DataWeb #symphone \xc2\xae" > /tmp/test.dat
$ echo -e "Copyright \xc2\xa9 technologies" >> /tmp/test.dat
$ cat /tmp/test.dat
DataWeb #symphone ®
Copyright © technologies
Then let's create/load a table:
$ vsql
SQL> CREATE TABLE public.test ( txt VARCHAR(20) ) ;
SQL> COPY public.test FROM '/tmp/test.dat' ABORT ON ERROR DIRECT;
And, finally, let's query this table:
$ vsql
SQL> SELECT txt FROM public.test ;
txt
---------------------
DataWeb #symphone ®
Copyright © technol
(2 rows)
I'd suggest you to run this test from Linux using vsql command line interface (avoid Win and click-click interfaces).

Vertica - is there a way of retrieving the rejected records by code?

The "REJECTMAX" parameter is a technique of executing copy command even though there are invalid records in the csv
(so if i have 100 records, 9 of them are invalid & max rejected is 10 the file will upload)
I wonder if there is a way that i can get as a text the rejected records that prints into the rejected file so i can log it into application error log.
Here you have an example on how to use REJECTED DATA. Suppose you have a table like this:
SQL> CREATE TABLE public.mydata ( id INTEGER ) ;
CREATE TABLE
and an input file containing:
$ cat /tmp/mydata
1
2
3
ABC
4
5
Clearly ABC won't fit into an integer...
So we run:
SQL> COPY public.mydata FROM '/tmp/mydata' REJECTMAX 2 REJECTED DATA '/tmp/mydata.rejected' ;
NOTICE 7850: In a multi-threaded load, rejected record data may be written to additional files
HINT: Rejected data may be written to files [/tmp/mydata.rejected], [/tmp/mydata.rejected.1], etc
Rows Loaded
-------------
5
And now...
$ cat /tmp/mydata.rejected
ABC
Is this what you were looking for?

Update Oracle database with content of text file

I would like to update a field in an Oracle database with the content of a standard txt file.
The file is generated every 10 minutes by an external program on which i do not have control.
I would like to create a job in oracle or a SQLPLUS batch file that would pick the content of the file and update a specific record in an ORACLE Database
For exemple My_Table would contains this:
ID Description FileContent
-- ----------- ---------------------------------------------------------
00 test1.txt This is content of test.txt
01 test2.txt Content of files may
Contain several lines
blank lines
pretty much everything (but must be limited to 2000char)
02 test3.txt not loaded yet
My file "test3.txt" changes often but i do no know when and would look like this:
File generated at 3:33 on august 19, 2016
Result :
1 Banana
2 Apple
3 Pineapple
END OF FILE
i would like the full content of the file to be loaded up into it's corresponding record in an Oracle Database.

How to determine the Schemas inside an Oracle Data Pump Export file

I have an Oracle database backup file (.dmp) that was created with expdp.
The .dmp file was an export of an entire database.
I need to restore 1 of the schemas from within this dump file.
I don't know the names of the schemas inside this dump file.
To use impdp to import the data I need the name of the schema to load.
So, I need to inspect the .dmp file and list all of the schemas in it, how do I do that?
Update (2008-09-18 13:02) - More detailed information:
The impdp command i'm current using is:
impdp user/password#database directory=DPUMP_DIR
dumpfile=EXPORT.DMP logfile=IMPORT.LOG
And the DPUMP_DIR is correctly configured.
SQL> SELECT directory_path
2 FROM dba_directories
3 WHERE directory_name = 'DPUMP_DIR';
DIRECTORY_PATH
-------------------------
D:\directory_path\dpump_dir\
And yes, the EXPORT.DMP file is in fact in that folder.
The error message I get when I run the impdp command is:
Connected to: Oracle Database 10g Enterprise Edition ...
ORA-31655: no data or metadata objects selected for the job
ORA-39154: Objects from foreign schemas have been removed from import
This error message is mostly expected. I need the impdp command be:
impdp user/password#database directory=DPUMP_DIR dumpfile=EXPORT.DMP
SCHEMAS=SOURCE_SCHEMA REMAP_SCHEMA=SOURCE_SCHEMA:MY_SCHEMA
But to do that, I need the source schema.
impdp exports the DDL of a dmp backup to a file if you use the SQLFILE parameter. For example, put this into a text file
impdp '/ as sysdba' dumpfile=<your .dmp file> logfile=import_log.txt sqlfile=ddl_dump.txt
Then check ddl_dump.txt for the tablespaces, users, and schemas in the backup.
According to the documentation, this does not actually modify the database:
The SQL is not actually executed, and the target system remains unchanged.
If you open the DMP file with an editor that can handle big files, you might be able to locate the areas where the schema names are mentioned. Just be sure not to change anything. It would be better if you opened a copy of the original dump.
Update (2008-09-19 10:05) - Solution:
My Solution: Social engineering, I dug real hard and found someone who knew the schema name.
Technical Solution: Searching the .dmp file did yield the schema name.
Once I knew the schema name, I searched the dump file and learned where to find it.
Places the Schemas name were seen, in the .dmp file:
<OWNER_NAME>SOURCE_SCHEMA</OWNER_NAME>
This was seen before each table name/definition.
SCHEMA_LIST 'SOURCE_SCHEMA'
This was seen near the end of the .dmp.
Interestingly enough, around the SCHEMA_LIST 'SOURCE_SCHEMA' section, it also had the command line used to create the dump, directories used, par files used, windows version it was run on, and export session settings (language, date formats).
So, problem solved :)
Assuming that you do not have the log file from the expdp job that generated the file in the first place, the easiest option would probably be to use the SQLFILE parameter to have impdp generate a file of DDL (based on a full import). Then you can grab the schema names from that file. Not ideal, of course, since impdp has to read the entire dump file to extract the DDL and then again to get to the schema you're interested in, and you have to do a bit of text file searching for the various CREATE USER statements, but it should be doable.
The running the impdp command to produce an sqlfile, you will need to run it as a user which has the DATAPUMP_IMP_FULL_DATABASE role.
Or... run it as a low privileged user and use the MASTER_ONLY=YES option, then inspect the master table. e.g.
select value_t
from SYS_IMPORT_TABLE_01
where name = 'CLIENT_COMMAND'
and process_order = -59;
col object_name for a30
col processing_status head STATUS for a6
col processing_state head STATE for a5
select distinct
object_schema,
object_name,
object_type,
object_tablespace,
process_order,
duplicate,
processing_status,
processing_state
from sys_import_table_01
where process_order > 0
and object_name is not null
order by object_schema, object_name
/
http://download.oracle.com/otndocs/products/database/enterprise_edition/utilities/pdf/oow2011_dp_mastering.pdf
Step 1: Here is one simple example. You have to create a SQL file from the dump file using SQLFILE option.
Step 2: Grep for CREATE USER in the generated SQL file (here tables.sql)
Example here:
$ impdp directory=exp_dir dumpfile=exp_user1_all_tab.dmp logfile=imp_exp_user1_tab sqlfile=tables.sql
Import: Release 11.2.0.3.0 - Production on Fri Apr 26 08:29:06 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Username: / as sysdba
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA Job "SYS"."SYS_SQL_FILE_FULL_01" successfully completed at 08:29:12
$ grep "CREATE USER" tables.sql
CREATE USER "USER1" IDENTIFIED BY VALUES 'S:270D559F9B97C05EA50F78507CD6EAC6AD63969E5E;BBE7786A5F9103'
Lot of datapump options explained here http://www.acehints.com/p/site-map.html
You need to search for OWNER_NAME.
cat -v dumpfile.dmp | grep -o '<OWNER_NAME>.*</OWNER_NAME>' | uniq -u
cat -v turn the dumpfile into visible text.
grep -o shows only the match so we don't see really long lines
uniq -u removes duplicate lines so you see less output.
This works pretty well, even on large dump files, and could be tweaked for usage in a script.
My solution (similar to KyleLanser's answer) (on a Unix box):
strings dumpfile.dmp | grep SCHEMA_LIST
In my case, based on Aldur's and slafs' answers I came up with this expression that should tell you just the name of the original schema:
cat -v file.dmp | grep 'SCHEMA_LIST' | uniq -u | grep -o -P '(?<=SCHEMAS\=).*(?=content)'
Tested for a DMP file from Oracle 19.8 version.

Resources