Oracle External Table Columns based on header row or star(*) - oracle

I have a text file with around 100 columns terminated by "|". And i need to get few of the columns from this file into by External Table. So the solution i have is either specify all columns under ACCESS PARAMETERS section in the same order as the file. and define required columns in the Create table definition. Or define all columns in the same order in the create table itself.
Can i avoid defining all the columns in the query? Is it possible to get the columns based on the first row name itself - Provided i have the column names as the first row.
Or is it atleast possible to get all columns like a (select * ) without mentioning each column?
Below is the code i use
drop table lz_purchase_data;
CREATE TABLE lz_purchase_data
(REC_ID CHAR(50))
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY "FILEZONE"
ACCESS PARAMETERS
( RECORDS DELIMITED BY NEWLINE Skip 1
FIELDS TERMINATED BY '|' OPTIONALLY ENCLOSED BY '"'
LRTRIM MISSING FIELD VALUES ARE NULL
) LOCATION( 'PURCHASE_DATA.txt' ))
REJECT LIMIT UNLIMITED
PARALLEL 2 ;
select * from LZ_PURCHASE_DATA;

Related

Replace specific junk characters from column in hive

I've an issue where one of the column loaded in a hive table contains junk character ("~) in a column suffixed with actual value (ABC). So the actual value that's visible for this column is (ABC"~).
This column can have either ABC (or any such string) or NULL. The table is huge and Update is not an option here.
I've thought of a solution of creating a temp table with this column containing either the string (ABC) or NULL, thereby want to remove this junk character ("~) completely while copying the data from original table to this temp table.
Any help on how I can remove this junk? I tried using regexp function, but no success. Any suggestions?
I was not using regexp properly; my fault.
The data loaded initially in the table had the extra characters attached to a column's values. For Ex: If the column's actual value was Adf452, then the data contained in the cell was Adf452"~.
So I loaded the data to a temp table like this:
insert overwrite table tempTable select colA, colB, colC, regexp_replace(colC,"\"~",""), partitionedCol from origTable;
This simply loaded the data in tempTable without those junk characters.

Insert part of data from csv into oracle table

I have a CSV (pipe-delimited) file as below
ID|NAME|DES
1|A|B
2|C|D
3|E|F
I need to insert the data into a temp table where I already have SQLLODER in place, but my table have only one column. The below is the control file configuration for loading from csv.
OPTIONS (SKIP=1)
LOAD DATA
CHARACTERSET UTF8
TRUNCATE
INTO TABLE EMPLOYEE
FIELDS TERMINATED BY '|'
TRAILING NULLCOLS
(
NAME
)
How do I select the data from only 2nd column from the csv and insert into only one column in the table EMPLOYEE?
Please let me know if you have any questions.
If you're using a filler field you don't need to have a matching column in the database table - that's the point, really - and as long as you know the field you're interested in is always the second one, you don't need to modify the control file if there are extra fields in the file, you just never specify them.
So this works, with just a filler ID field added and the three-field data file you showed:
OPTIONS (SKIP=1)
LOAD DATA
CHARACTERSET UTF8
TRUNCATE
INTO TABLE EMPLOYEE
FIELDS TERMINATED BY '|'
TRAILING NULLCOLS
(
IF FILLER,
NAME
)
Dmoe'd with:
SQL> create table employee (name varchar2(30));
$ sqlldr ...
Commit point reached - logical record count 3
SQL> select * from employee;
NAME
------------------------------
A
C
E
Adding more fields to the data file makes no difference, as long as they are after the field you are actually interested in. The same thing works for external tables, which can be more convenient for temporary/staging tables, as long as the CSV file is available on the database server.
Columns in data file which needs to be excluded from load can be defined as FILLER.
In given example use following. List all incoming fields and add filler to those columns needs to be ignored from load, e.g.
(
ID FILLER,
NAME,
DES FILLER
)
Another issue here is to ignore header line as in CSV so just use OPTIONS clause e.g.
OPTIONS(SKIP=1)
LOAD DATA ...
Regards,
R.

update table based on concatenated column value

I have a table with only 4 columns
First column - The concatenated column values for each row from another table.The columns are concatenated based on column id from the metadata table.The order of concatenation is the same order of column ids.
Second column -I have the comma separated primary key columns.
Now, based on the primary keys in the second column, I need to update the 3rd column which will retrieve the values for the primary key from each of the first concatenated field.
4 column _ it has the table name.
I am using cursor and string functions and it works perfectly fine but when I tested it fir huge millions of data , it failed and the performance is very poor.
Could anyone give please me a single update query for the same
There is a comparison tool which compares the data between 2 tables in different database but with same data structure and it dumps the mismatch rows into a table with all the columns concatenated(pipe seperaed).The columns are in the same order as that of column id and I know the primary keys for that table(concatenated but pipe seperated). So, based on this data I need to extract the primary key values for which there is a data mismatch.
I need to do something like
Update column4(primary key values pipe seperated extracted from column2)
Check this LINK, maybe will be useful. With that query you could concatenate a value with a character you need (this works for 11g2 version, for earlier versions use xmlagg
, xmlelement, extract method).
CREATE TABLE TEST(
FIELD INT);
INSERT INTO TEST VALUES(1);
INSERT INTO TEST VALUES(2);
INSERT INTO TEST VALUES(3);
INSERT INTO TEST VALUES(4);
SELECT listagg(FIELD,',' ) WITHIN GROUP (ORDER BY FIELD)
FROM TEST
Returns '1,2,3,4'

Oracle SQL save file name from LOCATION as a column in external table

I have several input files being read into an external table in Oracle. I want to run some queries across the content from all the files, however, there are some queries where I would like to filter the data based on the input file it came from. Is there a way to access the name of the source file in a select statement against an external table or somehow create a column in the external table that includes the location source.
Here is an example:
CREATE TABLE MY_TABLE (
first_name CHAR(100 BYTES)
last_name CHAR(100 BYTES)
)
ORGANIZATION EXTERNAL
TYPE ORACLE_LOADER
DEFAULT DIRECTORY TMP
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE
SKIP 1
badfile 'my_table.bad'
discardfile 'my_table.dsc'
LOGFILE 'my_table.log'
FIELDS terminated BY 0x'09' optionally enclosed BY '"' LRTRIM missing field VALUES are NULL
(
first_name char(100),
last_name
)
)
LOCATION ( TMP:'file1.txt','file2.txt')
)
REJECT LIMIT 100;
select distinct last_name
from MY_TABLE
where location like 'file2.txt' -- This is the part I don't know how to code
Any suggestions?
There is always the option to add the file name to the input file itself as an additional column. Ideally, I would like to avoid this work around.
The ALL_EXTERNAL_LOCATIONS data dictionary view contains information about external table locations. Also DBA_* and USER_* versions.
Edit: (It would help if I read the question thoroughly.)
You don't just want to read the location for the external table, you want to know which row came from which file. Basically, you need to:
Create a shell script that adds the file location to the file contents and sends them to stdin.
Add the PREPROCESSOR directive to your external table definition to execute the script.
Alter the external table definition to include a column to show the filename appended in the first step.
Here is an asktom article explaining it in detail.

Have dynamic columns in external tables

My requirement is that I have to use a single external table in a store procedure for different text files which have different columns.
Can I use dynamic columns in external tables in Oracle 11g? Like this:
create table ext_table as select * from TBL_test
organization external (
type oracle_loader
default directory DATALOAD
access parameters(
records delimited by newline
fields terminated by '#'
missing field values are null
)
location ('APD.txt')
)
reject limit unlimited;
The set of columns that are defined for an external table, just like the set of columns that are defined for a regular table, must be known at the time the external table is defined. You can't choose at runtime to determine that the table has 30 columns today and 35 columns tomorrow. You could also potentially define the external table to have the maximum number of columns that any of the flat files will have, name the columns generically (i.e. col1 through col50) and then move the complexity of figuring out that column N of the external table is really a particular field to the ETL code. It's not obvious, though, why that would be more useful than creating the external table definition properly.
Why is there a requirement that you use a single external table definition to load many differently formatted files? That does not seem reasonable.
Can you drop and re-create the external table definition at runtime? Or does that violate the requirement for a single external table definition?

Resources