sqlldr stream record format on commandline - oracle

I am loading a delimited file using sqlldr. I have kept file format/table details in the ctl file and pass other parameters on the command line.
sqlldr control=sp.ctl data=data.20170502.txt SKIP=1 userid=xyz#db/pwd log=sp.log bad=sp.bad
sp.ctl
LOAD DATA
TRUNCATE
INTO TABLE "T_DATA"
TRUNCATE
FIELDS TERMINATED BY '|'
TRAILING NULLCOLS
(
C_1 CHAR(2000),
C_2 CHAR(2000),
C_3 CHAR(2000)
)
I now need to use a stream record format on this data file.
infile 'example3.dat' "str '|\n'"
However, I am not using the infile syntax.
So I tried using
sqlldr control=sp.ctl data=data.20170502.txt "str '!\n'" SKIP=1
userid=xyz#db/pwd log=sp.log bad=sp.bad
It gives an error:
LRM-00112: multiple values not allowed for parameter 'data'
How do I pass the record delimiter on the command line?

Related

Oracle External Table RECORDS DELIMITED BY '",\n"' not working, how can I delimit by a character and newline in the same time?

I am trying to read large CSV files with lots of Newline characters in them.
this is how the data looks like in the CSV file.
"LastValueInRow",
"FirstValueInNextRow",
I would like to use " + , + NEWLINE + " as records delimiter to prevent it from reading all other return characters as new records.
The following code reads most CSV records correctly by using NEWLINE (\n) + "
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY "IMPORT_TEST"
ACCESS PARAMETERS
( RECORDS DELIMITED BY '\n"'
BADFILE SNOW_IMPORT_TEST:'TEST_1.bad'
LOGFILE SNOW_IMPORT_TEST:'TEST_1.log'
SKIP 1
FIELDS TERMINATED BY '","'
MISSING FIELD VALUES ARE NULL
)
LOCATION
( "IMPORT_TEST":'TEST_1.csv'
)
)
Adding any characters before the \n doesn't return any rows, below is what I want which doesn't work:
( RECORDS DELIMITED BY '",\n"'
Is it possible to use " + , + \n + " as records delimiter.
Thanks.
After a lot of research I have found that the best solution is to replace the return characters in the CSV file to a different character using Windows PowerShell then update the records delimiter in the external table.
I have created the following Powershell script to remove all the return characters in the CSV file (where $loc is the directory and $file_name is the file name)
(Get-content -raw -path $loc\$file_name".csv") -replace '[\r\n]', '|' | Out-File -FilePath $loc\$file_name"_PP.csv" -Force -Encoding ascii -nonewline
Then I have updated the external table parameter to read the records based on the new delimiter '",||"'.
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY "IMPORT_TEST"
ACCESS PARAMETERS
( RECORDS DELIMITED BY '",||"'
BADFILE SNOW_IMPORT_TEST:'TEST_1_PP.bad'
LOGFILE SNOW_IMPORT_TEST:'TEST_1_PP.log'
SKIP 1
FIELDS TERMINATED BY '","'
MISSING FIELD VALUES ARE NULL
)
LOCATION
( "IMPORT_TEST":'TEST_1_PP.csv'
)
)
Now the external table is reading all the records correctly.

SQL Loader - Multiple Files and Grabbing file names

I have a folder with over 400K txt files.
With names like
deID.RESUL_12433287659.txt_234323456.txt
deID.RESUL_34534563649.txt_345353567.txt
deID.RESUL_44235345636.txt_537967875.txt
deID.RESUL_35234663456.txt_423452545.txt
I want to store all the files and their content in the following way:
file_name file_content
deID.RESUL_12433287659.txt_234323456.txt Content 1
deID.RESUL_34534563649.txt_345353567.txt Content 2
deID.RESUL_44235345636.txt_537967875.txt Content 3
deID.RESUL_35234663456.txt_423452545.txt Content 4
I tried creating Control file using:
LOAD
DATA
INFILE 'deID.RESUL_12433287659.txt_234323456.txt'
INFILE 'deID.RESUL_34534563649.txt_345353567.txt'
INFILE 'deID.RESUL_44235345636.txt_537967875.txt'
INFILE 'deID.RESUL_35234663456.txt_423452545.txt'
APPEND INTO TABLE TBL_DATA
EVALUATE CHECK_CONSTRAINTS
REENABLE DISABLED_CONSTRAINTS
EXCEPTIONS EXCEPTION_TABLE
FIELDS TERMINATED BY ""
OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
FILE_NAME
)
Is there a way I can grab the files names dynamically and specify wildcard in the INFILE so I don't have to mention 400K files one by one in my control file?
1) Create table to hold data/files
create table TBL_DATA(file_name varchar2(4000), file_content clob);
2) Create load_all.ctl
LOAD DATA
INFILE file_list.txt
INSERT INTO TABLE TBL_DATA
APPEND
FIELDS TERMINATED BY ","
OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
file_name char(4000)
, file_content LOBFILE(file_name) TERMINATED BY EOF
)
3) Redirect list of file to one file_list.txt
ls -1 *.txt > file_list.txt
4) Run sqlldr user/pass#db control=load_all.ctl
5) load_all.ctl,file_list.txt and source files should be in the same folder.

How to export a Hive table into a CSV file including header?

I used this Hive query to export a table into a CSV file.
hive -f mysql.sql
row format delimited fields terminated by ','
select * from Mydatabase,Mytable limit 100"
cat /LocalPath/* > /LocalPath/table.csv
However, it does not include table column names.
How to export in csv the column names ?
show tablename ?
You should add set hive.cli.print.header=true; before your select query to get column names as the first row of your output. The output would look as Mytable.col1, Mytable.col2 ....
If you don't want the table name with the column names, use set hive.resultset.use.unique.column.names=false;. The first row of your output would then look like col1, col2 ...
Invoking hive command-line with the parameters suggested in the other answer here works for a plain select. So, you can extract the column names and create the csv to start with, as follows:
hive -S --hiveconf hive.cli.print.header=true --hiveconf hive.resultset.use.unique.column.names=false --database Mydatabase -e 'select * from Mytable limit 0;' > /LocalPath/table.csv
Post which you can have the actual data extraction part run, except this time, remember to append to the csv:
cat /LocalPath/* >> /LocalPath/table.csv ## From your question with >> for append

Hive table creation error through Bash Shell

Can anyone give me why I am getting error while creating partitioed table from bash shell.
[cloudera#localhost ~]$ hive -e "create table peoplecountry (
name1 string,
name2 string,
salary int,
country string
)
partitioned by (country string)
row format delimited
column terminated by '\n'";
Logging initialized using configuration in jar:file:/usr/lib/hive/lib/hive-common-0.10.0-cdh4.7.0.jar!/hive-log4j.properties
Hive history file=/tmp/cloudera/hive_job_log_0fdf7083-8ab4-499f-8048-a85f162d1357_376056456.txt
FAILED: ParseException line 8:0 missing EOF at 'column' near 'delimited'
If you meant newline at end of each row of your data then you need to use:
line terminated by '\n'
instead of column terminated by ,
In case you meant each column in the row to separated by a delimiter , then specify as
fields terminated by '\n'
refer :
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL

LOAD DATA query error

What is the problem with this line
$load ="LOAD DATA INFILE $inputFile INTO TABLE $tableName FIELDS TERMINATED BY ',' LINES TERMINATED BY '\r\n' IGNORE 1 LINES";
echo $load;
mysql_query($load);
The echo result is;
LOAD DATA INFILE appendpb.csv INTO TABLE appendpb_csv FIELDS TERMINATED BY ',' LINES TERMINATED BY ' ' IGNORE 1 LINES
The error is;
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'appendpb.csv INTO TABLE appendpb_csv FIELDS TERMINATED BY ',' LINES TERMINATED B' at line 1
According to the MYSQL LOAD DATA Reference it should have single quotes around the input file:
$load ="LOAD DATA INFILE '$inputFile' INTO TABLE $tableName FIELDS TERMINATED BY ',' LINES TERMINATED BY '\r\n' IGNORE 1 LINES";
Eventually looking likes this
LOAD DATA INFILE 'appendpb.csv' INTO TABLE appendpb_csv FIELDS TERMINATED BY ',' LINES TERMINATED BY ' ' IGNORE 1 LINES
Assuming the path of the file is correct.

Resources