Calculating number of rows in a file created by UTL_FILE - oracle

I have a PLSQL package, that writes out an extract from an Oracle table into multiple files. Each file can have maximum of 50000000 rows. Generally, 5 or 6 such files are created. I am using UTL_FILE functionality to create these extract files.
I have a requirement to log generated file names and number of rows in the generated file to an Oracle table.
I can log the file names, but how I can log the number of rows exported to a file?

How? Count them one-by-one.
create a local variable
increment it after each UTL_FILE.put_line call
log it after you're done
I've just tested it, output (result of DBMS_OUTPUT.PUT_LINE) looks like e.g.
ocit_4001_1.txt: 37465 row(s)
ocit_4001_2.txt: 37464 row(s)
ocit_4001_3.txt: 37462 row(s)

Related

I have n(large) number of small sized txt files in hive

I have n(large) number of small sized txt files which i want to merge into k(small) number of files
if you have hive table on top of these txt files then use
insert overwrite <db>.<existing_table> select * from <db>.<existing_table> order by <col_name>;
Hive supports select and overwriting same table, order by clause will force to run 1 reducer which results only 1 file to be created in the directory.
However if you are having large data then order by clause will not perform well, then use sort by (or) clustered by clause to initiate more than 1 reducer.
insert overwrite <db>.<existing_table> select * from <db>.<existing_table> sort by <col_name>;

How to find out failed insert script number among the multiple insertion scripts

My file.sql has 50000 insert scripts, in that one are more insert scripts exection failed because of the value is too large for the column, then how we can find out which insert script got failed (which line number of insert script failed in the file).
I take it you want the missing data to be inserted after all?
1 Can you delete all data, change the table to hold larger values and run the script again?
2 Is there a unique key on the table? Then modify the table so it can hold larger values and run the script again. Only the data you do not already have will be inserted now.
3 Create the same table in another schema or database with the modified definition. Insert the data. Query the records where length of columns value > previous maximum. Generate insert statements only for these records and run these on the original (but now modified to hold larger values) table.

Load multiple files content to table using SQL loader

How to insert data from multiple files having different columns into a table in Oracle database using SQL Loader with Single control file.
Basically ,
We have 3 CSV files
file 1 having columns a,b,c
file 2 having columns d,e,f
file 3 having columns g,h,i
We need to insert the above attributes to a Table named "TableTest"
having columns a,b,c ,d,e,f,g,h,i
Using single control file
Thanks in advance
You really can't. You can either splice the .csv files together (a lot of nasty work) or create 3 tables to load and then use plsql or sql to join them together into your target table.

How to create table dynamically based on the uploaded csv file column header using oracle apex

Based in the csv file column header it should create table dynamically and also insert records of that csv file into the newly create table.
Ex:
1) If i upload a file TEST.csv with 3 columns, it should create a table dynamically with three
2) Again if i upload a new file called TEST2.csv with 5 columns, it should create a table dynamically with five columns.
Every time it should create a table based on the uploaded csv file header..
how to achieve this in oracle APEX..
Thanks in Advance..
Without creating new tables you can treat the CSVs as tables using a TABLE function you can SELECT from. If you download the packages from the Alexandria Project you will find a function that will do just that inside CSV_UTIL_PKG (clob_to_csv is this function but you will find other goodies in here).
You would just upload the CSV and store in a CLOB column and then you can build reports on it using the CSV_UTIL_PKG code.
If you must create a new table for the upload you could still use this parser. Upload the file and then select just the first row (e.g. SELECT * FROM csv_util_pkg.clob_to_csv(your_clob) WHERE ROWNUM = 1). You could insert this row into an Apex Collection using APEX_COLLECTION.CREATE_COLLECTION_FROM_QUERY to make it easy to then iterate over each column.
You would need to determine the datatype for each column but could just use VARCHAR2 for everything.
But if you are just using generic columns you could just as easily just store one addition column as a name of this collection of records and store all of the uploads in the same table. Just build another table to store the column names.
Simply store this file as BLOB if structure is "dynamic".
You can use XML data type for this use case too but it won't be very different from BLOB column.
There is a SecureFile feature since 11g, It is a new BLOB implementation, it performs better than regular BLOB and it is good for unstructured or semi structured data.

What is the best way to produce large results in Hive

I've been trying to run some Hive queries with largish result sets. My normal approach is to submit a job through the WebHCat API, and read the results from the resulting stdout file, or to just run hive at the console and pipe stdout to a file. However, with large results (more than one reducer used), the stdout is blank or truncated.
My current solution is to create a new table from the results CREATE TABLE FROM SELECT which introduces an extra step, and leaves the table to clear up afterwards if I don't want to keep the result set.
Does anyone have a better method for capturing all the results from such a Hive query?
You can write the data directly to a directory on either hdfs or the local file system, then do what you want with the files. For example, to generate CSV files:
INSERT OVERWRITE DIRECTORY '/hive/output/folder'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE
SELECT ... FROM ...;
This is essentially the same as CREATE TABLE FROM SELECT but you don't have to clean up the table. Here's the full documentation:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Writingdataintothefilesystemfromqueries

Resources