SSIS pulling data from Oracle - oracle

I have an SSIS package that is pulling data from an oracle database using the Native OLD DB\Oracle provider for OLD DB.
My package successfully but slowly pulls the data from one view in Oracle to a staging table in my SQL Database. The problem I am having is some of fields in Oracle are 4000 char in length and in SQL Server I only need the first 255 characters. Would it be better to do a substring in my query for the oracle and only take the size i need, or to take all 4000 characters? Is there a better way to handle this data import?
here is a sample of the query I am using to extract the data from Oracle:
select a
, b
, c
, substring (c,1,255) as c, substring (d,1,255) as d
, e
, CASE WHEN EXTRACT(YEAR from LAST_TAKEN_DT) < 1900
THEN NULL
WHEN EXTRACT(YEAR from LAST_TAKEN_DT) > 2025
THEN NULL
END AS LAST_TAKEN_DT
from oracle_View1

First off if it was upto you I would suggest using the Attunity Oracle adapters over the OLEDB connection. It is definitely a lot faster and you could choose to do the substring in the Oracle query or within your SSIS package using a derived column.

Related

Import massive table from Oracle to PostgreSQL with oracle-fdw return ORA-01406

I work on a project to transfer data from an Oracle database to a PostgreSQL database to create a datawarehouse with bash & SQL scripts. To access to the Oracle database, I use the PostgreSQL extension oracle-fdw.
One of my scripts import data from a massive table (~ 100 000 000 new rows/day). This table is partitioned and each partition contains 1 day of data. The query I use to import data looks like that :
INSERT INTO postgre_target_table (some_fields)
SELECT some_aggregated_fields -- (~150 fields)
FROM oracle_source_table
WHERE partition_id = :v_partition_id AND some_others_filters
GROUP BY primary_key;
On DEV server, the query works fine (there is much less data on this server) but in PREPROD, it returns the error ORA-01406: fetched column value was truncated.
In some posts, people say that the output fields may be too small but if I try to send a simple SELECT query without INSERT or GROUP BY I have the same error.
Another idea I found in another post is to create an Oracle side view but in my query I use multiple parameters that I cannot use in a view.
The last idea I found is to create an Oracle stored procedure that fills a table with aggregated data and then import data from this table but the Oracle database is critical and my customer prefers to avoid adding more data on it.
Now, I'm starting to think there's no solution and it's not good...
PostgreSQL version : 12.4 / Oracle version : 11.2
UPDATE
It seems my problem is more complecated than I thought.
After applying the modification given by Laurenz Albe, the query runs correctly on PGAdmin but the problem still appears when I use psql command.
Moreover, another query seems to have the same problem. This other query does not use the same source table as the first query, it uses 4 joined tables without any partition. The common point between these queries is the structure.
The detail I omit to specify in the original post is that the purpose of both queries is to pivot a table. They look like that :
SELECT osr.id,
MIN(CASE osr.category
WHEN 123 THEN
1
END) AS field1,
MIN(CASE osr.category
WHEN 264 THEN
1
END) AS field2,
MIN(CASE osr.category
WHEN 975 THEN
1
END) AS field3,
...
FROM oracle_source_table osr
WHERE osr.category IN (123, 264, 975, ...)
GROUP BY osr.id;
Now that I have detailed what the queries look like, I can give you some results I had with the second one without changing the value of max_long (this query is lighter than the first one) :
Sometimes it works (~10%), sometimes it failed (~90%) on PGadmin but it never works with psql command
If I delete the WHERE, it always works
I don't understand why deleting the WHERE change something, the field used in this clause is a NUMBER(6, 0) between 0 and 2500 and it is still used in the SELECT clause... Oh and in the 4 Oracle tables used by this query, there is no LONG datatype, only NUMBER datatype is used.
Among 20 queries I have, only these two have a problem, their structure is similar and I don't believe in coincidences.
Don't despair!
Set the max_long option on the foreign table big enough that all your oversized data fit.
The documentation has the details:
max_long (optional, defaults to "32767")
The maximal length of any LONG, LONG RAW and XMLTYPE columns in the Oracle table. Possible values are integers between 1 and 1073741823 (the maximal size of a bytea in PostgreSQL). This amount of memory will be allocated at least twice, so large values will consume a lot of memory.
If max_long is less than the length of the longest value retrieved, you will receive the error message
ORA-01406: fetched column value was truncated
Example:
ALTER FOREIGN TABLE my_tab OPTIONS (ADD max_long '1000000');

Access to Oracle XmlType column through ODBC with data over 4K

Our legacy C++ ODBC application recently added support for XML data columns (data type xml in MS SQL Server, XmlType in Oracle). We used
SQLBindParameter( hStmt, i, SQL_PARAM_INPUT, SQL_C_CHAR, SQL_VARCHAR, ... )
for our insert/update queries, and
SQLBindCol( hStmt, i, SQL_C_CHAR, ... )
for our selects. While the XML data remained small, this worked fine on both DBMS. MS SQL Server also handles larger XML data without issues, on Oracle we ran into the 4K limit for string literals though.
We realize that we can use CLOB columns instead in Oracle (and varchar(max) on MS SQL) to store larger XML data bound with SQL_LONGVARCHAR. However, we can only change the database schema in our next major version.
Is there any way to get Oracle to work with data over 4K characters through ODBC while keeping the column as XmlType?
I've found various workarounds in PL/SQL, but nothing that applies to ODBC. Attempting to use PL/SQL stored procedures that take CLOB in parameters as helpers only resulted in the same
ORA-01461: can bind a LONG value only for insert into a LONG column
error message that we also get when directly attempting to bind the column.

Joining 2 Oracle tables using Join in SSIS

I have to get data from database1 something like
select *
from tab1
where dtCol > <*dtVariable*>
The value of dtVariable comes from database2 from
select *
processdt
where proc_name = 'PROCESS1'
Can you please let me how to do this?
I am using SSIS 2008
Oracle db - 12c
Oracle Drivers - Attunity 1.2
No DBLInks ( basically, we need to avoid using db links)
You would simply need to pull the input value from Db2 to a variable in SSIS as step 1.
Step 2, you would create an expression that forms that SELECT sql call encompassing that variable.
MSDN has an article how to use Execute SQL Task.

Import blob through SAS from ORACLE DB

Good time of a day to everyone.
I face with a huge problem during my work on previous week.
Here ia the deal:
I need to download exel file (blob) from ORACLE database through SAS.
I am using:
First step i need to get data from oracle. I used the construction (blob file is nearly 100kb):
proc sql;
connect to oracle;
create table SASTBL as
select * from connection to oracle (
select dbms_lob.substr(myblobfield,1,32767) as blob_1,
dbms_lob.substr(myblobfield,32768,32767) as blob_2,
dbms_lob.substr(myblobfield,65535,32767) as blob_3,
dbms_lob.substr(myblobfield,97302,32767) as blob_4
from my_tbl;
);
quit;
And the result is:
blob_1 = 70020202020202...02
blob_2 = 02020202020...02
blob_3 = 02020202...02
I do not understand why the field consists from "02"(the whole file)
And the length of any variable in sas is 1024 (instead of 37767) $HEX2024 format.
If I ll take:
dbms_lob.substr(my_blob_field,2000,900) from the same object the result will mush more similar to the truth:
blob = "A234ABC4536AE7...."
The question is: 1. how can i get binary data from blob field correctly trough SAS? What is my mistake?
Thank you.
EDIT 1:
I get the information but max string is 2000 kb.
Use the DBMAX_TEXT option on the CONNECT statement (or a LIBNAME statement) to get up to 32,767 characters. The default is probably 1024.
PROC SQL uses SQL to interact with SAS datasets (create tables, query tables, aggregate data, connect externally, etc.). The procedure mostly follows the ANSI standard with a few SAS specific extensions. Each RDMS extends ANSI including Oracle with its XML handling such as saving content in a blob column. Possibly, SAS cannot properly read the Oracle-specific (non-ANSI) binary large object type. Typically SAS processes string, numeric, datetime, and few other types.
As an alternative, consider saving XML content from Oracle externally as an .xml file and use SAS's XML engine to read content into SAS dataset:
** STORING XML CONTENT;
libname tempdata xml 'C:\Path\To\XML\File.xml';
** APPEND CONTENT TO SAS DATASET;
data Work.XMLData;
set tempdata.NodeName; /* CHANGE TO REPEAT PARENT NODE OF XML. */
run;
Adding as another answer as I can't comment yet... the issue you experienced is that the return of dbms_lob.substr is actually a varchar so SAS limits it to 2,000. To avoid this, you could wrap it in to_clob( ... ) AND set the DBMAX_TEXT option as previously answered.
Another alternative is below...
The code below is an effective method for retrieving a single record with a large CLOB. Instead of calculating how many fields to split the clob into resulting in a very wide record, it instead splits it into multiple rows. See expected output at bottom.
Disclaimer: Although effective it may not be efficient ie may not scale well to multiple rows, the generally accepted approach then is row pipelining PLSQL. That being said, the below got me out of a pinch if you can't make a procedure...
PROC SQL;
connect to oracle (authdomain=YOUR_Auth path=devdb DBMAX_TEXT=32767 );
create table clob_chunks (compress=yes) as
select *
from connection to Oracle (
SELECT id
, key
, level clob_order
, regexp_substr(clob_value, '.{1,32767}', 1, level, 'n') clob_chunk
FROM (
SELECT id, key, clob_value
FROM schema.table
WHERE id = 123
)
CONNECT BY LEVEL <= regexp_count(clob_value, '.{1,32767}',1,'n')
)
order by id, key, clob_order;
disconnect from oracle;
QUIT;
Expected output:
ID KEY CHUNK CLOB
1 1 1 short_clob
2 2 1 long clob chunk1of3
2 2 2 long clob chunk2of3
2 2 3 long clob chunk3of3
3 3 1 another_short_one
Explanation:
DBMAX_TEXT tells SAS to adjust the default of 1024 for a clob field.
The regex .{1,32767} tells Oracle to match at least once but no more than 32767 times. This splits the input and captures the last chunk which is likely to be under 32767 in length.
The regexp_substr is pulling a chunk from the clob (param1) starting from the start of the clob (param2), skipping to the 'level'th occurance (param3) and treating the clob as one large string (param4 'n').
The connect by re-runs the regex to count the chunks to stop the level incrementing beyond end of the clob.
References:
SAS KB article for DBMAX_TEXT
Oracle docs for REGEXP_COUNT
Oracle docs for REGEXP_SUBSTR
Oracle regex syntax
Stackoverflow example of regex splitting

java 1.4 :how to insert multiple records in a database with one single hit using executeBatch?

i am reading records data from a file(records count can be up to thousands ).Now i want to insert each record in to database.I want to insert all of records in one hit to reduce performance hit. If i use addBatch(String sqlQuery ) on statment object,my sql query should be static .but in my case query will be non static.Please tell me possible solutions with best performance?
platform
java 1.4
sql server 2000.
From Wiki
A SQL feature (since SQL-92) is the use of row value constructors to insert multiple rows at a time in a single SQL statement:
INSERT INTO ''TABLE'' (''column1'', [''column2, ... ''])
VALUES (''value1a'', [''value1b, ...'']),
(''value2a'', [''value2b, ...'']),
...

Resources