Failure loading parquet in Synapse Analytics - INT mapped as UTF8 - oracle

We have an on-premise Oracle database from which we need to extract data and store this in a Synapse dedicated pool. I have created a Synapse pipeline which first copies the data from Oracle to a datalake in a parquet file, which should then be imported into Synapse using a second copy task.
The data from Oracle is extracted through a dynamically created query. This query has 2 hard-coded INT values which are generated at runtime. The query runs fine and the parquet file is created correctly, but if I use polybase or copy command to import the file to Synapse it fails with the following error:
"errorCode": "2200",
"message": "ErrorCode=UserErrorSqlDWCopyCommandError,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=SQL DW Copy Command operation failed with error 'HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: ClassCastException: ',Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Data.SqlClient.SqlException,Message=HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: ClassCastException: ,Source=.Net SqlClient Data Provider,SqlErrorNumber=106000,Class=16,ErrorCode=-2146232060,State=1,Errors=[{Class=16,Number=106000,State=1,Message=HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: ClassCastException: ,},],'",
Bulk insert works but is less efficient on large quantities of data so I don't want to use that.
The mapping for the copy activities is created dynamically based on the target database table definition. However, when I created a separate copy task and import the mapping to check what is going on, I noticed that the 2 INT columns are mapped as UTF8 on the parquet source side. The sink table is INT32. When I exclude both columns the copy task completes successfully. It seems that the copy activity fails because it cannot implicitly cast a string to an integer.
The 2 columns are explicitly cast as integers in the Oracle query that is the source for the parquet file.
SELECT t.*
, CAST(419 AS INT) AS "Execution_id"
, CAST(4832 AS INT) AS "Task_id"
, TO_DATE('2022-07-05 14:40:34', 'YYYY-MM-DD HH24:MI:SS') AS "ProcessedDTS"
, t.DEMUTDT AS "EffectiveDTS"
FROM CBO.DRKASTR t
WHERE DEMUTDT >= TO_DATE('2022-07-05 13:37:35', 'YYYY-MM-DD HH24:MI:SS');
Adding an explicit mapping for Oracle to parquet mapping them as INT also doesn't solve the problem.
How do I prevent these 2 columns from being interpreted as integers instead of strings?!

We ended up resolving this by first importing the data as strings in the database and casting to the correct database during further processing.

Related

SQLSTATE HY104; INVALID PRECISION VALUE. ERROR IN PARAMETER

I am using SSIS for ETL. Source and destination databases are Oracle.
When I run job through SQL agent its prompts me with the following error:
This table contains 5 date columns which are creating this issue.
I have tried all possible solution but it didn't work. It does not seems data issue as I rerun job on those selective dates which worked perfectly. On full load it failed.
The bottom error message is:
Data Flow: Task:Error: SQLSTATE 22007, Message: [Microsoft][ODBC Oracle Wire Protocol driver]Invalid datetime format. Error in parameter 17.
You have an Invalid datetime format. You need to fix it by correcting either the data or the format model you are using but, since you haven't included any code, we can't help further.
I have a similar issue, the difference is my source is the SQL Server database and the destination is Oracle database.
I converted the source DateTime columns to type String first and then they were loaded to destination date columns successfully.

ADF: Can't get the for each paramter to work in the query for the sink

We have an Oracle database with a table and one of the tables holds dates. I want to itterate over this table by this date to copy dat from Oracle to Azure Datalake. But somehow I cannot get this to work.
The loopkup for the foreach works fine, but when I want to copy the data, using the one of the dates from the lookup, the copy activity task fails with the error: Message=ERROR [HY000] [Microsoft][ODBC Oracle Wire Protocol driver][Oracle]ORA-00936: missing expression
I suspect it has something to do with the dateformat that Oracle spits out en expects in the where clause. When I run the lookup-query in sql-developer, the dat format is like 29-DEC-14.
The query for the lookup looks like this:
select distinct activity_day
from Table 1
where activity_day < '01-JAN-15'
I restrict the data for testing so it only has to itterate everything before 01-01-2015 (which in this case is three rows)
In the foreach component items is stated as follows:
#activity('LookupDates').output.value
In the Copy activity the sink is specified as an Oracle query (connection to the oracle database works fine)
select column1, column2, coumn3,.......
from Table
where activity_day = #item().activity_day
The result should be that I get three files in my datalake with the data from three days. But as stated earlier, it fails in the copy activity on the source side. complet error below here:
"errorCode": "2200",
"message": "Failure happened on 'Source' side. ErrorCode=UserErrorOdbcOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=ERROR [HY000] [Microsoft][ODBC Oracle Wire Protocol driver][Oracle]ORA-00936: missing expression,Source=Microsoft.DataTransfer.ClientLibrary.Odbc.OdbcConnector,''Type=System.Data.Odbc.OdbcException,Message=ERROR [HY000] [Microsoft][ODBC Oracle Wire Protocol driver][Oracle]ORA-00936: missing expression,Source=msora28.dll,'",
"failureType": "UserError",
"target": "Copy Data1"
Answer was given on MSDN: in combination with another ttopic on stackoverflow:
https://social.msdn.microsoft.com/Forums/en-US/4224338f-9511-4f80-9fbf-4bf4cbc1b596/cant-get-lookup-data-passed-to-oracle-database?forum=AzureDataFactory

Suggestion for loading data of 2M records in to DB

Users upload data file through application (JSF) which has 2 million records, i have to upload it to DB. Loading through JAVA asynchronous call is occupying more memory out-of memory exception and also most of the time it is getting timeout.
So for that what i did is, stored uploaded file as CLOB in table1, i use UNIX shell script which runs every 15 minutes to see if table1 has not-processed records, if then read that CLOB file and load in to table2 using SQLLDR in the same shell script.It is working fine, but there is a 15 minutes delay in processing records.
So i think the same SQLLDR process can be run through a PL/SQL package or procedure and the same package can be called through JAVA JDBC call.. rite? any examples?
If it's one-time export/import you can use SQL Developer. It enables you to export displayed rows in a loader format. B/Clobs are exported as separate files.
Following Oracle's blog:
LOAD DATA
INFILE 'loader.txt'
INTO TABLE my_table
FIELDS TERMINATED BY ','
( id CHAR(10),
author CHAR(30),
created DATE "YYYY-MM-DD" ":created",
fname FILLER CHAR(80),
text LOBFILE(fname) TERMINATED BY EOF
)
"fname" is an arbitrary label, we could have used "fred" and it would
have worked exactly the same. It just needs to be the same on the two
lines where it is used.
loader.txt:
1,John Smith,2015-04-29,file1.txt
2,Pete Jones,2013-01-31,file2.txt
If you want to know how to dump a CLOB column into a file, please refer to Dumping CLOB fields into files?.

Oracle destination in SSIS data flow is failing with Error- ORA-01405: fetched column value is NULL

I have one SSIS package in which there is one DFT. In DFT, I have one Oracle source and one Oracle destination.
In Oracle destination I am using Data Access Mode as 'Table Name - Fast Load (Using Direct Path)'
There is one strange issue with that. It is failing with the following error
[Dest 1 [251]] Error: Fast Load error encountered during
PreLoad or Setup phase. Class: OCI_ERROR Status: -1 Code: 0 Note:
At: ORAOPRdrpthEngine.c:735 Text: ORA-00604: error occurred at
recursive SQL level 1 ORA-01405: fetched column value is NULL
I thought it is due to NULL values in source but there is no NOT NULL constraint in the destination table, so it should not be an issue. And to add into this, the package is working fine in case of 'Normal Load' but 'Fast Load'.
I have tried using NVL in case of NULL values from source but still no luck.
I have also recreated the DFT with these connections but that too in vain.
Can some one please help me with this?
It worked fine after recreating the oracle table with the same script

JDBC Error in insert with DB2 (works with Sql Server)

I use in a Java Application JDBC to query the DBMS. The application works correctly with Sql Server but I get this error in DB2 during one insert:
com.ibm.db2.jcc.am.SqlDataException: DB2 SQL Error: SQLCODE=-302, SQLSTATE=22001, SQLERRMC=1, DRIVER=3.63.75
The insert is made using the ResultSet.TYPE_SCROLL_SENSITIVE, ResultSet.CONCUR_UPDATABLE.
My query is a plain select of the table, then I declare my PreparedStatement, passing the parameters and afterwards with the ResultSet I do first the moveToInsertRow() and then the insertRow().
Do you know if there are any problems with this approach using DB2?
As I told you before the same code works correctly with Sql Server.
SQL Code -302 on DB2 means:
THE VALUE OF INPUT VARIABLE OR PARAMETER NUMBER position-number IS INVALID OR TOO LARGE FOR THE TARGET COLUMN OR THE TARGET VALUE
So it seems like you are trying to insert a value into a column which is too large or too short (e.g. Hello World into a varchar(5)). Probably the column has a different length in DB2 and sql-server or you are inserting different values.
Probably too late to add to this thread.. but someone else might find it useful
Got the same SQL Exception when trying to do a SELECT : didn't realize the property value in WHERE clause was exceeding the limit on the corresponding column
SELECT * FROM <schema>.<table_name> WHERE PropertyName = 'value';
value was a VARCHAR type but exceeded the Length limit
Detailed exception does say it clearly that data integrity was violated: org.springframework.dao.DataIntegrityViolationException
So a good idea would be to do a length check on the value(s) that are being set on the properties before firing any queries to the database.

Resources