How to perform a search and replace on a NCLOB? - oracle

With an Oracle 11g database, and a NCLOB column in a table, I'm trying to replace some text with another text. At a high level, it's pretty simple, I did it successfully on a SQL Server version of a SQL script, but with Oracle, things are getting complicated, mainly because of data in NCLOB column can be easily more than 46k in length.
With an error ORA-22835 (Buffer too small for CLOB to CHAR or BLOB to RAW conversion), the action, as suggested, is not possible because of variable length of data, and chunking this data with SUBSTR could split in middle of my "search string" to be found in data.
I'm looking for a straightforward and simple solution I could use in a SQL script.
Here is an example of script I'm using with SQL Server:
DECLARE #replacestring NVarChar(MAX) = '0D0D000402175300008950.. very long string 46k+ in length ..1CA68848EEB58360000000049454E44AE426082'
DECLARE #oldFingerprintStart NVarChar(MAX) = '0D0D0004002BA80000FFD8FFE000104A46494600010201004800480000FFE10B304578696600004D4D002A0000000800070';
DECLARE #oldFingerprintEnd NVarChar(MAX) = '02800A002800A002800A002800A002800A002800A002800A002800A002800A002800A002800A002800A002800A002803FFD9';
UPDATE Table1
SET datacolumn =
CONCAT(
SUBSTRING(datacolumn, 0, CHARINDEX(#oldFingerprintStart, datacolumn)),
#replacestring,
SUBSTRING(datacolumn, CHARINDEX(#oldFingerprintEnd, datacolumn) + LEN(#oldFingerprintEnd), LEN(datacolumn) - (CHARINDEX(#oldFingerprintEnd, datacolumn) + LEN(#oldFingerprintEnd))+1)
)
WHERE CHARINDEX(#oldFingerprintStart, datacolumn) > 0
AND CHARINDEX(#oldFingerprintEnd, datacolumn) > 0

You can find nice and detailed explanation here, but as for my experience (and as it is stated in Oracle documentation), standard REPLACE function works on NCLOB fields the same way, as on VARCHAR2.
UPDATE a_table
SET that_field = REPLACE(that_field, 'XYZ', 'ABC')
WHERE CONTAINS(that_field, 'XYZ') > 0
And this way you will avoid any trouble with buffer overflow, as there is none to take care of.

Related

Having issue with extracing date from CLOB data

Hi I am having issue when extracting fields from CLOB data. For one record I am not getting desired output.
The record is as below:
{1:F014243107336}{2:O2021216200422XXX24563}{3:{108:O2020}{121:2c02a452-5}{433:HIT}}{4:
:4A:SEC:20200901
:4B:FC5253
:4C:20042000,
:4D:XXXXXXX
:4E:RXX
:4F:RXXXX
-}{5:{CHK:87D1003B01F7}{TNG:}}{S:{SAC:}{COP:S}}<APSECSIGN>FS3sfasdfg!==</APSECSIGN>?
I want to extract data from tag :4A: into REF_NUMBER.
I am using below SQL to get the data.
NVL(TRIM(TRANSLATE(REGEXP_REPLACE(REGEXP_SUBSTR(dbms_lob.substr(CLOB, 4000, 1 ), ':4A.?:[^:]+(:|-\})'), ':20.?:([^:]+)(:|-\})', '\1'),CHR(10)||CHR(13), ' ')),' ') AS REF_NUMBER
the output I am getting is "SEC". However I want to see output as SEC:20200901.
Can any one suggest what I am missing in my query or provide me correct query.
A general suggestion. Why don't you have your data stored as JSON ? Because, JSON related functions are very fast when compared to others. And then your problem becomes quite easy.
However to answer your question:
with inputs (str) as
(
select to_clob(q'<
{1:F014243107336}{2:O2021216200422XXX24563}{3:{108:O2020}{121:2c02a452-5}{433:HIT}}{4:
:4A:SEC:20200901
:4B:FC5253
:4C:20042000,
:4D:XXXXXXX
:4E:RXX
:4F:RXXXX
-}{5:{CHK:87D1003B01F7}{TNG:}}{S:{SAC:}{COP:S}}<APSECSIGN>FS3sfasdfg!==</APSECSIGN>?
>') from dual
)
select str, regexp_substr(str,'SEC:\d+',1,1,'n') as val
from inputs;
Output:
Updated
If you know the date is always going to be 8 digits after the :4A: tag, you can use REGEXP_SUBSTR to get the value you need. Combining it with DBMS_LOB.SUBSTR removes the tag and converts it to a string.
SELECT DBMS_LOB.SUBSTR ((REGEXP_SUBSTR (clob_val, ':4A:.*\d{8}')), 4000, 5)
FROM (SELECT EMPTY_CLOB ()
|| '{1:F014243107336}{2:O2021216200422XXX24563}{3:{108:O2020}{121:2c02a452-5}{433:HIT}}{4:
:4A:SEC:20200901
:4B:FC5253
:4C:20042000,
:4D:XXXXXXX
:4E:RXX
:4F:RXXXX
-}{5:{CHK:87D1003B01F7}{TNG:}}{S:{SAC:}{COP:S}}<APSECSIGN>FS3sfasdfg!==</APSECSIGN>?' AS clob_val
FROM DUAL);

java.sql.SQLException: ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion

I have a dynamic query build with and executed with TypedQuery<NewsContentBaseInfo> and one of the fields is CLOB object - news.stores. Here is the error I get and I can't find iformation how to solve this:
java.sql.SQLException: ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion
Here is the query:
SELECT DISTINCT new com.kaufland.newsletter.usecase.newscontent.search.dto.response.NewsContentBaseInfo(news.id, news.uuid, news.dayAndTimeOfPublish, news.title, news.subtitle, news.categoryCountry, news.newsPeriod, to_char(news.stores))
FROM com.kaufland.newsletter.domain.content.AbstractNewsContent news
LEFT OUTER JOIN news.newsLinks newsLinks
WHERE news.country = :country AND news.status = :status
AND news.dayAndTimeOfPublish >= :dayAndTimeOfPublishStart
AND news.dayAndTimeOfPublish <= :dayAndTimeOfPublishEnd
AND (news.stores LIKE '%'||:storeNumber0||'%')
AND news.categoryCountry.id in :includeCategoryIds
AND (LOWER(news.title) LIKE LOWER('%'||:searchText||'%')
OR LOWER(news.subtitle) LIKE LOWER('%'||:searchText||'%')
OR LOWER(news.text1) LIKE LOWER('%'||:searchTextEscaped||'%')
OR LOWER(news.text2) LIKE LOWER('%'||:searchTextEscaped||'%')
OR LOWER(news.text3) LIKE LOWER('%'||:searchTextEscaped||'%')
OR LOWER(newsLinks.displayText) LIKE LOWER('%'||:searchText||'%'))
ORDER BY news.dayAndTimeOfPublish DESC
The to_char function returns a varchar that is limited to 4000 characters. If the CLOB is greater than that you can get this error (depending on the Oracle version).
If you really need an String value you can try with the dbms_lob package (https://docs.oracle.com/database/121/ARPLS/d_lob.htm#ARPLS600), that can handle more characters.

Retrieving data from oracle table using scala jdbc giving wrong results

I am using scala jdbc to check whether a partition exists for an oracle table. It is returning wrong results when an aggregate function like count(*) is used.
I have checked the DB connectivity and other queries are working fine. I have tried to extract the value of count(*) using an alias, But it failed. Also tried using getString. But it failed.
Class.forName(jdbcDriver)
var connection = DriverManager.getConnection(jdbcUrl,dbUser,pswd)
val statement = connection.createStatement()
try{
val sqlQuery = s""" SELECT COUNT(*) FROM USER_TAB_PARTITIONS WHERE
TABLE_NAME = \'$tableName\' AND PARTITION_NAME = \'$partitionName\' """
val resultSet1 = statement.executeQuery(sqlQuery)
while(resultSet1.next())
{
var cnt=resultSet1.getInt(1)
println("Count="+cnt)
if(cnt==0)
// Code to add partition and insert data
else
//code to insert data in existing partition
}
}catch(Exception e) { ... }
The value of cnt always prints as 0 even though the oracle partition already exists. Can you please let me know what is the error in the code? Is this giving wrong results because I am using scala jdbc to get the result of an aggregate function like count(*)? If yes, then what would be the correct code? I need to use scala jdbc to check whether the partition already exists in oracle and then insert data accordingly.
This is just a suggestion or might be the solution in your case.
Whenever you search the metadata tables of the oracle always use UPPER or LOWER on both side of equal sign.
Oracle converts every object name in to the upper case and store it in the metadata unless you have specifically provided the lower case object name in double quotes while creating it.
So take an following example:
-- 1
CREATE TABLE "My_table_name1" ... -- CASE SENSISTIVE
-- 2
CREATE TABLE My_table_name2 ... -- CASE INSENSITIVE
In first query, we used double quotes so it will be stored in the metadata of the oracle as case sensitive name.
In second query, We have not used double quotes so the table name will be converted into the upper case and stored in the metadata of the oracle.
So If you want to create a query against any metadata in the oracle which include both of the above cases then you can use UPPER or LOWER against the column name and value as following:
SELECT * FROM USER_TABLES WHERE UPPER(TABLE_NAME) = UPPER('<YOUR TABLE NAME>');
Hope, this will help you in solving the issue.
Cheers!!

Oracle SQL Query Performance, Function based Indexes

I have been trying to fine tune a SQL Query that takes 1.5 Hrs to process approx 4,000 error records. The run time increases along with the number of rows.
I figured out there is one condition in my SQL that is actually causing the issue
AND (DECODE (aia.doc_sequence_value,
NULL, DECODE(aia.voucher_num,
NULL, SUBSTR(aia.invoice_num, 1, 10),
aia.voucher_num) ,
aia.doc_sequence_value) ||'_' ||
aila.line_number ||'_' ||
aida.distribution_line_number ||'_' ||
DECODE (aca.doc_sequence_value,
NULL, DECODE(aca.check_voucher_num,
NULL, SUBSTR(aca.check_number, 1, 10),
aca.check_voucher_num) ,
aca.doc_sequence_value)) = " P_ID"
(P_ID - a value from the first cursor sql)
(Note that these are standard Oracle Applications(ERP) Invoice tables)
P_ID column is from the staging table that is derived the same way as above derivation and compared here again in the second SQL to get the latest data for that record. (Basically reprocessing the error records, the value of P_ID is something like "999703_1_1_9995248" )
Q1) Can I create a function based index on the whole left side derivation? If so what is the syntax.
Q2) Would it be okay or against the oracle standard rules, to create a function based index on standard Oracle tables? (Not creating directly on the table itself)
Q3) If NOT what is the best approach to solve this issue?
Briefly, no you can't place a function-based index on that expression, because the input values are derived from four different tables (or table aliases).
What you might look into is a materialised view, but that's a big and potentially difficult to solve a single query optimisation problem with.
You might investigate decomposing that string "999703_1_1_9995248" and applying the relevant parts to the separate expressions:
DECODE(aia.doc_sequence_value,
NULL,
DECODE(aia.voucher_num,
NULL, SUBSTR(aia.invoice_num, 1, 10),
aia.voucher_num) ,
aia.doc_sequence_value) = '999703' and
aila.line_number = '1' and
aida.distribution_line_number = '1' and
DECODE (aca.doc_sequence_value,
NULL,
DECODE(aca.check_voucher_num,
NULL, SUBSTR(aca.check_number, 1, 10),
aca.check_voucher_num) ,
aca.doc_sequence_value)) = '9995248'
Then you can use indexes on the expressions and columns.
You could separate the four components of the P_ID value using regular expressions, or a combination of InStr() and SubStr()
Ad 1) Based on the SQL you've posted, you cannot create function based index on that. The reason is that function based indexes must be:
Deterministic - i.e. the function used in index definition has to always return the same result for given input arguments, and
Can only use columns from the table the index is created for. In your case - based on aliases you're using - you have four tables (aia, aila, aida, aca).
Req #2 makes it impossible to build a functional index for that expression.

Why does Oracle require TO_NCHAR when binding SQL_C_WCHAR text via ODBC

I use the following statement prepared and bound in ODBC:
SELECT (CASE profile WHEN ? THEN 1 ELSE 2 END) AS profile_order
FROM engine_properties;
Executed in an ODBC 3.0 connection to an Oracle 10g database in AL32UTF8 charset, even after binding to a wchar_t string using SQLBindParameter(SQL_C_WCHAR), it still gives the error ORA-12704: character set mismatch.
Why? I'm binding as wchar. Shouldn't a wchar be considered an NCHAR?
If I change the parameter to wrap it with TO_NCHAR() then the query works without error. However since these queries are used for multiple database backends, I don't want to add TO_NCHAR just on Oracle text bindings. Is there something that I am missing? Another way to solve this without the TO_NCHAR hammer?
I haven't been able to find anything relevant via searches or in the manuals.
More details...
-- error
SELECT (CASE profile WHEN '_default' THEN 1 ELSE 2 END) AS profile_order
FROM engine_properties;
-- ok
SELECT (CASE profile WHEN TO_NCHAR('_default') THEN 1 ELSE 2 END) AS profile_order
FROM engine_properties;
SQL> describe engine_properties;
Name Null? Type
----------------------------------------- -------- ----------------------------
EID NOT NULL NVARCHAR2(22)
LID NOT NULL NUMBER(11)
PROFILE NOT NULL NVARCHAR2(32)
PKEY NOT NULL NVARCHAR2(50)
VALUE NOT NULL NVARCHAR2(64)
READONLY NOT NULL NUMBER(5)
This version without TO_NCHAR works fine in SQL Server and PostgreSQL (via ODBC) and SQLite (direct). However in Oracle it returns "ORA-12704: character set mismatch".
SQLPrepare(SELECT (CASE profile WHEN ? THEN 1 ELSE 2 END) AS profile_order
FROM engine_properties;) = SQL_SUCCESS
SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT, SQL_C_WCHAR,
SQL_VARCHAR, 32, 0, "_default", 18, 16) = SQL_SUCCESS
SQLExecute() = SQL_ERROR
SQLGetDiagRec(1) = SQL_SUCCESS
[SQLSTATE: HY000, NATIVE: 12704, MESSAGE: [Oracle][ODBC]
[Ora]ORA-12704: character set mismatch]
SQLGetDiagRec(2) = SQL_NO_DATA
If I do use TO_NCHAR, it's okay (but won't work in SQL Server, Postgres, SQLite, etc).
SQLPrepare(SELECT (CASE profile WHEN TO_NCHAR(?) THEN 1 ELSE 2 END) AS profile_order
FROM engine_properties;) = SQL_SUCCESS
SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT, SQL_C_WCHAR,
SQL_VARCHAR, 32, 0, "_default", 18, 16) = SQL_SUCCESS
SQLExecute() = SQL_SUCCESS
SQLNumResultCols() = SQL_SUCCESS (count = 1)
SQLFetch() = SQL_SUCCESS
If the Oracle database character set is AL32UTF8, why are the columns defined as NVARCHAR2? That means that you want those columns encoded using the national character set (normally AL16UTF16, but that may be different on your database). Unless you are primarily storing Asian language data (or other data that requires 3 bytes of storage in AL32UTF8), it is relatively uncommon to create NVARCHAR2 columns in an Oracle database when the database character set supports Unicode.
In general, you are far better served sticking with the database character set (CHAR and VARCHAR2 columns) rather than trying to work with the national character set (NCHAR and NVARCHAR2 columns) because there are far fewer hoops that need to be jumped through on the development/ configuration side of things. Since you aren't increasing the set of characters you can encode by choosing NVARCHAR2 data types, I'll wager that you'd be happier with VARCHAR2 data types.
Thanks Justin.
I can't say that I understand exactly how to choose between VARCHAR2 and NVARCHAR2 still. I had tried using VARCHAR2 for my date (which does include a lot of different languages, both European and Asian) and it didn't work that time.
I have had another bit of playing around again though and I found that using Justin's suggestion works in this combination:
AL32UTF8 database charset
VARCHAR2 column types
set NLS_LANG=.UTF8 before starting sqlplus.exe
data files using UTF-8 (i.e. the files with all the INSERT statements)
inserting and extracting strings from the database using SQL_C_WCHAR
I still don't find Oracle as fun to play with as (for instance) PostgreSQL though... :-)

Resources