Related
I am trying to split a huge CLOB which contains lines with more than 32K characters.
I tried to use this
SELECT REGEXP_SUBSTR(file_cont, '[^'||chr(10)||']+', 1, LEVEL) AS substr
from data_tab where interface = 'Historical'
CONNECT BY LEVEL <= LENGTH(REGEXP_REPLACE(file_cont, '[^'||chr(10)||']+')) + 1
The table data_tab contains some files with pipe as a separator.
The column file_cont is a clob which contains the file we are interested in.
However, when I try to execute the above query, it looks like there is an infinite loop.
For information, the CLOB contains more than 600 lines.
What I want to do is to split the clob, line by line into distinct CLOB.
Do you know a query that can display this result without falling into an infinite loop?
EDIT : The file's size is 22MB.
Thank you in advance.
I have a special package for split and PCRE regular expressions:
https://github.com/xtender/XT_REGEXP
You can find this function in https://github.com/xtender/XT_REGEXP/blob/master/xt_regexp.pck
/**
* Clob simple split
*/
function clob_split_simple(p_clob in clob,p_delim in varchar2)
return clob_table pipelined is
row clob;
l_b number:=1;
l_e number:=1;
$IF DBMS_DB_VERSION.ver_le_11 $THEN
$ELSE
pragma UDF;
$END
begin
while l_e>0
loop
l_e:=instr(p_clob,p_delim,l_b);
pipe row(substr(p_clob,l_b,case when l_e>0 then l_e-l_b else length(p_clob)+length(p_delim)-l_b end));
l_b:=l_e+length(p_delim);
end loop;
end clob_split_simple;
So you can either use this pipelined function:
select *
from table(xt_regexp.clob_split_simple(:clob,chr(10));
or take this code as an example.
clob_table is just a table of clob:
https://github.com/xtender/XT_REGEXP/blob/master/types.sql
create or replace type clob_table as table of clob;
/
create or replace type date_table as table of date;
/
create or replace type number_table as table of number;
/
create or replace type varchar2_table as table of varchar2(4000);
/
create or replace type xml_table as table of xmltype;
/
Update: fixed a bug with long matches: dbms_lob.substr which returns varchar2, replaced with substr(clob) which return clob.
You can use a PL/SQL function to read the and split the value:
If you have the data type:
CREATE TYPE clob_table AS TABLE OF CLOB;
Then the function:
CREATE FUNCTION split_clob(
p_value IN CLOB,
p_delimiter IN VARCHAR2 DEFAULT ','
) RETURN clob_table PIPELINED
IS
v_start PLS_INTEGER;
v_next PLS_INTEGER;
v_len PLS_INTEGER;
BEGIN
v_start := 1;
LOOP
v_next := DBMS_LOB.INSTR( p_value, p_delimiter, v_start );
v_len := CASE v_next WHEN 0 THEN LENGTH( p_value ) + 1 ELSE v_next END - v_start;
PIPE ROW ( SUBSTR( p_value, v_start, v_len ) );
EXIT WHEN v_next = 0;
v_start := v_next + LENGTH(p_delimiter);
END LOOP;
END;
/
For the sample data:
CREATE TABLE table_name ( value CLOB );
DECLARE
v_value TABLE_NAME.VALUE%TYPE := EMPTY_CLOB();
BEGIN
FOR ch IN 65 .. 68 LOOP
FOR i IN 1 .. 10 LOOP
v_value := v_value || RPAD( CHR(ch), 4000, CHR(ch) );
END LOOP;
IF ch < 68 THEN
v_value := v_value || CHR(10);
END IF;
END LOOP;
INSERT INTO table_name ( value ) VALUES ( v_value );
END;
/
Then the output of:
SELECT SUBSTR( s.column_value, 1, 10 ) AS value,
LENGTH( s.column_value ) AS len
FROM table_name t
CROSS APPLY TABLE( split_clob( t.value, CHR(10) ) ) s
Is:
VALUE
LEN
AAAAAAAAAA
40000
BBBBBBBBBB
40000
CCCCCCCCCC
40000
DDDDDDDDDD
40000
db<>fiddle here
I have a number of sources for information that I need to store in a table (along with other information). As of now I don't know which and how many sources there will be. The source is not required by business logic but rather is stored only for investigative purposes only. Also, this table will be used in production only once for data migration, so I would like to keep the solution as simple as possible (i.e. not do a properly normalized table structure).
I could create a boolean column for each source (as in source1 char(1) default '0', source2 char(1) default '0' etc.) However, I would have to add a column for each new source. What I'd like to have is a column that's a bit array, each bit representing one source. This is very similar to the order_status column mentioned in the documentation for the BITAND function.
My question is,
What would be the preferred data type for this column assuming that there will be say max. 16 sources? NUMBER(2)?
How would I update this field (e.g. set bit number 3)? I've been looking into UTL_RAW functions but they all seem to (surprise, surprise) expect RAW input which makes things a little cumbersome.
I'm open to other ideas as well, as long adding a new source doesn't require changes in the table structure. (I'm aware that using bit arrays in a database table is rarely a good idea, but these are special circumstances so no need to comment on that.) Our DB is 12c (12.1).
Create an Object Type:
Oracle Setup:
CREATE TYPE bitarray AS OBJECT(
data BLOB,
len NUMBER(38,0),
CONSTRUCTOR FUNCTION bitarray( in_length NUMBER ) RETURN SELF AS RESULT,
CONSTRUCTOR FUNCTION bitarray( in_data VARCHAR2 ) RETURN SELF AS RESULT,
MEMBER FUNCTION getBit( in_index NUMBER ) RETURN NUMBER,
MEMBER FUNCTION setBit( in_index NUMBER, in_value NUMBER ) RETURN bitarray,
MEMBER FUNCTION toString RETURN CLOB,
STATIC FUNCTION byteToRaw( in_value BINARY_INTEGER ) RETURN RAW
);
/
CREATE TYPE BODY bitarray AS
CONSTRUCTOR FUNCTION bitarray( in_length NUMBER ) RETURN SELF AS RESULT
AS
p_raw RAW(1) := BITARRAY.BYTETORAW( 0 );
BEGIN
DBMS_LOB.CREATETEMPORARY( SELF.DATA, FALSE );
SELF.LEN := in_length;
FOR i IN 1 .. CEIL( in_length / 8 ) LOOP
DBMS_LOB.WRITEAPPEND( SELF.DATA, 1, p_raw );
END LOOP;
RETURN;
END;
CONSTRUCTOR FUNCTION bitarray( in_data VARCHAR2 ) RETURN SELF AS RESULT
AS
p_value BINARY_INTEGER := 0;
p_power BINARY_INTEGER := 1;
BEGIN
SELF.LEN := LENGTH( in_data );
DBMS_LOB.CREATETEMPORARY( SELF.DATA, FALSE );
FOR i IN 1 .. SELF.LEN LOOP
IF SUBSTR( in_data, i, 1 ) = '1' THEN
p_value := p_value + p_power;
END IF;
IF MOD( i, 8 ) = 0 OR i = SELF.LEN THEN
DBMS_LOB.WRITEAPPEND( SELF.DATA, 1, BITARRAY.BYTETORAW( p_value ) );
p_value := 0;
p_power := 1;
ELSE
p_power := p_power * 2;
END IF;
END LOOP;
RETURN;
END;
MEMBER FUNCTION getBit( in_index NUMBER ) RETURN NUMBER
AS
p_amount BINARY_INTEGER := 1;
p_raw RAW(1);
p_bit_index BINARY_INTEGER := MOD( in_index - 1, 8 );
p_byte_index BINARY_INTEGER := ( in_index - 1 - p_bit_index ) / 8 + 1;
p_bit_value BINARY_INTEGER := POWER( 2, p_bit_index );
BEGIN
IF in_index IS NULL OR in_index < 1 OR in_index > SELF.LEN THEN
RETURN NULL;
END IF;
DBMS_LOB.READ( SELF.DATA, p_amount, p_byte_index, p_raw );
RETURN BITAND( UTL_RAW.CAST_TO_BINARY_INTEGER( p_raw ), p_bit_value ) / p_bit_value;
END;
MEMBER FUNCTION setBit( in_index NUMBER, in_value NUMBER ) RETURN bitarray
AS
p_amount BINARY_INTEGER := 1;
p_raw RAW(1);
p_bit_index BINARY_INTEGER := MOD( in_index - 1, 8 );
p_byte_index BINARY_INTEGER := ( in_index - 1 - p_bit_index ) / 8 + 1;
p_bit_value RAW(1) := BITARRAY.BYTETORAW( POWER( 2, p_bit_index ) );
p_array bitarray := SELF;
BEGIN
IF in_index IS NULL OR in_value NOT IN ( 0, 1 ) OR in_index < 1 OR in_index > SELF.LEN THEN
RETURN p_array;
END IF;
DBMS_LOB.READ( SELF.DATA, p_amount, p_byte_index, p_raw );
IF in_value = 1 THEN
p_raw := UTL_RAW.BIT_OR( p_raw, p_bit_value );
ELSE
p_raw := UTL_RAW.BIT_AND( p_raw, UTL_RAW.BIT_COMPLEMENT( p_bit_value ) );
END IF;
DBMS_LOB.WRITE( p_array.DATA, p_amount, p_byte_index, p_raw );
RETURN p_array;
END;
MEMBER FUNCTION toString RETURN CLOB
AS
p_string CLOB := EMPTY_CLOB();
BEGIN
FOR i IN 1 .. SELF.LEN LOOP
IF SELF.getBit(i) = 0 THEN
p_string := p_string || '0';
ELSIF SELF.getBit(i) = 1 THEN
p_string := p_string || '1';
ELSE
p_string := p_string || '-';
END IF;
END LOOP;
RETURN p_string;
END;
STATIC FUNCTION byteToRaw( in_value BINARY_INTEGER ) RETURN RAW
AS
BEGIN
RETURN UTL_RAW.SUBSTR( UTL_RAW.CAST_FROM_BINARY_INTEGER( in_value ), 4, 1 );
END;
END;
/
Query:
Then you can use it in SQL:
SELECT BITARRAY(5).toString() AS default_value,
BITARRAY('10110').toString() AS with_values,
BITARRAY('10110').setBit(3,0).toString() AS set_values
FROM DUAL;
Output:
DEFAULT_VALUE | WITH_VALUES | SET_VALUES
:------------ | :---------- | :---------
00000 | 10110 | 10010
Storage in a Table:
CREATE TABLE table_name ( id INT, bits BITARRAY );
INSERT INTO table_name
SELECT 1, bitarray( 4 ).setBit( 1, 1 ).setBit( 4, 1 ) FROM DUAL UNION ALL
SELECT 1, bitarray( '1011001' ) FROM DUAL;
Then query it using:
SELECT id, t.bits.toString() FROM table_name t;
which outputs:
ID | T.BITS.TOSTRING()
-: | :----------------
1 | 1001
1 | 1011001
db<>fiddle here
Instead of creating such a column, how about creating a table? It would contain two columns:
source name
Boolean info
For example:
SQL> create table that_table
2 (source_name varchar2(30),
3 cb_bool number(1) default 0 not null
4 );
Table created.
SQL> insert into that_table
2 select 'source 1', 0 from dual union all
3 select 'source 2', 1 from dual union all
4 select 'source 9', 1 from dual;
3 rows created.
SQL> select * From that_table;
SOURCE_NAME CB_BOOL
------------------------------ ----------
source 1 0
source 2 1
source 9 1
SQL>
As opposed to your idea, this scales and it doesn't really matter how many sources there are - you'd just INSERT a new (or UPDATE existing) row.
I have the table with a varchar2 column containing values like (1,2,20-25,222-256)
Now I have to filter the records based on the following search criteria (24,210,300,250)
Sample Records
Id | RangeOfString
---------------------------
1 | 20-25, 101, 222-256, 1001-1045, 1046, 1047, 1048
2 | 1, 2, 3, 2100-2300
3 | 56-89, 186-326, 548, 601, 875
Expected Result
Id | RangeOfString
---------------------------
1 | 20-25, 101, 222-256, 1001-1045, 1046, 1047, 1048
3 | 56-89, 186-326, 548, 601, 875
You can write a function to return a collection:
Oracle Setup:
CREATE TYPE intlist IS TABLE OF INTEGER;
/
CREATE PROCEDURE splitGroupedList(
p_grouped IN VARCHAR2,
p_delimiter IN VARCHAR2 DEFAULT ',',
p_separator IN VARCHAR2 DEFAULT '-'
) RETURN intlist DETERMINISTIC
AS
v_start PLS_INTEGER := 1;
v_end PLS_INTEGER;
v_sep PLS_INTEGER;
v_range VARCHAR2(4000);
v_lower INTEGER;
v_upper INTEGER;
v_numbers intlist := intlist();
c_del_len CONSTANT PLS_INTEGER := LENGTH( p_delimiter );
c_sep_len CONSTANT PLS_INTEGER := LENGTH( p_separator );
BEGIN
IF p_grouped IS NULL THEN
RETURN v_numbers;
END IF;
LOOP
EXIT WHEN v_start := 0;
v_end := INSTR( p_grouped, p_delimiter, v_start );
IF v_end = 0 THEN
v_range := SUBSTR( p_grouped, v_start );
v_start := 0;
ELSE
v_range := SUBSTR( p_grouped, v_start, v_end - v_start );
v_start := v_end + c_del_len;
END IF;
IF v_range IS NULL THEN
CONTINUE;
END IF;
v_sep := INSTR( v_range, p_separator );
IF v_sep = 0 THEN
v_lower := TO_NUMBER( v_range );
v_upper := v_lower;
ELSE
v_lower := TO_NUMBER( SUBSTR( v_range, 1, v_sep - 1 ) );
v_upper := TO_NUMBER( SUBSTR( v_range, v_sep + c_sep_len ) );
END IF;
FOR i IN v_lower .. v_upper LOOP
v_numbers.EXTEND;
v_numbers( v_numbers.COUNT ) := i;
END LOOP;
END LOOP;
RETURN v_numbers;
END;
/
Query:
WITH your_data ( Id, RangeOfString ) AS (
SELECT 1, '20-25,101,222-256,1001-1045,1046,1047,1048' FROM DUAL UNION ALL
SELECT 2, '1,2,3,2100-2300' FROM DUAL UNION ALL
SELECT 3, '56-89,186-326,548,601,875' FROM DUAL
)
SELECT *
FROM your_data
WHERE intlist( 24,210,300,250 ) MULTISET INTERSECT splitGroupedList( RangeOfString ) IS NOT EMPTY;
Output:
ID RANGEOFSTRING
-- ------------------------------------------
1 20-25,101,222-256,1001-1045,1046,1047,1048
3 56-89,186-326,548,601,875
Oracle 11g has certainly improved usability of CLOBs, having overloaded most of the string functions so they now work natively with CLOBs.
However, a colleague was getting this error from his code:
ORA-22828: input pattern or replacement parameters exceed 32K size limit
22828. 00000 - "input pattern or replacement parameters exceed 32K size limit"
*Cause: Value provided for the pattern or replacement string in the form of
VARCHAR2 or CLOB for LOB SQL functions exceeded the 32K size limit.
*Action: Use a shorter pattern or process a long pattern string in multiple
passes.
This only occurred when the third parameter to replace was a CLOB with more than 32k characters.
(Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production)
Test case:
declare
v2 varchar2(32767);
cl_small clob;
cl_big clob;
cl_big2 clob;
begin
v2 := rpad('x', 32767, 'x');
dbms_output.put_line('v2:' || length(v2));
cl_small := v2;
dbms_output.put_line('cl_small:' || length(cl_small));
cl_big := v2 || 'y' || v2;
dbms_output.put_line('cl_big[1]:' || length(cl_big));
cl_big2 := replace(cl_big, 'y', cl_small);
dbms_output.put_line('cl_big[2]:' || length(cl_big2));
cl_big2 := replace(cl_big, 'y', cl_big);
dbms_output.put_line('cl_big[3]:' || length(cl_big2));
end;
/
Results:
v2:32767
cl_small:32767
cl_big[1]:65535
cl_big[2]:98301
ORA-22828: input pattern or replacement parameters exceed 32K size limit
This seems at odds with the docs which imply that the replacement string may be a CLOB - I would have thought this should imply that any CLOB would be allowed, not just those that happen to be <32K: http://docs.oracle.com/cd/E11882_01/server.112/e41084/functions153.htm#SQLRF00697
Here is a rough first draft for a function that will do the job with certain limitations, it hasn't been very well tested yet:
function replace_with_clob
(i_source in clob
,i_search in varchar2
,i_replace in clob
) return clob is
l_pos pls_integer;
begin
l_pos := instr(i_source, i_search);
if l_pos > 0 then
return substr(i_source, 1, l_pos-1)
|| i_replace
|| substr(i_source, l_pos+length(i_search));
end if;
return i_source;
end replace_with_clob;
It only does a single replace on the first instance of the search term.
declare
v2 varchar2(32767);
cl_small clob;
cl_big clob;
cl_big2 clob;
begin
v2 := rpad('x', 32767, 'x');
dbms_output.put_line('v2:' || length(v2));
cl_small := v2;
dbms_output.put_line('cl_small:' || length(cl_small));
cl_big := v2 || 'y' || v2;
dbms_output.put_line('cl_big[1]:' || length(cl_big));
cl_big2 := replace(cl_big, 'y', cl_small);
dbms_output.put_line('cl_big[2]:' || length(cl_big2));
cl_big2 := replace_with_clob(cl_big, 'y', cl_big);
dbms_output.put_line('cl_big[3]:' || length(cl_big2));
end;
/
v2:32767
cl_small:32767
cl_big[1]:65535
cl_big[2]:98301
cl_big[3]:131069
You can create a function to handle CLOB values of any length:
SQL Fiddle
CREATE FUNCTION lob_replace(
i_lob IN clob,
i_what IN varchar2,
i_with IN clob,
i_offset IN INTEGER DEFAULT 1,
i_nth IN INTEGER DEFAULT 1
) RETURN CLOB
AS
o_lob CLOB;
n PLS_INTEGER;
l_lob PLS_INTEGER;
l_what PLS_INTEGER;
l_with PLS_INTEGER;
BEGIN
IF i_lob IS NULL
OR i_what IS NULL
OR i_offset < 1
OR i_offset > DBMS_LOB.LOBMAXSIZE
OR i_nth < 1
OR i_nth > DBMS_LOB.LOBMAXSIZE
THEN
RETURN NULL;
END IF;
n := NVL( DBMS_LOB.INSTR( i_lob, i_what, i_offset, i_nth ), 0 );
l_lob := DBMS_LOB.GETLENGTH( i_lob );
l_what := LENGTH( i_what );
l_with := NVL( DBMS_LOB.GETLENGTH( i_with ), 0 );
DBMS_LOB.CREATETEMPORARY( o_lob, FALSE );
IF n > 0 THEN
IF n > 1 THEN
DBMS_LOB.COPY( o_lob, i_lob, n-1, 1, 1 );
END IF;
IF l_with > 0 THEN
DBMS_LOB.APPEND( o_lob, i_with );
END IF;
IF n + l_what <= l_lob THEN
DBMS_LOB.COPY( o_lob, i_lob, l_lob - n - l_what + 1, n + l_with, n + l_what );
END IF;
ELSE
DBMS_LOB.APPEND( o_lob, i_lob );
END IF;
RETURN o_lob;
END;
/
Oracle 11g R2 Schema Setup:
CREATE TABLE table_name ( value clob)
/
CREATE TABLE replacements ( str VARCHAR2(4000), repl CLOB )
/
DECLARE
str VARCHAR2(4000) := 'value';
r CLOB;
c1l CLOB;
c1m CLOB;
c1r CLOB;
c2l CLOB;
c2m CLOB;
c2r CLOB;
c3l CLOB;
c3m CLOB;
c3r CLOB;
BEGIN
DBMS_LOB.CREATETEMPORARY( r, FALSE );
DBMS_LOB.CREATETEMPORARY( c1l, FALSE );
DBMS_LOB.CREATETEMPORARY( c1m, FALSE );
DBMS_LOB.CREATETEMPORARY( c1r, FALSE );
DBMS_LOB.CREATETEMPORARY( c2l, FALSE );
DBMS_LOB.CREATETEMPORARY( c2m, FALSE );
DBMS_LOB.CREATETEMPORARY( c2r, FALSE );
DBMS_LOB.CREATETEMPORARY( c3l, FALSE );
DBMS_LOB.CREATETEMPORARY( c3m, FALSE );
DBMS_LOB.CREATETEMPORARY( c3r, FALSE );
FOR i IN 1 .. 10 LOOP
DBMS_LOB.WRITEAPPEND( r, 4000, RPAD( 'y', 4000, 'y' ) );
DBMS_LOB.WRITEAPPEND( C1m, 20, RPAD( 'x', 20, 'x' ) );
DBMS_LOB.WRITEAPPEND( C1r, 40, RPAD( 'x', 40, 'x' ) );
DBMS_LOB.WRITEAPPEND( C2m, 200, RPAD( 'x', 200, 'x' ) );
DBMS_LOB.WRITEAPPEND( C2r, 400, RPAD( 'x', 400, 'x' ) );
DBMS_LOB.WRITEAPPEND( C3m, 2000, RPAD( 'x', 2000, 'x' ) );
DBMS_LOB.WRITEAPPEND( C3r, 4000, RPAD( 'x', 4000, 'x' ) );
END LOOP;
DBMS_LOB.WRITEAPPEND( c1l, 5, str );
DBMS_LOB.WRITEAPPEND( c1m, 5, str );
DBMS_LOB.WRITEAPPEND( c1r, 5, str );
DBMS_LOB.WRITEAPPEND( c2l, 5, str );
DBMS_LOB.WRITEAPPEND( c2m, 5, str );
DBMS_LOB.WRITEAPPEND( c2r, 5, str );
DBMS_LOB.WRITEAPPEND( c3l, 5, str );
DBMS_LOB.WRITEAPPEND( c3m, 5, str );
DBMS_LOB.WRITEAPPEND( c3r, 5, str );
FOR i IN 1 .. 10 LOOP
DBMS_LOB.WRITEAPPEND( C1l, 40, RPAD( 'x', 40, 'x' ) );
DBMS_LOB.WRITEAPPEND( C1m, 20, RPAD( 'x', 20, 'x' ) );
DBMS_LOB.WRITEAPPEND( C2l, 400, RPAD( 'x', 400, 'x' ) );
DBMS_LOB.WRITEAPPEND( C2m, 200, RPAD( 'x', 200, 'x' ) );
DBMS_LOB.WRITEAPPEND( C3l, 4000, RPAD( 'x', 4000, 'x' ) );
DBMS_LOB.WRITEAPPEND( C3m, 2000, RPAD( 'x', 2000, 'x' ) );
END LOOP;
INSERT INTO table_name VALUES ( NULL );
INSERT INTO table_name VALUES ( EMPTY_CLOB() );
INSERT INTO table_name VALUES ( '0123456789' );
INSERT INTO table_name VALUES ( str );
INSERT INTO table_name VALUES ( c1l );
INSERT INTO table_name VALUES ( c1m );
INSERT INTO table_name VALUES ( c1r );
INSERT INTO table_name VALUES ( c2l );
INSERT INTO table_name VALUES ( c2m );
INSERT INTO table_name VALUES ( c2r );
INSERT INTO table_name VALUES ( c3l );
INSERT INTO table_name VALUES ( c3m );
INSERT INTO table_name VALUES ( c3r );
INSERT INTO replacements VALUES ( str, r );
COMMIT;
END;
/
Query 1:
SELECT DBMS_LOB.GETLENGTH( value )
FROM table_name
Results:
| DBMS_LOB.GETLENGTH(VALUE) |
|---------------------------|
| (null) |
| 0 |
| 10 |
| 5 |
| 405 |
| 405 |
| 405 |
| 4005 |
| 4005 |
| 4005 |
| 40005 |
| 40005 |
| 40005 |
Query 2:
UPDATE table_name
SET value = LOB_REPLACE(
value,
( SELECT str FROM replacements ),
( SELECT repl FROM replacements )
)
Query 3:
SELECT DBMS_LOB.GETLENGTH( value )
FROM table_name
Results:
| DBMS_LOB.GETLENGTH(VALUE) |
|---------------------------|
| (null) |
| 0 |
| 10 |
| 40000 |
| 40400 |
| 40400 |
| 40400 |
| 44000 |
| 44000 |
| 44000 |
| 80000 |
| 80000 |
| 80000 |
This will do the job:
function CLOBREPLACE(
AINPUT CLOB,
APATTERN VARCHAR2,
ASUBSTITUTE CLOB
) return CLOB is
FCLOB CLOB := AINPUT;
FOFFSET INTEGER;
FCHUNK CLOB;
begin
if length(ASUBSTITUTE) > 32000 then
FOFFSET := 1;
FCLOB := replace(FCLOB, APATTERN, '###CLOBREPLACE###');
while FOFFSET <= length(ASUBSTITUTE) loop
FCHUNK := substr(ASUBSTITUTE, FOFFSET, 32000) || '###CLOBREPLACE###';
FCLOB := regexp_replace(FCLOB, '###CLOBREPLACE###', FCHUNK);
FOFFSET := FOFFSET + 32000;
end loop;
FCLOB := regexp_replace(FCLOB, '###CLOBREPLACE###', '');
else
FCLOB := replace(FCLOB, APATTERN, ASUBSTITUTE);
end if;
return FCLOB;
end;
Result of test case:
v2:32767
cl_small:32767
cl_big[1]:65535
cl_big[2]:98301
cl_big[3]:131069
Function REPLACE can be used for CLOB
Doc:
http://psoug.org/reference/translate_replace.html
Working sample:
declare
l_clob clob;
l_parname clob;
l_value clob;
l_par_id_obj varchar2(200) := '${id_obj}';
procedure setCLOBValue(p_CLOB in out CLOB, p_value string) as
begin
DBMS_LOB.createtemporary(p_CLOB, false);
dbms_lob.open(p_CLOB, dbms_lob.lob_readwrite);
dbms_lob.write(p_CLOB, length(p_value), 1, p_value);
dbms_lob.close(p_CLOB);
end;
begin
select SOURCE_FILE into l_clob from st_static_source_clob;
setCLOBValue(l_parname, l_par_id_obj);
setCLOBValue(l_value, '200701000024');
l_clob := replace(l_clob, l_parname, l_value);
end;
Below function solves the problem:
create or replace FUNCTION replace_clob
(
in_source IN CLOB,
in_search IN VARCHAR2,
in_replace IN CLOB
)
RETURN CLOB
IS
l_pos pls_integer;
out_replace_clob CLOB := in_source;
BEGIN
l_pos := instr(in_source, in_search);
IF l_pos > 0 THEN
WHILE l_pos > 0 LOOP
out_replace_clob := substr(out_replace_clob, 1, l_pos-1)
|| in_replace
|| substr(out_replace_clob, l_pos+LENGTH(in_search));
l_pos := instr(out_replace_clob, in_search);
END LOOP;
RETURN out_replace_clob;
END IF;
RETURN in_source;
END replace_clob;
/
This below function gives replace all functionality for CLOB datatype in Oracle PLSQL
CREATE FUNCTION lob_replaceall
(
i_lob IN clob,
i_what IN varchar2,
i_with IN clob,
i_offset IN INTEGER DEFAULT 1
) RETURN CLOB
AS
o_lob CLOB;
n PLS_INTEGER;
l_lob PLS_INTEGER;
l_what PLS_INTEGER;
l_with PLS_INTEGER;
l_temp_lob CLOB;
BEGIN
IF i_lob IS NULL
OR i_what IS NULL
OR i_offset < 1
OR i_offset > DBMS_LOB.LOBMAXSIZE
THEN
RETURN NULL;
END IF;
l_temp_lob:=i_lob;
LOOP
n := NVL( DBMS_LOB.INSTR( l_temp_lob, i_what, i_offset, 1 ), 0 );
l_lob := DBMS_LOB.GETLENGTH( l_temp_lob );
l_what := LENGTH( i_what );
l_with := NVL( DBMS_LOB.GETLENGTH( i_with ), 0 );
DBMS_LOB.CREATETEMPORARY( o_lob, FALSE );
IF n > 0 THEN
IF n > 1 THEN
DBMS_LOB.COPY( o_lob, l_temp_lob, n-1, 1, 1 );
END IF;
IF l_with > 0 THEN
DBMS_LOB.APPEND( o_lob, i_with );
END IF;
IF n + l_what <= l_lob THEN
DBMS_LOB.COPY( o_lob, l_temp_lob, l_lob - n - l_what + 1, n + l_with, n + l_what );
END IF;
ELSE
DBMS_LOB.APPEND( o_lob, l_temp_lob );
END IF;
EXIT WHEN n=0;
l_temp_lob:= o_lob;
o_lob := NULL;
END LOOP;
RETURN o_lob;
END;
/
I'm using a pdf package for oracle pl/sql called pl_fpdf to create pdfs on the fly (this is what I have to use at the moment). It works on one database, but doesn't work on the other. I believe I've narrowed down the issue to a difference in character set and the behavior of utl_raw.cast_to_varchar2 when trying to convert image binary to ascii (base64).
The working character set is WE8MSWIN1252, and the other is AL32UTF8 (seems to be much more common these days)
My question is, how do I make utl_raw.cast_to_varchar2 behave the same with AL32UTF8 as it does with WE8MSWIN1252 so that the resulting base64 image data is correct?
Here's the code where I think the issue is. If I'm completely wrong here, then please let me know.
procedure p_putstream(pData in out NOCOPY blob) is
offset integer := 1;
lv_content_length number := dbms_lob.getlength(pdata);
buf_size integer := 2000;
buf raw(2000);
begin
p_out('stream');
-- read the blob and put it in small pieces in a varchar
while offset < lv_content_length loop
dbms_lob.read(pData,buf_size,offset,buf);
p_out(utl_raw.cast_to_varchar2(buf), false);
offset := offset + buf_size;
end loop;
-- put a CRLF at te end of the blob
p_out(chr(10), false);
p_out('endstream');
exception
when others then
error('p_putstream : '||sqlerrm);
end p_putstream;
What is p_out ? A wrapper around dbms_output.put_line ?
Could this be a client character set issue ? According to the utl_raw.cast_to_varchar2 documentation:
"When casting to a VARCHAR2, the current Globalization Support character set is used for the characters within that VARCHAR2."
E.g.
$ export NLS_LANG=AMERICAN_AMERICA.UTF8
$ sqlplus
SQL> select utl_raw.cast_to_varchar2('80') from dual;
UTL_RAW.CAST_TO_VARCHAR2('80')
--------------------------------------------------------------------------------
€
SQL>
But
$ unset NLS_LANG
$ sqlplus
SQL> select utl_raw.cast_to_varchar2('80') from dual;
UTL_RAW.CAST_TO_VARCHAR2('80')
--------------------------------------------------------------------------------
?
SQL>
When database character set is
SQL> select * from nls_database_parameters where parameter like '%CHARACTERSET';
PARAMETER VALUE
------------------------------ ----------------------------------------
NLS_CHARACTERSET WE8MSWIN1252
NLS_NCHAR_CHARACTERSET AL16UTF16
I solved my own issue. As it turns out, the codebase I am using to import DBF files bastardizes the VARCHAR2 datatype as a RAW. I probably should really rewrite it to build the DBF header using RAW operations. That being said, I just hacked it up some more. In particular, I used nchar_cs, a customized version of utl_raw.cast_to_varchar2 and a customized substr (to return nvarchar2)
select ascii(chr(194)) from dual;
select ascii(chr(194 using nchar_cs)) from dual;
select ascii(chr(193)) from dual;
greatly illustrates the root cause (The top case shows as 0 on my XE installation, but 194 on my 11g enterprise setup). I'm really gunshy about posting this code because it is such a hackjob, but it does work now.
create or replace package dbase_fox as
-- procedure to a load a table with records
-- from a DBASE file.
--
-- Uses a BFILE to read binary data and dbms_sql
-- to dynamically insert into any table you
-- have insert on.
--
-- p_dir is the name of an ORACLE Directory Object
-- that was created via the CREATE DIRECTORY
-- command
--
-- p_file is the name of a file in that directory
-- will be the name of the DBASE file
--
-- p_tname is the name of the table to load from
--
-- p_cnames is an optional list of comma separated
-- column names. If not supplied, this pkg
-- assumes the column names in the DBASE file
-- are the same as the column names in the
-- table
--
-- p_show boolean that if TRUE will cause us to just
-- PRINT (and not insert) what we find in the
-- DBASE files (not the data, just the info
-- from the dbase headers....)
procedure load_Table( p_dir in varchar2,
p_file in varchar2,
p_tname in varchar2,
p_cnames in varchar2 default NULL,
p_show in BOOLEAN default FALSE,
p_rownum in BOOLEAN default FALSE);
end;
/
create or replace package body dbase_fox
as
-- Might have to change on your platform!!!
-- Controls the byte order of binary integers read in
-- from the dbase file
BIG_ENDIAN constant boolean default TRUE;
type dbf_header is RECORD
(
version varchar2(25), -- dBASE version number
year int, -- 1 byte int year, add to 1900
month int, -- 1 byte month
day int, -- 1 byte day
no_records int, -- number of records in file,
-- 4 byte int
hdr_len int, -- length of header, 2 byte int
rec_len int, -- number of bytes in record,
-- 2 byte int
no_fields int -- number of fields
);
type field_descriptor is RECORD
(
name varchar2(11),
type char(1),
length int, -- 1 byte length
decimals int -- 1 byte scale
);
type field_descriptor_array
is table of
field_descriptor index by binary_integer;
type rowArray
is table of
varchar2(4000) index by binary_integer;
g_cursor binary_integer default dbms_sql.open_cursor;
function mysubstr(d in varchar2, s in number, l in number) return nvarchar2 is
begin
return substr(d,s,l);
end;
-- Function to convert a binary unsigned integer
-- into a PLSQL number
function to_int( p_data in varchar2 ) return number
is
l_number number default 0;
l_bytes number default length(p_data);
begin
if (big_endian)
then
for i in 1 .. l_bytes loop
l_number := l_number +
ascii(mysubstr(p_data,i,1)) *
power(2,8*(i-1));
end loop;
else
for i in 1 .. l_bytes loop
l_number := l_number +
ascii(mysubstr(p_data,l_bytes-i+1,1)) *
power(2,8*(i-1));
end loop;
end if;
return l_number;
end;
procedure dump( p_data in varchar2 ) is
l_number number default 0;
l_bytes number default length(p_data);
byte_number number;
byte_string nvarchar2 (1);
begin
if( l_bytes > 0 ) then
dbms_output.put_line('toti=' || l_bytes);
for i in 1 .. l_bytes loop
byte_string := substr(p_data,l_bytes-i+1,1);
dbms_output.put_line('i=' || i || ' ref=' || (l_bytes-i+1) || ' val=' || ascii(byte_string));
end loop;
end if;
end;
function mycast( d in varchar2 ) return varchar2 is
--replaces utl_raw.cast_to_varchar2
t varchar2(2000) default '';
l number default length(d)/2;
function h(n in number) return number is
begin
if n > 47 and n < 58 then
return n - 48;
else
return n - 55;
end if;
end;
begin
if( l > 0 ) then
for i in 1 .. l loop
--t := t || substr(d,2*i-1,1) || substr(d,2*i,1);
--dbms_output.put_line('i=' || (2*i-1) || ' val=' || 16*(h(ascii(substr(d,2*i-1,1)))));
--dbms_output.put_line('i=' || (2*i) || ' val=' || (h(ascii(substr(d,2*i,1)))));
--dbms_output.put_line('ii=' || i || ' val=' || ((h(ascii(substr(d,2*i,1))))+16*(h(ascii(substr(d,2*i-1,1))))));
t := t || chr((h(ascii(substr(d,2*i,1))))+16*(h(ascii(substr(d,2*i-1,1)))) using nchar_cs);
end loop;
end if;
return t;
end;
-- Alex from Russia add this function
-- to convert a HexDecimal value
-- into a Decimal value
function Hex2Dec( p_data in varchar2 ) return number
is
l_number number default 0;
l_bytes number default length(p_data);
byte_number number;
byte_string nvarchar2 (1);
begin
if( l_bytes > 0 ) then
for i in 1 .. l_bytes loop
byte_string := substr(p_data,l_bytes-i+1,1);
case byte_string
when 'A' then byte_number:=10;
when 'B' then byte_number:=11;
when 'C' then byte_number:=12;
when 'D' then byte_number:=13;
when 'E' then byte_number:=14;
when 'F' then byte_number:=15;
else byte_number:=to_number(byte_string);
end case;
l_number := l_number + byte_number * power(16,(i-1));
end loop;
return l_number;
else
return 0;
end if;
end;
--Mattia from Italy add this function
function mytrim(p_str in varchar2) return varchar2 is
i number;
j number;
v_res varchar2(100);
begin
for i in 1 .. 11 loop
if ascii(mysubstr(p_str,i,1)) = 0 then
j:= i;
exit;
end if;
end loop;
v_res := mysubstr(p_str,1,j-1);
return v_res;
end mytrim;
-- Routine to parse the DBASE header record, can get
-- all of the details of the contents of a dbase file from
-- this header
procedure get_header
(p_bfile in bfile,
p_bfile_offset in out NUMBER,
p_hdr in out dbf_header,
p_flds in out field_descriptor_array )
is
l_data varchar2(100);
l_hdr_size number default 32;
l_field_desc_size number default 32;
l_flds field_descriptor_array;
begin
p_flds := l_flds;
l_data := mycast(
dbms_lob.substr( p_bfile,
l_hdr_size,
p_bfile_offset ) );
--dump(l_data);
p_bfile_offset := p_bfile_offset + l_hdr_size;
p_hdr.version := ascii( mysubstr( l_data, 1, 1 ) );
p_hdr.year := 1900 + ascii( mysubstr( l_data, 2, 1 ) );
p_hdr.month := ascii( mysubstr( l_data, 3, 1 ) );
p_hdr.day := ascii( mysubstr( l_data, 4, 1 ) );
p_hdr.no_records := to_int( mysubstr( l_data, 5, 4 ) );
--dbms_output.put_line('hdr_len:' || ascii(mysubstr(l_data,9,1)) || ',' || ascii(mysubstr(l_data,10,1)));
p_hdr.hdr_len := to_int( mysubstr( l_data, 9, 2 ) );
p_hdr.rec_len := to_int( mysubstr( l_data, 11, 2 ) );
p_hdr.no_fields := trunc( (p_hdr.hdr_len - l_hdr_size)/
l_field_desc_size );
for i in 1 .. p_hdr.no_fields
loop
l_data := mycast(
dbms_lob.substr( p_bfile,
l_field_desc_size,
p_bfile_offset ));
p_bfile_offset := p_bfile_offset + l_field_desc_size;
p_flds(i).name := mytrim(mysubstr(l_data,1,11));
p_flds(i).type := mysubstr( l_data, 12, 1 );
p_flds(i).length := ascii( mysubstr( l_data, 17, 1 ) );
p_flds(i).decimals := ascii(mysubstr(l_data,18,1) );
end loop;
p_bfile_offset := p_bfile_offset +
mod( p_hdr.hdr_len - l_hdr_size,
l_field_desc_size );
end;
function build_insert
( p_tname in varchar2,
p_cnames in varchar2,
p_flds in field_descriptor_array,
p_rownum in BOOLEAN) return varchar2
is
l_insert_statement long;
begin
l_insert_statement := 'insert into ' || p_tname || '(';
if ( p_cnames is NOT NULL )
then
l_insert_statement := l_insert_statement ||
p_cnames || ') values (';
else
for i in 1 .. p_flds.count
loop
if ( i <> 1 )
then
l_insert_statement := l_insert_statement||',';
end if;
l_insert_statement := l_insert_statement ||
'"'|| p_flds(i).name || '"';
end loop;
--add rownum functionality
if ( p_rownum )
then
l_insert_statement := l_insert_statement ||
',"ROWNUM"';
end if;
l_insert_statement := l_insert_statement ||
') values (';
end if;
for i in 1 .. p_flds.count
loop
if ( i <> 1 )
then
l_insert_statement := l_insert_statement || ',';
end if;
if ( p_flds(i).type = 'D' )
then
l_insert_statement := l_insert_statement ||
'to_date(:bv' || i || ',''yyyymmdd'' )';
else
l_insert_statement := l_insert_statement ||
':bv' || i;
end if;
end loop;
--add rownum functionality
if ( p_rownum )
then
l_insert_statement := l_insert_statement ||
',:bv' || (p_flds.count + 1);
end if;
l_insert_statement := l_insert_statement || ')';
return l_insert_statement;
end;
function get_row
( p_bfile in bfile,
p_bfile_offset in out number,
p_hdr in dbf_header,
p_flds in field_descriptor_array,
f_bfile in bfile,
memo_block in number ) return rowArray
is
l_data varchar2(4000);
l_row rowArray;
l_n number default 2;
f_block number;
begin
l_data := mycast(
dbms_lob.substr( p_bfile,
p_hdr.rec_len,
p_bfile_offset ) );
p_bfile_offset := p_bfile_offset + p_hdr.rec_len;
l_row(0) := mysubstr( l_data, 1, 1 );
for i in 1 .. p_hdr.no_fields loop
l_row(i) := rtrim(ltrim(mysubstr( l_data,
l_n,
p_flds(i).length ) ));
if ( p_flds(i).type = 'F' and l_row(i) = '.' )
then
l_row(i) := NULL;
-------------------working with Memo fields
elsif ( p_flds(i).type = 'M' ) then
--Check is file exists
if( dbms_lob.isopen( f_bfile ) != 0) then
--f_block - memo block length
f_block := Hex2Dec(dbms_lob.substr( f_bfile, 4, to_number(l_row(i))*memo_block+5 ));
--to_number(l_row(i))*memo_block+9 - offset in memo file *.fpt, where l_row(i) - number of
--memo block in fpt file
l_row(i) := mycast(dbms_lob.substr( f_bfile, f_block, to_number(l_row(i))*memo_block+9));
else
dbms_output.put_line('Not found .fpt file');
exit;
end if;
-------------------------------------------
end if;
l_n := l_n + p_flds(i).length;
end loop;
return l_row;
end get_row;
procedure show( p_hdr in dbf_header,
p_flds in field_descriptor_array,
p_tname in varchar2,
p_cnames in varchar2,
p_bfile in bfile,
p_rownum in BOOLEAN )
is
l_sep varchar2(1) default ',';
procedure p(p_str in varchar2)
is
l_str long default p_str;
begin
while( l_str is not null )
loop
dbms_output.put_line( substr(l_str,1,250) );
l_str := substr( l_str, 251 );
end loop;
end;
begin
p( 'Sizeof DBASE File: ' || dbms_lob.getlength(p_bfile) );
p( 'DBASE Header Information: ' );
p( chr(9)||'Version = ' || p_hdr.version );
p( chr(9)||'Year = ' || p_hdr.year );
p( chr(9)||'Month = ' || p_hdr.month );
p( chr(9)||'Day = ' || p_hdr.day );
p( chr(9)||'#Recs = ' || p_hdr.no_records);
p( chr(9)||'Hdr Len = ' || p_hdr.hdr_len );
p( chr(9)||'Rec Len = ' || p_hdr.rec_len );
p( chr(9)||'#Fields = ' || p_hdr.no_fields );
if p_hdr.no_fields > 100 then
return;
end if;
p( chr(10)||'Data Fields:' );
for i in 1 .. p_hdr.no_fields
loop
p( 'Field(' || i || ') '
|| 'Name = "' || p_flds(i).name || '", '
|| 'Type = ' || p_flds(i).Type || ', '
|| 'Len = ' || p_flds(i).length || ', '
|| 'Scale= ' || p_flds(i).decimals );
end loop;
p( chr(10) || 'Insert We would use:' );
p( build_insert( p_tname, p_cnames, p_flds, p_rownum ) );
p( chr(10) || 'Table that could be created to hold data:');
p( 'create table ' || p_tname );
p( '(' );
for i in 1 .. p_hdr.no_fields
loop
--if ( i = p_hdr.no_fields ) then l_sep := ')'; end if;
dbms_output.put
( chr(9) || '"' || p_flds(i).name || '" ');
if ( p_flds(i).type = 'D' ) then
p( 'date' || l_sep );
elsif ( p_flds(i).type = 'F' ) then
p( 'float' || l_sep );
elsif ( p_flds(i).type = 'N' ) then
if ( p_flds(i).decimals > 0 )
then
p( 'number('||p_flds(i).length||','||
p_flds(i).decimals || ')' ||
l_sep );
else
p( 'number('||p_flds(i).length||')'||l_sep );
end if;
elsif ( p_flds(i).type = 'M' ) then
p( 'clob' || l_sep);
else
p( 'varchar2(' || p_flds(i).length || ')'||l_sep);
end if;
end loop;
--add rownum functionality
if ( p_rownum )
then
p( chr(9) || '"ROWNUM" number)' );
end if;
p( '/' );
end;
procedure load_Table( p_dir in varchar2,
p_file in varchar2,
p_tname in varchar2,
p_cnames in varchar2 default NULL,
p_show in BOOLEAN default FALSE,
p_rownum in BOOLEAN default FALSE )
is
l_bfile bfile;
f_bfile bfile;
l_offset number default 1;
l_hdr dbf_header;
l_flds field_descriptor_array;
l_row rowArray;
f_file varchar2(25);
memo_block number;
l_cnt int default 0;
begin
f_file := substr(p_file,1,length(p_file)-4) || '.fpt';
l_bfile := bfilename( p_dir, p_file );
dbms_lob.fileopen( l_bfile );
----------------------- Alex from Russia add this
f_bfile := bfilename( p_dir, f_file );
if( dbms_lob.fileexists(f_bfile) != 0 ) then
dbms_output.put_line(f_file || ' - Open memo file');
dbms_lob.fileopen( f_bfile );
end if;
--------------------------------------------------
get_header( l_bfile, l_offset, l_hdr, l_flds );
if ( p_show )
then
show( l_hdr, l_flds, p_tname, p_cnames, l_bfile, p_rownum );
else
dbms_sql.parse( g_cursor,
build_insert( p_tname, p_cnames, l_flds, p_rownum ),
dbms_sql.native );
-- Memo block size in ftp file
if ( dbms_lob.isopen( f_bfile ) > 0 ) then
memo_block := Hex2Dec(dbms_lob.substr(f_bfile, 2, 7));
else
memo_block := 0;
end if;
for i in 1 .. l_hdr.no_records loop
l_row := get_row( l_bfile,
l_offset,
l_hdr,
l_flds, f_bfile, memo_block );
if ( l_row(0) <> '*' ) -- deleted record
then
for i in 1..l_hdr.no_fields loop
dbms_sql.bind_variable( g_cursor,
':bv'||i,
l_row(i),
4000 );
end loop;
--add rownum functionality
if ( p_rownum )
then
l_cnt := l_cnt + 1;
dbms_sql.bind_variable( g_cursor,
':bv'||(l_hdr.no_fields+1),
l_cnt,
4000 );
end if;
if ( dbms_sql.execute( g_cursor ) <> 1 )
then
raise_application_error( -20001,
'Insert failed ' || sqlerrm );
end if;
end if;
end loop;
end if;
dbms_lob.fileclose( l_bfile );
if ( dbms_lob.isopen( f_bfile ) > 0 ) then
dbms_lob.fileclose( f_bfile );
end if;
--exception
-- when others then
-- if ( dbms_lob.isopen( l_bfile ) > 0 ) then
-- dbms_lob.fileclose( l_bfile );
-- end if;
-- if ( dbms_lob.isopen( f_bfile ) > 0 ) then
-- dbms_lob.fileclose( f_bfile );
-- end if;
-- RAISE;
end;
end;
/