I want to insert the same long string into all cells in certain column, which is CLOB type.
It said I should use "bind variables" to do it. So i Googled this:
variable xmlstuff CLOB;
exec :xmlstuff := '<?xml version="1.0"?> ... really long xml...';
UPDATE TABLE_NAME SET COLUMN_NAME = '&&xmlstuff';
Now it still says
The string literal is longer than 4000 characters.
What is the proper use of bind variable in this case?
If you are programming using C# or Java - just use the OracleCLOB object, and it will do all the necessary steps.
If you want to use a CLOB in SQL or PL/SQL,
you need to allocate it, and release it after using.search for DBMS_LOB information.
Regarding the 4000 bytes limit - this is a varchar2 limit within SQL.
to bypass this - you can use PL/SQL - which limits you to a varchar of 32KB, which is not as near to the 4GB that you can hold in a CLOB, but that is the limit for "automatic" creation of a CLOB.
if your string is longer than 32K - you'll have to use a DBMS_LOB to load the data into the clog object, by using append on the clob object.
this is the fastest link I found about how to do it: http://geekswithblogs.net/robertphyatt/archive/2010/03/24/write-read-and-update-oracle-clobs-with-plsql.aspx
I wanted to answer fast, so please let me know if you cannot solve your issue after getting this information - and I'll try to explain it better.
Related
We use Oracle 10g and Oracle 11g.
We also have a layer to automatically compose queries, from pseudo-SQL code written in .net (something like SqlAlchemy for Python).
Our layer currently wraps any string in single quotes ' and, if contains non-ANSI characters, it automatically compose the UNISTR with special characters written as unicode bytes (like \00E0).
Now we created a method for doing multiple inserts with the following construct:
INSERT INTO ... (...)
SELECT ... FROM DUAL
UNION ALL SELECT ... FROM DUAL
...
This algorithm could compose queries where the same string field is sometimes passed as 'my simple string' and sometimes wrapped as UNISTR('my string with special chars like \00E0').
The described condition causes a ORA-12704: character set mismatch.
One solution is to use the INSERT ALL construct but it is very slow compared to the one used now.
Another solution is to instruct our layer to put N in front of any string (except for the ones already wrapped with UNISTR). This is simple.
I just want to know if this could cause any side-effect on existing queries.
Note: all our fields on DB are either NCHAR or NVARCHAR2.
Oracle ref: http://docs.oracle.com/cd/B19306_01/server.102/b14225/ch7progrunicode.htm
Basicly what you are asking is, is there a difference between how a string is stored with or without the N function.
You can just check for yourself consider:
SQL> create table test (val nvarchar2(20));
Table TEST created.
SQL> insert into test select n'test' from dual;
1 row inserted.
SQL> insert into test select 'test' from dual;
1 row inserted.
SQL> select dump(val) from test;
DUMP(VAL)
--------------------------------------------------------------------------------
Typ=1 Len=8: 0,116,0,101,0,115,0,116
Typ=1 Len=8: 0,116,0,101,0,115,0,116
As you can see identical so no side effect.
The reason this works so beautifully is because of the elegance of unicode
If you are interested here is a nice video explaining it
https://www.youtube.com/watch?v=MijmeoH9LT4
I assume that you get an error "ORA-12704: character set mismatch" because your data inside quotes considered as char but your fields is nchar so char is collated using different charsets, one using NLS_CHARACTERSET, the other NLS_NCHAR_CHARACTERSET.
When you use an UNISTR function, it converts data from char to nchar (in any case that also converts encoded values into characters) as the Oracle docs say:
"UNISTR takes as its argument a text literal or an expression that
resolves to character data and returns it in the national character
set."
When you convert values explicitly using N or TO_NCHAR you only get values in NLS_NCHAR_CHARACTERSET without decoding. If you have some values encoded like this "\00E0" they will not be decoded and will be considered unchanged.
So if you have an insert such as:
insert into select N'my string with special chars like \00E0',
UNISTR('my string with special chars like \00E0') from dual ....
your data in the first inserting field will be: 'my string with special chars like \00E0' not 'my string with special chars like à'. This is the only side effect I'm aware of. Other queries should already use NLS_NCHAR_CHARACTERSET encoding, so it shouldn't be any problem using an explicit conversion.
And by the way, why not just insert all values as N'my string with special chars like à'? Just encode them into UTF-16 (I assume that you use UTF-16 for nchars) first if you use different encoding in 'upper level' software.
use of n function - you have answers already above.
If you have any chance to change the charset of the database, that would really make your life easier. I was working on huge production systems, and found the trend that because of storage space is cheap, simply everyone moves to AL32UTF8 and the hassle of internationalization slowly becomes the painful memories of the past.
I found the easiest thing is to use AL32UTF8 as the charset of the database instance, and simply use varchar2 everywhere. We're reading and writing standard Java unicode strings via JDBC as bind variables without any harm, and fiddle.
Your idea to construct a huge text of SQL inserts may not scale well for multiple reasons:
there is a fixed length of maximum allowed SQL statement - so it won't work with 10000 inserts
it is advised to use bind variables (and then you don't have the n'xxx' vs unistr mess either)
the idea to create a new SQL statement dynamically is very resource unfriedly. It does not allow Oracle to cache any execution plan for anything, and will make Oracle hard parse your looong statement at each call.
What you're trying to achieve is a mass insert. Use the JDBC batch mode of the Oracle driver to perform that at light-speed, see e.g.: http://viralpatel.net/blogs/batch-insert-in-java-jdbc/
Note that insert speed is also affected by triggers (which has to be executed) and foreign key constraints (which has to be validated). So if you're about to insert more than a few thousands of rows, consider disabling the triggers and foreign key constraints, and enable them after the insert. (You'll lose the trigger calls, but the constraint validation after insert can make an impact.)
Also consider the rollback segment size. If you're inserting a million of records, that will need a huge rollback segment, which likely will cause serious swapping on the storage media. It is a good rule of thumb to commit after each 1000 records.
(Oracle uses versioning instead of shared locks, therefore a table with uncommitted changes are consistently available for reading. The 1000 records commit rate means roughly 1 commit per second - slow enough to benefit of write buffers, but quick enough to not interfer with other humans willing to update the same table.)
Good time of a day to everyone.
I face with a huge problem during my work on previous week.
Here ia the deal:
I need to download exel file (blob) from ORACLE database through SAS.
I am using:
First step i need to get data from oracle. I used the construction (blob file is nearly 100kb):
proc sql;
connect to oracle;
create table SASTBL as
select * from connection to oracle (
select dbms_lob.substr(myblobfield,1,32767) as blob_1,
dbms_lob.substr(myblobfield,32768,32767) as blob_2,
dbms_lob.substr(myblobfield,65535,32767) as blob_3,
dbms_lob.substr(myblobfield,97302,32767) as blob_4
from my_tbl;
);
quit;
And the result is:
blob_1 = 70020202020202...02
blob_2 = 02020202020...02
blob_3 = 02020202...02
I do not understand why the field consists from "02"(the whole file)
And the length of any variable in sas is 1024 (instead of 37767) $HEX2024 format.
If I ll take:
dbms_lob.substr(my_blob_field,2000,900) from the same object the result will mush more similar to the truth:
blob = "A234ABC4536AE7...."
The question is: 1. how can i get binary data from blob field correctly trough SAS? What is my mistake?
Thank you.
EDIT 1:
I get the information but max string is 2000 kb.
Use the DBMAX_TEXT option on the CONNECT statement (or a LIBNAME statement) to get up to 32,767 characters. The default is probably 1024.
PROC SQL uses SQL to interact with SAS datasets (create tables, query tables, aggregate data, connect externally, etc.). The procedure mostly follows the ANSI standard with a few SAS specific extensions. Each RDMS extends ANSI including Oracle with its XML handling such as saving content in a blob column. Possibly, SAS cannot properly read the Oracle-specific (non-ANSI) binary large object type. Typically SAS processes string, numeric, datetime, and few other types.
As an alternative, consider saving XML content from Oracle externally as an .xml file and use SAS's XML engine to read content into SAS dataset:
** STORING XML CONTENT;
libname tempdata xml 'C:\Path\To\XML\File.xml';
** APPEND CONTENT TO SAS DATASET;
data Work.XMLData;
set tempdata.NodeName; /* CHANGE TO REPEAT PARENT NODE OF XML. */
run;
Adding as another answer as I can't comment yet... the issue you experienced is that the return of dbms_lob.substr is actually a varchar so SAS limits it to 2,000. To avoid this, you could wrap it in to_clob( ... ) AND set the DBMAX_TEXT option as previously answered.
Another alternative is below...
The code below is an effective method for retrieving a single record with a large CLOB. Instead of calculating how many fields to split the clob into resulting in a very wide record, it instead splits it into multiple rows. See expected output at bottom.
Disclaimer: Although effective it may not be efficient ie may not scale well to multiple rows, the generally accepted approach then is row pipelining PLSQL. That being said, the below got me out of a pinch if you can't make a procedure...
PROC SQL;
connect to oracle (authdomain=YOUR_Auth path=devdb DBMAX_TEXT=32767 );
create table clob_chunks (compress=yes) as
select *
from connection to Oracle (
SELECT id
, key
, level clob_order
, regexp_substr(clob_value, '.{1,32767}', 1, level, 'n') clob_chunk
FROM (
SELECT id, key, clob_value
FROM schema.table
WHERE id = 123
)
CONNECT BY LEVEL <= regexp_count(clob_value, '.{1,32767}',1,'n')
)
order by id, key, clob_order;
disconnect from oracle;
QUIT;
Expected output:
ID KEY CHUNK CLOB
1 1 1 short_clob
2 2 1 long clob chunk1of3
2 2 2 long clob chunk2of3
2 2 3 long clob chunk3of3
3 3 1 another_short_one
Explanation:
DBMAX_TEXT tells SAS to adjust the default of 1024 for a clob field.
The regex .{1,32767} tells Oracle to match at least once but no more than 32767 times. This splits the input and captures the last chunk which is likely to be under 32767 in length.
The regexp_substr is pulling a chunk from the clob (param1) starting from the start of the clob (param2), skipping to the 'level'th occurance (param3) and treating the clob as one large string (param4 'n').
The connect by re-runs the regex to count the chunks to stop the level incrementing beyond end of the clob.
References:
SAS KB article for DBMAX_TEXT
Oracle docs for REGEXP_COUNT
Oracle docs for REGEXP_SUBSTR
Oracle regex syntax
Stackoverflow example of regex splitting
Or should I ?
(The title is inspired by Gary Myers' comment in Why does Oracle varchar2 have a mandatory size as a definition parameter?)
Consider the following variables:
declare
-- database table column interfacing variable
v_a tablex.a%type; -- tablex.a is varchar2
-- PL/SQL only variable
v_b varchar2(32767); -- is this a poor convention ?
begin
select a into v_a from tablex where id = 1;
v_b := 'Some arbitrary string: ' || v_a; -- ignore potential ORA-06502
insert into tabley(id, a) values(1, v_a); -- tablex.a and tabley.a types match
v_b := v_b || ' More arbitrary characters';
end;
/
Variable v_a is used to interface a database table column and therefore uses a %type attribute. But if I know the data type is varchar2 why shouldn't I use varchar2(4000) or varchar2(32767) that also guarantee the string read from database column will always fit to the PL/SQL variable ? Is there any other argument against this convention except the superiority of %type attribute ?
Variable v_b is only used in PL/SQL code and is usually returned to a JDBC client (Java/Python program, Oracle SOA/OSB etc.) or dumped into a flat file (with UTL_FILE). If the varchar2 presents e.g. csv-line why I should bother to calculate the exact maximum possible length (except to verify the line will fit into 32767 bytes in all cases so I don't need a clob) and re-calculate every time my data model changes ?
There is plenty of questions that covers varchar2 length semantics in SQL and explains why varchar2(4000) is a poor practice in SQL. Also the difference between SQL and PL/SQL varchar2-type is well covered:
What is the size limit for a varchar2 PL/SQL subprogram argument in Oracle?
VARCHAR(MAX) versus VARCHAR(n) in Oracle
Why does Oracle varchar2 have a mandatory size as a definition parameter?
Why does VARCHAR need length specification?
What is the default size of a varchar2 input to Oracle stored procedure, and can it be changed?
Why using anything else but VARCHAR2(4000) to store strings in an Oracle database?
Why does an oracle plsql varchar2 variable need a size but a parameter does not?
Ask Tom question: "I work with a modelers group, who would like to define every varchar2 field with the maximum length."
The only place where I have seen this issue discussed is the points #3 and #4 in an answer by APC:
The database uses the length of a variable when allocating memory for PL/SQL collections. As that memory comes out of the PGA supersizing the variable declaration can lead to programs failing because the server has run out of memory.
There are similar issues with the declaration of single variables in PL/SQL programs, it is just that collections tend to multiply the problem.
E.g. Oracle PL/SQL Programming, 5th Edition By Steven Feuerstein doesn't mention any drawbacks of declaring too long varchar2 variables, so it can't be a critical mistake, right ?
Update
After some more googling I found out that Oracle documentation has evolved during releases:
A quote from PL/SQL User's Guide and Reference 10g Release 2 Chapter 3 PL/SQL Datatypes:
Small VARCHAR2 variables are optimized for performance, and larger ones are optimized for efficient memory use. The cutoff point is 2000 bytes. For a VARCHAR2 that is 2000 bytes or longer, PL/SQL dynamically allocates only enough memory to hold the actual value. For a VARCHAR2 variable that is shorter than 2000 bytes, PL/SQL preallocates the full declared length of the variable. For example, if you assign the same 500-byte value to a VARCHAR2(2000 BYTE) variable and to a VARCHAR2(1999 BYTE) variable, the former takes up 500 bytes and the latter takes up 1999 bytes.
A quote from PL/SQL User's Guide and Reference 11g Release 1 Chapter 3 PL/SQL Datatypes:
For a CHAR variable, or for a VARCHAR2 variable whose maximum size is less than 2,000 bytes, PL/SQL allocates enough memory for the maximum size at compile time. For a VARCHAR2 whose maximum size is 2,000 bytes or more, PL/SQL allocates enough memory to store the actual value at run time. In this way, PL/SQL optimizes smaller VARCHAR2 variables for performance and larger ones for efficient memory use.
For example, if you assign the same 500-byte value to VARCHAR2(1999 BYTE) and VARCHAR2(2000 BYTE) variables, PL/SQL allocates 1999 bytes for the former variable at compile time and 500 bytes for the latter variable at run time.
But PL/SQL User's Guide and Reference 11g Release 2 Chapter 3 PL/SQL Datatypes doesn't mention memory allocation any more and I fail to find any other information about memory allocation at all. (I'm using this release so I check only 11.2 documentation.) The same holds also for PL/SQL User's Guide and Reference 12c Release 1 Chapter 3 PL/SQL Datatypes.
I also found an answer by Jeffrey Kemp that addresses this question too. However Jeffrey's answer refers to 10.2 documentation and the question is not about PL/SQL at all.
It looks like this is one of the areas where the PL/SQL functionality has evolved over releases when Oracle has implemented different optimizations.
Note this also means some of the answers listed in the OP are also release specific even that is not explicitly mentioned in those questions/answers. When the time pass by and use of older Oracle releases ends (me daydreaming ?) that information will became outdated (might take decades thought).
The conclusion above is backed with the following quote from chapter 12 Tuning PL/SQL Applications for Performance of PL/SQL Language Reference 11g R1:
Declare VARCHAR2 Variables of 4000 or More Characters
You might need to allocate large VARCHAR2 variables when you are not sure how big an expression result will be. You can conserve memory by declaring VARCHAR2 variables with large sizes, such as 32000, rather than estimating just a little on the high side, such as by specifying 256 or 1000. PL/SQL has an optimization that makes it easy to avoid overflow problems and still conserve memory. Specify a size of more than 4000 characters for the VARCHAR2 variable; PL/SQL waits until you assign the variable, then only allocates as much storage as needed.
This issue is no longer mentioned in 11g R2 nor 12c R1 version of the document. This is in line with the evolution of the chapter 3 PL/SQL Datatypes.
Answer:
Since 11gR2 it makes no difference from memory use of point of view to use varchar2(10) or varchar2(32767). Oracle PL/SQL compiler will take care of the dirty details for you in an optimal fashion !
For releases prior to 11gR2 there is a cutoff-point where different memory management strategies are used and this is clearly documented in each release's PL/SQL Language Reference.
The above only applies to PL/SQL-only variables when there is no natural length restriction that can be derived from the problem domain. If a varchar2-variable represents a GTIN-14 then one should declare that as varchar2(14).
When PL/SQL-variable interfaces with a table column use %type-attribute as that is the zero-effort way to keep you PL/SQL-code and database structure in sync.
Memory test results:
I run a memory analysis in Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 with the following results:
str_size iterations UGA PGA
-------- ---------- ----- ------
10 100 65488 0
10 1000 65488 65536
10 10000 65488 655360
32767 100 65488 0
32767 1000 65488 65536
32767 10000 65488 655360
Because the PGA changes are identical and depend only on iterations and not str_size I conclude the varchar2 declared size doesn't matter. The test might be too naïve though - comments welcome !
The test script:
-- plsql_memory is a convenience package wrapping sys.v_$mystat s and
-- sys.v_$statname tables written by Steven Feuerstein and available in the
-- code-zip file accompanying his book.
set verify off
define str_size=&1
define iterations=&2
declare
type str_list_t is table of varchar2(&str_size);
begin
plsql_memory.start_analysis;
declare
v_strs str_list_t := str_list_t();
begin
for i in 1 .. &iterations
loop
v_strs.extend;
v_strs(i) := rpad(to_char(i), 10, to_char(i));
end loop;
plsql_memory.show_memory_usage;
end;
end;
/
exit
Test run example:
$ sqlplus -SL <CONNECT_STR> #memory-test.sql 32767 10000
Change in UGA memory: 65488 (Current = 1927304)
Change in PGA memory: 655360 (Current = 3572704)
PL/SQL procedure successfully completed.
$
Maybe it's because I came of age in the era when the most memory any system had was 48k and then, happy days, up to a full 64k. And it was not virtual, what you allocated is what you got (wyaiwyg) (WAY-WIG?). I see in a lot of younger programmers a tendency to be lazy in their design and hide design flaws by throwing more memory at it.
If we get into the habit of just typing varchar2(MAX) whenever we define a string variable, we stop thinking about the length. But sometimes length matters. If we haven't already done so, as soon as we type the ( then we should stop and put some thought into how big it really needs to be. Does the size matter here? If so, what is a reasonable maximum (or minimum)? This forces us to look beyond the bytes and fields and indexes to the actual "thing" we are trying to work with. That is never a bad idea.
Discipline is hard. We should be developing habits to enforce it whenever we can. Good data and code design is difficult by nature. There are tools and techniques we can use to make it a little easier, but we shouldn't be doing anything just because it is easier. That's the path to the Dark Side and it catches up to us sooner or later.
I think the problem is the same as in any other piece of software: allocating too much memory will decrease performance and likeliness of failing due to the facts it has occupied all memory.
In my opinion, use %type as much as possible since it prevents mistakes when changing data type or length, and makes clear what the origin is from that variable.
I have a long raw column in an Oracle table. Insert with select is not working because of the long raw column which is part of my select statement as well. Basically I am trying to insert to insert history row with couple of parameters changed. Hence I was thinking of using PL/SQL in Oracle. I have no experience in PL/SQL neither I got anything after googling for couple of days. Can anyone help me with a sample PL/ SQL for my problem ? Thanks in advance !!!
LONG and LONG RAW datatypes are deprecated, and have been for many years now. You really are much better off getting away from them.
Having said that, if you're using PL/SQL, you will be limited to 32,760 bytes of data, which is the max that the LONG RAW PL/SQL datatype will hold. However, the LONG RAW database datatype, can hold up to 2GB of data. So, if any rows in your table contain data longer than 32,760 bytes, you will not be able to retrieve it using PL/SQL. This is a fundamental limitation of LONG and LONG RAW datatypes, and one of the reasons Oracle has deprecated their use.
In that case, the only options are Pro*C or OCI.
More information can be found here:
http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14261/datatypes.htm#CJAEGDEB
Hope that helps.
You can work with a LONG RAW column directly in PL/SQL if your data is limited to 32kB:
FOR cc IN (SELECT col1, col2... col_raw FROM your_table) LOOP
INSERT INTO your_other_table (col1, col2... col_raw)
VALUES (cc.col1, cc.col2... cc.col_raw);
END LOOP;
This will fail if any LONG RAW is larger than 32k.
In that case you will have to use another language. You could use java since it is included in the DB. I already answered a couple of questions on SO with LONG RAW and java:
Copying data from LOB Column to Long Raw Column (will work with LONG RAW to LONG RAW too, just replace the UPDATE with an INSERT)
Get the LENGTH of a LONG RAW
In any case as you have noticed it is a pain to work with this data type. If converting to LOB is not possible you will have to use a workaround.
As a follow-up to this question, I need help with the following scenario:
In Oracle, given a simple data table:
create table data (
id VARCHAR2(255),
key VARCHAR2(255),
value CLOB);
I am using the following merge command:
merge into data
using (
select
? id,
? key,
? value
from
dual
) val on (
data.id=val.id
and data.key=val.key
)
when matched then
update set data.value = val.value
when not matched then
insert (id, key, value) values (val.id, val.key, val.value);
I am invoking the query via JDBC from a Java application.
When the "value" string is large, the above query results in the following Oracle error:
ORA-01461: cannot bind a LONG value for insert into a long column
I even set the "SetBigStringTryClob" property as documented here with the same result.
Is it possible to achieve the behavior I want given that "value" is a CLOB?
EDIT: Client environment is Java
You haven't mentioned specifically in your post, but judging by the tags for the question, I'm assuming you're doing this from Java.
I've had success with code like this in a project I just finished. This application used Unicode, so there may be simpler solutions if your problem domain is limited to a standard ASCII character set.
Are you currently using the OracleStatement.setCLOB() method? It's a terribly awkward thing to have to do, but we couldn't get around it any other way. You have to actually create a temporary CLOB, and then use that temporary CLOB in the setCLOB() method call.
Now, I've ripped this from a working system, and had to make a few ad-hoc adjustments, so if this doesn't appear to work in your situation, let me know and I'll go back to see if I can get a smaller working example.
This of course assumes you're using the Oracle Corp. JDBC drivers (ojdbc14.jar or ojdbc5.jar) which are found in $ORACLE_HOME/jdbc/lib
CLOB tempClob = CLOB.createTemporary(conn, true, CLOB.DURATION_SESSION);
// Open the temporary CLOB in readwrite mode to enable writing
tempClob.open(CLOB.MODE_READWRITE);
// Get the output stream to write
Writer tempClobWriter = tempClob.getCharacterOutputStream();
// Write the data into the temporary CLOB
tempClobWriter.write(stringData);
// Flush and close the stream
tempClobWriter.flush();
tempClobWriter.close();
// Close the temporary CLOB
tempClob.close();
myStatement.setCLOB(column.order, tempClob);
Regards,
Dwayne King