Unable to create NVARCHAR2 > 2000 - oracle

in Oracle 18c I am not able to create table with column NVARCHAR2 with length >2000:
Error report -
ORA-00910: specified length too long for its datatype
00910. 00000 - "specified length too long for its datatype"
*Cause: for datatypes CHAR and RAW, the length specified was > 2000;
otherwise, the length specified was > 4000.
*Action: use a shorter length or switch to a datatype permitting a
longer length such as a VARCHAR2, LONG CHAR, or LONG RAW
Which is strange, because MAX_STRING_SIZE is STANDARD, so I should be able to store up to 4000.
What should be changed in DB setting to allow it?

With MAX_STRING_SIZE = STANDARD the limit for VARCHAR2, NVARCHAR2, and RAW types in Oracle SQL is 4000 bytes.
When you specify VARCHAR2(size) the size is interpreted as a byte length by default. Therefore you can specify up to VARCHAR2(4000).
When you specify NVARCHAR2(size) the size is interpreted as character length. The relevant character set is the national character set, which is AL16UTF16 by default. And for AL16UTF16 the multiplier to convert character length to byte length is 2. Therefore, you can specify up to NVARCHAR2(2000) because this converts to 4000 bytes.
Setting MAX_STRING_SIZE = EXTENDED increases the capacity of NVARCHAR to 32767 bytes which means you could specify up to NVARCHAR2(16383).
Note that changing MAX_STRING_SIZE in an existing database is more complicated than simply setting a parameter because metadata update is required for some database objects to account for the increased VARCHAR2, NVARCHAR2, and RAW capacity. The process is explained in the Oracle Reference Manual.
Note that Oracle's Autonomous Database Cloud Service on Shared Infrastructure (ADB-S) is setup with MAX_STRING_SIZE = EXTENDED by default!
If you are interested in more details, there are some nuances:
Are you sure you need to use NVARCHAR2 at all? Typically if you choose a database character set that supports Unicode, such as AL32UTF8, then you can avoid NCHAR, NVARCHAR2, and NCLOB data types.
It is possible to specify VARCHAR2 type with character length semantics, by using optional CHAR keyword like VARCHAR2(1000 CHAR) or by setting NLS_LENGTH_SEMANTICS parameter in the session to CHAR.
Strictly speaking, for NVARCHAR2(size) size is not characters but code points in the national character set. Some characters use two code points in AL16UTF16.
The national character set is AL16UTF16 by default, but could also be UTF8. The Oracle documentation for NVARCAHR2 type explains the NVARCHAR2 limits for the different combinations of MAX_STRING_SIZE and national character set:
The minimum value of size is 1. The maximum value is:
16383 if MAX_STRING_SIZE = EXTENDED and the national character set is AL16UTF16
32767 if MAX_STRING_SIZE = EXTENDED and the national character set is UTF8
2000 if MAX_STRING_SIZE = STANDARD and the national character set is AL16UTF16
4000 if MAX_STRING_SIZE = STANDARD and the national character set is UTF8

Related

Oracle not able to insert data into varchar2(4000 char) column

I use Oracle 12c. I have below table in my DB.
CREATE TABLE TEST_T (COL VARCHAR2(4000 CHAR));
I need insert multibyte characters in this table. The character is 3 byte character.
I am able to insert only 1333 (upto 3999 bytes) characters in table.
My expectation is to insert upto 1500 multibyte characters but I get ORA - 01461.
I don't want to change data type to CLOB or LONG.
Is there any way to use VARCHAR2(4000 CHAR) to achieve this.
Below is the code,
SET SERVEROUTPUT ON
DECLARE
LV_VAR CHAR(1):='プ'; -- 3 byte character
LV_STR VARCHAR2(32000) := '';
BEGIN
FOR I IN 1..1500
LOOP
LV_STR := LV_STR||LV_VAR;
END LOOP;
--
INSERT INTO TEST_T VALUES (LV_STR);
END;
/
Error report -
ORA-01461: can bind a LONG value only for insert into a LONG column
ORA-06512: at line 11
01461. 00000 - "can bind a LONG value only for insert into a LONG column"
*Cause:
*Action:
The problem is that the 4000 byte limit is a hard limit, regardless of whether the datatype is defined as VARCHAR2(4000 CHAR), VARCHAR2(4000 BYTE), or NVARCHAR2(4000). This means that multibyte characters will always have the chance of overflowing a max-size non-CLOB text column.
Oracle's table of Datatype Limits shows each of the VARCHAR2 variants as holding a max of 4000 bytes. And this is precisely the problem you have encountered.
You do have the option of increasing the max size for VARCHAR2 in your Oracle 12c database to 32k.
Here's how to do it: MAX_STRING_SIZE documentation
This is not something to be done without careful consideration: once you change your database to use extended VARCHAR2 strings you cannot go back.
Nevertheless, if your database is all your own and you like the idea of having 32K strings, then this feature was created specifically to address your situation.
Be careful to read the details of pluggable databases, container databases as they require different upgrade techniques. This is a change that cuts across the entire database so you want to get it right.
Use NVARCHAR2 instead of VARCHAR2
NCHAR and NVARCHAR2 are Unicode datatypes that store Unicode character data. The character set of NCHAR and NVARCHAR2 datatypes can only be either AL16UTF16 or UTF8 and is specified at database creation time as the national character set. AL16UTF16 and UTF8 are both Unicode encoding.
The maximum length of an NVARCHAR2 column is 4000 bytes. It can hold up to 4000 characters. The actual data is subject to the maximum byte limit of 4000. The two size constraints must be satisfied simultaneously at run time.
The maximum size for VARCHAR2 is 4000 bytes (VARCHAR2 max size) and is not 4000+ bytes for multibyte characters. You have to change the type to CLOB or NVARCHAR2.
The maximum byte length of an NVARCHAR2 depends on the configured national character set (NVARCHAR2).

Declaring a CLOB in an Oracle database with a custom charset

Is it possible to declare a UTF-8 CLOB if the database is set up with the following character sets?
PARAMETER VALUE
NLS_CHARACTERSET CL8ISO8859P5
NLS_NCHAR_CHARACTERSET AL16UTF16
I tried passing a charset name to the declaration, but it looks like it can only accept references to character sets of other objects.
declare
clob_1 clob character set "AL32UTF8";
begin
null;
end;
/
I don't think this is possible, see PL/SQL Language Fundamentals
PL/SQL uses the database character set to represent:
Stored source text of PL/SQL units
Character values of data types CHAR, VARCHAR2, CLOB, and LONG
So, in your case you have to use NCLOB which uses AL16UTF16 or try a workaround with BLOB. However, this might become cumbersome.
As far as I can tell, you can't do that.
Database character set is defined during database creation (and can't be changed unless you recreate the database) and all character datatype columns store data in that character set.
Perhaps you could try with NCLOB data type, where "N" represents "national character set" and it'll store Unicode character data.
Unicode is a universal encoded character set that can store
information in any language using a single character set

Thai characters not allowing more than 1333 characters from Java code

Thai characters not allowing more than 1333 characters from Java code.is there any possible way except using CLOB data type in db. we are using Oracle 11g.
Simply, no (I assume you use VARCHAR2 data type.), except Oracle 12c with EXTENDED string.
VARCHAR2 columns allow 4000 bytes in normal mode and up to 32767 in extended.
Thai requires multibyte characters that's why more than 1333 characters can take more than 4000 bytes.
NVARCHAR2 columns allow 2000 characters in normal mode and up to 16383 in extended.
What is the db character set ?
I suspect your scenario is as follows:
al32utf8 is the db character set.
the varchar2 column(s) in your table(s) have byte semantics.
The utf8 encoding represents each thai in up to 3 bytes. thus you encounter the length limit of 1333 instead of 4000.
You can change the length semantics from byte to char with ALTER TABLE MODIFY <column> VARCHAR2(n CHAR); (ref.: see here).
For the sake of completness: in case you are operating with a single byte db character set like WE8ISO8859P11 ( iso 8859-11, thai script ), characters can be composed from base characters and diacritical marks. In that case you might have success in changing encoding in the data source to use the code points for composite characters. However, I feel this scenario is unlikely, given that actually each of your test data characters must be composed from three parts to match the observation.

ORA-01704: string literal too long using long UTF-8 Character set

I'm testing a recently converted a database to UTF-8. If I use long random UTF-8 characters to insert into a varchar2 field (4000 characters) I get:
[ORA-01704: string literal too long using long UTF-8 Character set]
If I cut the string down to about 3600 characters, it works. What gives? Is there a way to insert my 4000 characters?
Note that there are some pretty strange characters in the string.
Thanks.
From the documentation:
Independently of the maximum length in characters, the length of VARCHAR2 data cannot exceed 4000 bytes.
So a field declared as varchar2(4000 [char]) can hold 4000 single-byte characters, or a lower number of multi-byte characters. You can't get around that, at least until 12c when varchar2 supports up to 32k.
If you do actually need to allow 4000 multi-byte characters in 11g or earlier you will need to create the column as a CLOB, which can hold gigabytes of data. (You might want to read more on LOB storage as well).
A single UTF-8 character can be more than 1 byte long. Oracle has a limit of 4000 bytes. Therefore less then 4000 UTF-8 characters will fit into a 4000 char length column.
Better you Change the datatype of the column to clob

Difference between VARCHAR2(10 CHAR) and NVARCHAR2(10)

I've installed Oracle Database 10g Express Edition (Universal) with the default settings:
SELECT * FROM NLS_DATABASE_PARAMETERS;
NLS_CHARACTERSET AL32UTF8
NLS_NCHAR_CHARACTERSET AL16UTF16
Given that both CHAR and NCHAR data types seem to accept multi-byte strings, what is the exact difference between these two column definitions?
VARCHAR2(10 CHAR)
NVARCHAR2(10)
The NVARCHAR2 datatype was introduced by Oracle for databases that want to use Unicode for some columns while keeping another character set for the rest of the database (which uses VARCHAR2). The NVARCHAR2 is a Unicode-only datatype.
One reason you may want to use NVARCHAR2 might be that your DB uses a non-Unicode character set and you still want to be able to store Unicode data for some columns without changing the primary character set. Another reason might be that you want to use two Unicode character set (AL32UTF8 for data that comes mostly from western Europe, AL16UTF16 for data that comes mostly from Asia for example) because different character sets won't store the same data equally efficiently.
Both columns in your example (Unicode VARCHAR2(10 CHAR) and NVARCHAR2(10)) would be able to store the same data, however the byte storage will be different. Some strings may be stored more efficiently in one or the other.
Note also that some features won't work with NVARCHAR2, see this SO question:
Oracle Text will not work with NVARCHAR2. What else might be unavailable?
I don't think answer from Vincent Malgrat is correct. When NVARCHAR2 was introduced long time ago nobody was even talking about Unicode.
Initially Oracle provided VARCHAR2 and NVARCHAR2 to support localization. Common data (include PL/SQL) was hold in VARCHAR2, most likely US7ASCII these days. Then you could apply NLS_NCHAR_CHARACTERSET individually (e.g. WE8ISO8859P1) for each of your customer in any country without touching the common part of your application.
Nowadays character set AL32UTF8 is the default which fully supports Unicode. In my opinion today there is no reason anymore to use NLS_NCHAR_CHARACTERSET, i.e. NVARCHAR2, NCHAR2, NCLOB. Note, there are more and more Oracle native functions which do not support NVARCHAR2, so you should really avoid it. Maybe the only reason is when you have to support mainly Asian characters where AL16UTF16 consumes less storage compared to AL32UTF8.
The NVARCHAR2 stores variable-length character data. When you
create a table with the NVARCHAR2 column, the maximum size is always
in character length semantics, which is also the default and only
length semantics for the NVARCHAR2 data type.
The NVARCHAR2data type uses AL16UTF16character set which encodes Unicode data in the UTF-16 encoding. The AL16UTF16 use 2 bytes to store a character. In addition, the maximum byte length of an NVARCHAR2 depends on the configured national character set.
VARCHAR2 The maximum size of VARCHAR2 can be in either bytes or characters. Its column only can store characters in the default character
set while the NVARCHAR2 can store virtually any characters. A single character may require up to 4 bytes.
By defining the field as:
VARCHAR2(10 CHAR) you tell Oracle it can use enough space to store 10
characters, no matter how many bytes it takes to store each one. A single character may require up to 4 bytes.
NVARCHAR2(10) you tell Oracle it can store 10 characters with 2 bytes per character
In Summary:
VARCHAR2(10 CHAR) can store maximum of 10 characters and maximum of 40 bytes (depends on the configured national character set).
NVARCHAR2(10) can store maximum of 10 characters and maximum of 20 bytes (depends on the configured national character set).
Note: Character set can be UTF-8, UTF-16,....
Please have a look at this tutorial for more detail.
Have a good day!
nVarchar2 is a Unicode-only storage.
Though both data types are variable length String datatypes, you can notice the difference in how they store values.
Each character is stored in bytes. As we know, not all languages have alphabets with same length, eg, English alphabet needs 1 byte per character, however, languages like Japanese or Chinese need more than 1 byte for storing a character.
When you specify varchar2(10), you are telling the DB that only 10 bytes of data will be stored. But, when you say nVarchar2(10), it means 10 characters will be stored. In this case, you don't have to worry about the number of bytes each character takes.

Resources