I have an nvarchar field in Oracle and I would like to know: how many bytes it can hold if it has a length of 178?
It depends on your NLS_LANG setting.
Check http://docs.oracle.com/cd/B28359_01/server.111/b28298/ch6unicode.htm#g1008281
From the manual:
Width specifications of character data type NVARCHAR2 refer to the number of characters. The maximum column size allowed is 4000 bytes.
Related
There is one aspect of the extended data types introduced with Oracle 12 that I don't quite understand. The documentation (https://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623) says:
A VARCHAR2 or NVARCHAR2 data type with a declared size of greater than 4000 bytes, or a RAW data type with a declared size of greater than 2000 bytes, is an extended data type. Extended data type columns are stored out-of-line, leveraging Oracle's LOB technology.
According to this definition does "VARCHAR2(4000 CHAR)" have "a declared size of greater than 4000 bytes" because with a multi-byte character set (e.g. ALS32UTF8) it could contain more that 4000 bytes?
Or more specific: What happens when I create a column of that type? I could think of the following possibilities:
It is created with an extended data type, so that the content of that column is always stored in a CLOB regardless of its size.
The values for that column are stored inline if they need not more than 4000 bytes, and as CLOB if they are longer.
It will refuse to store any values with more than 4000 bytes. To circumvent that I have to declare the column as VARCHAR2(4001 CHAR) or something similar.
Edit: The question had been marked as a duplicate of Enter large content to oracle database for a few weeks, but I don't think it is. The other question is generally about how you can enter more than 4000 characters in a VARCHAR2 column, but I am asking a very specific question about an edge case.
I have a database with the below NLS settings
NLS_NCHAR_CHARACTERSET - AL16UTF16
NLS_CHARACTERSET - AL32UTF8
There's a table with a clob column storing a base64 encoded data.
Since the characters are mostly english and letters, I would assume each character takes up 1 byte only as clob using the charset of NLS_CHARACTERSET for encoding.
With a inline enabled clob column, the clob will be stored inline unless it goes more that 4096 bytes in size. However, when I tried to store a set of data with 2048 chars, I found that it is not stored inline (By checking the table DBA_TABLES). So does it mean each character is not using only 1 byte? Can anyone elaborate on this?
Another test added:
Create a table with clob column with chunk size 8kb so that initial segment size is 65536 bytes.
After insert a row with 32,768 chars in clob column. The 2nd extent creation can be told by querying dba_segments.
http://docs.oracle.com/cd/E11882_01/server.112/e10729/ch6unicode.htm#r2c1-t12
It says:
Data in CLOB columns is stored in a format that is compatible with
UCS-2 when the database character set is multibyte, such as UTF8 or
AL32UTF8. This means that the storage space required for an English
document doubles when the data is converted
So it looks like CLOB internally stores everything as UCS-2 (Unicode), i.e. 2 bytes fixed per symbol. Consequently, it stores inline 4096/2 = 2048 chars.
I have a column in Oracle to store comments of Nvarchar2(2000). When a user tries to enter beyond 2000 characters, I get the following error:
ORA-00910: specified length too long for its datatype.
The NLS_NCHAR_CHARACTERSET parameter is having AL16UTF16 value.
Is there any way to increase the size to accept up to 6000 characters? My column already has lots of contents, so not sure if I will be able to change the datatype from NVarchar(2000) to any other.
Unless you use Oracle 12c, it's not possible to store more than 2000 characters, see the datatypes description here:
http://docs.oracle.com/cd/B28359_01/server.111/b28320/limits001.htm
Instead, you should use the NCLOB datatype.
If you use 12c, check: http://dbasolved.com/2013/06/26/change-varchar2-to-32k-12c-edition/
What is the datatype in oracle i should be using to store comment boxes? I was going to use LONG but it only allows one. Or should I just use VARCHAR2 and set it really large?
What is the longest comment you want to be able to support?
If your comments are less than 4000 bytes in length, you can use a VARCHAR2(4000). If your comments are longer than 4000 bytes in length, you can use a CLOB. A CLOB can store any character data supported by your database character set.
I've installed Oracle Database 10g Express Edition (Universal) with the default settings:
SELECT * FROM NLS_DATABASE_PARAMETERS;
NLS_CHARACTERSET AL32UTF8
NLS_NCHAR_CHARACTERSET AL16UTF16
Given that both CHAR and NCHAR data types seem to accept multi-byte strings, what is the exact difference between these two column definitions?
VARCHAR2(10 CHAR)
NVARCHAR2(10)
The NVARCHAR2 datatype was introduced by Oracle for databases that want to use Unicode for some columns while keeping another character set for the rest of the database (which uses VARCHAR2). The NVARCHAR2 is a Unicode-only datatype.
One reason you may want to use NVARCHAR2 might be that your DB uses a non-Unicode character set and you still want to be able to store Unicode data for some columns without changing the primary character set. Another reason might be that you want to use two Unicode character set (AL32UTF8 for data that comes mostly from western Europe, AL16UTF16 for data that comes mostly from Asia for example) because different character sets won't store the same data equally efficiently.
Both columns in your example (Unicode VARCHAR2(10 CHAR) and NVARCHAR2(10)) would be able to store the same data, however the byte storage will be different. Some strings may be stored more efficiently in one or the other.
Note also that some features won't work with NVARCHAR2, see this SO question:
Oracle Text will not work with NVARCHAR2. What else might be unavailable?
I don't think answer from Vincent Malgrat is correct. When NVARCHAR2 was introduced long time ago nobody was even talking about Unicode.
Initially Oracle provided VARCHAR2 and NVARCHAR2 to support localization. Common data (include PL/SQL) was hold in VARCHAR2, most likely US7ASCII these days. Then you could apply NLS_NCHAR_CHARACTERSET individually (e.g. WE8ISO8859P1) for each of your customer in any country without touching the common part of your application.
Nowadays character set AL32UTF8 is the default which fully supports Unicode. In my opinion today there is no reason anymore to use NLS_NCHAR_CHARACTERSET, i.e. NVARCHAR2, NCHAR2, NCLOB. Note, there are more and more Oracle native functions which do not support NVARCHAR2, so you should really avoid it. Maybe the only reason is when you have to support mainly Asian characters where AL16UTF16 consumes less storage compared to AL32UTF8.
The NVARCHAR2 stores variable-length character data. When you
create a table with the NVARCHAR2 column, the maximum size is always
in character length semantics, which is also the default and only
length semantics for the NVARCHAR2 data type.
The NVARCHAR2data type uses AL16UTF16character set which encodes Unicode data in the UTF-16 encoding. The AL16UTF16 use 2 bytes to store a character. In addition, the maximum byte length of an NVARCHAR2 depends on the configured national character set.
VARCHAR2 The maximum size of VARCHAR2 can be in either bytes or characters. Its column only can store characters in the default character
set while the NVARCHAR2 can store virtually any characters. A single character may require up to 4 bytes.
By defining the field as:
VARCHAR2(10 CHAR) you tell Oracle it can use enough space to store 10
characters, no matter how many bytes it takes to store each one. A single character may require up to 4 bytes.
NVARCHAR2(10) you tell Oracle it can store 10 characters with 2 bytes per character
In Summary:
VARCHAR2(10 CHAR) can store maximum of 10 characters and maximum of 40 bytes (depends on the configured national character set).
NVARCHAR2(10) can store maximum of 10 characters and maximum of 20 bytes (depends on the configured national character set).
Note: Character set can be UTF-8, UTF-16,....
Please have a look at this tutorial for more detail.
Have a good day!
nVarchar2 is a Unicode-only storage.
Though both data types are variable length String datatypes, you can notice the difference in how they store values.
Each character is stored in bytes. As we know, not all languages have alphabets with same length, eg, English alphabet needs 1 byte per character, however, languages like Japanese or Chinese need more than 1 byte for storing a character.
When you specify varchar2(10), you are telling the DB that only 10 bytes of data will be stored. But, when you say nVarchar2(10), it means 10 characters will be stored. In this case, you don't have to worry about the number of bytes each character takes.