Multibyte characters in Oracle - oracle

I want to insert Chinese characters in Oracle database.
select length('有个可爱的小娃在旁边') from dual;
10
drop table multibyte;
create table multibyte (name varchar2(10));
insert into multibyte
values('有个可爱的小娃在旁边');
I get a error message saying
An attempt was made to insert or update a column with a value
which is too wide for the width of the destination column.
The name of the column is given, along with the actual width
of the value, and the maximum allowed width of the column.
Note that widths are reported in characters if character length
semantics are in effect for the column, otherwise widths are
reported in bytes
I know that if i increase the column width the problem can be wished away.
My question is when the length function tells me the width is 10 why cant i insert it into a column which is of varchar2(10) ?

The difference is due to the definition of the column: VARCHAR2(10) is equivalent to VARCHAR2(10 BYTE), whereas what you want is VARCHAR2(10 CHAR).
Difference between BYTE and CHAR in column datatypes

Yeah, that is a bit unfortunate. Oracle measures the text column length in bytes, so a varchar2(10) can only store ten bytes, which is about three Chinese characters.

Related

Oracle not able to insert data into varchar2(4000 char) column

I use Oracle 12c. I have below table in my DB.
CREATE TABLE TEST_T (COL VARCHAR2(4000 CHAR));
I need insert multibyte characters in this table. The character is 3 byte character.
I am able to insert only 1333 (upto 3999 bytes) characters in table.
My expectation is to insert upto 1500 multibyte characters but I get ORA - 01461.
I don't want to change data type to CLOB or LONG.
Is there any way to use VARCHAR2(4000 CHAR) to achieve this.
Below is the code,
SET SERVEROUTPUT ON
DECLARE
LV_VAR CHAR(1):='プ'; -- 3 byte character
LV_STR VARCHAR2(32000) := '';
BEGIN
FOR I IN 1..1500
LOOP
LV_STR := LV_STR||LV_VAR;
END LOOP;
--
INSERT INTO TEST_T VALUES (LV_STR);
END;
/
Error report -
ORA-01461: can bind a LONG value only for insert into a LONG column
ORA-06512: at line 11
01461. 00000 - "can bind a LONG value only for insert into a LONG column"
*Cause:
*Action:
The problem is that the 4000 byte limit is a hard limit, regardless of whether the datatype is defined as VARCHAR2(4000 CHAR), VARCHAR2(4000 BYTE), or NVARCHAR2(4000). This means that multibyte characters will always have the chance of overflowing a max-size non-CLOB text column.
Oracle's table of Datatype Limits shows each of the VARCHAR2 variants as holding a max of 4000 bytes. And this is precisely the problem you have encountered.
You do have the option of increasing the max size for VARCHAR2 in your Oracle 12c database to 32k.
Here's how to do it: MAX_STRING_SIZE documentation
This is not something to be done without careful consideration: once you change your database to use extended VARCHAR2 strings you cannot go back.
Nevertheless, if your database is all your own and you like the idea of having 32K strings, then this feature was created specifically to address your situation.
Be careful to read the details of pluggable databases, container databases as they require different upgrade techniques. This is a change that cuts across the entire database so you want to get it right.
Use NVARCHAR2 instead of VARCHAR2
NCHAR and NVARCHAR2 are Unicode datatypes that store Unicode character data. The character set of NCHAR and NVARCHAR2 datatypes can only be either AL16UTF16 or UTF8 and is specified at database creation time as the national character set. AL16UTF16 and UTF8 are both Unicode encoding.
The maximum length of an NVARCHAR2 column is 4000 bytes. It can hold up to 4000 characters. The actual data is subject to the maximum byte limit of 4000. The two size constraints must be satisfied simultaneously at run time.
The maximum size for VARCHAR2 is 4000 bytes (VARCHAR2 max size) and is not 4000+ bytes for multibyte characters. You have to change the type to CLOB or NVARCHAR2.
The maximum byte length of an NVARCHAR2 depends on the configured national character set (NVARCHAR2).

VARCHAR2 datatype in ORACLE

I am very new to oracle and today I found about the data type VARCHAR2, and I wanted to learn more about it and google the datatype where I met the problem.
I have gone through few articles about the datatype, and I found out some direct opposite descriptions for VARCHAR2.
DESCRIPTION 1:
When you create a table with a VARCHAR2 column, you specify a maximum
column length (in bytes, not characters) between 1 and 2000 for the
VARCHAR2 column(article)
DESCRIPTION 2:
you can store up to 4000 characters in a VARCHAR2 column. (article)
As you can see it is bit confusing. Is VARCHAR2 is to specify the maximum column length or maximum characters length? Somebody please explain me which one is the correct one?
It depends on your Oracle version, but both articles are mostly incorrect.
When you DECLARE the column, you can either declare the stated length EXPLICITLY as either bytes or characters, or IMPLICITLY using your session's default.
Also, the maximum length is 4000 bytes, NOT characters. Even if you declare VARCHAR2(4000 CHAR), the column cannot store more than 4000 BYTES. It will store 4000 characters if they are all single-byte, otherwise it will store fewer than 4000 characters.
DESCRIPTION 2:
you can store up to 4000 characters in a VARCHAR2 column.
This is correct
The VARCHAR2 datatype stores variable-length character strings. When you create a table with a VARCHAR2 column, you specify a maximum string length (in bytes or characters) between 1 and 4000 bytes for the VARCHAR2 column.
=> varchar2 datatype is same as varchar datatype.
=> its datatype with variable lengh.
Ex. "name varchar2(20)" and pass the value of name is "Ram" so, LENGTH(name) is 3 NOT 20.
=> its internal datatype managed by oracle server only.
=> even if, you declare varchar oracle implicitely converts to into varchar2

SQLPLUS displaying of tables with CLOB types

I have a table which has 3 columns. I have a NUMBER column, CLOB column, and BLOB column. how can i use some sort of SELECT * statement in order to display what I have entered into this table, not just a partial piece of the long character strings i have in there. The only way I know of displaying a long string form a CLOB would be using the DBMS_LOB.substr technique. My BLOB column is currently all NULL so not too worried about displaying that section, Just the number column with its associated CLOB. Thanks!
See here How to query a CLOB column in Oracle
When getting the substring of a CLOB column and using a query tool that has size/buffer restrictions
sometimes you would need to set the BUFFER to a larger size.
For example while using SQL Plus use the SET BUFFER 10000 to set it to 10000 as the default is 4000.
Running the DMBS_LOB.substr command you can also specify the amount of characters you want to return and the offset from which.
So using DMBS_LOB.substr(column, 3000) might restrict it to a small enough amount for the buffer.
See oracle documentation for more info on the substr command
DBMS_LOB.SUBSTR (
lob_loc IN CLOB CHARACTER SET ANY_CS,
amount IN INTEGER := 32767,
offset IN INTEGER := 1)
RETURN VARCHAR2 CHARACTER SET lob_loc%CHARSET;

Varchar2(CHAR) always in PLSQL?

If the DB is using utf8 as charset does it make sense to use varchar2 with byte semantics?
Consider a variable:
l_str varchar2(10 BYTE);
If I later assign values to it based on a query that will return the contents of a 10 char column I have no way of knowing how much bytes those characters will take, in utf8 that might be more than one byte per character.
So when using a multibyte character set shouldn't I always use the following?
l_str varchar2(10 CHAR);
Or to put it another way, is there any reason why you should use varchar2(10 byte) or varchar2(10) in PLSQL?
EDIT: The only reason I can think of for using byte is if you know for sure how many bytes the characters stored will need. So in this case you will allocate less memory.
Depending on the character set I think you just end up only getting half the space you might expect with a more restrictive set for instance. This is confirmed by the documentation
http://docs.oracle.com/cd/B19306_01/appdev.102/b14251/adfns_sqltypes.htm#sthref367
id VARCHAR2(32 BYTE)
The id column contains only single-byte data, up to 32 bytes.
name VARCHAR2(32 CHAR)
The name column contains data in the database character set. If the database character set allows multibyte characters, then the 32 characters can be stored as more than 32 bytes.
I have AL32UTF8 charset with Oracle 11g. Oracle stores byte and char data with different lenght. e.g.--
create table t1(aa varchar2(1));
create table t2(aa varchar2(1 char));
Now execute--
select * from ALL_TAB_COLUMNS where table_name in ('T1','T2');
Check the DATA_LENGTH column where for byte datatype, it has 1 as value and for char it has 4 as value.
You may follow this link for ALL_TAB_COLUMNS columns description.

ORA-12899: value too large for column

I am getting data from erp systems in the form of feeds ,in particular one column length in feed is 15 only.
In target table also corresponded column also length is varchar2(15) but when I am trying to load same into db it showing error like:
ORA-12899: value too large for column emp_name (actual: 16, maximum:
15)
I cant increase the column length since it is base table in the production.
have a look into this blog, the problem resolved for me by changing the column datatype from varchar(100) to varchar(100 char). in my case the data contains some umlaut characters.
http://gerardnico.com/wiki/database/oracle/byte_or_character
The usual reason for problems like this are non-ASCII characters that can be represented with one byte in the original database but require two (or more) bytes in the target database (due to different NLS settings).
To ensure your target column is large enough for 15 characters, you can modify it:
ALTER TABLE table_name MODIFY column_name VARCHAR2(15 CHAR)
(note the 15 CHAR - you can also use BYTE; if neither is present, the database uses the NLS_LENGTH_SEMANTICS setting as a default).
To check which values are larger than 15 bytes, you can
create a staging table in the target database with the column length set to 15 CHAR
insert the data from the source table into the staging table
find the offending rows with
SELECT * FROM staging WHERE lengthb(mycol) > 15
(note the use of LENGTHB as apposed to LENGTH - the former returns the length in bytes, whereas the latter returns the length in characters)
I found AL32UTF8 as the only valid setting. This varies from standard UTF8 with a few character having supplementary bytes, i.e, the characters are about 99% the same. I am guessing you have character conversion problems going on. In other words the data in table1 was written using one charset, and the new table has a slightly different charset.
If this is true, you have to find the source of the oddball charset. Because this will continue to happen.
Solution to:
ORA-12899: VALUE TOO LARGE FOR COLUMN(ACTUAL,MAXIMUM)
If you are facing problem while updating a column size of a table which already has data more than the new length below is the simple script that would work definitely.
ALTER TABLE TABLE_NAME ADD (NEW_COLUMN_NAME DATATYPE(DATASIZE));
UPDATE TABLE_NAME SET NEW_COLUMN_NAME = SUBSTR(OLD_COLUMN_NAME , 1, NEW_LENGTH);
ALTER TABLE TABLE_NAME DROP COLUMN OLD_COLUMN_NAME ;
ALTER TABLE TABLE_NAME RENAME COLUMN NEW_COLUMN_NAME TO OLD_COLUMN_NAME;
Meaning of the query:
ALTER TABLE TABLE_NAME ADD (NEW_COLUMN_NAME DATATYPE(DATASIZE));
It would just create a new column of the required new length in your existing table.
UPDATE TABLE_NAME SET NEW_COLUMN_NAME = SUBSTR(OLD_COLUMN_NAME , 1, NEW_LENGTH);
It will discard all the values after the new length value from old column values and set the trimmed values into the new column name.
ALTER TABLE TABLE_NAME DROP COLUMN OLD_COLUMN_NAME ;
It will remove the old column name as its absurd now and we have copied all the information into the new column.
ALTER TABLE TABLE_NAME RENAME COLUMN NEW_COLUMN_NAME TO OLD_COLUMN_NAME;
Renaming the new column name to the old column name would help you regain the original table structure except for the new column size as you wished.
Certainly the cause of error is that the value is too large for column data type. However, sometimes it is not visible at first sight. Except "byte versus char" differences mentioned in other answers, there can also be problem with line terminators.
I was trying to load CSV file using SQL*Loader in dockerized Oracle. The foo column of type char(1) was the last column. I got ORA-12899: value too large for column foo (actual: 2, maximum: 1) error despite all values of foo column were of length 1. Later I noticed the CSV file has been edited in Windows editor and accidentally saved with CRLF terminators. Since Linux in Docker container expects just LF, the CR was treated as part of column data.
This error made me confused a little bit.
VARCHAR2(x CHAR) means that the column will hold x characters but not
more than can fit into 4000 bytes. Internally, Oracle will set the
byte length of the column (DBA_TAB_COLUMNS.DATA_LENGTH) to MIN(x *
mchw, 4000), where mchw is the maximum byte width of a character in
the database character set. This is 1 for US7ASCII or WE8MSWIN1252, 2
for JA16SJIS, 3 for UTF8, and 4 for AL32UTF8.
For example, a VARCHAR2(3000 CHAR) column in an AL32UTF8 database will
be internally defined as having the width of 4000 bytes. It will hold
up to 3000 characters from the ASCII range (the character limit), but
only 1333 Chinese characters (the byte limit, 1333 * 3 bytes = 3999
bytes). A VARCHAR2(100 CHAR) column in an AL32UTF8 database will be
internally defined as having the width of 400 bytes. It will hold up
to any 100 Unicode characters.
Reference: https://community.oracle.com/tech/developers/discussion/421117/difference-between-varchar2-4000-byte-varchar2-4000-char

Resources