If the DB is using utf8 as charset does it make sense to use varchar2 with byte semantics?
Consider a variable:
l_str varchar2(10 BYTE);
If I later assign values to it based on a query that will return the contents of a 10 char column I have no way of knowing how much bytes those characters will take, in utf8 that might be more than one byte per character.
So when using a multibyte character set shouldn't I always use the following?
l_str varchar2(10 CHAR);
Or to put it another way, is there any reason why you should use varchar2(10 byte) or varchar2(10) in PLSQL?
EDIT: The only reason I can think of for using byte is if you know for sure how many bytes the characters stored will need. So in this case you will allocate less memory.
Depending on the character set I think you just end up only getting half the space you might expect with a more restrictive set for instance. This is confirmed by the documentation
http://docs.oracle.com/cd/B19306_01/appdev.102/b14251/adfns_sqltypes.htm#sthref367
id VARCHAR2(32 BYTE)
The id column contains only single-byte data, up to 32 bytes.
name VARCHAR2(32 CHAR)
The name column contains data in the database character set. If the database character set allows multibyte characters, then the 32 characters can be stored as more than 32 bytes.
I have AL32UTF8 charset with Oracle 11g. Oracle stores byte and char data with different lenght. e.g.--
create table t1(aa varchar2(1));
create table t2(aa varchar2(1 char));
Now execute--
select * from ALL_TAB_COLUMNS where table_name in ('T1','T2');
Check the DATA_LENGTH column where for byte datatype, it has 1 as value and for char it has 4 as value.
You may follow this link for ALL_TAB_COLUMNS columns description.
Related
I use Oracle 12c. I have below table in my DB.
CREATE TABLE TEST_T (COL VARCHAR2(4000 CHAR));
I need insert multibyte characters in this table. The character is 3 byte character.
I am able to insert only 1333 (upto 3999 bytes) characters in table.
My expectation is to insert upto 1500 multibyte characters but I get ORA - 01461.
I don't want to change data type to CLOB or LONG.
Is there any way to use VARCHAR2(4000 CHAR) to achieve this.
Below is the code,
SET SERVEROUTPUT ON
DECLARE
LV_VAR CHAR(1):='プ'; -- 3 byte character
LV_STR VARCHAR2(32000) := '';
BEGIN
FOR I IN 1..1500
LOOP
LV_STR := LV_STR||LV_VAR;
END LOOP;
--
INSERT INTO TEST_T VALUES (LV_STR);
END;
/
Error report -
ORA-01461: can bind a LONG value only for insert into a LONG column
ORA-06512: at line 11
01461. 00000 - "can bind a LONG value only for insert into a LONG column"
*Cause:
*Action:
The problem is that the 4000 byte limit is a hard limit, regardless of whether the datatype is defined as VARCHAR2(4000 CHAR), VARCHAR2(4000 BYTE), or NVARCHAR2(4000). This means that multibyte characters will always have the chance of overflowing a max-size non-CLOB text column.
Oracle's table of Datatype Limits shows each of the VARCHAR2 variants as holding a max of 4000 bytes. And this is precisely the problem you have encountered.
You do have the option of increasing the max size for VARCHAR2 in your Oracle 12c database to 32k.
Here's how to do it: MAX_STRING_SIZE documentation
This is not something to be done without careful consideration: once you change your database to use extended VARCHAR2 strings you cannot go back.
Nevertheless, if your database is all your own and you like the idea of having 32K strings, then this feature was created specifically to address your situation.
Be careful to read the details of pluggable databases, container databases as they require different upgrade techniques. This is a change that cuts across the entire database so you want to get it right.
Use NVARCHAR2 instead of VARCHAR2
NCHAR and NVARCHAR2 are Unicode datatypes that store Unicode character data. The character set of NCHAR and NVARCHAR2 datatypes can only be either AL16UTF16 or UTF8 and is specified at database creation time as the national character set. AL16UTF16 and UTF8 are both Unicode encoding.
The maximum length of an NVARCHAR2 column is 4000 bytes. It can hold up to 4000 characters. The actual data is subject to the maximum byte limit of 4000. The two size constraints must be satisfied simultaneously at run time.
The maximum size for VARCHAR2 is 4000 bytes (VARCHAR2 max size) and is not 4000+ bytes for multibyte characters. You have to change the type to CLOB or NVARCHAR2.
The maximum byte length of an NVARCHAR2 depends on the configured national character set (NVARCHAR2).
I am very new to oracle and today I found about the data type VARCHAR2, and I wanted to learn more about it and google the datatype where I met the problem.
I have gone through few articles about the datatype, and I found out some direct opposite descriptions for VARCHAR2.
DESCRIPTION 1:
When you create a table with a VARCHAR2 column, you specify a maximum
column length (in bytes, not characters) between 1 and 2000 for the
VARCHAR2 column(article)
DESCRIPTION 2:
you can store up to 4000 characters in a VARCHAR2 column. (article)
As you can see it is bit confusing. Is VARCHAR2 is to specify the maximum column length or maximum characters length? Somebody please explain me which one is the correct one?
It depends on your Oracle version, but both articles are mostly incorrect.
When you DECLARE the column, you can either declare the stated length EXPLICITLY as either bytes or characters, or IMPLICITLY using your session's default.
Also, the maximum length is 4000 bytes, NOT characters. Even if you declare VARCHAR2(4000 CHAR), the column cannot store more than 4000 BYTES. It will store 4000 characters if they are all single-byte, otherwise it will store fewer than 4000 characters.
DESCRIPTION 2:
you can store up to 4000 characters in a VARCHAR2 column.
This is correct
The VARCHAR2 datatype stores variable-length character strings. When you create a table with a VARCHAR2 column, you specify a maximum string length (in bytes or characters) between 1 and 4000 bytes for the VARCHAR2 column.
=> varchar2 datatype is same as varchar datatype.
=> its datatype with variable lengh.
Ex. "name varchar2(20)" and pass the value of name is "Ram" so, LENGTH(name) is 3 NOT 20.
=> its internal datatype managed by oracle server only.
=> even if, you declare varchar oracle implicitely converts to into varchar2
I am getting data from erp systems in the form of feeds ,in particular one column length in feed is 15 only.
In target table also corresponded column also length is varchar2(15) but when I am trying to load same into db it showing error like:
ORA-12899: value too large for column emp_name (actual: 16, maximum:
15)
I cant increase the column length since it is base table in the production.
have a look into this blog, the problem resolved for me by changing the column datatype from varchar(100) to varchar(100 char). in my case the data contains some umlaut characters.
http://gerardnico.com/wiki/database/oracle/byte_or_character
The usual reason for problems like this are non-ASCII characters that can be represented with one byte in the original database but require two (or more) bytes in the target database (due to different NLS settings).
To ensure your target column is large enough for 15 characters, you can modify it:
ALTER TABLE table_name MODIFY column_name VARCHAR2(15 CHAR)
(note the 15 CHAR - you can also use BYTE; if neither is present, the database uses the NLS_LENGTH_SEMANTICS setting as a default).
To check which values are larger than 15 bytes, you can
create a staging table in the target database with the column length set to 15 CHAR
insert the data from the source table into the staging table
find the offending rows with
SELECT * FROM staging WHERE lengthb(mycol) > 15
(note the use of LENGTHB as apposed to LENGTH - the former returns the length in bytes, whereas the latter returns the length in characters)
I found AL32UTF8 as the only valid setting. This varies from standard UTF8 with a few character having supplementary bytes, i.e, the characters are about 99% the same. I am guessing you have character conversion problems going on. In other words the data in table1 was written using one charset, and the new table has a slightly different charset.
If this is true, you have to find the source of the oddball charset. Because this will continue to happen.
Solution to:
ORA-12899: VALUE TOO LARGE FOR COLUMN(ACTUAL,MAXIMUM)
If you are facing problem while updating a column size of a table which already has data more than the new length below is the simple script that would work definitely.
ALTER TABLE TABLE_NAME ADD (NEW_COLUMN_NAME DATATYPE(DATASIZE));
UPDATE TABLE_NAME SET NEW_COLUMN_NAME = SUBSTR(OLD_COLUMN_NAME , 1, NEW_LENGTH);
ALTER TABLE TABLE_NAME DROP COLUMN OLD_COLUMN_NAME ;
ALTER TABLE TABLE_NAME RENAME COLUMN NEW_COLUMN_NAME TO OLD_COLUMN_NAME;
Meaning of the query:
ALTER TABLE TABLE_NAME ADD (NEW_COLUMN_NAME DATATYPE(DATASIZE));
It would just create a new column of the required new length in your existing table.
UPDATE TABLE_NAME SET NEW_COLUMN_NAME = SUBSTR(OLD_COLUMN_NAME , 1, NEW_LENGTH);
It will discard all the values after the new length value from old column values and set the trimmed values into the new column name.
ALTER TABLE TABLE_NAME DROP COLUMN OLD_COLUMN_NAME ;
It will remove the old column name as its absurd now and we have copied all the information into the new column.
ALTER TABLE TABLE_NAME RENAME COLUMN NEW_COLUMN_NAME TO OLD_COLUMN_NAME;
Renaming the new column name to the old column name would help you regain the original table structure except for the new column size as you wished.
Certainly the cause of error is that the value is too large for column data type. However, sometimes it is not visible at first sight. Except "byte versus char" differences mentioned in other answers, there can also be problem with line terminators.
I was trying to load CSV file using SQL*Loader in dockerized Oracle. The foo column of type char(1) was the last column. I got ORA-12899: value too large for column foo (actual: 2, maximum: 1) error despite all values of foo column were of length 1. Later I noticed the CSV file has been edited in Windows editor and accidentally saved with CRLF terminators. Since Linux in Docker container expects just LF, the CR was treated as part of column data.
This error made me confused a little bit.
VARCHAR2(x CHAR) means that the column will hold x characters but not
more than can fit into 4000 bytes. Internally, Oracle will set the
byte length of the column (DBA_TAB_COLUMNS.DATA_LENGTH) to MIN(x *
mchw, 4000), where mchw is the maximum byte width of a character in
the database character set. This is 1 for US7ASCII or WE8MSWIN1252, 2
for JA16SJIS, 3 for UTF8, and 4 for AL32UTF8.
For example, a VARCHAR2(3000 CHAR) column in an AL32UTF8 database will
be internally defined as having the width of 4000 bytes. It will hold
up to 3000 characters from the ASCII range (the character limit), but
only 1333 Chinese characters (the byte limit, 1333 * 3 bytes = 3999
bytes). A VARCHAR2(100 CHAR) column in an AL32UTF8 database will be
internally defined as having the width of 400 bytes. It will hold up
to any 100 Unicode characters.
Reference: https://community.oracle.com/tech/developers/discussion/421117/difference-between-varchar2-4000-byte-varchar2-4000-char
I want to insert Chinese characters in Oracle database.
select length('有个可爱的小娃在旁边') from dual;
10
drop table multibyte;
create table multibyte (name varchar2(10));
insert into multibyte
values('有个可爱的小娃在旁边');
I get a error message saying
An attempt was made to insert or update a column with a value
which is too wide for the width of the destination column.
The name of the column is given, along with the actual width
of the value, and the maximum allowed width of the column.
Note that widths are reported in characters if character length
semantics are in effect for the column, otherwise widths are
reported in bytes
I know that if i increase the column width the problem can be wished away.
My question is when the length function tells me the width is 10 why cant i insert it into a column which is of varchar2(10) ?
The difference is due to the definition of the column: VARCHAR2(10) is equivalent to VARCHAR2(10 BYTE), whereas what you want is VARCHAR2(10 CHAR).
Difference between BYTE and CHAR in column datatypes
Yeah, that is a bit unfortunate. Oracle measures the text column length in bytes, so a varchar2(10) can only store ten bytes, which is about three Chinese characters.
What is the difference between varchar and varchar2?
As for now, they are synonyms.
VARCHAR is reserved by Oracle to support distinction between NULL and empty string in future, as ANSI standard prescribes.
VARCHAR2 does not distinguish between a NULL and empty string, and never will.
If you rely on empty string and NULL being the same thing, you should use VARCHAR2.
Currently VARCHAR behaves exactly the same as VARCHAR2. However, the type VARCHAR should not be used as it is reserved for future usage.
Taken from: Difference Between CHAR, VARCHAR, VARCHAR2
Taken from the latest stable Oracle production version 12.2:
Data Types
The major difference is that VARCHAR2 is an internal data type and VARCHAR is an external data type. So we need to understand the difference between an internal and external data type...
Inside a database, values are stored in columns in tables. Internally, Oracle represents data in particular formats known as internal data types.
In general, OCI (Oracle Call Interface) applications do not work with internal data type representations of data, but with host language data types that are predefined by the language in which they are written. When data is transferred between an OCI client application and a database table, the OCI libraries convert the data between internal data types and external data types.
External types provide a convenience for the programmer by making it possible to work with host language types instead of proprietary data formats. OCI can perform a wide range of data type conversions when transferring data between an Oracle database and an OCI application. There are more OCI external data types than Oracle internal data types.
The VARCHAR2 data type is a variable-length string of characters with a maximum length of 4000 bytes. If the init.ora parameter max_string_size is default, the maximum length of a VARCHAR2 can be 4000 bytes. If the init.ora parameter max_string_size = extended, the maximum length of a VARCHAR2 can be 32767 bytes
The VARCHAR data type stores character strings of varying length. The first 2 bytes contain the length of the character string, and the remaining bytes contain the string. The specified length of the string in a bind or a define call must include the two length bytes, so the largest VARCHAR string that can be received or sent is 65533 bytes long, not 65535.
A quick test in a 12.2 database suggests that as an internal data type, Oracle still treats a VARCHAR as a pseudotype for VARCHAR2. It is NOT a SYNONYM which is an actual object type in Oracle.
SQL> select substr(banner,1,80) from v$version where rownum=1;
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
SQL> create table test (my_char varchar(20));
Table created.
SQL> desc test
Name Null? Type
MY_CHAR VARCHAR2(20)
There are also some implications of VARCHAR for ProC/C++ Precompiler options. For programmers who are interested, the link is at: Pro*C/C++ Programmer's Guide
After some experimentation (see below), I can confirm that as of September 2017, nothing has changed with regards to the functionality described in the accepted answer:-
Rextester demo for Oracle 11g:
Empty strings are inserted as NULLs for both VARCHAR
and VARCHAR2.
LiveSQL demo for Oracle 12c: Same results.
The historical reason for these two keywords is explained well in an answer to a different question.
VARCHAR can store up to 2000 bytes of characters while VARCHAR2 can store up to 4000 bytes of characters.
If we declare datatype as VARCHAR then it will occupy space for NULL values. In the case of VARCHAR2 datatype, it will not occupy any space for NULL values. e.g.,
name varchar(10)
will reserve 6 bytes of memory even if the name is 'Ravi__', whereas
name varchar2(10)
will reserve space according to the length of the input string. e.g., 4 bytes of memory for 'Ravi__'.
Here, _ represents NULL.
NOTE: varchar will reserve space for null values and varchar2 will not reserve any space for null values.
Currently, they are the same. but previously
Somewhere on the net, I read that,
VARCHAR is reserved by Oracle to support distinction between NULL and empty string in future, as ANSI standard prescribes.
VARCHAR2 does not distinguish between a NULL and empty string, and never will.
Also,
Emp_name varchar(10) - if you enter value less than 10 digits then remaining space cannot be deleted. it used total of 10 spaces.
Emp_name varchar2(10) - if you enter value less than 10 digits then remaining space is automatically deleted