I am comparing two databases which have similar schema. Both should support unicode characters.
When i describe the same table in both database, db 1 shows all the varchar fields with char, (eg varchar(20 char)) but the db2 shows without char, (varchar(20)
the second schema supports only one byte/char.
When i compare nls_database_parameters and v$nls_parameters in both database its all same.
could some one let me know what may be the change here?
Have you checked NLS_LENGTH_SEMANTICS? You can set the default to BYTE or CHAR for CHAR/VARCHAR2 types.
If these parameters are the same on both datbases then maybe the table was created by explicitly specifying it that way.
Related
I am analyzing Facebook data using Cassandra due to which I ended up having need of multiple languages text in one of my columns.
I am unable to insert text data into Cassandra which is not English:
<stdin>:1:'ascii' codec can't encode character u'\u010c' in position 51: ordinal not in range(128)
<stdin>:1:Invalid syntax at char 7623
I browsed thorough the Internet and found that i need to override coding (link)
but I am not sure how to configure this.
Note : there is a possibility of multiple language in a single row.
Your column seems to be of type ascii, which only supports US-ASCII-encoded text. If you need a wider range of characters, use varchar instead (see here for details on CQL types).
To change the column type, use this ALTER TABLE statement:
ALTER TABLE my_table ALTER my_column TYPE varchar;
See here for details on ALTER TABLE.
Recently I came across a unicode character (\u2019) in a database table column while parsing using Python.
Question: What are the reasons that can result in unicode characters showing up in the database table? Is it data entry issue?
Appreciate any input.
When you set up your Oracle Database you choose a character set which will be used in the SQL char datatypes (char, varchar2 etc).
Suppose you chose your character set and you have a table with a column of VARCHAR2 type. Suddenly you need to store some string with non-ASCII symbols not supported by your database (chosen character set). You may convert this string into ASCII string by calling ASCIISTR function for example and store it in your VARCHAR2 column (but it's not a good idea because many SQL built-in functions don't understand '\u2019' (they think it's just 6 symbols)). That's how Unicode may appear in your table column (ASCIISTR converts non-ascii symbols into unicode representation such as '\u2019').
Another option is special Oracle nchar datatypes which were designed to store UNICODE without altering global database settings.
Here is the link with Oracle documentation: https://docs.oracle.com/cd/B19306_01/server.102/b14225/ch6unicode.htm
I have a column in Oracle to store comments of Nvarchar2(2000). When a user tries to enter beyond 2000 characters, I get the following error:
ORA-00910: specified length too long for its datatype.
The NLS_NCHAR_CHARACTERSET parameter is having AL16UTF16 value.
Is there any way to increase the size to accept up to 6000 characters? My column already has lots of contents, so not sure if I will be able to change the datatype from NVarchar(2000) to any other.
Unless you use Oracle 12c, it's not possible to store more than 2000 characters, see the datatypes description here:
http://docs.oracle.com/cd/B28359_01/server.111/b28320/limits001.htm
Instead, you should use the NCLOB datatype.
If you use 12c, check: http://dbasolved.com/2013/06/26/change-varchar2-to-32k-12c-edition/
I am running following query in SQL*Plus
CREATE TABLE tbl_audit_trail (
id NUMBER(11) NOT NULL,
old_value varchar2(255) NOT NULL,
new_value varchar2(255) NOT NULL,
action varchar2(20) CHARACTER SET latin1 NOT NULL,
model varchar2(255) CHARACTER SET latin1 NOT NULL,
field varchar2(64) CHARACTER SET latin1 NOT NULL,
stamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
user_id NUMBER(11) NOT NULL,
model_id varchar2(65) CHARACTER SET latin1 NOT NULL,
PRIMARY KEY (id),
KEY idx_action (action)
);
I am getting following error:
action varchar2(20) CHARACTER SET latin1 NOT NULL,
*
ERROR at line 5:
ORA-00907: missing right parenthesis
Can you suggest what am I missing?
The simple answer is that, unlike MySQL, character sets can't be defined at column (or table) level. Latin1 is not a valid Oracle character set either.
Character sets are consistent across the database and will have been specified when you created the database. You can find your character by querying NLS_DATABASE_PARAMETERS,
select value
from nls_database_parameters
where parameter = 'NLS_CHARACTERSET'
The full list of possible character sets is available for 11g r2 and for 9i or you can query V$NLS_VALID_VALUES.
It is possible to use the ALTER SESSION statement to set the NLS_LANGUAGE or the NLS_TERRITORY, but unfortunately you can't do this for the character set. I believe this is because altering the language changes how Oracle would display the stored data whereas changing the character set would change how Oracle stores the data.
When displaying the data, you can of course specify the required character set in whichever client you're using.
Character set migration is not a trivial task and should not be done lightly.
On a slight side note why are you trying to use Latin 1? It would be more normal to set up a new database in something like UTF-8 (otherwise known as AL32UTF8 - don't use UTF8) or UTF-16 so that you can store multi-byte data effectively. Even if you don't need it now it's wise to attempt - no guarantees in life - to future proof your database with no need to migrate in the future.
If you're looking to specify differing character sets for different columns in a database then the better option would be to determine if this requirement is really necessary and to try to remove it. If it is definitely necessary1 then your best bet might be to use a character set that is a superset of all potential character sets. Then, have some sort of check constraint that limits the column to specific hex values. I would not recommend doing this at all, the potential for mistakes to creep in is massive and it's extremely complex. Furthermore, different character sets render different hex values differently. This, in turn, means that you need to enforce that a column is rendered in a specific character, which is impossible as it falls outside the scope of the database.
1. I'd be interested to know the situation
According to provided DDL statement it's some need to use 2 character sets. The implementation of this functionality in Oracle is different from MySQL and done with n* data types like nvarchar2, nchar... Latin1 is similar to some Western European character set that might be default. So you able to define for example "Latin1" (WE**) and some Unicode (UTF8..).
The NVARCHAR2 datatype was introduced by Oracle for databases that want to use Unicode for some columns while keeping another character set for the rest of the database (which uses VARCHAR2). The NVARCHAR2 is a Unicode-only datatype.
The reason you want to use NVARCHAR2 might be that your DB uses a non-Unicode character and you still want to be able to store Unicode data for some columns.
Columns in your example would be able to store the same data, however the byte storage will be different.
I have a query that has
... WHERE PRT_STATUS='ONT' ...
The prt_status field is defined as CHAR(5) though. So it's always padded with spaces. The query matches nothing as the result. To make this query work I have to do
... WHERE rtrim(PRT_STATUS)='ONT'
which does work.
That's annoying.
At the same time, a couple of pure-java DBMS clients (Oracle SQLDeveloper and AquaStudio) I have do NOT have a problem with the first query, they return the correct result. TOAD has no problem either.
I presume they simply put the connection into some compatibility mode (e.g. ANSI), so the Oracle knows that CHAR(5) expected to be compared with no respect to trailing characters.
How can I do it with Connection objects I get in my application?
UPDATE I cannot change the database schema.
SOLUTION It was indeed the way Oracle compares fields with passed in parameters.
When bind is done, the string is passed via PreparedStatement.setString(), which sets type to VARCHAR, and thus Oracle uses unpadded comparision -- and fails.
I tried to use setObject(n,str,Types.CHAR). Fails. Decompilation shows that Oracle ignores CHAR and passes it in as a VARCHAR again.
The variant that finally works is
setObject(n,str,OracleTypes.FIXED_CHAR);
It makes the code not portable though.
The UI clients succeed for a different reason -- they use character literals, not binding. When I type PRT_STATUS='ONT', 'ONT' is a literal, and as such compared using padded way.
Note that Oracle compares CHAR values using blank-padded comparison semantics.
From Datatype Comparison Rules,
Oracle uses blank-padded comparison
semantics only when both values in the
comparison are either expressions of
datatype CHAR, NCHAR, text literals,
or values returned by the USER
function.
In your example, is 'ONT' passed as a bind parameter, or is it built into the query textually, as you illustrated? If a bind parameter, then make sure that it is bound as type CHAR. Otherwise, verify the client library version used, as really old versions of Oracle (e.g. v6) will have different comparison semantics for CHAR.
If you cannot change your database table, you can modify your query.
Some alternatives for RTRIM:
.. WHERE PRT_STATUS like 'ONT%' ...
.. WHERE PRT_STATUS = 'ONT ' ... -- 2 white spaces behind T
.. WHERE PRT_STATUS = rpad('ONT',5,' ') ...
I would change CHAR(5) column into varchar2(5) in db.
You can use cast to char operation in your query:
... WHERE PRT_STATUS=cast('ONT' as char(5))
Or in more generic JDBC way:
... WHERE PRT_STATUS=cast(? as char(5))
And then in your JDBC code do use statement.setString(1, "ONT");