CLOB in an ASCII characterset database contains non-ASCII characters - how? - oracle

I am working with an Oracle 12.2 database. The database characterset is WE8MSWIN1252 (ie. an ASCII characterset).
The database contains a table with a CLOB column (according to Oracle SQL Developer). Some values in this column contain non-ASCII characters (I know this as when using ASCIISTR function on this column I can see the escaped non-ASCII character codes).
How is this possible? I thought ASCII characterset databases could only store unicode in NVARCHAR, NCLOB etc.
(I only discovered this when I was using a linked server to the Oracle db from SQL Server - when I ran an OPENQUERY on the table with the CLOB, it returned ? for the non-ASCII characters. I changed the OPENQUERY query string to use TO_NCLOB(clob_column) and it returned the non-ASCII characters.)
Any ideas?
Thanks

From wikipedia WE8MSWIN1252 description:
Windows-1252 or CP-1252 (code page 1252) is a single-byte character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows for English and many European languages including Spanish, French, and German.
So, it a CLOB in a database with this charset can store strings like éàè. And ASCIISTR returns escaped codes because these chars are not defined in ASCII, for example:
SQL> select asciistr('é') eaccent, asciistr('e') e from dual;
EACCENT E
---------- -
\FFFD\FFFD e

Related

national characters in Oracle

we are using Oracle 19c
there are setting in nls_database_parameters
nls_nchar_characterset is UTF8
nls_charchterset is WE8ISO8859P15
I have a table with one column of varchar2 and another column of nvarchar2
I try to insert in both column the same letter,non english ,for example ş and it is not wotking but if I try another non english letter in my language like ž then is is working in both columns. Another colleagues of mine can not insert any letter correctly using the same database user. I don't understand this behavior,what defines what you can insert as national character?
we receive a big list of different cities in different languages. What is the best way to insert all of them correctly?

Oracle database - identify and convert special character issues

I have some issues related to special characters in some tables. For example, some words with character ü were inserted in database as NŒ. Is there a way to find this unicode problems and convert it in a table?
I also checked NLS_CHARACTERSET and LS_NCHAR_CHARACTERSET v$nls_parameters table and it looks fine (AL32UTF8 and UTF8).

Special Characters not getting displayed in Oracle tables

I have data which contains special characters like à ç è etc..
I am trying to insert the data into tables having these characters. Data gets inserted without any issues but these characters are replaced with with ?/?? when stored in tables
How should I resolve this issue?I want to store these characters in my tables.
Is it related to NLS parameters?
Currently the NLS characterset is having AL32UTF8 as seen from V$Nls_parameters table.
Is there any specific table/column to be checked ? Or is it something at the database settings ?
Kindly advise.
Thank in advance
From the comments: It is not required that column must be NVARCHAR (resp. NVARCHAR2), because your database character set is AL32UTF8 which supports any Unicode character.
Set your NLS_LANG variable to AMERICAN_AMERICA.AL32UTF8 before you launch your SQL*Plus. You may change the language and/or territory to your own preferences.
Ensure you select a font which is able to display the special characters.
Note, client character set AL32UTF8 is determined by your local LANG variable (i.e. en_US.UTF-8), not by the database character set.
Check also this answer for more information: OdbcConnection returning Chinese Characters as "?"

how to store € in Oracle WE8ISO8859P1

We have a customer who wants to store '€' symbol through our application in Oracle VARCHAR2 column. Their database characterset is WE8ISO8859P1.
Our customer does does not want to change their database characterset to WE8MSWIN1252 which supports '€' symbol storage.
And we do not want to change the data type to NVARCHAR in the short term.
Is there any easy way to go around this to store € in Oracle WE8ISO8859P1?
Thanks,
sorry, this character set is older than euro. you can upgrade to WE8ISO8859P15 or try to use unused hex values 80–9F (CHR(128)-CHR(159)), then you'd have to replace them on client...
other possibility is to store UTF-8 encoded text as its representation

How did the unicode characters endup in the database table column?

Recently I came across a unicode character (\u2019) in a database table column while parsing using Python.
Question: What are the reasons that can result in unicode characters showing up in the database table? Is it data entry issue?
Appreciate any input.
When you set up your Oracle Database you choose a character set which will be used in the SQL char datatypes (char, varchar2 etc).
Suppose you chose your character set and you have a table with a column of VARCHAR2 type. Suddenly you need to store some string with non-ASCII symbols not supported by your database (chosen character set). You may convert this string into ASCII string by calling ASCIISTR function for example and store it in your VARCHAR2 column (but it's not a good idea because many SQL built-in functions don't understand '\u2019' (they think it's just 6 symbols)). That's how Unicode may appear in your table column (ASCIISTR converts non-ascii symbols into unicode representation such as '\u2019').
Another option is special Oracle nchar datatypes which were designed to store UNICODE without altering global database settings.
Here is the link with Oracle documentation: https://docs.oracle.com/cd/B19306_01/server.102/b14225/ch6unicode.htm

Resources