Oracle database - identify and convert special character issues - oracle

I have some issues related to special characters in some tables. For example, some words with character ü were inserted in database as NŒ. Is there a way to find this unicode problems and convert it in a table?
I also checked NLS_CHARACTERSET and LS_NCHAR_CHARACTERSET v$nls_parameters table and it looks fine (AL32UTF8 and UTF8).

Related

CLOB in an ASCII characterset database contains non-ASCII characters - how?

I am working with an Oracle 12.2 database. The database characterset is WE8MSWIN1252 (ie. an ASCII characterset).
The database contains a table with a CLOB column (according to Oracle SQL Developer). Some values in this column contain non-ASCII characters (I know this as when using ASCIISTR function on this column I can see the escaped non-ASCII character codes).
How is this possible? I thought ASCII characterset databases could only store unicode in NVARCHAR, NCLOB etc.
(I only discovered this when I was using a linked server to the Oracle db from SQL Server - when I ran an OPENQUERY on the table with the CLOB, it returned ? for the non-ASCII characters. I changed the OPENQUERY query string to use TO_NCLOB(clob_column) and it returned the non-ASCII characters.)
Any ideas?
Thanks
From wikipedia WE8MSWIN1252 description:
Windows-1252 or CP-1252 (code page 1252) is a single-byte character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows for English and many European languages including Spanish, French, and German.
So, it a CLOB in a database with this charset can store strings like éàè. And ASCIISTR returns escaped codes because these chars are not defined in ASCII, for example:
SQL> select asciistr('é') eaccent, asciistr('e') e from dual;
EACCENT E
---------- -
\FFFD\FFFD e

Special character conversion issue in Datastage

In Datastage, we have source system as Oracle and target system as Netezza. In Oracle the column datatype is varchar whereas in Netezza it is nvarchar. Most of the characters are Latin and Dutch.
We are getting character in our table row which is exactly opposite to the one mentioned in bracket (`) means it's heading towards right and slanting on left(mostly dutch). We feel it is Dutch character which represent apostrophe. The table consists of million records and many values in table have this special character. We want to process the value as it is but we are getting garbage value. Can anyone help us in which conversion function we should try?
I tried iso-8859-1 and iso-8859-15

Special Characters not getting displayed in Oracle tables

I have data which contains special characters like à ç è etc..
I am trying to insert the data into tables having these characters. Data gets inserted without any issues but these characters are replaced with with ?/?? when stored in tables
How should I resolve this issue?I want to store these characters in my tables.
Is it related to NLS parameters?
Currently the NLS characterset is having AL32UTF8 as seen from V$Nls_parameters table.
Is there any specific table/column to be checked ? Or is it something at the database settings ?
Kindly advise.
Thank in advance
From the comments: It is not required that column must be NVARCHAR (resp. NVARCHAR2), because your database character set is AL32UTF8 which supports any Unicode character.
Set your NLS_LANG variable to AMERICAN_AMERICA.AL32UTF8 before you launch your SQL*Plus. You may change the language and/or territory to your own preferences.
Ensure you select a font which is able to display the special characters.
Note, client character set AL32UTF8 is determined by your local LANG variable (i.e. en_US.UTF-8), not by the database character set.
Check also this answer for more information: OdbcConnection returning Chinese Characters as "?"

Oracle enconding inverted interrogation characters

I do find a problem with enconding of characters into Oracle
They are two inverted interrogation characters coming from imported data.
How can i search for two inverted characters into Oracle so i can see how many lines have this problem ?
I assume by "inverted interrogation character" you mean character ¿.
There are two possibilities:
Character ¿ is actually stored in your database, because your database character set (check with SELECT * FROM V$NLS_PARAMETERS WHERE PARAMETER LIKE '%CHARACTERSET') is not capable to support special characters you tried to import.
You can find affected rows with SELECT * FROM TABLE_NAME WHERE REGEXP_LIKE(COL_NAME, '¿');
Your client (e.g. SQL*Plus) is not able to display the special character and substitute those by placeholder ¿. In this case set your NLS_LANG value properly, see this answer for more details.
Unfortunately you did not tell us how you imported your data nor any of your characters sets. Thus I cannot provide you a guideline to set NLS_LANG properly.

How did the unicode characters endup in the database table column?

Recently I came across a unicode character (\u2019) in a database table column while parsing using Python.
Question: What are the reasons that can result in unicode characters showing up in the database table? Is it data entry issue?
Appreciate any input.
When you set up your Oracle Database you choose a character set which will be used in the SQL char datatypes (char, varchar2 etc).
Suppose you chose your character set and you have a table with a column of VARCHAR2 type. Suddenly you need to store some string with non-ASCII symbols not supported by your database (chosen character set). You may convert this string into ASCII string by calling ASCIISTR function for example and store it in your VARCHAR2 column (but it's not a good idea because many SQL built-in functions don't understand '\u2019' (they think it's just 6 symbols)). That's how Unicode may appear in your table column (ASCIISTR converts non-ascii symbols into unicode representation such as '\u2019').
Another option is special Oracle nchar datatypes which were designed to store UNICODE without altering global database settings.
Here is the link with Oracle documentation: https://docs.oracle.com/cd/B19306_01/server.102/b14225/ch6unicode.htm

Resources