Is that normal behavior for unicode supported db (UTF-8) or it's bug? - h2

I am using 1.4.197 H2 in memory db and have the varchar match issue with national character (İ - uppercase, i - lowercase). After some research I got some test sql dual test for easy explanation of problem.
Is that normal result for the below request:
select lower('İ') as lower_from_upper,
STRINGTOUTF8 (lower('İ')) as lower_from_upper_utf,
lower('i') as lower_from_lower,
STRINGTOUTF8 (lower('i')) as lower_from_lower_utf
from dual;
Result:
lower_from_upper
lower_from_upper_utf
lower_from_lower
lower_from_lower_utf
i̇
69cc87
i
69
For the first look they seems as same characters in output, but unicode columns give different values for them.
That is all, thanks in advance!

Related

special character displayed as question mark in rdf

I am using oracle reports 12c. While running report special character in varchar2 column is displayed as ? in the pdf output of the report. Please help.
If
"special character" is correctly saved into the database (which means that database's NLS settings know how to properly interpret those characters). UTF8 usually helps
report (the RDF file), in its Paper Layout editor, uses font which is capable of properly displaying those "special characters"
then there shouldn't be any problems.
Therefore, check what's really stored into the table (because, you might think that you stored characters properly, but - in reality - you stored garbage) and that you chose correct font.
P.S. As reports run on server (WebLogic, right?), such a font should be installed there as well, not only on your own PC, i.e. a computer where you develop reports.
If you can connect to the database directly using something like SQL Developer, try running the query below but replacing Test String with the actual string you are having issues displaying. It may help identify what the character is that is giving you issues.
Most likely, it issue has to do with character encoding or a character set either in the database or in the reports application not supporting the character you are trying to display.
Query
This should help identify what the problematic character is
WITH
test_string AS (SELECT 'Test String' AS val FROM DUAL),
chars
AS
( SELECT SUBSTR (ts.val, LEVEL, 1) AS single_char
FROM test_string ts
CONNECT BY LEVEL <= LENGTH (ts.val))
SELECT single_char,
ASCII (single_char) AS ascii_code,
CONVERT (single_char, 'AL32UTF8') AS char_as_al32utf8
FROM chars;
Result
SINGLE_CHAR ASCII_CODE CHAR_AS_AL32UTF8
______________ _____________ ___________________
T 84 T
e 101 e
s 115 s
t 116 t
32
S 83 S
t 116 t
r 114 r
i 105 i
n 110 n
g 103 g

Can't insert ñ character to an Oracle database

I've got an issue inserting ñ chatacter to an Oracle database.
The INSERT operation completes successfully.
However, when I SELECT, I get n instead of ñ.
Also, I noticed executing:
select 'ñ' from dual;
gives me 'n'.
select * from nls_database_parameters where parameter='NLS_CHARACTERSET';
gives me 'EE8MSWIN1250'.
How do I insert ñ?
I'd like to avoid modifying db settings.
The only way I got this working was:
read the input .csv file with its windows-1250 encoding
change encoding to unicode
change strings to unicode codes representation (in which ñ is \00d1)
execute INSERT/UPDATE passing the value from #3 to a UNISTR function
For sure there is an easier way to achieve this.
Simplest way would be, find out the ASCII of the character using ASCII() and insert use CHR() to convert back to string.
SQL:
select ascii('ñ'),chr(50097) from dual;
Output:
ASCII('Ñ') CHR(50097)
---------- ----------
50097 ñ
The client you are using to connect to database matters. Some clients, donot support these characters.
Before you start your SQL*Plus you have to set the codepage (using chcp command) and NLS_LANG environment parameter accordingly:
chcp 1250
set NLS_LANG=.EE8MSWIN1250
sqlplus ...
However as already given in comments, Windows 1250 does not support character ñ. So, these values will not work.
It is not compulsory to set these values equal to your database character set, however codepage and NLS_LANG must match, i.e. following ones should work as well (as they all support ñ)
chcp 1252
set NLS_LANG=.WE8MSWIN1252
sqlplus ...
or
chcp 850
set NLS_LANG=.WE8PC850
sqlplus ...
Again, EE8MSWIN1250 does not support character ñ, unless data type of your column is NVARCHAR2 (resp. NCLOB) it is not possible to store ñ in the database.
Is that character supported by the database character set?
I could not find that character here: https://en.wikipedia.org/wiki/Windows-1250
I had no problems inserting that character in my database which has character set WE8MSWIN1252.
https://en.wikipedia.org/wiki/Windows-1252

Inserting French character into Oracle gets converted into some junk characters

Using PL/SQL Developer, I'm able to insert French character in my Oracle database without any error.
Querying:
SELECT * FROM nls_database_parameters WHERE parameter = 'NLS_NCHAR_CHARACTERSET';
Output: AL16UTF16
But when i retreive the data using select statement it get converted into some junk characters, For eg:
système gets converted to système and so on....
Any suggestion/workaround will be appreciated.
The issue was due to different values in NLS_LANGUAGE at client and server.
At server it was: AMERICAN
use following query to read the parameters:
SELECT * FROM nls_database_parameters
At client it was: AMERICAN_AMERICA.WE8MSWIN1252
In PL/SQL Developer Help->About, click on Additional Info button and scroll down.
What i observed other thing, while trying to fix the issue:
The characters were not converting to junk characters in first update.
But when i retreive them(which contains non-ascii characters) and update again, then they are converting to junk characters.

Why Oracle ascii function return >255 code?

When using Oracle ascii function:
select ascii('A') from dual;
It return 65 is right.
But,when i using:
select ascii('周') from dual;
The return is 55004.The ascii can represent>255???
How to explain?
Help!!!!
My oracle version:Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
my Characterset:6 NLS_CHARACTERSET ZHS16GBK
ASCII in the name is a holdover from when Oracle only supported ASCII. It does not mean it only returns ASCII values.
From the docs:
ASCII returns the decimal representation in the database character set of the first character of char.
http://docs.oracle.com/cd/E11882_01/server.112/e41084/functions013.htm#sthref933
So the result depends on the database character set, which can be greater than 255.
This may vary with your version of Oracle, but it is probably trying to do you the favor of gracefully handling the non-7bit ASCII value that you are passing (but should not be). The doc in at least one version discusses some handling of non-ASCII inputs (http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions007.htm) though if you are using a different version of oracle you may want to refer to the appropriate docs.
If your docs don't say anything more about how it handles non-7bit characters then the answer is probably not well defined (ie no guarantee from Oracle on behavior) and you may want to consider cleansing your input so you only try calling the ASCII function on values that you know to be in the proper input set.

UTF 8 from Oracle tables

The client has asked for a number of tables to be extracted into csv's, all done no problem. They've just asked we make sure the files are always in UTF 8 format.
How do I check this is actually the case. Or even better force it to be so, is it something i can set in a procedure before running a query perhaps?
The data is extracted from an Oracle 10g database.
What should I be checking?
Thanks
You can check the database character set with the following query:
select value from nls_database_parameters
where parameter='NLS_CHARACTERSET'
If it says AL32UTF8 then your database is in the format what you need and if the export does not impair it then your are done.
You may read about Oracle globalization support here, and here about NLS parameters like the above.
How, exactly, are you generating the CSV files? Depending on the exact architecture, there will be different answers.
If you are, for example, using SQL*Plus to extract the data, you would need to set the NLS_LANG on the client machine to something appropriate (i.e. AMERICAN_AMERICA.AL32UTF8) to force the data to be sent to the client machine in UTF-8. If you are using other approaches, NLS_LANG may or may not be important.
What you have to look for is the eight-bit ascii characters in hte input (if any) are translated into double byte utf-8 characters.
This is highly dependant on your local ASCII code page but typically:-
ASCII "£" should be x'A3' in ascii magically becomes x'C2A3' in utf-8.
Ok it wasn't as simple as I first hoped. The query above returns AL32UTF8.
I am using a stored proc compiled on the database to loop through a list of table names held in an array inside the stored procedure.
I use DBMS_SQL package to build the SQL and UTL_FILE.PUT_NCHAR to insert data into a text file.
I believed then my resultant output would be in UTF 8 however opening in Textpad says it's in ANSI and the data is garbled in places :)
Cheers
It might be important that NLS_CHARACTERSET is AL32UTF8 and NLS_NCHAR_CHARACTERSET is AL16UTF16

Resources