UTF 8 from Oracle tables - oracle

The client has asked for a number of tables to be extracted into csv's, all done no problem. They've just asked we make sure the files are always in UTF 8 format.
How do I check this is actually the case. Or even better force it to be so, is it something i can set in a procedure before running a query perhaps?
The data is extracted from an Oracle 10g database.
What should I be checking?
Thanks

You can check the database character set with the following query:
select value from nls_database_parameters
where parameter='NLS_CHARACTERSET'
If it says AL32UTF8 then your database is in the format what you need and if the export does not impair it then your are done.
You may read about Oracle globalization support here, and here about NLS parameters like the above.

How, exactly, are you generating the CSV files? Depending on the exact architecture, there will be different answers.
If you are, for example, using SQL*Plus to extract the data, you would need to set the NLS_LANG on the client machine to something appropriate (i.e. AMERICAN_AMERICA.AL32UTF8) to force the data to be sent to the client machine in UTF-8. If you are using other approaches, NLS_LANG may or may not be important.

What you have to look for is the eight-bit ascii characters in hte input (if any) are translated into double byte utf-8 characters.
This is highly dependant on your local ASCII code page but typically:-
ASCII "£" should be x'A3' in ascii magically becomes x'C2A3' in utf-8.

Ok it wasn't as simple as I first hoped. The query above returns AL32UTF8.
I am using a stored proc compiled on the database to loop through a list of table names held in an array inside the stored procedure.
I use DBMS_SQL package to build the SQL and UTL_FILE.PUT_NCHAR to insert data into a text file.
I believed then my resultant output would be in UTF 8 however opening in Textpad says it's in ANSI and the data is garbled in places :)
Cheers
It might be important that NLS_CHARACTERSET is AL32UTF8 and NLS_NCHAR_CHARACTERSET is AL16UTF16

Related

Compilation converts utf-8 characters to question mark in PLSQL developer

I am using PLSQL developer to work with oracle db. When I compile a view which stores code like this:
select 'xidmət' as service from dual
the 'ə' character in the string becomes '?' character.
I think it's some oracle or plsql developer configuration problem, but I don't know what.What do you think the problem is?
First, check what your character set is using this:
select value from nls_database_parameters where parameter='NLS_CHARACTERSET';
Then set your NLS_LANG environment variable to AMERICAN_AMERICA.CHARSET where CHARSET is the value found with that select. If you are in Windows, you will have to go to Control Panel, System, Advanced, Environment Variables, and set NLS_LANG under System variables.
Oracle does at least a 'one-pass' conversion between client and database, but the problem is that there are so many layers between client and database, including your client software, that it is usually better to match your client NLS_LANG with the database setting.
It also depends how that character was inserted. It might have been inserted using a different client tool using a different NLS_LANG, so you might have to update your extended ASCII characters (or foreign characters) before you get a consistent view from your select.
Need to configure your systems NLS_LANG parameter, according to your language preferences. Here's a link:
http://www.nazmulhuda.info/setting-nls_lang-environment-variable-for-windows-and-unix-for-oracle-database
For example, we have letters like "ąčęėįšųū". So in order to see them in pl/sql, we set NLS_LANG with value "LITHUANIAN_LITHUANIA.BLT8MSWIN1257".
Hope it helps. Good luck.

Why Oracle ascii function return >255 code?

When using Oracle ascii function:
select ascii('A') from dual;
It return 65 is right.
But,when i using:
select ascii('周') from dual;
The return is 55004.The ascii can represent>255???
How to explain?
Help!!!!
My oracle version:Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
my Characterset:6 NLS_CHARACTERSET ZHS16GBK
ASCII in the name is a holdover from when Oracle only supported ASCII. It does not mean it only returns ASCII values.
From the docs:
ASCII returns the decimal representation in the database character set of the first character of char.
http://docs.oracle.com/cd/E11882_01/server.112/e41084/functions013.htm#sthref933
So the result depends on the database character set, which can be greater than 255.
This may vary with your version of Oracle, but it is probably trying to do you the favor of gracefully handling the non-7bit ASCII value that you are passing (but should not be). The doc in at least one version discusses some handling of non-ASCII inputs (http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions007.htm) though if you are using a different version of oracle you may want to refer to the appropriate docs.
If your docs don't say anything more about how it handles non-7bit characters then the answer is probably not well defined (ie no guarantee from Oracle on behavior) and you may want to consider cleansing your input so you only try calling the ASCII function on values that you know to be in the proper input set.

Character encoding issue in Oracle PL/SQL

I'm facing a character discrepancy issue while extracting data from db tables.
I've written a PL/SQL code to spool some data to .txt file from my db tables and running this sql using unix shell but when I'm getting the spooled file, the result set is a changed one from the one at back end.
For example:
At back end: SADETTÝN
In Spooled txt file : SADETTŸN
If you look at the Y character, it is a changed one. I want to preserve all the characters the way they are at back end.
My db's character set:
SELECT * FROM v$nls_parameters WHERE parameter LIKE 'NLS%CHARACTERSET'
PARAMETER VALUE
NLS_CHARACTERSET WE8ISO8859P1
NLS_NCHAR_CHARACTERSET WE8ISO8859P1
And Unix NLS_LANG parameter :
$ echo $NLS_LANG
AMERICAN_AMERICA.WE8ISO8859P1
I tried changing NLS_LANG parameter to WE8ISO8859P9(Trukish characterset) but no help!
Could anyone let me know the solution to this problem?
I presume that you are trying to visualize your file with "vi" or something similar.NLS_LANG parameter is used only by your database to export to your file.For your editor(vi), you need to set the LANG parameter to the corresponding value to your NLS_LANG.
Exemple : For ISO8859P1 american english you have to do
export LANG=en_US.ISO8859-1
In other words your file is just fine it's your editor who doesn't know what to do with your Turkish characters.
You should use NCHAR data types. More information is available at Oracle Documentation - SQL and PL/SQL Programming with Unicode
For spooling from SQL*Plus, you need to set the NLS_LANG environment variable correctly. Here is a similar question in stackoverflow.

Oracle SQL Developer environment encoding

I have Oracle SQL Developer (3.1.07) and I'm trying to work with a database that uses WE8ISO8859P1 encoding:
SELECT * FROM nls_database_parameters WHERE parameter = 'NLS_CHARACTERSET';
I have problems with saving packages that contains unicode symbols. When I open previously saved package all unicode symbols are turned to '¿'.
What settings do I have to change to make SQL Developer keep those symbols?
I've tried to set environment encoding to 'ISO-8859-15' and some other encodings, but it won't help.
If your database encodes text to a non-unicode single-byte encoding (e.g. ISO-8859), any symbol not present on the character table will be seen as invalid and replaced by a placeholder. You can't go back from that, the information is lost.
That can be usually worked around when storing data, but as for source code, you cannot control how Oracle would encode your strings.
If your database is configured to use such encoding scheme you're probably not supposed to write code that violates its rules.
Maybe you could need this character set migration
http://docs.oracle.com/cd/B10501_01/server.920/a96529/ch10.htm#1656
on the Oracle's documentation
At least to open PKG in sql developer, you can do a quick try and see if it works:-
Change SQL Developer 'encoding' to 'unicode-utf-8' which is default to later versions now.
You would ,eventually, need to go for database charset migration to 'AL32UTF8' to avoid other issues (like data) due to this char set.
If you look at USER_SOURCE you'll see that the source code, as stored/interpreted by the database, will be in a VARCHAR2 column so use the database character set. As such, your source code will need to be in WE8ISO8859P1.
In theory, if the client and database are using the same character set, then the database won't try to do any character set translation and you may be able to sneak in a sequence of bytes that the database thinks are WE8ISO8859P1 but will make sense in unicode. However, at some point, someone will use the wrong client and it will break.
You don't need unicode for identifiers etc in the code, so I assume it is in string literals. You are better off storing these in a table (NVARCHAR2 column) and selecting them into the code rather than hard-coding them. If that isn't possible, you could use UNISTR and hard-code the relevant hex values.

Oracle Export with SQL Developer and "Special chars"

Sorry for this "stupid" Questions but iam not able to find a solution..
I have an Table in my Oracle Database. The "ä,ö,ß" are stored in this Format:
\344
\374
Is there anyway to convert them back? I need an excel sheet..
What you really mean is that the tool you used to get this data renders the characters as \344 \374.
To make sure how they are stored you would actually need to request a dump. As in
alter system dump datafile xx block min y block max z ;
This is the best test. It might well as well be that your chars are stored OK but your tooling settings are wrong. To find out you first need to know the database character set
select * from V$NLS_PARAMETERS where parameter='NLS_CHARACTERSET';
And then compare 344 374 with the expected codes. As a matter of fact 344 is the correct octal value for a umlaut with the ISO 8859-1 character set.
Make sure your client NLS_LANG settings (either environment variable or registry setting) is well set (e.g. for windows WE8MSWIN 1252)

Resources