I need to change character set from EE8ISO8859P2 to EE8MSWIN1250.
I have read a lot of guides, but I have not found a solution. How can I make this conversion? I need a complete instruction.
I would recommend to change it to UTF-8, i.e. AL32UTF8 following the Database Migration Assistant for Unicode Guide or Character Set Migration.
As sandman also suggested, do not run ALTER DATABASE CHARACTER SET ...
It is de-supported since Oracle 10g
Database SQL Reference 10g Release 1: ALTER DATABASE:
You can no longer change the database character set or the national
character set using the ALTER DATABASE statement. Please refer to
Oracle Database Globalization Support Guide for information on
database character set migration.
It used to be complicated with csscan etc. but nowadays you download a GUI tool called Oracle Database Migration Assistant and follow the instructions. It's a lot easier if your character sets are single-byte (I'm assuming) as then you won't have lossy conversion of some data, for example a multi-byte character set like UTF8.
You will require downtime though and it might take hours to do, depending on the size of the data found by the DMU tool. You can NOT change the character set by simply doing an 'alter database' as some people might suggest.
Related
I've been tasked with determining if our web platform can be 'localized' to Japanese, and how to do so. The platform is PL/SQL based in an Oracle 10g database. We have localized it for French Canadian and Brazilian Portuguese in the past, but I'm wondering what issues I may run into with Japanese (Kanji, I believe). Am I correct that Japanese is a double-byte char set while the others we've used are single-byte? How will this impact code and/or database table structure and access?
The various sentences/phrases/statements are stored in a database table and are looked up as needed based on the user's id and language setting. The table field that stores the 'text' is defined as a CLOB. It's often read into a VARCHAR2 variable.
I tried to copy/past some Japanese characters into the table via a direct paste to the field in a TOAD schema browser. That resulted in '??' being displayed.
Is there anything I have to do in order to be able to store Japanese characters in that table? Or access/display them from that table?
Check your database character set by
SELECT *
FROM V$NLS_PARAMETERS
WHERE PARAMETER IN ('NLS_CHARACTERSET', 'NLS_NCHAR_CHARACTERSET');
If the character set support Japanese (e.g. AL32UTF8) it should be no big deal to localize your application also to Japanese. Changing the character set on an existing database is also possible but requires some effort, see Character Set Migration
Check also this answer for topics related to database character set vs. client character set, i.e. NLS_LANG setting.
I have Oracle SQL Developer (3.1.07) and I'm trying to work with a database that uses WE8ISO8859P1 encoding:
SELECT * FROM nls_database_parameters WHERE parameter = 'NLS_CHARACTERSET';
I have problems with saving packages that contains unicode symbols. When I open previously saved package all unicode symbols are turned to '¿'.
What settings do I have to change to make SQL Developer keep those symbols?
I've tried to set environment encoding to 'ISO-8859-15' and some other encodings, but it won't help.
If your database encodes text to a non-unicode single-byte encoding (e.g. ISO-8859), any symbol not present on the character table will be seen as invalid and replaced by a placeholder. You can't go back from that, the information is lost.
That can be usually worked around when storing data, but as for source code, you cannot control how Oracle would encode your strings.
If your database is configured to use such encoding scheme you're probably not supposed to write code that violates its rules.
Maybe you could need this character set migration
http://docs.oracle.com/cd/B10501_01/server.920/a96529/ch10.htm#1656
on the Oracle's documentation
At least to open PKG in sql developer, you can do a quick try and see if it works:-
Change SQL Developer 'encoding' to 'unicode-utf-8' which is default to later versions now.
You would ,eventually, need to go for database charset migration to 'AL32UTF8' to avoid other issues (like data) due to this char set.
If you look at USER_SOURCE you'll see that the source code, as stored/interpreted by the database, will be in a VARCHAR2 column so use the database character set. As such, your source code will need to be in WE8ISO8859P1.
In theory, if the client and database are using the same character set, then the database won't try to do any character set translation and you may be able to sneak in a sequence of bytes that the database thinks are WE8ISO8859P1 but will make sense in unicode. However, at some point, someone will use the wrong client and it will break.
You don't need unicode for identifiers etc in the code, so I assume it is in string literals. You are better off storing these in a table (NVARCHAR2 column) and selecting them into the code rather than hard-coding them. If that isn't possible, you could use UNISTR and hard-code the relevant hex values.
I have to change the character set from AL32UTF8 to WE8MSWIN1252 in a Oracle 11g r2 Express instance... I tried to use the command:
ALTER DATABASE CHARACTER SET WE8MSWIN1252;
But it fails saying that MSWIN1252 isn't a superset of AL32UTF8. Then I found some articles talking about CSSCAN, and that tool doesn't seem to be available in Oracle 11 Express.
http://www.oracle-base.com/articles/10g/CharacterSetMigration.php
Anyone has an idea on how to do that? Thanks in advance
Edit
Clarifying a little bit: The real issue is that I'm trying to import data into a table that has a column defined as VARCHAR(6 byte). The string causing the issue is 'eq.mês', it needs 6 bytes in MSWIN1252 and 7 bytes in UT8
You can't.
The Express Edition of 11g is only available using a UTF-8 character set. If you want to go back to the express edition of 10g, there was a Western European version that used the Windows-1252 character set. Unlike with the other editions, Oracle doesn't support the full range of character sets in the Express Edition nor does it support changing the character set of an existing XE database.
Why do you believe you need to change the database character set? Other than potentially taking a bit more storage space to support the characters in the upper half of the Windows-1252 range, which generally aren't particularly heavily used, there aren't many downsides to a UTF-8 database.
I would say that your best option when you want to go to a character set that supports only a subset of the original characters, that your best option is to use exp and imp back (or expdp and impdp).
Are you sure that no single table will contain any character not found in the 1252 code page ?
The problem of only execute that ALTER DATABASE command is that the Data Dictionary was not converted and it can be corrupted.
I had the same problem. In my case, we are using a Oracle 11g Express Edition (11.2.0.2.0) and we really need that it runs on WE8MSWIN1252 character set, but I cannot change the character set on installation (it always install with AL32UTF8).
With a Oracle Client 11g installed as Administrator and run only the csscan full=y (check this link https://oracle-base.com/articles/10g/character-set-migration) and we notice that are lossy and convertible data problems in our database. But, the problems are with the MDSYS (Oracle Spatial) and APEX_040000 (Oracle Application Express) schemas. So, as we dont need this products, we remove them (check this link: http://fast-dba.blogspot.com.br/2014/04/how-to-remove-unwanted-components-from.html).
Then, we export with expdp the users schemas and drop the users (they must be recreated at the end of the process).
Executing csscan again with full=y capture=y, it reports that: The data dictionary can be safely migrated using the CSALTER script. If the report doesnt have this, the csalter.plb script will not work, because there are some conditions that will not be satisfied:
changeless for all CHAR VARCHAR2, and LONG data (Data Dictionary and Application Data)
changeless for all Application Data CLOB
changeless and/or convertible for all Data Dictionary CLOB
In our case, this conditions were satisfied and we could ran the CSALTER script successfully. Moreover, this script executes the ALTER DATABASE command you are trying to run and it converts the CLOB data of Data Dictionary that is convertible.
Finally, we create the users and the tablespaces of our application and we import the dump of the user data successfully.
I have an oracle database which nls character set is set to ALS32UTF8 and nls nchar character set is set to UTF8.
But however if i insert any data to a nvarchar column in a table.
Subsequent when i do a select the data i got is ???.
Why is this so?
The funny thing is that using TOAD i can read view the correct nvarchar data using the schema brower - > data
But if i use sql to do a select i get ???.
Any idea anyone and how to resolve this?
What is the NLS_LANG setting on your client? For a good reference on NLS parameters, read this FAQ by Oracle. For additional reading on Unicode, see this essay by Joel Spolsky.
You need to set the clients NLS_LANG to utf.
sqlplus uses these environent variables (registry parameters on windows):
(You may need to use sqlplusw.exe to use utf-8 on windows.)
NLS_LANG=AMERICAN_AMERICA.AL32UTF8
LC_CTYPE="en_US.UTF-8"
ORA_NCHAR_LITERAL_REPLACE=true
See also : Inserting national characters into an oracle NCHAR or NVARCHAR column does not work
The client has asked for a number of tables to be extracted into csv's, all done no problem. They've just asked we make sure the files are always in UTF 8 format.
How do I check this is actually the case. Or even better force it to be so, is it something i can set in a procedure before running a query perhaps?
The data is extracted from an Oracle 10g database.
What should I be checking?
Thanks
You can check the database character set with the following query:
select value from nls_database_parameters
where parameter='NLS_CHARACTERSET'
If it says AL32UTF8 then your database is in the format what you need and if the export does not impair it then your are done.
You may read about Oracle globalization support here, and here about NLS parameters like the above.
How, exactly, are you generating the CSV files? Depending on the exact architecture, there will be different answers.
If you are, for example, using SQL*Plus to extract the data, you would need to set the NLS_LANG on the client machine to something appropriate (i.e. AMERICAN_AMERICA.AL32UTF8) to force the data to be sent to the client machine in UTF-8. If you are using other approaches, NLS_LANG may or may not be important.
What you have to look for is the eight-bit ascii characters in hte input (if any) are translated into double byte utf-8 characters.
This is highly dependant on your local ASCII code page but typically:-
ASCII "£" should be x'A3' in ascii magically becomes x'C2A3' in utf-8.
Ok it wasn't as simple as I first hoped. The query above returns AL32UTF8.
I am using a stored proc compiled on the database to loop through a list of table names held in an array inside the stored procedure.
I use DBMS_SQL package to build the SQL and UTL_FILE.PUT_NCHAR to insert data into a text file.
I believed then my resultant output would be in UTF 8 however opening in Textpad says it's in ANSI and the data is garbled in places :)
Cheers
It might be important that NLS_CHARACTERSET is AL32UTF8 and NLS_NCHAR_CHARACTERSET is AL16UTF16