Change encoding (collation?) of SQL Server 2008 R2 to UTF-8 - utf-8

We'd like to move our Confluence system to a SQL Server 2008 R2. Now, since Confluence uses UTF-8 encoding, I'd need a database using the same encoding (I guess that's the collation?).
There's the command
alter database confluence set collation COLLATION_NAME
Now, as it seems, there is no utf-8, and as I found out SQL Server uses ucs-2 which is basically the same. But I can't figure out what the collation name of ucs-2 would be? Does somebody know about that?
Edit: I do see the difference between encoding and collation now. The Confluence documentation suggests that I should create an schema which relies on UCS-2 (because MS SQL has missing support for UTF-8). I have looked trough the Managment Studio and I found an entry for schemas in the Security directory of the database. However, I can not figure out how to assign UCS-2 encoding to the schema. What do I have to realize this in the Managment Studio to do so (or which query should I use)?

According to the confluence documentation you should set the collation to SQL_Latin1_General_CP1_CS_AS
We have followed this document and have had a successful confluence deployment on SQL Server 2008 R2:
Database Setup for SQL Server

Related

Why does one machine assume unicode and another non-unicode for same database columns in SSIS OLE DB Source task?

A coworker made an SSIS package that pulls data from Oracle and transfers it to a SQL Server database that is nearly identical, so it's a lot of data flow tasks simply with an OLE DB Source (Oracle) to an OLE DB Destination (SS). When I open it on my computer, I get the error "Column [column name] cannot convert between unicode and non-unicode string data types" on all the source tasks. If I add a data conversion task to convert the unicode columns to non-unicode, all works fine but I really want this to work like how he has it because it's running on the production server like this. My best guess is it has to do with the installing of the Oracle client or drivers or the NLS_LANG variable but I can't seem to solve it. My environment variable NLS_LANG = AMERICAN_AMERICA.WE8ISO8859P1
I was worried something went wrong with my Oracle client installation because of my registry values. Now I have 3 clients installed since I went through the install process again. These are the third client's reg values and I added NLS_LANG myself and rebooted. I'm more of a SQL Server developer, so I'm possibly saying something wrong here.
The solution was to set the NLS_LANG environment variable value to AMERICAN_AMERICA.WE8MSWIN1252 to match what my coworker had and what my registry has because I somehow didn't notice they were different! However, neither NLS_LANG were set to start with so the real solution was to add this in. I rebooted and when I reopened the package, got zero errors.

How to change (migrate) the character set of a root container database in a multi-tenant architecture?

I have at my disposition a multi-tenant database where both the root container database (referred to as CDB from now on) and the pluggable database (referred to as PDB from now on) have been installed with the WE8DEC character set (select value from nls_database_parameters where parameter='NLS_CHARACTERSET';).
The requirement would be that both (or at least the PDB, which I am actually using) use the AL32UTF8 character set. And installing a new database from scratch is not an option.
In 12.2 character set migration to unicode should be done with the DMU (Database Migration Assistant for Unicode) tool. However, in this particular setup this doesn't help me, because:
- DMU can only be used for PDB's
- and I cannot migrate to unicode a PDB whose CDB is not already in unicode
- but DMU cannot migrate the CDB to unicode
Therefore my question is:
How can I migrate the CDB from WE8DEC to AL32UTF8?
(According to my research, and export/import approach could perhaps come into discussion, but that involves installing from scratch a new database server with AL32UTF8 encoding. But as I said that's not an option in my specific case.)
Has anybody encountered this specific dilemma yet?
Thanks in advance for your answers.
Database Migration Assistant for Unicode (DMU) does not support migrating CDB root. The solution is to create a new CDB in AL32UTF8, move the PDB to it and then use DMU to migrate the PDB.

OLE DB Command to Oracle - unicode/non-unicode conversion error

My package runs fine from both my desktop and my ETL server when I RDP into it. However, when running as part a job, I get the following error on all my string columns: "Error: Column "***STATUS" cannot convert between unicode and non-Unicode string data types."
The error occurs on an OLE DB Command component that updates a table in an Oracle database. None of my columns on either the SQL/SSIS side nor the Oracle side are Unicode. Here's the metadata directly leading into my OLD DB Command component.
I verified that the External Columns on the OLE DB Command component in question exactly match that metadata. I've also tried explicitly converting the columns to Unicode before inserting in case they were Unicode (I know they're not) on the Oracle side, but that leads to a hard error (red X) and the same message.
Here's the Oracle schema:
Command:
Anyone have any idea on how to get this to run from the agent?
Based on the following oracle support case:
Oraoledb: Cannot Convert Between Unicode And Non-Unicode String Data Types
The following error message cause is:
Developing an SSIS package that uses the Oracle OLEDB Provider on a 32 bit operating system and then deploying to a 64 bit SQL Server installation
Possible Workarounds
Note that i didn't tested these workarounds before.
(1) Try running the package in 32-bit mode:
From Visual Studio
GoTo Project properties >> Debugging >> Run64BitRuntime = False
From SQL Agent
Check the following link:
SSIS Package Not Running as 32bit in SQL Server 2012
(2) Install Oracle x64 oledb provider
64-bit Oracle Data Access Components (ODAC) Downloads
On the OLE DB Source ensure you have delay validation set to true.
I set the default code page to false and had the code page as 1252.
I have tried this both with 32 bit and 64 bit.
Also need to make the flag validateexternalmetadata = false

Switching empty Oracle 12c database to Unicode

I have Oracle 12c installed in a Ubuntu Linux development environment with NLS_CHARACTERSET = WE8MSWIN1252. I want to import a database using data pump that has AL32UTF8 encoding. Is there a convenient way (other than reinstalling Oracle) to either switch my Oracle to Unicode entirely (no need to worry about preserving existing data), or to somehow "locally" use Unicode encoding for the schemas that I import?
For converting a database to AL32UTF8 characterset, Oracle introduced the Database Migration Assistant for Unicode (DMU) in version 12.1.
It is "convenient" as it is a GUI based tool which guides you through all the steps, at the same time it can be called "complex" as it requires a separate repository.

SSIS converts Varchar2 to DT_STR

We have an SSIS package downloading data from an Oracle database to an SQL Server datawarehouse. For this datawarehouse, several environments are set up; Development, Test and Production. Dev and test share a machine, Prod is stand-alone.
When the SSIS package is run on the PROD machine, it downloads the Varchar2 columns from our Oracle source database to MSSQL in DT_WSTR format and saves this to a NVarchar column. I.E. all steps involved support Unicode.
When this same package is run against the same source database on the DEV/Test box, it somehow sees the external columns as being Varchar, derives this to DT_STR in the data flow and refuses to store this in an NVarchar column.
All OS's are Win2K8r2, MSSQL 2008 64 bits. The package is run in 32bits mode, same behaviour is seen when run from BIDS or from SQL Agent.
Anyone care to guess why? I've already seen the suggestion to disable validating external metadata (https://stackoverflow.com/a/18383598/2903056), but that's not a practical suggestion for our situation.
An old question I know, but seems to still be relevant. And since I could not find a suitable answer in the last 3 months I have been searching, I figure now is as good a time as any to post my findings.
I have had the same curious behaviour and have finally been able to resolve it.
My layout looked like this:
Oracle 10g R2 database on Windows 2003 Server (lets call it ORA)
Dev machine with Windows 8, Visual Studio 2012 + SSDT, Sql Express 2012,
ODT 12.1.0.21 (lets call that DEV)
Sql Server 2012 on Windows 2012 Server, Oracle Client 11.2 (lets call that TEST)
Both DEV and TEST were connecting to ORA. DEV was reporting VARCHAR2 columns as DT_WSTR while TEST would insist that they are DT_STR.
I then installed ODT 12.1.0.21 on TEST and the problem was solved. Notably, I used the "machine wide" option during the install. I am not sure how much of an impact that had.
There seems to be a difference in the datatypes that are returned by the Oracle OleDb providers across the different versions of the client side components.
Check the value of the NLS_LANG in the registry.
reg query HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\ORACLE\KEY_<orahome> /f NLS_LANG
If it matches the server's character set, OraOLEDB will use regular (non-Unicode) datatype DBTYPE_STR, otherwise it uses Unicode-mode, datatype DBTYPE_WSTR.
If the NLS_LANG field is missing, it defaults to US7ASCII which almost certainly will not match your database and you will be using Unicode datatypes.
To get the server's characterset, do:
SELECT parameter, value FROM nls_database_parameters WHERE parameter = 'NLS_CHARACTERSET';
Check Metadata validation property value if it true make it false

Resources