DB2: How to set encoding for db2clp under Windows? - windows

I have a DB2 that was created with encoding set to UTF-8
db2 create database mydb using codeset UTF-8
My data insert scripts are also stored in encoding UTF-8.
The problem now is that the command line processor seems to work with a different encoding as the Windows installation doesn't use UTF-8:
C:\Users\Administrator>chcp
Active code page: 850
This leads to the problem that my data (which contains special characters) is not stored correctly to the database.
Under Linux/AIX I could change the command line encoding by setting
export LC_ALL=en_US.UTF-8
How do I achieve this under Windows? I already tried
chcp 65001
UPDATE:
But that won't have any effect? It seems like the db2clp can't deal with the UTF-8 encoded file because it will print out junk:
D:\Program Files\ibm_db2\SQLLIB\BIN>chcp 65001
Active code page: 65001
D:\Program Files\ibm_db2\SQLLIB\BIN>type d:\tmp\encoding.sql
INSERT INTO MY_TABLE (ID, TXT) VALUES (99, 'äöü');
D:\Program Files\ibm_db2\SQLLIB\BIN>db2 connect to mydb
Datenbankverbindungsinformationen
Datenbank-Server = DB2/NT64 9.5.0
SQL-Berechtigungs-ID = MYUSER
Aliasname der lokalen Datenbank = MYDB
D:\Program Files\ibm_db2\SQLLIB\BIN>db2 -tvf d:\tmp\encoding.sql
INSERT INTO MY_TABLE (ID, TXT) VALUES (99, 'äöü')
DB20000I Der Befehl SQL wurde erfolgreich ausgeführt.

You need to set both:
CHCP 65001
SET DB2CODEPAGE=1208
on the db2cmd command line, before running db2 -tvf. This works for databases that have CODESET set to UTF-8. To check the CODESET setup for database run:
db2 get db cfg for <your database>
and look for "Database code page" and "Database code set" they should be 1208 and UTF-8 respectively.

when dealing with encodings, you have to take a careful look into your envirnoments, and where you are currently. So in your case:
the Server stores its data in encoding A (like UTF-8)
the client resides in an environment which has encoding B (like windows-1252)
in your client, you have to have to use the encoding of your client (or tell the client you intentionally use another encoding on client side (like UTF-8-encoded file inside a windows-1251 environment)!). The connection between the Client and the server is doing the work for you to change encoding B into encoding A for storing the data into the database.

It's work for me by setting db2codepage, thanks to Mr. Zoran Regvart.
by the way, after setting, you need to execute "db2 terminate" to reset client, and then reconnect.

Related

psql on windows: ERROR: invalid byte sequence for encoding "UTF8": 0xc8 0x20

on database1:
show LC_CTYPE; shows C
show LC_COLLATE; shows C
show SERVER_ENCODING; shows UTF8
but set "PGPASSWORD=password1" & set "PGCLIENTENCODING=UTF8" & psql.exe -h 127.0.0.1 -p 5432 -U postgres -d database1 -c "INSERT INTO table1 (column1) VALUES ('mise à jour 1');"
shows: ERROR: invalid byte sequence for encoding "UTF8": 0xc8 0x20
the error disappears if PGCLIENTENCODING is set to ISO_8859_5 for example
how to fix this issue?
There is nothing much to fix. Your Windows shell uses a different encoding than UTF-8, so you have to set the client encoding to that encoding to make it work. To find out which client encoding to use, you must figure out which encoding your shell uses. That in turn depends on which shell you are using and how the Windows system was configured.

Copying from file: read psql message on windows with UTF8

I'm trying to import csv file to postgres with COPY command.
As I've received well known 'ERROR: character with byte sequence 0xd0 0x9f in encoding "UTF8" has no equivalent in encoding "WIN1252"' I changed my client_encoding to utf8.
Now I'm getting completely unreadable message
ПОМИЛКÐ: Ð²Ñ–Ð´Ð½Ð¾ÑˆÐµÐ½Ð½Ñ "mytab" не Ñ–Ñнує
I tried to change console codepage by chcp 65001 but with no luck.
Can anybody help me with that extraordinary rare and complex task - to import csv to database?
Solution:
I would suggest, that problem is due to UA or RU localization of installed DB.
Switching DB lang should help (at least hepled me):
SET lc_messages TO 'en_US.UTF-8';
Please try on your PC and let me know if that helps.
My investigation:
In the powerShell I all the time getting an error but with:
ERROR: character with byte sequence 0xd0 0x9f in encoding "UTF8" has no equivalent in encoding "WIN1252"
When I swith encoding to UTF-8 with comand:
SET client_encoding TO 'UTF8';
I'm starting to get the same not readable symbols, but if I going to pgAdmin4 and run the same command it gives me well explained error in UA lang:
ERROR: ПОМИЛКА: insert або update в таблиці "exam_results" порушує обмеження зовнішнього ключа "exam_results_subject_id_fkey"
DETAIL: Ключ (subject_id)=(0) не присутній в таблиці "subjects".
CONTEXT: SQL-оператор "insert into exam_results (student_id, subject_id, mark)
values ((random()*100000)::int,
(random()*1000)::int,
(random()*5)::int)"
Функція PL/pgSQL inline_code_block рядок 4 в SQL-оператор

Firebird 2.5 query returns COLLATION UTF8_CI_AI_NUMERIC_SORT for CHARACTER SET UTF8 is not installed

I have an old source database in which apparently custom collation UTF8_CI_AI_NUMERIC_SORT was created. I'm running it on docker via image jacobalberty/firebird:2.5-ss. Originally database was created on a Windows machine.
When I try to do a query on the table where this collation was used, I get the error:
SQL> select * from "InvoiceService";
Statement failed, SQLSTATE = 22021
COLLATION UTF8_CI_AI_NUMERIC_SORT for CHARACTER SET UTF8 is not installed
Show collations returns the following:
SQL> show collations;
UTF8_CI_AI_NUMERIC_SORT, CHARACTER SET UTF8, FROM EXTERNAL ('UNICODE'), CASE INSENSITIVE, ACCENT INSENSITIVE, 'NUMERIC-SORT=1'
I tried the following fixes:
add entry to fbintl.conf:
<charset UTF8>
intl_module fbintl
collation UTF8_CI_AI_NUMERIC_SORT
</charset>
Then run the sp_register_character_set("UTF8", 4) procedure, and receiving error about duplicate collations (because UTF8_CI_AI_NUMERIC_SORT is already defined in the DB).
Dropping collation
SQL> drop collation UTF8_CI_AI_NUMERIC_SORT;
Statement failed, SQLSTATE = 42000
unsuccessful metadata update
-Collation UTF8_CI_AI_NUMERIC_SORT is used in table InvoiceService (field name NAME) and cannot be dropped
Adding new column in which different collation would be used, but can't even add it:
SQL> ALTER TABLE "InvoiceService" ADD NAME2 VARCHAR(600) CHARACTER SET UTF8;
Statement failed, SQLSTATE = 22021
unsuccessful metadata update
-InvoiceService
-COLLATION UTF8_CI_AI_NUMERIC_SORT for CHARACTER SET UTF8 is not installed
With using gbak restoring only metadata, fixing the schema and then inserting only the data, but gbak does not support restoring data only
...
I'm out of ideas now. What else could I try?
So, I finally managed to solve the problem. What I did was to create a DB backup with
gbak -v -t -user SYSDBA /path/to/source.fdb /path/to/backup.fbk
Then use the 3.0 version of Docker image with Firebird DB (jacobalberty/firebird:3.0) and restore from backup with
gbak -create /path/to/backup.fbk /path/to/restored3.fdb
Note that the same backup-restore procedure without switching the Docker image did not work.
I didn't have to do anything else. There's only a slight difference in SHOW COLLATIONS; output:
// originally:
UTF8_CI_AI_NUMERIC_SORT, CHARACTER SET UTF8, FROM EXTERNAL ('UNICODE'), CASE INSENSITIVE, ACCENT INSENSITIVE, 'NUMERIC-SORT=1'
// restored DB
UTF8_CI_AI_NUMERIC_SORT, CHARACTER SET UTF8, FROM EXTERNAL ('UNICODE'), CASE INSENSITIVE, ACCENT INSENSITIVE, 'COLL-VERSION=58.0.6.50;NUMERIC-SORT=1'

strange character in migration from postgres to oracle (Ansi)

I'm migrating a db from postgres to oracle.I create csv files with this command:
\copy ttt to 'C:\test\ttt.csv' CSV DELIMITER ',' HEADER encoding 'UTF8' quote as '"'; then with oracle sql loader I put data in oracle tables.
It's all ok but I have in some description this character  that wasnt in the original DB.
The encoding of db postgres is UTF8 and I'm on a window machine.
Thanks to all.
Gian Piero
Before you start sqlloader run
chcp 65001
set NLS_LANG=.AL32UTF8
chcp 65001 sets codepage of your cmd.exe to UTF-8 (which is inherited by sqlloader and sqlplus)
With set NLS_LANG=.AL32UTF8 you tell the Oracle database "The client uses UTF-8"
Without these commands you would have this situation (due to defaults)
chcp 850
set NLS_LANG=AMERICAN_AMERICA.US7ASCII
Maybe on your PC you have codepage 437 instead of 850, it depends whether your PC is U.S. or Europe, see National Language Support (NLS) API Reference, column OEM codepage
You can set NLS_LANG also as Environment Variable in PC settings or you can define it in Registry at HKLM\SOFTWARE\Wow6432Node\ORACLE\KEY_%ORACLE_HOME_NAME%\NLS_LANG (for 32 bit), resp. HKLM\SOFTWARE\ORACLE\KEY_%ORACLE_HOME_NAME%\NLS_LANG
You can also change codepage of your cmd.ext persistent, see https://stackoverflow.com/a/33475373/3027266
For details about NLS_LANG see https://stackoverflow.com/a/33790600/3027266

Import / export oracle scheme with correct character set

I have exported a scheme successfully. On the import however the log says that the character sets don't match. The strange thing is that on the server the export was done the character set is the same as on the target database.
This is from the source:
SQL> select * from v$NLS_PARAMETERS
2 ;
**NLS_CHARACTERSET
WE8MSWIN1252**
**NLS_NCHAR_CHARACTERSET
AL16UTF16**
And this is from the log of the import:
Importvorgang mit Zeichensatz WE8MSWIN1252 und Zeichensatz AL16UTF16 NCHAR durchgeführt
Export-Client verwendet Zeichensatz US7ASCII (mögliche Zeichensatzkonvertierung)
Why is the dump recognized as US7ASCII set? The source and target both are non-US machines.
Thank you
Yes, Looks like issue with char set of client session. Set it to globally supported and recommended UTF8 format.
Pls take the export again and try importing. (Do the following before export):
In Windows
set NLS_LANG=AMERICAN_AMERICA.UTF8
In Unix
export NLS_LANG=AMERICAN_AMERICA.UTF8
These days DB char set is also recommended to be 'AL32UTF8'.

Resources