PLSQL decode NVARCHAR2 from BASE64 to UTF-8 - utf-8

I have a database which stores usernames only in English at the moment.
I would like to incorporate BASE64 & UTF-8 in order to store in other languages as well; I want to store it in a column of type NVARCHAR2.
The database procedure receives the name in BASE64, I'm decoding it via UTL_ENCODE.BASE64_DECODE & converting the string to VARCHAR2 using UTL_RAW.CAST_TO_VARCHAR2. But I get gibberish back and not the actual word.
For example I get 'алекс' as the name in BASE64. I'm able to decode it but the cast to VARCHAR2/NVARCHAR2 does not return the value: I get only gibberish.
I'm running on Oracle 12c using NLS_CHARACTERSET WE8ISO8859P1
Here is the code I use to decode:
DECLARE
lv_OrgUserName VARCHAR2(2000);
lv_encodedUserName VARCHAR2(2000);
lv_UserName VARCHAR2(2000);
BEGIN
lv_OrgUserName := 'алекс';
lv_encodedUserName := UTL_RAW.CAST_TO_VARCHAR2(UTL_ENCODE.BASE64_ENCODE(UTL_RAW.CAST_TO_RAW(lv_OrgUserName)));
DBMS_OUTPUT.PUT_LINE (lv_encodedUserName);
lv_UserName := UTL_RAW.CAST_TO_VARCHAR2(UTL_ENCODE.BASE64_DECODE(UTL_RAW.CAST_TO_RAW (lv_encodedUserName)));
DBMS_OUTPUT.PUT_LINE (lv_UserName);
END;
How can I overcome this?

First and foremost WE8ISO8859P1 (Western European 8-bit ISO 8859 Part 1, or - ISO8859 Part 1) does not support cyryllic characters:
see this link: https://en.wikipedia.org/wiki/ISO/IEC_8859-1
Therefore if you try to store a string like алекс into VARCHAR2 variable/column, you will always get a???? as an outcome.
Probably during the database installation someone has not considered cyryllic characters and has choosen a bad codepage.
A better option is ISO/IEC 8859-5 (part 5), see this link: https://en.wikipedia.org/wiki/ISO/IEC_8859-5
One option is to change this encoding - but this is not easy and it is beyound of this question.
What you can do is to strictly use NVARCHAR2 datatype instead of VARCHAR2 datatype in all places of your application that must support cyrillic characters.
There are still some pitfalls though you need to be aware of:
You cannot use DBMS_OUTPUT package to debug your code, because this package support only VARCHAR2 datatype, it doesn't support NVARCHAR
you must use N'some string' literals (with N prefix) in all literals --> 'алекс' is of VARCHAR2 datatype and it is always automatically converted to 'a????' in your encoding, while n'алекс' is of NVARCHAR2 datatype and such conversion doesn't occur.
The below code is tested on version 12c, I am using EE8MSWIN1250 code page (it also desn't support cyrillic characters):
select * from nls_database_parameters
where parameter like '%CHARACTERSET%';
PARAMETER VALUE
----------------------- ------------
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_CHARACTERSET EE8MSWIN1250
please give it a try:
CREATE OR REPLACE PACKAGE my_base64 AS
FUNCTION BASE64_ENCODE( str nvarchar2 ) RETURN varchar2;
FUNCTION BASE64_DECODE( str varchar2 ) RETURN nvarchar2;
END;
/
CREATE OR REPLACE PACKAGE BODY my_base64 AS
FUNCTION BASE64_ENCODE( str nvarchar2 ) RETURN varchar2
IS
lv_encodedUserName VARCHAR2(2000);
BEGIN
lv_encodedUserName := UTL_RAW.CAST_TO_VARCHAR2(UTL_ENCODE.BASE64_ENCODE(UTL_RAW.CAST_TO_RAW(str)));
RETURN lv_encodedUserName;
END;
FUNCTION BASE64_DECODE( str varchar2 ) RETURN nvarchar2
IS
lv_UserName nVARCHAR2(2000);
BEGIN
lv_UserName := UTL_RAW.CAST_TO_nVARCHAR2(UTL_ENCODE.BASE64_DECODE(UTL_RAW.CAST_TO_RAW (str)));
RETURN lv_UserName;
END;
END;
/
A few examples:
select 'aлекс' As A, n'aлекс' As B from dual;
A B
----- -----
a???? aлекс
select my_base64.BASE64_ENCODE( n'аaaлекс' ) As aleks from dual;
ALEKS
--------------------------------------------------------------------------------
BDAAYQBhBDsENQQ6BEE=
select my_base64.BASE64_DECODE( 'BDAAYQBhBDsENQQ6BEE=' ) as aleks from dual;
ALEKS
--------------------------------------------------------------------------------
аaaлекс
select my_base64.BASE64_DECODE( my_base64.BASE64_ENCODE( n'аaaлекс' ) ) as Aleks from dual;
ALEKS
--------------------------------------------------------------------------------
аaaлекс

Related

How to convert CLOB to UTF8 in an Oracle query?

I've got a query with a CLOB field which I want to return her value in UTF8 format. The next query works fine if the field are varchar, for example, but if it is CLOB doesn't return a correct UTF8 string.
select convert(field, 'AL32UTF8', 'WE8ISO8859P15') from table;
How can I do to return a UTF8 string from a CLOB in a query?
Use DBMS_LOB.CONVERTTOBLOB.
From the oracle documentation:
Oracle discourages the use of the CONVERT function in the current
Oracle Database release. The return value of CONVERT has a character
datatype, so it should be either in the database character set or in
the national character set, depending on the datatype. Any
dest_char_set that is not one of these two character sets is
unsupported. …
If you need a character datatype like CLOB in a character set that differs from those the database is setup with it should be converted into a BLOB.
This is where DBMS_LOB.CONVERTTOBLOB comes in.
If you need a function that returns a BLOB you have to wrap CONVERTTOBLOB into your own function.
For example:
CREATE OR REPLACE FUNCTION clob_to_blob (p_clob CLOB, p_charsetname VARCHAR2)
RETURN BLOB
AS
l_lang_ctx INTEGER := DBMS_LOB.default_lang_ctx;
l_warning INTEGER;
l_dest_offset NUMBER := 1;
l_src_offset NUMBER := 1;
l_return BLOB;
BEGIN
DBMS_LOB.createtemporary (l_return, FALSE);
DBMS_LOB.converttoblob (
l_return,
p_clob,
DBMS_LOB.lobmaxsize,
l_dest_offset,
l_src_offset,
CASE WHEN p_charsetname IS NOT NULL THEN NLS_CHARSET_ID (p_charsetname) ELSE DBMS_LOB.default_csid END,
l_lang_ctx,
l_warning);
RETURN l_return;
END;
This allows queries like:
SELECT clob_to_blob (field, 'UTF8') FROM t;
To get a list of supported values for the character set name use:
SELECT *
FROM v$nls_valid_values
WHERE parameter = 'CHARACTERSET'
use dbms_lob package for it
for example
select convert(dbms_lob.substr(field,dbms_lob.getlength(field), **0**),
'AL32UTF8',
'WE8ISO8859P15')
from table;
Fixed it:
select convert(dbms_lob.substr(field,dbms_lob.getlength(field)),
'AL32UTF8',
'WE8ISO8859P15')
from table;

Create Oracle XMLTYPE from CLOB specifying character set

I'm trying to create XMLTYPE from CLOB column and specify character set explicitly. I've found there is an overloaded XMLTYPE.createXML function which accepts character set but when I execute passing additional arguments I get an error. Why?
SELECT
XMLTYPE.createXML(TO_CLOB ('<node1><node2>the ´ character</node2></node1>'),NLS_CHARSET_ID('AL32UTF8'),'',1,1)
from dual;
error:
ORA-06553: PLS-306: wrong number or types of arguments in call to
'CREATEXML'
The reason why I bother passing the character set is, CLOB column contains characters which are encoded with different character set than the database character set (which for example doesn't support #180).
The reason why I bother passing the character set is, CLOB column contains characters which are encoded with different character set than the database character set (which for example doesn't support #180).
This one I don't understand. #180; is simple plain ASCII, it should work for any condition.
Simply run
SELECT
XMLTYPE.createXML(TO_CLOB('<node1><node2>the ´ character</node2></node1>'))
from dual;
or even shorter
SELECT
XMLTYPE('<node1><node2>the ´ character</node2></node1>')
from dual;
Now, let's assume your XML contains characters which are not supported by database character set, in this case your XML could be <node1><node2>the ´ character</node2></node1> for example.
First of all you cannot store (or use) any character in CLOB (or VARCHAR2) which is not supported by database character set - never! You must use NCLOB (or NVARCHAR2) which are based on National Database Character Set and typically support any Unicode character.
You can specify character set in XMLTYPE.createXML(), however then you must provide XML as BLOB. You could do it like this:
DECLARE
xmlString NCLOB := '<node1><node2>the '||NCHR(180)||' character</node2></node1>';
xmlDoc XMLTYPE;
xmlBinary BLOB;
lang_context INTEGER := DBMS_LOB.DEFAULT_LANG_CTX;
dest_offset INTEGER := 1;
src_offset INTEGER := 1;
read_offset INTEGER := 1;
warning INTEGER;
BEGIN
DBMS_LOB.CREATETEMPORARY(xmlBinary, TRUE);
DBMS_LOB.CONVERTTOBLOB(xmlBinary, xmlString, DBMS_LOB.LOBMAXSIZE, dest_offset, src_offset, 2000, lang_context, warning);
xmlDoc := XMLTYPE.createXML(xmlBinary, 2000, NULL, 1, 1);
END;
2000 is the csid of your national database character set. Use
SELECT PARAMETER, VALUE, NLS_CHARSET_ID(VALUE)
FROM NLS_DATABASE_PARAMETERS
WHERE PARAMETER LIKE '%CHARACTERSET';
to get your ID's.
Some notes:
I tried with string N'<node1><node2>the ´ character</node2></node1>', however Oracle replaced ´ immediately with ¿. I did not manage to enter ´ directly.
Almost all XML Functions return VARCAHR2 values (not NVARCAHR2), also most of XMLTYPE member functions work with CLOB (not NCLOB). If you just read and store your XML documents as XMLTYPE in your database it should be fine, however as soon you start any operations with these data sooner or later you will hit a conversion error. You should really consider to migrate your database character set, see Character Set Migration and/or Oracle Database Migration Assistant for Unicode
You can directly use the XMLType function
SELECT XMLTYPE('<?xml version="1.0" encoding="UTF-8"?>'
||TO_CLOB ('<node1><node2>the ´ character</node2></node1>')) myxml
FROM dual;

Oracle PLSQL equivalent of ASCIISTR(N'str')

My database has NLS_LANGUAGE:AMERICAN / NLS_CHARACTERSET:WE8ISO8859P15 / NLS_NCHAR_CHARACTERSET:AL16UTF16; NLS_LANG is set to AMERICAN_AMERICA.WE8MSWIN1252 in my Windows> properties> advanced system settings> advanced> environment variables - hope it applies to my PLSQL Dev.
I use ASCIISTR to get a unicode encoded value for exotic chars like this:
SELECT ASCIISTR(N'κόσμε') FROM DUAL;
Results in
ASCIISTR(UNISTR('\03BA\1F79\03...
---------------------------------
\03BA\1F79\03C3\03BC\03B5
It looks like the 'N' means the string is unicode, because if I don't specify it I get it wrong encoded.
SELECT ASCIISTR('κόσμε') FROM DUAL;
Results in
ASCIISTR('??SµE')
--------------------
??s\00B5e
What does this 'N' stands for? How do I invoke it in PLSQL?
I intend to use it on a pl/sql variable to encode exotice characters like this:
DECLARE
l_in VARCHAR2(2000);
l_ec VARCHAR2(2000);
l_dc VARCHAR2(2000);
BEGIN
l_in := 'κόσμε';
execute immediate 'select ASCIISTR(N'''||l_in||''') from dual' into l_ec;
DBMS_OUTPUT.PUT_LINE(l_ec);
select unistr(l_ec) into l_dc from dual;
DBMS_OUTPUT.PUT_LINE (l_dc);
END;
But I get
??s\00B5e
??sµe
As if I were in the second case above, without the 'N'
N'κόσμε' is (more or less) equivalent to CAST('κόσμε' AS NVARCHAR2(..))
With N'κόσμε' you say "treat the string as NVARCHAR". If you write just 'κόσμε' then the string is treated as VARCHAR. However, your NLS_CHARACTERSET is WE8ISO8859P15 which does not support Greek characters. Thus you get ? as placeholder.
You didn't tell us your NLS_NCHARACTERSET setting, most likely this supports Unicode.
btw, you don't have to select ... from dual, simply write like
l_ec := ASCIISTR('κόσμε');
in PL/SQL.
What is your local NLS_LANG value, i.e. at your client side? Most likely it does not match the character encoding of your SQL*Plus. See this answer for more details: OdbcConnection returning Chinese Characters as "?"
I (sadly) discovered in PLSQL decode NVARCHAR2 from BASE64 to UTF-8 that DBMS_OUTPUT doesn't support NVARCHAR2 datatype. I thusly can't use it to debug.
Then I can do the following to test:
-- encoding
CREATE OR REPLACE FUNCTION my_ec(l_in nvarchar2) RETURN varchar2 is
l_out varchar2(32000);
BEGIN
l_out := asciistr(l_in);
return l_out;
END;
/
-- decoding
CREATE OR REPLACE FUNCTION my_dc(l_in varchar2) RETURN nvarchar2 is
l_out nvarchar2(32000);
BEGIN
l_out := unistr(l_in);
return l_out;
END;
/
with expected result!
select my_ec(N'κόσμε') from dual;
--'\03BA\1F79\03C3\03BC\03B5'
select my_dc('\03BA\1F79\03C3\03BC\03B5') from dual;
--'κόσμε'
select my_dc(my_ec(N'κόσμε')) from dual;
--'κόσμε'

How to obtain the hexa code of a NCHAR into a VARCHAR2

I'm working on a conversion of T-SQL script into pl/sql. And i need your help about a conversion type.
My t-sql script :
set #cust_name_hex = convert(VARCHAR(max),convert(varbinary(max), #cust_name),2)
My conversion, but i'm not really sure...
set cust_name_hex = TO_CHAR(cust_name);
I've to obtain the hexa code of the 'cust_name' variable. I search on the web and every where, and i found the WARTOHEX function.
I missed tu say you that the variable cust_name is a NCHAR. So i understand in t-sql, the schema of conversion : NVARCHAR -> VARBINARY -> VARCHAR.
In PL/SQL, i try to make the same conversion, but i don't obtain the good result.. I don't know how to convert a NCHAR in VARCHAR2, to give me the Hexa value...
Combination of UTL_RAW.CAST_TO_RAW and RAWTOHEX functions should do the job:
SELECT RAWTOHEX(UTL_RAW.CAST_TO_RAW(N'unicode text')) FROM DUAL;
or using PL/SQL
DECLARE
cust_name_hex VARCHAR2(255);
BEGIN
cust_name_hex := RAWTOHEX(UTL_RAW.CAST_TO_RAW(N'unicode text'));
DBMS_OUTPUT.PUT_LINE(cust_name_hex);
END;

Making a sha1-hash of a row in Oracle

I'm having a problem with making a sha1-hash of a row in a select on an Oracle database. I've done it in MSSQL as follows:
SELECT *,HASHBYTES('SHA1',CAST(ID as varchar(10)+
TextEntry1+TextEntry2+CAST(Timestamp as varchar(10)) as Hash
FROM dbo.ExampleTable
WHERE ID = [foo]
However, I can't seem to find a similar function to use when working with Oracle.
As far as my googling has brought me, I'm guessing dbms_crypto.hash_sh1 has something to do with it, but I haven't been able to wrap my brain around it yet...
Any pointers would be greatly appreciated.
The package DBMS_CRYPTO is the correct package to generate hashes. It is not granted to PUBLIC by default, you will have to grant it specifically (GRANT EXECUTE ON SYS.DBMS_CRYPTO TO user1).
The result of this function is of datatype RAW. You can store it in a RAW column or convert it to VARCHAR2 using the RAWTOHEX or UTL_ENCODE.BASE64_ENCODE functions.
The HASH function is overloaded to accept three datatypes as input: RAW, CLOB and BLOB. Due to the rules of implicit conversion, if you use a VARCHAR2 as input, Oracle will try to convert it to RAW and will most likely fail since this conversion only works with hexadecimal strings.
If you use VARCHAR2 then, you need to convert the input to a binary datatype or a CLOB, for instance :
DECLARE
x RAW(20);
BEGIN
SELECT sys.dbms_crypto.hash(utl_raw.cast_to_raw(col1||col2||to_char(col3)),
sys.dbms_crypto.hash_sh1)
INTO x
FROM t;
END;
you will find additional information in the documentation of DBMS_CRYPTO.hash
The DBMS_crypto package does not support varchar2. It works with raw type so if you need a varchar2 you have to convert it. Here is a sample function showing how to do this :
declare
p_string varchar2(2000) := 'Hello world !';
lv_hash_value_md5 raw (100);
lv_hash_value_sh1 raw (100);
lv_varchar_key_md5 varchar2 (32);
lv_varchar_key_sh1 varchar2 (40);
begin
lv_hash_value_md5 :=
dbms_crypto.hash (src => utl_raw.cast_to_raw (p_string),
typ => dbms_crypto.hash_md5);
-- convert into varchar2
select lower (to_char (rawtohex (lv_hash_value_md5)))
into lv_varchar_key_md5
from dual;
lv_hash_value_sh1 :=
dbms_crypto.hash (src => utl_raw.cast_to_raw (p_string),
typ => dbms_crypto.hash_sh1);
-- convert into varchar2
select lower (to_char (rawtohex (lv_hash_value_sh1)))
into lv_varchar_key_sh1
from dual;
--
dbms_output.put_line('String to encrypt : '||p_string);
dbms_output.put_line('MD5 encryption : '||lv_varchar_key_md5);
dbms_output.put_line('SHA1 encryption : '||lv_varchar_key_sh1);
end;
Just to put it here, if someone will search for.
In Oracle 12 you can use standard_hash(<your_value>, <algorythm>) function.
With no parameter <algorythm> defined, it will generate SHA-1 hash (output datatype raw(20))
You can define this function in your favorite package, I defined in utils_pkg.
FUNCTION SHA1(STRING_TO_ENCRIPT VARCHAR2) RETURN VARCHAR2 AS
BEGIN
RETURN LOWER(TO_CHAR(RAWTOHEX(SYS.DBMS_CRYPTO.HASH(UTL_RAW.CAST_TO_RAW(STRING_TO_ENCRIPT), SYS.DBMS_CRYPTO.HASH_SH1))));
END SHA1;
Now to call it
SELECT UTILS_PKG.SHA1('My Text') AS SHA1 FROM DUAL;
The response is
SHA1
--------------------------------------------
5411d08baddc1ad09fa3329f9920814c33ea10c0
You can select a column from some table:
SELECT UTILS_PKG.SHA1(myTextColumn) FROM myTable;
Enjoy!
Oracle 19c:
select LOWER(standard_hash('1234')) from dual;
which is equivalent to
select LOWER(standard_hash('1234','SHA1')) from dual;
will return an SHA1 hash.
For alternative algorithms see: https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/STANDARD_HASH.html

Resources