Firebird UTF-8 database with Turkish collation - utf-8

I have UTF-8 database with Turkish text data stored in it. When it comes to Turkish there is problem when converting to uppercase or lowercase. Unlike other latin character based languages, turkish has different conversion rule for "i" and "I" characters. The problem is very common among RDBMS products. Most commercial and some open source RDBMS solved this issue. But not Firebird, despite the fact that it is very popular among turkish developers. BTW, it is not issue when database character set is ISO8859-9 (Turkish).
"i" -> uppercase -> "İ"
"ı" -> uppercase -> "I"
As far as I know firebird does not have collation for unicode/turkish .
So work "ikna" uppercased as "IKNA" when it should be "İKNA"
Does any one has workaround solution for such cases? Specifically, I want case incensitive LIKE search on text data.
Pretty informative
http://www.moserware.com/2008/02/does-your-code-pass-turkey-test.html

Ceyhun, did you try character set WIN1254 collate pxw_turk ?
create table test
(
recid integer,
name_ varchar(50) character set WIN1254 collate pxw_turk
);
commit;
insert into test(name_, recid) values('İsmail Ilgın', 1);
select
name_,
upper(name_),
lower(name_)
from test;
or with use of domain
create domain mydomain as varchar(50) character set WIN1254 collate pxw_turk;
create table test
(
recid integer,
name_ mydomain
);
commit;
insert into test(name_, recid) values('İsmail Ilgın', 1);
select
name_,
upper(name_),
lower(name_)
from test;
And result is :
'İsmail Ilgın' 'İSMAİL ILGIN' 'ismail ılgın'

Related

Unable to insert Arabic alphabets in DB

I am trying to run sql file with below insert query, where this query has to insert arabic alphabets
INSERT INTO language
(locale_id, language_id,
VALUE
)
VALUES (4011951073333968003, 9161117031233296391,
'ابةتثجحخدذرزسشصضطظعغفقكلمنهوي'
);
But the result of this query is,
INSERT INTO language
(locale_id, language_id,
VALUE
)
VALUES (4011951073333968003, 9161117031233296391,
'يوهنملكقÙغعظطضصشسزرذدخحجثتبأ'
);
I am facing this issue, even though, I have set charset to UTF-8.
Second problem :
If I try to specify a number following Arabic alphabets, it is misplaced and get inserted before the declaration of Arabic alphabets, like,
INSERT INTO language
(locale_id, language_id,
VALUE, priority
)
VALUES (4011951073333968003, 9161117031233296391,
1,'ابةتثجحخدذرزسشصضطظعغفقكلمنهوي'
);
Here, Corresponding priority value is 1 but I am unable to declare it after Arabic script, every time, it moves at the beginning, if I declare it after the script.
Can anyone suggest me the solution to resolve this problem?

How to make a good procedure in PL/SQL?

so I have just started to create procedure in pl/sql but my skills is very limited. I have this table:
bank_account (
NUM_CC NUMBER(10) CONSTRAINT PK_NUMCC PRIMARY KEY,
NUM_CLIENT NUMBER(10),
SOLD NUMBER(10)
)
I want to create a procedure to transfer an amount 'm' on my account, so i have write this procedure :
Create or replace procedure virement-cc(num-client number, num-cc number , m number)AS
Begin
UPDATE bank_account
SET SOLD = SOLD+m
WHERE NUM_CLIENT = num-client
AND NUM_CC = num-cc ;
End ;
This is procedure is not good but i would like to know how to improve it to resolve my problem. Thank you all.
don't use hyphen but underline
don't name parameters as columns; prefix them with e.g. p_. Otherwise, Oracle will simply do nothing (but set column's name to its current value) and you'd have impression that nothing happened
So:
Create or replace procedure virement_cc
(p_num_client number, p_num_cc number, p_m number)
AS
Begin
UPDATE bank_account
SET SOLD = SOLD + p_m
WHERE NUM_CLIENT = p_num_client
AND NUM_CC = p_num_cc;
End ;
You need to distinguish between 'num_cc' the input parameter and 'num_cc' the column name. Rename 'num_cc' the input parameter as 'p_num_cc'.
Your WHERE clause has an issue:
WHERE NUM_CLIENT = p_num_client
AND NUM_CC = p_num_cc ;
(I've gone ahead and renamed your input parameter)
If NUM_CC is the primary key (it is) then it uniquely identifies a particular role, so there is no need to include a comparison on num_client. And with that you don't even need an input parameter for it.
Give more thought to your naming conventions
I name all of my table columns in the format 'adjective_noun'. And don't be afraid to be descriptive. Instead of NUM_CC, how about CARD_NUMBER. Instead of NUM_CLIENT, try CLIENT_NUMBER. This 'adjective_noun' format also has the advantage of absolutely eliminating the possibility of accidentally trying to use a reserved or key word.
I name all of my parameters P_'adjective_noun'
I name all of my internal variables V_'adjective_noun'.
You may well decide on different naming standards. They key is to actually have naming standards, and that those standards be well thought out an reasoned.

Postgres: After converting from bytea to varchar '\r' remains

I have a table which contains xml file as binary data. The xmls contains "\r\n" characters as "\015\012" in bytea. I need to change the column type from bytea to varchar.
I run:
ALTER TABLE my_table ALTER COLUMN xml_data TYPE VARCHAR;
UPDATE my_table SET xml_data = convert_from(xml_data::bytea, 'UTF8');
And it works for linux. But on Windows it converts '\015' to "\r" (two characters). So I have something like that in the result:
<field>...</field>\r
<field>...</field>
Maybe there is an proper method to convert binary data to UTF?
You'll have to strip the carriage returns in a separate step.
If you are ok with getting rid of them wholesale, I suggest something like:
ALTER TABLE my_table
ALTER xml_data TYPE text
USING replace(
convert_from(xml_data, 'UTF8'),
E'\r',
''
);
Is there a good reason for using data type varchar (or text, which is the same) rather than xml?

Oracle 12c - SQL Loader Invalid month Error

I'm getting ORA-01843: not a valid month error while loading the data using tab delimited text file with sql loader 11.1.0.6.0 on oracle 12c.
control file:
options (skip=1)
load data
truncate
into table test_table
fields terminated by '\t' TRAILING NULLCOLS
(
type,
rvw_date "case when :rvw_date = 'NULL' then null WHEN REGEXP_LIKE(:rvw_date, '\d{4}/\d{2}/\d{2}') THEN to_date(:rvw_date,'yyyy/mm/dd') else to_date(:rvw_date,'mm-dd-yy') end"
)
Data:
type rvw_date
Phone 2014/01/29
Phone 2014/02/13
Field NULL
Phone 01/26/15
Field 02/25/12
Schema:
create table test_table
(
type varchar2(20),
rvw_date date
)
The SQL*Loader control file is interpreting a backslash as an escape character, it seems. The SQL operator section shows double-quotes being escaped too; it ins't obvious it would apply to anything else, but it kind of makes sense that a single backslash would always be assumed to be escaping something.
Your regular expression pattern needs to double-escape the \d to make the pattern work in the file as it does in plain SQL. At the moment the pattern is not matched, so all the values to go the else, which is the wrng format mask (even when they are both corrected).
This works:
options (skip=1)
load data
truncate
into table test_table
fields terminated by '\t' TRAILING NULLCOLS
(
type,
rvw_date "case when :rvw_date = 'NULL' then null when REGEXP_LIKE(:rvw_date,'\\d{4}/\\d{2}/\\d{2}') then to_date(:rvw_date,'yyyy/mm/dd') else to_date(:rvw_date,'mm/dd/rr') end"
)
With your original data that creates rows:
alter session set nls_date_format = 'YYYY-MM-DD';
select * from test_table;
TYPE RVW_DATE
-------------------- ----------
Phone 2014-01-29
Phone 2014-02-13
Field
Phone 2015-01-26
Field 2012-02-25

Why did SQL*Loader load 808594481 when using the INTEGER data-type?

I was loading data using SQL*Loader and when making the control file I used the table definition and accidentally left the INTEGER data type on the "version" line.
And in the "version" field (data type integer) it inserted the value 808594481.
I'm having a hard time understanding how it processed this value -- I'm assuming it took it as a literal ... but is that the sum of the ASCII representations of each letter?
NOPE!
SELECT ASCII('I')+ascii('N')+ASCII('T')+ASCII('E')+ASCII('G')+ASCII('E')+ASCII('G')+ASCII('E')+ASCII('R')
FROM SYS.DUAL
returns 666 (which, btw is hilarious).
concatenate ascii values?
SELECT ASCII('I')||ascii('N')||ASCII('T')||ASCII('E')||ASCII('G')||ASCII('E')||ASCII('G')||ASCII('E')||ASCII('R')
FROM SYS.DUAL
returns 737884697169716982
I'm hoping someone out there knows the answer.
This is the actual control file:
OPTIONS (SKIP=1)
LOAD DATA
APPEND into table THETABLE
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
(id ,
parent_id ,
record_id ,
version INTEGER,
created_at ,
updated_at ,
created_by ,
updated_by ,
species_and_cohort ,
species_and_cohort_count)
Table DDL:
create table THETABLE
(
id VARCHAR2(36),
parent_id VARCHAR2(36),
record_id VARCHAR2(36),
version INTEGER,
created_at VARCHAR2(25),
updated_at VARCHAR2(25),
created_by VARCHAR2(50),
updated_by VARCHAR2(50),
species_and_cohort VARCHAR2(150),
species_and_cohort_other VARCHAR2(150),
species_and_cohort_count NUMBER
)
Data:
id,parent_id,record_id,version,created_at,updated_at,created_by,updated_by,species_and_cohort,species_and_cohort_other,species_and_cohort_count
60D90F54-C5F2-47AF-951B-27A424EAE8E3,f9fe8a3b-3470-4caf-b0ba-3682a1c79731,f9fe8a3b-3470-4caf-b0ba-3682a1c79731,1,2014-09-23 21:02:54 UTC,2014-09-23 21:02:54 UTC,x#gmail.com,x#gmail.com,"PRCA Cherrylaurel,Sapling","",5
FC6A2120-AA0B-4238-A2F6-A6AEDD9B8202,f9fe8a3b-3470-4caf-b0ba-3682a1c79731,f9fe8a3b-3470-4caf-b0ba-3682a1c79731,1,2014-09-23 21:03:02 UTC,2014-09-23 21:03:02 UTC,x7#gmail.com,x7#gmail.com,"JUVI Eastern Redcedar,Sapling","",45
If you split 808594481 into bytes as it would be encoded in a 32 bit twos complement encoding, and treat each byte as an ascii-encoded character, you get "02,1" or "1,20" depending on byte order. You probably inserted a string that starts or ends with one of those, and some layer between your code and the database silently converted it to an integer.

Resources