Insert Unicode string to DB using Linq

Insert Unicode string to DB using Linq - linq

When I try to excecute this:
INSERT INTO [DB_NAME].[dbo].[Table]
([Column])
VALUES('some_hebrew_characters')
I get only questions mark in the column. If I change it to N'some_hebrew_characters' - then it's OK. Why is this happening? How can I translate it to Linq?
How can I make this table to treat all data as Unicode by default? My colum collation is Hebrew_CS_AI, and server is SQL 2008 R2.
Thanks!
---EDIT----
something I just noticed:
even if I run this
SELECT 'some_hebrew_characters'
Im getting questions mark in my results grid

Didn't you forget to mark your column as NVARCHAR also?

Probably that's your editor's default enncoding is not unicode.
To be sure, save your query as a unicode file in SQL SERVER Management Studio and re-run it.
I think if you get results through Linq there would be right.

you need to prefix the '' with the letter N
when inserting a value that contains unicode characters, you need to do this:
insert into table_name(unicode_field) values (N'会意字')
without the N prefix, they'll be passed as ASCII characters.
Also, be sure that the column you're inserting to, supports unicode characters - i.e. nchar, nvarchar, ntext.

Related

Oracle APEX - no leading zero

I have an app build in Oracle APEX 18.2. Every number field in app have missing leading zero. For example when the number is 0.5, APEX displays it as .5. The problem occurs also in SQL Workshop. In SQL Developer numbers with leading zeros are formatted well, so I think this is problem with Oracle APEX, not with Oracle DB. Is there any global setting for number formatting in APEX?

As far as I can tell, there's no such a global setting, which means that you'd have to apply some format mask either
directly (in SELECT statement, within the TO_CHAR function call), or
in column's (item's) property
Format mask you might consider is FM999G990D00 as
FM will remove leading spaces and superfluous trailing zeros
instead of using explicit , and . grouping & decimal characters, use G and D instead

I had an issue adding zeros to numbers, but it was fixed using Function with Oracle Developer
select LPAD((max(ID))+1, 6, '0') from Yourtable
and call it as a function.
Probably you could use PL/SQL expressions to fix it

I had the same issue recently. i fixed it using to_char(xxx,'FM999G990D00')
SELECT mont.period_name AS PERIOD
, to_char(ivo.total_overtime,'FM999G990D00')
, to_char(emho.ACTUAL_TRANSFERRED_HOURS,'FM999G990D00')
, to_char(emho.actual_recup_days,'FM999G990D00')
FROM .....
worked like a charm

I'll assume you have a Classic or Interactive Report, in such case:
Go to page designer and select the column
Go to the format mask option
Select the numeric format you wish to have.
You'll probably get something like 999G999G999G999G990D00
if it has a 9D00 at the end, change the 9 to a 0

Insert Unicode string to SQL Server using linq

I have a table with column Name is nvarchar
In SQL statement i using N prefix to save data with Unicode
Insert into TBL_Name (Name) values(N'Hôm nay đẹp trời')
it's work fine.
But i don't know how to save unicode string when using Linq?
Please help me.
Thanks so much!

N prefix mean string literal will be unicode.
When you use LinQ on nvarchar field LinQ already knows that your string will be Unicode. So you don't need to do anything else.

Can N function cause problems with existing queries?

We use Oracle 10g and Oracle 11g.
We also have a layer to automatically compose queries, from pseudo-SQL code written in .net (something like SqlAlchemy for Python).
Our layer currently wraps any string in single quotes ' and, if contains non-ANSI characters, it automatically compose the UNISTR with special characters written as unicode bytes (like \00E0).
Now we created a method for doing multiple inserts with the following construct:
INSERT INTO ... (...)
SELECT ... FROM DUAL
UNION ALL SELECT ... FROM DUAL
...
This algorithm could compose queries where the same string field is sometimes passed as 'my simple string' and sometimes wrapped as UNISTR('my string with special chars like \00E0').
The described condition causes a ORA-12704: character set mismatch.
One solution is to use the INSERT ALL construct but it is very slow compared to the one used now.
Another solution is to instruct our layer to put N in front of any string (except for the ones already wrapped with UNISTR). This is simple.
I just want to know if this could cause any side-effect on existing queries.
Note: all our fields on DB are either NCHAR or NVARCHAR2.
Oracle ref: http://docs.oracle.com/cd/B19306_01/server.102/b14225/ch7progrunicode.htm

Basicly what you are asking is, is there a difference between how a string is stored with or without the N function.
You can just check for yourself consider:
SQL> create table test (val nvarchar2(20));
Table TEST created.
SQL> insert into test select n'test' from dual;
1 row inserted.
SQL> insert into test select 'test' from dual;
1 row inserted.
SQL> select dump(val) from test;
DUMP(VAL)
--------------------------------------------------------------------------------
Typ=1 Len=8: 0,116,0,101,0,115,0,116
Typ=1 Len=8: 0,116,0,101,0,115,0,116
As you can see identical so no side effect.
The reason this works so beautifully is because of the elegance of unicode
If you are interested here is a nice video explaining it
https://www.youtube.com/watch?v=MijmeoH9LT4

I assume that you get an error "ORA-12704: character set mismatch" because your data inside quotes considered as char but your fields is nchar so char is collated using different charsets, one using NLS_CHARACTERSET, the other NLS_NCHAR_CHARACTERSET.
When you use an UNISTR function, it converts data from char to nchar (in any case that also converts encoded values into characters) as the Oracle docs say:
"UNISTR takes as its argument a text literal or an expression that
resolves to character data and returns it in the national character
set."
When you convert values explicitly using N or TO_NCHAR you only get values in NLS_NCHAR_CHARACTERSET without decoding. If you have some values encoded like this "\00E0" they will not be decoded and will be considered unchanged.
So if you have an insert such as:
insert into select N'my string with special chars like \00E0',
UNISTR('my string with special chars like \00E0') from dual ....
your data in the first inserting field will be: 'my string with special chars like \00E0' not 'my string with special chars like à'. This is the only side effect I'm aware of. Other queries should already use NLS_NCHAR_CHARACTERSET encoding, so it shouldn't be any problem using an explicit conversion.
And by the way, why not just insert all values as N'my string with special chars like à'? Just encode them into UTF-16 (I assume that you use UTF-16 for nchars) first if you use different encoding in 'upper level' software.

use of n function - you have answers already above.
If you have any chance to change the charset of the database, that would really make your life easier. I was working on huge production systems, and found the trend that because of storage space is cheap, simply everyone moves to AL32UTF8 and the hassle of internationalization slowly becomes the painful memories of the past.
I found the easiest thing is to use AL32UTF8 as the charset of the database instance, and simply use varchar2 everywhere. We're reading and writing standard Java unicode strings via JDBC as bind variables without any harm, and fiddle.
Your idea to construct a huge text of SQL inserts may not scale well for multiple reasons:
there is a fixed length of maximum allowed SQL statement - so it won't work with 10000 inserts
it is advised to use bind variables (and then you don't have the n'xxx' vs unistr mess either)
the idea to create a new SQL statement dynamically is very resource unfriedly. It does not allow Oracle to cache any execution plan for anything, and will make Oracle hard parse your looong statement at each call.
What you're trying to achieve is a mass insert. Use the JDBC batch mode of the Oracle driver to perform that at light-speed, see e.g.: http://viralpatel.net/blogs/batch-insert-in-java-jdbc/
Note that insert speed is also affected by triggers (which has to be executed) and foreign key constraints (which has to be validated). So if you're about to insert more than a few thousands of rows, consider disabling the triggers and foreign key constraints, and enable them after the insert. (You'll lose the trigger calls, but the constraint validation after insert can make an impact.)
Also consider the rollback segment size. If you're inserting a million of records, that will need a huge rollback segment, which likely will cause serious swapping on the storage media. It is a good rule of thumb to commit after each 1000 records.
(Oracle uses versioning instead of shared locks, therefore a table with uncommitted changes are consistently available for reading. The 1000 records commit rate means roughly 1 commit per second - slow enough to benefit of write buffers, but quick enough to not interfer with other humans willing to update the same table.)

Oracle10 and JDBC: how to make CHAR ignore trailing spaces at comparision?

I have a query that has
... WHERE PRT_STATUS='ONT' ...
The prt_status field is defined as CHAR(5) though. So it's always padded with spaces. The query matches nothing as the result. To make this query work I have to do
... WHERE rtrim(PRT_STATUS)='ONT'
which does work.
That's annoying.
At the same time, a couple of pure-java DBMS clients (Oracle SQLDeveloper and AquaStudio) I have do NOT have a problem with the first query, they return the correct result. TOAD has no problem either.
I presume they simply put the connection into some compatibility mode (e.g. ANSI), so the Oracle knows that CHAR(5) expected to be compared with no respect to trailing characters.
How can I do it with Connection objects I get in my application?
UPDATE I cannot change the database schema.
SOLUTION It was indeed the way Oracle compares fields with passed in parameters.
When bind is done, the string is passed via PreparedStatement.setString(), which sets type to VARCHAR, and thus Oracle uses unpadded comparision -- and fails.
I tried to use setObject(n,str,Types.CHAR). Fails. Decompilation shows that Oracle ignores CHAR and passes it in as a VARCHAR again.
The variant that finally works is
setObject(n,str,OracleTypes.FIXED_CHAR);
It makes the code not portable though.
The UI clients succeed for a different reason -- they use character literals, not binding. When I type PRT_STATUS='ONT', 'ONT' is a literal, and as such compared using padded way.

Note that Oracle compares CHAR values using blank-padded comparison semantics.
From Datatype Comparison Rules,
Oracle uses blank-padded comparison
semantics only when both values in the
comparison are either expressions of
datatype CHAR, NCHAR, text literals,
or values returned by the USER
function.
In your example, is 'ONT' passed as a bind parameter, or is it built into the query textually, as you illustrated? If a bind parameter, then make sure that it is bound as type CHAR. Otherwise, verify the client library version used, as really old versions of Oracle (e.g. v6) will have different comparison semantics for CHAR.

If you cannot change your database table, you can modify your query.
Some alternatives for RTRIM:
.. WHERE PRT_STATUS like 'ONT%' ...
.. WHERE PRT_STATUS = 'ONT ' ... -- 2 white spaces behind T
.. WHERE PRT_STATUS = rpad('ONT',5,' ') ...

I would change CHAR(5) column into varchar2(5) in db.

You can use cast to char operation in your query:
... WHERE PRT_STATUS=cast('ONT' as char(5))
Or in more generic JDBC way:
... WHERE PRT_STATUS=cast(? as char(5))
And then in your JDBC code do use statement.setString(1, "ONT");

UTF 8 from Oracle tables

The client has asked for a number of tables to be extracted into csv's, all done no problem. They've just asked we make sure the files are always in UTF 8 format.
How do I check this is actually the case. Or even better force it to be so, is it something i can set in a procedure before running a query perhaps?
The data is extracted from an Oracle 10g database.
What should I be checking?
Thanks

You can check the database character set with the following query:
select value from nls_database_parameters
where parameter='NLS_CHARACTERSET'
If it says AL32UTF8 then your database is in the format what you need and if the export does not impair it then your are done.
You may read about Oracle globalization support here, and here about NLS parameters like the above.

How, exactly, are you generating the CSV files? Depending on the exact architecture, there will be different answers.
If you are, for example, using SQL*Plus to extract the data, you would need to set the NLS_LANG on the client machine to something appropriate (i.e. AMERICAN_AMERICA.AL32UTF8) to force the data to be sent to the client machine in UTF-8. If you are using other approaches, NLS_LANG may or may not be important.

What you have to look for is the eight-bit ascii characters in hte input (if any) are translated into double byte utf-8 characters.
This is highly dependant on your local ASCII code page but typically:-
ASCII "£" should be x'A3' in ascii magically becomes x'C2A3' in utf-8.

Ok it wasn't as simple as I first hoped. The query above returns AL32UTF8.
I am using a stored proc compiled on the database to loop through a list of table names held in an array inside the stored procedure.
I use DBMS_SQL package to build the SQL and UTL_FILE.PUT_NCHAR to insert data into a text file.
I believed then my resultant output would be in UTF 8 however opening in Textpad says it's in ANSI and the data is garbled in places :)
Cheers
It might be important that NLS_CHARACTERSET is AL32UTF8 and NLS_NCHAR_CHARACTERSET is AL16UTF16

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio