Oracle: possible encoding problems when importing data - oracle

On Oracle 11, I dumped my data using exp/imp to be migrated to another DB.
I tested to import the dump file on my local database, with no problem at all.
But then my colleague tried the same on his own machine and some tables couldn't get imported due to the error:
can bind a LONG value only for insert into a LONG column.
I dont have any long type, but I read that this error could also be thrown when size exceeds on a varchar2 type, So I checked character sets of databases, I have default Windows charset and he has utf8 charset. So do you think maybe same length of data are represented with more bytes and this leads to this kind of error?
Do I have to change my database charset and create another dump? I look for a better solution, because this also needs to be imported to customers database, which I know has a totally different charset..

Any windows inherited character set isn't multi byte by definition. When you created multi byte(utf8) db every single character may be converted during the import to 1-3 bytes. So you have to increase automatically before import every string type column to x3 times. In case you will case the limit of 4096 use Clob type instead.

Related

Oracle Import Error - Field Length

I'm running into errors when trying to import a table from a western character set into UTF character set. I know the problem is that some of the data "grows". F
For example:
ORA-02374: conversion error loading table "RESULT"
ORA-12899: value too large for column MATRIX_NAME (actual: 11, maximum: 10)
ORA-02372: data for row: MATRIX_NAME :
I know originally when the backup was done in the source database (which uses Western English character set), that column field length is defined as varchar(10), however when imported into target database (uses UTF character set), the data grows to length of 11.
Does Oracle have a way to automatically fix the field length (so it expands the field length automatically into 11), before it imports the data, so to ensure there is no truncating of the data. How to do it?
I'm using Oracle 11G.
Yes. Oracle's built-in import / export tool can automatically handle size issues.

Migrating an oracle database from a non unicode server to a unicode server

I want to move an oracle database from a non-unicode server (EL8ISO8859P7 character set and AL16UTF16 NCHAR character set) to a unicode server. Specifically to an Oracle Express server with AL32UTF8 character set.
Simply exporting (exp) and importing (imp) the data fails. We have a lot of the varchar2 columns with their length specified in bytes. When their contents are mapped in unicode they take more bytes and are truncated.
I tried the following:
- doubling the length of all varchar2 columns of the original database with a script (varchar2(10) becomes varchar2(20))
- exporting
- importing to the new server
And it worked. Apparently doubling is arbitrary, I probably should have changed them to the same size with CHAR semantics.
I also tried the following:
- change all varchar2 columns to nvarchar2 (same size - varchar(10) becomes nvarchar(10))
- exporting
- importing to the new server
It also worked.
Somehow the latter (converting to nvarchar) seems "cleaner". Then again you have a unicode database with unicode data types which seems weird.
So the question is: is there a suggested way to go about moving the database between the two servers? Is there any serious problem with either of the two approaches I mentioned above?
Don't use NVARCHAR2 data types unless that is your only option. The national character set exists to deal with cases where you have an existing, legacy application that does not support Unicode and you want to add a handful of columns to the system that do support Unicode without touching those legacy applications. Using NVARCHAR2 columns is great for those cases but it creates all sorts of issues in application development. Plenty of tools, APIs, and applications either don't support NVARCHAR2 columns or require additional configuration to do so. And since NVARCHAR2 columns are relatively uncommon in the Oracle world, it's very easy to spend gobs of time trying to resolve the particular issues you encounter. Less critically, since AL16UTF16 requires at least 2 bytes per character, you're likely to require quite a bit more space since much of your data is likely to consist of English characters.
I would strongly prefer migrating to the new database with character-length semantics (i.e. VARCHAR2(10 BYTE) becomes VARCHAR2(10 CHAR)). That avoids doubling the allowed length. It also makes it much easier to explain to users what the length limits are (or to code those validations in front-ends). It's terribly confusing to most users to explain that a particular column can sometimes hold 20 characters (when only English characters are used), can sometimes hold 10 characters (when only non-English characters are used), and can sometimes hold something in the middle (when there is a mixture of characters). Character length semantics make all those issues drastically easier.
Migrating to unicode databases is a 4 step process.
Use exp[dp] to export the data and generate ddl for the tables.
Alter the ddl to change the byte length varchar2 fields to character length fields.
create the tables using the modified ddl script.
import the data using imp[dp]
skipping steps 2 and 3 leaves you with the byte length defined fields again and probably with a lot of errors during import because data doesn't fit in the defined columns. If there is only us characters in the source database it won't be a big problem but for example latin characters will give problems because a single character could need more bytes.
Following the listed procedure prevents the length problems. There are obviously more ways to do this but rule is to first have the ddl definition ok and insert the data later.

SSIS Oracle Data Load is Incomplete

I have a data flow task where the data from oracle source is fetched and stored in SQL Server DB after nearly 400k rows the data flow fails with following error.
ORA-01489 result of string concatenation is too long
I the Execution results the [Oracle Source [1543]] Error: what this exactly means.
I'm assuming you are using varchar2 datatype which limits to 4000 chars.
This error is because the concatenated string returns more than 4000 chars of varchar2 which exceeds the limit try using CLOB datatype.
http://nimishgarg.blogspot.in/2012/06/ora-01489-result-of-string.html
use a derived column after your source to cut the strings to 4000 chars
Your data source (Oracle) is sending out strings that are larger than 4000 characters while your SSIS source expects something less than that. Check your source for any data that has a length > 4000.
After a long battle i decided to modify the package and it turns that deleting and creating the all the tasks again has solved the problem.
Real cause is still unknown to me.

SSIS Package Troubleshooting

I'm working with an SSIS Package that pulls data from a DB2 source, runs through a conversion process (unicode stuff) and then stores the data in a SQL table. From the error information below, I have been able to determine that there is some kind of special characters in the DB2 file/table. What I do not know is how I can narrow down which specific record has the issue. There are about 200,000 records in the DB2 file and I need to know which one is specifically causing the issue.
Is there a way to query the DB2 source looking for "special characters"? Is there a way to have the SSIS package show me which record it is failing on?
Error: 2009-07-15 01:32:31.19 Code:
0xC020901C Source: Import MY APP Data
DETAIL [2670] Description: There was
an error with output column "COLUMN1"
(2710) on output "OLE DB Source
Output" (2680). The column status
returned was: "Text was truncated or
one or more characters had no match in
the target code page.".
DB2 has a built-in function called HEX() that takes in just about any expression of any type and returns a VARCHAR of the hex representation of each byte. You can also specify any binary value as a literal by prepending it x', for example: x'0123456789abcdef'
If the problem is coming from a single-byte character, you could find it by building up temp table of all single characters from x'00' to x'ff' and seeing which ones appear in each row of your DB2 data. You could also add some code to the utility that converts the data for Unicode so it will scan the DB2 records for any anomalies.

Consistent method of inserting TEXT column to Informix database using JDBC and ODBC

I have problem when I try insert some data to Informix TEXT column
via JDBC. In ODBC I can simply run SQL like this:
INSERT INTO test_table (text_column) VALUES ('insert')
but this do not work in JDBC and I got error:
617: A blob data type must be supplied within this context.
I searched for such problem and found messages from 2003:
http://groups.google.com/group/comp.databases.informix/browse_thread/thread/4dab38472e521269?ie=UTF-8&oe=utf-8&q=Informix+jdbc+%22A+blob+data+type+must+be+supplied+within+this%22
I changed my code to use PreparedStatement. Now it works with JDBC,
but in ODBC when I try using PreparedStatement I got error:
Error: [Informix][Informix ODBC Driver][Informix]
Illegal attempt to convert Text/Byte blob type.
[SQLCode: -608], [SQLState: S1000]
Test table was created with:
CREATE TABLE _text_test (id serial PRIMARY KEY, txt TEXT)
Jython code to test both drivers:
# for Jython 2.5 invoke with --verify
# beacuse of bug: http://bugs.jython.org/issue1127
import traceback
import sys
from com.ziclix.python.sql import zxJDBC
def test_text(driver, db_url, usr, passwd):
arr = db_url.split(':', 2)
dbname = arr[1]
if dbname == 'odbc':
dbname = db_url
print "\n\n%s\n--------------" % (dbname)
try:
connection = zxJDBC.connect(db_url, usr, passwd, driver)
except:
ex = sys.exc_info()
s = 'Exception: %s: %s\n%s' % (ex[0], ex[1], db_url)
print s
return
Errors = []
try:
cursor = connection.cursor()
cursor.execute("DELETE FROM _text_test")
try:
cursor.execute("INSERT INTO _text_test (txt) VALUES (?)", ['prepared', ])
print "prepared insert ok"
except:
ex = sys.exc_info()
s = 'Exception in prepared insert: %s: %s\n%s\n' % (ex[0], ex[1], traceback.format_exc())
Errors.append(s)
try:
cursor.execute("INSERT INTO _text_test (txt) VALUES ('normal')")
print "insert ok"
except:
ex = sys.exc_info()
s = 'Exception in insert: %s: %s\n%s' % (ex[0], ex[1], traceback.format_exc())
Errors.append(s)
cursor.execute("SELECT id, txt FROM _text_test")
print "\nData:"
for row in cursor.fetchall():
print '[%s]\t[%s]' % (row[0], row[1])
if Errors:
print "\nErrors:"
print "\n".join(Errors)
finally:
cursor.close()
connection.commit()
connection.close()
#test_varchar(driver, db_url, usr, passwd)
test_text("sun.jdbc.odbc.JdbcOdbcDriver", 'jdbc:odbc:test_db', 'usr', 'passwd')
test_text("com.informix.jdbc.IfxDriver", 'jdbc:informix-sqli://169.0.1.225:9088/test_db:informixserver=ol_225;DB_LOCALE=pl_PL.CP1250;CLIENT_LOCALE=pl_PL.CP1250;charSet=CP1250', 'usr', 'passwd')
Is there any setting in JDBC or ODBC to have one version of code
for both drivers?
Version info:
Server: IBM Informix Dynamic Server Version 11.50.TC2DE
Client:
ODBC driver 3.50.TC3DE
IBM Informix JDBC Driver for IBM Informix Dynamic Server 3.50.JC3DE
First off, are you really sure you want to use an Informix TEXT type? The type is a nuisance to use, in part because of the problems you are facing. It pre-dates anything in any SQL standard with respect to large objects (TEXT still isn't in SQL-2003 - though approximately equivalent structures, CLOB and BLOB, are). And the functionality of BYTE and TEXT blobs has not been changed since - oh, let's say 1996, though I suspect there's a case for choosing an earlier date, like 1991.
In particular, how much data are you planning to store in the TEXT columns? Your example shows the string 'insert'; that is, I presume, much much smaller than you would really use. You should be aware that a BYTE or TEXT columns uses a 56-byte descriptor in the table plus a separate page (or set of pages) to store the actual data. So, for tiny strings like that, it is a waste of space and bandwidth (because the data for the BYTE or TEXT objects will be shipped between client and server separately from the rest of the row). If your size won't get above about 32 KB, then you should look at using LVARCHAR instead of TEXT. If you will be using data sizes above that, then BYTE or TEXT or BLOB or CLOB are sensible alternatives, but you should look at configuring either blob spaces (for BYTE or TEXT) or smart blob spaces (for BLOB or CLOB). You can, and are, using TEXT IN TABLE, rather than in a blob space; be aware that doing so impacts your logical logs whereas using a blob space does not impact them anything like as much.
One of the features I've been campaigning for a decade or so is the ability to pass string literals in SQL statements as TEXT literals (or BYTE literals). That is in part because of the experience of people like you. I haven't yet been successful in getting it prioritized ahead of other changes that need to be made. Of course, you need to be aware that the maximum size of an SQL statement is 64 KB text, so you could create too big an SQL statement if you aren't careful; placeholders (question marks) in the SQL normally prevent that being a problem - and increasing the size of an SQL statement is another feature request which I've been campaigning for, but a little less ardently.
OK, assuming that you have sound reasons for using TEXT...what next. I'm not clear what Java (the JDBC driver) is doing behind the scenes - apart from too much - but it is a fair bet that it is noticing that a TEXT 'locator' structure is needed and is shipping the parameter in the correct format. It appears that the ODBC driver is not indulging you with similar shenanigans.
In ESQL/C, where I normally work, then the code has to deal with BYTE and TEXT differently from everything else (and BLOB and CLOB have to be dealt with differently again). But you can create and populate a locator structure (loc_t or ifx_loc_t from locator.h - which may not be in the ODBC directory; it is in $INFORMIXDIR/incl/esql by default) and pass that to the ESQL/C code as the host variable for the relevant placeholder in the SQL statement. In principle, there is probably a parallel method available for ODBC. You may have to look at the Informix ODBC driver manual to find it, though.

Resources