Migrate tables with special characters in Talend studio - oracle

I am migrating from table A (DB A) to table B (DB B), an error occurs on 1 specific field that contains french characters (é, à, ..) and special characters (&, ', ..):
Exception in component tOracleOutput_1
java.sql.SQLException: ORA-12899: value too large for column "DB1"."COLUMN1"."COMMENT" (actual: 121, maximum: 118)
While querying the table from sql editor, the maximum length for the values is 100.
How can I insert these values into the new table without loosing the special and the french characters?

This is not due to the special characters. Your column is too small.
You have three possibilities :
Increase the size of your column directly in the table schema.
See here:
how to modify the size of a column
Delete blank character before and after the value with the TRIM function in the tMap: StringHandling.TRIM(row1.yourcolumn)
Truncate the value to fit the column in your tMap: StringHandling.LEFT(row1.yourcolumn,118) (your column have 118 characters max)

Related

Postgresql: find multiple records from comma separated column

I am using PostgreSQL with Laravel 8.
In a Postgres table, I have one comma separated character varying column
I need those records from this table where value of this comma separated column is 43 OR 56 OR 252.
Storing comma separated values in a single column is a huge mistake to begin with. But if you can't change the data model, you can achieve what you want by converting the value into an array, then use the array overlaps operator && with the values you want to test for:
select *
from the_table
where string_to_array(bad_column, ',')::int[] && array[43, 56, 252];

ORA-12899 Error Too large String for Same column but success different tables

I am updating string to column of length 35 into two tables
first table update was success but second table give ORA error ORA-12899 Error Too large String
select length('Andres Peñalver D1 Palmar Sani salt') bytes from dual;
BYTES
----------
35
select lengthb('Andres Peñalver D1 Palmar Sani salt') bytes from dual;
BYTES
----------
36
Both tables colm1 field declared as VARCHAR(35), first table update fails and second one success.
update t
set colm1='Andres Peñalver D1 Palmar Sani Salt'
where value1='123456';
update t2
set colm1='Andres Peñalver D1 Palmar Sani Salt'
where value1='123456';
ORA-12899
select value from nls_database_parameters where parameter='NLS_CHARACTERSET';
VALUE
----------------------------------------------------------------
AL32UTF8
let me know why this behaviour for these table which is having same column type
Check the actual columns size for both the tables in all_tab_columns.
35 Char is 3 times 35 bytes, and if one table's column is defined in char other in byte(during ddl) the size is different.
Normal characters like A-Z a-z take 1 byte to store but language specific characters take 3 byte to store.
The full error message as described in the error message documentation
should give you the answer:
$ oerr ora 12899
12899, 00000, "value too large for column %s (actual: %s, maximum: %s)"
// *Cause: An attempt was made to insert or update a column with a value
// which is too wide for the width of the destination column.
// The name of the column is given, along with the actual width
// of the value, and the maximum allowed width of the column.
// Note that widths are reported in characters if character length
// semantics are in effect for the column, otherwise widths are
// reported in bytes.
// *Action: Examine the SQL statement for correctness. Check source
// and destination column data types.
// Either make the destination column wider, or use a subset
// of the source column (i.e. use substring).
This is likely linked to character length semantics.

how to check the size of input to avoid exceed DB column limit

I have an input field of my page with size=8.
And in the DB, the corresponding column is VARCHAR2(8).
But if I input a string of length 8 with a special ascii character in the field, I will get the following exception.
ORA-12899: value too large for column xxxx (actual: 10, maximum: 8)
I'm trying to catch this in the validator, I check myString.getBytes().length which is also 8.
I know one solution is on DB side that change the column to VARCHAR2(8 CHAR).
Is there another solution that I can check this in the controller?
The error is telling you that you've given 10 bytes but the column only allows 8. I am assuming it's bytes because of your use of the Chinese character set. So, I believe that the column was created as if it were VARCHAR2(8 byte).
If you describe the table, you'll see what's going on. Compare that describe with a describe of this one:
create table x (a varchar2(30), b varchar2(30 byte), c varchar2(30 char));
The code you are executing to obtain the number of bytes is almost correct. Instead of:
myString.getBytes().length /* this probably returns 8 */
you need to execute this:
myString.getBytes("UTF-8").length /* this probably returns 10 */
This should help you, this will return the actual size in Bytes.
SELECT LENGTHB ('é')
FROM DUAL;
Above will return 2. So whatever character you are using, you can specify something like MY_VARCHAR_FIELD VARCHAR2(2 BYTES)

Import CSV which every cell terminated by newline

I have CSV file. The data looks like this :
PRICE_a
123
PRICE_b
500
PRICE_c
1000
PRICE_d
506
My XYZ Table is :
CREATE TABLE XYZ (
DESCRIPTION_1 VARCHAR2(25),
VALUE NUMBER
)
Do csv as above can be imported to the oracle?
How do I create a control.ctl file?
Here's how to do it without having to do any pre-processing. Use the CONCATENATE 2 clause to tell SQL-Loader to join every 2 lines together. This builds logical records but you have no separator between the 2 fields. No problem, but first understand how the data file is read and processed. SQL-Loader will read the data file a record at a time, and try to map each field in order from left to right to the fields as listed in the control file. See the control file below. Since the concatenated record it read matches with TEMP from the control file, and TEMP does not match a column in the table, it will not try to insert it. Instead, since it is defined as a BOUNDFILLER, that means don't try to do anything with it but save it for future use. There are no more data file fields to try to match, but the control file next lists a field name that matches a column name, DESCRIPTION_1, so it will apply the expression and insert it.
The expression says to apply the regexp_substr function to the saved string :TEMP (which we know is the entire record from the file) and return the substring of that record consisting of zero or more non-numeric characters from the start of the string where followed by zero or more numeric characters until the end of the string, and insert that into the DESCRIPTION_1 column.
The same is then done for the VALUE column, only returning the numeric part at the end of the string, skipping the non-numeric at the beginning of the string.
load data
infile 'xyz.dat'
CONCATENATE 2
into table XYZ
truncate
TRAILING NULLCOLS
(
TEMP BOUNDFILLER CHAR(30),
DESCRIPTION_1 EXPRESSION "REGEXP_SUBSTR(:TEMP, '^([^0-9]*)[0-9]*$', 1, 1, NULL, 1)",
VALUE EXPRESSION "REGEXP_SUBSTR(:TEMP, '^[^0-9]*([0-9]*)$', 1, 1, NULL, 1)"
)
Bada-boom, bada-bing:
SQL> select *
from XYZ
/
DESCRIPTION_1 VALUE
------------------------- ----------
PRICE_a 123
PRICE_b 500
PRICE_c 1000
PRICE_d 506
SQL>
Note that this is pretty dependent on the data following your example, and you should do some analysis of the data to make sure the regular expressions will work before putting this into production. Some tweaking will be required if the descriptions could contain numbers. If you can get the data to be properly formatted with a separator in a true CSV format, that would be much better.

What can I do to ensure fields longer than column width go to the BAD File?

When creating Oracle external tables, how should I phrase the reject rows clause to ensure that any field which exceeds its column width rejects and goes in the BADFILE?
This is my current design and I don't want records greater than 20 characters. I do want them to go BADFILE instead. Yet, they still appear when I select * from foobar
DROP TABLE FOOBAR CASCADE CONSTRAINTS;
CREATE TABLE FOOBAR
(
FOO_MAX20 VARCHAR2(20 CHAR)
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY FOOBAR
ACCESS PARAMETERS
( RECORDS DELIMITED BY NEWLINE
BADFILE 'foobar_bad_rec.txt'
DISCARDFILE 'foobar_discard_rec.txt'
LOGFILE 'foobar_logfile.txt'
FIELDS
MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
(
FOO_MAX20 POSITION(1:20)
)
)
LOCATION (foobar:'foobar.txt') )
REJECT LIMIT UNLIMITED
PARALLEL ( DEGREE DEFAULT INSTANCES DEFAULT )
NOMONITORING;
Here is my external file foobar.txt
1234567
1234567890123456
126464843750476074218751012345678901234567890
7135009765625
048669433593
7
527
You can't do this with the reject rows clause, as it only accepts one form.
You have a variable-length (delimited) record, but a fixed-length field. Everything after the last position you specify, which is 20 in this case, is seen as filler that you want to ignore. That isn't an error condition; you might have rubbish at the end that isn't relevant to your table. There is nothing that says chars 21-45 in your third record shouldn't be there - just that you aren't interested in them.
It would be nice if you could discard them with the load when clause, but you don't seem to be able to compare , say, (21:21) to null or an empty string - the former isn't recognised and the latter causes an internal error, which isn't good.
You can make the longer records be sent to the bad file by forcing an SQL error when it tries to put a longer parsed value from the file into the field, by changing:
FOO_MAX20 POSITION(1:20)
to
FOO_MAX20 POSITION(1:21)
Values that are up to 20 characters are still loaded:
select * from foobar;
FOO_MAX20
--------------------
1234567
1234567890123456
7135009765625
048669433593
7
527
6 rows selected
but for anything longer than 20 characters it'll try to put 21 chars in to the database's 20-char field, which gets this in the log:
error processing column FOO_MAX20 in row 3 for datafile /path/to/dir/foobar.txt
ORA-12899: value too large for column FOO_MAX20 (actual: 21, maximum: 20)
And the bad file gets that record:
126464843750476074218751012345678901234567890
Have a CHECK CONSTRAINT on the column to not allow any value exceeding the `LENGTH'.

Resources