How to get the data after last | using instr and substr in informatica

How to get the data after last | using instr and substr in informatica - informatica-powercenter

I have a column with data in two format
1.date|month|year
2.Date|year...
How to get the value after the last | ie year using InSTR and SuBstr in informatica
I have tried the logic but the data with three pipe is working file not with the data with two pipes

Please use this
SUBSTR( COL, INSTR(COL,'|',-1)+1)
INSTR will locate location of pipe form end of string. SUBSTR will pick data from that pipe till end of string.
Output -

Related

How to load data into Oracle using SQL Loader with skipping and merging columns?

I am trying to load data into Oracle database using sqlloader,
My data looks like following.
1|2|3|4|5|6|7|8|9|10
I do not want to load first and last column into table,
I want to load 2|3|4|5|6|7|8|9 into one field.
The table I am trying to load into has only one filed named 'field1'.
If anyone has this kind of experience, could you give some advice?
I tried BOUNDFILLER, FILLER and so on, I could not make it.
Help me. :)

Load the entire row from the file into a BOUNDFILLER, then extract the part you need into the column. You have to tell sqlldr that the field is terminated by the carriage return/linefeed (assuming a Windows OS) so it will read the entire line from the file as one field. here the whole line from the file is read into "dummy" as BOUNDFILLER. "dummy" does not match a column name, and it's defined as BOUNDFILLER anyway, so the whole row is "remembered". The next line in the control file starts with a column that DOES match a column name, so sqlldr attempts to execute the expression. It extracts a substring from the saved "dummy" and puts it into the "col_a" column.
The regular expression in a nutshell returns the part of the string after but not including the first pipe, and before but not including the last pipe. Note the double backslashes. In my environment anyway, when using a backslash to take away the special meaning of the pipe (not needed when between the square brackets) it gets stripped when passing from sqlldr to the regex engine so two backslashes are required in the control file
(normally a pipe symbol is a logical OR) so one gets through in the end. If you have trouble here, try one backslash and see what happens. Guess how long THAT took me to figure out!
load data
infile 'x_test.dat'
TRUNCATE
into table x_test
FIELDS TERMINATED BY x'0D0A'
(
dummy BOUNDFILLER,
col_a expression "regexp_substr(:dummy, '[^|]*\\|(.+)\\|.*', 1, 1, NULL, 1)"
)
EDIT: Use this to test the regular expression. For example, if there is an additional pipe at the end:
select regexp_substr('1|2|3|4|5|6|7|8|9|10|', '[^|]*\|(.+)\|.*\|', 1, 1, NULL, 1)
from dual;
2nd edit: For those uncomfortable with regular expressions, this method uses nested SUBSTR and INSTR functions:
SQL> with tbl(str) as (
select '1|2|3|4|5|6|7|8|9|10|' from dual
)
select substr(str, instr(str, '|')+1, (instr(str, '|', -1, 2)-1 - instr(str
, '|')) ) after
from tbl;
AFTER
---------------
2|3|4|5|6|7|8|9
Deciding which is easier to maintain is up to you. Think of the developer after you and comment at any rate! :-)

Hive INSTR function working incorrectly on string with UTF8 characters

Hive INSTR function is working incorrectly on strings with UTF8 characters. When an accent character is part of the string, INSTR returns an incorrect character location for subsequent characters. It seems to be counting bytes instead of characters.
With the accent character as part of string it returns 8
select INSTR("Réservation:", 'a'); returns 8
Without the accent character as part of string it returns 7
select INSTR("Reservation:", 'a'); returns 7
Is there a fix to this or an alternate function that I could use ?

This what I'm getting with hive 1.1.0,
hive>select INSTR("Réservation:", 'a');
OK
7
So no issues with Hive. If you still need problem with using INSTR write your own UDF to achieve this. For writing UDF refer the below link,
Click here for UDF

ORACLE xmlagg limitations

I have a table which has multiple lines of text that need to be merged into 1 line. I need to support strings longer than 4000 characters, so listagg is not an option.
I have done the following after much searching:
select mdesc.DEFINITION_ID,
xmlagg(xmlelement(E,mdesc.record_desc||' ')).EXTRACT('//text()')
FROM METRIC_DESC mdesc
GROUP BY DEFINITION_ID
but my results are returned with certain characters escaped.
This SO thread has some suggestions
How to tweak LISTAGG to support more than 4000 character in select query?
but I cannot convert to CLOB for my purposes.
Any idea how I can get the results of the query in a usable format? (i.e. not CLOB and not escaped?)

To convery the XMLType object to a CLOB, You will need to add a getClobVal() function.

Try using the rtrim function:
SELECT mdesc.definition_id,
Rtrim(Xmlagg(Xmlelement(e, mdesc.record_desc
|| ' ')).EXTRACT('//text()'), ',')
FROM metric_desc mdesc
GROUP BY definition_id ;

bulk load UDT columns in Oracle

I have a table with the following structure:
create table my_table (
id integer,
point Point -- UDT made of two integers (x, y)
)
and i have a CSV file with the following data:
#id, point
1|(3, 5)
2|(7, 2)
3|(6, 2)
now i want to bulk load this CSV into my table, but i cant find any information about how to handle the UDT in Oracle sqlldr util. Is is possible to use the bulk load util when having UDT columns?

I don't know if sqlldr can do this, but personally I would use an external table.
Attach the file as an external table (the file must be on the database server), and then insert the contents of the external table into the destination table transforming the UDT into two values as you go. The following select from dual should help you with the translation:
select
regexp_substr('(5, 678)', '[[:digit:]]+', 1, 1) x_point,
regexp_substr('(5, 678)', '[[:digit:]]+', 1, 2) y_point
from dual;
UPDATE
In sqlldr, you can transform fields using standard SQL expressions:
LOAD DATA
INFILE 'data.dat'
BADFILE 'bad_orders.txt'
APPEND
INTO TABLE test_tab
FIELDS TERMINATED BY "|"
( info,
x_cord "regexp_substr(:x_cord, '[[:digit:]]+', 1, 1)",
)
The control file above will extract the first digit in the fields like (3, 4), but I cannot find a way to extract the second digit - ie I am not sure if it is possible to have the same field in the input file inserted into two columns.
If external tables are not an option for you, I would suggest either (1) transform the file before loading, using sed, awk, Perl etc or (2) SQLLDR the file into a temporary table and then have a second process to trandform the data and insert into your final table. Another option is to look at how the file is generated - could you generate it so that the field you need to transform is repeated in two fields in the file, eg:
data|(1, 2)|(1, 2)
Maybe someone else will chip in with a way to get sqlldr to do what you want.

Solved the problem after more research, because Oracle SQL*Loader has this feature, and it is used by specifying a column object, the following was the solution:
LOAD DATA
INFILE *
INTO TABLE my_table
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
id,
point column object
(
x,
y
)
)
BEGINDATA
1,3,5
2,7,2
3,6,2

Showing only actual column data in SQL*Plus

I'm spooling out delimited text files from SQL*Plus, but every column is printed as the full size per its definition, rather than the data actually in that row.
For instance, a column defined as 10 characters, with a row value of "test", is printing out as "test " instead of "test". I can confirm this by selecting the column along with the value of its LENGTH function. It prints "test |4".
It kind of defeats the purpose of a delimiter if it forces me into fixed-width. Is there a SET option that will fix this, or some other way to make it print only the actual column data.
I don't want to add TRIM to every column, because if a value is actually stored with spaces I want to be able to keep them.
Thanks

I have seen many SQL*plus script, that create text files like this:
select A || ';' || B || ';' || C || ';' || D
from T
where ...
It's a strong indication to me that you can't just switch to variable length output with a SET command.
Instead of ';' you can of course use any other delimiter. And it's up to your query to properly escape any characters that could be confused with a delimiter or a line feed.

Generally, I'd forget SQL Plus as a method for getting CSV out of Oracle.
Tom Kyte has written a nice little Pro-C unloader
Personally I've written a utility which does similar but in perl

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to get the data after last | using instr and substr in informatica - informatica-powercenter

I have a column with data in two format 1.date|month|year 2.Date|year... How to get the value after the last | ie year using InSTR and SuBstr in informatica I have tried the logic but the data with three pipe is working file not with the data with two pipes

Please use this SUBSTR( COL, INSTR(COL,'|',-1)+1) INSTR will locate location of pipe form end of string. SUBSTR will pick data from that pipe till end of string. Output -

Related

How to load data into Oracle using SQL Loader with skipping and merging columns?

Hive INSTR function working incorrectly on string with UTF8 characters

ORACLE xmlagg limitations

bulk load UDT columns in Oracle

Showing only actual column data in SQL*Plus

Categories

Resources