what will translate function do if I want to change some chars to nothing? - oracle

I have a sql statement:
select translate('abcdefg', 'abc', '') from dual;
Why the result is nothing?
I think it should be 'defg'.

From the documentation:
You cannot use an empty string for to_string to remove all characters in from_string from the return value. Oracle Database interprets the empty string as null, and if this function has a null argument, then it returns null. To remove all characters in from_string, concatenate another character to the beginning of from_string and specify this character as the to_string. For example, TRANSLATE(expr, 'x0123456789', 'x') removes all digits from expr.
So you can do something like:
select translate('abcdefg', '#abc', '#') from dual;
TRANSLATE('ABCDEFG','#ABC','#')
-------------------------------
defg
... using any character that isn't going to be in your from_string.

select translate('abcdefg', 'abc', '') from dual;
To add to Alex's answer, you could use any character(allowed in SQL) for that matter to concatenate to remove all the characters. So, you could even use a space instead of empty string. An empty string in Oracle is considered as NULL value.
So, you could also do -
SQL> SELECT TRANSLATE('abcdefg', ' abc', ' ') FROM dual;
TRAN
----
defg
SQL>
Which is the same as -
SQL> SELECT TRANSLATE('abcdefg', chr(32)||'abc', chr(32)) FROM dual;
TRAN
----
defg
SQL>
Since the ascii value of space is 32.
It was just a demo, it is better to use any other character than space for better understanding and code readability.

Related

Extract sub-string after match in oracle

I have string like Order#Confirm####2791 i wanted to fetch 2791 after #### delimiter. I tried below one not getting exact sub-string what i am expecting to return.
SELECT regexp_substr('Order # Confirm ####2791','[^####]+',1,2) regexp_substr
FROM dual;
I would like to return 2791 from above query.
SELECT regexp_substr('Order # Confirm ####2791','####(.*)$',1,1, null, 1) regexp_substr
FROM dual;
If you want to restrict the match to digits:
SELECT regexp_substr('Order # Confirm ####2791','####(\d+)$',1,1, null, 1) regexp_substr
FROM dual;
regexp_replace works too:
SELECT regexp_replace('Order # Confirm ####2791','.*?####(\d+)$', '\1') regexp_replace
FROM dual;
Note with regexp_substr() if a match is not found NULL is returned and with regexp_replace() if a match is not found the original string is returned.
You don't need regular expressions for this.
SELECT substr('Order # Confirm ####2791',
instr('Order # Confirm ####2791', '####') + 4) as your_substr
FROM dual;

CHR(0) in REGEXP_LIKE

I am using the queries to check how chr(0) behaves in regexp_like.
CREATE TABLE t1(a char(10));
INSERT INTO t1 VALUES('0123456789');
SELECT CASE WHEN REGEXP_LIKE(a,CHR(0)) THEN 1 ELSE 0 END col, DUMP(a)
FROM t1;
The output I am getting like this -
col dump(a)
----------- -----------------------------------
1 Typ=96 Len=10: 48,49,50,51,52,53,54,55,56,57
I am totally confused, if there is no chr(0) as shown by the dump(a), how regexp_like is finding the chr(0) in the column and returning 1? Shouldn't it return 0 here?
CHR(0) is the character used to terminate a string in the C programming language (among others).
When you pass CHR(0) to the function it will, in turn, pass it to lower level function that will parse the strings you have passed in and build a regular expression pattern from that string. This regular expression pattern will see CHR(0) and think it is the string terminator and ignore the rest of the pattern.
The behaviour is easier to see with REGEXP_REPLACE:
SELECT REGEXP_REPLACE( 'abc' || CHR(0) || 'e', CHR(0), 'd' )
FROM DUAL;
What happens when you run this:
CHR(0) is compiled into a regular expression and become a string terminator.
Now the pattern is just the string terminator and so the pattern is a zero-length string.
The regular expression is then matched against the input string and it reads the first character a and finds a zero-length string can be matched before the a so it replaces the nothing it has matched before the a with an d giving the output da.
It will then repeat for the next character transforming b to db.
and so on until you reach the end-of-string when it will match the zero-length pattern and append a final d.
And you will get get the output:
dadbdcd_ded
(where _ is the CHR(0) character.)
Note: the CHR(0) in the input is not replaced.
If the client program you are using is also truncating the string at CHR(0) you may not see the entire output (this is an issue with how your client is representing the string and not with Oracle's output) but it can also be shown using DUMP():
SELECT DUMP( REGEXP_REPLACE( 'abc' || CHR(0) || 'e', CHR(0), 'd' ) )
FROM DUAL;
Outputs:
Typ=1 Len=11: 100,97,100,98,100,99,100,0,100,101,100
[TL;DR] So what is happening with
REGEXP_LIKE( '1234567890', CHR(0) )
It will make a zero-length string regular expression pattern and it will look for a zero-length match before the 1 character - which it will find and then return that it has found a match.
Aleksej kind of beat me to it, but CHR(0) is the value for the string terminator (kind of like the NULL keyword but not exactly). Think of it like an internal end-of-string indicator that CHR(0) apparently can see. Note that if you try the query with the keyword NULL, it will return zero, as nothing can be compared to NULL and the comparison thus will fail (as you were expecting). Interesting. Perhaps someone more experienced with the internal workings can explain further, I would be interested to hear more.
Not an answer, just some experiments, but too long for a comment.
REGEXP_COUNT seems to be confused by chr(0), counting every character as chr(0); besides, it seems to find one occurrence more than the size of the string.
SQL> select dump('a'), regexp_count('a', chr(0)) from dual;
DUMP('A') REGEXP_COUNT('A',CHR(0))
---------------- ------------------------
Typ=96 Len=1: 97 2
SQL> select dump(chr(0)), regexp_count(chr(0), chr(0)) from dual;
DUMP(CHR(0)) REGEXP_COUNT(CHR(0),CHR(0))
-------------- ---------------------------
Typ=1 Len=1: 0 2
SQL> select dump('0123456789' || chr(0)), regexp_count('0123456789' || chr(0), chr(0)) from dual;
DUMP('0123456789'||CHR(0)) REGEXP_COUNT('0123456789'||CHR(0),CHR(0))
--------------------------------------------- -----------------------------------------
Typ=1 Len=11: 48,49,50,51,52,53,54,55,56,57,0 12
LIKE seems to have a good behaviour, while its REGEXP version seems to fail:
SQL> select 1 from dual where 'a' like '%' || chr(0) || '%';
no rows selected
SQL> select 1 from dual where regexp_like ('a', chr(0));
1
----------
1
Same thing for INSTR and REGEXP_INSTR
SQL> select 1 from dual where instr('a', chr(0)) != 0;
no rows selected
SQL> select 1 from dual where regexp_instr('a', chr(0)) != 0;
1
----------
1
Tested on 11g XE Release 11.2.0.2.0 - 64bit

Can't trim the string in Oracle

I have a string IN-123456; now I need to trim the IN- from that string. I tried as in Oracle
select trim('IN-' from 'IN-123456) from dual;
but I get an error
ORA-30001: trim set should have only one character
30001. 00000 - "trim set should have only one character"
*Cause: Trim set contains more or less than 1 character. This is not
allowed in TRIM function.
How can I solve this?
A simple replace wouldn't do the trick?
select replace('IN-123456', 'IN-', '') from dual;
Thanks for the result...
It can be solved with LTRIM() function
Clearly, TRIM is not the correct function for the job. You need to REPLACE the (sub)string IN- with nothing:
SELECT REPLACE('IN-123456', 'IN-') FROM dual;
Be aware that this will replace all occurrences of IN- anywhere in the string. If that's not appropriate, but the IN- will always be at the start of the string, then you could use SUBSTR instead:
SELECT SUBSTR('IN-123456', 4) FROM dual;
you just forget to complete single quote
select trim('IN-' from 'IN-123456') from dual;
now try this
Trim Function is always remove one char from string
Here is the example -
SELECT TRIM(both 'P' FROM 'PHELLO WORLDP') FROM DUAL
Out put -HELLO WORLD
You may use LEADING /TRAILING insert of Both.
In your case "IN-" holding three char.

What does caret(^) in Oracle translate function mean?

I encountered this statement from other developer's code which returns ABCDEF:
SELECT TRANSLATE('ABC123DEF456', '^0123456789', '^') FROM DUAL;
Then I tested with the following which have the same result:
SELECT TRANSLATE('ABC123DEF456', '^0123456789', ' ') FROM DUAL;
SELECT TRANSLATE('ABC123DEF456', '0123456789', ' ') FROM DUAL;
But this one returns null:
SELECT TRANSLATE('ABC123DEF456', '0123456789', '') FROM DUAL;
What does caret(^) mean? Why is it necessary?
TRANSLATE(expr, from_string, to_string):
You cannot use an empty string for to_string to remove all characters
in from_string from the return value. Oracle Database interprets the
empty string as null, and if this function has a null argument, then
it returns null.
Thus you cannot specify '' as the value for the to_string parameter, because it would be interpreted as null.
I suspect ^ is used here because it will never appear in the expr, and thus you will never see it in the resulting string as in TRANSLATE('ABC12^3DE0F456', '^0123456789', '^') which returns ABC^DEF.
Your original function SELECT TRANSLATE('ABC123DEF456', '^0123456789', '^') FROM DUAL; effectively strips all digits from the source string because for every matching digit in from_string there's no corresponding character in to_string, the other characters are just ignored.

Oracle Regexp to replace \n,\r and \t with space

I am trying to select a column from a table that contains newline (NL) characters (and possibly others \n, \r, \t). I would like to use the REGEXP to select the data and replace (only these three) characters with a space, " ".
No need for regex. This can be done easily with the ASCII codes and boring old TRANSLATE()
select translate(your_column, chr(10)||chr(11)||chr(13), ' ')
from your_table;
This replaces newline, tab and carriage return with space.
TRANSLATE() is much more efficient than its regex equivalent. However, if your heart is set on that approach, you should know that we can reference ASCII codes in regex. So this statement is the regex version of the above.
select regexp_replace(your_column, '([\x0A|\x0B|`\x0D])', ' ')
from your_table;
The tweak is to reference the ASCII code in hexadecimal rather than base 10.
select translate(your_column, chr(10)||chr(11)||chr(13), ' ') from your_table;
to clean it is essential to serve non-null value as params ...
(oracle function basically will return null once 1 param is null, there are few excpetions like replace-functions)
select translate(your_column, ' '||chr(10)||chr(11)||chr(13), ' ') from your_table;
this examples uses ' '->' ' translation as dummy-value to prevent Null-Value in parameter 3

Resources