Does Oracle have built-in string character class constants (digits, letters, alphanum, upper, lower, etc)?
My actual goal is to efficiently return only the digits [0-9] from an existing string.
Unfortunately, we still use Oracle 9, so regular expressions are not an option here.
Examples
The field should contain zero to three letters, 3 or 4 digits, then zero to two letters. I want to extract the digits.
String --> Result
ABC1234YY --> 1234
D456YD --> 456
455PN --> 455
No string constants, but you can do:
select translate
( mystring
, '0'||translate (mystring, 'x0123456789', 'x')
, '0'
)
from mytable;
For example:
select translate
( mystring
, '0'||translate (mystring, 'x0123456789', 'x')
, '0'
)
from
( select 'fdkhsd1237ehjsdf7623A#L:P' as mystring from dual);
TRANSLAT
--------
12377623
If you want to do this often you can wrap it up as a function:
create function only_digits (mystring varchar2) return varchar2
is
begin
return
translate
( mystring
, '0'||translate (mystring, 'x0123456789', 'x')
, '0'
);
end;
/
Then:
SQL> select only_digits ('fdkhsd1237ehjsdf7623A#L:P') from dual;
ONLY_DIGITS('FDKHSD1237EHJSDF7623A#L:P')
-----------------------------------------------------------------
12377623
You can check the list for predefined datatypes on Oracle here, but you are not going to find what are you looking for.
To extract the numbers of an string you can use some combination of these functions:
TO_NUMBER, to convert an string to number.
REPLACE, to remove occurences.
TRANSLATE, to convert chars.
If you provide a more concise example will be easier to give you a detailed solution.
If you are able to use PL/SQL here, another approach is write your own regular expression matcher function. One starting point is Rob Pike's elegant, very tiny regular expression matcher in Chapter 1 of Beautiful Code. One of the exercises for the reader is to add character classes. (You'd first need to translate his 30 lines of C code into PL/SQL.)
Related
I have a string of comma separated values, that I want to trim down for display purpose.
The string is a comma separated list of values of varying lengths and number of list entries.
Each entry in the list is formatted as a five character pattern in the format "##-NX" followed by some text.
e.g., "01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc..."
Is there an regular expression function I can use to remove the text after the 5 character prefix portion of each entry in the list, returning "01-NX, 02-NX, 09-NX, 12-NX,..."?
I am a novice with regular expressions and I haven't been able figure out how to code the pattern.
I think what you need is
regexp_replace(regexp_replace(mystring, '(\d{2}-NX)(.*?)(,)', '\1\3'), '(\d{2}.*NX).*', '\1')
The inner REGEXP_REPLACE looks for a pattern like nn-NX (two numeric characters followed by "-NX") and any number of characters up to the next comma, then replaces it with the first and third term, dropping the "any number of characters" part.
The outer REGEXP_REPLACE looks for a pattern like two numeric characters followed by any number of characters up to the last NX, and keeps that part of the string.
Here is the Oracle code I used for testing:
with a as (
select '01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc.' as myString
from dual
)
select mystring
, regexp_replace(regexp_replace(mystring, '(\d{2}-NX)(.*?)(,)', '\1\3'), '(\d{2}.*NX).*', '\1') as output
from a
This alternative calls REGEXP_REPLACE() once.
Match 2 digits, a dash and 'NX' followed by any number of zero or more characters (non-greedy) where followed by a comma or the end of the string. Replace with the first group and the 3rd group which will be either the comma or the end of the string.
EDIT: Took dougp's advice and eliminated the RTRIM by adding the 3rd capture group. Thanks for that!
WITH tbl(str) AS (
SELECT '01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc.' FROM dual
)
SELECT
REGEXP_REPLACE(str, '(\d{2}-NX)(.*?)(,|$)', '\1\3') str
from tbl;
I'm building a string in oracle, where I get a number from a column and make it a 12 digit number with the LPad function, so the length of it is 12 now.
Example: LPad(nProjectNr,12,'0') and I get 000123856812 (for example).
Now I want to split this string in parts of 3 digit with a "\" as prefix, so that the result will look like this \000\123\856\812.
How can I archive this in a select statement, what function can accomplish this?
Assuming strings of 12 digits, regexp_replace could be a way:
select regexp_replace('000123856812', '(.{3})', '\\\1') from dual
The regexp matches sequences of 3 characters and adds a \ as a prefix
It is much easier to do this using TO_CHAR(number) with the proper format model. Suppose we use \ as the thousands separator.... (alas we can't start a format model with a thousands separator - not allowed in TO_CHAR - so we still need to concatenate a \ to the left):
See also edit below
select 123856812 as n,
'\' || to_char(123856812, 'FM000G000G000G000', 'nls_numeric_characters=.\') as str
from dual
;
N STR
--------- ----------------
123856812 \000\123\856\812
Without the FM format model modifier, TO_CHAR will add a leading space (placeholder for the sign, plus or minus). FM means "shortest possible string representation consistent with the model provided" - that is, in this case, no leading space.
Edit - it just crossed my mind that we can exploit TO_CHAR() even further and not need to concatenate the first \. The thousands separator, G, may not be the first character of the string, but the currency symbol, placeholder L, can!
select 123856812 as n,
to_char(123856812, 'FML000G000G000G000',
'nls_numeric_characters=.\, nls_currency=\') as str
from dual
;
SUBSTR returns a substring of a string passed as the first argument. You can specify where the substring starts and how many characters it should be.
Try
SELECT '\'||SUBSTR('000123856812', 1,3)||'\'||SUBSTR('000123856812', 4,3)||'\'||SUBSTR('000123856812', 7,3)||'\'||SUBSTR('000123856812', 10,3) FROM dual;
I am working on PL/SQL . The oracle password by developer is set such way
=> input word => converted to ascii => added 2 to each letter => converted back to word
ex: input password is "admin".
admin is splitted into characters/letters (a, d, m, i, n)
converted to ascii and added 2 and again converted to word
a=97 97+2 = 99 = c
d=100 100+2=102 = f
m=109 109+2=111 = o
i=105 105+2=107 = k
n=110 110+2=112 = p
what i did is
$pass=str_split('admin');
foreach($pass as $password){
$new_password[]=chr(ord($password)+2);
}
$final= $new_password[0].$new_password[1].$new_password[2].$new_password[3].$new_password[4]; //the values 0-4 is set manually
echo $final;
result: cfokp
But i could not get proper ans to run the result string on command and match the oracle password with the retrieved one.
Another way in SQL is to split the characters, add 2 to the ascii value, and aggregate the string.
Of course, it won't be faster than the TRANSLATE approach. But, for a single or small set of values it shouldn't matter much.
For example,
SQL> WITH data AS
2 (SELECT 'admin' str FROM dual
3 )
4 SELECT str, LISTAGG(CHR(ASCII(REGEXP_SUBSTR(str, '\w', 1, LEVEL)) + 2), '') WITHIN GROUP(
5 ORDER BY LEVEL) str_new
6 FROM data
7 CONNECT BY LEVEL <= LENGTH(str)
8 /
STR STR_NEW
------ -------
admin cfokp
SQL>
The above SQL does following important tasks:
Split the string into characters using REGEXP_SUBSTR and ROW GENERATOR technique
Add value 2 to the ascii value of each character.
Convert back the modified ascii into characters.
Aggregate the string using LISTAGG
This is probably easier to do with translate:
select translate('admin',
'abcdefghijklmnopqrstuvwxyz',
'cdefghijklmnopqrstuvwxyzab'
)
from dual;
I'm not sure what you want to do with "y" and "z". This maps them back to "a" and "b". You can extend this to upper case letters and other characters if you need.
In Oracle I want to check whether the string has "=' sign at the end. could you please let me know how to check it. If it has '=' sign at the end of string, I need to trailing that '=' sign.
for eg,
varStr VARCHAR2(20);
varStr = 'abcdef='; --needs to trailing '=' sign
I don't think you need "pattern matching" here. Just check if the last character is the =
where substr(varstr, -1, 1) = '='
substr when called with a negative position will work from the end of the string, so substr(varstr,-1,1) extracts the last character of the given string.
Use the REGEX_EXP function. I'm putting a sql command since you didn't specify on your question.:
select *
from someTable
where regexp_like( someField, '=$' );
The pattern $ means that the precedent character should be at the end of the string.
see it here on sql fiddle: http://sqlfiddle.com/#!4/d8afd/3
It seems that substr is the way to go, at lease with my sample data of about 400K address lines this returns 1043 entries that end in 'r' in an average of 0.2 seconds.
select count(*) from addrline where substr(text, -1, 1) = 'r';
On the other hand, the following returns the same results but takes 1.1 seconds.
select count(*) from addrline where regexp_like(text, 'r$' );
I am using the following code to write to a file in Oracle PL/SQL.
l_file := utl_file.fopen('HR_OUT', 'TRUMEDAID.txt', 'w');
utl_file.put_line
(
l_file,
utl_raw.cast_to_varchar2
(
utl_raw.convert
(
utl_raw.cast_to_raw(rec_text),
'AMERICAN_AMERICA.WE8ISO8859P1', -- To character set.
'AMERICAN_AMERICA.AL32UTF8' -- From Character set.
)
)
);
However this does not discard the UTF-8 characters but instead translates Pamêla&~ into Pam�&~. Is there another way that would at least give Pam�la&~? Why isn't the ascii character ê used?
Not sure why you are casting to RAW and back again. You can use the CONVERT function. The input string can be CHAR,VARCHAR2,CLOB etc. See: http://docs.oracle.com/cd/B28359_01/server.111/b28286/functions027.htm#SQLRF00620
You only need to specify the destination character set since it takes the database character set as the default source.
Also, in these cases, never trust the resulting string of letters. Use the dump() function to investigate the byte values that make up the string. This way you can determine if the string is made up of the correct values.
select utl_raw.cast_to_varchar2
(
utl_raw.convert
(
utl_raw.cast_to_raw(first_name),
'AMERICAN_AMERICA.US7ASCII', -- To character set.
'AMERICAN_AMERICA.AL32UTF8' -- From Character set.
)
)
from per_all_people_f
where employee_number = '212164'
worked for me