Oracle Contains statement with special characters - oracle

I would like to search for this string 'A&G BROS, INC.' using oracle contains statement
FROM contact
WHERE CONTAINS
(name, 'A&G BROS, INC.') > 0
But I do not get accurate results I get over 300,000 records basically anything containing INC.
I tried escaping the & char using
FROM contact
WHERE CONTAINS
(name, 'A&' || 'G BROS, INC.') > 0
I still get same massive results
Any idea how to run this query with this special chars I want to narrow the results down so I can al least get results that starts with "A&G" Note "LIKE" and "INSTR" cannot be used.

Another way to deal with the special characters is to use the function CHR(n), where n is the ASCII value of the special character. For &, it is 38, so instead of
'A&G BROS, INC.' you can use 'A'||CHR(38)||'G BROS, INC.'
Using these special characters directly in literals can be tricky, because they can behave differently in different environments.
You can find the ASCII value of a character using the ASCII function, like this:
select ascii('&') from dual;
ASCII('&')
38

The & is AND, but the , is also ACCUM. The behaviour of those operators explains what you are seeing.
You need to escape those characters:
To query on words or symbols that have special meaning in query expressions such as and & or| accum, you must escape them. There are two ways to escape characters in a query expression...
So you could do:
FROM contact
WHERE CONTAINS
(name, 'A\&G BROS\, INC.') > 0
or
FROM contact
WHERE CONTAINS
(name, 'A{&}G BROS{,} INC.') > 0
or
FROM contact
WHERE CONTAINS
(name, '{A&G BROS, INC.}') > 0
If you can't stop your client prompting for substitution variables - which is really a separate issue to the contains escapes - then you could combine this with your original approach:
FROM contact
WHERE CONTAINS
(name, '{A&' || 'G BROS, INC.}') > 0

Related

Regular expression to remove a portion of text from each entry in commas separated list

I have a string of comma separated values, that I want to trim down for display purpose.
The string is a comma separated list of values of varying lengths and number of list entries.
Each entry in the list is formatted as a five character pattern in the format "##-NX" followed by some text.
e.g., "01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc..."
Is there an regular expression function I can use to remove the text after the 5 character prefix portion of each entry in the list, returning "01-NX, 02-NX, 09-NX, 12-NX,..."?
I am a novice with regular expressions and I haven't been able figure out how to code the pattern.
I think what you need is
regexp_replace(regexp_replace(mystring, '(\d{2}-NX)(.*?)(,)', '\1\3'), '(\d{2}.*NX).*', '\1')
The inner REGEXP_REPLACE looks for a pattern like nn-NX (two numeric characters followed by "-NX") and any number of characters up to the next comma, then replaces it with the first and third term, dropping the "any number of characters" part.
The outer REGEXP_REPLACE looks for a pattern like two numeric characters followed by any number of characters up to the last NX, and keeps that part of the string.
Here is the Oracle code I used for testing:
with a as (
select '01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc.' as myString
from dual
)
select mystring
, regexp_replace(regexp_replace(mystring, '(\d{2}-NX)(.*?)(,)', '\1\3'), '(\d{2}.*NX).*', '\1') as output
from a
This alternative calls REGEXP_REPLACE() once.
Match 2 digits, a dash and 'NX' followed by any number of zero or more characters (non-greedy) where followed by a comma or the end of the string. Replace with the first group and the 3rd group which will be either the comma or the end of the string.
EDIT: Took dougp's advice and eliminated the RTRIM by adding the 3rd capture group. Thanks for that!
WITH tbl(str) AS (
SELECT '01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc.' FROM dual
)
SELECT
REGEXP_REPLACE(str, '(\d{2}-NX)(.*?)(,|$)', '\1\3') str
from tbl;

ADO Recordset filter comparing two fields

How can I filter a recordset by comparing two fields?
For a given ADO Recordset with n fields (Field1, Field2,...,Fieldn)
I used to filter a field against a value:
rs.Filter = "Field1 = 'something'"
But what I need to do is something like this:
rs.Filter = "Field1 = Field2"
Is that possible?
The criteria string is made up of clauses in the form FieldName-Operator-Value
Value is the value with which you will compare the field values (for example, 'Smith', #8/24/95#, 12.345, or $50.00). Use single quotes with strings and pound signs (#) with dates. For numbers, you can use decimal points, dollar signs, and scientific notation. If Operator is LIKE, Value can use wildcards. Only the asterisk (*) and percent sign (%) wild cards are allowed, and they must be the last character in the string. Value cannot be null.
This suggests that comparing fields to each other is not supported. Value must be a literal.

Oracle SQL: Using Replace function while Inserting

I have a query like this:
INSERT INTO TAB_AUTOCRCMTREQUESTS
(RequestOrigin, RequestKey, CommentText) VALUES ('Tracker', 'OPM03865_0', '[Orange.Security.OrangePrincipal]
em[u02650791]okok
it's friday!')
As expected it is throwing an error of missing comma, due to this it's friday! which has a single quote.
I want to remove this single quote while inserting using Replace function.
How can this be done?
Reason for error is because of the single Quote. In order to correct it, you shall not remove the single quote instead you need to add one more i.e. you need to make it's friday to it''s friday while inserting.
If you need to replace it for sure, then try the below code :
insert into Blagh values(REPLACE('it''s friday', '''', ''),12);
I would suggest using Oracle q quote.
Example:
INSERT INTO TAB_AUTOCRCMTREQUESTS (RequestOrigin, RequestKey, CommentText)
VALUES ('Tracker', 'OPM03865_0',
q'{[Orange.Security.OrangePrincipal] em[u02650791]okok it's friday!}')
You can read about q quote here.
To shorten this article you will follow this format: q'{your string here}' where "{" represents the starting delimiter, and "}" represents the ending delimiter. Oracle automatically recognizes "paired" delimiters, such as [], {}, (), and <>. If you want to use some other character as your start delimiter and it doesn't have a "natural" partner for termination, you must use the same character for start and end delimiters.
Obviously you can't user [] delimiters because you have this in your queries. I sugest using {} delimiters.
Of course you can use double qoute in it it''s with replace. You can omit last parameter in replace because it isn't mandatory and without it it automatically will remove ' character.
INSERT INTO TAB_AUTOCRCMTREQUESTS (CommentText) VALUES (REPLACE('...it''s friday!', ''''))
Single quotes are escaped by doubling them up
INSERT INTO Blagh VALUES(REPLACE('it''s friday', '''', ''),12);
You can try this, (sorry but I don't know why q'[ ] works)
INSERT INTO TAB_AUTOCRCMTREQUESTS
(RequestOrigin, RequestKey, CommentText) VALUES ('Tracker', 'OPM03865_0', q'[[Orange.Security.OrangePrincipal] em[u02650791]okok it's friday!]')
I just got the q'[] from this link Oracle pl-sql escape character (for a " ' ") - this question could be a possible duplicate

Format string in Oracle

I'm building a string in oracle, where I get a number from a column and make it a 12 digit number with the LPad function, so the length of it is 12 now.
Example: LPad(nProjectNr,12,'0') and I get 000123856812 (for example).
Now I want to split this string in parts of 3 digit with a "\" as prefix, so that the result will look like this \000\123\856\812.
How can I archive this in a select statement, what function can accomplish this?
Assuming strings of 12 digits, regexp_replace could be a way:
select regexp_replace('000123856812', '(.{3})', '\\\1') from dual
The regexp matches sequences of 3 characters and adds a \ as a prefix
It is much easier to do this using TO_CHAR(number) with the proper format model. Suppose we use \ as the thousands separator.... (alas we can't start a format model with a thousands separator - not allowed in TO_CHAR - so we still need to concatenate a \ to the left):
See also edit below
select 123856812 as n,
'\' || to_char(123856812, 'FM000G000G000G000', 'nls_numeric_characters=.\') as str
from dual
;
N STR
--------- ----------------
123856812 \000\123\856\812
Without the FM format model modifier, TO_CHAR will add a leading space (placeholder for the sign, plus or minus). FM means "shortest possible string representation consistent with the model provided" - that is, in this case, no leading space.
Edit - it just crossed my mind that we can exploit TO_CHAR() even further and not need to concatenate the first \. The thousands separator, G, may not be the first character of the string, but the currency symbol, placeholder L, can!
select 123856812 as n,
to_char(123856812, 'FML000G000G000G000',
'nls_numeric_characters=.\, nls_currency=\') as str
from dual
;
SUBSTR returns a substring of a string passed as the first argument. You can specify where the substring starts and how many characters it should be.
Try
SELECT '\'||SUBSTR('000123856812', 1,3)||'\'||SUBSTR('000123856812', 4,3)||'\'||SUBSTR('000123856812', 7,3)||'\'||SUBSTR('000123856812', 10,3) FROM dual;

Space characters inside Oracle's Contains() function

I needed to use Oracle 11g's Contains() function to search some exact text contained in some field typed by the user. I was asked not to use the 'like' operator.
According to the Oracle documentation, for everything to work you need to:
Double } characters
Put the whole input between {}
This works in most cases except for a few ones. Below it a test case:
create table theme
(name varchar2(300 char) not null);
insert into theme (name)
values ('a');
insert into theme (name)
values ('b');
insert into theme (name)
values ('a or b');
insert into theme (name)
values ('Pdz344_1_b');
create index name_index on theme(name) indextype is ctxsys.context;
If the 'or' operator was interpreted, I would get all four results, which is hopefully not the case. Now if I run the following, I would expect is to only find 'a or b'.
select * from theme
where contains(name, '{a or b}')>0;
However I also get 'Pdz344_1_b'. But there's no 'a', 'o' not 'r' and I find it very surprising that this text is matched. Is there something I don't get about contains()'s syntax?
CONTAINS is not like LIKE operator at all. Since it using ORACLE TEXT search engine (something like google search), not just string matching.
{} - is an escape marker. Means everything you put inside should be treated as text to escape.
Therefore you issue query to find text that looks like a or b not like a or b.
So your query get matched against Pdz344_1_b because it has b char in it.
Row with only a character ain't matched because a character exists in the default stop list.
Why just b ain't matched? Because your match sequence actually looks like a\ or\ b.
So we have 3 tokens a _or _b (underscores represents spaces). a in stop list, and we have no string _b in the b row, because there only single character. But we do have this combination in the Pdz344_1_b row, because non-alphabetic characters are treated as whitespace. If you remove {} or query for {b or a} then you'll get matches against b as well.

Resources