Count characters after given symbol in oracle varchar column value - oracle

How would I go about counting the characters after a certain character. I'm new to Oracle, and I've learned quite a bit however I'm stumped at this point. I found a couple functions that will get you a substring and I found a function that will give you the length of a string. I am examining an email address, myemail#thedomain.com. I want to check the length after the '.' in the email.
SELECT email
FROM user_table
WHERE length(substr(email, /*what values*/, /*to put here*/))
I don't know if it's actually possible to find the location of the final '.' in the email string?

I'm not sure I would use substr. You can try something like this :
select length('abcd#efgh.123.4567') - instr('abcd#efgh.123.4567', '.', -1) from dual
Using instr(..,..,-1) searches backwards from the last character to find the position.

Since you're doing checks, I suggest you validate the format with regular expressions using REGEXP_INSTR. For instance, an email validation I found on this site is REGEXP_INSTR(email, '\w+#\w+(\.\w+)+') > 0
I didn't check it myself, but it looks quite ok.
Cheers.

Related

how to make Google sheets REGEXREPLACE return empty string if it fails?

I'm using the REGEXREPLACE function in google sheets to extract some text contained in skus. If it doesn't find a text, it seems to return the entire original string. Is it possible to make it return an empty string instead? Having problems with this because it doesn't actually error out, so using iserror doesn't seem to work.
Example: my sku SHOULD contain 5 separate groups delimited by the underscore character '_'. in this example it is missing the last group, so it returns the entire original string.
LDSS0107_SS-WH_5
=REGEXREPLACE($A3,"[^_]+_[^_]+_[^_]+_[^_]+_(.*)","$1")
Fails to find the fifth capture group, that is correct... but I need it to give me an empty string when it fails... presently gives me the whole original string. Any ideas??
Perhaps the solution would be to add the missing groups:
=REGEXREPLACE($A1&REPT("_ ",4-(LEN($A1)-LEN(SUBSTITUTE($A1,"_","")))),"[^_]+_[^_]+_[^_]+_[^_]+_(.*)","$1")
This returns space as result for missing group. If you don't want to, use TRIM function.

How do I write a regex for Excel cell range?

I need to validate that something is an Excel cell range in Ruby, i.e: "A4:A6". By looking at it, the requirement I am looking for is:
<Alphabetical, Capitalised><Integer>:<Integer><Alphabetical, Capitalised>
I am not sure how to form a RegExp for this.
I would appreciate a small explanation for a solution, as opposed to purely a solution.
A bonus would be to check that the range is restricted to within a row or column. I think this would be out of scope of Regular Expressions though.
I have tried /[A-Z]+[0-9]+:[A-Z]+[0-9]+/ this works but allows extra characters on the ends.
This does not work because it allows extra's to be added on to the beginning or end:
"HELLOAA3:A7".match(/\A[A-Z]+[0-9]+:[A-Z]+[0-9]+\z/) also returns a match, but is more on the right track.
How would I limit the number range to 10000?
How would I limit the number of characters to 3?
This is my solution:
(?:(?:\'?(?:\[(?<wbook>.+)\])?(?<sheet>.+?)\'?!)?(?<colabs>\$)?(?<col>[a-zA-Z]+)(?<rowabs>\$)?(?<row>\d+)(?::(?<col2abs>\$)?(?<col2>[a-zA-Z]+)(?<row2abs>\$)?(?<row2>\d+))?|(?<name>[A-Za-z]+[A-Za-z\d]*))
It includes named ranges, but the R1C1 notation is not supported.
The pattern is written in perl compatible regex dialect (i.e. can also be used with C#), I'm not familiar with Ruby, so I can't tell the difference, but you may want to look here: What is the difference between Regex syntax in Ruby vs Perl?
This will do both: match Excel range and that they must be same row or column. Stub
^([A-Z]+)(\d+):(\1\d+|[A-Z]+\2)$
A4:A6 // ok
A5:B10 // not ok
B5:Z5 // ok
AZ100:B100hello // not ok
The magic here is the back-reference group:
([A-Z]+)(\d+) -- column is in capture group 1, row in group 2
(\1\d+|[A-Z]+\2) -- the first column followed by any number; or
-- the first row preceded by any character

Clear table data in SQL

I hope there is no post limit since I have posted more than once today. :-P
Now I have a table in OracleSQL. I noticed there are some useless signs and want to delete them. The way I do it is to replace all of them. Below is my table and my query.
Here is my query:
SELECT
CASE WHEN WORD IN ('!', '"', '#','""') Then ''
ELSE WORD END
FROM TERM_FREQUENCY;
It is not giving me an error, but these special characters are not going away either... Any thoughts?
A little typo of yours: you use - instead of _
SELECT
CASE WHEN WORD IN ('!', '"', '#','""') Then ''
ELSE WORD END
-- FROM TERM-FREQUENCY; --This is where the problem is.
FROM TERM_FREQUENCY; -- Because your table is named TERM _ FREQUENCY
You originally tagged your question with 'replace' but then didn't use that function in your code. You're comparing each whole word to those fixed strings, not seeing if it contains any of them.
You can either use nested replace calls to remove one character at a time:
select replace(replace(word, '!', null), '"', null) from ...
... which would be tedious and rely on you identifying every character you didn't want; or you could use a regular expression only keep alphabetic characters, which I suspect is what you're really after:
select regexp_replace(word, '[^[:alpha:]]', null) from ...
Quick demo.
You might also want to use lower or upper to get everything into the same case, as you probably don't really want to count different capitalisation differently either.

Regular expression use in oracle

I am trying to get the user information from a table (named userinfo) from the Oracle database on the basis of name.
In database name can be like {"Ashwani Dahiya","Ashwani kumar","ashwani dahiya","ashwani kumar","Ashwani dahiya","ashwani Dahiya","ashwani"}
So I want if I search for name "ashwani" then it should return the above whole list of users
select *
from userinfo
where regexp_like('name','Ashwani([[:space:]]* | [[:space:]]+[a-zA-Z0-9]*)','i')
I had tried this but "no result found".
This expression
regexp_like('name','Ashwani([[:space:]]* | [[:space:]]+[a-zA-Z0-9]*)','i')
searches inside the string 'name' not inside a column called name. When you want to refer to a column you don't need quotes.
So you need your expression to:
regexp_like(name,'Ashwani([[:space:]]* | [[:space:]]+[a-zA-Z0-9]*)','i')
(Note the missing single quote ' around name).
But I don't see the need for a regex here. a simple
where lower(name) like '%ashwani%'
will also do the trick (and will not be slower than the regex because neither of them will use an index)
I assume you meant to use the column called NAME and not the literal 'name' so try without the quotes around name and also lose the spaces around the "|":
select * from userinfo where regexp_like(name,'Ashwani([[:space:]]*|[[:space:]]+[a-zA-Z0-9]*)','i')
Note that for testing purposes you can try it with literals like so:
select *
from dual
where regexp_like('ashwani ','Ashwani([[:space:]]*|[[:space:]]+[a-zA-Z0-9]*)','i')
As mentioned by a_horse_with_no_name, you may get away with using the "LIKE" but I'm guessing you're looking for just "ashwani" and not "ashwaniX" (where X is some other letter) in which you could have just
lower(name) = 'ashwani'
or lower(name) LIKE 'ashwani %'
The trouble with regex functions is that they can be rather slow if you're working with a lot of data.
if you wish to cling to regexes, try
select *
from userinfo
where regexp_count(name, '^ashwani', 1, 'i') = 1
;
but there really is no need for if the matches will always start with the literal to compare against.

Oracle text curly braces behavior

I'm using oracle text to do a readahead (according to the spec writer) in the search bar.
Basically, a user can start typing text and we fill the suggestions bar with likely matches.
I tried using oracle text for this, and ran into some issues, and the latest one being:
Table contains this entry for answertext: ... we offer many pricing options ...
SELECT
questiontext as qtext,
answertext as text,
questionid FROM question
WHERE contains(answertext, '{pric}', 1) > 0
;
This query returns nothing. But using {pricing} will return the correct result.
And suggestion why this is happening would be great!
Edit: just wanted to add that using stemming does not work for me because the user wants to differentiate between "report" and "reporting" and they want the matching substring to be highlighted which can be done if I can find the substring among the returned results.
Edit 2: I have my guess, that oracle tokenizes each word using word boundary of some sort in the index, and thus without any wildcards it looks for a token that equals = 'pric' and therefore does not find it (because there is a token 'pricing'). So, if that guess is correct I would love if someone can chime in for how I can make the query above work with the example entry while still maintaining whitespace so if type 'pricing options' it should return but if i type 'many options' it should not...
CONTAINS operator supports wildcards and fuzzy text search. Try:
SELECT * FROM question WHERE contains(answertext, '{pric%}', 1) > 0;
or
SELECT * FROM question WHERE contains(answertext, 'fuzzy({pric})', 1) > 0;
But with fuzzy "prize" will also match your search criteria.
To highlight found substrings you can use CTX_DOC.MARKUP.

Resources