remove special characters from query without using the replace function - oracle

I have some foreign language names in my query. The problem is, I don't know where all the special characters are, so using the REPLACE function will not be helpful because there are over 500,000 rows. Some foreign names appear like this for example:
I want the name to appear like this instead "COLLEGE BOREAL DARTS APPLIQUES ET DE TECHNOLOGIE"
Is there a way to achieve this without using the replace function? So that it works on other Names on the list as well
I tried something like this that I saw in another post:
SELECT
CTE.COLLEGE_NAME COLLATE Cyrillic_General_CI_AI
FROM SCHOOLS cte
But it did not work. If someone could please help me solve this, that would be great! thank you

You seem to be talking about both accented and special characters. As #Sayan showed you can use nlssort to remove the accents, but as well as having to deal with the case change, it doesn't remove things like the trademark symbol (which you mentioned0 as you might expect or want - the '™' is converted to 'tm' which is clever but unhelpful here, and it throws out the translate too (as shown here, adding examples to Sayan's code).
Another approach that might work for you is to use convert (which Oracle recommend not to do) or utl_raw/utl_i18n functions to convert your values to plain ASCII, which takes care of the accents (hopefully all of them; I haven't tested extensively, and the discussion #Littlefoot linked to shows a lot of variations), and replaces any other non-ASCII values with a ?, which you can then conventiently remove along with other punctuation and symbols:
select college_name,
regexp_replace(
utl_i18n.raw_to_char(utl_i18n.string_to_raw(college_name, 'US7ASCII'), 'US7ASCII'),
'[[:punct:]]',
null) as result
from schools
which with your example and another with a trademark symbol gives:
COLLEGE_NAME
RESULT
COLLÈGE BORÉAL D’ARTS APPLIQUÉS ET DE TECHNOLOGIE
COLLEGE BOREAL DARTS APPLIQUES ET DE TECHNOLOGIE
Collectives™ on Stack Overflow
Collectives on Stack Overflow
db<>fiddle including some variations; but don't use the convert ones *8-)

Of course, you can remove ascent/umlauts from characters.
First of all, look at this example:
with t(n,name) as (
select 1, 'Löwenbrauerei' from dual union all
select 2, 'LÖwenbrauerei' from dual union all
select 3, 'Lowenbrauerei' from dual union all
select 4, 'LOwenbrauerei' from dual
)
select
n
,name
,utl_raw.cast_to_varchar2(nlssort(name, 'NLS_SORT=BINARY_AI')) name_AI
from t;
Results:
N NAME NAME_AI
---------- -------------- --------------------
1 Löwenbrauerei lowenbrauerei
2 LÖwenbrauerei lowenbrauerei
3 Lowenbrauerei lowenbrauerei
4 LOwenbrauerei lowenbrauerei
As you can see NLSSORT(..., 'NLS_SORT=BINARY_AI') removes all ascents and changes all to lower-case characters, so you just need to restore original upper/lower-case characters. For example you can use it with translate:
with t(n,name) as (
select 1, 'Löwenbrauerei' from dual union all
select 2, 'LÖwenbrauerei' from dual union all
select 3, 'Lowenbrauerei' from dual union all
select 4, 'LOwenbrauerei' from dual
)
select
n
,name
,upper(name)
,lower(utl_raw.cast_to_varchar2(nlssort(name, 'NLS_SORT=BINARY_AI'))) name_AI_lower
,upper(utl_raw.cast_to_varchar2(nlssort(name, 'NLS_SORT=BINARY_AI'))) name_AI_upper
,translate(
translate(
name
,upper(name)
,upper(utl_raw.cast_to_varchar2(nlssort(name, 'NLS_SORT=BINARY_AI')))
)
,lower(name)
,utl_raw.cast_to_varchar2(nlssort(name, 'NLS_SORT=BINARY_AI'))
) as name_ascent_removed
from t;
Results:
N NAME UPPER(NAME) NAME_AI_LOWER NAME_AI_UPPER NAME_ASCENT_REMOVED
---------- -------------- -------------- -------------------- -------------------- --------------------------------------------------------
1 Löwenbrauerei LÖWENBRAUEREI lowenbrauerei LOWENBRAUEREI Lowenbrauerei
2 LÖwenbrauerei LÖWENBRAUEREI lowenbrauerei LOWENBRAUEREI LOwenbrauerei
3 Lowenbrauerei LOWENBRAUEREI lowenbrauerei LOWENBRAUEREI Lowenbrauerei
4 LOwenbrauerei LOWENBRAUEREI lowenbrauerei LOWENBRAUEREI LOwenbrauerei
ps. probably you can just to set codepage/font on the client that ignores them...

Related

Allow multiple values from SSRS in oracle

I have a query that gets contract_types 1 to 10. This query is being used in an SSRS report to filter out a larger dataset. I am using -1 for nulls and -2 for all.
I would like to know how we would allow multiple values - does oracle concatenate the inputs together so '1,2,3' would be passed in? Say we get select -1,0,1 in SSRS, how could we alter the bottom query to return values?
My query to get ContractTypes:
SELECT
ContractType,
CASE WHEN ContractType = -2 THEN 'All'
WHEN ContractType = -1 THEN'Null'
ELSE to_Char(ContractType)
END AS DisplayFigure
FROM ContractTypes
which returns
ContractType DisplayFig
-1 Null
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
This currently is only returning single values or all, not muliple values:
SELECT *
FROM Employee
WHERE NVL(CONTRACT_TYPE, -1) = :contract_type or :contract_type = -2
I'm assuming we want to do something like:
WHERE NVL(CONTRACT_TYPE, -1) IN (:contract_type)
But this doesn't seem to work.
Data in Employee
Name ContractType
Bob 1
Sue 0
Bill Null
Joe 2
In my report, I want to be able to select contract_type as -1(null),0,1 using the 'allow muliple values' checkbox. At the moment, I can only select either 'all' using my -2 value, or single contract types.
My input would be: contract type = -1,1,2
My output would be Bill, Bob, Joe.
This is how I'm executing my code
I use SSRS with Oracle a lot so I see where you're coming from. Thankfully, they work pretty well together.
First make sure the parameter is set to allow multiple values. This adds a Select All option to your dropdown so you don't have to worry about adding a special case for "All". You'll want to make sure the dataset for the parameter has a row with -1 as the Value and a friendly description for the Label.
Next, the WHERE clause would be just as you mentioned:
WHERE NVL(CONTRACT_TYPE, -1) IN (:contract_type)
SSRS automatically populates the values. There is no XML or string manipulation needed. Keep in mind that this will not work with single-value parameters.
If for some reason this still doesn't work as expected in your environment, there is another workaround you can use which is more universal and works even with ODBC connections.
In the dataset parameter properties, use an expression like this to concatenate the values into a single, comma-separated string:
="," + Join(Parameters!Parameter.Value, ",") + ","
Then use an expression like this in your WHERE clause:
where :parameter like '%,' + Column + ',%'
Obviously, this is less efficient because it most likely won't be using an index, but it works.
I don't know SSRS, but - if I understood you correctly, you'll have to split that comma-separated values list into rows. Something like in this example:
SQL> select *
2 from dept
3 where deptno in (select regexp_substr('&&contract_type', '[^,]+', 1, level)
4 from dual
5 connect by level <= regexp_count('&&contract_type', ',') + 1
6 );
Enter value for contract_type: 10,20,40
DEPTNO DNAME LOC
---------- -------------------- --------------------
20 RESEARCH DALLAS
10 ACCOUNTING NEW YORK
40 OPERATIONS BOSTON
SQL>
Applied to your code:
select *
from employee
where nvl(contract_type, -1) in (select regexp_substr(:contract_type, '[^,]+', 1, level)
from dual
connect by level <= regexp_substr(:contract_type, ',') + 1
)
If you have the comma separated list of numbers and then if you like to split it then, the below seems simple and easy to maintain.
select to_number(column_value) from xmltable(:val);
Inputs: 1,2,3,4
Output:
I guess I understood your problem. If I am correct the below should solve your problem:
with inputs(Name, ContractType) as
(
select 'Bob', 1 from dual union all
select 'Sue', 0 from dual union all
select 'Bill', Null from dual union all
select 'Joe', 2 from dual
)
select *
from inputs
where decode(:ContractType,'-2',-2,nvl(ContractType,-1)) in (select to_number(column_value) from xmltable(:ContractType))
Inputs: -1,1,2
Output:
Inputs: -2
Output:

How to use different group separators for thousands and millions. Oracle

I need to display different results in the next format, example:
40000000 to 40'000,000
I tried using this, but when i try 2 differents group separators i get the "invalid number format model" error:
select to_char(9999999999, '9g999g99999g9', 'NLS_NUMERIC_CHARACTERS='',.''')
from dual;
Also tried using substr and replace but it doesnt work in all the cases (like when the result is 3000000 or 700000000).
This works but it is not the optimal solution.:
SELECT substr(replace('40,000,000',',',''''),0,length(40000000)-2)|| substr('40,000,000',-4) from dual;
What the actual select look like if i use the previous code.
SELECT substr(replace(to_char(oTOTAL_SENIOR, '999,999,999'),',',''''),0,length(oTOTAL_SENIOR)-2)|| substr(to_char(oTOTAL_SENIOR, '999,999,999'),-4) from dual
The previous select gets bugged when i use substr replace and to_char together because of the '999,999,999'.
I also tried using regexp_replace but im not good at it.
I know i need to replace everything but the last 4 characters (,000) but i dont know how.
Any help will be aprreciated.
You'll need someone smarter than me to do that properly, but - meanwhile, see whether this helps.
I'm on XE 11g; in order to avoid "no more data to read from socket" error I got during the final steps of this query (I believe it was due to two string reversings), I created my own "reverse" function. There's undocumented reverse function, but I don't want to use it.
Basically, it reverses a string you pass as a parameter. What do I need it for? I found it simpler to reverse values, split them to 3-by-3-by-3 characters and apply those "strange" separators you want. Also, it makes the whole code simpler. It can be done without it, but - as I said - no more data to read from socket won't allow it. Sorry about that.
Now, someone will say: why didn't you (meaning: me, LF) do that using PL/SQL completely and put everything into a function? No particular reason & no problem in doing it, if necessary.
OK, here it is:
SQL> create or replace function f_reverse (par_string in varchar2)
2 return varchar2
3 is
4 retval varchar2(20);
5 begin
6 select listagg(substr(par_string, level, 1))
7 within group (order by level desc)
8 into retval
9 from dual
10 connect by level <= length(par_string);
11
12 return retval;
13 end;
14 /
Function created.
SQL> select f_reverse('1234') from dual;
F_REVERSE('1234')
---------------------------------------------------------------------
4321
SQL>
Finally, this is what you want:
SQL> with test (id, col) as
2 (select 1, 40100200 from dual union all
3 select 2, 2300400 from dual union all
4 select 3, 700500 from dual union all
5 select 4, 25700 from dual union all
6 select 5, 6300 from dual union all
7 select 6, 555 from dual
8 )
9 select id,
10 regexp_replace(
11 f_reverse(
12 substr(f_reverse(col), 1, 3) ||','||
13 substr(f_reverse(col), 4, 3) || chr(39) ||
14 substr(f_reverse(col), 7)
15 ), '^[^0-9]+', '') result
16 from test;
ID RESULT
---------- ----------
1 40'100,200
2 2'300,400
3 700,500
4 25,700
5 6,300
6 555
6 rows selected.
SQL>
What does it do?
lines #1 - 7 - sample data
lines #9 onward is useful code
lines #12 - 14 - splitting reversed sample data into substrings 3 characters in length
line #12 - concatenated , as this is thousands separator
line #13 - concatenaded chr(13) which is ', a millions separator
line #11 - reversing concatenated "reversed" string back
line #10 - removing possible non-numeric characters from the beginning of the result (those are separators for values that are shorter than "thousand" or "million")

FInd if the fifth position is a letter and not a number using ORACLE

How can I find if the fifth position is a letter and thus not a number using Oracle ?
My last try was using the following statement:
REGEXP_LIKE (table_column, '([abcdefghijklmnopqrstuvxyz])');
Perhaps you'd rather check whether 5th position contains a number (which means that it is not something else), i.e. do the opposite of what you're doing now.
Why? Because a "letter" isn't only ASCII; have a look at the 4th row in my example - it contains Croatian characters and these aren't between [a-z] (nor [A-Z]).
SQL> with test (col) as
2 (select 'abc_3def' from dual union all
3 select 'A435D887' from dual union all
4 select '!#$%&/()' from dual union all
5 select 'ASDĐŠŽĆČ' from dual
6 )
7 select col,
8 case when regexp_like(substr(col, 5, 1), '\d+') then 'number'
9 else 'not a number'
10 end result
11 from test;
COL RESULT
------------- ------------
abc_3def number
A435D887 not a number
!#$%&/() not a number
ASDĐŠŽĆČ not a number
SQL>
Anchor to the start of the string else you may get unexpected results. This works, but remove the caret (start of string anchor) and it returns 'TRUE'! Note it uses the case-insensitive flag of 'i'.
select 'TRUE'
from dual
where regexp_like('abcd4fg', '^.{4}[A-Z]', 'i');
Yet another way to do it:
regexp_like(table_column, '^....[[:alpha:]]')
Using the character class [[:alpha:]] will pick up all letters upper case, lower case, accented and etc. but will ignore numbers, punctuation and white space characters.
If what you care about is that the character is not a number, then use
not regexp_like(table_column, '^....[[:digit:]]')
or
not regexp_like(table_column, '^....\d')
Try:
REGEXP_LIKE (table_column, '^....[a-z]')
Or:
SUBSTR (table_column, 5, 1 ) BETWEEN 'a' AND 'z'

Replacing/Converting 1 to A with Oracle/PLSQL

Firstly, I greatly appreciate any feedback that anyone can offer. I am using Oracle SQL Developer, Version 4.0.2.15, Build 15.21.
I know and understand that many, many similar questions have been asked, as I've searched around on stackoverflow as well as the rest of the internet. However, the corresponding answers are either too vague or too extravagant, and attempt to do things that are way over my head and not what I am trying to accomplish. I am extremely new to SQL and haven't seriously done any coding since I did Java about 12 years ago. So please understand that something simple to you, is not so simple and obvious to me.
My bare-bones endstate that I am shooting for is taking a pre-existing Oracle Table Column, which is called 'service_level', that has parameters of 1-3, and making them A-C (where A=1, B=2, C=3). The reason for this is that I have an ArcGIS gdB featureclass that has a corresponding column, called 'MaintServi', with the parameters of A-C. I am going to join them using ArcToolbox once I have converted/replaced the 1-3 to A-C, and have exported them from Oracle into an Arc gdB as another table. The reason being is that the featureclass (obviously) has geometry, but this particular Oracle table does not.
From what I have gathered I know (or think) I will need to use something like:
chr(ord('a') + 3)
^ Where I will need to use/call upon the chr/ord functions. However, due to my inexperience, I cannot think of how to properly call this without getting an error. Below is what I have for my query thus far (but without chr/ord). I just need to figure out how to correctly insert it into my query to achieve the desired results.
SELECT v_wv_wp_crew.*,
Substr(v_wv_wp_crew.winter_supp_id, 1, 6) AS CostCenter,
Substr(v_wv_wp_crew.winter_supp_id, 8, 11) AS Crew_Supp_ID
FROM v_wv_wp_crew
WHERE crew_on_road >= '13-FEB-12'
AND ( operation = 2
OR operation = 3 );
Thanks again and hopefully I have complied with the posting rules of stackoverflow.
# Mark J. Bobak -
When implementing his ideas I get either this (Like I said, i'm not sure how to insert it properly without receiving an error)
SELECT v_wv_wp_crew.*,
Substr(v_wv_wp_crew.winter_supp_id, 1, 6) AS CostCenter,
Substr(v_wv_wp_crew.winter_supp_id, 8, 11) AS Crew_Supp_ID
FROM v_wv_wp_crew
WHERE crew_on_road >= '13-FEB-12'
AND ( operation = 2
OR operation = 3 )
UNION ALL
WITH service_level as (select 1 service_level from dual
union all
select 2 service_level from dual union all
select 3 service_level from dual)
select decode(service_level,1,'A',2,'B',3,'C') from service_level;
I receive the following error:
*ORA-32034: unsupported use of WITH clause
32034. 00000 - "unsupported use of WITH clause"
*Cause: Inproper use of WITH clause because one of the following two reasons
1. nesting of WITH clause within WITH clause not supported yet
2. For a set query, WITH clause can't be specified for a branch.
3. WITH clause can't sepecified within parentheses.
Action: correct query and retry
Error at Line: 14 Column: 25
Or I receive an output of only 3 rows (A, B, C) if I run the query separately - sorry I don't have enough reputation to post the image yet.
You can use the DECODE() function. Something like this should work:
with list_of_digits as (select 1 col_a from dual
union all
select 2 col_a from dual
union all
select 3 col_a from dual
union all
select 4 col_a from dual)
select decode(col_a,1,'A',2,'B',3,'C','Other') from list_of_digits;
Using your query, try this:
WITH service_level as (select 1 service_level from dual
union all
select 2 service_level from dual union all
select 3 service_level from dual)
select decode(service_level,1,'A',2,'B',3,'C') from service_level
union all
SELECT v_wv_wp_crew.*,
Substr(v_wv_wp_crew.winter_supp_id, 1, 6) AS CostCenter,
Substr(v_wv_wp_crew.winter_supp_id, 8, 11) AS Crew_Supp_ID
FROM v_wv_wp_crew
WHERE crew_on_road >= '13-FEB-12'
AND ( operation = 2
OR operation = 3 );
ord isn't an Oracle function. The equivalent Oracle function is ASCII. However, even substituting in the correct function, I don't see how that gets you what you want.
It seems most likely that you just want to add a column (I'd use case to translate the values):
SELECT v_wv_wp_crew.*,
Substr(v_wv_wp_crew.winter_supp_id, 1, 6) AS CostCenter,
Substr(v_wv_wp_crew.winter_supp_id, 8, 11) AS Crew_Supp_ID,
case service_level
when '1' then 'a'
when '2' then 'b'
when '3' then 'c'
end as service_level_alpha
FROM v_wv_wp_crew
WHERE crew_on_road >= '13-FEB-12'
AND ( operation = 2
OR operation = 3 );
If you want to return this column as service_level, then you'll need to return the full list of columns instead of using the asterisk.
Since this is a straight-forward character swap, you could use translate to really streamline the operation: translate(service_level,'123','abc'). However, I vastly prefer case over either decode or translate for readability

Oracle text search for ranges

I'm looking for a better way of searching through numeric ranges in Oracle Text. I have a DB app that does a lot of GIS-type things, but we now want to add street range searching to it.
So I'd like to store the min and max values in a column, and search for a number within those values. I'm happy to go explore options, but I'd like some pointers on where to head. Does anyone have any suggestions for me?
EDIT: we're just trying to make address lookups easier. Text on the address parts has been a huge success, but we want to store street ranges instead of every individual house number.
So, if I searched for "11 high street", I'd expect a match if high street had a range of 1 to 1000. I'd also like some options that I can use if I searched for "flat 1 11 high street" too though. I expect that I will have to do some jiggery with the input in these cases, I just want to know what kind of tools there are that I could try working with.
My suggestion is to make standard length string field for storing building numbers, create index on this field and then use between for search.
Something like this format:
NNNNNNCCCCBBBB
where:
NNNNNN - left-padded house number;
CCCC - left-padded character (like 'A' in '11A');
BBBB - left-padded building number
Under 'left-padded' I mean "filled with some symbol to standard length at left side", see for example result of select lpad('11',5,'X') from dual; query.
E.g. suppose, you have "11A high street building 5" address and choose '%' as filling symbol. When converted to proposed format it looks like '%%%11%%%A%%%' and 'high street' stored at separated field(s).
Next is query example for selecting all houses between 1 and 1000:
with address_list as (
select '%%%11%%%A%%%%' bnum from dual union all
select '%1001%%%A%%%%' bnum from dual union all
select '%%%%1%%%A%%%%' bnum from dual union all
select '%%%%1%%%%%%%%' bnum from dual union all
select '%%321%%%A%%%%' bnum from dual union all
select '%1000%%%A%%%%' bnum from dual union all
select '%1000%%QQ%%12' bnum from dual
)
select * from address_list
where
-- from '1 high street'
bnum >= '%%%%1%%%%%%%%'
and
-- less then '1001 high street'
bnum < '%1001%%%%%%%%'
order by
bnum
In real case is better to use chr(1) or any other unprintable symbol as symbol for padding.
Another thing is to build only function-based index for search without real field storage.
Anything wrong with
WHERE <number> BETWEEN minColumn AND maxColumn

Resources