Oracle Trim RegEx - oracle

I am trying to execute a join between two tables in Oracle where the column in one of the tables is a string and number in the other.
I need to perform some sort of trim function the string version because it is an 8 character field and will lead with 0s in cases where the number is less than 8 digits.
For example, 123 = '00000123'. How can I get the string '00000123' to equal '123' regardless of the number of leading 0s.
Thanks!!

Use to_number conversion function:
SELECT to_number('00000123')
FROM dual;
| TO_NUMBER('00000123') |
|-----------------------|
| 123 |
Demo: http://sqlfiddle.com/#!4/1792d/18

Related

Finding fields with non alfa numeric values

I am looking a way to find the values in a column that has non alfa numeric values...
I tried
select 'kjh$' not RLIKE '([0-9][a-z]|[A-Z])*')
but does not work
Thanks for your help
You can use REGEXP '^[A-Za-z0-9]+$' or RLIKE '^[A-Za-z0-9]+$'.
Sample SQL -
select * from my table where mycol not RLIKE '^[A-Za-z0-9]+$'
^ - determines start of the string
$ - end of the string
+ - match the preceding character one or more times
[A-Za-z0-9] - to check alphanumeric or not
I ran a simple select statement to check if a string has alphanumeric or not using regex and here is the output.
select 'Aa90$$bc' ,'Aa90$$bc' rlike '^[A-Za-z0-9]+$'

How display two fields sums in the same query in HIve

I have a Hive table with the following fields:
id STRING , x STRING
where x can have values such as 'c'.
I need a query that display number of rows where column x contains a value 'c' and the number of rows where x has values are other than 'c'.
id | count(x='c') | count(x<>'c')
---|--------------|--------------
1 | 3 | 7
I don't know if it's possible.
You can try :
SELECT sum(if(x='c',1,0)), sum(if(x!='c',1,0)) FROM table_name;
This will print two columns. I didn't understand the id field in your sample output.

Oracle cursor removes leading zero

I have a cursor which selects date from column with NUMBER type containg floating point numbers. Numbers like 4,3433 are returned properly while numbers smaller then 1 have removed leading zero.
For example number 0,4513 is returned as ,4513.
When I execute select used in the cursor on the database, numbers are formatted properly, with leading zeros.
This is how I loop over the recors returned by the cursor:
FOR c_data IN cursor_name(p_date) LOOP
...
END LOOP;
Any ideas why it works that way?
Thank you in advance.
You're confusing number format and number value.
The two strings 0.123 and .123, when read as a number, are mathematically equals. They represent the same number. In Oracle the true number representation is never displayed directly, we always convert a number to a character to display it, either implicitly or explicitly with a function.
You assume that a number between 0 and 1 should be represented with a leading 0, but this is not true by default, it depends on how you ask this number to be displayed. If you don't want unexpected outcome, you have to be explicit when displaying numbers/dates, for example:
to_char(your_number, '9990.99');
It's the default number formatting that Oracle provides.
If you want to specify something custom, you shall use TO_CHAR function (either in SQL query or PL/SQL code inside the loop).
Here is how it works:
SQL>
SQL> WITH aa AS (
2 select 1.3232 NUM from dual UNION ALL
3 select 1.3232 NUM from dual UNION ALL
4 select 332.323 NUM from dual UNION ALL
5 select 0.3232 NUM from dual
6 )
7 select NUM, to_char(NUM, 'FM999990D9999999') FORMATTED from aa
8 /
NUM FORMATTED
---------- ---------------
1.3232 1.3232
1.3232 1.3232
332.323 332.323
.3232 0.3232
SQL>
In this example, 'FM' - suppresses extra blanks, '0' indicates number digit including leading/trailing zeros, and '9' indicates digit suppressing leading/trailing zeros.
You can find many examples here:
http://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_elements004.htm#i34570

Sorting results on Oracle as ASCII

I'm doing a query that returns a VARCHAR2 and some other fields. I'm ordering my results by this VARCHAR2 and having some problems related to the linguistic sort, as I discovered on Oracle documentation. For example:
SELECT id, my_varchar2 from my_table ORDER BY MY_VARCHAR2;
Will return:
ID MY_VARCHAR2
------ -----------
3648 A
3649 B
6504 C
7317 D
3647 0
I need it to return the string "0" as the first element on this sequence, as it would be comparing ASCII values. The string can have more than one character so I can't use the ascii function as it ignores any characters except for the first one.
What's the best way to do this?
For that case, you should be able to just order by the BINARY value of your characters;
SELECT id, my_varchar2
FROM my_table
ORDER BY NLSSORT(MY_VARCHAR2, 'NLS_SORT = BINARY')
SQLFiddle here.

How should I range partition an index with a varchar2 column in Oracle? Is it a bad idea?

I am using Oracle 10g Enterprise edition.
A table in our Oracle database stores the soundex value representation of another text column. We are using a custom soundex implementation in which the soundex values are longer than are generated by traditional soundex algorithms (such as the one Oracle uses). That's really beside the point.
Basically I have a varchar2 column that has values containing a single character followed by a dynamic number of numeric values (e.g. 'A12345', 'S382771', etc). The table is partitioned by another column, but I'd like to add a partitioned index to the soundex column since it is often searched. When trying to add a range partitioned index using the first character of the soundex column it worked great:
create index IDX_NAMES_SOUNDEX on NAMES_SOUNDEX (soundex)
global partition by range (soundex) (
partition IDX_NAMES_SOUNDEX_PART_A values less than ('B'), -- 'A%'
partition IDX_NAMES_SOUNDEX_PART_B values less than ('C'), -- 'B%'
...
);
However, I in order to more evenly distribute the size of the partitions, I want to define some partitions by the first two chars, like so:
create index IDX_NAMES_SOUNDEX on NAMES_SOUNDEX (soundex)
global partition by range (soundex) (
partition IDX_NAMES_SOUNDEX_PART_A5 values less than ('A5'), -- 'A0% - A4%'
partition IDX_NAMES_SOUNDEX_PART_A values less than ('B'), -- 'A4% - A9%'
partition IDX_NAMES_SOUNDEX_PART_B values less than ('C'), -- 'B%'
...
);
I'm not sure how to properly range partition using varchar2 columns. I'm sure this is a less than ideal choice, so perhaps someone can recommend a better solution. Here's a distribution of the soundex data in my table:
-----------------------------------
| SUBSTR(SOUNDEX,1,1) | COUNT |
-----------------------------------
| A | 6476349 |
| B | 854880 |
| D | 520676 |
| F | 1200045 |
| G | 280647 |
| H | 3048637 |
| J | 711031 |
| K | 1336522 |
| L | 348743 |
| M | 3259464 |
| N | 1510070 |
| Q | 276769 |
| R | 1263008 |
| S | 3396223 |
| V | 533844 |
| W | 555007 |
| Y | 348504 |
| Z | 1079179 |
-----------------------------------
As you can see, the distribution is not evenly spread, which is why I want to define range partitions using the first two characters instead of just the first character.
Suggestions?
Thanks!
What exactly is your question?
Don't you know how you can split your table in n equal parts to avoid skew?
You can do that with analytic function percentile_disc().
Here an SQL PLUS example with n=100, I admit that it isn't very sophisticated but it will do the job.
set pages 0
set lines 200
drop table random_strings;
create table random_strings
as
select upper(dbms_random.string('A', 12)) rndmstr
from dual
connect by level < 1000;
spool parts
select 'select '||level||'/100,percentile_disc('||level||
'/100) within group (order by RNDMSTR) from random_strings;'
sql_statement
from dual
connect by level <= 100
/
spool off
This will output in file parts.lst:
select 1/100,percentile_disc(1/100) within group (order by RNDMSTR) from random_strings;
select 2/100,percentile_disc(2/100) within group (order by RNDMSTR) from random_strings;
select 3/100,percentile_disc(3/100) within group (order by RNDMSTR) from random_strings;
...
select 100/100,percentile_disc(100/100) within group (order by RNDMSTR) from random_strings;
Now you can run script parts.lst to get the partition values. Each partition will contain 1% of the data initially.
Script parts.lst will output:
,01 AJUDRRSPGMNP
,02 AOMJZQPZASQZ
,03 AWDQXVGLLUSJ
,04 BIEPUHAEMELR
....
,99 ZTMHDWTXUJAR
1 ZYVJLNATVLOY
Is the table is being searched by the partitioning key in addition to the SOUNDEX value? Or is it being searched just by the SOUNDEX column?
If you are just trying to achieve an even distribution of data among partitions, have you considered using hash partitions rather than range partitions? Assuming you choose a power of 2 for the number of partitions, that should give you a pretty even distribution of data between partitions.
Talk to me!
Can you tell me what your reason is for partitioning this table? It sounds like it is an OLTP table and may not need to be partition. We don’t want to partition just to say we are partitioned. Tell me what you are trying to accomplish by partitioning this table and I can help you pick a correct partitioning scheme. Partitioning does not equal faster queries. It actually can cause your queries to be slower in some cases.
I see some of your additional thoughts above and I don’t believe you need to partition your table. If your queries are going to be doing aggregates on entire partitions then you may want to partition. If you are going to have hundreds of millions of rows of data you may want to partition to help with DBA maintenance. If you just want you queries to run fast then the primary key index will suffice. Please let me know
Just create a global index on your desired columns.

Resources