Finding fields with non alfa numeric values - hadoop

I am looking a way to find the values in a column that has non alfa numeric values...
I tried
select 'kjh$' not RLIKE '([0-9][a-z]|[A-Z])*')
but does not work
Thanks for your help

You can use REGEXP '^[A-Za-z0-9]+$' or RLIKE '^[A-Za-z0-9]+$'.
Sample SQL -
select * from my table where mycol not RLIKE '^[A-Za-z0-9]+$'
^ - determines start of the string
$ - end of the string
+ - match the preceding character one or more times
[A-Za-z0-9] - to check alphanumeric or not
I ran a simple select statement to check if a string has alphanumeric or not using regex and here is the output.
select 'Aa90$$bc' ,'Aa90$$bc' rlike '^[A-Za-z0-9]+$'

Related

REGEXP_LIKE Oracle equivalent to count characters in Snowflake

I am trying to come up with an equivalent of the below Oracle statement in Snowflake. This would check if the different parts of the string separated by '.' matches the number of characters in the REGEXP_LIKE expression. I have come up with a rudimentary version to perform the check in Snowflake but I am sure there's a better and cleaner way to do it. I am looking to come up with a one-liner regular expression check in Snowflake similar to Oracle. Appreciate your help!
-- Oracle
SELECT -- would return True
CASE
WHEN REGEXP_LIKE('AB.XYX.12.34.5670.89', '^\w{2}\.\w{3}\.\w{2}') THEN 'True'
ELSE NULL
END AS abc
FROM DUAL
-- Snowflake
SELECT -- would return True
REGEXP_LIKE(SPLIT_PART('AB.XYX.12.34.5670.89', '.', 1), '[A-Z0-9]{2}') AND
REGEXP_LIKE(SPLIT_PART('AB.XYX.12.34.5670.89', '.', 2), '[A-Z0-9]{3}') AND
REGEXP_LIKE(SPLIT_PART('AB.XYX.12.34.5670.89', '.', 3), '[A-Z0-9]{2}') AS abc
You need to add a .* at the end as the REGEXP_LIKE adds explicit ^ && $ to string:
The function implicitly anchors a pattern at both ends (i.e. '' automatically becomes '^$', and 'ABC' automatically becomes '^ABC$'). To match any string starting with ABC, the pattern would be 'ABC.*'.
select
column1 as str,
REGEXP_LIKE(str, '\\w{2}\\.\\w{3}\\.\\w{2}.*') as oracle_way
FROM VALUES
('AB.XYX.12.34.5670.89')
;
gives:
STR
ORACLE_WAY
AB.XYX.12.34.5670.89
TRUE
Or in the context of your question:
SELECT IFF(REGEXP_LIKE('AB.XYX.12.34.5670.89', '\\w{2}\\.\\w{3}\\.\\w{2}.*'), 'True', null) AS abc;
Your use of \w seems to suggest you don't need delimited strings to be strictly [A-Z0-9] since word characters allow underscore and period. If all bets were off and the only requirement was to have . at 3rd, 7th and 10th position, you could have used like this way.
select 'AB.XGH.12.34.5670.89' like '__.___.__.%' ;

Oracle query to find any special character in first position or end position of the field value

I have a table in Oracle database with special characters attached at first and last position in the field value. I want to eliminate those special characters while querying the table. I have used INSTR function but I had to apply for each and every special character using CASE expression.
Is there a way to eliminate any special characters that is attached only at first and last positions in one shot?
The query I am using as is below:
CASE WHEN
INSTR(emp_address,'"')=1 THEN REPLACE((emp_address,'"', '').
.
.
.
You can use regular expressions to replace the leading and trailing character of a string if they match the regular expression pattern. For example, if your definition of a "special character" is anything that is not an alpha-numeric character then you can use the regular expression:
^ the start-of-the-string then
[^[:alnum:]] any single character that does not match the POSIX alpha-numeric character group
| or
[^[:alnum:]] any single character that does not match the POSIX alpha-numeric character group then
$ the end-of-the-string.
Like this:
SELECT emp_address,
REGEXP_REPLACE(
emp_address,
'^[^[:alnum:]]|[^[:alnum:]]$'
) AS simplified_emp_address
FROM table_name
Which, for the sample data:
CREATE TABLE table_name (emp_address) AS
SELECT 'test' FROM DUAL UNION ALL
SELECT '"test2"' FROM DUAL UNION ALL
SELECT 'Not "this" one' FROM DUAL;
Outputs:
EMP_ADDRESS
SIMPLIFIED_EMP_ADDRESS
test
test
"test2"
test2
Not "this" one
Not "this" one
If you have a more complicated definition of a special character then change the regular expression appropriately.
db<>fiddle here

Remove string after second comma Oracle / PL SQL

I have this value '45465,6464,654' And I want to remove second comma and string after it. So basically I want ''45465,6464' . I found a solution but its for Mssql. How can I make this query for Oracle I couldnt do it even with substring. Can you help me?
This for MSSQL;
`declare #S varchar(20) = '45465#6464#654';
select left(#S, charindex('#', #S, charindex('#', #S)+1)-1);`
You can use something very similar:
WITH s AS (SELECT '45465#6464#654' s FROM dual)
SELECT SUBSTR(s,1,INSTR(s,'#',1,2)-1) FROM s
or you can use regular Expression:
SELECT regexp_substr('45465#6464#654','([^#]*#)?[^#]*') from dual

ORACLE SUBQUERY NOT WORKING IN (IN CONDITION)

I need help
i have records 123,456,789 in rows when i am execute like
this one is working
select * from table1 where num1 in('123','456')
but when i am execute
select * from table1 where num1 in(select value from table2)
no resultset found - why?
Check the DataType varchare2 or Number
try
select * from table1 where num1 in(select to_char(value) from table2)
Storing comma separated values could be the cause of problem.
You can try using regexp_substr to split comma.
First and foremost, an important thing to remember: Do not store numbers in character datatypes. Use NUMBER or INTEGER. Secondly, always prefer VARCHAR2 datatype over CHAR if you wish to store characters > 1.
You said in one of your comments that num1 column is of type char(4). The problem with CHAR datatype is that If your string is 3 characters wide, it stores the record by adding extra 1 space character to make it 4 characters. VARCHAR2 only stores as many characters as you pass while inserting/updating and are not blank padded.
To verify that you may run select length(any_char_col) from t;
Coming to your problem, the IN condition is never satisfied because what's actually being compared is
WHERE 'abc ' = 'abc' - Note the extra space in left side operator.
To fix this, one good option is to pad the right side expression with as many spaces as required to do the right comparison.The function RPAD( string1, padded_length [, pad_string] ) could be used for this purpose.So, your query should look something like this.
select * from table1 where num1 IN (select rpad(value,4) from table2);
This will likely utilise an index on the column num1 if it exists.
The other one is to use RTRIM on LHS, which is only useful if there's a function based index on RTRIM(num1)
select * from table1 where RTRIM(num1) in(select value from table2);
So, the takeaway from all these examples is always use NUMBER types to store numbers and prefer VARCHAR2 over CHAR for strings.
See Demo to fully understand what's happening.
EDIT : It seems You are storing comma separated numbers.You could do something like this.
SELECT *
FROM table1 t1
WHERE EXISTS (
SELECT 1
FROM table2 t2
WHERE ',' ||t2.value|| ',' LIKE '%,' || rtrim(t1.num1) || ',%'
);
See Demo2
Storing comma separated values are bound to cause problems, better change it.
Let me tell you first,
You have stored values in table2 which is comma seperated.
So, how could you match your data with table1 and table2.
Its not Possible.
That's why you did not get any values in result set.
I found the Solution using string array
SELECT T.* FROM TABLE1 T,
(SELECT TRIM(VALUE)AS VAL FROM TABLE2)TABLE2
WHERE
TRIM(NUM1) IN (SELECT COLUMN_VALUE FROM TABLE(FUNC_GETSTRING_ARRAY(TABLE2.VAL)))
thanks

psql query to return values that contain space at the start/end of the value specified

I have a DB with more than 1,000,000 entries in it, some of them contain space char at the start/end of the value.
I have tried the following queries and it works but i would have to go through 1,000,000 records because all id's are unique
select * from tablename where id like '%1234%';
select * from tablename where id='1234 ';
select * from tablename where id=' 1234';
select * from tablename where id=' 1234 ';
is there a query that can be run that returns all values with space/empty char at the start/end of the value?
Appreciate your help
B
You can use a regular expression that says "one or more spaces at the start of the string OR one of more spaces at the end of the string".
select * from tablename where id ~ '^\s+|\s+$';
Each special character:
^ match start of the string
\s match a white space character
| the or operator
+ one or more times
$ match the end of the string.
I found the way to do this
select * from tablename where id like '% %';

Resources