I have log table:
with t as
(select '16/04/2014 20:17:25 XXX. Xxxxxxx xxx xxx [SYSTEM_JOBS] xxx [POSTPONE_JOBS] xxx [SYSTEM] xxxx [JOB2]' col
from dual
UNION ALL
select '16/04/2014 20:17:25 XXX. Xxxxxxx [SYSTEM_JOBS] xxx [POSTPONE_JOBS]' col
from dual)
select * from t
I am trying to extract the CODE between '[' and ']' including [].
My version:
select regexp_substr(col, '(\[.*?\])', 1, level) col_substr,
regexp_replace(regexp_substr(col, '(\[.*?\])', 1, level),
'(\[|])',
'') col_replace_substr
from t
CONNECT BY regexp_substr(col, '(\[.*?\])', 1, level) is not null
My version work good, but i need get result without function regexp_replace, i wanna use one function in my code.
Can I get result only with REGEXP_SUBSTR?
This question popped up in the "Related" section and even though its an old question it is still relevant and deserves to be answered. I'm afraid the original poster is incorrect as his code does not work good but returns invalid results. His example returns 14 rows, where there are only 6 values surrounded by square brackets. Here is an example with a regex that allows for when the matched pattern is followed by the end of the line and also shows the row number and match number for proof. I hope this helps a future searcher:
SQL> with t(rownbr, col1) as (
select 1, '16/04/2014 20:17:25 XXX. Xxxxxxx xxx xxx [SYSTEM_JOBS] xxx [POSTPONE_JOBS] xxx [SYSTEM] xxxx [JOB2]' from dual
UNION
select 2, '16/04/2014 20:17:25 XXX. Xxxxxxx [SYSTEM_JOBS] xxx [POSTPONE_JOBS]' from dual
)
SELECT rownbr, column_value match_number,
regexp_substr(col1,'(\[.*?\])( |$)', 1, column_value, NULL, 1) col_substr,
regexp_substr(col1,'\[(.*?)\]( |$)', 1, column_value, NULL, 1) col_replace_substr
FROM t,
TABLE(
CAST(
MULTISET(SELECT LEVEL
FROM dual
CONNECT BY LEVEL <= REGEXP_COUNT(col1, '\[.*?\]')
) AS sys.OdciNumberList
)
)
WHERE regexp_substr(col1,'(\[.*?\])( |$)', 1, column_value, NULL, 1) IS NOT NULL;
ROWNBR MATCH_NUMBER COL_SUBSTR COL_REPLACE_SUBSTR
---------- ------------ -------------------- --------------------
1 1 [SYSTEM_JOBS] SYSTEM_JOBS
1 2 [POSTPONE_JOBS] POSTPONE_JOBS
1 3 [SYSTEM] SYSTEM
1 4 [JOB2] JOB2
2 1 [SYSTEM_JOBS] SYSTEM_JOBS
2 2 [POSTPONE_JOBS] POSTPONE_JOBS
6 rows selected.
SQL>
Related
I have a table with data in below format
COURSE
[]
["12345"]
["12345","7890"
I want to extract the data between [] but without "
So, my output would be in below format
COURSE
12345
12345, 7890
I tried the below code which works fine for first 3 rows
select REGEXP_SUBSTR (COURSE,
'"([^"]+)"',
1,
1,
NULL,
1) from TEST;
But 4th row only results in 12345.
Why not simple translate?
SQL> with test (course) as
2 (select '[]' from dual union
3 select null from dual union
4 select '["12345"]' from dual union
5 select '["12345","7890"]' from dual
6 )
7 select course,
8 translate(course, 'a[]"', 'a') result
9 from test;
COURSE RESULT
---------------- -----------------------------------------
["12345","7890"] 12345,7890
["12345"] 12345
[]
SQL>
I have a number column, I need to replace the first number by 7 in oracle.
How to replace guys?
number want_number
4789654 7789654
2754678 7754678
1765689 7765689
For instance
REGEXP_REPLACE(number, '^\d', '7')
should work.
Or, a substring option with concatenation:
SQL> with test (num) as
2 (select 4789654 from dual union all
3 select 2754678 from dual union all
4 select 1765689 from dual
5 )
6 select num, '7' || substr(num, 2) wanted_num
7 from test;
NUM WANTED_NUM
---------- --------------------
4789654 7789654
2754678 7754678
1765689 7765689
SQL>
You could manipulate as numbers, rather than converting to (and presumably later back from) strings:
with your_table (original) as (
select 4789654 from dual
union all select 2754678 from dual
union all select 1765689 from dual
union all select 999 from dual
union all select 1000 from dual
union all select 1001 from dual
)
select original,
original
- trunc(original, -floor(log(10, original)))
+ 7 * power(10, floor(log(10, original))) as wanted
from your_table;
ORIGINAL WANTED
---------- ----------
4789654 7789654
2754678 7754678
1765689 7765689
999 799
1000 7000
1001 7001
The floor(log(10, original) gives you the magnitude of the number. As an example, for your first original value 4789654 that evaluates to 6. If you then do trunc(original, -floor(log(10, original))) that is trunc(4789654, -6), which zeros the six least significant digits, giving you 4000000. Subtracting that from the original value gives you 789654. Then power(10, floor(log(10, original))) gives you power(10, 6) which is 1000000, multiplying that by 7 gives you 7000000, and adding that back on gives you 7789654.
(This won't work if your original value is <= zero, but that looks unlikely?)
I want to select the data from a Oracle table, whereas the table columns contains the data as , [ex : key,value] separated values; so here I want to select the second split i.e, value
table column data as below :
column_data
++++++++++++++
asper,worse
tincher,good
golder
null -- null values need to eliminate while selection
www,ewe
from the above data, desired output like below:
column_data
+++++++++++++
worse
good
golder
ewe
Please help me with the query
According to data you provided, here are two options:
result1: regular expressions one (get the 2nd word if it exists; otherwise, get the 1st one)
result2: SUBSTR + INSTR combination
SQL> with test (col) as
2 (select 'asper,worse' from dual union all
3 select 'tincher,good' from dual union all
4 select 'golder' from dual union all
5 select null from dual union all
6 select 'www,ewe' from dual
7 )
8 select col,
9 nvl(regexp_substr(col, '\w+', 1, 2), regexp_substr(col, '\w+', 1,1 )) result1,
10 --
11 nvl(substr(col, instr(col, ',') + 1), col) result2
12 from test
13 where col is not null;
COL RESULT1 RESULT2
------------ -------------------- --------------------
asper,worse worse worse
tincher,good good good
golder golder golder
www,ewe ewe ewe
SQL>
I have a table called TVL_DETAIL that contains column TVL_CD_LIST. Column TVL_CD_LIST contains three records:
TVL_CD_LIST:
M1180_Z6827
K5900_Z6828
I2510
I've used the following code in an attempt to return the values only(so excluding the underscore):
SELECT
TVL_CD_LIST
FROM TVL_DETAIL
WHERE TVL_CD_LIST IN (SELECT regexp_substr(TVL_CD_LIST,'[^_]+', 1, level) FROM DUAL
CONNECT BY regexp_substr(TVL_CD_LIST,'[^_]+', 1, level) IS NOT NULL)
What I was expecting to see returned in separate rows was:
M1180
Z6827
K5900
Z6828
I2510
But it only returns I2510(which is the original value that doesn't contain an underscore).
What am I doing wrong? Any help is appreciated. Thanks!
To answer your question, you are querying for the list where it matches a sub-element and that will only happen where the list is comprised of one element. What you really wanted to select are the sub-elements themselves.
Note: Explanation of why parsing strings using the regex form '[^_]+' is bad here: https://stackoverflow.com/a/31464699/2543416
You want to parse the list, selecting the elements:
SQL> with TVL_DETAIL(TVL_CD_LIST) as (
select 'M1180_Z6827' from dual union
select 'K5900_Z6828' from dual union
select 'I2510' from dual
)
SELECT distinct regexp_substr(TVL_CD_LIST, '(.*?)(_|$)', 1, level, NULL, 1) element
FROM TVL_DETAIL
CONNECT BY level <= LENGTH(regexp_replace(TVL_CD_LIST, '[^_]', '')) + 1;
-- 11g CONNECT BY level <= regexp_count(TVL_CD_LIST, '_') + 1;
ELEMENT
-----------
Z6827
K5900
M1180
I2510
Z6828
SQL>
And this is cool if you want to track by row and element within row:
SQL> with TVL_DETAIL(row_nbr, TVL_CD_LIST) as (
select 1, 'M1180_Z6827' from dual union
select 2, 'K5900_Z6828' from dual union
select 3, 'I2510' from dual
)
SELECT row_nbr, column_value substring_nbr,
regexp_substr(TVL_CD_LIST, '(.*?)(_|$)', 1, column_value, NULL, 1) element
FROM TVL_DETAIL,
TABLE(
CAST(
MULTISET(SELECT LEVEL
FROM dual
CONNECT BY level <= LENGTH(regexp_replace(TVL_CD_LIST, '[^_]', '')) + 1
-- 11g CONNECT BY LEVEL <= REGEXP_COUNT(TVL_CD_LIST, '_')+1
) AS sys.OdciNumberList
)
)
order by row_nbr, substring_nbr;
ROW_NBR SUBSTRING_NBR ELEMENT
---------- ------------- -----------
1 1 M1180
1 2 Z6827
2 1 K5900
2 2 Z6828
3 1 I2510
SQL>
EDIT: Oops, edited to work with 10g as REGEXP_COUNT is not available until 11g.
The query you have used creates the list but you are comparing the list of record with column it self using the in clause, as such M1180 or Z6827 cannot be equal to M1180_Z6827 and so for K5900_Z6828. I2510 has only one value so it gets matched.
You can use below query if your requirement is exactly what you have mentioned in your desired output.
SQL> WITH tvl_detail AS
2 (SELECT 'M1180_Z6827' tvl_cd_list FROM dual
3 UNION ALL
4 SELECT 'K5900_Z6828' FROM dual
5 UNION ALL
6 SELECT 'I2510' FROM dual)
7 ---------------------------
8 --- End of data preparation
9 ---------------------------
10 SELECT regexp_substr(tvl_cd_list, '[^_]+', 1, LEVEL) AS tvl_cd_list
11 FROM tvl_detail
12 CONNECT BY regexp_substr(tvl_cd_list, '[^_]+', 1, LEVEL) IS NOT NULL
13 AND PRIOR tvl_cd_list = tvl_cd_list
14 AND PRIOR sys_guid() IS NOT NULL;
OUTPUT:
TVL_CD_LIST
--------------------------------------------
I2510
K5900
Z6828
M1180
Z6827
Data in the file_name field of the generation table should be an assigned number, then _01, _02, or _03, etc. and then .pdf (example 82617_01.pdf).
Somewhere, the program is putting a state name and sometimes a date/time stamp, between the assigned number and the 01, 02, etc. (82617_ALABAMA_01.pdf or 19998_MAINE_07-31-2010_11-05-59_AM.pdf or 5485325_OREGON_01.pdf for example).
We would like to develop a SQL statement to find the bad file names and fix them. In theory it seems rather simple to find file names that include a varchar2 data type and remove it, but putting the statement together is beyond me.
Any help or suggestions appreciated.
Something like:
UPDATE GENERATION
SET FILE_NAME (?)
WHERE FILE_NAME (?...LIKE '%STRING%');?
You can find the problem rows like this:
select *
from Files
where length(FILE_NAME) - length(replace(FILE_NAME, '_', '')) > 1
You can fix them like this:
update Files
set FILE_NAME = SUBSTR(FILE_NAME, 1, instr(FILE_NAME, '_') -1) ||
SUBSTR(FILE_NAME, instr(FILE_NAME, '_', 1, 2))
where length(FILE_NAME) - length(replace(FILE_NAME, '_', '')) > 1
SQL Fiddle Example
You can also use Regexp_replace function:
SQL> with t1(col) as(
2 select '82617_mm_01.pdf' from dual union all
3 select '456546_khkjh_89kjh_67_01.pdf' from dual union all
4 select '19998_MAINE_07-31-2010_11-05-59_AM.pdf' from dual union all
5 select '5485325_OREGON_01.pdf' from dual
6 )
7 select col
8 , regexp_replace(col, '^([0-9]+)_(.*)_(\d{2}\.pdf)$', '\1_\3') res
9 from t1;
COL RES
-------------------------------------- -----------------------------------------
82617_mm_01.pdf 82617_01.pdf
456546_khkjh_89kjh_67_01.pdf 456546_01.pdf
19998_MAINE_07-31-2010_11-05-59_AM.pdf 19998_MAINE_07-31-2010_11-05-59_AM.pdf
5485325_OREGON_01.pdf 5485325_01.pdf
To display good or bad data regexp_like function will come in handy:
SQL> with t1(col) as(
2 select '826170_01.pdf' from dual union all
3 select '456546_01.pdf' from dual union all
4 select '19998_MAINE_07-31-2010_11-05-59_AM.pdf' from dual union all
5 select '5485325_OREGON_01.pdf' from dual
6 )
7 select col bad_data
8 from t1
9 where not regexp_like(col, '^[0-9]+_\d{2}\.pdf$');
BAD_DATA
--------------------------------------
19998_MAINE_07-31-2010_11-05-59_AM.pdf
5485325_OREGON_01.pdf
SQL> with t1(col) as(
2 select '826170_01.pdf' from dual union all
3 select '456546_01.pdf' from dual union all
4 select '19998_MAINE_07-31-2010_11-05-59_AM.pdf' from dual union all
5 select '5485325_OREGON_01.pdf' from dual
6 )
7 select col good_data
8 from t1
9 where regexp_like(col, '^[0-9]+_\d{2}\.pdf$');
GOOD_DATA
--------------------------------------
826170_01.pdf
456546_01.pdf
To that end your update statement might look like this:
update your_table
set col = regexp_replace(col, '^([0-9]+)_(.*)_(\d{2}\.pdf)$', '\1_\3');
--where clause if needed