match words in all rows in same column - oracle

There is a column in table 'mytable' named 'Description'.
+----+-------------------------------+
| ID | Description |
+----+-------------------------------+
| 1 | My NAME is Sajid KHAN |
| 2 | My Name is Ahmed Khan |
| 3 | MY friend name is Salman Khan |
+----+-------------------------------+
I need to write an Oracle SQL query/procedure/function to list the distinct words in the column.
The output should be:
+------------------+-------+
| Word | Count |
+------------------+-------+
| MY | 3 |
| NAME | 3 |
| IS | 3 |
| SAJID | 1 |
| KHAN | 3 |
| AHMED | 1 |
| FRIEND | 1 |
| SALMAN | 1 |
+------------------+-------+
Word matching should be case-insensitive.
I am using Oracle 12.1.

Let's suppose we would somehow manage to split every description in words.
So, instead of single row with Id = 1 and Description = 'My NAME is Sajid KHAN' we'd have 5 rows like this
ID | Description
--- | ------------
1 | My
1 | NAME
1 | is
1 | Sajid
1 | KHAN
in this form it'd be trivial, something like
select Description, count(*) from data_in_new_form group by Description
So, let's do this using recursive query.
create table mytable
as
select 1 as ID, 'My NAME is Sajid KHAN' as Description from dual
union all
select 2, 'My Name is Ahmed Khan' from dual
union all
select 3, 'MY friend name is Salman Khan' from dual
union all
select 4, 'test, punctuation! it is' from dual
;
with
rec (id, str, depth, element_value) as
(
-- Anchor member.
select id, upper(Description) as str, 1 as depth, REGEXP_SUBSTR( upper(Description), '(.*?)( |$)', 1, 1, NULL, 1 ) AS element_value
from mytable
UNION ALL
-- Recursive member.
select id, str, depth + 1, REGEXP_SUBSTR( str ,'(.*?)( |$)', 1, depth+1, NULL, 1 ) AS element_value
from rec
where depth < regexp_count(str, ' ')+1
)
, data as (
select * from rec
--order by id, depth
)
select element_value, count(*) from data
group by element_value
order by element_value
;
Please notice this version doesn't do anything about punctuation assuming words are separated with spaces.
UPDATE alternative way using hierarchic query
with rec as
(
SELECT id, LEVEL AS depth,
REGEXP_SUBSTR( upper(description) ,'(.*?)( |$)', 1, LEVEL, NULL, 1 ) AS element_value
FROM mytable
CONNECT BY LEVEL <= regexp_count(description, ' ')+1
and prior id = id
and prior SYS_GUID() is not null
)
, data as (
select * from rec
--order by id, depth
)
select element_value, count(*) from data
group by element_value
order by 2 desc
;

This query will work. The ordering of the words may be different. However, frequent words come at the beginning as you have listed.
SELECT word,
COUNT(*)
FROM
(SELECT TRIM (REGEXP_SUBSTR (Description, '[^ ]+', 1, ROWNUM) ) AS Word
FROM
(SELECT LISTAGG(UPPER(Description),' ') within GROUP(
ORDER BY ROWNUM ) AS Description
FROM mytable
)
CONNECT BY LEVEL <= REGEXP_COUNT ( Description, '[^ ]+')
)
GROUP BY WORD
ORDER BY 2 DESC;

Related

display avg of the last column on the last row in SQL query

Hello I am having trouble trying to figure out this particular question. using oracle SQL developer.
trying to figure out how a query so that it will display exactly like the below table/picture.
the last row of this query to display word AVERAGE: and show the average (of all values in the in the sixth column) of the percentage above min selling price for all the sales made. and all the remaining column to display "--------"
Code ProductName Title ShopID SalePrice %SoldAbove Min.SellPrice
1 Martin Robot 1 $49000 15%
2
3
4
--- ------ ---- ---- AVERAGE: 16.5%
below is last row of the output i am looking for. But i have no clue on how to produce the
"--------" in the remaining columns let alone AVERAGE: and the average of all the values in the sixth column of the last row.
in summary, the last row of the output should show the average (in the sixth column) of the percentage
sold above the minimum selling price for all the sales.
Use ROLLUP:
SELECT DECODE( GROUPING( Code ), 1, '----', code ) AS code,
DECODE( GROUPING( Code ), 1, '----', MAX(col1) ) AS Col1,
DECODE( GROUPING( Code ), 1, '----', MAX(col2) ) AS Col2,
DECODE( GROUPING( Code ), 1, 'Average:', MAX(col3) ) AS Col3,
AVG( value )
FROM table_name
GROUP BY ROLLUP(Code);
Which, for the sample data:
CREATE TABLE table_name ( code, col1, col2, col3, value ) AS
SELECT 1, 'AAA', 1, 'AA1', 15.0 FROM DUAL UNION ALL
SELECT 2, 'BBB', 2, 'BB2', 17.5 FROM DUAL UNION ALL
SELECT 3, 'CCC', 3, 'CC3', 20.0 FROM DUAL;
Outputs:
CODE | COL1 | COL2 | COL3 | AVG(VALUE)
:--- | :--- | :--- | :------- | ---------:
1 | AAA | 1 | AA1 | 15
2 | BBB | 2 | BB2 | 17.5
3 | CCC | 3 | CC3 | 20
---- | ---- | ---- | Average: | 17.5
db<>fiddle here
This is a job for UNION ALL. A good way to get this sort of result:
SELECT * FROM (
SELECT COALESCE(whatever1, '------') whatever1
COALESCE(whatever2, '------') whatever2,
COALESCE(whatever3, '------') whatever3,
whatever4
FROM whatever
UNION ALL
SELECT NULL, NULL, NULL, AVG(whatever4) FROM whatever
) r
ORDER BY whatever1 NULLS LAST
The COALESCE function puts your ----- characters into the output.
You can also investigate GROUP BY WITH ROLLUP. It may do what you want.

How to use Oracle's LISTAGG function with a multi values?

I have an 'ITEMS' table like below:
ITEM_NO ITEM_NAME
1 Book
2 Pen
3 Sticky Notes
4 Ink
5 Corrector
6 Ruler
In another 'EMP_ITEMS' table I have the below:
EMPLOYEE ITEMS_LIST
John 1,2
Mikel 5
Sophia 2,3,6
William 3,4
Daniel null
Michael 6
The output has to be like this:
EMPLOYEE ITEMS_LIST ITEM_NAME
John 1,2 Book,Pen
Mikel 5 Corrector
Sophia 2,3,6 Pen,Sticky Notes,Ruler
William 3,4 Sticky Notes,Ink
Daniel null null
Michael 6 Ruler
I used the below query:
SELECT e.EMPLOYEE,e.ITEMS_LIST, LISTAGG(i.ITEM_NAME, ',') WITHIN GROUP (ORDER BY i.ITEM_NAME) ITEM_DESC
FROM EMP_ITEMS e
INNER JOIN ITEMS i ON i.ITEM_NO = e.ITEMS_LIST
GROUP BY e.EMPLOYEE,e.ITEMS_LIST;
But there is an error:
ORA-01722: invalid number
But there is an error: ORA-01722: invalid number
That is because your ITEMS_LIST is a string composed of numeric and comma characters and is not actually a list of numbers and you are trying to compare a single item number to a list of items.
Instead treat it as a string a look for sub-string matches. To do this you will need to surround the strings in the delimiter character and compare to see if one is the substring of the other:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE Items ( ITEM_NO, ITEM_NAME ) As
SELECT 1, 'Book' FROM DUAL UNION ALL
SELECT 2, 'Pen' FROM DUAL UNION ALL
SELECT 3, 'Sticky Notes' FROM DUAL UNION ALL
SELECT 4, 'Ink' FROM DUAL UNION ALL
SELECT 5, 'Corrector' FROM DUAL UNION ALL
SELECT 6, 'Ruler' FROM DUAL;
CREATE TABLE emp_items ( EMPLOYEE, ITEMS_LIST ) AS
SELECT 'John', '1,2' FROM DUAL UNION ALL
SELECT 'Mikel', '5' FROM DUAL UNION ALL
SELECT 'Sophia', '3,2,6' FROM DUAL UNION ALL
SELECT 'William', '3,4' FROM DUAL UNION ALL
SELECT 'Daniel', null FROM DUAL UNION ALL
SELECT 'Michael', '6' FROM DUAL;
Query 1:
SELECT e.employee,
e.items_list,
LISTAGG( i.item_name, ',' )
WITHIN GROUP (
ORDER BY INSTR( ','||e.items_list||',', ','||i.item_no||',' )
) AS item_names
FROM emp_items e
LEFT OUTER JOIN
items i
ON ( ','||e.items_list||',' LIKE '%,'||i.item_no||',%' )
GROUP BY e.employee, e.items_list
Results:
| EMPLOYEE | ITEMS_LIST | ITEM_NAMES |
|----------|------------|------------------------|
| John | 1,2 | Book,Pen |
| Mikel | 5 | Corrector |
| Daniel | (null) | (null) |
| Sophia | 3,2,6 | Sticky Notes,Pen,Ruler |
| Michael | 6 | Ruler |
| William | 3,4 | Sticky Notes,Ink |

How do I conditionally group by two different columns in Oracle?

Suppose I have a table with the following data:
+----------+-----+--------+
| CLASS_ID | Day | Period |
+----------+-----+--------+
| 1 | A | CCR |
+----------+-----+--------+
| 1 | B | CCR |
+----------+-----+--------+
| 2 | A | 1 |
+----------+-----+--------+
| 2 | A | 2 |
+----------+-----+--------+
| 3 | A | 3 |
+----------+-----+--------+
| 3 | B | 4 |
+----------+-----+--------+
| 4 | A | 5 |
+----------+-----+--------+
As you could probably guess from the nature of the data, I'm working on an Oracle SQL query that pulls class schedule data from a Student Information System. I'm trying to pull a class's "period expression", a calculated value that contains the Day and Period fields into a single field. Let's get my expectation out of the way first:
If the Periods match, Period should be the GROUP BY field, and Day should be the aggregated field (via a LISTAGG function), so the calculated field would be CCR (A-B)
If the Days match, Day should be the GROUP BY field, and Period should be the aggregated field, so the calculated field would be 1-2 (A)
I'm only aware of how to do each GROUP BY individually, something like for where Days match:
SELECT
day,
LISTAGG(period, '-') WITHIN GROUP (ORDER BY period)
FROM schedule
GROUP BY day
and vice versa for matching Periods, but I'm not seeing how I could do that dynamically for Period and Day in the same query.
You'll also notice that the last row in the example data set doesn't span multiple days or periods, so I also need to account for classes that don't need a GROUP BY at all.
Edit
The end result should be:
+------------+
| Expression |
+------------+
| CCR(A-B) |
+------------+
| 1-2(A) |
+------------+
| 3-4(A-B) |
+------------+
| 5(A) |
+------------+
It is really not clear to me WHY you want output in that way. It doesn't provide any useful information (I don't think) - you can't tell, for example for class_id = 3, which combinations of day and period are actually used. There are four possible combinations (according to the output), but only two are actually in the class schedule.
Anyway - you may have your reasons. Here is how you can do it. You seem to want to LISTAGG both the day and the period (both grouped by class_id, they are not grouped by each other). The difficulty is that you want distinct values in the aggregate lists only - no duplicates. So you will need to select distinct, separately for period and for day, then to the list aggregations, and then concatenate the results in an inner join.
Something like this:
with
test_data ( class_id, day, period ) as (
select 1, 'A', 'CCR' from dual union all
select 1, 'B', 'CCR' from dual union all
select 2, 'A', '1' from dual union all
select 2, 'A', '2' from dual union all
select 3, 'A', '3' from dual union all
select 3, 'B', '4' from dual union all
select 4, 'A', '5' from dual
)
-- end of test data; the actual solution (SQL query) begins below this line
select a.class_id, a.list_per || '(' || b.list_day || ')' as expression
from ( select class_id,
listagg(period, '-') within group (order by period) as list_per
from ( select distinct class_id, period from test_data )
group by class_id
) a
inner join
( select class_id,
listagg(day, '-') within group (order by day) as list_day
from ( select distinct class_id, day from test_data )
group by class_id
) b
on a.class_id = b.class_id
;
CLASS_ID EXPRESSION
-------- ----------
1 CCR(A-B)
2 1-2(A)
3 3-4(A-B)
4 5(A)
How about union with having count(*) = 1?
select LISTAGG(period, '-') list WITHIN GROUP (ORDER BY period)
from schedule
group by CLASS_ID, day
having count(*) = 1
union all
select LISTAGG(day, '-') list WITHIN GROUP (ORDER BY day)
from schedule
group by CLASS_ID, period
having count(*) = 1

Oracle regular expression for Alphanumeric check

We have below table structure.
From To Output
A001 A555 "A"
A556 A999 "B"
AA01 AA55 "C"
AA56 AA99 "D"
B001 B555 "C"
If my input value if greater than From column and less than To column then return that Output. E.g. If I received A222 then return output as "A". If I received AA22 then return output as "C".
Currently I have handle this in Java but wanted to check if I can write a Oracle query(using Regex) to get above output.
Thanks
Sach
Create your table with the appropriate (virtual) columns and indexes:
Oracle Setup:
CREATE TABLE table_name (
"from" VARCHAR2(10) UNIQUE,
"to" VARCHAR2(10) UNIQUE,
output CHAR(1) NOT NULL,
prefix VARCHAR2(9) GENERATED ALWAYS AS (
CAST(
REGEXP_SUBSTR( "from", '^\D+' )
AS VARCHAR2(9)
)
) VIRTUAL,
from_postfix NUMBER(9) GENERATED ALWAYS AS (
TO_NUMBER( REGEXP_SUBSTR( "from", '\d+$' ) )
) VIRTUAL,
to_postfix NUMBER(9) GENERATED ALWAYS AS (
TO_NUMBER( REGEXP_SUBSTR( "to", '\d+$' ) )
) VIRTUAL,
CONSTRAINT table_name__from_to__u PRIMARY KEY (
prefix, from_postfix, to_postfix
),
CONSTRAINT table_name__f_t_prefix__chk CHECK (
REGEXP_SUBSTR( "from", '\^\D+' ) = REGEXP_SUBSTR( "to", '\^\D+' )
)
);
INSERT INTO table_name ( "from", "to", output )
SELECT 'A001', 'A555', 'A' FROM DUAL UNION ALL
SELECT 'A556', 'A999', 'B' FROM DUAL UNION ALL
SELECT 'AA01', 'AA55', 'C' FROM DUAL UNION ALL
SELECT 'AA56', 'AA99', 'D' FROM DUAL UNION ALL
SELECT 'B001', 'B555', 'C' FROM DUAL;
COMMIT;
Query:
Then your query can use the index and not need to do a full table scan:
SELECT output
FROM table_name
WHERE REGEXP_SUBSTR( :input, '^\D+' ) = prefix
AND TO_NUMBER( REGEXP_SUBSTR( :input, '\d+$' ) )
BETWEEN from_postfix
AND to_postfix;
Output:
If the input bind variable is AA22 then the result is:
OUTPUT
------
C
Explain Plan:
PLAN_TABLE_OUTPUT
-------------------------------------------
Plan hash value: 2408507965
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 35 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TABLE_NAME | 1 | 35 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TABLE_NAME__FROM_TO__U | 1 | | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("PREFIX"= REGEXP_SUBSTR (:INPUT,'^\D+') AND "TO_POSTFIX">=TO_NUMBER(
REGEXP_SUBSTR (:INPUT,'\d+$')) AND "FROM_POSTFIX"<=TO_NUMBER( REGEXP_SUBSTR (:INPUT,'\d+$')))
filter("TO_POSTFIX">=TO_NUMBER( REGEXP_SUBSTR (:INPUT,'\d+$')))
You have to extract varchar and number value from columns and input value and compare it each other
with tabl("FROM","TO",Output) as (
select 'A001','A555','A' from dual union all
select 'A556','A999','B' from dual union all
select 'AA01','AA55','C' from dual union all
select 'AA56','AA99','D' from dual union all
select 'B001','B555','C' from dual)
select * from tabl
where to_number(regexp_substr('AA22','[0-9]+')) between to_number(regexp_substr("FROM",'[0-9]+')) and to_number(regexp_substr("TO",'[0-9]+'))
and regexp_substr('AA22','[^0-9]+') = regexp_substr("FROM",'[^0-9]+')

Merging Query Result into single row - Oracle

Can I do like this in oracle,.? I have some data like this:
No | Data |
===========
1 | A |
1 | B |
1 | C |
1 | D |
Is there any query that can produce a result like this,.?
No | Data |
=================
1 | A, B, C, D |
Many thanks :D
Maybe this page shows what you are looking for.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE TEST ( ID, DATA ) AS
SELECT 1, 'A' FROM DUAL
UNION ALL SELECT 1, 'B' FROM DUAL
UNION ALL SELECT 1, 'C' FROM DUAL
UNION ALL SELECT 1, 'D' FROM DUAL
UNION ALL SELECT 2, 'E' FROM DUAL
UNION ALL SELECT 2, 'F' FROM DUAL;
Query 1:
SELECT ID,
LISTAGG( DATA, ',' ) WITHIN GROUP ( ORDER BY DATA ) AS AGGREGATED_DATA
FROM TEST
GROUP BY ID
Results:
| ID | AGGREGATED_DATA |
|----|-----------------|
| 1 | A,B,C,D |
| 2 | E,F |
In Oracle we can use wm_concat function. Here is the query for example above:
SELECT no, wm_concat(data) from table group by no
reference: wm_concat
select
no,
rtrim (xmlagg (xmlelement (d, data|| ',')).extract ('//text()'), ',') data
from
table_name
group by
no
;

Resources