Selecting only distinct record from table in oracle - oracle

I have table with following records;
ID | NN | MBL | IC | OTHER
---+-----+------+----+------
1 | 123 | | | ac
2 | | 544 | | dc
3 | | | 524| df
4 |527 | | 124| ff
5 |123 | | | tr // duplicate NN of ID 1
6 | | 544 | | op // duplicate MBL of ID 2
7 | | | 124| ii // duplicate for IC ID 4
When querying with select I need just records with single entry, skipping second occurrence,
select
ID, NN, MBL, IC, OTHER
from
TABLE1 // this should return only one entry of any NN, MBL and IC
How do I get this, I cannot use distinct for multiple columns and I also need ID and OTHER column to display in select query
Expecting result like this:
1 | 123 | | | ac
2 | | 544 | | dc
3 | | | 524| df
4 |527 | | 124| ff

You can use the analytical function ROW_NUMBER() to calculate ranks over each column you want and filter only these rows with rank = 1.
Here is an example:
WITH testdata AS (
SELECT 1 AS ID, 123 AS NN, NULL AS MBL, NULL AS IC, 'ac' AS OTHER FROM DUAL UNION ALL
SELECT 2, NULL, 544 , NULL, 'dc' FROM DUAL UNION ALL
SELECT 3, NULL, NULL, 524 , 'df' FROM DUAL UNION ALL
SELECT 4, 527, NULL, 124, 'ff' FROM DUAL UNION ALL
SELECT 5, 123, NULL, NULL, 'tr' FROM DUAL UNION ALL
SELECT 6, NULL, 544, NULL, 'op' FROM DUAL UNION ALL
SELECT 7, NULL, NULL , 124, 'ii' FROM DUAL
)
SELECT *
FROM(SELECT ID,
NN,
CASE WHEN NN IS NULL THEN 1 ELSE ROW_NUMBER() OVER (PARTITION BY NN ORDER BY ID) END AS NN_RANG,
MBL,
CASE WHEN MBL IS NULL THEN 1 ELSE ROW_NUMBER() OVER (PARTITION BY MBL ORDER BY ID) END AS MBL_RANG,
IC,
CASE WHEN IC IS NULL THEN 1 ELSE ROW_NUMBER() OVER (PARTITION BY IC ORDER BY ID) END AS IC_RANG,
OTHER
FROM testdata
)
WHERE NN_RANG = 1
AND MBL_RANG = 1
AND IC_RANG = 1
;
Hope it helps.

Related

Is there an other solution calculate data by filtered date without "with" clause?

Need to calculate value when date< first_date by name with interval 3 day in BigQuery.
Example of data:
+------+------------+------------------+
| Name | date | order_id | value |
+------+------------+----------+-------+
| JONES| 2019-01-03 | 11 | 10 |
| JONES| 2019-01-05 | 12 | 5 |
| JONES| 2019-06-03 | 13 | 3 |
| JONES| 2019-07-03 | 14 | 20 |
| John | 2019-07-23 | 15 | 10 |
+------+------------+----------+-------+
My solution is:
WITH data AS (
SELECT "JONES" name, DATE("2019-01-03") date_time, 11 order_id, 10 value
UNION ALL
SELECT "JONES", DATE("2019-01-05"), 12, 5
UNION ALL
SELECT "JONES", DATE("2019-06-03"), 13, 3
UNION ALL
SELECT "JONES", DATE("2019-07-03"), 14, 20
UNION ALL
SELECT "John", DATE("2019-07-23"), 15, 10
),
data2 AS (
SELECT *, MIN(date_time) OVER (PARTITION BY name) min_date
FROM data
)
SELECT name,
ARRAY_AGG(STRUCT(order_id as f_id, date_time as f_date) ORDER BY order_id LIMIT 1)[OFFSET(0)].*,
sum(case when date_time< date_add(min_date,interval 3 day) then value end) as total_value_day3,
SUM(value) AS total
FROM data2
GROUP BY name
Output:
+------+------+------------+----------------+------+
| name | f_id | f_date |total_value_day3| total|
+------+------+------------+----------------+------+
| JONES| 11 | 2019-01-03 | 15 | 38 |
| John | 15 | 2019-07-23 | 10 | 10 |
+------+------+------------+----------------+------+
So my question, can do the same calculated with a more effective way?
Or this solution is ok for large datasets?
The following gets the same results without using window functions or array aggregations, so BQ has to do less ordering/partitioning. For this small example, my query takes longer to run, but there is less byte shuffling. If you run this against a much larger dataset, I think mine will be more efficient.
WITH data AS (
SELECT "JONES" name, DATE("2019-01-03") date_time, "11" order_id, 10 value UNION ALL
SELECT "JONES", DATE("2019-01-05"), "12", 5 UNION ALL
SELECT "JONES", DATE("2019-06-03"), "13", 3 UNION ALL
SELECT "JONES", DATE("2019-07-03"), "14", 20 UNION ALL
SELECT "John", DATE("2019-07-23"), "15", 10
),
aggs as (
select name, min(date_time) as first_order_date, min(order_id) as first_order_id, sum(value) as total
from data
group by 1
)
select
name,
first_order_id as f_id,
first_order_date as f_date,
sum(value) as total_value_day3,
total
from aggs
inner join data using(name)
where date_time < date_add(first_order_date, interval 3 day) -- <= perhaps
group by 1,2,3,5
Note, this makes an assumption that order_id is sequential (aka order_id 11 always occurs before order_id 12) in the same manner that dates are sequential.

ORACLE Query Get Last ID Using MIN Based On Quantity Consumed By ID

I have Incoming Stock transaction data using Oracle:
ID | DESCRIPTION | PART_NO | QUANTITY | DATEADDED
TR5 | FG | P0025 | 5 | 06-SEP-2017 08:20:33 <-- just now added
TR4 | Test | TEST1 | 8 | 05-SEP-2017 15:11:15
TR3 | FG | GSDFGSG | 10 | 31-AUG-2017 16:26:04
TR2 | FG | GSDFGSG | 2 | 31-AUG-2017 16:05:39
TR1 | FG | GSDFGSG | 2 | 30-AUG-2017 16:30:16
And now I'm grouping that data to be:
TR_ID | PART_NO | TOTAL
TR1 | GSDFGSG | 14
TR4 | TEST1 | 8
TR5 | P0025 | 5 <-- just now added
Query Code:
SELECT MIN(TRANSACTION_EQUIPMENTID) as TR_ID,
PART_NO,
SUM(T.QUANTITY) AS TOTAL
FROM WA_II_TBL_TR_EQUIPMENT T
GROUP BY T.PART_NO
As you can see on that data and query code, I'm show TR_ID using MIN to get first ID on first transaction.
And now I have Outgoing transaction data:
Assume I try to get quantity 8
ID_FK | QUANTITY
TR1 | 8
And now I want to get last ID due to quantity 8 has been consumed
ID | DESCRIPTION | PART_NO | QUANTITY
TR3| FG | GSDFGSG | 10 <-- CONSUMED 4+2+2, TOTAL 8
TR2| FG | GSDFGSG | 2 <-- CONSUMED 2+2, TOTAL 4
TR1| FG | GSDFGSG | 2 <-- CONSUMED 2
As you can see above, TR1, TR2 has been consumed. Now I want the query
SELECT MIN(TRANSACTION_EQUIPMENTID) as TR_ID,
PART_NO,
SUM(T.QUANTITY) AS TOTAL
FROM WA_II_TBL_TR_EQUIPMENT T
GROUP BY T.PART_NO
get the last id is : TR3, due to TR1 & TR2 has been consumed.
How to do that in query?
Take minimum id where growing sum is greater than 8. Use analytic sum():
select min(id) id
from (select t.*,
sum(quantity) over (partition by part_no order by id) sq
from t
where part_no = 'GSDFGSG'
)
where sq >= 8
Test data, output:
create table t(ID varchar2(3), DESCRIPTION varchar2(5),
PART_NO varchar2(8), QUANTITY number(5), DATEADDED date);
insert into t values ('TR4', 'Test', 'TEST1', 8, timestamp '2017-09-05 15:11:15');
insert into t values ('TR3', 'FG', 'GSDFGSG', 10, timestamp '2017-08-31 16:26:04');
insert into t values ('TR2', 'FG', 'GSDFGSG', 2, timestamp '2017-08-31 16:05:39');
insert into t values ('TR1', 'FG', 'GSDFGSG', 2, timestamp '2017-08-30 16:30:16');
insert into t values ('TR5', 'FG', 'GSDFGSG', 3, timestamp '2017-08-31 17:00:00');
Edit:
Add part_no and total columns and group by clause:
select min(id) id, part_no, min(sq) total
from (select t.*,
sum(quantity) over (partition by part_no order by id) sq
from t
where part_no = 'GSDFGSG'
)
where sq >= 8
group by part_no
ID PART_NO TOTAL
--- -------- ----------
TR3 GSDFGSG 14

match words in all rows in same column

There is a column in table 'mytable' named 'Description'.
+----+-------------------------------+
| ID | Description |
+----+-------------------------------+
| 1 | My NAME is Sajid KHAN |
| 2 | My Name is Ahmed Khan |
| 3 | MY friend name is Salman Khan |
+----+-------------------------------+
I need to write an Oracle SQL query/procedure/function to list the distinct words in the column.
The output should be:
+------------------+-------+
| Word | Count |
+------------------+-------+
| MY | 3 |
| NAME | 3 |
| IS | 3 |
| SAJID | 1 |
| KHAN | 3 |
| AHMED | 1 |
| FRIEND | 1 |
| SALMAN | 1 |
+------------------+-------+
Word matching should be case-insensitive.
I am using Oracle 12.1.
Let's suppose we would somehow manage to split every description in words.
So, instead of single row with Id = 1 and Description = 'My NAME is Sajid KHAN' we'd have 5 rows like this
ID | Description
--- | ------------
1 | My
1 | NAME
1 | is
1 | Sajid
1 | KHAN
in this form it'd be trivial, something like
select Description, count(*) from data_in_new_form group by Description
So, let's do this using recursive query.
create table mytable
as
select 1 as ID, 'My NAME is Sajid KHAN' as Description from dual
union all
select 2, 'My Name is Ahmed Khan' from dual
union all
select 3, 'MY friend name is Salman Khan' from dual
union all
select 4, 'test, punctuation! it is' from dual
;
with
rec (id, str, depth, element_value) as
(
-- Anchor member.
select id, upper(Description) as str, 1 as depth, REGEXP_SUBSTR( upper(Description), '(.*?)( |$)', 1, 1, NULL, 1 ) AS element_value
from mytable
UNION ALL
-- Recursive member.
select id, str, depth + 1, REGEXP_SUBSTR( str ,'(.*?)( |$)', 1, depth+1, NULL, 1 ) AS element_value
from rec
where depth < regexp_count(str, ' ')+1
)
, data as (
select * from rec
--order by id, depth
)
select element_value, count(*) from data
group by element_value
order by element_value
;
Please notice this version doesn't do anything about punctuation assuming words are separated with spaces.
UPDATE alternative way using hierarchic query
with rec as
(
SELECT id, LEVEL AS depth,
REGEXP_SUBSTR( upper(description) ,'(.*?)( |$)', 1, LEVEL, NULL, 1 ) AS element_value
FROM mytable
CONNECT BY LEVEL <= regexp_count(description, ' ')+1
and prior id = id
and prior SYS_GUID() is not null
)
, data as (
select * from rec
--order by id, depth
)
select element_value, count(*) from data
group by element_value
order by 2 desc
;
This query will work. The ordering of the words may be different. However, frequent words come at the beginning as you have listed.
SELECT word,
COUNT(*)
FROM
(SELECT TRIM (REGEXP_SUBSTR (Description, '[^ ]+', 1, ROWNUM) ) AS Word
FROM
(SELECT LISTAGG(UPPER(Description),' ') within GROUP(
ORDER BY ROWNUM ) AS Description
FROM mytable
)
CONNECT BY LEVEL <= REGEXP_COUNT ( Description, '[^ ]+')
)
GROUP BY WORD
ORDER BY 2 DESC;

Oracle CONNECT BY with multiple tables

I have 4 tables containing my data:
Table COMP: definition of my component data
COMPID | NAME | DESCRIPTION
--------+-----------+------------
000123 | Comp. 1 | A44.123
000277 | Comp. 2 | A96.277
000528 | Comp. 3 | 1235287
001024 | Comp. 4 | Lollipop
004711 | Comp. 5 | Yippie
Table COMPLIST: containing the sub-components of each component
COMPID | POS | SUBCOMPID | QUANTITY
--------+------+------------ +-----------
000123 | 1 | 000277 | 3
000123 | 2 | 000528 | 1
000528 | 1 | 004711 | 1
Table COMPSUPPLIER: definition of the components suppliers
COMPID | SUPPLIER | ORDERNUMBER
--------+-----------+-------------
000123 | Supp1 | A44.123
000277 | Supp1 | A96.277
000528 | Supp2 | 1235287
001024 | Supp2 | ux12v39
004711 | Supp1 | 123456
Table ASSEMBLY: definition of my assembly
ASSYID | POS | COMPID | QUANTITY
--------+------+---------+----------
5021 | 1 | 000123 | 1
5021 | 2 | 001024 | 2
I want to get all components used in an assembly with their supplier and order number (Edited: added Position):
POS | COMPID | NAME | SUPPLIER | ORDERNUMBER | QUANTITY
-------|---------+---------+----------+-------------+----------
1 | 000123 | Comp. 1 | Supp1 | A44.123 | 1
1.1 | 000277 | Comp. 2 | Supp1 | A96.277 | 3
1.2 | 000528 | Comp. 3 | Supp2 | 1235287 | 1
1.2.1 | 004711 | Comp. 5 | Supp1 | 123456 | 1
2 | 001024 | Comp. 4 | Supp2 | ux12v39 | 2
My idea was to use a SELECT in combination with CONNECT BY but I can't get it working right.
My current approach (Edited: updated with GurV's input):
SELECT c.COMPID, c.NAME, cs.SUPPLIER, cs.ORDERNUMBER
FROM COMP c
JOIN COMPSUPPLIER cs ON c.COMPID = cs.COMPID
WHERE c.COMPID in (
SELECT COMPID
FROM ASSEMBLY
WHERE ASSYID = '5021'
UNION ALL
SELECT SUBCOMPID
FROM COMPLIST
CONNECT BY NOCYCLE PRIOR SUBCOMPID = COMPID
START WITH COMPID in (
SELECT COMPID
FROM ASSEMBLY
WHERE ASSYID = '5402')
);
With this I get all my sub components but not the position. Is it possible to get also the position column somehow?
A standard hierarchical query will work for this problem. I see in your desired output that you don't have a column for assyid; if you have more than one assembly in your business, that's a flaw. Also, I thought at some point you will want to compute the total quantity of a sub-component for an assembly (say, screws are used in component a and also in component b, both part of assembly 1000, and you would need the total number of screws); but, since you want to show everything "in its proper hierarchy" (as reflected in the pos column), it seems you aren't interested in that, at least in this query. That would be harder to do with a standard hierarchical query and easier to do in a recursive query, but that doesn't seem to be the case here.
The idea is to union all between complist and assembly, adding a flag column to use in the start with clause of the hierarchical query. Everything else is pretty standard.
with
comp ( compid, name, description ) as (
select '000123', 'Comp. 1', 'A44.123' from dual union all
select '000277', 'Comp. 2', 'A96.277' from dual union all
select '000528', 'Comp. 3', '1235287' from dual union all
select '001024', 'Comp. 4', 'Lollipop' from dual union all
select '004711', 'Comp. 5', 'Yippie' from dual
),
Complist ( compid, pos, subcompid, quantity ) as (
select '000123', 1, '000277', 3 from dual union all
select '000123', 2, '000528', 1 from dual union all
select '000528', 1, '004711', 1 from dual
),
compsupplier ( compid, supplier, ordernumber ) as (
select '000123', 'Supp1', 'A44.123' from dual union all
select '000277', 'Supp1', 'A96.277' from dual union all
select '000528', 'Supp2', '1235287' from dual union all
select '001024', 'Supp2', 'ux12v39' from dual union all
select '004711', 'Supp1', '123456' from dual
),
assembly ( assyid, pos, compid, quantity ) as (
select '5021', 1, '000123', 1 from dual union all
select '5021', 2, '001024', 2 from dual
)
select h.assyid, ltrim(h.pos, '.') as pos, h.compid,
c.name, s.supplier, s.ordernumber, h.quantity
from (
select subcompid as compid, quantity,
connect_by_root compid as assyid,
sys_connect_by_path(pos, '.') as pos
from ( select complist.*, 'f' as flag from complist
union all
select assembly.*, null as flag from assembly
)
start with flag is null
connect by compid = prior subcompid
) h
left outer join comp c on h.compid = c.compid
left outer join compsupplier s on h.compid = s.compid
;
Output:
ASSYID POS COMPID NAME SUPPLIER ORDERNUMBER QUANTITY
------ -------- ------ ------- -------- ----------- ----------
5021 1 000123 Comp. 1 Supp1 A44.123 1
5021 1.1 000277 Comp. 2 Supp1 A96.277 3
5021 1.2 000528 Comp. 3 Supp2 1235287 1
5021 1.2.1 004711 Comp. 5 Supp1 123456 1
5021 2 001024 Comp. 4 Supp2 ux12v39 2
5 rows selected.
If I'm following your logic, you can use recursive subquery factoring instead of a hierarchical query, which makes cycles etc. a bit easier to cope with:
with rcte (position, compid, name, supplier, ordernumber, quantity) as (
select to_char(a.pos), a.compid, c.name, cs.supplier, cs.ordernumber, a.quantity
from assembly a
join compsupplier cs on cs.compid = a.compid
join comp c on c.compid = cs.compid
where a.assyid = 5021
union all
select rcte.position ||'.' || cl.pos, cl.subcompid, c.name,
cs.supplier, cs.ordernumber, cl.quantity
from rcte
join complist cl on cl.compid = rcte.compid
join compsupplier cs on cs.compid = cl.subcompid
join comp c on c.compid = cs.compid
)
select *
from rcte;
POSITION COMPID NAME SUPPL ORDERNU QUANTITY
---------- ------ ------- ----- ------- ----------
1 000123 Comp. 1 Supp1 A44.123 1
2 001024 Comp. 4 Supp2 ux12v39 2
1.1 000277 Comp. 2 Supp1 A96.277 3
1.2 000528 Comp. 3 Supp2 1235287 1
1.2.1 004711 Comp. 5 Supp1 123456 1
The anchor member gets the first two rows direct from the assembly data, including the position from that table - that's essentially your original (pre-Gurv) query, plus the position.
The recursive member then looks at complist for each generated row's compid existing as a subcompid, and appends its position to the parent's while getting the other relevant data from the other tables.
If you want to preserve the order as you showed it in the question, you can add additional columns to the recursive CTE that track the original position and the level you're currently at (possibly with other info to break ties, if they are possible), and exclude those from the final select list:
with rcte (position, compid, name, supplier, ordernumber, quantity,
order_by_1, order_by_2)
as (
select to_char(a.pos), a.compid, c.name, cs.supplier, cs.ordernumber, a.quantity,
a.pos, 1
from assembly a
join compsupplier cs on cs.compid = a.compid
join comp c on c.compid = cs.compid
where a.assyid = 5021
union all
select rcte.position ||'.' || cl.pos, cl.subcompid, c.name,
cs.supplier, cs.ordernumber, cl.quantity,
rcte.order_by_1, rcte.order_by_2 + 1
from rcte
join complist cl on cl.compid = rcte.compid
join compsupplier cs on cs.compid = cl.subcompid
join comp c on c.compid = cs.compid
)
select position, compid, name, supplier, ordernumber, quantity
from rcte
order by order_by_1, order_by_2;
POSITION COMPID NAME SUPPL ORDERNU QUANTITY
---------- ------ ------- ----- ------- ----------
1 000123 Comp. 1 Supp1 A44.123 1
1.1 000277 Comp. 2 Supp1 A96.277 3
1.2 000528 Comp. 3 Supp2 1235287 1
1.2.1 004711 Comp. 5 Supp1 123456 1
2 001024 Comp. 4 Supp2 ux12v39 2

Oracle sum group by date range

I have the following table
+-----------+-------+-------+
| Date | Type | Value |
+-----------+-------+-------+
| 1/1/2013 | A | 1 |
| 1/2/2013 | A | 3 |
| 1/3/2013 | A | 5 |
| 1/4/2013 | A | 6 |
| 1/6/2013 | A | 8 |
| 1/7/2013 | A | 1 |
| 1/8/2013 | A | 2 |
+-----------+-------+-------+
I want to sum the value for the previous 3 dates for a certain day so i used this query.
ie: sel_date = 1/3/2013.
select type, sum(value)
from table_name
where date <= seldate
and date > seldate - 3
group by type
Now the problem is, I want to output a table with a given date range computing for the previous 3 days for each date.
ie: sel_date range 1/3/2013 - 1/8/2013
+-----------+-------+------------+
| Date | Type | Sum(Value) |
+-----------+-------+------------+
| 1/3/2013 | A | 9 | // 5 + 3 + 1
| 1/4/2013 | A | 14 | // 6 + 5 + 3
| 1/5/2013 | A | 11 | // 0 + 6 + 5
| 1/6/2013 | A | 14 | // 8 + 0 + 6
| 1/7/2013 | A | 9 | // 1 + 8 + 0
| 1/8/2013 | A | 11 | // 2 + 1 + 8
+-----------+-------+------------+
Is there a way to do this in a single query. I tried reading on partitioning but it is leading me no where.
Use range between in windowing clause:
select dt, type, value,
sum(value) over (order by dt range between 2 preceding and current row) as sv
from t
Test data and output:
create table t (dt date, type varchar2(1), value number(5));
insert into t values (date '2013-01-01', 'A', 1);
insert into t values (date '2013-01-02', 'A', 3);
insert into t values (date '2013-01-03', 'A', 5);
insert into t values (date '2013-01-04', 'A', 6);
insert into t values (date '2013-01-05', 'A', 8);
insert into t values (date '2013-01-06', 'A', 1);
insert into t values (date '2013-01-07', 'A', 2);
insert into t values (date '2013-01-12', 'A', 2);
DT TYPE VALUE SV
----------- ---- ------ ----------
2013-01-01 A 1 1
2013-01-02 A 3 4
2013-01-03 A 5 9
2013-01-04 A 6 14
2013-01-05 A 8 19
2013-01-06 A 1 15
2013-01-07 A 2 11
2013-01-12 A 2 2
You can try with something like this:
with test(Date_, Type, Value ) as
(
select to_date('01/01/2013', 'mm/dd/yyyy'), 'A', 1 from dual union all
select to_date('01/02/2013', 'mm/dd/yyyy'), 'A', 3 from dual union all
select to_date('01/03/2013', 'mm/dd/yyyy'), 'A', 5 from dual union all
select to_date('01/04/2013', 'mm/dd/yyyy'), 'A', 6 from dual union all
select to_date('01/05/2013', 'mm/dd/yyyy'), 'A', 8 from dual union all
select to_date('01/06/2013', 'mm/dd/yyyy'), 'A', 1 from dual union all
select to_date('01/07/2013', 'mm/dd/yyyy'), 'A', 2 from dual
)
select *
from (
select date_, type,
value + nvl(lag(value, 1) over (partition by type order by date_), 0)
+ nvl(lag(value, 2) over (partition by type order by date_), 0) as value
from test
)
where date_ between to_date('01/03/2013', 'mm/dd/yyyy') and to_date('01/07/2013', 'mm/dd/yyyy')
This sums, for each row, the values of the two preceding ones, based on date; the external query is simply used to apply the filter, given that applying it in the internal query would lead to a wrong sum.
The LAG is used to read values from the rows that precede the current row by 1 or 2 positions.
You can use this:
select date1 ,type,
(select sum(t1.value) sumvalue from table_name t1 where t1.date1 between (t2.date1 - 2) and t2.date1 )
from table_name t2
where date1 between startDate and endDate
select t.date, sum(t.value) OVER(ORDER BY t.date ROWS BETWEEN 2 PRECEDING AND 0 FOLLOWING) as Pre_3row_sum
from table_name t

Resources