oracle sql compare result two subselects - oracle

Let's say I have two Oracle SQL tables for my invoices. INV_HEAD for adress, date, ...
Then I have INV_POS for every position of the invoice.
INV_HEAD
--------
id
date
adr_id
total
INV_POS
---------
id
he_id
pos
art_id
quantity
price
I can list all the invoices with
SELECT he.id, he.date, po.art_id, po.quantity, po.price
FROM INV_HEAD he
JOIN INV_POS po on po.he_id = he.id
Now I want to find invoices with the same positions, not necessarily in the same order. How can I do this?
As a result I only need the INV_HEAD.id of all invoices with the same positions.
Here is same sample data:
id | he_id | pos | art_id | quantity | price
1 | 1 | 1 | 1000 | 5 | 100.00
2 | 1 | 2 | 2000 | 10 | 5000.00
3 | 2 | 1 | 2500 | 2 | 1250.00
4 | 3 | 1 | 2000 | 10 | 5000.00
5 | 3 | 2 | 1000 | 5 | 100.00
Invoice with he_id 1 and 3 have the same positions.

You can use analytic function LISTAGG to concatenate id with same position
SELECT p.pos, LISTAGG(h.id, ', ') WITHIN GROUP (ORDER BY p.pos) "Id"
FROM inv_head h, inv_pos p
where h.id=p.he_id
group by p.pos;
You will get following results
POS Id
1 | 1, 2, 3
2 | 1, 3
I don't see the reason to join on inv_head, however I sticked to your original query (probably you are having some intention in this).

You want something like (note that the next query does not work, because we cannot compare sets using =):
SELECT SELECT DISTINCT H1.ID, H2.ID
FROM INV_HEAD H1, INV_HEAD H2
WHERE H1.ID <> H2.ID AND
(SELECT ART_ID, QUANTITY, PRICE FROM INV_POS WHERE HE_ID = H1.ID) =
(SELECT ART_ID, QUANTITY, PRICE FROM INV_POS WHERE HE_ID = H2.ID)
But we can rethink the problem: A = B also means that A-B UNION B-A is an empty set.
So instead of A = B you can use NOT EXISTS((A MINUS B) UNION (B MINUS A))
where A is (SELECT ART_ID, QUANTITY, PRICE FROM INV_POS WHERE HE_ID = H1.ID) and B is (SELECT ART_ID, QUANTITY, PRICE FROM INV_POS WHERE HE_ID = H2.ID)
So your query is:
SELECT DISTINCT H1.HE_ID, H2.HE_ID
FROM INV_HEAD H1, INV_HEAD H2
WHERE H1.ID <> H2.ID
AND NOT EXISTS(
((SELECT ART_ID, QUANTITY, PRICE FROM INV_POS WHERE HE_ID = H1.ID)
MINUS
(SELECT ART_ID, QUANTITY, PRICE FROM INV_POS WHERE HE_ID = H2.ID))
UNION
((SELECT ART_ID, QUANTITY, PRICE FROM INV_POS WHERE HE_ID = H2.ID)
MINUS
(SELECT ART_ID, QUANTITY, PRICE FROM INV_POS WHERE HE_ID = H1.ID)));
This query creates pairs of invoices that have the same positions (note that if two invoices has no positions they are considered equals).
The condition H1.ID <> H2.ID avoids pairs like (1, 1) or (3, 3). But if you have the pair (1,3) you will also have the symmetric (3,1). If you want to avoid symmetry then use H1.ID < H2.ID or H1.ID > H2.ID instead.
If you want to know the invoices with the same position than a given invoice with ID = X then use WHERE H1.ID = X AND H1.ID <> H2.ID AND.. (use <>, never use < or > in this case).

Related

Is it possible to add distinct to part of a sum clause in Oracle?

I have a pretty lengthy SQL query which I'm going to run on Oracle via hibernate. It consists of two nested selects. In the first select statement, a number of sums are calculated, but in one of them I want to filter the results using unique ids.
SELECT ...
SUM(NVL(CASE WHEN SECOND_STATUS= 50 OR SECOND_STATUS IS NULL THEN RECEIVE_AMOUNT END, 0) +
NVL(CASE WHEN FIRST_STATUS = 1010 THEN AMOUNT END, 0) +
NVL(CASE WHEN FIRST_STATUS = 1030 THEN AMOUNT END, 0) -
NVL(CASE WHEN FIRST_STATUS = 1010 AND (SECOND_STATUS= 50 OR SECOND_STATUS IS NULL) THEN RECEIVE_AMOUNT END, 0)) TOTAL, ...
And at the end:
... FROM (SELECT s.*, p.* FROM FIRST_TABLE s
JOIN SECOND_TABLE p ON s.ID = p.FIRST_ID
In one of the lines that start with NVL (second line actually), I want to add a distinct clause that sums the amounts only if first table ids are unique. But I don't know if this is possible or not. If yes, how would it be?
Assume such setup
select * from first;
ID AMOUNT
---------- ----------
1 10
2 20
select * from second;
SECOND_ID FIRST_ID AMOUNT2
---------- ---------- ----------
1 1 100
2 1 100
3 2 100
After the join you get the total sum of both amounts too high because the amount from the first table is duplicated.
select *
from first
join second on first.id = second.first_id;
ID AMOUNT SECOND_ID FIRST_ID AMOUNT2
---------- ---------- ---------- ---------- ----------
1 10 1 1 100
1 10 2 1 100
2 20 3 2 100
You must add a row_number that identifies the first occurence in the parent table and consider in the AMOUNT only the first row and resets it to NULL in the duplicated rows.
select ID,
case when row_number() over (partition by id order by second_id) = 1 then AMOUNT end as AMOUNT,
SECOND_ID, FIRST_ID, AMOUNT2
from first
join second on first.id = second.first_id;
ID AMOUNT SECOND_ID FIRST_ID AMOUNT2
---------- ---------- ---------- ---------- ----------
1 10 1 1 100
1 2 1 100
2 20 3 2 100
Now you can safely sum in a separate subquery
with tab as (
select ID,
case when row_number() over (partition by id order by second_id) = 1 then AMOUNT end as AMOUNT,
SECOND_ID, FIRST_ID, AMOUNT2
from first
join second on first.id = second.first_id
)
select id, sum(nvl(amount,0) + nvl(amount2,0))
from tab
group by id
;
ID SUM(NVL(AMOUNT,0)+NVL(AMOUNT2,0))
---------- ---------------------------------
1 210
2 120
Note also that this is an answer to your question. Some will argue that in your situation you should first aggregate and than join. This will also resolve your problem possible more elegantly.

Oracle add group function over result rows

I'm trying to add an aggregate function column to an existing result set. I've tried variations of OVER(), UNION, but cannot find a solution.
Example current result set:
ID ATTR VALUE
1 score 5
1 score 7
1 score 9
Example desired result set:
ID ATTR VALUE STDDEV (score)
1 score 5 2
1 score 7 2
1 score 9 2
Thank you
Seems like you're after:
stddev(value) over (partition by attr)
stddev(value) over (partition by id, attr)
It just depend on what you need to partition by. Based on sample data the attr should be enough; but I could see possibly the ID and attr.
Example:
With CTE (ID, Attr, Value) as (
SELECT 1, 'score', 5 from dual union all
SELECT 1, 'score', 7 from dual union all
SELECT 1, 'score', 9 from dual union all
SELECT 1, 'Z', 1 from dual union all
SELECT 1, 'Z', 5 from dual union all
SELECT 1, 'Z', 8 from dual)
SELECT A.*, stddev(value) over (partition by attr)
FROM cte A
ORDER BY attr, value
DOCS show that by adding an order by to the analytic, one can acquire the cumulative standard deviation per record.
Giving us:
+----+-------+-------+------------------------------------------+
| ID | attr | value | stdev |
+----+-------+-------+------------------------------------------+
| 1 | Z | 1 | 3.51188458428424628280046822063322249225 |
| 1 | Z | 5 | 3.51188458428424628280046822063322249225 |
| 1 | Z | 8 | 3.51188458428424628280046822063322249225 |
| 1 | score | 5 | 2 |
| 1 | score | 7 | 2 |
| 1 | score | 9 | 2 |
+----+-------+-------+------------------------------------------+

converting multiple comma separated columns into rows

I have an Oracle table which holds comma separated values in many columns. For example :
Id Column1 Column2
1 A,B,C H
2 D,E J,K
3 F L,M,N
I want to split all the columns into rows and the output should be this :
ID Column1 Column2
1 A H
1 B H
1 C H
2 D J
2 D K
2 E J
2 E K
3 F L
3 F M
3 F N
I found some suggestions which uses regexp_substr and connect by but it deals with only one column that has comma separated values. I have tried sub-query method also where I will be dealing with one column at a time in inner query and send the inner query output as input it outer query, this takes more time and the columns that hold comma separated values are more. So I cannot use the sub-query method.
For one column you can CROSS JOIN a TABLE() collection expression containing a correlated sub-query that uses a hierarchical query to split the column value up into separate strings. For two columns, you just do the same thing for the second column and the CROSS JOIN takes care of ensuring that each delimited value in column1 is paired with each delimited value in column2.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE table_name ( Id, Column1, Column2 ) AS
SELECT 1, 'A,B,C', 'H' FROM DUAL UNION ALL
SELECT 2, 'D,E', 'J,K' FROM DUAL UNION ALL
SELECT 3, 'F', 'L,M,N' FROM DUAL;
Query 1:
SELECT t.id,
c1.COLUMN_VALUE AS c1,
c2.COLUMN_VALUE AS c2
FROM table_name t
CROSS JOIN
TABLE(
CAST(
MULTISET(
SELECT REGEXP_SUBSTR( t.Column1, '[^,]+', 1, LEVEL )
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT( t.Column1, '[^,]+' )
) AS SYS.ODCIVARCHAR2LIST
)
) c1
CROSS JOIN
TABLE(
CAST(
MULTISET(
SELECT REGEXP_SUBSTR( t.Column2, '[^,]+', 1, LEVEL )
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT( t.Column2, '[^,]+' )
) AS SYS.ODCIVARCHAR2LIST
)
) c2
Results:
| ID | C1 | C2 |
|----|----|----|
| 1 | A | H |
| 1 | B | H |
| 1 | C | H |
| 2 | D | J |
| 2 | D | K |
| 2 | E | J |
| 2 | E | K |
| 3 | F | L |
| 3 | F | M |
| 3 | F | N |
Below will give you an idea about to how to convert the comma separated string into rows. You can use this logic to satisfy your need.
select regexp_substr('a,b,c,v,f', '[^,]+',1,level)
from dual
connect by level <= regexp_count('a,b,c,v,f', ',') + 1;

Select an included values

I'm using Oracle SQL and i need help with a query. Hope it's not too much easy one. I did't find an answer for it in Google.
I have a table that need to be aggregated by ID column and then to select only the records that two values are included in a certain table (and both of them).
Table for example
ID | Value
1 | Y
1 | N
2 | N
2 | N
2 | Y
3 | Y
3 | Y
4 | Y
5 | Y
5 | N
5 | Y
5 | N
The output table need to include only the IDs that both Y and N are included in Value table. Output:
ID
1
2
5
Another solution that groups by the ID and uses HAVING to return only those with > 1 DISTINCT values:
with v_data(id, value) as (
select 1, 'Y' from dual union all
select 1, 'N' from dual union all
select 2, 'Y' from dual)
select id
from v_data
group by id
having count(distinct value) > 1
select distinct a.id
from your_table a inner join your_table b on a.id = b.id and a.value != b.value;

Sample time serie by time interval with Hive QL and calculate jumps

I have time series data in a table. Basically each row has a timestamp and a value.
The frequency of the data is absolutely random.
I'd like to sample it with a given frequency and for each frequency extract relevant information about it: min, max, last, change (relative previous), return (change / previous) and maybe more (count...)
So here's my input:
08:00:10, 1
08:01:20, 2
08:01:21, 3
08:01:24, 5
08:02:24, 2
And I'd like to get the following result for 1 minute sampling (ts, min, max, last, change, return):
ts m M L Chg Return
08:01:00, 1, 1, 1, NULL, NULL
08:02:00, 2, 5, 5, 4, 4
08:03:00, 2, 2, 2, -3, -0.25
You could do it with something like this (comments inline):
SELECT
min
, mn
, mx
, l
, l - LAG(l, 1) OVER (ORDER BY min) c
-- This might not be the right calculation. Unsure how -0.25 was derived in question.
, (l - LAG(l, 1) OVER (ORDER BY min)) / (LAG(l, 1) OVER (ORDER BY min)) r
FROM
(
SELECT
min
, MIN(val) mn
, MAX(val) mx
-- We can take MAX here because all l's (last values) for the minute are the same.
, MAX(l) l
FROM
(
SELECT
min
, val
-- The last value of the minute, ordered by the timestamp, using all rows.
, LAST_VALUE(val) OVER (PARTITION BY min ORDER BY ts ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) l
FROM
(
SELECT
ts
-- Drop the seconds and go back one minute by converting to seconds,
-- subtracting 60, and then going back to a shorter string format.
-- 2000-01-01 is a dummy date just to enable the conversion.
, CONCAT(FROM_UNIXTIME(UNIX_TIMESTAMP(CONCAT("2000-01-01 ", ts), "yyyy-MM-dd HH:mm:ss") + 60, "HH:mm"), ":00") min
, val
FROM
-- As from the question.
21908430_input a
) val_by_min
) val_by_min_with_l
GROUP BY min
) min_with_l_m_M
ORDER BY min
;
Result:
+----------+----+----+---+------+------+
| min | mn | mx | l | c | r |
+----------+----+----+---+------+------+
| 08:01:00 | 1 | 1 | 1 | NULL | NULL |
| 08:02:00 | 2 | 5 | 5 | 4 | 4 |
| 08:03:00 | 2 | 2 | 2 | -3 | -0.6 |
+----------+----+----+---+------+------+

Resources