What if I want Oracle SQL to GROUP BY this but not that? - oracle

This is what I get previously from another SQL code:
Customer Id week_ending Purchase Id Price
1234 2/28/2015 8604220 15
1234 2/28/2015 8604220 13.75
1234 2/28/2015 8604220 12.95
1234 2/28/2015 8604220 18.95
567890 8/15/2015 6376243 5.15
567890 8/15/2015 6376243 0.89
567890 8/15/2015 6376243 3.99
567890 8/15/2015 6376243 2.3
1234 1/24/2015 8824241 0.99
1234 1/24/2015 8824241 3.99
1234 1/24/2015 8824241 3.89
Now I want to sum the price by Purchase ID since it is unique for every of our customer's order but I don't want my SQL to think and sum it by Customer ID (since each customer could order multiple times with multiple Purchase ID). Following is my code that I wrote but I'm afraid that it would sum them by customer_id. How do I avoid this mistake of double accounting? Thanks in advance!
WITH example AS(SELECT
customer_id
,MAX(nvl(promised_arrival_day, ship_day)) OVER (PARTITION BY purchase_id) AS ship2_day
,purchase_id
,SUM(price) AS order_size
FROM
my_table
GROUP BY
customer_id
,MAX(nvl(promised_arrival_day, ship_day)) OVER (PARTITION BY customer_purchase_id)
,purchase_id)
SELECT
example.customer_id
,TO_CHAR(example.ship2_day + (7-TO_CHAR(example.ship2_day,'d')),'MM-DD-YYYY') AS week_ending
,example.purchase_id
,example.order_size
FROM
example;

Just
SELECT customer_id, purchase_id, sum(price)
FROM your_table
GROUP BY customer_id, purchase_id
Each record will be counted only once. It doesn't do any "double accounting" as you say.
You will get one record for each unique combination of customer_id/puchase_id in your data.

Looking at your data, I would do something like.
with tbl as(
-- query by which you are getting dataset in example
)
select customer_id,purchase_id, sum(price) as total_price from tbl
group by purchase_id,customer_id

Related

Oracle: Retrieving specific group of records based by date

I have a table in oracle that I'm trying to write a query for but having a problem writing it correctly. The data of the table looks like this:
Name
ID
DATE
Shane
1
01JAN2023
Angie
2
02JAN2023
Shane
1
02JAN2023
Austin
3
03JAN2023
Shane
1
03JAN2023
Angie
2
03JAN2023
Tony
4
05JAN2023
What I was trying to come up with was a way to iterate over each day, look at all the records for that day and compare with the rest of the records in the table that came before it and only pull back the first instance of the record based on the ID & Date. The expected output would be:
Name
ID
DATE
Shane
1
01JAN2023
Angie
2
02JAN2023
Austin
3
03JAN2023
Tony
4
05JAN2023
Can anyone tell me what the query should be to accomplish this?
Thank you in advance.
You'll need to convert your date field to a real date so it orders correctly
SELECT name,id,MIN(TO_DATE(date,'DDMONYYYY')) date
FROM table
GROUP BY name,id
Isn't that just
select name, id, min(date_column)
from your_table
group by name, id;
If you don't want to use aggregation, you can use FETCH NEXT ROWS WITH TIES:
SELECT tab.*
FROM tab
ORDER BY ROW_NUMBER() OVER(PARTITION BY Name, Id ORDER BY DATE_)
FETCH NEXT 1 ROWS WITH TIES
Output:
NAME
ID
DATE_
Angie
2
02-JAN-23
Austin
3
03-JAN-23
Shane
1
01-JAN-23
Tony
4
05-JAN-23
Check the demo here.

How to select rows from a table based on duplicated values in a column Snowflake

I have a table A that looks similar to:
ID
PET
COUNTRY
45
DOG
US
72
DOG
CA
15
CAT
CA
36
CAT
US
37
CAT
SG
12
SNAKE
IN
20
PIG
US
14
PIG
RS
33
HORSE
IQ
(has about a few hundred rows)
I would like to retain the rows that have a duplicated "PET" value, so the result looks like:
|ID|PET |COUNTRY
|--| --- |---|
|45| DOG |US|
|72 |DOG|CA|
|15 |CAT |CA|
|36 |CAT|US|
|37 |CAT|SG|
|20|PIG|US|
|14|PIG|RS|
How can I remove the rows that do not have duplicated PET values? Would it be something like
SELECT ID, PET, COUNTRY, COUNT(*)
FROM A
GROUP BY PET, COUNTRY, ID
HAVING COUNT(*) >1
I am not sure how to group the values by PET and pick out the groups only containing one row. Thanks!
What about simply doing:
WITH
RES AS (SELECT PET, COUNT(*) FROM A GROUP BY PET HAVING COUNT(*) > 1)
SELECT ID, PET, COUNTRY FROM A WHERE PET IN (SELECT PET FROM RES);
This would give you all rows with pets present in more than one row.
A shorter way is to use QUALIFY:
SELECT *
FROM tab
QUALIFY COUNT(*) OVER(PARTITION BY PET) > 1;

Oracle 'Partition By' and 'Row_Number' keyword along with pivot

I have this query written by someone else and I am trying to figure out how is it working. I have general idea about all these things such as row_number() , partition by, pivot but I am unable to understand them all together.
For this query :
select
d, p, s, a
from
(
select name,occupation, (ROW_NUMBER() OVER (partition by occupation order by name)) as rownumber from occupations
)
pivot
(
max(name)
for occupation
in ('Doctor' as d, 'Professor' as p, 'Singer' as s, 'Actor' as a)
)
order by rownumber;
This is the input table on which the above query works :
This it the output generated by the query which is correct as per the question :
Jenny Ashley Meera Jane
Samantha Christeen Priya Julia
NULL Ketty NULL Maria
Now, I want to know how the output is generated by the query i.e. step by step with flow of execution. Explanation with easy examples matching the above situation would be much appreciated. Thanks in advance.
After from clause you have following :
select name,occupation, (ROW_NUMBER() OVER (partition by occupation order by name))
Above virtually restack your table data in three columns - Name, occupation, rownumber. rownumber will reset itself as soon as occupation column changes. Output data will be like :
NAME OCCUPATION ROWNUMBER
-------------------- -------------------- ----------
Jane ACTOR 1
Julia ACTOR 2
Maria ACTOR 3
JENNY DOCTOR 1 <-- rownumber reset to 1
Sammantha DOCTOR 2
Pivot function let you aggregate result & rotate rows into columns.
Pivot usage code is :
PIVOT
(
aggregate_function(column2)
FOR column2
IN ( expr1, expr2, ... expr_n) | subquery
)
So your PIVOT function have name stacked NAME based on OCCUPATION . Each stack (column in output) is ordered by rownumber column inserted via first subquery.

Display Records through SQL in Oracle

I had run following query in Oracle Database and produces following output:
Query: select id,name from member where name like 'A%';
ID Name
261 A....
706 Aaa.......
327 Ab.....
and more...
This Query returns 50 records and
I want to display 10 records at a time to user.
Since, ID does not contain data in autoincrement fashion, i cannot use between operator.
and rownum operator also doesn't help much.
Kindly Help.
Regards,
Ankit Agarwal
SELECT ID, Name
from (
select id,name, ROW_NUMBER() over( order by name) r
from member
where name like 'A%'
)
WHERE R between FromRowNum AND ToRowNum;
See http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:76812348057

Select all rows from SQL based upon existence of multiple rows (sequence numbers)

Let's say I have table data similar to the following:
123456 John Doe 1 Green 2001
234567 Jane Doe 1 Yellow 2001
234567 Jane Doe 2 Red 2001
345678 Jim Doe 1 Red 2001
What I am attempting to do is only isolate the records for Jane Doe based upon the fact that she has more than one row in this table. (More that one sequence number)
I cannot isolate based upon ID, names, colors, years, etc...
The number 1 in the sequence tells me that is the first record and I need to be able to display that record, as well as the number 2 record -- The change record.
If the table is called users, and the fields called ID, fname, lname, seq_no, color, date. How would I write the code to select only records that have more than one row in this table? For Example:
I want the query to display this only based upon the existence of the multiple rows:
234567 Jane Doe 1 Yellow 2001
234567 Jane Doe 2 Red 2001
In PL/SQL
First, to find the IDs for records with multiple rows you would use:
SELECT ID FROM table GROUP BY ID HAVING COUNT(*) > 1
So you could get all the records for all those people with
SELECT * FROM table WHERE ID IN (SELECT ID FROM table GROUP BY ID HAVING COUNT(*) > 1)
If you know that the second sequence ID will always be "2" and that the "2" record will never be deleted, you might find something like:
SELECT * FROM table WHERE ID IN (SELECT ID FROM table WHERE SequenceID = 2)
to be faster, but you better be sure the requirements are guaranteed to be met in your database (and you would want a compound index on (SequenceID, ID)).
Try something like the following. It's a single tablescan, as opposed to 2 like the others.
SELECT * FROM (
SELECT t1.*, COUNT(name) OVER (PARTITION BY name) mycount FROM TABLE t1
)
WHERE mycount >1;
INNER JOIN
JOIN:
SELECT u1.ID, u1.fname, u1.lname, u1.seq_no, u1.color, u1.date
FROM users u1 JOIN users u2 ON (u1.ID = u2.ID and u2.seq_no = 2)
WHERE:
SELECT u1.ID, u1.fname, u1.lname, u1.seq_no, u1.color, u1.date
FROM users u1, thetable u2
WHERE
u1.ID = u2.ID AND
u2.seq_no = 2
Check out the HAVING clause for a summary query. You can specify stuff like
HAVING COUNT(*) >= 2
and so forth.

Resources