How to select duplicate rows and group by two columns

How to select duplicate rows and group by two columns - laravel

I have the following table:
ID|user_id|group_id|subject |book_id
1| 2 |3 |history |1
2| 4 |3 |history |1
3| 5 |3 |art |2
4| 2 |3 |art |2
5| 1 |4 |sport |5
I would like to list all rows for group 3(id) that have duplicate rows with the same subject_id and book_id. The subject and book_id is what would determine the 2 or more rows to be duplicate.
I would like my distinct results to look like this:
|subject |book_id|
|history |1 |
|art |2 |
Using either query builder or eloquent

A SQL query to get the desired result may look
SELECT subject, book_id
FROM table1
WHERE group_id = 3
GROUP BY subject, book_id
HAVING COUNT(*) > 1
Here is a SQLFiddle demo
Now the same using the Laravel Query Builder
$duplicates = DB::table('table1')
->select('subject', 'book_id')
->where('group_id', 3)
->groupBy('subject', 'book_id')
->havingRaw('COUNT(*) > 1')
->get();

Related

Using partition by to get count in Oracle

I have a EMP table. I need to get number of employees in each department grouped by country name = 'INDIA','USA', 'AUSTRALIA'.
For example,
DEPARTMENT | #EMPLOYEE(INDIA) | #EMPLOYEE(USA) | # EMPLOYEE(AUSTRALIA)
ACCOUNTING | 5 |2 | 3
IT | 5 |2 | 1
BUSINESS | 1 |4 | 3
I need to use Partition BY to do it. I am able to use PARTITION by to get the total count of employees for each department. But I am not able to subgroup by country name.
Please give me suggestions.
Thank you.

Consider conditional count.
SELECT DEPARTMENT,
COUNT(CASE WHEN Country = 'INDIA' THEN 1 END) as emp_india,
COUNT(CASE WHEN Country = 'USA' THEN 1 END) as emp_usa,
COUNT(CASE WHEN Country = 'AUSTRALIA' THEN 1 END) as emp_australia
GROUP BY DEPARTMENT

Join two tables in HIVE with sub query

I need to get the cost of an item at a certain date and time. I have these two tables:
create table sales ( product_id int, items_sold int, date_loaded date );
create table product ( product_id int, description string, item_cost double, date_loaded date );
The product table is a history of each item. If the cost of an item today is $1.00 but the cost of that item yesterday was $0.99 I would have two records one for each day. When I load my sales data I need to reflect the cost of the item yesterday and not today's cost.
Here is the query I am trying to execute:
SELECT s.product_id, s.items_sold, p.description, s.items_sold * p.item_cost as total_cost FROM sales s, product p
WHERE
p.product_id = s.product_id and
p.date_loaded <= (
SELECT MAX(pp.date_loaded)
FROM product pp
WHERE
pp.product_id = s.product_id and
pp.date_loaded <= s.date_loaded
)
SALES TABLE:
|PRODUCT_ID |ITEMS_SOLD |DATE_LOADED |
|1 |4 |2016-06-30 |
|1 |5 |2016-07-01 |
|1 |6 |2016-07-02 |
|1 |3 |2016-07-03 |
PRODUCT TABLE:
|PRODUCT_ID |DESCRIPTION |ITEM_COST |DATE_LOADED |
|1 |ITEM A |0.99 |2016-06-20 |
|1 |ITEM A |1.00 |2016-07-02 |
I would expect to see this result:
|PRODUCT_ID |ITEMS_SOLD |DESCRIPTION |ITEM_COST |TOTAL_COST |
|1 |4 |ITEM A |0.99 |3.96 |
|1 |5 |ITEM A |0.99 |4.95 |
|1 |6 |ITEM A |1.00 |6.00 |
|1 |3 |ITEM A |1.00 |3.00 |
From everything I have read this form of a sub query is not allowed. So how can I accomplish this in HIVE?

It can be accomplished with CTE and Lag widow function
With result as(select PRODUCT_ID, DESCRIPTION, ITEM_COST , DATE_LOADED ,
LEAD(DATE_LOADED, 1,'2999-01-01')
OVER (ORDER BY DATE_LOADED) AS fromdate from PRODUCT )
SELECT s.product_id, s.items_sold, p.description, s.items_sold * p.item_cost
as total_cost FROM sales s join result p on s.product_id = p.product_id
where s.DATE_LOADED >= p.DATE_LOADED and s.DATE_LOADED < p.fromdate ;

Oracle SQL group by cube with distinct ID

Example data (complete table has more columns and millions of rows):
invoice_number |year |department |euros
-------------------------------------------------------------
1234 |2010 |1 | 200
1234 |2011 |1 | 200
1234 |2011 |2 | 200
4567 |2010 |1 | 450
4567 |2010 |2 | 450
4567 |2010 |3 | 450
My Objective:
I want to sum the euros for every year and every department in every possible combination.
How result should look:
year |department |euros
--------------------------------------------
2010 |1 |650
2010 |2 |450
2010 |3 |450
2010 |(null) |650
2011 |1 |200
2011 |2 |200
(null) |1 |650
(null) |2 |650
(null) |3 |450
(null) |(null) |650
My query:
select year
, department
, sum(euros)
from table1
group by cube (
year
, department
)
Problem:
One invoice number can occur in several categories. For example, one invoice can have items from 2010 and 2011. This is no problem when I want to show the data per year. However, when I want the total over all years the euros will be summed twice, one time for each year. I want the functionality of 'group by cube' but I want to sum only distinct invoice numbers for aggregations.
Problem table:
year |department |euros
--------------------------------------------
2010 |1 |650
2010 |2 |450
2010 |3 |450
2010 |(null) |1550
2011 |1 |200
2011 |2 |200
(null) |1 |850
(null) |2 |650
(null) |3 |450
(null) |(null) |1950
Is it possible to do what I want? So far my search has yielded no results. I have created a SQL Fiddle, I hope it works

[Removed previous "solution"]
New attempt: here is quite an ugly solution, but it seems to work, even when two invoices have the same amount. With two table accesses, you should check if performance is acceptable.
SQL> with table1_cubed as
2 ( select year
3 , department
4 , grouping_id(year,department) gid
5 from table1
6 group by cube(year,department)
7 )
8 , join_distinct_invoices as
9 ( select distinct x.*
10 , r.invoice_number
11 , r.euros
12 from table1_cubed x
13 inner join table1 r on (nvl(x.year,r.year) = r.year and nvl(x.department,r.department) = r.department)
14 )
15 select year
16 , department
17 , sum(euros)
18 from join_distinct_invoices
19 group by year
20 , department
21 , gid
22 order by year
23 , department
24 /
YEAR DEPARTMENT SUM(EUROS)
---------- -------------------- ----------
2010 1 650
2010 2 450
2010 3 450
2010 650
2011 1 200
2011 2 200
2011 200
1 650
2 650
3 450
650
11 rows selected.

select year
,department
,case when GROUPING_id(year,department) in (3) then sum(dist_euro) else sum(euros) end sums
,decode(GROUPING_id(year,department),0,'NO GROUP',1,'DEPARTMENT IS NULL',2,'YEAR IS NULL',3,'TOTAL OVER ALL YEARS') info
from (
select year
, department
, euros
,case when row_number() over(partition by year order by year) = 1 then euros else 0 end dist_euro
from table1)
group by cube (
year
, department
)
order by GROUPING_id(year,department)

Recursive sum of values in an hierarchical table in Oracle 10g

Assuming I have this table:
CREATE TABLE MY_EXAMPLE ( ID NUMBER , PARENT NUMBER , VALUE NUMBER );
Insert into MY_EXAMPLE (ID,PARENT,VALUE) values (1,null,100);
Insert into MY_EXAMPLE (ID,PARENT,VALUE) values (2,1,50);
Insert into MY_EXAMPLE (ID,PARENT,VALUE) values (3,null,0);
Insert into MY_EXAMPLE (ID,PARENT,VALUE) values (4,2,1000);
Insert into MY_EXAMPLE (ID,PARENT,VALUE) values (5,1,1);
|id |parent |value |
|1 |null |100 |
|2 |1 |50 |
|3 |null |0 |
|4 |2 |1000 |
|5 |1 |1 |
I need to create a view (which should perform well) with the same number of rows but giving the row's plus the children's value summed. Many levels are possible as well as many children.
|id |parent |value |
|1 |null |1151 | (sum of 1 + 2 + 4 + 5)
|2 |1 |1050 | (sum of 2 + 4)
|3 |null |0 | (only 3 because has no children)
|4 |2 |1000 | (only 4 because has no children)
|5 |1 |1 | (only 5 because has no children)
ps.: I tried something like this but it didn't work in Oracle 10g first because the keyword RECURSIVE is not supported and second because it won't allow recursive WITH ("forward or recursive reference of a query name in WITH clause is not allowed").
Also I couldn't figure out a way to do it with CONNECT BY that includes the id and parent columns and gives me the whole table (in my attempts I always had to use START WITH).

You will have to create a recursive function:
CREATE FUNCTION RECURSIVE_ADD(
ROOT_ID IN NUMBER)
RETURN NUMBER
IS
TOTAL NUMBER;
BEGIN
SELECT SUM(VALUE)
INTO TOTAL
FROM (
(
SELECT VALUE FROM MY_EXAMPLE WHERE ID = ROOT_ID
)
UNION
(
SELECT recursive_add(id) FROM my_example WHERE parent = root_id
));
RETURN total;
END;
select id, parent, value, RECURSIVE_ADD(id) from my_example;
Make sure you don't have a cycle in your data (for example, if you set the parent of 1 to 2) otherwise this will never terminate. There are other ways to do this in newer versions of Oracle, but this will work in 10g.

Complex SQL query to join two tables

Problem:
Given two tables: TableA, TableB, where TableA has a one-to-many relationship with TableB, I want to retrieve all records in TableB for where the search criteria matches a certain column in TableB and return NULL for the unique TableA records for the same attribute.
Table Structures:
Table A
ID(Primary Key) | Name | City
1 | ABX | San Francisco
2 | ASDF | Oakland
3 | FDFD | New York
4 | GFGF | Austin
5 | GFFFF | San Francisco
Table B
ATTR_ID |Attr_Type | Attr_Name | Attr_Value
1 | TableA | Attr_1 | Attr_Value_1
2 | TableD | Attr_1 | Attr_Value_2
1 | TableA | Attr_2 | Attr_Value_3
3 | TableA | Attr_4 | Attr_Value_4
9 | TableC | Attr_2 | Attr_Value_5
Table B holds attribtue names and values and is a common table used across multiple tables. Each table is identified by Attr_Type and ATTR_ID (which maps to the IDs of different tables).
For instance, the record in Table A with ID 1 has two attributes in Table B with Attr_Names: Attr_1 and Attr_2 and so on.
Expected Output
ID | Name | City | TableB.Attr_Value
1 | ABX | San Francisco | Attr_Value_1
2 | ASDF | Oakland | Attr_Value_2
3 | FDFD | New York | NULL
4 | GFGF | Austin | NULL
5 | GFFFF | San Francisco | NULL
Search Criteria:
Get rows from Table B for each record in Table A with ATTR_NAME Attr_1. If a particular TableA record doesn't have Attr_1, return null.
My Query
select id, name, city,
b.attr_value from table_A
join table_B b on
table_A.id =b.attr_id and b.attr_name='Attr_1'

This is a strange data structure. You need a left outer join with the conditions in the on clause:
select a.id, a.name, a.city, b.attr_value
from table_A a left join
table_B b
on a.id = b.attr_id and b.attr_name = 'Attr_1' and b.attr_type = 'TableA';
I added the attr_type condition, because that seems logic with this data structure.

I dont have an sql server to test the command, but what you want is an inner/outer join query. You could do something like this
select id, name, city,
b.attr_value from table_A
join table_B b on
table_A.id *= b.attr_id and b.attr_name *= 'Attr_1'
Something like this should do the trick for you

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to select duplicate rows and group by two columns - laravel

Related

Using partition by to get count in Oracle

Join two tables in HIVE with sub query

Oracle SQL group by cube with distinct ID

Recursive sum of values in an hierarchical table in Oracle 10g

Complex SQL query to join two tables

Categories

Resources