Have the following data tables:
menu_items:
item_id,
item_name,
price,
sales:
item_id,
customer_id,
employee_id,
date
I am attempting to join the tables on item_id. I want to display the item_name, number of item_names sold and date, and group them by the date. How should I adjust the code below to make the query work.
select item_name, count(item_name), date
from menu_items join sales
on item_id = item_id
group by date
As you probably found out, it won't work; all non-aggregated items must be contained in the GROUP BY clause. Also, you should use table aliases, always.
select s.date_col,
i.item_name,
count(*) number_of_items_sold
from menu_items i join sales s on s.item_id = i.item_id
group by s.date_col, i.item_name
order by s.date_col, i.item_name;
If it is not what you wanted, please, post some sample data and desired output; it might be easier to answer, then.
Related
I have multiple columns in a table in hive having around 80 columns. I need to apply the distinct clause on some of the columns and get the first values from the other columns also. Below is the representation of what I am trying to achieve.
select distinct(col1,col2,col3),col5,col6,col7
from abc where col1 = 'something';
All the columns mentioned above are text columns. So I cannot apply group by and aggregate functions.
You can use row_number function to solve the problem.
create table temp as
select *, row_number() over (partition by col1,col2,col3) as rn
from abc
where col1 = 'something';
select *
from temp
where rn=1
You can also sort the table while partitioning.
row_number() over (partition by col1,col2,col3 order by col4 asc) as rn
DISTINCT is the most overused and least understood function in SQL. It's the last thing that is executed over your entire result set and removes duplicates using ALL columns in your select. You can do a GROUP BY with a string, in fact that is the answer here:
SELECT col1,col2,col3,COLLECT_SET(col4),COLLECT_SET(col5),COLLECT_SET(col6)
FROM abc WHERE col1 = 'something'
GROUP BY col1,col2,col3;
Now that I re-read your question though, I'm not really sure what you are after. You might have to join the table to an aggregate of itself.
Is it possible to select only distinct combinations of multiple columns?
E.g. only the distinct combinations of customers and the dates they placed orders (as a representation of only days they placed orders)?
What you’re looking for are groups of data (which shows you only distinct combinations of values), which you can return with the GROUP BY clause.
SELECT customer_id, date
FROM orders
GROUP BY customer_id, date;
SELECT DISTINCT
customerId, orderDate
FROM
table;
OR
SELECT DISTINCTROW
customerId, orderDate
FROM
table;
I have a select statement that gets user_id and a list of transactions for the day such as this:
select user_id, sale_amount, date, product from transactions
I want to be able to select each user_id (there are many) along with their top sale_amount, date and product. If there is a tie, I want it to just select one. How is this possible? Rownum or rank seem to be close but not quite there?
I m not ifo computer but this should work. Let me know
select * from (select user_id, sale_amount, date, product,row_number() over (partition by user_id order by sales_amount desc) as maxsale from transactions) l where maxsale=1
I'm using this query:
SELECT *
FROM HISTORY
LEFT JOIN CUSTOMER ON CUSTOMER.CUST_NUMBER = HISTORY.CUST_NUMBER
LEFT JOIN (
Select LOAN_DATE, CUST_NUMBER, ACCOUNT_NUMBER, STOCK_NUMBER, LOC_SALE
From LOAN
WHERE ACCOUNT_NUMBER != 'DD'
ORDER BY LOAN_DATE DESC
) LOAN ON LOAN.CUST_NUMBER = HISTORY.CUST_NUMBER
order by DATE desc
But I want only the top result from the loan table to be joined (Most recent by Loan_date). For some reason, it's getting three records (one for each loan on the customer I'm looking at). I'm sure I'm missing something simple?
If you're after joining the latest loan row per cust_number, then this ought to do the trick:
select *
from history
left join customer on customer.cust_number = history.cust_number
left join (select loan_date,
cust_number,
account_number,
stock_number,
loc_sale
from (select loan_date,
cust_number,
account_number,
stock_number,
loc_sale,
row_number() over (partition by cust_number
order by loan_date desc) rn
from loan
where account_number != 'DD')
where rn = 1) loan on loan.cust_number = history.cust_number
order by date desc;
If there are two rows with the same loan_date per cust_number and you want to retrieve both, then change the row_number() analytic function for rank().
If you only want to retreive one row, then you'd have to make sure you add additional columns into the order by, to make sure that the tied rows always display in the same order, otherwise you could find that sometimes you get different rows returned on subsequent runs of the query.
I have a table of accounts and a table of transactions. In a report I need to show the following for each account:
First Purchase Date,
First Purchase Amount,
Last Purchase Date,
Last Purchase Amount,
# of Purchases,
Total of All Purchases.
The transaction table looks like this:
TX_UID
Card_Number
Post_Date
TX_Type
TX_Amount
Currently the query I've inherited has a sub-query for each of these elements. It seems to me that there's got to be a more efficient way. I'm able to use a stored procedure for this and not a single query.
A sample of a query to get all transactions for a single account would be:
select * from tx_table where card_number = '12345' and TX_Type = 'Purchase'
Any ideas?
try this:
select tt1.post_date as first_purchase_date,
tt1.tx_amount as first_purchase_amount,
tt2.post_date as last_purchase_date,
tt2.tx_amount as last_purchase_amount,
tg.pc as purchase_count,
tg.amount as Total
from (select Card_Number,min(post_date) as mipd, max(post_date) as mxpd, count(*) as pc, sum(TX_Amount) as Amount from tx_table where TX_Type = 'Purchase' group by card_number) tg
join tx_table tt1 on tg.card_number=tt1.card_number and tg.mipd=tt1.post_date
join tx_table tt2 on tg.card_number=tt2.card_number and tg.mxpd=tt2.post_date
where TX_Type = 'Purchase'
I added the count .. I didn't see it first time.
If you need also the summary on multiple TX_Types, you have to take it from the where clause and put it in the group and the inner selection join. But I guess you need only for purchases
;with cte as
(
select
Card_Number,
TX_Type,
Post_Date,
TX_Amount,
row_number() over(partition by TX_Type, Card_Number order by Post_Date asc) as FirstP,
row_number() over(partition by TX_Type, Card_Number order by Post_Date desc) as LastP
from tx_table
)
select
F.Post_Date as "First Purchase Date",
F.TX_Amount as "First Purchase Amount",
L.Post_Date as "Last Purchase Date",
L.TX_Amount as "Last Purchase Amount",
C.CC as "# of Purchases",
C.Amount as "Total of All Purchases"
from (select Card_Number, TX_Type, count(*) as CC, sum(TX_Amount) as Amount
from cte
group by Card_Number, TX_Type) as C
inner join cte as F
on C.Card_Number = F.Card_Number and
C.TX_Type = F.TX_Type and
F.FirstP = 1
inner join cte as L
on C.Card_Number = L.Card_Number and
C.TX_Type = L.TX_Type and
L.LastP = 1