ActiveRecord query help. Finding the latest record from each group - activerecord

So I have an orders table that looks like this:
Where ledger_id is a uuid and version is a timestamp.
There could be many orders per ledger_id. This is a denormalized table btw, used to keep track of orders and their progression through processing FWIW.
If a couple ledger_ids come in and we want the latest order for each ledger_id, what's the ActiveRecord query that will get us this?
I feel like I'm close. I have this:
orders = Order.where(ledger_id: ledger_ids).group(:ledger_id, :id).having('version = MIN(version)').first
where ledger_ids is an array of 1 or more ledger uuids.
But this gives us an error:
:StatementInvalid: PG::GroupingError: ERROR: column "orders.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT "orders".* FROM "orders" WHERE "orders"."ledger_id" ...
^
: SELECT "orders".* FROM "orders" WHERE "orders"."ledger_id" = $1 GROUP BY "orders"."ledger_id" ORDER BY "orders"."id" ASC LIMIT $2
Anyone know of a solution?

Related

Nested Subquery Limitations in Oracle

So, I have been doing a fair amount of reading on this in various forums and resource sites but have not yet found found a solution I believe applies to my case. Also, I can't believe how difficult this is proving to be; I would think this kind of query would be fairly common.
Essentially what I am doing here is querying two historical tables (tbl_b and tbl_c), via union, for a specific milestone date - for which there may be multiple results... I then wish to find the most recent of these results, using max. This date is then returned as a column in the main query.
My problem is that, in the 3rd tier subquery, I need to reference an identifier value from the table in the top query (tbl_a).
I know that correlated queries only are able to reference their parent query - so, I am stuck.
Edit 1
The target date I am searching for will most likely, but not necessarily, be unique within the result set. It is a timestamp of the data record. I am looking for the most recent entry in the history that correlates to each column in tbl_a. Creating an SQL Fiddle for this.
See sample below:
select tbl_a.col_a,
tbl_a.col_b,
(
select max(target_date)
from
(
select tbl_b.target_date
from tbl_b
where tbl_b.tbl_a_id = tbl_a.id and
tbl_b.flag = 1 and
tbl_b.milestone_id = tbl_a.milestone_id
union
select tbl_c.target_date
from tbl_c
where tbl_c.tbl_a_id = tbl_a.id and
tbl_c.flag = 1 and
tbl_c.milestone_id = tbl_a.milestone_id
) most_recent_target_date
)
from tbl_a
Convert this query to a join, in this way:
select tbl_a.col_a,
tbl_a.col_b,
max(most_recent_target_date.target_date)
from tbl_a
join (
select tbl_b.target_date, tbl_b.date_id
from tbl_b
where tbl_b.flag = 1
union all
select tbl_c.target_date, tbl_c.date_id
from tbl_c
where tbl_c.flag = 1
) most_recent_target_date
ON tbl_a.date_id = most_recent_target_date.date_id
GROUP BY tbl_a.col_a,
tbl_a.col_b

Oracle SQL formatting

I Have a query to get bill data, grouped by the calendar year and the week of the year. I want to have the evolution of bills in total. So not only the number of bills of 1 week but from the beginning until the week. I have following query
SELECT DD.CAL_YEAR, DD.WEEK_OF_YEAR AS "Date by week", SUM(DISTINCT FAB.ID) OVER ( ORDER BY DD.CAL_YEAR, DD.WEEK_OF_YEAR ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS "Number of bills"
FROM BILLS_TABLE FAB
JOIN DIM_DATE DD ON FAB.BALANCE_DATE_ID = DD.ID
GROUP BY DD.CAL_YEAR,DD.WEEK_OF_YEAR;
But when I execute this query I get following exception:
Error: ORA-30487: ORDER BY not allowed here
SQLState: 99999 ErrorCode: 30487
The order by clause is needed for the OVER function, so what is wrong?
The problem is really the use of distinct inside sum.
Because you have the order by clause, that means you are doing a cumulative sum. Performing a distinct in that case doesn't really make any sense. That's what the error is actually trying to tell you.
The documentation for the SUM analytic function mentions this:
If you specify DISTINCT, then you can specify only the query_partition_clause of the analytic_clause. The order_by_clause and windowing_clause are not allowed.

Compare date to month-year in Postgres/Ruby

I have a date column in my table and I would like to 'filter'/select out items after a certain year-month. So if I have data from 2010 on, I have a user input that specifies '2011-10' as the 'earliest date' they want to see data from.
My current SQL looks like this:
select round(sum(amount), 2) as amount,
date_part('month', date) as month
from receipts join items
on receipts.item = items.item
where items.expense = ?
and date_part('year', date)>=2014
and funding = 'General'
group by items.expense, month, items.order
order by items.order desc;
In the second part of the 'where', instead of doing year >= 2014, I want to do something like to_char(date, 'YY-MMMM') >= ? as another parameter and then pass in '2011-10'. However, when I do this:
costsSql = "select round(sum(amount), 2) as amount,
to_char(date, 'YY-MMMM') as year_month
from receipts join items
on receipts.item = items.item
where items.expense = ?
and year_month >= ?
and funding = 'General'
group by items.expense, year_month, items.order
order by items.order desc"
and call that with my two params, I get a postgres error: PG::UndefinedColumn: ERROR: column "year_month" does not exist.
Edit: I converted my YYYY-MM string into a date and passed that in as my param instead and it's working. But I still don't understand why I get the 'column does not exist' error after I created that column in the select clause - can someone explain? Can columns created like that not be used in where clauses?
This error: column "year_month" does not exist happens because year_month is an alias defined the SELECT-list and such aliases can't be refered to in the WHERE clause.
This is based on the fact that the SELECT-list is evaluated after the WHERE clause, see for example: Column alias in where clause? for an explanation from PG developers.
Some databases allow it nonetheless, others don't, and PostgreSQL doesn't. It's one of the many portability hazards between SQL engines.
In the case of the query shown in the question, you don't even need the to_char in the WHERE clause anyway, because as mentioned in the first comment, a direct comparison with a date is simpler and more efficient too.
When a query has a complex expression in the SELECT-list and repeating it in the WHERE clause looks wrong, sometimes it might be refactored to move the expression into a sub-select or a WITH clause at the beginning of the query.

Logic of applying Group by in Sql Queries

I am asking a very beginner level of question but I am always confused whenever I want to use aggregate function with Group by. Actually I am getting the right results but I am not pretty sure about how group by is working here. My requirement is to get the count of sent items which is based on MessageGroup columns.
MessageId SenderId MessageGroup Message
_____________________________________________________________________________
1 2 67217969-e03d-41ec-863e-659ca26e660f Hi
2 2 67217969-e03d-41ec-863e-659ca26e660f Hello
3 2 67217969-e03d-41ec-863e-659ca26e660f bye
4 1 c45dc414-9320-40a5-8f8f-9c960d6deffe TC
5 1 8486d16b-294b-45a5-8674-e7024e55f39b shutup
Actually I want to get the count for sent messages.here SenderId=2 has sent three messages to someone but I want to show a single count so I have used MessageGroup and I am doing Groupby and getting the count.
I have used Linq query::
return DB.tblMessage.Where(m => m.SenderId == 2 ).GroupBy(m => m.MessageGroup).Count();
This returns "1" which is correct and I want to show (1) in sent messages.
But if I try to query the above in SQL Server, it returns 3
Here is my SQL query:
select count(*)
from tblMessage
where SenderId = 2
group by MessageGroup
The Linq query is right As it returns me one as Microsoft says here
Actually I am confused with Group by. Please clear my point.
When you are using GroupBy, which ever columns present in groupBy Clause should be in Select Clause
select MessageGroup,count(MessageGroup)from tblMessage
where SenderId=2
group by MessageGroup
You want to include MessageGroup as part of the select, like this:
select MessageGroup, count(*)
from tblMessage
where SenderId=2
group by MessageGroup

oracle query error: exact fetch return more than requested no of rows

I have two tables seatinfo(siid,seatno,classid,tsid) and booking (bookid,siid,date,status).
I've input parameter bookDate,v_tsId ,v_clsId. I need exactly one row (bookid) to return. This query is not working. I don't no why. How can I fix it?
select bookid
into v_bookid
from booking
where (to_char(booking.bookdate,'dd-mon-yy'))=(to_char(bookDate,'dd-mon-yy'))
and status=0
and rownum <= 1
and siid in(select siid
from seatinfo
where tsid=v_tsId
and classid= v_clsId);
I also tried this:
select bookid
into v_bookid
from booking,
seatinfo
where booking.siid=seatinfo.siid
and (to_char(booking.bookdate,'dd-mon-yy'))=(to_char(bookDate,'dd-mon-yy'))
and booking.status=0
and rownum <= 1
and seatinfo.tsid=v_tsId
and seatinfo.classid= v_clsId;
Are you saying that you get an "ORA-01422: exact fetch returns more than requested number of rows" when you run both of those queries? That seems highly unlikely since you're including the predicate rownum <= 1. Can you cut and paste from a SQL*Plus session that runs just this query in a PL/SQL block and generates the error?
If you are not complaining about the error you mention in the title, and the problem is just that you're not getting the data you expect, the likely problem is that you apparently have a bookDate parameter that has the same name as a column in your table. That is not going to work. When you say
(to_char(booking.bookdate,'dd-mon-yy'))=(to_char(bookDate,'dd-mon-yy'))
you presumably mean to compare the bookDate column in the booking table against the bookDate parameter. But since column names have precedence over local variables, the left-hand side of your expression is also looking at the bookDate column in the booking table. So you're comparing a column to itself. It would make much more sense to change the name of the parameter (to, say, p_bookDate) and then write
booking.bookDate = p_bookDate
or, if you want to do the comparison ignoring the time component of the dates
trunc( booking.bookDate ) = trunc( p_bookDate )

Resources