I Have a query to get bill data, grouped by the calendar year and the week of the year. I want to have the evolution of bills in total. So not only the number of bills of 1 week but from the beginning until the week. I have following query
SELECT DD.CAL_YEAR, DD.WEEK_OF_YEAR AS "Date by week", SUM(DISTINCT FAB.ID) OVER ( ORDER BY DD.CAL_YEAR, DD.WEEK_OF_YEAR ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS "Number of bills"
FROM BILLS_TABLE FAB
JOIN DIM_DATE DD ON FAB.BALANCE_DATE_ID = DD.ID
GROUP BY DD.CAL_YEAR,DD.WEEK_OF_YEAR;
But when I execute this query I get following exception:
Error: ORA-30487: ORDER BY not allowed here
SQLState: 99999 ErrorCode: 30487
The order by clause is needed for the OVER function, so what is wrong?
The problem is really the use of distinct inside sum.
Because you have the order by clause, that means you are doing a cumulative sum. Performing a distinct in that case doesn't really make any sense. That's what the error is actually trying to tell you.
The documentation for the SUM analytic function mentions this:
If you specify DISTINCT, then you can specify only the query_partition_clause of the analytic_clause. The order_by_clause and windowing_clause are not allowed.
Related
I'm trying to figure out how to build a measure that sums a total, but only taking the first non-empty row for a user.
For example, my data looks like the below:
date user value
-----------------
1/1/17 a 15
2/1/17 a 12
1/1/17 b null
5/1/17 b 3
I'd therefore like a result of 18 (15 + 3).
I'm thinking that using FIRSTNONBLANK would help, but it only takes a single column, I'm not sure how to give it the grouping - perhaps some sort of windowing is required.
I've tried the below, but am struggling to work out what the correct syntax is
groupby(
GROUPBY (
myTable,
myTable[user],
“Total”, SUMX(CURRENTGrOUP(), FIRSTNONBLANK( [value],1 ))
),
sum([total])
)
I didn't have much luck getting FIRSTNONBLANK or GROUPBY to work exactly how I wanted, but I think I found something that works:
SUMX(
ADDCOLUMNS(
ADDCOLUMNS(VALUES(myTable[User]),
"FirstDate",
CALCULATE(MIN(myTable[Date]),
NOT(ISBLANK(myTable[Value])))),
"FirstValue",
CALCULATE(SUM(myTable[Value]),
FILTER(myTable, myTable[Date] = [FirstDate]))),
[FirstValue])
The inner ADDCOLUMNS calculates the first non-blank date values for each user in the filter context.
The next ADDCOLUMNS, takes that table of users and first dates and for each user sums each [value] that occurred on each respective date.
The outer SUMX takes that resulting table and totals all of the values of [FirstValue].
I'm trying to get an count based on two dates and I'm not sure how it should look in a query. I have two date fields; I want to get a count based on those dates.
<cfquery>
SELECT COUNT(*)
FROM Table1
Where month of date1 is one month less than month of date2
</cfquery>
Assuming Table1 is your original query, you can accomplish your goal as follows.
Step 1 - Use QueryAddColumn twice to add two empty columns.
Step 2 - Loop through your query and populate these two columns with numbers. One will represent date1 and the other will represent date2. It's not quite as simple as putting in the month numbers because you have to account for the year as well.
Step 3 - Write your Q of Q with a filter resembling this:
where NewColumn1 - NewColumn2 = 1
I have a date column in my table and I would like to 'filter'/select out items after a certain year-month. So if I have data from 2010 on, I have a user input that specifies '2011-10' as the 'earliest date' they want to see data from.
My current SQL looks like this:
select round(sum(amount), 2) as amount,
date_part('month', date) as month
from receipts join items
on receipts.item = items.item
where items.expense = ?
and date_part('year', date)>=2014
and funding = 'General'
group by items.expense, month, items.order
order by items.order desc;
In the second part of the 'where', instead of doing year >= 2014, I want to do something like to_char(date, 'YY-MMMM') >= ? as another parameter and then pass in '2011-10'. However, when I do this:
costsSql = "select round(sum(amount), 2) as amount,
to_char(date, 'YY-MMMM') as year_month
from receipts join items
on receipts.item = items.item
where items.expense = ?
and year_month >= ?
and funding = 'General'
group by items.expense, year_month, items.order
order by items.order desc"
and call that with my two params, I get a postgres error: PG::UndefinedColumn: ERROR: column "year_month" does not exist.
Edit: I converted my YYYY-MM string into a date and passed that in as my param instead and it's working. But I still don't understand why I get the 'column does not exist' error after I created that column in the select clause - can someone explain? Can columns created like that not be used in where clauses?
This error: column "year_month" does not exist happens because year_month is an alias defined the SELECT-list and such aliases can't be refered to in the WHERE clause.
This is based on the fact that the SELECT-list is evaluated after the WHERE clause, see for example: Column alias in where clause? for an explanation from PG developers.
Some databases allow it nonetheless, others don't, and PostgreSQL doesn't. It's one of the many portability hazards between SQL engines.
In the case of the query shown in the question, you don't even need the to_char in the WHERE clause anyway, because as mentioned in the first comment, a direct comparison with a date is simpler and more efficient too.
When a query has a complex expression in the SELECT-list and repeating it in the WHERE clause looks wrong, sometimes it might be refactored to move the expression into a sub-select or a WITH clause at the beginning of the query.
I have two tables seatinfo(siid,seatno,classid,tsid) and booking (bookid,siid,date,status).
I've input parameter bookDate,v_tsId ,v_clsId. I need exactly one row (bookid) to return. This query is not working. I don't no why. How can I fix it?
select bookid
into v_bookid
from booking
where (to_char(booking.bookdate,'dd-mon-yy'))=(to_char(bookDate,'dd-mon-yy'))
and status=0
and rownum <= 1
and siid in(select siid
from seatinfo
where tsid=v_tsId
and classid= v_clsId);
I also tried this:
select bookid
into v_bookid
from booking,
seatinfo
where booking.siid=seatinfo.siid
and (to_char(booking.bookdate,'dd-mon-yy'))=(to_char(bookDate,'dd-mon-yy'))
and booking.status=0
and rownum <= 1
and seatinfo.tsid=v_tsId
and seatinfo.classid= v_clsId;
Are you saying that you get an "ORA-01422: exact fetch returns more than requested number of rows" when you run both of those queries? That seems highly unlikely since you're including the predicate rownum <= 1. Can you cut and paste from a SQL*Plus session that runs just this query in a PL/SQL block and generates the error?
If you are not complaining about the error you mention in the title, and the problem is just that you're not getting the data you expect, the likely problem is that you apparently have a bookDate parameter that has the same name as a column in your table. That is not going to work. When you say
(to_char(booking.bookdate,'dd-mon-yy'))=(to_char(bookDate,'dd-mon-yy'))
you presumably mean to compare the bookDate column in the booking table against the bookDate parameter. But since column names have precedence over local variables, the left-hand side of your expression is also looking at the bookDate column in the booking table. So you're comparing a column to itself. It would make much more sense to change the name of the parameter (to, say, p_bookDate) and then write
booking.bookDate = p_bookDate
or, if you want to do the comparison ignoring the time component of the dates
trunc( booking.bookDate ) = trunc( p_bookDate )
Database
Table1
Id
Table2Id
...
Table2
Id
StartTime
Duration //in hours
Query
select * from Table1 join Table2 on Table2Id = Table2.Id
where starttime < :starttime and starttime + Duration/24 > :endtime
This query is currently taking about 2 seconds to run which is too long. There is an index on the id columns and a function index on Start_time+duration/24 In Sql Developer the query plan shows no indexes being used. The query returns 475 rows for my test start and end times. Table2 has ~800k rows Table1 has ~200k rows
If the duration/24 calculation is removed from the query, replaced with a static value the query time is reduced by half. This does not retrieve the exact same data, but leads me to believe that the division is expensive.
I have also tested adding an endtime column to Table2 that is populated with (starttime + duration/24) The column was prepopulated via a single update, if it would be used in production I would populate it via an update trigger.
select * from Table1 join Table2 on Table2Id = Table2.Id
where starttime < :starttime and endtime > :endtime
This query will run in about 600ms and it uses an index for the join. It is less then ideal because of the additional column with redundant data.
Are there any methods of making this query faster?
Create a function index on both starttime and the expression starttime + Duration/24:
create index myindex on table2(starttime, starttime + Duration / 24);
A compound index on the entire predicate of your query should be selected, whereas individually indexed the optimizer is likely deciding that repeated table accesses by rowid based on a scan of one of those indexes is actually slower than a full table scan.
Also make sure that you're not doing an implicit conversion from varchar to date, by ensuring that you're passing DATEs in your bind variables.
Try lowering the optimizer_index_cost_adj system parameter. I believe the default is 100. Try setting that to 10 and see if your index is selected.
Consider partitioning the table by starttime.
You have two criteria with range predicates (greater than/less than). An index range scan can start at one point in the index and end at another.
For a compound index on starttime and "Starttime+duration/24", since the leading column is starttime and the predicate is "less than bind value", it will start at the left most edge of the index (earliest starttime) and range scan all rows up to the point where the starttime reaches the limit. For each of those matches, it can evaluate the calculated value for "Starttime+duration/24" on the index against the bind value and pass or reject the row. I'd suspect most of the data in the table is old, so most entries have an old starttime and you'd end up scanning most of the index.
For a compound index on "Starttime+duration/24" and starttime, since the leading column is the function and the predicate is "greater than bindvalue", it will start partway through the index and work its way to the end. For each of those matches, it can evaluate the starttime on the index against the bind value and pass or reject the row. If the enddate passed in is recent, I suspect this would actually involve a much smaller amount of the index being scanned.
Even without the starttime as a second column on the index, the existing function based index on "Starttime+duration/24" should still be useful and used. Check the explain plan to make sure the bindvalue is either a date or converted to a date. If it is converted, make sure the appropriate format mask is used (eg an entered value of '1/Jun/09' may be converted to year 0009, so Oracle will see the condition as very relaxed and would tend not to use the index - plus the result could be wrong).
"In Sql Developer the query plan shows no indexes being used. " If the index wasn't being used to find the table2 rows, I suspect the optimizer thought most/all of table2 would be returned [which it obviously isn't, by your numbers]. I'd guess that it though most of table1 would be returned, and thus neither of your predicates did a lot of filtering. As I said above, I think the "less than" predicate isn't selective, but the "greater than" should be. Look at the explain plan, especially the ROWS value, to see what Oracle thinks
PS.
Adjusting the value means the optimizer changes the basis for its estimates. If a journey planner says you'll take six hours for a trip because it assumes an average speed of 50, if you tell it to assume an average of 100 it will comes out with three hours. it won't actually affect the speed you travel at, or how long it takes to actually make the journey.
So you only want to change that value to make it more accurately reflect the actual value for your database (or session).
Oracle would not use indexes if the selectivity of the where clause is not very good. Index would be used if the number of rows returned would be some percentage of the total number of rows in the table (the percentage varies, since oracle will count the cost of reading the index as well as reading the tables).
Also, when the index columns are modified in where clause, the index would get disabled. For example, UPPERCASE(some_index_column), would disable the usage of the index on some_index_column. This is why starttime + Duration/24 > :endtime does not use the Index.
Can you try this
select * from Table1 join Table2 on Table1.Id = Table2.Table1Id
where starttime < :starttime and starttime > :endtime - Duration/24
This should allow the use of the Index and there is no need for an additional column.