Oracle LAG/LEAD analytic functions works as join or not? - oracle

lag/lead analytic functions allow access to previous/next rows without join technique according official docs:
http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions070.htm
http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions074.htm
Does that mean that if I include fields which are not lag/lead - Oracle do join on this fields?
So if I query currency rates (to find holes - the day where are no rate defined):
select CURRENCY, RATE, XDATE, lead(XDATE) over (order by XDATE) from CURRENCY_RATE
I get lead(XDATE) for same CURRENCY as XDATE?
Or partition by keyword in lead expression must be used on CURRENCY to achieve this goal?

Your query make a lead without currency consideration.
If you want to sort by date for any currency, you need to add the partition by keyword :
select CURRENCY,
RATE,
XDATE,
lead(XDATE) over(partition by CURRENCY order by XDATE)
from CURRENCY_RATE
;

I found good explanation of analytical functions semantic in http://www.orafaq.com/node/55 :
Some functions support the window_clause inside the partition to further limit the records they act on. In the absence of any window_clause analytic functions are computed on all the records of the partition clause.
The functions SUM, COUNT, AVG, MIN, MAX are the common analytic functions the result of which does not depend on the order of the records.
Functions like LEAD, LAG, RANK, DENSE_RANK, ROW_NUMBER, FIRST, FIRST VALUE, LAST, LAST VALUE depends on order of records.
Another nice docs: http://www.oracle-base.com/articles/misc/analytic-functions.php

Related

How make view use index

The view has several join, but no WHEREcloses. It helped our developers to have all the data the needed in one appian object, that could be easily used in the "low code" later on. In most cases, Appian add conditions to query the data on the view, in a subsequent WHERE clause like below:
query: [Report on Record Type], order by: [[Sort[histoDateAction desc], Sort[id asc]]],
filters:[((histoDateAction >= TypedValue[it=9,v=2022-10-08 22:00:00.0])
AND (histoDateAction < TypedValue[it=9,v=2022-10-12 22:00:00.0])
AND (histoUtilisateur = TypedValue[it=3,v=miwem6]))
]) (APNX-1-4198-000) (APNX-1-4205-031)
Now we start to have data in the database, and performances get low. Reason seems to be, from execution plan view, query do not use indexes when data is queried.
Here is how the query for view VIEW_A looks:
SELECT
<columns> (not much transformation here)
FROM A
LEFT JOIN R on R.id=A.id_type1
LEFT JOIN R on R.id=A.id_type2
LEFT JOIN R on R.id=A.id_type3
LEFT JOIN U on U.id=A.id_user <500>
LEFT JOIN C on D.id=A.id_customer <50000>
LEFT JOIN P on P.id=A.id_prestati <100000>
and in the current, Appian added below clauses:
where A.DATE_ACTION < to_date('2022-10-12 22:00:00', 'YYYY-MM-DD HH24:MI:SS')
and A.DATE_ACTION >= to_date('2022-10-08 22:00:00', 'YYYY-MM-DD HH24:MI:SS')
and A.USER_ACTION = 'miwem6'
typically, when I show explain plan for the VIEW_A WHERE <conditions> , I have a cost around 6'000, and when I show explain plan for the <code of the view> where <clause>, the cost is 30.
Is it possible to use some Oracle hint to tell it: "Some day, someone will query this adding a WHERE clause on some columns, so don't be a stupid engine and use indexes when time comes"?
First, this isn't a great architecture. I can't tell you how many times folks have pulled me in to diagnose performance problems due to unpredictably dynamic queries where they are adding an assortment of unforseable WHERE predicates.
But if you have to do this, you can increase your likelihood of using indexes by lowering their cost. Like this:
SELECT /*+ opt_param('optimizer_index_cost_adj',1) */
<columns> (not much transformation here)
FROM A . . .
If you know for sure that nested loops + index use is the way you want to access everything, you can even disable the CBO entirely:
SELECT /*+ rule */
<columns> (not much transformation here)
FROM A . . .
But of course it's on you to ensure that there's an index on every high cardinality column that your system may use to significantly filter desired rows by. That's not every column, but it sounds like it may be quite a few.
Oh, and one more thing... please ignore COST. By definition Oracle always chooses what it computes as the lowest cost plan. When it makes wrong choices, it's because it's computation of cost is incorrect. Therefore by definition, if you are having problems, the COST numbers you see are wrong. Ignore them.

Oracle Sql group function is not allowed here

I need someone who can explain me about "group function is not allowed here" because I don't understand it and I would like to understand it.
I have to get the product name and the unit price of the products that have a price above the average
I initially tried to use this, but oracle quickly told me that it was wrong.
SELECT productname,unitprice
FROM products
WHERE unitprice>(AVG(unitprice));
search for information and found that I could get it this way:
SELECT productname,unitprice FROM products
WHERE unitprice > (SELECT AVG(unitprice) FROM products);
What I want to know is why do you put two select?
What does group function is not allowed here mean?
More than once I have encountered this error and I would like to be able to understand what to do when it appears
Thank you very much for your time
The phrase "group function not allowed here" is referring to anything that is in some way an "aggregation" of data, eg SUM, MIN, MAX, etc et. These functions must operate on a set of rows, and to operate on a set of rows you need to do a SELECT statement. (I'm leaving out UPDATE/DELETE here)
If this was not the case, you would end up with ambiguities, for example, lets say we allowed this:
select *
from products
where region = 'USA'
and avg(price) > 10
Does this mean you want the average prices across all products, or just the average price for those products in the USA? The syntax is no longer deterministic.
Here's another option:
SELECT *
FROM (
SELECT productname,unitprice,AVG(unitprice) OVER (PARTITION BY 1) avg_price
FROM products)
WHERE unitprice > avg_price
The reason your original SQL doesn't work is because you didn't tell Oracle how to compute the average. What table should it find it in? What rows should it include? What, if any, grouping do you wish to apply? None of that is communicated with "WHERE unitprice>(AVG(unitprice))".
Now, as a human, I can make a pretty educated guess that you intend the averaging to happen over the same set of rows you select from the main query, with the same granularity (no grouping). We can accomplish that either by using a sub-query to make a second pass on the table, as your second SQL did, or the newer windowing capabilities of aggregate functions to internally make a second pass on your query block results, as I did in my answer. Using the OVER clause, you can tell Oracle exactly what rows to include (ROWS BETWEEN ...) and how to group it (PARTITION BY...).

Tableau - Aggregate and non aggregate error for divide forumla

I used COUNT (CUST_ID) as measure value to come up [Total No of Customer]. When I created new measure for [Average Profit per customer] by formula - [Total Profit] / [Total No of Customer], the error of Aggregate and non aggregate error prompted.
DB level:
Cust ID_____Profit
123_______100
234_______500
345_______350
567_______505
You must be looking for avg aggregate function.
Select cust_id, avg(profit)
From your_table
Group by cust_id;
Cheers!!
In your database table, you appear to have one data row per customer. Customer ID is serving as a unique primary key. The level of detail (or granularity) of the database table is the customer.
Given that, the simplest solution to your question is to display AVG([Profit]) -- without having [Cust ID] in the view (i.e. not on any shelf)
If the assumptions mentioned above are not correct, then you may need to employ other methods depending on how you define your question. I suggest making sure you understand what COUNT() actually does compared to COUNTD(). The behavior is not what people tend to assume. LOD calculations may prove useful. All described in the online help.
Put the calculations directly in the calculated field as:
SUM([Profit])/COUNT([CUST_ID])
This will give you aggregate and aggregate calculation.
If you want to show Average profit using a key like [CUST_ID], you can use LOD expression:
{FIXED [CUST_ID]: AVG[Profit]}

How do I build effecient SQL filters?

After taking an advanced T-SQL performance/query tuning class, something that I thought I remembered hearing was that you can speed up some queries just a little bit if you put your date(time) filters first.
Ex:
WHERE
RunDate = '12/1/2015' AND
OtherFilters = etc...
But does this really only count if I have indexes in place on these columns I filter on for this table?
So to add to this just a little, should I be building my filters off of the indexes on any tables referenced in the query? Such that my first filters of the query are based on my indexes?
Ex:
WHERE
ID > 1000 AND
RunDate <= '1/1/206' AND
OtherFilters = etc...
Where ID and RunDate are part of my indexes/primary key.
The order of filters in WHERE clause does not matter. As long as you have index on the fields, SQL Server knows how to use your filters.
Assume you have index on (ID, RunDt) and you have both ID and RunDt in your WHERE clause. SQL Server first filters the data on ID and then from that subset rows, will filter on RunDt.
This scenario may change if you have other indexes depends on selectivity of your data.
Also if you have clustered index on RunDt, SQL will first filter on RunDt and then ID.
You don't need to worry about the order of your filters in WHERE clause, as long as you have the right order of columns in your index definition.
TSQL is just a logical representation
The query optimizer will set the actual execution order that is most efficient
It messes up some times but for the most part it is spot on
If you have a clustered PK on ID then this will typically be done first
Appears even the OP is confused about the question
Can only answer the stated question
But does this really only count if I have indexes in place on these
columns I filter on for this table?
The order in the where does not matter for columns with indexes
The order in the where does not matter for columns without indexes
The order in the where does not matter

Sum based on specific condition - Oracle

I need your advice on the following query that I have - Let's say that I have a table with all payments that are booked on my current account.
The details of the payment contain date of the operation and hour. I would like to extract the information in a such a way so to have next to each transaction the amount of of the balance(sum of transactions' amount) since the beginning of the day up to the current transaction. The balance for each day is reset to 0.
I was thinking to join this table to itself and find all unique operations from the joined table where the date matches and the hour is less then currently reviewed operation's hour then to use sum on the group.
Still I think that there is much more intelligent solution.
Thanks in advance
here is a sample of the table. Expected result is in the last column
My guess is that you just want a rolling sum. Making up column names and table names, you probably want something like this in your projection (your select list). You shouldn't need to do a self-join.
SUM(transaction_amount)
OVER (PARTITION BY account_number, trunc(transaction_date)
ORDER BY transaction_date) rolling_sum

Resources