Sorting an output in BigQuery - sorting

I ran a query to fetch the Total runs and few other metrics, scored by batsmen in Indian Premier League from 2008-2022.
1.I would like to sort to show the highest IPL scorers on top. How to do it?
How to rename line 3 as Year? Extract (Year from Start_Date) as Year throws an error that Start_Date is not aggregated neither grouped
Player,
Team,
Extract (Year from Start_Date),
count (Start_Date) AS Matches,
COUNT(Balls) AS Innings,
(COUNT(Balls) - COUNTIF(NotOut>0)) AS Innings_Out,
SUM(Runs) AS TotalRuns,
CAST(SAFE_DIVIDE(SUM(Runs),
(COUNT(Balls)-COUNTIF(NotOut>0))) AS INT64) AS Average,
SUM(_4s) AS Fours,
SUM(_6s) AS Sixes
FROM
ipl_all_batting.ipl
GROUP BY
Player,
Team,
extract (year from Start_Date)
ORDER BY
Player

Related

Computing the YTD using the PARTITION BY clause

I've the following dataset:
I need to compute the YTD Revenue for each combination of sender_country, receiver_country. That is for example, I should get for Canada-Nigeria 5.08 for the year 2016 and 10.24 for 2017 (if i'm not wrong). I've tried to apply the following:
SELECT sender_country, receiver_country, date_move, revenue, SUM(revenue) OVER (PARTITION BY sender_country, receiver_country ORDER BY date_move) as YTD FROM (TABLE));
However, I'm getting the following result:
As you can see I'm aggregating the results for each pair of sender_country and receiver_country but I don't manage to "reset" the aggregated revenue for each year.
Thanks in advance
Add the year to the partition with either TRUNC(date_move, 'YY') or EXTRACT(YEAR FROM date_move):
SELECT sender_country,
receiver_country,
date_move,
revenue,
SUM(revenue) OVER (
PARTITION BY sender_country, receiver_country, TRUNC(date_move, 'YY')
ORDER BY date_move
) AS YTD
FROM table_name;

oracle's analytical function issue

Please let me know if the following is off topic, or not clear, or too specific, or too complex to understand. I think the following is a challenge to describe, understand and solve.
CIF=cost, insurance, frieght (basically it is the import value)
The simiplified version of input table (Import) looks like this:
enter image description here
So from January to June the value 1 is assigned to SixMonthPeriod column, and the rest of the months are given the value 2.
I then want to calculate unit price for a six period, thus I use
select SixMonthPeriod, ProductDescrip, Sum(weight), Sum (CIF), (Sum (CIF))/(Sum(weight)) as UnitPrice
from Import
group by SixMonthPeriod, ProductDescrip;
This is fine, but I then want to calculate inflation for each product (over a six month period )where I need to use lag (an oracle analytical function). The six month period has to be fixed. Thus, if the previous period for a particular product is missing, then the unit price should be zero. I want to re-begin/begin the calculation of inflation for each product. The unit price and inflation equations looks like the following, respectively:
unit price = (Sum(weight) over a six month period)/(Sum (CIF) over a six month period)
inflation = (Current Unit price - previous unit price)/(previous unit price)
I use the following SQL to calculate inflation for a six month period for each product, where the calculation begins again for each product:
select Yr, SixMthPeriod, Product, UnitPrice, LagUnitPrice, ((UnitPrice -LagUnitPrice)/LagUnitPrice) as inflation
from (select Year as Yr, SixMonthPeriod as SixMthPeriod,
ProductDescrip as product, (Sum (CIF))/(Sum(weight)) as UnitPrice,
lag((Sum (CIF))/(Sum(weight)))
over (partition by ProductDescrip order by YEAR, SixMonthPeriod) as LagUnitPrice
From Import
group by Year, SixMonthPeriod, ProductDescrip)
The problem is the inflation period is not fixed.
For example, for the result, I get the following:
enter image description here
The first two rows are fine and there should be null values because they are my first line, thus there is no LagUnitPrice and inflation.
The third line has a problem where it has taken 0.34 as the LagUnitPrice but actually it is zero (for the period 2016 where SixMthPeriod=1 for the product barley). the oracle analytical functions does not take into account missing rows (e.g. for the period 2016 where SixMthPeriod=1 for the product barley).
how do I fix this problem (if you understand me)?
I have 96 rows, thus I can export the file to excel, and use excel's formulas to fix these exceptions.
You can autogenerate missing periods with nullable price, attach them to your data and do the rest as you did:
select product, year, smp, price, prev_price, (price - prev_price) / prev_price inflation
from (
select product, year, smp, price,
lag(price) over (partition by product order by year, smp) prev_price
from (
select year, ProductDescrip product, SixMonthPeriod smp, sum(CIF)/sum(weight) price
from Import
group by year, SixMonthPeriod, ProductDescrip) a
full join (
select distinct year, productdescrip product, column_value smp
from import cross join table(sys.odcinumberlist(1, 2))) b
using (product, year, smp))
order by product, year, smp
SQLFiddle demo
Subquery b is responsible for generating all periods, you can run it separately to see what it produces.

How can i solve this query in sql oracle?

It's an exercise that is not solved in the book in which I am studying.
The goal is to find the seller who has had the highest number of sales per month,
during all the months for which there is registered information. The problem is that I do not know how to divide tuples into periods of one month.
First table is:
Table Sellers
Id_seller
Name_Product
And the other one is:
Table Product
Name_Product
View_datetime
Budget
What did i do?
I made this query:
SELECT id_seller FROM(SELECT id_seller, COUNT(id_seller)
FROM SELLERS INNER JOIN PRODUCT
ON SELLERS.name_product = PRODUCT.name_product
GROUP BY id_seller HAVING COUNT(id_seller)>= 1
ORDER BY 2 DESC)
WHERE ROWNUM = 1;
The query returns me the seller that most sales has done, but not "per month since there are records" as the statement asks. Any ideas? I'm so lost...
The idea is to compare the total sales of each salesman in this month (sysdate), with those of a month ago, two months ago ... so long as there are older records. And get the maximum from each seller. And then you print the seller with more sales from the previous list. If a seller sells 400 products this month(April, the sysdate), but another seller sold in October last year 500, the result would be the second seller . That's what I do not know how to do.
Thanks ^^
You could try this query
select MonthName, id_seller, max(TotalSales) from (
select to_char(sysdate, 'Month') MonthName, sellers.id_seller, count(sellers.id_seller) TotalSales
from sellers inner join product
on sellers.name_product = product.name_product
group by to_char(view_datetime, 'Month'), sellers.id_seller
) tab
group by MonthName, id_seller
There are a few points to make...
The tables are weird. I assume your table sellers would better be called sales, right?
In this example, having count... >= 1 is a no-op. Count could only be 0 if there were no rows at all, in which case there would be no row in the group- by output. You can just leave this count away, here.
To get the sales per month, just add the month to the group by. I.e. group by id_seller, To_date(view_datetime,'YYYYMM').

using alias for cast in HIVE

I have a table called loan with loan amount,annual income, year (MMM-YY format) and member id. I am trying to find the highest loan amount in a year along wit annual income and member id details.
I tried to group the highest loan amount by year using the code
select max(cast(loan_amt as int)),issue_d from loan group by issue_d;
then I wanted also to fetch the member id and annual income information so I wrote the following code
but it is giving me error message for using alias for a column which is cast.
Code:
select a.loan_amt,a.member_id,a.annual_inc,a.issue_d
from
(select loan_amt,member_id,annual_inc,issue_d from loan) a
join
(select max(cast(loan_amt as int)) as ml,issue_d from loan group by issue_d) c
where ((a.issue_d=c.issue_d) and (a.loan_amt=a.ml));
What you want to do is rank the records based on the Amount, per Period, then keep only the top 1 record for each Period.
Use one of the analytic functions that are designed exactly for that purpose -- Hive has a pretty good support of the SQL standard on that topic.
Since you don't say what to do about ties (i.e. what if several loans have the same Amount???) I assume you want just one record chosen randomly...
select X, Y, Z, Period, Amount as TopAmount
from
(select X, Y, Z, Period, cast(StrAmt as double) as Amount,
row_number() over (partition by Period order by cast(StrAmt as double) desc) as TmpRank
from WTF
) TMPWTF
where TmpRank =1
If you want all the records with top Amount then replace row_number with rank or dense_rank (the "dense" stuff would make a difference for the top 2, but not for the top 1)

How to add one month to month and year

I am working in oracle and new to coding and new to this site so I apologize in advance for the newbie question:
I have a script I am trying to run that will return the sum of next months' sales orders and compare that figure against our budgeted sales forecast. It was working great last month (November) when I set it up but now that it's December, I believe it's having problems figuring out that next month is a new year.
Essentially I just want to sum of our sales order records from the next month and compare that number against our forecast number.
Here is what I have so far (I'm sure I am making lots of grammatical mistakes so please be patient!)
select
"Backlog", "Forecast Amount" , round("Backlog"/"Forecast Amount",4) as "Backlog Percent"
from
(select round(sum(NVL(unit_price,0) *NVL( ship_quan,0)),2) as "Backlog"
from v_backlog_releases
where
(TO_CHAR(V_BACKLOG_RELEASES.PROMISE_DATE,'MM\YYYY') = TO_CHAR(sysdate,'MM\YYYY')+1)),
(select budamount as "Forecast Amount"
from
glbudget,
glperiods
where
glbudget.glperiods_id=glperiods.id and
TO_CHAR(GLPERIODS.START_DATE,'MM') = TO_CHAR(sysdate,'MM')+1)
The system won't let me post images of the output since I am too new. Essentially I should get something that looks like this:
Backlog | Forecast Amount | Backlog Percent
100,000 | 200,000 | .50
The backlog column is just a sum of ship quantities * price for all orders due to ship the following month.
Your issue is that for December TO_CHAR(sysdate, 'MM') + 1 is returning 13 instead of 1 of the next year. Obviously there is no month 13...
Try using ADD_MONTHS(sysdate, 1) instead and handle that result as appropriate. Best advice is to handle dates as dates instead of chars whenever possible.
Update based on comments:
Try using:
EXTRACT(MONTH FROM GLPERIODS.START_DATE) = EXTRACT(MONTH FROM ADD_MONTHS(sysdate, 1))
Documentation: https://docs.oracle.com/cd/B14117_01/server.101/b10759/functions045.htm

Resources