Calculate Percentiles Using Two Variables

Calculate Percentiles Using Two Variables - dax

I have created a percentile measure in my data using one column of data (ENGAGED_DAYS) but where more than one person has the same value (i.e. PERSON_CODE 2 and 3) I'd like to use another column as a tie-breaker (REQUEST_PER_ACTIVE_DAY).
The data looks like this...
|COURSE_CODE|PERSON_CODE|ENGAGED_DAYS|REQUESTS_PER_ACTIVE_DAY|
|-----------|-----------|------------|-----------------------|
|MATHS101 |1 |10 |15 |
|MATHS101 |2 |15 |11 |
|MATHS101 |3 |15 |17 |
|MATHS101 |4 |20 |10 |
SORRY - CAN'T GET THE TABLE TO RENDER CORRECTLY
I created this measure...
EngDays = SUM('All Faculty Course Activity Percentile 2022-04-04 - NO PERCENTILE'[ENGAGED_DAYS])
And this is the measure I created which calulates the percentile based on ENGAGED_DAYS only....
Course_Activity_Percentile =
VAR EngDays =
[EngDays]
RETURN
IF(
HASONEVALUE('All Faculty Course Activity Percentile 2022-04-04 - NO PERCENTILE'[PERSON_CODE]),
COALESCE(
DIVIDE(
--Numerator (below) counts values that are < the students's active day count in the course
CALCULATE(
COUNTROWS('All Faculty Course Activity Percentile 2022-04-04 - NO PERCENTILE'),
FILTER(
ALLEXCEPT('All Faculty Course Activity Percentile 2022-04-04 - NO PERCENTILE','All Faculty Course Activity Percentile 2022-04-04 - NO PERCENTILE'[CANVAS_COURSE_CODE]),
'All Faculty Course Activity Percentile 2022-04-04 - NO PERCENTILE'[ENGAGED_DAYS] < EngDays
)
),
--Denominator (below) counts the total students in the course
CALCULATE(
COUNTROWS('All Faculty Course Activity Percentile 2022-04-04 - NO PERCENTILE'),
ALLEXCEPT('All Faculty Course Activity Percentile 2022-04-04 - NO PERCENTILE','All Faculty Course Activity Percentile 2022-04-04 - NO PERCENTILE'[CANVAS_COURSE_CODE])
)
),
0
)
)
Can anyone help me to incorporate this additional factor into the calculation.
Thanks very much for your help.

Related

LOOKUPVALUE based upon aggregate function in DAX

I need a calculated column (because this will be used in a slicer) that returns the employee's most recent supervisor.
Data sample (table 'Performance'):
EMPLOYEE | DATE | SUPERVISOR
--------------------------------------------
Jim | 2018-11-01 | Bob
Jim | 2018-11-02 | Bob
Jim | 2018-11-03 | Bill
Mike | 2018-11-01 | Steve
Mike | 2018-11-02 | Gary
Desired Output:
EMPLOYEE | DATE | SUPERVISOR | LAST SUPER
---------------------------------------------------------------
Jim | 2018-11-01 | Bob | Bill
Jim | 2018-11-02 | Bob | Bill
Jim | 2018-11-03 | Bill | Bill
Mike | 2018-11-01 | Steve | Gary
Mike | 2018-11-02 | Gary | Gary
I tried to use
LAST SUPER =
LOOKUPVALUE (
Performance[SUPERVISOR],
Performance[DATE], MAXX ( Performance, [DATE] )
)
but I get the error:
Calculation error in column 'Performance'[]: A table of multiple
values was supplied where a single value was expected.
After doing more research, it appears this approach was doomed from the start. According to this, the search value cannot refer to any column in the same table being searched. However, even when I changed the search value to TODAY() or a static date as a test, I got the same error about multiple values. MAXX() is also returning the maximum date in the entire table, not just for that employee.
I wondered if it was a many to many issue, so I went back into Power Query, duplicated the original query, grouped by EMPLOYEE to get MAX(DATE), matched both fields against the original query to get the SUPERVISOR on MAX(DATE), and can treat this like a regular lookup table. While it does work, unsurprisingly the refresh is markedly slower.
I can't decide if I'm over-complicating, over-simplifying, or just wildly off base with either approach, but I would be grateful for any suggestions.
What I'd like to know is:
Is it possible to use a simple function like LOOKUPVALUES() to achieve the desired output?
If not, is there a more efficient approach than duplicating the query?

The reason LOOKUPVALUE is giving that particular error is that it's doing a lookup on the whole table, not just the rows associated with that particular employee. So if you have multiple supervisors matching the same maximal date, then you've got a problem.
If you want to use the LOOKUPVALUE function for this, I suggest the following:
Last Super =
VAR EmployeeRows =
FILTER( Performance, Performance[Employee] = EARLIER( Performance[Employee] ) )
VAR MaxDate = MAXX( EmployeeRows, Performance[Date] )
RETURN
LOOKUPVALUE(
Performance[Supervisor],
Performance[Date], MaxDate,
Performance[Employee], Performance[Employee]
)
There are two key differences here.
I'm taking the maximal date over only the rows for that particular employee (EmployeeRows).
I'm including Employee in the lookup function, so that it
only matches for the appropriate employee.
For other possible solutions, please see this question:
Return top value ordered by another column

How to query to find times of departure and arrival on laravel 5?

I tried to search for the departure hours and bus arrival hours
My database
id | bus_id | city | time
1 |1 | 1 | 05:00
2 |1 | 2 | 06:00
3 |1 | 3 | 07:00
example user choose city origin = 1 and city destination = 2
then the results I expected are
bus | time departure | time arrival
1 | 05:00 | 06:00
my query
$departures = Departure::whereBetween(
'city', [
$request->origin, $request->destination
])
->get();

If the time of a specific bus that arrive a specific city is only one record,
you can try this or it will return the max arrival time and max departure time.
It will merge two records together(bus_1 to A, and bus_1 to b), and choose the max time as the arrival time or departure time:
$departure_citys = Departure::where('city',$request->origin)
->selectRaw('id,
bus_id,
city,
time AS time_departure,
NULL AS time_arrival');
$arrival_citys = Departure::where('city',$request->destination)
->selectRaw('id,
bus_id,
city,
NULL AS time_departure,
time AS time_arrival');
$departures = $departure_citys->unionAll($arrival_citys);
DB::table(DB::raw("({$departures->toSql()}) AS tr"))
->mergeBindings($departures->getQuery())
->groupBy('city')
->selectRaw('bus_id,
MAX(time_departure) AS time_dept,
MAX(time_arrival) AS time_arvl')->get();

Weighted average over two columns in different tables

I have a table called Products and another table called Sales. They are related by ProductId. I need to find the weighted average for a selected list of products in Products table. The formula is
(Product1.UnitCost * Sales[ProductId1].ItemsSold +
Product2.UnitCost * Sales[ProductId2].ItemsSold + ...) /
(Total sum of the chosen products items sold)
How can I write a DAX formula for this?
Products
ProductId | Name | Description | UnitItemCost
------------|---------|---------------|----------------
id1 | Name1 | Description1 | 10
id2 | Name2 | Description2 | 20
id3 | Name3 | Description3 | 30
Sales
ProductId | ItemsSold
-----------|-------------- 1714.126984
Id1 | 20
id2 | 30
id1 | 10
id2 | 40
id3 | 50
id3 | 39
Average unit cost = 23.12 (10*30+20*70+30*89)/189

I'm not sure about the logic in your above example. It looks like you're trying to take UnitCost * ItemSold for each product, sum that all together and divide by the total ItemSold.
That should be ((10*30)+(20*70)+(30*89)) = 4370 divided by 189, which is 23.12.
If that's the case, you can created a calculated measure like so:
Average unit cost =
--create a summary table, one row per product id, with a 'Cost * Sold' column giving you UnitItemCost * ItemSold for each product
VAR my_table =
SUMMARIZE (
Sales,
Sales[ProductId],
"Cost * Sold", MAX ( Products[UnitItemCost] ) * SUM ( Sales[ItemSold] )
)
RETURN
--take the sum of UnitItemCost * ItemSold for each product (4370 in your example) divided by the total ItemSold (189 in your example)
SUMX ( my_table, [Cost * Sold] ) / SUM ( Sales[ItemSold] )
As long as your Products and Sales tables are related via ProductId, this should work. After testing it on my end with your sample data, I'm getting 23.12.

As Rory points out, your 1714.126984 value does not seem to match your example, but the weighted average unit cost can be calculated using a relatively simple measure:
Average unit cost = DIVIDE(
SUMX(Sales,
Sales[ItemsSold] * RELATED(Products[UnitItemCost])),
SUM(Sales[ItemsSold]))
This is the sum-product of the items and their associated cost divided by the total number of items sold.

How to calculate day specific data from accumulated data

I have an Oracle table with values for two accounts and each record will have Date field. First day of the week will have only data relevant to the Day 1 but when we see the data for Day2 in a week it has accumulated data. So we need to subtract Day2 data from previous day data to calculate exact data for Day2.Similar approach for Day3..Day7.
Please suggest the best approach in SQL query to handle this requirement. I am very sorry to bother you. I am totally new to SQL.Really appreciate your valuable inputs.As an example, there are 6 columns with header are given below
Center Entity Bonus Year Period Incentive
MANUFACTURING NEW YORK 1200 FY18 31-12-2017 120
MANUFACTURING NEW YORK 1500 FY18 01-01-2018 250
MANUFACTURING NEW YORK 1800 FY18 01-01-2018 320
So assuming Dec 31, 2017 is the first day of the week, the data record will show only data for that day 1. When we move on to Day 2 of the week i.e. Jan 01, 2018, it has accumulated data which includes Day 1 and day2. So we need to subtract Day2 data from Day1 data to calculate exact data for data 2. 1500 - 1200 = 300 is the exact value for Day 2. Similar approach we need to follow for Day3, day4, Day5,Day6 and Day7.
Expected output is given below
Center Entity Bonus Year Period Incentive
MANUFACTURING NEW YORK 1200 FY18 01-01-2018 120
MANUFACTURING NEW YORK 300 FY18 01-01-2018 130
MANUFACTURING NEW YORK 300 FY18 01-01-2018 70

You could use a simple LAG() function with NVL().
select Center, Entity, Bonus,Year,Period,
Incentive - NVL( LAG(Incentive , 1) OVER ( ORDER BY Period ), 0) Incentive
FROM yourtable;
DEMO

You can do a self join with you table on period and do the subtraction from previous date's data.
looks like you have typo in the test data and result, the dates should be incremental as stated in description, but it has duplicate.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE t
(Center varchar2(13), Entity varchar2(8), Bonus int, Year varchar2(4), Period DATE, Incentive int)
;
INSERT INTO t (Center, Entity, Bonus, Year, Period, Incentive)
VALUES ('MANUFACTURING', 'NEW YORK', 1200, 'FY18', DATE '2017-12-31', 120);
INSERT INTO t (Center, Entity, Bonus, Year, Period, Incentive)
VALUES ('MANUFACTURING', 'NEW YORK', 1500, 'FY18', DATE '2018-01-01', 250);
INSERT INTO t (Center, Entity, Bonus, Year, Period, Incentive)
VALUES ('MANUFACTURING', 'NEW YORK', 1800, 'FY18', DATE '2018-01-02', 320);
;
Query:
select t1.center,
t1.entity,
t1.bonus - nvl (t2.bonus,0) bonus,
t1.year,
t1.period,
t1.incentive - nvl(t2.incentive,0) incentive
from t t1
left outer join t t2
on t1.period = t2.period + 1
order by t1.period
Results:
| CENTER | ENTITY | BONUS | YEAR | PERIOD | INCENTIVE |
|---------------|----------|-------|------|----------------------|-----------|
| MANUFACTURING | NEW YORK | 1200 | FY18 | 2017-12-31T00:00:00Z | 120 |
| MANUFACTURING | NEW YORK | 300 | FY18 | 2018-01-01T00:00:00Z | 130 |
| MANUFACTURING | NEW YORK | 300 | FY18 | 2018-01-02T00:00:00Z | 70 |

sort_array order by a different column, Hive

I have two columns, one of products, and one of the dates they were bought. I am able to order the dates by applying the sort_array(dates) function, but I want to be able to sort_array(products) by the purchase date.
Is there a way to do that in Hive?
Tablename is
ClientID Product Date
100 Shampoo 2016-01-02
101 Book 2016-02-04
100 Conditioner 2015-12-31
101 Bookmark 2016-07-10
100 Cream 2016-02-12
101 Book2 2016-01-03
Then, getting one row per customer:
select
clientID,
COLLECT_LIST(Product) as Prod_List,
sort_array(COLLECT_LIST(date)) as Date_Order
from tablename
group by 1;
As:
ClientID Prod_List Date_Order
100 ["Shampoo","Conditioner","Cream"] ["2015-12-31","2016-01-02","2016-02-12"]
101 ["Book","Bookmark","Book2"] ["2016-01-03","2016-02-04","2016-07-10"]
But what I want is the order of the products to be tied to the correct chronological order of purchases.

It is possible to do it using only built-in functions, but it is not a pretty site :-)
select clientid
,split(regexp_replace(concat_ws(',',sort_array(collect_list(concat_ws(':',cast(date as string),product)))),'[^:]*:([^,]*(,|$))','$1'),',') as prod_list
,sort_array(collect_list(date)) as date_order
from tablename
group by clientid
;
+----------+-----------------------------------+------------------------------------------+
| clientid | prod_list | date_order |
+----------+-----------------------------------+------------------------------------------+
| 100 | ["Conditioner","Shampoo","Cream"] | ["2015-12-31","2016-01-02","2016-02-12"] |
| 101 | ["Book2","Book","Bookmark"] | ["2016-01-03","2016-02-04","2016-07-10"] |
+----------+-----------------------------------+------------------------------------------+

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Calculate Percentiles Using Two Variables - dax

Related

LOOKUPVALUE based upon aggregate function in DAX

How to query to find times of departure and arrival on laravel 5?

Weighted average over two columns in different tables

How to calculate day specific data from accumulated data

sort_array order by a different column, Hive

Categories

Resources