Power query, iterate over the column records to apply a custom cumulative calculation - powerquery

Using Power Query in Excel. I am trying to implement a custom column that would iteratively calculate the row based on the previous row's value of the same column.
I have a 3 column table and the 4th column will be the calculation column that I am failing to implement.
The calculation is very easy to apply in Excel which goes as follows:
Formula in cell D3 --> = =IF(A3=1,C3+6.4,IF(C3+D2>=12.8,12.8,IF(C3+D2<=1.28,1.28,C3+D2)))
The same formula is applied to the whole column by dragging.
The idea behind it:
For each category, I have an index column starting from 1,
If Index = 1, then Calculation is Value + 6.4,
else if Value + Value(previous row Custom cumulative) >= 12.8 then 12.8
else if Value + Value(previous row Custom cumulative) <= 1.28 then 1.28
else Value + Value(previous row Custom cumulative)
So, the calculation is a cumulative sum with an upper and lower cap built into it.
How can I implement this in Power Query and M-Language?
I really appreciate your help!
I have tried to use List.Generate and List.Accumulate features, however, I was stuck with creating records that has values from multiple columns in it.

Try this
(edited to make more efficient with single pass process)
let Source = Excel.CurrentWorkbook(){[Name="Table15"]}[Content],
process = (zzz as list) => let x= List.Accumulate( zzz,{0},( state, current ) =>
if List.Last(state) =0 then List.Combine ({state,{6.4+current}}) else
if List.Last(state)+current >=12.8 then List.Combine ({state,{12.8}}) else
if List.Last(state)+current <=1.28 then List.Combine ({state,{1.28}}) else
List.Combine ({state,{List.Last(state)+current}})
) in x,
#"Grouped Rows" = Table.Group(Source, {"Category"}, {{"data", each
let a=process(_[Values])
in Table.AddColumn(_, "Custom Cumulative", each a{[Index]}), type table }}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"Index", "Values", "Custom Cumulative"}, {"Index", "Values", "Custom Cumulative"})
in #"Expanded data"

Related

(Power Query) Complicated sort

I have a complicated sorting that I want, and I'm just not sure how to get power query to do it. The TLDR version is "oldest new ones first, then newest old ones." So I want to split the sort between ascending/descending depending on what data are in the columns.
Certain columns on my sheet (I through K) contain the word 'Yes' if it is a new item, otherwise blank. Possible combinations of columns that have 'yes' in them:
I only, J only, K only, I + J, J + K, I + J + K
Here's the sort logic I want:
All rows with a Yes in K are listed first, ascending by date (column H), whether they have 'Yes' in columns I or J or not.
Then, Of only the rows that are left, all rows with a Yes in J, ascending by date (column H)
Next, Of only the rows that are left, all rows with a Yes in I, ascending by date (column H)
Finally, the only rows left should not have a Yes in any columns I-K. Of those rows, DEscending by date (Column H).
I can sort of maybe figure out how to do the sort up through step 3 by creating a custom column to label and identifying whether the row will go in the first, second, or third sort, then sorting by that custom column before sorting the others.
But step 4 is stumping me because of the reverse to descending instead of ascending. I'm thinking maybe grouping the data, sorting it within the group descending and outside the group ascending (as a 4th entry in the custom column that sorted the first 3), and then expanding it back out again after the external sort, or something?
Please help!
Currently I'm only able to sort the sheet ascending and can't sort part of it descending.
Filter a column, then sort it. Filter another column and sort it. etc. Put them together
Load your data into powerquery (data ... from table/range ... )and use code below pasted into home ... advanced editor.... It assumes your data is loaded as Table1 with column headers A,H,I,J,K, so change that to reflect your actual table name and column names. If you have your own code, remove the first row and change the Source in the second row to reflect your #"PriorStepName"
sample code to transform image below on left to image on right:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"A", Int64.Type}, {"H", type date}, {"I", type text}, {"J", type text}, {"K", type text}}),
Part1 = Table.Sort(Table.SelectRows(#"Changed Type", each ([K] = "Yes")),{{"H", Order.Ascending}}),
Part2 = Table.Sort(Table.SelectRows(#"Changed Type", each ([K] <> "Yes" and [J] = "Yes")),{{"H", Order.Ascending}}),
Part3 = Table.Sort(Table.SelectRows(#"Changed Type", each ([K] <> "Yes" and [J] <> "Yes" and [I] = "Yes")),{{"H", Order.Ascending}}),
Part4 = Table.Sort(Table.SelectRows(#"Changed Type", each ([K] <> "Yes" and [J] <> "Yes" and [I] <> "Yes")),{{"H", Order.Descending}}),
Combined = Table.Combine({Part1,Part2,Part3,Part4})
in Combined

Powerquery: passing column value to custom function

I'm struggling on passing the column value to a formula. I tried many different combinations but I only have it working when I hard code the column,
(tbl as table, col as list) =>
let
avg = List.Average(col),
sdev = List.StandardDeviation(col)
in
Table.AddColumn(tbl, "newcolname" , each ([column] - avg)/sdev)
I'd like to replace [column] by a variable. In fact, it's the column I use for the average and the standard deviation.
Please any help.
Thank you
This probably does what you want, called as x= fctn(Source,"ColumnA")
Does the calculations using and upon ColumnA from Source table
(tbl as table, col as text) =>
let
avg = List.Average(Table.Column(tbl,col)),
sdev = List.StandardDeviation(Table.Column(tbl,col))
in Table.AddColumn(tbl, "newcolname" , each (Record.Field(_, col) - avg)/sdev)
Potentially you want this. Does the average and std on the list provided (which can come from any table) and does the subsequent calculations on the named column in the table passed over
called as x = fctn(Source,"ColumnNameInSource",SomeSource[SomeColumn])
(tbl as table, cname as text, col as list) =>
let
avg = List.Average(col),
sdev = List.StandardDeviation(col)
in Table.AddColumn(tbl, "newcolname" , each (Record.Field(_, cname) - avg)/sdev)

Convert Switch True() Dax calculated column to M query custom colum

I am having issues with my calculated column and the multiple tables I am joining. It is not filtering my visuals correctly. After researching it was recommended to use a custom column in the query instead but I do not know where to start to convert the following DAX to M query.
overall =
VAR skills =
CALCULATETABLE (
VALUES ( tsr_skill[ts_skill] ),
ALLEXCEPT ( tsr_skill, tsr_skill[ts_tsr] )
)
RETURN
SWITCH (
TRUE (),
"JMSR" IN skills, "Senior",
"JMOV" IN skills, "Over",
"JMUN" IN skills, "Under",
"JMRH" IN skills, "RHT",
"MNT"
)
Data structure in Query:
How I would like the data to show in the Query instead of showing as a calculated column.
Preferred Output:
Based on your explanation, and the levels assigned in your DAX formula, it would seem that all should be assigned as "under".
In your "Preferred Output" you do show JMXX being assigned as "Over", but that tsr does not include the JMOV skill
If your written explanation is correct, and your Preferred Output screenshot incorrect based on the posted data, then, in PQ you can
Group by tsr
Create a custom aggregation returning the "overall" based on containing one of the skills listed in your DAX formula.
If that is not the case, please clarify how you are assigning "Over" to JMXX.
Edit: M Code simplified
M Code
let
//Source = the data structure you show
Source = Excel.CurrentWorkbook(){[Name="Table13"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ts_tsr", type text}, {"ts_skill", type text}}),
//Group rows by tsr, then check if it has one of the defined skills
//If so, return the appropriate ranking.
#"Grouped Rows" = Table.Group(#"Changed Type", {"ts_tsr"}, {
{"ALL", each _, type table [ts_tsr=nullable text, ts_skill=nullable text]},
{"overall", each if List.Contains([ts_skill],"JMSR") then "Senior"
else if List.Contains([ts_skill],"JMOV") then "Over"
else if List.Contains([ts_skill],"JMUN") then "Under"
else if List.Contains([ts_skill],"JMRH") >=0 then "RHT"
else "MNT"}
}),
//Then re-expand the table
#"Expanded ALL" = Table.ExpandTableColumn(#"Grouped Rows", "ALL", {"ts_skill"}, {"ts_skill"})
in
#"Expanded ALL"
Data
Output

PowerQuery - use position of column instead of column name in calculation

New to PowerQuery and M-Code.
I have added a column with a calculation to get the max. Instead of using the hardcoded column name, I would like to use the position number of the column.
The current code is:
= Table.AddColumn(Source, "Maximum", each List.Max({[#"1-6-2021"], [#"1-5-2021"], [#"1-4-2021"]}), type number)
Instead of [#"1-6-2021"], I would like it to be column 3; for [#"1-5-2021"] column 4 etc.
How do I replace these columnnames with positions?
Many thanks for the help!
You can adjust the {x} part for the column # you want
0 is the first column, so this is max of columns 2/3/4
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
x= Table.AddColumn(Source, "Maximum", each List.Max({
Record.Field(_,Table.ColumnNames(Source){1}),
Record.Field(_,Table.ColumnNames(Source){2}),
Record.Field(_,Table.ColumnNames(Source){3})
}), type number)
in x
If you need to do a Max on a bunch of columns, below would, for example, do it for all columns except the first two, which are removed by the 2nd line
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
colToSum = List.RemoveFirstN(Table.ColumnNames(Source),2),
AddIndex = Table.AddIndexColumn(Source,"Index",0,1),
GetMax = Table.AddColumn(AddIndex, "Custom", each List.Max( Record.ToList( Table.SelectColumns(AddIndex,colToSum){[Index]}) ))
in GetMax

DAX EARLIER() function in Power Query

Is there an an equivalent to EARLIER in M/Power Query?
Say, I have a table with lots of different dates in column DATE and a smaller number of letters in column LETTER. I now want the maximum date for each letter.
In DAX, I would use something like CALCULATE(MAX([Date]),FILTER(ALL(Table),[Letter]=EARLIER([Letter])).
How would I achieve the same in M?
Thanks
2 Solutions in the code below. Notice that each uses "PreviousStep" as basis, so these are separate solutions.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
PreviousStep = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Letter", type text}}),
// 1. Add a column to the original table with the MaxDate for each letter
// "earlier" is just the name of a function parameter; it could as well have been "x" or "MarcelBeug"
AddedMaxDate = Table.AddColumn(PreviousStep, "MaxDate", (earlier) => List.Max(Table.SelectRows(PreviousStep, each [Letter] = earlier[Letter])[Date])),
// 2. Group by letter and get the MaxDate for each letter
GroupedOnLetter = Table.Group(PreviousStep, {"Letter"}, {{"MaxDate", each List.Max([Date]), type date}})
in
GroupedOnLetter
In short, there is no exact match for this function. Still, you can use other ways that can produce same results.
To reproduce example offered by Microsoft in help for EARLIER function, you can use following code (table1 equals table given in the example before ranking):
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("TVNNaxtBDP0rxuSoivn+uMYlLSUFE4f2YHIYd4d48Xq3rO1C/n01o1mc4+i9kZ6epP1+LcMa1o/94cvuOM3XCz0epHUgnccQ1m+wXytXGae8ekl/TpWhlACvBHrBDL8wdtc0dpWiLTgV0EVm1CrT9Trky4ooq016z5VnI2ij0OjKs402nVePM1XLrMgEcEaj8ZVU9czpxAmcAik1SlcxGSm2SX/5m4eoDVpToSJyc0z9WLEAwXgUrcX6a8hpzDNb4CAEhU5VuIjfzGk8XZoeGSVYpVBwd+X31zynfhjyjRM4A9FZ1NyWFhR7ymPX0hsJ0RuUbJ+s6DSzt96QtR4d96MK9m2Y/uVmfABtNVrWbSj2newc8iEtwjUoS401O2Rh5NQtyq0HZyNGFq4ZHs6Lz1aCjAopXmFV4I9uTtd+GlfbZfyR3IkafTOvJPlBneUPbj1GMCouMFkA6+f+/VhLcKjofp5aNmlBkKQ23JLs53QbrzSoVdkp3iYDWlgIzqBi6VJ9Jj7N6cxMA1ZSE16ga/XLTm3TOPZsPv8uora5SwNLMIIkK1Q8EF02bHs78xZJBS5alK1bCr1Mqbtro7+WfHPRoeZNk2Yh3XVpcNqBjgE9myuLrl3qaHg8GUUr5RYbVKlzP0kdLHhBJ9kOrsjfLQaWndCEWcZK8dfF7wcZIrkRUXNe7Ss6tzN8vR2WxTIQtMLQJl9Y023ux/d7o1JTHVOH0MyQ7hPv3isdh7F01gYFH5Aqvf7KF5akyLEYBYrmVpH0+5jz0C4nADEq+vYf", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ProductSubcategoryKey = _t, EnglishProductSubcategoryName = _t, TotalSubcategorySales = _t]),
table1 = Table.TransformColumnTypes(Source,{{"ProductSubcategoryKey", Int64.Type}, {"EnglishProductSubcategoryName", type text}, {"TotalSubcategorySales", Currency.Type}}, "en-US"),
AddCount = Table.AddColumn(
table1,
"SubcategoryRanking", //(a) is a parameter for function, which equals current record, and function should return value for new cell of "SubcategoryRanking"
(a)=> Table.RowCount(
Table.SelectRows(
table1, //(b) equals whole table1. This function returns table filtered by given criteria
(b) => b[TotalSubcategorySales] < a[TotalSubcategorySales])
) + 1,
Int64.Type)
in
AddCount
I think you can use the GroupBy function to group the data by Letter and find the Max of the date column. So your code should look like.
= Table.Group(#"Previous step", {"Letter"}, {{"Max Date", each List.Max([Date]), type date}})

Resources