(Power Query) Complicated sort - powerquery

I have a complicated sorting that I want, and I'm just not sure how to get power query to do it. The TLDR version is "oldest new ones first, then newest old ones." So I want to split the sort between ascending/descending depending on what data are in the columns.
Certain columns on my sheet (I through K) contain the word 'Yes' if it is a new item, otherwise blank. Possible combinations of columns that have 'yes' in them:
I only, J only, K only, I + J, J + K, I + J + K
Here's the sort logic I want:
All rows with a Yes in K are listed first, ascending by date (column H), whether they have 'Yes' in columns I or J or not.
Then, Of only the rows that are left, all rows with a Yes in J, ascending by date (column H)
Next, Of only the rows that are left, all rows with a Yes in I, ascending by date (column H)
Finally, the only rows left should not have a Yes in any columns I-K. Of those rows, DEscending by date (Column H).
I can sort of maybe figure out how to do the sort up through step 3 by creating a custom column to label and identifying whether the row will go in the first, second, or third sort, then sorting by that custom column before sorting the others.
But step 4 is stumping me because of the reverse to descending instead of ascending. I'm thinking maybe grouping the data, sorting it within the group descending and outside the group ascending (as a 4th entry in the custom column that sorted the first 3), and then expanding it back out again after the external sort, or something?
Please help!
Currently I'm only able to sort the sheet ascending and can't sort part of it descending.

Filter a column, then sort it. Filter another column and sort it. etc. Put them together
Load your data into powerquery (data ... from table/range ... )and use code below pasted into home ... advanced editor.... It assumes your data is loaded as Table1 with column headers A,H,I,J,K, so change that to reflect your actual table name and column names. If you have your own code, remove the first row and change the Source in the second row to reflect your #"PriorStepName"
sample code to transform image below on left to image on right:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"A", Int64.Type}, {"H", type date}, {"I", type text}, {"J", type text}, {"K", type text}}),
Part1 = Table.Sort(Table.SelectRows(#"Changed Type", each ([K] = "Yes")),{{"H", Order.Ascending}}),
Part2 = Table.Sort(Table.SelectRows(#"Changed Type", each ([K] <> "Yes" and [J] = "Yes")),{{"H", Order.Ascending}}),
Part3 = Table.Sort(Table.SelectRows(#"Changed Type", each ([K] <> "Yes" and [J] <> "Yes" and [I] = "Yes")),{{"H", Order.Ascending}}),
Part4 = Table.Sort(Table.SelectRows(#"Changed Type", each ([K] <> "Yes" and [J] <> "Yes" and [I] <> "Yes")),{{"H", Order.Descending}}),
Combined = Table.Combine({Part1,Part2,Part3,Part4})
in Combined

Related

Power query, iterate over the column records to apply a custom cumulative calculation

Using Power Query in Excel. I am trying to implement a custom column that would iteratively calculate the row based on the previous row's value of the same column.
I have a 3 column table and the 4th column will be the calculation column that I am failing to implement.
The calculation is very easy to apply in Excel which goes as follows:
Formula in cell D3 --> = =IF(A3=1,C3+6.4,IF(C3+D2>=12.8,12.8,IF(C3+D2<=1.28,1.28,C3+D2)))
The same formula is applied to the whole column by dragging.
The idea behind it:
For each category, I have an index column starting from 1,
If Index = 1, then Calculation is Value + 6.4,
else if Value + Value(previous row Custom cumulative) >= 12.8 then 12.8
else if Value + Value(previous row Custom cumulative) <= 1.28 then 1.28
else Value + Value(previous row Custom cumulative)
So, the calculation is a cumulative sum with an upper and lower cap built into it.
How can I implement this in Power Query and M-Language?
I really appreciate your help!
I have tried to use List.Generate and List.Accumulate features, however, I was stuck with creating records that has values from multiple columns in it.
Try this
(edited to make more efficient with single pass process)
let Source = Excel.CurrentWorkbook(){[Name="Table15"]}[Content],
process = (zzz as list) => let x= List.Accumulate( zzz,{0},( state, current ) =>
if List.Last(state) =0 then List.Combine ({state,{6.4+current}}) else
if List.Last(state)+current >=12.8 then List.Combine ({state,{12.8}}) else
if List.Last(state)+current <=1.28 then List.Combine ({state,{1.28}}) else
List.Combine ({state,{List.Last(state)+current}})
) in x,
#"Grouped Rows" = Table.Group(Source, {"Category"}, {{"data", each
let a=process(_[Values])
in Table.AddColumn(_, "Custom Cumulative", each a{[Index]}), type table }}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"Index", "Values", "Custom Cumulative"}, {"Index", "Values", "Custom Cumulative"})
in #"Expanded data"

Convert Switch True() Dax calculated column to M query custom colum

I am having issues with my calculated column and the multiple tables I am joining. It is not filtering my visuals correctly. After researching it was recommended to use a custom column in the query instead but I do not know where to start to convert the following DAX to M query.
overall =
VAR skills =
CALCULATETABLE (
VALUES ( tsr_skill[ts_skill] ),
ALLEXCEPT ( tsr_skill, tsr_skill[ts_tsr] )
)
RETURN
SWITCH (
TRUE (),
"JMSR" IN skills, "Senior",
"JMOV" IN skills, "Over",
"JMUN" IN skills, "Under",
"JMRH" IN skills, "RHT",
"MNT"
)
Data structure in Query:
How I would like the data to show in the Query instead of showing as a calculated column.
Preferred Output:
Based on your explanation, and the levels assigned in your DAX formula, it would seem that all should be assigned as "under".
In your "Preferred Output" you do show JMXX being assigned as "Over", but that tsr does not include the JMOV skill
If your written explanation is correct, and your Preferred Output screenshot incorrect based on the posted data, then, in PQ you can
Group by tsr
Create a custom aggregation returning the "overall" based on containing one of the skills listed in your DAX formula.
If that is not the case, please clarify how you are assigning "Over" to JMXX.
Edit: M Code simplified
M Code
let
//Source = the data structure you show
Source = Excel.CurrentWorkbook(){[Name="Table13"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ts_tsr", type text}, {"ts_skill", type text}}),
//Group rows by tsr, then check if it has one of the defined skills
//If so, return the appropriate ranking.
#"Grouped Rows" = Table.Group(#"Changed Type", {"ts_tsr"}, {
{"ALL", each _, type table [ts_tsr=nullable text, ts_skill=nullable text]},
{"overall", each if List.Contains([ts_skill],"JMSR") then "Senior"
else if List.Contains([ts_skill],"JMOV") then "Over"
else if List.Contains([ts_skill],"JMUN") then "Under"
else if List.Contains([ts_skill],"JMRH") >=0 then "RHT"
else "MNT"}
}),
//Then re-expand the table
#"Expanded ALL" = Table.ExpandTableColumn(#"Grouped Rows", "ALL", {"ts_skill"}, {"ts_skill"})
in
#"Expanded ALL"
Data
Output

PowerQuery - use position of column instead of column name in calculation

New to PowerQuery and M-Code.
I have added a column with a calculation to get the max. Instead of using the hardcoded column name, I would like to use the position number of the column.
The current code is:
= Table.AddColumn(Source, "Maximum", each List.Max({[#"1-6-2021"], [#"1-5-2021"], [#"1-4-2021"]}), type number)
Instead of [#"1-6-2021"], I would like it to be column 3; for [#"1-5-2021"] column 4 etc.
How do I replace these columnnames with positions?
Many thanks for the help!
You can adjust the {x} part for the column # you want
0 is the first column, so this is max of columns 2/3/4
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
x= Table.AddColumn(Source, "Maximum", each List.Max({
Record.Field(_,Table.ColumnNames(Source){1}),
Record.Field(_,Table.ColumnNames(Source){2}),
Record.Field(_,Table.ColumnNames(Source){3})
}), type number)
in x
If you need to do a Max on a bunch of columns, below would, for example, do it for all columns except the first two, which are removed by the 2nd line
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
colToSum = List.RemoveFirstN(Table.ColumnNames(Source),2),
AddIndex = Table.AddIndexColumn(Source,"Index",0,1),
GetMax = Table.AddColumn(AddIndex, "Custom", each List.Max( Record.ToList( Table.SelectColumns(AddIndex,colToSum){[Index]}) ))
in GetMax

Google App Script: Remove blank rows from range selection for sorting

I want to sort real-time when a number is calculated in a "Total" column, which is a sum based on other cells, inputted by the user. The sort should be descending and I did achieve this functionality using the following:
function onEdit(event){
var sheet = event.source.getActiveSheet();
var range = sheet.getDataRange();
var columnToSortBy = 6;
range.sort( { column : columnToSortBy, ascending: false } );
}
It's short and sweet, however empty cells in the total column which contain the following formula, blanking itself if the sum result is a zero, otherwise printing the result:
=IF(SUM(C2:E2)=0,"",SUM(C2:E2))
It causes these rows with an invisible formula to be included in the range selection and upon descending sort, they get slapped up top for some reason. I want these blank rows either sorted to the bottom, or in an ideal scenario removed from the range itself (Without deleting them and the formula they contain from the sheet) prior to sorting.
Or maybe some better way which doesn't require me dragging a formula across an entire column of mostly empty rows. I've currently resorted to adding the formula manually one by one as new entries come in, but I'd rather avoid this.
EDIT: Upon request find below a screenshot of the sheet. As per below image, the 6th column of total points needs to be sorted descending, with winner on top. This should have a pre-pasted formula running lengthwise which sums up the preceding columns for each participant.
The column preceding it (Points for Tiers) is automatically calculated by multiplying the "Tiers" column by 10 to get final points. This column could be eliminated and everything shifted once left, but it's nice to maintain a visual of the actual points awarded. User input is entered in the 3 white columns.
You want to sort the sheet by the column "F" as the descending order.
You want to sort the sheet by ignoring the empty cells in the column "F".
You want to move the empty rows to the bottom of row.
You don't want to change the formulas at the column "F".
You want to achieve this using Google Apps Script.
If my understanding is correct, how about this answer?
Issue and workaround:
In the current stage, when the empty cells are scattered at the column "F", I think that the built-in method of "sort" of Class Range cannot be directly used. The empty cells are moved to the top of row like your issue. So in this answer, I would like to propose to use the sort method of JavaScript for this situation.
Modified script:
In order to run this function, please edit a cell.
function onEdit(event){
const columnToSortBy = 6; // Column "F"
const headerRow = 1; // 1st header is the header row.
const sheet = event.source.getActiveSheet();
const values = sheet.getRange(1 + headerRow, 1, sheet.getLastRow() - headerRow, sheet.getLastColumn())
.getValues()
.sort((a, b) => a[columnToSortBy - 1] > b[columnToSortBy - 1] ? -1 : 1)
.reduce((o, e) => {
o.a.push(e.splice(0, columnToSortBy - 1));
e.splice(0, 1);
if (e.length > 0) o.b.push(e);
return o;
}, {a: [], b: []});
sheet.getRange(1 + headerRow, 1, values.a.length, values.a[0].length).setValues(values.a);
if (values.b.length > 0) {
sheet.getRange(1 + headerRow, columnToSortBy + 1, values.b.length, values.b[0].length).setValues(values.b);
}
}
In this sample script, it supposes that the header row is the 1st row. If in your situation, no header row is used, please modify to const headerRow = 0;.
From your question, I couldn't understand about the columns except for the column "F". So in this sample script, all columns in the data range except for the column "F" is replaced by sorting. Please be careful this.
Note:
Please use this sample script with enabling V8.
References:
sort(sortSpecObj)
sort()
Added:
You want to sort the sheet by the column "F" as the descending order.
You want to sort the sheet by ignoring the empty cells in the column "F".
You want to move the empty rows to the bottom of row.
In your situation, there are the values in the column "A" to "F".
The formulas are included in not only the column "F", but also other columns.
You don't want to change the formulas.
You want to achieve this using Google Apps Script.
From your replying and updated question, I could understand like above. Try this sample script:
Sample script:
function onEdit(event){
const columnToSortBy = 6; // Column "F"
const headerRow = 1; // 1st header is the header row.
const sheet = event.source.getActiveSheet();
const range = sheet.getRange(1 + headerRow, 1, sheet.getLastRow() - headerRow, 6);
const formulas = range.getFormulas();
const values = range.getValues().sort((a, b) => a[columnToSortBy - 1] > b[columnToSortBy - 1] ? -1 : 1);
range.setValues(values.map((r, i) => r.map((c, j) => formulas[i][j] || c)));
}
A much simpler way to fix this is to just change
=IF(SUM(C2:E2)=0,"",SUM(C2:E2))
to
=IF(SUM(C2:E2)=0,,SUM(C2:E2))
The cells that are made blank when the sum is zero will then be treated as truly empty and they will be excluded from sort, so only cells with content will appear sorted at the top of the sheet.
Why your original formula doesn't work that way is because using "" actually causes the cell contain content so it's not treated as a blank cell anymore. You can test this by entering ISBLANK(F1) into another cell and check the difference between the two formulas.

Delete rows based on certain logic in power query

I need to delete rows based on the below logic:
Sum of column B for the same product, to compare with one of the values in column D for this product.
If the sum value < the value in column D, then delete the rows with extra ReceiptQty. In this case, for product AAA, receiptQty =12000, which is >10000, then delete the row 7.
Is there any way to achieve this in power query? Thanks~
This code should work:
let
Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
group = Table.Group(Source, {"ProductID"}, {"temp", each _}),
list = Table.AddColumn(group, "list", each List.Skip(List.Accumulate([temp][ReceiptQty], {0}, (a, b) => a & {List.Last(a) + b}))),
table = Table.AddColumn(list, "table", each Table.FromColumns(Table.ToColumns([temp])&{[list]}, Table.ColumnNames(Source)&{"RunningQty"})),
final = Table.SelectRows(Table.Combine(table[table]), each [OnhandQty] >= [RunningQty])
in
final

Resources