Power Query - Fill Down with special text conditions - powerquery

I am trying to clean a large detailed profit and loss report and it includes Category & Sub Category columns. I have successfully filled down the Category column. However, the Sub Category column only has subcategory names scattered throughout the report therefore a normal fill down won't work.
How do I Fill Down starting where there exists a Sub Category value but only continues down to the Sub Category Total description?
Example in the picture below: Sub Category = "Closing stock - cattle"
Fill Down with those exact words - Closing stock - cattle until the cell that reads Total Closing stock - cattle. Then Fill Down from the new Sub Category in the same format.
Basically, the word Total is a very important part as I do not want that particular value Filled Down. Please note, there can be hundreds of rows that do not have a Sub category.

Filter out the total rows into their own table, fill down, then append the total rows back in.
let
StartTable = <Your Data Source>,
NoTotals = Table.SelectRows(StartTable, each not Text.Contains([Sub Category], "Total") or [Sub Category] = null),
OnlyTotals = Table.SelectRows(StartTable, each Text.Contains([Sub Category], "Total")),
FillDown = Table.FillDown(NoTotals, {"Sub Category"}),
Append = Table.Combine({FillDown, OnlyTotals})
in
Append
If you need it to get back to the starting order, add an index column before doing any filtering and then sort by that index as your last step.

Try this
#"Duplicated Column" = Table.DuplicateColumn(#"PreviousStep", "Subcategory", "Dupe"),
#"Filled Down" = Table.FillDown(#"Duplicated Column",{"Dupe"}),
#"Added Custom" = Table.AddColumn(#"Filled Down", "Custom", each if [Subcategory]=null then (if Text.Contains([Dupe], "Total") then [Subcategory] else [Dupe]) else [Dupe]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Subcategory", "Dupe"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Custom", "Subcategory"}})

Related

Power Query Table pivot transformation

Find the attached image, I want to transform the days to date and put all days in a columnenter image description here
In powerquery, try this to re-arrange the column. You have not provided any information needed to figure out how to insert the date
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
base=1, group=25, // 1 base column, then groups of 25, stack them
Combo = List.Transform(List.Split(List.Skip(Table.ColumnNames(Source),base),group), each List.FirstN(Table.ColumnNames(Source),base) & _),
#"Added Custom" =List.Accumulate(Combo,#table({"Column1"}, {}),(state,current)=> state & Table.Skip(Table.DemoteHeaders(Table.SelectColumns(Source, current)),1)),
#"Rename"=Table.RenameColumns(#"Added Custom",List.Zip({Table.ColumnNames(#"Added Custom"),List.FirstN(Table.ColumnNames(Source),base+group)}))
in #"Rename"

Can I add a new column with Linear Interpolation in Power Query M?

I am working on extracting an Interest Rate curve from futures market prices and create a table (Table 1) inside power query with the following columns:
- BusinessDays: Represents the nr o business days from today to the expiry of each future contract
- InterestRate: Represents the rate from today until the expiry of the futures contract
The second table (table 2) refers to the ID of internal financial products that expire in different business days.
- InstrumentID: Unique internal ID a financial product selled by a financial institution
- BusinessDays: Represents the nr o business days from today to the expiry of each financial product
I am having some trouble with M language, and unfortunately this specific calculation must be executed in Excel, so i am restricted to Power Query M.
The specific step i am not able to do is:
Creating a function in power query that adds a new column do table 2 containing the interpolated interest rate os each financial product.
The end result i am looking for would look like this
There are several ways to approach this but one way or another, you'll need to do some kind of lookup to determine which bracket to match your BusinessDays value with, so you can calculate the interpolated value.
I think it's simpler to just generate an all inclusive list of days vs interest rates, and then do a Join to pull out the matches.
I Name'd this first query intRates and expanded the Interest Rate table:
let
//Get the interest rate/business day table
Source = Excel.CurrentWorkbook(){[Name="intRates"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"BusinessDays", Int64.Type}, {"InterestRate", Percentage.Type}}),
//Add two columns which are the interest rate and business day columns offset by one
//It is faster to subtract this way than by adding an Index column
offset=
Table.FromColumns(
Table.ToColumns(#"Changed Type")
& {List.RemoveFirstN(#"Changed Type"[BusinessDays]) & {null}}
& {(List.RemoveFirstN(#"Changed Type"[InterestRate])) & {null}},
type table[BusinessDays=Int64.Type, InterestRate=Percentage.Type, shifted BusDays=Int64.Type, shifted IntRate=Percentage.Type]),
//Add a column with a list of the interest rates for each data interpolated between the segments
#"Added Custom" = Table.AddColumn(offset, "IntList", each let
sbd=[shifted BusDays],
intRateIncrement = ([shifted IntRate]-[InterestRate])/([shifted BusDays]-[BusinessDays]),
Lists= List.Generate(
()=>[d=[BusinessDays],i=[InterestRate]],
each [d]< sbd,
each [d=[d]+1, i = [i]+intRateIncrement],
each [i])
in Lists),
//add another column with a list of days corresponding to the interest rates
#"Added Custom1" = Table.AddColumn(#"Added Custom", "dayList", each {[BusinessDays]..[shifted BusDays]-1}),
//remove the last row as it will have an error
remErrRow = Table.RemoveLastN(#"Added Custom1",1),
//create the new table which has the rates for every duration
intRateTable = Table.FromColumns(
{List.Combine(remErrRow[dayList]),List.Combine(remErrRow[IntList])},
type table[Days=Int64.Type, Interest=Percentage.Type])
in
intRateTable
This results in a table that has every day (from 39 to , with its corresponding interest rate.
Then read in the "Instruments" table and Join it with the intRates, using a JoinKind.LeftOuter
let
Source = Excel.CurrentWorkbook(){[Name="Instruments"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"InstrumentID", type text}, {"BusinessDays", Int64.Type}}),
//add the rate column
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"BusinessDays"}, intRates, {"Days"}, "intRates", JoinKind.LeftOuter),
#"Expanded intRates" = Table.ExpandTableColumn(#"Merged Queries", "intRates", {"Interest"}, {"Interest"})
in
#"Expanded intRates"
Some of the results in the middle part of the table differ from what you've posted, but seem to be consistent with the linear interpolation formula for between two values, so I'm not sure how the discrepancy arises

Bring Value with Sumifs in Pow.Query language to specified row, and column(location)

Next step? I have brought with sumifs and a lot sumif from other workbook, information to the exact row, column in excel workbook. Now I want to do the same with query language. I can bring two values if condition is met, but then it is unclear how I will bring the total sum to the one row in excel workbook. Can anyone show me the path? I guess I will need Data Model...
= Table.AddColumn(#"Changed Type", "Sumif", each if [Column2] =2 or [Column2]=1 then [Column3]+[Column4] else 0)
let
Source = Folder.Files...
#"C:\Users...
#"Imported Excel" = Excel.Workbook(#"C:\...
SegPL_Chart = #"Imported Excel"{[Name="SegPL_Chart"]}[Data],
#"Removed Top Rows" = Table.Skip(SegPL_Chart,12),
#"Removed Alternate Rows" = Table.AlternateRows(#"Removed Top Rows",1,1,90),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Alternate Rows"),
#"Filtered Rows" = Table.SelectRows(#"Promoted Headers", each ([Col1]="1" or [Col1]="2")),
#"Table Group = Table.Group(#"Filtered Rows", {}, List.TransformMany(Table.ColumnNames(#"Filtered Rows",(x)=>{each if x = "Names" then "Totals" else List.Sum(Table.Column(_,x))},(x,y)=>{x,y})),
#"append" = Table.Combine({#"Filtered Rows",#"Table Group"})
in
#"append"
It gives an error "in" Token comma needed..? What else I need to do bring total rows?
You can use several steps to create several helper columns with intermediate results of conditional sums. Then you can create a new column, sum up all the intermediate results and the delete the helper columns with the intermediate results.
Keep in mind that unlike Excel, the calculations in Power Query always return constants and you can then delete calculated columns you no longer need. So,
Create helper column 1 with complicated IF and Sum scenario
Create helper column 2 with complicated IF and Sum scenario
Create total column to add column 1 + column 2
Delete helper columns and keep only the total column
That gives me exact result what I was looking for, but it is with DAX formula in PowerPivot:
=SUMX(FILTER('TableName',[ColName] = 1),'TableName'[ColName2])
So would be glad to convert it to Power-Query formula

Power Query: Selecting multiple elements in 'value field settings' to measure a specifc field

I'm trying to create a measure that averages the 29 elements' value [Overtime/Hours_worked] into one cell, visualised by the attached image.
Cell F32 currently shows [AverageA Total Overtime/Total Hours_worked] but I want it to be an average of the 29 rows' values as displayed in cell H32, =AVERAGEA(F3:F31).
The elements' figures are based upon underlying data from Data$, currently amounting to ~150k rows. When creating a measure that's averaging the elements' values from Column E [AvereageA Overtime/Hours_worked] and showing as a % of the 29 elements' aggregate %, I'm running into the problem of averaging the separate elements' underlying data taken from Data$. Worth noting is that F3:F31 is redundant in this instance, I'm looking for the average of the 29 elements' values in column E and not their respective averages shown in column F.
Am I right to use measure here or is there a better way to approach it? If measures can be used, is there a way to design the measure so that it refers to the Pivot Table's shown data instead of the underlying data taken from Data$? For instance by designing the measure to refer to column E in the pivot table?
Side note
The table needs to remain dynamic since Data$ is being updated regularly. I'm relatively new to Power Query so I'm not sure if there are other ways to solve this, i.e. through MDX, but I doubt I'll be able to sort that out myself.
Any and all help is appreciated, thanks.
I'm not sure how you are computing the individual entries in the AverageA Total Overtime/Total Hours_worked (so I left it blank), but to compute the totals and averages for the other columns, you can use the Table.Group command in a special way with an empty list for the key (so as to return the entire table for the Aggregation operations).
Given:
M Code
read the comments in the code to understand the algorithm
If your overtime% column is in your original data, you can just delete those code lines that add that column
let
//be sure to change table name in next line to your actual table name
Source = Excel.CurrentWorkbook(){[Name="wrkTbl"]}[Content],
//set data types
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"Area", type text}, {"Hours_worked", Int64.Type}, {"Overtime", Int64.Type}}),
//Add the percent overtime column
#"Added Custom" = Table.AddColumn(#"Changed Type", "Overtime/Hours_worked",
each [Overtime]/[Hours_worked], Percentage.Type),
//Use Table.Group to compute total/averages for the last row
//Be sure to use the exact same names as used in the original table
#"Grouped Rows" = Table.Group(#"Added Custom", {}, {
{"Area", each "Totals",type text},
{"Hours_worked", each List.Sum([Hours_worked]), Int64.Type},
{"Overtime", each List.Sum([Overtime]), Int64.Type},
{"Overtime/Hours_worked", each List.Sum([Overtime])/List.Sum([Hours_worked]), Percentage.Type},
{"AverageA Overtime/Hours_worked", each List.Average([#"Overtime/Hours_worked"]), Percentage.Type}
}),
//Append the two tables to add the Totals row
append= Table.Combine({ #"Added Custom", #"Grouped Rows"})
in
append
results in =>

Populate conditional column depending on column name criteria

I receive a weekly report which contains some repetition of columns. This is because it is drawn from a collection of web forms which ask similar questions to each other - let's say they all ask "Do you want to join our email list?" - but this question is stored in the source system as a separate field for each form (each form is effectively a separate table). The columns will always be consistently named - e.g. "Email_optin_1", "Email_optin_2" - so I can come up with rules to identify the columns which ask the email question. However, the number of columns may vary from week to week - one week the report might just contain "Email_optin_2", the next week it might include four such columns. (This depends on which web-forms have been used in that week). The possible values are the same in all these columns - let's say "Yes" and "No".
Each row should normally only have one of the "Email_optin" columns populated.
What I would like to do is create a single column in Power Query called "Email_Optin_FINAL", which would return "Yes" if ANY columns beginning with "Email_optin" contain a value of "Yes".
So, basically, instead of the criteria simply referring to the values in specific columns, what I would like it to do is first of all figure out which columns it needs to be looking at, and then look at the values in those columns.
Is this possible in PowerQuery?
Thanks in advance for any advice!
This would find all the columns containing Email_optin and merge them for you into a new column and remove the original columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
EmailList= List.Select(Table.ColumnNames(Source), each Text.Contains(_, "Email_optin")),
#"Merged Columns" = Table.CombineColumns(Source,EmailList,Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged")
in #"Merged Columns"
This would find all the columns containing Email_optin and merge them for you into a new column and preserve the original columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Index= Table.AddIndexColumn(Source, "Index", 0, 1),
EmailList= List.Select(Table.ColumnNames(Index), each Text.Contains(_, "Email_optin")),
Merged = Table.CombineColumns(Index,EmailList,Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"),
#"Merged Queries" = Table.NestedJoin(Index,{"Index"},Merged,{"Index"},"Merged",JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Merged", {"Merged"}, {"Merged"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Table2",{"Index"})
in #"Removed Columns"
you can then filter for "YES" among the merged answers if you want

Resources