Power Query - Check if value in column B exists in column A - powerquery

I have tried this with a list of 500 rows and it works great. But when I try to add it to my real query of 500k rows it takes forever (looking at row count I see that it would take days to finish).
Is there any way to speed it up by "Buffering" or "Query List"?
I'm very new to Power Query and is using it with Excel only
= Table.AddColumn(#"Changed Type", "TYPE_CHECK", each List.Contains(#"Source"[TYPE_SORT],[MASTER]))
this code work great but take to much resources
Example of what is expected:

Looks like you are trying to see if items in column of Table2 appear in column of Table1. Another way to do it is just merge the two tables
base version:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"AAA", type text}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"AAA"}, Table2, {"BBB"}, "Table2", JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"BBB"}, {"BBB"})
in #"Expanded Table2"
alternate True/False version
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"AAA", type text}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"AAA"}, Table2, {"BBB"}, "Table2", JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"BBB"}, {"BBB"}),
ConvertTrueFalse = Table.TransformColumns(#"Expanded Table2",{{"BBB", each if _=null then false else true}})
in ConvertTrueFalse
You can also merge the same table on top of itself to check columns against each other in the same table
merge table on itself:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"AAA", type text}, {"BBB", type text}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"AAA"}, #"Changed Type", {"BBB"}, "Changed Type", JoinKind.LeftOuter),
#"Expanded Changed Type" = Table.ExpandTableColumn(#"Merged Queries", "Changed Type", {"BBB"}, {"AAAmatchfromBBB"}),
#"MakeTrueFalse" = Table.TransformColumns(#"Expanded Changed Type",{{"AAAmatchfromBBB", each if _=null then false else true}})
in #"MakeTrueFalse"

Right before you AddColumn, you insert a standalone step
BufferList = List.Buffer(#"Source"[TYPE_SORT])
Then on the AddColumn step, you use
#"Added Column" = Table.AddColumn(#"Changed Type", "TYPE_CHECK", each List.Contains(BufferList,[MASTER]))

Related

Using power query to group alternate rows

Starting with the table above, the headers and their respective values are in alternating rows. For example for Nike, the Serial for the boots is 123 and Part No. is ABC, and it is sold on 12 Apr 22 for $23.03 with 20 left in stock. What I am trying to achieve by using power query is the following table:
I have tried adding an index and divide-integer 2 as there are 2 rows (1 header, 1 value) for each item sold and grouping using the resultant index. Then unpivot all except the index.
Then split the Attribute and Value columns using #(lf)
But I'm stuck here and running out of ideas. Any advice will be greatly appreciated. Thanks.
Try
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"AlternateRows"=Table.AlternateRows(Source,0,1,1),
#"Added Index" = Table.AddIndexColumn(AlternateRows, "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn( #"Added Index", "Custom", each Text.Split([Column2],"#(lf)")),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each Text.Split([Column4],"#(lf)")),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Custom.2", each
Table.AddColumn(
Table.UnpivotOtherColumns(
Table.AddIndexColumn(
Table.FromColumns({[Custom],[Custom.1]})
, "Index", 0, 1, Int64.Type)
, {"Index"}, "Attribute", "Value")
,"Key", each Text.From([Index]) & Text.End([Attribute],1))
),
#"Expanded Custom.2" = Table.ExpandTableColumn(#"Added Custom2", "Custom.2", {"Value", "Key"}, {"Value", "Key"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom.2",{"Column2", "Column4", "Custom", "Custom.1"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Key]), "Key", "Value"),
#"Renamed Columns" = Table.RenameColumns(#"Pivoted Column",{{"01", "Serial"}, {"02", "Date"}, {"11", "PartNo"}, {"12", "Price"}, {"22", "Item"}, {"32", "Stocks Left"}, {"Column1", "Currency"}, {"Column3", "Brand"}}),
#"Removed Columns1" = Table.RemoveColumns(#"Renamed Columns",{"Index"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns1",{{"Date", type date}, {"Price", type number}, {"Stocks Left", type number}})
in #"Changed Type"

Power Query: Duplicate Rows Based on Value

I have a column that contains the Total Stock of an item. I'd like to expand this out into 1 row per item (i.e. the item has 6 in stock and therefore appears as 6 line items).
Is this possible with power query?
The M-Code below will expand this input table
to this
let
Source = Excel.CurrentWorkbook(){[Name="tblData"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ColA", type text}, {"Stock", Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Col", each List.Repeat({[ColA]},[Stock])),
#"Expanded Col" = Table.ExpandListColumn(#"Added Custom", "Col")
in
#"Expanded Col"

Power query column editing

I have a table in power bi query with dates
01.01.2020
02.01.2020
and so on..
I need to duplicate this table and replace values 01.01.2020 into 20200101 and so on. Is there an obvious, easy way for this?
First option:
Here is the simplest option I found:
Create a custom column and apply "Text.Reverse" to your column
Create a custom column and apply to the newly created "Text.Remove" for "." which will remove the "." of your string.
Here is what you will get, with "reverse date" as your column in the reverse order, and "reverse date without point" as the second column without the point.
Here is the M code:
#"Promoted Headers" = Table.PromoteHeaders(Sheet2_Sheet, [PromoteAllScalars=true]),
#"Changed Type3" = Table.TransformColumnTypes(#"Promoted Headers",{{"Date", type text}}),
#"Added Custom3" = Table.AddColumn(#"Changed Type3", "reverse date", each Text.Reverse([Date])),
#"Added Custom4" = Table.AddColumn(#"Added Custom3", "reverse date witout point", each Text.Remove([reverse date], {"."}))
Second option:
Here is a second option, which is longer:
Break down your column in three distinct columns with "." as delimiter
Add new columns with padding zero to day and months (I called them "month with zero" and "day with zero")
Concatenate
and you get you result!
Here is my starting point:
Here is the first step, "breaking the column" in "columns":
Here is the custom column with zero padding:
Here is how you concatenate:
Here is the M code:
#"Split Column by Delimiter" = Table.SplitColumn(#"Promoted Headers", "Date", Splitter.SplitTextByDelimiter(".", QuoteStyle.Csv), {"Date.1", "Date.2", "Date.3"}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Date.1", Int64.Type}, {"Date.2", Int64.Type}, {"Date.3", Int64.Type}}),
#"Changed Type1" = Table.TransformColumnTypes(#"Changed Type",{{"Date.1", type text}, {"Date.2", type text}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type1",{{"Date.1", "Day"}, {"Date.2", "Month"}, {"Date.3", "Year"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "Month with zero", each Text.PadStart(Text.From([Month]),2,"0")),
#"Added Custom2" = Table.AddColumn(#"Added Custom", "Day with zero", each Text.PadStart(Text.From([Day]),2,"0")),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"Day", "Month"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Removed Columns",{{"Year", type text}}),
#"Added Custom1" = Table.AddColumn(#"Changed Type2", "New Date", each [Year] & [Month with zero] & [Day with zero])
in
#"Added Custom1"

Power Query M - Custom Column for Rolling 28 Days Sales

I'm looking for some Power Query help. I have a huge set of sales data for 40k products over one year. For each product on each day I need to add a 28 day sales column.
I essentially want to do a sumifs like the below but in M.
=SUMIFS([SALES],[Product Code],[This Product Code],[Date],<=[This Date],[Date],>=[This Date]-28))
Try this then, it should work but would likely do so at a crawl
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Sales", Int64.Type}, {"Product Code", type text}, {"Date", type date}}),
TotalAmountAdded = Table.AddColumn(Source, "Total Amount", (i) => List.Sum(Table.SelectRows(Source, each ([Product Code] = i[Product Code] and [Date]<=i[Date] and [Date]>=Date.AddDays(i[Date],-28)))[Sales]), type number )
in TotalAmountAdded
Add a custom column with date logic (based on your sample sumif formula), filter the new column to get the relevant rows, then group by product code and sum Sales. Assuming source data is in Table1 with three columns (Sales,Product Code, Date) the code would be
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Sales", Int64.Type}, {"Product Code", type text}, {"Date", type date}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "AddMe", each if [Date]<=DateTime.Date(DateTime.LocalNow()) and [Date]>=Date.AddDays(DateTime.Date(DateTime.LocalNow()),-28) then 1 else 0),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([AddMe] = 1)),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Product Code"}, {{"ProductSales", each List.Sum([Sales]), type number}})
in #"Grouped Rows"

Collect related values from more than one column and row into a single column

I have this table:
Which I'd like to change to this:
As you can see, I want to collect all the related entries from all of the P1 rows, across all columns, into a single column under P1, and do the same for P2 and P3 related entries.
Is there a simple way to do this in PowerQuery / M?
Yes:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"P", type text}, {"15", Int64.Type}, {"25", Int64.Type}, {"35", Int64.Type}, {"45", Int64.Type}, {"55", Int64.Type}, {"65", Int64.Type}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"P"}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Attribute"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"P"}, {{"AllData", each _, type table}}),
TablesToLists = Table.TransformColumns(#"Grouped Rows",{{"AllData", each _[Value]}}),
#"Transposed Table" = Table.Transpose(TablesToLists),
#"Promoted Headers" = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true]),
TableFromColumns = Table.FromColumns(Record.FieldValues(#"Promoted Headers"{0}),Table.ColumnNames(#"Promoted Headers"))
in
TableFromColumns

Resources