Fill down on non-blank values using PowerQuery - powerquery

i'm trying to figure out a way to clean this data set. I'm sure there is a smart way to do it but i can't get my head to it.
I've tried using the Fill Down function but for that i need to find a way to exclude those data values under the date i'd like to keep.
Here is my data
My Data in one column What i want in two columns
APA --> 22 null null
Je 06/01/2022 APA --> 22 Je 06/01/2022
Ve 07/01/2022 APA --> 22 Ve 07/01/2022
Lu 07/02/2022 APA --> 22 Lu 07/02/2022
Ma 08/02/2022 APA --> 22 Ma 08/02/2022
null null null
AR --> 6 null null
Ma 04/01/2022 AR --> 6 Ma 04/01/2022
Ve 21/01/2022 AR --> 6 Ve 21/01/2022
Sa 22/01/2022 AR --> 6 Sa 22/01/2022
Me 23/02/2022 AR --> 6 Me 23/02/2022
Lu 21/03/2022 AR --> 6 Lu 21/03/2022
Ma 22/03/2022 AR --> 6 Ma 22/03/2022
null null null
AS --> 545 null null
Sa 01/01/2022 AS --> 545 Sa 01/01/2022
Sa 01/01/2022 AS --> 545 Sa 01/01/2022
Sa 01/01/2022 AS --> 545 Sa 01/01/2022
Di 02/01/2022 AS --> 545 Di 02/01/2022
Any help appreciated !

Try this
add column, custom column with formula
= if Text.Contains([Column1],"->") then [Column1] else null
right click and fill down this new column
add column, custom column, with formula
= if not Text.Contains([Column1],"->") then [Column1] else null
create a filter on first column
each [Column1]=null or not Text.Contains([Column1], "->"
On the two new columns, transform, replace values .. replace errors ... and enter nothing to get a null
right click and remove original column
let Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each if Text.Contains([Column1],"->") then [Column1] else null),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each if not Text.Contains([Column1],"->") then [Column1] else null),
#"Filled Down" = Table.FillDown(#"Added Custom1",{"Custom"}),
#"Filtered Rows" = Table.SelectRows(#"Filled Down", each [Column1]=null or not Text.Contains([Column1], "->")),
#"Replaced Errors" = Table.ReplaceErrorValues(#"Filtered Rows", {{"Custom", ""}, {"Custom.1", ""}}),
#"Removed Columns" = Table.RemoveColumns(#"Replaced Errors",{"Column1"})
in #"Removed Columns"

Related

special vlookup in power query

thank you for answering our questions. Look at the picture below. Transfer the rate value appropriately from the first to the second range.
If I understand you correctly, the code below should do what you require.
Read the code comments and explore the Applied Steps to better understand the algorithm.
let
//Read in the lookup table
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
lookupTable = Table.TransformColumnTypes(Source,{
{"Date", Int64.Type},
{"P Code",Int64.Type},
{"SRC", Int64.Type},
{"Rate", Int64.Type}
}),
//read in data table
Source2 = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
typeIt2 = Table.TransformColumnTypes(Source2,{
{"Date", Int64.Type},
{"P.Code", Int64.Type},
{"Src", Int64.Type}
}),
//Join the two tables based on PCode and Src
join = Table.NestedJoin(typeIt2,{"P.Code","Src"},lookupTable,{"P Code","SRC"},"Joined", JoinKind.LeftOuter),
//for each joined subtable
// Sort descending by date
// Select only those rows where the date in table 2 is >= the corresponding date from table 1
// Then extract the first row Rate value (as that will be the closest to the date in table 2)
#"Added Custom" = Table.AddColumn(join, "Rate", each Table.SelectRows(Table.Sort([Joined],
{"Date",Order.Descending}),(t)=> t[Date] <= [Date])[Rate]{0}, Int64.Type),
//Remove unneeded join table column
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Joined"})
in
#"Removed Columns"
Lookup Table
Results

How to make a pivot table with power query

I have a table with number like below ,
Phone Number
123, 456, 890
123453
902, 423
so i would like to do the pivot table with can show all the phone number (delimiter is ",") and count how many time it appear in the list ? can someone assist for that?
I just have a initial step with the code below
let
Source = Excel.CurrentWorkbook(){[Name="Phone_Number"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Phone Number", type text}})
in
#"Changed Type"
updated: question solved.
In powerquery,
right click the column,
home .. split column by delimiter ... delimiter:comma, Advanced Options:rows
then right click column and group by...
use default options and hit ok
let Source = Excel.CurrentWorkbook(){[Name="Phone_Number"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Phone Number", type text}}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Table.TransformColumnTypes(#"Changed Type", {{"Phone Number", type text}}, "en-US"), {{"Phone Number", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Phone Number"),
#"Grouped Rows" = Table.Group(#"Split Column by Delimiter", {"Phone Number"}, {{"Count", each Table.RowCount(_), Int64.Type}})
in #"Grouped Rows"

Group records - keeping first Start Date (oldest) and last End Date (most recent)

I'm new to Power Query, so I'm hoping someone can help me with this problem. 
I have a dataset with ID, Status, Start Date and End Date. There are multiple rows for each ID with different start and end dates. Here's a sample of my dataset.
ID
Status
Start Date
End Date
1
A
01/04/2015
28/05/2015
1
A
28/05/2015
15/06/2016
1
B
15/06/2016
19/06/2016
1
B
19/06/2016
31/07/2016
1
B
31/07/2016
2
B
01/03/2017
03/06/2018
2
A
03/06/2018
07/08/2018
2
A
07/08/2018
31/12/2018
2
C
31/12/2018
01/09/2019
2
C
01/09/2019
03/05/2020
2
A
03/05/2020
I want to group consecutive rows (End Date same as the Start Date of next row) with same status for each ID and take the oldest Start Date and the most recent End Date. End Date will be blank for records that are still active. Here is the the output I'm looking for. 
ID
Status
Start Date
End Date
1
A
01/04/2015
15/06/2016
1
B
15/06/2016
2
B
01/03/2017
03/06/2018
2
A
03/06/2018
31/12/2018
2
C
31/12/2018
03/05/2020
2
A
03/05/2020
Is this possible? 
Many Thanks for your help.
This M code seems to work for what you described. It starts with your sample data in a table named Table1 in an Excel spreadsheet.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "New Start Date",
each if [Index] = 0 or ([Index] > 0 and [ID] = #"Added Index"[ID]{[Index]-1} and [Status] <> #"Added Index"[Status]{[Index]-1}) then [Start Date]
else if [ID] <> #"Added Index"[ID]{[Index]-1}
then [Start Date]
else null),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "New Finish Date",
each if [Index] = List.Last(#"Added Index"[Index]) or ([ID] <> #"Added Index"[ID]{[Index]+1} or [Status] <> #"Added Index"[Status]{[Index]+1}) then [End Date]
else null),
#"Filled Down" = Table.FillDown(#"Added Custom1",{"New Start Date"}),
#"Added Custom2" = Table.AddColumn(#"Filled Down", "Date Flag",
each if [Index] = 0
then if ([ID] <> #"Filled Down"[ID]{[Index]+1} or [Status] <> #"Filled Down"[Status]{[Index]+1}) or ([ID] = #"Filled Down"[ID]{[Index]+1} and [Status] = #"Filled Down"[Status]{[Index]+1} and [New Start Date] <> #"Filled Down"[New Start Date]{[Index]+1})
then 1
else 0
else if [Index] = List.Last(#"Filled Down"[Index])
then 1
else if ([ID] = #"Filled Down"[ID]{[Index]-1} and [Status] = #"Filled Down"[Status]{[Index]-1} and [New Start Date] <> #"Filled Down"[New Start Date]{[Index]+1}) or ([ID] = #"Filled Down"[ID]{[Index]+1} and [Status] <> #"Filled Down"[Status]{[Index]+1} and [New Start Date] <> #"Filled Down"[New Start Date]{[Index]+1})
then 1
else 0),
#"Filtered Rows" = Table.SelectRows(#"Added Custom2", each ([Date Flag] = 1)),
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"ID", "Status", "New Start Date", "New Finish Date"})
in
#"Removed Other Columns"
First. You must change null value in End Data with Today Data
Second. Group by advanced... ID and Status with 2 New column name: min for Start data and max for End data
Below you have the code:
P.S. #Promoted Headers is necessary only in my sample.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type any}, {"Column2", type text}, {"Column3", type any}, {"Column4", type any}}),
#"Promoted Headers" = Table.PromoteHeaders(#"Changed Type", [PromoteAllScalars=true]),
#"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", Int64.Type}, {"Status", type text}, {"Start Date", type datetime}, {"End Date", type datetime}}),
#"Replaced null withToday" = Table.ReplaceValue(#"Changed Type1",null,DateTime.LocalNow(),Replacer.ReplaceValue,{"End Date"}),
#"Grouped Rows" = Table.Group(#"Replaced null withToday", {"ID", "Status"}, {{"Minim Data (1)", each List.Min([Start Date]), type nullable datetime}, {"Maxim Data (1)", each List.Max([End Date]), type nullable datetime}})
in
#"Grouped Rows"
Thanks #Marc Pincince and #Marius Durlea for your response. As #Storax' comments, I have got the solution now and it is here. Adding the code below too.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZBLDoUgDEW3Yjom6YeHwFDfMoz734YtEKXRAYN7esItHAcwBNj0ECP9UIiTBilIqYczPM6EA3BCWi2st7N7rKF+O3VyohbntzPhAEsbyRjZqtFG2UIcd5Xb2TzWkJHKh/Pg1sfinb/HvbhaqM6ZcC+2PxJ67TNwe855AQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Status = _t, #"Start Date" = _t, #"End Date" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Status", type text}, {"Start Date", type date}, {"End Date", type date}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID", "Status"}, {{"Start Date", each List.First([Start Date]), type nullable date}, {"End Date", each List.Last([End Date]), type nullable date}},GroupKind.Local
)
in
#"Grouped Rows"

How to get a calculated column based on 2 different column in Power Query

Appreciate for any helps and suggestions.
I have a table structure as following:-
Date, Product Code ,Result ,Schedule
Day1, A ,0 ,0
Day2, A ,20 ,100
Day3, A ,200 ,100
How can i add a new column [Different] which reset by product code
Date ,Product Code ,Result ,Schedule ,Different
Day1 ,A ,0 ,0 ,0
Day2 ,A ,20 ,100 ,-80
Day3 ,A ,200 ,100 ,20
Where different = previous's different + result - schedule
thank you.
Paste code below in Home ... Advanced Editor, save, and name it fnRunningSum
It will create a function to do a cumultive running total on column named Amount
(MyTable as table) =>
let Source = Table.Buffer(MyTable),
MyColumn="Amount",
TableType = Value.Type(Table.AddColumn(Source, "Cumul", each null, type number)),
Cumulative = List.Skip(List.Accumulate(Table.Column(Source,MyColumn),{0},(cumulative,MyColumn) => cumulative & {List.Last(cumulative) + MyColumn})),
Cumu = Table.FromColumns(Table.ToColumns(Source)&{Cumulative},TableType)
in Cumu
Load your data into powerquery, here assumed to be in range Table1. Paste code below in Home ... Advanced Editor...
What it does is (1) add a new column that is Result-Schedule (2) Group on Product Code and cumulative sum the new column (3) Expand to get the columns back
letSource = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Amount", each [Result]-[Schedule]),
#"Grouped Rows" = Table.Group(#"Added Custom" , {"Product Code"}, {{"AllData", fnRunningSum}}),
#"Expanded AllData" = Table.ExpandTableColumn(#"Grouped Rows", "AllData", {"Date", "Result", "Schedule", "Cumul"}, {"Date", "Result", "Schedule", "Different"})
in #"Expanded AllData"
Try to achieve that with a Column From Examples, and type what you want to see in the new column. It usually works fine for me.
Documentation in case you have never used it before:
https://learn.microsoft.com/en-us/power-bi/desktop-add-column-from-example

Add calculated custom column using entire table for calculation power query or dax

I have this table in Power BI with events concerning some objects
|new_state |object_id | created_at |
|new |1 |11/4/2015 1:50:48 PM |
|in_use |3 |11/4/2015 2:31:10 PM |
|in_use |1 |11/4/2015 2:31:22 PM |
|deleted |2 |11/4/2015 3:14:10 PM |
.....
I am trying to add a calculated column either in DAX or power query so that for each row I would have the previous_state of that object. From a logical point of view it's not difficult: you group by id and for each row in that group you look for the closest previous time and get the "new_state" which would represent the previous state for that row.
I have tried doing this by creating a function in power query and use it in a custom column but I am getting a "cyclic reference detected" error and cannot do it. Any ideas on solutions?
It's hard to express comparisons between rows today in Power Query. Most of the functions assume the table is just an unordered set of rows.
To expand on Oğuz's comment, you could add an index column, then add a column PreviousState indexing into the previous row (or null). As an optimization it might be much faster if you buffer the whole table first.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WqslLLY8vLkksSVWoyU/KSk0uic9MUahRSC5KBYqlxCeWKNQoxepAFCrUGAKRob6JvpGBoamCoZWpgZWJhUKAL0xNZl58aTHQJGNkZUZWxoZWhgZYlBliKDMyQlKWkpqTCnSDQo0RsjpjK0MThHGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Split Column by Delimiter" = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv),{"Column1.1", "Column1.2", "Column1.3", "Column1.4", "Column1.5"}),
#"Removed Columns" = Table.RemoveColumns(#"Split Column by Delimiter",{"Column1.1", "Column1.5"}),
#"Trimmed Text" = Table.TransformColumns(#"Removed Columns",{},Text.Trim),
#"Promoted Headers" = Table.PromoteHeaders(#"Trimmed Text"),
ChangedType = Table.TransformColumnTypes(#"Promoted Headers",{{"object_id", Int64.Type}, {"created_at", type datetime}, {"new_state", type text}}),
#"Added Index" = Table.AddIndexColumn(ChangedType, "Index", 0, 1),
Buffer = Table.Buffer(#"Added Index"),
#"Added Custom" = Table.AddColumn(Buffer, "PreviousState", each try Buffer{[Index] - 1}[created_at] otherwise null),
#"Inserted Time Subtraction" = Table.AddColumn(#"Added Custom", "TimeDifference", each [created_at] - [PreviousState], type duration)
in
#"Inserted Time Subtraction"
There are surely neater solutions than this but in DAX you can create a calculated column (prevdate) to store the datetime of the previous entry:
=
CALCULATE (
MAX ( [created_at] ),
ALL ( table1 ),
Table1[created_at] < EARLIER ( [created_at] ),
Table1[object_id] = EARLIER ( [object_id] ) )
Then you add another calculated column to store the state at that previous time:
=
CALCULATE (
VALUES ( Table1[new_state] ),
ALL ( Table1 ),
Table1[created_at] = EARLIER ( Table1[prevdate] ),
Table1[object_id] = EARLIER ( Table1[object_id] )
)
I've solved it :D
#"Sorted Rows" = Table.Sort(#"Reordered Columns",{{"object_id", Order.Ascending}, {"created_at", Order.Ascending}}),
#"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1),
Buffer = Table.Buffer(#"Added Index"),
#"Added Custom" = Table.AddColumn(Buffer, "PreviousState", each try (if Buffer{[Index] - 1}[object_id]=Buffer{[Index]}[object_id] then Buffer{[Index] - 1}[new_state] else null ) otherwise null)
I'm not sure it's not mostly a hack but it seems to be working. Do you see any point where it might fail in the future?

Resources