Merge/Combine two header rows - powerquery

I have ever-changing header rows for a list of 1000+ shops (column order is changing) and I need to combine the first two rows into one header.
This is the simplified example table with 3 distinct shops and 4 weeks of data (the real data is > 40000 rows with 88 columns each)
Column0
Column1
Column2
Column3
Column4
Column5
Column6
Column8
Column11
Index
Shopbrand
Product
00/00182
Week
ProductA
ProductA
ProductA
ProductB
ProductB
ProductB
359
ShopBrand0
Datatype
00/00182
Sc Amount
Sc Value
Sc Profit
Sc Amount
Sc Value
Sc Profit
360
ShopBrand0
Week
00/00182
202201
361
ShopBrand0
Week
00/00182
202202
4
11,96
4
362
ShopBrand0
Week
00/00182
202203
5
14,95
8
363
ShopBrand0
Week
00/00182
202204
1
6,49
1,5
364
ShopBrand0
Product
00/00205
Week
ProductA
ProductA
ProductA
ProductB
ProductB
ProductB
400
ShopBrand0
Datatype
00/00205
Sc Amount
Sc Value
Sc Profit
Sc Amount
Sc Value
Sc Profit
401
ShopBrand0
Week
00/00205
202201
402
ShopBrand0
Week
00/00205
202202
403
ShopBrand0
Week
00/00205
202203
1
5,09
0,79
1
6,49
1,5
404
ShopBrand0
Week
00/00205
202204
0
0
-19,19
1
6,49
-10
405
ShopBrand0
Product
00/09002
Week
ProductA
ProductA
ProductA
ProductB
ProductB
ProductB
42557
ShopBrand1
Datatype
00/09002
Sc Amount
Sc Value
Sc Profit
Sc Amount
Sc Value
Sc Profit
42558
ShopBrand1
Week
00/09002
202201
2
11,1
3,22
4
23,36
5,88
42559
ShopBrand1
Week
00/09002
202202
5
25,45
3,95
42560
ShopBrand1
Week
00/09002
202203
3
14,97
2,09
2
8,98
0,48
42561
ShopBrand1
Week
00/09002
202204
2
8,98
0,48
7
33,83
3,88
42562
ShopBrand1
Note that every shop-id (column1) has TWO header rows, where the "product + datatype" (column3 onwards) need to be combined.
I can group by shop two get individual tables (real data approx 1200 shops), but who can I efficiently combine the two rows so that e.g. in Column3 I do get "ProductA : Sc Amount".
Note that due to the data source ProductA is NOT always in Column3, but for some shops it might be a different product that comes first, so I cannot simply take a fixed header for all the shops, I need to go through it for each shop individually.
This is the first part of the code with a bit more table data (no row combining, just grouping)
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText(
"rZnNbtw2FIVfJZg1B+X9kUQuE/QBCgRoF0YWRuKiRdo4COxF3r7iJeURZ3itU8CAxxZk+ZAiv3N5SN/dnX778fjl+fPTKZxi/CVGSrxe/vHw8HX90X73/ujyg3spU16/f/zr8fuHH/ffvsR3p0/h7vTr/dP908/vD32j5cHP797/+/j87ale/37/z/NDvVw1//z7CXtE5jhotL3TrkGOzJFqy1dfMhMuUS50/RCFPLfrnRLjSrJeTEVJQy4/U68kuJL1Z/3MQcsMUJh6KcWlppHUzQ2ZJ1xy3nem3E0hl3eNQZNpzbjW4kzggkskRyLhEtmRGOE/lqDojbPWGzFwmTVZcLrJoXvB6SbGpn/BMSdxeoXzTepI4FzT5EjgHNPsSOD4koPvguNLDr4Lji8ZX+1zlpCuygfXKcZh5iOYz+VJSTjK7KCccJSZ939ZO2OFMAaxspNwhtlhOOEM82WQqY7RknstHGZ2YE44zOzAnHCY2YE54TCzA3PCYWanFiccX4ljiYzzKg6vGedV2JHAMRUH04xjKk6pzTid4tCZcTrFoTOP6LzKtDXCvFmm1Tii4DrT1kZPb5VpNR5xUxt8JdNqPOJmJzFGT+MRNzuJ8mjpyRRijS9W3m4WBY1HIO00rVq3z5lyK5wXybOlKI1HYO0kx3GmXC0hl8KlwYq6xqNCuBMtj1pwaomWglViu0WhNCfBVjSNR7Vxp9qV18FAHtXInVQ6kDqqlTup3F4sB1uF1nneZr4PrkpH1fMiSuMCrIS7oM++t69IuBvsUTaUbZtyZgq6vSSHbORp2/wRbhEnCCvhjnCCsBLuACcIK+G8O0FYCYfbCcJKONROEFbCYaZxdlDG0WUHXcbRtUdvDbTTwtm9Db0tbp55LczJ1HBindirjBPLXQ3nOVg6vM3ByjjCTvxVxhF24q8yjrATf5VxhJ34q4wj3MffSpHFvVg3Oyo4y04OVsFZdnKwCo6wHOzbVHCAnTSrgsPmpFkVHDYnzaqMYOvTbI7xTU9oV5tNXas0zLOt2dOb5dm12TRodjdercmXRNuHp+3QlSXYaE4hpaabcd2OLJudKagFwHoAu8r1h8mvyxUOLRdojY4cXtJRf8JZdAnX1aFIaUGkplN5ef3+vPl13fKKZfCEgp1UTsHOnexgYql5JrZYswoLLjyfhhl6MIWrruK6S+swp1CZqDnMOpxDHfJpm7gJ101b52Kw3LhmRPIHYsaF84bp2tOKqb2Djc2aR+26EleER0YcC9d9TuXfpnEK82aS3JbUwG0dnXGnEe1GYq4j4WxmijBuNf8c+WqXwNOCO65P0GM13Gd9mB7CuuDuovGOz+4uldKpnrQWXdxctJsPqvNhpfQK1lYOFtxdFsGbsJ2Cc+gW76KGe6p27XrCOzHcR24unxbcM24ynxbcIM45dBHBzcDbhpK4HT1fpnDdW1qB0SCNjYT7geWiYolW2kJ0sv15XTyWVscS7oyXs+qr5FXuploh1vltHkm4R/rgPmhgVcOdYW98u5MYy+K+4GUoUsd0qndyK7UJNwinYW+bAet6GcyTRRf3Cuf/MQq4eyT2Iuct5IyFcUfJPtzZzbqo2xo5VVo5GKxFGHeZbC7ryR/2N+Me68/UW8qx38pLRMi4tWQLd1eTPioPqzDuLbmcbA7/v1TUcG/JZehSXWAtb7S026KCtFyQb8z16T8=",
BinaryEncoding.Base64), Compression.Deflate)),
let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column0 = _t, Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t, Column5 = _t, Column6 = _t, Column8 = _t, Column11 = _t, Index = _t, Shopbrand = _t]),
#"Grouped Rows" = Table.Group(Source, {"Column1"},
{{"storetables",
each _, type table [Column0=nullable text, Column1=nullable text, Column2=nullable text, Column3=nullable text, Column4=nullable text, Column5=nullable text, Column6=nullable text, Column8=nullable text, Column11=nullable text, Index=nullable text, Shopbrand=nullable text]}})
in
#"Grouped Rows"
Basically I need a more efficient version, than what I already came up with. On smaller test-data I did already cobble together this code fragment inspired by https://community.powerbi.com/t5/Desktop/Power-Query-Help/m-p/381272 but on the real 40000+ rows table this did still run aufer 100 minutes without feedback, so I stopped it.
// Converts a list of records, into a table
Table.TransformRows(
// Creates a list by applying the transform operation to each row in table
#"table",
(row) =>
let
// Keep only Cols that need transforming => remove the following:
TransformTheseColumns = List.RemoveItems(
// Removes all occurrences of the given values in the list2 from list1
Record.FieldNames(row),
//Returns the names of the fields in the record
{"Column0","Column1","Column2","Index","Shopbrand"}
),
Transforms = List.Transform(
TransformTheseColumns,
(name) =>
{name,(cell) =>
if Text.Contains(row[Column0],"Product")
then
cell
& " : "
& Table.FirstValue(
Table.SelectColumns(
Table.SelectRows(
#"table-empty-rows-removed",
each
[Index] = row[Index] + 1
),
name
)
)
else
cell
}
)
in
Record.TransformFields(
// Returns a record after applying transformations specified
row, Transforms
)
)
)
Another idea I came up with was to combine just the row headers, which leaves me with a list, but I don't know how to re-add this as the first row into the table. But it might be a good start for a more efficient solution, i.e. running through all the grouped tables, extracting the first two rows, combining them and sticking them back into the table.
= List.Transform( List.Zip( Table.ToRows ( just_two_headers )), each Lines.ToText( _, " : "))
outcome should be something like this (for each shop):

To combine row1/row2 and make that the column names of all columns:
#"NewNames" = Table.AddColumn(Table.Transpose(Table.FirstN(Source,2)), "Custom", each Text.Trim([Column1]&":"&[Column2]))[Custom],
#"Rename"=Table.RenameColumns( Table.Skip(Source,2), List.Zip( { Table.ColumnNames( Source ), #"NewNames" } ) )
or same as above, but then specify the first 3 column names individually as special cases:
FirstFew={"A","B","C"},
#"NewNames1" = Table.AddColumn(Table.Transpose(Table.FirstN(Source,2)), "Custom", each Text.Trim([Column1]&":"&[Column2]))[Custom],
#"NewNames" = FirstFew & List.Skip(#"NewNames1",List.Count(FirstFew)),
#"Rename"=Table.RenameColumns( Table.Skip(Source,2), List.Zip( { Table.ColumnNames( Source ), #"NewNames" } ) )
updated answer
I don't understand why you are bothering with the grouping. It seems you can do it with one shot, then apply a filter to pull out the bad rows
<snip>
#"Reordered Columns" = Table.ReorderColumns(Source, {"Index","Shopbrand","Column0","Column1","Column2","Column3","Column4","Column5","Column6","Column8","Column11" }),
NewNames = Table.AddColumn(Table.Transpose(Table.FirstN( #"Reordered Columns", 2)),"Custom",each Text.Trim([Column1]& " : "& [Column2]))[Custom],
rename_headers = Table.RenameColumns(Table.Skip( #"Reordered Columns", 2),List.Zip({Table.ColumnNames( #"Reordered Columns"),{"Index","Shopbrand","delete-able","shop-nr","Week"}& List.Skip(NewNames, 5)})),
#"Duplicated Column" = Table.DuplicateColumn(rename_headers, "shop-nr", "shop-nr - Copy"),
#"Reordered Columns2" = Table.ReorderColumns(#"Duplicated Column",{"shop-nr", "Index", "Shopbrand", "delete-able", "shop-nr - Copy", "Week"}),
#"Filtered Rows" = Table.SelectRows(#"Reordered Columns2", each ([#"delete-able"] = "Week"))
in #"Filtered Rows"

Try
Transpose table
Concatenate first 2 columns
Transpose back

This is my final solution incorporating #horseyrides answer above.
let
Source = Table.FromRows(
Json.Document(
Binary.Decompress(
Binary.FromText(
"rZnNbtw2FIVfJZg1B+X9kUQuE/QBCgRoF0YWRuKiRdo4COxF3r7iJeURZ3itU8CAxxZk+ZAiv3N5SN/dnX778fjl+fPTKZxi/CVGSrxe/vHw8HX90X73/ujyg3spU16/f/zr8fuHH/ffvsR3p0/h7vTr/dP908/vD32j5cHP797/+/j87ale/37/z/NDvVw1//z7CXtE5jhotL3TrkGOzJFqy1dfMhMuUS50/RCFPLfrnRLjSrJeTEVJQy4/U68kuJL1Z/3MQcsMUJh6KcWlppHUzQ2ZJ1xy3nem3E0hl3eNQZNpzbjW4kzggkskRyLhEtmRGOE/lqDojbPWGzFwmTVZcLrJoXvB6SbGpn/BMSdxeoXzTepI4FzT5EjgHNPsSOD4koPvguNLDr4Lji8ZX+1zlpCuygfXKcZh5iOYz+VJSTjK7KCccJSZ939ZO2OFMAaxspNwhtlhOOEM82WQqY7RknstHGZ2YE44zOzAnHCY2YE54TCzA3PCYWanFiccX4ljiYzzKg6vGedV2JHAMRUH04xjKk6pzTid4tCZcTrFoTOP6LzKtDXCvFmm1Tii4DrT1kZPb5VpNR5xUxt8JdNqPOJmJzFGT+MRNzuJ8mjpyRRijS9W3m4WBY1HIO00rVq3z5lyK5wXybOlKI1HYO0kx3GmXC0hl8KlwYq6xqNCuBMtj1pwaomWglViu0WhNCfBVjSNR7Vxp9qV18FAHtXInVQ6kDqqlTup3F4sB1uF1nneZr4PrkpH1fMiSuMCrIS7oM++t69IuBvsUTaUbZtyZgq6vSSHbORp2/wRbhEnCCvhjnCCsBLuACcIK+G8O0FYCYfbCcJKONROEFbCYaZxdlDG0WUHXcbRtUdvDbTTwtm9Db0tbp55LczJ1HBindirjBPLXQ3nOVg6vM3ByjjCTvxVxhF24q8yjrATf5VxhJ34q4wj3MffSpHFvVg3Oyo4y04OVsFZdnKwCo6wHOzbVHCAnTSrgsPmpFkVHDYnzaqMYOvTbI7xTU9oV5tNXas0zLOt2dOb5dm12TRodjdercmXRNuHp+3QlSXYaE4hpaabcd2OLJudKagFwHoAu8r1h8mvyxUOLRdojY4cXtJRf8JZdAnX1aFIaUGkplN5ef3+vPl13fKKZfCEgp1UTsHOnexgYql5JrZYswoLLjyfhhl6MIWrruK6S+swp1CZqDnMOpxDHfJpm7gJ101b52Kw3LhmRPIHYsaF84bp2tOKqb2Djc2aR+26EleER0YcC9d9TuXfpnEK82aS3JbUwG0dnXGnEe1GYq4j4WxmijBuNf8c+WqXwNOCO65P0GM13Gd9mB7CuuDuovGOz+4uldKpnrQWXdxctJsPqvNhpfQK1lYOFtxdFsGbsJ2Cc+gW76KGe6p27XrCOzHcR24unxbcM24ynxbcIM45dBHBzcDbhpK4HT1fpnDdW1qB0SCNjYT7geWiYolW2kJ0sv15XTyWVscS7oyXs+qr5FXuploh1vltHkm4R/rgPmhgVcOdYW98u5MYy+K+4GUoUsd0qndyK7UJNwinYW+bAet6GcyTRRf3Cuf/MQq4eyT2Iuct5IyFcUfJPtzZzbqo2xo5VVo5GKxFGHeZbC7ryR/2N+Me68/UW8qx38pLRMi4tWQLd1eTPioPqzDuLbmcbA7/v1TUcG/JZehSXWAtb7S026KCtFyQb8z16T8=",
BinaryEncoding.Base64
),
Compression.Deflate
)
),
let
_t = ((type nullable text) meta [Serialized.Text = true] )
in
type table [Column0 = _t,Column1 = _t,Column2 = _t,Column3 = _t,Column4 = _t,Column5 = _t,Column6 = _t,Column8 = _t,Column11 = _t,Index = _t,
Shopbrand = _t
]
),
#"Reordered Columns" = Table.ReorderColumns(
Source, {"Index","Shopbrand","Column0","Column1","Column2","Column3","Column4","Column5","Column6","Column8","Column11" }
),
// Grouping and expanding based on this solution here https://stackoverflow.com/a/73800993/1440255
#"Grouped Rows" = Table.Group(
#"Reordered Columns",
{
"Column1"
},
{{
"storetables",
each
_,
type table [Column0 = nullable text,Column1 = nullable text,Column2 = nullable text,Column3 = nullable text,Column4 = nullable text,Column5 = nullable text,Column6 = nullable text,Column8 = nullable text,Column11 = nullable text,Index = nullable text,Shopbrand = nullable text]
}}
),
// helper only for debuging to see one of the stores tables
#"00/00182" = #"Grouped Rows"{[Column1 = "00/00182"]}[storetables],
// go through all tables in storetables and join the first two rows and use them as new headers
#"newshop" = Table.TransformColumns(
#"Grouped Rows",
{
{
"storetables",
each
let
// join the first two rows with " : "
NewNames = Table.AddColumn(
Table.Transpose(Table.FirstN(_, 2)),
"Custom",
each Text.Trim([Column1]& " : "& [Column2])
)[Custom],
// rename the headers, but keep the first 5 from the fixed list below
rename_headers = Table.RenameColumns(
Table.Skip(_, 2),
List.Zip(
{Table.ColumnNames(_),
{
"Index","Shopbrand","delete-able","shop-nr","Week"
}
& List.Skip(NewNames, 5)
}
)
)
in
rename_headers
}
}
),
// Epand all storetables back into one beautiful table
#"expand storetables" =
let
ColumnNames = Table.ColumnNames(
Table.Combine(#"newshop"[storetables])
),
ExpandColumns = Table.ExpandTableColumn(
#"newshop",
"storetables",
ColumnNames
)
in
ExpandColumns
in
#"expand storetables"

Related

Counting customers that have sales in the same products

The data table has customers, products sales etc.
Based on a slicer selection products i want to count how many customers have sales in all selected products.
As below i only want to count customer B because he is the only one having all selected products
Customer a Product a product b product c
A. 1 1
B. 1 1 1
C. 1
D. 1 1
E. 1 1
OKay. I think Your table is this:
Then You need this dax code:
CustomerHavingAll3Products =
VAR Onlya =
CALCULATETABLE ( VALUES ( 'Product'[Customer] ), 'Product'[Product a] <> 0 )
VAR Onlyb =
CALCULATETABLE ( VALUES ( 'Product'[Customer] ), 'Product'[Product b] <> 0 )
VAR Onlyc =
CALCULATETABLE ( VALUES ( 'Product'[Customer] ), 'Product'[Product c] <> 0 )
VAR CombinedAll =
INTERSECT ( INTERSECT ( Onlya, Onlyb ), Onlyc )
RETURN
COUNTX ( CombinedAll, [Customer] )
If we test it on a table visual:
Please do not forget to click the down-pointing arrow on customer column on the filter pane, and ensure that "show items with no data" is checked, see the picture below
you can do it on power query side... totally dynamic...
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTIEYjAVqxOt5AQVMISLOMOkgRSI74KuxRVJi1JsLAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Customer = _t, #"Product a" = _t, #"product b" = _t, #"product c" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Customer", type text}, {"Product a", Int64.Type}, {"product b", Int64.Type}, {"product c", Int64.Type}}),
count_columns = Table.ColumnCount(Source)-1,
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Customer"}, "Attribute", "Value"),
#"Grouped Rows" = Table.Group(#"Unpivoted Columns", {"Customer"}, {{"Total Count", each Table.RowCount(_), Int64.Type}}),
#"Filtered Rows" = Table.SelectRows(#"Grouped Rows", each [Total Count] = count_columns )
in
#"Filtered Rows"

Power Query MAX Value from a table

I trying to get the MAX date from a table onto a different date with power query. At the moment I'm stuck and all I get is a table based on a condition. Not sure if this is clear, so I'll explain with the code. This is my code at the moment:
let
Source = Table.NestedJoin(Table.NestedJoin(SKU,{"SKU"},q_UltColh_NEW,{"SKU"},"qUltColh_NEW",JoinKind.LeftOuter),{"SKU"},r_STK,{"SKU"},"Rep_Stk", JoinKind.LeftOuter),
.
.
.
#"Expanded Origem" = ...
#"Expanded Origem" = Table.ExpandTableColumn(#"Merged Queries", "Origem", {"Desc_ORI", "Parent_ORI"}, {"Origem.Desc_ORI", "Origem.Parent_ORI"}),
#"Added Last_Rec" = Table.AddColumn(#"Expanded Origem", "Last_Rec", each
let SKU = [SKU]
in Table.SelectRows(r_GOODSREC,each [SKU]=SKU)
)
in
#"Added Last_Rec"
I have two tables:
SKU Desc
46_24_ ABC
103_5_ DEF
doc_DATE RowNo SKU Cod_ART QTT
10/01/2017 1 46_24_ 46.24 50
14/01/2017 1 46_24_ 46.24 100
14/01/2017 1 103_5_ 103.5 50
16/01/2017 1 103_5_ 103.5 100
And I want to get:
SKU Desc Last_Entry Qtt
46_24_ ABC 14/01/2017 50
103_5_ DEF 16/01/2017 100
my code is returning a table with various columns:
SKU Desc Last_Entry
46_24_ ABC Table
103_5_ DEF Table
I believe once I get the max value I can just exapand the table, unless you tell me that is a bad ideia.
Thank you very much,
I got this with the code below:
Notes:
1. My date format is month/day/year whereas yours was day/month/year.
2. Also, in your question, you show an expected QTT of 50 for SKU 46_24_; but your source table has 100 as the QTT for the latest date of SKU 46_24_, which is why my table has 100 instead of 50.
I used the same two starting tables as you. I called them Table1 and Table2. (Table1 is the one with just the SKU and Desc columns.)
Then I merged those two tables into a new table called Merge1, using a left-outer join.
I guess the key points are:
I used "Group By" (i.e., Table.Group) to group by each SKU and get its max date value in a column I called Last_Entry, and I included all row data in a column I called AllData. Here's the Group By pop-up window:
Then, after the Group By, I expanded the embedded table in AllData and added a new column to flag and filter out the rows where the doc_Date was not equal to Last_Entry.
let
Source = Table.NestedJoin(Table1,{"SKU"},Table2,{"SKU"},"Table2",JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(Source, "Table2", {"doc_Date", "RowNo", "SKU", "Cod_ART", "QTT"}, {"doc_Date", "RowNo", "SKU.1", "Cod_ART", "QTT"}),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Table2",{{"doc_Date", type date}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"SKU"}, {{"Last_Entry", each List.Max([doc_Date]), type datetime}, {"AllData", each _, type table}}),
#"Expanded AllData1" = Table.ExpandTableColumn(#"Grouped Rows", "AllData", {"Desc", "doc_Date", "QTT"}, {"Desc", "doc_Date", "QTT"}),
#"Added Custom" = Table.AddColumn(#"Expanded AllData1", "Custom", each if[Last_Entry]=[doc_Date] then "Last_Entry" else "NotLast_Entry"),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = "Last_Entry")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Custom", "doc_Date"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"SKU", "Desc", "Last_Entry", "QTT"})
in
#"Reordered Columns"

powerquery m language - how to select all rows until value

Very similar question to this one but using Power Query/M
Given the following (Power Query Excel import) ...
A B
1 Item Amount
2 Item1 1
3 Item2 4
4 Grand 5
How do you select all the rows up until (excluding) the fourth row with Grand? (and excluding all rows after)
I have created a new column like this:
#"Added Custom" = Table.AddColumn(#"Changed Type1", "match_check", each Text.Contains([A],"Grand"))
and it indicates correctly the "Grand" line, but what is really needed are all the lines ahead of it (and none of the lines after it).
That's easy! :))
Continuing your code:
#"Added Custom" = Table.AddColumn(#"Changed Type1", "match_check", each Text.Contains([A],"Grand")), //Your line
AddIndex = Table.AddIndexColumn(#"Added Custom", 1, 1),
SelectGrandTotals = Table.SelectRows(AddIndex, each [match_check] = true), //select matched rows with grand totals
MinIndex = List.Min(SelectGrandTotals[Index]), //select first totals row index (if there are several such rows)
FilterTable = Table.SelectRows(AddIndex, each [Index] < MinIndex) //get all rows before

Add calculated custom column using entire table for calculation power query or dax

I have this table in Power BI with events concerning some objects
|new_state |object_id | created_at |
|new |1 |11/4/2015 1:50:48 PM |
|in_use |3 |11/4/2015 2:31:10 PM |
|in_use |1 |11/4/2015 2:31:22 PM |
|deleted |2 |11/4/2015 3:14:10 PM |
.....
I am trying to add a calculated column either in DAX or power query so that for each row I would have the previous_state of that object. From a logical point of view it's not difficult: you group by id and for each row in that group you look for the closest previous time and get the "new_state" which would represent the previous state for that row.
I have tried doing this by creating a function in power query and use it in a custom column but I am getting a "cyclic reference detected" error and cannot do it. Any ideas on solutions?
It's hard to express comparisons between rows today in Power Query. Most of the functions assume the table is just an unordered set of rows.
To expand on Oğuz's comment, you could add an index column, then add a column PreviousState indexing into the previous row (or null). As an optimization it might be much faster if you buffer the whole table first.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WqslLLY8vLkksSVWoyU/KSk0uic9MUahRSC5KBYqlxCeWKNQoxepAFCrUGAKRob6JvpGBoamCoZWpgZWJhUKAL0xNZl58aTHQJGNkZUZWxoZWhgZYlBliKDMyQlKWkpqTCnSDQo0RsjpjK0MThHGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Split Column by Delimiter" = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv),{"Column1.1", "Column1.2", "Column1.3", "Column1.4", "Column1.5"}),
#"Removed Columns" = Table.RemoveColumns(#"Split Column by Delimiter",{"Column1.1", "Column1.5"}),
#"Trimmed Text" = Table.TransformColumns(#"Removed Columns",{},Text.Trim),
#"Promoted Headers" = Table.PromoteHeaders(#"Trimmed Text"),
ChangedType = Table.TransformColumnTypes(#"Promoted Headers",{{"object_id", Int64.Type}, {"created_at", type datetime}, {"new_state", type text}}),
#"Added Index" = Table.AddIndexColumn(ChangedType, "Index", 0, 1),
Buffer = Table.Buffer(#"Added Index"),
#"Added Custom" = Table.AddColumn(Buffer, "PreviousState", each try Buffer{[Index] - 1}[created_at] otherwise null),
#"Inserted Time Subtraction" = Table.AddColumn(#"Added Custom", "TimeDifference", each [created_at] - [PreviousState], type duration)
in
#"Inserted Time Subtraction"
There are surely neater solutions than this but in DAX you can create a calculated column (prevdate) to store the datetime of the previous entry:
=
CALCULATE (
MAX ( [created_at] ),
ALL ( table1 ),
Table1[created_at] < EARLIER ( [created_at] ),
Table1[object_id] = EARLIER ( [object_id] ) )
Then you add another calculated column to store the state at that previous time:
=
CALCULATE (
VALUES ( Table1[new_state] ),
ALL ( Table1 ),
Table1[created_at] = EARLIER ( Table1[prevdate] ),
Table1[object_id] = EARLIER ( Table1[object_id] )
)
I've solved it :D
#"Sorted Rows" = Table.Sort(#"Reordered Columns",{{"object_id", Order.Ascending}, {"created_at", Order.Ascending}}),
#"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1),
Buffer = Table.Buffer(#"Added Index"),
#"Added Custom" = Table.AddColumn(Buffer, "PreviousState", each try (if Buffer{[Index] - 1}[object_id]=Buffer{[Index]}[object_id] then Buffer{[Index] - 1}[new_state] else null ) otherwise null)
I'm not sure it's not mostly a hack but it seems to be working. Do you see any point where it might fail in the future?

How to get the last element by date of each "type" in LINQ or TSQL

Imagine to have a table defined as
CREATE TABLE [dbo].[Price](
[ID] [int] NOT NULL,
[StartDate] [datetime] NOT NULL,
[Price] [int] NOT NULL
)
where ID is the identifier of an action having a certain Price. This price can be updated if necessary by adding a new line with the same ID, different Price, and a more recent date.
So with a set of a data like
ID StartDate Price
1 01/01/2009 10
1 01/01/2010 20
2 01/01/2009 10
2 01/01/2010 20
How to obtain a set like the following?
1 01/01/2010 20
2 01/01/2010 20
In SQL, there are several ways to say it. Here's one that uses a subquery:
SELECT *
FROM Price p
WHERE NOT EXISTS (
SELECT *
FROM Price
WHERE ID = p.ID
AND StartDate > p.StartDate
)
This translates fairly trivially to LINQ:
var q = from p in ctx.Price
where !(from pp in ctx.Price
where pp.ID == p.ID
&& pp.StartDate > p.StartDate
select pp
).Any()
select p;
Or should I say, I think it does. I'm not in front VS right now, so I can't verify that this is correct, or that LINQ will be able to convert it to SQL.
Minor quibble: Don't use the name ID to store a non-unique value (the type, in this case). It's confusing.
Assuming ID & StartDate will be unique:
SELECT p.ID, p.StartDate, p.Price
FROM Price p
JOIN
(
SELECT ID, MAX(StartDate) AS LatestDate
FROM Price
GROUP BY ID
) p2 ON p.ID = p2.ID AND p.StartDate = p2.LatestDate
Since you tagged your question with LINQ to SQL, here is an LINQ query to express what you want:
from price in db.Prices
group price by price.Id into group
let maxDateInGroup = group.Max(g => g.StartDate)
let maxDatePrice = group.First(g => g.StartDate == maxDateInGroup)
select
{
Id = group.Key,
StartDate = maxDatePrice.StartDate,
Price = maxDatePrice.Price
};

Resources