Insert rows for missing dates in Power Query - powerquery

the starting point is the following table in which entries are made for events on specific days (journal).
Entity
Event
Date
Amount
0123
acquisition
05.05.2015
10,000.00
0123
capital increase
30.11.2015
1,000.00
0123
write-off
31.12.2017
-4,000.00
0123
write-up
31.12.2019
3,000.00
This journal is loaded into Power Query to be enhanced with additional information from other sources.
The goal is a Power Pivot table in which the amounts are summarized as at 31.12. of each year (Subtotals).
Year
Entity
Event
Date
Amount
2015
0123
aquisition
05.05.2015
10,000.00
2015
0123
capital increase
30.11.2015
1,000.00
2015 Subtotal
0123
11,000.00
2016 Subtotal
0123
11,000.00
2017
0123
write-off
31.12.2017
-4,000.00
2017 Subtotal
0123
7,000.00
2018 Subtotal
0123
7,000.00
2019
0123
write-up
31.12.2019
3,000.00
2019 Subtotal
0123
10,000.00
2020 Subtotal
0123
10,000,00
The question is how to insert rows in Power Query for years where no activity (event) has occurred (no entry in the journal) so that a subtotal can be shown in Power Pivot as of 31.12. of each year.
I hope I could explain my issue in an understandable way. Thanks in advance for your help!
Kind regards,
Joerg

See if something like this works for you. There are shorter, more confusing ways to do it
Get minimum year of all the data, and maximum year of all the data, and create a table of all combinations of years and entities. See if those are being used. If not, merge that year and entity back into the original table with month=dec day=31
there is a bit of self-merging etc, which requires pasting this into home...advanced... since not all of it can be done in the user interface
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Entity", Int64.Type}, {"Event", type text}, {"Date", type date}, {"Amount", Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Year", each Date.Year([Date])),
// Create table of all possible Entities and Years
DateList = {Date.Year(List.Min(#"Added Custom"[Date])) .. Date.Year(List.Max(#"Added Custom"[Date]))},
Entities = Table.AddColumn(Table.Distinct(Table.SelectColumns(#"Added Custom",{"Entity"})),"Year", each DateList),
#"Expanded Year" = Table.ExpandListColumn(Entities, "Year"),
// Find unique Data and merge into original data set
#"Merged Queries" = Table.NestedJoin(#"Expanded Year",{"Year", "Entity"},#"Added Custom",{"Year", "Entity"},"Table2",JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"Date"}, {"Date2"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Table2", each ([Date2] = null)),
#"Added Custom1" = Table.AddColumn(#"Filtered Rows", "Date", each #date([Year],12,31), type date),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom1",{"Date2", "Year"}),
#"Appended Query" = Table.Combine({#"Changed Type", #"Removed Columns" })
in #"Appended Query"

Related

Power query - Group a table based on the date and the hourly interval

I need to group a table based on the date and the hourly interval, using the Sum:
Date
Interval: from 8am today to <8am today+1
Previously I was using MS Access and a query to create it. Now I need to go through Power Query in MS Excel.
That was the SQL Query used before:
SELECT switch(Tbl_Prod_Chat.[Interval]>=8,Tbl_Prod_Chat.[Date],Tbl_Prod_Chat.[Interval]<8,Tbl_Prod_Chat.[Date]-1) AS LINK_DATE, Tbl_Prod_Chat.Agent, Sum(Tbl_Prod_Chat.ProdChat) AS Prod_Chat
FROM Tbl_Prod_Chat
GROUP BY Switch(Tbl_Prod_Chat.[Interval]>=8,Tbl_Prod_Chat.[Date],Tbl_Prod_Chat.[Interval]<8,Tbl_Prod_Chat.[Date]-1), Tbl_Prod_Chat.Agent;
The table is built as:
Field 1 "Date" (type/format: mm/dd/yyyy)
Field 2 "Interval" (type: whole number): 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 0
Field 3 "Volume of contact" (type: whole number)
The new table would be:
Field 1 "Date"
Field 2 "Total Volume" (sum on 24h from 8am toady to <8am Today+1).
Can you please help me on this?
Thanks
Seb
Sounds like you just need to add a single custom column
add column .. custom column...
= if [Interval] >7 or [Interval]=0 then [Date] else Date.AddDays([Date],-1)
or
= if [Interval] <8 and [Interval] > 0 then Date.AddDays([Date],-1) else [Date]
That will take all hours [8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,0] and use the current date and will take all hours [1,2,3,4,5,6,7] from the next date.
Then right click ... Group By .. on that new custom column and do operation Sum on Column: Volume of Contact, with whatever name you want in New Column Name
sample full code
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Interval", Int64.Type}, {"Volume of contact", Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each if [Interval] >7 or [Interval]=0 then [Date] else Date.AddDays([Date],-1), type date),
#"Grouped Rows" = Table.Group(#"Added Custom", {"Custom"}, {{"Volume of Contact", each List.Sum([Volume of contact]), type number}})
in #"Grouped Rows"

how to make measure that shows deviation

I have a table which shows sales depending on the source (fact and forecast).
year
week
category
sales rub
source
2021
32
shorts
54387
2021 fact
2021
32
shorts
58264
forecast
2021
33
dresses
4325
2021 fact
2021
33
dresses
5432
forecast
When I make a matrix in powerBI need to get a deviation fact from forecast, bu I cannot make a quick measure division because in fact i have only one column with values. How can i calculate the deviation? Thanks a lot
If using powerquery, load data
click select source column
transform .. pivot column ... and for values column, choose sales rub from dropdown
add column .. custom column. Name it deviation with formula
= [2021 fact]-[forecast]
full sample code:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Pivoted Column" = Table.Pivot(Source, List.Distinct(Source[source]), "source", "sales rub", List.Sum),
#"Added Custom" = Table.AddColumn(#"Pivoted Column", "deviation", each [2021 fact]-[forecast])
in #"Added Custom"

Power Query - Data Transformation

New to PowerBi and Power Query and having some trouble transforming the data.
The data contains processes for each sale category with status if the manufacturing process has been complete or not. Require a new aggregate table that has three calculated columns returning the following dates:
Start date which is defined as the first date the process enters the table
Predicted end date which is defined as the last date the process is shown in the table
Actual end date which is defined as the last instance the process status is equal to "Done"
Have managed to return the three dates but each ends up on a separate line rather than one line with the data. Below is the original data and required output.
Output Table
Would appreciate any assistance in transforming this data.
Most of it you can do using the Power Query UI:
Group by Month/Category/Process
Aggregations:
Start date => Min of Date
Estimated (or Predicted) end date => Max of Date
But then you need a custom aggregation where you determine the max date after filtering the subtable for "Done" in the status column.
You can do that in the Advanced Editor editing the M Code directly.
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Month", type date},
{"Category", type text}, {"Process", type text}, {"Status", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Month", "Category", "Process"}, {
{"Start Date", each List.Min([Date]), type nullable date},
{"Predicted End Date", each List.Max([Date]), type nullable date},
//Custom aggregation to calculate Actual End Date
//Note that we can Filter the table here, and then select the last date
{"Actual End Date", each List.Max(Table.SelectRows(_, each [Status]= "Done")[Date]), type nullable date}
})
in
#"Grouped Rows"
Original Data
Results

PowerQuery filter on 20th of the month no matter the month or year

I am trying to filter a list on the 20th of the month as this has been given as a significant date to identify a specific subset of records. There is no set date just a set day so it can be the 20th of any month in any year. Is there a way I can filter on these in PowerQuery?
Thanks
I assume you mean you want to filter a Table, choosing only to show the rows where the day = the 20th
Let's also also assume your data is loaded into Powerquery, and the date info is a column named Date
Add column, custom column, with formula
= Date.Day([Date])
( See the Power Query M function reference list )
Click at top of that new column and use the drop down filter to [x] the 20
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each Date.Day([Date])),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = 20))
in #"Filtered Rows"

PowerQuery: taking the average of each of many columns

I'm new to PowerQuery and I have a table that is essentially a matrix of dates and hours within those days: the first column holds each date and the rest of the columns are labeled 1 through 24. An example is:
Date H1 H2 H3 H4 ...
---- -- -- -- --
Jan 1
Jan 2
Jan 3
...
This is stored in an Excel file that is quite large, so I want to be able to simply query that file and pull subsets of the data. One example is the average hourly number by year. In SQL this would be represented by "SELECT YEAR(Date), AVG(H1), AVG(H2), ... FROM Source Table GROUPBY YEAR(Date)". However, in PowerQuery it seems like you can only use GROUPBY to generate a new column with the grouped result and thus have to repeat the operation x24 in this case, or more if I had data by seconds for example (to be fair, in the SQL query you also have to type out each column if you don't consider scripting solutions). Is there a simpler approach to generate my desired table (essentially collapsing each column to its average), or do I need to manually add each column?
You can unpivot your hour columns and then you only need to group by year and the unpivoted attribute column.
I made a sample table of your data like this and loaded it into power query. I converted the Date column to Year only, Unpivoted Other Columns on the Date column, then Grouped by the Date and Hour column after unpivoting. The result looks like this.
You can of course repivot the data after if you want inside or outside of power query. This is what the code in power query looks like, but this was all created with normal menu options, not written by hand.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Extracted Year" = Table.TransformColumns(Source,{{"Date", Date.Year, Int64.Type}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Extracted Year", {"Date"}, "Hour", "Value"),
#"Grouped Rows" = Table.Group(#"Unpivoted Other Columns", {"Date", "Hour"}, {{"Average", each List.Average([Value]), type number}})
in
#"Grouped Rows"

Resources