How to aggregate/group the data from one table and output the result as another table in PowerBI? - datatable

I have a Raw Data Table as shown in the screenshot below:
I want to group the data in the raw data table into the Output Table as shown in the screenshot below:
Basically the output table is counting the number of student for each understanding level in different intake. May I know how should I get the output table from the raw data table? I'm still new in Power Query, any help will be greatly appreciated!
This is what I have tried:
Code:
= Table.Group(Source, {"Intake"}, {
{"Count_Little_Understand", each Table.RowCount(Table.SelectRows(_, each ([Topic 1] = "Little Understanding"))), Int64.Type},
{"Count_General_Understanding", each Table.RowCount(Table.SelectRows(_, each ([Topic 1] = "General Understanding"))), Int64.Type},
{"Count_Good_Understand", each Table.RowCount(Table.SelectRows(_, each ([Topic 1] = "Good Understanding"))), Int64.Type},
{"Count_Fully_Understand", each Table.RowCount(Table.SelectRows(_, each ([Topic 1] = "Fully Understand"))), Int64.Type}
})
I only able to get the table by individual Topic, not sure how to include other Topic appended below and also add another extra column to label the Topic as shown in my second screenshot. Hope to get some advice/help on how should I modified the code. Thank you/1

I've rebuilt a similar but shorter table:
Now we first go into Transform (1), mark the Topic Cols (2) and Unpivot Columns (3).
Your table now looks like the following screenshot. Finally, we select the Value column (1), click on Pivot Column (2), select Employee Name (3).
Result:

You can Unpivot the Topic columns, then Pivot the Understanding column, using Count of Employee Name as the aggregate value.
Then simply reorder columns and sort rows, to suit the output you need:
#"Unpivoted Topic" = Table.UnpivotOtherColumns(#"Raw Data Table", {"Employee Name", "Intake"}, "Topic", "Understanding"),
#"Pivoted Understanding" = Table.Pivot(#"Unpivoted Topic", List.Distinct(#"Unpivoted Topic"[Understanding]), "Understanding", "Employee Name", List.NonNullCount),
#"Reordered Columns" = Table.ReorderColumns(#"Pivoted Understanding",{"Intake", "Topic", "Little Understanding", "General Understanding", "Good Understanding", "Fully Understand"}),
#"Sorted Rows" = Table.Sort(#"Reordered Columns",{{"Topic", Order.Ascending}, {"Intake", Order.Ascending}})
Output:

Related

Creating Dynamic Columns in Power Query with Expression Lookups

I have a Source file that has a set of columns, each of binary type (true, false). I then have a processing (transform) table which has one or more rows, each row specifying a new column to be added to the Source, along with the expression to use to populate that column based on existing columns. In Power Query, I want to be able to extend the Source table with these additional column definitions. There can be any number of transform rows, so I would need to do this dynamically based on row count. Click on link below for an illustration of what I'm shooting for.
I believe this can be achieved using List.Accumulate in Power Query, but I haven't figured out exactly how to do it. Any suggestions?
Here is a method using List.Generate:
let
Transform = Excel.CurrentWorkbook(){[Name="Transform"]}[Content],
#"Transform Table" = Table.TransformColumnTypes(Transform, {{"Col Name", type text},{"Col Formula", type text}}),
Source = Excel.CurrentWorkbook(){[Name="Source"]}[Content],
#"Source Table" = Table.TransformColumnTypes(Source,{{"A", type logical}, {"B", type logical}, {"C", type logical}}),
addCol = List.Last(
List.Generate(
()=>[c=Table.AddColumn(#"Source Table", #"Transform Table"[Col Name]{0},
Expression.Evaluate("each " & #"Transform Table"[Col Formula]{0}), type logical),
idx=0],
each [idx] < Table.RowCount(#"Transform Table"),
each [c=Table.AddColumn([c], #"Transform Table"[Col Name]{[idx]+1},
Expression.Evaluate("each " & #"Transform Table"[Col Formula]{[idx]+1}), type logical),
idx=[idx]+1],
each [c]))
in
addCol

Is there any way how to refer actual column name as variable in Power Query

I am new to Power Query and I would like to ask more experienced people about it.
I am trying to solve problem with Text.Combine where I would like to combine together value in specific column and column name of actual cell.
Do you have any idea?
My idea is formula similar to this:
=Text.Combine({[kod], Column.Name},";")
Thank you very much for answer.
Tomas
Edit 8.12.2021:
#horseyride
I actually try fill columns automatically with data in following format where first part will be value from actual row and actual column name:
For e.g.:
8M0183:F01A0101.B in first row, second column,
8M0182:F01A0102.A in second row, first column
Table example mentioned bellow.
Thank you very much for all answers.
see if this works for you. It combines the KOD column with the column name into each null cell, for every column that is not named KOD
It finds all column not named KOD. It converts those to text. It replaces all nulls with a placeholder, here "xxx". We unpivot to get everything to just three columns. We then combine the column title and the cell contents if the cell contents is equal to the placeholder. Pivot to get back to original format
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
ColumnList=List.Difference(Table.ColumnNames(Source),{"KOD"}),
ConvertToText= Table.TransformColumnTypes (Source,List.Transform(ColumnList, each {_ , type text})),
ReplaceNullsWithPlaceholder = Table.ReplaceValue(ConvertToText,null,"xxx",Replacer.ReplaceValue,ColumnList),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(ReplaceNullsWithPlaceholder, {"KOD"}, "Attribute", "Value"),
CombineColumnAndRow = Table.AddColumn(#"Unpivoted Other Columns", "Value2", each if [Value]="xxx" then [Attribute]&":"&[KOD] else [Value]),
#"Removed Columns" = Table.RemoveColumns(CombineColumnAndRow,{"Value"}),
#"Pivoted Column1" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attribute]), "Attribute", "Value2")
in #"Pivoted Column1"

Power Query: Selecting multiple elements in 'value field settings' to measure a specifc field

I'm trying to create a measure that averages the 29 elements' value [Overtime/Hours_worked] into one cell, visualised by the attached image.
Cell F32 currently shows [AverageA Total Overtime/Total Hours_worked] but I want it to be an average of the 29 rows' values as displayed in cell H32, =AVERAGEA(F3:F31).
The elements' figures are based upon underlying data from Data$, currently amounting to ~150k rows. When creating a measure that's averaging the elements' values from Column E [AvereageA Overtime/Hours_worked] and showing as a % of the 29 elements' aggregate %, I'm running into the problem of averaging the separate elements' underlying data taken from Data$. Worth noting is that F3:F31 is redundant in this instance, I'm looking for the average of the 29 elements' values in column E and not their respective averages shown in column F.
Am I right to use measure here or is there a better way to approach it? If measures can be used, is there a way to design the measure so that it refers to the Pivot Table's shown data instead of the underlying data taken from Data$? For instance by designing the measure to refer to column E in the pivot table?
Side note
The table needs to remain dynamic since Data$ is being updated regularly. I'm relatively new to Power Query so I'm not sure if there are other ways to solve this, i.e. through MDX, but I doubt I'll be able to sort that out myself.
Any and all help is appreciated, thanks.
I'm not sure how you are computing the individual entries in the AverageA Total Overtime/Total Hours_worked (so I left it blank), but to compute the totals and averages for the other columns, you can use the Table.Group command in a special way with an empty list for the key (so as to return the entire table for the Aggregation operations).
Given:
M Code
read the comments in the code to understand the algorithm
If your overtime% column is in your original data, you can just delete those code lines that add that column
let
//be sure to change table name in next line to your actual table name
Source = Excel.CurrentWorkbook(){[Name="wrkTbl"]}[Content],
//set data types
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"Area", type text}, {"Hours_worked", Int64.Type}, {"Overtime", Int64.Type}}),
//Add the percent overtime column
#"Added Custom" = Table.AddColumn(#"Changed Type", "Overtime/Hours_worked",
each [Overtime]/[Hours_worked], Percentage.Type),
//Use Table.Group to compute total/averages for the last row
//Be sure to use the exact same names as used in the original table
#"Grouped Rows" = Table.Group(#"Added Custom", {}, {
{"Area", each "Totals",type text},
{"Hours_worked", each List.Sum([Hours_worked]), Int64.Type},
{"Overtime", each List.Sum([Overtime]), Int64.Type},
{"Overtime/Hours_worked", each List.Sum([Overtime])/List.Sum([Hours_worked]), Percentage.Type},
{"AverageA Overtime/Hours_worked", each List.Average([#"Overtime/Hours_worked"]), Percentage.Type}
}),
//Append the two tables to add the Totals row
append= Table.Combine({ #"Added Custom", #"Grouped Rows"})
in
append
results in =>

Populate conditional column depending on column name criteria

I receive a weekly report which contains some repetition of columns. This is because it is drawn from a collection of web forms which ask similar questions to each other - let's say they all ask "Do you want to join our email list?" - but this question is stored in the source system as a separate field for each form (each form is effectively a separate table). The columns will always be consistently named - e.g. "Email_optin_1", "Email_optin_2" - so I can come up with rules to identify the columns which ask the email question. However, the number of columns may vary from week to week - one week the report might just contain "Email_optin_2", the next week it might include four such columns. (This depends on which web-forms have been used in that week). The possible values are the same in all these columns - let's say "Yes" and "No".
Each row should normally only have one of the "Email_optin" columns populated.
What I would like to do is create a single column in Power Query called "Email_Optin_FINAL", which would return "Yes" if ANY columns beginning with "Email_optin" contain a value of "Yes".
So, basically, instead of the criteria simply referring to the values in specific columns, what I would like it to do is first of all figure out which columns it needs to be looking at, and then look at the values in those columns.
Is this possible in PowerQuery?
Thanks in advance for any advice!
This would find all the columns containing Email_optin and merge them for you into a new column and remove the original columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
EmailList= List.Select(Table.ColumnNames(Source), each Text.Contains(_, "Email_optin")),
#"Merged Columns" = Table.CombineColumns(Source,EmailList,Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged")
in #"Merged Columns"
This would find all the columns containing Email_optin and merge them for you into a new column and preserve the original columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Index= Table.AddIndexColumn(Source, "Index", 0, 1),
EmailList= List.Select(Table.ColumnNames(Index), each Text.Contains(_, "Email_optin")),
Merged = Table.CombineColumns(Index,EmailList,Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"),
#"Merged Queries" = Table.NestedJoin(Index,{"Index"},Merged,{"Index"},"Merged",JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Merged", {"Merged"}, {"Merged"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Table2",{"Index"})
in #"Removed Columns"
you can then filter for "YES" among the merged answers if you want

Inserting text manually in Powerquery

I'm merging multiple Excel files into one where the user can review and mark an additional Comment column as completed. Each day there are additional files and I need to refresh the query and pull the data in. Keeping the original Comment column values.
I've attempted to do this by referencing Marcel Beug's video but that uses an sql table and I cannot seem to get it to work with the Excel files as the source.
After the Merge Queries I attempt to modify the first file to my source "InputFile"
![Modify the Merge Formula1][2]
![Changed to last query step of InputFile][3]
![InputFile Query with Source2 and Merge][4]
![M Code of InputFile Query with Merge][5]
By setting the First field in the Merge Formula to the last step in the InputFile query I was able to get around the Cyclic error however I find that every Refresh creates duplicate rows. 4 become 8 that then becomes 16, etc.
let
Source = Excel.Workbook(File.Contents("S:\Fin_Aid\Operations Team\COD mpn - lec\InputFiles\8.22.18 to 8.23.18.xlsx"), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
Rename_RecID = Table.RenameColumns(#"Removed Columns",{{"Column3.1", "RecID"}}),
Source2 = Excel.CurrentWorkbook(){[Name="InputFile"]}[Content],
InputWithComment = Table.TransformColumnTypes(Source2,{{"RecID", Int64.Type}, {"Column1", type text}, {"Column2", type text}, {"Column4", type text}, {"Column5", type text}, {"Comment", type text}}),
#"Merged Queries" = Table.NestedJoin(Rename_RecID,{"RecID"},InputWithComment,{"RecID"},"InputWithComment",JoinKind.LeftOuter),
#"Expanded InputWithComment" = Table.ExpandTableColumn(#"Merged Queries", "InputWithComment", {"Comment"}, {"Comment"})
in
#"Expanded InputWithComment"
Regards,
Jim

Resources