Combining all data in table into one large column?

Combining all data in table into one large column? - powerquery

I am working in Excel's Power Query Editor and currently have a table that is 29 cells long and x cells tall.
I need every single value of each cell to become its own cell in one long column(or row).
The problem is the height of the table will continually grow so I can't manually do this as it will ignore any new rows afterwords.
Anyone know a fix for this?

Use Table.UnpivotOtherColumns to unpivot everything, then remove the Attribute column:
#"Unpivoted Columns" = Table.UnpivotOtherColumns(Source, {}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"})

Related

Is there any way how to refer actual column name as variable in Power Query

I am new to Power Query and I would like to ask more experienced people about it.
I am trying to solve problem with Text.Combine where I would like to combine together value in specific column and column name of actual cell.
Do you have any idea?
My idea is formula similar to this:
=Text.Combine({[kod], Column.Name},";")
Thank you very much for answer.
Tomas
Edit 8.12.2021:
#horseyride
I actually try fill columns automatically with data in following format where first part will be value from actual row and actual column name:
For e.g.:
8M0183:F01A0101.B in first row, second column,
8M0182:F01A0102.A in second row, first column
Table example mentioned bellow.
Thank you very much for all answers.

see if this works for you. It combines the KOD column with the column name into each null cell, for every column that is not named KOD
It finds all column not named KOD. It converts those to text. It replaces all nulls with a placeholder, here "xxx". We unpivot to get everything to just three columns. We then combine the column title and the cell contents if the cell contents is equal to the placeholder. Pivot to get back to original format
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
ColumnList=List.Difference(Table.ColumnNames(Source),{"KOD"}),
ConvertToText= Table.TransformColumnTypes (Source,List.Transform(ColumnList, each {_ , type text})),
ReplaceNullsWithPlaceholder = Table.ReplaceValue(ConvertToText,null,"xxx",Replacer.ReplaceValue,ColumnList),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(ReplaceNullsWithPlaceholder, {"KOD"}, "Attribute", "Value"),
CombineColumnAndRow = Table.AddColumn(#"Unpivoted Other Columns", "Value2", each if [Value]="xxx" then [Attribute]&":"&[KOD] else [Value]),
#"Removed Columns" = Table.RemoveColumns(CombineColumnAndRow,{"Value"}),
#"Pivoted Column1" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attribute]), "Attribute", "Value2")
in #"Pivoted Column1"

Bring Value with Sumifs in Pow.Query language to specified row, and column(location)

Next step? I have brought with sumifs and a lot sumif from other workbook, information to the exact row, column in excel workbook. Now I want to do the same with query language. I can bring two values if condition is met, but then it is unclear how I will bring the total sum to the one row in excel workbook. Can anyone show me the path? I guess I will need Data Model...
= Table.AddColumn(#"Changed Type", "Sumif", each if [Column2] =2 or [Column2]=1 then [Column3]+[Column4] else 0)
let
Source = Folder.Files...
#"C:\Users...
#"Imported Excel" = Excel.Workbook(#"C:\...
SegPL_Chart = #"Imported Excel"{[Name="SegPL_Chart"]}[Data],
#"Removed Top Rows" = Table.Skip(SegPL_Chart,12),
#"Removed Alternate Rows" = Table.AlternateRows(#"Removed Top Rows",1,1,90),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Alternate Rows"),
#"Filtered Rows" = Table.SelectRows(#"Promoted Headers", each ([Col1]="1" or [Col1]="2")),
#"Table Group = Table.Group(#"Filtered Rows", {}, List.TransformMany(Table.ColumnNames(#"Filtered Rows",(x)=>{each if x = "Names" then "Totals" else List.Sum(Table.Column(_,x))},(x,y)=>{x,y})),
#"append" = Table.Combine({#"Filtered Rows",#"Table Group"})
in
#"append"
It gives an error "in" Token comma needed..? What else I need to do bring total rows?

You can use several steps to create several helper columns with intermediate results of conditional sums. Then you can create a new column, sum up all the intermediate results and the delete the helper columns with the intermediate results.
Keep in mind that unlike Excel, the calculations in Power Query always return constants and you can then delete calculated columns you no longer need. So,
Create helper column 1 with complicated IF and Sum scenario
Create helper column 2 with complicated IF and Sum scenario
Create total column to add column 1 + column 2
Delete helper columns and keep only the total column

That gives me exact result what I was looking for, but it is with DAX formula in PowerPivot:
=SUMX(FILTER('TableName',[ColName] = 1),'TableName'[ColName2])
So would be glad to convert it to Power-Query formula

Power Query: Selecting multiple elements in 'value field settings' to measure a specifc field

I'm trying to create a measure that averages the 29 elements' value [Overtime/Hours_worked] into one cell, visualised by the attached image.
Cell F32 currently shows [AverageA Total Overtime/Total Hours_worked] but I want it to be an average of the 29 rows' values as displayed in cell H32, =AVERAGEA(F3:F31).
The elements' figures are based upon underlying data from Data$, currently amounting to ~150k rows. When creating a measure that's averaging the elements' values from Column E [AvereageA Overtime/Hours_worked] and showing as a % of the 29 elements' aggregate %, I'm running into the problem of averaging the separate elements' underlying data taken from Data$. Worth noting is that F3:F31 is redundant in this instance, I'm looking for the average of the 29 elements' values in column E and not their respective averages shown in column F.
Am I right to use measure here or is there a better way to approach it? If measures can be used, is there a way to design the measure so that it refers to the Pivot Table's shown data instead of the underlying data taken from Data$? For instance by designing the measure to refer to column E in the pivot table?
Side note
The table needs to remain dynamic since Data$ is being updated regularly. I'm relatively new to Power Query so I'm not sure if there are other ways to solve this, i.e. through MDX, but I doubt I'll be able to sort that out myself.
Any and all help is appreciated, thanks.

I'm not sure how you are computing the individual entries in the AverageA Total Overtime/Total Hours_worked (so I left it blank), but to compute the totals and averages for the other columns, you can use the Table.Group command in a special way with an empty list for the key (so as to return the entire table for the Aggregation operations).
Given:
M Code
read the comments in the code to understand the algorithm
If your overtime% column is in your original data, you can just delete those code lines that add that column
let
//be sure to change table name in next line to your actual table name
Source = Excel.CurrentWorkbook(){[Name="wrkTbl"]}[Content],
//set data types
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"Area", type text}, {"Hours_worked", Int64.Type}, {"Overtime", Int64.Type}}),
//Add the percent overtime column
#"Added Custom" = Table.AddColumn(#"Changed Type", "Overtime/Hours_worked",
each [Overtime]/[Hours_worked], Percentage.Type),
//Use Table.Group to compute total/averages for the last row
//Be sure to use the exact same names as used in the original table
#"Grouped Rows" = Table.Group(#"Added Custom", {}, {
{"Area", each "Totals",type text},
{"Hours_worked", each List.Sum([Hours_worked]), Int64.Type},
{"Overtime", each List.Sum([Overtime]), Int64.Type},
{"Overtime/Hours_worked", each List.Sum([Overtime])/List.Sum([Hours_worked]), Percentage.Type},
{"AverageA Overtime/Hours_worked", each List.Average([#"Overtime/Hours_worked"]), Percentage.Type}
}),
//Append the two tables to add the Totals row
append= Table.Combine({ #"Added Custom", #"Grouped Rows"})
in
append
results in =>

Populate conditional column depending on column name criteria

I receive a weekly report which contains some repetition of columns. This is because it is drawn from a collection of web forms which ask similar questions to each other - let's say they all ask "Do you want to join our email list?" - but this question is stored in the source system as a separate field for each form (each form is effectively a separate table). The columns will always be consistently named - e.g. "Email_optin_1", "Email_optin_2" - so I can come up with rules to identify the columns which ask the email question. However, the number of columns may vary from week to week - one week the report might just contain "Email_optin_2", the next week it might include four such columns. (This depends on which web-forms have been used in that week). The possible values are the same in all these columns - let's say "Yes" and "No".
Each row should normally only have one of the "Email_optin" columns populated.
What I would like to do is create a single column in Power Query called "Email_Optin_FINAL", which would return "Yes" if ANY columns beginning with "Email_optin" contain a value of "Yes".
So, basically, instead of the criteria simply referring to the values in specific columns, what I would like it to do is first of all figure out which columns it needs to be looking at, and then look at the values in those columns.
Is this possible in PowerQuery?
Thanks in advance for any advice!

This would find all the columns containing Email_optin and merge them for you into a new column and remove the original columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
EmailList= List.Select(Table.ColumnNames(Source), each Text.Contains(_, "Email_optin")),
#"Merged Columns" = Table.CombineColumns(Source,EmailList,Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged")
in #"Merged Columns"
This would find all the columns containing Email_optin and merge them for you into a new column and preserve the original columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Index= Table.AddIndexColumn(Source, "Index", 0, 1),
EmailList= List.Select(Table.ColumnNames(Index), each Text.Contains(_, "Email_optin")),
Merged = Table.CombineColumns(Index,EmailList,Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"),
#"Merged Queries" = Table.NestedJoin(Index,{"Index"},Merged,{"Index"},"Merged",JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Merged", {"Merged"}, {"Merged"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Table2",{"Index"})
in #"Removed Columns"
you can then filter for "YES" among the merged answers if you want

Difference vs Previous Row & DistinctCount

I want to calculate a Delta Weeks column in Power Query WeekNum[current row] - WeekNum[previous row]
I found a way to do it using the [Index] column, but it is painfully slow, and my table is 100k rows.
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Customer", type text}, {"Product", type text}, {"WeekNum", Int64.Type}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "Delta Weeks", each try Source[WeekNum]{[Index]} - Source[WeekNum]{[Index]-1} otherwise 0) in #"Added Custom"
Also, after this, I need another column who would count the distinct values from the beginning up to that row.
Most of the weeks are consecutive, so basically the distinct count will increase when they are not.
(I don't know how to do this in Power Query).

I believe PQ wasn't designed for working with previous row context.
What I did find works better than referencing the previous row using [Index]-1, is creating 2 index columns (one starting with: 0,1,2, and the other with 0,0,1,2, so basically an [Index]-1 floored at 0), and then joining the 2 tables, which basically puts the previous row on the same row, if that makes sense.
However even that was too slow for me, and in the end I implemented a different approach, and I simply use a bit of VBA code where I calculate the difference via previous row, and then import the table in PQ. I think this is a more efficient (and considerably faster) approach!

try/otherwise might be pretty slow. Is it any faster if you use if [Index] > 0 then Source[WeekNumber]{[Index]} - Source[WeekNumber]{[Index] - 1} else 0 for the custom formula?

Instead of your try ... otherwise code I would use something more direct like:
[WeekNumber] - #"Added Index"[WeekNumber]{[Index] - 1}.
Then I would add a Replace Errors step to clean up the first row.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Combining all data in table into one large column? - powerquery

Use Table.UnpivotOtherColumns to unpivot everything, then remove the Attribute column: #"Unpivoted Columns" = Table.UnpivotOtherColumns(Source, {}, "Attribute", "Value"), #"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"})

Related

Is there any way how to refer actual column name as variable in Power Query

Bring Value with Sumifs in Pow.Query language to specified row, and column(location)

Power Query: Selecting multiple elements in 'value field settings' to measure a specifc field

Populate conditional column depending on column name criteria

Difference vs Previous Row & DistinctCount

Categories

Resources