Excel Power Query - Remove Column if it exists, otherwise don't try - powerquery

I have a calculated Column Custom = Column1 + Column2 - Column3
After the calculation, i need to delete all columns except Custom
Problem is sometimes one of the columns [Column4] does not exist in the dataset
I can have the the Custom calculate properly with "try otherwise" as in:
#"Added Custom" = Table.AddColumn(#"Previous Step", "Custom", each [Column2]+[Column3]- (try [Column4] otherwise 0)),
#"Removed Columns7" = Table.RemoveColumns(#"Added Custom",{"Column2", "Column3", "Column4"}),
This works fine, however the second step fails if [column4] doesn't exist.
So i need a way to test if [Column4] exists and remove it if it does, otherwise don't try to.

how about
#"Added Custom" = Table.AddColumn(#"Previous Step", "Custom", each [Column2]+[Column3]- (try [Column4] otherwise 0)),
#"Removed Columns" = try Table.RemoveColumns(#"Added Custom,{"Column4"}) otherwise #"Added Custom"

One way to approach this is to select the columns you want to keep rather than removing columns you don't want. This is equivalent to removing all except the columns you specify.
Alternatively, you can intersect the all of table columns with your list.
Table.RemoveColumns(
#"Added Custom",
List.Intersect(
{
Table.ColumnNames(#"Added Custom"),
{"Column2", "Column3", "Column4"}
}
)
)

Another way:
add = Table.AddColumn(Source, "Custom", each List.Sum({[Column2],[Column3],-[Column4]?})),
del = Table.RemoveColumns(add,{"Column2", "Column3", "Column4"}, 1)

If the only column you want to retain is Custom, then just use Table.SelectColumns.
If there might be other columns you want to retain, you can select them also or you can generate a list of columns to remove.
From what you write, it seems you want to remove any columns whose name starts with Column. If that is the case, here is one method:
#"Removed Columns"= Table.RemoveColumns(#"Previous Step",
List.Select(Table.ColumnNames(#"Previous Step"), each Text.StartsWith(_,"Column")))

Related

Power Query to filter only numbers

I've been searching everywhere to find a way to filter a column that contains both Text and Numbers, I want to filter out the numbers only from that column.
Thanks.
Add column, custom column, potentially with one of these
= Text.Select([Column1],{"0".."9"})
=try Number.From([Column1]) otherwise "Text"
Try this:
let
//Change next line to reflect Data source
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
//change next line to include all columns and their names
#"Changed Type" = Table.TransformColumnTypes(Source,{{"COLUMN", type any}}),
//Change next line to be testing the proper column
#"Numbers Only" = Table.SelectRows(#"Changed Type", each not (try Number.From([COLUMN]))[HasError]),
#"Changed Type1" = Table.TransformColumnTypes(#"Numbers Only",{{"COLUMN", type number}})
in
#"Changed Type1"

Is there any way how to refer actual column name as variable in Power Query

I am new to Power Query and I would like to ask more experienced people about it.
I am trying to solve problem with Text.Combine where I would like to combine together value in specific column and column name of actual cell.
Do you have any idea?
My idea is formula similar to this:
=Text.Combine({[kod], Column.Name},";")
Thank you very much for answer.
Tomas
Edit 8.12.2021:
#horseyride
I actually try fill columns automatically with data in following format where first part will be value from actual row and actual column name:
For e.g.:
8M0183:F01A0101.B in first row, second column,
8M0182:F01A0102.A in second row, first column
Table example mentioned bellow.
Thank you very much for all answers.
see if this works for you. It combines the KOD column with the column name into each null cell, for every column that is not named KOD
It finds all column not named KOD. It converts those to text. It replaces all nulls with a placeholder, here "xxx". We unpivot to get everything to just three columns. We then combine the column title and the cell contents if the cell contents is equal to the placeholder. Pivot to get back to original format
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
ColumnList=List.Difference(Table.ColumnNames(Source),{"KOD"}),
ConvertToText= Table.TransformColumnTypes (Source,List.Transform(ColumnList, each {_ , type text})),
ReplaceNullsWithPlaceholder = Table.ReplaceValue(ConvertToText,null,"xxx",Replacer.ReplaceValue,ColumnList),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(ReplaceNullsWithPlaceholder, {"KOD"}, "Attribute", "Value"),
CombineColumnAndRow = Table.AddColumn(#"Unpivoted Other Columns", "Value2", each if [Value]="xxx" then [Attribute]&":"&[KOD] else [Value]),
#"Removed Columns" = Table.RemoveColumns(CombineColumnAndRow,{"Value"}),
#"Pivoted Column1" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attribute]), "Attribute", "Value2")
in #"Pivoted Column1"

Power Query: How to insert a Column using the Left(Right Function based on A4

Is there a way to Add a column, in Power Query, by referencing data in a specific cell?
I want to take the text from "A4", use a Left(Right function, and add that to a new column.
My VBA macro is:
"Latest 4 Wks - Ending " & Left(Right(.Range("A4"), 24), 23)
I guess you want to do something like that. In a first step you define a named range for A4 which I named cellA4. I then did a load into Powerquery, added an extra column with the part of the text (I used Text.Middle other text function are possible, of course) from the cell and drilled down to the content of the cell. The M-code for that is
let
Source = Excel.CurrentWorkbook(){[Name="cellA4"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each Text.Middle([Column1],23)),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Column1"}),
Custom = #"Removed Columns"{0}[Custom]
in
Custom
Result looks like
Them I just made a table with one column and imported that into Powerquery and added an extra column which just contains the text from cell A4. M-Code is
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Col1", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each cellA4)
in
#"Added Custom"
Result is
Through further research, I found that by adding a Blank Query, I was able to add a column, in Power Query, by referencing data in a specific cell?
Insert BlankQuery
Advance Editor
(YourWorkSheet as table ) as text=>
let
SheetCellA4 =YourWorkSheet[Column1]{3},
SplitByFrom = Text.Split(SheetCellA4, "to "){1},
SplitByTime = Text.Split(SplitByFrom, "`"){0}
in SplitByTime
The bring in the worksheet data
After the Source line
#"Added Custom" = Table.AddColumn(Source, "Custom", each Query1(Source))
In
#"Added Custom"

Populate conditional column depending on column name criteria

I receive a weekly report which contains some repetition of columns. This is because it is drawn from a collection of web forms which ask similar questions to each other - let's say they all ask "Do you want to join our email list?" - but this question is stored in the source system as a separate field for each form (each form is effectively a separate table). The columns will always be consistently named - e.g. "Email_optin_1", "Email_optin_2" - so I can come up with rules to identify the columns which ask the email question. However, the number of columns may vary from week to week - one week the report might just contain "Email_optin_2", the next week it might include four such columns. (This depends on which web-forms have been used in that week). The possible values are the same in all these columns - let's say "Yes" and "No".
Each row should normally only have one of the "Email_optin" columns populated.
What I would like to do is create a single column in Power Query called "Email_Optin_FINAL", which would return "Yes" if ANY columns beginning with "Email_optin" contain a value of "Yes".
So, basically, instead of the criteria simply referring to the values in specific columns, what I would like it to do is first of all figure out which columns it needs to be looking at, and then look at the values in those columns.
Is this possible in PowerQuery?
Thanks in advance for any advice!
This would find all the columns containing Email_optin and merge them for you into a new column and remove the original columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
EmailList= List.Select(Table.ColumnNames(Source), each Text.Contains(_, "Email_optin")),
#"Merged Columns" = Table.CombineColumns(Source,EmailList,Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged")
in #"Merged Columns"
This would find all the columns containing Email_optin and merge them for you into a new column and preserve the original columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Index= Table.AddIndexColumn(Source, "Index", 0, 1),
EmailList= List.Select(Table.ColumnNames(Index), each Text.Contains(_, "Email_optin")),
Merged = Table.CombineColumns(Index,EmailList,Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"),
#"Merged Queries" = Table.NestedJoin(Index,{"Index"},Merged,{"Index"},"Merged",JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Merged", {"Merged"}, {"Merged"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Table2",{"Index"})
in #"Removed Columns"
you can then filter for "YES" among the merged answers if you want

Difference vs Previous Row & DistinctCount

I want to calculate a Delta Weeks column in Power Query WeekNum[current row] - WeekNum[previous row]
I found a way to do it using the [Index] column, but it is painfully slow, and my table is 100k rows.
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Customer", type text}, {"Product", type text}, {"WeekNum", Int64.Type}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "Delta Weeks", each try Source[WeekNum]{[Index]} - Source[WeekNum]{[Index]-1} otherwise 0) in #"Added Custom"
Also, after this, I need another column who would count the distinct values from the beginning up to that row.
Most of the weeks are consecutive, so basically the distinct count will increase when they are not.
(I don't know how to do this in Power Query).
I believe PQ wasn't designed for working with previous row context.
What I did find works better than referencing the previous row using [Index]-1, is creating 2 index columns (one starting with: 0,1,2, and the other with 0,0,1,2, so basically an [Index]-1 floored at 0), and then joining the 2 tables, which basically puts the previous row on the same row, if that makes sense.
However even that was too slow for me, and in the end I implemented a different approach, and I simply use a bit of VBA code where I calculate the difference via previous row, and then import the table in PQ. I think this is a more efficient (and considerably faster) approach!
try/otherwise might be pretty slow. Is it any faster if you use if [Index] > 0 then Source[WeekNumber]{[Index]} - Source[WeekNumber]{[Index] - 1} else 0 for the custom formula?
Instead of your try ... otherwise code I would use something more direct like:
[WeekNumber] - #"Added Index"[WeekNumber]{[Index] - 1}.
Then I would add a Replace Errors step to clean up the first row.

Resources