Power Query add multiple empty colums? - dax

I need to add 12 empty columns in Power Query with custom names.
Now I am adding one column at the time and change the name.
Is it possible to do this faster/better?
I tried the first option but get an error.
When I add a column with code
= Table.FromColumns(
Table.ToColumns(#"Prev Step") & {{null}, {null}, {null}},
Table.ColumnNames(#"Prev Step") & {"Empty1", "Empty2", "Empty3"}
)
I get a lot of rows in the three columns.
What am I doing wrong?

How about this for adding three empty columns? Extend to more as needed.
= Table.FromColumns(
Table.ToColumns(#"Prev Step") & {{null}, {null}, {null}},
Table.ColumnNames(#"Prev Step") & {"Empty1", "Empty2", "Empty3"}
)

By generating lists of columns and associated names, you can specify just the number of columns to add, and let the list of nulls and names be generated automatically.
eg:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSkksSVQwVIrVgTKNlGJjAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
//add Multiple Blank Columns
numCols = 12,
colNames = List.Generate(
()=>[colName = "Blank", idx=0],
each [idx] < numCols,
each [colName = "Blank" & Number.ToText([idx]+1), idx = [idx]+1],
each [colName]
),
addedCols = Table.FromColumns(
Table.ToColumns(#"Changed Type") & List.Repeat({{null}},numCols),
Table.ColumnNames(#"Changed Type") & colNames)
in
addedCols

Related

Power Query: Change Header Titles from Camel Case to Snake Case

I have just imported some data into Power Query. The headers are in camel case. I.e.:
headerOne
headerTwo
headerThree
Etc.
I would like them to be in snake case. I.e.:
header_one
header_two
header_three
Etc.
I am not sure, though, how to do this. Any ideas?
Thanks.
Examine the applied steps to understand the algorithm.
let
//change next line to reflect actual data source
Source = Excel.CurrentWorkbook(){[Name="Table4"]}[Content],
//change column headers
colNames = Table.ColumnNames(Source),
#"Split at UpperCase" = List.Transform(colNames, each Splitter.SplitTextByCharacterTransition({"a".."z"},{"A".."Z"})(_)),
#"Snake Case" = List.Transform(#"Split at UpperCase", each Text.Lower(Text.Combine(_,"_"))),
rename = Table.RenameColumns(Source, List.Zip({colNames,#"Snake Case"}))
in
rename
Another method is to replace "X" with "_x" for any upper-case letter.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSlTSUUoC4mSl2FgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [headerOne = _t, headerTwo = _t, headerThree = _t]),
Headers = Table.ColumnNames(Source),
NewHeaders = List.Transform(Headers, each Text.Combine(List.Transform(Text.ToList(_), each if List.Contains({"A".."Z"}, _) then "_" & Text.Lower(_) else _))),
Result = Table.RenameColumns(Source, List.Zip({Headers, NewHeaders}))
in
Result
This splits each header into a list of characters (using Text.ToList), replaces any capital letter in "A" to "Z" with the "_" prepended to the lower-case version (using List.Transform), and then combines the list back into a string (using Text.Combine).

Create conditional Table.TransformColumnType in powerquery

I am trying to convert columns to numeric. If TransformColumnTypes causes an error, I want to keep it text. Something like this:
#"Changed Type" = try Table.TransformColumnTypes(CombineTables,List.Transform(sTranCol, each {_, type number})), otherwise Table.TransformColumnTypes(CombineTables,List.Transform(sTranCol, each {_, type number})),
Obviously this doesn't work. sTranCol is the list of columns to covert to numeric. It is dynamically created and isn't static. I don't care if it puts error in the cell but transposing with errors in the cells is causing query to abort.
The M Code methods I've seen to detect data type of a column consist of sampling the data and determining the type. This seems messy.
But perhaps an alternative might be type the columns as numeric, and then replace the error values with something that won't cause a problem when transposing.
Here is some sample code to replace errors with null, but you could replace with anything null or numeric:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSlTSUTJUitWJBpI6SsZglhGQZQ5mVQBZiWCWKZCVBGaZA1kVEB0ghYYmSrGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,
List.Transform(Table.ColumnNames(Source), each {_,type nullable number})),
nullList = List.Transform(Table.ColumnNames(#"Changed Type"), each {_, null}),
#"Replaced Errors" = Table.ReplaceErrorValues(#"Changed Type", nullList)
in
#"Replaced Errors"
Source
Changed Type
Replaced Errors
Edit: Add M Code to set column types depending on if all numeric
let
Source = Excel.CurrentWorkbook(){[Name="Table37"]}[Content],
//check data type
//if all numbers set to number, else any
colTypes = List.Accumulate(Table.ColumnNames(Source),
{},
(state,current)=> List.Combine({state,
if List.IsEmpty(
List.RemoveMatchingItems(
List.Transform(Table.Column(Source,current), each Value.Type(_)),
{type number}))
then {{current, type number}}
else {{current, type any}}})),
#"Changed Type" = Table.TransformColumnTypes(Source,colTypes)
in
#"Changed Type"
Source
Changed Type

How to get Tables from Navigator Pane and Combine

This is the corrected code with the guidance from Alexis. My PDF returns two tables (and 1 page Table) per output page. Table001 is a throw away. I only need even numbered Tables so I use the List.Select to remove the Page Table and List.Alternate to skip odd numbered tables.
let
Source = Pdf.Tables(File.Contents("State_Fico.pdf"), [Implementation="1.3"]),
TableNames = List.Alternate(List.Select(Table.Column(Source, "Id"),each Text.Contains(_,"Table")),1,1),
TableList = List.Transform(TableNames, each Source{[Id=_]}[Data]),
CombineTables = Table.Combine(TableList)
in
CombineTables
This allows me to generate 1 table no matter how many pages the pdf is.
While this doesn't really answer the question in your title, I think the best way to do what you're ultimately after is to not use Expression.Evaluate at all and, instead, use list transformation(s).
For example, if you want to append Table002 and Table004, you can use Table.Combine on a list of tables, {Table002, Table004}.
Here's what the code might look like:
let
Source = Pdf.Tables(File.Contents("State_Fico.pdf"), [Implementation="1.3"]),
#"Transposed Table" = Table.Transpose(Source),
#"Promoted Headers" = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Page001", type any}, {"Table001", type any}, {"Page002", type any}, {"Table002", type any}, {"Table003", type any}, {"Table004", type any}}),
ColumnNames = Table.ColumnNames(#"Changed Type"),
TableNames = List.Alternate(List.Select(ColumnNames, each Text.Contains(_ ,"Table")),1,1),
//--New Steps Below--//
TableList = List.Transform(TableNames, each Source{[ID=_]}[Data]),
CombineTables = Table.Combine(TableList)
in
CombineTables

How to set source of a query which name is a value of another query?

I need a query (query1) from a file in a folder. This file is daily updated and I need to connect to the newest one. In order to do that I created a query (query2) which returns the newest filename in his unique record.
Now, how to set the source of query1 as a dinamic value extracted from query 2.
In below example I want, instead of pointing to staticfilename.xlsx, to point to a dinamic filename, which value in calculated with query2
let
Source = Excel.Workbook(File.Contents("Q:\....\staticfilename.XLSX"), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(...)
in
#"Changed Type"
An alternative to this is to load from a folder, sort by date created (or modified), and pick the top row instead of needing a separate query.
More details in this article and this one too.
If query2 returns a 1x1 table with column name Column1, and contents Q:\path\subpath\filename.xlsx then this should work to pull the path from query2 into your next query
Source = Excel.Workbook(File.Contents(query2{0}[Column1]), null, true),
Another option is to use a function to return the latest file:
//fnLatestFile (excluding temp files)
(
FileFolder as text,
optional FileNameContains as text,
optional FileExtension as text,
optional IncludeSubfolders as logical,
optional OutputType as text
) =>
let
fSwitch = (Expression as any, Values as list, Results as list, optional Else as any) =>
try Results{List.PositionOf(Values, Expression)} otherwise if Else = null then "Value not found" else Else,
Source = Table.SelectRows(Folder.Files(FileFolder), each not Text.Contains([Name], "~")),
#"Filtered Name" = if FileNameContains = null then Source else Table.SelectRows(Source, each (Text.Contains([Name], FileNameContains) = true)),
#"Filtered Extension" = if FileExtension = null then #"Filtered Name" else Table.SelectRows(#"Filtered Name", each ([Extension] = FileExtension)),
#"Filtered Subfolder" = if IncludeSubfolders = true then #"Filtered Extension" else Table.SelectRows(#"Filtered Extension", each ([Folder Path] = Text.Combine({FileFolder, if Text.End(FileFolder,1) = "\" then "" else "\"}))),
#"Sorted by Modified Date" = Table.Sort(#"Filtered Subfolder",{{"Date modified", Order.Descending}}),
FileData = #"Sorted by Modified Date"{0},
Output = fSwitch(
Text.Lower(OutputType),
{"name","fullname","date"},
{FileData[Name], FileData[Folder Path] & FileData[Name], FileData[Date modified]},
FileData[Content]
)
in
Output
Applying to your query, your first line then becomes:
Source = Excel.Workbook(fnLatestFile("Q:\....\", "staticfilename", ".xlsx", false), null, true),

DAX EARLIER() function in Power Query

Is there an an equivalent to EARLIER in M/Power Query?
Say, I have a table with lots of different dates in column DATE and a smaller number of letters in column LETTER. I now want the maximum date for each letter.
In DAX, I would use something like CALCULATE(MAX([Date]),FILTER(ALL(Table),[Letter]=EARLIER([Letter])).
How would I achieve the same in M?
Thanks
2 Solutions in the code below. Notice that each uses "PreviousStep" as basis, so these are separate solutions.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
PreviousStep = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Letter", type text}}),
// 1. Add a column to the original table with the MaxDate for each letter
// "earlier" is just the name of a function parameter; it could as well have been "x" or "MarcelBeug"
AddedMaxDate = Table.AddColumn(PreviousStep, "MaxDate", (earlier) => List.Max(Table.SelectRows(PreviousStep, each [Letter] = earlier[Letter])[Date])),
// 2. Group by letter and get the MaxDate for each letter
GroupedOnLetter = Table.Group(PreviousStep, {"Letter"}, {{"MaxDate", each List.Max([Date]), type date}})
in
GroupedOnLetter
In short, there is no exact match for this function. Still, you can use other ways that can produce same results.
To reproduce example offered by Microsoft in help for EARLIER function, you can use following code (table1 equals table given in the example before ranking):
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("TVNNaxtBDP0rxuSoivn+uMYlLSUFE4f2YHIYd4d48Xq3rO1C/n01o1mc4+i9kZ6epP1+LcMa1o/94cvuOM3XCz0epHUgnccQ1m+wXytXGae8ekl/TpWhlACvBHrBDL8wdtc0dpWiLTgV0EVm1CrT9Trky4ooq016z5VnI2ij0OjKs402nVePM1XLrMgEcEaj8ZVU9czpxAmcAik1SlcxGSm2SX/5m4eoDVpToSJyc0z9WLEAwXgUrcX6a8hpzDNb4CAEhU5VuIjfzGk8XZoeGSVYpVBwd+X31zynfhjyjRM4A9FZ1NyWFhR7ymPX0hsJ0RuUbJ+s6DSzt96QtR4d96MK9m2Y/uVmfABtNVrWbSj2newc8iEtwjUoS401O2Rh5NQtyq0HZyNGFq4ZHs6Lz1aCjAopXmFV4I9uTtd+GlfbZfyR3IkafTOvJPlBneUPbj1GMCouMFkA6+f+/VhLcKjofp5aNmlBkKQ23JLs53QbrzSoVdkp3iYDWlgIzqBi6VJ9Jj7N6cxMA1ZSE16ga/XLTm3TOPZsPv8uora5SwNLMIIkK1Q8EF02bHs78xZJBS5alK1bCr1Mqbtro7+WfHPRoeZNk2Yh3XVpcNqBjgE9myuLrl3qaHg8GUUr5RYbVKlzP0kdLHhBJ9kOrsjfLQaWndCEWcZK8dfF7wcZIrkRUXNe7Ss6tzN8vR2WxTIQtMLQJl9Y023ux/d7o1JTHVOH0MyQ7hPv3isdh7F01gYFH5Aqvf7KF5akyLEYBYrmVpH0+5jz0C4nADEq+vYf", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ProductSubcategoryKey = _t, EnglishProductSubcategoryName = _t, TotalSubcategorySales = _t]),
table1 = Table.TransformColumnTypes(Source,{{"ProductSubcategoryKey", Int64.Type}, {"EnglishProductSubcategoryName", type text}, {"TotalSubcategorySales", Currency.Type}}, "en-US"),
AddCount = Table.AddColumn(
table1,
"SubcategoryRanking", //(a) is a parameter for function, which equals current record, and function should return value for new cell of "SubcategoryRanking"
(a)=> Table.RowCount(
Table.SelectRows(
table1, //(b) equals whole table1. This function returns table filtered by given criteria
(b) => b[TotalSubcategorySales] < a[TotalSubcategorySales])
) + 1,
Int64.Type)
in
AddCount
I think you can use the GroupBy function to group the data by Letter and find the Max of the date column. So your code should look like.
= Table.Group(#"Previous step", {"Letter"}, {{"Max Date", each List.Max([Date]), type date}})

Resources