Related
Hello hope someone can assist me in a power query I'm having trouble with. I'm brand new to Power Query and the M language and while I do have some coding background coding is not my day job. I'm pulling data from web page and the data in one column that is a list nested in a list.
This is a clip of what I see at from query initially:
I then drill down on the list and see this for all of the rows:
I then drill down again in that list and get a this:
At this level I get at least one row but there could be many rows.
What I want is to take all of the values and combine them into one cell as a bulleted list like this:
Any assistance on how to do this would be appreciated
I've tried looking at some of the examples in other threads and only get errors when I do this.
You didn't really provide enough detail here, but it looks like a bunch of lists within lists
you can run them through something like this to expand them all. If the results dont look like what you want, provide more information and sample data we can reproduce
let Source = <<copy whatever your source is here>>,
//Marcel Beug 2017
TableSchema = Table.Schema(Source),
ColumnNames = Table.SelectColumns(TableSchema,{"Name"}),
IsListColumn = Table.AddColumn(ColumnNames, "IsListColumn?", each List.AllTrue(List.Transform(Table.Column(Source,[Name]), each _ is list))),
NonListColumns = Table.SelectRows(IsListColumn, each ([#"IsListColumn?"] = false)),
NonListColumnNames = Table.RemoveColumns(NonListColumns,{"IsListColumn?"})[Name],
SelectNonListColumns = Table.SelectColumns(Source,NonListColumnNames),
ListColumns = Table.SelectRows(IsListColumn, each ([#"IsListColumn?"] = true)),
ListColumnNames = Table.RemoveColumns(ListColumns,{"IsListColumn?"})[Name],
SelectListColumns = Table.SelectColumns(Source,ListColumnNames),
TableFromLists = Table.AddColumn(SelectListColumns, "TableFromLists", each Table.FromColumns(Record.FieldValues(_))),
ListTables = Table.SelectColumns(TableFromLists,{"TableFromLists"}),
Custom1 = Table.FromColumns({Table.ToRecords(SelectNonListColumns),Table.ToRecords(ListTables)}),
#"Expanded Column1" = Table.ExpandRecordColumn(Custom1, "Column1", Table.ColumnNames(#table(List.Min({1,List.Count(NonListColumnNames)}),{})), NonListColumnNames),
#"Expanded Column2" = Table.ExpandRecordColumn(#"Expanded Column1", "Column2", {"TableFromLists"}, {"TableFromLists"}),
#"Expanded TableFromLists" = Table.ExpandTableColumn(#"Expanded Column2", "TableFromLists", Table.ColumnNames(#table(List.Count(ListColumnNames),{})), ListColumnNames),
#"Reordered Columns" = Table.ReorderColumns(#"Expanded TableFromLists",ColumnNames[Name])
in #"Reordered Columns"
EDIT for specific website clarification
let Source = Web.Page(Web.Contents("https://ised-isde.canada.ca/site/high-speed-internet-canada/en/universal-broadband-fund/selected-universal-broadband-fund-projects")),
#"Expanded Data" = Table.ExpandTableColumn(Source, "Data", {"Location of project", "Number of Households to be served / Number of kilometers to be covered (mobile projects)", "Funding recipient", "Funding amountFootnote *"}, {"Location of project", "Number of Households to be served / Number of kilometers to be covered (mobile p", "Funding recipient", "Funding amountFootnote *"}),
#"Added Custom" = Table.AddColumn(#"Expanded Data", "Location of Project2", each Text.Combine([Location of project]{1},"#(lf)")),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Source", "ClassName", "Id", "Location of project"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Caption] <> "Document")),
#"FundingToAmount" = Table.TransformColumns(#"Filtered Rows",{{"Funding amountFootnote *", each Number.From(Text.Select(_,{"0".."9",".","$"})), type number}})
in #"FundingToAmount"
then in excel format that column as text control [x] wrap text
I would like to choose a certain columns in power query, but not using their names. Ex. I can do this in R, by command: select. I'm wondering how i can do it in power query. I found some information here, but not all that I need.
Any idea, if I want to refer to more than one column?
It doesn't work if I write the code as below:
#"Filtered Part Desc" = Table.SelectRows (
#"Removed Columns3",
each List.Contains(
{ "ENG", "TRANS" },
Record.Field(_, Table.ColumnNames(#"Removed Columns3") { 5, 6, 7 })
)
)
Let's say I have this table and want to do a couple of things to it.
First, I want to change the column type of the second and last columns. We can use Table.ColumnNames to do this using simple indexing (which starts at zero) as follows:
Table.TransformColumnTypes(
Source,
{
{Table.ColumnNames(Source){1}, Int64.Type},
{Table.ColumnNames(Source){3}, Int64.Type}
}
)
That works but requires specifying each index separately. If we want to unpivot these columns like this
Table.Unpivot(#"Changed Type", {"Col2", "Col4"}, "Attribute", "Value")
but using the index values instead we can use the same method as above
Table.Unpivot(
#"Changed Type",
{
Table.ColumnNames(Source){1},
Table.ColumnNames(Source){3}
}, "Attribute", "Value"
)
But is there a way to do this where we can use a single list of positional index values and use Table.ColumnNames only once? I found a relatively simple though unintuitive method on this blog. For this case, it works as follows:
Table.Unpivot(
#"Changed Type",
List.Transform({1,3}, each Table.ColumnNames(Source){_}),
"Attribute", "Value"
)
This method starts with the list of positional index values and then transforms them into column names by looking up the names of the columns corresponding to those positions.
Here's the full M code for the query I was playing with:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSlTSUTIE4nIgtlSK1YlWSgKyjIC4AogtwCLJQJYxEFcCsTlYJAXIMgHiKiA2U4qNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Col1 = _t, Col2 = _t, Col3 = _t, Col4 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{Table.ColumnNames(Source){1}, Int64.Type},{Table.ColumnNames(Source){3}, Int64.Type}}),
#"Unpivoted Columns" = Table.Unpivot(#"Changed Type", List.Transform({1,3}, each Table.ColumnNames(Source){_}), "Attribute", "Value")
in
#"Unpivoted Columns"
Forgive me I am very novice, I am experimenting with PQ in Excel to pull some sales data from a REST API https://docs.vendhq.com/reference/2/spec/sales/listsales
My query looks like the below
let
MaxVersion = Excel.CurrentWorkbook(){[Name="Versions"]}[Content]{0}[Sales],
Source = Json.Document(Web.Contents("https://*****.vendhq.com/api/2.0/sales?after=" &Text.From(MaxVersion) , [Headers=[Authorization="Bearer **********************************"]])),
data = Source[data],
#"Converted to Table" = Table.FromList(data, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"id", "outlet_id", "register_id", "user_id", "customer_id", "invoice_number", "source", "source_id", "status", "note", "short_code", "return_for", "total_price", "total_tax", "total_loyalty", "created_at", "updated_at", "sale_date", "deleted_at", "line_items", "payments", "adjustments", "version", "receipt_number", "total_price_incl", "taxes"}, {"Column1.id", "Column1.outlet_id", "Column1.register_id", "Column1.user_id", "Column1.customer_id", "Column1.invoice_number", "Column1.source", "Column1.source_id", "Column1.status", "Column1.note", "Column1.short_code", "Column1.return_for", "Column1.total_price", "Column1.total_tax", "Column1.total_loyalty", "Column1.created_at", "Column1.updated_at", "Column1.sale_date", "Column1.deleted_at", "Column1.line_items", "Column1.payments", "Column1.adjustments", "Column1.version", "Column1.receipt_number", "Column1.total_price_incl", "Column1.taxes"}),
#"Expanded Column1.line_items" = Table.ExpandListColumn(#"Expanded Column1", "Column1.line_items"),
#"Expanded Column1.line_items1" = Table.ExpandRecordColumn(#"Expanded Column1.line_items", "Column1.line_items", {"id", "product_id", "tax_id", "discount_total", "discount", "price_total", "price", "cost_total", "cost", "tax_total", "tax", "quantity", "loyalty_value", "note", "price_set", "status", "sequence", "gift_card_number", "tax_components", "promotions", "total_tax", "total_cost", "total_discount", "total_loyalty_value", "total_price", "unit_cost", "unit_discount", "unit_loyalty_value", "unit_price", "unit_tax", "is_return"}, {"Column1.line_items.id", "Column1.line_items.product_id", "Column1.line_items.tax_id", "Column1.line_items.discount_total", "Column1.line_items.discount", "Column1.line_items.price_total", "Column1.line_items.price", "Column1.line_items.cost_total", "Column1.line_items.cost", "Column1.line_items.tax_total", "Column1.line_items.tax", "Column1.line_items.quantity", "Column1.line_items.loyalty_value", "Column1.line_items.note", "Column1.line_items.price_set", "Column1.line_items.status", "Column1.line_items.sequence", "Column1.line_items.gift_card_number", "Column1.line_items.tax_components", "Column1.line_items.promotions", "Column1.line_items.total_tax", "Column1.line_items.total_cost", "Column1.line_items.total_discount", "Column1.line_items.total_loyalty_value", "Column1.line_items.total_price", "Column1.line_items.unit_cost", "Column1.line_items.unit_discount", "Column1.line_items.unit_loyalty_value", "Column1.line_items.unit_price", "Column1.line_items.unit_tax", "Column1.line_items.is_return"})
in
#"Expanded Column1.line_items1"
The query works great once, it passes a parameter from the excel workbook to use as the version for pulling in sales, after the last refresh (which gets populated from a second query, not sure if its possible to combine the queries??).
The other problem is if on the refresh the data is empty (no new sales) or if it needs to paginate the data I cant quite workout the syntax of how to iteratively call this as a function and include an escape clause for if the data returned is empty.
Seems appropriate to use List.Generate here.
I can't test the code below (as I don't have the API credentials or know the exact endpoint), so you'll need to check if it returns all/expected pages.
let
getSalesData = (maxVersion as number) =>
let
url = "https://YOUR_DOMAIN_PREFIX.vendhq.com/api/2.0/sales",
requestOptions = [
Query = [after = Number.ToText(maxVersion)],
Headers = [Authorization = "Bearer YOUR_API_KEY"]
],
response = Web.Contents(url, requestOptions),
deserialised = Json.Document(response)
in deserialised,
allPagesOfSalesData = List.Generate(
() => getSalesData(0),
each not List.IsEmpty([data]),
each getSalesData([version][max])
)
in
allPagesOfSalesData
I pass in 0 as an argument for the first API call even though the documentation (https://docs.vendhq.com/reference/introduction/pagination#first-page) states:
"By default, the value of the after parameter will be assumed as equal 0 so itβs not necessary to use it on the first page."
The list allPagesOfSalesData stops being generated when the data field contains an empty list, which I think is what the documentation (https://docs.vendhq.com/reference/introduction/pagination#subsequent-pages) suggests:
"This should be repeated until an empty collection is returned. This will mean that all items of the collection have been returned."
But I've assumed that "empty collection" refers to the data field (in the JSON response).
You should get a list of tables (for allPagesOfSalesData), which you can then combine into a single table (using Table.Combine).
You might want to use Table.FromRecords (instead of the Table.FromList and Table.ExpandRecordColumn) when transforming the response.
Hopefully it works, but I can't test it.
Suppose I have a list of strings
var data = new List<string>{"fname", "phone", "lname", "home", "home", "company", "phone", "phone"};
I would like to list all values and add index to duplicates like this
fname,
phone,
lname,
home,
home[1],
company,
phone[1],
phone[2]
or like this
fname,
phone[0],
lname,
home[0],
home[1],
company,
phone[1],
phone[2]
The both solutions would work for me.
Is that possible with Linq?
You can use LINQ GroupBy to gather the matches, and then the counting version of Select to append the indexes.
var ans = data.GroupBy(d => d).SelectMany(dg => dg.Select((d, n) => n == 0 ? d : $"{d}[{n}]"));
Using Power Query in Microsoft Excel 2013, I created a table that looks like this:
// To insert this in Power Query, append a '=' before the 'Table.FromRows'
Table.FromRows(
{
{"0", "Tom", "null", "null"},
{"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
{"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
},
{"ID", "Name", "Address", "Wife"}
)
Now, I want to expand the columns Address and Wife by using the name attribute
on both records. Manually, I would do it like this:
// To insert this in Power Query, append a '=' before the 'Table.FromRows'
let
t = Table.FromRows(
{
{"0", "Tom", "null", "null"},
{"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
{"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
},
{"ID", "Name", "Address", "Wife"}
),
expAddress = Table.ExpandRecordColumn(t, "Address", {"name"}, {"Address β name"}),
expWife = Table.ExpandRecordColumn(expAddress, "Wife", {"name"}, {"Wife β name"})
in
expWife
Background
Whenever I have data tables that have a different layout, I need to rewrite the
query. In a fantasy world, you could expand all columns that have Records in
them using a specific key. Ideally, you would have the following library
functions:
// Returns a list with the names of the columns that match the secified type.
// Will also try to infer the type of a column if the table is untyped.
Table.ColumnsOfTypeInfer(
table as table,
listOfTypes as list
) as list
// Expands a column of records into columns with each of the values.
Table.ExpandRecordColumnByKey(
table as table,
columns as list,
key as text,
) as table
Then, I could call
// To insert this in Power Query, append a '=' before the 'Table.FromRows'
let
t = Table.FromRows(
{
{"0", "Tom", "null", "null"},
{"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
{"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
},
{"ID", "Name", "Address", "Wife"}
),
recordColumns = Table.ColumnsOfTypeInfer(t, {type record}),
expAll = Table.ExpandRecordColumnByKey(t, recordColumns, "name")
in
expAll
Question
Can you get a list of columns with a specific type that is not specified in the table, aka infer it?
Can you make that record expansion generic?
Edit: Added row #0 with two null cells.
(First off, thanks for the clear explanation and sample data and suggestions!)
1) There's no way in M code to do type inference. This limitation might almost be considered a "feature", because if the source data changes in a way that causes the inferred type to be different, it will almost certainly break your query.
Once you load your untyped data, it should be quick to use the Detect Data Type button to generate the M for this. Or if you are reading data from JSON it should be mostly typed enough already.
If you have a specific scenario where this doesn't work want to update your question? :)
2) It's very possible and only a little convoluted to make the record expansion generic, as long as the cell values of the table are records. This finds columns where all rows are either null or a record and expands the name column.
Here's some simple implementations you can add to your library:
let
t = Table.FromRows(
{
{"0", "Tom", null, null},
{"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
{"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
},
{"ID", "Name", "Address", "Wife"}
),
Table.ColumnsOfAllRowType = (table as table, typ as type) as list => let
ColumnNames = Table.ColumnNames(table),
ColumnsOfType = List.Select(ColumnNames, (name) =>
List.AllTrue(List.Transform(Table.Column(table, name), (cell) => Type.Is(Value.Type(cell), typ))))
in
ColumnsOfType,
Table.ExpandRecordColumnByKey = (table as table, columns as list, key as text) as table =>
List.Accumulate(columns, table, (state, columnToExpand) =>
Table.ExpandRecordColumn(state, columnToExpand, {key}, { columnToExpand & " β " & key })),
recordColumns = Table.ColumnsOfAllRowType(t, type nullable record),
expAll = Table.ExpandRecordColumnByKey(t, recordColumns, "name")
in
expAll
If a new library function can be implemented in just M we're less likely to add it to our standard library, but if you feel it is missing feel free to suggest it at: https://ideas.powerbi.com/forums/265200-power-bi/
You might have a good argument for adding something like Table.ReplaceTypeFromFirstRow(table as table) as table, because constructing the type with M is very messy.
Sorry to come to this a bit late, but I just had a similar challenge. I tried using Chris Webb's ExpandAll function:
http://blog.crossjoin.co.uk/2014/05/21/expanding-all-columns-in-a-table-in-power-query/
... but that only works on Table-type columns, not Record-type columns, but I have managed to hack it to that purpose. I duplicated Chris' function as "ExpandAllRecords" and made 3 edits: :
replaced each if _ is table then Table.ColumnNames(_) with each if _ is record then Record.FieldNames(_)
replaced Table.ExpandTableColumn with Table.ExpandRecordColumn
replaced ExpandAll with ExpandAllRecords
I tried getting both tables and records expanding in one function, but I kept getting type errors.
Anyway, with that in place, the final query is just:
let
t = Table.FromRows(
{
{"1", "Tom", null, [ name="Jane", age=35 ]},
{"2", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
{"3", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
},
{"ID", "Name", "Address", "Wife"}
),
Output = ExpandAllRecords(t)
in
Output
Edit:
Out of concern that that one day the great snippet (by Chris Webb, mentioned by #MikeHoney) will one day disappear), I'll mirror the entire code here:
let
//Define function taking two parameters - a table and an optional column number
Source = (TableToExpand as table, optional ColumnNumber as number) =>
let
//If the column number is missing, make it 0
ActualColumnNumber = if (ColumnNumber=null) then 0 else ColumnNumber,
//Find the column name relating to the column number
ColumnName = Table.ColumnNames(TableToExpand){ActualColumnNumber},
//Get a list containing all of the values in the column
ColumnContents = Table.Column(TableToExpand, ColumnName),
//Iterate over each value in the column and then
//If the value is of type table get a list of all of the columns in the table
//Then get a distinct list of all of these column names
ColumnsToExpand = List.Distinct(List.Combine(List.Transform(ColumnContents,
each if _ is table then Table.ColumnNames(_) else {}))),
//Append the original column name to the front of each of these column names
NewColumnNames = List.Transform(ColumnsToExpand, each ColumnName & "." & _),
//Is there anything to expand in this column?
CanExpandCurrentColumn = List.Count(ColumnsToExpand)>0,
//If this column can be expanded, then expand it
ExpandedTable = if CanExpandCurrentColumn
then
Table.ExpandTableColumn(TableToExpand, ColumnName,
ColumnsToExpand, NewColumnNames)
else
TableToExpand,
//If the column has been expanded then keep the column number the same, otherwise add one to it
NextColumnNumber = if CanExpandCurrentColumn then ActualColumnNumber else ActualColumnNumber+1,
//If the column number is now greater than the number of columns in the table
//Then return the table as it is
//Else call the ExpandAll function recursively with the expanded table
OutputTable = if NextColumnNumber>(Table.ColumnCount(ExpandedTable)-1)
then
ExpandedTable
else
ExpandAll(ExpandedTable, NextColumnNumber)
in
OutputTable
in
Source
You can then use this function on the XML file as follows:
let
//Load XML file
Source = Xml.Tables(File.Contents("C:\Users\Chris\Documents\PQ XML Expand All Demo.xml")),
ChangedType = Table.TransformColumnTypes(Source,{{"companyname", type text}}),
//Call the ExpandAll function to expand all columns
Output = ExpandAll(ChangedType)
in
Output
(Source and downloadable example: Chris Webb's Bi Blog, 2014-05-21)