Expand List to same row in Power Query - powerquery

I'm new to power query. I'm parsing JSON. I have an array name as "categories" when I expand it using Power Query it creates three rows for each category while I just want to remain in one row and want to create 3 separate column for each category like category1,category2,category3.
here is my code
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"id", Int64.Type}, {"no", type text}, {"complete", Int64.Type}, {"json", type text}}),
#"Parsed JSON" = Table.TransformColumns(#"Changed Type",{{"json", Json.Document}}),
#"Expanded json" = Table.ExpandRecordColumn(#"Parsed JSON", "json", {"title", "price", "StoreName", "ratings", "merchant", "categories", "VariantB", "detailA", "detailB", "bullets", "images", "description"}, {"title", "price", "StoreName", "ratings", "merchant", "categories", "VariantB", "detailA", "detailB", "bullets", "images", "description"})
in
#"Expanded json"

Add extra step:
ExtractList = Table.TransformColumns(#"Expanded json", {"categories", each Text.Combine(List.Transform(_, Text.From), ","), type text})

Related

How do I filter for only the latest versions of a data line in Power Query?

I have a dataset where new data is appended as a 'Change' (last column) and numbered, with the higher number being the latest outcome. I'm trying to correctly filter the latest rows based on the highest change number available and get a result as per the image below.
While I've managed to get the right rows with columns Contract, Period, Person Company & Person Name, I can't seem to get the rest of the data to appear. Can anyone suggest what I'm missing?
EDIT - GOT SOMETHING WORKING
I don't understand it yet, but the below code from another question finally worked. Thanks to Olly on Power Query - Group by MAX Column Value
let
Partitions = Table.Group(Sheet1, {"Person Name"}, {{"Data", each Table.FirstN(Table.Sort(_,{{"Change", Order.Descending}}),1), type table}}),
Combined = Table.Combine(Partitions[Data])
in
Combined
Here is another approach (assuming that your original dataset is formatted in a table called Table1.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Contract", Int64.Type}, {"Period", type datetime}, {"Person Company", type text}, {"Person Name", type text}, {"Person Role", type text}, {"Gender", type text}, {"Age", type text}, {"Employment Costs", Int64.Type}, {"Monthly Hours", Int64.Type}, {"Change", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Person Name"}, {{"AllData", each _, type table [Contract=nullable number, Period=nullable datetime, Person Company=nullable text, Person Name=nullable text, Person Role=nullable text, Gender=nullable text, Age=nullable text, Employment Costs=nullable number, Monthly Hours=nullable number, Change=nullable number]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Max_Change", each List.Max([AllData][Change])),
#"Expanded AllData" = Table.ExpandTableColumn(#"Added Custom", "AllData", {"Contract", "Period", "Person Company", "Person Role", "Gender", "Age", "Employment Costs", "Monthly Hours", "Change"}, {"Contract", "Period", "Person Company", "Person Role", "Gender", "Age", "Employment Costs", "Monthly Hours", "Change"}),
#"Added Custom1" = Table.AddColumn(#"Expanded AllData", "Filter", each if [Change] = [Max_Change] then "Yes" else null),
#"Filtered Rows" = Table.SelectRows(#"Added Custom1", each ([Filter] = "Yes")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Filter", "Max_Change"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Contract", "Period", "Person Company", "Person Name", "Person Role", "Gender", "Age", "Employment Costs", "Monthly Hours", "Change"})
in
#"Reordered Columns"
The idea is to create a column that carry the maximum change (by using List.Max()) of each agent and then filter by rows that fill the condition [Change] = [Max_Change]
you basically have the answer, but for instructions
click select all the columns you want to group on (contract, period, person company, person name), righ click group by
use operation all rows and hit ok
go into home...advanced editor, and on the group line, replace
... each _, type table [...}})
with
.... each Table.FirstN(Table.Sort(_,{{"Change", Order.Descending}}),1), type table }})
hit done, then use arrows atop the new column to expand the extra columns
thats going to sort, in descending order by change #, then take the first row only
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Contract", Int64.Type}, {"Period", type date}, {"Person company", type text}, {"Person Name", type text}, {"Person Role", type text}, {"Gender", type text}, {"Age", Int64.Type}, {"Employment Costs", Int64.Type}, {"Monthly Hours", Int64.Type}, {"Change", Int64.Type}}),
#"Grouped Rows1" = Table.Group(#"Changed Type", {"Contract", "Period", "Person company", "Person Name"}, {{"Count", each Table.FirstN(Table.Sort(_,{{"Change", Order.Descending}}),1), type table }}),
#"Expanded Count" = Table.ExpandTableColumn(#"Grouped Rows1", "Count", {"Person Role", "Gender", "Age", "Employment Costs", "Monthly Hours", "Change"}, {"Person Role", "Gender", "Age", "Employment Costs", "Monthly Hours", "Change"})
in #"Expanded Count"

How to remove duplicates in a text string with Power Query

As my subject mentioned, after doing a few step (groupby , filter , combine text ...), I have an issue with removing duplicate in the same cell in power query.
example: column "cc_emails" has many row, but each row have some duplicated email due to Text.Combine step before:
sth like that: "Giang.Phan#abc.com, thao.tran#abc.com, Khoa.Vu#abc.com, Vn.Offset#abc.com, Giang.Phan#abc.com, thao.tran#abc.com, Khoa.Vu#abc.com, Vn.Offset#abc.com"
I would like to 1 email will only appear once in the list? Can someone help look this ?
output expected:
"Giang.Phan#abc.com,thao.tran#abc.com, Khoa.Vu#abc.com, Vn.Offset#abc.com"
##update my Query Editor:
let
Source = Exchange.Contents("giang.phan#abc.com"),
Mail1 = Source{[Name="Mail"]}[Data],
#"Reordered Columns" = Table.ReorderColumns(Mail1,{"DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "Sender", "DisplayTo", "DisplayCc", "ToRecipients", "CcRecipients", "BccRecipients", "Importance", "Categories", "IsRead", "HasAttachments", "Attachments", "Preview", "Attributes", "Body", "Id"}),
#"Filtered Rows" = Table.SelectRows(#"Reordered Columns", each [DateTimeReceived] > #datetime(2021, 12, 29, 0, 0, 0) and [DateTimeReceived] < #datetime(2022, 1, 4, 0, 0, 0)),
#"Expanded ToRecipients" = Table.ExpandTableColumn(#"Filtered Rows", "ToRecipients", {"Address"}, {"ToRecipients.Address"}),
#"Expanded CcRecipients" = Table.ExpandTableColumn(#"Expanded ToRecipients", "CcRecipients", {"Address"}, {"CcRecipients.Address"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded CcRecipients",{"BccRecipients", "Importance", "Categories", "IsRead", "HasAttachments", "Attachments", "Preview", "Attributes"}),
#"Reordered Columns1" = Table.ReorderColumns(#"Removed Columns",{"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "Sender", "DisplayTo", "DisplayCc", "ToRecipients.Address", "CcRecipients.Address", "Body"}),
#"Grouped Rows" = Table.Group(#"Reordered Columns1", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "DisplayTo", "DisplayCc"}, {{"cc_address", each Text.Combine([CcRecipients.Address],", "), type text}, {"to_address", each Text.Combine([ToRecipients.Address],", "), type text}}),
#"Grouped Rows1" = Table.Group(#"Grouped Rows", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "DisplayTo", "DisplayCc", "cc_address", "to_address"}, {{"Last_time receive", each List.Max([DateTimeReceived]), type datetime}, {"Last_subject", each List.Max([Subject]), type nullable text}}),
#"Removed Other Columns" = Table.SelectColumns(#"Grouped Rows1",{"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "DisplayTo", "DisplayCc", "cc_address", "to_address", "Last_time receive", "Last_subject"})
in
#"Removed Other Columns"
You can split the text by delimiter, select the distinct list values, then recombine as a string:
let
Source = "Giang.Phan#abc.com, thao.tran#abc.com, Khoa.Vu#abc.com, Vn.Offset#abc.com, Giang.Phan#abc.com, thao.tran#abc.com, Khoa.Vu#abc.com, Vn.Offset#abc.com",
#"Distinct Values" = Text.Combine(List.Distinct(Text.Split(Source, ", ")),", ")
in
#"Distinct Values"
Edit after question update:
In your case, you can simply change this line:
#"Grouped Rows" = Table.Group(#"Reordered Columns1", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "DisplayTo", "DisplayCc"}, {{"cc_address", each Text.Combine([CcRecipients.Address],", "), type text}, {"to_address", each Text.Combine([ToRecipients.Address],", "), type text}}),
to include the List.Distinct function, and return only distinct address values:
#"Grouped Rows" = Table.Group(#"Reordered Columns1", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "DisplayTo", "DisplayCc"}, {{"cc_address", each Text.Combine(List.Distinct([CcRecipients.Address]),", "), type text}, {"to_address", each Text.Combine(List.Distinct([ToRecipients.Address]),", "), type text}}),

Problem with WebDataRocks Pivot Table captions in rows, columns, measures

I have 3 questions about captions (data localizations):
How to provide caption for a field not included in slice object
Why are captions ignored in: toolbar -> fields -> all fields -> any measure field
Why are captions ignored while adding calculated value: toolbar -> fields -> add calculated value -> any row or column field
Check out this js fiddle
var pivot = new WebDataRocks({
container: "#wdr-component",
toolbar: true,
height: 395,
report: {
dataSource: {
filename: "https://cdn.webdatarocks.com/data/data.csv"
},
"slice": {
"rows": [{
"uniqueName": "Category",
"caption": "Category_Localized"
}
],
"columns": [{
"uniqueName": "Color",
"caption": "Color_Localized"
}],
"measures": [{
"uniqueName": "Price",
"aggregation": "sum",
"caption": "Price_Localized"
}
]
}
}
});
It seems like all these features are currently not working as expected.
As a workaround to the first question though you can simply change the field names in the CSV file you're feeding to the pivot table.

PowerQuery to Convert List of Records to Delimited String

Given the below JSON I'm trying to load it into Excel. The "Ratings" section I would like to format into a single delimited string/cell. I'm pretty new to PowerQuery so I'm struggling to do this. I've managed to format the list of Records to its own table, but concatenating this into a string and adding it back into my source table is where I'm drawing a blank. Any help would be appreciated.
PowerQuery
let
Source = Json.Document(File.Contents("C:\filename.json")),
Ratings1 = Source[Ratings],
#"Converted to Table" = Table.FromList(Ratings1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
LastStep = Table.ExpandRecordColumn(#"Converted to Table", "Column1", { "Source", "Value" })
in
LastStep
JSON
{
"Title": "Iron Man",
"Year": "2008",
"Rated": "PG-13",
"Ratings": [{
"Source": "Internet Movie Database",
"Value": "7.9/10"
}, {
"Source": "Rotten Tomatoes",
"Value": "93%"
}, {
"Source": "Metacritic",
"Value": "79/100"
}
]
}
Ultimately, something like below would be ideal.
How about this?
let
Source = Json.Document(File.Contents("C:\filename.json")),
#"Converted to Table" = Record.ToTable(Source),
#"Transposed Table" = Table.Transpose(#"Converted to Table"),
#"Promoted Headers" = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Title", type text}, {"Rated", type text}, {"Year", Int64.Type}}),
#"Expanded Ratings" = Table.ExpandListColumn(#"Changed Type", "Ratings"),
#"Expanded Ratings1" = Table.ExpandRecordColumn(#"Expanded Ratings", "Ratings", {"Source", "Value"}, {"Source", "Value"}),
#"Added Custom" = Table.AddColumn(#"Expanded Ratings1", "Custom", each [Source] & "=" & [Value]),
#"Grouped Rows" = Table.Group(#"Added Custom", {"Title", "Year", "Rated"}, {{"Ratings", each Text.Combine([Custom],"#(lf)"), type text}})
in
#"Grouped Rows"
Most of the steps here are fairly clear from their name and are produced through GUI controls. The one trickier step is where I use a custom aggregator when doing the grouping. If you use the GUI, Text.Combine is not an option in the Group By dialog box, so I selected Max (which becomes List.Max in the code) and replaced that with Text.Combine to concatenate with the line feed character as the separator.
Concatenated with a pipe character into column. Is that what you want?
let
Source = Json.Document(File.Contents("C:\temp\filename.json")),
Ratings1 = Source[Ratings],
#"Converted to Table" = Table.FromList(Ratings1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
LastStep = Table.ExpandRecordColumn(#"Converted to Table", "Column1", { "Source", "Value" }),
#"Added Custom" = Table.AddColumn(LastStep, "Concat", each [Source]&"|"&[Value]),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Concat"})
in #"Removed Other Columns"

jqGrid - hide specific column or row in pivot

If we want to hide the column (month 7, range: border black)
or hide the column(year 2015, border green)
Is there any solution or jqGrid options that we can use it?
jqPivot method have no special options which allow you to hide some columns directly, but one can use beforeInitGrid callback to make any modifications of colModel before the grid will be created. The only problem is: one have to understand the exact name conversion for the columns used by jqPivot to write correct code of beforeInitGrid callback. So I describe some internals structures of jqPivot first of all and then the code of beforeInitGrid callback will be clear to understand. I explain the problem based on the example. I recommend all to read the wiki article too which provides additional information about jqPivot implemented in free jqGrid 4.9.0.
First of all I have to remind that jqPivot get as input data which will be indexed based on xDimension and yDimension options and then one calculates aggregation function over all items with the same x and y values. The aggregation function will be specified by aggregates parameter. In other words jqPivot is "pre-processor" of input data. It analyses the data and generate new data and colModel which display more compact information about original data.
To implement your requirements one need to understand which column names will be used by jqPivot for the colModel which will be generated. Moreover one need understand how to get the corresponding y values for the column.
For example we have the following input data:
var data = [{
CategoryName: "Baby", ProductName: "Baby Oil",
Price: "193.81", Quantity: "1",
sellmonth: "7", sellyear: "2011", week: "first"
}, {
CategoryName: "Mom", ProductName: "Shampoo",
Price: "93.81", Quantity: "1",
sellmonth: "12", sellyear: "2011", week: "first"
}, {
CategoryName: "none", ProductName: "beauty",
Price: "93.81", Quantity: "1",
sellmonth: "12", sellyear: "2011", week: "second"
}, {
CategoryName: "none", ProductName: "beauty",
Price: "93.81", Quantity: "1",
sellmonth: "12", sellyear: "2011", week: "third"
}, {
CategoryName: "none", ProductName: "Shampoo",
Price: "105.37", Quantity: "2",
sellmonth: "12", sellyear: "2011", week: "third"
}, {
CategoryName: "none", ProductName: "beauty",
Price: "93.81", Quantity: "1",
sellmonth: "12", sellyear: "2015", week: "second"
}];
and we use as jqPivot options
$("#pvtCrewAttendance").jqGrid("jqPivot",
data,
{
footerTotals: true,
footerAggregator: "sum",
totals: true,
totalHeader: "Grand Total",
totalText: "<span style='font-style: italic'>Grand {0} {1}</span>",
xDimension: [
{ dataName: "CategoryName", label: "Category Name", sortorder: "desc" },
{ dataName: "ProductName", label: "Product Name", footerText: "Total:" }
],
yDimension: [
{ dataName: "sellyear", sorttype: "integer", totalHeader: "Total in {0}" },
{ dataName: "sellmonth", sorttype: "integer" }//,
//{ dataName: "week" }
],
aggregates: [
{ member: "Price", aggregator: "sum", summaryType: "sum", label: "{1}" },
{ member: "Quantity", aggregator: "sum", summaryType: "sum", label: "{1}" }
]
},
{/* jqGrid options ...*/});
The resulting pivot grid will be displayed on the demo:
The above options means than qnique values of CategoryName and ProductName properties of input dat build x-values - the first rows of the grid. It's
[["Baby", "Baby Oil"], ["Mom", "Shampoo"], ["none", "beauty"], ["none", "Shampoo"]]
The above array is xIndex. In the same way the unique y-values are
[["2011", "7"], ["2011", "12"], ["2015", "12"]]
The values build the columns of colModel. If one use totalHeader, totalHeader, totalText or totals: true properties in some yDimension then additional columns with total sum over the group will be included. One uses totalHeader for dataName: "sellyear" in the above example. It means that additional two columns with both aggregates (sum by Price and sum by Quantity) will be inserted at the end of columns having sellyear "2011" and "2015".
The first names of columns of the grid will be "x0" and "x1" (corresponds the number of items in xDimension). Then there are columns which names starts with y and the ending a0 and a1 (corresponds the number of items in aggregates). The final two "total" columns have the names "ta0" and "ta1" (corresponds the number of items in aggregates). If aggregates contains only from one element that the suffixes (ending) a0 and a1 will be missing in the column which starts with y or t. The grouping total columns have the names which starts with y have t in the middle and a at the end (like y1t0a0). I includes an example on column names from the example above
I hope that one will see the column names which I wrote in red color. It's the name values for all 14 columns: x0, x1, y0a0, y0a1, y1a0, y1a1, y1t0a0, y1t0a1, y2a0, y2a1, y2t0a0, y2t0a1, ta0, ta1.
Now it's important to mention that jqPivot includes xIndex and yIndex used for building of pivot table inside. To be exactly one can get pivotOptions parameter of jqGrid and examine xIndex.items and yIndex.items properties. One will see the arrays of items which I included above.
Finally one have now enough information to understand the below code used in the demo which hides the columns which you asked:
The demo uses the following beforeInitGrid which hides the required columns:
beforeInitGrid: function () {
var $self = $(this), p = $self.jqGrid("getGridParam"),
yItems = p.pivotOptions.yIndex.items, matches, iy, y,
colModel = p.colModel, i, cm, l = colModel.length, cmName;
for (i = 0; i < l; i++) {
cm = colModel[i];
cmName = cm.name;
if (cmName.charAt(0) === "y") { // x, y, t
// aggregation column
matches = /^([x|y])(\d+)(t(\d+))?(a)?(\d+)/.exec(cmName);
if (matches !== null && matches.length > 1) {
// matches[2] - iy - index if y item
// matches[4] - undefined or total group index
// matches[6] - ia - aggregation index
iy = parseInt(matches[2], 10);
y = yItems[iy];
if (y != null && (y[0] === "2015" || (y[0] === "2011" && y[1] === "7"))) {
cm.hidden = true;
}
}
}
}
}

Resources