How do I filter for only the latest versions of a data line in Power Query? - powerquery

I have a dataset where new data is appended as a 'Change' (last column) and numbered, with the higher number being the latest outcome. I'm trying to correctly filter the latest rows based on the highest change number available and get a result as per the image below.
While I've managed to get the right rows with columns Contract, Period, Person Company & Person Name, I can't seem to get the rest of the data to appear. Can anyone suggest what I'm missing?
EDIT - GOT SOMETHING WORKING
I don't understand it yet, but the below code from another question finally worked. Thanks to Olly on Power Query - Group by MAX Column Value
let
Partitions = Table.Group(Sheet1, {"Person Name"}, {{"Data", each Table.FirstN(Table.Sort(_,{{"Change", Order.Descending}}),1), type table}}),
Combined = Table.Combine(Partitions[Data])
in
Combined

Here is another approach (assuming that your original dataset is formatted in a table called Table1.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Contract", Int64.Type}, {"Period", type datetime}, {"Person Company", type text}, {"Person Name", type text}, {"Person Role", type text}, {"Gender", type text}, {"Age", type text}, {"Employment Costs", Int64.Type}, {"Monthly Hours", Int64.Type}, {"Change", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Person Name"}, {{"AllData", each _, type table [Contract=nullable number, Period=nullable datetime, Person Company=nullable text, Person Name=nullable text, Person Role=nullable text, Gender=nullable text, Age=nullable text, Employment Costs=nullable number, Monthly Hours=nullable number, Change=nullable number]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Max_Change", each List.Max([AllData][Change])),
#"Expanded AllData" = Table.ExpandTableColumn(#"Added Custom", "AllData", {"Contract", "Period", "Person Company", "Person Role", "Gender", "Age", "Employment Costs", "Monthly Hours", "Change"}, {"Contract", "Period", "Person Company", "Person Role", "Gender", "Age", "Employment Costs", "Monthly Hours", "Change"}),
#"Added Custom1" = Table.AddColumn(#"Expanded AllData", "Filter", each if [Change] = [Max_Change] then "Yes" else null),
#"Filtered Rows" = Table.SelectRows(#"Added Custom1", each ([Filter] = "Yes")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Filter", "Max_Change"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Contract", "Period", "Person Company", "Person Name", "Person Role", "Gender", "Age", "Employment Costs", "Monthly Hours", "Change"})
in
#"Reordered Columns"
The idea is to create a column that carry the maximum change (by using List.Max()) of each agent and then filter by rows that fill the condition [Change] = [Max_Change]

you basically have the answer, but for instructions
click select all the columns you want to group on (contract, period, person company, person name), righ click group by
use operation all rows and hit ok
go into home...advanced editor, and on the group line, replace
... each _, type table [...}})
with
.... each Table.FirstN(Table.Sort(_,{{"Change", Order.Descending}}),1), type table }})
hit done, then use arrows atop the new column to expand the extra columns
thats going to sort, in descending order by change #, then take the first row only
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Contract", Int64.Type}, {"Period", type date}, {"Person company", type text}, {"Person Name", type text}, {"Person Role", type text}, {"Gender", type text}, {"Age", Int64.Type}, {"Employment Costs", Int64.Type}, {"Monthly Hours", Int64.Type}, {"Change", Int64.Type}}),
#"Grouped Rows1" = Table.Group(#"Changed Type", {"Contract", "Period", "Person company", "Person Name"}, {{"Count", each Table.FirstN(Table.Sort(_,{{"Change", Order.Descending}}),1), type table }}),
#"Expanded Count" = Table.ExpandTableColumn(#"Grouped Rows1", "Count", {"Person Role", "Gender", "Age", "Employment Costs", "Monthly Hours", "Change"}, {"Person Role", "Gender", "Age", "Employment Costs", "Monthly Hours", "Change"})
in #"Expanded Count"

Related

How to remove duplicates in a text string with Power Query

As my subject mentioned, after doing a few step (groupby , filter , combine text ...), I have an issue with removing duplicate in the same cell in power query.
example: column "cc_emails" has many row, but each row have some duplicated email due to Text.Combine step before:
sth like that: "Giang.Phan#abc.com, thao.tran#abc.com, Khoa.Vu#abc.com, Vn.Offset#abc.com, Giang.Phan#abc.com, thao.tran#abc.com, Khoa.Vu#abc.com, Vn.Offset#abc.com"
I would like to 1 email will only appear once in the list? Can someone help look this ?
output expected:
"Giang.Phan#abc.com,thao.tran#abc.com, Khoa.Vu#abc.com, Vn.Offset#abc.com"
##update my Query Editor:
let
Source = Exchange.Contents("giang.phan#abc.com"),
Mail1 = Source{[Name="Mail"]}[Data],
#"Reordered Columns" = Table.ReorderColumns(Mail1,{"DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "Sender", "DisplayTo", "DisplayCc", "ToRecipients", "CcRecipients", "BccRecipients", "Importance", "Categories", "IsRead", "HasAttachments", "Attachments", "Preview", "Attributes", "Body", "Id"}),
#"Filtered Rows" = Table.SelectRows(#"Reordered Columns", each [DateTimeReceived] > #datetime(2021, 12, 29, 0, 0, 0) and [DateTimeReceived] < #datetime(2022, 1, 4, 0, 0, 0)),
#"Expanded ToRecipients" = Table.ExpandTableColumn(#"Filtered Rows", "ToRecipients", {"Address"}, {"ToRecipients.Address"}),
#"Expanded CcRecipients" = Table.ExpandTableColumn(#"Expanded ToRecipients", "CcRecipients", {"Address"}, {"CcRecipients.Address"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded CcRecipients",{"BccRecipients", "Importance", "Categories", "IsRead", "HasAttachments", "Attachments", "Preview", "Attributes"}),
#"Reordered Columns1" = Table.ReorderColumns(#"Removed Columns",{"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "Sender", "DisplayTo", "DisplayCc", "ToRecipients.Address", "CcRecipients.Address", "Body"}),
#"Grouped Rows" = Table.Group(#"Reordered Columns1", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "DisplayTo", "DisplayCc"}, {{"cc_address", each Text.Combine([CcRecipients.Address],", "), type text}, {"to_address", each Text.Combine([ToRecipients.Address],", "), type text}}),
#"Grouped Rows1" = Table.Group(#"Grouped Rows", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "DisplayTo", "DisplayCc", "cc_address", "to_address"}, {{"Last_time receive", each List.Max([DateTimeReceived]), type datetime}, {"Last_subject", each List.Max([Subject]), type nullable text}}),
#"Removed Other Columns" = Table.SelectColumns(#"Grouped Rows1",{"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "DisplayTo", "DisplayCc", "cc_address", "to_address", "Last_time receive", "Last_subject"})
in
#"Removed Other Columns"
You can split the text by delimiter, select the distinct list values, then recombine as a string:
let
Source = "Giang.Phan#abc.com, thao.tran#abc.com, Khoa.Vu#abc.com, Vn.Offset#abc.com, Giang.Phan#abc.com, thao.tran#abc.com, Khoa.Vu#abc.com, Vn.Offset#abc.com",
#"Distinct Values" = Text.Combine(List.Distinct(Text.Split(Source, ", ")),", ")
in
#"Distinct Values"
Edit after question update:
In your case, you can simply change this line:
#"Grouped Rows" = Table.Group(#"Reordered Columns1", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "DisplayTo", "DisplayCc"}, {{"cc_address", each Text.Combine([CcRecipients.Address],", "), type text}, {"to_address", each Text.Combine([ToRecipients.Address],", "), type text}}),
to include the List.Distinct function, and return only distinct address values:
#"Grouped Rows" = Table.Group(#"Reordered Columns1", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "DisplayTo", "DisplayCc"}, {{"cc_address", each Text.Combine(List.Distinct([CcRecipients.Address]),", "), type text}, {"to_address", each Text.Combine(List.Distinct([ToRecipients.Address]),", "), type text}}),

Expand List to same row in Power Query

I'm new to power query. I'm parsing JSON. I have an array name as "categories" when I expand it using Power Query it creates three rows for each category while I just want to remain in one row and want to create 3 separate column for each category like category1,category2,category3.
here is my code
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"id", Int64.Type}, {"no", type text}, {"complete", Int64.Type}, {"json", type text}}),
#"Parsed JSON" = Table.TransformColumns(#"Changed Type",{{"json", Json.Document}}),
#"Expanded json" = Table.ExpandRecordColumn(#"Parsed JSON", "json", {"title", "price", "StoreName", "ratings", "merchant", "categories", "VariantB", "detailA", "detailB", "bullets", "images", "description"}, {"title", "price", "StoreName", "ratings", "merchant", "categories", "VariantB", "detailA", "detailB", "bullets", "images", "description"})
in
#"Expanded json"
Add extra step:
ExtractList = Table.TransformColumns(#"Expanded json", {"categories", each Text.Combine(List.Transform(_, Text.From), ","), type text})

Laravel/Eloquent, retrieve data dynamically (whereIn, sum)

I am pretty noob in Laravel, and I would like to retrieve data from DB in an "automated" way. I have a query that get the value of key "cases" where 'state' is the latest repeated
$pieWyoming=State::select('cases')->where('state', '=', 'Wyoming')->orderByDesc('id')->limit(1)->get()->sum('cases');
But I want to do this with whereIn
$statesArr = array("Alabama", "Alaska", "Arizona", "Arkansas", "California", "Colorado", "Connecticut", "Delaware", "District of Columbia", "Florida", "Georgia", "Guam", "Hawaii", "Idaho", "Illinois", "Indiana", "Iowa", "Kansas","Kentucky","Louisiana","Maine","Maryland","Massachusetts","Michigan","Minnesota","Mississippi","Missouri", "Montana", "Nebraska", "Nevada", "New Hampshire", "New Jersey", "New Mexico", "New York", "North Carolina", "North Dakota", "Northern Mariana Islands", "Ohio", "Oklahoma", "Oregon", "Pennsylvania", "Puerto Rico", "Rhode Island", "South Carolina", "South Dakota", "Tennessee", "Texas", "Utah", "Vermont", "Virgin Islands", "Virginia", "Washington", "West Virginia", "Wisconsin", "Wyoming");
$Wyoming=State::select('cases')->whereIn('state',$statesArr)->orderByDesc('id')->limit(1)->get()->sum('cases');
But it seems that this traverse $statesArr and only gets the last value, that is correct but is only one value from one state, I want ALL VALUES from ALL states
EDIT, SAMPLE DATA AND EXPECTED OUTPUT
The database holds data as this https://res.cloudinary.com/dcdajhifi/image/upload/v1599003808/ex1_ys5ksi.png
and I would like to get only the value of field "cases" for each state in array $statesArr, each state should be the last repeated, in example
https://res.cloudinary.com/dcdajhifi/image/upload/v1599003808/ex2_lby8md.png
here is the last time ALABAMA appears, so I would like to get the value of cases for this row, THIS for each state. So I could create a pannel where I can display the state name with it's value cases and deaths, without doing a query for each state.
To pick the latest entry (based on latest autoincrement id) for each state you can do a self join with your table like
DB::table('states as a')
->select('a.*')
->leftJoin('states as b', function ($join) {
$join->on('a.state', '=', 'b.state')
->whereRaw(DB::raw('a.id < b.id'))
;
})
->whereNull('b.id')
->whereIn('a.state',$statesArr)
->get();
If you are using latest version of laravel you can rewrite above using inner join
$latestStates = DB::table('states')
->select(DB::raw('max(id) as max_id'))
->groupBy('state');
$states = DB::table('states as a')
->joinSub($latestStates, 'b', function ($join) {
$join->on('a.id', '=', 'b.max_id');
})->whereIn('a.state',$statesArr)
->get();
Or you can use whereExists
$states = DB::table('states as a')
->whereExists(function ($query) {
$query->select(DB::raw(1))
->from('states as b')
->whereRaw('a.state= b.state')
->groupBy('b.state')
->havingRaw('max(b.id) = a.id')
;
})->whereIn('a.state',$statesArr)
->get();
Try like this:
$statesArr = array("Alabama", "Alaska", "Arizona", "Arkansas", "California", "Colorado", "Connecticut", "Delaware", "District of Columbia", "Florida", "Georgia", "Guam", "Hawaii", "Idaho", "Illinois", "Indiana", "Iowa", "Kansas","Kentucky","Louisiana","Maine","Maryland","Massachusetts","Michigan","Minnesota","Mississippi","Missouri", "Montana", "Nebraska", "Nevada", "New Hampshire", "New Jersey", "New Mexico", "New York", "North Carolina", "North Dakota", "Northern Mariana Islands", "Ohio", "Oklahoma", "Oregon", "Pennsylvania", "Puerto Rico", "Rhode Island", "South Carolina", "South Dakota", "Tennessee", "Texas", "Utah", "Vermont", "Virgin Islands", "Virginia", "Washington", "West Virginia", "Wisconsin", "Wyoming");
$Wyoming = State::whereIn('state', $statesArr)->orderByDesc('id')->sum('cases');

why region_code not storing properly?

I am using Magento GraphQL api in my project. To create a customer address I used createCustomerAddress mutation(createCustomerAddress).
Below is the mutation that I have called to create the customer address :
mutation createAddress {
createCustomerAddress(
input: {
firstname: "test"
lastname: "name"
company: "networld"
telephone: "1231231231"
street: ["test address line 1", "test address line 2"]
city: "Rajkot"
region: { region:"Gujarat", region_code: "GJ" }
postcode: "360001"
country_code: IN
}
) {
id
prefix
firstname
lastname
middlename
city
company
country_code
default_billing
default_shipping
postcode
region {
region
region_code
}
street
suffix
telephone
vat_id
}
}
This is working properly and returning me the result as below :
{
"data": {
"createCustomerAddress": {
"id": 44,
"prefix": null,
"firstname": "test",
"lastname": "name",
"middlename": null,
"city": "Rajkot",
"company": "networld",
"country_code": "IN",
"default_billing": false,
"default_shipping": false,
"postcode": "360001",
"region": {
"region": "Gujarat",
"region_code": "GJ"
},
"street": [
"test address line 1",
"test address line 2"
],
"suffix": null,
"telephone": "1231231231",
"vat_id": null
}
}
}
But, now when I query to get the customer address, it returning wrong region_code.
Here is the query I written to get the customer address :
query{
customer{
addresses{
id
firstname
lastname
street
city
region{
region
region_code
}
country_code
postcode
telephone
}
}
}
Result :
{
"data": {
"customer": {
"addresses": [
{
"id": 44,
"firstname": "test",
"lastname": "name",
"street": [
"test address line 1",
"test address line 2"
],
"city": "Rajkot",
"region": {
"region": "Gujarat",
"region_code": "Gujarat"
},
"country_code": "IN",
"postcode": "360001",
"telephone": "1231231231"
}
]
}
}
}
As you can see, region_code in this query result and region_code in mutation result was different. Query not returning region_code that generated from the mutation. Mutation generated region_code was GJ and query returned region_code was Gujarat.
Can anyone help me why this is happening ? How to solve it ?
I just stumbled upon this bug myself in Magento 2.3.4 and it looks like it's buggy with the region_code. There's a workaround for this, try to send the region_id instead of region_code, like this:
mutation {
createCustomerAddress(input: {
region: {
region: "VendeƩ"
region_id: 799
}
country_code: FR
street: ["123 Main Street"]
telephone: "7777777777"
postcode: "77777"
city: "Phoenix"
firstname: "Bob"
lastname: "Loblaw"
default_shipping: true
default_billing: false
}) {
id
region {
region
region_code
}
country_code
street
telephone
postcode
city
default_shipping
default_billing
}
}
After this, if you retrieve the region_code, it will show fine. It looks like it has problems identifying the region by the region_code.

PowerQuery to Convert List of Records to Delimited String

Given the below JSON I'm trying to load it into Excel. The "Ratings" section I would like to format into a single delimited string/cell. I'm pretty new to PowerQuery so I'm struggling to do this. I've managed to format the list of Records to its own table, but concatenating this into a string and adding it back into my source table is where I'm drawing a blank. Any help would be appreciated.
PowerQuery
let
Source = Json.Document(File.Contents("C:\filename.json")),
Ratings1 = Source[Ratings],
#"Converted to Table" = Table.FromList(Ratings1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
LastStep = Table.ExpandRecordColumn(#"Converted to Table", "Column1", { "Source", "Value" })
in
LastStep
JSON
{
"Title": "Iron Man",
"Year": "2008",
"Rated": "PG-13",
"Ratings": [{
"Source": "Internet Movie Database",
"Value": "7.9/10"
}, {
"Source": "Rotten Tomatoes",
"Value": "93%"
}, {
"Source": "Metacritic",
"Value": "79/100"
}
]
}
Ultimately, something like below would be ideal.
How about this?
let
Source = Json.Document(File.Contents("C:\filename.json")),
#"Converted to Table" = Record.ToTable(Source),
#"Transposed Table" = Table.Transpose(#"Converted to Table"),
#"Promoted Headers" = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Title", type text}, {"Rated", type text}, {"Year", Int64.Type}}),
#"Expanded Ratings" = Table.ExpandListColumn(#"Changed Type", "Ratings"),
#"Expanded Ratings1" = Table.ExpandRecordColumn(#"Expanded Ratings", "Ratings", {"Source", "Value"}, {"Source", "Value"}),
#"Added Custom" = Table.AddColumn(#"Expanded Ratings1", "Custom", each [Source] & "=" & [Value]),
#"Grouped Rows" = Table.Group(#"Added Custom", {"Title", "Year", "Rated"}, {{"Ratings", each Text.Combine([Custom],"#(lf)"), type text}})
in
#"Grouped Rows"
Most of the steps here are fairly clear from their name and are produced through GUI controls. The one trickier step is where I use a custom aggregator when doing the grouping. If you use the GUI, Text.Combine is not an option in the Group By dialog box, so I selected Max (which becomes List.Max in the code) and replaced that with Text.Combine to concatenate with the line feed character as the separator.
Concatenated with a pipe character into column. Is that what you want?
let
Source = Json.Document(File.Contents("C:\temp\filename.json")),
Ratings1 = Source[Ratings],
#"Converted to Table" = Table.FromList(Ratings1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
LastStep = Table.ExpandRecordColumn(#"Converted to Table", "Column1", { "Source", "Value" }),
#"Added Custom" = Table.AddColumn(LastStep, "Concat", each [Source]&"|"&[Value]),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Concat"})
in #"Removed Other Columns"

Resources