As my subject mentioned, after doing a few step (groupby , filter , combine text ...), I have an issue with removing duplicate in the same cell in power query.
example: column "cc_emails" has many row, but each row have some duplicated email due to Text.Combine step before:
sth like that: "Giang.Phan#abc.com, thao.tran#abc.com, Khoa.Vu#abc.com, Vn.Offset#abc.com, Giang.Phan#abc.com, thao.tran#abc.com, Khoa.Vu#abc.com, Vn.Offset#abc.com"
I would like to 1 email will only appear once in the list? Can someone help look this ?
output expected:
"Giang.Phan#abc.com,thao.tran#abc.com, Khoa.Vu#abc.com, Vn.Offset#abc.com"
##update my Query Editor:
let
Source = Exchange.Contents("giang.phan#abc.com"),
Mail1 = Source{[Name="Mail"]}[Data],
#"Reordered Columns" = Table.ReorderColumns(Mail1,{"DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "Sender", "DisplayTo", "DisplayCc", "ToRecipients", "CcRecipients", "BccRecipients", "Importance", "Categories", "IsRead", "HasAttachments", "Attachments", "Preview", "Attributes", "Body", "Id"}),
#"Filtered Rows" = Table.SelectRows(#"Reordered Columns", each [DateTimeReceived] > #datetime(2021, 12, 29, 0, 0, 0) and [DateTimeReceived] < #datetime(2022, 1, 4, 0, 0, 0)),
#"Expanded ToRecipients" = Table.ExpandTableColumn(#"Filtered Rows", "ToRecipients", {"Address"}, {"ToRecipients.Address"}),
#"Expanded CcRecipients" = Table.ExpandTableColumn(#"Expanded ToRecipients", "CcRecipients", {"Address"}, {"CcRecipients.Address"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded CcRecipients",{"BccRecipients", "Importance", "Categories", "IsRead", "HasAttachments", "Attachments", "Preview", "Attributes"}),
#"Reordered Columns1" = Table.ReorderColumns(#"Removed Columns",{"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "Sender", "DisplayTo", "DisplayCc", "ToRecipients.Address", "CcRecipients.Address", "Body"}),
#"Grouped Rows" = Table.Group(#"Reordered Columns1", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "DisplayTo", "DisplayCc"}, {{"cc_address", each Text.Combine([CcRecipients.Address],", "), type text}, {"to_address", each Text.Combine([ToRecipients.Address],", "), type text}}),
#"Grouped Rows1" = Table.Group(#"Grouped Rows", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "DisplayTo", "DisplayCc", "cc_address", "to_address"}, {{"Last_time receive", each List.Max([DateTimeReceived]), type datetime}, {"Last_subject", each List.Max([Subject]), type nullable text}}),
#"Removed Other Columns" = Table.SelectColumns(#"Grouped Rows1",{"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "DisplayTo", "DisplayCc", "cc_address", "to_address", "Last_time receive", "Last_subject"})
in
#"Removed Other Columns"
You can split the text by delimiter, select the distinct list values, then recombine as a string:
let
Source = "Giang.Phan#abc.com, thao.tran#abc.com, Khoa.Vu#abc.com, Vn.Offset#abc.com, Giang.Phan#abc.com, thao.tran#abc.com, Khoa.Vu#abc.com, Vn.Offset#abc.com",
#"Distinct Values" = Text.Combine(List.Distinct(Text.Split(Source, ", ")),", ")
in
#"Distinct Values"
Edit after question update:
In your case, you can simply change this line:
#"Grouped Rows" = Table.Group(#"Reordered Columns1", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "DisplayTo", "DisplayCc"}, {{"cc_address", each Text.Combine([CcRecipients.Address],", "), type text}, {"to_address", each Text.Combine([ToRecipients.Address],", "), type text}}),
to include the List.Distinct function, and return only distinct address values:
#"Grouped Rows" = Table.Group(#"Reordered Columns1", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "DisplayTo", "DisplayCc"}, {{"cc_address", each Text.Combine(List.Distinct([CcRecipients.Address]),", "), type text}, {"to_address", each Text.Combine(List.Distinct([ToRecipients.Address]),", "), type text}}),
Related
I have a dataset where new data is appended as a 'Change' (last column) and numbered, with the higher number being the latest outcome. I'm trying to correctly filter the latest rows based on the highest change number available and get a result as per the image below.
While I've managed to get the right rows with columns Contract, Period, Person Company & Person Name, I can't seem to get the rest of the data to appear. Can anyone suggest what I'm missing?
EDIT - GOT SOMETHING WORKING
I don't understand it yet, but the below code from another question finally worked. Thanks to Olly on Power Query - Group by MAX Column Value
let
Partitions = Table.Group(Sheet1, {"Person Name"}, {{"Data", each Table.FirstN(Table.Sort(_,{{"Change", Order.Descending}}),1), type table}}),
Combined = Table.Combine(Partitions[Data])
in
Combined
Here is another approach (assuming that your original dataset is formatted in a table called Table1.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Contract", Int64.Type}, {"Period", type datetime}, {"Person Company", type text}, {"Person Name", type text}, {"Person Role", type text}, {"Gender", type text}, {"Age", type text}, {"Employment Costs", Int64.Type}, {"Monthly Hours", Int64.Type}, {"Change", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Person Name"}, {{"AllData", each _, type table [Contract=nullable number, Period=nullable datetime, Person Company=nullable text, Person Name=nullable text, Person Role=nullable text, Gender=nullable text, Age=nullable text, Employment Costs=nullable number, Monthly Hours=nullable number, Change=nullable number]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Max_Change", each List.Max([AllData][Change])),
#"Expanded AllData" = Table.ExpandTableColumn(#"Added Custom", "AllData", {"Contract", "Period", "Person Company", "Person Role", "Gender", "Age", "Employment Costs", "Monthly Hours", "Change"}, {"Contract", "Period", "Person Company", "Person Role", "Gender", "Age", "Employment Costs", "Monthly Hours", "Change"}),
#"Added Custom1" = Table.AddColumn(#"Expanded AllData", "Filter", each if [Change] = [Max_Change] then "Yes" else null),
#"Filtered Rows" = Table.SelectRows(#"Added Custom1", each ([Filter] = "Yes")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Filter", "Max_Change"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Contract", "Period", "Person Company", "Person Name", "Person Role", "Gender", "Age", "Employment Costs", "Monthly Hours", "Change"})
in
#"Reordered Columns"
The idea is to create a column that carry the maximum change (by using List.Max()) of each agent and then filter by rows that fill the condition [Change] = [Max_Change]
you basically have the answer, but for instructions
click select all the columns you want to group on (contract, period, person company, person name), righ click group by
use operation all rows and hit ok
go into home...advanced editor, and on the group line, replace
... each _, type table [...}})
with
.... each Table.FirstN(Table.Sort(_,{{"Change", Order.Descending}}),1), type table }})
hit done, then use arrows atop the new column to expand the extra columns
thats going to sort, in descending order by change #, then take the first row only
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Contract", Int64.Type}, {"Period", type date}, {"Person company", type text}, {"Person Name", type text}, {"Person Role", type text}, {"Gender", type text}, {"Age", Int64.Type}, {"Employment Costs", Int64.Type}, {"Monthly Hours", Int64.Type}, {"Change", Int64.Type}}),
#"Grouped Rows1" = Table.Group(#"Changed Type", {"Contract", "Period", "Person company", "Person Name"}, {{"Count", each Table.FirstN(Table.Sort(_,{{"Change", Order.Descending}}),1), type table }}),
#"Expanded Count" = Table.ExpandTableColumn(#"Grouped Rows1", "Count", {"Person Role", "Gender", "Age", "Employment Costs", "Monthly Hours", "Change"}, {"Person Role", "Gender", "Age", "Employment Costs", "Monthly Hours", "Change"})
in #"Expanded Count"
I'm new to power query. I'm parsing JSON. I have an array name as "categories" when I expand it using Power Query it creates three rows for each category while I just want to remain in one row and want to create 3 separate column for each category like category1,category2,category3.
here is my code
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"id", Int64.Type}, {"no", type text}, {"complete", Int64.Type}, {"json", type text}}),
#"Parsed JSON" = Table.TransformColumns(#"Changed Type",{{"json", Json.Document}}),
#"Expanded json" = Table.ExpandRecordColumn(#"Parsed JSON", "json", {"title", "price", "StoreName", "ratings", "merchant", "categories", "VariantB", "detailA", "detailB", "bullets", "images", "description"}, {"title", "price", "StoreName", "ratings", "merchant", "categories", "VariantB", "detailA", "detailB", "bullets", "images", "description"})
in
#"Expanded json"
Add extra step:
ExtractList = Table.TransformColumns(#"Expanded json", {"categories", each Text.Combine(List.Transform(_, Text.From), ","), type text})
I have two tables -
Podcasts
Categories
Podcast columns - podcast_id, podcast_autho, podcast_category_id, podcast_description, podcast_owner, podcast_subcategory_id, podcast_title
Category columns - category_id, category_name, category_owner
podcast_category_id has a value from the Categories table and so does podcast_subcategory_id.
Now, I have an SQL statement which inner joins the two table to produce category name instead of id in the output -
SELECT p.*, c.category_name, c2.category_name AS sub_category_name FROM podcasts p
LEFT JOIN categories c ON p.podcast_category_id = c.category_id
LEFT JOIN categories c2 ON p.podcast_subcategory_id = c2.category_id;
I use the same query in my JpaRepository class to get the output -
interface PodcastRepository: JpaRepository<Podcast, Long> {
#Query(value = "SELECT p.*, c.category_name, c2.category_name AS sub_category_name FROM podcasts p " +
"LEFT JOIN categories c ON p.podcast_category_id = c.category_id " +
"LEFT JOIN categories c2 ON p.podcast_subcategory_id = c2.category_id",
nativeQuery = true)
fun getPodcastsByOwner(owner: Long): List<Any>
}
I do get an output but it is not in JSON format unlike other cases where I don't user Any (Object in Java) but the specific Entity class.
What I get -
[
[
8,
"krktush",
2,
"World War 1",
1,
3,
"What Went Wrong",
"General",
"Specific"
],
[
9,
"krktush",
2,
"World War 2",
1,
3,
"What went right",
"General",
"Specific"
]
]
What I expect -
[
{
"id": 16,
"author": "krtkush",
"title": "World War 1",
"description": "What Went Wrong",
"category": "2",
"subCategory": "",
"owner": 1,
"categoryName": General,
"subCategoryName": Specific
},
{
"id": 22,
"author": "krtkush",
"title": "World War 2",
"description": "What Went Right",
"category": "20",
"subCategory": "",
"owner": 1,
"categoryName": General,
"subCategoryName": Specific
}
]
How do I achieve this?
I am pretty noob in Laravel, and I would like to retrieve data from DB in an "automated" way. I have a query that get the value of key "cases" where 'state' is the latest repeated
$pieWyoming=State::select('cases')->where('state', '=', 'Wyoming')->orderByDesc('id')->limit(1)->get()->sum('cases');
But I want to do this with whereIn
$statesArr = array("Alabama", "Alaska", "Arizona", "Arkansas", "California", "Colorado", "Connecticut", "Delaware", "District of Columbia", "Florida", "Georgia", "Guam", "Hawaii", "Idaho", "Illinois", "Indiana", "Iowa", "Kansas","Kentucky","Louisiana","Maine","Maryland","Massachusetts","Michigan","Minnesota","Mississippi","Missouri", "Montana", "Nebraska", "Nevada", "New Hampshire", "New Jersey", "New Mexico", "New York", "North Carolina", "North Dakota", "Northern Mariana Islands", "Ohio", "Oklahoma", "Oregon", "Pennsylvania", "Puerto Rico", "Rhode Island", "South Carolina", "South Dakota", "Tennessee", "Texas", "Utah", "Vermont", "Virgin Islands", "Virginia", "Washington", "West Virginia", "Wisconsin", "Wyoming");
$Wyoming=State::select('cases')->whereIn('state',$statesArr)->orderByDesc('id')->limit(1)->get()->sum('cases');
But it seems that this traverse $statesArr and only gets the last value, that is correct but is only one value from one state, I want ALL VALUES from ALL states
EDIT, SAMPLE DATA AND EXPECTED OUTPUT
The database holds data as this https://res.cloudinary.com/dcdajhifi/image/upload/v1599003808/ex1_ys5ksi.png
and I would like to get only the value of field "cases" for each state in array $statesArr, each state should be the last repeated, in example
https://res.cloudinary.com/dcdajhifi/image/upload/v1599003808/ex2_lby8md.png
here is the last time ALABAMA appears, so I would like to get the value of cases for this row, THIS for each state. So I could create a pannel where I can display the state name with it's value cases and deaths, without doing a query for each state.
To pick the latest entry (based on latest autoincrement id) for each state you can do a self join with your table like
DB::table('states as a')
->select('a.*')
->leftJoin('states as b', function ($join) {
$join->on('a.state', '=', 'b.state')
->whereRaw(DB::raw('a.id < b.id'))
;
})
->whereNull('b.id')
->whereIn('a.state',$statesArr)
->get();
If you are using latest version of laravel you can rewrite above using inner join
$latestStates = DB::table('states')
->select(DB::raw('max(id) as max_id'))
->groupBy('state');
$states = DB::table('states as a')
->joinSub($latestStates, 'b', function ($join) {
$join->on('a.id', '=', 'b.max_id');
})->whereIn('a.state',$statesArr)
->get();
Or you can use whereExists
$states = DB::table('states as a')
->whereExists(function ($query) {
$query->select(DB::raw(1))
->from('states as b')
->whereRaw('a.state= b.state')
->groupBy('b.state')
->havingRaw('max(b.id) = a.id')
;
})->whereIn('a.state',$statesArr)
->get();
Try like this:
$statesArr = array("Alabama", "Alaska", "Arizona", "Arkansas", "California", "Colorado", "Connecticut", "Delaware", "District of Columbia", "Florida", "Georgia", "Guam", "Hawaii", "Idaho", "Illinois", "Indiana", "Iowa", "Kansas","Kentucky","Louisiana","Maine","Maryland","Massachusetts","Michigan","Minnesota","Mississippi","Missouri", "Montana", "Nebraska", "Nevada", "New Hampshire", "New Jersey", "New Mexico", "New York", "North Carolina", "North Dakota", "Northern Mariana Islands", "Ohio", "Oklahoma", "Oregon", "Pennsylvania", "Puerto Rico", "Rhode Island", "South Carolina", "South Dakota", "Tennessee", "Texas", "Utah", "Vermont", "Virgin Islands", "Virginia", "Washington", "West Virginia", "Wisconsin", "Wyoming");
$Wyoming = State::whereIn('state', $statesArr)->orderByDesc('id')->sum('cases');
Given the below JSON I'm trying to load it into Excel. The "Ratings" section I would like to format into a single delimited string/cell. I'm pretty new to PowerQuery so I'm struggling to do this. I've managed to format the list of Records to its own table, but concatenating this into a string and adding it back into my source table is where I'm drawing a blank. Any help would be appreciated.
PowerQuery
let
Source = Json.Document(File.Contents("C:\filename.json")),
Ratings1 = Source[Ratings],
#"Converted to Table" = Table.FromList(Ratings1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
LastStep = Table.ExpandRecordColumn(#"Converted to Table", "Column1", { "Source", "Value" })
in
LastStep
JSON
{
"Title": "Iron Man",
"Year": "2008",
"Rated": "PG-13",
"Ratings": [{
"Source": "Internet Movie Database",
"Value": "7.9/10"
}, {
"Source": "Rotten Tomatoes",
"Value": "93%"
}, {
"Source": "Metacritic",
"Value": "79/100"
}
]
}
Ultimately, something like below would be ideal.
How about this?
let
Source = Json.Document(File.Contents("C:\filename.json")),
#"Converted to Table" = Record.ToTable(Source),
#"Transposed Table" = Table.Transpose(#"Converted to Table"),
#"Promoted Headers" = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Title", type text}, {"Rated", type text}, {"Year", Int64.Type}}),
#"Expanded Ratings" = Table.ExpandListColumn(#"Changed Type", "Ratings"),
#"Expanded Ratings1" = Table.ExpandRecordColumn(#"Expanded Ratings", "Ratings", {"Source", "Value"}, {"Source", "Value"}),
#"Added Custom" = Table.AddColumn(#"Expanded Ratings1", "Custom", each [Source] & "=" & [Value]),
#"Grouped Rows" = Table.Group(#"Added Custom", {"Title", "Year", "Rated"}, {{"Ratings", each Text.Combine([Custom],"#(lf)"), type text}})
in
#"Grouped Rows"
Most of the steps here are fairly clear from their name and are produced through GUI controls. The one trickier step is where I use a custom aggregator when doing the grouping. If you use the GUI, Text.Combine is not an option in the Group By dialog box, so I selected Max (which becomes List.Max in the code) and replaced that with Text.Combine to concatenate with the line feed character as the separator.
Concatenated with a pipe character into column. Is that what you want?
let
Source = Json.Document(File.Contents("C:\temp\filename.json")),
Ratings1 = Source[Ratings],
#"Converted to Table" = Table.FromList(Ratings1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
LastStep = Table.ExpandRecordColumn(#"Converted to Table", "Column1", { "Source", "Value" }),
#"Added Custom" = Table.AddColumn(LastStep, "Concat", each [Source]&"|"&[Value]),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Concat"})
in #"Removed Other Columns"