CountIf Equivalent in Power Query , counts per row within self - powerquery

I need help in creating a custom column that shows how many models per modality for each account. What would I need to input in the custom column section in power query.

It depends on how many other columns you have. I don't see an account column, but you mention one.
In general, in powerquery click select account and Modality columns. Right click, and use Group By. Use operation Count Rows with the new column name of your choice
Alternatively, [add aggregation] and use operation All Rows for that one
Then expand the new column using the arrows atop the new column to replace the missing data
edited answer to provide all potential combinations. Try
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ChildName", type text}, {"Modality", type text}, {"Model Info", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ChildName"}, {
{"Modality per ChildName", each Table.RowCount(_), Int64.Type},
{"Unique Modality per ChildName", each List.Count(List.Distinct(_[Modality])), Int64.Type},
{"data", each _, type table}
}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"Modality", "Model Info"}, {"Modality", "Model Info"}),
#"Grouped Rows1" = Table.Group(#"Expanded data", {"ChildName", "Modality"}, {
{"data", each _, type table },
{"Model Info Per Modality", each Table.RowCount(_), Int64.Type},
{"Unique Model Info Per Modality", each List.Count(List.Distinct(_[Model Info])), Int64.Type}
}),
#"Expanded data1" = Table.ExpandTableColumn(#"Grouped Rows1", "data", {"Modality per ChildName", "Unique Modality per ChildName", "Model Info"}, {"Modality per ChildName", "Unique Modality per ChildName", "Model Info"})
in #"Expanded data1"

Related

Remove duplicates in Power query editor

I have below data:
Con Payment Status Count
HUMANABRATTEN,MICOL9/20/2021 Resubmitted 15
HUMANABRATTEN,MICOL9/20/2021 In-Process 1
they have exact same length but when I try to remove duplicate it always removes the "Resubmitted" whereas I want the high count Payment status
Normally in Excel, when we remove duplicate from any Data it always return the first value and remove 2nd value. IDK why its not working in Power Query
Power Query does not necessarily return results in the order you might expect. Even the sorts are unstable, if I recall correctly.
For your problem, one solution would be to use Table.GroupBy and then extract the desired results. In your case it seems to be the Max Count, and the Payment Status that is in the same row as Max Count.
eg:
let
Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Con", type text}, {"Payment Status", type text}, {"Count", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Con"}, {
//Return the Payment Status cell that is in the same row as Max Count
{"Payment Status", each _[Payment Status]{List.PositionOf(_[Count],List.Max(_[Count]))}},
//determine the Max Count
{"Count", each List.Max([Count]), type nullable number}})
in
#"Grouped Rows"

Power Query: How to insert a Column using the Left(Right Function based on A4

Is there a way to Add a column, in Power Query, by referencing data in a specific cell?
I want to take the text from "A4", use a Left(Right function, and add that to a new column.
My VBA macro is:
"Latest 4 Wks - Ending " & Left(Right(.Range("A4"), 24), 23)
I guess you want to do something like that. In a first step you define a named range for A4 which I named cellA4. I then did a load into Powerquery, added an extra column with the part of the text (I used Text.Middle other text function are possible, of course) from the cell and drilled down to the content of the cell. The M-code for that is
let
Source = Excel.CurrentWorkbook(){[Name="cellA4"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each Text.Middle([Column1],23)),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Column1"}),
Custom = #"Removed Columns"{0}[Custom]
in
Custom
Result looks like
Them I just made a table with one column and imported that into Powerquery and added an extra column which just contains the text from cell A4. M-Code is
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Col1", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each cellA4)
in
#"Added Custom"
Result is
Through further research, I found that by adding a Blank Query, I was able to add a column, in Power Query, by referencing data in a specific cell?
Insert BlankQuery
Advance Editor
(YourWorkSheet as table ) as text=>
let
SheetCellA4 =YourWorkSheet[Column1]{3},
SplitByFrom = Text.Split(SheetCellA4, "to "){1},
SplitByTime = Text.Split(SplitByFrom, "`"){0}
in SplitByTime
The bring in the worksheet data
After the Source line
#"Added Custom" = Table.AddColumn(Source, "Custom", each Query1(Source))
In
#"Added Custom"

Inserting text manually in Powerquery

I'm merging multiple Excel files into one where the user can review and mark an additional Comment column as completed. Each day there are additional files and I need to refresh the query and pull the data in. Keeping the original Comment column values.
I've attempted to do this by referencing Marcel Beug's video but that uses an sql table and I cannot seem to get it to work with the Excel files as the source.
After the Merge Queries I attempt to modify the first file to my source "InputFile"
![Modify the Merge Formula1][2]
![Changed to last query step of InputFile][3]
![InputFile Query with Source2 and Merge][4]
![M Code of InputFile Query with Merge][5]
By setting the First field in the Merge Formula to the last step in the InputFile query I was able to get around the Cyclic error however I find that every Refresh creates duplicate rows. 4 become 8 that then becomes 16, etc.
let
Source = Excel.Workbook(File.Contents("S:\Fin_Aid\Operations Team\COD mpn - lec\InputFiles\8.22.18 to 8.23.18.xlsx"), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
Rename_RecID = Table.RenameColumns(#"Removed Columns",{{"Column3.1", "RecID"}}),
Source2 = Excel.CurrentWorkbook(){[Name="InputFile"]}[Content],
InputWithComment = Table.TransformColumnTypes(Source2,{{"RecID", Int64.Type}, {"Column1", type text}, {"Column2", type text}, {"Column4", type text}, {"Column5", type text}, {"Comment", type text}}),
#"Merged Queries" = Table.NestedJoin(Rename_RecID,{"RecID"},InputWithComment,{"RecID"},"InputWithComment",JoinKind.LeftOuter),
#"Expanded InputWithComment" = Table.ExpandTableColumn(#"Merged Queries", "InputWithComment", {"Comment"}, {"Comment"})
in
#"Expanded InputWithComment"
Regards,
Jim

Get errors in table

I'm trying to validate some data in power query by doing type conversion and catching the errors that happen. I want to then append the error messages to the end of the records where the error appears. The catch here is that I don't know before hand which field/column the error may appear in. I convert the data type of all columns, and if a field in a record results in error I need to catch it.
Currently i'm trying something like:
let
Source = #"Site Served Import",
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1),
#"Changed Type" = Table.TransformColumnTypes(#"Added Index",{{"Date", type datetime}, {"Campaign ID", type any}, {"Campaign Name", type text}, {"Site ID ", type any}, {"Site Name", type text}, {"Placement Name", type text}, {"Creative Name", type text}, {"Creative Size/Length", type text}, {"Buy Type", type text}, {"Planned Impressions", Int64.Type}, {"Planned Media Spend", type number}, {"Delivered Spend", type number}, {"Impressions", Int64.Type}, {"Clicks", Int64.Type}, {"Video Starts", Int64.Type}, {"25% Video Completion", Int64.Type}, {"50% Video Completion", Int64.Type}, {"75% Video Completion", Int64.Type}, {"Video Fully Completed", Int64.Type}, {"Engagements", Int64.Type}, {"Actions", Int64.Type}, {"Source", type text}, {"Action Description", type text}}),
#"Removed Errors" = Table.SelectRowsWithErrors(#"Changed Type")
This bit gives me only the records with errors in them. Then i'm trying to write a function to append those errors to the end of the records. So far my attack has been to add a custom column for each column with the formula:
try [column]
Honestly i'm having trouble even trying to add columns in a loop. I don't think i'm thinking right about m yet. I'm more of a C type of guy, so this m stuff is strangely counter intuitive.
Anyone know how to do this?
You are on the right track. The basic approach is to add a Calculated Column using try [column]. That will return a nested Record-type column that you can expand a couple of times to get to column.Error.Message.
The easiest way is to get started is to generate an example using the UI, then look at the script it produces.
Here's a good blog on the subject:
https://blog.gbrueckl.at/2013/12/error-handling-in-power-query/

Power Query: issue with removing duplicates function

Problem when using "Remove Duplicates" function in Power Query.
I'm running Excel 2013 with PowerQuery and PowerPivot. Multiple txt files in the same folder were loaded into data models by creating an connection. The tables looks like below.
CoCd Doc.Id Plant PGroup Purch.Doc. Vendor
7200 411647 7200 U36 4800311931 2000031503
7020 421245 7020 D05 4800277051 2000032922
7200 404320 1000 8 4800000000 2000032944
7200 404321 7200 T48 4800293878 2000032944
7010 425013 7010 R21 4800346743 2000036726
There are total 440k rows in total. By running a pivot table, I've identified 144k unique Doc.Ids.
I then selected the Doc.Id (Whole Number) column and use the "Remove Duplicates" function in Power Query to remove the other duplicated rows. However, the final table only loaded 75k rows (should be 144k). I changed the data type of Doc.Ids to "text", then removed duplicates, the final table became 163k rows, which is some what correct as Doc.Ids contain "603" and " 603". Unfortunately I really need to have 144k rows in my final table.
Why the remove duplicates function doesn't work in my case with Doc.Ids as whole Number?
The code in Advance Editor looks like below:
#"Changed Type1" = Table.TransformColumnTypes(#"Filtered Rows",{{"CreateTime", type time}, {" TotalAmoun", Currency.Type}, {"Pst Date", type date}, {"Doc. Date", type date}, {"Due Date", type date}, {"DaysToDue", Int64.Type}, {"CreateDate", type date}, {"Cycle Time", type text}, {"Doc. Id", type text}, {"Purch.Doc.", Int64.Type}, {"Vendor", type text}, {"CoCd", Int64.Type}, {"Plant", type text}}),
#"Removed Duplicates" = Table.Distinct(#"Changed Type1", {"Doc. Id"})
in
#"Removed Duplicates"
After some further digging, it appears that a chunk of Doc.Id were missing between "398103" and "657238" plus some random ones. An example list of missing numbers as below. Can't find any reasons why they should be missing.
"245233"
"261404"
...
...
"398103"
...
...
"657238"

Resources