Powershell Sorting a tuple not working - sorting

This is the most stripped down example of what I am trying and failing to do:
#(1, 3, 2) | Select-Object {New-Object "tuple[Int,Int]" $_, 1} | Sort-Object -Property Item1
It returns:
(2, 1)
(3, 1)
(1, 1)
I'm looking to return the list sorted by the first element of the tuple. Obviously in this little demo I could just do the sorting first, but in my actual use case I perform a calculation on the list during the Select-Object phase and then want to sort based on the results of that calculation.
I'm certainly not married to using tuples but I've tried and failed using other options that haven't had any success. I just need to take a list, add additional information to each element of the list based on the result of a calculation and then sort by that additional information.
Feeling really stupid right now given how long this is taking me to figure out so any help is appreciated.

If you're just adding a custom property to an object, try returning a new object with the additional property. To use your same example with #(1,3,2):
#(1, 3, 2) | % {
[PSCustomObject] #{
Item1 = $_
Item2 = 1
}
} | Sort-Object -Property Item1

You can sort tuples only if you ADD them to a list first.
Sort-order is alway ascending from first item to last item:
$list = New-Object 'Collections.ArrayList'
foreach ($x in 1, 3, 2) {$null = $list.add([tuple]::create($x, 1))}
$list.Sort()
$list | select item1, item2

Related

PowerShell | Optimization search : the matching between the elements of two arrays knowing in advance that only one unique pair exists

I would like to optimize the process when I match the elements between two arrays (each contains several thousand elements). If the match is found then we move on to the next element instead of continuing to search for another match (which does not exist because each element is unique).
$array1 = #(thousandItemsForExample)
$array2 = #(thousandItemsForExample)
foreach ($array1item in $array1) {
$object = [PSCustomObject]#{
property1 = $array1item.property1
property2 = ($array1 | Where-Object { $_.property1 -eq $array2.property1 } | Select-Object property2).property2
}
I tried to find out if any of the comparison operators had this kind of option but I couldn't find anything.
Thank you! :)
PS : Sorry for my English, it's not my native language...
You do this with the help of a hash table that allows for fast look-ups. Also Group-Object -AsHashtable helps greatly with the construction of the hash table:
$array1 = #(thousandItemsForExample)
$array2 = thousandItemsForExample | Group-Object property1 -AsHashTable -AsString
$result = foreach ($item in $array1) {
[PSCustomObject]#{
property1 = $item.property1
property2 = $array2[$item.property1].property2
}
}
Create a hashtable and load all the items from $array2 into it, using the value of property1 as the key:
$array1 = #(thousandItemsForExample)
$array2 = #(thousandItemsForExample)
$lookupTable = #{}
$array2 |ForEach-Object {
$lookupTable[$_.property1] = $_
}
Fetching the corresponding item from the hashtable by key is going to be significantly faster than filtering the whole array with Where-Object everytime:
foreach ($array1item in $array1) {
$object = [PSCustomObject]#{
property1 = $array1item.property1
property2 = $lookupTable[$array1item.property1].property2
}
}

Powershell csv calculate total sum

I am currently working on powershell. Powershell is new for me so its kind of hard to figure out this one.
I have three headers in my csv files.
Headers include: Name, MessageCount and Direction.
Names are email addresses and those addresses are all the same. Direction have "Inbound" and "Outbound". MessageCount are bunch of diffrent numbers:
Overview
I want to calculate those number so i get "Inbound" and "Outbound" Totals and emails on those rows.
I am trying to foreach loop out MessageCount and calculate those together it will only give me output like this :
MessageCount
Try something like this:
$data = Import-Csv "path-to-your-csv-file";
$data | group Name
| select Name,
#{n = "Inbound"; e = {
(($_.Group | where Direction -eq "Inbound").MessageCount | Measure-Object -Sum).Sum }
},
#{n = "Outbound"; e = {
(($_.Group | where Direction -eq "Outbound").MessageCount | Measure-Object -Sum).Sum }
}
Code explanation
group Name groups results by property Name - in this case, email address. More here
select allows select property from object or create custom with #{n="";e={}}. More here
($_.Group | where Direction -eq "Outbound").MessageCount gets data from the group, searches for rows with Direction equal to Outbound and then gets the MessageCount from found rows.
Measure-Object -Sum takes array and creates object with properties ie. sum of values in array, so we get sum of MessageCount and return as custom property in object.

fill time gaps with power query

I have following data
start stop status
+-----------+-----------+-----------+
| 09:01:10 | 09:01:40 | active |
| 09:02:30 | 09:04:50 | active |
| 09:10:01 | 09:11:50 | active |
+-----------+-----------+-----------+
I want to fill in the gaps with "passive"
start stop status
+-----------+-----------+-----------+
| 09:01:10 | 09:01:40 | active |
| 09:01:40 | 09:02:30 | passive |
| 09:02:30 | 09:04:50 | active |
| 09:04:50 | 09:10:01 | passive |
| 09:10:01 | 09:11:50 | active |
+-----------+-----------+-----------+
How can I do this in M Query language?
You could try something like the below (my first two steps someTable and changedTypes are just to re-create your sample data on my end):
let
someTable = Table.FromColumns({{"09:01:10", "09:02:30", "09:10:01"}, {"09:01:40", "09:04:50", "09:11:50"}, {"active", "active", "active"}}, {"start","stop","status"}),
changedTypes = Table.TransformColumnTypes(someTable, {{"start", type duration}, {"stop", type duration}, {"status", type text}}),
listOfRecords = Table.ToRecords(changedTypes),
transformList = List.Accumulate(List.Skip(List.Positions(listOfRecords)), {listOfRecords{0}}, (listState, currentIndex) =>
let
previousRecord = listOfRecords{currentIndex-1},
currentRecord = listOfRecords{currentIndex},
thereIsAGap = currentRecord[start] <> previousRecord[stop],
recordsToAdd = if thereIsAGap then {[start=previousRecord[stop], stop=currentRecord[start], status="passive"], currentRecord} else {currentRecord},
append = listState & recordsToAdd
in
append
),
backToTable = Table.FromRecords(transformList, type table [start=duration, stop=duration, status=text])
in
backToTable
This is what I start off with (at the changedTypes step):
This is what I end up with:
To integrate with your existing M code, you'll probably need to:
remove someTable and changedTypes from my code (and replace with your existing query)
change changedTypes in the listOfRecords step to whatever your last step is called (otherwise you'll get an error if you don't have a changedTypes expression in your code).
Edit:
Further to my answer, what I would suggest is:
Try changing this line in the code above:
listOfRecords = Table.ToRecords(changedTypes),
to
listOfRecords = List.Buffer(Table.ToRecords(changedTypes)),
I found that storing the list in memory reduced my refresh time significantly (maybe ~90% if quantified). I imagine there are limits and drawbacks (e.g. if the list can't fit), but might be okay for your use case.
Do you experience similar behaviour? Also, my basic graph indicates non-linear complexity of the code overall unfortunately.
Final note: I found that generating and processing 100k rows resulted in a stack overflow whilst refreshing the query (this might have been due to the generation of input rows and may not the insertion of new rows, don't know). So clearly, this approach has limits.
I think I may have a better performing solution.
From your source table (assuming it's sorted), add an index column starting from 0 and an index column starting from 1 and then merge the table with itself doing a left outer join on the index columns and expand the start column.
Remove columns except for stop, status, and start.1 and filter out nulls.
Rename columns to start, status, and stop and replace "active" with "passive".
Finally, append this table to your original table.
let
Source = Table.RenameColumns(#"Removed Columns",{{"Column1.2", "start"}, {"Column1.3", "stop"}, {"Column1.4", "status"}}),
Add1Index = Table.AddIndexColumn(Source, "Index", 1, 1),
Add0Index = Table.AddIndexColumn(Add1Index, "Index.1", 0, 1),
SelfMerge = Table.NestedJoin(Add0Index,{"Index"},Add0Index,{"Index.1"},"Added Index1",JoinKind.LeftOuter),
ExpandStart1 = Table.ExpandTableColumn(SelfMerge, "Added Index1", {"start"}, {"start.1"}),
RemoveCols = Table.RemoveColumns(ExpandStart1,{"start", "Index", "Index.1"}),
FilterNulls = Table.SelectRows(RemoveCols, each ([start.1] <> null)),
RenameCols = Table.RenameColumns(FilterNulls,{{"stop", "start"}, {"start.1", "stop"}}),
ActiveToPassive = Table.ReplaceValue(RenameCols,"active","passive",Replacer.ReplaceText,{"status"}),
AppendQuery = Table.Combine({Source, ActiveToPassive}),
#"Sorted Rows" = Table.Sort(AppendQuery,{{"start", Order.Ascending}})
in
#"Sorted Rows"
This should be O(n) complexity with similar logic to #chillin, but I think should be faster than using a custom function since it will be using a built-in merge which is likely to be highly optimized.
I would approach this as follows:
Duplicate the first table.
Replace "active" with "passive".
Remove the start column.
Rename stop to start.
Create a new stop column by looking up the earliest start time from your original table that occurs after the current stop time.
Filter out nulls in this new column.
Append this table to the original table.
The M code will look something like this:
let
Source = <...your starting table...>
PassiveStatus = Table.ReplaceValue(Source,"active","passive",Replacer.ReplaceText,{"status"}),
RemoveStart = Table.RemoveColumns(PassiveStatus,{"start"}),
RenameStart = Table.RenameColumns(RemoveStart,{{"stop", "start"}}),
AddStop = Table.AddColumn(RenameStart, "stop", (C) => List.Min(List.Select(Source[start], each _ > C[start])), type time),
RemoveNulls = Table.SelectRows(AddStop, each ([stop] <> null)),
CombineTables = Table.Combine({Source, RemoveNulls}),
#"Sorted Rows" = Table.Sort(CombineTables,{{"start", Order.Ascending}})
in
#"Sorted Rows"
The only tricky bit above is the custom column part where I define the new column like this:
(C) => List.Min(List.Select(Source[start], each _ > C[start]))
This takes each item in the column/list Source[start] and compares it to the time in the current row. It selects only the ones that occur after the time in the current row and then take the min over that list to find the earliest one.

How to sort a comma separated string with a specific value on first position?

Lets say I have an unsorted string like "Apples,Bananas,Pineapples,Apricots" in my query. I want to sort that list and specificly have "Bananas" first in the list if they occur and the rest sorted ascending.
Example:
[BASKET] | [CONTENT] | [SORTED]
John | Apples,Apricots,Bananas | Bananas,Apples,Apricots
Melissa | Pineapples,Bananas | Bananas,Pineapples
Tom | Pineapples,Apricots,Apples | Apples,Apricots,Pineapples
How can I accomplish this with Power Query?
Cheap version (a) replace Banana with something that will sort first in strict alpha sort (b) Sort in new column (c) Fix Banana (d) remove extra column
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Replaced Value" = Table.ReplaceValue(Source,"Bananas","111Bananas",Replacer.ReplaceText,{"Items"}),
MySort = (x) => Text.Combine(List.Sort(Text.Split(x, ",")), ","),
Sorted = Table.AddColumn(#"Replaced Value", "Sorted", each MySort([Items])),
#"Replaced Value1" = Table.ReplaceValue(Sorted,"111Bananas","Bananas",Replacer.ReplaceText,{"Sorted"}),
#"Removed Columns" = Table.RemoveColumns(#"Replaced Value1",{"Items"})
in #"Removed Columns"

Put results in a table and then sorted output

I am writing a script which produces two outputs with in a foreach loop , one string $server and one integer $util.(lets say I get 20 results)
What is the simplest approach to put my results in a table while running the loop and then I can output them sorted (descending) after the loop is finished ?
SERVER UTIL
------ ----
SERVER001 95
SERVER002 74
SERVER003 32
SERVER004 12
if you want to sort the results in descending order you will have put the results in an array and then sort outside the loop like so:
$arr = #()
foreach ($item in $collection)
{
$arr += [pscustomobject]#{
Server = $item.server
util = $item.util
}
}
$arr | Sort-Object -Property Util -Descending

Resources