Powershell Perfomance with Where-Object? - performance

See below code, the only difference in the two scripts is the lack of a script block in the second one, and there is a significant % performance increase due to it.
Is there a reason for this? Is one more native to powershell than the other?
I am doing a large number of scripts with multiple similar blocks in many of them, and would like a reasonable answer as to how to gain easy performance boosts such as this one, so why does the exclusion of a scriptblock in the Where-Object (alias ?) suddenly cut performance by a decent margin?
PS C:\Scripts> $a = 1..15 | % {
Measure-Command {
$G = Get-ADGroup -Filter *
1..3 | % {
$G | ? {$_.Name -eq "TestGroup$($_)"}
}
}
}
$b = 1..15 | % {
Measure-Command {
$G = Get-ADGroup -Filter *
1..3 | % {
$G | ? Name -eq "TestGroup$($_)"
}
}
}
($a.TotalMilliseconds | Measure -Average).Average
($b.TotalMilliseconds | Measure -Average).Average
283.479413333333
212.57384

PowerShell 3.0 introduced Where-Object (aliases: Where, ?) that can check properties directly without a scriptblock.
When a scriptblock is executed PowerShell, like any other language, creates a new execution context and it's very expensive in an interpreted language.
The only reasons to use the old notation with a scriptblock are:
to have a PS1/PS2-compatible code;
to perform complex checks;
to do something besides the check itself.
As for your code, both snippets use scriptblocks in other places, pipelining (it's 5-10x slower than foreach statement), and needlessly measure Get-ADGroup time, so we can optimize it even more:
$group = Get-ADGroup -Filter *
$c = foreach ($i in 1..15) {
Measure-Command {
foreach ($j in 1..3) {
$filtered = foreach ($g in $group) { if ($g.Name -eq "TestGroup$($j)") { $g } }
}
}
}
Sometimes a much bigger gain can be achieved by preparing the data. For example, if some array of objects is repeatedly checked in a loop to see whether it has a certain property value, it's better to convert the array into a hashtable with a key based on that property value:
# index by somefield
$lookup = #{}
foreach ($element in $array) { $lookup[$element.somefield] = $element }
# use the indexed lookup table
foreach ($element in $anotherarray) {
$existingElement = $lookup[$element.somefield]
if ($existingElement) {
# do something
}
}

Related

PowerShell | Optimization search : the matching between the elements of two arrays knowing in advance that only one unique pair exists

I would like to optimize the process when I match the elements between two arrays (each contains several thousand elements). If the match is found then we move on to the next element instead of continuing to search for another match (which does not exist because each element is unique).
$array1 = #(thousandItemsForExample)
$array2 = #(thousandItemsForExample)
foreach ($array1item in $array1) {
$object = [PSCustomObject]#{
property1 = $array1item.property1
property2 = ($array1 | Where-Object { $_.property1 -eq $array2.property1 } | Select-Object property2).property2
}
I tried to find out if any of the comparison operators had this kind of option but I couldn't find anything.
Thank you! :)
PS : Sorry for my English, it's not my native language...
You do this with the help of a hash table that allows for fast look-ups. Also Group-Object -AsHashtable helps greatly with the construction of the hash table:
$array1 = #(thousandItemsForExample)
$array2 = thousandItemsForExample | Group-Object property1 -AsHashTable -AsString
$result = foreach ($item in $array1) {
[PSCustomObject]#{
property1 = $item.property1
property2 = $array2[$item.property1].property2
}
}
Create a hashtable and load all the items from $array2 into it, using the value of property1 as the key:
$array1 = #(thousandItemsForExample)
$array2 = #(thousandItemsForExample)
$lookupTable = #{}
$array2 |ForEach-Object {
$lookupTable[$_.property1] = $_
}
Fetching the corresponding item from the hashtable by key is going to be significantly faster than filtering the whole array with Where-Object everytime:
foreach ($array1item in $array1) {
$object = [PSCustomObject]#{
property1 = $array1item.property1
property2 = $lookupTable[$array1item.property1].property2
}
}

Powershell question - Looking for fastest method to loop through 500k objects looking for a match in another 500k object array

I have two large .csv files that I've imported using the import-csv cmdlet. I've done a lot of searching and trying and am finally posting to ask for some help to make this easier.
I need to move through the first array that will have anywhere from 80k rows to 500k rows. Each object in these arrays has multiple properties, and I then need to find the corresponding entry in a second array of the same size matching on a property from there.
I'm importing them as [systems.collection.arrayList] and I've tried to place them as hashtables too. I have even tried to muck with LINQ which was mentioned in several other posts.
Any chance anyone can offer advice or insight how to make this run faster? It feels like I'm looking in one haystack for matching hay in a different stack.
$ImportTime1 = Measure-Command {
[System.Collections.ArrayList]$fileList1 = Import-csv file1.csv
[System.Collections.ArrayList]$fileSorted1 = ($fileList1 | Sort-Object -property 'Property1' -Unique -Descending)
Remove-Variable fileList1
}
$ImportTime2 = Measure-Command {
[System.Collections.ArrayList]$fileList2 = Import-csv file2.csv
[System.Collections.ArrayList]$fileSorted2 = ($fileList2 | Sort-Object -property 'Property1' -Unique -Descending)
Remove-Variable fileList2
}
$fileSorted1.foreach({
$varible1 = $_
$target = $fileSorted2.where({$_ -eq $variable1})
###do some other stuff
})
This may be of use: https://powershell.org/forums/topic/comparing-two-multi-dimensional-arrays/
The updated solution in comment #27359 + add the suggested change by Max Kozlov in comment #27380.
Function RJ-CombinedCompare() {
[CmdletBinding()]
PARAM(
[Parameter(Mandatory=$True)]$List1,
[Parameter(Mandatory=$True)]$L1Match,
[Parameter(Mandatory=$True)]$List2,
[Parameter(Mandatory=$True)]$L2Match
)
$hash = #{}
foreach ($data in $List1) {$hash[$data.$L1Match] += ,[pscustomobject]#{Owner=1;Value=$($data)}}
foreach ($data in $List2) {$hash[$data.$L2Match] += ,[pscustomobject]#{Owner=2;Value=$($data)}}
foreach ($kv in $hash.GetEnumerator()) {
$m1, $m2 = $kv.Value.where({$_.Owner -eq 1}, 'Split')
[PSCustomObject]#{
MatchValue = $kv.Key
L1Matches = $m1.Count
L2Matches = $m2.Count
L1MatchObject = $L1Match
L2MatchObject = $L2Match
List1 = $m1.Value
List2 = $m2.Value
}
}
}
$fileList1 = Import-csv file1.csv
$fileList2 = Import-csv file2.csv
$newList = RJ-CombinedCompare -List1 $fileList1 -L1Match $(yourcolumnhere) -List2 $fileList2 -L2Match $(yourothercolumnhere)
foreach ($item in $newList) {
# your logic here
}
It should be fast to pass the lists into this hashtable and it's fast to iterate through as well.

getting answer from sql query about specific command

I am new here I tried to find my problem here but I couldnt.
I have 2 db and I am searching two different information and comparing them. I thought many other languages which we called "containts"
I need to find which names rapor_one has and rapor_two does not have. Probably I have problem with Contains part. How can I Use it
for($i=1; $i -le $rapor_one.Rows.Count;$i++ ){
if($rapor_two.Contains($row)){
Write-Host $row
}
}
If it matters I am using connection string and connection adapter which I am also adding below
$connectionString =
$connection =
$connection.ConnectionString =
$connection.Open()
$command =
$command.CommandText =
$DataAdapter = new-object System.Data.SqlClient.SqlDataAdapter $
$Dataset = new-object System.Data.Dataset
$DataAdapter.Fill($Dataset)
I am sure my query working well I take db response but I just have problem with comparing them
****ADDED: Also I want to ask that if there was 2 or more attributes in my variable how can I compare them?
****ADDED:
This algorithm is better. Cant we use contains any other modern languages(I know we have in C# java etc?
foreach($row in $rapor_two)
{
if ($rapor_one - Contains $row){
Write-Host "True"
}
}

Powershell Merge 2 lists Performance

I have two lists in Powershell with a large amount of data:
$Byods containing MACs and Usernames with approximately 5000 items
$DHCPLeases containing MACs and IPs with approximately 3000 items
I want to create a new list containing Usernames and IPs where the Byods list is leading, and the IPs are found from the DHCPleases, only containg records that found a match (a left join?)
I created a foreach loop, that does the job. However, its taking a huge ammount of time to complete (> 30 min).
Im sure this can be faster. Anyone?
$UserByods = #()
foreach ($lease in $DHCPLeases)
{
$MAC = [string]$lease.MAC
$UserByod = #()
$UserByod = $Byods | where {$_.MAC -eq $MAC}
if (($UserByod | measure).count -eq 1) {
$ByodIP = New-Object -TypeName PSObject
$ByodIP | Add-Member -Name 'User' -MemberType Noteproperty -Value $UserByod.Username
$ByodIP | Add-Member -Name 'IP' -MemberType Noteproperty -Value $lease.IP
$UserByods += $ByodIP
}
}
A number of improvements can be made here. First off, don't use Add-Member to construct the objects, it'll be significantly slower than specifying the properties up front.
You'd also want to avoid using the addition operator (+=) on an collection, since it'll cause the underlying array to be resized which is a quite memory-intensive operation.
Finally use a hashtable for the MAC correlation, it'll be much faster than looping through all 5000 entries 3000 times (which is what ...| Where {...} basically does):
$BYODTable = $Byods |ForEach-Object -Begin { $table = #{} } -Process { $table[$_.MAC] = $_.Username } -End { return $table }
$UserByods = foreach ($lease in $DHCPLeases)
{
$MAC = [string]$lease.MAC
if ($BYODTable.ContainsKey($MAC)) {
New-Object -TypeName PSObject -Property #{
User = $BYODTable[$MAC]
IP = $lease.IP
}
}
}
Appending to an array in a loop is slow. Just output your custom objects in the loop and collect the loop output in the variable $UserByods. Linear reads on a list are slow as well. Better build a hashtable from $Byods so you can lookup devices by their MAC address.
$tbl = #{}
$Byods | ForEach-Object { $tbl[$_.MAC] = $_ }
$UserByods = foreach ($lease in $DHCPLeases) {
New-Object -TypeName PSObject -Property #{
'User' = $tbl[$lease.MAC].Username
'IP' = $lease.IP
}
}
I do not have your list but I wonder how my Join-Object cmdlet performs on this.
The command should be something like this:
$Byods | LeftJoin $DHCPLeases Mac
I have put quiet some effort in the performance but as it is a general solution it might not be able to compete with the specific solution given here...

Put results in a table and then sorted output

I am writing a script which produces two outputs with in a foreach loop , one string $server and one integer $util.(lets say I get 20 results)
What is the simplest approach to put my results in a table while running the loop and then I can output them sorted (descending) after the loop is finished ?
SERVER UTIL
------ ----
SERVER001 95
SERVER002 74
SERVER003 32
SERVER004 12
if you want to sort the results in descending order you will have put the results in an array and then sort outside the loop like so:
$arr = #()
foreach ($item in $collection)
{
$arr += [pscustomobject]#{
Server = $item.server
util = $item.util
}
}
$arr | Sort-Object -Property Util -Descending

Resources