Switch vs If-Else Performance - performance

I have the following If block which in a logon script which I am re-writing:
If ($distinguishedname -match 'Joe Bloggs') {
Map-Drive 'X' "\\path\to\drive"
}
If ($distinguishedname -match 'Steve Bloggs') {
Map-Drive 'X' "\\path\to\drive"
}
If ($distinguishedname -match 'Joe Jobs') {
Map-Drive 'X' "\\path\to\drive"
}
Which obviously needs to be re-written as an If/Else statement (as each user only has 1 name!) However, I prefer the look of the following switch -Regex method:
switch -Regex ($distinguishedname) {
'Joe Bloggs' {Map-Drive 'X' "\\path\to\drive"; break}
'Steve Bloggs' {Map-Drive 'X' "\\path\to\drive"; break}
'Joe Jobs' {Map-Drive 'X' "\\path\to\drive"; break}
}
My question is - would using a switch in this manner have any impact on the performance of this function? It must be better than the above (if/if/if), as not every possibility is evaluated each time, but would the switch be faster than an ifelse/ifelse/else?

I wrote this test to check if I could figure out which way is better using Measure-Command:
function switchtest {
param($name)
switch -Regex ($name) {
$optionsarray[0] {
Write-host $name
break
}
$optionsarray[1] {
Write-host $name
break
}
$optionsarray[2] {
Write-host $name
break
}
$optionsarray[3] {
Write-host $name
break
}
$optionsarray[4] {
Write-host $name
break
}
default { }
}
}
function iftest {
param($name)
If ($name -match $optionsarray[0]) {Write-host $name}
ElseIf ($name -match $optionsarray[1]) {Write-host $name}
ElseIf($name -match $optionsarray[2]) {Write-host $name}
ElseIf($name -match $optionsarray[3]) {Write-host $name}
ElseIf($name -match $optionsarray[4]) {Write-host $name}
}
$optionsarray = #('Joe Bloggs', 'Blog Joggs', 'Steve Bloggs', 'Joe Jobs', 'Steve Joggs')
for ($i=0; $i -lt 10000; $i++) {
$iftime = 0
$switchtime = 0
$rand = Get-Random -Minimum 0 -Maximum 4
$name = $optionsarray[$rand]
$iftime = (Measure-Command {iftest $name}).Ticks
$switchtime = (Measure-Command {switchtest $name}).Ticks
Add-Content -Path C:\path\to\outfile\timetest.txt -Value "$switchtime`t$iftime"
}
Results
On average, this is how each function performed in 10,000 tests:
Switch - 11592.8566
IfElse - 15740.3281
The results were not the most consistent (sometimes switch was faster, sometimes ifelse was faster) but as switch is faster overall (on mean average) I will be using this instead of ifelse.
Would appreciate any feedback on this decision and my testing.

Typically, switch statements work by building a jump table in the assembly code and using that to determine the appropriate route instead of using comparators like if/else. That's why switch statements are faster. I believe that with strings, the compiler generates a hash code of the strings and uses that to implement the jump table so that the switch statement is still faster. So the switch statement should be faster than the if/if/if you have written above but it may not be since switch statements typically rely on the options being somewhat evenly spaced (e.g. 1 2 3 or 5 10 15).
With that being said, why don't you use an if/else-if/else-if instead of an if/if/if? That'll definitely be faster since not every option is evaluated each time.

Related

How to get the highest value from list of Device Names

Generating a list of windows workstation computer names by reading the active directory and I need to find the highest number so that I can then assign a new device with the next available number - I am not having any success in doing this - how to do it? And as you can see from the list of names, I also have missing numbers in the sequence that ideally, I would like to fill in with new devices also...
The code I am using to get the list from AD is below.
((Get-ADComputer -Filter {operatingsystem -notlike "*server*" -and Name -like $NamingConvention -and enabled -eq "true"} -Credential $credential -server $ADServerIP).Name)
List of device names
PC01
PC28
PC29
PC30
PC31
PC32
PC33
PC34
PC35
PC36
PC37
PC38
PC40
PC41
PC42
PC43
PC44
PC45
PC46
PC47
PC27
PC48
PC26
PC24
PC179
PC18
PC180
PC181
PC182
PC183
PC184
PC185
PC186
PC187
PC188
PC189
PC19
PC190
PC191
PC192
PC21
PC22
PC23
PC25
PC178
PC49
PC51
PC77
PC78
PC79
PC80
PC81
PC83
PC84
PC85
PC87
PC88
PC89
PC90
PC91
PC92
PC93
PC94
PC95
PC96
PC97
PC76
PC50
PC75
PC72
PC52
PC53
PC54
PC55
PC56
PC57
PC59
PC60
PC61
PC62
PC63
PC64
PC65
PC66
PC67
PC68
PC69
PC70
PC71
PC73
PC98
PC177
PC175
PC115
PC116
PC117
PC118
PC119
PC12
PC120
PC121
PC122
PC123
PC124
PC125
PC126
PC127
PC128
PC129
PC13
PC130
PC131
PC114
PC132
PC113
PC111
PC02
PC03
PC04
PC06
PC08
PC09
PC10
PC100
PC101
PC102
PC103
PC104
PC105
PC106
PC107
PC108
PC109
PC11
PC110
PC112
PC176
PC133
PC135
PC158
PC159
PC16
PC160
PC161
PC162
PC163
PC164
PC165
PC166
PC167
PC168
PC169
PC17
PC170
PC171
PC172
PC173
PC174
PC157
PC134
PC156
PC154
PC136
PC137
PC138
PC139
PC14
PC140
PC141
PC142
PC143
PC144
PC145
PC146
PC147
PC148
PC149
PC150
PC151
PC152
PC153
PC155
PC99
Sort the pc names on their numeric values and select the last one:
$lastPC = (Get-ADComputer -Filter {operatingsystem -notlike "*server*" -and Name -like $NamingConvention -and enabled -eq "true"} -Credential $credential -server $ADServerIP).Name |
Sort-Object { [int]($_ -replace '\D+')} | Select-Object -Last 1
Here's a solution that will give you the highest number ($dataMax), the missing numbers ($dataMissing), and the next number to use ($dataNext). The next number to use will be either the 1st missing number, or if there are no missing numbers then it will be the highest number + 1
# load the computers list
$data = ((Get-ADComputer -Filter {operatingsystem -notlike "*server*" -and Name -like $NamingConvention -and enabled -eq "true"} -Credential $credential -server $ADServerIP).Name)
# create an array by splitting the data text using the "space" character as a delimiter
$data = $data.Split(" ")
# remove all the alpha characters ("PC"), leaving only the number values so it can be sorted easier
$dataCleaned = $data -replace "[^0-9]" , '' | sort { [int]$_ }
# after sorting the data, [-1] represents the last element in the array which will be the highest number
[int]$dataMax = $dataCleaned[-1]
# create a number range that represents all the numbers from 1 to the highest number
$range = 1..$dataMax | foreach-object { '{0:d2}' -f $_ }
# compare the created range against the numbers actually in the computer array to find the missing numbers
$dataMissing = #(compare $range $dataCleaned -PassThru)
# if there's a missing value, [0] represents the first element in the array of missing numbers
if ($dataMissing)
{
$dataNext = $dataMissing[0]
}
# if there's no missing values, the next value is the max value + 1
else
{
$dataMissing = "none"
$dataNext = $dataMax + 1
}
Write-Host "The highest number is:"('{0:d2}' -f $dataMax)
Write-Host "The missing numbers are: $dataMissing"
Write-Host "The next number to use is:" ('{0:d2}' -f $dataNext)
Assuming your list is exactly as it appears to be, then this appears to be one way to do it:
$List = 'PC01 PC28 PC29 PC30 PC31 PC32 PC33 PC34 PC35 PC36 PC37 PC38 PC40 PC41 PC42 PC43 PC44 PC45 PC46 PC47 PC27 PC48 PC26 PC24 PC179 PC18 PC180 PC181 PC182 PC183 PC184 PC185 PC186 PC187 PC188 PC189 PC19 PC190 PC191 PC192 PC21 PC22 PC23 PC25 PC178 PC49 PC51 PC77 PC78 PC79 PC80 PC81 PC83 PC84 PC85 PC87 PC88 PC89 PC90 PC91 PC92 PC93 PC94 PC95 PC96 PC97 PC76 PC50 PC75 PC72 PC52 PC53 PC54 PC55 PC56 PC57 PC59 PC60 PC61 PC62 PC63 PC64 PC65 PC66 PC67 PC68 PC69 PC70 PC71 PC73 PC98 PC177 PC175 PC115 PC116 PC117 PC118 PC119 PC12 PC120 PC121 PC122 PC123 PC124 PC125 PC126 PC127 PC128 PC129 PC13 PC130 PC131 PC114 PC132 PC113 PC111 PC02 PC03 PC04 PC06 PC08 PC09 PC10 PC100 PC101 PC102 PC103 PC104 PC105 PC106 PC107 PC108 PC109 PC11 PC110 PC112 PC176 PC133 PC135 PC158 PC159 PC16 PC160 PC161 PC162 PC163 PC164 PC165 PC166 PC167 PC168 PC169 PC17 PC170 PC171 PC172 PC173 PC174 PC157 PC134 PC156 PC154 PC136 PC137 PC138 PC139 PC14 PC140 PC141 PC142 PC143 PC144 PC145 PC146 PC147 PC148 PC149 PC150 PC151 PC152 PC153 PC155 PC99'
$NextNumber = ($List -split "\s" | ForEach-Object { if ($_ -match 'PC(?<Number>\d+)') { $Matches.Number } } | Measure-Object -Maximum).Maximum + 1
$NextNumber
"PC$NextNumber"

Powershell Performance

i have a Problem with powershell Performance while searching a 40gb log file.
i Need to check if any of 1000 email adresses are included in this 40gb file. This would take 180 hours :D any ideas?
$logFolder = "H:\log.txt"
$adressen= Get-Content H:\Adressen.txt
$ergebnis = #()
foreach ($adr in $adressen){
$suche = Select-String -Path $logFolder -Pattern "\[\(\'from\'\,.*$adr.*\'\)\]" -List
$aktiv= $false
$adr
if ($suche){
$aktiv = $true
}
if ($aktiv -eq $true){
$ergebnis+=$adr + ";Ja"
}
else{
$ergebnis+=$adr + ";Nein"
}
}
$ergebnis |Out-File H:\output.txt
Don't read the file 1000 times.
Build a regexp line with all 1000 addresses (it's gonna be a huge line, but hey, much smaller than 40TB). Like:
$Pattern = "\[\(\'from\'\,.*$( $adressen -join '|' ).*\'\)\]"
Then do your Select-String, and save the result to do an address-by-address search in it. Hopefully, the result will be much smaller than 40Gb, and should be much faster.
As mentioned in the comments, replace
$ergebnis = #()
with
$ergebnis = New-Object System.Collections.ArrayList
and
$ergebnis+=$adr + ";Ja"
with
$ergebnis.add("$adr;Ja")
or respective
$ergebnis.add("$adr;Nein")
This will speed up your script quite a bit.

Invoke-Command faster than the command itself?

I was trying to measure some ways to write to files in PowerShell. No question about that but I don't understand why the first Measure-Command statement below takes longer to be executed than the 2nd statement.
They are the same but in the second one I write a scriptblock to send to Invoke-Command and in the 1st one I only run the command.
All informations about Invoke-Command speed I can find are about remoting.
This block takes about 4 seconds:
Measure-Command {
$stream = [System.IO.StreamWriter] "$PSScriptRoot\t.txt"
$i = 0
while ($i -le 1000000) {
$stream.WriteLine("This is the line number: $i")
$i++
}
$stream.Close()
} # takes 4 sec
And this code below which is exactly the same but written in a scriptblock passed to Invoke-Command takes about 1 second:
Measure-Command {
$cmdtest = {
$stream = [System.IO.StreamWriter] "$PSScriptRoot\t2.txt"
$i = 0
while ($i -le 1000000) {
$stream.WriteLine("This is the line number: $i")
$i++
}
$stream.Close()
}
Invoke-Command -ScriptBlock $cmdtest
} # Takes 1 second
How is that possible?
As it turns out, based on feedback from a PowerShell team member on GitHub issue #8911, the issue is more generally about (implicit) dot-sourcing (such as direct invocation of an expression) vs. running in a child scope, such as with &, the call operator, or, in the case at hand, with Invoke-Command -ScriptBlock.
Running in a child scope avoids variable lookups that are performed when (implicitly) dot-sourcing.
Therefore, as of Windows PowerShell v5.1 / PowerShell (Core) 7.2.x, you can speed up statements involving script blocks by invoking them via & { ... }, in a child scope (somewhat counter-intuitively, given that creating a new scope involves extra work).
Note that using & means that such blocks then cannot modify the caller's variables directly, but there are workarounds.
The following simplified code, which uses a foreach expression to loop 1 million times (1e6) demonstrates the performance advantage of running via & { ... }:
# REGULAR, direct invocation of an expression (a `foreach` statement in this case),
# which is implicitly DOT-SOURCED
(Measure-Command { $result = foreach ($n in 1..1e6) { $n } }).TotalSeconds
# OPTIMIZED invocation in CHILD SCOPE, using & { ... }
# up to 10+ TIMES FASTER, depending on OS and PowerShell edition
(Measure-Command { $result = & { foreach ($n in 1..1e6) { $n } } }).TotalSeconds
However, note that the performance advantage diminishes and can even go away the more preexisting variables are being referenced in the script block:
# Define a few sample variables to reference in the script blocks.
# Note that, due to PowerShell's dynamic scoping, even the child
# scope created by & { ... } sees these variables.
$i1=1; $i2=2; $i3=3; $i4=4; $i5=5
(Measure-Command { $result = foreach ($n in 1..1e6) { $n, $i1, $i2, $i3, $i4, $i5 } }).TotalSeconds
# MAY OR MAY NOT BE FASTER, depending on the OS and PowerShell edition.
(Measure-Command { $result = & { foreach ($n in 1..1e6) { $n, $i1, $i2, $i3, $i4, $i5 } } }).TotalSeconds
The reason is that variables that aren't created in the script block (by assigning to them inside it) require a variable lookup with & { ... } too, due to PowerShell's dynamic scoping (see this answer).

Compare-Object PowerShell performance and Operation VS Loop

I was looking at this question where the OP wanted to know how to compare items in two arrays without looping through each array.
The command given was:
$array3 = #(Compare-Object $array1 $array2 | select -Expand InputObject
My question is two-fold:
One, does this actually avoid iterating through the arrays in any form? Or does it simply obfuscate the operation from the user by doing it behind the scenes.
Two, as far as performance goes is this the best method for comparing objects? It appears to me it is actually significantly slower.
I made a real crude test:
$Array1 = #("1","2","Orchid","Envy","Sam","Map Of the World","Short String","s","V","DM","qwerty","1234567891011")
$Array2 = #("Bob", "Helmet", "Jane")
$Date1 = Get-Date
$Array2 | ForEach-Object `
{
if ($Array1 -contains $_){}
}
$Date2 = Get-Date
$Time1 = [TimeSpan]$Date2.Subtract($Date1)
Write-Host $Time1
$Date1 = Get-Date
$Array3 = #(Compare-Object $Array1 $Array2)
$Date2 = Get-Date
$Time2 = [TimeSpan]$Date2.Subtract($Date1)
Write-Host $Time2
And my times came out:
ForEach-Object: 00:00:00.0030001
Compare-Object: 00:00:00.0030002
Edit
I updated the script to make it more fair, and it essentially evened out the times.
So what is the behind the scenes difference between Compare-Object and a traditional loop? Am I correct in assuming none?
Edit 2
I found this code using the decompiler:
internal int Compare(ObjectCommandPropertyValue first, ObjectCommandPropertyValue second)
{
if (first.IsExistingProperty && second.IsExistingProperty)
return this.Compare(first.PropertyValue, second.PropertyValue);
if (first.IsExistingProperty)
return -1;
return second.IsExistingProperty ? 1 : 0;
}
public int Compare(object first, object second)
{
if (ObjectCommandComparer.IsValueNull(first) && ObjectCommandComparer.IsValueNull(second))
return 0;
PSObject psObject1 = first as PSObject;
if (psObject1 != null)
first = psObject1.BaseObject;
PSObject psObject2 = second as PSObject;
if (psObject2 != null)
second = psObject2.BaseObject;
try
{
return LanguagePrimitives.Compare(first, second, !this.caseSensitive, (IFormatProvider) this.cultureInfo) * (this.ascendingOrder ? 1 : -1);
}
catch (InvalidCastException ex)
{
}
catch (ArgumentException ex)
{
}
return string.Compare(((object) PSObject.AsPSObject(first)).ToString(), ((object) PSObject.AsPSObject(second)).ToString(), !this.caseSensitive, this.cultureInfo) * (this.ascendingOrder ? 1 : -1);
}
I have traced it around as best as I can, and I believe these are the two worker threads. It appears Compare-Object actually only does a 1 <==> 1 check down the list. Am I missing something here?

How to split a huge folder?

We have a folder on Windows that's ... huge. I ran "dir > list.txt". The command lost response after 1.5 hours. The output file is about 200 MB. It shows there're at least 2.8 million files. I know the situation is stupid but let's focus the problem itself. If I have such a folder, how can I split it to some "manageable" sub-folders? Surprisingly all the solutions I have come up with all involve getting all the files in the folder at some point, which is a no-no in my case. Any suggestions?
Thank Keith Hill and Mehrdad. I accepted Keith's answer because that's exactly what I wanted to do but I couldn't quite get PS working quickly.
With Mehrdad's tip, I wrote this little program. It took 7+ hours to move 2.8 million files. So the initial dir command did finish. But somehow it didn't return to console.
namespace SplitHugeFolder
{
class Program
{
static void Main(string[] args)
{
var destination = args[1];
if (!Directory.Exists(destination))
Directory.CreateDirectory(destination);
var di = new DirectoryInfo(args[0]);
var batchCount = int.Parse(args[2]);
int currentBatch = 0;
string targetFolder = GetNewSubfolder(destination);
foreach (var fileInfo in di.EnumerateFiles())
{
if (currentBatch == batchCount)
{
Console.WriteLine("New Batch...");
currentBatch = 0;
targetFolder = GetNewSubfolder(destination);
}
var source = fileInfo.FullName;
var target = Path.Combine(targetFolder, fileInfo.Name);
File.Move(source, target);
currentBatch++;
}
}
private static string GetNewSubfolder(string parent)
{
string newFolder;
do
{
newFolder = Path.Combine(parent, Path.GetRandomFileName());
} while (Directory.Exists(newFolder));
Directory.CreateDirectory(newFolder);
return newFolder;
}
}
}
I use Get-ChildItem to index my whole C: drive every night into c:\filelist.txt. That's about 580,000 files and the resulting file size is ~60MB. Admittedly I'm on Win7 x64 with 8 GB of RAM. That said, you might try something like this:
md c:\newdir
Get-ChildItem C:\hugedir -r |
Foreach -Begin {$i = $j = 0} -Process {
if ($i++ % 100000 -eq 0) {
$dest = "C:\newdir\dir$j"
md $dest
$j++
}
Move-Item $_ $dest
}
The key is to do the move in a streaming manner. That is, don't collect up all the Get-ChildItem results into a single variable and then proceed. That would require all 2.8 million FileInfos to be in memory at once. Also, if you use the Name parameter on Get-ChildItem it will output a single string containing the file's path relative to the base dir. Even then, perhaps this size will just overwhelm the memory available to you. And no doubt, it will take quite a while to execute. IIRC correctly, my indexing script takes several hours.
If it does work, you should wind up with c:\newdir\dir0 thru dir28 but then again, I haven't tested this script at all so your mileage may vary. BTW this approach assumes that you're huge dir is a pretty flat dir.
Update: Using the Name parameter is almost twice as slow so don't use that parameter.
I found out the GetChildItem is the slowest option when working with many items in a directory.
Look at the results:
Measure-Command { Get-ChildItem C:\Windows -rec | Out-Null }
TotalSeconds : 77,3730275
Measure-Command { listdir C:\Windows | Out-Null }
TotalSeconds : 20,4077132
measure-command { cmd /c dir c:\windows /s /b | out-null }
TotalSeconds : 13,8357157
(with listdir function defined like this:
function listdir($dir) {
$dir
[system.io.directory]::GetFiles($dir)
foreach ($d in [system.io.directory]::GetDirectories($dir)) {
listdir $d
}
}
)
With this in mind, what I would do: I would stay in PowerShell but use more lowlevel approach with .NET methods:
function DoForFirst($directory, $max, $action) {
function go($dir, $options)
{
foreach ($f in [system.io.Directory]::EnumerateFiles($dir))
{
if ($options.Remaining -le 0) { return }
& $action $f
$options.Remaining--
}
foreach ($d in [system.io.directory]::EnumerateDirectories($dir))
{
if ($options.Remaining -le 0) { return }
go $d $options
}
}
go $directory (New-Object PsObject -Property #{Remaining=$max })
}
doForFirst c:\windows 100 {write-host File: $args }
# I use PsObject to avoid global variables and ref parameters.
To use the code you have to switch to .NET 4.0 runtime -- enumerating methods are new in .NET 4.0.
You can specify any scriptblock as -action parameter, so in your case it would be something like {Move-item -literalPath $args -dest c:\dir }.
Just try to list first 1000 items, I hope it will finish very quickly:
doForFirst c:\yourdirectory 1000 {write-host '.' -nonew }
And of course you can process all items at once, just use
doForFirst c:\yourdirectory ([long]::MaxValue) {move-item ... }
and each item should be processed immediately after it is returned. So the whole list is not read at once and then processed, but it is processed during reading.
How about starting with this:
cmd /c dir /b > list.txt
That should get you a list of all the file names.
If you're doing "dir > list.txt" from a powershell prompt, get-childitem is aliased as "dir". Get-childitem has known issues enumerating large directories, and the object collections it returns can get huge.

Resources