Efficiently finding an extremum - sorting

A common answer to "how do I find the newest file" is:
dir | Sort-Object -Property LastWriteTime | Select-Object -Last 1
This isn't efficient for a large number of files.
Is there a built-in way to efficiently find extrema?

The fastest method I know of to accomplish that for a large directory is:
(cmd /c dir /b /a-d /tw /od)[-1]

And for something a bit more .NET programmer-ish: :-)
[Linq.Enumerable]::First([Linq.Enumerable]::OrderByDescending((new-object IO.DirectoryInfo $pwd).EnumerateFiles(), [Func[IO.FileInfo,DateTime]]{param($f) $f.LastWriteTime}))
This will return the full .NET FileInfo object. It seems to perform on the same order as #mjolinor's solution - in limited testing.

Another way to do it:
$newest = $null
dir | % { if ($newest -eq $null -or $_.LastWriteTime -gt $newest.LastWriteTime) { $newest = $_ } }
$newest

Here's one approach. The function Max:
function Max ($Property)
{
$max = $null
foreach ($elt in $input)
{
if ($max -eq $null) { $max = $elt }
if ($elt.$Property -gt $max.$Property) { $max = $elt }
}
$max
}
can be used to define Newest:
function Newest () { $input | Max LastWriteTime }
It can be called as such:
dir | Newest
It can also be used to define Largest:
function Largest () { $input | Max Length }
E.g.:
dir -File | Largest
Similarly, Min can be used to define Oldest and Smallest:
function Min ($Property)
{
$min = $null
foreach ($elt in $input)
{
if ($min -eq $null) { $min = $elt }
if ($elt.$Property -lt $min.$Property) { $min = $elt }
}
$min
}
function Oldest () { $input | Min LastWriteTime }
function Smallest () { $input | Min Length }

Related

Nested intervals within PowerShell (c#) / faster alternative to "where-object" with sorted lists?

I'm looking for a way to accelerate the (Windows 10) PowerShell command Where-Object for a sorted array.
In the end the array will contain thousands of lines from a log file. All lines in the log file start with date and time and are sorted by date/time (new lines will always be appended).
The following command would work but is extremely slow and ineffective with a sorted array:
$arrFileContent | where {($_ -ge $Start) -and ($_ -le $End)}
Here is a (strongly simplified) example:
$arrFileContent = #("Bernie", "Emily", "Fred", "Jake", "Keith", "Maria", "Paul", "Richard", "Sally", "Tim", "Victor")
$Start = "E"
$End = "P"
Expected result: "Emily", "Fred", "Jake", "Keith", "Maria", "Paul".
I guess, using "nested intervals" it should be much faster, like "find the first entry starting with "E" or above and the first starting with "P" or below and return all entries in between.
I suppose there must be a simple PowerShell or .NET solution for this, so I won't have to code it myself, correct?
Edit 31.08.19: Not sure if "nested intervals" (German "Intervallschachtelung") is the right term.
What I mean is the "telephone book principle": Open the book in the middle, check if the wanted name is listed before or after, open the book in the middle of the first (or last) half, and so on.
In this case (checking 100.000 lines of a log file for a given date range):
- check line no. 50.000
- if after given start date check line no. 75.000 else check no. 25.000
- check line no. 75.000 (or 25.000)
- if after given start date check line no. 87.500 (or ...) else check no. 62.500 (or ...)
and so on ...
The log file contains lines like this:
2018-01-17 14:28:19 Installation xxx started
(only with a lot more text)
Let's measure all ways mentioned in comments. Let's mimic thousands of lines from a log file using Get-ChildItem:
$arrFileContent = (
Get-ChildItem d:\bat\* -File -Recurse -ErrorAction SilentlyContinue
).Name | Sort-Object -Unique
$Start = "E"
$End = "P"
$arrFileContent.Count
('Where-Object', $(Measure-Command {
$arrFileNarrowed = $arrFileContent | Where-Object {
($_ -ge $Start) -and ($_ -le $End)
}
}).TotalMilliseconds, $arrFileNarrowed.Count) -join "`t"
('Where method', $(Measure-Command {
$arrFileNarrowed = $arrFileContent.Where( {
($_ -ge $Start) -and ($_ -le $End)
})
}).TotalMilliseconds, $arrFileNarrowed.Count) -join "`t"
('foreach + if', $(Measure-Command {
$arrFileNarrowed = foreach ($OneName in $arrFileContent) {
if ( ($OneName -ge $Start) -and ($OneName -le $End) ) {
$OneName
}
}
}).TotalMilliseconds, $arrFileNarrowed.Count) -join "`t"
Output using Get-ChildItem d:\bat\*:
D:\PShell\SO\56993333.ps1
2777
Where-Object 111,5433 535
Where method 56,8577 535
foreach + if 6,542 535
Output using Get-ChildItem d:\* (much more names):
D:\PShell\SO\56993333.ps1
89570
Where-Object 4056,604 34087
Where method 1636,9539 34087
foreach + if 422,8259 34087
"Nested intervals", to me, means "intervals within intervals." I think I'd describe what you're looking to do is select a range. We can exploit the fact that the data is sorted to stop enumerating as soon as the end of the range is found.
.NET's LINQ queries allow us to do this easily. Assuming this content for Names.txt...
Bernie
Emily
Fred
Jake
Keith
Maria
Paul
Richard
Sally
Tim
Victor
...in C# the filtering would be as simple as...
IEnumerable<string> filteredNames = System.IO.File.ReadLines("Names.txt")
.Where(name => name[0] >= 'E')
.TakeWhile(name => name[0] <= 'P');
ReadLines() enumerates the lines of the file, Where() filters the output of ReadLines() (setting a lower bound on the range), and TakeWhile() stops enumerating Where() (and, therefore, ReadLines()) once its condition is no longer true (setting an upper bound on the range). This is all very efficient because A) the file is enumerated rather than read entirely into memory and B) enumeration stops as soon as the end of the desired range is reached.
We can invoke LINQ methods from PowerShell, too, but since PowerShell supports neither extension methods nor lamba expressions the equivalent code is a little more verbose...
$source = [System.IO.File]::ReadLines($inputFilePath)
$rangeStartPredicate = [Func[String, Boolean]] {
$name = $args[0]
return $name[0] -ge [Char] 'E'
}
$rangeEndPredicate = [Func[String, Boolean]] {
$name = $args[0]
return $name[0] -le [Char] 'P'
}
$filteredNames = [System.Linq.Enumerable]::TakeWhile(
[System.Linq.Enumerable]::Where($source, $rangeStartPredicate),
$rangeEndPredicate
)
In order for this to work you have to invoke the static LINQ methods directly and get all of the types correct. Thus, the first parameter of Where() is an System.Collections.Generic.IEnumerable[String], which is what ReadLines() returns (that's why I used a file for this). The predicate parameters of Where() and TakeWhile() are of type [Func[String, Boolean]] (a function that takes a String and returns a Boolean), which is why the ScriptBlocks must be explicitly cast to that type.
After this code executes $filteredNames will contain a query object; that is, it doesn't contain the results but rather a blueprint for how to get the results...
PS> $filteredNames.GetType()
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
False False <TakeWhileIterator>d__27`1 System.Object
Only when the query is executed/evaluated does file enumeration and filtering actually occur...
PS> $filteredNames
Emily
Fred
Jake
Keith
Maria
Paul
If you are going to access the results multiple times you should store them in an array to avoid reading the file multiple times...
PS> $filteredNames = [System.Linq.Enumerable]::ToArray($filteredNames)
PS> $filteredNames.GetType()
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True String[] System.Array
PS> $filteredNames
Emily
Fred
Jake
Keith
Maria
Paul
I tried a variation on #josefz's answer. I didn't get amazing results breaking when I was past the last line I wanted. Actually, if it was just 'a' to 'b', I save a minute. Unless the slowness is due to get-content? "Get-content log" will be slower than "get-content -readcount -1 log".
$arrFileContent = Get-ChildItem -name -File -Recurse | select -first 89570 | sort -u
$start = 'e'
$end = 'p'
measure-command {
$arrFileNarrowed = foreach ($OneName in $arrFileContent) {
if ($OneName -ge $Start) {
if ($OneName -le $End ) {
$OneName
}
}
}
} | fl seconds, milliseconds
# break early
measure-command {
$arrFileNarrowed = foreach ($OneName in $arrFileContent) {
if ($OneName -ge $Start) {
if ($OneName -le $End ) {
$OneName
} else {
break
}
}
}
} | fl seconds, milliseconds
Output:
Seconds : 1
Milliseconds : 207
Seconds : 1
Milliseconds : 174
Trying out get-content vs switch -file:
$start = 'e'
$end = 'p'
# uses more memory
measure-command {
$result1 = get-content -readcount -1 log | foreach { $_ |
where { $_ -ge $start -and $_ -le $end } }
} | fl seconds,milliseconds
measure-command {
$result2 = switch -file log {
{ $_ -ge $start -and $_ -le $end } { $_ } }
} | fl seconds,milliseconds
Output:
Seconds : 4
Milliseconds : 491
Seconds : 2
Milliseconds : 747
It is also effective to simply replace as follows
$arr | where { <expression> }
↓
$arr | & { process { if (<expression>) { $_ } } }
$arrFileContent | & { process { if ($_ -ge $Start -and $_ -lt $End) { $_ } } }

Shared and unique lines from large files. Fastest method?

This code returns unique and shared lines between two files. Unfortunately, it runs forever if the files have 1 million lines. Is there a faster way to do this (e.g., -eq, -match, wildcard, Compare-Object) or containment operators are the optimal approach?
$afile = Get-Content (Read-Host "Enter 'A' file")
$bfile = Get-Content (Read-Host "Enter 'B' file")
$afile |
? { $bfile -notcontains $_ } |
Set-Content lines_ONLY_in_A.txt
$bfile |
? { $afile -notcontains $_ } |
Set-Content lines_ONLY_in_B.txt
$afile |
? { $bfile -contains $_ } |
Set-Content lines_in_BOTH_A_and_B.txt
As mentioned in my answer to a previous question of yours, -contains is a slow operation, particularly with large arrays.
For exact matches you could use Compare-Object and discriminate the output by side indicator:
Compare-Object $afile $bfile -IncludeEqual | ForEach-Object {
switch ($_.SideIndicator) {
'<=' { $_.InputObject | Add-Content 'lines_ONLY_in_A.txt' }
'=>' { $_.InputObject | Add-Content 'lines_ONLY_in_B.txt' }
'==' { $_.InputObject | Add-Content 'lines_in_BOTH_A_and_B.txt' }
}
}
If that's still too slow try reading each file into a hashtable:
$afile = Get-Content (Read-Host "Enter 'A' file")
$ahash = #{}
$afile | ForEach-Object {
$ahash[$_] = $true
}
and process the files like this:
$afile | Where-Object {
-not $bhash.ContainsKey($_)
} | Set-Content 'lines_ONLY_in_A.txt'
If that still doesn't help you need to identify the bottleneck (reading the files, comparing the data, doing multiple comparisons, ...) and proceed from there.
try this:
$All=#()
$All+= Get-Content "c:\temp\a.txt" | %{[pscustomobject]#{Row=$_;File="A"}}
$All+= Get-Content "c:\temp\b.txt" | %{[pscustomobject]#{Row=$_;File="B"}}
$All | group row | %{
$InA=$_.Group.File.Contains("A")
$InB=$_.Group.File.Contains("B")
if ($InA -and $InB)
{
$_.Group.Row | select -unique | Out-File c:\temp\lines_in_A_And_B.txt -Append
}
elseif ($InA)
{
$_.Group.Row | select -unique | Out-File c:\temp\lines_Only_A.txt -Append
}
else
{
$_.Group.Row | select -unique | Out-File c:\temp\lines_Only_B.txt -Append
}
}
Full code for the best option (#ansgar-wiechers). A unique, B unique, and A,B shared lines:
$afile = Get-Content (Read-Host "Enter 'A' file")
$ahash = #{}
$afile | ForEach-Object {
$ahash[$_] = $true
}
$bfile = Get-Content (Read-Host "Enter 'B' file")
$bhash = #{}
$bfile | ForEach-Object {
$bhash[$_] = $true
}
$afile | Where-Object {
-not $bhash.ContainsKey($_)
} | Set-Content 'lines_ONLY_in_A.txt'
$bfile | Where-Object {
-not $ahash.ContainsKey($_)
} | Set-Content 'lines_ONLY_in_B.txt'
$afile | Where-Object {
$bhash.ContainsKey($_)
} | Set-Content 'lines_in _BOTH_A_and_B.txt'
Considering my suggestion to do a binary search, I have created a reusable Search-SortedArray function for this:
Description
The Search-SortedArray (alias Search) (binary) searches a string in a sorted array. If the string is found, the index of the found string in the array is returned. Otherwise, if the string is not found, a $Null is returned.
Function Search-SortedArray ([String[]]$SortedArray, [String]$Find, [Switch]$CaseSensitive) {
$l = 0; $r = $SortedArray.Count - 1
While ($l -le $r) {
$m = [int](($l + $r) / 2)
Switch ([String]::Compare($find, $SortedArray[$m], !$CaseSensitive)) {
-1 {$r = $m - 1}
1 {$l = $m + 1}
Default {Return $m}
}
}
}; Set-Alias Search Search-SortedArray
$afile |
? {(Search $bfile $_) -eq $Null} |
Set-Content lines_ONLY_in_A.txt
$bfile |
? {(Search $afile $_) -eq $Null} |
Set-Content lines_ONLY_in_B.txt
$afile |
? {(Search $bfile $_) -ne $Null} |
Set-Content lines_in_BOTH_A_and_B.txt
Note 1: Due to the overhead, a binary search will only give advantage with (very) large arrays.
Note 2: The array has to be sorted otherwise the result will be unpredictable.
Nate 3: The search doesn't account for duplicates. In case of duplicate values, just one index will be returned (which isn't a concern for this specific question)
Added 2017-11-07 based on the comment from #Ansgar Wiechers:
Quick benchmark with 2 files with a couple thousand lines each (including duplicate lines): binary search: 2400ms; compare-object: 1850ms; hashtable lookup: 250ms
The idea is that the binary search will take its advantage on the long run: the larger the arrays the more it will proportional gain performance.
Taken $afile |? { $bfile -notcontains $_ } as an example, the performance measurements in the comment and that “a couple thousand lines” is 3000 lines:
For a standard search, you will need an average of 1500 iterations in the $bfile:*1
(3000 + 1) / 2 = 3001 / 2 = 1500
For a binary search, you will need an average of 6.27 iterations in the $bfile:
(log2 3000 + 1) / 2 = (11.55 + 1) / 2 = 6.27
In both situations you do this 3000 times (for each item in $afile)
This means that each single iteration takes:
For a standard search: 250ms / 1500 / 3000 = 56 nanoseconds
For a binary search: 2400ms / 6.27 / 3000 = 127482 nanoseconds
The breakeven point will at about:
56 * ((x + 1) / 2 * 3000) = 127482 * ((log2 x + 1) / 2 * 3000)
Which is (according my calculations) at about 40000 entries.
*1 presuming that a hashtable lookup doesn’t do a binary search itself as it is unaware that the array is sorted
Added 2017-11-07
Conclusion from the comments: Hash tables appear to have a similar associative array algorithms that can't be outperformed with low-level programming commands.

Powershell Format-Wide and Sorting

One of the things I like to do in Powershell is:
Set-Location 'C:\Windows\System32\WindowsPowerShell\v1.0\en-US\'
Get-ChildItem *.txt | Format-Wide -Column 3
This gives me a great view on everything there is to learn and explore. The thing that bothered me is the sorting, because now I have 3 columns that start with 'A'. It would be more readable when I'd have (for example) one column with A-J, one column L-R, and one going from R-Z. This bothered me so much, I wrote a function to do it:
Function PSHELP {
Set-Location 'C:\Windows\System32\WindowsPowerShell\v1.0\en-US\'
#Initialize variables
$files = gci *.txt | select name
$count = $files.count
$remainder = $count % 3
$rows = ($count - $remainder) /3 -as [int]
#I add an extra row by default, to deal with remainders
$rows++
$Table = New-Object 'object[,]' $rows,3
#Build up a table the way I want to see it
#column 1: A,B,C...
#column 2: L,M,N..
#column 3: R,...,Z
$index = 0
for ($j = 0; $j -lt 3; $j++)
{
$ThisColumnLength = $rows
if($j -ge $remainder){
$ThisColumnLength--
}
for ($i = 0; $i -lt $ThisColumnLength; $i++)
{
$table[$i,$j] = $files[$index]
$index++
}
}
#Read the table in the order Format-Wide throws them on the screen
#And store this in an array
$array = #()
for ($i = 0; $i -lt $rows; $i++)
{ $ThisRowLength = 3
if(($i+1) -eq $Rows){
$ThisRowLength = $remainder
}
if ($ThisRowLength -gt 0){
for ($j = 0; $j -lt $ThisRowLength; $j++)
{
$array += $table[$i,$j]
}
}
}
$array | fw -Column 3
}
Is there a more 'standard' way to do this in powershell? It seems like quite a natural option to me, but I couldn't find it. Is there an option or command that I've missed?
To clarify: I am not looking for ways to find help. This question is about the Format-Wide command, and/or possible alternative. I just thought this would be a nice example.
[Edit:] Changed my function to something slightly less clumbsy.
[Edit2:] The code I posted is flawed, and it's getting late. If you paste it in the shell and compare it with {Get-Childitem *.txt | format-wide -column 3}', you should be able to see what I am trying to do here. I hope somebody can suggest some kind of alternative.
[Edit3:] Modified the code again, finally got this to work. In the process I found out a very interesting thing about what Format-Wide returns:
PS> (Get-ChildItem).count
Result: 125
PS> (Get-ChildItem | Format-Wide).count
Result: 129
This confused me a lot, because sometimes I counted the results and didn't get what I expected, so a couple of times I thought something was wrong with my code, but maybe everything was fine.
I found something that does exactly what I want, and is more generic:
http://www.leporelo.eu/blog.aspx?id=powershell-formatting-format-wide-rotated-to-format-columns (dead link)
Version 0.4 code (pasted at the bottom of this answer in case this link dies too)
Archived 0.1 code (included because it has more discussion/examples)
Example with regular Format-Wide
'The basic promise behind implicit remoting is that you can work ' +
'with remote commands using local syntax.' -split ' ' | Format-Wide -force -prop {$_} -col 3
The basic promise
behind implicit remoting
is that you
can work with
remote commands using
local syntax.
Example with Format-Columns
'The basic promise behind implicit remoting is that you can work ' +
'with remote commands using local syntax.' -split ' ' | . Format-Columns -col 3
The is remote
basic that commands
promise you using
behind can local
implicit work syntax.
remoting with
I am still surprised Powershell doesn't have a built-in option to do this.
Version 0.4 code:
function Format-Columns {
[CmdletBinding()]
param(
[Parameter(Mandatory=$true,ValueFromPipeline=$true)]
[PsObject[]]$InputObject,
[Object]$Property,
[int]$Column,
[int]$MaxColumn,
[switch]$Autosize
)
Begin { $values = #() }
Process { $values += $InputObject }
End {
function ProcessValues {
$ret = $values
$p = $Property
if ($p -is [Hashtable]) {
$exp = $p.Expression
if ($exp) {
if ($exp -is [string]) { $ret = $ret | % { $_.($exp) } }
elseif ($exp -is [scriptblock]) { $ret = $ret | % { & $exp $_} }
else { throw 'Invalid Expression value' }
}
if ($p.FormatString) {
if ($p.FormatString -is [string]) { $ret = $ret | % { $p.FormatString -f $_ } }
else { throw 'Invalid format string' }
}
}
elseif ($p -is [scriptblock]) { $ret = $ret | % { & $p $_} }
elseif ($p -is [string]) { $ret = $ret | % { $_.$p } }
elseif ($p -ne $null) { throw 'Invalid -property type' }
# in case there were some numbers, objects, etc., convert them to string
$ret | % { $_.ToString() }
}
if (!$Column) { $Autosize = $true }
$values = ProcessValues
$valuesCount = #($values).Count
if ($valuesCount -eq 1) {
return $values
}
# from some reason the console host doesn't use the last column and writes to new line
$consoleWidth = $host.ui.RawUI.maxWindowSize.Width - 1
$gutterWidth = 2
# get length of the longest string
$values | % -Begin { [int]$maxLength = -1 } -Process { $maxLength = [Math]::Max($maxLength,$_.Length) }
# get count of columns if not provided
if ($Autosize) {
$Column = [Math]::Max( 1, ([Math]::Floor(($consoleWidth/($maxLength+$gutterWidth)))) )
$remainingSpace = $consoleWidth - $Column*($maxLength+$gutterWidth);
if ($remainingSpace -ge $maxLength) {
$Column++
}
if ($MaxColumn -and $MaxColumn -lt $Column) {
$Column = $MaxColumn
}
}
$countOfRows = [Math]::Ceiling($valuesCount / $Column)
$maxPossibleLength = [Math]::Floor( ($consoleWidth / $Column) )
# cut too long values, considers count of columns and space between them
$values = $values | % {
if ($_.length -gt $maxPossibleLength) { $_.Remove($maxPossibleLength-2) + '..' }
else { $_ }
}
#add empty values so that the values fill rectangle (2 dim array) without space
if ($Column -gt 1) {
$values += (#('') * ($countOfRows*$Column - $valuesCount))
}
# in case there is only one item, make it array
$values = #($values)
<#
now we have values like this: 1, 2, 3, 4, 5, 6, 7, ''
and we want to display them like this:
1 3 5 7
2 4 6 ''
#>
$formatString = (1..$Column | %{ "{$($_-1),-$maxPossibleLength}" }) -join ''
1..$countOfRows | % {
$r = $_-1
$line = #(1..$Column | %{ $values[$r + ($_-1)*$countOfRows]} )
Write-Output "$($formatString -f $line)".PadRight($consoleWidth,' ')
}
}
<#
.SYNOPSIS
Formats incoming data to columns.
.DESCRIPTION
It works similarly as Format-Wide but it works vertically. Format-Wide outputs the data row by row, but Format-Columns outputs them column by column.
.PARAMETER Property
Name of property to get from the object.
It may be
-- string - name of property.
-- scriptblock
-- hashtable with keys 'Expression' (value is string=property name or scriptblock)
and 'FormatString' (used in -f operator)
.PARAMETER Column
Count of columns
.PARAMETER Autosize
Determines if count of columns is computed automatically.
.PARAMETER MaxColumn
Maximal count of columns if Autosize is specified
.PARAMETER InputObject
Data to display
.EXAMPLE
1..150 | Format-Columns -Autosize
.EXAMPLE
Format-Columns -Col 3 -Input 1..130
.EXAMPLE
Get-Process | Format-Columns -prop #{Expression='Handles'; FormatString='{0:00000}'} -auto
.EXAMPLE
Get-Process | Format-Columns -prop {$_.Handles} -auto
.NOTES
Name: Get-Columns
Author: stej, http://twitter.com/stejcz
Site: http://www.leporelo.eu/blog.aspx?id=powershell-formatting-format-wide-rotated-to-format-columns
Lastedit: 2017-09-11
Version 0.4 - 2017-09-11
- removed color support and changed output from Write-Host to Write-Output
Version 0.3 - 2017-04-24
- added ForegroundColor and BackgroundColor
Version 0.2 - 2010-01-14
- added MaxColumn
- fixed bug - displaying collection of 1 item was incorrect
Version 0.1 - 2010-01-06
#>
}
If you mean standard way of finding all the help files in PowerShell, then yes there:
Get-Help * -Category HelpFile
Outside of that I just go check this page on Technet: Windows PowerShell Core About Topics
Here is a fork of the above script in a module form: https://github.com/loxia01/FormatColumn
Function syntax:
Format-Column [[-Property] <Object>] [-MaxColumnCount <int>] [-MinRowCount <int>] [-OrderBy <string>] [-InputObject <psobject>] [<CommonParameters>]
Format-Column [[-Property] <Object>] -ColumnCount <int> [-OrderBy <string>] [-InputObject <psobject>] [<CommonParameters>]

How to compare a folder size in Powershell

I have to apply a command IF the folder size is greater or equal to 600MB.
I tried something like this
$folders = Get-ChildItem d:\home -exclude *.*
function Get-Size
{
param([string]$pth)
"{0:n2}" -f ((gci -path $pth -recurse | measure-object -property length -sum).sum /1mb)
}
ForEach ($subFolder in $folders){
echo $subFolder | select-object fullname
$size = Get-Size $subFolder
echo $size
if ($size -gt "600") { echo "Not ok." }
else { echo "OK template." }
}
It doesn't work. It writes the right size of the folder but the IF statement is not respected. How do I do?
The simplest way is to use the FileSystemObject COM object:
function Get-FolderSize($path) {
(New-Object -ComObject 'Scripting.FileSystemObject').GetFolder($path).Size
}
I'd recommend against doing formatting in a Get-Size function, though. It's usually better to have the function return the raw size, and do calculations and formatting when you actually display the value.
Use it like this:
Get-ChildItem 'D:\home' | Where-Object {
$_.PSIsContainer -and
Get-FolderSize $_.FullName -gt 600MB
}
or like this:
Get-ChildItem 'D:\home' | Where-Object {
$_.PSIsContainer
} | ForEach-Object {
if (Get-FolderSize $_.FullName -gt 600MB) {
'Not OK.'
} else {
'OK template.'
}
}
On PowerShell v3 and newer you can use Get-ChildItem -Directory instead of Get-ChildItem | Where-Object { $_.PSIsContainer }.
When you are comparing using $size -gt "600" the value is considered as string. Hence not getting right results.
Try comparison using integers.

PowerShell's equivalent of LINQ All, or how can I verify all the items in the collection are equal to specific value?

I have a collection of items, which I build from a regex match, like this:
$collection = $input | foreach {
if ($_ -match $regex) {$matches} else { return }
} |
Select-Object –Property #{name='command'; expression={$_.command} },
#{name='id'; expression={$_.id} }
(sorry if that's not the best way to do this, I'm learning PowerShell :))
What I'd like to do is to make sure all the command properties in this $collection are equal to the same command, e.g. "myCommand", how can I do that?
ff this were C#, I'd probably do something like:
if (collection.All(item => item.Key == "myCommand")) { ... }
What's the idiomatic way to do this in PowerShell?
The following snippet returns $true when all items in array are equal. What it does is:
iterate through an array of objects using ForEach-Object
compare each one with the previous using the Compare-Object cmdlet.
If Compare-Object returns non-null at least once, which means that the two objects being compared are different, the snippet will return $false, otherwise $true.
Applying the snippet to the array #(1,1,1,1,1,2) will return $false because of the last item.
#(1,1,1,1,1,2) |
ForEach-Object -Begin { $last = $null; $result = $true } {
if ($last -ne $null -and $result -and (Compare-Object $last $_) -ne $null) {
$result = $false
}
$last = $_
} -End { $result }
I was pointed to this LINQ Module for PowerShell, which allowed me to do simply:
$collection | Linq-All { $_.command -eq "myCommand" }
Which is just what I needed! More examples here.

Resources