Powershell Plus or Minus Comparison Operator (Fuzzy Logic)? - windows

So let me tell you what I'm trying to do here. Our SolarWinds alerts report on disk capacity as read by Windows, not the Virtual Machine vDisk size setting. What I'm trying to do is match the size so that I can find the correct vDisk and report on its datastore free space to determine whether or not we can add more.
Here's the problem, the GB number never matches between Windows and VMWare. Say the disk has a 149.67 capacity as reported by Windows, well the VMWare setting is 150, or 150.18854, or anything of that sort. I cannot find the vdisk without knowing the exact number, but theoretically I could find it if I could just say, have a comparison operator that had some breathing room, like plus or minus 1 or even 0.5. So for example:
Get-HardDisk -Vm SERVERNAME | Where-Object {
$_.CapacityGB -lt $size + 0.5 -and
$_.CapacityGB -gt $size - 0.5
}
This doesn't work though, for whatever reason. I need something similar to this. Any ideas?
UPDATE: Turns out to be user error, I was experimenting with the wrong number when testing the command. I thought it was the syntax, it was the number I was using itself.

So because I managed to answer my own question I thought I'd post a script for achieving this here. Note that you will need to have a txt file with a comma separated servername and capacity. You could probably modify this to do many other things with VMWare data gathering if you wanted. In the end you'll need to know which columns are which and import to Excel as comma delimited.
Most the variables are decimal values.
Also note that I have no yet figured out a way to programatically deal with the discovery of multiple matching disks.
$serverlist = Get-Content "./ServerList.txt"
$logfile = "./Stores.txt"
remove-item "./Stores.txt"
Function LogWrite {
Param (
[string]$srv,
[string]$disk,
[string]$store
)
Add-Content $logfile -value $srv","$disk","$store
}
foreach ($item in $serverlist){
$store = "Blank"
$disk = "Blank"
try {
$server,$arg = $item.split(',')
$round = [math]::Round($arg,0)
$disk = get-harddisk -vm $server | where-object{$_.CapacityGB -lt ($round + 2) -and $_.CapacityGB -gt ($round - 2) }
if ([string]::IsNullOrEmpty($disk)){
$disk = "Problem locating disk."
$store = "N/A"
continue
}
if ($disk.count -gt 1) {
$disk = "More than one matching disk."
$store = "N/A"
} else {
$store = get-harddisk -vm $server | where-object{$_.CapacityGB -lt ($round + 2) -and $_.CapacityGB -gt ($round - 2) } | Get-Datastore | %{ "{0},{1},{2}" -f $_.Name,[math]::Round($_.FreeSpaceGB,1),[math]::Round($_.CapacityGB,1) }
}
}
catch {
$disk = "Physical"
$store = "N/A"
}
LogWrite $server $disk $store
}

Related

Why Powershell output all items in collection ignorig IF statement?

This code supposed to output only macaddress of active connection, yet it output all macaddresses of all PhysicalAdapters
# Assign info on Network Connections selected fileds to collection
$q = Get-CIMInstance Win32_NetworkAdapter | Select-Object NetConnectionStatus, PhysicalAdapter, MACAddress
# Find mac address of active connection
foreach ($i in $q) {IF ($q.NetConnectionStatus -eq 2) {Write-Host $i.MacAddress}}
In output I recieve all macaddresses of all physical adapters.
You have a typo on your if condition, $q is the collection not the item ($i):
IF ($q.NetConnectionStatus -eq 2)
Since -eq can act as a filter when the left-hand side of the operation, as long as there is at least one element in $q.NetConnectionStatus equal to 2, the condition will be $true hence all items are being outputted to the console.
Try this simple example to understand what is happening:
$collection = 0..10 | ForEach-Object {
[pscustomobject]#{
NetConnectionStatus = $_
MACAddress = Get-Random
}
}
foreach($item in $collection) {
if($collection.NetConnectionStatus -eq 2) {
$item.MACAddress
}
}
This will output 11 random numbers to your console. The right filtering condition on above example should have been:
if($item.NetConnectionStatus -eq 2) {

How to iterate or loop through folders that match a certain condition

I have directory that contains folders that correspond to dates (i.e. dated subfolders). They are all in the same format of yyyymmdd. There are other folders within this directory as well, that have names that are text. What I'm trying to do is write a script that will take a user inputting how many days previous they would like to see. The script will then perform some action on each folder. I'm having some trouble figuring out how to get a list of all of the folders correctly.
The directory will look something like:
C:--/
FOO--/
--20190525
--20190526
--20190527
--20190528
--20190529
--20190530
--20190531
--20190601
--20190602
etc.
I currently am using a for loop, but the issue is that when the end of the month occurs, i.e. 20190531, the loop will continue to 20190532 instead of 20190601.
$lastday = (Get-Date).AddDays(-1).ToString("yyyyMMdd")
$lastdayint = [int]$lastday
$days = [Microsoft.VisualBasic.Interaction]::InputBox('How many days back do you want to process?')
if ($days.Length -eq 0) {
Exit
}
$daysint = [int]$days
$firstday = (Get-Date).AddDays(-$daysint).ToString("yyyyMMdd")
$firstdayint = [int]$firstday
for ($i = $firstdayint; $i -le $lastdayint; $i++) {
# Get our dated sub folder by looking through the root of FOO
# and finding the folder that matches the condition
$datedsub = (Get-ChildItem -Path "C:\FOO\*" | Where-Object { ($_.PSIsContainer) -and ($_.Name -eq [string]$i) } ).Name
# If the path does not exist, both processes will fail, so exit out of the loop and move on to the next date
if ( $( try { !(Test-Path -Path ("C:\FOO\$datedsub").Trim() -PathType Container) } catch { $false } ) ) {
Continue
}
}
I'm kind of stuck figuring out how to get all of the folders with names between 2 dates. This was pretty easy to do in CMD scripting, as everything is a string and folders don't have properties. But with Powershell, it's a bit more difficult.
What the script needs to do is loop through all of the folders in a specified range. I'm pretty sure a foreach loop would probably be my best bet, but I'm having trouble figuring out how to set it up.
Your code can be simplified in algorithm:
[int]$days = [Microsoft.VisualBasic.Interaction]::InputBox('How many days back do you want to process?')
if ($days.Length -eq 0) {
Exit
}
# will contain folders
$dirlist = #()
# note: includes today. To not include today, start $i at 1
for ($i = 0; $i -le $days; $i++) {
$datestr = (Get-Date).AddDays(-$i).ToString("yyyyMMdd")
$dir = Get-ChildItem -Path $datestr -Directory -ErrorAction SilentlyContinue
if($null -ne $dir) {
[void]$dirlist.Add($dir)
# Or do work here. Otherwise you can use the outputted arraylist
}
}
# prints folders to console
$dirlist
The above:
Asks user for number of days back
Saves that number of days and loops that many times
Gets the date string using Today - $i
Uses Get-ChildItem with -Path to find the item, -Directory to limit to directories, and -ErrorAction SilentlyContinue to suppress errors (if it does not exist it will return $null)
Adds to ArrayList if not $null
Instead of converting the string to an integer and incrementing that, you should use AddDays() to loop through the actual dates:
$days = [Microsoft.VisualBasic.Interaction]::InputBox('How many days back do you want to process?')
if ($days.Length -eq 0) {
Exit
}
$startDate = (Get-Date).AddDays(-$days)
$endDate = (Get-Date).AddDays(-1)
for($currentDate = $startDate; $currentDate -le $endDate; $currentDate.AddDays(1)) {
$dateString = $currentDate.ToString('yyyyMMdd')
if ( -not( Test-Path "C:\FOO\$dateString" -PathType Container ) ) {
continue
}
# do work here
}

How to get NUMA difference for a specific NIC which uses RSS in Powershell

I want to write a powershell-script which checks if a network interface card which uses receive side scaling uses a processor with a NUMA (Non-Uniform Memory Access) distance > 0.
What I've done so far:
$name = "Ethernet"
$adapter = Get-NetAdapterRss -Name $name
This outputs the RSS-Adapter processor data (together with other information) like:
RssProcessorArray: [Group:Number/NUMA Distance] : 0:0/0 0:2/0 0:4/0 0:6/0 0:8/0 0:10/0 0:12/0 0:14/0
0:16/0 0:18/0 0:20/0 0:22/0 0:24/32767 0:26/32767 0:28/32767
0:30/32767
0:32/32767 0:34/32767 0:36/32767 0:38/32767 0:40/32767
0:42/32767 0:44/32767 0:46/32767
As you see, the NUMA distance is the value behind the '/'.
Now i want to retrieve it like:
foreach($processor in $adapter.RssProcessorArray)
{
Write-Host $processor.ProcessorGroup
Write-Host $processor.ProcessorNumber
Write-Host $processor.??
}
Somehow there is no ".NumaDistance" property on the object i get. How can i get this value for each processor in the list?
Similar idea, but with regexp:
$str = (Get-NetAdapterrss -name "Ethernet" | Out-String).Split("`n") | where {$_ -like 'RssProcessorArray*'}
$rss = $str | Select-String '\d+:\d+/\d+' -AllMatches
Write-Output $rss.Matches.Value
$rss.Matches.Value | foreach { ($_ -split "[:/]") -join "---" } #if need each value separetly
Using static data as an example, but hope this helps
$text = 'RssProcessorArray: [Group:Number/NUMA Distance] : 0:0/0 0:2/0 0:4/0 0:6/0 0:8/0 0:10/0 0:12/0 0:14/0 0:16/0 0:18/0 0:20/0 0:22/0 0:24/32767 0:26/32767 0:28/32767 0:30/32767 0:32/32767 0:34/32767 0:36/32767 0:38/32767 0:40/32767 0:42/32767 0:44/32767 0:46/32767'
# split the text up on spaces
$firstSplit = $text.Split(' ')
# take all results starting at the first 0:0/0
# put into an array
[array]$processData = $firstSplit[4..($firstSplit.Count -1)]
# get just the data after the / for each item in the array
[array]$splitProcessData = $processData.split('/') | ? {$_ -notmatch ':'}
foreach($processor in $adapter.RssProcessorArray)
{
Write-Host $processor.ProcessorGroup
Write-Host $processor.ProcessorNumber
foreach($entry in $splitProcessData)
{
Write-Host $entry
}
}

Powershell Format-Wide and Sorting

One of the things I like to do in Powershell is:
Set-Location 'C:\Windows\System32\WindowsPowerShell\v1.0\en-US\'
Get-ChildItem *.txt | Format-Wide -Column 3
This gives me a great view on everything there is to learn and explore. The thing that bothered me is the sorting, because now I have 3 columns that start with 'A'. It would be more readable when I'd have (for example) one column with A-J, one column L-R, and one going from R-Z. This bothered me so much, I wrote a function to do it:
Function PSHELP {
Set-Location 'C:\Windows\System32\WindowsPowerShell\v1.0\en-US\'
#Initialize variables
$files = gci *.txt | select name
$count = $files.count
$remainder = $count % 3
$rows = ($count - $remainder) /3 -as [int]
#I add an extra row by default, to deal with remainders
$rows++
$Table = New-Object 'object[,]' $rows,3
#Build up a table the way I want to see it
#column 1: A,B,C...
#column 2: L,M,N..
#column 3: R,...,Z
$index = 0
for ($j = 0; $j -lt 3; $j++)
{
$ThisColumnLength = $rows
if($j -ge $remainder){
$ThisColumnLength--
}
for ($i = 0; $i -lt $ThisColumnLength; $i++)
{
$table[$i,$j] = $files[$index]
$index++
}
}
#Read the table in the order Format-Wide throws them on the screen
#And store this in an array
$array = #()
for ($i = 0; $i -lt $rows; $i++)
{ $ThisRowLength = 3
if(($i+1) -eq $Rows){
$ThisRowLength = $remainder
}
if ($ThisRowLength -gt 0){
for ($j = 0; $j -lt $ThisRowLength; $j++)
{
$array += $table[$i,$j]
}
}
}
$array | fw -Column 3
}
Is there a more 'standard' way to do this in powershell? It seems like quite a natural option to me, but I couldn't find it. Is there an option or command that I've missed?
To clarify: I am not looking for ways to find help. This question is about the Format-Wide command, and/or possible alternative. I just thought this would be a nice example.
[Edit:] Changed my function to something slightly less clumbsy.
[Edit2:] The code I posted is flawed, and it's getting late. If you paste it in the shell and compare it with {Get-Childitem *.txt | format-wide -column 3}', you should be able to see what I am trying to do here. I hope somebody can suggest some kind of alternative.
[Edit3:] Modified the code again, finally got this to work. In the process I found out a very interesting thing about what Format-Wide returns:
PS> (Get-ChildItem).count
Result: 125
PS> (Get-ChildItem | Format-Wide).count
Result: 129
This confused me a lot, because sometimes I counted the results and didn't get what I expected, so a couple of times I thought something was wrong with my code, but maybe everything was fine.
I found something that does exactly what I want, and is more generic:
http://www.leporelo.eu/blog.aspx?id=powershell-formatting-format-wide-rotated-to-format-columns (dead link)
Version 0.4 code (pasted at the bottom of this answer in case this link dies too)
Archived 0.1 code (included because it has more discussion/examples)
Example with regular Format-Wide
'The basic promise behind implicit remoting is that you can work ' +
'with remote commands using local syntax.' -split ' ' | Format-Wide -force -prop {$_} -col 3
The basic promise
behind implicit remoting
is that you
can work with
remote commands using
local syntax.
Example with Format-Columns
'The basic promise behind implicit remoting is that you can work ' +
'with remote commands using local syntax.' -split ' ' | . Format-Columns -col 3
The is remote
basic that commands
promise you using
behind can local
implicit work syntax.
remoting with
I am still surprised Powershell doesn't have a built-in option to do this.
Version 0.4 code:
function Format-Columns {
[CmdletBinding()]
param(
[Parameter(Mandatory=$true,ValueFromPipeline=$true)]
[PsObject[]]$InputObject,
[Object]$Property,
[int]$Column,
[int]$MaxColumn,
[switch]$Autosize
)
Begin { $values = #() }
Process { $values += $InputObject }
End {
function ProcessValues {
$ret = $values
$p = $Property
if ($p -is [Hashtable]) {
$exp = $p.Expression
if ($exp) {
if ($exp -is [string]) { $ret = $ret | % { $_.($exp) } }
elseif ($exp -is [scriptblock]) { $ret = $ret | % { & $exp $_} }
else { throw 'Invalid Expression value' }
}
if ($p.FormatString) {
if ($p.FormatString -is [string]) { $ret = $ret | % { $p.FormatString -f $_ } }
else { throw 'Invalid format string' }
}
}
elseif ($p -is [scriptblock]) { $ret = $ret | % { & $p $_} }
elseif ($p -is [string]) { $ret = $ret | % { $_.$p } }
elseif ($p -ne $null) { throw 'Invalid -property type' }
# in case there were some numbers, objects, etc., convert them to string
$ret | % { $_.ToString() }
}
if (!$Column) { $Autosize = $true }
$values = ProcessValues
$valuesCount = #($values).Count
if ($valuesCount -eq 1) {
return $values
}
# from some reason the console host doesn't use the last column and writes to new line
$consoleWidth = $host.ui.RawUI.maxWindowSize.Width - 1
$gutterWidth = 2
# get length of the longest string
$values | % -Begin { [int]$maxLength = -1 } -Process { $maxLength = [Math]::Max($maxLength,$_.Length) }
# get count of columns if not provided
if ($Autosize) {
$Column = [Math]::Max( 1, ([Math]::Floor(($consoleWidth/($maxLength+$gutterWidth)))) )
$remainingSpace = $consoleWidth - $Column*($maxLength+$gutterWidth);
if ($remainingSpace -ge $maxLength) {
$Column++
}
if ($MaxColumn -and $MaxColumn -lt $Column) {
$Column = $MaxColumn
}
}
$countOfRows = [Math]::Ceiling($valuesCount / $Column)
$maxPossibleLength = [Math]::Floor( ($consoleWidth / $Column) )
# cut too long values, considers count of columns and space between them
$values = $values | % {
if ($_.length -gt $maxPossibleLength) { $_.Remove($maxPossibleLength-2) + '..' }
else { $_ }
}
#add empty values so that the values fill rectangle (2 dim array) without space
if ($Column -gt 1) {
$values += (#('') * ($countOfRows*$Column - $valuesCount))
}
# in case there is only one item, make it array
$values = #($values)
<#
now we have values like this: 1, 2, 3, 4, 5, 6, 7, ''
and we want to display them like this:
1 3 5 7
2 4 6 ''
#>
$formatString = (1..$Column | %{ "{$($_-1),-$maxPossibleLength}" }) -join ''
1..$countOfRows | % {
$r = $_-1
$line = #(1..$Column | %{ $values[$r + ($_-1)*$countOfRows]} )
Write-Output "$($formatString -f $line)".PadRight($consoleWidth,' ')
}
}
<#
.SYNOPSIS
Formats incoming data to columns.
.DESCRIPTION
It works similarly as Format-Wide but it works vertically. Format-Wide outputs the data row by row, but Format-Columns outputs them column by column.
.PARAMETER Property
Name of property to get from the object.
It may be
-- string - name of property.
-- scriptblock
-- hashtable with keys 'Expression' (value is string=property name or scriptblock)
and 'FormatString' (used in -f operator)
.PARAMETER Column
Count of columns
.PARAMETER Autosize
Determines if count of columns is computed automatically.
.PARAMETER MaxColumn
Maximal count of columns if Autosize is specified
.PARAMETER InputObject
Data to display
.EXAMPLE
1..150 | Format-Columns -Autosize
.EXAMPLE
Format-Columns -Col 3 -Input 1..130
.EXAMPLE
Get-Process | Format-Columns -prop #{Expression='Handles'; FormatString='{0:00000}'} -auto
.EXAMPLE
Get-Process | Format-Columns -prop {$_.Handles} -auto
.NOTES
Name: Get-Columns
Author: stej, http://twitter.com/stejcz
Site: http://www.leporelo.eu/blog.aspx?id=powershell-formatting-format-wide-rotated-to-format-columns
Lastedit: 2017-09-11
Version 0.4 - 2017-09-11
- removed color support and changed output from Write-Host to Write-Output
Version 0.3 - 2017-04-24
- added ForegroundColor and BackgroundColor
Version 0.2 - 2010-01-14
- added MaxColumn
- fixed bug - displaying collection of 1 item was incorrect
Version 0.1 - 2010-01-06
#>
}
If you mean standard way of finding all the help files in PowerShell, then yes there:
Get-Help * -Category HelpFile
Outside of that I just go check this page on Technet: Windows PowerShell Core About Topics
Here is a fork of the above script in a module form: https://github.com/loxia01/FormatColumn
Function syntax:
Format-Column [[-Property] <Object>] [-MaxColumnCount <int>] [-MinRowCount <int>] [-OrderBy <string>] [-InputObject <psobject>] [<CommonParameters>]
Format-Column [[-Property] <Object>] -ColumnCount <int> [-OrderBy <string>] [-InputObject <psobject>] [<CommonParameters>]

Sort very large text file in PowerShell

I have standard Apache log files, between 500Mb and 2GB in size. I need to sort the lines in them (each line starts with a date yyyy-MM-dd hh:mm:ss, so no treatment necessary for sorting.
The simplest and most obvious thing that comes to mind is
Get-Content unsorted.txt | sort | get-unique > sorted.txt
I am guessing (without having tried it) that doing this using Get-Content would take forever in my 1GB files. I don't quite know my way around System.IO.StreamReader, but I'm curious if an efficient solution could be put together using that?
Thanks to anyone who might have a more efficient idea.
[edit]
I tried this subsequently, and it took a very long time; some 10 minutes for 400MB.
Get-Content is terribly ineffective for reading large files. Sort-Object is not very fast, too.
Let's set up a base line:
$sw = [System.Diagnostics.Stopwatch]::StartNew();
$c = Get-Content .\log3.txt -Encoding Ascii
$sw.Stop();
Write-Output ("Reading took {0}" -f $sw.Elapsed);
$sw = [System.Diagnostics.Stopwatch]::StartNew();
$s = $c | Sort-Object;
$sw.Stop();
Write-Output ("Sorting took {0}" -f $sw.Elapsed);
$sw = [System.Diagnostics.Stopwatch]::StartNew();
$u = $s | Get-Unique
$sw.Stop();
Write-Output ("uniq took {0}" -f $sw.Elapsed);
$sw = [System.Diagnostics.Stopwatch]::StartNew();
$u | Out-File 'result.txt' -Encoding ascii
$sw.Stop();
Write-Output ("saving took {0}" -f $sw.Elapsed);
With a 40 MB file having 1.6 million lines (made of 100k unique lines repeated 16 times) this script produces the following output on my machine:
Reading took 00:02:16.5768663
Sorting took 00:02:04.0416976
uniq took 00:01:41.4630661
saving took 00:00:37.1630663
Totally unimpressive: more than 6 minutes to sort tiny file. Every step can be improved a lot. Let's use StreamReader to read file line by line into HashSet which will remove duplicates, then copy data to List and sort it there, then use StreamWriter to dump results back.
$hs = new-object System.Collections.Generic.HashSet[string]
$sw = [System.Diagnostics.Stopwatch]::StartNew();
$reader = [System.IO.File]::OpenText("D:\log3.txt")
try {
while (($line = $reader.ReadLine()) -ne $null)
{
$t = $hs.Add($line)
}
}
finally {
$reader.Close()
}
$sw.Stop();
Write-Output ("read-uniq took {0}" -f $sw.Elapsed);
$sw = [System.Diagnostics.Stopwatch]::StartNew();
$ls = new-object system.collections.generic.List[string] $hs;
$ls.Sort();
$sw.Stop();
Write-Output ("sorting took {0}" -f $sw.Elapsed);
$sw = [System.Diagnostics.Stopwatch]::StartNew();
try
{
$f = New-Object System.IO.StreamWriter "d:\result2.txt";
foreach ($s in $ls)
{
$f.WriteLine($s);
}
}
finally
{
$f.Close();
}
$sw.Stop();
Write-Output ("saving took {0}" -f $sw.Elapsed);
this script produces:
read-uniq took 00:00:32.2225181
sorting took 00:00:00.2378838
saving took 00:00:01.0724802
On same input file it runs more than 10 times faster. I am still surprised though it takes 30 seconds to read file from disk.
I've grown to hate this part of windows powershell, it is a memory hog on these larger files. One trick is to read the lines [System.IO.File]::ReadLines('file.txt') | sort -u | out-file file2.txt -encoding ascii
Another trick, seriously is to just use linux.
cat file.txt | sort -u > output.txt
Linux is so insanely fast at this, it makes me wonder what the heck microsoft is thinking with this set up.
It may not be feasible in all cases, and i understand, but if you have a linux machine, you can copy 500 megs to it, sort and unique it, and copy it back in under a couple minutes.
If each line of the log is prefixed with a timestamp, and the log messages don't contain embedded newlines (which would require special handling), I think it would take less memory and execution time to convert the timestamp from [String] to [DateTime] before sorting. The following assumes each log entry is of the format yyyy-MM-dd HH:mm:ss: <Message> (note that the HH format specifier is used for a 24-hour clock):
Get-Content unsorted.txt
| ForEach-Object {
# Ignore empty lines; can substitute with [String]::IsNullOrWhitespace($_) on PowerShell 3.0 and above
if (-not [String]::IsNullOrEmpty($_))
{
# Split into at most two fields, even if the message itself contains ': '
[String[]] $fields = $_ -split ': ', 2;
return New-Object -TypeName 'PSObject' -Property #{
Timestamp = [DateTime] $fields[0];
Message = $fields[1];
};
}
} | Sort-Object -Property 'Timestamp', 'Message';
If you are processing the input file for interactive display purposes you can pipe the above into Out-GridView or Format-Table to view the results. If you need to save the sorted results you can pipe the above into the following:
| ForEach-Object {
# Reconstruct the log entry format of the input file
return '{0:yyyy-MM-dd HH:mm:ss}: {1}' -f $_.Timestamp, $_.Message;
} `
| Out-File -Encoding 'UTF8' -FilePath 'sorted.txt';
(Edited to be more clear based on n0rd's comments)
It's might be a memory issue. Since you're loading the entire file into memory to sort it (and adding the overhead of the pipe into Sort-Object and the pipe into Get-Unique), it's possible that you're hitting the memory limits of the machine and forcing it to page to disk, which will slow things down a lot. One thing you might consider is splitting the logs up before sorting them, and then splicing them back together.
This probably won't match your format exactly, but if I've got a large log file for, say, 8/16/2012 which spans several hours, I can split it up into a different file for each hour using something like this:
for($i=0; $i -le 23; $i++){ Get-Content .\u_ex120816.log | ? { $_ -match "^2012-08-16 $i`:" } | Set-Content -Path "$i.log" }
This is creating a regular expression for each hour of that day and dumping all the matching log entries into a smaller log file named by the hour (e.g. 16.log, 17.log).
Then I can run your process of sorting and getting unique entries on a much smaller subsets, which should run a lot faster:
for($i=0; $i -le 23; $i++){ Get-Content "$i.log" | sort | get-unique > "$isorted.txt" }
And then you can splice them back together.
Depending on the frequency of the logs, it might make more sense to split them by day, or minute; the main thing is to get them into more manageable chunks for sorting.
Again, this only makes sense if you're hitting the memory limits of the machine (or if Sort-Object is using a really inefficient algorithm).
"Get-Content" can be faster than you think. Check this code-snippet in addition to the above solution:
foreach ($block in (get-content $file -ReadCount 100)) {
foreach ($line in $block){[void] $hs.Add($line)}
}
There doesn't seem to be a great way to do it in powershell, including [IO.File]::ReadLines(), but with the native windows sort.exe or the gnu sort.exe, either within cmd.exe, 30 million random numbers can be sorted in about 5 minutes with around 1 gb of ram. The gnu sort automatically breaks things up into temp files to save ram. Both commands have options to start the sort at a certain character column. Gnu sort can merge sorted files. See external sorting.
30 million line test file:
& { foreach ($i in 1..300kb) { get-random } } | set-content file.txt
And then in cmd:
copy file.txt+file.txt file2.txt
copy file2.txt+file2.txt file3.txt
copy file3.txt+file3.txt file4.txt
copy file4.txt+file4.txt file5.txt
copy file5.txt+file5.txt file6.txt
copy file6.txt+file6.txt file7.txt
copy file7.txt+file7.txt file8.txt
With gnu sort.exe from http://gnuwin32.sourceforge.net/packages/coreutils.htm . Don't forget the dependency dll's -- libiconv2.dll & libintl3.dll. Within cmd.exe:
.\sort.exe < file8.txt > filesorted.txt
Or windows sort.exe within cmd.exe:
sort.exe < file8.txt > filesorted.txt
With the function below:
PS> PowerSort -SrcFile C:\windows\win.ini
function PowerSort {
param(
[string]$SrcFile = "",
[string]$DstFile = "",
[switch]$Force
)
if ($SrcFile -eq "") {
write-host "USAGE: PowerSort -SrcFile (srcfile) [-DstFile (dstfile)] [-Force]"
return 0;
}
else {
$SrcFileFullPath = Resolve-Path $SrcFile -ErrorAction SilentlyContinue -ErrorVariable _frperror
if (-not($SrcFileFullPath)) {
throw "Source file not found: $SrcFile";
}
}
[Collections.Generic.List[string]]$lines = [System.IO.File]::ReadAllLines($SrcFileFullPath)
$lines.Sort();
# Write Sorted File to Pipe
if ($DstFile -eq "") {
foreach ($line in $lines) {
write-output $line
}
}
# Write Sorted File to File
else {
$pipe_enable = 0;
$DstFileFullPath = Resolve-Path $DstFile -ErrorAction SilentlyContinue -ErrorVariable ev
# Destination File doesn't exist
if (-not($DstFileFullPath)) {
$DstFileFullPath = $ev[0].TargetObject
}
# Destination Exists and -force not specified.
elseif (-not $Force) {
throw "Destination file already exists: ${DstFile} (using -Force Flag to overwrite)"
}
write-host "Writing-File: $DstFile"
[System.IO.File]::WriteAllLines($DstFileFullPath, $lines)
}
return
}

Resources