Faster iteration - performance

I have this code, which is part of a function that returns a list of SQL rows based on a time range.
The query itself (1st line of code) is quite fast. But the foreach loop that extract the relevant data takes a while to complete.
I have around 350.000 lines to iterate, and despite it's has to take a while, I was wondering if there is any change I could make in order to make it faster.
$SqlDocmasterTableResuls = $this.SqlConnection.GetSqlData("SELECT DOCNUM, DOCLOC FROM MHGROUP.DOCMASTER WHERE ENTRYWHEN between '" + $this.FromDate + "' and '" + $this.ToDate + "'")
[System.Collections.ArrayList]$ListOfDocuments = [System.Collections.ArrayList]::New()
if ($SqlDocmasterTableResuls.Rows.Count)
{
foreach ($Row in $SqlDocmasterTableResuls.Rows)
{
$DocProperties = #{
"DOCNUM" = $Row.DOCNUM
"SOURCE" = $Row.DOCLOC
"DESTINATION" = $Row.DOCLOC -replace ([regex]::Escape($this.iManSourceFileServerName + ":" + $this.iManSourceFileServerPath.ROOTPATH)),
([regex]::Escape($this.iManDestinationFileServerName + ":" + $this.iManDestinationFileServerPath.ROOTPATH))
}
$DocObj = New-Object -TypeName PSObject -Property $DocProperties
$ListOfDocuments.Add($DocObj)
}
return $ListOfDocuments

Avoid appending to an array in a loop. The best way to capture loop data in a variable is to simply collect the loop output in a variable:
$ListOfDocuments = foreach ($Row in $SqlDocmasterTableResuls.Rows) {
New-Object -Type PSObject -Property #{
"DOCNUM" = $Row.DOCNUM
"SOURCE" = $Row.DOCLOC
"DESTINATION" = $Row.DOCLOC -replace ...
}
}
You don't need the surrounding if conditional, because if the table doesn't have any rows the loop should skip right over it, leaving you with an empty result.
Since you want to return the list anyway, you don't even need to collect the loop output in a variable. Just leave the output as it is and it will get returned anyway.
Also avoid repeating operations in a loop when their result doesn't change. Calculate the escaped source and destination paths once before the loop:
$srcPath = [regex]::Escape($this.iManSourceFileServerName + ':' + $this.iManSourceFileServerPath.ROOTPATH)
$dstPath = [regex]::Escape($this.iManDestinationFileServerName + ':' + $this.iManDestinationFileServerPath.ROOTPATH)
and use the variables $srcPath and $dstPath inside the loop.
Something like this should do:
$SqlDocmasterTableResuls = $this.SqlConnection.GetSqlData("SELECT ...")
$srcPath = [regex]::Escape($this.iManSourceFileServerName + ':' + $this.iManSourceFileServerPath.ROOTPATH)
$dstPath = [regex]::Escape($this.iManDestinationFileServerName + ':' + $this.iManDestinationFileServerPath.ROOTPATH)
foreach ($Row in $SqlDocmasterTableResuls.Rows) {
New-Object -Type PSObject -Property #{
'DOCNUM' = $Row.DOCNUM
'SOURCE' = $Row.DOCLOC
'DESTINATION' = $Row.DOCLOC -replace $srcPath, $dstPath
}
}
return

[edit - per Ansgar Wiechers, the PSCO accelerator is only available with ps3+.]
one other thing that may help is to replace New-Object with [PSCustomObject]. that is usually somewhat faster to use. something like this ...
$DocObj = [PSCustomObject]$DocProperties
another way to use that type accelerator is to do what Ansgar Wiechers did in his code sample, but use the accelerator instead of the cmdlet. like this ...
[PSCustomObject]#{
'DOCNUM' = $Row.DOCNUM
'SOURCE' = $Row.DOCLOC
'DESTINATION' = $Row.DOCLOC -replace $srcPath, $dstPath
}
hope that helps,
lee

Related

Prepend folder name to paths in an array

I'm looking to prepend a folder name to the start of an array of (relative) paths using a foreach statement, but it's not making any changes to the array (no errors either)
Note: This is more for educational purposes than functional as I have it working using a for loop which I've commented out, but I'm interested in learning how the foreach statement works
$myFiles = #(
"blah1\blah2\file1.txt"
"blah3\blah4\file2.txt"
"blah5\blah6\file3.txt"
)
$checkoutFolder = "folder1"
#for ($h = 0; $h -lt $myFiles.Length; $h++) {
#$myFiles[$h] = $checkoutFolder + "\" + $myFiles[$h]
#}
foreach ($path in $myFiles) {
$path = $checkoutFolder + "\" + $path
}
$myFiles
I also tried using a buffer variable e.g.
$buffer = $checkoutFolder + "\" + $path
$path = $buffer
But same result i.e.
OUTPUT:
blah1\blah2\file1.txt
blah3\blah4\file2.txt
blah5\blah6\file3.txt
I could think of two ways:
Create new array with modified data of old array
$myFiles = #(
"blah1\blah2\file1.txt"
"blah3\blah4\file2.txt"
"blah5\blah6\file3.txt"
)
$checkoutFolder = "folder1"
#Create new array $myFilesnew
$myFilesnew = #()
#For each line in in old array
foreach ($file in $myFiles)
{
#Create new row from modied row $file of $myFiles array
$row = $checkoutFolder+"\"+$file
#Add row $row to a new array $myFilesnew
$myFilesnew+=$row
}
$myFilesnew
Modify each row of existing array:
$myFiles = #(
"blah1\blah2\file1.txt"
"blah3\blah4\file2.txt"
"blah5\blah6\file3.txt"
)
$checkoutFolder = "folder1"
$i=0
while($i-lt $myFiles.Count)
{
#Get $i row $myFiles[$i] from aray, perform insert of modified data, write data back to $myFiles[$i] row of the array
$myfiles[$i]=$myFiles[$i].Insert(0,$checkoutFolder+"\");
#Add +1 to $i
$i++
}
$myFiles
Better start using the Join-Path cmdlet to avoid creating paths with backslashes omitted or doubled.
Something like this would do it:
$checkoutFolder = "folder1"
$myFiles = "blah1\blah2\file1.txt", "blah3\blah4\file2.txt", "blah5\blah6\file3.txt" | ForEach-Object {
Join-Path -Path $checkoutFolder -ChildPath $_
}
$myFiles
output:
folder1\blah1\blah2\file1.txt
folder1\blah3\blah4\file2.txt
folder1\blah5\blah6\file3.txt
You can replace the regex beginning of each string with the folder name. This is a useful idiom for generating computernames too.
'blah1\blah2\file1.txt','blah3\blah4\file2.txt','blah5\blah6\file3.txt' -replace '^','folder1\'
folder1\blah1\blah2\file1.txt
folder1\blah3\blah4\file2.txt
folder1\blah5\blah6\file3.txt

Powershell Performance

i have a Problem with powershell Performance while searching a 40gb log file.
i Need to check if any of 1000 email adresses are included in this 40gb file. This would take 180 hours :D any ideas?
$logFolder = "H:\log.txt"
$adressen= Get-Content H:\Adressen.txt
$ergebnis = #()
foreach ($adr in $adressen){
$suche = Select-String -Path $logFolder -Pattern "\[\(\'from\'\,.*$adr.*\'\)\]" -List
$aktiv= $false
$adr
if ($suche){
$aktiv = $true
}
if ($aktiv -eq $true){
$ergebnis+=$adr + ";Ja"
}
else{
$ergebnis+=$adr + ";Nein"
}
}
$ergebnis |Out-File H:\output.txt
Don't read the file 1000 times.
Build a regexp line with all 1000 addresses (it's gonna be a huge line, but hey, much smaller than 40TB). Like:
$Pattern = "\[\(\'from\'\,.*$( $adressen -join '|' ).*\'\)\]"
Then do your Select-String, and save the result to do an address-by-address search in it. Hopefully, the result will be much smaller than 40Gb, and should be much faster.
As mentioned in the comments, replace
$ergebnis = #()
with
$ergebnis = New-Object System.Collections.ArrayList
and
$ergebnis+=$adr + ";Ja"
with
$ergebnis.add("$adr;Ja")
or respective
$ergebnis.add("$adr;Nein")
This will speed up your script quite a bit.

Get Color Palette of Image using PowerShell

I am trying to get the Color Palette of an image. I tried various methods, and now I use the following code in PowerShell, but I could not get the correct result:
$filename = "C:\Users\schoo\Desktop\bb.jpg"
$BitMap = [System.Drawing.Bitmap]::FromFile((Get-Item $filename).fullname)
Foreach($y in (1..($BitMap.Height-1))){
Foreach($x in (1..($BitMap.Width-1))){
$Pixel = $BitMap.GetPixel($X,$Y)
$BackGround = $Pixel.Name
}
$R = $Pixel | select -ExpandProperty R
$G = $Pixel | select -ExpandProperty G
$B = $Pixel | select -ExpandProperty B
$A = $Pixel | select -ExpandProperty A
$allClr = "$R" + "." + "$G" + "." + "$B" + "." + "$A"
$allClr
}
The result take me more than thousand RGB codes:
I assume that by "color palette" you mean the swathe of distinct colours that appear in the image.
A simple (and quite fast) way to select only a distinct subset of a collection is to use a hashtable.
$filename = 'C:\Users\schoo\Desktop\bb.jpg'
$BitMap = [System.Drawing.Bitmap]::FromFile((Resolve-Path $filename).ProviderPath)
# A hashtable to keep track of the colors we've encountered
$table = #{}
foreach($h in 1..$BitMap.Height){
foreach($w in 1..$BitMap.Width) {
# Assign a value to the current Color key
$table[$BitMap.GetPixel($w - 1,$h - 1)] = $true
}
}
# The hashtable keys is out palette
$palette = $table.Keys

Powershell API Post Variable to Ducksboard

Trying to use the following Powershell script which I've taken from Github to push data into a Ducksboard dashboard. The function works excellently however I need to feed in a variable as part of the data. Of the two calls to the function included below the function which pushes in the actual value 44 works fine, however if I substitute it for my variable $qtybord the function falls over. I've tried a number of options to overcome the var being within the single quotes but cannot get it to work - can anyone help me?
# Squirt data to Duscksboard
function Execute-DucksboardApi
{
param(
[string] $url = $null,
[string] $data = $null,
[string] $apikey = $null,
[string] $contentType = "application/json",
[string] $codePageName = "UTF-8",
[string] $userAgent = $null
);
if ($url -and $data -and $apikey)
{
[System.Net.WebRequest]$webRequest = [System.Net.WebRequest]::Create($url);
$webRequest.ServicePoint.Expect100Continue = $false;
[System.Net.NetworkCredential]$credentials = New-Object System.Net.NetworkCredential($apikey, 'ignored');
$webRequest.Credentials = $credentials.GetCredential($url, 'Basic');
$webRequest.PreAuthenticate = $true;
$webRequest.ContentType = $contentType;
$webRequest.Method = "POST";
if ( $userAgent )
{
$webRequest.UserAgent = $userAgent;
}
$enc = [System.Text.Encoding]::GetEncoding($codePageName);
[byte[]]$bytes = $enc.GetBytes($data);
$webRequest.ContentLength = $bytes.Length;
[System.IO.Stream]$reqStream = $webRequest.GetRequestStream();
$reqStream.Write($bytes, 0, $bytes.Length);
$reqStream.Flush();
$resp = $webRequest.GetResponse();
$rs = $resp.GetResponseStream();
[System.IO.StreamReader]$sr = New-Object System.IO.StreamReader -argumentList $rs;
$sr.ReadToEnd();
}
}
$qtybord = 44
Execute-DucksboardApi -url 'https://push.ducksboard.com/v/123752/' -data '{"value": $qtybord}' -apikey 'tu2j3d3epqytWZD1haHnjJSJ1NqBrmvPe5SONc0VYge4BbIPi0'
Execute-DucksboardApi -url 'https://push.ducksboard.com/v/123752/' -data '{"value": 44}' -apikey 'tu2j3d3epqytWZD1haHnjJSJ1NqBrmvPe5SONc0VYge4BbIPi0'
*
try this:
-data "{`"value`": $qtybord}"
or
-data "{""value"": $qtybord}"
in single quote variable aren't expanded, in double quote you need to escape the double quote inside the string.
So make the -data parameter take $data instead and make $data = '{"value": ' + $qtyboard + '}' or just wrap that whole thing in parens after -data.
Also, if you're on PowerShell v3 you can play with something like this:
$data = New-Object -Type PSObject -Property #{
value = $qtyboard
} | ConvertTo-JSON

formatting csv files and powershell

Ok so we have a manual process that runs through PL/SQL Developer to run a query and then export to csv.
I am trying to automate that process using powershell since we are working in a windows environment.
I have created two files that seems to be exact duplicates from the automated and manual process but they don't work the same so I assume I am missing some hidden characters but I can't find them or figure out how to remove them.
The most obvious example of them working differently is opening them in excel. The manual file opens in excel automatically putting each column in it's own seperate column. The automated file instead puts everything into one column.
Can anybody shed some light? I am hoping that by resolving this or at least getting some info will help with the bigger problem of it not processing correctly.
Thanks.
ex one column
"rownum","year","month","batch","facility","transfer_facility","trans_dt","meter","ticket","trans_product","trans","shipper","customer","supplier","broker","origin","destination","quantity"
ex seperate column
"","ROWNUM","RPT_YR","RPT_MO","BATCH_NBR","FACILITY_CD","TRANSFER_FACILITY_CD","TRANS_DT","METER_NBR","TKT_NBR","TRANS_PRODUCT_CD","TRANS_CD","SHIPPER_CD","CUSTOMER_NBR","SUPPLIER_NBR","BROKER_CD","ORIGIN_CD","DESTINATION_CD","NET_QTY"
$connectionstring = "Data Source=database;User Id=user;Password=password"
$connection = New-Object System.Data.OracleClient.OracleConnection($connectionstring)
$command = New-Object System.Data.OracleClient.OracleCommand($query, $connection)
$connection.Open()
Write-Host -ForegroundColor Black " Opening Oracle Connection"
Start-Sleep -Seconds 2
#Getting data from oracle
Write-Host
Write-Host -ForegroundColor Black "Getting data from Oracle"
$Oracle_data=$command.ExecuteReader()
Start-Sleep -Seconds 2
if ($Oracle_data.read()){
Write-Host -ForegroundColor Green "Connection Success"
while ($Oracle_data.read()) {
#Variables for recordset
$rownum = $Oracle_data.GetDecimal(0)
$rpt_yr = $Oracle_data.GetDecimal(1)
$rpt_mo = $Oracle_data.GetDecimal(2)
$batch_nbr = $Oracle_data.GetString(3)
$facility_cd = $Oracle_data.GetString(4)
$transfer_facility_cd = $Oracle_data.GetString(5)
$trans_dt = $Oracle_data.GetDateTime(6)
$meter_nbr = $Oracle_data.GetString(7)
$tkt_nbr = $Oracle_data.GetString(8)
$trans_product_cd = $Oracle_data.GetString(9)
$trans_cd = $Oracle_data.GetString(10)
$shipper_cd = $Oracle_data.GetString(11)
$customer_nbr = $Oracle_data.GetString(12)
$supplier_nbr = $Oracle_data.GetString(13)
$broker_cd = $Oracle_data.GetString(14)
$origin_cd = $Oracle_data.GetString(15)
$destination_cd = $Oracle_data.GetString(16)
$net_qty = $Oracle_data.GetDecimal(17)
#Define new file
$filename = "Pipeline" #Get-Date -UFormat "%b%Y"
$filename = $filename + ".csv"
$fileLocation = $newdir + "\" + $filename
$fileExists = Test-Path $fileLocation
#Create object to hold record
$obj = new-object psobject -prop #{
rownum = $rownum
year = $rpt_yr
month = $rpt_mo
batch = $batch_nbr
facility = $facility_cd
transfer_facility = $transfer_facility_cd
trans_dt = $trans_dt
meter = $meter_nbr
ticket = $tkt_nbr
trans_product = $trans_product_cd
trans = $trans_cd
shipper = $shipper_cd
customer = $customer_nbr
supplier = $supplier_nbr
broker = $broker_cd
origin = $origin_cd
destination = $destination_cd
quantity = $net_qty
}
$records += $obj
}
}else {
Write-Host -ForegroundColor Red " Connection Failed"
}
#Write records to file with headers
$records | Select-Object rownum,year,month,batch,facility,transfer_facility,trans_dt,meter,ticket,trans_product,trans,shipper,customer,supplier,broker,origin,destination,quantity |
ConvertTo-Csv |
Select -Skip 1|
Out-File $fileLocation
Why are you skipping the first row(usually the headers)? Also, try using Export-CSV instead:
#Write records to file with headers
$records | Select-Object rownum, year, month, batch, facility, transfer_facility, trans_dt, meter, ticket, trans_product, trans, shipper, customer, supplier, broker, origin, destination, quantity |
Export-Csv $fileLocation -NoTypeInformation

Resources