Get-ChildItem - Slow - powershell-4.0

Running Powershell V4.0. Below is script to search through 1000's of files by date stamp in filename in given date range and find all matching strings. However Get-ChildItem cmdlet is extremely slow. I'm new to Powershell. Any way to make this more efficient? Perhaps read in batches?
######################################################
#Date ranges and filename
$startdate = [datetime]'01/26/2017'
$enddate = [datetime] '02/05/2017'
$Filename = "DACNBA0124IDT030"
######################################################
#Progress
######################################################
$array =
do {
$startdate.ToString('yyyy_MM_dd*')
$startdate = $startdate.AddDays(1)
}
until ($startdate -gt [datetime] $enddate)
$files = $array | foreach-object {"G:\Live Engineering\AsRuns\Test\$_*"}
write-host $files
$Matches = get-childitem $files -recurse -force -OutBuffer 20000 | select- string $Filename | Where -Verbose {$_.line -notlike '*.mxf'}
$Matches.Matches.Count > "C:\Users\user\Desktop\report app\Log.txt"

I have a helper function which I use it everyday at my work. I have the available in my GitHub repo.
I have two cmdlets in it which does the same and more...
hope will help you .
https://github.com/kvprasoon/PSReadScript
Feel free to post issues/Bugs and request there
regards,
Kvprasoon

Related

PS - Find Folders that Haven't Had their Files Modfied in Some Time

We're migrating our FTP and I would like to only migrate folders that have had files in them that have been used/written in the last 6 months. I would think this would be something that I find all over the place with google, but all the scripts I've found have the same fatal flaw.
It seems with everything I find, it depends on the "Date modified" of the folder. The problem with that is, I have PLENTY of folders that show a "Date Modified" of years ago, yet when you dig into it, there are files that are being created and written as recently as today.
Example:
D:/Weblogs may show a date modified of 01/01/2018 however, when you dig into it, there is some folder, idk, called "Log6" let's say, and THAT folder has a log file in it that was modified as recently as yesterday.
All these scripts I'm seeing pull the date modified of the top folder, which just doesn't seem to be accurate.
Is there any way around this? I would expect something like
Get all folders at some top level, then foreach through the CONTENTS of those folders looking for files with the datemodified -lt adddays(-180) filter. If stuff that's "New" is found, then don't add the overarching directory to the array, but if not, then list the directory.
Any ideas?
Edit: I've tried this
$path = <some dir>
gci -path $path -Directory where-object {LastWriteTime -lt (get-date).addmonths(-6))}
and
$filter = {$_.LastWriteTime -lt (Get-Date).AddDays(-180)}
#$OldStuff = gci "D:\FTP\BELAMIINC" -file | Where-Object $filter
$OldFolders = gci "D:\FTP\BELAMIINC" -Directory | ForEach-Object {
gci "D:\FTP\BELAMIINC" -file | Where-Object $filter
}
Write-Host $OldFolders
Give this a try, I added comments for you to follow along the thought process.
The use of -Force is mainly to find hidden files and folders.
$path = '/path/to/parentFolder'
$limit = [datetime]::Now.AddMonths(-6)
# Get all the child folders recursive
$folders = Get-ChildItem $path -Recurse -Directory -Force
# Loop over the folders to see if there is at least one file
# that has been modified or accessed after the limit date
$result = foreach($folder in $folders)
{
:inner foreach($file in Get-ChildItem $folder -File -Force)
{
if($file.LastAccessTime -gt $limit -or $file.LastWriteTime -gt $limit)
{
# If this condition is true at least once, break this loop and return
# the folder's FullName, the File and File's Date for reference
[pscustomobject]#{
FullName = $folder.FullName
File = $file.Name
LastAccessTime = $file.LastAccessTime
LastWriteTime = $file.LastWriteTime
}
break inner
}
}
}
$result | Out-GridView
If you need to find the folders that don't have files recently modified you can use the $folders array and compare it against $result:
$folders.where({$_.FullName -notin $result.FullName}).FullName

Accurate time measurements when copying data via powershell

Hi I have been trying to find a way to help me estimate how long it will take to move databases from one location to another. My online research has helped me through a few issues so far but I seem to be stuck because I have it using what seems to be the correct commands to see all files that would need to be counted but it comes back on a 5TB database as it will only take 22 milliseconds so either I have a faster network and server that I even knew or I screwed this up some how that I cannot see.
$item = get-childitem 'D:\SQL01' -Recurse
$d = "E:\SQL01"
$results = #()
$results = Foreach ($i in $item) {
Measure-Command -Expression {
Copy-Item -literalpath $i $d
}
}
($results | Measure-Object -Property TotalSeconds -Sum).Sum
$results -f "c"
Reading over this it seems fine and it even returns a the sum of time but there is no way that is accurate. Please leave a comment if anyone sees where I did something wrong or you think there is something I could try differently.
Here is an example of Write-Progress in action:
# Get all directories on D:\SQL01 recursively
$directories = Get-ChildItem 'D:\SQL01' -Directory -Recurse
# Set a destination folder
$destination = 'E:\SQL01'
$dirCount = $directories.count
$i=0;foreach($directory in $directories)
{
$progress = #{
Activity = "Copying - {0}" -f $directory.FullName
Status = "Folder $i of $dirCount"
PercentComplete = $i++ / $dirCount * 100
}
Write-Progress #progress
Copy-Item -Path $directory -Destination $destination -Recurse
}

Improve the efficiency of my PowerShell script

The below code searches 400+ numbers from a list.txt file to see if it exists within any files within the folder path specified.
The script is very slow and has yet to complete as it did not complete after 25 minutes of running. The folder we are searching is 507 MB (532,369,408 bytes) and it contains 1,119 Files & 480 Folders. Any help to improve the speed of the search and the efficiency is greatly appreciated.
$searchWords = (gc 'C:\temp\list.txt') -split ','
$results = #()
Foreach ($sw in $searchWords)
{
$files = gci -path 'C:\Users\david.craven\Dropbox\Asset Tagging\_SJC Warehouse_\_Project Completed_\2018\A*' -filter "*$sw*" -recurse
foreach ($file in $files)
{
$object = New-Object System.Object
$object | Add-Member -Type NoteProperty –Name SearchWord –Value $sw
$object | Add-Member -Type NoteProperty –Name FoundFile –Value $file.FullName
$results += $object
}
}
$results | Export-Csv C:\temp\output.csv -NoTypeInformation
The following should speed up your task substantially:
If the intent is truly to look for the search words in the file names:
$searchWords = (Get-Content 'C:\temp\list.txt') -split ','
$path = 'C:\Users\david.craven\Dropbox\Facebook Asset Tagging\_SJC Warehouse_\_Project Completed_\2018\A*'
Get-ChildItem -File -Path $path -Recurse -PipelineVariable file |
Select-Object -ExpandProperty Name |
Select-String -SimpleMatch -Pattern $searchWords |
Select-Object #{n='SearchWord'; e='Pattern'},
#{n='FoundFile'; e={$file.FullName}} |
Export-Csv C:\temp\output.csv -NoTypeInformation
If the intent is to look for the search words in the files' contents:
$searchWords = (Get-Content 'C:\temp\list.txt') -split ','
$path = 'C:\Users\david.craven\Dropbox\Facebook Asset Tagging\_SJC Warehouse_\_Project Completed_\2018\A*'
Get-ChildItem -File -Path $path -Recurse |
Select-String -List -SimpleMatch -Pattern $searchWords |
Select-Object #{n='SearchWord'; e='Pattern'},
#{n='FoundFile'; e='Path'} |
Export-Csv C:\temp\output.csv -NoTypeInformation
The keys to performance improvement:
Perform the search with a single command, by passing all search words to Select-String. Note: -List limits matching to 1 match (by any of the given patterns).
Instead of constructing custom objects in a script block with New-Object and Add-Member, let Select-Object construct the objects for you directly in the pipeline, using calculated properties.
Instead of building an intermediate array iteratively with += - which behind the scenes recreates the array every time - use a single pipeline to pipe the result objects directly to Export-Csv.
So there are definitely some basic things in the PowerShell code you posted that can be improved, but it may still not be super fast. Based on the sample you gave us I'll assume you're looking to match the file names against a list of words. You're looping through the list of words (400 iterations) and in each loop you're looping through all 1,119 files. That's a total of 447,600 iterations!
Assuming you can't reduce the number of iterations in the loop, let's start by making each iteration faster. The Add-Member cmdlet is going to be really slow, so switch that approach up by casting a hashtable to the [PSCustomObject] type accelerator:
[PSCustomObject]#{
SearchWord = $Word
File = $File.FullName
}
Also, there is no reason to pre-create an array object and then add each file to it. You can simply capture the ouptut of the foreach loop in a variable:
$Results = Foreach ($Word in $Words)
{
...
So a faster loop might look like this:
$Words = Get-Content -Path $WordList
$Files = Get-ChildItem -Path $Path -Recurse -File
$Results = Foreach ($Word in $Words)
{
foreach ($File in $Files)
{
if ($File.BaseName -match $Word)
{
[PSCustomObject]#{
SearchWord = $Word
File = $File.FullName
}
}
}
}
A simpler approach might be to use Where-Object on the files array:
$Results = Foreach ($Word in $Words)
{
$Files | Where-Object BaseName -match $Word
}
Try both and test out the performance.
So if speeding up the loop doesn't meet your needs, try removing the loop entirely. You could use regex and join all the words together:
$Words = Get-Content -Path $WordList
$Files = Get-ChildItem -Path $Path -Recurse -File
$WordRegex = $Words -join '|'
$Files | Where basename -match $WordRegex

Cannot get file last modified time via PowerShell script

What I tried to do is searching a directory to output file path, name, version and last modified time into a txt file.
My code as below:
function Get-Version($filePath)
{
$name = #{Name="Name";Expression= {split-path -leaf $_.FileName}}
$path = #{Name="Path";Expression= {split-path $_.FileName}}
$time = #{Name="Last Modified"; Expression={Get-Date $_.LastWriteTime}}
dir -recurse -path $filePath | % { if ($_.Name -match "(.*exe)$") {$_.VersionInfo} } | select $path, $name,$time, FileVersion
}
Get-Version('E:\PS test') >> "version_info.txt"
However the output txt has name, path and version, but no last modified time.
Any hints?
Thanks!
It's because you're returning the .VersionInfo property from your ForEach-Object (%) call, and .LastWriteTime is a property of the file object, not the version info. Have a look at this:
function Get-Version($filePath)
{
$name = #{Name="Name";Expression= {split-path -leaf $_.VersionInfo.FileName}}
$path = #{Name="Path";Expression= {split-path $_.VersionInfo.FileName}}
$time = #{Name="Last Modified"; Expression={Get-Date $_.LastWriteTime}}
$version = #{Name="FileVersion"; Expression={$_.VersionInfo.FileVersion}}
dir -recurse -path $filePath | ? { $_.Name -match "(.*exe)$" } | select $path, $name,$time, $version
}
By Changing the defintiion of $name and $path to refer directly to the version info, you can operate on the original object. I also had $version to get at the FileVersion you were referring to in the select.
That makes the ForEach-Object redundant, since you'd only be passing along the input. Since you were only checking a condition in it anyway, easier to convert it to Where-Object (?).
Expanding your aliases makes it look like this:
function Get-Version($filePath)
{
$name = #{Name="Name";Expression= {Split-Path -Leaf $_.VersionInfo.FileName}}
$path = #{Name="Path";Expression= {Split-Path $_.VersionInfo.FileName}}
$time = #{Name="Last Modified"; Expression={Get-Date $_.LastWriteTime}}
$version = #{Name="FileVersion"; Expression={$_.VersionInfo.FileVersion}}
Get-ChildItem -Recurse -Path $filePath | Where-Object { $_.Name -match "(.*exe)$" } | Select-Object $path, $name,$time, $version
}
However I should also point out that you can filter the file names directly in dir (Get-ChildItem), making the Where-Object superfluous too:
function Get-Version($filePath)
{
$name = #{Name="Name";Expression= {Split-Path -Leaf $_.VersionInfo.FileName}}
$path = #{Name="Path";Expression= {Split-Path $_.VersionInfo.FileName}}
$time = #{Name="Last Modified"; Expression={Get-Date $_.LastWriteTime}}
$version = #{Name="FileVersion"; Expression={$_.VersionInfo.FileVersion}}
Get-ChildItem -Recurse -Path $filePath -Filter *.exe | Select-Object $path, $name,$time, $version
}
And then based on your comment, I realized it can be simplified even more:
function Get-Version($filePath)
{
$path = #{Name="Path";Expression= {$_.DirectoryName}}
$time = #{Name="Last Modified"; Expression={$_.LastWriteTime}}
$version = #{Name="FileVersion"; Expression={$_.VersionInfo.FileVersion}}
Get-ChildItem -Recurse -Path $filePath -Filter *.exe | Select-Object $path, Name,$time, $version
}
$name is not needed because the file object already has a property called .Name that has the file name.
$path can be simplified because $_.DirectoryName already has the path.
$time can be simplified because the .LastWriteTime property is already a [DateTime] so you don't need Get-Date.
The only reason you still need the name/expression hashes for those is to have the fields be named something other than the underlying property. If you don't care about that, you could do this:
function Get-Version($filePath)
{
$version = #{Name="FileVersion"; Expression={$_.VersionInfo.FileVersion}}
Get-ChildItem -Recurse -Path $filePath -Filter *.exe | Select-Object DirectoryName, Name, LastWriteTime, $version
}

Powershell - Speeding up writing to files

I wrote this script to find all of the folders in a directory and for each folder, check inside a common file if some strings exist and if not add them. I needed to insert strings in particular places. Not really knowing how to do this, I opted for simpler find and replace where the strings needed to be inserted. Anyway this script takes almost an hour to work through 800 files. I'm hoping some experienced members can point out ways to make my task quicker as I have only been working with Powershell for two days. Many Thanks!!!
# First find and replace items.
$FindOne =
$ReplaceOneA =
$ReplaceOneB =
$ReplaceOneC =
# Second find and replace items.
$FindTwo =
$ReplaceTwo =
# Strings to test if exist.
# To avoid duplicate entries.
$PatternOne =
$PatternTwo =
$PatternThree =
$PatternFour =
# Gets window folder names.
$FilePath = "$ProjectPath\$Station\WINDOW"
$Folders = Get-ChildItem $FilePath | Where-Object {$_.mode -match "d"}
# Adds folder names to an array.
$FolderName = #()
$Folders | ForEach-Object { $FolderName += $_.name }
# Adds code to each builder file.
ForEach ($Name in $FolderName) {
$File = "$FilePath\$Name\main.xaml"
$Test = Test-Path $File
# First tests if file exists. If not, no action.
If ($Test -eq $True) {
$StringOne = Select-String -pattern $PatternOne -path $File
$StringTwo = Select-String -pattern $PatternTwo -path $File
$StringThree = Select-String -pattern $PatternThree -path $File
$StringFour = Select-String -pattern $PatternFour -path $File
$Content = Get-Content $File
# If namespaces or object don't exist, add them.
If ($StringOne -eq $null) {
$Content = $Content -Replace $FindOne, $ReplaceOneA
}
If ($StringTwo -eq $null) {
$Content = $Content -Replace $FindOne, $ReplaceOneB
}
If ($StringThree -eq $null) {
$Content = $Content -Replace $FindOne, $ReplaceOneC
}
If ($StringFour -eq $null) {
$Content = $Content -Replace $FindTwo, $ReplaceTwo
}
$Content | Set-Content $File
}
}
# End of program.
You could try writing to the file with a stream, like this
$stream = [System.IO.StreamWriter] $File
$stream.WriteLine($content)
$stream.close()

Resources