Extract timestamp from filename and sort - sorting

I'm trying to look through each item in a folder and add each item to an array sorted by the datestamp in the filename.
For example, I have three files:
myfile_20150813_040949.txt
myfile_20150812_030949.txt
myfile_20150812_010949.txt
I'm not sure how to parse out the time from each and add them to an array in ascending order. Any help would be appreciated.

I am assuming that you are looking to sort the files by the parsed timestamp that is pulled from the file name with this example. It may not the be the best RegEx approach, but it works in testing.
#RegEx pattern to parse the timestamps
$Pattern = '.*_(\d{4})(\d{2})(\d{2})_(\d{2})(\d{2})(\d{2})\.txt'
$List = New-Object System.Collections.ArrayList
$Temp = New-Object System.Collections.ArrayList
Get-ChildItem | ForEach {
#Make sure the file matches the pattern
If ($_.Name -match $Pattern) {
Write-Verbose "Add $($_.Name)" -Verbose
$Date = $Matches[2],$Matches[3],$Matches[1] -join '/'
$Time = $Matches[4..6] -join ':'
[void]$Temp.Add(
(New-Object PSObject -Property #{
Date =[datetime]"$($Date) $($Time)"
File = $_
}
))
}
}
#Sort the files by the parsed timestamp and add to the main list
$List.AddRange(#($Temp | Sort Date | Select -Expand File))
#Clear out the temp collection
$Temp.Clear()
#Display the results
$List

What you could be doing for this is using the string method .Split() with the [datetime] method of TryParseExact(). Go though each file and add a property for the "FromFileDate" and then sort on that.
$path = "C:\temp"
Get-ChildItem -Filter "*.txt" -Path $path | ForEach-Object{
$date = ($_.BaseName).Split("_",2)[1]
$result = New-Object DateTime
if([datetime]::TryParseExact($date,"yyyyMMdd_hhmmss",[System.Globalization.CultureInfo]::InvariantCulture,[System.Globalization.DateTimeStyles]::None,[ref]$result)){
# This is a good date
Add-Member -InputObject $_ -MemberType NoteProperty -Name "FromFileDate" -Value $result -PassThru
} Else {
# Could not parse date from filename
Add-Member -InputObject $_ -MemberType NoteProperty -Name "FromFileDate" -Value "Could not Parse" -PassThru
}
} | Select-Object Name,fromfiledate | Sort-Object fromfiledate
We take the basename of the each text file and split it into 2 parts from the first underscore. Using TryParseExact we then attempt to convert the "date" string to the format of "yyyyMMdd_hhmmss". Since we use TryParseExact if we have trouble parsing the date then the code will continue.
Sample Output
Name FromFileDate
---- ------------
myfile_20150812_030949.txt 8/12/2015 3:09:49 AM
myfile_20150813_040949.txt 8/13/2015 4:09:49 AM
files.txt Could not Parse
If you didn't want the erroneous data in the output a simple Where-Object{$_.fromfiledate -is [datetime]} would remove those entries.

Related

How to select the file with the maximum number of the specified file

I want to keep only the file with the largest version of the specified zip file in the folder using powershell. I wrote a shell script but it returns all the files. How can I modify the script to select only the file with the largest version?
$files = Get-ChildItem -Filter "*.zip"
$max = $files |Measure-Object -Maximum| ForEach-Object {[int]($_.Split("_")[-1].Split(".")[0])}
$largestFiles = $files | Where-Object {[int]($_.Split("_")[-1].Split(".")[0]) -eq $max}
Write-Output $largestFiles
Expectation:
A1_Fantasic_World_20.zip
A1_Fantasic_World_21.zip
B1_Mythical_Realms_11.zip
B1_Mythical_Realms_12.zip
C1_Eternal_Frame_Corporation_2.zip
C1_Eternal_Frame_Corporation_3.zip
↓
A1_Fantasic_World_21.zip
B1_Mythical_Realms_12.zip
C1_Eternal_Frame_Corporation_3.zip
A1_Fantasic_World's biggest number is 21.B1_Mythical_Realms's is 12.C1_Eternal_Frame_Corporation's is 3. So I want to choose the biggest version of zip.
First you add the calculated properties to your file system objects you use for filtering. Then with a combination of Group-Object, Sort-Object and Select.Object you can filter the desired files.
$FileList =
Get-ChildItem -Filter *.zip |
Select-Object -Property *,
#{
Name = 'Title'
Expression = {($_.BaseName -split '_')[0..$(($_.BaseName -split '_').count - 2)] -join '_' }
},
#{
Name = 'Counter'
Expression = {[INT]($_.BaseName -split '_')[-1]}
}
$LastOnesList =
$FileList |
Group-Object -Property Title |
ForEach-Object {
$_.Group | Sort-Object -Property Counter | Select-Object -Last 1
}
$LastOnesList |
Select-Object -Property Name

Improve the efficiency of my PowerShell script

The below code searches 400+ numbers from a list.txt file to see if it exists within any files within the folder path specified.
The script is very slow and has yet to complete as it did not complete after 25 minutes of running. The folder we are searching is 507 MB (532,369,408 bytes) and it contains 1,119 Files & 480 Folders. Any help to improve the speed of the search and the efficiency is greatly appreciated.
$searchWords = (gc 'C:\temp\list.txt') -split ','
$results = #()
Foreach ($sw in $searchWords)
{
$files = gci -path 'C:\Users\david.craven\Dropbox\Asset Tagging\_SJC Warehouse_\_Project Completed_\2018\A*' -filter "*$sw*" -recurse
foreach ($file in $files)
{
$object = New-Object System.Object
$object | Add-Member -Type NoteProperty –Name SearchWord –Value $sw
$object | Add-Member -Type NoteProperty –Name FoundFile –Value $file.FullName
$results += $object
}
}
$results | Export-Csv C:\temp\output.csv -NoTypeInformation
The following should speed up your task substantially:
If the intent is truly to look for the search words in the file names:
$searchWords = (Get-Content 'C:\temp\list.txt') -split ','
$path = 'C:\Users\david.craven\Dropbox\Facebook Asset Tagging\_SJC Warehouse_\_Project Completed_\2018\A*'
Get-ChildItem -File -Path $path -Recurse -PipelineVariable file |
Select-Object -ExpandProperty Name |
Select-String -SimpleMatch -Pattern $searchWords |
Select-Object #{n='SearchWord'; e='Pattern'},
#{n='FoundFile'; e={$file.FullName}} |
Export-Csv C:\temp\output.csv -NoTypeInformation
If the intent is to look for the search words in the files' contents:
$searchWords = (Get-Content 'C:\temp\list.txt') -split ','
$path = 'C:\Users\david.craven\Dropbox\Facebook Asset Tagging\_SJC Warehouse_\_Project Completed_\2018\A*'
Get-ChildItem -File -Path $path -Recurse |
Select-String -List -SimpleMatch -Pattern $searchWords |
Select-Object #{n='SearchWord'; e='Pattern'},
#{n='FoundFile'; e='Path'} |
Export-Csv C:\temp\output.csv -NoTypeInformation
The keys to performance improvement:
Perform the search with a single command, by passing all search words to Select-String. Note: -List limits matching to 1 match (by any of the given patterns).
Instead of constructing custom objects in a script block with New-Object and Add-Member, let Select-Object construct the objects for you directly in the pipeline, using calculated properties.
Instead of building an intermediate array iteratively with += - which behind the scenes recreates the array every time - use a single pipeline to pipe the result objects directly to Export-Csv.
So there are definitely some basic things in the PowerShell code you posted that can be improved, but it may still not be super fast. Based on the sample you gave us I'll assume you're looking to match the file names against a list of words. You're looping through the list of words (400 iterations) and in each loop you're looping through all 1,119 files. That's a total of 447,600 iterations!
Assuming you can't reduce the number of iterations in the loop, let's start by making each iteration faster. The Add-Member cmdlet is going to be really slow, so switch that approach up by casting a hashtable to the [PSCustomObject] type accelerator:
[PSCustomObject]#{
SearchWord = $Word
File = $File.FullName
}
Also, there is no reason to pre-create an array object and then add each file to it. You can simply capture the ouptut of the foreach loop in a variable:
$Results = Foreach ($Word in $Words)
{
...
So a faster loop might look like this:
$Words = Get-Content -Path $WordList
$Files = Get-ChildItem -Path $Path -Recurse -File
$Results = Foreach ($Word in $Words)
{
foreach ($File in $Files)
{
if ($File.BaseName -match $Word)
{
[PSCustomObject]#{
SearchWord = $Word
File = $File.FullName
}
}
}
}
A simpler approach might be to use Where-Object on the files array:
$Results = Foreach ($Word in $Words)
{
$Files | Where-Object BaseName -match $Word
}
Try both and test out the performance.
So if speeding up the loop doesn't meet your needs, try removing the loop entirely. You could use regex and join all the words together:
$Words = Get-Content -Path $WordList
$Files = Get-ChildItem -Path $Path -Recurse -File
$WordRegex = $Words -join '|'
$Files | Where basename -match $WordRegex

Compare mkv's creationtime

I've been tasked with creating a script that checks to see if the office cameras we've set up have stopped uploading their feeds to the "Camera" share located on our Windows 2016 storage server. If the NEWEST .mkv is over an hour old compared to the current time (get-date) then the "problem" camera needs to be restarted manually. (No need to script that part.)
Here's what my Director has written so far:
#Variable Definitions start here
$numhours = 1
Get-ChildItem "d:\Shares\Cameras" | Foreach {
$folderToLookAt = ($_.FullName + "\*.mkv")
$result = Get-ChildItem -Recurse $folderToLookAt | Sort-Object CreationTime -Descending
echo $result[0].FullName
echo $result[0].CreationTime
}
The first variable really isn't used yet, but I'm kind of dumb-struck as what to do next. The above returns the full names and creation times successfully of the newest .mkvs
Suggestions on the next part?
Invert the logic - instead of searching all the files, sorting them, finding the most recent, and checking the date, do it the other way round.
Look for files created since the cutoff, and alert if there were none found:
$cutOffTime = [datetime]::Now.AddHours(-1)
Get-ChildItem "d:\Shares\Cameras" | Foreach {
$folderToLookAt = ($_.FullName + "\*.mkv")
$result = Get-ChildItem -Recurse $folderToLookAt | Where-Object { $_.CreationTime -gt $cuttoffTime }
if (-not $result)
{
"$($_.Name) has no files since the cutoff time"
}
}
I'm assuming your paths look like:
D:\Shares\Cameras\Camera1\file1.mkv
D:\Shares\Cameras\Camera1\file2.mkv
D:\Shares\Cameras\Camera2\file1.mkv
D:\Shares\Cameras\Camera2\file2.mkv
D:\Shares\Cameras\Camera3\file1.mkv
.
.
.
If so, I would do something like this:
# The path to your files
$CameraShareRoot = 'D:\Shares\Cameras';
# Number of Hours
$NumberOfHours = 1;
# Date and time of significance. It's $NumberOfHours in the past.
$MinFileAge = (Get-Date).AddHours( - $NumberOfHours);
# Get all the folders at the camera share root
Get-ChildItem -Path $CameraShareRoot -Directory | ForEach-Object {
# Get the most recently created file in each folder
$_ | Get-ChildItem -Recurse -Filter '*.mkv' -File | Sort-Object -Property CreationTime -Descending | Select-Object -First 1
} | Where-Object {
# Remove any files that were created after our datetime
$_.CreationTime -lt $MinFileAge;
} | Select-Object -Property FullName, CreationTime
This will just output the full file name and creation time for stale cameras.
You could do something like this to email yourself a report when the results have any files:
# The path to your files
$CameraShareRoot = 'D:\Shares\Cameras';
# Number of Hours
$NumberOfHours = 1;
# Date and time of significance. It's $NumberOfHours in the past.
$MinFileAge = (Get-Date).AddHours( - $NumberOfHours);
# Get all the folders at the camera share root, save the results to $StaleCameraFiles
$StaleCameraFiles = Get-ChildItem -Path $CameraShareRoot -Directory | ForEach-Object {
# Get the most recently created file in each folder
$_ | Get-ChildItem -Recurse -Filter '*.mkv' -File | Sort-Object -Property CreationTime -Descending | Select-Object -First 1;
} | Where-Object {
# Remove any files that were created after our datetime
$_.CreationTime -lt $MinFileAge;
}
# If there are any stale camera files
if ($StaleCameraFiles) {
# Send an email
$MailMessage = #{
SmtpServer = 'mail.example.com';
To = 'youremail#example.com';
From = 'youremail#example.com';
Subject = 'Stale Camera Files';
Body = $StaleCameraFiles | Select-Object -Property FullName, CreationTime | ConvertTo-Html -Fragment | Out-String;
BodyAsHtml = $true;
}
Send-MailMessage #MailMessage;
}
Generally you will want to use LastWriteTime instead of CreationTime since the latter can be updated by a file move or copy, but maybe that's what you want here.
You have to compare the CreationTime date with (Get-Date).AddHours(-1). The AddHours method allows you to add hours to the DateTime, but also to subtract.
You can use the following example:
$Path = 'd:\Shares\Cameras'
$CreationTime = Get-ChildItem -Path $Path -Filter *.mkv |
Sort-Object -Property CreationTime -Descending |
Select-Object -First 1 -ExpandProperty CreationTime
if ($CreationTime -lt (Get-Date).AddHours(-1)) {
# your action here (restart, send mail, write output, ...)
}
It also optimizes your code a bit. ;)
$LatestFile = Get-ChildItem C:\Users\Connor\Desktop\ | Sort CreationTime | Select -Last 1
if ($LatestFile.CreationTime -gt (Get-Date).AddHours(-1)){
#It's Currently Working
} else {
#Do Other Stuff
}
try this :
Get-ChildItem "c:\temp" -Filter *.mkv -File | sort CreationTime -Descending |
select -First 1 | where CreationTime -lt (Get-Date).AddHours(-1) |
%{Write-Host "Alert !!" -ForegroundColor Red}

How do I filter directories with powershell on the amount of files contained

I am having issues finding the correct syntax I need to filter my results on only listing directories with a file count of above a specified amount (600 in my case).
This is my code so far;
$server_dir= "D:\backup"
$export_dir= "C:\support\spcount.txt"
if($server_dir)
{
$folders = Get-ChildItem $server_dir
$output = #()
foreach($folder in $folders)
{
$fname = $folder.Name
$fpath = $folder.FullName
$fcount = Get-ChildItem $fpath | Measure-Object | Select-Object -Expand Count
$obj = New-Object psobject -Property #{FolderName = $fname; FileCount = $fcount} | Format-List;
$output += $obj
}
#Output
$output | Tee-Object -FilePath $export_dir | Format-list FileCount
}
And I am getting positive results with this, it is listing all Child Items within the backup dir however I need to filter this to only display and out too text format IF the directory contains 600 or more files.
Can anybody help me please?
I am fairly new too powershell so please pull me up if this code is not the greatest, I am forever wanting too learn.
Thanks!
I think I found the issue. It's that Format-List statement at the end of your object creation statement. It pipes the newly created object through Format-List, and thus transforms it into something else.
$obj = New-Object psobject -Property #{FolderName = $fname; FileCount = $fcount} | Format-List
So if you remove that last bit, you'll get the object you expect
$obj = New-Object psobject -Property #{FolderName = $fname; FileCount = $fcount}
So when you use the where statement to filter, you'll actually have a FileCount property to filter on.
I detected it by running the $output through Get-Member which showed me it wasn't the object with the expected properties.
So basically, here's your code, including fixes:
if($server_dir)
{
# *** Added the -directory flag, cause we don't need those pesky files ***
$folders = Get-ChildItem $server_dir -directory
$output = #()
foreach($folder in $folders)
{
$fname = $folder.Name
$fpath = $folder.FullName
$fcount = Get-ChildItem $fpath | Measure-Object | Select-Object -Expand Count
# *** Format-List was dropped here to avoid losing the objects ***
$obj = New-Object psobject -Property #{FolderName = $fname; FileCount = $fcount}
$output += $obj
}
# *** And now the filter and we're done ***
$output | where -Property FileCount -ge 600 | Tee-Object -FilePath $export_dir | Format-list FileCount
}
Note also the -directory to get only folders with get-childitem, and the -ge 600 (greater than or equal) instead of -gt 599 which is just a bit more obvious.
Remember that the Format-* statements actually transform the data passed through them. So you should only use those at the end of the pipeline to show data on screen or dump it to a file.
Don't use it to transform the data you still want to work with later on.
So in short you could do something like this to get that information.
Get-ChildItem C:\temp -Directory |
Select Name,#{Label="Count";Expression={(Get-Childitem $_ -file -Recurse).Count}} |
Where-Object{$_.Count -lt 10}
Let see if we can incorporate that in your code. Your if statement is also kind of pointless. Your variable contains a non-null \ non-zerolength string so it will always be True. You want it to work if the directory exists I imagine.
$server_dir= "D:\backup"
$export_dir= "C:\support\spcount.txt"
if(Test-Path $server_dir){
Get-ChildItem C:\temp -Directory |
Select Name,#{Label="Count";Expression={(Get-Childitem $_ -file -Recurse).Count}} |
Where-Object{$_.Count -lt 10} |
ConvertTo-Csv | Tee -File $export_dir | ConvertFrom-Csv
} Else {
Write-Warning "$server_dir does not exist."
}
Just working on getting this to file and screen with Tee just a moment.
I see 2 ways to do this.
Filter it in your output like this:
$output | where -property FileCount -gt 599 | # ... your code to write to the output
Or not store it in the output array if it doesn't match the condition:
if ($fcount -gt 599) {
$obj = New-Object psobject -Property #{FolderName = $fname; FileCount = $fcount} | Format-List;
$output += obj
}

Count, Sort and Group-By in Powershell

Are there any cool cmdlets that will help me do the following?
I want something in Powershell that is as simple as doing the same in SQL:
select RootElementName , count(*) from Table
group by RootElementName
order by RootElementName
I'm all XML files in a directory, finding the root element of each XML file.
$DirectoryName = "d:\MyFolder\"
$AllFiles = Get-ChildItem $DirectoryName -Force
foreach ($Filename in $AllFiles)
{
$FQFilename = $DirectoryName + $Filename
[xml]$xmlDoc = Get-Content $FQFilename
$rootElementName = $xmlDoc.SelectSingleNode("/*").Name;
Write-Host $FQFilename $rootElementName
}
Desired Result:
RootName Count
-------- -----
Root1 15
MyRoot 16
SomeRoot 24
I know I could could either create two arrays, or an array of objects, store the root elements in the array, and do the counts all using typical code, was just hoping that this new language might have something built-in that I haven't discovered yet.
Could I pipe the "Write-Host $FQFilename $rootElementName " to something that would behave something to the SQL I referred to above?
You can get groups and counts by using Group-Object like this:
$AllFiles | Group-Object RootElementName | Sort-Object Name | Select-Object Name, Count
In your current example, Write-Host doesn't write an object to the pipeline that we can sort or group. Write-Host only prints text to the screen to show the user something, ex. a script menu.
$DirectoryName = "d:\MyFolder\"
$AllFiles = Get-ChildItem $DirectoryName -Force | ForEach-Object {
#The FullName-property contains the absolute path, so there's no need to join the filename and $directoryname
[xml]$xmlDoc = Get-Content $_.FullName
$rootElementName = $xmlDoc.SelectSingleNode("/*").Name
#Outputing an object that we can group and sort
New-Object -TypeName psobject -Property #{
FileName = $_.FullName
RootElementName = $rootElementName
}
}
$grped = $AllFiles | Group-Object RootElementName | Sort-Object Name | Select-Object Name, Count
I'm creating an object with a FileName-property and the RootElementName so you have it if you need to retrieve the filename+rootelement for a list. If not, we could simplify this to:
$DirectoryName = "d:\MyFolder\"
$AllFiles = Get-ChildItem $DirectoryName -Force | ForEach-Object {
#The FullName-property contains the absolute path, so there's no need to join the filename and $directoryname
[xml]$xmlDoc = Get-Content $_.FullName
#Output rootelementname
$xmlDoc.SelectSingleNode("/*").Name
}
$grped = $AllFiles | Group-Object | Sort-Object Name | Select-Object Name, Count

Resources