I have multiple machines uploading files to one FTP directory. The first part of the filename is the machine, the rest is a timestamp, e.g. AAAAA_20130312_125113.
Now I want to get a sorted list of all Unique machines that have uploaded to this directory.
I managed to write the lost of all filenames.substring(0,5) to the host but I still don't have the unique machine names.
$files=Get-ChildItem $strMOVETO -Name -Include TAS*.csv -Recurse
ForEach ($i in $files) { Write-Host $i.Substring(0,5) }
Any hints on how to do this? Does not necessary have to be a one liner, although that would be a nice challenge ;-).
Thanks!
What happens when you have an 8-character machine name? Your substring will break. Since the machine name, date & time are delimited by an _, split on that & get the first item.
Get-ChildItem $strMOVETO -recurse -name -include TAS*.csv|%{$_.split("_")[0]}|sort-object -unique
To filter on date as well:
Get-ChildItem $strMOVETO -recurse -include TAS*.csv|where-object{$_.lastwritetime -ge (get-date).adddays(-1)}|%{$_.basename.split("_")[1]}|sort-object -unique
Not tested but something like this:
Get-ChildItem $strMOVETO -Name -Include TAS*.csv -Recurse | % { $_.Name.Substring(0,5) } | Sort -Unique
You don't need to do the Write-Host inside the loop and it's easier to use % instead of a foreach loop.
pipe the results of your command into a | sort -unique
$files=Get-ChildItem $strMOVETO -Name -Include TAS*.csv -Recurse
ForEach ($i in $files) { Write-Host $i.Substring(0,5) } | sort -unique
...but better still would be to simplify the script...
$filter = "TAS*.csv"
Get-ChildItem -Path $strMOVETO -Filter $filter -Recurse | % {$_.BaseName.Substring(0,5) } | sort -unique
Related
I am having a helluva time trying to understand why this script is not working as intended. It is a simple script in which I am attempting to import a CSV, select a few columns that I want, then export the CSV and copy over itself. (Basically we have archived data that I only need a few columns from for another project due to memory size constraints). This script is very simple, which apparently has an inverse relationship with how much frustration it causes when it doesn't work... Right now the end result is I end up with an empty csv instead of a csv containing only the columns I selected with Select-Object.
$RootPath = "D:\SomeFolder"
$csvFilePaths = Get-ChildItem $RootPath -Recurse -Include *.csv |
ForEach-Object{
Import-CSV $_ |
Select-Object Test_Name, Test_DataName, Device_Model, Device_FW, Data_Avg_ms, Data_StdDev |
Export-Csv $_.FullName -NoType -Force
}
Unless you read the input file into memory in full, up front, you cannot safely read from and write back to the same file in a given pipeline.
Specifically, a command such as Import-Csv file.csv | ... | Export-Csv file.csv will erase the content of file.csv.
The simplest solution is to enclose the command that reads the input file in (...), but note that:
The file's content (transformed into objects) must fit into memory as a whole.
There is a slight risk of data loss if the pipeline is interrupted before all (transformed) objects have been written back to the file.
Applied to your command:
$RootPath = "D:\SomeFolder"
Get-ChildItem $RootPath -Recurse -Include *.csv -OutVariable csvFiles |
ForEach-Object{
(Import-CSV $_.FullName) | # NOTE THE (...)
Select-Object Test_Name, Test_DataName, Device_Model, Device_FW,
Data_Avg_ms, Data_StdDev |
Export-Csv $_.FullName -NoType -Force
}
Note that I've used -OutVariable csvFiles in order to collect the CSV file-info objects in output variable $csvFiles. Your attempt to collect the file paths via $csvFilePaths = ... doesn't work, because it attempts to collects Export-Csv's output, but Export-Csv produces no output.
Also, to be safe, I've changed the Import-Csv argument from $_ to $_.FullName to ensure that Import-Csv finds the input file (because, regrettably, file-info object $_ is bound as a string, which sometimes expands to the mere file name).
A safer solution would be to output to a temporary file first, and (only) on successful completion replace the original file.
With either approach, the replacement file will have default file attributes and permissions; if the original file had special attributes and/or permissions that you want to preserve, you must recreate them explicitly.
As Matt commented, your last $PSItem ($_) not related to the Get-ChildItem anymore but for the Select-Object cmdlet which don't have a FullName Property
You can use differnt foreach approach:
$RootPath = "D:\SomeFolder"
$csvFilePaths = Get-ChildItem $RootPath -Recurse -Include *.csv
foreach ($csv in $csvFilePaths)
{
Import-CSV $csv.FullName |
Select-Object Test_Name,Test_DataName,Device_Model,Device_FW,Data_Avg_ms,Data_StdDev |
Export-Csv $csv.FullName -NoType -Force
}
Or keeping your code, add $CsvPath Variable containing the csv path and use it later on:
$RootPath = "D:\SomeFolder"
Get-ChildItem $RootPath -Recurse -Include *.csv | ForEach-Object{
$CsvPath = $_.FullName
Import-CSV $CsvPath |
Select-Object Test_Name,Test_DataName,Device_Model,Device_FW,Data_Avg_ms,Data_StdDev |
Export-Csv $CsvPath -NoType -Force
}
So I have figured it out. I was attempting to pipe through the Import-Csv cmdlet directly instead of declaring it as a variable in the o.g. code. Here is the code snippet that gets what I wanted to get done, done. I was trying to pipe in the Import-Csv cmdlet directly before, I simply had to declare a variable that uses the Import-Csv cmdlet as its definition and pipe that variable through to Select-Object then Export-Csv cmdlets. Thank you all for your assistance, I appreciate it!
$RootPath = "\someDirectory\"
$CsvFilePaths = #(Get-ChildItem $RootPath -Recurse -Include *.csv)
$ColumnsWanted = #('Test_Name','Test_DataName','Device_Model','Device_FW','Data_Avg_ms','Data_StdDev')
for($i=0;$i -lt $CsvFilePaths.Length; $i++){
$csvPath = $CsvFilePaths[$i]
Write-Host $csvPath
$importedCsv = Import-CSV $csvPath
$importedCsv | Select-Object $ColumnsWanted | Export-CSV $csvPath -NoTypeInformation
}
For PowerShell 2.0 in Win 2008,
I need to check what's the newest file in a directory with about 1.6 million files.
I know I can use Get-ChildItem like so:
$path="G:\Calls"
$filter='*.wav'
$lastFile = Get-ChildItem -Recurse -Path $path -Include $filter | Sort-Object -Property LastWriteTime | Select-Object -Last 1
$lastFile.Name
$lastFile.LastWriteTime
The issue is that it takes sooooo long to find the newest file due to the sheer amount of files.
Is there a faster way to find that?
Sort-Object is known to be slow as it aggregates over each item combination.
But you don't need to do that as you might just go over each file and keep track of the latest one:
Get-ChildItem -Recurse |ForEach-Object `
-Begin { $Newest = $Null } `
-Process { if ($_.LastWriteTime -gt $Newest.LastWriteTime) { $Newest = $_ } } `
-End { $Newest }
there are a couple of things that can be done to improve performance.
First, use -Filter rather than -Include because the filter is passed to the underlying Win32API which will be a bit faster.
Also, because the script gathers all the files and then sorts them, you might be creating a very large memory footprint during the sorting phase. I don't know if it's possible to query the MFT or some other process which avoids retrieving each file and inspecting the lastwritetime, but an alternative approach could be:
gci -rec -file -filter *.wav | %{$v = $null}{if ($_.lastwritetime -gt $v.lastwritetime){$v=$_}}{$v}
I tried this with all files and saw the following:
measure-command{ ls -rec -file |sort lastwritetime|select -last 1}
. . .
TotalSeconds : 142.1333641
vs
measure-command { gci -rec -file | %{$v = $null}{if ($_.lastwritetime -gt $v.lastwritetime){$v=$_}}{$v} }
. . .
TotalSeconds : 87.7215093
which is a pretty good savings. There may be additional ways to improve performance
The below code searches 400+ numbers from a list.txt file to see if it exists within any files within the folder path specified.
The script is very slow and has yet to complete as it did not complete after 25 minutes of running. The folder we are searching is 507 MB (532,369,408 bytes) and it contains 1,119 Files & 480 Folders. Any help to improve the speed of the search and the efficiency is greatly appreciated.
$searchWords = (gc 'C:\temp\list.txt') -split ','
$results = #()
Foreach ($sw in $searchWords)
{
$files = gci -path 'C:\Users\david.craven\Dropbox\Asset Tagging\_SJC Warehouse_\_Project Completed_\2018\A*' -filter "*$sw*" -recurse
foreach ($file in $files)
{
$object = New-Object System.Object
$object | Add-Member -Type NoteProperty –Name SearchWord –Value $sw
$object | Add-Member -Type NoteProperty –Name FoundFile –Value $file.FullName
$results += $object
}
}
$results | Export-Csv C:\temp\output.csv -NoTypeInformation
The following should speed up your task substantially:
If the intent is truly to look for the search words in the file names:
$searchWords = (Get-Content 'C:\temp\list.txt') -split ','
$path = 'C:\Users\david.craven\Dropbox\Facebook Asset Tagging\_SJC Warehouse_\_Project Completed_\2018\A*'
Get-ChildItem -File -Path $path -Recurse -PipelineVariable file |
Select-Object -ExpandProperty Name |
Select-String -SimpleMatch -Pattern $searchWords |
Select-Object #{n='SearchWord'; e='Pattern'},
#{n='FoundFile'; e={$file.FullName}} |
Export-Csv C:\temp\output.csv -NoTypeInformation
If the intent is to look for the search words in the files' contents:
$searchWords = (Get-Content 'C:\temp\list.txt') -split ','
$path = 'C:\Users\david.craven\Dropbox\Facebook Asset Tagging\_SJC Warehouse_\_Project Completed_\2018\A*'
Get-ChildItem -File -Path $path -Recurse |
Select-String -List -SimpleMatch -Pattern $searchWords |
Select-Object #{n='SearchWord'; e='Pattern'},
#{n='FoundFile'; e='Path'} |
Export-Csv C:\temp\output.csv -NoTypeInformation
The keys to performance improvement:
Perform the search with a single command, by passing all search words to Select-String. Note: -List limits matching to 1 match (by any of the given patterns).
Instead of constructing custom objects in a script block with New-Object and Add-Member, let Select-Object construct the objects for you directly in the pipeline, using calculated properties.
Instead of building an intermediate array iteratively with += - which behind the scenes recreates the array every time - use a single pipeline to pipe the result objects directly to Export-Csv.
So there are definitely some basic things in the PowerShell code you posted that can be improved, but it may still not be super fast. Based on the sample you gave us I'll assume you're looking to match the file names against a list of words. You're looping through the list of words (400 iterations) and in each loop you're looping through all 1,119 files. That's a total of 447,600 iterations!
Assuming you can't reduce the number of iterations in the loop, let's start by making each iteration faster. The Add-Member cmdlet is going to be really slow, so switch that approach up by casting a hashtable to the [PSCustomObject] type accelerator:
[PSCustomObject]#{
SearchWord = $Word
File = $File.FullName
}
Also, there is no reason to pre-create an array object and then add each file to it. You can simply capture the ouptut of the foreach loop in a variable:
$Results = Foreach ($Word in $Words)
{
...
So a faster loop might look like this:
$Words = Get-Content -Path $WordList
$Files = Get-ChildItem -Path $Path -Recurse -File
$Results = Foreach ($Word in $Words)
{
foreach ($File in $Files)
{
if ($File.BaseName -match $Word)
{
[PSCustomObject]#{
SearchWord = $Word
File = $File.FullName
}
}
}
}
A simpler approach might be to use Where-Object on the files array:
$Results = Foreach ($Word in $Words)
{
$Files | Where-Object BaseName -match $Word
}
Try both and test out the performance.
So if speeding up the loop doesn't meet your needs, try removing the loop entirely. You could use regex and join all the words together:
$Words = Get-Content -Path $WordList
$Files = Get-ChildItem -Path $Path -Recurse -File
$WordRegex = $Words -join '|'
$Files | Where basename -match $WordRegex
I have a series of folders and subfolders, structured in this way:
001/Fabric/Blue/ (.jpg files, sequentially named)
001/Fabric/Green/ (.jpg files, sequentially named)
002/Fabric/Blue/ (.jpg files, sequentially named)
002/Fabric/Green/ (.jpg files, sequentially named)
etc.
The file names have excess string characters that I would like to remove, and I would like to convert their file names into an easier sequential format (0.jpg, 1.jpg, etc.).
I tried working with a few different PowerShell examples to get this to work. I have the recursive searching functionality working, however I receive an error about an InvalidOperationException when trying to rename the files in the ForEach-Object loop. Additionally, I am afraid my sequential numbering is not being 'reset' for each of the folders where it renames files.
$i = 0
Get-ChildItem -Filter "*.jpg" -Recurse | ForEach-Object {
Rename-Item $_ -NewName ('$i.jpg' -f $i++)
}
So, two questions:
How can I fix the error with Rename-Item?
How can I ensure my variable is reset for each subfolder the script starts renaming files in?
If you take a two step approach, first getting all the folders containing jpg's and then iterating through this list, you have no problem beginning with 1. But I'd always use leading zeroes for such a renumbering.
$BaseFld = "Q:\Test\"
$Ext = "*.jpg"
$jpgFolders = gci $($BaseFld+$Ext) -Recurse |
Select -ExpandProperty Directory -Unique |
select -ExpandProperty Fullname | Sort
ForEach ($Folder in $jpgFolders) {
Set-location $Folder
$i = 1
Get-ChildItem $Ext | %{Ren $_ -NewName ('{0:D4}.jpg' -f $i++) -whatif}
}
If the ouptut suits you, remove the -whatif in the second last line
other method
$rootdir="C:\temp"
gci $rootdir -Recurse -Directory | %{$i=1; gci $_.FullName -Recurse -File -Filter "*.jpg" | %{Ren $_.FullName -NewName ('{0}.jpg' -f $i++)} }
Does anybody know a powershell 2.0 command/script to count all folders and subfolders (recursive; no files) in a specific folder ( e.g. the number of all subfolders in C:\folder1\folder2)?
In addition I also need also the number of all "leaf"-folders. in other words, I only want to count folders, which don't have subolders.
In PowerShell 3.0 you can use the Directory switch:
(Get-ChildItem -Path <path> -Directory -Recurse -Force).Count
You can use get-childitem -recurse to get all the files and folders in the current folder.
Pipe that into Where-Object to filter it to only those files that are containers.
$files = get-childitem -Path c:\temp -recurse
$folders = $files | where-object { $_.PSIsContainer }
Write-Host $folders.Count
As a one-liner:
(get-childitem -Path c:\temp -recurse | where-object { $_.PSIsContainer }).Count
To answer the second part of your question, of getting the leaf folder count, just modify the where object clause to add a non-recursive search of each directory, getting only those that return a count of 0:
(dir -rec | where-object{$_.PSIsContainer -and ((dir $_.fullname | where-object{$_.PSIsContainer}).count -eq 0)}).Count
it looks a little cleaner if you can use powershell 3.0:
(dir -rec -directory | where-object{(dir $_.fullname -directory).count -eq 0}).count
Another option:
(ls -force -rec | measure -inp {$_.psiscontainer} -Sum).sum
This is a pretty good starting point:
(gci -force -recurse | where-object { $_.PSIsContainer }).Count
However, I suspect that this will include .zip files in the count. I'll test that and try to post an update...
EDIT: Have confirmed that zip files are not counted as containers. The above should be fine!
Get the path child items with recourse option, pipe it to filter only containers, pipe again to measure item count
((get-childitem -Path $the_path -recurse | where-object { $_.PSIsContainer }) | measure).Count