I am looking through C:\ProgramFiles for a jar file named log4j-core-x.y.z.jar. I am trying to match on the last digit z, which can be both a one or two digit number (0-99). I can't seem to get the right glob pattern to accomplish this.
Code:
PS C:\Users\Administrator> Get-ChildItem -Path 'C:\Program Files\' -Filter log4j-core-*.*.[1-9][0-9].jar -Recurse -ErrorAction SilentlyContinue -Force | %{$_.FullName}
This yields no results, but when I just do all wildcards like, -Filter log4j-core-*.*.*.jar, I get:
C:\Program Files\apache-log4j-2.16.0-bin\apache-log4j-2.16.0-bin\log4j-core-2.16.0-javadoc.jar
C:\Program Files\apache-log4j-2.16.0-bin\apache-log4j-2.16.0-bin\log4j-core-2.16.0-sources.jar
C:\Program Files\apache-log4j-2.16.0-bin\apache-log4j-2.16.0-bin\log4j-core-2.16.0-tests.jar
C:\Program Files\apache-log4j-2.16.0-bin\apache-log4j-2.16.0-bin\log4j-core-2.16.0.jar
The only thing I care about getting is C:\Program Files\apache-log4j-2.16.0-bin\apache-log4j-2.16.0-bin\log4j-core-2.16.0.jar, log4j-core-2.16.0.jar
-Filter doesn't support filtering with regex or Character ranges such as [A-Z] or [0-9]. Thanks mklement0 for pointing it out.
From the parameter description of Get-ChildItem official documentation:
The filter string is passed to the .NET API to enumerate files. The API only supports * and ? wildcards.
Try with this:
Get-ChildItem -Path 'C:\Program Files\' -Filter log4j-core-*.*.??.jar -Recurse -ErrorAction SilentlyContinue -Force |
Where-Object {
$_.Name -match '\.\d{1,2}\.jar$'
# => Ends with a . followed by 1 or 2 digits and the .jar extension
}
Santiago Squarzon's helpful answer offers a regex-assisted solution that has the potential to perform much more sophisticated matching than required in the case at hand.
Let me complement it with a wildcard-based solution that builds on your own attempt:
The -Filter parameter does not support PowerShell's wildcard syntax; it only supports * and ? as wildcard metacharacters (as Santiago notes), not also character-range/set constructs such as [0-9].
Instead, -Filter arguments are interpreted by the platform's file-system APIs, which on Windows additionally have legacy quirks - see this answer.
That said, with patterns that -Filter does support, its use is preferable to -Include (see below), because it performs much better, due to filtering at the source.
By contrast, the -Include parameter does use PowerShell's wildcards and additionally supports multiple patterns.
Unlike regexes, character-range/set expressions in PowerShell's wildcard language do not support duplication (quantifier) logic and match exactly one character each (just like ? does for any single character; * is the only metacharacter that implicitly supports duplication: zero or more characters).
Therefore, [1-9][0-9] matches exactly 2 characters (digits), and also matching just one digit ([0-9]) requires an additional pattern:
Get-ChildItem -Recurse 'C:\Program Files' -Include log4j-core-*.*.[0-9].jar, log4j-core-*.*.[1-9][0-9].jar -ErrorAction SilentlyContinue -Force |
ForEach-Object FullName
Caveats:
Using -Include (or -Exclude) without -Recurse doesn't work as one would expect - see this answer.
As of PowerShell 7.2, combining -Recurse with -Include suffers from performance problems due to inefficient implementation - see GitHub issue #8662.
Related
This question already has answers here:
How can I make PowerShell handle [ or ] in file name well?
(2 answers)
Closed 7 months ago.
i use this code to add a text to some files:
gci C:\Users\Documents\SAMPLE\* -in *.mp4,*.mkv -Recurse | % { rename-item –path $_.Fullname –Newname ( $_.basename + ' (TEST)' + $_.extension) }
it works as i want BUT it does not work with files that their name contaion "[ ]"
and give me this error:
Cannot rename because item at 'C:\Users\Documents\SAMPLE\AAAAAAAAA[B].mp4' does not exist.
Debugging tip: when you have code issues and are using the pipeline, rewrite the code not to use the pipeline, break the problem down into steps, and insert debugging aids to help troubleshoot. For example, it could be Write-Host, saving to temp variables, etc.
-LiteralPath
Specify the path with the -LiteralPath parameter:
[PS]> gci -LiteralPath '.\Music\Artist - Name\Album Name [Disc 1]\'
To improve the ability of Windows PowerShell 3.0 to interpret and
correctly handle special characters, the LiteralPath parameter, which
handles special characters in paths, is valid on almost all cmdlets
that have a Path parameter
See the following Q&A for more about square brackets & escaping:
How do I use square brackets in a wildcard pattern in PowerShell Get-ChildItem?
I have a directory containing hundreds of thousands of PDF files with quite complex names. I need to be able to move SOME (not all files) from the directory they're in to another directory. Here is an example of my .sh script that handles it:
#!/bin/bash
/usr/bin/echo "Moving subset 300-399"
# 300-399
/usr/bin/mv *-*-*-3[0-9][0-9]-*-*-*-*.pdf ../destination_folder/
/usr/bin/echo "Moving subset 450-499"
# 450-499
/usr/bin/mv *-*-*-4[5-9][0-9]-*-*-*-*.pdf ../destination_folder/
/usr/bin/echo "Moving subset 500-599"
# 500-599
/usr/bin/mv *-*-*-5[0-9][0-9]-*-*-*-*.pdf ../destination_folder/
Because there are so many files and I think that mv is performing an evaluation on every single one, it's taking upwards of two hours to perform the work. This is a script that must be run EVERY day, so I need to find a more efficient way to do the work. Is there a more efficient command I can utilize in a Windows environment or a more efficient way I can evaluate each file in order to speed up the mv process?
As mentioned in the comments, powershell will probably be faster as it is native to windows. The difference in speed will be dependent on the implementation of bash you are using.
For a pure bash solution, you can try :
#!/bin/bash
find /input/folder -regextype posix-extended -regex '^(?:[^-]+-){3}(?:4[5-9]|[35][0-9])[0-9](?:-[^-]+){4}\.pdf$' -exec mv {} /destination/folder +
Explanation :
find /input/folder -regextype posix-extended -regex :
find every file in your input folder that match the regex
'^(?:[^-]+-){3}(?:4[5-9]|[35][0-9])[0-9](?:-[^-]+){4}\.pdf$'
the pattern matching your files. More explanations here
-exec mv {} /destination/folder +
execute the mv command on every file found
the + symbol means the command will be executed in as few calls as possible, when the find command has discovered every file matching the regex
It is worth to mention that the duration of these mv commands depends on the amount of data of course: the total size of the pdf files in the current directory.
Please, note that mv command has at least 2 different behaviors with different performances, depending on the location of the ../destination_folder/ directory:
../destination_folder/ and *.pdf files on different file systems: the mv command is copying the files and then removing them from the source directory.
../destination_folder/ and *.pdf files on the same file system: only a rename is done which is super fast.
the df command can be used to display the ../destination_folder/ directory very nature.
Should you could choose the destination directory, then make sure it is located on the same file system: expect a great improvement.
In addition, if the ../destination_folder/ directory is located onto a remote server, the duration depends also on the network speed. If this is your situation, then compressing/uncompressing the files while moving should be tested: the performance can be much better.
If you have bash on Windows, you can run each in the background with the & suffix and try to parallelize it to achieve better performance. Use the wait keyword to wait for the background processes to complete. For example:
/usr/bin/echo "Moving subset 300-399"
/usr/bin/mv *-*-*-3[0-9][0-9]-*-*-*-*.pdf ../destination_folder/ & # Run this line in the background
# Other async calls
# Wait for background processes to finish
wait
If you want PowerShell, you can use Start-Job to run these in the background. To use your 300 subset as an example:
Write-Host "Moving subset 300-399"
$mv300jb = Start-Job {
$sourceFiles = Get-ChildItem -File .\*-*-*-3*-*-*-*-*.pdf | Where-Object {
$_.FullName -match '\\(\w+-){3}3[0-9]{2}(-\w+){4}\.pdf$'
}
Move-Item -Path $sourceFiles "..\destination_folder"
}
# Here you would also start other async jobs, assigning $mv400, $mv500, etc. like above
...
# Wait for job to complete
while( $mv300.State -notin 'Completed', 'Failed' ) {
Start-Sleep 30 # Change this to number of seconds to poll job again
}
Honorable mention
A second alternative on Windows would be to use robocopy.exe which copies and moves files more performantly than the standard copy and move commands. The /mt parameter will make use of multi-threading. Unfortunately, I don't have any robocopy examples to share here.
Explaining the regex
Note: I have since learned that you can use basic character ranges with Get-ChildItem and some other PowerShell cmdlets which support globbing. See my edit at the bottom of this answer for more information.
Since asked, here's a breakdown of the .NET regex I used to match on the filename:
\\(\w+-){3}3[0-9]{2}(-\w+){4}\.pdf$
\\: Literal \ character
(\w+-): Looks for group of one or more \w word-characters followed by a -
{3}: Quantifier to match on exactly 3 occurrences of the previous group
3[0-9]: Looks for literal 3 followed by a digit character
{2}: Quantifier to match on exactly two preceeding digit characters
(-\w+): Looks for group of one or more - characters followed by at least one word-character \w.
{4}: Quantifier to match exactly 4 occurrences of the previous group
\.pdf: Literal . character followed by pdf
$: End of input/string
At this time of writing I was unaware character ranges can be used with globbing in Get-ChildItem, so I resorted to using a regular expression to find the exact number of fields matching the specific number pattern in the 4th field, while ensuring the 8-field filename was intact for any found files.
If you plug this expression into https://regexr.com, it will break the expression down and explain everything better visually than I can here, without making this answer too long.
EDIT
As I learned the other day, you can use character ranges with PowerShell's file matching, though this doesn't work in other contexts within Windows. In my example above the following line can be modified to match letter and number ranges as well without having to use regex. If you take the following code from above:
$sourceFiles = Get-ChildItem -File .\*-*-*-3*-*-*-*-*.pdf | Where-Object {
$_.FullName -match '\\(\w+-){3}3[0-9]{2}(-\w+){4}\.pdf$'
}
we can use globbing to match on the filename without having to use the Where-Object or regular expression, greatly reducing the complexity of this bit:
$sourceFiles = Get-ChildItem -File .\*-*-*-3[0-9][0-9]*-*-*-*-*.pdf
Here is the modified code for eschewing the regex in favor of globbing:
Write-Host "Moving subset 300-399"
$mv300jb = Start-Job {
$sourceFiles = Get-ChildItem -File .\*-*-*-3*-*-*-*-*.pdf
Move-Item -Path $sourceFiles "..\destination_folder"
}
# Here you would also start other async jobs, assigning $mv400, $mv500, etc. like above
...
# Wait for job to complete
while( $mv300.State -notin 'Completed', 'Failed' ) {
Start-Sleep 30 # Change this to number of seconds to poll job again
}
The availability of this feature seems to hinge on whether a PowerShell construct is performing the globbing (it works) or if it is native to the Win32 API (does not work). In other words, it seems to be supported by PowerShell but not by other Windows APIs.
I have a large list of files (2,554 items), named like so
[mm_dd_yyyy hh_mm_ss] uniquefilenamestring.mp4
when sorting theses by name, the folder of course puts all the months together, rather than sorting by year, I need to run a PowerShell regex on the filenames but can't work out what I need to do
Ideally I'd like
[yyyy_mm_dd hh_mm_ss] uniquefilenamestring.mp4
I feel like it's simple enough but I just cant fathom it, originally the files also had a 9 digit number in front of the square brackets but I managed to use the below to fix that.
get-childitem *.mp4 | rename-item -newname { [string]($_.name).substring(9) }
If I understand you correctly, you simply want to swap the year with the month/day. With or without the brackets this should do the trick.
get-childitem -filter *.mp4 |
rename-item -NewName {$_.name -replace '(\d{2}_\d{2})_(\d{4})','$2_$1'}
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I am looking for a command in PowerShell for finding and moving files that contain certain string.
I have a folder with thousands XML files. These XML files have same structure and each file contains over 1000 lines. So Select-String command will go through all the file content, which is unnecessary, because the String I am looking for is present on first 10 lines of the file.
So I would like to some how help the PowerShell to get result faster. (Recursive searching is needed).
So, I want to find those files (int folder file_source) and move them to another folder called destination. The searching pattern is "\s*A73" (without quotes) and I have use this command:
Get-ChildItem -path ./file_source -recurse | Select-String -list -pattern "<type>\s*A73" | move -dest ./destination
Thanks.
You have not provided any code samples of what you are trying to do. That leaves some things open for interpretation. With that said, you can do something like the following:
$RootDirectoryToCheck = 'some directory path'
$DestinationDirectory = 'some directory path'
$TextToFind = 'some text'
Get-ChildItem -Path $RootDirectoryToCheck -Filter '*.xml' -File -Recurse |
where {(Get-Content $_.FullName -TotalCount 10) -match $TextToFind} |
Move-Item -Destination $DestinationDirectory
Explanation:
Get-ChildItem contains a -Recurse parameter to recursively search starting from -Path. -File ensures the output only contains files.
Get-Content's parameter -TotalCount tells PowerShell to only read the first 10 lines of a file. -match is a regex matching operator that will return True or False if comparing a single string. When comparing a collection of strings, it will return the matched string on successful match or null for an unsuccessful match.
The matched files can then be piped into Move-Item. The -Destination parameter can be used to direct where to move the files.
I doubt this is faster, compared to reading first 10 lines:
(dir <SourcePath> -Recurse -File | Select-String -Pattern <SearchTerm> -List).Path | Move-Item -Destination <DestinationPath>
But what the heck, since I just spent the time realizing that Select-String can't be made recursive on its own...
In the Windows Command Prompt, special folders are resolved like so:
However, in powershell, these folders do not seem to be resolved:
Consider the string:
$myfile = "%temp%\\myfolder\\myfile.txt"
How can I use this as an argument to PowerShell functions (eg: Remove-Item), and have PowerShell correctly resolve the special folders, as opposed to taking it literally and prepending the current working directory?
Edit:
I am working with strings using standard windows path notation coming from external configuration files, for example:
config.json:
{
"some_file": "%TEMP%\\folder\\file.txt"
}
myscript.ps1:
$config = Get-Content -Raw -Path "config.json" | ConvertFrom-Json
Remove-Item -path $config.some_file -Force
Note: as any of the Windows special folders can appear in these strings, I'd rather avoid horrible find-replace hacks like this
$config.some_file = $config.some_file -replace '%TEMP%' $env:temp
You can expand it to a full path using:
[System.Environment]::ExpandEnvironmentVariables("%TEMP%\\myfolder\\myfile.txt")
c:\users\username\AppData\Local\Temp\\myfolder\\myfile.txt
Double-backslash \\ isn't a PowerShell thing either, \ is not a special character in a PowerShell string - but double backslashes in a path do seem to work.
Documentation: https://msdn.microsoft.com/en-us/library/system.environment.expandenvironmentvariables.aspx
If you don't mind some performance issues
$resolvedPathInABitHackyWay = (cmd /c echo "%TEMP%\\folder\\file.txt")
This will actually give you %TEMP% resolved by cmd itself.
You can grab all env variables from the env:\ drive and use that to construct a succinct regex pattern for your find-replace operation, then use the Regex.Replace() method with a match evaluator:
$vars = Get-ChildItem env:\ |ForEach-Object {[regex]::Escape($_.Name)}
$find = "%(?:$($envNames -join '|'))%"
[regex]::Replace($config.some_file, $find, {param([System.Text.RegularExpressions.Match]$found) return (Get-Item "env:\$($found.Groups[1])").Value},'IgnoreCase')