Script to compare two different folder contents and rename them based on minimum similarity - windows

Story:
I have multiple folders with 1000+ files in each that are named similar to each other but are slightly different but they relate to the same content.
For example, in one folder I have files named quite simply "Jobs to do.doc" and in another folder "Jobs to do (UK) (Europe).doc" etc.
This is on Windows 10, not Linux.
Question:
Is there a script to compare each folder's content and rename them based on minimum similarity? So the end result would be to remove all the jargon and have each file in each folder (multiple) the same as one another but STILL remain in the retrospective folder?
*Basically compare multiple folder content to one folders contents and rename them so each file in each folder is named the same?
Example:
D:/Folder1/Name_Of_File1.jpeg
D:/Folder2/Name_Of_File1 (Europe).jpeg
D:/Folder3/Name_of_File1_(Random).jpeg
D:/folder1/another_file.doc
D:/Folder2/another_file_(date_month_year).txt
D:/Folder3/another_file(UK).XML
I have used different file extensions in the above example in hope someone can write a script to ignore file extensions.
I hope this make sense. So either a script to remove the content in brackets and keep the files integrity or rename ALL files across all folders based on minimum similarity.
The problem is its 1000+ files in each folder so want to run it as an automated job.
Thanks in advance.

If the stuff you want to get rid of is always in brackets then you could write a regex like
(.*?)([\s|_|]*\(.*\))
Try something like this
$folder = Get-ChildItem 'C:\TestFolder'
$regex = '(.*?)([\s|_|]*\(.*\))'
foreach ($file in $folder){
if ($file.BaseName -match $regex){
Rename-Item -Path $file.FullName -NewName "$($matches[1])$($file.extension)" -Verbose #-WhatIf
}
}
Regarding consistency you could run a precheck using same regex
#change each filename if it matches regex and store only it's new basename
$folder1 = get-childitem 'D:\T1' | foreach {if ($_.BaseName -match $regex){$matches[1]}else{$_.BaseName}}
$folder2 = get-childitem 'D:\T2' | foreach {if ($_.BaseName -match $regex){$matches[1]}else{$_.BaseName}}
#compare basenames in two folders - if all are the same nothing will be returned
Compare-Object $folder1 $folder2
Maybe you could build with that idea.

Related

How can I convert part of a filename to become the file extension?

I downloaded a backup folder of about 3,000 files from our email service provider. None of the files have an associated filetype; instead the file extension was appended to the name of each individual file. For example:
community-involvement-photo-1-jpg
social-responsibility-31-2012-png
report-02-12-15-pdf
I can manually change the last dash to a period and the files work just fine. I'm wondering if there is a way to batch convert all of the files so they can be sorted and organized properly. I know in the Command Line I can do something like ren *. *.jpg but there are several different file types contained in the folder, so it wouldn't work for all of them. Is there any way I can tell it to convert the last "-" in each file name into a "." ?
I'm on Windows 10; unable to install any filename conversion programs unless I want to go through weeks of trouble with the IT group.
$ordner = "c:\temp\pseudodaten"
$Liste = (get-childitem -path $Ordner).Name
cd $ordner
foreach ($Datei in $Liste) {
$Length = $datei.length
$NeuerName=$Datei.Substring(0,$Length-4)+"."+$datei.Substring($Length - 3, 3)
rename-item -Path $Datei -NewName $NeuerName
}

Show all files in folder and subfolders and list names and encoding

I want to see all the files in a folder and its sub folders and list its encoding.
I know that you can use git ls-files to see the files and file* to get the name + its encoding.
But I need help how I can do both at the same time.
The reason is that we have problem with encoding and need to see what files are encoded in what way. So I guess a PS script would work fine as well.
I think the best way to solve this by Powershell is first get your files by following Script:
$folder = Get-ChildItem -Path "YourPath"
and in a foreach ($file in $folder) use one of the following scripts to get the encoding (which is straightforward)
https://www.powershellgallery.com/packages/PSTemplatizer/1.0.20/Content/Functions%5CGet-FileEncoding.ps1
https://vertigion.com/2015/02/04/powershell-get-fileencoding/

Powershell: Place all files in a specified directory into separate, uniquely named subdirectories

I have an existing directory, let's say "C:\Users\Test" that contains files (with various extensions) and subdirectories. I'm trying to write a Powershell script to that will put each file in "C:\Users\Test" into a uniquely named subdirectory, such as "\001", "\002", etc., while ignoring any existing subdirectories and files therein. Example:
Before running script:
C:\Users\Test\ABC.xlsx
C:\Users\Test\QRS.pdf
C:\Users\Test\XYZ.docx
C:\Users\Test\Folder1\TUV.gif
After running script:
C:\Users\Test\001\ABC.xlsx
C:\Users\Test\002\QRS.pdf
C:\Users\Test\003\XYZ.docx
C:\Users\Test\Folder1\TUV.gif
Note:
Names, extensions, and number of files will vary each time the script is run on a batch of files. The order in which files are placed into the new numbered subdirectories is not important, just that each subdirectory has a short, unique name. I have another script that will apply a consistent sequential naming convention for all subdirectories, but first I need to get all files into separate folders while maintaining their native file names.
This is where I'm at so far:
$id = 1
Get-ChildItem | where {!$_.PsIsContainer}| % {
MD ($_.root +($id++).tostring('000'));
MV $_ -Destination ($_.root +(001+n))
}
The MD expression successfully creates the subdirectories, but I not sure how to write the MV expression to actually move the files into them. I've written (001+n) to illustrate the concept I'm going for, where n would increment from 0 to the total number of files. Or perhaps an entirely different approach is needed.
$id = 1
Get-ChildItem C:\Test\ -file | % {
New-Item -ItemType Directory -Path C:\Test -Name $id.tostring('000')
Move-Item $_.FullName -Destination C:\Test\$($id.tostring('000'))
$id++
}
Move-Item is what you were looking for.
Ok, think I figured it out. Running the following scripts sequentially produces the desired result. The trick was resetting the $id increment when running the MV expression. This can probably be improved though, so let me know if you have a better way! Edited: #ArcSet has provided a better answer in a single script! Thank you!
$id = 1
Get-ChildItem | where {!$_.PsIsContainer}| % {
MD ($_.root +($id++).tostring('000'))
}
$id = 1
Get-ChildItem | where {!$_.PsIsContainer}| % {
MV $_ -Destination($_.root +($id++).tostring('000'))
}

Batch file to compress subdirectories individually with Windows native tools

I've seen variations of this question answered, but typically using something like 7zip. I'm trying to find a solution that will work with the capabilities that come with windows absent any additional tools.
I have a directory that contains several hundred subdirectories. I need to individually compress each subdirectory....so I'll wind up with several hundred zip files, one per subdirectory. This is on a machine at work where I don't have administrative privileges to install new software...hence the desire to stay away from 7zip, winRar, etc.
If this has already been answered elsewhere, my apologies...
Never tried that myself, but there is Compress-Archive:
The Compress-Archive cmdlet creates a zipped (or compressed) archive file from one or more specified files or folders. An archive file allows multiple files to be packaged, and optionally compressed, into a single zipped file for easier distribution and storage. An archive file can be compressed by using the compression algorithm specified by the CompressionLevel parameter.
Because Compress-Archive relies upon the Microsoft .NET Framework API System.IO.Compression.ZipArchive to compress files, the maximum file size that you can compress by using Compress-Archive is currently 2 GB. This is a limitation of the underlying API.
Here's a sample script I just hacked together:
# configure as needed
$source = "c:\temp"
$target = "d:\temp\test"
# grab source file names and list them
$files = gci $source -recurse
$files
# target exists?
if( -not (test-path $target)) {
new-item $target -type directory
}
# compress, I am using -force here to overwrite existing files
$files | foreach{
$dest = "$target\" + $_.name + ".zip"
compress-archive $_ $dest -CompressionLevel Optimal -force
}
# list target dir contents
gci $target -recurse
You may have to improve it a bit when it comes to subfolders. In the above version, subfolders are compressed as a whole into a single file. This might not exactly be what you want.
Get-ChildItem c:\path\of\your\folder | ForEach-Object {
$path = $_.FullName
Compress-Archive -Path $path -DestinationPath "$path.zip"
}
I put this, as a quick snippet. Don't hesitate to comment if this does not fit with your request.
In a folder X, there are subfolders Y1, Y2...
Y1.zip, Y2.zip... will be created.
use PowerShell go the the path that you would like to compress, do:
$folderlist = Get-ChildItem "."
foreach ($Folder in $folderlist) { Compress-Archive -path $Folder.Name -destinationPath "$($Folder.Name).zip"}

Windows Batch sript that moves files based on a (partial char string) looking it up in a CSV/txt file

What I'm looking for might be a variation of this solution: Windows batch file to sort files into separate directories based on types specified in a csv
My Situation: a batch process in a server creates files that look like this: S0028513-010716-0932.txt. S stands for summary, the first five digits stand for a supplier, the last two before the hyphen stand for the Distribution Center. After the hyphen, there is the date and after the second hyphen the timestamp.
What I need to do is:
set a variable for the month/year (e.g. 0716) (this has been set with "set /P c:Please enter MMYY:"). This part is done.
create a folder with subfolders (e.g. 0716\PHARMA, 0716\MEDICAL, etc). I've done this part.
look up the supplier number in a CSV file (e.g. S00285 above) and
move the file to the corresponding folder based on MMYY\PHARMA, etc.
Points 3 and 4 are obvioulsy missing. A practical example: there are three folders where the files can be moved: PHARMA, MEDICAL and CONSUMER
The CSV file looks like this:
S00285 CONSUMER
S00286 PHARMA
S00287 MEDICAL
...
What I want the script to do is to look up the month/year combination in variable c and take all files that correspond to this month/year and move them to the three folders according to the assignment in the CSV file.
Can this be done with standard Windows scripting? Sorry guys, I'm a novice as you can tell. I have only some very basic knowledge of BASH scripting.
Thank you a lot for any advice.
BR
Marcio
This can fairly easily be accomplished with PowerShell
$FolderRoot = "E:\Target\Directory"
Set-Location $FolderRoot
# 1. Have user input month/year string
do{
$MMYY = $(Read-Host 'Please enter MMYY').Trim()
} until ($MMYY -match '\d{4}')
# 2. Create directory
mkdir $MMYY
# ?. Gather input files for that year
$Files = Get-ChildItem -Filter S*.txt |Where-Object {$_.BaseName -match "S\d{7}-\d{2}$MMYY-\d{4}"}
# ?. load CSV file into hash table to easily look up supplier numbers
$SupplierLookupTable = #{}
# Assuming the csv has headers: Supplier,Industry
Import-Csv -Path E:\path\to\suppliers.csv |ForEach-Object {
$SupplierLookupTable[$_.Supplier] = $_.Industry
}
foreach($File in $Files)
{
# Grab the S and first 5 digits from the file name
$Supplier = $File.BaseName.Substring(0,6)
# 3. Look up the industry
$Industry = $SupplierLookupTable[$Supplier]
$Destination = Join-Path $MMYY $Industry
# Create folder if it doesn't already exist
if(-not (Test-Path $Destination))
{
mkdir $Destination
}
# 4. Move the file
Move-Item $File.Fullname -Destination $Destination
}

Resources