In one folder I have 10000 files and the the names of files is structured like this:
File1_0, File1_1, File1_2, File1_3
File2_0, File2_1, File2_2
File3_0, File3_1, File3_2, File3_3
...
File1000_0, File1000_1
I like to delete previous revisions of the files and to stay with the last one. Above files to become like this:
File1_3
File2_2
File3_3
....
File1000_1
I am trying the following: Put the name without _xx in an array then to remove the duplicates. But I don't think this is correct and for the moment I am stuck in the logic of the way I should do this.
This is the code:
Option Explicit
Dim fso, folder, sourcefolder, file
Dim b : b = Array()
Dim i
Dim x
Dim z
Dim y
sourcefolder = "C:\test"
Set fso = CreateObject("Scripting.FileSystemObject")
Set folder = fso.GetFolder(sourcefolder)
i = 0
For Each file In folder.Files
i = i + 1
x = fso.GetBaseName(file)
ReDim Preserve b(UBound(b) + 1) : b(UBound(b)) = Left(x, Len(x)-2)
y = y & b(i-1) & "#"
Next
z = RemoveDuplicates(y)
For i=0 To UBound(z)-1
WScript.Echo i+1 & " " & z(i)
Next
Function RemoveDuplicates(str)
Dim d
Dim elem
If Trim(str) = "" Then
RemoveDuplicates = Array()
Exit Function
End If
Set d = CreateObject("Scripting.Dictionary")
d.CompareMode = vbTextCompare 'make dictionary case-insensitive
For Each elem In Split(str, "#")
d(elem) = True
Next
RemoveDuplicates = d.Keys
End Function
One way to approach your problem is to build a dictionary that maps the basenames of the files to the highest revision number:
Set d = CreateObject("Scripting.Dictionary")
d.CompareMode = vbTextCompare
For Each f In fso.GetFolder(sourcefolder)
basename = fso.GetBaseName(f.Name)
a = Split(basename, "_")
revision = Int(a(UBound(a)))
prefix = a(0)
if d(prefix) < revision Then
d(prefix) = revision
End If
Next
Then run a second loop to remove all files whose basename doesn't have that revision:
For Each f In fso.GetFolder(sourcefolder)
basename = fso.GetBaseName(f.Name)
a = Split(basename, "_")
revision = Int(a(UBound(a)))
prefix = a(0)
If d.Exists(prefix) And revision < d(prefix) Then
f.Delete
End If
Next
Note that this code assumes that the underscore separating prefix and revision is the only one in the basename. If you have filenames containing more than one underscore (like foo_bar_1.txt) you'll need to adjust the extraction of prefix and revision to take care of that.
With that said, I strongly recommend against doing revision management in filenames. Use a revision control system (Git, Mercurial, Subversion, ...). That's what they were invented for.
Option Explicit
' Folder to process
Dim sourceFolder
sourceFolder = "."
Dim fso
Set fso = WScript.CreateObject("Scripting.FileSystemObject")
' Regular expresion used to separate base name and sequence
Dim re
Set re = New RegExp
re.Pattern = "^(.*)_([0-9]+)$"
' Dictionary to store data for each sequence
Dim fileSequences
Set fileSequences = WScript.CreateObject("Scripting.Dictionary")
' fileSequences will use the base name of the files as key and
' hold as data an array with the higher sequence number and the
' associated file full path.
Dim f, colMatches, baseName, sequence
For Each f In fso.GetFolder(sourceFolder).Files
' Try to separate base name and sequence
Set colMatches = re.Execute( fso.GetBaseName(f.Name) )
' Only handle serialized files, those whose name match the regular expresion
If colMatches.Count > 0 Then
' base name and sequence are stored in the Submatches collection
' file extension is included in the base name to avoid handling separate series as one
baseName = LCase( colMatches.Item(0).SubMatches(0) & "." & fso.GetExtensionName( f.Name ) )
' Get the numeric sequence value - This should also handle zero prefixed sequence numbers
sequence = CLng( colMatches.Item(0).SubMatches(1) )
Select Case True
Case Not fileSequences.Exists( baseName )
' New sequence found - store current sequence value and the associated file path
fileSequences.Add baseName, Array( sequence, f.Path )
Case sequence < fileSequences.Item( baseName )(0)
' File with a lower sequence number found - Remove
f.Delete
Case sequence > fileSequences.Item( baseName )(0)
' File with a higher sequence number found - Remove previous one
fso.DeleteFile fileSequences.Item( baseName )(1)
' Update sequence information with new higher value and the associated file path
fileSequences.Item(baseName) = Array( sequence, f.Path )
End Select
End If
Next
Related
I have a folder with 8 Excel files with the following naming convention:
date_All_Groups
date_HRFull_Status_All
date_RME_Groups_Excluded
These files are used for monthly reports, therefore the date will obviously always be different.
I will be using a macro to manipulate the data in each worksheet, however I cannot create the macro due the changing file name (the date) - the only guarantee I have is that each of these files will DEFINITELY contain a partial string match.
I have a script that finds the files in the location and will rename the file, but it only renames 1 file and its not the first file in the folder.
My issue is using the For Each loop effectively.
Here's the code I have:
Dim fso, folder, file
Dim folderName, searchFileName, renameFile1, renameFile2, renameFile3, renameFile4, renameFile5, renameFile6, renameFile7, renameFile8
'Path
folderName = "C:\test\"
'Future FileName
renameFile1 = "All Groups.csv"
renameFile2 = "Groups Excluded.csv"
renameFile3 = "No Exclusions.csv"
renameFile4 = "HR.csv"
renameFile5 = "AD Users.csv"
renameFile6 = "Encryption Status.csv"
renameFile7 = "ePO4 Not Encrypted.csv"
renameFile8 = "ePO5 Not Encrypted.csv"
' Create filesystem object and the folder object
Set fso = CreateObject("Scripting.FileSystemObject")
Set folder = fso.GetFolder(folderName)
' Loop over all files in the folder until the searchFileName is found
For Each file In folder.Files
' See If the file starts with the name we search
If InStr(file.Name, "All_Groups") then
file.Name = renameFile1
End If
If InStr(file.Name, "Groups_Excluded") Then
file.Name = renameFile2
End If
If InStr(file.Name, "No_Exclusions") Then
file.Name = renameFile3
End If
If InStr(file.Name, "HR") Then
file.Name = renameFile4
End If
If InStr(file.Name, "AD_Users") then
file.Name = renameFile5
End If
If InStr(file.Name, "Encryption_Status") then
file.Name = renameFile6
End If
If InStr(file.Name, "ePO4") then
file.Name = renameFile7
End If
If InStr(file.Name, "ePO5") then
file.Name = renameFile8
End If
Exit For
' echo the job is completed
WScript.Echo "Completed!"
Next
The original code I found was exactly as above, but with only one If statement inside the For Each loop and the Exit For was inside the If statement.
Currently when I execute the script, the code renames only one file and its always the HR file first.
If I execute the script again, it then starts with All Groups, then Groups Excluded, and so on.
And the "Echo Completed" does not do anything either.
If you just want to rename your files to "canonical" names you could do something like this, assuming that you just want the date from the beginning of the filename removed and the underscores replaced with spaces:
Set re = New RegExp
re.Pattern = "\d{4}-\d{2}-\d{2}_(.*\.csv)"
For Each f In folder.Files
For Each m In re.Execute(f.Name)
f.Name = Replace(m.Submatches(0), "_", " ")
Next
Next
If the files have the same "date" you only need Find for that, for excample (if the date is a iso date "YYYYMMDD") (Date Returns "today" date)
IsoDate=CStr(Year(Date)) & Right("0" & CStr(Month(Date)),2) & Right("0" & CStr(Day(Date)),2)
And the for each:
For Each file In folder.Files
If InStr(file.Name, IsoDate) = 1 then 'if is in the start of the string
file.Name = Mid(file.Name, Len(IsoDate)+1) 'The same name with out date
End IF
Next
I have a folder that I will be looping through to process files differently based on their filenames. Doing good on my script (first one!), until I realized there will be filenames that have also have numbers representing priority. For example in the folder there may be:
'NV_CX67_mainx.dxf'
'NV_CX67_mainx1.dxf'
'NV_CX67_mainx2.dxf '
'NV_CX67_mainxroad.dxf'
'NV_CX67_motx.dxf'
'NV_CX67_resxroad.dxf'
The mainx, mainx1 and mainx2 are the same file type but mainx2 has priority and should be the only one processed. Currently, my statement is:
If Instr(1,FileRef, "mainx",1) then
How might I add a 2nd filter to process only the file with the highest number before moving onto the next file?
You are going to have run through the following process
Sort your input files
Loop through each file one by one
Compare the current file to the previous one you looked at minus the numbers to see if it greater.
Only process an item you have scanned all the similar items to ensure this one has the largest number
I wrote up an example below. Notice only NV_CX67_mainx4.dxf, and NV_CX67_mainxroad.dxf get processed:
Option Explicit
Dim i, sBaseFileName, sPrevFileName, prevBaseFile
sPrevFileName = "~"
prevBaseFile = "~"
Dim arr(5)
'Initialize test array. This will need to be sorted for this code to work properly
arr(0) = "NV_CX67_mainx.dxf"
arr(1) = "NV_CX67_mainx4.dxf"
arr(2) = "NV_CX67_mainx2.dxf"
arr(3) = "NV_CX67_mainxroad.dxf"
arr(4) = "NV_CX67_motx.dxf"
arr(5) = "NV_CX67_resxroad.dxf"
'Loop through the array
For i = LBound(arr) to UBound(arr)
If Instr(1, arr(i), "mainx",1) Then 'Check prev qualifier
sBaseFileName = getsBaseFileName(arr(i))
'First Case
If prevBaseFile = "~" Then
prevBaseFile = sBaseFileName
sPrevFileName = arr(i)
'Tie - Figure out which one to keep based on number at end of file name
ElseIf prevBaseFile = sBaseFileName Then
sPrevFileName = GetMaxFile(sPrevFileName, arr(i))
prevBaseFile = getsBaseFileName(sPrevFileName)
'New Case - Process prev case
Else
'Process File
MsgBox ("Processing " + sPrevFileName)
'Capture new current file for future processing
sPrevFileName = arr(i)
prevBaseFile = getsBaseFileName(sPrevFileName)
End If
End If
Next
'If last file was valid process it
If sPrevFileName <> "~" Then
MsgBox ("Processing " + sPrevFileName)
End If
'Return the larger of the two files based on numbers at end.
'Note "file9.txt" > "file10.txt" in this code
Function GetMaxFile(sFile1, sFile2)
GetMaxFile = sFile1
If sFile2 > sFile1 Then
GetMaxFile = sFile2
End If
End Function
'Return the file without extension and trailing numbers
'getsBaseFileName("hello123.txt") returns "hello"
Function getsBaseFileName(sFile)
Dim sFileRev
Dim iPos
getsBaseFileName = sFile
sFileRev = StrReverse(sFile)
'Get rid of the extension
iPos = Instr(1, sFileRev, ".",1)
If iPos < 1 Then
Exit Function
End If
sFileRev = Right(sFileRev, Len(sFileRev)-iPos)
'Get rid of trailing numbers
Do
If InStr(1, "1234567890", Left(sFileRev, 1), 1) Then
sFileRev = Right(sFileRev, Len(sFileRev)-1)
Else
Exit Do
End If
Loop While(Len(sFileRev) > 0)
getsBaseFileName = StrReverse(sFileRev)
End Function
Basically I am trying to write a script to delete files after a certain filename, so based on the below file list
FILE_000001_FULL.ZIP
FILE_000002_FULL.ZIP
FILE_000003_FULL.ZIP
FILE_000004_FULL.ZIP
FILE_000005_FULL.ZIP
FILE_000006_DELTA.ZIP
FILE_000007_DELTA.ZIP
FILE_000008_FULL.ZIP
Everything up until FILE_000005_FULL.ZIP would be deleted. The files are created using a tool and will be sorted by file name, so highest number first. Basically need the 2 latest FULL files kept and the DELTA's (if any) between them. I hope that makes sense.
So far, this is what I have, but just loops constantly, not just until it finds the 2 latest fulls.
Dim fso, folder, files, ToDel, sfolder
Set fso = CreateObject("Scripting.FileSystemObject")
sFolder = ("C:\MDS")
Set ToDel = fso.CreateTextFile ("C:\MDS\FileList.txt", True)
Set folder = fso.GetFolder(sFolder)
set files = folder.files
For each folderIDX In files
ToDel.WriteLine(folderidx.Name)
Next
ToDel.close
Dim arrFileLines()
i = 0
Set ObjFile = FSO.OpenTextFile("C:\MDS\FileList.txt", 1)
Do Until objFile.AtEndOfStream
Redim Preserve arrFileLines(i)
arrFileLines(i) = objFile.ReadLine
i = i + 1
Loop
ObjFile.Close
s = 0
Do While s < 2
For l = Ubound(arrFileLines) to LBound(arrFileLines) Step -1
For Each strLine in arrFileLines
IF InStr(strLine, "FULL") <> 0 Then
wscript.echo "Found Full!!!!"
wscript.echo strLine, s
s = S + 1
End If
Next
Next
LooP
My thoughts was to delete the lines from the text file, then use this text file to delete the files from the directory.
Hopefully that all makes sense and someone can pass some advice on!
You should be able to do this with two iterations through your folder and without the need/use of a text file. During the first pass, record the numbers assigned to the two latest FULL's. Then, in your second pass, delete any files that are less than your second-highest FULL.
Here's how it might look:
' First pass: Find the two latest FULLs...
For Each File In FSO.GetFolder("c:\mds").Files
' Is this a FULL?
If Right(File.Name, 8) = "FULL.ZIP" Then
' Get the numeric value from the file name (6 digits starting as pos 6)...
intNum = CLng(Mid(File.Name, 6, 6))
' Maintain the two latest FULLs...
If intNum > intMax1 Then
intMax2 = intMax1
intMax1 = intNum
ElseIf intNum > intMax2 Then
intMax2 = intNum
End If
End If
Next
' Second pass: Delete anything prior to the second-latest FULL...
For Each File In FSO.GetFolder("c:\mds").Files
intNum = CLng(Mid(File.Name, 6, 6))
If intNum < intMax2 Then File.Delete
Next
I've found an issue with adobes bates numbering tool, where file names are messing up the order in which they are numbered. I was hoping to write a script that users would be able to click on and add the folder extension for all the files. Then the script would prepend all the file names within the folder with a 000001filename.pdf, 000002filename.pdf, etc.
I've never combined scripts before but I've found scripts that either rename OR prepend, and I couldn't find anything that would rename sequentially with preceding 0s.
This is what I have so far:
Dim iloop As Integer
Dim iFileNumber As Integer
Dim sPrefix As String
Dim sNewFileName As String
Dim arr() As String
'Get array of all pdfs from the selected directory
arr = System.IO.Directory.GetFiles(strPath, "*.PDF")
'loop through the array
For iloop = 0 To UBound(arr)
'Create a prefix for each file
iFileNumber = iloop + 1
Select Case iFileNumber
Case 0 To 9 : sPrefix = "00000" & iFileNumber
Case 10 To 99 : sPrefix = "0000" & iFileNumber
Case 100 To 999 : sPrefix = "000" & iFileNumber
Case 1000 To 9999 : sPrefix = "00" & iFileNumber
Case 10000 To 99999 : sPrefix = "0" & iFileNumber
Case Else : sPrefix = iFileNumber
End Select
Dim arr2() As String
'split the path by the / symbol to get the filename
arr2 = Split(arr(iloop),"\")
'Add the prefix to the front of the filename, filename will be the last item in the array.
arr2(uBound(arr2)) = sPrefix & arr2(uBound(arr2))
'Put the new path and filename back together
sNewFileName = Join(arr2,"\")
'Rename the file with the new filename
System.IO.File.Move(arr(iloop),sNewFileName)
Next
A VBScript solution for renaming PDF files in a given folder by prefixing them with a running number, left-padded with zeroes, could look like this:
fldr = "..."
Set fso = CreateObject("Scripting.FileSystemObject")
i = -1
For Each f In fso.GetFolder(fldr).Files
If LCase(fso.GetExtensionName(f)) = "pdf" Then
Do
i = i + 1
newname = Right("0000" & i, 5) & f.Name
Loop While fso.FileExists(fso.BuildPath(f.ParentFolder, newname))
f.Name = newname
End If
Next
I am trying to determine the number of files that would be copied from the source folder to the destination and then assign this value to the progressbar.max.But using the code below I get Runtime error 5, Invalid procedure call or argument at the marked position.Please Guide
Private Sub cmdCopy_Click()
Dim sFileName As String 'Source File
Dim sDirName As String 'Source Directory
Dim dDirName As String 'Destination Directory
Dim fiFileCount As Integer 'Number of Files to be copied
Dim fbFileMatch As Boolean
If prgFCount.Visible = True Then prgFCount.Visible = False
dDirName = "D:\Destination\"
sDirName = "C:\Source\"
sFileName = Dir(sDirName)
' Disable this button so the user cannot
' start another copy.
cmdCopy.Enabled = False
cmdCancel.Enabled = True
fiFileCount = 0
Do While Len(sFileName) > 0
fbFileMatch = False
If Len(Dir$(dDirName & sFileName)) > 0 Then
fbFileMatch = True
End If
If fbFileMatch = False Then
fiFileCount = fiFileCount + 1
End If
sFileName = Dir '## Error at this Point ##
Loop
If fiFileCount = 0 Then
cmdCopy.Enabled = True
cmdCancel.Enabled = False
Exit Sub
End If
prgFCount.Min = 0
prgFCount.Max = fiFileCount
prgFCount.Visible = True
End Sub
If Len(Dir$(dDirName & sFileName)) > 0 Then
You set up your directory iteration with the line:
sFileName = Dir(sDirName)
Calling the Dir function without parameters will get the next item meeting the file name pattern and attributes is retrieved. The Len(Dir$ call is screwing it up.
I would suggest rewriting your code to loop through all the files in your source folder and build a list, then loop through the list and look for matches in your destination folder.
Something like this:
...
sFileName = Dir$(sDirName)
Do While Len(sFileName) > 0
i = i + 1
ReDim Preserve strSourceFileList(i)
strSourceFileList(i) = sFileName
sFileName = Dir()
Loop
If i > 0 Then
For i = LBound(strSourceFileList) To UBound(strSourceFileList)
sFileName = Dir$(dDirName & strSourceFileList(i))
If Len(sFileName) = 0 Then
fiFileCount = fiFileCount + 1
End If
Next i
End If
...
Dir returns the name of a matching file, directory, or folder.
Calling Dir should be fine but in your case it generates the error.
You also have no loop implemented to iterrate through all the available source files.
Using the FileSystemObject is one of the options.
To use the FileSystemObject, click the Project menu option, followed by the References... menu option.
This will open the References Dialog.
Tick the box beside the reference named "Microsoft Scripting Runtime" and click OK.
Now you can declare a variable as a FileSystemObject. In addition you get access to more objects such as File, Folder, Files and more.
Using the FileSystemObject gives you access to a wide range of features.
The code below demonstrates how to get the count of files which do not exist in the destination and will be copied, using the FileSystemObject.
Private Sub cmdCopy_Click()
Dim fso As New FileSystemObject
Dim sourceFolder As Folder
Dim sourceFile As File
Dim destinationFolder As Folder
Dim filesToBeCopied As Integer
Set sourceFolder = fso.GetFolder("C:\-- Temp --\Source")
Set destinationFolder = fso.GetFolder("C:\-- Temp --\Destination")
filesToBeCopied = 0
' Iterrate through each file in the source folder.
For Each sourceFile In sourceFolder.Files
' Check if the source file exists in the destination folder
If Not (fso.FileExists(destinationFolder + "\" + sourceFile.Name)) Then
filesToBeCopied = filesToBeCopied + 1
End If
Next
End Sub
I have tested the above code and it correctly increments the count of filesToBeCopied to the expected number.