efficient directory traversal for archivable folders - performance

There is a file server I have been asked to clean up (Pentium 4! with 512 meg of ram!!). It gets files added to a directory structure that consists of a few provinces, within each province directory are town directories, within each town directory are are 'newspaper' directories, within each newspaper directory are 'story' directories, and within each story directory are are files named by a string consisting of topic and date. So each file looks like this:
water poisons yak 2013-04-11.doc, malaria peak 2012-01-05, etc. I have to filter this collection for story directories that are inactive (no contributions within 3 years). The collection is to be filtered once a month. About 1000 new files get added every month. The collection has 80,000 files.
My solution is to run through each story directory and check the name of each file with this:
Sub LoopThroughFiles(storyfolder)
Dim MyObj As Object, MySource As Object, file As Variant
file = Dir(storyfolder)
While (file <> "")
If (InStr(file, "2013") Or InStr(file, "2014") > 0) Then
MsgBox "recent file found. move on to next directory"
recentDir = True
Exit Sub
End If
file = Dir
Wend
End Sub
This checks if the year 2013 or 2014 are in the filename. If those years are not found,then I run through the story folder again checking for the year 2012 and the month in a similar way. If a month more recent than the current one is not found, then that whole story directory is moved. In Visual Basic, this is accomplished by changing the pathname of every file in that particular story directory. So this means I run through the story folder Again (a third time), this time assigning a different pathname to each file.
My question has to do with the correct design of this algorithm to be efficient. To know if a folder can be archived, the filenames of its contents have to be inspected.
If I loop through the whole collection in the beginning and load it into an array (to make the subsequent searches faster ) then I will need a ton of memory. On the other hand, looping through each directory three times seems wasteful. I feel like this question should have been solved already so I ask.
Although I have to use visual basic, the principles behind this question are fundamental and applicable to any language. Do not hesitate to reply in another computer language (in English though).

Related

How to use Automator and Applescript to batch move files/folders to a specified folder based on specific text in Calendar Events, MacOS

Every day, I get up to 25 real estate photoshoots from a photographer, edit them, and export them into their respective realtors' folder in Dropbox. I get each shoot from the photographer in a folder named with its respective address/location. Currently, after editing each shoot, I look at the foldername (the address/location), go to the Calendar app, find which event has that location, see who the respective realtor is, navigate to their personal folder or create it if it doesn't exist, and then export the photoshoot into their folder (the realtor's name). Every. Single. Time.
I'm trying to automate this process by simply exporting all the shoots into one folder, triggering an Automator Folder Action which sorts the shoots into their respective realtor's folder, based on the foldername.
The relationship between the source folders and destination folders is determined based on location data from a Calendar Event. I've already figured out how to pull the data from Calendar, but getting Automator to move the folders around using the data has me thoroughly stumped.
I've tried several things in Automator, and I'm sure there's a simple way using Applescript but I haven't learned enough about Applescript to make it work. I've tried copy/pasting bits of Applescript code from other posts but I've finally thrown my hands up and decided to ask the experts lol. Here are some screenshots to help illustrate the issue.
Folders full of pictures "Test"(Sources):
"Realtors" Folder (Destinations)
Here is the Automator Workflow I've created so far, which pulls the address (Source foldername) and realtor names (Destination foldername [with other junk to be removed]) from Calendar event data, and creates 2 txt files (shown in the screenshot after this one):
Text files created by above workflow:
It's possible that I'm overcomplicating this horribly, but the solution that comes to mind is this: The text files correlate based on line number, i.e. line 1 of "Today's Locations.txt" (Location: 326 Whiterock Dr) corresponds with line 1 of "Today's Shoots.txt" (Summary: Photos for Remy Locasio).
When I drop these folders into the trigger folder, "Test" in this case, Automator/Applescript could search the "Today's Locations.txt" file for "326 Whiterock Dr" and say, "Hey, that's on line 1. I'm gonna go into "Today's Shoots.txt",find line 1, parse out (Summary: Photos for ), take the realtor's name only (Remy Locasio) and move the input folder ("326 Whiterock Dr") into the realtor's folder with that name ("Remy Locasio"), then do it again for the next folder and it's corresponding documents' line, etc etc, until all the folders are moved.
Before the realtor's name is typically the word "for ". Whether it's Photos for , Photos and videos for , Photos and aerial photos for etc. so "for " could be a delimiter maybe?
Sometimes, after the realtor's name, there's a note, always separated by a " (", eg. Brandi Smith (text when on the way), so " (" could be another delimiter maybe?
Sometimes the realtor's name is spelled wrong in the calendar but not in the folder, as is the case with Brandi vs Brandy Smith on line 9. There would need to be a popup in this case asking for the correct folder. Something to the effect of "Could not find a folder called "Brandi Smith". Should we create it or choose another folder? and I'd choose the correct folder, or in the case of "Megan Forsberg" (line 8) who doesn't have a folder yet, allow it to create the folder.
I've racked my brain for a week trying to do this with Automator and variables and I'm at a complete loss, any help would be a godsend. I'll gladly clarify anything that doesn't make sense or answer any questions anyone has.
I can't understand the whole problem clearly but you can use below code in the applescript for the regex part.
set a to "Photos for Karie Williams (Zonker)"
set b to do shell script "str='" & a & "';sed 's/(.*//g' <<< $str"
set c to do shell script "inp='" & b & "';sed 's/.*\\(for \\)//' <<< $inp"
display dialog c

Is it possible to append prefix to files names taken from folder name, automatically? (Windows)

I work as technical photographer. I do a lot of photos of particular parts. Each parts get a folder assigned and then I copy photos to the folder.
I would like the names of files (photos) get a prefix which is folder name. Example:
I take 20 photos of part A1. I copy those 20 photos from SD card to my PC to previously created folder named "A1". I would like those 20 files to have names as follows:
A1(1)
A1(2)
A1(3)
[...]
A1(20)
Is it possible to make it automatic? or do it by one click?
Thanks in advance
If you don't need to preserve the original numbering, it's as simple as selecting all the files in Explorer, pressing F2 (for rename) and typing in the new name. The files will automatically get non-colliding names in the form of "Name (number)".
This respects the ordering you have selected in Explorer, so if you want the index to increment from older to newer files, for example, just sort the files by date ascending.
This can also be used to preserve the original numbering, but only if there are no gaps and if the numbers start from 1. If you sort the files by name and do the rename trick, they will still be ordered the same as before. If there are gaps, they will not be there anymore with the new file names, though.
One more gotcha is that this only works if all of the files have the same extension. If some are jpg and others png, for example, each extension will get its own numbering.
If this isn't good enough, you'll either have to use a script, which is a bit more advanced, or some tool that helps with batch renaming. My favourite has been Total Commander for a long time - in TC, this is as simple as selecting the files you want to rename, pressing Ctrl+M, and changing the file name to something like A1 ([N]).

Xcopy creates blank/empty csv files

I have a batch file that I use (and have used all day successfully) to copy some badly named files (all are 1.csv) in a horrendously long dir tree to a different location with a more appropriate name (and shorter route to the file). It suddenly quit working; specifically, it IS making the new directory and it IS making the file with the correct name; however, the 'new' file is basically void. The source file may be 26mb but the destination file created is 287k. I'm using
xcopy "Z:\jobs\!q-z\clientname\tars\2016_06_06\home\abcd\xyz\qwert\more\moarr\deeper\deepest\x\y\z\1.csv" "Z:\jobs\!q-z\clientname\daily\160606\biz.csv*"
As I said earlier, this was working just fine all day long and suddenly began creating all dest files exactly the same size, which is an empty shell of a csv file. I have a feeling this might have to do with some sort of "cache" issue, but if so, I don't know how to clear it. Or is there some error in my syntax that allowed it to work almost accidentally before today? I've tried using pretty much every switch for xcopy, with no better results.
Also, the only thing that ever changes in the batch file is the date.

Moving files to corresponding subdirectory in Python

Every week at work, I am responsible for manually moving 100-200 files from one folder into a corresponding subfolder. After doing this for a couple of weeks, I thought to myself: This can be done faster!
I have used Python 2.7 and 3.X a bit at school, but mostly with (very) basic search engines and text search.
I found another thread, where a guy was told to use either os.rename or shutil.move. I made a simple test with os.rename:
os.rename("path/to/current/file.foo", "path/to/new/desination/for/file.foo")
And it works, so far so good.
Is there any way to make python run through every file from a folder and move it into a corresponding subdirectory in another folder? The original directory contains all the files, while the target directory contains all the folders.
Every file (A_file, B_file, etc.) has the same name as the folder(A_folder, B_folder, etc), which means they are in the correct order.
This makes me think a simple iteration could work, as in(More of an algorithm than code):
for file in original_dir
move file to folder_x in tar_dir
x += 1
Obviously this is not complete, but maybe someone can point me in the right direction.
This makes directories recursively.
os.makedirs(path)
So you pass the path to the directory you want. eg /path/to/
Which you would follow up with the copy.
def move_file(new_path_to_file, file_to_move):
file_name = file_to_move.split(os.path.sep)[-1]
os.makedirs(new_path_to_file)
os.rename(file_to_move, os.path.join(new_path_to_file, file_name))
You could also make it easier by passing in the filename as well.

VBS Removing files within a directory using FileSystemObject with exceptions?

I work with the rather finnicky Oracle Business Intelligence software and we often have issues that entail, clearing out specific data on users systems, and then synchronizing with the server to pull down the data again. I've got a vbs script that I'm working on that removes key directories, and renames others and stops services etc.
Where I'm stuck is on one specific directory. Using FileSystemObject, what would be the easiest way to remove every single file within a directory with the exception of a single folder?
So, for this specific example, I have C:\OracleBIData\sync\config
Where I want to delete everything inside of the "sync" directory, with the exception of the config directory. Any takers?
Snippet:
Option explicit
Const folderspec = "C:\OracleBIData\sync"
Const excludeFolder = "C:\OracleBIData\sync\config"
deleteSubFolders CreateObject("Scripting.FileSystemObject").GetFolder(folderspec), excludeFolder
Public Sub deleteSubFolders(byRef MyFolder, exclFolder)
Dim sf
For Each sf in MyFolder.SubFolders
If not (lCase(sf.Path) = lCase(exclFolder)) Then
deleteSubFolders sf, exclFolder
sf.Delete
End If
Next
End Sub
It will not delete folders under the excludeFolder.
Brute force is all I can think of.
Walk through the directory item but item and delete it if it is not config. Or if this directory has lots and lots and lots of files, first run through deleting a*., b.*, d*.* 25 times, and then walk through the rest of the items.

Resources