Batch OCR files in subfolders and save new files with new name - windows

I have the following code, which OCR's all PDF files in a specific folder (d:\extracttmp2), but it does not rename the files as I would like, or put the new files in the right place.
Currently, all files are within subfolders of 'extracttmp2'.
The OCR runs correctly, but I would like the OCR'ed files to be renamed to: <parent folder path>-<filename>_ocred.pdf. Naming them in such a manner will produce no file overwrites.
Currently, the code OCR's the files, but it saves the new files to the folder above the folder they are located in. It also saves the filenames as "JAN_ocred.pdf", for example, for a file named "JAN.pdf". The result of saving up one folder leads to some file overwrites, which is unwanted.
Also, it doesn't matter if the OCR'ed files remain in the folder where the un-OCR'ed files are located, or if they're saved up one folder. The desired renaming will eliminate any overwrites.
The software I'm using is PDF24. https://creator.pdf24.org/manual/10/#command-line. However, I think that my problem is not with the OCR software, but my syntax in the batch script.
Can anyone tell me what I am doing wrong?
For /R d:\extracttmp2\ %%G in (*.pdf) do "C:\Program Files\PDF24\pdf24-Ocr.exe" -outputFile "%%~nG_ocred.pdf" -language eng -dpi 300 -skipFilesWithText "%%G"

Is this what you mean? i.e. files will be saved in the same location as before, but each name will be prefixed with their parent directories' name, followed by a hyphen/dash.
#For /R "D:\extracttmp2" %%G In (*.pdf) Do #For %%H In ("%%~dpG.") Do #"%ProgramFiles%\PDF24\pdf24-Ocr.exe" -outputFile "%%~nxH-%%~nG_ocred%%~xG" -language eng -dpi 300 -skipFilesWithText "%%G"
Just a quick clarification: D:\extracttmp2\directory1\JAN.pdf would be saved in the working directory with the name directory1-JAN_ocred.pdf, and D:\extracttmp2\directory2\subdirectory3\SOMENAME.pdf, as subdirectory3-SOMENAME_ocred.pdf
If you want to save the files somewhere else, either change the working directory, or prepend it to %%~nxH-%%~nG_ocred%%~xG

Related

How to batch copy a list of folders and its file contents from a text file into a new folder in windows

I have a folder with approximately 7000 subfolders. I am interested in copying 1500 of those subfolders and their file contents into a new folder.
The closest I have got is copying the subfolders file contents into a new Target folder. However, the file contents are not copied over within their resepective subfolder making the batch copy useless to me.
Here is what I have tried.
CD E:\Source_Folder
FOR /F "delims=" %%N in (List.txt) do COPY "%%N" "Target_Folder"
I have tried XCOPY, ROBOCOPY, as well and all give me the same output of individual files in my target folder. I am looking for the subfolders with their contents to be copied into my new target folder.
Any help would be much appreciated. Thanks!
Just a try: rsync in linux does have the option to take the file-list to be synced from a file. Maybe robocopy (which is the windows-pendant) has the same capability, then you need one robocopy x y z only.
-edit: robocopy can'r read from a file, but if you search for 'robocopy reading from list' you get a lot of short scripts.

How to batch copy files based on a list (txt) in to another folder with same directory structure?

I have a root directory with over 25,000 files in it. These files are in loads of different subdirectories.
I also have a text file with 4300 lines in it, each line is an absolute path to one of the files in the directory. Like below,
c:\dir1\hat1.gif
c:\dir1\hat2.gif
c:\dir1\dir2\hat1.gif
c:\dir1\dir2\hat2.gif
c:\dir1\dir3\cat.zip
c:\dir1\dir3\banana.exe
I also have another root directory witch is a copy of the original root directory structure but all the directories are empty.
I would like to copy all the files listed in the text file to the directory which is empty and place all the copied files inn the respected subdirectories.
if I use the following batchfile I keep getting file overwrite prompts because it is not copying the files to the correct directories.
#echo off
set dst_folder=c:\DSTN2
for /f "tokens=*" %%i in (USEDFILES.txt) DO (
xcopy /S/E "%%i" "%dst_folder%"
)
How do I modify this so the files are copied to the correct directory?
Since you are copying specific files from a list, you need to make sure the directory structure exists in the destination if you want it in a similar folder structure. So using the power of the FOR command modifiers you can get the file path only from the file name in the file list. You will use that modifier to create the destination directory and also use it as the destination for the XCOPY command.
I have taken the liberty of providing best practices for all the code you are using.
#echo off
set "dst_folder=c:\DSTN2"
for /f "usebackq delims=" %%G in ("USEDFILES.txt") DO (
mkdir "%dst_folder%%%~pG" 2>NUL
xcopy "%%~G" "%dst_folder%%%~pG"
)

dynamically adjust to the unzipped folder name in a batch file

I'm having trouble setting the correct path to the folder containing my batch file. For example, right now, I have a zipped file called "example.zip". This zip file contains 4 files within it (file1, file2, file3, file4). When a user right-clicks and extracts the 4 files, they are given the option to rename the file path and folder name. By default, the folder name gets saved as "example". In my batch script, I can find and move the files great if they don't change the folder name. But if they change the folder name path to C:\Users\%username%\Downloads\"notexample" then it screws up the batch file.
I'm wondering how to grab the folder path after the user has extracted the the zip file and named it possibly something other than the default name.
My current configuration in my batch script is
for \f "delims=" %%F in ('dir /b /s "cd:~0,2%\Users\%username%\example" 2^>nul') do set filepath=%%F
This is just searching for any folder that matches "example" in the Users\Downloads directory and grabs the file path. You can see the problem if the user renames the folder "notexample". My batch script yells "folder not found"
Thanks
To get a batch file's location is quite simple, use the %~dp0 variable:
echo %~dp0
returns the batch file's folder.
That means you could simply use this:
set "filepath=%~dp0"
This won't require the for loop.

Windows CMD or BATCH file that "flags" empty folders (and folders that only contain empty folders)

Most companies I have worked for start new projects with a templated folder structure. Windows will automatically flag empty folders (the icon shown in the file explorer is that of an empty folder) however, folders whose subfolders are also empty will not be flagged (their icon shows a folder containing files). This can create confusion and lead to mistakes during the early stages of a project as it would appear at a glance that some folders have had their content added when they really only contain empty subfolders.
So my question is:
How can I iterate over folders with empty subfolders and "flag" them via the use of a Batch file.
The .bat file would need to search through subfolders to determine if a folder has any real content. If the folder does not have real content then the .bat file would need to flag it (flagging could be done with a change of its icon or filename). This would make it much easier to navigate new projects with large templated folder structures.
I don't need a completed file, I would just like to know how it could be achieved. However, any tips or suggestions on how to achieve this functionality would be more than welcome!
*Edit
Just to clarify I will show an example:
If I create an empty folder called 'Project' it will display with the Empty Folder icon. As shown Below:
Now I will add a new folder to my project folder called 'I am Empty':
The folder 'Project' no longer displays with the Empty Folder icon. It now uses the icon that shows it with content. As shown Below:
What I want is a .bat file that will parse the contents of the 'Project' folder and determine that it only contains the 'I am Empty' folder, which is empty and "flag" that folder (to flag it we could change the icon of the 'Project' folder back to the Empty Folder icon, change the name or "gray it out"). As shown Below:
cd example
for /d %%i in (*) do #dir /b /s /a-d "%%i">nul 2>&1|| #echo "%%i" has no files
give for an additional /r if you want to check subfolders too.
The trick is to list files only (/a-d) in the folder and all subfolders (/s) and if this fails (||) (because there are no files), do something with this folder (just echo here, but rd /s /q or ren is also possible)
Building on the answer given by Stephan and looking at other related stack exchange posts I pieced this solution together and it works well for my needs:
#ECHO OFF
PUSHD "%~dp0"
FOR /f "tokens=* delims=" %%F in ('dir /b/s/ad *') DO (
#dir /s /a-d-s "%%F">nul 2>&1|| ATTRIB +H "%%F"
#dir /s /a-d-s "%%F">nul 2>&1&& ATTRIB -H "%%F"
)
POPD
EXIT
I opted for a solution that did not modify file names. While testing that approach I realized it doesn't create the best user experience.
Instead, running this batch file hides the folders that are effectively empty and if your folder settings allow you to see hidden folders then they appear faded out. It also re-shows hidden files that have new content since the last time the batch file was ran.
For those of you coming to this solution who, like myself, are relatively inexperienced with batch scripting. I will explain what the code does and why (as I understand it).
I don't want to change the batch file for each implementation so I call PUSHD "%~dp0 to set the active directory to the folder containing the batch file (this lets me include the batch file in the folder-structure template, which is copied and pasted for each project)
Since I decided to use the hidden attribute for folder flagging, I needed to modify the FOR loop . FOR loops typically ignore hidden files which becomes troublesome if you need show a file that was previously hidden because it has new content. Running FOR /f "tokens=* delims=" %%F in ('dir /b/s/ad *') DO ()
allows the batch file to loop through all files mainly because of the /f attribute, but check out
this post about looping through hidden folders for more information.
Inside the for loop I am pretty much doing what Stephen suggested in his answer with the added logic to remove the hidden attribute on folders that no longer need it.
The only thing this batch file is lacking, is the ability to auto update on folder modifications or to auto-run on folder open (I hear this might be possible with an .ini file?) however, for my needs it will suffice to rerun the batch file, manually, after making changes to the folder.
Batch scripting is way outside of my comfort zone as far as scripting languages go so please forgive and correct me if I have made any mistakes or if there is a more reliable way to do what I need.

batch list all .xls files in folder

I have following batch files to list all the excel files in my folder:
for /r %%i In (*.xls) DO echo %%i
However, this will also include all excel files in subfolders of my current folder. How can I prevent that? I only want the files in the folder itself, not the subfolder.
Pff, silly question, silly answer: just remove the /r

Resources