I want to view all the duplicate files present in a
drive using command prompt. I have tried a few commands like tree but I am not satisfied by it.
I'll assume you are simply looking for duplicate file names, regardless of content.
This is inherently a relatively slow process. If you want a script based solution, then your best bet is probably to write a custom powershell, VBScript, or JScript script.
But I have a pair of pure script based utilities that can give decent performance. You should still expect the command to take many minutes to begin printing results (perhaps hours? if a large drive). The entire directory listing must fit within 2 GBytes. This command will fail if the limit is exceeded.
This will not allow you to see files for which you do not have access.
jren "^.*" "name()+' : '+path()" /list /j /s /p c:\ | sort | jrepl "^(.*? : ).*\n(?:\1.*\n)+" "$0" /m /i /jmatch
The above works by first using JREN to recursively list all files, one file per line as
fileName : fullFilePath
The list is then sorted, and then JREPL is used to extract consecutive lines where the leading file name repeats.
JREN.BAT is available at http://www.dostips.com/forum/viewtopic.php?f=3&t=6081
JREPL.BAT is available at http://www.dostips.com/forum/viewtopic.php?f=3&t=6044
Use JREN /? and JREPL /? to get full documentation on the utilities.
Related
this is for my doctoral thesis in medicine. So please excuse my noobishnis in programing.
I have a bunch (about 4000 files) of scans from patients. There is a front and a back .jpg for each patient. And there where multiple patients each day.
The folder structure looks like this:
\images
\2017-08-21
\pa_102165.jpg
\pa_10216500001.jpg
\2017-06-14
\pa_101545.jpg
\pa_10154500001.jpg
\pa_104761.jpg
\pa_10476100001.jpg
\pa_107514.jpg
\pa_10751400001.jpg
\2017-03-73
\pa_109631.jpg
\pa_10963100001.jpg
\pa_108624.jpg
\pa_10862400001.jpg
Where in the first example 2017-08-21 is the date the patient came in, pa_102165.jpg is the front and pa_10216500001.jpg is the back. So the front is always pa_10XXXX.jpg and the back is pa_10XXXX00001.jpg. I had no hand in the nameing scheme.
My goal is to make a batchscript that merges the 2 corresponding .jpgs of each patient horizontally and automatically puts them in a different folder, so that I don't have to do it manually with something like MS Paint.
For example like this:
\images
\merged
\2017-08-21
\pa_102165_merged.jpg
\2017-06-14
\pa_101545_merged.jpg
\pa_104761_merged.jpg
\pa_107514_merged.jpg
\2017-03-73
\pa_109631_merged.jpg
\pa_108624_merged.jpg
I'm working on Windows 10 and found two promising methods so far but fail to comprehend how to make this into a batch file or something like it.
IrfanView Thumbnails
1. Mark the 2 corresponding .jpgs
2. File>Create contact sheet from selected files...
3. Create
4. File>Save as... in destination folder which i have to create for every day
which is faster than merging them by hand but would consume multiple workdays to do for all the pairs
and...
ImageMagic in Windows cmd
C:\Users\me\doctor\Images\test\images\2016-03-31>convert pa_102165.jpg pa_10216500001.jpg +append pa_102165_merged.jpg
This produces the merged .jpeg in the same folder the input images are in. This looks more promising but I fail to grasp how I could automate this process given the nameing scheme and the folder structure.
Thanks for taking the time to read this! I'm happy for every input you have!
This should get you fairly close. Essentially it is using the power of the FOR command modifiers to extract the base file name and file extension. The FOR /F command is capturing the output of the DIR command that is piped to the FINDSTR command. We are doing that so we only grab files with the file mask of pa_######.jpg
Once we have that we use the command modifiers with the IF command to make sure the 00001 file exists. If it does exist then it will execute the convert command. For the sake of making sure the code is performing correctly I am just ECHOING the output to the screen. If the output on the screen looks correct then remove the ECHO so that the CONVERT command executes.
#echo off
CD /D "C:\Users\me\doctor\Images\test\images"
FOR /F "delims=" %%G IN ('DIR /A-D /B /S PA_*.jpg ^|findstr /RIC:"pa_[0-9][0-9][0-9][0-9][0-9][0-9]\.jpg$"') DO (
IF EXIST "%%~dpnG00001%%~xG" (
ECHO convert "%%G" "%%~dpnG00001%%~xG" +append "%%~dpnG_merged%%~xG"
)
)
This task could be done with IrfanView with the following batch file stored in the directory containing the folder images.
#echo off
setlocal EnableExtensions DisableDelayedExpansion
set "IrfanView=%ProgramFiles(x86)%\IrfanView\i_view32.exe"
set "SourcePath=%~dp0images"
set "TargetPath=%~dp0merged"
for /F "delims=" %%I in ('%SystemRoot%\System32\where.exe /R "%SourcePath%" pa_10????.jpg 2^>nul') do for %%J in ("%%~dpI.") do (
if not exist "%TargetPath%\%%~nxJ\%%~nI_merged%%~xI" if exist "%%~dpnI00001%%~xI" (
if not exist "%TargetPath%\%%~nxJ\" md "%TargetPath%\%%~nxJ"
if exist "%TargetPath%\%%~nxJ\" (
echo Merging "%%~nxJ\%%~nxI" and "%%~nxJ\%%~nI00001%%~xI" ...
"%IrfanView%" /convert="%TargetPath%\%%~nxJ\%%~nI_merged%%~xI" /jpgq=95 /panorama=(1,"%%I","%%~dpnI00001%%~xI"^)
)
)
)
endlocal
There must be customized the fully qualified file name of IrfanView in the third line. There can be modified also the percent value of option /jpgq which defines the quality of the output JPEG file.
The command WHERE searches recursive in subdirectory images of the directory containing the batch file for files matching the wildcard pattern pa_10????.jpg with ignoring all other files. The found file names are output with full path and this list of file names is captured by FOR and processed line by line after WHERE finished. WHERE is executed in this case by one more cmd.exe started in background with option /c and the command line within ' as additional arguments and not by cmd.exe processing the batch file.
Read the Microsoft documentation about Using command redirection operators for an explanation of 2>nul. The redirection operator > must be escaped with caret character ^ on FOR command line to be interpreted as literal character when Windows command interpreter processes this command line before executing command FOR which executes the embedded where command line with using a separate command process started in background.
Each image file with full name (drive + path + name + extension) is assigned one after the other to the loop variable I. For each file name one more FOR loop is used which processes just the full path to the current image file to assign this path with a dot appended to loop variable J. The dot at end means current directory, i.e. the directory containing current image file to process.
There is next checked with the first IF condition if for that image file does not exist already a matching pa_10????_merged.jpg file in which case there is nothing to do for the current image file. That means the batch file can be executed on same folder as often as wanted because of it runs IrfanView only for the source JPEG files for which the appropriate target JPEG file does not exist already.
The second IF condition checks if the back image exists also in the directory of current front image as otherwise nothing can be merged at all.
There is next checked with the third IF condition if the target directory exists already and this directory is created if that is not the case.
The last IF condition checks once again the existence of the target directory and if that exists now as expected, IrfanView is called with the appropriate options to create the merged image file in the target directory with the appropriate file name.
The closing round bracket ) on IrfanView command line must be escaped with ^ to be interpreted literally by cmd.exe to pass this closing parenthesis to IrfanView instead of interpreting it as end of one of the command blocks opened with ( above.
For understanding the used commands and how they work, open a command prompt window, execute there the following commands, and read entirely all help pages displayed for each command very carefully.
call /? ... explains %~dp0 ... drive and path of argument 0 which is the batch file path always ending with a backslash
echo /?
endlocal /?
for /?
if /?
md /?
setlocal /?
where /?
Double click on the text file i_options.txt in program files folder of IrfanView for the description of the IrfanView options as used in the batch file.
In my current script, i am using findstr (Windows) as follows:
findstr /s "string" C:\*.*
but this is extremely slow.
What is the fastest way to do this in Windows without using any additional software (e.g. python, c#, etc...).
Also, the files in the directories are constantly changing, so i'm unable to index the files and perform a search on the index.
The results need the full path and filename with the string match.
The full lines where the string matches need to be returned.
Only text based files need to be searched (e.g. xml, txt, etc...)
A possible batch file for this task would be:
#echo off
cd /D C:\
del "%USERPROFILE%\SearchResults.txt" 2>nul
%SystemRoot%\system32\findstr.exe /I /S /C:"string" *.htm *.html *.txt *.xml >"%USERPROFILE%\SearchResults.tmp"
ren "%USERPROFILE%\SearchResults.tmp" "SearchResults.txt"
First the current directory is changed to root of drive C:.
Next a perhaps existing search results file from a previous run on desktop of current user is deleted.
Then findstr is executed to search for the string case-insensitive in the files specified with wildcards on the command line. More file extensions can be appended. There is no file extension for "all text files".
findstr prints for each found occurrence the name of the file and the found line to stdout which is redirected into a text file with file extension tmp to desktop of current user. It is not advisable to give the results file a file extension being specified also on command line of findstr.
Finally the created results file is renamed to change file extension from tmp to txt.
But searching for a string in thousands of files with writing all found lines with name of file into a results file needs time. On second run the tasks finishes already faster because Windows will have many files already loaded in cache and does not access a second time the storage media for those files not changed in the meantime.
BTW: A case-sensitive search done with removing /I is much faster than a case-insensitive search.
I am currently trying to write a VB script in order to identify file paths that are too large (255+), in a large hierarchical network structure. These are usually truncated and end with a tilde (~). I need to get the file paths outputted onto a text file, so that someone can manually decide what to do with them. Scanning the whole server would be too large a job, and so I was hoping to be able to run the script on certain folders and their sub-folders.
I am quite comfortable with VB for access, but have never used VB Script to manipulate directories like this.
I'm using windows 7, and the directories could be server based.
Any help would be greatly appreciated. Thanks :)
If your problematic paths ends with a ~, you can use
dir /s /b x:\someFolder1 | findstr /e /c:"~" > paths.txt
dir /s /b x:\someFolder2 | findstr /e /c:"~" >> paths.txt
....
Note: There is no need to use batch per say, but I am just familiar with batch, Powershell would be better I imagine, so if there are easier solutions for this problem in powershell, please shout!
I have the arduous task of testing our DR backups for all our clients, that is, mounting ShadowProtect Snapshots latest incremental, writing and reading a file, them unmounting the image. The actual ShadowProtect part of batch is fairly simple but I would like to design a batch that can automate this.
Essentially my question is:
How in a batch file can I firstly, enumerate files in a folder, and then place a specific part of a given filename into a variable?
Reason being ShadowProtect incrementals have a naming convention such like:
SERVERNAME_DRIVELETTER_b00X_i000x - whereby b = base image, i = incremental number
I need to mount the latest incremental image, therefore need to parse the folder and find the latest incremental image, based on the number following the i in the filename.
Is this possible in batch?
Thanks!
Something like this should work:
#echo off
setlocal EnableDelayedExpansion
for /f "delims=_ tokens=1-4" %%f in ('dir /b *_*_*_*') do (
set servername=%%f
set driveletter=%%g
set base_image=%%h
set increment=%%i
)
echo !servername!
echo !driveletter!
echo !base_image!
echo !increment!
endlocal
If you have several matching files and want to do something with all of them, you need to put the processing code inside the loop.
Edit:
for /f: process either a file or the output of a command enclosed in single quotes
delims=_: fields in the processed content are separated by underscores
tokens=1-4: assign the first four tokens to the parameters %%f through %%i (first parameter is the one given in the for statement)
dir /b *_*_*_*: list all file where the name contains at least 3 underscores with just their file name (the output of this command is processed by the for loop)
setlocal EnableDelayedExpansion: expand variables at run time (otherwise assigning the parameters to variables wouldn't work)
For further details see help for and help dir.
You could always use vbscript or jscript. It is much more powerful than batch files. Also jscript and vbscript hosts are available also on machines that don't have powershell!
Link for enumeration:
http://www.techimo.com/forum/webmastering-programming/100453-recursive-javascript-list-all-files-folders-given-folder.html
Jscript string reference:
http://msdn.microsoft.com/en-us/library/bxsyt3yc(v=vs.80).aspx
You should be able to combine the two.
you run your jscript (I prefer jscript to vbscript because of its resemblance to javascript)
cscript scriptname
In Unix, I can provide a command with a list of files by doing that:
mycommand folder/*
The argc will then be equal to the number of files in the directory and argv to the name of each files in the directory.
However, this doesn't seem to be the same on Windows. Is there a way to emulate this without listing all the files of the folder as argument to the command?
Thanks.
Windows command prompt does not natively support wildcard expansion.
If "myprogram" is an application build with Visual C++ and you have control over how it is built, you can add support for wildcards to the application itself, as described in MSDN article Expanding Wildcard Arguments
From here:
To delete every .bak file in every subfolder starting at C:\temp
C:\>FOR /R C:\temp\ %%G IN (*.bak) DO del %%G
Also take a look on FORFILES.