Strange Windows DIR command behavior - windows

I discovered this quite by accident while looking for a file with a number in the name. When I type:
dir *number*
(where number represents any number from 0 to 9 and with no spaces between the asterisks and the number)
at the cmd.exe command prompt, it returns various files do not appear in any to fit the search criteria. What's weird, is that depending on the directory, some numbers will work and not others. An example is, in a directory associated with a website, I type the following:
dir *4*
and what is returned is:
Directory of C:\Ampps\www\includes\pages
04/30/2012 03:55 PM 153 inventory_list_retrieve.php
06/18/2012 11:17 AM 6,756 ix.html
06/19/2012 01:47 PM 257,501 jquery.1.7.1.js
3 File(s) 264,410 bytes
0 Dir(s) 362,280,906,752 bytes free
That just doesn't make any sense to me. Any clue?
The question is posed on stackOverflow because the DIR command is often combined with FOR in batch programs. The strange DIR behavior would seem to make batch programs potentially unreliable if they use the DIR command.
Edit: (additional note). Though much time has passed, I discovered another quirk with this that almost cost me a lot of work. I wanted to delete all .htm files in a particular directory tree. I realized just before doing it that *.htm matches .html files as well. Also, *.man matches .manifest, and there are probably others. Deleting all .html files in that particular directory would have been upsetting to say the least.

Wild cards at the command prompt are matched against both the long file name and the short "8.3" name if one is present. This can produce surprises.
To see the short names, use the /X option to the DIR command.
Note that this behavior is not in any way specific to the DIR command, and can lead to other (often unpleasant) surprises when a wild card matches more than expected on any command, such as DEL.
Unlike in *nix shells, replacement of a file pattern with the list of matching names is implemented within each command and not implemented by the shell itself. This can mean that different commands could implement different wild card pattern rules, but in practice this is quite rare as Windows provides API calls to search a directory for files that match a pattern and most programs use those calls in the obvious way. For programs written in C or C++ using the "usual" tools, that expansion is provided "for free" by the C runtime library, using the Windows API.
The Windows API in question is FindFirstFile() and its close relatives FindFirstFileEx(), FindNextFile(), and FindClose().
Oddly, although the documentation for FindFirstFile() describes its lpFileName parameter as "directory or path, and the file name, which can include wildcard characters, for example, an asterisk (*) or a question mark (?)" it never actually defines what the * and ? characters mean.
The exact meaning of the file pattern has history in the CP/M operating system dating from the early 1970s that strongly influenced (some might say "was directly copied" in place of "influenced" here) the design of MSDOS. This has resulted in a number of "interesting" artifacts and behaviors. Some of this at the DOS end of the spectrum is described at this blog post from 2007 where Raymond describes exactly how file patters were implemented in DOS.

Yep. You'll see that it also searches through short names if you try this:
dir /x *4*
(/x switch is for short names)
for filtering file names use :
dir /b | find "4"

A quote from RBerteig's answer:
Note that this behavior is not in any way specific to the DIR command,
and can lead to other (often unpleasant) surprises when a wild card
matches more than expected on any command, such as DEL.
The above is true even for the FOR command, which is very nasty.
for %A in (*4*) do #echo %A contains a 4
will also search the short names. The solution again would be to use FIND or FINDSTR to filter out the names in a more reliable manner.
for %A in (*) do #echo %A | >nul findstr 4 && echo %A contains a 4
Note - change %A to %%A if using the command within a batch file.
Combining FOR with FINDSTR can be a general purpose method to safely use any command that runs into problems with short file names. Simply replace ECHO with the problem command such as COPY or DEL.

Seems like dir command searches also short ( 8.3 manner ) file names under the hood.
When I call dir *1* this is what I get:
Volume in drive C is System
Volume Serial Number is F061-0B78
Directory of C:\Users\Piotrek\Desktop\Downloads
2012-05-20 17:33 23 639 040 gDEBugger-5_8.msi
2012-05-20 17:30 761 942 glew-1.7.0.zip
2012-05-20 17:11 9 330 176 irfanview_plugins_433_setup.exe
2012-05-24 20:17 4 419 192 SumatraPDF-2.1.1-install.exe
2012-05-15 22:55 3 466 248 TrueCrypt Setup 7.1a.exe
5 File(s) 1 127 302 494 bytes
There is a gDEBugger-5_8.msi file amongst listed ones, which apparently does not have any 1 character in it.
Everything becomes clear when I use /X switch with the dir command, which makes dir use 8.3 file names. Output from a dir /X *1* command:
Volume in drive C is System
Volume Serial Number is F061-0B78
Directory of C:\Users\Piotrek\Desktop\Downloads
2012-05-20 17:33 23 639 040 GDEBUG~1.MSI gDEBugger-5_8.msi
2012-05-20 17:30 761 942 GLEW-1~1.ZIP glew-1.7.0.zip
2012-05-20 17:11 9 330 176 IRFANV~1.EXE irfanview_plugins_433_setup.exe
2012-05-24 20:17 4 419 192 SUMATR~1.EXE SumatraPDF-2.1.1-install.exe
2012-05-15 22:55 3 466 248 TRUECR~1.EXE TrueCrypt Setup 7.1a.exe
5 File(s) 1 127 302 494 bytes
Quote from dir's help:
/X This displays the short names generated for non-8dot3 file
names. The format is that of /N with the short name inserted
before the long name. If no short name is present, blanks are
displayed in its place.

Related

Script to move all files starting with the same 7 letters in a different folder named after first 7 chars of its future content

All files are in a directory (over 500 000 files), named in the following pattern
AR00001_1
AR00001_2
AR00001_3
AR00002_1
AR00002_2
AR00002_3
I need a script, can be both batch or unix shell that takes everything with AR00001 and moves it into a new folder that will be called AR00001, and does the same for AR00002 files etc
Here's what I've been trying to figure out until now
for f in *_*; do
DIR="$( echo ${f%.*} | tr '_' '/')"
mkdir -p "./$DIR"
mv "$f" "$DIR"
done
Thanks
// Update
Ran this in the CMD
for %F in (c:\test\*) do (md "d:\destination\%~nF"&move "%F" "d:\destination\%~nF\") >nul
Seems to be almost what I wanted, except that it does not take the first 7 characters as a substring but instead creates a folder for each file :/ I'm trying to mix it with your solutions
#echo off
setlocal enabledelayedexpansion
for %%a in (???????_*) do (
set "x=%%a"
set "x=!x:~0,7!"
md "!x!" >nul
move "!x!*" "!x!\" 2>nul
)
for every matching file do:
- get the first 7 characters
- create a folder with that name (ignore error message, if exist)
- move all files that start with those 7 characters (ignore errormessages, if files doesn't exist (already moved))
The following achieves the desired effect and checks for non-existence of the target directory each time before creating it.
#echo off
setlocal ENABLEDELAYEDEXPANSION
set "TOBASE=c:\target\"
set "MATCHFILESPEC=AR*"
for %%F in ("%MATCHFILESPEC%") do (
set "FILENAME=%%~nF"
set "TOFOLDER=%TOBASE%!FILENAME:~0,7!"
if not exist "!TOFOLDER!\" md "!TOFOLDER!"
move "%%F" "!TOFOLDER!" >nul
)
endlocal
In the move command, by moving only the current file rather than including a wildcard, we ensure that we're not eating up file names that might be about to appear the next time around the loop. Keeping it simple, assuming that efficiency is not of prime importance.
I'd recommend prototyping by creating batch files (with a .bat or .cmd extension) rather than trying to do complex tasks interactively using on one-liners. The behaviour can be different and there are more things you can do in a batch file, such as using setlocal to turn on delayed expansion of variables. It's also just a pain writing for loops using the %F interactively, only to have to remember to convert all those to %%F, %%~nF, etc. when pasting into a batch file for posterity.
One word of caution: with 500,000 files in the folder, and all of the files having very similar prefixes, if your file system has 8.3 directory naming turned on (which is often the default) it is possible to run into problems using wildcards. This happens as the 8.3 namespace gets more and more busy and there are fewer and fewer options for ways the file name can be encoded in 8 characters. (The hash table fills up and starts overflowing into unexpected file names).
One solution is to turn that feature off on the server but that may have severe implications for any legacy applications. To see what the file looks like in 8.3 naming scheme, you can do, e.g.:
dir /x /p AR*
... which might give you something like (where the left hand name is the one converted to 8.3):
ARB900~1.TST AR15467_RW322.tst
AR85E3~1.TST AR15468_RW322.tst
ARDDFE~1.TST AR15469_RW322.tst
AR1547~1.TST AR15470_RW322.tst
AR1547~2.TST AR15471_RW322.tst
...
In this example, since the first two characters seem to be maintained, there should be no conflict.
So for example if I say for %a in (AR8*) do #echo %a I get what might at first seem to be incorrect:
AR15468_RW322.tst
AR18565_RW322.tst
AR20376_RW322.tst
AR14569_RW322.tst
AR17278_RW322.tst
...
But this is actually correct; it is all the files that match AR8* in both the long file name and short file name formats.
Edit: I am aware in retrospect that this solution looks very similar to Stephan's, and I had browsed through the existing answers before starting work on my own, so I should credit him. I will try and save face by pointing out a benefit of Stephan's solution. Its use of wildcards should circumvent any 8.3 naming issue: by specifying the wildcard as ???????_*, it only catches the long file names and won't match any of the converted 8.3 file names (all of which are devoid of underscores in that position). Similarly, a wildcard such as AR?????_* would do the same.
With bash, you'd write:
for f in *; do
[[ -d $f ]] && continue # skip existing directories
prefix=${f:0:7} # substring of first 7 characters
mkdir -p "$prefix" # create the directory if it does not exist
mv "$f" "$prefix" # and move the file
done
For the substring expansion, see https://www.gnu.org/software/bash/manual/bash.html#Shell-Parameter-Expansion -- this is probably the bit you're missing.

Windows delete with wildcards deleting erratically

This is driving me crazy. Basically, I have a program that outputs tables to a flat file for multiple databases with the same structure. These files get named in the format tablename_####.dat, where #### is the 4 digit company number. After these are all created, the program then combines all of the files by tablename, and adds a timestamp on the end. So, the final file name is in the format tablename_YYYYMMDD_HHmmSS.dat. Finally, I want to delete all of the individual .dat files, leaving only the combined, time stamped files.
This works just fine for all of the tables, except for the table VEX. For example, I have files:
VEX_1234.dat
VEX_5678.dat
VEX_0987.dat
which combine to form VEX_20150414_144352.dat. After this, I run the command:
`del *_????.dat`
This deletes all of the tables' individual files (V_1234.dat, PAT_9534.dat, etc.), while leaving the combined files (V_20150414_142311.dat, PAT_20150413_132113.dat) ...except for VEX. It deletes both the individual files and the combined file. Shouldn't this only delete files that end with an underscore, 4 characters, and ".dat"?
I know this has to be something really simple that I'm missing. What is going on?
Most likely your issue is caused by short 8.3 file names.
The ? wildcard can match 0 or 1 character if it precedes a dot. Your file mask of *_????.dat will match any name that has any number of characters, followed by a _, followed by 0 to 4 characters, followed by the .dat extension. The tricky thing is it will attempt to match both the long file name, and any short 8.3 name, if it exists.
Try issuing dir /x *.dat, and look at the short name of the problem file. I suspect it will match your file mask.
There are patterns with how short names are derived, but there is no way to predict the short name of any given file unless you are aware of all existing short names within the folder, and then you would be relying on undocumented behavior.
This is a fairly common problem. If your files are on an NTFS drive and you have admin rights, then you can disable short file name generation. But this does not remove already existing short names.
The best general solution is to pipe DIR /B through FINDSTR to remove the unwanted files, and process the result with FOR /F to delete each file individually. The FINDSTR below will exclude file names that contain two or more _ characters.
for /f "delims=" %%F in ('dir /b *.dat^|findstr /v "_.*_"') do del "%%F"

renaming files in windows...perhaps dos command prompt (For)

This kind of question has been asked a few times before on here and I have tried to use the answers in previous posts for my problem but I'm still struggling.
I have in a directory with 100's of files along the lines of
ab00123456.stp
ab00123457.stp
ab00123458.stp
...and so on
I would like to rename all these by adding a pre and post text to the file name.
So the end result would be...
CDE_AB00123456_A.stp
CDE_AB00123457_A.stp
CDE_AB00123458_A.stp
...and so on
(Note the upper and lowercase text change also......as if this wasn't difficult enough already!)
Any clues would be much appreciated.....along the lines of some DOS command perhaps....
Andy
for /? is extremely helpful. In particular, it contains the following substitutions:
%~nI - expands %I to a file name only
%~xI - expands %I to a file extension only
Thus, you create a for loop that iterates through your files with iteration variable %I and renames %I to CDE_%~nI_A%~xI.
Ready-to-use example:
for %i in (*) DO echo rename %i CDE_%~ni_A%~xi
Try this in a directory of your choice, fine-tune it and remove the echo once you are satisfied.
Note that translation to upper-case is much harder, but since Windows is not case sensitive anyway, I'd just double-check if this is really required.
You should write a batch script to do this. But if you don't know how to script there are 100's of free file renaming tools.
here is a list of some
http://listoffreeware.com/list-of-best-free-file-rename-software/

How to renaming a lot of folders with a Windows batch file

I need change a lot of folders with a batch script file..
I have those format of name folders:
2013.03.12.08.05.06_Debug_Test1
2013.03.12.08.04.09_Debug_Test2
...
I need change for this:
2013.12.03.08.05.06_Debug_Test1
2013.12.03.08.04.09_Debug_Test2
Invert the number 12 with the number 03
This is possible using a windows batch file?
#echo off
for /f "tokens=1,2,3*delims=." %%a in ('dir /b /ad "*.*.*.*") do if not %%b==%%c echo ren "%%a.%%b.%%c.%%d" "%%a-%%c.%%b.%%d"
for /f "tokens=1,2,3*delims=.-" %%a in ('dir /b /ad "*-*.*.*") do if not %%b==%%c echo ren "%%a-%%b.%%c.%%d" "%%a.%%b.%%c.%%d"
Should get you started.
The first FOR selects directories of the format *.*.*.* and renames them *-*.*.* with the 2nd and 3rd elements swapped.
The second renames the renamed directories to change the - to .
Consider directories 2013.03.12.08.05.06_Debug_Test1 and 2013.12.03.08.05.06_Debug_Test1 - attempting to rename the one will fail because the other exists, hence need to rename twice.
(I've assumed that '-' does not exist in your directorynames - you may wish to substitute some other character - #,#,$,q suggest themselves)
Note that I've simply ECHOed the rename. Since the second rename depends on the first, the second set wouldn't be produced until the echo is removed from the first after careful checking.
I'd suggest you create a sample subdirectory to test first, including such names as I've highlighted.
Using StringSolver, which requires a valid JRE installation and sbt, allows to use a semi-automatic version of move:
move 2013.03.12.08.05.06_Debug_Test1 2013.12.03.08.05.06_Debug_Test1
Then check the transformation:
move --explain
concatenates for all a>=0 (a 2-digit number from the substring starting at the a+1-th number ending at the end of the a+1-th AlphaNumeric token in first input + the substring starting at the 2*a+2-th token not containing 0-9a-zA-Z ending at the end of the a+3-th non-number in first input) + the first input starting at the 4th AlphaNumeric token.
which means that it decomposed the transformation by:
2013.12.03.08.05.06_Debug_Test1
AAAABBBBAABCCCCCCCCCCCCCCCCCCCC
where
A is "a 2-digit number from the substring starting at the a+1-th number ending at the end of the a+1-th AlphaNumeric token in first input"
B is "the substring starting at the 2*a+2-th token not containing 0-9a-zA-Z ending at the end of the a+3-th non-number in first input"
C is "the first input starting at the 4th AlphaNumeric token."
which corresponds to what you expected for folders of this type.
If you do not trust it, you can have a dry run:
move --test
which displays what the mapping would do on all folders.
Then perform the transformation for all folders using move --auto or the abbreviated command
move
Alternative
Using the Monitor.ps1 modified and run in Powershell -Sta, you can do it yourself in Windows like in this Youtube video.
DISCLAIMER: I am a co-author of this software developped for academic purposes.

Extracting a 7-Zip file "silently" - command line option

I want to extract a 7-Zip archive in a Python script. It works fine except that it spits out the extraction details (which is huge in my case).
Is there a way to avoid this verbose information while extracting? I did not find any "silent" command line option to 7z.exe.
My command is
7z.exe -o some_dir x some_archive.7z
I just came across this when searching for the same, but I solved it myself! Assuming the command is processed with Windows / DOS, a simpler solution is to change your command to:
7z.exe -o some_dir x some_archive.7z > nul
That is, direct the output to a null file rather than the screen.
Or you could pipe the output to the DOS "find" command to only output specific data, that is,
7z.exe -o some_dir x some_archive.7z | FIND "ing archive"
This would just result in the following output.
Creating archive some_archive.7z
or
Updating archive some_archive.7z**
My final solution was to change the command to
... some_archive.7z | FIND /V "ing "
Note double space after 'ing'. This resulted in the following output.
7-Zip 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18
Scanning
Updating some_archive.7z
Everything is Ok
This removes the individual file processing, but produces a summary of the overall operation, regardless of the operation type.
One possibility would be to spawn the child process with popen, so its output will come back to the parent to be processed/displayed (if desired) or else completely ignored (create your popen object with stdout=PIPE and stderr=PIPE to be able to retrieve the output from the child).
Like they said, to hide most of the screen-filling messages you could use ... some_archive.7z | FIND /V "Compressing" but that "FIND" would also remove the error messages that had that word. You would not be warned. That "FIND" also may have to be changed because of a newer 7-zip version.
7-zip has a forced verbose output, no silence mode, mixes stderr and stdout(*), doesn't save Unix permissions, etc. Those anti-standards behaviors together put "7-zip" in a bad place when being compared to "tar+bzip2" or "zip", for example.
(*) "Upstream (Igor Pavlov) does not want to make different outputs for messages, even though he's been asked several times to do so :(" http://us.generation-nt.com/answer/bug-346463-p7zip-stdout-stderr-help-166693561.html - "Igor Pavlov does not want to change this behaviour" http://sourceforge.net/tracker/?func=detail&aid=1075294&group_id=111810&atid=660493
7zip does not have an explicit "quiet" or "silent" mode for command line extraction.
One possibility would be to spawn the child process with popen, so its output will come back to the parent to be processed/displayed (if desired) or else completely ignored (create your popen object with stdout=PIPE and stderr=PIPE to be able to retrieve the output from the child).
Otherwise Try doing this:
%COMSPEC% /c "%ProgramFiles%\7-Zip\7z.exe" ...
Expanding on #Matthew 's answer and this answer https://superuser.com/questions/194659/how-to-disable-the-output-of-7-zip
I'm using FINDSTR instead of find so I can chain multiple lines to exclude and blank lines as well:
7za.exe a test1.zip .\foldertozip | FINDSTR /V /R /C:"^Compressing " /C:"Igor Pavlov" /C:"^Scanning$" /C:"^$" /C:"^Everything is Ok$"
/V: exclude
/R: regex
/C:"^Compressing " : begining of line, Compressing, 2 spaces
/C:"^Scanning$" : the word Scanning on its own on a line (begining/end)
/C:"^$" : a begining and end without anything in between, ie, a blank line
I'm using /C so that a space is a space, otherwise it's a separator between multiple words to exlude as in this simpler version:
FINDSTR /V "Compressing Pavlov Scanning Everytyhing"
(the same caveats exist, if the wording changes in a new version, or if a useful line starts with the word "Compressing ", it will not work as expected).
If you're running 7-zip.exe from Powershell, and you only want to see errors, then you could try something like this:
7-zip.exe u <Target> <Source> | Select-String "Error" -Context 10
This will only display the "Error" message line and the surrounding 10 lines (or whatever number) to capture the error specific output.
The | FIND is a good alternative to show what happened without displaying insignificant text.
Examining 7zip source I found hidden -ba switch that seems to do the trick. Unfortunately it is not finished. I managed to make it work with several modifications of sources but it's just a hack. If someone's interested, the option variable is called options.EnableHeaders and changes are required in CPP/7zip/UI/Console/Main.cpp file.
Alternatively you can poke 7Zip's author to finish the feature in tracker. There are several requests on this and one of them is here.
7-zip has not such an option. Plus the lines printed at each file compressed are supposed to display at the same spot without newline, erasing the previous one, which has a cool effect. Unfortunatly, in some contexts (Jenkins...) it produced several lines ☹️ flooding the console.
NUL (windows) is maybe one solution.
7-zip.exe -o some_dir x some_archive.7z>NUL
To show just the last 4 lines...
7z x -y some_archive.7z | tail -4
gives me:
Everything is Ok
Size: 917519
Compressed: 171589
The switch -y is to answer yes to everything (in my case to override existing files).
On Unix-like operating systems (Linux, BSD, etc.) the shell command 7z ... >/dev/null will discard all text written by 7z to standard output. That should cover all the status/informational messages written by 7z.
It seems that 7z writes error messages to standard error so if you do >/dev/null, error messages will still be shown.
As told by Fr0sT above, -ba switch outputs only valid things (at least in list option on which I was trying).
7z.exe l archive_name.zip
7z.exe l -ba archive_name.zip
made great difference, esp for parsing the output in scripts.
There is no need to modify anything, just use -ba switch in version19. This was also told bysomeone above. I'm putting as answer as I can't comment.
You can stop 7-Zip from displaying prompts by using the -y switch. This will answer yes to all prompts. Use this only when you are confident.

Resources