How to get only line number of the matched pattern using FINDSTR - windows

I got stuck in the windows batch(cmd) pattern search. I need to search for a pattern in a file and need to return the line number. I have used FINDSTR with /X option, but it is also appending the patterned matched line to the line number.
Also I don't have privilege to install any utility like unix-utilities so that I can use cut to extract the line number.

for /f "delims=:" %%a in ('findstr /n "pattern" "file"') do echo "pattern" found in line #%%a

Endoro has posted a good pure batch solution.
Another option is to use a hybrid JScript/batch utility I wrote called REPL.BAT that performs regex search and replace on stdin and writes the result to stdout. It is purely script based, so no executables need to be installed. It works on any modern Windows machine from XP onward. REPL.BAT is available here.
Assuming REPL.BAT is in your current directory, or better yet, somewhere within your PATH:
findstr /n "pattern" "file.txt"|repl :.* ""

Related

Unconcatenating files using jeb's tricky method

EDIT: My essential question (without the specific setting for which I need a solution, as described in my original posting):
BinFile.bin is a file concatenated from binary files and a text file. The included text file consists only of lines beginning with a specific string, e.g. ;;;===,,,
With a batch file:
findstr /v "^;;;===,,," "BinFile.bin" > output.bin
an output bin file is generated in which the text file is completely removed.
How to use findstr (or another dos command) to not only remove all lines beginning with the specified string, but also the part of the bin before first such line (i.e. the complete binary part preceeding the text file)?
>>> My original posting:
jeb invented a method to concatenate files using Windows native tools which can be unconcatenated (in a specific way) using native tools. His solution is just ingenious!
copy /a batchBin.bat + /b myBinaryFile.bin /b combined.bat
with batchBin.bat:
;;;===,,,#echo off
;;;===,,,echo line2
;;;===,,,findstr /v "^;;;===,,," "%~f0" > output.bin
;;;===,,,exit /b
"The key is the findstr command, it outputs all lines not beginning with ;;;===,,,.
And as each of them are standard batch delimiters, they can be prefix any command in a batch file in any combination."
So myBinaryFile.bin can be extracted from the combined.bat––only by means of native tools!
My question:
In jeb's example the combined file is a batch file, because the first file in the copy command is a batch file. Could jeb's tricky method be used for the following task too, where the combined file would be combined.exe, an exe file?
copy /b aBat2ExeFile.exe + /a delimiter.bat + /b myBinaryFile.bin /b combined.exe
where delimiter.bat would be something like this:
;;;===,,,REM
and aBat2ExeFile.exe would be a batch file (aBat2ExeFile.bat) converted to exe, with a tricky use of findstr like in batchBin.bat, but with the result
[...] > output.exe
In aBat2ExeFile.bat findstr should be used with the result that all lines of combined.exe before and including the line ';;;===,,,REM' would be ignored and output.exe would be equal to myBinaryFile.bin again?
In think the concept is correct. But how this could be implemented in the aBat2ExeFile.bat?
EDIT: My question can be simplified (the frame described above is not essential):
How the findstr method used by jeb could be adapted to process a binary file in such a way that not only lines starting with ';;;===,,,' but also all lines preceding the first such line are "ignored"?

How to read specific string in txt file by using windows batch?

This is the content of my .txt file
123:456
789:333
I'm trying to use findstr to read string and search for 789:333, but it only print fist line 123:456
As I know, use cut can fulfill my requirement in Linux.
In Windows, do we have any method where we can search for a string in a file by using a batch-script?
it is simple. using a for loop.
#echo off
for /F "delims=" %%a in ('findstr /I "789:333" somefile.txt') do echo %%a
you can learn a lot about batch file commands by simply opening a cmd.exe window and typing help
It describes briefly each command and once you find one that you think might work, like let's say for, then you simply do for /? which will show you help content which will make your life easy.

findstr not working as expected

I am trying to find last line in a text file using the regex ^.*\z, it's working fine in notepad++ but when I try it in cmd using findstr /R "^.*^Z" file.txt not working.
Open a command prompt window and run findstr /?. The output help explains what FINDSTR supports. The regular expression feature is limited in FINDSTR. It does not support all the features as supported by Boost Perl Regular Expression library used by many text editors in various versions.
This batch code could be used to get last non empty line from a file assigned to an environment variable:
#echo off
setlocal EnableExtensions DisableDelayedExpansion
set "LastLine="
if exist "file.txt" for /F "usebackq eol= delims=" %%# in ("file.txt") do set "LastLine=%%#"
echo Last line is: "%LastLine%"
endlocal
Command FOR skips all empty lines and by default also all lines starting with a semicolon. For that reason eol= is used to define form-feed control character as end of line. In case of last line of file surely never starts with ; it would be best to remove eol= from the FOR command line.
In case of file to process always has at least X lines, it would make sense to add to the FOR options after usebackq the option skip=X to skip the first X lines of the file for faster processing.
For details on command FOR open a command prompt window and run for /?.

How to view duplicate files in a drive using command in cmd

I want to view all the duplicate files present in a
drive using command prompt. I have tried a few commands like tree but I am not satisfied by it.
I'll assume you are simply looking for duplicate file names, regardless of content.
This is inherently a relatively slow process. If you want a script based solution, then your best bet is probably to write a custom powershell, VBScript, or JScript script.
But I have a pair of pure script based utilities that can give decent performance. You should still expect the command to take many minutes to begin printing results (perhaps hours? if a large drive). The entire directory listing must fit within 2 GBytes. This command will fail if the limit is exceeded.
This will not allow you to see files for which you do not have access.
jren "^.*" "name()+' : '+path()" /list /j /s /p c:\ | sort | jrepl "^(.*? : ).*\n(?:\1.*\n)+" "$0" /m /i /jmatch
The above works by first using JREN to recursively list all files, one file per line as
fileName : fullFilePath
The list is then sorted, and then JREPL is used to extract consecutive lines where the leading file name repeats.
JREN.BAT is available at http://www.dostips.com/forum/viewtopic.php?f=3&t=6081
JREPL.BAT is available at http://www.dostips.com/forum/viewtopic.php?f=3&t=6044
Use JREN /? and JREPL /? to get full documentation on the utilities.

Batch: create fileC.txt from the result of (fileA.txt minus fileB.txt)

I'm trying to create a batch that creates a fileC.txt containing all lines in fileA.txt except for those that contains the strings in the lines in fileB.txt:
Pseudo:
foreach(line L in fileA.txt)
excluded = false
foreach(string str in fileB.txt)
if L contains str
exclude = true
if !excluded
add L to fileC.txt
if L !contains
For example
fileA.txt: (all)
this\here\is\a\line.wav
and\this\is\another.wav
i\am\a\chocolate.wav
peanut\butter\jelly\time.wav
fileB.txt: (those to be excluded)
another.wav
time.wav
fileC.txt: (wanted result)
this\here\is\a\line.wav
i\am\a\chocolate.wav
I've been fiddling around with FINDSTR but I just can't seem to puzzle it together.. any help or pointers greatly appreciated!
Cheers!
/ Fredde
The answer should be this simple:
findstr /lvg:"fileB.txt" "fileA.txt" >fileC.txt
And with your example, the above does give the correct results.
But there is a nasty FINDSTR bug that makes it unreliable when using multiple case sensitive literal search strings. See Why doesn't this FINDSTR example with multiple literal search strings find a match?, as well as the answer that goes with it. For a "complete" list of undocumented FINDSTR features and bugs, see What are the undocumented features and limitations of the Windows FINDSTR command?.
So the simple code above can fail depending on the content of the files. If you can get away with using a case insensitive search, then the solution is simple.
findstr /livg:"fileB.txt" "fileA.txt" >fileC.txt
Edit: Both versions above will fail if fileB.txt contains \\ or \". In order to work properly, those strings must be escaped as \\\ and \\"
But if you must use a case sensitive search, then there is no simple solution. Your best bet for a pure batch solution might be to use the /R regular expression option. But then you will have to create a modified version of fileB.txt where all regex meta-characters are escaped so that the strings give the correct literal search. That is a mini project in and of itself.
Perhaps your best option for a case sensitive solution is to get a 3rd party tool like grep or sed for Windows.
Edit: Here is a reasonably performing pure batch solution that is nearly bullet proof
I looked into doing something like the proposed logic in your question. But using batch to read all lines in a file is relatively slow. This solution only reads the exclude file line by line. It uses FINDSTR to read the lines in "fileA.txt" repeatedly, once per search string. This is a much faster algorithm for a batch file.
The traditional method to read a file is to use a FOR /F loop, but there is another technique using SET /P that is faster, and it is safe to use with delayed expansion. The only limitations to this method are:
It strips trailing control characters from the line
It is limited to 1021 bytes per line
Each line must be terminated by <CR><LF> as is the Windows standard. It will not work with unix style lines terminated by <LF>
The search strings must have each \ and " escaped as \\ and \" when they are used with the /C option.
#echo off
setlocal enableDelayedExpansion
copy fileA.txt fileC.txt >nul
for /f %%N in ('find /c /v "" ^<fileB.txt') do set len=%%N
<fileB.txt (
for /l %%N in (1 1 !len!) do (
set "ln="
set /p "ln="
if defined ln (
set "ln=!ln:\=\\!"
set ln=!ln:"=\"!
move /y fileC.txt temp.txt >nul
findstr /lv /c:"!ln!" temp.txt >fileC.txt
)
)
)
del temp.txt
type fileC.txt

Resources