Batch split a text file - windows

I have this batch file to split a txt file:
#echo off
for /f "tokens=1*delims=:" %%a in ('findstr /n "^" "PASSWORD.txt"') do for /f "delims=~" %%c in ("%%~b") do >"text%%a.txt" echo(%%c
pause
It works but it splits it line by line. How do i make it split it every 5000 lines. Thanks in advance.
Edit:
I have just tried this:
#echo off
setlocal ENABLEDELAYEDEXPANSION
REM Edit this value to change the name of the file that needs splitting. Include the extension.
SET BFN=passwordAll.txt
REM Edit this value to change the number of lines per file.
SET LPF=50000
REM Edit this value to change the name of each short file. It will be followed by a number indicating where it is in the list.
SET SFN=SplitFile
REM Do not change beyond this line.
SET SFX=%BFN:~-3%
SET /A LineNum=0
SET /A FileNum=1
For /F "delims==" %%l in (%BFN%) Do (
SET /A LineNum+=1
echo %%l >> %SFN%!FileNum!.%SFX%
if !LineNum! EQU !LPF! (
SET /A LineNum=0
SET /A FileNum+=1
)
)
endlocal
Pause
exit
But i get an error saying: Not enough storage is available to process this command

This will give you the a basic skeleton. Adapt as needed
#echo off
setlocal enableextensions disabledelayedexpansion
set "nLines=5000"
set "line=0"
for /f "usebackq delims=" %%a in ("passwords.txt") do (
set /a "file=line/%nLines%", "line+=1"
setlocal enabledelayedexpansion
for %%b in (!file!) do (
endlocal
>>"passwords_%%b.txt" echo(%%a
)
)
endlocal
EDITED
As the comments indicated, a 4.3GB file is hard to manage. for /f needs to load the full file into memory, and the buffer needed is twice this size as the file is converted to unicode in memory.
This is a fully ad hoc solution. I've not tested it over a file that high, but at least in theory it should work (unless 5000 lines needs a lot of memory, it depends of the line length)
AND, with such a file it will be SLOW
#echo off
setlocal enableextensions disabledelayedexpansion
set "line=0"
set "tempFile=%temp%\passwords.tmp"
findstr /n "^" passwords.txt > "%tempFile%"
for /f %%a in ('type passwords.txt ^| find /c /v "" ') do set /a "nFiles=%%a/5000"
for /l %%a in (0 1 %nFiles%) do (
set /a "e1=%%a*5", "e2=e1+1", "e3=e2+1", "e4=e3+1", "e5=e4+1"
setlocal enabledelayedexpansion
if %%a equ 0 (
set "e=/c:"[1-9]:" /c:"[1-9][0-9]:" /c:"[1-9][0-9][0-9]:" /c:"!e2![0-9][0-9][0-9]:" /c:"!e3![0-9][0-9][0-9]:" /c:"!e4![0-9][0-9][0-9]:" /c:"!e5![0-9][0-9][0-9]:" "
) else (
set "e=/c:"!e1![0-9][0-9][0-9]:" /c:"!e2![0-9][0-9][0-9]:" /c:"!e3![0-9][0-9][0-9]:" /c:"!e4![0-9][0-9][0-9]:" /c:"!e5![0-9][0-9][0-9]:" "
)
for /f "delims=" %%e in ("!e!") do (
endlocal & (for /f "tokens=1,* delims=:" %%b in ('findstr /r /b %%e "%tempFile%"') do #echo(%%c)>passwords_%%a.txt
)
)
del "%tempFile%" >nul 2>nul
endlocal
EDITED, again: Previous code will not correctly work for lines starting with a colon, as it has been used as a delimiter in the for command to separate line numbers from data.
For an alternative, still pure batch but still SLOW
#echo off
setlocal enableextensions disabledelayedexpansion
set "nLines=5000"
set "line=0"
for /f %%a in ('type passwords.txt^|find /c /v ""') do set "fileLines=%%a"
< "passwords.txt" (for /l %%a in (1 1 %fileLines%) do (
set /p "data="
set /a "file=line/%nLines%", "line+=1"
setlocal enabledelayedexpansion
>>"passwords_!file!.txt" echo(!data!
endlocal
))
endlocal

Test this: the input file is "file.txt" and output files are "splitfile-5000.txt" for example.
This uses a helper batch file called findrepl.bat - download from: https://www.dropbox.com/s/rfdldmcb6vwi9xc/findrepl.bat
Place findrepl.bat in the same folder as the batch file or on the path.
#echo off
:: splits file.txt into 5000 line chunks.
set chunks=5000
set /a s=1-chunks
:loop
set /a s=s+chunks
set /a e=s+chunks-1
echo %s% to %e%
call findrepl /o:%s%:%e% <"file.txt" >"splitfile-%e%.txt"
for %%b in ("splitfile-%e%.txt") do (if %%~zb EQU 0 del "splitfile-%e%.txt" & goto :done)
goto :loop
:done
pause
A limitation is the number of lines in the file and the real largest number is 2^31 - 1 where batch math tops out.

#echo off
setlocal EnableDelayedExpansion
findstr /N "^" PASSWORD.txt > temp.txt
set part=0
call :splitFile < temp.txt
del temp.txt
goto :EOF
:splitFile
set /A part+=1
(for /L %%i in (1,1,5000) do (
set "line="
set /P line=
if defined line echo(!line:*:=!
)) > text%part%.txt
if defined line goto splitFile
exit /B
If the input file has not empty lines, previous method may be modified in order to run faster.

Related

Output filename and count of a directory in a text file - windows

I have a batch to output filename and count of filenames with the same prefix and date in a text file.
E.g.
Q1231111.zip
Q1241111.zip
where:
Q123 - is a prefix
1111 - is a date.
I want an output such as:
123 : 1
124 : 1
125 : 0
But the batch file unable to output the last one. I wanna see the file is present so i need the 0 output.
Here's my code:
#echo off
setlocal EnableExtensions
for %%I in ("Z:\StoreDataJDA\Q1231111.zip") do call :CountFile "%%~nI"
for %%I in ("Z:\StoreDataJDA\Q1241111.zip") do call :CountFile "%%~nI"
for %%I in ("Z:\StoreDataJDA\Q1251111.zip") do call :CountFile "%%~nI"
for /F "tokens=2,3 delims=#=" %%I in ('set Group# 2^>nul') do echo %%I: %%J >>count.txt
endlocal
goto :EOF
:CountFile
set "FileName=%~1"
set "FileGroup=%FileName:~1,4%"
if "Group#%FileGroup%" == "" (
set "Group#%FileGroup%=1"
) else (
set /A Group#%FileGroup%+=1
)
goto :EOF
pause
Thanks in advance!
From your description, it's hard to tell exactly what you are looking for. But by looking at your very similar question from a year ago, and from Gerhard's comment, I think what you're looking for is:
Count the number of files with a given prefix, each file having a different date suffix.
I think this should do it:
#echo off
setlocal EnableExtensions
set "Group#123=0" & for %%I in ("Z:\StoreDataJDA\Q123????.zip") do call :CountFile "%%~nI"
set "Group#124=0" & for %%I in ("Z:\StoreDataJDA\Q124????.zip") do call :CountFile "%%~nI"
set "Group#125=0" & for %%I in ("Z:\StoreDataJDA\Q125????.zip") do call :CountFile "%%~nI"
(for /F "tokens=2,3 delims=#=" %%I in ('set Group# 2^>nul') do echo %%I: %%J)>count.txt
endlocal
goto :EOF
:CountFile
set "FileName=%~1"
set "FileGroup=%FileName:~1,3%"
set /A Group#%FileGroup%+=1
goto :EOF

count length of filenames in batch

my problem is I want to count the length of multiple filenames and save this numbers into a file.
My approach is this:
#echo off
for %%i in (*.txt) do (
set Datei=%%~ni
call :strLen Datei strlen
:strLen
setlocal enabledelayedexpansion
:strLen_Loop
if not "!%1:~%len%!"=="" set /A len+=1 & goto :strLen_Loop
(endlocal & set %2=%len%)
echo.%strlen%>> tmp
)
The problem here is it only works for the first filename and after that it is stuck and does not go on to the next filename.
Just for another alternative, findstr can output the offset of each matching line. Using it over the dir output and substracting the offset of the previous line from the current one (and the CRLF at the end of the line), we get the length of the previous line / file name
#echo off
setlocal enableextensions disabledelayedexpansion
set "lastOffset=0"
for /f "tokens=1,* delims=:" %%a in ('(dir /b *.txt ^& echo(^) ^| findstr /o "^"') do (
if %%a gtr 0 (
set /a "size=%%a - lastOffset - 2"
setlocal enabledelayedexpansion
echo(!fileName! !size!
endlocal
)
set "lastOffset=%%a"
set "fileName=%%b"
)
Just for another alternative of a one-line command:
for %# in (*.txt) do #for /F "delims=:" %G in ('(echo "%~f#"^&echo(^)^|findstr /O "^"') do #if %~G NEQ 0 ( <^NUL set /p "dummy=%~f#|%~z#|"&set /a %~G-5&echo()
or the same in a bit more readable form:
for %# in (*.txt) ^
do #for /F "delims=:" %G in ('(echo "%~f#"^&echo(^)^|findstr /O "^"') ^
do #if %~G NEQ 0 ( <^NUL set /p "dummy=%~f#|%~z#|"&set /a %~G-5&echo()
Unfortunately, unlike the cmd shell, set command does not display its result in a batch script. Therefore, we need to set a string length to an environment variable and then echo its value with delayed expansion enabled:
#ECHO OFF >NUL
SETLOCAL EnableExtensions EnableDelayedExpansion
rem file lengths:
rem for %%A in (*.txt) do echo Name:%%A, Length: %%~zA
rem full path lengths:
for %%# in (*.txt) do (
for /F "delims=:" %%G in ('(echo "%%~f#"^&echo(^)^|findstr /O "^"') do (
if %%~G NEQ 0 (
set /a "length=%%~G-5"
rem echo(%%~f#^|%%~z#^|!length!
echo(%%~f#^|!length!
)
)
)
A slightly modified how to do count length of file name.
#echo off
SETLOCAL ENABLEDELAYEDEXPANSION
(
for /f "tokens=*" %%A in ('dir *.txt /B /S /A:-D ^| findstr /IV "_output.txt"') do (
set "name=%%~nA"
#echo "!name!">"%TMP%\_Temp.txt"
for %%I in ("%TMP%\_Temp.txt") do (
set /a "length=%%~zI"
set /a length-=4
#echo !length!:'!name!'
)
)
)> _output.txt 2>&1
del "%TMP%\_Temp.txt"
MORE /C /P _output.txt
ENDLOCAL
EXIT /B 0
well the solution was coming from you i just moved the parts out of the loop. the code is this:
#echo off
for %%i in (*.txt) do (
set Datei=%%~ni
call :strLen Datei strlen
)
:strLen
setlocal enabledelayedexpansion
:strLen_Loop
if not "!%1:~%len%!"=="" set /A len+=1
goto :strLen_Loop
(endlocal & set %2=%len%)
echo.%strlen%>> tmp

read then write delimited file - Batch cmd

Hi I have a pipe | delimited file.
I need to reverse all the numbers in there
The file looks like this
Entity|Division|Channel|200|300|800
I need to read the file an create a new one with reversed numbers
Entity|Division|Channel|-200|-300|-800
trying to make this work but, not entirely sure how to amend the text I'm getting. I need help on the :processToken procedure. How to output the tokens into a new file and add "-" and add back the delimiter |
for /f "tokens=* delims= " %%f in (M:\GPAR\Dev\EQ\Upload_Files\eq_test.txt) do (
set line=%%f
call :processToken
)
goto :eof
:processToken
for /f "tokens=* delims=|" %%a in ("%line%") do (
)
if not "%line%" == "" goto :processToken
goto :eof
Thanks
#echo off
set "ent_file=c:\entity.txt"
break>"%tmp%\rev.txt"
for /f "usebackq tokens=1,2,3,4,5,6 delims=|" %%a in ("%ent_file%") do (
echo %%a^|%%b^|%%c^|-%%d^|-%%e^|-%%f
)>>"%tmp%\rev.txt"
move /y "%tmp%\rev.txt" "%ent_file%"
del /q /f "%tmp%\rev.txt"
Should work if your file contains only lines like the one you've posted.
EDIT output the part after the 6th token:
#echo off
set "ent_file=c:\entity.txt"
break>"%tmp%\rev.txt"
for /f "usebackq tokens=1,2,3,4,5,6,* delims=|" %%a in ("%ent_file%") do (
echo %%a^|%%b^|%%c^|-%%d^|-%%e^|-%%f|%%g
)>>"%tmp%\rev.txt"
move /y "%tmp%\rev.txt" "%ent_file%"
del /q /f "%tmp%\rev.txt"
#echo off
setlocal EnableDelayedExpansion
set digits=0123456789
(for /F "delims=" %%f in (eq_test.txt) do (
set "input=%%f"
set "output="
call :processToken
set /P "=!output:~1!" < NUL
echo/
)) > new_file.txt
goto :EOF
:processToken
for /F "tokens=1* delims=|" %%a in ("%input%") do (
set "token=%%a"
set "input=%%b"
)
if "!digits:%token:~0,1%=!" neq "%digits%" set "token=-%token%"
set "output=%output%|%token%"
if defined input goto processToken
exit /B

How to create a unique output filename for Windows Script?

I am trying to create a windows script that should generate this kind of filename everytime I run it: filename1, filename2, filename3 and so on. Here is what I have so far:
(
#echo off
wmic logicaldisk get size,freespace,caption
) > disk.txt
I hope you can help me. Thanks!!
:: make a tempfile
:maketemp
SET "tempfile=%temp%\%random%"
IF EXIST "%tempfile%*" (GOTO maketemp) ELSE (ECHO.>"%tempfile%a")
You now have any number of filenames available.
%tempfile%a exists and is empty, but %tempfile%anythingelse should be available for use.
#ECHO OFF
SETLOCAL
SET "basename=filename"
SET /a outname=0
:genloop
SET /a outname+=1
IF EXIST "%basename% %outname%.txt" GOTO genloop
SET "outname=%basename% %outname%.txt"
ECHO %outname%
GOTO :EOF
Ah - increment the destination filename on each run. This should do that. It's not actually creating a file - you'd need to create the file %outname% each time to have it increment...
(the space between %basename% and %outname% is optional, of course - omit it if desired.)
edited to include .txt
This will give you up to 1000 filenames but you can go higher, up to 2 Billion, but the higher you go the longer the delay will be before it picks a filename.
#echo off
for /L %%a in (1,1,1000) do if not defined filename if not exist "filename%%a.txt" set "filename=filename%%a.txt"
(
wmic logicaldisk get size,freespace,caption
) > "%filename%"
#echo off
setlocal enableextensions
call :getNextFilename "filename*.txt" nextFilename
echo %nextFilename%
echo test > "%nextFilename%"
call :getNextFilename "%cd%\filename*.txt" nextFilename
echo %nextFilename%
echo test > "%nextFilename%"
endlocal
exit /b
:getNextFilename whatToSearch returnVariable
setlocal enableextensions enabledelayedexpansion
for /f %%a in ("$\%~1"
) do for /f "tokens=1,* delims=*" %%b in ("%%~nxa"
) do ( set "left=%%b" & set "right=%%c" )
set "max=0"
for %%a in ("%~1"
) do for /f "tokens=1 delims=%left%%right% " %%b in ("%%~nxa"
) do for /f "tokens=* delims=0 " %%c in ("0%%~b"
) do if %%~c geq !max! set /a "max=%%c+1"
endlocal & set "%~2=%~dp1%left%%max%%right%" & exit /b
This should find the next file in sequence independently of the existence of holes in the numeration of the files. A path can be included or omitted. The * will be used as the placeholder for the numeration. BUT this will not work if files or included paths have "problematic" characters.
If the date/time of creation of the file can be considered, then this version can be optimized as
:getNextFilename whatToSearch returnVariable
setlocal enableextensions disabledelayedexpansion
for /f %%a in ("$\%~1"
) do for /f "tokens=1,* delims=*?" %%b in ("%%~nxa"
) do ( set "left=%%b" & set "right=%%c" )
set "max=0"
for /f "delims=" %%a in ('dir /tc /o-d /b "%~1" 2^>nul'
) do for /f "tokens=1 delims=%left%%right% " %%b in ("%%~nxa"
) do for /f "tokens=* delims=0 " %%c in ("0%%~b"
) do set /a "max=%%c+1" & goto done
:done
endlocal & set "%~2=%~dp1%left%%max%%right%" & exit /b
that will take the latest created instance of the file set.
I finally figured out where to put the .txt extension. This is from #Magoo's answer but I wanted the file to be a text file so I placed the .txt twice in order for it to work properly.
#ECHO OFF
SETLOCAL
SET "basename=DISK-OUT"
SET /a outname=0
:genloop
SET /a outname+=1
IF EXIST "%basename% %outname%.txt" GOTO genloop
SET "outname=%basename% %outname%.txt"
(
wmic logicaldisk get size,freespace,caption
) > "%outname%"
GOTO :EOF

Getting a list of common strings with first occurance mark

I've got bunch of text files with some content. First I wanted to number the lines globally. Then I extracted all lines that are duplicated somewhere (occur in any of given files at least twice). But now I need to mark all of these lines with the filename and line number of the first occurrence of this line. And now the funny part - it needs to be a windows batch file, using native windows tools. That's why I've got this problem to begin with.
So, to sum it up:
I have a file A with unique strings/lines, each of them is said to occur at least twice in given set of files.
I need to search these files and mark all occurrences of given line from A file with
-file name in which the line first occured
-line number in this file
This is my code with effort to number lines and format files.
#echo off
setlocal EnableDelayedExpansion
set /a lnum=0
if not [%1]==[] pushd %1
for /r %%F in (*.txt) do call :sub "%%F"
echo Total lines in %Files% files: %Total%
popd
exit /b 0
:Sub
set /a Cnt=0
for /f %%n in ('type %1') do (
set /a Cnt+=1
set /a lnum=!lnum!+1
echo ^<!lnum!^> %%n >> %1_ln.txt && echo ^<!lnum!^> >> %1_ln.txt && echo. >> %1_ln.txt
)
set /a Total+=Cnt
set /a Files+=1
echo %1: %Cnt% lines
#echo off
setlocal EnableDelayedExpansion
set lnum=0
if not "%~1" == "" pushd %1
rem "I've got bunch of text files..." (%%F is file name)
for /r %%F in (*.txt) do call :sub "%%F"
echo Total lines in %Files% files: %lnum%
popd
exit /b 0
:Sub "filename"
set Cnt=0
rem "... with some content." (%%n is line contents)
(for /f "usebackq delims=" %%n in (%1) do (
set /a Cnt+=1
rem "First I wanted to number the lines globally."
set /a lnum+=1
echo ^<!lnum!^> %%n
rem "Then I extracted all lines that are duplicated somewhere" (that were defined before)
if defined line[%%n] (
rem "I need to mark all of these lines with the filename and line number of the first occurrence of this line."
echo ^<!line[%%n]!^>
echo/
) else (
REM (Store the first occurrence of this line with *local* line number and filename)
set line[%%n]=!Cnt!: %1
)
)) > "%~PN1_ln.txt"
set /A Files+=1
echo %1: %Cnt% lines
exit /B
The above Batch program ignore empty lines in the input files and fail if they contain special Batch characters, like ! & < > |; this limitation may be fixed if required.
#ECHO OFF
SETLOCAL
FOR /f "delims=" %%s IN (A) DO (
SET searching=Y
FOR /f "delims=" %%f IN (
'dir /s /b /a-d *.txt') DO IF DEFINED searching (
FOR /f "tokens=1delims=:" %%L IN (
'findstr /b /e /n /l /c:"%%s" ^<"%%f"') DO IF DEFINED searching (
ECHO Line %%L IN "%%f" FOUND "%%s"
SET "searching="
)
)
)
Here's the meat of a routine that should do what you appear to be looking for - and that's as clear as mud.
It looks through the "A" file for each string in turn, assigns the string to %%s and sets the flag searching
Then it looks through the file list, assigning filenames to %%f
Then it executes a findstr to find the /c:"%%s" complete string %%s (including any spaces) in /l or literal mode (ie. not using regular expressions) for a line that both /b and /e begins and ends with the target (ie exactly matches) and /n numbers those lines.
The output of findstr will be in the format linenumber:linecontents so if this line is examined by the FORwith the option "delims=:" then the partion up to the first : is assigned to to %%L
So - %%L contains the line#, %%f the filename, %%s the string
Clearing searching having detected this line by setting its value to [nothing] means it's not NOT DEFINED hence no further lines will be reported from the current file, and no further filenames will be examined.
Now if you want to get a listing of ALL of the occurrences of the target lines, all you need to do is to REM-out the SET "searching=" line. Searching will then never be reset, so each line in each file is reported.
If you want some other combination, please clarify.
I have absolutely no idea whatever what you mean by "marking" a line.
#ECHO OFF & setlocal
for /f "tokens=1*delims==" %%i in ('set "$" 2^>nul') do set "%%i="
for %%a in (*.txt) do (
for /f %%b in ('find /v /c "" ^<"%%a"') do echo(%%b lines in %%a.
set /a counter=0, files+=1
for /f "usebackqdelims=" %%b in ("%%~a") do (
set /a counter+=1, total+=1
set "line=%%b"
setlocal enabledelayedexpansion
if not defined $!line! set "$!line!=%%a=!counter!=!line!"
for /f "delims=" %%i in ('set "$" 2^>nul') do (if "!"=="" endlocal)& set "%%i"
)
)
echo(%total% lines in %files% files.
for /f "delims=" %%a in (a) do set "#%%a=%%a"
for /f "tokens=2,3*delims==:" %%i in ('set "$" 2^>nul') do (
if defined #%%k echo("%%k" found in %%i at line %%j.
)
Script can handle !&<>|%, but not =.

Resources