How to sort lines by length with batch script? - sorting

I wanna make script, which in document example.txt sort all lines by length (some line has spaces). The longest line will be first line, the shortest line will be at end of the document. Script can rewrite original document. Thank you :-)

It is surprisingly easy and fast to accomplish this by writing each line as its own file within a temporary folder. Then use DIR /B /O-S to sort the files (lines) by size, capturing the result with FOR /F and then printing each file (line) with TYPE.
#echo off
setlocal disableDelayedExpansion
set "file=example.txt"
set "tempLoc=sortLinesTemp"
md "%tempLoc%"
set "cnt=0"
for /f usebackq^ delims^=^ eol^= %%A in ("%file%") do (
set /a cnt+=1
set "ln=%%A"
setlocal enableDelayedExpansion
echo(!ln!>"%tempLoc%\f!cnt!"
endlocal
)
(for /f %%F in ('dir /b /o-s "%tempLoc%"') do type "%tempLoc%\%%F")>"%file%.new"
move /y "%file%.new" "%file%" >nul
rd /s /q "%tempLoc%"
type "%file%"
This solution will strip empty lines. Empty lines can be preserved with a bit more code.
Also, line lengths are limited to a bit less than 8191 characters. This limitation is inherent to any pure native batch solution.

This batch file uses a VBS script to help get the line lengths, and sorts, then rewrites the input.txt file into input.new.txt
Use the batch like this: sortline.bat "filename.txt"
Leading | characters in a line will vanish.
#echo off
set "file=%temp%\sortline.vbs"
(
echo. Const ForReading = 1, ForWriting = 2
echo. infile = "%~1"
echo. Set fso = CreateObject("Scripting.FileSystemObject"^)
echo. Set f1 = fso.OpenTextFile(infile, ForReading^)
echo. Do While not f1.AtEndOfStream
echo. f = f1.readline
echo. Wscript.echo right(10000+len(f^),4^) ^& "|" ^& f
echo. loop
echo. f1.close
)>"%file%"
(for /f "tokens=1,* delims=|" %%a in (' cscript //nologo "%file%" ^|sort /r ') do echo(%%b)>"%~n1.new.txt"
del "%file%"

I think this is the simplest and fastest way to do that:
#echo off
setlocal EnableDelayedExpansion
set /A seqNum=10000, accumLen=0
set "lastLine="
for /F "tokens=1* delims=:" %%a in ('findstr /O "^" example.txt') do (
if not defined lastLine (
set "lastLine=%%b"
) else (
set /A "seqNum+=1, thisLen=10000-(%%a-accumLen), accumLen=%%a"
set "line[!thisLen!!seqNum:~-4!]=!lastLine!"
set "lastLine=%%b"
)
)
for %%a in (example.txt) do (
set /A "seqNum+=1, thisLen=10000-(%%~Za-accumLen)"
set "line[!thisLen!!seqNum:~-4!]=!lastLine!"
)
(for /F "tokens=1* delims==" %%a in ('set line[') do echo %%b) > sorted.txt
This solution strip empty lines and exclamation marks from the file. Both limitations may be fixed, if needed.

You need the strLen function from this page. Also, here is a download link to download sortn.bat , which will give you the hints to get started. Also, you need to be familiar with the basic compare function:
public int compare(String o1, String o2) {
if (o1.length()!=o2.length()) {
return o1.length()-o2.length();
}
return o1.compareTo(o2);
}
Good luck.

Related

Creating Each line of text as variable and them constantly changing in a loop in batch

So what I'm trying to do is create a find for multiple people where it in the text file it will say names and numbers like
Example of text file:
Beth
1234567891
Jay
2134456544
This is the best way I can explain what I'm trying to do:
#echo off
set "file=Test1.txt"
setlocal EnableDelayedExpansion
<"!file!" (
for /f %%i in ('type "!file!" ^| find /c /v ""') do set /a n=%%i && for /l %%j in (1 1 %%i) do (
set /p "line_%%j="
)
)
set /a Name=1
set /a Number=2
Echo Line_%Name%> %Name%.txt (Im trying to get this to say line_2 to say 1st line in the text file)
Echo Line_%Number%> %Name%.txt (Im trying to get this to say line_2 to say 2nd line in the text file)
:Start
set /a Name=%Name%+2 (These are meant to take off after 1 so lines 3,5,7,9 so on)
set /a Number=%Number%+2 (These are meant to take off after 2 so lines 4,6,8,10 so on)
Echo Line_%Name%
Echo Line_%Number%
GOTO :Start
so the outcome would be
In Beth.txt:
Beth
1234567891
So every name will be a file name and the first line in a file. I will change it later so I can do a addition in each text file.
Name: Beth
Number: 1234567891
#ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION
SET "sourcedir=u:\your files"
SET "destdir=u:\your results"
SET "filename1=%sourcedir%\q65417881.txt"
rem make sure arrays are empty
For %%b IN (name number) DO FOR /F "delims==" %%a In ('set %%b[ 2^>Nul') DO SET "%%a="
rem Initialise counter and entry array
SET /a count=0
SET "number[0]=dummy"
FOR /f "usebackqdelims=" %%a IN ("%filename1%") DO (
IF DEFINED number[!count!] (SET /a count+=1&SET "name[!count!]=%%a") ELSE (SET "number[!count!]=%%a")
)
rem clear out dummy entry
SET "number[0]=dummy"
FOR /L %%c IN (1,1,%count%) DO (
rem replace spaces with dashes
SET "name[%%c]=!name[%%c]: =-!"
rem report to console rem report to console
ECHO Name: !name[%%c]! Number: !number[%%c]!
rem generate name.txt file
(
ECHO !name[%%c]!
ECHO !number[%%c]!
)>"%destdir%\!name[%%c]!.txt"
)
GOTO :EOF
You would need to change the values assigned to sourcedir and destdir to suit your circumstances. The listing uses a setting that suits my system.
I deliberately include spaces in names to ensure that the spaces are processed correctly.
I used a file named q65417881.txt containing your data for my testing.
The line data read from the file is assigned to %%a is assigned to and number[!count!] alternately. The data is retained in these arrays for use by further processing.
[Edited to include conversion of spaces within names to dashes]
If I understand correctly, you want to precede every second line with Number: + SPACE and every other line with Name: + SPACE. For this you do not need to store each line in a variable first, you can use a single for /F loop lo read the file line by line and process every line individually. There are two possibilities:
Temporarily precede every line with a line number plus : using findstr /N:
#echo off
rem // Loop through lines and precede each with line number plus `:`:
for /F "tokens=1* delims=:" %%K in ('findstr /N "^" "Test1.txt"') do (
rem // Calculate remainder of division by two:
set /A "MOD=%%K%%2" 2> nul
rem // Toggle delayed expansion to avoid issues with `!`:
setlocal EnableDelayedExpansion
rem // Conditionally return line string with adequate prefix:
if !MOD! neq 0 (
endlocal & echo Name: %%L
) else (
endlocal & echo Number: %%L
)
)
This will fail when a line begins with the a :.
Check whether numeric representation of current line string is greater than 0:
#echo off
rem // Loop through (non-empty) lines:
for /F "usebackq delims=" %%L in ("Test1.txt") do (
rem // Determine numeric representation of current line string:
set /A "NUM=%%L" 2> nul
rem // Toggle delayed expansion to avoid issues with `!`:
setlocal EnableDelayedExpansion
rem // Conditionally return line string with adequate prefix:
if !NUM! equ 0 (
endlocal & echo Name: %%L
) else (
endlocal & echo Number: %%L
)
)
This fails when a name begins with numerals and/or when a numeric line is 0.
And just for the sake of posting something different:
#SetLocal EnableExtensions DisableDelayedExpansion & (Set LF=^
% 0x0A %
) & For /F %%G In ('Copy /Z "%~f0" NUL') Do #Set "CR=%%G"
#For /F "Tokens=1,2* Delims=:" %%G In ('%__AppDir__%cmd.exe /D/V/C ^
"%__AppDir__%findstr.exe /NR "^[a-Z]*!CR!!LF![0123456789]" "Test1?.txt" 2>NUL"
') Do #(SetLocal EnableDelayedExpansion
(Set /P "=Name: %%I!CR!!LF!Number: " 0<NUL & Set "_="
For /F Delims^=^ EOL^= %%J In ('%__AppDir__%more.com +%%H "%%G"') Do #(
If Not Defined _ Set "_=_" & Echo %%J)) 1>"%%I.txt" & EndLocal)
This file should be run with the Test1.txt file in the current working directory. It is important that along side Test1.txt, there are no other .txt files with the same basename followed by one other character, (for example Test1a.txt or Test12.txt). Should you wish to change your filename, just remember that you must suffix its basename in the above code with a ? character, (e.g. MyTextFile.log ⇒ MyTextFile?.log).
I had the rare opportunity to verify that this script worked against the following example Test1.txt file:
Beth
1234567891
Jay
2134456544
Bob
2137856514
Jimmy
4574459540
Mary
3734756547
Gemma
6938456114
Albert
0134056504

How to use Windows batch for bicirculation

two files 1.txt 2.txt
1.txt
a2441
b4321
p8763
2.txt
Apple
banana
peach
I want to generated content in 12.bat
ren a2441 Apple
ren b4321 banana
ren p8763 peach
I try to write like this but failed
#echo on
for /f %%a in (1.txt) do (
for /f %%b in (2.txt ) do (
ren %%a %%b>>12.bat
)
)
pause
How can I achieve my desired results? help me,thks
It is not so straightforward. You'll need to skip increasing amount of lines in the inner loop , but because of the FOR parsing you'll need additional subroutine call.And if there are empty lined they need to be stripped with findstr for more accurate skipping of the lines. Also skip argument in FOR does not accept 0 so also an if is needed.
#echo off
setlocal ENABLEDELAYEDEXPANSION
set /a s=0
for /f "tokens=* delims=" %%a in (1.txt) do (
call :inner !s!
echo ren %%a !second!
set /a s=s+1
)
endlocal
pause
exit /b
:inner
set "second="
if %1==0 (
for /f "tokens=* delims=" %%b in ('findstr /r "[a-zA-Z]" "2.txt"') do (
if not defined second set "second=%%b"
)
) else (
for /f "skip=%1 tokens=* delims=" %%b in ('findstr /r "[a-zA-Z]" "2.txt"' ) do (
if not defined second set "second=%%b"
)
)
The problem with anidated loops is that for each line of the outer file/loop all the lines of the inner file/loop are processed.
One simple solution is to number the lines of both files and only generate the output line when both numbers match. Quick and dirty:
#echo off
setlocal enableextensions disabledelayedexpansion
>"12.bat" (
for /f "tokens=1,* delims=:" %%a in ('findstr /n "^" 1.txt') do (
for /f "tokens=1,* delims=:" %%c in ('findstr /n "^" 2.txt') do (
if %%a==%%c (
echo ren %%b %%d
)
))
)
This code just uses findstr to generate line numbers (/n) for each of the input files, and using the delims and tokens clauses of the for /f command to separate them from the original line contents (note: if the lines could start with a colon, more code is needed). If the line number in first file matches the line number in second file the command is echoed (there is a active output redirection to the final file)
Another option is to assign each file to a input stream and to read them using set /p operations from each of the streams
#echo off
setlocal enableextensions disabledelayedexpansion
call :processFiles 9<"1.txt" 8<"2.txt" >"12.bat"
goto :eof
:processFiles
<&9 set /p f1= || goto :eof
<&8 set /p f2= || goto :eof
echo ren %f1% %f2%
goto :processFiles
In the call to the subroutine we assign each file to a stream. Inside the subroutine each stream is read with set /p (note: set /p can not read lines longer than 1021 characters)

Batch split a text file

I have this batch file to split a txt file:
#echo off
for /f "tokens=1*delims=:" %%a in ('findstr /n "^" "PASSWORD.txt"') do for /f "delims=~" %%c in ("%%~b") do >"text%%a.txt" echo(%%c
pause
It works but it splits it line by line. How do i make it split it every 5000 lines. Thanks in advance.
Edit:
I have just tried this:
#echo off
setlocal ENABLEDELAYEDEXPANSION
REM Edit this value to change the name of the file that needs splitting. Include the extension.
SET BFN=passwordAll.txt
REM Edit this value to change the number of lines per file.
SET LPF=50000
REM Edit this value to change the name of each short file. It will be followed by a number indicating where it is in the list.
SET SFN=SplitFile
REM Do not change beyond this line.
SET SFX=%BFN:~-3%
SET /A LineNum=0
SET /A FileNum=1
For /F "delims==" %%l in (%BFN%) Do (
SET /A LineNum+=1
echo %%l >> %SFN%!FileNum!.%SFX%
if !LineNum! EQU !LPF! (
SET /A LineNum=0
SET /A FileNum+=1
)
)
endlocal
Pause
exit
But i get an error saying: Not enough storage is available to process this command
This will give you the a basic skeleton. Adapt as needed
#echo off
setlocal enableextensions disabledelayedexpansion
set "nLines=5000"
set "line=0"
for /f "usebackq delims=" %%a in ("passwords.txt") do (
set /a "file=line/%nLines%", "line+=1"
setlocal enabledelayedexpansion
for %%b in (!file!) do (
endlocal
>>"passwords_%%b.txt" echo(%%a
)
)
endlocal
EDITED
As the comments indicated, a 4.3GB file is hard to manage. for /f needs to load the full file into memory, and the buffer needed is twice this size as the file is converted to unicode in memory.
This is a fully ad hoc solution. I've not tested it over a file that high, but at least in theory it should work (unless 5000 lines needs a lot of memory, it depends of the line length)
AND, with such a file it will be SLOW
#echo off
setlocal enableextensions disabledelayedexpansion
set "line=0"
set "tempFile=%temp%\passwords.tmp"
findstr /n "^" passwords.txt > "%tempFile%"
for /f %%a in ('type passwords.txt ^| find /c /v "" ') do set /a "nFiles=%%a/5000"
for /l %%a in (0 1 %nFiles%) do (
set /a "e1=%%a*5", "e2=e1+1", "e3=e2+1", "e4=e3+1", "e5=e4+1"
setlocal enabledelayedexpansion
if %%a equ 0 (
set "e=/c:"[1-9]:" /c:"[1-9][0-9]:" /c:"[1-9][0-9][0-9]:" /c:"!e2![0-9][0-9][0-9]:" /c:"!e3![0-9][0-9][0-9]:" /c:"!e4![0-9][0-9][0-9]:" /c:"!e5![0-9][0-9][0-9]:" "
) else (
set "e=/c:"!e1![0-9][0-9][0-9]:" /c:"!e2![0-9][0-9][0-9]:" /c:"!e3![0-9][0-9][0-9]:" /c:"!e4![0-9][0-9][0-9]:" /c:"!e5![0-9][0-9][0-9]:" "
)
for /f "delims=" %%e in ("!e!") do (
endlocal & (for /f "tokens=1,* delims=:" %%b in ('findstr /r /b %%e "%tempFile%"') do #echo(%%c)>passwords_%%a.txt
)
)
del "%tempFile%" >nul 2>nul
endlocal
EDITED, again: Previous code will not correctly work for lines starting with a colon, as it has been used as a delimiter in the for command to separate line numbers from data.
For an alternative, still pure batch but still SLOW
#echo off
setlocal enableextensions disabledelayedexpansion
set "nLines=5000"
set "line=0"
for /f %%a in ('type passwords.txt^|find /c /v ""') do set "fileLines=%%a"
< "passwords.txt" (for /l %%a in (1 1 %fileLines%) do (
set /p "data="
set /a "file=line/%nLines%", "line+=1"
setlocal enabledelayedexpansion
>>"passwords_!file!.txt" echo(!data!
endlocal
))
endlocal
Test this: the input file is "file.txt" and output files are "splitfile-5000.txt" for example.
This uses a helper batch file called findrepl.bat - download from: https://www.dropbox.com/s/rfdldmcb6vwi9xc/findrepl.bat
Place findrepl.bat in the same folder as the batch file or on the path.
#echo off
:: splits file.txt into 5000 line chunks.
set chunks=5000
set /a s=1-chunks
:loop
set /a s=s+chunks
set /a e=s+chunks-1
echo %s% to %e%
call findrepl /o:%s%:%e% <"file.txt" >"splitfile-%e%.txt"
for %%b in ("splitfile-%e%.txt") do (if %%~zb EQU 0 del "splitfile-%e%.txt" & goto :done)
goto :loop
:done
pause
A limitation is the number of lines in the file and the real largest number is 2^31 - 1 where batch math tops out.
#echo off
setlocal EnableDelayedExpansion
findstr /N "^" PASSWORD.txt > temp.txt
set part=0
call :splitFile < temp.txt
del temp.txt
goto :EOF
:splitFile
set /A part+=1
(for /L %%i in (1,1,5000) do (
set "line="
set /P line=
if defined line echo(!line:*:=!
)) > text%part%.txt
if defined line goto splitFile
exit /B
If the input file has not empty lines, previous method may be modified in order to run faster.

how to add space in for /f "tokens=*"

this is my myfile.txt I want to add space in second column as see the sample
ARK,LAR SNE,QNE,898,ILO,SNE,SNE,LAR,LAR,545
AUS,MNY P08,TTL,7776,STO,STL,STL,MNY,MNY,567
BOS,MTZ TNK,SDK,444,PPO,TNK,TNK,MTZ,MTZ,456
this is the code I am using
for /f "tokens=* " %%i in (myfile.txt) do call :echo2 %%i %%J %%K %%L %%M %%N %%O %%P %%Q %%R %%S
goto :EOF
:echo2
echo insert ('%1','%2','%3','%4','%5','%6','%7','%8','%9','%10'); >>myfile1.txt
goto :EOF
its displaying results , where it should have taken space what I am missing any help is appreciated
#ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION
(
FOR /f "delims=" %%i IN (myfile.txt) DO (
SET "dataline=%%i"
SET "outline="
CALL :adddata
)
)>myfile1.txt
GOTO :EOF
:adddata
FOR /f "tokens=1*delims=," %%p IN ("%dataline%"
) DO SET outline=%outline%'%%p',&SET "dataline=%%q"
IF DEFINED dataline GOTO adddata
ECHO insert (%outline:~0,-1%);
GOTO :eof
This should do the job with no practical limit on columns - provided of course that the comma is reliably an end-of-column delimiter.
For each line in the source file, assign the entire line to
dataline and clear outline
then take the first token, delimited by comma, from dataline, quote it,add a comma and append it to outline; then set dataline to the remainder of the line after the first comma.
repeat until there is nothing left in dataline
output the text insert ( + all but the last character of outline (which will be a comma) + );
If I understand you correctly, you want to preserve the spaces in the text between the 1st and 2nd comma, correct? Try this:
#echo off
for /f "tokens=1-10 delims=," %%a in (myfile.txt) do (
>>myfile1.txt echo.insert ('%%a','%%b','%%c','%%d','%%e','%%f','%%g','%%h','%%i','%%j'^);
)
Try this:
#echo off & setlocal
(for /f "delims=" %%i in (myfile.txt) do (
set "line='%%i'"
setlocal enabledelayedexpansion
set "line=!line:,=','!"
set "line=!line: = ','!"
echo(insert (!line!^);
endlocal
))>myfile1.txt
You can't exceed 9 variables so your script won't work after the 9th. You can use for /f to copy each line exactly as the original file like so:
for /f "tokens=* " %%i in (myfile.txt) do echo %%i >>myfile1.txt
goto :EOF

Getting a list of common strings with first occurance mark

I've got bunch of text files with some content. First I wanted to number the lines globally. Then I extracted all lines that are duplicated somewhere (occur in any of given files at least twice). But now I need to mark all of these lines with the filename and line number of the first occurrence of this line. And now the funny part - it needs to be a windows batch file, using native windows tools. That's why I've got this problem to begin with.
So, to sum it up:
I have a file A with unique strings/lines, each of them is said to occur at least twice in given set of files.
I need to search these files and mark all occurrences of given line from A file with
-file name in which the line first occured
-line number in this file
This is my code with effort to number lines and format files.
#echo off
setlocal EnableDelayedExpansion
set /a lnum=0
if not [%1]==[] pushd %1
for /r %%F in (*.txt) do call :sub "%%F"
echo Total lines in %Files% files: %Total%
popd
exit /b 0
:Sub
set /a Cnt=0
for /f %%n in ('type %1') do (
set /a Cnt+=1
set /a lnum=!lnum!+1
echo ^<!lnum!^> %%n >> %1_ln.txt && echo ^<!lnum!^> >> %1_ln.txt && echo. >> %1_ln.txt
)
set /a Total+=Cnt
set /a Files+=1
echo %1: %Cnt% lines
#echo off
setlocal EnableDelayedExpansion
set lnum=0
if not "%~1" == "" pushd %1
rem "I've got bunch of text files..." (%%F is file name)
for /r %%F in (*.txt) do call :sub "%%F"
echo Total lines in %Files% files: %lnum%
popd
exit /b 0
:Sub "filename"
set Cnt=0
rem "... with some content." (%%n is line contents)
(for /f "usebackq delims=" %%n in (%1) do (
set /a Cnt+=1
rem "First I wanted to number the lines globally."
set /a lnum+=1
echo ^<!lnum!^> %%n
rem "Then I extracted all lines that are duplicated somewhere" (that were defined before)
if defined line[%%n] (
rem "I need to mark all of these lines with the filename and line number of the first occurrence of this line."
echo ^<!line[%%n]!^>
echo/
) else (
REM (Store the first occurrence of this line with *local* line number and filename)
set line[%%n]=!Cnt!: %1
)
)) > "%~PN1_ln.txt"
set /A Files+=1
echo %1: %Cnt% lines
exit /B
The above Batch program ignore empty lines in the input files and fail if they contain special Batch characters, like ! & < > |; this limitation may be fixed if required.
#ECHO OFF
SETLOCAL
FOR /f "delims=" %%s IN (A) DO (
SET searching=Y
FOR /f "delims=" %%f IN (
'dir /s /b /a-d *.txt') DO IF DEFINED searching (
FOR /f "tokens=1delims=:" %%L IN (
'findstr /b /e /n /l /c:"%%s" ^<"%%f"') DO IF DEFINED searching (
ECHO Line %%L IN "%%f" FOUND "%%s"
SET "searching="
)
)
)
Here's the meat of a routine that should do what you appear to be looking for - and that's as clear as mud.
It looks through the "A" file for each string in turn, assigns the string to %%s and sets the flag searching
Then it looks through the file list, assigning filenames to %%f
Then it executes a findstr to find the /c:"%%s" complete string %%s (including any spaces) in /l or literal mode (ie. not using regular expressions) for a line that both /b and /e begins and ends with the target (ie exactly matches) and /n numbers those lines.
The output of findstr will be in the format linenumber:linecontents so if this line is examined by the FORwith the option "delims=:" then the partion up to the first : is assigned to to %%L
So - %%L contains the line#, %%f the filename, %%s the string
Clearing searching having detected this line by setting its value to [nothing] means it's not NOT DEFINED hence no further lines will be reported from the current file, and no further filenames will be examined.
Now if you want to get a listing of ALL of the occurrences of the target lines, all you need to do is to REM-out the SET "searching=" line. Searching will then never be reset, so each line in each file is reported.
If you want some other combination, please clarify.
I have absolutely no idea whatever what you mean by "marking" a line.
#ECHO OFF & setlocal
for /f "tokens=1*delims==" %%i in ('set "$" 2^>nul') do set "%%i="
for %%a in (*.txt) do (
for /f %%b in ('find /v /c "" ^<"%%a"') do echo(%%b lines in %%a.
set /a counter=0, files+=1
for /f "usebackqdelims=" %%b in ("%%~a") do (
set /a counter+=1, total+=1
set "line=%%b"
setlocal enabledelayedexpansion
if not defined $!line! set "$!line!=%%a=!counter!=!line!"
for /f "delims=" %%i in ('set "$" 2^>nul') do (if "!"=="" endlocal)& set "%%i"
)
)
echo(%total% lines in %files% files.
for /f "delims=" %%a in (a) do set "#%%a=%%a"
for /f "tokens=2,3*delims==:" %%i in ('set "$" 2^>nul') do (
if defined #%%k echo("%%k" found in %%i at line %%j.
)
Script can handle !&<>|%, but not =.

Resources