Text file string removal/replacement with Windows batch file - windows

I want to remove exactly three numbers after dot or if its easier everything after . and before ; using a Windows batch script.
Before
ABC;CDEF;GH;123:456.XXX;EFG;789:123.XXXABC;CDEF;GH;123:456.XXX;EFG;789:123.XXX...
After
ABC;CDEF;GH;123:456;EFG;789:123ABC;CDEF;GH;123:456;EFG;789:123...
I've been trying something with myVar:~0,-4 but I don't know how to use it when replacing with other string:
set "str=;"
for /f "tokens=* delims=." %%A in (%input%) do set myVar=%%A & echo !myVar:%myVar:~0,-4%=%str%! >> %output%

Using your examples this method doesn't need to know the token count.
#Echo Off
SetLocal EnableDelayedExpansion
Copy/Y "input.txt" "output.txt">Nul
(For /F "UseBackQDelims=" %%A In ("output.txt") Do (Set "OL="
For %%B In (%%A) Do Set "OL=!OL!;%%~nB"
Echo=!OL:~1!))>"input.txt"
::Del "output.txt"
Pause
If you're happy with the content of input.txt you may remove the first two characters, :: from line 7. If not, don't worry nothing is lost, you can delete input.txt and rename output.txt to input.txt.
I have used input.txt and output.txt, please adjust those two names as necessary

for /f "tokens=1-3 delims=." %%A in (%input%) do set "myVar1=%%A"&set "myVar2=%%B"&set "myVar3=%%C" & echo !myVar1!!myVar2:~3!!myVar3:~3! >> %output%
or
for /f "tokens=1-3 delims=." %%A in (%input%) do set "myVar2=%%B"&set "myVar3=%%C" & echo %%A!myVar2:~3!!myVar3:~3! >> %output%
would be my first attempt - assuming your structure contains exactly 2 .s which are each followed by 3 characters to be removed.

This method works with any number of dots placed at any positions:
#echo off
setlocal EnableDelayedExpansion
rem Get file lines
(for /F "delims=" %%a in (input.txt) do (
rem Split line at dots
set "line=%%a"
set "myVar="
for %%b in ("!line:.=" "!") do (
if not defined myVar (
rem Copy first part
set "myVar=%%~b"
) else (
rem Eliminate first three chars from rest of parts
set "part=%%~b"
set "myVar=!myVar!!part:~3!"
)
)
echo !myVar!
)) > output.txt

Related

How to extract the last word from the last line of a TXT file through a batch script

I am trying to extract the last word from the last line of a txt file.
The result I want is just Cup$2!.
This is what I tried:
#echo off
SetLocal EnableDelayedExpansion
set L=1
for /F "tokens=2 delims=" %%a in (corner.txt) do (
set line=%%a
if !L!==7 set Line7=%%a
set /a L=!L!+1
)
echo The word is %Line7%
pause
The result I'm getting is The word is.
What should I edit to get the above result?
Get line count.
for /f "tokens=3*" %%i in ('find /c /v /n /i"" corner.txt') do set /a v=%%i-1
Then get the last values from 7-th word of the last line:
for /f "tokens=7*" %%a in ('more corner.txt +%v%') do set "String="%%b""
Variable %String% keeps the values framed by double quotes: Cup&2!/Cup$2!
If you use / as delimiter you can get last value:
for /f "delims=/ tokens=2*" %%a in (%String%) do #echo %%a
Here's a quick example of how you could capture the substring you require, from reading your code, i.e. the last substring on the seventh line:
#Echo Off
SetLocal EnableExtensions DisableDelayedExpansion
Set "var="
For /F UseBackQ^ Skip^=7^ Delims^=^ EOL^= %%G In ("corner.txt") Do Set "var=%%~nxG" & GoTo Next
If Not Defined var GoTo EndIt
:Next
SetLocal EnableDelayedExpansion
Echo The word is !var!
EndLocal
:EndIt
Echo Press any key to close . . .
Pause 1>NUL
EndLocal
Exit /B
If however, as your question title and body asks, you want the last substring of the last line, it's a little bit simpler:
#Echo Off
SetLocal EnableExtensions DisableDelayedExpansion
Set "var="
For /F UseBackQ^ Delims^=^ EOL^= %%G In ("corner.txt") Do Set "var=%%~nxG"
If Not Defined var GoTo EndIt
SetLocal EnableDelayedExpansion
Echo The word is !var!
EndLocal
:EndIt
Echo Press any key to close . . .
Pause 1>NUL
EndLocal
Exit /B
I will not however be explaining any of it, please use the search facility at the top of the page, and the output from the built in help for each of the commands I have used.
Parsing strings in batch files is never ideal but this seems to work:
#echo off
setlocal ENABLEEXTENSIONS DISABLEDELAYEDEXPANSION
echo.line 1 > "%temp%\SO_Test.txt"
echo. >> "%temp%\SO_Test.txt"
echo.foo^&bar!baz hello/world Cup^&2!/Cup$2!>> "%temp%\SO_Test.txt"
goto start
:start
set "line="
for /F "usebackq delims=" %%a in ("%temp%\SO_Test.txt") do #set line=%%a
:word
set "b="
rem Delims is slash, tab, space and the order is important
for /F "tokens=1,* delims=/ " %%A in ("%line%") do (
set "line=%%A"
set "b=%%B"
)
if not "%b%" == "" (
set "line=%b%"
goto word
)
echo.Last word is "%line%"
#ECHO OFF
SETLOCAL
rem The following settings for the source directory, destination directory, target directory,
rem batch directory, filenames, output filename and temporary filename [if shown] are names
rem that I use for testing and deliberately include names which include spaces to make sure
rem that the process works using such names. These will need to be changed to suit your situation.
SET "sourcedir=u:\your files"
SET "filename1=%sourcedir%\q70761955.txt"
SET "amper=&"
FOR /f "usebackq delims=" %%b IN ("%filename1%") DO SET "lastword=%%b"
:: Replace ampersands with spaces, then slashes with spaces
CALL SET "lastword=%%lastword:%amper%= %%"
SET "lastword=%lastword:/= %"
FOR %%b IN (%lastword%) DO SET "lastword=%%b"
ECHO last "word" is -^>%lastword%^<-
GOTO :EOF
Read each line of the file, retaining the last-read line in lastword.
Replace each "delimiter" character (& and / are implied in the question. Others - no information.)
Select the last "word" in the last line.
You would need to change the value assigned to sourcedir to suit your circumstances. The listing uses a setting that suits my system.
I deliberately include spaces in names to ensure that the spaces are processed correctly.
I used a file named q70761955.txt containing your data for my testing.
You could use this in a batch-file run by cmd on windows.
FOR /F "delims=" %%A IN ('powershell -NoLogo -NoProfile -Command ^
"((Get-Content -Path 'corner.txt' -Last 1) -split '[\s*/]')[-1]"') DO (SET "LINE7=%%A")
ECHO %LINE7%
If the script were written in PowerShell instead of cmd language, it could be a one-liner.
$Line7 = ((Get-Content -Path 'corner.txt' -Last 1) -split '[\s*/]')[-1]

Reading line by line from one file and write to another file using batch script

In below code i am tring to fetch the line no of string "AXX0000XXXA" from file data.txt,then fetching line by line and printing target.txt file,in between if the line reach the find line no i am adding one more line from file temp.txt.The code is working fine with the less nos of records(tested with 150 lines-File Size 100 kb),but when i am processing with 50K records(File Size 25MB) it is taking more then 25 minutes to process.could you please help me how i will process same in less time.
#echo off
setlocal enabledelayedexpansion
for /f "delims=:" %%a in ('findstr /n "AXX0000XXXA" "C:\Users\23456\Desktop\data.txt"') do (set find_line=%%a)
set /a counter=0
for /f "usebackq delims=" %%b in (`"findstr /n ^^ C:\Users\23456\Desktop\data.txt"`) do (
set curr_line=%%b
set /a counter=!counter!+1
if !counter! equ !find_line! (
type temp.txt >> target.txt
)
call :print_line curr_line
)
endlocal
:print_line
setlocal enabledelayedexpansion
set line=!%1!
set line=!line:*:=!
echo !line!>>target.txt
endlocal
Your code uses three Batch file constructs that are inherently slow: call command, >> append redirection and setlocal/endlocal, and these constructs are executed once per each file line! It would be faster to include the subroutine into the original code to avoid the call and setlocal commands, and an echo !line!>>target.txt command imply open the file, search for the end, append the data and close the file, so it is faster to use this construct: (for ...) > target.txt that just open the file once. An example of a code with such changes is in Compo's answer.
This is another method to solve this problem that may run faster when the search line is placed towards the beginning of the file:
#echo off
setlocal enabledelayedexpansion
for /f "delims=:" %%a in ('findstr /n "AXX0000XXXA" "C:\Users\23456\Desktop\data.txt"') do (set /A find_line=%%a-1)
call :processFile < "C:\Users\23456\Desktop\data.txt" > target.txt
goto :EOF
:processFile
rem Duplicate the first %find_line%-1 lines
for /L %%i in (1,1,%find_line%) do (
set /P "line="
echo !line!
)
rem Insert the additional line
type temp.txt
rem Copy the rest of lines
findstr ^^
exit /B
This should create target.txt with content matching data.txt except for an inserted line taken from tmp.txt immediately above the line matching the search string, AXX0000XXXA.
#Echo Off
Set "fSrc=C:\Users\23456\Desktop\data.txt"
Set "iSrc=temp.txt"
Set "sStr=AXX0000XXXA"
Set "fDst=target.txt"
Set "iStr="
Set/P "iStr="<"%iSrc%" 2>Nul
If Not Defined iStr Exit/B
Set "nStr="
For /F "Delims=:" %%A In ('FindStr/N "%sStr%" "%fSrc%" 2^>Nul') Do Set "nStr=%%A"
If Not Defined nStr Exit/B
( For /F "Tokens=1*Delims=:" %%A In ('FindStr/N "^" "%fSrc%"') Do (
If "%%A"=="%nStr%" Echo %iStr%
Echo %%B))>"%fDst%"
I have made it easy for you to change your variable data, you only need to alter lines 3-6.
I have assumed that this was your intention, your question was not clear, please accept my apologies if I have assumed incorrectly.

Parse and Rename text files

I need to rename all of these files to the 6 character Part Number(306391) on the 3rd line.
Currently i have:
setlocal enabledelayedexpansion
set first=1
for /f "skip=3 delims= " %%a in (Name.txt) do (
if !first! ==1 (
set first=0
echo %%a > out.txt
ren Name.txt %%a.txt
)
)
Which finds the 6 digit part number and renames the file to the correct name. But breaks if i use *.txt instead of the actual name of the .txt file. I need it to work for all .txt files in the directory.
Surround your for /f loop with another for loop, then reference the outer loop variable in your ren command. You can also eliminate the need for delayed expansion by using if defined for boolean checks. I put in other tweaks here and there. Just ask if you want details.
#echo off
setlocal
for %%I in (*.txt) do (
set first=
for /f "usebackq skip=3" %%a in ("%%~fI") do (
if not defined first (
set first=1
echo %%~nxI ^> %%a.txt
rem // uncomment this when you're satisfied that it works correctly
rem // ren "%%~fI" "%%a.txt"
)
)
)
This method should run faster, specially if the files are large, because just 4 lines of each file are read (instead of the whole file):
#echo off
setlocal EnableDelayedExpansion
rem Process all .txt files
for %%f in (*.txt) do (
rem Read the 4th line
(for /L %%i in (1,1,4) do set /P "line4=") < "%%f"
rem Rename the file
for /F %%a in ("!line4!") do ECHO ren "%%f" "%%a.txt"
)

Windows CMD FOR loop

I'm trying to make a code which will get first words from all lines of HELP's output to a variable and echo this variable. Here is my code:
#echo off
set a=
for /F "tokens=1,*" %%i in ('help') do (
set a=%a% %%i
)
echo %a%
But it returns first word from only last line. Why?
Bali C solved your problem as stated, but it looks to me like you are trying to get a list of commands found in HELP.
Some of the commands appear on multiple lines, so you get some extraneous words. Also there is a leading and trailing line beginning with "For" on an English machine that is not wanted.
Here is a short script for an English machine that will build a list of commands. The FINDSTR command will have to change for different languages.
#echo off
setlocal enableDelayedExpansion
set "cmds="
for /f "eol= delims=." %%A in ('help^|findstr /bv "For"') do (
for /f %%B in ("%%A") do set "cmds=!cmds! %%B"
)
set "cmds=%cmds:~1%"
echo %cmds%
EDIT
Ansgar Wiechers came up with a more efficient algorithm to extract just the command names at https://stackoverflow.com/a/12733642/1012053 that I believe should work with all languages. I've used his idea to simplify the code below.
#echo off
setlocal enableDelayedExpansion
set "cmds="
for /f %%A in ('help^|findstr /brc:"[A-Z][A-Z]* "') do set "cmds=!cmds! %%A"
set "cmds=%cmds:~1%"
echo %cmds%
You need to use delayed expansion in your for loop
#echo off
setlocal enabledelayedexpansion
set a=
for /F "tokens=1,*" %%i in ('help') do (
set a=!a! %%i
)
echo %a%
Instead of using %'s around the a variable, you use !'s to use delayed expansion.
Because the echo is outside the do ( ...... )
#echo off
for /F "tokens=1,*" %%i in ('help') do (
echo %%i
)
and no need to print a, you can use directly %%i.
Another very simple example could be a batch like this saved as help1.cmd
#echo off
for /F "tokens=1,*" %%i in ('help') do (
if /I "%%i" EQU "%1" echo %%j
)
and you call this batch like
help1 MKDIR
to get the short help text for the MKDIR command

Batch to remove duplicate rows from text file

Is it possible to remove duplicate rows from a text file? If yes, how?
Sure can, but like most text file processing with batch, it is not pretty, and it is not particularly fast.
This solution ignores case when looking for duplicates, and it sorts the lines. The name of the file is passed in as the 1st and only argument to the batch script.
#echo off
setlocal disableDelayedExpansion
set "file=%~1"
set "sorted=%file%.sorted"
set "deduped=%file%.deduped"
::Define a variable containing a linefeed character
set LF=^
::The 2 blank lines above are critical, do not remove
sort "%file%" >"%sorted%"
>"%deduped%" (
set "prev="
for /f usebackq^ eol^=^%LF%%LF%^ delims^= %%A in ("%sorted%") do (
set "ln=%%A"
setlocal enableDelayedExpansion
if /i "!ln!" neq "!prev!" (
endlocal
(echo %%A)
set "prev=%%A"
) else endlocal
)
)
>nul move /y "%deduped%" "%file%"
del "%sorted%"
This solution is case sensitive and it leaves the lines in the original order (except for duplicates of course). Again the name of the file is passed in as the 1st and only argument.
#echo off
setlocal disableDelayedExpansion
set "file=%~1"
set "line=%file%.line"
set "deduped=%file%.deduped"
::Define a variable containing a linefeed character
set LF=^
::The 2 blank lines above are critical, do not remove
>"%deduped%" (
for /f usebackq^ eol^=^%LF%%LF%^ delims^= %%A in ("%file%") do (
set "ln=%%A"
setlocal enableDelayedExpansion
>"%line%" (echo !ln:\=\\!)
>nul findstr /xlg:"%line%" "%deduped%" || (echo !ln!)
endlocal
)
)
>nul move /y "%deduped%" "%file%"
2>nul del "%line%"
EDIT
Both solutions above strip blank lines. I didn't think blank lines were worth preserving when talking about distinct values.
I've modified both solutions to disable the FOR /F "EOL" option so that all non-blank lines are preserved, regardless what the 1st character is. The modified code sets the EOL option to a linefeed character.
New solution 2016-04-13: JSORT.BAT
You can use my JSORT.BAT hybrid JScript/batch utility to efficiently sort and remove duplicate lines with a simple one liner (plus a MOVE to overwrite the original file with the final result). JSORT is pure script that runs natively on any Windows machine from XP onward.
#jsort file.txt /u >file.txt.new
#move /y file.txt.new file.txt >nul
you may use uniq http://en.wikipedia.org/wiki/Uniq from UnxUtils http://sourceforge.net/projects/unxutils/
Some time ago I found an unexpectly simple solution, but this unfortunately only works on Windows 10: the sort command features some undocumented options that can be adopted:
/UNIQ[UE] to output only unique lines;
/C[ASE_SENSITIVE] to sort case-sensitively;
So use the following line of code to remove duplicate lines (remove /C to do that in a case-insensitive manner):
sort /C /UNIQUE "incoming.txt" /O "outgoing.txt"
This removes duplicate lines from the text in incoming.txt and provides the result in outgoing.txt. Regard that the original order is of course not going to be preserved (because, well, this is the main purpose of sort).
However, you sould use these options with care as there might be some (un)known issues with them, because there is possibly a good reason for them not to be documented (so far).
The Batch file below do what you want:
#echo off
setlocal EnableDelayedExpansion
set "prevLine="
for /F "delims=" %%a in (theFile.txt) do (
if "%%a" neq "!prevLine!" (
echo %%a
set "prevLine=%%a"
)
)
If you need a more efficient method, try this Batch-JScript hybrid script that is developed as a filter, that is, similar to Unix uniq program. Save it with .bat extension, like uniq.bat:
#if (#CodeSection == #Batch) #then
#CScript //nologo //E:JScript "%~F0" & goto :EOF
#end
var line, prevLine = "";
while ( ! WScript.Stdin.AtEndOfStream ) {
line = WScript.Stdin.ReadLine();
if ( line != prevLine ) {
WScript.Stdout.WriteLine(line);
prevLine = line;
}
}
Both programs were copied from this post.
set "file=%CD%\%1"
sort "%file%">"%file%.sorted"
del /q "%file%"
FOR /F "tokens=*" %%A IN (%file%.sorted) DO (
SETLOCAL EnableDelayedExpansion
if not [%%A]==[!LN!] (
set "ln=%%A"
echo %%A>>"%file%"
)
)
ENDLOCAL
del /q "%file%.sorted"
This should work exactly the same. That dbenham example seemed way too hardcore for me, so, tested my own solution. usage ex.: filedup.cmd filename.ext
Pure batch - 3 effective lines.
#ECHO OFF
SETLOCAL
:: remove variables starting $
FOR /F "delims==" %%a In ('set $ 2^>Nul') DO SET "%%a="
FOR /f "delims=" %%a IN (q34223624.txt) DO SET $%%a=Y
(FOR /F "delims=$=" %%a In ('set $ 2^>Nul') DO ECHO %%a)>u:\resultfile.txt
GOTO :EOF
Works happily if the data does not contain characters to which batch has a sensitivity.
"q34223624.txt" because question 34223624 contained this data
1.1.1.1
1.1.1.1
1.1.1.1
1.2.1.2
1.2.1.2
1.2.1.2
1.3.1.3
1.3.1.3
1.3.1.3
on which it works perfectly.
Did come across this issue and had to resolve it myself because the use was particulate to my need.
I needed to find duplicate URL's and order of lines was relevant so it needed to be preserved. The lines of text should not contain any double quotes, should not be very long and sorting cannot be used.
Thus I did this:
setlocal enabledelayedexpansion
type nul>unique.txt
for /F "tokens=*" %%i in (list.txt) do (
find "%%i" unique.txt 1>nul
if !errorlevel! NEQ 0 (
echo %%i>>unique.txt
)
)
Auxiliary: if the text does contain double quotes then the FIND needs to use a filtered set variable as described in this post: Escape double quotes in parameter
So instead of:
find "%%i" unique.txt 1>nul
it would be more like:
set test=%%i
set test=!test:"=""!
find "!test!" unique.txt 1>nul
Thus find will look like find """what""" file and %%i will be unchanged.
I have used a fake "array" to accomplish this
#echo off
:: filter out all duplicate ip addresses
REM you file would take place of %1
set file=%1%
if [%1]==[] goto :EOF
setlocal EnableDelayedExpansion
set size=0
set cond=false
set max=0
for /F %%a IN ('type %file%') do (
if [!size!]==[0] (
set cond=true
set /a size="size+1"
set arr[!size!]=%%a
) ELSE (
call :inner
if [!cond!]==[true] (
set /a size="size+1"
set arr[!size!]=%%a&& ECHO > NUL
)
)
)
break> %file%
:: destroys old output
for /L %%b in (1,1,!size!) do echo !arr[%%b]!>> %file%
endlocal
goto :eof
:inner
for /L %%b in (1,1,!size!) do (
if "%%a" neq "!arr[%%b]!" (set cond=true) ELSE (set cond=false&&goto :break)
)
:break
the use of the label for the inner loop is something specific to cmd.exe and is the only way I have been successful nesting for loops within each other. Basically this compares each new value that is being passed as a delimiter and if there is no match then the program will add the value into memory. When it is done it will destroy the target files contents and replace them with the unique strings

Resources