Remove multi-line strings from a text file using a batch script - windows

I am trying to create a batch file that will edit a text file to remove lines that contain a certain string and remove the line directly after that.
An example of this file would look like this:
LINE ENTRY KEEP_1 BLA BLA
END
LINE ENTRY REMOVE_1 FOO BAR
END
LINE ENTRY REMOVE_2 HELLO WORLD
END
LINE ENTRY KEEP_2 CAT DOG
END
After running the batch script I require the new file to contain
LINE ENTRY KEEP_1 BLA BLA
END
LINE ENTRY KEEP_2 CAT DOG
END
where any line containing REMOVE_ has been deleted, as well as the corresponding 'END' line.
I have tried using the technique found here to remove the lines containing the string but it does not appear to be possible to include characters such as \r\n to check for and include the 'END' in the search. I can't do this as 2 seperate FINDSTR commands as I still require the 'END' text to be kept for the other two entries.
Using findstr /v REMOVE_ leaves me with the following:
LINE ENTRY KEEP_1 BLA BLA
END
END
END
LINE ENTRY KEEP_2 CAT DOG
END
and using findstr /v "REMOVE_*\r\nEnd" does not seem to work at all.
Just to confirm each line is definitely terminated with \r\n.
Any help on this issue would be greatly appreciated.

The following batch script should do what you want:
#echo off
setlocal enabledelayedexpansion
set /A REMOVE_COUNT=1
if "%~2"=="" (
echo Usage: %~n0 search_str file
echo remove lines that contain a search_str and remove %REMOVE_COUNT% line^(s^) directly after that
exit /b 1
)
set "SEARCH_STR=%~1"
set "SRC_FILE=%~2"
set /A SKIP_COUNT=0
for /F "skip=2 delims=[] tokens=1,*" %%I in ('find /v /n "" "%SRC_FILE%"') do (
if !SKIP_COUNT! EQU 0 (
set SRC_LINE=%%J
if defined SRC_LINE (
if "!SRC_LINE:%SEARCH_STR%=!" == "!SRC_LINE!" (
echo.!SRC_LINE!
) else (
set /A SKIP_COUNT=%REMOVE_COUNT%
)
) else (
rem SRC_LINE is empty
echo.
)
) else (
set /A SKIP_COUNT-=1
)
)
The number of lines to be removed after a matched line can be configured by setting the REMOVE_COUNT variable.
The script also handles files with empty lines correctly by using a trick: The find command is used to prefix all lines with line numbers. That way the for command will not skip empty lines.

findstr operates line-wise. You cannot do anything with it that spans more than a single line.
In any case, you're in for a world of pain if you do this with batch files. While you certainly can loop through the file and only output certain lines, this would look kinda like the following:
set remove=
for /f %%x in (file.txt) do (
if not defined remove (
echo %%x|findstr "REMOVE" >nul 2>&1 && set remove=1
if not defined remove echo.%%x
) else (
set remove=
)
)
(untested, but might work). The problem here is twofold: for /f removes any empty lines from the output so if your file had them before you won't have them afterwards. This may or may not be a problem for your specific case. Another problem is that dealing with special characters can get hairy. I give no guarantee that the above works as it should for things like >, <, &, |, ...
Your best bet in this case, if you need to run it on almost any Windows machine, would probably be a VBScript. The string handling capabilities are much more robust there.

Related

how to replace one line in text file without removing empty lines in batch

i am trying to code a simple script in batch that can find and replace a line
so far, I've found a snippet that works perfectly fine for my purpose the only problem is that it removes empty lines
and i can't figure out why!!
I've tried to add another if statement in this for loop but I fail
also I found that there is a bat called JREPL, i tried to run few simple commands from the docs and i failed again XD
here is the snippet:
:Variables
set InputFile=t.txt
set OutputFile=t-new.txt
set _strFind= old "data"
set _strInsert= new "data";
:Replace
>"%OutputFile%" (
for /f "usebackq delims=" %%A in ("%InputFile%") do (
if "%%A" equ "%_strFind%" (echo %_strInsert%) else (echo %%A)
)
)
i was expecting that this snippet won't remove my empty lines
and i can't figure out why
I am posting this without testing, as I do not have the environment to test as we speak.
But to explain your issue, cmd will ommit empty lines as it is built that way. It is the same as setting a variable to nothing and expecting it to return a result, so we simply assign values to each line by sort of simulating a detection of line breaks (Don't know exactly how to explain that one) but nevertheless, we will add some additional characters to the lines to ensure we get line breaks, the just get rid of them once we have them, So here goes:
#echo off
setlocal enabledelayedexpansion
set inputfile=t.txt
set outputfile=t-new.txt
set _strfind=old "data"
set _strinsert=new "data";
for /f "tokens=*" %%a in ('type "%inputfile%" ^| find /v /n "" ^& break ^> "%inputfile%"') do (
set "str=%%a"
set "str=!str:*]=!"
if "!str!"=="%_strfind%" set "str=%_strinsert%"
>>%outputfile% echo(!str!
)
That should send to output file.. You can however make the output file the same as the input as it would then be the same as replacing the text inline in the original file. Once I am able to test, I will fix the answer if there are any issues with it.
As a side note, be careful of where you have additional whitespace in your variables you set. For instance:
set a = b
has 2 issues, the variable, containing a space after a will be created with the space. So it will be seen as:
%a %
The aftermath of this is that the value of the variable will start with a leading space, so when you expected b as the value, it in fact became b
Then lastly, it is alsways a good idea to enclose your variables with double quotes, simply again to eliminate whitespace, because:
set a=b
Even though you cannot see it with your naked eyes, contains a space at the end, so doing a direct match like:
if "b"=="b"
Will result in a false statement as in fact we have:
if "b"=="b "
So the correct statement would be to set variables as:
set "a=b"
if "%a%"=="b"
which will be a perfect match.
Note I posted this from my phone, so any spelling, grammar and code issues I will resolved as I go though my answer.
…and one way using JREPL
JRepl "old \qdata\q" "new \qdata\q;" /I /XSEQ /F "t.txt" /O "t-new.txt"

CMD Batch - Search for last occurence of character while looping through file

I have a .txt file which I loop through every line and spool to another file. Ok no problem so far. But I want NOT to spool lines, which have following criteria:
they contain more slashes. Find the last slash. After this one search the rest of the string for .*** (* = wildcard). If not found don´t spool, else spool.
Input file content for example:
c:/abc/abc/
c:/abc/abc/test.txt
c:/eee/
c:/eee/test.cfg
c:/test/abc/test/xxx/bbb/ccc/aaa/test.txt
c:/test/abc/test/xxx/bbb/ccc/aaa/
Output should look like:
c:/abc/abc/test.txt
c:/eee/test.cfg
c:/test/abc/test/xxx/bbb/ccc/aaa/test.txt
It is not static, where this lines appear, which should be removed. So I thought about finding the last slash and take all after that and look if there the last thing is ".***" If so keep else don´t echo
I don´t want to use other tools for this. It must be done via native command-line functionality.
Maybe somebody can help me out.
Code:
>OUTPUT.txt (
FOR /F "usebackq delims=" %%I IN ("FILE.txt") DO (
set "line=%%I"
setlocal enabledelayedexpansion
rem DO SOMEHTING HERE I DON`T KNOW HOW TO DO
echo(!line!)
)
)
just do it in one line using findstr and its regular expression mode (\....$ means all strings ending with . followed by 3 characters):
findstr /R \....$ FILE.txt
result:
c:/abc/abc/test.txt
c:/eee/test.cfg
c:/test/abc/test/xxx/bbb/ccc/aaa/test.txt

Batch file to process csv document to add space in postcode field

I have a csv file populated with name, address, and postcode. A large number of the postcodes do not have the required space in between e.g LU79GH should be LU7 9GH and W13TP should be W1 3TP. I need to add a space in each postcode field if it is not there already, the space should always be before the last 3 characters.
What is the best way to solve this via windows command line?
Many Thanks
You can do this with for /f as follows:
#echo off
setlocal enabledelayedexpansion
if "%~1" equ "" (echo.%~0: usage: missing file name.& exit /b 1)
if "%~2" neq "" (echo.%~0: usage: too many arguments.& exit /b 1)
for /f %%i in (%~1) do (echo.%%i& goto :afterheader)
:afterheader
for /f "skip=1 tokens=1-3 delims=," %%i in (%~1) do (
set name=%%i
set address=%%j
set postcode=%%k
set postcode=!postcode: =!
echo.!name!,!address!,!postcode:~0,-3! !postcode:~-3!
)
exit /b 0
Demo:
> type data.csv
name,address,postcode
n1,a1,LU79GH
n2,a2,W13TP
n1,a1,LU7 9GH
n2,a2,W1 3TP
> .\add-space.bat data.csv
name,address,postcode
n1,a1,LU7 9GH
n2,a2,W1 3TP
n1,a1,LU7 9GH
n2,a2,W1 3TP
You can redirect the output to a file to capture it. (But you can't redirect to the same file as the input, because then the redirection will overwrite the input file before it can be read by the script. If you want to overwrite the original file, you can redirect the output to a new file, and then move the new file over the original after the script has finished.)
Using windows you could do something with Powershell.
$document = (Get-Content '\doc.csv')
foreach($line in $document) {
Write-Host $line
// Add logic to cut out exactly what column your looking at with
$list = $line -split","
// Then use an if statement and regular expression to match ones with no space
if($list[0] -match ^[A-Z0-9]$){
// item has no space add logic to add space and write to file
}else{
// item has space or doesnt match the above regular expression could skip this
}
}
Pretty good documentation online check out http://ss64.com/ps/ for help with powershell.
Parsing CSV can be tricky because a comma may be a column delimiter, or it may be a literal character within a quoted field.
Since your postcode is always the last field, I would simply look at the 4th character from the end of the entire line, and if it is not already a space, than insert a space before the last 3 characters in the line. I will also assume that the first line of the file lists the field names, so you don't want to modify that one.
Using pure batch (assuming no values contain !):
#echo off
setlocal enableDelayedExpansion
set "skip=true"
>"test.csv.new" (
for /f "usebackq delims=" %%A in ("test.csv") do (
set "line=%%A"
if "!line:~-4,1!" equ " " set "skip=true"
if defined skip (echo !line!) else (echo !line:~0,-3! !line:~-3!)
set "skip="
)
)
move /y "test.csv.new" "test.csv" >nul
The solution is simpler if you use my JREPL.BAT regular expression text processor. It is a pure script (hybrid JScript/batch) that runs natively on any Windows machine from XP onward. The following one liner will do the trick:
jrepl "[^ ](?=...$)" "$& " /jbegln "skip=(ln==1)" /f test.csv /o -
Use CALL JREPL ... if you use the command within another script.

Combine multiple lines from one text file into one line

So I have a file with multiple lines of letters and numbers as shown:
a
1
h
7
G
k
3
l
END
I need a type of code that combines them together and preferably outputs it into a variable as shown:
var=a1h7Gk2l
Any help would be appreciated.
#echo off
setlocal enableDelayedExpansion
set "var="
for /f "usebackq" %%A in ("test.txt") do set var=!var!%%A
echo !var!
Edit
I assumed "END" does not physically exist in your file. If it does exist, then you can add the following line after the FOR statement to strip off the last 3 characters.
set "!var!=!var:~0,-3!"
Or, if you just want to put the result into a file (as opposed to storing it in memory for some purpose), you could do something like this:
#ECHO OFF
TYPE NUL >output.txt
FOR /F %%L IN (input.txt) DO (
IF NOT "%%L" == "END" (<NUL >>output.txt SET /P "=%%L")
)
ECHO.>>output.txt
The last command may be unnecessary, depending on whether you need the End-of-Line symbol at, well, the end of the line.

Adding and removing codes in every line in windows

I have lines like the ones shown below.
abcbasndo
bacmaisca
ascmasoc
Now, I need to take out the first three characters of every line and add AAA at the start and end of each line, so that it looks like the one shown below.
AAAabcAAA
AAAbacAAA
AAAascAAA
I am using windows.
Please help.
This little cmd script will do the job for you:
#setlocal enableextensions enabledelayedexpansion
#echo off
for /f "delims=" %%a in (qq.txt) do (
set var=%%a
echo AAA!var:~0,3!AAA
)
endlocal
See the following transcript:
C:\Pax> type qq.txt
abcbasndo
bacmaisca
ascmasoc
C:\Pax> qq
AAAabcAAA
AAAbacAAA
AAAascAAA
The for loop grabs each line in the qq.txt file (without delims=, it would use spaces within the line as delimiters) and puts it in %%a.
The body of the for loop puts that value into var and then uses the substring operator to get the first three characters.
I haven't tested what will happen if the line has less than three characters since (1) you didn't specify what you expected; and (2) it should be fairly easy to expand this script to handle it.

Resources