Sorry, for bothering you for the (n+1)th time about search & replace with batch scripts.
I have text files (actually PS-files) (approx. 10kB-3MB) where I need to replace just a few numbers.
This should be easy, I thought.
I found quite a few scripts here on Stackoverflow but none of them worked properly so far. If I have overlooked THE "working one" please let me know.
The last one I tried:
#echo off
setlocal DisableDelayedExpansion
set OutputFile=%1
set OutputFile=%OutputFile:"=%
set InputFile=%OutputFile%.tmp
set SearchString=636170656C6C6133
set ReplaceString=636170656C6C6134
rem write empty file
type NUL > %OutputFile%
for /f "tokens=1,* delims=¶" %%A in ( '"type %InputFile%"') do (
SET string=%%A
setlocal EnableDelayedExpansion
SET modified=!string:%SearchString%=%ReplaceString%!
echo !modified!>>%OutputFile%
endlocal
)
del %InputFile%
First of all, it seems to be pretty(!) slow. I can see on disk how the file size increases.
The occurrences of the numbers seem to be replaced. However, the file is altered, which I easily can see from the different file size. As far as I can see, empty lines, exclamation marks and lines beginning with semicolon are skipped. This is messing up my file completely.
How to avoid this?
If I do the same thing with Perl I really get only the numbers altered, nothing else. However, I don't want to and cannot use Perl. I also don't want to use other extra programs or Windows-Powershell, since it should work on older systems too.
Is there any way to achieve this with a simple Windows batch script?
Thanks!
I believe the following is a working bat script that should not make any changes other than the desired number change:
#echo off
setlocal DisableDelayedExpansion
set "out=%~1"
set "in=%out%.tmp"
set "find=636170656C6C6133"
set "repl=636170656C6C6134"
>"%out%" (
for /f "delims=" %%A in ('findstr /n "^" "%in%"') do (
set "str=%%A"
setlocal EnableDelayedExpansion
set "str=!str:*:=!"
if defined str set "str=!str:%find%=%repl%!"
echo(!str!
endlocal
)
)
del "%in%"
Changes I have made:
Use %~1 to remove enclosing parentheses. Though technically, that is not necessary. Something like echo test >"someName.txt".new will work just fine.
FOR /F strips empty lines. I used FINDSTR to prefix each line with the line number, followed by a colon. Now there are no empty lines.
I use an extra variable expansion find/replace with * to remove the line number prefix.
Variable expansion find/replace will fail if a string is empty (undefined variable). So I verify the variable is defined before doing find/replace.
ECHOing an empty line, or line containing only white space, will result in ECHO is off. output. This is solved by using echo(
It takes time to initialize redirection, and your loop does this every iteration, which slows things down. I improved performance by enclosing the entire FOR loop in parentheses and redirecting only once.
You still may see a slight file size change for any of the following reasons
If the input has \n line terminators instead of \r\n.
If the last line of input is not terminated by \r\n. The script terminates all lines with \r\n, regardless what the input had.
The script will fail if any line contains a null byte, or if any line is >~8k length.
I hate editing text files with batch - it is complicated code, slow, and even the best possible solution still has significant limitations.
I recommend you use JREPL.BAT - a command line regular expression text processing utility. JREPL is pure script (hybrid batch/JScript) that runs natively on any Windows machine from XP onward - no 3rd party exe file or special configuration is needed.
The tool is very powerful, with many options. Full documentation is available from the command line via jrepl /?, or jrepl /?? for paged help.
Solving your problem with JREPL is trivial - you don't even need another script. The following command will work right from a command prompt:
jrepl 636170656C6C6133 636170656C6C6134 /f input.txt /o output.txt
Use CALL JREPL if you put the command within another batch script.
JREPL is way more powerful than what you need for this simple problem. But it is incredibly convenient, and once you have the utility, I suspect you will find many uses for it. Especially if you learn to use regular expressions, as well as the many JREPL options.
A VBS script
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout
Text = Inp.readall
Text = Replace(Text, "636170656C6C6133", "636170656C6C6134")
outp.write Text
To use
cscript //nologo script.vbs < input.txt > Output.txt
You use the right tool for the job. Batch is for starting programs and copying files.
The above is suited to the file sizes given. However if we are getting up to 100s of MB then this code is better.
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout
Do Until Inp.AtEndOfStream
Text = Inp.readline
Text = Replace(Text, "636170656C6C6133", "636170656C6C6134")
outp.writeline Text
Loop
Related
I am trying to write a batch file that will create other, more complex batch files.
This is a portion of my script for right now:
(
echo #echo off
echo for /f "delims=" %%a in (themeset.txt) do set color=%%a&goto setcolor
echo :setcolor
echo color %%color%%
echo pause
) >otherfile.cmd
Theoretically, the output should be a .cmd file with these exact contents:
#echo off
for /f "delims=" %%a in (themeset.txt) do set color=%%a&goto setcolor
:setcolor
color %color%
pause
However, the batch file does not run, and when I attempted to double the %% signs it does not write the entire line correctly.
Any suggestions or solutions would be appreciated.
Max
Magoo's comment beat me to most of the issues.
There is a mostly obscure situation where ECHO text to echo fails, so many people switched to ECHO.text to echo. But somewhere along the way I hit an even more obscure line where that failed, and discovering that semicolon didn't have that problem, I switched to ECHO;text to echo. Which so far, I've not seen any issues.
The FOR /F command has issues, with EOL defaulting to the semicolon being one of them. Some web-page out there made the claim that DELIMS was executed before EOL, so setting EOL to the same as DELIMS would disable it. In my testing it does. But now if we want the whole line, it seems that TOKENS=* disables DELIMS. As far as I know, this is true, but I will admit that I haven't tested that one as much as I would like, so it is probably best to set DELIMS=~, or to similar character that isn't likely to be found in the file.
The % has to be escaped with another percent as in %%, but FOR requires %%, so now you need 4 percents %%%% total.
%%~aa, %%~dd, %%~ff, %%~nn, %%~pp, %%~ss, %%~tt, %%~xx, and %%~zz are all ambiguous. So it probably best to use only capital letters with FOR variables, and probably safest to use only these letters: %%B, %%C, %%E, %%G, %%H, %%I, %%J, %%K, %%L, %%M, %%O, %%Q, %%R, %%U, %%V, %%W, and %%Y See the bottom sections of the help produced by FOR /? for more info on this.
Use the caret to escape special characters. This page, Escape using caret(^), gives some useful info, but its NOT complete. You NEED the caret before the closing parentheses ^), which that page fails to tell you. And to play it safe, I also caret the opening parentheses ^(. But the good thing about that page is that it does points out that when DelayedExpansion is on, you have to double escape the exclamation mark ^^!.
Most of the rest is conventions, I normally capitalize all commands and place a colon in front of all labels, both GOTO :Label and CALL :Label.
The following code:
#ECHO OFF
(
ECHO;#ECHO OFF
ECHO;FOR /F "EOL=~ TOKENS=* DELIMS=~" %%%%L IN ^(themeset.txt^) DO SET Color=%%%%L^&GOTO :SetColor
ECHO;:SetColor
ECHO;Color %%Color%%
ECHO;PAUSE
) >otherfile.cmd
Producted the following file:
#ECHO OFF
FOR /F "EOL=~ TOKENS=* DELIMS=~" %%L IN (themeset.txt) DO SET Color=%%L&GOTO :SetColor
:SetColor
Color %Color%
PAUSE
It is clear from your code that you are not even performing the task in the most efficient manner. The following example will therefore perform the same task, but using another methodology. (i.e. retrieve the first line of the text file content and use it as the color command parameter from another Windows Command Script)
#( Echo #Set /P "colr=" 0^< "themeset.txt"
Echo #Color %%colr%%
Echo #Pause) 1> "otherfile.cmd"
I have a string in a batch file, of the structure
[[status]]:{"01bcd123-1234-5678-0000-abcdefghijkl": "11"}
I need to get just the 01bcd123-1234-5678-0000-abcdefghijkl out of it, but trying to use " as a delimiter doesn't turn out well. \ and ^ don't seem to escape it properly.
set i=1
set "x!i!=%x:"=" & set /A i+=1 & set "x!i!=%"
Is what I have with x being the whole string, attempting to parse it into x1, x2 etc with " as the delimiter.
What is a proper way to split this string, using " as the delimiter?
Edit: Powershell tag is because I am running the script as part of a larger orchestration in Powershell and could export the functionality of the batch script into it if necessary.
Here are two approaches. The first one doesn't mess with the for syntax format, but it's risky - too much dependence on the string (the quotes are actually stripped by %%~). The second one is an ugly non-intuitive syntax, but actually delimits by quotes:
set "string=[[status]]:{"01bcd123-1234-5678-0000-abcdefghijkl": "11"}"
for /f "tokens=2 delims=:{" %%a in ("%string%") do #echo %%~a
for /f tokens^=2delims^=^" %%a in ("%string%") do #echo %%a
Well, the self-expanding code you have posted works fine, given that you have got delayed expansion enabled, by having put the statement setlocal EnableDelayedExpansion placed before. The string of interest is then stored in variable x2. Note that when the script terminates, x2 (like all the other x# variables as well) is no longer available since an implicit endlocal is executed then. To avoid that, place endlocal & set "x2=%x2%" in the last line:
#echo off
rem // Define string to parse:
set "x=[[status]]:{"01bcd123-1234-5678-0000-abcdefghijkl": "11"}"
rem // Enable delayed expansion:
setlocal EnableDelayedExpansion
rem // Initialise index counter:
set i=1
rem // Split string using self-expanding code:
set "x!i!=%x:"=" & set /A i+=1 & set "x!i!=%" & rem // (unbalanced `"`!)
rem // Display all `x#` variables:
set x
rem // Make `x2` survive the `endlocal` barrier:
endlocal & set "x2=%x2%"
rem // Return the retrieved value:
echo(%x2%
However, I would most probably use a for /F loop, but not with " as delimiter since the syntax appears quite odd then; rather I would use :, {, } and SPACE as delimiters. But I would remove the prefix [[status]] in advance:
#echo off
rem // Define string to parse:
set "x=[[status]]:{"01bcd123-1234-5678-0000-abcdefghijkl": "11"}"
rem /* At first, split off everything up to the first occurrence of `]]`;
rem if there is no such prefix, there is no harm, because nothing happens;
rem then extract the first token that is delimited by `:`, `{`, `}` or space;
rem that way there may even be spaces around the `:` or around `{` or `}`;
rem then return it with surrounding quotation marks removed (`~`-modifier): */
for /F "tokens=1 eol=: delims=:{} " %%I in ("%x:*]]=%") do echo(%%~I
N. B.:
The odd-looking syntax echo( is not a typo, it is actually the only safe way to echo an arbitrary string (even on, off or /?); take a look at this external thread for more details.
Since you tagged PowerShell, you can use the following regex, but I am not sure you want PowerShell based on the question.
[regex]::Match('[[status]]:{"01bcd123-1234-5678-0000-abcdefghijkl": "11"}','(?<=")[^"]+(?=")').Value
Split regex can also work:
('[[status]]:{"01bcd123-1234-5678-0000-abcdefghijkl": "11"}' -split '"')[1]
If you stick with a batch file, Stephan's helpful answer is definitely the simplest and fastest solution.
Needless to say, if you port your batch file to PowerShell, you'll have vastly more functionality at your disposal.
You can even harness that functionality from a batch file via PowerShell's CLI, by calling powershell.exe (Windows PowerShell) or pwsh.exe (POwerShell Core), but that comes with two caveats:
Doing so creates a PowerShell child process, whose startup time is not insignificant.
Getting nested quoting right can be a challenge, as shown below.
Here's a solution that calls PowerShell's CLI from a batch file, applying the -split technique from AdminOfThings' helfpul answer; again, this solution would be overkill in the case at hand, but the approach may be of interest if you need to perform tasks that simply cannot be done in the batch language or would be too cumbersome.
#echo off
setlocal
:: # The input text.
set txt=[[status]]:{"01bcd123-1234-5678-0000-abcdefghijkl": "11"}
:: # Call the PowerShell CLI to extract the token of interest and save the
:: # result in variable %id%.
:: # In PowerShell code, the equivalent would be:
:: # $id = ($txt -split '"')[1]
for /f %%i in ('powershell -noprofile -c "('%txt:"=\"%' -split '\""')[1]"') do set id=%%i
:: # Echo the result.
echo %id%
Note the need to \-escape the " chars. embedded in %txt%, via substitution %txt:"=\"%, and the need for an additional " char. after \" in '\""' so as to prevent the for command from breaking.
I am trying to find last line in a text file using the regex ^.*\z, it's working fine in notepad++ but when I try it in cmd using findstr /R "^.*^Z" file.txt not working.
Open a command prompt window and run findstr /?. The output help explains what FINDSTR supports. The regular expression feature is limited in FINDSTR. It does not support all the features as supported by Boost Perl Regular Expression library used by many text editors in various versions.
This batch code could be used to get last non empty line from a file assigned to an environment variable:
#echo off
setlocal EnableExtensions DisableDelayedExpansion
set "LastLine="
if exist "file.txt" for /F "usebackq eol= delims=" %%# in ("file.txt") do set "LastLine=%%#"
echo Last line is: "%LastLine%"
endlocal
Command FOR skips all empty lines and by default also all lines starting with a semicolon. For that reason eol= is used to define form-feed control character as end of line. In case of last line of file surely never starts with ; it would be best to remove eol= from the FOR command line.
In case of file to process always has at least X lines, it would make sense to add to the FOR options after usebackq the option skip=X to skip the first X lines of the file for faster processing.
For details on command FOR open a command prompt window and run for /?.
I am running a batch file on Windows 7 and running into this error (I have narrowed down the error to the following line):
FOR /F "delims=" %%I in ('echo %RegVal%') do set sasroot=%%~sI
Where Regval is the file path of a given software, which in this case (on my Win7 machine) is:
RegVal = C:\Program Files\SAS 9.2_M3_10w37\SASFoundation\9.2(32-bit)
This same script used to work on Windows Vista, although I suspect it may be that there a parenthesis in RegVal now as it was previousy C :\Program Files\SAS 9.2_M3_10w37\SASFoundation\ on my previous Vista machine.
You suspection is correct.
To get around it, enclose your variable into doublequotes (You remove them again with the ~ in the setcommand)
FOR /F "delims=" %%I in ('echo "%RegVal%"') do set sasroot=%%~sI
I suggest you create a file with the value of RegVal in it, then parse it using the FOR loop:
echo %RegVal%>C:\SomeFile.txt
FOR /F "delims=" %%I in (C:\SomeFile.txt) do set sasroot=%%~sI
This should help you get around your problem.
Stephan's solution is much simpler, but I'll explain my solution anyway, which might prove useful in some cases.
When the FOR command parses the data specified in the IN part using a command, it replaces the command with the result of the command, then runs the FOR command. For example, with the question above, the FOR command that will be executed after expanding echo %RegVal% is:
FOR /F "delims=" %%I in (C:\Program Files\SAS 9.2_M3_10w37\SASFoundation\9.2(32-bit)) do set sasroot=%%~sI
Thus, when the parser hits the first closing parenthesis, it will stop, thinking that everything it read before is the text to work on. However, in this case this is wrong, as the first closing parenthesis is part of the string to read; it doesn't indicate the end of the string.
When parsing a file with the FOR command, it will read each line, assign the predefined tokens with the correct values, then execute the code block that follows. Rinse and repeat for every line in the file. But in this case, it will not replace the IN part with each line; it will only parse it and assign values to the tokens. This is the reason why special characters (such as parenthesis) do not create parsing errors in this case.
I have a multiple line file (about 300 - 400 lines) each line has 72 characters and i need that transformed into a single line.
Any ideas ?
This is possible, assuming you want your concatenated line in one line in a text file. However, even though you can create the long line with batch, you will not be able to read the line using batch. As Electro Hacker says, you cannot create a batch environment variable longer than 8191 bytes long.
XP SET /P will preserve leading spaces from each line. But SET /P on Vista and beyond strips leading spaces.
This solution adds a space between each concatenated line.
#echo off
setlocal
set "infile=test.txt"
set "outfile=out.txt"
>"%outfile%" (
for /f usebackq^ delims^=^ eol^= %%A in ("%infile%") do <nul set /p "=%%A "
)
If you want to stick to standard Windows tools, PowerShell would also be an option:
-join (Get-Content foo.txt)
You can't break a limitation of the OS, you can't break the 255 chars path in Windows, and you can't break the CMD interpreter lenght limitation, simply as that!
Sorry but you can't store that line into a var, no way, don't exist any magic, computers are logical.
But it's not the end of the world, you can do it so easy in any other lenguage, I recommend you Ruby or python (Ruby for that), it's an easy job, open a file, store the content into a var, and then do what you want, don't need any experience for that, if you need a example just comment this.