Extract first row from each text file in directory subtree - windows

I have set of 5000 CSV files in tree directory structure.
Every file was first row denoting its enconding, then typical CSV content follows.
UTF-8
Key1,"Value 1",
Key2,"Value 2"
etc, etc...
How can I quickly collect first row from each file in order to oversee set of their encodings?
I'm trying this with help of this answer but I don't clearly understand all syntax nuances of batch file variables so I am getting stuff like Echo is on.
for /R D:\resources\ %%f in (*.csv) do (
set /p line1=<%%f
echo %line1% >> out.txt
)
If parts are executed individually (for single file), they work.
I was also trying single-liner
for /R D:\resources\ %f in (*.csv) do set /p line1=<%f & echo %line1% >> out.txt
but this one put value collected from first file into output from each file.

The main problem here is the lack of delayed expansion, which is needed when you write and read a variable within the same block of code; without it, you are reading the variable value present when parsing the block, so before it is executed. Here is the fixed code:
#echo off
copy NUL out.txt
setlocal EnableDelayedExpansion
for /R "D:\resources" %%g in ("*.csv") do (
set "line1="
< "%%~g" set /P line1=""
>> "out.txt" echo(!line1!
)
endlocal
In addition, I quoted all file/directory paths in order to avoid trouble with white-spaces or special characters in them. I also reversed the redirection syntax, because echo(!line1! >> "out.txt" also outputs the SPACE in front of >>. The ( instead of a SPACE behind echo looks odd but prevents ECHO is {on|off}. being returned when the read line in !line1! is empty.
By the way, you do not need to prepare an empty output file out.txt initially, you can simply redirect the whole for loop into the output files once using >, given that you put a surrounding pair of parentheses:
#echo off
setlocal EnableDelayedExpansion
> "out.txt" (
for /R "D:\resources" %%g in (*.csv) do (
set "line1="
< "%%~g" set /P line1=""
echo(!line1!
)
)
endlocal
To do the same directly in command prompt as a single-liner, try this:
cmd /V /C (for /R "D:\resources" %g in ("*.csv") do #(set "line1=" ^& ^< "%~g" set /P line1="" ^& echo(!line1!)) ^> "out.txt"

aschipfl's comment helped. Adding corrected and completed script for benefit of others:
#echo off
copy NUL out.txt
SETLOCAL EnableDelayedExpansion
for /R D:\resources\ %%g in (*.csv) do (
set /p line1=<%%g
echo !line1! >> out.txt
)

Related

Find and replace algorithm for string in text file using batch script, works, but stopping when `<`, `>`, or `|` characters appear

I've been trying to figure out how to replace an entire line in a text file that contains a certain string using a Batch Script. I've found this solution provided by another user on Stack Overflow, which does the job, however, it just stops iterating through the text file at some random point and in turn, the output file is left with a bunch of lines untransferred from the original file. I've looked character by character, and line by line of the script to figure out what each part exactly does, and can not seem to spot what is causing this bug.
The code provided, thanks to Ryan Bemrose on this question
copy nul output.txt
for /f "tokens=1* delims=:" %%a in ('findstr /n "^" file.txt') do call :do_line "%%b"
goto :eof
:do_line
set line=%1
if {%line:String =%}=={%line%} (
echo.%~1 >> output.txt
goto :eof
)
echo string >> output.txt
The lines it is stopping at always either contain < or > or both and lines with | will either cause it to stop, or sometimes it will delete the line and continue.
To do this robustly, Delayed expansion is necessary to prevent "poison" characters such as < > & | etc being interpreted as command tokens.
Note however that delayed expansion should not be enabled until after the variable containing the line value is defined so as to preserve any ! characters that may be present.
The following will robustly handle all Ascii printable characters *1, and preserve empty lines present in the source file:
#Echo off
Set "InFile=%~dp0input.txt"
Set "OutFile=%~dp0output.txt"
Set "Search=String "
Set "Replace="
>"%OutFile%" (
for /F "delims=" %%G in ('%SystemRoot%\System32\findstr.exe /N "^" "%InFile%"') do (
Set "line=%%G"
call :SearchReplace
)
)
Type "%OutFile%" | More
goto :eof
:SearchReplace
Setlocal EnableDelayedExpansion
Set "Line=!Line:*:=!"
If not defined Line (
(Echo()
Endlocal & goto :eof
)
(Echo(!Line:%Search%=%Replace%!)
Endlocal & goto :eof
*1 Note - Due to how substring modification operates, You cannot replace Search strings that:
contain the = Operator
Begin with ~

Batch file searching for multiple file formats in FOR loop

I'm coding a batch file that should search two folder paths for multiple file extensions and list them in a text file. Currently I'm trying to use a FOR loop with a list of file extensions (*.doc, *.docx, etc). I believe the file is erroring out because of the "*" character but I don't know how to correct this.
I've tried to straight list them: FOR %%G IN (*.one,*.mht,*.onepkg). I've tried quote marks: FOR %%G IN ("*.one","*.mht","*.onepkg"). I've tried carets: FOR %%G IN (^^*.one,^^*.mht,^^*.onepkg).
Here's my code:
set outputfilepath=d:\output.txt
FOR %%G IN ("*.one","*.mht","*.onepkg") DO (
echo Searching for %%G files
dir "C:\%%G" /s /b >> "%outputfilepath%"
Rem Add 2 blank lines between next search
echo. >> "%outputfilepath%"
echo. >> "%outputfilepath%" )
Nothing gets output to my text file.
Any help is appreciated.
#ECHO Off
SETLOCAL
set "outputfilepath=u:\output.txt"
(
FOR %%G IN (one,mht,onepkg) DO (
echo Searching for %%G files>con
dir ".\*.%%G" /s /b |FINDSTR /i /e /L ".%%G"
Rem Add 2 blank lines between next search
echo.
echo.
)
)> "%outputfilepath%"
GOTO :EOF
Please note that I've changed drivenames to suit my system.
simply for meta in extensionlist then add the * in the dir command. Filter the dir output using findstr to ensure that only names matching /e at the end /L the literal ".%%G" are shown.
Also by enclosing the enitre for command in parentheses, you can send all stdout text (which would normally appear on the console) to the file. > naturally means create-file-anew. >> to append if that's your preference.
The >con appended to the Searching... echo overrides the redirection and specifically sends the text from that echo to the console.
I really like the existing suggestions but this is a different approach. I find that this style reads more like code.
This may seem very complicated but...
This technique can be used to loop over parameters for ANYTHING to include ones passed via the command line.
The logic of looping through passed parameters is isolated to its own function (enumerate_search_types).
The logic of what you are going to do with each parameter is isolated to its own function (search_for_search_type).
This might make it easier for some, too complicated for others.
#echo off
:: The parameters we are working with...
set outputfilepath=d:\output.txt
set starting_path=c:\
set search_types="*.one" "*.mht" "*.onepkg"
pushd "%starting_path%"
call :enumerate_search_types %search_types%
popd
goto :EOF
:: ---------------------------------------------------------------
:enumerate_search_types
set "current_param=%~1"
if "%current_param%"=="" goto :EOF
call :search_for_search_type "%current_param%"
shift /1
goto :enumerate_search_types
:: ---------------------------------------------------------------
:search_for_search_type
set "current_search_type=%1"
set "had_output=false"
echo Searching for %current_search_type% files
for /f "delims=" %%f in ('dir /s /b %current_search_type% 2^>NUL') do set had_output=true&& echo %%f >> "%outputfilepath%"
:: If the dir command didn't produce anything, don't add the blank lines
if "false"=="%had_output%" goto :EOF
echo. >> "%outputfilepath%"
echo. >> "%outputfilepath%"
goto :EOF
Short answer if you refer to only the file extension.
In order to use different extension in a for loop:
FOR %%G IN (one,mht,onepkg) do [command]

Merge every 2nd line with previous line in batch scripting

I used the following code, but set Content is blank in my case. Please help. Thanks.
set content=
for /f "delims=" %%i in (fileA.txt) do set content=%%i
for /f "delims=" %%i in (FileA.txt) do set content=%content% %%i
ECHO %content%> result.txt
FileA.txt
test
A
Testing
B
Expected Output:
test A
Testing B
You need a single for command to process all lines and this simple logic: if it is the first line read, store it; else show the stored first line and the second one AND delete the first line, so the same logic be used in all line pairs:
#echo off
setlocal EnableDelayedExpansion
set "firstLine="
(for /F "delims=" %%a in (FileA.txt) do (
if not defined firstLine (
set "firstLine=%%a"
) else (
echo !firstLine! %%a
set "firstLine="
)
)) > result.txt
Your two for work independently (the second starts when the first is finished).
Your first loop gets the last line of the file and then the second adds every line of the textfile to the variable (there is a limit for variable length and you will soon reach it with this method).
The empty variable at the end is due to lack of using delayed expansion.
Work with a single for and an alternating flag instead:
#echo off
setlocal enabledelayedexpansion
set flag=0
(for /f "delims=" %%i in (FileA.txt) do (
if !flag!==0 (
<nul set /p ".=%%i "
) else (
echo %%i
)
set /a "flag=(flag+1) %% 2"
))>result.txt
Note: due to batch/cmd limitations, this may have some problems (line length, special characters
We need '#echo off' statement to not to print code on every execution of the program and only echo statements, 'rem' is to mention the line is a comment. 'SETLOCAL EnableExtensions EnableDelayedExpansion' is need to enable ! statements to resolve the variables.
#echo off
rem this for loop reads the file FileA.txt line by line by specifying delims= (nothing)
rem then checks the condition if the line is even line or not, if odd then adding it to myVar variable
rem if even then printing both earlier odd with the current even line to the result.txt file.
set myVar=
set nummod2=0
set /a i=0
rem creating an empty file on everytime the program runs
copy /y nul result.txt
SETLOCAL EnableExtensions EnableDelayedExpansion
for /f "delims=" %%a in (FileA.txt) do (
set /a i=i+1
set /a nummod2=i%%2
if !nummod2!==0 (
echo !myVar! %%a
) else (
set myVar=%%a
)
) >> result.txt;
echo 'Done with program execution. Result saved to result.txt in the same folder of this batchfile'
rem pause

Insert a string at a specific position in a file with batch script

I'm new to batch script, I have a file with a string containing the word "media"(quotes included) and I need to insert another string right before it.
I messed around with findstr but couldn't make heads or tails of it.
Edit2:
here's what i did, doesn't seem to work:
#echo off
SETLOCAL=ENABLEDELAYEDEXPANSION
for /f "delims=," %%a in (f1.txt) do (
set foo=%%a
if !foo!=="media"
set var=!foo:"media"=aa"media"!
echo !foo! >> f2.txt)
You have two options to do this. You can read the file with a FOR /F command or if you are just editing a single line file then you can use the SET /P command.
Here are both of those examples in a single batch file.
#echo off
setlocal enabledelayedexpansion
for /F "delims=" %%G in (sotemp.txt) do (
set "line=%%G"
set "foo=!line:"media"=aa"media"!"
echo !foo!
)
set /p "line="<sotemp.txt
echo %line:"media"=aa"media"%
pause

Batch File - Insert Line into file

I'm trying to insert a line into a file using the following code (from Write batch variable into specific line in a text file)
#echo off
setlocal enableextensions enabledelayedexpansion
set inputfile=variables.txt
set tempfile=%random%-%random%.tmp
copy /y nul %tempfile%
set line=0
for /f "delims=" %%l in (%inputfile%) do (
set /a line+=1
if !line!==4 (
echo WORDS YOU REPLACE IT WITH>>%tempfile%
) else (
echo %%l>>%tempfile%
)
)
del %inputfile%
ren %tempfile% %inputfile%
endlocal
My problem is the file has comment lines (which start with semicolons) which need to be kept
; directory during network startup. This statement must indicate a local disc
; drive on your PC and not a network disc drive.
LOCALDRIVE=C:\TEMP;
; PANELISATION PART/NET NAMING CONVENTION
; When jobs are panelised, parts/nets are renamed for each panel step by
When I run the batch file, it ignores the semicolon lines, So I only get:
LOCALDRIVE=C:\TEMP;
What do I need to do to keep the semicolon lines?
The EOL option determines what lines are to be ignored. The default value is a semicolon. If you know a character that can never appear in the first position of a line, then you can simply set EOL to that character. For example, if you know a line can't start with |, then you could use
for /f "eol=| delims=" %%l in (%inputfile%) do ...
There is an awkward syntax that disables EOL completely, and also disables DELIMS:
for /f delims^=^ eol^= %%l in (%inputfil%) do ...
Note that FOR /F always discards empty lines, so either of the above would result in:
; directory during network startup. This statement must indicate a local disc
; drive on your PC and not a network disc drive.
LOCALDRIVE=C:\TEMP;
; PANELISATION PART/NET NAMING CONVENTION
; When jobs are panelised, parts/nets are renamed for each panel step by
A trick is used if you want to preserve empty lines. Use FIND or FINDSTR to insert the line number before each line, and then use expansion find/replace to remove the line number. Now you know the line never begins with ;, so you can ignore the EOL option.
for /f "delims=" %%L in ('findstr /n "^" "%inputfile%"') do (
set "ln=%%L"
set "ln=!ln:*:=!"
REM You now have the original line, do whatever needs to be done here
)
But all of the above have a potential problem in that you have delayed expansion enabled when you expand the FOR variable, which means that any content containing ! will be corrupted. To solve this you must toggle delayed expansion on and off within the loop:
setlocal disableDelayedExpansion
...
for /f "delims=" %%L in (findstr /n "^" "%inputfile%") do (
set "ln=%%L"
setlocal enableDelayedExpansion
set "ln=!ln:*:=!"
REM You now have the original line with ! preserved, do whatever needs done here
endlocal
)
Also, when ECHOing an empty line, it will print out ECHO is off unless you do something like
echo(!ln!
It takes time to open and position the write cursor to the end every time you use >> within the loop. It is faster to enclose the entire operation in one set of parentheses and redirect once. Also, you can replace the DEL and REN with a single MOVE command.
Here is a final robust script:
#echo off
setlocal disableDelayedExpansion
set "inputfile=variables.txt"
set line=0
>"%inputfile%.new" (
for /f "delims=" %%L in (findstr /n "^" "%inputfile%") do (
set "txt=%%L"
set /a line+=1
setlocal enableDelayedExpansion
set "txt=!txt:*:=!"
if !line! equ 4 (
echo New line content here
) else (
echo(!txt!
)
endlocal
)
)
move /y "%inputfile%.new" "%inputfile%" >nul
endlocal
That is an awful lot of work for such a simple task, and it requires a lot of arcane knowledge.
There is a much quicker hack that works as long as
your first 4 lines do not exceed 1021 bytes
none of your first 3 lines have trailing control characters that need to be preserved
the remaining lines do not have <tab> characters that must be preserved (MORE converts <tab> into a string of spaces.
#echo off
setlocal enableDelayedExpansion
set "inputfile=variables.txt"
>"%inputfile%.new" (
<"%inputfile%" (
for /l %%N in (1 1 3) do (
set "ln="
set /p "ln="
echo(!ln!
)
)
echo New line content here
more +4 "%inputfile%"
)
move /y "%inputfile%.new" "%inputfile%"
That is still a lot of work and arcane knowledge.
I would use my JREPL.BAT utility
Batch is really a terrible tool for text processing. That is why I developed JREPL.BAT to manipulate text using regular expressions. It is a hybrid JScript/batch script that runs natively on any Windows machine from XP onward. It is extremely versatile, robust, and fast.
A minimal amount of code is required to solve your problem with JREPL. Your problem doesn't really require the regular expression capabilities.
jrepl "^" "" /jendln "if (ln==4) $txt='New content here'" /f "variables.txt" /o -
If used within a batch script, then you must use call jrepl ... because JREPL.BAT is also a batch script.
By default, the FOR command treats ; as the end-of-line character, so all those lines that start with ; are being ignored.
Add eol= to your FOR command, like this:
for /f "eol= delims=" %%l in (%inputfile%) do (
It looks like you're echoing just the line delimiter, not the whole line:
echo %%l>>%tempfile%
I'm rusty on ms-dos scripts, so I can't give you more than that.

Resources