Remove header while merging multiple .csv files using batch - windows

I have written the code to concatenate sample files into a single file minus the headers each file.
Input files:
File1:
[ Row : Header ],,,,,,,,,
ContractNum,ProgramNum,CustomerNum,TierNum,StartDate,EndDate,DateCreated,CreatedBy,DateUpdated,UpdatedBy
00032116,21238,60304PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14
00032116,21238,81790PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14
File 2:
[ Row : Header ],,,,,,,,,
ContractNum,ProgramNum,CustomerNum,TierNum,StartDate,EndDate,DateCreated,CreatedBy,DateUpdated,UpdatedBy
00024067,15562,9942PRMI,1,2014-09-16,2016-12-31,gintgUser,gintgUser,2016-02-21 05:59:43,2016-02-21 05:59:43
Expected Output:
[ Row : Header ],,,,,,,,,
ContractNum,ProgramNum,CustomerNum,TierNum,StartDate,EndDate,DateCreated,CreatedBy,DateUpdated,UpdatedBy
00024067,15562,9942PRMI,1,2014-09-16,2016-12-31,gintgUser,gintgUser,2016-02-21 05:59:43,2016-02-21 05:59:43
00032116,21238,60304PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14
00032116,21238,81790PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14
Actual Output:
[ Row : Header ],,,,,,,,,
ContractNum,ProgramNum,CustomerNum,TierNum,StartDate,EndDate,DateCreated,CreatedBy,DateUpdated,UpdatedBy
00024067,15562,9942PRMI,1,2014-09-16,2016-12-31,gintgUser,gintgUser,2016-02-21 05:59:43,2016-02-21 05:59:43
00032116,21238,60304PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14
[ Row : Header ],,,,,,,,,
ContractNum,ProgramNum,CustomerNum,TierNum,StartDate,EndDate,DateCreated,CreatedBy,DateUpdated,UpdatedBy
00032116,21238,81790PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14
Please find below code used for this operation:
#echo off
break>Combined.csv
cls
setlocal enabledelayedexpansion
if exist C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\Combined.csv del C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\Combined.csv
dir /a-d /b C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\ContractEligibility_*.csv>C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\dirfiles.txt
cd C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\
for /f "tokens=*" %%A in (C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\dirfiles.txt) do (
set /p header=<%%A
if "!header!" neq "" (
(echo(!header!)>Combined.csv
goto :break_for
)
)
:break_for
for /f "tokens=*" %%A in (C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\dirfiles.txt) do (
more +1 %%A>>Combined.csv
)
del dirfiles.txt
}
Can someone please help me resolve this issue. I am a neophyte to batch scripting and unable to debug this issue.

A couple points about this question:
This question is an exact duplicate of Windows Batch file execution error
At that question there are 4 answers, one of which is mine.
In my answer I asked you to post a small section of your data files, but you never replied.
This is a copy of my answer at that question after I slightly modified it in order to insert the key point of your problem: the headers contain TWO lines:
EDIT: I modified the code accordingly to the new specifications posted in a comment: there are three lines of headers in each file, but just the 3rd must be included in the output.
#echo off
setlocal enabledelayedexpansion
cls
REM cd C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\
set "header3="
(for %%A in (*.csv) do (
if not defined header3 (
(set /p "header1=" & set /p "header2=" & set /p "header3=") <%%A
echo !header3!
)
more +3 %%A
)) > Combined.txt
And this is the generated Combined.txt file when this program run with your data above:
.
[ Row : Header ],,,,,,,,,
ContractNum,ProgramNum,CustomerNum,TierNum,StartDate,EndDate,DateCreated,CreatedBy,DateUpdated,UpdatedBy
00032116,21238,60304PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14
00032116,21238,81790PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14
00024067,15562,9942PRMI,1,2014-09-16,2016-12-31,gintgUser,gintgUser,2016-02-21 05:59:43,2016-02-21 05:59:43
As you can see, the output is the same you want.
EDIT: I can't test the modification because the posted input files does not contain the same data as the real files...
You should follow up the questions you post and not post new questions with the exact same problem of a previous one.
You should be clearer in the description of your problem and post an example data.

There is no need for an interim file that contains a list of CSV files, you can read and combine them by a standard for loop and a nested for /F loop, using its skip option to get rid of the headers (assuming the header is always a single line). The initial header can be taken from another for/for /F loop construct that is broken upon its first iteration:
> "C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\Combined.csv" (
for %%F in ("C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\ContractEligibility_*.csv") do (
for /F "usebackq eol=| delims=" %%L in ("%%~F") do (
echo(%%L
goto :LEAVE
)
)
)
:LEAVE
>> "C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\Combined.csv" (
for %%F in ("C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\ContractEligibility_*.csv") do (
for /F "usebackq skip=1 eol=| delims=" %%L in ("%%~F") do (
echo(%%L
)
)
)
If you need a specific sort order of the CSV files, you need another for /F loop instead of the standard for loop that parses the output of a dir /B command to do that job. The following example takes a two-line header, then it sorts the files from oldest to newest modification dates:
> "C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\Combined.csv" (
set "FLAG="
for %%F in ("C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\ContractEligibility_*.csv") do (
for /F "usebackq eol=| delims=" %%L in ("%%~F") do (
echo(%%L
if defined FLAG goto :LEAVE
set "FLAG=#"
)
)
)
:LEAVE
>> "C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\Combined.csv" (
for /F "eol=| delims=" %%F in ('
dir /B /A:-D /O:D /T:W "C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\ContractEligibility_*.csv"
') do (
for /F "usebackq skip=2 eol=| delims=" %%L in ("%%F") do (
echo(%%L
)
)
)

If you felt like installing awk - one of the handiest programs around from Unix/Linux - your task would become very simple. It is available for Windows from here.
Then you could just use:
awk 'NR<3 || FNR>2' *.csv
To explain the command, you need to know that NR is the Number of the Record (i.e. the line number) and it starts at one for the first record/line of the first file and then increments with each record, so it will be less than 3 for just the first two records of just the very first file. FNR on the other hand, is the File Number of Record which is the same, but it resets to one as each new file is opened, so it will be less than 2 for the first two records of every file.
So, in summary, the command says... "Print any line if it is one of the very first two lines of all the input files, or if it is past line 2 of any of the files."
Note that you may need to replace the single quotes with double quotes on Windows.
Note that if you were to download gawk, it will work just the same as awk for this example.

Related

Windows Batch - Find word in one string matching word in another string and capture output

While this may seem easy to some, I've struggled for hours on it.
I have a file:
MYFOLDER,JobE,JobD_ENDED_OK,
MYFOLDER,JobD,JobC_ENDED_OK,JobD_ENDED_OK
MYFOLDER,JobD,JobB_ENDED_OK,
MYFOLDER,JobC,JobA_ENDED_OK,JobC_ENDED_OK
MYFOLDER,JobB,JobA_ENDED_OK,JobB_ENDED_OK
MYFOLDER,JobA,,JobA_ENDED_OK
I need to loop through and find where token 4 in one line matches token 3 in another line and then echo a statement to a file. I am looking for an output file that shows this:
MYFOLDER_JobA_MYFOLDER_JobB matches JobA_ENDED_OK
MYFOLDER_JobA_MYFOLDER_JobC matches JobA_ENDED_OK
MYFOLDER_JobB_MYFOLDER_JobD matches JobB_ENDED_OK
MYFOLDER_JobC_MYFOLDER_JobD matches JobC_ENDED_OK
MYFOLDER_JobD_MYFOLDER_JobE matches JobD_ENDED_OK
I know it's a FOR loop with a DO, I am just not getting the rest of it.
Any assistance is greatly appreciated.
#ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
SET "filename1=%sourcedir%\q46510665.txt"
SET "outfile=%destdir%\outfile.txt"
(
FOR /f "usebackqdelims=" %%h IN ("%filename1%") DO (
SET "col4line=%%h"
SET "col4line=!col4line:,=|,|!"
FOR /f "tokens=1-4delims=," %%a IN ("!col4line!") DO IF "%%d" neq "|" (
FOR /f "usebackqdelims=" %%H IN ("%filename1%") DO (
SET "col3line=%%H"
SET "col3line=!col3line:,=|,|!"
FOR /f "tokens=1-4delims=," %%A IN ("!col3line!") DO (
IF "%%d|"=="%%C" (
SET "reportline=%%a_%%b_%%A_%%B matches %%C"
ECHO !reportline:^|=!
)
)
)
)
)
)>"%outfile%"
GOTO :EOF
You would need to change the settings of sourcedir and destdir to suit your circumstances.
I used a file named q46510665.txt containing your data for my testing.
Produces the file defined as %outfile%
For each line in the file, set a variable col4line to the entire line via %%h, then replace each , with |,| so that successive , will be separated. Tokenise on , and ignore any line which has simply | as its 4th token (ie last-column-empty).
Repeat the process for every line in the file this time through %%H into col3line (note case differential to use different actual metavariables) and if the third column matches the fourth column+| from the outer loop, assemble the report line from the tokens and output, removing the |s.

Windows batch code to merge file and edit

I have a lot of csv file into a folder.
The files are named like OPERATORS_*.csv where * is a variable.
I want, using a batch file, to merge all files into one, delete the first row of each file and add at the end of each row the *.
I have tried this code:
copy /b OPERATORS_*.csv OPERATORS_FULL.csv
This way is fine, but the first row of each file is printed and i lost the attribute in the filename.
Example:
OPERATORS_ACTIVITY1.csv
OPT;SALES;REDEMPTION
OPT1;12;75
OPERATORS_ACTIVITY2.csv
OPT;SALES;REDEMPTION
OPT2;22;64
and i want this:
OPERATORS_FULL.csv
OPT1;12;75;ACTIVITY1
OPT2;22;64;ACTIVITY2
Any suggestions?
Try this (Update #2):
#ECHO OFF
SETLOCAL EnableDelayedExpansion
IF EXIST OPERATORS_FULL.csv DEL OPERATORS_FULL.csv
IF EXIST OPERATORS_FULL.tmp DEL OPERATORS_FULL.tmp
FOR %%A IN ( OPERATORS_*.csv ) DO (
:: get attribute from filename
SET "attr=%%A"
SET "attr=!attr:OPERATORS_=!"
SET "attr=!attr:.csv=!"
:: get date suffix
SET tmp=!attr:_= !
FOR %%G IN ( !tmp! ) DO (
SET date_=%%G
)
:: if we have a date (i.e. a numeric value)
IF !date_! EQU +!date_! (
:: ...remove date from attr with leading underscore
CALL SET attr=%%attr:_!date_!=%%
) ELSE (
:: ...else clear date variable
SET date_=
)
:: dump CSVs, skipping each header line, adding the attribute from the filename
FOR /F "skip=1 tokens=*" %%G IN ( %%A ) DO ECHO %%G;!attr!;!date_! >> OPERATORS_FULL.tmp
)
REN OPERATORS_FULL.tmp OPERATORS_FULL.csv
Here is a different approach using redirection -- see all the explanatory rem remarks in the script:
#echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "_INPUT=OPERATORS_*.csv" & rem // (input files)
set "_OUTPUT=OPERATORS_FULL.csv" & rem // (output file)
set /A "_SKIP=1" & rem // (number of lines to skip for each input file)
rem // Redirect the whole output at once:
> "%_OUTPUT%" (
rem // Iterate over all the input files:
for %%F in ("%_INPUT%") do (
rem // Store the current file name to get the attribute name later:
set "NAME=%%~nF"
rem // Exclude the output file from being processed:
if /I not "%%~nxF"=="%_OUTPUT%" (
rem // Determine the number of lines of the current input file:
for /F %%E in ('^< "%%~F" find /C /V ""') do set /A "CNT=%%E"
rem // Read current input file:
< "%%~F" (
setlocal EnableDelayedExpansion
rem // Loop over every line:
for /L %%E in (1,1,!CNT!) do (
rem // Read current line:
set "LINE=" & set /P LINE=""
rem // Return current line if it is not to be skipped:
if %%E GTR %_SKIP% echo(!LINE!;!NAME:*_=!
)
endlocal
)
)
)
)
endlocal
exit /B
#echo off
setlocal
del operators_full.csv 2>nul >nul
FOR %%f IN (operators_*.csv) DO for /f "usebackqdelims=" %%a in ("%%f") do echo %%a>operators_full.txt&goto body
:body
(
FOR %%f IN (operators_*.csv) DO FOR /f "tokens=1*delims=_" %%s IN ("%%~nf") DO for /f "skip=1usebackqdelims=" %%a in ("%%f") do echo %%a;%%t
)>>operators_full.txt
move operators_full.txt operators_full.csv
First, delete the output file if it exists, then start copying the file(s) to a .txt file but deliberately abort after the very first line.
then, for each file, tokenise on the _ in the name part of the file %%f copy every line,appending the post-_ part of the filename in %%t, skipping the first and append to the .txt file (note the position of the outer pair of parentheses - this syntax allows the output of the entire code block to be redirected)
Finally, move or rename the file.
Oh -- you don't want the header line? Omit the first for line.

Reading list of files in a directory and copying the contents using batch command file

I have a list of csv files in a directory which have name with format XX_YYYFile.csv, where XX is a name that can have any characters (including space), and YYY is random 3 digits. For example: "book_123File.csv", "best movie_234File.csv", etc. I want to read this list of files then create new CSV files by removing "_YYYFile". The content of the new files are the same with the original ones, except the first line needs to be added with value "number,name,date".
set inputFileFolder=C:\Input
set outputFileFolder=C:\Output
FOR /F "delims=" %%F IN ('DIR %inputFileFolder%\*File.csv /B /O:D') DO (
set reportInputFile=%inputFileFolder%\%%F
set reportInputFileName=%%F
set result=!reportInputFileName:~0,-12!
set reportOutputFileName=!result!.csv
set reportOutputFile=%outputFileFolder%\!result!.csv
echo number,name,date > !reportOutputFile!
for /f "tokens=* delims=" %%a in (!reportInputFile!) do (
echo %%a >> !reportOutputFile!
)
)
If I run this batch file, file "book.csv" is successfully created with the correct contents (first line: "number,name,date", the next lines are from file "book_123.csv"). But file "best movie_234.csv" and other files contain space in the filename are not created successfully. File "best movie.csv" is created with only 1 line "number,name,date". The contents of file "best movie_234.csv" are not copied to file "best movie.csv".
Please help.
You need to Escape Characters, Delimiters and Quotes properly. Note the usebackq parameter in inner for /F loop as well:
#ECHO OFF
SETLOCAL EnableExtensions EnableDelayedExpansion
set "inputFileFolder=C:\Input"
set "outputFileFolder=C:\Output"
FOR /F "delims=" %%F IN ('DIR "%inputFileFolder%\*File.csv" /B /O:D') DO (
set "reportInputFile=%inputFileFolder%\%%F"
set "reportInputFileName=%%F"
set "result=!reportInputFileName:~0,-12!"
set "reportOutputFileName=!result!.csv"
set "reportOutputFile=%outputFileFolder%\!result!.csv"
>"!reportOutputFile!" echo number,name,date
for /f "usebackq tokens=* delims=" %%a in ("!reportInputFile!") do (
>>"!reportOutputFile!" echo %%a
)
rem above `for /f ... %%a ...` loop might be replaced by FINDSTR
rem >>"!reportOutputFile!" findstr "^" "!reportInputFile!"
rem or by TYPE
rem >>"!reportOutputFile!" type "!reportInputFile!"
)
Hint: each > and >> redirector works as follows:
opens specified oputput file, then
writes something to oputput file, and finally
closes oputput file.
This procedure might be extremely slow if repeated in next for /f ... %%a ... loop for larger files:
>"!reportOutputFile!" echo number,name,date
for /f "usebackq tokens=* delims=" %%a in ("!reportInputFile!") do (
>>"!reportOutputFile!" echo %%a
)
Use block syntax rather:
>"!reportOutputFile!" (
echo number,name,date
for /f "usebackq tokens=* delims=" %%a in ("!reportInputFile!") do (
echo %%a
)
)
above for /f ... %%a ... loop might be replaced by FINDSTR command (it eliminates empty lines like for does) as follows:
>"!reportOutputFile!" (
echo number,name,date
findstr "^." "!reportInputFile!"
)
or by TYPE command (it will retain empty lines unlike for) as follows:
>"!reportOutputFile!" (
echo number,name,date
type "!reportInputFile!"
)

create multiple text files from a list using batch file

I have a list of letters in a file called "letters.txt" and a list of number of occurrence of each letter in a file called "LetterPerSample.txt",both files are arranged, so first row of letters.txt has "a" second has "b"...etc, and same for SamplePerLetter.txt the first row has max nymber of "a",second has max number of "b" and so in,i want to create a list of files like this a_1,a_2,.....a_max.txt, where max is a number as listed above, and each file generated has it's own letter written inside. So a_1.txt has "a" written inside, b_5.txt has "b" written and so on
what i have done so far is:
#echo off
setlocal enableDelayedExpansion
for /f "tokens=*" %%a in (letters.txt) do (
set letter=%%a
for /f "tokens=*" %%b in (SamplePerLetter.txt) do (
set num=%%b
for/L %%g IN (1,1,!num!) do (
set index=%%g
echo !letter!>letter_labels/!letter!/!letter!!index!.lab
)
)
)
sample of the output
a_1.txt
a_2.txt
...
a_10.txt
b_1.txt
b_2.txt
...
b_10.txt
but a and b doesnt have the same number of occurrence in the file LetterPerSample.txt a has 10 and b has 5, so what's wrong with my code?
This method does not require that the letters in letters.txt file be in order, so you may insert just the desired letters in such file:
#echo off
setlocal EnableDelayedExpansion
rem Load the number of occurrences of each letter from "LetterPerSample.txt" file
set "letters=abcdefghijklmnopqrstuvwxyz"
set "i=0"
for /F %%b in (LetterPerSample.txt) do (
for %%i in (!i!) do set "number[!letters:~%%i,1!]=%%b"
set /A i+=1
)
rem Process the letters in "letters.txt" file (in any order)
for /F %%a in (letters.txt) do (
set "letter=%%a"
set "num=!number[%%a]!"
for /L %%g in (1,1,!num!) do (
set "index=%%g"
echo !letter!>letter_labels\!letter!\!letter!_!index!.lab
)
)
You may review the management of arrays in Batch files at this post.
If letters.txt file have always all the letters, from a to z, then this file contain redundant information that can be eliminated:
#echo off
setlocal EnableDelayedExpansion
rem Load the number of occurrences of each letter from "LetterPerSample.txt" file
rem and create the desired files
set "letters=abcdefghijklmnopqrstuvwxyz"
set "i=0"
for /F %%b in (LetterPerSample.txt) do (
for %%i in (!i!) do set "letter=!letters:~%%i,1!"
set /A i+=1
set "num=%%b"
for /L %%g in (1,1,!num!) do (
set "index=%%g"
echo !letter!>letter_labels\!letter!\!letter!_!index!.lab
)
)
your problem is, to read two files simultanioulsly. Here is a trick to do so:
#echo off
setlocal enabledelayedexpansion
<letterpersample.txt (
for /f %%a in (letters.txt) do (
set /p num=
for /l %%i in (1,1,!num!) do (
echo %%a>letter_labels\%%a\%%a%%i.lab
)
)
)
The for loop (%%a) reads one line after the other from letters.txt. set /p reads a line from STDIN (which is redirected from letterspersample.txt), so if for reads line number 5 from one file, set /p reads line number 5 from the other.
(PS: I doubt, your echo logic is ok. Seems odd)

Batch Scripting:search file, extract numbers, and copy into new file

I'm new to batch scripting and have a question. I was wondering if it's possible to search a .txt file by requirements and take data specified and copy into a new .txt file?
Like if I have 50 lines with 9 digit numbers and a bunch of other crap I don't need after them can I say,
"For any line beginning with a 1,2,3,4,5,6,7,8,or 9...take the first 9 digits and copy them into a new file, for all lines in the file???"
I thought this would be easier than trying to delete all the other stuff. Let me know if you know anything about how to do this! Thanks.
Here's an example of what one line looks like:
123456789#example
and I just need to extract the 9 digit numbers from about 50 lines of this.
You can use FINDSTR to filter out all lines that do not start with 9 digits. Then FOR /F can read the result, line by line. A variable is set, and a substring operation preserves just the first 9 digits.
#echo off
setlocal enableDelayedExpansion
(
for /f %%A in (
'findstr "^[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]" yourFile.txt'
) do (
set "ln=%%A"
echo !ln:~0,9!
)
)>newFile.txt
#ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION
(
FOR /f "delims=" %%a IN (q24824079.txt) DO (
SET "line=%%a"
REM set DIGITS to the first 9 characters of LINE
SET "digits=9!line:~0,9!"
FOR /L %%z IN (0,1,9) DO SET "digits=!digits:%%z=!"
IF NOT DEFINED digits ECHO(!line:~0,9!
)
)>newfile.txt
GOTO :EOF
I used a file named q24824079.txt containing data for my testing.
Produces newfile.txt
You did not specfy what to do if the line was all-digits but has fewer than 9 characters. I chose to report that line.
Hopefully this helps getting the job done:
#echo off
setlocal enabledelayedexpansion
for /f %%e in (emails.txt) do (
echo Email: %%e
for /f "delims=# tokens=1" %%b in ("%%e") do (
set BEGIN=%%b
echo Name: !BEGIN!
set FIRST=!BEGIN:~0,1!
echo First char: !FIRST!
set /a NUMERIC=!FIRST!+0
echo Converted to number: !NUMERIC!
if !FIRST!==!NUMERIC! echo Yippieh!
echo.
)
)
Instead of echo Yippieh! append the email (%%e) to a file, e.g. like
echo %%e >> output.txt

Resources