Why is this batch file producing extra, unexpected, unwanted characters? - windows

I'm trying to use the following batch script to concatenate some files together:
copy NUL bin\translate.js
for %%f in (source\Libraries\sprintf.js, source\translate-namespace.js, source\util.js, source\translator.js, source\translate.js) do (
type %%f >> bin\translate.js
echo. >> bin\translate.js
)
However, when I do this, an extra character seems to be printed at the end of each file. When I view the file in ASCII, it is interpreted as these three characters:

Why is this happening? What can I do to fix it?

The  looks like a unicode byte order mark. Is it possible to start with files that are stored without the byte mark? I am not aware of any command line commands that can remove the mark.

The DOS copy command works like the UNIX cat command. That is, you can list multiple source files and one destination file, seperated with + signs.
copy source\Libraries\sprintf.js+source\translate-namespace.js bin\translate.js

Related

Batch file keeps adding special characters to file name?

This is the first time ive posted here so I apologise if im in the wrong place.
I have a batch file that reads a list of domains from a text file and then does an nslookup ls against them, posting the results in their own text file.
Ive never had a problem with this until recently and I cant for the life of me work out why this has started happening.
All the files are perfect except for the first one! The first file name is always proceeded with "" (without the quotes) These files get read by another program I have written so it tends to cause a problem.
Heres the code that creates the files...
(
del /s /q "D:\Profile\Desktop\New_folder\Records\*.*"
for /f %%a in (D:\Profile\Desktop\New_folder\Domains\Domains.txt) do (
echo ls %%a >temp\tempfile.txt
echo exit >>temp\tempfile.txt
nslookup < temp\tempfile.txt > records\%%a.txt
)
)
Any help is much appreciated.
Cheers,
Aaron
According to IBM Extendend Characterset the characters you mentioned have the hex codes EF BB BF which is the UTF-8 byte order mark ("BOM"), see Wikipedia. This means that the file Domain.txt seems to have been saved using UTF-8 character encoding with BOM recently.
In order to get rid of the characters, simply edit the file and save it without a BOM. See e.g. to How to make Notepad to save text in UTF-8 without BOM? how to do that or search for "remove BOM"
Note that UTF-8 without BOM is compatible to printable ASCII, i.e. "normal" characters encoded as UTF-8 will show correctly in most common charactersets such as IBM Extended Characterset.
If you cannot or do not want to edit the input file then you might get rid of the prefix in your batch script, see Substrings in http://www.robvanderwoude.com/ntset.php#StrSubst - eventually something like
set BOM_REMOVED=false
for ...
set X=%%a
if %BOM_REMOVED%==false set X=%X:~3%
set BOM_REMOVED=true
echo ls %X >temp\tempfile.txt
...

concatenating .txt files into a csv file with a tab delimiter

I am trying to concatenate a set of .txt files using windows command line, into a csv file.
so i use
type *.txt > me_new_file.csv
but a the fields of a given row, which is tab delimited, ends up in one column. How do I take advantage of tab separation in the original text file to create a csv file such that fields are aligned in columns correctly, using one or more command lines? I am thinking there might be something like...
type *.txt > me_new_file.csv delim= ' '
but haven't been able to find anything yet.
Thank You for your help. Would also appreciate if someone could direct me to a related answer.
From the command line you'd have a fairly complicated time of it. The Windows cmd.exe command processor is much, much simpler than dash, ash, or bash, et.al.
Best thing would be to concatenate all of your files into the .csv file, open it in a text editor, and do a global find and replace replacing with ,
Be careful that your other data doesn't have any commas in it.
If the source files are tab delimited, then the output file is also tab delimited. Depending on the software you are using, you should be able load the tab delimited data properly.
Suppose you are using Excel. If the output file has a .csv extension, then Excel will default to comma delimited columns when it opens the file. Of course that does not work for you. But if you rename the file to have some other extension like .txt, then when you open it with Excel, it will open a series of dialog boxes where you can specify the format, including tab delimited.
If you want to keep the .csv extension and have Excel automatically open it properly, then you need to transform the data. This can be done very easily with JREPL.BAT - a hybrid JScript/batch utility that performs a regular expression search and replace on text data. JREPL.BAT is pure script that runs natively on any Windows machine from XP onward.
The following encloses each value in quotes, just in case a value contains a comma literal.
type *.txt 2>nul | jrepl "\t" "\q,\q" /x /jendln "$txt='\x22'+$txt+'\x22'" /o output.csv
Beware: Your use of type *.txt will fail if the last line in any of your source .txt files does not end with a newline. In such a case, the first line of the next file will be appended to the last line of the previous file. Not good.
You can solve that problem by processing each file individually in a FOR loop.
(for %F in (*.txt) do jrepl "\t" "\q,\q" /x /jendln "$txt='\x22'+$txt+'\x22'" /f "%F") >output.csv
The above is designed to run on the command line. If used in a batch script, then a few changes are needed:
(for %%F in (*.txt) do call jrepl "\t" "\q,\q" /x /jendln "$txt='\x22'+$txt+'\x22'" /f "%%F") >output.csv
Note: My answer assumes none of the source files contain quotes. If they do contain quotes, then a more complicated search and replace is required. But it still can be done efficiently with JREPL.

Word Sorting in Batch

Right let me rewrite this try to make it more clear.
Picture added to make this even clearer:
I have two files
File 1, contains words.
file 2, contains commands.
I need to put words from FILE 1
into FILE 2
I cannot copy-paste them one by one, because there is a LOT of words in FILE 1
File 1 is listed in alphabetical order (by first letter)
File 2 the command does not change
The issue is getting words from file 1 into file 2
but they have to be moved into quotes " " in file 2
so a script that could for example..
Take apple from file 1 and move it between quotes admin.executemotecommand "apple"inside file 2 as it goes down the list keeping the words in order as they move them across.
This could perhaps be done the same way around in which, the script writes the command in front of the words in file 1 as it goes down file 1's list
Is this even possible? I've never seen this done anywhere else and completely clueless if batch is even the right language for it.
The question is a little confusing, but based on your responses in the comments my understanding is that you don't necessarily need the script to edit a preexisting file 2, because you're repeating the same command(s) for each word, so the script can just create a new file based on the words in file 1.
You can do it at the prompt like this:
FOR /F %a IN (words.txt) DO ECHO admin.executeremotecommand "%a" >> commands.txt
The original version of the question indicated that you want more than one command for each word. I take it you changed that in order to simplify that question, and figured you'd just run the script once for each command? However, it's quite simple to have it produce more than one command for each word:
FOR /F %a IN (words.txt) DO (ECHO first.command "%a" & ECHO second.command "%a") >> commands.txt
In a batch file, you'd do it this way:
#ECHO OFF
FOR /F %%a IN (words.txt) DO (
ECHO first.command "%%a"
ECHO second.command "%%a"
) >> commands.txt
BTW, in the code in some of your comments, you surrounded the variable with %'s (%A%). That's incorrect; it would evaluate to the value of %A followed by a literal %. Surrounding with %'s is used only for environment variables. (Note that the %'s around environment variables do not get doubled in a batch file. For example, to get the current date, use ECHO %date% both at the prompt and in a batch file.)

Unicode characters in batch files

I need to use a lot of characters from character map for this batch file.
Here is part of the batch file I am using:
"C:\v2.vbs" "C:\file.txt" 火 a
Is there a way to have cmd recognize the 火 or any other non-keyboard characters I have in the batch file? This command seems to only work if I don't use special characters.
What else could I use that will run a batch file and accomplish this?
If this
"C:\v2.vbs" "C:\file.txt" <literal UTF-16 charcter> a
means "start v2.vbs with 3 arguments", then you could encode the second parameter like "&Habcd" (quotes needed) and use sC = ChrW(WScript.Arguments(1)) in v2.vbs.

Windows command line/shell - While appending file to another file, how to ignore lines that match a regex?

I'm not familiar with Windows shell. So, let's say my file is like:
DontAppend this line shouldn't be appended
DontAppend this line shouldn't be either
Some lines
more lines
And I'm appending like this:
type file.txt >> AppendHere.txt
This appends the whole file. How do I make it so it skips lines that begin with "DontAppend"?
The command findstr will let you search for lines not containing a string or regular expression so you can use:
findstr /vrc:"^[^A-Za-z0-9]*DontAppend" file.txt >> AppendHere.txt
The /r option says it should use regular expressions and the caret (^) says it should begin with the string.
Edit: added a filter for non alphanumeric chars that may solve the Unicode issues (Unicode files sometimes have a non-printable indicator characters in the beginning).
Either get grep for windows or you could use Windows' own find command
type so.txt|find /v "DontAppend" >> output.txt
The /v option means output lines that dont match your string.
find works for very simple things like this but any more you will need a real filtering tool like grep

Resources