Trying to translate a linux shell script loop to Windows + GnuWin32 - shell

Edited to remove unnecessary usage of the "cat" command...
I have a Linux shell script that reads data out of a CSV file and performs operations based on the last column of each line in the file.
The input file has this basic format:
asdf,foo
1234,foo
qwerty,bar
zxcv,baz
7890,bar
The original Linux script looks like this:
sed s/.*,//g $1 | sort -u | while read item
do
# Do stuff with $item
done
I'm having a tough time translating the Linux script to run in a Windows shell environment with GnuWin32 versions of cat, sed, and sort. Here's what I've tried so far:
for /f "tokens=*" %%A in ('sed s/.*,//g %1 ^| sort -u') do (some stuff with %%A)
When I try to run this, I get:
sed: -e expression #1, char 4: unterminated `s' command
cat: write error: Invalid argument
I'm sure I'm misunderstanding something rudimentary about batch scripting. I tried to cover some basics like escape sequences, but I'm still drawing a blank. Any hints?
Thanks guys!

Try this:
for /f "delims=" %%A in ('cat "%~1" ^| sed "s/.*,//g" ^| sort -u') do (some stuff with %%A)
In Windows replace the single quotes from linux shell script ' with double quotes ". If you need double quotes in GNUWin commands, it must be escaped by a backslash \ - but only on the Windows shell prompt cmd, the scripts from sed, awk etc. are fully compatible.

Related

Replacement for $() in Windows batch script

I am trying to convert my bash script into a Windows batch file. It's a really simple one liner that's supposed to feed the contents of a file as arguments to script.exe, and send the results to output.txt.
This is the working bash script:
./script.exe $(cat input.txt) > output.txt
I know this might be bad style, but it works. The problem is, I have no idea how to do something like $() in a windows batch file. When I use it it sends the string "$(cat input.txt)" as the argument instead of running the command.
This bash construct is called command substitution. Here is a great answer from #MichaelBurr.
You can get a similar functionality using cmd.exe scripts with the
for /f command:
for /f "usebackq tokens=*" %%a in (`echo Test`) do my_command %%a
Yeah, it's kinda non-obvious (to say the least), but it's what's
there.
See for /? for the gory details.
Sidenote: I thought that to use "echo" inside the backticks in a
"for /f" command would need to be done using "cmd.exe /c echo
Test" since echo is an internal command to cmd.exe, but it works
in the more natural way. Windows batch scripts always surprise me
somehow (but not usually in a good way).
See also, on Superuser: Is there something like Command Substitution in WIndows CLI?

Why can't I pipe 'where' output to 'type' in batch?

I'm trying to print the contents of a batch file that exists in my path.
I can find the file with 'where':
> where myscript
C:\scripts\myscript.bat
I can display the contents of the file with 'type':
> type C:\scripts\myscript.bat
echo This is my script. There are many like it, but this one is mine.
However, when I want to be lazy and use a single command:
> where myscript | type
The syntax of the command is incorrect.
Based on some tests I did, it seems 'where' output can't be piped out and 'type' input can't be piped in.
Can anyone explain why this doesn't work in this way?
P.S. I was able to do this in Powershell: Get-Command myscript | Get-Content.
As #Luaan said in the comments, type will only accept the filename as argument and not via its input channel. So piping won't do the trick in your case. You'll have to use another way to give the result of the where command as an argument. Fortunately the for /f can help you process outputs of other commands. To print the file corresponding to the output of the where command you'll have to use this on the command line:
FOR /F "delims=" %G IN ('where myscript') DO type "%G"
In a batch-file you'll have to use
#echo off
FOR /F "delims=" %%G IN ('where myscript') DO type "%%G"
As exposed by Luaan and J.Baoby, not all the commands can retrieve its arguments from a pipe or a redirection, but some of them can.
Certainly not the exact output, but probably the nearer syntax in command line
where myScript | findstr /f:/ "^"
The output of the where command is piped into a findstr. The /f switch is used to indicate the list of files to be searched, and the slash means that the list will be readed from standard input. The "^" is just a regular expression that will match all the lines in the files enumerated by the where comamnd

Can text lines be modified while writting them to a file?

Can we add any string along with the TEXT we are writing or appending to a file.
For example
dir filepath/filename.txt >> file1.txt
I want to add a string (,ab) at the end of every line in file1.txt. In the same line and not in the next line.
somewhat like
source file
data1
data2
data3
target should be
data1,ab
data2,ab
data3,ab
You can use the FOR command for this purpose (see HELP FOR on the command prompt):
FOR /F "delims=" %i IN (filepath\filename.txt) DO ECHO %i,ab >> file1.txt
That will read each line from filepath\filename.txt and write it to file1.txt, with your ab appended.
Now, if you really wanted to execute the dir filepath\filename.txt command, and add the ab to its output, then you'd do the following
FOR /F "usebackq delims=" %i IN (`dir filepath\filename.txt`) DO ECHO %i,ab >> file1.txt
or
FOR /F "delims=" %i IN ('dir filepath\filename.txt') DO ECHO %i,ab >> file1.txt
Finally, note that if you want to put the above commands in a batch file, you need to escape the %i by writing %%i.
Additionally, to reduce noise if executed on the command line, you can use "#ECHO" instead of "ECHO", i.e. as #dbenham commented, prefix the DO-command with the "#" character.
Christian K's answer using FOR /F is how you would do this using pure batch. There could be complications depending on the command output.
By default, FOR /F will skip lines beginning with ; (the default EOL character). If you know of a character that cannot appear in the output, then you can simply set EOL to that character. But sometimes you have no control or knowledge of the potential output. There is an awkward syntax that disables both DELIMS and EOL, thus solving that problem.
Note that I prefix the DO command with # to prevent the command line from being echoed. I also enclose the entire construct within parentheses and redirect only once, as it is more efficient:
(for /f delims^=^ eol^= %A in ('someCommand') do #echo %A,ab)>file1.txt
But there is still a potential problem in that empty lines will be skipped. There is a solution using FINDSTR or FIND to prefix each line with the line number, and then remove the prefix within the loop. But that can slow things down.
It is much simpler to use my JREPL.BAT utility that performs a regular expression search and replace on text. This is also much faster if you are dealing with a lot of output.
The following will append ,ab to all lines of output, including empty lines:
someCommand | jrepl "$" ",ab" >>file1.txt
Your example implies that you do not want to append anything to empty lines. The following accomplishes that:
someCommand | jrepl ".$" "$&,ab" >>file1.txt

Windows command prompt: Using a variable set in the same line of a one-liner

Okay, I learned that one can use & or && to combine multiple commands into a single line, but it seems that variables that are set aren't actually available for interpolation in the same line:
C:\Users\Andrew>set foo=hello, world!&& echo %foo%
%foo%
C:\Users\Andrew>echo %foo%
hello, world!
Why can't I make this work, and is there any way to make it work in a single line?
The reason I need a one-liner is that an external program I'm working with accepts a single command as a pre-run hook, and, of course, I need to run multiple commands.
Preemptive Defenses
"hello, world! should be surrounded in double-quotes!" Actually, doing so seems to store literal double-quotes in the variable, which I do not want, e.g.
C:\Users\Andrew>set bar="hello, world!"&& echo %bar%
%bar%
C:\Users\Andrew>echo %bar%
"hello, world!"
"There should be a space before the &&!" Actually, doing so seems to store a trailing space in the variable, which I do not want, e.g.
C:\Users\Andrew>set bar="hello, world!"&& echo %bar%
%bar%
C:\Users\Andrew>echo %bar%
"hello, world!"
"Both!" >:(
C:\Users\Andrew>set mu="hello, world!" && echo %mu%
%mu%
C:\Users\Andrew>echo (%mu%)
("hello, world!" )
You can do it in the same line, but I would recommend to use a batch file like preHook.bat.
As one liner
set "mu=hello, world!" && call echo %^mu%
At first you can see that the quotes are different than yours.
Here's why:
set test1="quotes"
set "test2=no quotes" with comment
In the first test, the quotes will be part of the test1 variable, as well as all characters after the last quote.
In the second test, it uses the extended syntax of the SET command.
So the content will be no quotes, as only the content to the last quote is used; the rest will be dropped.
To echo the variable content, I use call echo %^mu%, as percent expansion will expand it when the line is parsed, before any of the commands are executed.
But the call command will be executed later, and it restarts the parser, which, when used at the command line, uses different expansion rules than the batch parser: an empty variable (in this case, %^mu%, in the first time) stays unchanged in the line; but, in the next parser phase, the ^ caret will be removed.
In your case, call echo %mu% would also work, but only when mu is always empty. The caret variant also works when mu has content before the line is executed.
More about the parser at SO: How does the Windows Command Interpreter (CMD.EXE) parse scripts?
And about variable expansion at SO: Variable expansion
Though I accepted #jeb's answer, ultimately I had to go another route, because I needed to pipe the output of my command, and doing so on call led to mangled results. Indeed, the documentation for call itself remarks,
Do not use pipes and redirection symbols with call.
Instead, reminded by #Blorgbeard of cmd's /v option (and I believe that should be lowercase, not uppercase), I realized I could simply start a subprocess:
C:\Users\Andrew>cmd /v /c "set foo=hello, world!&& echo !foo!"
hello, world!
(For some reason, /v must appear before /c.) Within the quotes, I was able pipe my output to other utilities. One tip for those taking this path: If you find yourself needing to use quotes within those quotes, I suggest avoiding them altogether and trying character codes, e.g. \x20 for space, \x22 for double quotes, and so on.
For example, this was the eventual solution to my problem (warning: may cause eyes to bleed):
C:\Users\Andrew>cmd /v /c "set source=C:\source& set target=C:\target& set archive=C:\archive& robocopy.exe !source! !target! /l /e /zb /xx /xl /fp /ns /nc /ndl /np /njh /njs | sed -e s/^^[\t\x20]\+// | sed -e /^^$/d | sed -e s/^!source:\=\\!// | sed -e s/.*/xcopy\x20\/Fvikrhyz\x20\x22!source:\=\\!^&\x22\x20\x22!archive:\=\\!^&\x22/"
Try the following:
cmd.exe /v /c "set foo=hello, world & echo !foo!"
The /v argument enables delayed variable expansion. This allows you to access variables' values at execution time rather than at parse time (the default). You do this by writing !foo! instead of %foo% (with this /v option turned on).
Instead of passing /v, you can also turn delayed variable expansion on permanently via the registry, which obviously affects the entire system:
[HKEY_LOCAL_MACHINE\Software\Microsoft\Command Processor]
"DelayedExpansion"= (REG_DWORD)
1=enabled 0=disabled (default)
I could not get an IF command to work with either call or cmd /v /c so I would like to offer another option for those who may need it:
use FOR /F to declare and use a variable within a single command.
For example:
FOR /F %i IN ('echo 123') DO (IF %i==123 echo done)
The %i is the variable that is set with the result of the IN command, which in this example is 'echo 123'.
For values with single quotes or spaces, use the "usebackq tokens=*" flag to put the command in backquotes:
FOR /F "usebackq tokens=*" %i IN (`echo "hello, ' ' world!"`) DO (IF %i=="hello, ' ' world!" echo done)

How to find the number of occurrences of a string in file using windows command line?

I have a huge files with e-mail addresses and I would like to count how many of them are in this file. How can I do that using Windows' command line ?
I have tried this but it just prints the matching lines. (btw : all e-mails are contained in one line)
findstr /c:"#" mail.txt
Using what you have, you could pipe the results through a find. I've seen something like this used from time to time.
findstr /c:"#" mail.txt | find /c /v "GarbageStringDefNotInYourResults"
So you are counting the lines resulting from your findstr command that do not have the garbage string in it. Kind of a hack, but it could work for you. Alternatively, just use the find /c on the string you do care about being there. Lastly, you mentioned one address per line, so in this case the above works, but multiple addresses per line and this breaks.
Why not simply using this (this determines the number of lines containing (at least) an # char.):
find /C "#" "mail.txt"
Example output:
---------- MAIL.TXT: 96
To avoid the file name in the output, change it to this:
find /C "#" < "mail.txt"
Example output:
96
To capture the resulting number and store it in a variable, use this (change %N to %%N in a batch file):
set "NUM=0"
for /F %N in ('find /C "#" ^< "mail.txt"') do set "NUM=%N"
echo %NUM%
Using grep for Windows
Very simple solution:
grep -o "#" mail.txt | grep -c .
Remember a dot at end of line!
Here is little bit more understandable way:
grep -o "#" mail.txt | grep -c "#"
First grep selects only "#" strings and put each on new line.
Second grep counts lines (or lines with #).
The grep utility can be easy installed from grep-for Windows page. It is very small and safe text filter. The grep is one of most usefull Unix/Linux commands and I use it in both Linux and Windows daily.
The Windows findstr is good, but does not have such features as grep.
Installation of the grep in Windows will be one of the best decision if you like CLI or batch scripts.
Download and Installation
Download latest version from the project page https://sourceforge.net/projects/grep-for-windows/. Direct link to file is https://sourceforge.net/projects/grep-for-windows/files/grep-3.5_win32.zip/download.
Unzip the ZIP archive. A file is inside.
Put the grep.exe file to the C:\Windows directory or another place from the system path list got using command echo %PATH%.
That is all.
Test if grep is working:
Open command line window (cmd)
Run the command grep --help
Uninstallation
Delete the grep.exe file from folder where you have placed it.
May be it's a little bit late, but the following script worked for me (the source file contained quote characters, this is why I used 'usebackq' parameter).
The caret sign(^) acts as escape character in windows batch scripting language.
#setlocal enableextensions enabledelayedexpansion
SET TOTAL=0
FOR /F "usebackq tokens=*" %%I IN (file.txt) do (
SET LN=%%I
FOR %%J IN ("!LN!") do (
FOR /F %%K IN ('ECHO %%J ^| FIND /I /C "searchPhrase"') DO (
#SET /A TOTAL=!TOTAL!+%%K
)
)
)
ECHO Number of occurences is !TOTAL!
I found this on the net. See if it works:
findstr /R /N "^.*certainString.*$" file.txt | find /c "#"
I would install the unix tools on your system (handy in any case :-), then it's really simple - look e.g. here:
Count the number of occurrences of a string using sed?
(Using awk:
awk '$1 ~ /title/ {++c} END {print c}' FS=: myFile.txt
).
You can get the Windows unix tools here:
http://unxutils.sourceforge.net/
OK - way late to the table, but... it seems many respondents missed the original spec that all email addresses occur on 1 line. This means unless you introduce a CRLF with each occurrence of the # symbol, your suggestions to use variants of FINDSTR /c will not help.
Among the Unix tools for DOS is the very powerful SED.exe. Google it. It rocks RegEx. Here's a suggestion:
find "#" datafile.txt | find "#" | sed "s/#/#\n/g" | find /n "#" | SED "s/\[\(.*\)\].*/Set \/a NumFound=\1/">CountChars.bat
Explanation: (assuming the file with the data is named "Datafile.txt")
1) The 1st FIND includes 3 lines of header info, which throws of a line-count approach, so pipe the results to a 2nd (identical) find to strip off unwanted header info.
2) Pipe the above results to SED, which will search for each "#" character and replace it with itself+ "\n" (which is a "new line" aka a CRLF) which gets each "#" on its own line in the output stream...
3) When you pipe the above output from SED into the FIND /n command, you'll be adding line numbers to the beginning of each line. Now, all you have to do is isolate the numeric portion of each line and preface it with "SET /a" to convert each line into a batch statement that (increasingly with each line) sets the variable equal to that line's number.
4) isolate each line's numeric part and preface the isolated number per the above via:
| SED "s/\[\(.*\)\].*/Set \/a NumFound=\1/"
In the above snippet, you're piping the previous commands's output to SED, which uses this syntax "s/WhatToLookFor/WhatToReplaceItWith/", to do these steps:
a) look for a "[" (which must be "escaped" by prefacing it with "\")
b) begin saving (or "tokenizing") what follows, up to the closing "]"
--> in other words it ignores the brackets but stores the number
--> the ".*" that follows the bracket wildcards whatever follows the "]"
c) the stuff between the \( and the \) is "tokenized", which means it can be referred-to later, in the "WhatToReplaceItWith" section. The first stuff that's tokenized is referred to via "\1" then second as "\2", etc.
So... we're ignoring the [ and the ] and we're saving the number that lies between the brackets and IGNORING all the wild-carded remainder of each line... thus we're replacing the line with the literal string:
Set /a NumFound= + the saved, or "tokenized" number, i.e.
...the first line will read: Set /a NumFound=1
...& the next line reads: Set /a NumFound=2 etc. etc.
Thus, if you have 1,283 email addresses, your results will have 1,283 lines.
The last one executed = the one that matters.
If you use the ">" character to redirect all of the above output to a batch file, i.e.:
> CountChars.bat
...then just call that batch file & you'll have a DOS environment variable named "NumFound" with your answer.
This is how I do it, using an AND condition with FINDSTR (to count number of errors in a log file):
SET COUNT=0
FOR /F "tokens=4*" %%a IN ('TYPE "soapui.log" ^| FINDSTR.exe /I /R^
/C:"Assertion" ^| FINDSTR.exe /I /R /C:"has status VALID"') DO (
:: counts number of lines containing both "Assertion" and "has status VALID"
SET /A COUNT+=1
)
SET /A PASSNUM=%COUNT%
NOTE: This counts "number of lines containing string match" rather than "number of total occurrences in file".
Use this:
type file.txt | find /i "#" /c

Resources