piping findstr's output - windows

Windows command line, I want to search a file for all rows starting with:
# NNN "<file>.inc"
where NNN is a number and <file> any string.
I want to use findstr, because I cannot require that the users of the script install ack.
Here is the expression I came up with:
>findstr /r /c:"^# [0-9][0-9]* \"[a-zA-Z0-9_]*.inc" all_pre.txt
The file to search is all_pre.txt.
So far so good. Now I want to pipe that to another command, say for example more.
>findstr /r /c:"^# [0-9][0-9]* \"[a-zA-Z0-9]*.inc" all_pre.txt | more
The result of this is the same output as the previous command, but with the file name as prefix for every row (all_pre.txt).
Then comes:
FINDSTR: cannot open |
FINDSTR: cannot open more
Why doesn't the pipe work?
snip of the content of all_pre.txt
# 1 "main.ss"
# 7 "main.ss"
# 11 "main.ss"
# 52 "main.ss"
# 1 "Build_flags.inc"
# 7 "Build_flags.inc"
# 11 "Build_flags.inc"
# 20 "Build_flags.inc"
# 45 "Build_flags.inc(function a called from b)"
EDIT: I need to escape the dot in the regex also. Not the issue, but worth to mention.
>findstr /r /c:"^# [0-9][0-9]* \"[a-zA-Z0-9_]*\.inc" all_pre.txt
EDIT after Frank Bollack:
>findstr /r /c:"^# [0-9][0-9]* \"[a-zA-Z0-9_]*\.inc.*" all_pre.txt | more
is not working, although (I think) it should look for the same string as before then any character any number of times. That must include the ", right?

You are missing a trailing \" in your search pattern.
findstr /r /c:"^# [0-9][0-9]* \"[a-zA-Z0-9]*.inc\"" all_pre.txt | more
The above works for me.
Edit:
findstr /r /c:"^# [0-9][0-9]* \"[a-zA-Z0-9]*\.inc.*\"" all_pre.txt | more
This updated search string will now match these lines from your example:
# 1 "Build_flags.inc"
# 7 "Build_flags.inc"
# 11 "Build_flags.inc"
# 20 "Build_flags.inc"
# 45 "Build_flags.inc(function a called from b)"
Edit:
To circumvent this "bug" in findstr, you can put your search into a batch file like this:
#findstr /r /c:"^# [0-9][0-9]* \"[a-zA-Z0-9_]*\.inc" %1
Name it something like myfindstr.bat and call it like that:
myfinsdtr all_pre.txt | more
You can now use the pipe and redirection operators as usual.
Hope that helps.

I can't really explain the why, but from my experience although findstr behaviour with fixed strings (e.g. /c:"some string") is exactly as desired, regular expressions are a different beast. I routinely use the fixed string search function like so to extract lines from CSV files:
C:\> findstr /C:"literal string" filename.csv > output.csv
No issue there.
But using regular expressions (e.g. /R "^\"some string\"" ) appears to force the findstr output to console and can't be redirected via any means. I tried >, >>, 1> , 2> and all fail when using regular expressions.
My workaround for this is to use findstr as the secondary command. In my case I did this:
C:\> type filename.csv | findstr /R "^\"some string\"" > output.csv
That worked for me without issue directly from a command line, with a very complex regular expression string. In my case I only had to escape the " for it to work. other characters such as , and . worked fine as literals in the expression without escaping.
I confirmed that the behaviour is the same on both windows 2008 and Windows 7.
EDIT: Another variant also apparently works:
C:\> findstr /R "^\"some string\"" < filename.csv > output.csv
it's the same principle as using type, but just using the command line itself to create the pipe.

If you use a regex with an even number of double quotes, it works perfectly. But your number of " characters is odd, redirection doesn't work. You can either complete your regex with the second quote (you can use range for this purpose: [\"\"]), or replace your quote character with the dot metacharacter.
It looks like a cmd.exe issue, findstr is not guilty.

Here is my find, it's related to the odd number of double quotes not redirecting from within a batch script. Michael Yutsis had it right, just didn't give an example, so I thought I would:
dataset:
"10/19/2022 20:02:06.057","99.526755039736002573"
"10/19/2022 20:02:07.061"," "
"10/19/2022 20:02:08.075","85.797437749585213851"
"10/19/2022 20:02:09.096","96.71306029796799919"
"10/19/2022 20:02:10.107","4.0273833029566628028"
I tried using the following to find just lines that had a fractional portion of a number at the end of each line.
findstr /r /c:"\.[0-9]*\"$" file1.txt > file2.txt
(a valid regex string surrounded by quotes that has one explicit double quote in it)
needed to become
findstr /r /c:"\"[0-9]*\.[0-9]*\"$"" file1.txt > file2.txt
so it could identify the entire decimal (including the explicit quotes).
I tried just adding another double quote at the end of the string ($"" ) and the command worked and generated file2.txt, but it didn't match any lines in the file, so the extra trailing double quote becomes part of the regex string, I guess, and it doesn't match anything. Including the leading double quote around the full decimal was necessary, and fine for my needs.

Related

sed: remove parentheses from string

I am using a Mac.
I'm trying to remove all parentheses, ( and ), from a string using sed.
Input: this string contains (parentheses)
Desired output: this string contains parentheses
I've tried:
sed -E 's/[\)\(]//g'
but whether I escape the parentheses or not, I still only get a match (and consequently removal) for the first one.
EDIT: the problem was with the input string:
A close paren is ASCII 41, whereas my input has ASCII 239 which explains what's failing. Even more confusingly this equates to an acute accent. Closer examination shows that the ) can't be selected without the following 'space'.
tr with the -d (delete) flag is my goto for removing one or more characters. From the man page:
The tr utility copies the standard input to the standard output with substitution or deletion of selected characters.
echo -n 'this string contains (parentheses)' | tr -d '()'
# this string contains parentheses
Just DON'T use backslashes (typing \( meta-fies the paren) or -Extended pattern matching (which would then require the backslash to UN-meta-fy).
$: echo "this string contains (parentheses)" | sed 's/[)(]//g'
this string contains parentheses

How to overwrite a string in a file but keep the remainder of the string in tact

struggling with this maybe somebody has an idea
I just want to overwrite the amount of characters of a given output over the original string but keep the remainder untouched
example
original string
00000000000000000000000000000000000
new string
sometexthere00000000000000000000000
looked a various ways but the all seem to replace the whole line or look for certain strings to match
Sed would be very helpful.
For example you want to replace the first 12 characters with sometexthere, you can write
>>> echo 0000000000000000000000000000000000 | sed -E 's/^.{12}/sometexthere/'
sometexthere0000000000000000000000
What it does?
^ Anchors the regex at the start of the string
.{12} . Matches anything, combined with {12}, it matches 12 any characters.
You can also use parameter substitution, for example
$ echo ${val/00000/hello}
hello000000000000000000000000000000
You can use a . in a regex to match any character (just once) and a ^ to match the start of a line.
Not knowing the exact format of the file etc, here is an example using sed to edit a string outputted by echo:
echo "00000000000000000000000000000000000" | sed 's/^.........../aaaaaaaaaaaa/'
Full bash solution:
#!/bin/bash
# original="abcdefghijklmnopqrstuvwxyz0123456789"
original="00000000000000000000000000000000000"
overwrite="sometexthere"
modified=${overwrite}${original:${#overwrite}} # append overwrite with substring of original from overwrite's length to the end
echo ${modified}

FINDSTR refuses to match EOL

Given the following piped in text:
a master
a release
a release2
a some-release
Can someone please explain why
findstr /i /r /c:"a release$"
does not return line 2?
After several hours of reading everything imaginable about the windows findstr command, it just doesn't seem possible to get the $ character to match the EOL. Note that using the /E switch instead of $ makes no difference. I am running Windows 7.
Can someone come up with any way to match just line 2 using standard windows commands? I will resort to grep if necessary, but I can't believe there's no way to solve this natively.
Thanks!
You mention "piped" text. I just had this problem and was searching on stack. The answer for me was, findstr /R and the echo command have some weird quirks with pipes in DOS (or cmd).
I was trying to match files ending in .jpg, so to test, my command was:
echo "myfile spaces in name.jpg" | findstr /i /r "\.jpg$"
But that wasn't working. I used gnu utils to find out, echo with a space before the pipe inserts a SPACE in the output of the PIPE. Since I'm very used to UNIX style echo and regular expressions I did not expect the extra space inserted after the filename in my echo test.
To fix, I added a " *" (space-star) in the regex after the jpg, (to match 0 or more spaces):
echo "myfile spaces in name.jpg" | findstr /i /r "\.jpg *$"
That worked great.**
Proof using GNU's octal dump command (space is 040 in octal):
c:\>echo "myfile.jpg" | od -cb
0000000 " m y f i l e . j p g " \r \n
042 155 171 146 151 154 145 056 152 160 147 042 040 015 012
Now if I remove the space before the "|" pipe, it goes away:
c:\>echo "myfile.jpg"| od -cb
0000000 " m y f i l e . j p g " \r \n
042 155 171 146 151 154 145 056 152 160 147 042 015 012
This shouldn't happen in text files in DOS, nor most commands filtered through a pipe, but it will happen if you do quick command line tests using echo like I did.
** Another weird quirk (at least on my Win10 version of findstr), it ignores double-quotes near the end of the line.
If there are not non visible characters in the data, the most probable cause is the line termination character. If the lines piped do not end with carriage return / line feed (0x0D 0x0A) characters, findstr will not match the end of the line where it should.
Try something like
sourceofdata | more | findstr /r /c:"a release$"
sourceofdata | find /v "" | findstr /r /c:"a release$"
Both find and more changes the line ending. If it works, you have found the source of the problem.
If not, here (if you have still not readed it) you will find an extensive documentation on how findstr can fail.
It won't work. /c says do literal not regular expresion. You can make it work with command line switches though. Did you look up the reference before writing your command.

Print all characters upto a matching pattern from a file

Maybe a silly question but I have a text file that needs to display everything upto the first pattern match which is a '/'. (all lines contain no blank spaces)
Example.txt:
somename/for/example/
something/as/another/example
thisfile/dir/dir/example
Preferred output:
somename
something
thisfile
I know this grep code will display everything after a matching pattern:
grep -o '/[^\n]*' '/my/file.txt'
So is there any way to do the complete opposite, maybe rm everything after matching pattern or invert to display my preferred output?
Thanks.
If you're calling an external command like grep, you can get the same results your require with the sed command, i.e.
echo "something/as/another/example" | sed 's:/.*::'
something
Instead of focusing on what you want to keep, think about what you want to remove, in this case everything after the first '/' char. This is what this sed command does.
The leading s means substitute, the :/.*: is the pattern to match, with /.* meaning match the first /' char and all characters after that. The 2nd half of thesedcommand is the replacement. With::`, this means replace with nothing.
The traditional idom for sed is to use s/str/rep/, using / chars to delimit the search from the replacement, but you can use any character you want after the initial s (substitute) command.
Some seds expect the / char, and want a special indication that the following character is the sub/replace delimiter. So if s:/.*:: doesn't work, then s\:/.*:: should work.
IHTH.
Yu can use a much simpler reg exp:
/[^/]*/
The forward slash after the carat is what you're matching to.
jsFiddle
Assuming filename as "file.txt"
cat file.txt | cut -d "/" -f 1
Here, we are cutting the input line with "/" as the delimiter (-d "/"). Then we select the first field (-f 1).
You just need to include starting anchor ^ and also the / in a negated character class.
grep -o '^[^/]*' file

How to read a 3rd string in a file by makefile

I'm trying to convert some cmd script to a makefile with no success.
The script is:
for /F "eol=* tokens=2,3*" %%i in (%VERSION_FILE_PATH%\VersionInfo.h,%VERSION_FILE_PATH%\Version.h) do (
if %%i==%MAJOR% set MAJOR_VALUE=%%j
if %%i==%MINOR% set MINOR_VALUE=%%j
if %%i==%HOTFIX% set HOTFIX_VALUE=%%j
if %%i==%BUILD% set BUILD_VALUE=%%j
)
What the script does is searching for specific string in each line and gets the string followed.
for example: #define MAJOR 4
I'm searching for MAJOR and getting the 4.
My question is how to do it in makefile.
If these lines always have the same structure, you could do it this way with awk:
tester:
cat test | grep MAJOR | awk '{print $$3}'
Where I made a little test file that just contains
#define MAJOR 4
You could loop over each line, grepping for the token you want, and then grabbing the third value with awk. Note that you need the double $ to escape string expansion by make.

Resources