What is the proper method to pipe the output of the cut command into a grep command? - bash

I am currently learning a little more about using Bash shell on OSX terminal. I am trying to pipe the output of a cut command into a grep command, but the grep command is not giving any output even though I know there are matches. I am using the following command:
cut -d'|' -f2 <filename.txt> > <temp.txt> | grep -Ff <temp.txt> <searchfile.txt> > <filematches.txt>
I was thinking that this should work, but most of the examples I have seen normally pipe grep output into the cut. My goal was to cut field 2 from the file and use that as the pattern to search for in . However, using the command produced no output.
When I generated the temp.txt first with the cut command and then ran the grep on it manually with no pipe, the grep seemed to run fine. I am not sure why this is?

You can use process substitution here:
grep -Ff <(cut -d'|' -f2 filename.txt) searchfile.txt > filematches.txt
<(cut -d'|' -f2 filename.txt) is feeding cut command's output to grep as a file.

Okay, a reason this line doesn't behave as you expect
cut -d'|' -f2 <filename.txt> > <temp.txt> | grep -Ff <temp.txt> <searchfile.txt> > <filematches.txt>
is that the output of your cut is going to temp.txt. You're not sending anything to the pipe. Now, conveniently pipe also starts a new commend, so it doesn't matter much -- grep runs and reads searchfile.txt.
But what are you trying to do? Here's what your command line is trying to do:
take the second pipe-delimited field from filename.txt
write it to a file
run grep ...
... using the contents of the file from 2 as a grep search string (which isn't going to do what you think either, as you're effectively asking grep to look for the pattern match1\nmatch2...)
You'd be closer with
cut ... && grep ...
as that runs grep assuming cut completes effectively. Or you could use
grep -f `cut ...`
which would put the results on the command line. You need to mess with quoting, but you're still going to be looking for a line containing ALL of your match fields from cut.
I'd recommend maybe you mean something like this:
for match in `cut ...`
do
grep -f $match >> filematches.txt
done

Related

grep return the string in between words

I am trying to use grep to filter out the RDS snapshot identifier from the rds describe-db-snapshots command output below:
"arn:aws:rds:ap-southeast-1:123456789:snapshot:rds:apple-pie-2018-05-06-17-12",
"rds:apple-pie-2018-05-06-17-12",
how to return the exact output as in
rds:apple-pie-2018-05-06-17-12
tried using
grep -Eo ",rds:"
but not able to
Following awk may also help you on same.
awk 'match($0,/^"rds[^"]*/){print substr($0,RSTART+1,RLENGTH-1)}' Input_file
Your grep -Eo ",rds:" is failing for different reasons:
You did not add a " in the string to match
Between the comma and rds you need to match the character.
You are trying to match the comma that can be on the previous line
Your sample input is 2 lines (with a newline in between), perhaps the real input is without the newline.
You want to match until the next double quote.
You can support both input-styles (with/without newline) with
grep -Eo '(,|^)"rds:[^"]*' rdsfile |cut -d'"' -f2
You can do this in one command with
sed -rn 's/.*(,|^)"(rds:[^"]*).*/\2/p' rdsfile
EDIT: Manipulting stdout and not the file is with similar commands:
yourcommand | grep -Eo '(,|^)"rds:[^"]*' |cut -d'"' -f2
# or
yourcommand | sed -rn 's/.*(,|^)"(rds:[^"]*).*/\2/p'
You can also test the original commands with yourcommand > rdsfile.
You might notice that rdsfile is missing data that you have seen on the screen, in that case add 2>&1
yourcommand 2>&1 | grep -Eo '(,|^)"rds:[^"]*' |cut -d'"' -f2
# or
yourcommand 2>&1 | sed -rn 's/.*(,|^)"(rds:[^"]*).*/\2/p'

Why is while read data; do echo "$data" | cut -d: -f1; done so slow?

To get only the files that git grep prints, I can do
$ git grep "search" | cut -d':' -f1
So I made a short helper script cutg to which I can pipe to and I place at ~/bin/ dir.
#!/bin/sh
while read data; do
echo "$data" | cut -d':' -f1
done
So now I can do
$ git grep "search" | cutg
But it is very slow.
Why so? How do I make it as fast as the 1st command?
The script should be just:
cut -d':' -f1
or (better)
exec cut -d':' -f1
Shell loops are slow—especially if they invoke a process on each iteration, and especially if they're useless.
Your loop reads each line of input, and creates a new cut process for each line. The original one-liner used a single cut process for all the input. Thankfully, you can inherit the script's standard input, and simply write
#!/bin/sh
exec cut -d: -f1 "$#" -
There's no need for your script to do anything at all, except replace itself with an appropriate cut invocation. I included "$#" in case you want to provide additional arguments to cut, but you can safely leave that out if you're sure you don't need it.

"not found" error in shell script

I am trying to write a script that should take values from a xml file.
Here is the xml file :-
`<manifestFile>
<productInformation>
<publicationInfo>
<pubID pcsi-selector="P.S.">PACODE</pubID>
<pubNumber/>
</publicationInfo>
</productInformation>
</manifestFile>`
and i my code is
:-
#!/bin/sh
Manifest=""
Manifest= `/bin/grep 'pcsi-selector="' /LDCManifest.xml | cut -f 2 -d '"'`
echo $Manifest
I expect my result to be P.S. , but it keeps throwing error as :-
./abc.sh: P.S.: not found
I am new to shell and i am not able to figure out whats the error here ?
You can't have a space after the =.
When you run this command:
Manifest= `/bin/grep 'pcsi-selector="' /LDCManifest.xml | cut -f 2 -d '"'`
It's the same as this:
Manifest='' `/bin/grep 'pcsi-selector="' /LDCManifest.xml | cut -f 2 -d '"'`
That tells the shell to
Run the grep command.
Take its output
Run that output as a command, with the environment variable Manifest set to the empty string for the duration of the command.
Get rid of the space after the = and you'll get the result you want.
However, you should also avoid using backticks for command substitution, because they interfere with quoting. Use $(...) instead:
Manifest=$(grep 'pcsi-selector="' /LDCManifest.xml | cut -f2 -d'"')
Also, using text/regex-based tools like grep and cut to manipulate XML is clunky and error-prone. You'd be better off installing something like XMLStarlet:
Manifest=$(xmlstarlet sel -t \
-v '/manifestFile/productInformation/publicationInfo/pubID/#pcsiselector' -n \
/LDCManifest.xml)
Or simpler:
grep -oP 'pcsi-selector="\K[^"]+' /LDCManifest.xml
would print
P.S.
assign
Manifest=$(grep -oP 'pcsi-selector="\K[^"]+' /LDCManifest.xml)

Using linux "cut" with stdin

I'm trying to pipe data into "cut" to, say, cut away the first column of text. This works
$ cat test.txt | cut -d\ -f2-
Reading from stdin also works:
$ cut -d\ -f2- -
? doc/html/analysis.html
? doc/html/classxytree-members.html
<CTRL+D>
However, as soon as a pipe is involved, it doesn't accept my <CTRL+D> anymore, and I can't signal "end of file":
$ cut -d\ -f2- - | xargs echo
Update: This is apparently a bug in an old version of bash (3.00.15). It does work in more recent versions (tried 4.0.33 and 3.2.25). It would be nice to have some workaround, though, since I can't easily upgrade.
Background: I've got a script/oneliner that gives me a condensed output of cvs status (I know, CVS...) in the form
? filename
e.g. for a file not committed yet. I'd like to be able to copy+paste parts of the output from that command and use this as an input to another command, that adds these files to cvs. Say:
$ cut -d\ -f2- | xargs cvs add
<paste lines>
<CTRL-D> # <-- doesn't work
Ideas?
have you tried
$ cat | cut -d\ -f2- | xargs cvs add
<paste lines>
<CTRL-D> # <-- doesn't work
Your examples work fine for me. What shell are you using? What utilities?
One thing that sometimes trips people up is that Ctrl-D only works if it's the first character in the line. If you copy and paste, you might sometimes accidentally have whitespace as the first character of the line, or no newline at the end of the pasted block, in which case Ctrl-D won't work. Just hit return and then try Ctrl-D again and see if that fixes your problem.

How do you pipe input through grep to another utility?

I am using 'tail -f' to follow a log file as it's updated; next I pipe the output of that to grep to show only the lines containing a search term ("org.springframework" in this case); finally I'd like to make is piping the output from grep to a third command, 'cut':
tail -f logfile | grep org.springframework | cut -c 25-
The cut command would remove the first 25 characters of each line for me if it could get the input from grep! (It works as expected if I eliminate 'grep' from the chain.)
I'm using cygwin with bash.
Actual results: When I add the second pipe to connect to the 'cut' command, the result is that it hangs, as if it's waiting for input (in case you were wondering).
Assuming GNU grep, add --line-buffered to your command line, eg.
tail -f logfile | grep --line-buffered org.springframework | cut -c 25-
Edit:
I see grep buffering isn't the only problem here, as cut doesn't allow linewise buffering.
you might want to try replacing it with something you can control, such as sed:
tail -f logfile | sed -u -n -e '/org\.springframework/ s/\(.\{0,25\}\).*$/\1/p'
or awk
tail -f logfile | awk '/org\.springframework/ {print substr($0, 0, 25);fflush("")}'
On my system, about 8K was buffered before I got any output. This sequence worked to follow the file immediately:
tail -f logfile | while read line ; do echo "$line"| grep 'org.springframework'|cut -c 25- ; done
What you have should work fine -- that's the whole idea of pipelines. The only problem I see is that, in the version of cut I have (GNU coreutiles 6.10), you should use the syntax cut -c 25- (i.e. use a minus sign instead of a plus sign) to remove the first 24 characters.
You're also searching for different patterns in your two examples, in case that's relevant.

Resources