Why is while read data; do echo "$data" | cut -d: -f1; done so slow? - bash

To get only the files that git grep prints, I can do
$ git grep "search" | cut -d':' -f1
So I made a short helper script cutg to which I can pipe to and I place at ~/bin/ dir.
#!/bin/sh
while read data; do
echo "$data" | cut -d':' -f1
done
So now I can do
$ git grep "search" | cutg
But it is very slow.
Why so? How do I make it as fast as the 1st command?

The script should be just:
cut -d':' -f1
or (better)
exec cut -d':' -f1
Shell loops are slow—especially if they invoke a process on each iteration, and especially if they're useless.

Your loop reads each line of input, and creates a new cut process for each line. The original one-liner used a single cut process for all the input. Thankfully, you can inherit the script's standard input, and simply write
#!/bin/sh
exec cut -d: -f1 "$#" -
There's no need for your script to do anything at all, except replace itself with an appropriate cut invocation. I included "$#" in case you want to provide additional arguments to cut, but you can safely leave that out if you're sure you don't need it.

Related

How to assign values in bash after running commands

NAME=aaa/bbbbb:0.1.2
How can I assign the new string?
NEW_VAR=$NAME | sed 's/.*\///' | cut -f1 -d":"
Assuming you are using bash, you don't actually need to shell out to other commands, there is built in string mangling functions:
NEW_VAR=${NAME%%:*}
If you would prefer to, it'll be something like this:
NEW_VAR=$(echo -n $NAME | cut -f1 -d":")

using cut on a line having multiple instances of the same delimiter - unix

I am trying to write a generic script which can have different file name inputs.
This is just a small part of my bash script.
for example, lets say folder 444-55 has 2 files
qq.filter.vcf
ee.filter.vcf
I want my output to be -
qq
ee
I tried this and it worked -
ls /data2/delivery/Stack_overflow/1111_2222_3333_23/secondary/444-55/*.filter.vcf | sort | cut -f1 -d "." | xargs -n 1 basename
But lets say I have a folder like this -
/data2/delivery/Stack_overflow/de.1111_2222_3333_23/secondary/444-55/*.filter.vcf
My script's output would then be
de
de
How can I make it generic?
Thank you so much for your help.
Something like this in a script will "cut" it:
for i in /data2/delivery/Stack_overflow/1111_2222_3333_23/secondary/444-55/*.filter.vcf
do
basename "$i" | cut -f1 -d.
done | sort
advantages:
it does not parse the output of ls, which is frowned upon
it cuts after having applied the basename treatment, and the cut ignores the full path.
it also sorts last so it's guaranteed to be sorted according to the prefix
Just move the basename call earlier in the pipeline:
printf "%s\n" /data2/delivery/Stack_overflow/1111_2222_3333_23/secondary/444-55/*.filter.vcf |
xargs -n 1 basename |
sort |
cut -f1 -d.

What is the proper method to pipe the output of the cut command into a grep command?

I am currently learning a little more about using Bash shell on OSX terminal. I am trying to pipe the output of a cut command into a grep command, but the grep command is not giving any output even though I know there are matches. I am using the following command:
cut -d'|' -f2 <filename.txt> > <temp.txt> | grep -Ff <temp.txt> <searchfile.txt> > <filematches.txt>
I was thinking that this should work, but most of the examples I have seen normally pipe grep output into the cut. My goal was to cut field 2 from the file and use that as the pattern to search for in . However, using the command produced no output.
When I generated the temp.txt first with the cut command and then ran the grep on it manually with no pipe, the grep seemed to run fine. I am not sure why this is?
You can use process substitution here:
grep -Ff <(cut -d'|' -f2 filename.txt) searchfile.txt > filematches.txt
<(cut -d'|' -f2 filename.txt) is feeding cut command's output to grep as a file.
Okay, a reason this line doesn't behave as you expect
cut -d'|' -f2 <filename.txt> > <temp.txt> | grep -Ff <temp.txt> <searchfile.txt> > <filematches.txt>
is that the output of your cut is going to temp.txt. You're not sending anything to the pipe. Now, conveniently pipe also starts a new commend, so it doesn't matter much -- grep runs and reads searchfile.txt.
But what are you trying to do? Here's what your command line is trying to do:
take the second pipe-delimited field from filename.txt
write it to a file
run grep ...
... using the contents of the file from 2 as a grep search string (which isn't going to do what you think either, as you're effectively asking grep to look for the pattern match1\nmatch2...)
You'd be closer with
cut ... && grep ...
as that runs grep assuming cut completes effectively. Or you could use
grep -f `cut ...`
which would put the results on the command line. You need to mess with quoting, but you're still going to be looking for a line containing ALL of your match fields from cut.
I'd recommend maybe you mean something like this:
for match in `cut ...`
do
grep -f $match >> filematches.txt
done

How Do I Convert A Cut Command In Bash Into Grep With Given Code?

I've written a template engine script that uses cut to extract certain elements from a file, but I want to use grep in place of the cut. Here is the code I have written:
#!/bin/bash
IFS=# #makes # a delimiter.
while read line
do
dataspace=`echo ${line} | cut -d'=' -f1`
value=`echo ${line} | cut -d"=" -f2`
printf -v $dataspace "$value" #make the value stored in value into the name of a dataspace.
done < 'template.vars' #read template.vars for standard input.
skipflag=false #initialize the skipflag to false
while read line #while it is reading standard input one line at a time
Just came to the conclusion that the code blocks system here does not support bash.
Anyway, since stackoverflow isn't letting me put Bash into codeblocks, I am not putting the entire script since it would look nasty. Based on what is currently high-lighted, how would I go about changing the part using the cut command into a line using the grep command?
As has been noted, you should give more information for a better answer. Going with what you have, I would say that awk is a better option than grep
dataspace=$(awk '$0=$1' FS== <<< "$line")
value=$(awk '$0=$2' FS== <<< "$line")

How to handle variables that contain ";"?

I have a configuration file that contains lines like "hallo;welt;" and i want to do a grep on this file.
Whenever i try something like grep "$1;$2" my.config or echo "$1;$2 of even line="$1;$2" my script fails with something like:
: command not found95: line 155: =hallo...
How can i tell bash to ignore ; while evaluating "..." blocks?
EDIT: an example of my code.
# find entry
$line=$(grep "$1;$2;" $PERMISSIONSFILE)
# splitt line
reads=$(echo $line | cut -d';' -f3)
writes=$(echo $line | cut -d';' -f4)
admins=$(echo $line | cut -d';' -f5)
# do some stuff on the permissions
# replace old line with new line
nline="$1;$2;$reads;$writes;$admins"
sed -i "s/$line/$nline/g" $TEMPPERM
my script should be called like this: sh script "table" "a.b.*.>"
EDIT: another, simpler example
$test=$(grep "$1;$2;" temp.authorization.config)
the temp file:
table;pattern;read;write;stuff
the call sh test.sh table pattern results in: : command not foundtable;pattern;read;write;stuff
Don't use $ on the left side of an assignment in bash -- if you do it'll substitute the current value of the variable rather than assigning to it. That is, use:
test=$(grep "$1;$2;" temp.authorization.config)
instead of:
$test=$(grep "$1;$2;" temp.authorization.config)
Edit: also, variable expansions should be in double-quotes unless there's a good reason otherwise. For example, use:
reads=$(echo "$line" | cut -d';' -f3)
instead of:
reads=$(echo $line | cut -d';' -f3)
This doesn't matter for semicolons, but does matter for spaces, wildcards, and a few other things.
A ; inside quotes has no meaning at all for bash. However, if $1 contains a doublequote itself, then you'll end up with
grep "something";$2"
which'll be parsed by bash as two separate commands:
grep "something" ; other"
^---command 1----^ ^----command 2---^
Show please show exactly what your script is doing around the spot the error is occurring, and what data you're feeding into it.
Counter-example:
$ cat file.txt
hello;welt;
hello;world;
hell;welt;
$ cat xx.sh
grep "$1;$2" file.txt
$ bash -x xx.sh hello welt
+ grep 'hello;welt' file.txt
hello;welt;
$
You have not yet classified your problem accurately.
If you try to assign the result of grep to a variable (like I do) your example breaks.
Please show what you mean. Using the same data file as before and doing an assignment, this is the output I get:
$ cat xx.sh
grep "$1;$2" file.txt
output=$(grep "$1;$2" file.txt)
echo "$output"
$ bash -x xx.sh hello welt
+ grep 'hello;welt' file.txt
hello;welt;
++ grep 'hello;welt' file.txt
+ output='hello;welt;'
+ echo 'hello;welt;'
hello;welt;
$
Seems to work for me. It also demonstrates why the question needs an explicit, complete, executable, minimal example so that we can see what the questioner is doing that is different from what people answering the question think is happening.
I see you've provided some sample code:
# find entry
$line=$(grep "$1;$2;" $PERMISSIONSFILE)
# splitt line
reads=$(echo $line | cut -d';' -f3)
writes=$(echo $line | cut -d';' -f4)
admins=$(echo $line | cut -d';' -f5)
The line $line=$(grep ...) is wrong. You should omit the $ before line. Although it is syntactically correct, it means 'assign to the variable whose name is stored in $line the result of the grep command'. That is unlikely to be what you had in mind. It is, occasionally, useful. However, those occasions are few and far between, and only for people who know what they're doing and who can document accurately what they're doing.
For safety if nothing else, I would also enclose the $line values in double quotes in the echo lines. It may not strictly be necessary, but it is simple protective programming.
The changes lead to:
# find entry
line=$(grep "$1;$2;" $PERMISSIONSFILE)
# split line
reads=$( echo "$line" | cut -d';' -f3)
writes=$(echo "$line" | cut -d';' -f4)
admins=$(echo "$line" | cut -d';' -f5)
The rest of your script was fine.
It seems like you are trying to read a semicolon-delimited file, identify a line starting with 'table;pattern;' where table is a string you specify and pettern is a regular expression grep will understand. Once the line is identified you wish to replaced the 3rd, 4th and 5th fields with different data and write the updated line back to the file.
Does this sound correct?
If so, try this code
#!/bin/bash
in_table="$1"
in_pattern="$2"
file="$3"
while IFS=';' read -r -d$'\n' tuple pattern reads writes admins ; do
line=$(cut -d: -f1<<<"$tuple")
table=$(cut -d: -f2<<<"$tuple")
# do some stuff with the variables
# e.g., update the values
reads=1
writes=2
admins=12345
# replace the old line with the new line
sed -i'' -n $line'{i\
'"$table;$pattern;$reads;$writes;$admins"'
;d;}' "$file"
done < <(grep -n '^'"${in_table}"';'"${in_pattern}"';' "${file}")
I chose to update by line number here to avoid problems of unknown characters in the left hand of the substitution.

Resources