I have several sequences to test to see if they are present in my file and I want to extract them in another file. The sequences start with a unique id that must be kept and end with ">" that I don't want to keep. I did a test but I have a problem with the regular expression
#!/bin/bash
cat data.fsa | grep "Qrob" | wc -l
for gene_id in 'gene1' 'gene2'
do
if cat "data.fsa" |grep $gene_id >/dev/null 2>&1
then
echo "data.fsa" | sed -n "s/.*${gene_id}\(.*\)>.*/\"\1\"/p"
else
continue
fi
done
How do I do this? Thanks for your help
I understand my error thanks to you ! Thank you.
sed -n "/^>$gene_id/,/^>/p" data.fsa >> test.fsa && sed -i '$d' test.fsa
I generate the file directly and I delete with sed -i '$d' test.fsa manually the last selection.
Related
I am trying to do a one liner command that would delete the first line from a bunch of files. The list of files will be generated by grep command.
grep -l 'hsv,vcv,tro,ztk' ${OUTPUT_DIR}/*.csv | tr -s "\n" " " | xargs /usr/bin/sed -i '1d'
The problem is that sed can't see the list of files to act on.I'm not able to work out what is wrong with the command. Please can someone point me to my mistake.
Line numbers in sed are counted across all input files. So the address 1 only matches once per sed invocation.
In your example, only the first file in the list will get edited.
You can complete your task with loop such as this:
grep -l 'hsv,vcv,tro,ztk' "${OUTPUT_DIR}/"*.csv |
while IFS= read -r file; do
sed -i '1d' "$file"
done
This might work for you (GNU sed and grep):
grep -l 'hsv,vcv,tro,ztk' ${OUTPUT_DIR}/*.csv | xargs sed -i '1d'
The -l ouputs the file names which are received as arguments for xargs.
The -i edits in place the file and removes the first line of each file.
N.B. The -i option in sed works at a per file level, to use line numbers for each file within a stream use the -s option.
The only solution that worked for me is this apart from the one posted by Dan above -
for k in $(grep -l 'hsv,vcv,tro,ztk' ${OUTPUT_DIR}/*.csv | tr -s "\n" " ")
do
/usr/bin/sed -i '1d' "${k}"
done
I'm trying to get the output from a grep and sed pipe to go to the terminal and a text file.
Neither
grep -Filr "string1" * 2>&1 | tee ~/outputfile.txt | sed -i "s|string1|string2|g"
nor
grep -Filr "string1" * | sed -i "s|string1|string2|g" 2>&1 | tee ~/outputfile.txt
work. I get "sed: no input files" going to the terminal so sed is not getting the correct input. I just want to see and write out to a text file which files are modified from the search and replace. I know using find instead of grep would be more efficient since the search wouldn't be done twice, but I'm not sure how to output the file name using find and sed when there is a search hit.
EDIT:
Oops I forgot to include xargs in the code. It should have been:
grep -Filr "string1" * 2>&1 | tee ~/outputfile.txt | xargs sed -i "s|string1|string2|g"
and
grep -Filr "string1" * | xargs sed -i "s|string1|string2|g" 2>&1 | tee ~/outputfile.txt
To be clear, I'm looking for a solution that modifies the matched files with the search and replace, and then outputs the modified files' file names to the terminal and a log file.
The -i option to sed is only useful when sed operates on a file, not on standard input. Drop it, and your first option is correct.
I'd use a loop:
for i in `grep -lr string1 *`; do sed -i . 's/string1/string2/g' $i; echo $i >> ~/outputfile.txt; done
I'd advise against using the 'i' option for grep, because it would match files which the sed command won't actually modify.
You can do the same with find and exec, but that's a dangerous tool.
I almost forgot about this. I eventually went with a for loop in a bash script:
#!/bin/bash
for i in $( grep -Flr "string1" * ); do
sed -i "s|string1|string2|g" $i
echo $i
echo $i >> ~/outputfile.txt
done
I'm using the vertical pipe | as the separator, because I'm replacing URL paths with lots of forward slashes.
Thank you both for your help.
I am trying to select the nth file in a folder of which the filename matches a certain pattern:
Ive tried using this with sed: e.g.,
sed -n 3p /path/to/files/pattern.txt
but it appears to return the 3rd line of the first matching file.
Ive also tried
sed -n 3p ls /path/to/files/*pattern*.txt
which doesnt work either.
Thanks!
Why sed, when bash is so much better at it?
Assuming some name n indicates the index you want:
Bash
files=(path/to/files/*pattern*.txt)
echo "${files[n]}"
Posix sh
i=0
for file in path/to/files/*pattern*.txt; do
if [ $i = $n ]; then
break
fi
i=$((i++))
done
echo "$file"
What's wrong with sed is that you would have to jump through many hoops to make it safe for the entire set of possible characters that can occur in a filename, and even if that doesn't matter to you you end up with a double-layer of subshells to get the answer.
file=$(printf '%s\n' path/to/files/*pattern*.txt | sed -n "$n"p)
Please, never parse ls.
ls -1 /path/to/files/*pattern*.txt | sed -n '3p'
or, if patterne is a regex pattern
ls -1 /path/to/files/ | egrep 'pattern' | sed -n '3p'
lot of other possibilities, it depend on performance or simplicity you look at
Within my template(callbacks), there is a line that ends with "IP:" I would like to append to. I tried this command:
cat callbacks | grep "IP:" | cut -d ":" -f 2 | echo $(ping -c2 host.com).
I thought i would be able to echo something at the end, but that didn't work. Could someone please shed some light on what i am doing wrong.
This is what i have so far:
for textfile in $(find . -iname "2013*-malware-callback*.txt")
do cat callbacks | cat - $textfile > tmpfile && mv tmpfile $textfile
done
The following takes the contents of $textfile, finds any occurrence of IP: and appends to it an IP address, and saves the result in tmpfile:
v="1.2.3.4"
cat "$textfile" | sed 's/IP:/IP: '"$v/" >tmpfile
The pipeline can be simplified:
sed 's/IP:/IP: '"$v/" <"$textfile" >tmpfile
Further, if the ultimate goal is to replace $textfile with the modified version, we can use sed's modify-in-place feature:
sed -i.bak 's/IP:/IP: '"$v/" "$textfile"
This modifies $textfile in place and, for safekeeping, leaves a backup copy of the original with extension .bak.
I have a command like this :
cat error | grep -o [0-9]
which is printing only numbers like 2,30 and so on. Now I wish to pass this number to sed.
Something like :
cat error | grep -o [0-9] | sed -n '$OutPutFromGrep,$OutPutFromGrepp'
Is it possible to do so?
I'm new to shell scripting. Thanks in advance
If the intention is to print the lines that grep returns, generating a sed script might be the way to go:
grep -E -o '[0-9]+' error | sed 's/$/p/' | sed -f - error
You are probably looking for xargs, particularly the -I option:
themel#eristoteles:~$ xargs -I FOO echo once FOO, twice FOO
hi
once hi, twice hi
there
once there, twice there
Your example:
themel#eristoteles:~$ cat error
error in line 123
error in line 234
errors in line 345 and 346
themel#eristoteles:~$ grep -o '[0-9]*' < error | xargs -I OutPutFromGrep echo sed -n 'OutPutFromGrep,OutPutFromGrepp'
sed -n 123,123p
sed -n 234,234p
sed -n 345,345p
sed -n 346,346p
For real-world use, you'll probably want to pass sed an input file and remove the echo.
(Fixed your UUOC, by the way. )
Yes you can pass output from grep to sed.
Please note that in order to match whole numbers you need to use [0-9]* not only [0-9] which would match only a single digit.
Also note you should use double quotes to get variables expanded(in the sed argument) and it seems you have a typo in the second variable name.
Hope this helps.