I have a fasta with headers like this:
tr|Q7MX99|Q7MX99_PORGI_BACT
I would like them to say:
tr|Q7MX99|Q7MX99_PORGI_BACT_ORALMICROBIOME
So basically, whenever I have PORGI_BACT I want to append _ORALMICROBIOME to each instance.
I'm sure there is an easy fix through the terminal, but I can't seem to find it.
My first idea is to do something like:
sed 's/>.*/&_ORALMICROBIOME/' file.fa > outfile.fa
BUT I only want to add to specific header endings, and that is where I'm stuck.
Using sed:
sed -r 's/(^.*)(PORGI_BACT|HUMAN_MAM|TESTA_BACT)(.*$)/\1\2_ORALMICROBIOME\3/' file.fa > outfile.fa
Enable regular expression interpretation using -r or -E and then split the line into three sections based on "PORGI_BACT" being in section two and then substitute the line for the first and second sections, followed by "_ORALMICROBIOME" and finally the third section.
You are almost close. Would you please try the following:
sed 's/^>.*PORGI_BACT/&_ORALMICROBIOME/' file.fa > outfile.fa
[Edit]
According to the OP's requirement, how about:
sed -E 's/^>.*(PORGI_BACT|HUMAN_MAM|TESTA_BACT)/&_ORALMICROBIOME/' file.fa > outfile.fa
Sample input as file.fa:
>SEQ0|tr|Q7MX99|Q7MX99_PORGI_BACT
FQTWEEFSRAAEKLYLADPMKVRVVLKYRHVDGNLCIKVTDDLVCLVYRTDQAQDVKKIEKF
>SEQ1|tr|Q7MX88|Q7MX88_HUMAN_MAM
KYRTWEEFTRAAEKLYQADPMKVRVVLKYRHCDGNLCIKVTDDVVCLLYRTDQAQDVKKIEKFHSQLMRLME
LKVTDNKECLKFKTDQAQEAKKMEKLNNIFFTLM
>SEQ2|tr|Q7MX77|Q7MX77_TESTA_BACT
EEYQTWEEFARAAEKLYLTDPMKVRVVLKYRHCDGNLCMKVTDDAVCLQYKTDQAQDVKKVEKLHGK
>SEQ3|tr|Q7MX66|Q7MX66_DUMMY
MYQVWEEFSRAVEKLYLTDPMKVRVVLKYRHCDGNLCIKVTDNSVCLQYKTDQAQDVK
Output:
>SEQ0|tr|Q7MX99|Q7MX99_PORGI_BACT_ORALMICROBIOME
FQTWEEFSRAAEKLYLADPMKVRVVLKYRHVDGNLCIKVTDDLVCLVYRTDQAQDVKKIEKF
>SEQ1|tr|Q7MX88|Q7MX88_HUMAN_MAM_ORALMICROBIOME
KYRTWEEFTRAAEKLYQADPMKVRVVLKYRHCDGNLCIKVTDDVVCLLYRTDQAQDVKKIEKFHSQLMRLME
LKVTDNKECLKFKTDQAQEAKKMEKLNNIFFTLM
>SEQ2|tr|Q7MX77|Q7MX77_TESTA_BACT_ORALMICROBIOME
EEYQTWEEFARAAEKLYLTDPMKVRVVLKYRHCDGNLCMKVTDDAVCLQYKTDQAQDVKKVEKLHGK
>SEQ3|tr|Q7MX66|Q7MX66_DUMMY
MYQVWEEFSRAVEKLYLTDPMKVRVVLKYRHCDGNLCIKVTDNSVCLQYKTDQAQDVK
Related
I have a file full of IDs which I need to use to build a list of URLs as part of a bash file.
ids.txt is as follows:
s_Foo
p_Bar
s1_Blah
e_Yah
The URLs will always end in a filename that contains the ID, in its own path.
I've looked around for how to prepend and append using sed, but cannot figure out to do the duplicating copy/paste part (\1) using that tool. The ID can be anything, so pattern matching seems hard. Duplication of everything before the line break seems more sensible? I don't know.
How do I create something like this as urls.txt using sed or awk? Is it possible?
https://link.domain.com/list/s_Foo/s_Foo_meta.xml
https://link.domain.com/list/p_Bar/p_Bar_meta.xml
https://link.domain.com/list/s1_Blah/s1_Blah_meta.xml
https://link.domain.com/list/e_Yah/e_Yah_meta.xml
$ sed 's#.*#https://link.domain.com/list/&/&_meta.xml#' ids.txt
https://link.domain.com/list/s_Foo/s_Foo_meta.xml
https://link.domain.com/list/p_Bar/p_Bar_meta.xml
https://link.domain.com/list/s1_Blah/s1_Blah_meta.xml
https://link.domain.com/list/e_Yah/e_Yah_meta.xml
$ awk '{sub(/.*/,"https://link.domain.com/list/&/&_meta.xml")}1' ids.txt
https://link.domain.com/list/s_Foo/s_Foo_meta.xml
https://link.domain.com/list/p_Bar/p_Bar_meta.xml
https://link.domain.com/list/s1_Blah/s1_Blah_meta.xml
https://link.domain.com/list/e_Yah/e_Yah_meta.xml
try gnu sed:
sed -E 's/\S+/https://link.domain.com/list/&/&_meta.xml' ids.txt >urls.txt
I have a text file and there is a part which I would like to replace with another one. The problem is that the new part contains special signs like {=./} and I'm not able to do that using sed.
What I want to do is to change text123 into {text123=./bla}.
I tried to do it with sed command but look like there are some issues with that. Can any help me with?
Looks like I found a solution but unfortunately there is another problem. I want to put all line below (with all special signs) into text file using echo:
sed -i -e 's/Location/Location\n\n\/\/One_File\nPart Second File '"{ d =.\/source; }"'\nThird File '"{ d =.\/source; }"'/' $file
So once I run script it will take me something like this:
Location
\\One file
Part Second File { d =.\/source; }
Part Third File { d =.\/source; }
Can anyone help me with that? Seems there are too many special signs here, but I need them all.
echo text123 | sed 's/text123/{text123=.\/bla}/g'
result :
{text123=./bla}
Using back reference with sed:
echo text123 | sed 's/\(text123\)/{\1=.\/bla}/g'
{text123=./bla}
I have a configure file which has a line ServerIP= in it. Now I want to use find this line and add a new IP address to it, i.e. replace it with ServerIP=192.168.0.101, what is the command like?
You can employ a find-and-replace command to do this:
sed -e 's/\(^ServerIP=\)/\1192.168.0.101/g' your_file
Are we doing this all over the file or only in one spot? The command above should replace it everywhere. You will have to send the output somewhere. I never edit in place with sed because I make too many mistakes.
One tricky thing is this part, \1192.168.0.101, which actually can be broken down like this:
\1 --> the thing we captured
192.168.0.101 --> the thing we are placing IMMEDIATELY after the thing we captured
Also, you may have other lines that look a little different. But, in the future, look up "sed capture and replace".
This one would work whether there's an existing value in ServerIP or not:
sed -i 's#\([[:blank:]]*ServerIP=\)[[:digit:].]*#\1192.168.0.101#' file
I also suggest that you try to learn using CLI editors like VIM or Nano instead.
try:
sed -i 's/^ *ServerIP=/&192.168.0.101/' file
I would do:
sed -i 's/^ServerIP=$/ServerIP=192.168.0.101/' file.config
I have a text file and I want to remove all lines containing the words: facebook, youtube, google, amazon, dropbox, etc.
I know to delete lines containing a string with sed:
sed '/facebook/d' myfile.txt
I don't want to run this command five different times though for each string, is there a way to combine all the strings into one command?
Try this:
sed '/facebook\|youtube\|google\|amazon\|dropbox/d' myfile.txt
From GNU's sed manual:
regexp1\|regexp2
Matches either regexp1 or regexp2. Use parentheses to use
complex alternative regular expressions. The matching process tries
each alternative in turn, from left to right, and the first one that
succeeds is used. It is a GNU extension.
grep -vf wordsToExcludeFile myfile.txt
"wordsToExcludeFile" should contain the words you don't want, one per line.
If you need to save the result back to the same file, then add this to the command:
> myfile.new && mv myfile.new myfile.txt
With awk
awk '!/facebook|youtube|google|amazon|dropbox/' myfile.txt > filtered.txt
I'm trying to find the parameters sent in all POST requests by looking at my production Rails log files. Right now I'm just using the following:
grep 'Started POST.*\/[fr]' log/production.log
This shows me when the POSTs happen but not the parameters. What I'd like is to do something along the lines of:
Store the line in sed's hold buffer when it encounters the regex above
Print the contents of the hold buffer and the current line when it encounters "Parameters:"
As I understood you need to display line with regex and the following line. grep can do it itself:
grep -A1 'Started POST.*\/[fr]' log/production.log
with sed it will look like:
sed -n '/Started POST.*\/[fr]/{N;p}' log/production.log
or if not all following lines may content "Parameters" and you need only them:
sed -n '/Started POST.*\/[fr]/{N;/Parameters/p}' log/production.log
This should do what you're looking for:
sed -n '/Started POST.*\/[fr]/h;/Parameters/{H;g;p;q}'
It holds the first line and prints it and the line containing "Parameters". Only the first set of lines will be printed.