Transform two lines separated by a line into one line - shell

I'm trying to find the parameters sent in all POST requests by looking at my production Rails log files. Right now I'm just using the following:
grep 'Started POST.*\/[fr]' log/production.log
This shows me when the POSTs happen but not the parameters. What I'd like is to do something along the lines of:
Store the line in sed's hold buffer when it encounters the regex above
Print the contents of the hold buffer and the current line when it encounters "Parameters:"

As I understood you need to display line with regex and the following line. grep can do it itself:
grep -A1 'Started POST.*\/[fr]' log/production.log
with sed it will look like:
sed -n '/Started POST.*\/[fr]/{N;p}' log/production.log
or if not all following lines may content "Parameters" and you need only them:
sed -n '/Started POST.*\/[fr]/{N;/Parameters/p}' log/production.log

This should do what you're looking for:
sed -n '/Started POST.*\/[fr]/h;/Parameters/{H;g;p;q}'
It holds the first line and prints it and the line containing "Parameters". Only the first set of lines will be printed.

Related

Appending text to specific patterns in a fasta BASH

I have a fasta with headers like this:
tr|Q7MX99|Q7MX99_PORGI_BACT
I would like them to say:
tr|Q7MX99|Q7MX99_PORGI_BACT_ORALMICROBIOME
So basically, whenever I have PORGI_BACT I want to append _ORALMICROBIOME to each instance.
I'm sure there is an easy fix through the terminal, but I can't seem to find it.
My first idea is to do something like:
sed 's/>.*/&_ORALMICROBIOME/' file.fa > outfile.fa
BUT I only want to add to specific header endings, and that is where I'm stuck.
Using sed:
sed -r 's/(^.*)(PORGI_BACT|HUMAN_MAM|TESTA_BACT)(.*$)/\1\2_ORALMICROBIOME\3/' file.fa > outfile.fa
Enable regular expression interpretation using -r or -E and then split the line into three sections based on "PORGI_BACT" being in section two and then substitute the line for the first and second sections, followed by "_ORALMICROBIOME" and finally the third section.
You are almost close. Would you please try the following:
sed 's/^>.*PORGI_BACT/&_ORALMICROBIOME/' file.fa > outfile.fa
[Edit]
According to the OP's requirement, how about:
sed -E 's/^>.*(PORGI_BACT|HUMAN_MAM|TESTA_BACT)/&_ORALMICROBIOME/' file.fa > outfile.fa
Sample input as file.fa:
>SEQ0|tr|Q7MX99|Q7MX99_PORGI_BACT
FQTWEEFSRAAEKLYLADPMKVRVVLKYRHVDGNLCIKVTDDLVCLVYRTDQAQDVKKIEKF
>SEQ1|tr|Q7MX88|Q7MX88_HUMAN_MAM
KYRTWEEFTRAAEKLYQADPMKVRVVLKYRHCDGNLCIKVTDDVVCLLYRTDQAQDVKKIEKFHSQLMRLME
LKVTDNKECLKFKTDQAQEAKKMEKLNNIFFTLM
>SEQ2|tr|Q7MX77|Q7MX77_TESTA_BACT
EEYQTWEEFARAAEKLYLTDPMKVRVVLKYRHCDGNLCMKVTDDAVCLQYKTDQAQDVKKVEKLHGK
>SEQ3|tr|Q7MX66|Q7MX66_DUMMY
MYQVWEEFSRAVEKLYLTDPMKVRVVLKYRHCDGNLCIKVTDNSVCLQYKTDQAQDVK
Output:
>SEQ0|tr|Q7MX99|Q7MX99_PORGI_BACT_ORALMICROBIOME
FQTWEEFSRAAEKLYLADPMKVRVVLKYRHVDGNLCIKVTDDLVCLVYRTDQAQDVKKIEKF
>SEQ1|tr|Q7MX88|Q7MX88_HUMAN_MAM_ORALMICROBIOME
KYRTWEEFTRAAEKLYQADPMKVRVVLKYRHCDGNLCIKVTDDVVCLLYRTDQAQDVKKIEKFHSQLMRLME
LKVTDNKECLKFKTDQAQEAKKMEKLNNIFFTLM
>SEQ2|tr|Q7MX77|Q7MX77_TESTA_BACT_ORALMICROBIOME
EEYQTWEEFARAAEKLYLTDPMKVRVVLKYRHCDGNLCMKVTDDAVCLQYKTDQAQDVKKVEKLHGK
>SEQ3|tr|Q7MX66|Q7MX66_DUMMY
MYQVWEEFSRAVEKLYLTDPMKVRVVLKYRHCDGNLCIKVTDNSVCLQYKTDQAQDVK

How do I get rid of “--” line separator when using grep

I'm using the commands given below for splitting my fastq file into two separate paired end reads files:
grep '#.*/1' -A 3 24538_7#2.fq >24538_7#2_1.fq
grep '#.*/2' -A 3 24538_7#2.fq >24538_7#2_2.fq
But it's automatically introducing a -- line separator between the entries. Hence, making my fastq file inappropriate for further processing(because it then becomes an invalid fastq format).
So, I want to get rid of the line separator(--).
PS: I've found the answer for Linux machine but I'm using MacOS, and those didn't work on Mac terminal.
You can use the --no-group-separator option to suppress it (in GNU grep).
Alternatively, you could use (GNU) sed:
sed '\|#.*/1|,+3!d'
deletes all lines other than the one matching #.*/1 and the next three lines.
For macOS sed, you could use
sed -n '\|#.*/1|{N;N;N;p;}'
but this gets unwieldy quickly for more context lines.
Another approach would be to chain grep with itself:
grep '#.*/1' -A 3 file.fq | grep -v "^--"
The second grep selects non-matching (-v) lines that start with -- (though this pattern can sometimes be interpreted as a command line option, requiring some weird escaping like "[-][-]", which is why i put the ^ there).

How to omit reading of first line using sed , when a variable is used?

I want to read datas based on timestamp. Check the first line of the input file, if it matches the $timestamp, remove the lines that is above the obtained maximum timestamp. I used the below command but throws error in my sed command which I wrote for deleting lines obtained before $timestamp. Where am I wrong?
timestamp=`(exec mysql db -u xxx -pxxx -s -N -e "select max(time) from table;")`
sed -e '1/"$timestamp"/d' /home/xx/xx
I guess what you want to do is, check the first line of the input file, if it matches the $timestamp, remove the line, if it doesn't match, do nothing on the file.
Your sed codes are wrong, please read some tutorial. This line should work:
sed "1{/$timestamp/d}" /home/whatever

Create polymorphic bash without die trying. Sed replacing

I'm creating a bash script which in one point needs to modify itself in order to make persistent a change (only one line) of the script needs to change.
I know sed -i is what I need to do this. The problem is my sed command is replacing the line where the command is stored instead of the line I want. So I guess I need to include an exclusion while replacing. Let's check the snippet code stuff:
#!/bin/bash
echo "blah,blah,blah"
echo "more code here, not matters"
sed -i "s/#Awesome line to be replaced/#New line here/" "/path/to/my/script" 2> /dev/null
#Awesome line to be replaced
echo "blah,blah,blah, more code blah"
The problem here is the replaced line is not the line with only #Awesome line to be replaced. It is replaced the line where the sed command is.
This is a reduced example but the script is polymorphic and maybe the line numbers change, so it can't be based on line numbers. And there will be more sed commands like this... so I thought It could be nice to have some piece of text which always could be in the sed command lines in order to use it as excluding pattern, and yeah! that piece of text is /dev/null which always will be in sed command lines and never in the line which I want to replace.
How can achieve this using sed -i? Thanks in advance.
EDIT Forgot to say the order of appearance (offset) can't be used neither because of the polymorphic thing.
EDIT2 Beginning chars before #Awesome line to be replaced can't be used because they could change too. Sorry for who already answered based on this. Is complicated to write a polymorphic snippet considering all the possibilities.
This hack can work:
sed -i "s/#[A]wesome line to be replaced/#New line here/" "/path/to/my/script" 2> /dev/null
I think it is self-explanatory, why it will not match the sed line itself.
Anchor your expression by starting your sed line with :
sed -i "s/^$'\t'*#Awesome (rest of command goes here)
This will make sure sed only matches if the text found is at the beginning of the line with zero or more tabs, and will not match the line with the actual sed command.

Awk/Sed - how to print selection between two patterns?

From reference: catonmat.net I think I could get the interested selection between two patterns using the following:
Source Text (one line): 6 June 2013 08.32.435 UTF+8 Report /content/folder[#name='....' Failure ....
Here the important part is the path to report , therefore I am using:
awk '/content\/folder\[#name=/,/Failure/' source.csv
I got the entire matched line, instead of only the content path between the two matches.
I have also tried to:
sed -n '/content\/folder\[#name/,/Failure/ {/content\/folder\[#name\|Failure/!p}' source.csv
Still returning the entire line...
What was wrong?
Try this:
sed -n '|content/folder\[#name.*Failure|s|.*content/folder\[#name\(.*\)Failure.*|\1|' source.csv
/re1/,/re2/ is for selecting a range of lines, not a range of text within a line. Since content/folder and Failure are on the same line, you don't need a range, just a regex that matches a line containing both. Then use s/// to extra the part between them.
sed 's,.*/content/folder\[#name=\(.*\)Failure.*,\1,' source.csv
grep -Po '(?<=#name=).*(?=Failure)' source.csv

Resources