How to delete matching lines using sed in while loop - bash

Below is the content of claim_note file
B|2050013344207770
B|2050013344157085
I have Input file which has values
B|2050013344207770|xxx|xxx
B|2050013344157085|xxx|xxx
B|2050013344157999|xxx|xxx
I am using below code to delete matching line in Input file, but my code delete only first matching pattern
cat claim_note | while read FILE
do
echo $FILE
sed -n "/$FILE/!p" Input > TempInput
mv TempInput Input
done

Rather than looping and using send on every line you can use awk:
awk -F'|' 'FNR==NR{a[$1,$2]; next} !(($1,$2) in a)' claim_note Input
B|2050013344157999|xxx|xxx

You can use this grep:
grep -vf claim.txt input.txt
Output:
B|2050013344157999|xxx|xxx

Related

Remove hyphen from duration format time

I need to remove hyphen from duration format time and i didn't succeed with sed command as i intended to do it.
original output:
00:0-26:0-8
00:0-28:0-30
00:0-28:0-4
00:0-28:0-28
00:0-27:0-54
00:0-27:0-19
Expected output:
00:26:08
00:28:30
00:28:04
00:28:28
00:27:54
00:27:19
I tried with command but i am stucked.
sed 's/;/ /g' temp_file.txt | awk '{print $8}' | grep - | sed 's/-//g;s/00:0/0:/g'
Using sed:
sed 's/\<[0-9]\>/0&/g;s/:00-/:/g' file
The first command s/\<[0-9]\>/0&/g is adding a zero to single digit numbers.
The second command s/:00-/:/g is removing the 0- in front of the number.
With your shown sample only, following awk may help you on same.
awk -F":" '{for(i=1;i<=NF;i++){sub(/0-/,"",$i);$i=length($i)==1?0$i:$i}} 1' OFS=":" Input_file
In case you want to save output into Input_file itself then append > temp_file && mv temp_file Input_file to above command too.
For the given example, this one-liner does the job:
awk -F':0-' '{printf "%02d:%02d:%02d\n",$1,$2,$3}' file
If I have the below output with two columns "duration time"? When I try to use one of your regexp above is adding me "0" for the first column duration time/timestamp and I dont want that, just the column $7 = duration_time separated by ; to be modified.
01;12May2018 8:20:36;192.168.1.111;78787;192.168.1.111;78787;80:25:0-49;2018-05-12_111111;RO
02;14May2018 2:43:16;192.168.1.132;78787;192.168.1.111;78787;36:10:0-10;2018-05-12_111111;RO
03;15May2018 7:40:01;192.168.131.1;78787;192.168.1.111;78787;18:39:0-44;2018-05-12_111111;RO
04;15May2018 12:37:46;192.168.1.201;78787;192.168.1.111;78787;12:51:0-14;2018-05-12_111111;RO
Here is the output:
root#root> sed 's/\<[0-9]\>/0&/g;s/:00-/:/g' temp_file
01;12May2018 08:20:36;192.168.01.111;78787;192.168.01.111;78787;80:25:49;2018-05-12_111111;RO
02;14May2018 02:43:16;192.168.01.132;78787;192.168.01.111;78787;36:10:10;2018-05-12_111111;RO
03;15May2018 07:40:01;192.168.131.01;78787;192.168.01.111;78787;18:39:44;2018-05-12_111111;RO
04;15May2018 12:37:46;192.168.01.201;78787;192.168.01.111;78787;12:51:14;2018-05-12_111111;RO

Delete 4 consecutive lines after a match in a file

I am in the process of deleting around 33k of zones on a DNS server. I used this awk string to find the matching rows in my zones.conf file:
awk -v RS= -v ORS='\n\n' '/domain.com/' zones.conf
This give me the output down below, which is what I want.
zone "domain.com" {
type master;
file "/etc/bind/db/domain.com";
};
The problem I am facing now, is to delete the 4 lines.
Is it possible to use sed or awk to perform this action?
EDIT:
I have decided that I want to run in in a while loop. List.txt contain the domain which I want to remove from the zones.conf file.
Every row is defined as the variable '${line}' and is defined in the awk (which was provided by "l'L'l")
The string was originaly:
awk -v OFS='\n\n' '/domain.com/{n=4}; n {n--; next}; 1' < zones.conf > new.conf
I tried to modify it so it would accept a variable, but without result:
#!/bin/bash
while read line
do
awk -v OFS='\n\n' '/"'${line}'"/{n=4}; n {n--; next}; 1' zones.conf > new.conf
done<list.txt
Thanks in advance
This is quite easy with sed:
sed -i '/zone "domain.com"/,+4d' zones.conf
With a variable:
sed -i '/zone "'$domain'"/,+4d' zones.conf
Full working example:
#!/bin/bash
while read domain
do
sed -i '/zone "'$domain'"/,+4d' zones.conf
done<list.txt
You should be able to modify your existing awk command to remove a specified number of lines once the match is found, for example:
awk -v OFS='\n\n' '/domain.com/{n=4}; n {n--; next}; 1' < zones.conf > new.conf
This would remove 4 lines after the initial domain.com is found, giving you the correct newlines.
Output:
zone "other.com" {
type master;
file "/etc/bind/db/other.com";
};
zone "foobar.com" {
type master;
file "/etc/bind/db/foobar.com";
};
My sed solution would be
sed '/zone "domain.com"/{:l1;/};\n$/!{N;bl1};d}' file > newfile
#But the above would be on the slower end if you're dealing with 33k zones
For inplace editing use the -i option with sed like below :
sed -i.bak '/zone "domain.com"/{:l1;/};\n$/!{N;bl1};d}' file
#Above will create a backup of the original file with a '.bak' extension
For using variables
#!/bin/bash
while read domain #capitalized variables are usually reserved for the system
do
sed '/zone "'"${domain}"'"/{:l1;/};\n$/!{N;bl1};d}' file > newfile
# for inplace edit use below
# sed -i.bak '/zone "'"${domain}"'"/{:l1;/};\n$/!{N;bl1};d}' file
done<list.txt

Print last line of text file

I have a text file like this:
1.2.3.t
1.2.4.t
complete
I need to print the last non blank line and two line to last as two variable. the output should be:
a=1.2.4.t
b=complete
I tried this for last line:
b=awk '/./{line=$0} END{print line}' myfile
but I have no idea for a.
grep . file | tail -n 2 | sed 's/^ *//;1s/^/a=/;2s/^/b=/'
Output:
a=1.2.4.t
b=complete
awk to the rescue!
$ awk 'NF{a=b;b=$0} END{print "a="a;print "b="b}' file
a=1.2.4.t
b=complete
Or, if you want to the real variable assignment
$ awk 'NF{a=b;b=$0} END{print a, b}' file
| read a b; echo "a="$a; echo "b="$b
a=1.2.4.t
b=complete
you may need -r option for read if you have backslashes in the values.

How to extract FASTA sequence using sequence ID (shell script)

I have the following sequences which is in a fasta format with sequence header and its nucleotides.
How can I compare two files(Kcompare.pep and clade1i.txt) and extract the sequences with the same sequence header?
Can anyone help me?
Kcompare.pep
>ztr:MYCGRDRAFT_45998
MAAPLHAEGPIRTPYTGVELLNTPYLNKGTAFPADERRVLGLTALLPTSVHTLDQQLQRA
WHQYQSRDNDLARNTFLTSLKEQNEVLYYRLVLDHLSEVFSIIYTPTEGEAIQRYSSLFR
>kal:KALB_5042
MTAEVAVVSDGSAIPGASPPATLPLLQDYAELVREHAGLSAVPLAVDSARLAAELCALPK
RFRAVFLTHTDPERAFQVQRAVAKAGGPLVITDQDTTAISLTASTLTTLARRGRSPSDSR
clade1i.txt
cpo:COPRO5265_0583
ble:BleG1_3845
kal:KALB_5042
expected output
>kal:KALB_5042
MTAEVAVVSDGSAIPGASPPATLPLLQDYAELVREHAGLSAVPLAVDSARLAAELCALPK
RFRAVFLTHTDPERAFQVQRAVAKAGGPLVITDQDTTAISLTASTLTTLARRGRSPSDSR
I tried to run this but no error or result appeared.
for i in K*
do
echo $i
awk -F ' ' '{print $1}' $i/$i.pep > Kcompare.pep
mv Kcompare.pep $i
awk -F '_' '{print $2":"$3"_"$4}' $i/firstClade.txt > $i/clade1i.txt
awk 'NR==1{printf $0"\t";next}{printf /^>/ ? "\n"$0"\t" : $0}' $i/Kcompare.pep | awk -F"\t" 'BEGIN{while((getline k <"$i/clade1i.txt")>0)i[k]=1}{gsub("^>","",$0);if(i[$1]){print ">"$1"\n"$2}}' > $i/firsti.pep
done
Using awk:
awk 'NR==FNR{a[">"$0];next}/^>/{f=0;}($0 in a)||f{print;f=1}' clade1i.txt Kcompare.pep
Read the clade1i.txt file and store in an array as keys.
Read the Kcompare.pep. For every line beginning with '>', set a flag, and keep printing the lines till the next line beginning with '>' is encountered.
Use this:
while read l; do
sed -n '/^>'"$l"'/,/^>|$/p' Kcompare.pep
done <clade1i.txt
The while loop loops trough the clade1i.txt file line by line.
sed -n suppresses auto print.
/regex/,/regex/ matches all from the first regex to the second.
p prints matched lines.

Appending text from first line to last of a file through loop

I want to append a text to the end of the each line of a existing file. This text has to be appended from the first line of the file to last line.
The thing is, I am passing the file contents as an input to the loop and output of that has to be appended onto the same file. I could not figure out a logic for it.
FileName: sample
cat sample
Alex, Johston
Samuel, John
Vebron, Justus
Above are the content of the file. Now, in this, I want to append first column of values ie) Alex, Samuel and Verbron to the end of the file with comma.
My intented output:
Alex, Johston,Alex
Samuel, John,Samuel
Vebron, Justus,Vebron
My script I wrote to take the first column values:
while
read LINE
do
fcol=$(echo $LINE|awk -F, '{ print $1 }')
done < sample
Running through the above loop, variable fcol will store the values - Alex, Samuel, Vebron. I need to append these values into end of each line
Can some one guide me on this and so that I can alter the above code to have the intented output as explained above.
Thanks!
awk -F, '{print $0", "$1}' sample
The loop is not required as the awk takes each line from the input file and process it based on the command provided.
Here the command is print $0", "$1 which appends the $0 and $1 with a "," between them
you can use awk to do this :
cat sample |awk -F "," '{print $0 ", "$1}'
You can't do this without an intermediate file of some sort (without storing the files contents in memory before processing them). That said things like sed -i will hide that detail from you.
awk -F, -v OFS=, '{print $0,$1}' sample > sample.new
sed -i.orig -e 's/^\(.\)\(.*\)/&\1/' sample
Initialize line_no = 0 before looping through the file.
Increment line_no by 1 before processing each line:
Following sed command will append the $fcol at the end of the $line_no
sed -i "$line_no s/$/ ,$fcol/" sample
Final Script:
line_no=0
while read LINE
do
line_no=$((line_no+1))
fcol=$(echo $LINE|awk -F, '{ print $1 }')
sed -i "$line_no s/$/ ,$fcol/" sample
done < sample
My solution is
sed -r 's/([^,]*).*/\0, \1/'

Resources