sed delete all occurences of pattern in each line - bash

The following command doesn't repeat the process for each occurence in one line...
input_file.txt :
<!--:nl-->hond <span>bob</span><!--:fr-->chien <span>bob</span><!--:nl-->kat<!--:fr-->chat
<!--:nl-->hond<!--:fr-->chien<!--:nl-->kat<!--:fr-->chat
wrong sed command :
sed -e 's/\(\<\!--\:nl\--\>\).*\(\<\!--\:fr\--\>\)/\1\2/g' input_file.txt > output_file.txt
current output_file.txt result :
<!--:nl--><!--:fr-->chat
desired output_file.txt result :
chien <span>bob</span>chat
chienchat
[EDIT] hond, chien, kat and chat may have HTML tags around them that need to be kept...

You can use this sed:
sed 's/<!--:nl-->[^<]*<!--:fr-->//g' file

Following awk may also help you in same.
awk -F'<!--:nl-->|<!--:fr-->' '{print $3$5}' Input_file
Explanation: Simply making strings <!--:nl--> OR <!--:fr--> as field separators and then printing 3rd and 5th columns of the line(as per your output required).

Related

bash / sed : editing of the file

I use sed to remove all lines starting from "HETATM" from the input file and cat to combine another file with the output recieved from SED
sed -i '/^HETATM/ d' file1.pdb
cat fil2.pdb file1.pdb > file3.pdb
is this way to do it in one line e.g. using only sed?
If you want to consider awk then it can be done in a single command:
awk 'FNR == NR {print; next} !/^HETATM/' file2.pdb file1.pdb > file3.pdb
With cat + grep combination please try following code. Simple explanation would be, using cat command's capability to concatenate file's output when multiple files are passed to it and using grep -v to remove all words starting from HETATM in file1.pdb before sending is as an input to cat command and creating new file named file3.pdb from cat command's output.
cat file2.pdb <(grep -v '^HETATM' file1.pdb) > file3.pdb
I'm not sure what you mean by "remove all lines starting from 'HETATM'", but if you mean that any line that appears in the file after a line that starts with "HETATM" will not be outputted, then your sed expression won't do it - it will just remove all lines starting with the pattern while leaving all following lines that do not start with the pattern.
There are ways to get the effect I believe you wanted, possibly even with sed - but I don't know sed all that well. In perl I'd use the range operator with a guaranteed non-matching end expression (not sure what will be guaranteed for your input, I used "XXX" in this example):
perl -ne 'unless (/^HETATM/../XXX/) { print; }' file1.pdb
mawk '(FNR == NR) < NF' FS='^HETATM' f1 f2

sed extract part of string from a file

I've ben trying to extract only part of string from a file looking like this:
str1=USER_NAME
str2=justAstring
str3=https://product.org/v-4.5-bin.zip
str4=USER_HOME
I need to extract ONLY the version - in this case: 4.5
I did it by grep and then sed but now the output is 4.5-bin.zip
-> grep str3 file.txt
str3=https://product.org/v-4.5-bin.zip
-> echo str3=https://product.org/v-4.5-bin.zip | sed -n "s/^.*v-\(\S*\)/\1/p"
4.5-bin.zip
What should I do in order to remove also the -bin.zip at the end?
Thanks.
1st solution: With your shown samples, please try following sed code.
sed -n '/^str3=/s/.*-\([^-]*\)-.*/\1/p' Input_file
Explanation: Using sed's -n option which will STOP printing of values by default, to only print matched part. In main program checking condition if line starts from str3= then perform substitution there. In substitution catching everything between 1st - and next - in a capturing group and substituting whole line with it by using \1 and printing the matched portion only by using p option.
2nd solution: Using GNU grep you could try following grep program.
grep -oP '^str3=.*?-\K([^-]*)' Input_file
3rd solution: Using awk program for getting expected output as per shown smaples.
awk -F'-' '/^str3=/{print $2}' Input_file
4th solution: Using awk's match function to get expected results with help of using RSTART and RLENGTH variables which get set once a TRUE match is found by match function.
awk 'match($0,/^str3=.*-/){split(substr($0,RSTART,RLENGTH),arr,"-");print arr[2]}' Input_file
If you know the version contains just digits and dots, replace \S by [0-9.]. Also, match the remaining characters outside of the capture group to get it removed.
sed -n 's/^.*v-\([0-9.]*\).*/\1/p'

Find multiple strings between values and replace with newline in bash

I need to write a bash script to list values from an sql database.
I've got so far but now I need to get the rest of the way.
The string so far is
10.255.200.0/24";i:1;s:15:"10.255.207.0/24";i:2;s:14:"192.168.0.0/21
I now need to delete everything between the speech marks and send it to a new line.
desired output:
10.255.200.0/24
10.255.207.0/24
192.168.0.0/21
any help would be greatly appreciated.
$ tr '"' '\n' <<< $string | awk 'NR%2'
10.255.200.0/24
10.255.207.0/24
192.168.0.0/21
You could use :
echo 'INPUT STRING HERE' | sed $'s/"[^"]*"/\\\n/g'
Explanation :
sed 's/<PATTERN1>/<PATTERN2/g' : we substitute every occurrence of PATTERN1 by PATTERN2
[^"]*: any character that is not a ", any number of time
\\\n: syntax for newline in sed (reference here)
Considering that your Input_file is same as shown sample then could you please try following.
awk '
{
while(match($0,/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+\/[0-9]+/)){
print substr($0,RSTART,RLENGTH)
$0=substr($0,RSTART+RLENGTH)
}
}' Input_file
This might work for you (GNU sed):
sed 's/"[^"]*"/\n/g' file
Or using along side Bash:
sed $'/"[^"]*"/\\n/g' file
Or using most other sed's:
sed ':a;/"[^"]*"\(.*\)\(.\|$\)/{G;s//\2\1/;ba}' file
This uses the feature that an unadulterated hold space contains a newline.

Delete first and last character from from each line of a txt file

I need to delete the first and the last character from the each line of a text file.
for example:
Input
$cat file1.txt
|head1|head2|head3|
|1|2|3|
|2|3|4|
Output:
head1|head2|head3
1|2|3
2|3|4
Using sed:
sed 's/.$//; s/^.//' inputfile
A simple way to do it in one sed command:
sed -E 's/^.|.$//g' file
Match a character at the start or the end of the line and replace with nothing.
In basic mode, remember that the | needs to be escaped:
sed 's/^.\|.$//g' file
If awk is helpful:
awk '{print substr($0,2,length($0)-2)}' file
head1|head2|head3
1|2|3
2|3|4
#smisra- Could you please try following, it may help you in same too.
awk '{gsub(/^\||\|$/,X,$0);print}' Input_file

replace a string before the semi colon

I have several files, which begins like this :
unit,s_adj,partner,stk_flow,indic,geo\time;aaaa;2222;
time,s_adj,partner,stk_flow,lolo,geo\time;bbb;2222;
I want to replace the first occurence before the semi-colon with that new occurence YEAR
The desired output would be:
YEAR;aaaa;2222;
YEAR;bbb;2222;
I tried with the following command line but it does not seem to do what I want
awk -F ";" 'NR==1 {$1=""; print "year"}' input_file
Your suggestions are welcomed.
Best.
try this:
sed 's/[^;]*/YEAR/' file
if you only want the substitution happen on the 1st line:
sed '1s/[^;]*/YEAR/' file
You can also do:
awk '{$1="YEAR"}1' OFS=\; FS=\; input-file

Resources