bash: cat + grep to produce several replicas merging two filles - bash

Using Linux bash command line, I need to merge two filles integrating several copies of the file1 inside the specified part of the file 2. The file 1 looks like:
ATOM 1 N SER A 1 -2.390 4.343 -17.003 1.00 27.76 N1+
ATOM 2 CA SER A 1 -2.066 5.647 -16.370 1.00 27.12 C
ATOM 3 C SER A 1 -2.394 5.608 -14.874 1.00 26.29 C
ATOM 4 O SER A 1 -3.014 4.627 -14.405 1.00 22.93 O
ATOM 5 CB SER A 1 -2.771 6.798 -17.057 1.00 28.10 C
ATOM 6 OG SER A 1 -2.538 8.023 -16.373 1.00 32.02 O
ATOM 7 N GLY A 2 -1.982 6.655 -14.162 1.00 25.31 N
ATOM 8 CA GLY A 2 -2.172 6.779 -12.716 1.00 24.93 C
ATOM 9 C GLY A 2 -0.888 6.336 -12.067 1.00 23.66 C
ATOM 10 O GLY A 2 -0.168 5.459 -12.608 1.00 27.42 O
ATOM 11 N PHE A 3 -0.636 6.866 -10.900 1.00 22.07 N
ATOM 12 CA PHE A 3 0.622 6.595 -10.191 1.00 21.70 C
ATOM 13 C PHE A 3 0.279 6.570 -8.716 1.00 20.39 C
ATOM 14 O PHE A 3 -0.265 7.544 -8.167 1.00 23.83 O
the file 2 is a multi-block, where separate parts are defined by model1,model 2, model N and separated by ENDMDL:
MODEL 1
REMARK VINA RESULT: -7.828 0.000 0.000
REMARK INTER + INTRA: -13.769
REMARK INTER: -10.110
REMARK INTRA: -3.659
REMARK UNBOUND: -3.196
ENDMDL
MODEL 2
REMARK VINA RESULT: -7.828 0.000 0.000
REMARK INTER + INTRA: -13.769
REMARK INTER: -10.110
REMARK INTRA: -3.659
REMARK UNBOUND: -3.196
ENDMDL
MODEL 3
REMARK VINA RESULT: -7.828 0.000 0.000
REMARK INTER + INTRA: -13.769
REMARK INTER: -10.110
REMARK INTRA: -3.659
REMARK UNBOUND: -3.196
ENDMDL
I need to copy several times all the containt of the file 1 into the file 2 just before the separator ENDMDL (in the second file), thus integrating several coppies of the file 1 into the file 2. Here is the example of expected output:
MODEL 1
REMARK VINA RESULT: -7.828 0.000 0.000
REMARK INTER + INTRA: -13.769
REMARK INTER: -10.110
REMARK INTRA: -3.659
REMARK UNBOUND: -3.196
ATOM 1 N SER A 1 -2.390 4.343 -17.003 1.00 27.76 N1+
ATOM 2 CA SER A 1 -2.066 5.647 -16.370 1.00 27.12 C
ATOM 3 C SER A 1 -2.394 5.608 -14.874 1.00 26.29 C
ATOM 4 O SER A 1 -3.014 4.627 -14.405 1.00 22.93 O
ATOM 5 CB SER A 1 -2.771 6.798 -17.057 1.00 28.10 C
ATOM 6 OG SER A 1 -2.538 8.023 -16.373 1.00 32.02 O
ATOM 7 N GLY A 2 -1.982 6.655 -14.162 1.00 25.31 N
ATOM 8 CA GLY A 2 -2.172 6.779 -12.716 1.00 24.93 C
ATOM 9 C GLY A 2 -0.888 6.336 -12.067 1.00 23.66 C
ATOM 10 O GLY A 2 -0.168 5.459 -12.608 1.00 27.42 O
ATOM 11 N PHE A 3 -0.636 6.866 -10.900 1.00 22.07 N
ATOM 12 CA PHE A 3 0.622 6.595 -10.191 1.00 21.70 C
ATOM 13 C PHE A 3 0.279 6.570 -8.716 1.00 20.39 C
ATOM 14 O PHE A 3 -0.265 7.544 -8.167 1.00 23.83 O
ENDMDL
MODEL 2
REMARK VINA RESULT: -7.828 0.000 0.000
REMARK INTER + INTRA: -13.769
REMARK INTER: -10.110
REMARK INTRA: -3.659
REMARK UNBOUND: -3.196
ATOM 1 N SER A 1 -2.390 4.343 -17.003 1.00 27.76 N1+
ATOM 2 CA SER A 1 -2.066 5.647 -16.370 1.00 27.12 C
ATOM 3 C SER A 1 -2.394 5.608 -14.874 1.00 26.29 C
ATOM 4 O SER A 1 -3.014 4.627 -14.405 1.00 22.93 O
ATOM 5 CB SER A 1 -2.771 6.798 -17.057 1.00 28.10 C
ATOM 6 OG SER A 1 -2.538 8.023 -16.373 1.00 32.02 O
ATOM 7 N GLY A 2 -1.982 6.655 -14.162 1.00 25.31 N
ATOM 8 CA GLY A 2 -2.172 6.779 -12.716 1.00 24.93 C
ATOM 9 C GLY A 2 -0.888 6.336 -12.067 1.00 23.66 C
ATOM 10 O GLY A 2 -0.168 5.459 -12.608 1.00 27.42 O
ATOM 11 N PHE A 3 -0.636 6.866 -10.900 1.00 22.07 N
ATOM 12 CA PHE A 3 0.622 6.595 -10.191 1.00 21.70 C
ATOM 13 C PHE A 3 0.279 6.570 -8.716 1.00 20.39 C
ATOM 14 O PHE A 3 -0.265 7.544 -8.167 1.00 23.83 O
ENDMDL
MODEL 3
REMARK VINA RESULT: -7.828 0.000 0.000
REMARK INTER + INTRA: -13.769
REMARK INTER: -10.110
REMARK INTRA: -3.659
REMARK UNBOUND: -3.196
ATOM 1 N SER A 1 -2.390 4.343 -17.003 1.00 27.76 N1+
ATOM 2 CA SER A 1 -2.066 5.647 -16.370 1.00 27.12 C
ATOM 3 C SER A 1 -2.394 5.608 -14.874 1.00 26.29 C
ATOM 4 O SER A 1 -3.014 4.627 -14.405 1.00 22.93 O
ATOM 5 CB SER A 1 -2.771 6.798 -17.057 1.00 28.10 C
ATOM 6 OG SER A 1 -2.538 8.023 -16.373 1.00 32.02 O
ATOM 7 N GLY A 2 -1.982 6.655 -14.162 1.00 25.31 N
ATOM 8 CA GLY A 2 -2.172 6.779 -12.716 1.00 24.93 C
ATOM 9 C GLY A 2 -0.888 6.336 -12.067 1.00 23.66 C
ATOM 10 O GLY A 2 -0.168 5.459 -12.608 1.00 27.42 O
ATOM 11 N PHE A 3 -0.636 6.866 -10.900 1.00 22.07 N
ATOM 12 CA PHE A 3 0.622 6.595 -10.191 1.00 21.70 C
ATOM 13 C PHE A 3 0.279 6.570 -8.716 1.00 20.39 C
ATOM 14 O PHE A 3 -0.265 7.544 -8.167 1.00 23.83 O
ENDMDL
I have tried to use cat BUT it just fused the both files together without the required replication of the first file:
cat file1.pdb file2.pdb > together.pdb
Need I pipe this to some expression of grep in order to replicate the file1 in the positions before the ENDMDL of the file 2 ?

Here is an awk solution that doesn't call unsafe system or getline:
awk 'NR==FNR {s = s $0 ORS; next} $0 == "ENDMDL" {$0 = s $0} 1' file1 file2
If you want to pass shell variable names then use:
awk 'NR==FNR {s = s $0 ORS; next}
$0 == "ENDMDL" {$0 = s $0} 1' "$file1" "$file2"

Use awk.
awk '/^ENDMDL$/ {system("cat file1.pdb");}; {print}' file2.pdb
Each line from file2 is written to standard output, but when the line matches ENDMDL, the entire contents of file1 are output first.
Some alternatives:
Replace /^ENDMDL$/ with $0 == "ENDMDL"
Replace {print} with 1. (With no explicit pattern, the action is performed. With no explicit action, the current line is printed.)

Here's a straight-forward awk solution:
awk '
BEGIN {
FS = RS = "\a"
getline contents < ARGV[2]
close(ARGV[2])
ARGV[2] = ""
RS = "\n"
}
/^ENDMDL$/ { printf "%s", contents }
{ print }
' file1 file2
The script slurps the file content (the one to be inserted) into a variable then prints it each time ENDMDL appears. I'm using the BELL character as FS and RS because you won't encounter it in a PDB file.

Related

Removing lines depending upon keyword occurance

I have 7,000 files(sade1.pdbqt ... sade7200.pdbqt). Only some of the files contains second and so occurrence of a keyword TORSDOF. For a given file, I want to remove all lines following the first occurrence if there is second occurrence of keyword TORSDOF, while preserving the file names. Can somebody please provide a sample snippet. Thank you.
$ cat FileWith2ndOccurance.txt
ashu
vishu
jyoti
TORSDOF
Jatin
Vishal
Shivani
TORSDOF
Sushil
Kiran
after function run
$ cat FileWith2ndOccurance.txt
ashu
vishu
jyoti
TORSDOF
EDIT1: Actual file copy-
REMARK Name = 17-DMAG.cdx
REMARK 8 active torsions:
REMARK status: ('A' for Active; 'I' for Inactive)
REMARK 1 A between atoms: C_1 and N_8
REMARK 2 A between atoms: N_8 and C_9
REMARK 3 A between atoms: C_9 and C_10
REMARK 4 A between atoms: C_10 and N_11
REMARK 5 A between atoms: C_15 and O_17
REMARK 6 A between atoms: C_25 and O_28
REMARK 7 A between atoms: C_27 and O_33
REMARK 8 A between atoms: O_28 and C_29
REMARK x y z vdW Elec q Type
REMARK _______ _______ _______ _____ _____ ______ ____
ROOT
ATOM 1 C UNL 1 7.579 11.905 0.000 0.00 0.00 +0.000 C
ATOM 2 C UNL 1 7.579 10.500 0.000 0.00 0.00 +0.000 C
ATOM 30 O UNL 1 8.796 8.398 0.000 0.00 0.00 +0.000 OA
ENDROOT
BRANCH 21 31
ATOM 31 O UNL 1 13.701 7.068 0.000 0.00 0.00 +0.000 OA
ATOM 32 C UNL 1 12.306 6.953 0.000 0.00 0.00 +0.000 C
ENDBRANCH 41 42
ENDBRANCH 19 41
TORSDOF 8
REMARK Name = 17-DMAG.cdx
REMARK 8 active torsions:
REMARK status: ('A' for Active; 'I' for Inactive)
REMARK 1 A between atoms: C_1 and N_8
REMARK 2 A between atoms: N_8 and C_9
REMARK x y z vdW Elec q Type
REMARK _______ _______ _______ _____ _____ ______ ____
ROOT
ATOM 1 CL UNL 1 0.000 11.656 0.000 0.00 0.00 +0.000 Cl
ENDROOT
TORSDOF 0
What I would do:
#!/bin/bash
for file in sade*.pdbqt; do
count=$(grep -c '^TORSDOF' "$file")
if ((count>1)); then
awk '/^TORSDOF/{print;exit}1' "$file" > /tmp/.torsdof &&
mv /tmp/.torsdof "$file"
fi
done

How do I return a varying number as a variable in a string found in another file that otherwise stays constant (BASH)?

I have a file that contains text like this (only a portion of it here) and want to find the ATOM # associated with the O5' line (in this case "2"). I would then like to store this number as a variable for future use. Note that the data below is stored in another file titled "xyz.file" for example. The number of spaces between "ATOM" and the column the number of interest is found in may vary as the number of interest's value changes.
ATOM 1 HO5' G5 1 7.415 -9.123 -8.109 1.00 0.00
ATOM 2 O5' G5 1 7.997 -8.960 -8.863 1.00 0.00
ATOM 3 C5' G5 1 9.136 -9.784 -8.729 1.00 0.00
ATOM 4 H5' G5 1 9.679 -9.808 -9.673 1.00 0.00
ATOM 5 H5'' G5 1 8.814 -10.797 -8.484 1.00 0.00
ATOM 6 C4' G5 1 10.067 -9.272 -7.628 1.00 0.00
ATOM 7 H4' G5 1 10.847 -10.015 -7.448 1.00 0.00
ATOM 8 O4' G5 1 10.700 -8.053 -7.990 1.00 0.00
ATOM 9 C1' G5 1 10.866 -7.262 -6.821 1.00 0.00
ATOM 10 H1' G5 1 11.907 -6.970 -6.696 1.00 0.00
ATOM 11 N9 G5 1 10.027 -6.048 -6.896 1.00 0.00
An awk one-liner:
n=$(awk '$3 == "O5'\''" {print $2; quit}' file)
echo $n
prints
2

How to replace a multiple columns with others using bash?

I have a text file that contains data arranged in columns, and I need to replace some columns with others, and to be specific, xyz coordinates. What I'm looking for is described in the image below.(replace the red rectangle number 1 with the green rectangle number 2).
HETATM 1 C LIG 1 -0.517 1.592 -0.048 1.00 0.00 0.212 A
HETATM 2 C LIG 1 0.017 -0.536 0.534 1.00 0.00 0.149 A
HETATM 3 C LIG 1 1.133 0.155 0.029 1.00 0.00 0.212 A
HETATM 4 N LIG 1 -1.027 0.379 0.499 1.00 0.00 -0.337 N
HETATM 5 N LIG 1 0.789 1.466 -0.324 1.00 0.00 -0.219 NA
HETATM 6 C LIG 1 -2.429 0.112 0.889 1.00 0.00 0.221 C
HETATM 7 C LIG 1 -3.179 -0.453 -0.210 1.00 0.00 -0.097 C
HETATM 8 C LIG 1 -3.805 -0.925 -1.124 1.00 0.00 0.014 C
HETATM 9 N LIG 1 2.482 -0.388 -0.118 1.00 0.00 -0.095 N
HETATM 10 O LIG 1 2.619 -1.549 0.253 1.00 0.00 -0.530 OA
HETATM 11 O LIG 1 3.362 0.305 -0.578 1.00 0.00 -0.530 OA
ATOM 1 C LIG 1 -13.469 13.704 72.248 -0.37 -0.04 +0.212 75.145
ATOM 2 C LIG 1 -14.243 15.824 72.493 -0.41 -0.03 +0.149 75.145
ATOM 3 C LIG 1 -15.124 15.039 71.727 -0.40 -0.04 +0.212 75.145
ATOM 4 N LIG 1 -13.200 14.974 72.836 -0.28 +0.06 -0.337 75.145
ATOM 5 N LIG 1 -14.635 13.735 71.586 -0.32 +0.05 -0.219 75.145
ATOM 6 C LIG 1 -11.994 15.348 73.608 -0.46 -0.02 +0.221 75.145
ATOM 7 C LIG 1 -12.341 15.781 74.943 -0.66 +0.01 -0.097 75.145
ATOM 8 C LIG 1 -12.628 16.141 76.055 -0.66 -0.00 +0.014 75.145
ATOM 9 N LIG 1 -16.387 15.490 71.145 -0.60 +0.01 -0.095 75.145
ATOM 10 O LIG 1 -17.127 14.595 70.751 -0.10 +0.02 -0.530 75.145
ATOM 11 O LIG 1 -16.631 16.674 71.082 -0.58 -0.08 -0.530 75.145
Assuming that files have the same length, you could merge them with paste. and then extract columns in the desired order:
paste file1.txt file2.txt|awk '{print $1, $2, $3, $4, $5, $18, $19, $20, $9, $10, $11, $12}'
It's not clear if you are trying to align the rows contextually, but if you are literally just wanting to replace columns 6, 7, and 8 with the columns from the same row in the other file, you can just do something like:
$ cat file1
HETATM 1 C LIG 1 -0.517 1.592 -0.048 1.00 0.00 0.212 A
HETATM 2 C LIG 1 0.017 -0.536 0.534 1.00 0.00 0.149 A
HETATM 3 C LIG 1 1.133 0.155 0.029 1.00 0.00 0.212 A
HETATM 4 N LIG 1 -1.027 0.379 0.499 1.00 0.00 -0.337 N
HETATM 5 N LIG 1 0.789 1.466 -0.324 1.00 0.00 -0.219 NA
HETATM 6 C LIG 1 -2.429 0.112 0.889 1.00 0.00 0.221 C
HETATM 7 C LIG 1 -3.179 -0.453 -0.210 1.00 0.00 -0.097 C
HETATM 8 C LIG 1 -3.805 -0.925 -1.124 1.00 0.00 0.014 C
HETATM 9 N LIG 1 2.482 -0.388 -0.118 1.00 0.00 -0.095 N
HETATM 10 O LIG 1 2.619 -1.549 0.253 1.00 0.00 -0.530 OA
HETATM 11 O LIG 1 3.362 0.305 -0.578 1.00 0.00 -0.530 OA
$ cat file2
ATOM 1 C LIG 1 -13.469 13.704 72.248 -0.37 -0.04 +0.212 75.145
ATOM 2 C LIG 1 -14.243 15.824 72.493 -0.41 -0.03 +0.149 75.145
ATOM 3 C LIG 1 -15.124 15.039 71.727 -0.40 -0.04 +0.212 75.145
ATOM 4 N LIG 1 -13.200 14.974 72.836 -0.28 +0.06 -0.337 75.145
ATOM 5 N LIG 1 -14.635 13.735 71.586 -0.32 +0.05 -0.219 75.145
ATOM 6 C LIG 1 -11.994 15.348 73.608 -0.46 -0.02 +0.221 75.145
ATOM 7 C LIG 1 -12.341 15.781 74.943 -0.66 +0.01 -0.097 75.145
ATOM 8 C LIG 1 -12.628 16.141 76.055 -0.66 -0.00 +0.014 75.145
ATOM 9 N LIG 1 -16.387 15.490 71.145 -0.60 +0.01 -0.095 75.145
ATOM 10 O LIG 1 -17.127 14.595 70.751 -0.10 +0.02 -0.530 75.145
ATOM 11 O LIG 1 -16.631 16.674 71.082 -0.58 -0.08 -0.530 75.145
$ awk '{getline s < "file2"; split(s, a); $6 = a[6]; $7 = a[7]; $8 = a[8]}1' file1
HETATM 1 C LIG 1 -13.469 13.704 72.248 1.00 0.00 0.212 A
HETATM 2 C LIG 1 -14.243 15.824 72.493 1.00 0.00 0.149 A
HETATM 3 C LIG 1 -15.124 15.039 71.727 1.00 0.00 0.212 A
HETATM 4 N LIG 1 -13.200 14.974 72.836 1.00 0.00 -0.337 N
HETATM 5 N LIG 1 -14.635 13.735 71.586 1.00 0.00 -0.219 NA
HETATM 6 C LIG 1 -11.994 15.348 73.608 1.00 0.00 0.221 C
HETATM 7 C LIG 1 -12.341 15.781 74.943 1.00 0.00 -0.097 C
HETATM 8 C LIG 1 -12.628 16.141 76.055 1.00 0.00 0.014 C
HETATM 9 N LIG 1 -16.387 15.490 71.145 1.00 0.00 -0.095 N
HETATM 10 O LIG 1 -17.127 14.595 70.751 1.00 0.00 -0.530 OA
HETATM 11 O LIG 1 -16.631 16.674 71.082 1.00 0.00 -0.530 OA

delete rows after specific character | awk

I am writing a Bash script and,
I need to remove all lines in between TER, including 'TER's
Input File :
ATOM 186 O3' U 6 7.297 6.145 -5.250 1.00 0.00 O
ATOM 187 HO3' U 6 7.342 5.410 -5.865 1.00 0.00 H
TER
ATOM 1 HO5' A 1 3.429 -7.861 3.641 1.00 0.00 H
ATOM 2 O5' A 1 4.232 -7.360 3.480 1.00 0.00 O
ATOM 3 C5' A 1 5.480 -8.064 3.350 1.00 0.00 C
ATOM 4 H5' A 1 5.429 -8.766 2.518 1.00 0.00 H
TER
Expected output:
ATOM 186 O3' U 6 7.297 6.145 -5.250 1.00 0.00 O
ATOM 187 HO3' U 6 7.342 5.410 -5.865 1.00 0.00 H
I found
sed '/TER/,$d' ${myArray[j]}.txt >> ${MyArray[j]}.txt ### ${MyArray[j]} file name through an array
But this does not work, I think awk will work with Bash Script. help Thanks
You can just use sed like this:
sed -i.bak '/^TER/,/^TER/d' "${myArray[j]}.txt"
cat "${myArray[j]}.txt"
ATOM 186 O3' U 6 7.297 6.145 -5.250 1.00 0.00 O
ATOM 187 HO3' U 6 7.342 5.410 -5.865 1.00 0.00 H
sed '/TER/,/TER/d'
echo
"ATOM 186 O3' U 6 7.297 6.145 -5.250 1.00 0.00 O
ATOM 187 HO3' U 6 7.342 5.410 -5.865 1.00 0.00 H
TER
ATOM 1 HO5' A 1 3.429 -7.861 3.641 1.00 0.00 H
ATOM 2 O5' A 1 4.232 -7.360 3.480 1.00 0.00 O
ATOM 3 C5' A 1 5.480 -8.064 3.350 1.00 0.00 C
ATOM 4 H5' A 1 5.429 -8.766 2.518 1.00 0.00 H
TER" |sed '/TER/,/TER/d'
######################################################################################
ATOM 186 O3' U 6 7.297 6.145 -5.250 1.00 0.00 O
ATOM 187 HO3' U 6 7.342 5.410 -5.865 1.00 0.00 H
sed '/Start Pattern/,/End Pattern/d'
It can be done like this
sed '/TER/,$d' ${myArray[j]}.txt > tmp.txt #note only one " > "
mv tmp.txt ${myArray[j]}.txt
awk also provides a simple solution using a flag to control printing. Below the skip variable is used as a flag. If 1 the lines are skipped, on the transition from 1 to 0, the script exits.
awk -v skip=0 '$1=="TER"{skip=skip?1:0; if (!skip)exit}1' file
Above $1=="TER" is used to match lines (records) where the first field is TER (this disambiguates between "TER" and "TERMINAL", etc...) Within the rule, the ternary skip=skip?1:0 sets skip=1 the first time "TER" is encountered and to 0 on the next. If skip==0 the script exits. The 1 at the end is just shorthand for print.
Example Use/Output
Using your data in file, you would get:
$ awk -v skip=0 '$1=="TER"{skip=skip?1:0; if (!skip)exit}1' file
ATOM 186 O3' U 6 7.297 6.145 -5.250 1.00 0.00 O
ATOM 187 HO3' U 6 7.342 5.410 -5.865 1.00 0.00 H

how to replace alphabet 'L' with alphabet 'H' in specific column with specific numbers of rows?

i want to replace alphabet "L" with alphabet "H" in coulmn 5 and starting row from 2334 till 2343. how i can do this
ATOM 2328 C PRO H 216 2.775 27.948 31.304 1.00 54.68 C
ANISOU 2328 C PRO H 216 6662 6876 7238 231 -273 -901 C
ATOM 2329 O PRO H 216 3.081 27.188 32.221 1.00 33.86 O
ANISOU 2329 O PRO H 216 4076 4302 4486 297 -305 -920 O
ATOM 2330 CB PRO H 216 0.348 28.666 31.322 1.00 32.21 C
ANISOU 2330 CB PRO H 216 3856 4070 4311 245 -165 -866 C
ATOM 2331 CG PRO H 216 -0.233 27.810 32.376 1.00 35.76 C
ANISOU 2331 CG PRO H 216 4380 4616 4590 319 -134 -850 C
ATOM 2332 CD PRO H 216 -0.205 26.395 31.831 1.00 29.01 C
ANISOU 2332 CD PRO H 216 3545 3784 3691 274 -64 -735 C
TER 2333 PRO H 216
ATOM 2334 N ASP L 1 12.679 9.090 -25.911 1.00 24.97 N
ANISOU 2334 N ASP L 1 3340 2560 3588 66 89 -196 N
ATOM 2335 CA ASP L 1 11.386 9.008 -25.214 1.00 22.13 C
ANISOU 2335 CA ASP L 1 3001 2290 3117 87 13 -178 C
ATOM 2336 C ASP L 1 10.586 10.270 -25.405 1.00 24.75 C
ANISOU 2336 C ASP L 1 3332 2595 3476 107 45 -149 C
ATOM 2337 O ASP L 1 11.150 11.366 -25.533 1.00 25.78 O
ANISOU 2337 O ASP L 1 3423 2631 3741 109 104 -176 O
ATOM 2338 CB ASP L 1 11.594 8.775 -23.699 1.00 23.18 C
ANISOU 2338 CB ASP L 1 3081 2475 3250 109 -77 -274 C
ATOM 2339 CG ASP L 1 12.293 7.471 -23.340 1.00 26.65 C
ANISOU 2339 CG ASP L 1 3535 2947 3645 108 -126 -303 C
ATOM 2340 OD1 ASP L 1 12.541 6.650 -24.258 1.00 24.18 O
ANISOU 2340 OD1 ASP L 1 3268 2622 3300 81 -97 -250 O
ATOM 2341 OD2 ASP L 1 12.537 7.243 -22.126 1.00 26.04 O
ANISOU 2341 OD2 ASP L 1 3432 2911 3553 145 -200 -379 O
ATOM 2342 N ILE L 2 9.260 10.129 -25.359 1.00 19.52 N
ANISOU 2342 N ILE L 2 2706 2012 2698 123 3 -103 N
ATOM 2343 CA ILE L 2 8.371 11.280 -25.505 1.00 19.22 C
ANISOU 2343 CA ILE L 2 2671 1960 2672 154 16 -82 C
.
.
.
.
HETATM 4661 O HOH L2236 8.200 18.486 2.750 1.00 58.70 O
HETATM 4662 O HOH L2237 2.087 16.407 1.748 1.00 45.02 O
HETATM 4663 O HOH L2238 1.933 41.087 7.631 1.00 31.01 O
HETATM 4664 O HOH L2239 4.744 42.515 11.051 1.00 60.18 O
HETATM 4665 O HOH L2240 2.258 41.306 12.333 1.00 45.78 0
expected outcomes
ATOM 2328 C PRO H 216 2.775 27.948 31.304 1.00 54.68 C
ANISOU 2328 C PRO H 216 6662 6876 7238 231 -273 -901 C
ATOM 2329 O PRO H 216 3.081 27.188 32.221 1.00 33.86 O
ANISOU 2329 O PRO H 216 4076 4302 4486 297 -305 -920 O
ATOM 2330 CB PRO H 216 0.348 28.666 31.322 1.00 32.21 C
ANISOU 2330 CB PRO H 216 3856 4070 4311 245 -165 -866 C
ATOM 2331 CG PRO H 216 -0.233 27.810 32.376 1.00 35.76 C
ANISOU 2331 CG PRO H 216 4380 4616 4590 319 -134 -850 C
ATOM 2332 CD PRO H 216 -0.205 26.395 31.831 1.00 29.01 C
ANISOU 2332 CD PRO H 216 3545 3784 3691 274 -64 -735 C
TER 2333 PRO H 216
ATOM 2334 N ASP H 1 12.679 9.090 -25.911 1.00 24.97 N
ANISOU 2334 N ASP H 1 3340 2560 3588 66 89 -196 N
ATOM 2335 CA ASP H 1 11.386 9.008 -25.214 1.00 22.13 C
ANISOU 2335 CA ASP H 1 3001 2290 3117 87 13 -178 C
ATOM 2336 C ASP H 1 10.586 10.270 -25.405 1.00 24.75 C
ANISOU 2336 C ASP H 1 3332 2595 3476 107 45 -149 C
ATOM 2337 O ASP H 1 11.150 11.366 -25.533 1.00 25.78 O
ANISOU 2337 O ASP H 1 3423 2631 3741 109 104 -176 O
ATOM 2338 CB ASP H 1 11.594 8.775 -23.699 1.00 23.18 C
ANISOU 2338 CB ASP H 1 3081 2475 3250 109 -77 -274 C
ATOM 2339 CG ASP H 1 12.293 7.471 -23.340 1.00 26.65 C
ANISOU 2339 CG ASP H 1 3535 2947 3645 108 -126 -303 C
ATOM 2340 OD1 ASP H 1 12.541 6.650 -24.258 1.00 24.18 O
ANISOU 2340 OD1 ASP H 1 3268 2622 3300 81 -97 -250 O
ATOM 2341 OD2 ASP H 1 12.537 7.243 -22.126 1.00 26.04 O
ANISOU 2341 OD2 ASP H 1 3432 2911 3553 145 -200 -379 O
ATOM 2342 N ILE H 2 9.260 10.129 -25.359 1.00 19.52 N
ANISOU 2342 N ILE H 2 2706 2012 2698 123 3 -103 N
ATOM 2343 CA ILE H 2 8.371 11.280 -25.505 1.00 19.22 C
ANISOU 2343 CA ILE H 2 2671 1960 2672 154 16 -82 C
.
.
.
.
HETATM 4661 O HOH L2236 8.200 18.486 2.750 1.00 58.70 O
HETATM 4662 O HOH L2237 2.087 16.407 1.748 1.00 45.02 O
HETATM 4663 O HOH L2238 1.933 41.087 7.631 1.00 31.01 O
HETATM 4664 O HOH L2239 4.744 42.515 11.051 1.00 60.18 O
HETATM 4665 O HOH L2240 2.258 41.306 12.333 1.00 45.78 0
This works for your input sample:
sed '/2334/,/ANISOU 2343/s/ L / H /'
Note that it doesn't check the column number, you might need to tweak the expression to do that.
Using sed:
sed '2334,2343s/\(\([^ ]* \)\{4\}\)L/\1 H/' input
You can use sed:
sed '2334,2343s/\([^ ]* \+\)L/\1H/' input.txt
You can use awk for this:
awk 'NR>=2334 && NR<=2343 {gsub(/L/, "H", $5)} 1' file
Try this awk command,
awk '$2>=2334 && $2<=2343 {gsub(/L/,"H",$5)}1' file

Resources