How I can repeat different lines at different times, using a file with specification? (I think that will be better exemplify)
Example (file that I want repeated lines):
ID01 rs01 AB
ID02 rs01 BA
OA03 rs01 AA
EA04 rs01 BB
Example (file specifying how many times each line must have repeated- specifying the character in first column):
ID01 1
ID02 5
OA03 2
EA04 3
And I want the output file:
ID01 rs01 AB
ID02 rs01 BA
ID02 rs01 BA
ID02 rs01 BA
ID02 rs01 BA
ID02 rs01 BA
OA03 rs01 AA
OA03 rs01 AA
EA04 rs01 BB
EA04 rs01 BB
EA04 rs01 BB
But in my case, my real data is big. Thank you.
Following python script will do the job:
import sys
default_repeats = 1
repeats_file = open(sys.argv[2])
data_file = open(sys.argv[1])
repeats = { i: int(n) for i, n in ( l.split()[:2] for l in repeats_file.readlines() ) }
for line in data_file.readlines():
identifier = line.split(' ')[0]
sys.stdout.write(line * repeats.get(identifier, default_repeats))
It accepts two arguments:
$ python script_file.py <file_with_data> <file_with_repetitions>
Related
I need to protect my NTAG215 from bad password after ten incorrect attempts.
First protected page 04 to 81:
RAW COMMAND: A2 83 04 00 00 04
After to enable brute force protection:
RAW COMMAND: A2 84 82 00 00 00
HEX 82 = BIN 10000010
PROT = 1
CFGLCK = 0
RFUI = 0
NFC_CNT_EN = 0
NFC_CNT_PWD_PROT = 0
AUTHLIM = 010 ( 10 attempts )
But after 3 incorrect attempts my NTAG215 seems dead.
NTAG213-216 data sheet:
https://www.nxp.com/docs/en/data-sheet/NTAG213_215_216.pdf
I'm doing something incorrectly?
Thanks for help.
#SOLVED ( Thanks to #nanofarad )
authlim is a three-bit number - 010 is not ten attempts, it's binary 10 attempts (meaning 2 attempts), so presumably on the third it locks itself.
You cannot set 10 attempts using AUTHLIM if AUTHLIM can only represent numbers from zero (binary 000) to seven (binary 111).
I need to split last column into two separate columns & delete some part of it.
Currently all the values in the last column has 6 numbers . I need to split them into two separate columns.
First column should have first three numbers and second column should have next three numbers.
I ultimately want to delete newly created second column.
Data -
ID c1 c2 c3 c4 c5
12 A XY 123 456 657098
The new file should be created as below -
Data 2
ID c1 c2 c3 c4 c5
12 A XY 123 456 657
Thanks
You can use this awk that checks length of last column for each row:
awk 'length($NF) == 6 { $NF = substr($NF, 1, 3) } 1' file
Data -
ID c1 c2 c3 c4 c5
12 A XY 123 456 657
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Numbers of strings are given,
We need to join all strings to make single valid string.
All the characters of string are lowercase letters [a-z].
Valid string means :
We need to concatenate sub-string where the last letter of a sub-string match the first one of another one (example aa ab => becomes aaab // aa ba => becomes baaa)
All the same characters of the string should be adjacent.
Calculate how many numbers of combinations are possible to make valid string.
Example:
input :
array(aa, ab, bc, aa)
desired output :
This expression got two possible combinaison.
(since "aa aa ab bc" is deferent from "aa aa ab bc", cause there is two aa, and they must not be considered equals)
My efforts :-
I just know its a graph problem. (Connected component Problem).
Please help me so that I can build this algorithm.
I need your guidance, not exact algorithm. So that I can myself build the algorithm
How i'd handle this one :
Let's declare 2 variables :
int factorials_quotient = 1;
int solution_count = 0;
As an example, il will use string : abc cme eqk aa a bb b, since it it the trickiest case
FIRST : identify the 'concat' strings.
Exemple : abc cme eqk aa a bb b
Since 'abc cme eqk' string only have a unique order, you MUST concatenate them and consider it as a unique string : abccmeeqk aa a bb b. (For each concatenation, if the first letter equals the last letter on the substring, you got a exeption case, you must handdle it).
SECOND: identify the 'factorials' strings.
Exemple : abccmeeqk aa a bb b
We are looking for two or more string containing the same characters like : aa a & bb b (DO NOT count 'abcmeeqk' since it contains char different from a).
For each of these string multiply factorials_quotient by n!
In our case, we first got aa a, since we have 2 string (a and aa) we multiply factorials_quotient by factorial(2), then the same for bb b. our factorials_quotient now equals 1 * 2! * 2! = 4.
Then concatenate aa to a and bb to b.
We now got : abccmeeqk aaa bbb.
THIRD : try to concatenante the factorials and the concats if it is possible.
Exemple we not have abccmeeqk aaa bbb.
Since abccmeeqk start with a qe have to put aaa in front of, we now got : aaaabccmeeqk bbb
FOUR : get your result
solution_cuount = factorials_quotient * factorial(number of sub-string in our string)
In our case : aaaabccmeeqk bbb we got 2 sub-string.
solution_count = 4 * 2! = 4 * 2= 8 solutions :
Possible solutions :
aa a abc cme eqk bb b
aa a abc cme eqk b bb
a aa abc cme eqk bb b
a aa abc cme eqk b bb
bb b aa a abc cme eqk
b bb aa a abc cme eqk
bb b a aa abc cme eqk
b bb a aa abc cme eqk
Array step by step :
STEP 0: (at start)
array(
"aa",
"a",
"abc",
"cme",
"eqk",
"bb",
"b"
)
STEP 1:
array(
"aa",
"a",
"abccmeeqk",
"bb",
"b"
)
STEP 2:
array(
"aaa",
"abccmeeqk",
"bbb"
)
STEP 3:
array(
"aaaabccmeeqk",
"bbb"
)
STEP 4:
you got your result
I need an algorithm which generates all the combinations of size n of k characters.
If for example I have n=1 and k={a,b}, the result should be:
a
b
If n=3 and k={a,b}, the result should be:
a a a
a a b
a b a
a b b
b a a
b a b
b b a
b b b
Can someone suggest an algorithm for achieving this?
Thank you!
You can simply map your solution to the values 0 to (|k|^n )-1. The solutions are simply the representation of the number with base |k|
e.g. k={a,b,c} n=2
Solution is 0,1,2,... 3^2 -1 = 8
decimal | representation in base 3
--------+---------------------------
0 | 00
1 | 01
2 | 02
3 | 10
4 | 11
5 | 12
6 | 20
7 | 21
8 | 22
now replace '0' by 'a', '1' by 'b' and '2' by 'c' and you get
aa
ab
ac
ba
bb
bc
ca
cb
cc
the length of k to the power of n
in java:
int combinations = Math.pow(k.length,n);
This is how I do it.
It's intuitive and I hope it helps you.
Explanation:Pick a character and choose what to put next to it.Do this recursively.
char availablechars[];//In your case this is {a,b}
int availablechars_size;//This is 2
void generate(int n,string res)
{
if(n==0)
{
cout<<"\n"<<res;
return;
}
for(int i=0;i<availablechars_size;i++)
{
string t=res;
t+=availablechars[i];
t+=" ";
generate(n-1,t);
}
}
Time Complexity:O(kn)
The function call is :
generate(n,"");
I have files with the following format:
ATOM 8962 CA VAL W 8 8.647 81.467 25.656 1.00115.78 C
ATOM 8963 C VAL W 8 10.053 80.963 25.506 1.00114.60 C
ATOM 8964 O VAL W 8 10.636 80.422 26.442 1.00114.53 O
ATOM 8965 CB VAL W 8 7.643 80.389 25.325 1.00115.67 C
ATOM 8966 CG1 VAL W 8 6.476 80.508 26.249 1.00115.54 C
ATOM 8967 CG2 VAL W 8 7.174 80.526 23.886 1.00115.26 C
ATOM 4440 O TYR S 89 4.530 166.005 -14.543 1.00 95.76 O
ATOM 4441 CB TYR S 89 2.847 168.812 -13.864 1.00 96.31 C
ATOM 4442 CG TYR S 89 3.887 169.413 -14.756 1.00 98.43 C
ATOM 4443 CD1 TYR S 89 3.515 170.073 -15.932 1.00100.05 C
ATOM 4444 CD2 TYR S 89 5.251 169.308 -14.451 1.00100.50 C
ATOM 4445 CE1 TYR S 89 4.464 170.642 -16.779 1.00100.70 C
ATOM 4446 CE2 TYR S 89 6.219 169.868 -15.298 1.00101.40 C
ATOM 4447 CZ TYR S 89 5.811 170.535 -16.464 1.00100.46 C
ATOM 4448 OH TYR S 89 6.736 171.094 -17.321 1.00100.20 O
ATOM 4449 N LEU S 90 3.944 166.393 -12.414 1.00 94.95 N
ATOM 4450 CA LEU S 90 5.079 165.622 -11.914 1.00 94.44 C
ATOM 5151 N LEU W 8 -66.068 209.785 -11.037 1.00117.44 N
ATOM 5152 CA LEU W 8 -64.800 210.035 -10.384 1.00116.52 C
ATOM 5153 C LEU W 8 -64.177 208.641 -10.198 1.00116.71 C
ATOM 5154 O LEU W 8 -64.513 207.944 -9.241 1.00116.99 O
ATOM 5155 CB LEU W 8 -65.086 210.682 -9.033 1.00115.76 C
ATOM 5156 CG LEU W 8 -64.274 211.829 -8.478 1.00113.89 C
ATOM 5157 CD1 LEU W 8 -64.528 211.857 -7.006 1.00111.94 C
ATOM 5158 CD2 LEU W 8 -62.828 211.612 -8.739 1.00112.96 C
In principle, column 5 (W, in this case, which represents the chain ID) should be identical only in consecutive chunks. However, in files with too many chains, there are no enough letters of the alphabet to assign a single ID per chain and therefore duplicity may occur.
I would like to be able to check whether or not this is the case. In other words I would like to know if a given chain ID (A-Z, always in the 5th column) is present in non-consecutive chunks. I do not mind if it changes from W to S, I would like to know if there are two chunks sharing the same chain ID. In this case, if W or S reappear at some point. In fact, this is only a problem if they also share the first and the 6th columns, but I do not want to complicate things too much.
I do not want to print the lines, just to know the name of the file in which the issue occurs and the chain ID (in this case W), in order to solve the problem. In fact, I already know how to solve the problem, but I need to identify the problematic files to focus on those ones and not repairing already sane files.
SOLUTION (thanks to all for your help and namely to sehe):
for pdb in $(ls *.pdb) ; do
hit=$(awk -v pdb="$pdb" '{ if ( $1 == "ATOM" ) { print $0 } }' $pdb | cut -c22-23 | uniq | sort | uniq -dc)
[ "$hit" ] && echo $pdb = $hit
done
For this particular sample:
cut -c22-23 t | uniq | sort | uniq -dc
Will output
2 W
(the 22nd column contains 2 runs of the letter 'W')
untested
awk '
seen[$5] && $5 != current {
print "found non-consecutive chain on line " NR
exit
}
{ current = $5; seen[$5] = 1 }
' filename
Here you go, this awk script is tested and takes into account not just 'W':
{
if (ln[$5] && ln[$5] + 1 != NR) {
print "dup " $5 " at line " NR;
}
ln[$5] = NR;
}