Excess (-1) in base 4 representation - algorithm

I've been trying to wrap my head around this one problem for the last couple of days, and I can't figure out a way to solve it. So, here it goes:
Given the base 4(that is 0, 1, 2, 3 as digits for a number), find the excess (-1) in base 4 representation of any negative or positive integer number.
examples:
-6 = (-1)22
conversely, (-1)22 in excess (-1) of base 4 = 2 * 4^0 + 2 * 4^1 + (-1) * 4^2 = 2 + 8 - 16 = 10 - 16 = -6 in base 10
27 = 2(-1)(-1)
conversely, 2(-1)(-1) = (-1) * 4^0 + (-1) * 4^1 + 2 * 4^2 = -1 - 4 + 32 = 27
I did come up with a few algorithms for positive numbers, but none of them hold true for all negative numbers, so into the trash they went.
Can anyone give me some kind of clue here? Thanks!
----------------
Edit: I'm going to try to rephrase this question in such a way that it does not raise any confusions.
Consider the radix obtained by subtracting 1 from every digit, called the excess-(-1) of base 4. In this radix, any number can be represented using the digits -1, 0, 1, 2. So, the problem asks for an algorithm that gets as an input any integer number, and gives as output the representation of that given number.
Examples:
decimal -6 = -1 2 2 for the excess-(-1) of base 4.
To verify this, we take the representation -1 -1 2 and transform it to a decimal number, start from the right-most digit and use the generic base n to base 10 algorithm, like so:
number = 2 * 4^0 + 2 * 4^1 + (-1) * 4^2 = 2 + 4 - 16 = -6

I don't know if "quaterit" is the correct word for the radix in this representation, but I'm going to use it anyway.
Since you say you already have an algorithm for positive numbers, I'll try to take a negative number as an input and write something that uses what you already have. The code below doesn't quite work, but I'll explain why at the end.
int[] BaseFourExcessForNegativeNumbers(int x) {
int powerOfFour = 1;
while (-powerOfFour > x) {
powerOfFour *= 4;
}
int firstQuaterit = -1;
int remainder = x + powerOfFour;
int[] otherQuaterits;
if (remainder >= 0) {
otherQuaterits = BaseFourExcessForPositiveNumbers(remainder);
} else {
otherQuaterits = BaseFourExcessForNegativeNumbers(remainder);
}
int[] result = new int[otherQuaterits.Length + 1];
result[0] = firstQuaterit;
for (int index = 0; index < otherQuaterits.Length; ++index) {
result[index + 1] = otherQuaterits[index];
}
return result;
}
The idea here is that every negative number x will start with a (-1) in this representation. If that (-1) is in the 4^n position, we want to find out how to represent x - (-1)*4^n to see how to represent the rest of the number.
The reason the code I wrote won't work is that it doesn't take into consideration the possibility that the second quaterit is a 0. If that happens, the array my code will produce will be missing that 0. In fact, if BaseFourExcessForPositiveNumbers is written in the same way, the resulting array will be missing every 0, but will otherwise be correct. A workaround is to keep track of which place the first quaterit takes, and then make the array that size, and fill it from the back to the front.

Related

geting maximum number in a set with special conditions

I encountered a problem recently I have a hard time finding the answer.
This is the question:
Consider a set of numbers.There are tree kinds of input:
1 x
2 x
3
The first command adds integer x to the set.
The second one means for every element y in list, put:
y = y xor x
and The last command prints the biggest number in the set. for instance:
10
3
1 7
3
2 4
2 8
2 3
1 10
1 3
3
2 1
results:
0
7
15
if n is the number of commands in input:
and:
also there is a 1 second execution time limit!
My solution so far:
lets call the set S and have an integer m which initially is 0.as you know:
number = number xor x xor x
meaning that if we apply xor twice on something then the its effect is reversed and the original number doesn't change. That being said if we every time we insert a number(command 1) we do the following:
y = y xor m
add y to S
and every time we want to get a number from the set:
find y
y = y xor m
return y
and if command two comes to the following:
m = m xor x
then the problem is almost solved, since initially save the XORed version of the numbers and when needed we do the revers!
But the problem here is to find the largest number in the set( pay attention that the numbers in the set are different from original numbers) so command 3 works right. I don't know how to do this in an efficient time.but I have an idea here:
if we save the binary representation of the numbers in the set in a trie data structure at first the maybe we can quickly find the biggest number. I don't really know how but this idea occurred to me.
so to sum up these are my issues:
problem 1:
how to find the biggest number in the revised list
problem 2:
is this trie idea good?
problem 3:
how can I implement it in code(the language is not very important here) so that it works time find?
also what is the time complexity needed to solve this problem in the first place?
Thanks for reading my question.
Yes your idea is correct, it can be solved in O(N log 10^9) using binary trie data structure.
The idea is to store numbers in binary notation yet putting biggest bits first, so while traversing the trie we can choose a branch that leads to greatest answer.
For determining which branch to choose we can determine this bit by bit, if from some trie node we have 2 branches with values 0 and 1 we choose the one which gives better result after xoring with m
Sample code (C++):
#include <bits/stdc++.h>
using namespace std;
int Trie[4000005][2];
int nxt = 2;
void Add(int x)
{
bitset<32>b(x);
int c = 1;
for(int j=31; j>=0; j--)
if(Trie[c][b[j]])c=Trie[c][b[j]];
else c = Trie[c][b[j]] = nxt++;
}
int Get(int x)
{
bitset<32>b(x),res(0);
int c = 1;
for(int j=31; j>=0; j--)
if(Trie[c][!b[j]])c=Trie[c][!b[j]],res[j]=!b[j];
else c = Trie[c][b[j]], res[j]=b[j];
return res.to_ullong()^x;
}
int main()
{
ios::sync_with_stdio(0);cin.tie(0);cout.tie(0);
int q,m=0;
cin>>q;
Add(0);
while(q--)
{
int type;
cin>>type;
if(type==1)
{
int x;
cin>>x;
Add(x^m);
}
else if(type==2)
{
int x;
cin>>x;
m^=x;
}
else cout<<Get(m)<<"\n";
}
}
This is very similar to this problem and should be solvable in O(n), because the number of bits for x is constant (for 10^9 you will have to look at the 30 lowest bits).
At start m = 0, each time you encounter the 2nd command you do m ^= x (m = m xor x).
Use a binary tree. Unlike for the linked question the amount of numbers in a bucket doesn't matter, you just need to be able to tell if there is a number that has a certain bit which is one or zero. E.g. for 3-bit numbers 1, 4 and 5 the tree could look like this (left means bit is 0, right means bit is 1):
*
/ \
1 1 there are numbers with highest bit 0 and 1
/ /
1 1 of the numbers with 1st bit 0, there is a number with 2nd bit 0 and ...
\ / \
1 1 1 of the numbers with 1st and 2nd bit 0, there is a number with 3rd bit 1,...
1 4 5 (the numbers just to clarify)
So adding a number just means adding some edges and nodes.
To get the highest number in the set you go down the tree and through the bits of m and calculate the max x as follows:
Initialize node n as the root of the tree, i = 29 the bit of m we are looking at and the solution x = 0.
mi = (m & (1 << i)) >> i (1 if the bit in m is 1, 0 otherwise).
If we look at n and there is only an edge denoting a 0 or if mi == 1 and we have a 0-edge: n becomes the node connected by that edge, x = 2 * x + mi (or more fancy: x = (x << 1) | mi).
Otherwise n becomes the node connected by the 1-edge and x = 2 * x + 1 - mi
If i > 0: decrease i by 1 and continue with step 2.
An example for 3-bit numbers m = 6 (110) and the numbers 1 (001), 4 (100) and 5 (101) in the set, the answer should be 7 (111), i.e. 1 xor 6: First step we go left and x = 1, then we can only go left and x = 3, then we can only go right and x = 7.

Querying in a range[L,R]

Given a binary string (that is a string consisting of only 0 and 1). They were supposed to perform two types of query on the string.Problem
Type 0: Given two indices l and r.Print the value of the binary string from l to r modulo 3.
Type 1: Given an index l flip the value of that index if and only if the value at that index is 0.
I am trying to solve this using BIT.
If the number in range [l,r] is even then:
if the sum of the numbers of one is even then the answer is 0 else 2
If the number in range [l,r] is odd
if the sum of the numbers of one is even then the answer is 0 else 1
But I am getting wrong answer for some test cases what wrong is in my approach.
public static void update(int i){
while(A.length>i){
A[i]+=1;
i+=i&-i;
}
}
public static int ans(int i){
int a=0;
while(i>0){
a+=A[i];
i-=i&-i;
}
return a;
}
Answer for each Query.
while(Q>0){
Q--;
int x = in.nextInt();
int l = in.nextInt()+1;
if(x==1){
if((ans(l)-ans(l-1))==0) update(l);
continue;
}
int r = in.nextInt()+1;
int f = ans(r) - ans(r-1);
if(f==0){
int sum = ans(r)- ans(l-1);
if(sum%2==0) System.out.println(0);
else System.out.println(2);
}else{
int sum = ans(r)- ans(l-1);
if(sum%2==0) System.out.println(0);
else System.out.println(1);
}
}
Full CODE
When building a binary number, from the left digit to the right digit, if you consider only a currently parsed section of the string, it is a binary number. We'll call it n.
When you append a digit to the right of n, this shifts it left, and then also adds the digit (1 or 0). So n0 = 2*n and n1 = 2*n+1
Because you only care about the number modulo 3, you can just keep track of this modulo 3.
You can note that
0 (mod 3) * 2 = 0 (mod 3)
0 (mod 3) * 2 + 1 = 1 (mod 3)
1 (mod 3) * 2 = 2 (mod 3)
1 (mod 3) * 2 + 1 = 0 (mod 3)
2 (mod 3) * 2 = 1 (mod 3)
2 (mod 3) * 2 + 1 = 2 (mod 3)
You can construct a simple fsm to represent these relationships and simply use the section of the string which you are interested in as input to it. Or implement it however else you want.
Hopefully you realise that "Given an index l flip the value of that index if and only if the value at that index is 0." simply means set the value at l to 1.

Sum of all numbers written with particular digits in a given range

My objective is to find the sum of all numbers from 4 to 666554 which consists of 4,5,6 only.
SUM = 4+5+6+44+45+46+54+55+56+64+65+66+.....................+666554.
Simple method is to run a loop and add the numbers made of 4,5 and 6 only.
long long sum = 0;
for(int i=4;i <=666554;i++){
/*check if number contains only 4,5 and 6.
if condition is true then add the number to the sum*/
}
But it seems to be inefficient. Checking that the number is made up of 4,5 and 6 will take time. Is there any way to increase the efficiency. I have tried a lot but no new approach i have found.Please help.
For 1-digit numbers, note that
4 + 5 + 6 == 5 * 3
For 2-digits numbers:
(44 + 45 + 46) + (54 + 55 + 56) + (64 + 65 + 66)
== 45 * 3 + 55 * 3 + 65 * 3
== 55 * 9
and so on.
In general, for n-digits numbers, there are 3n of them consist of 4,5,6 only, their average value is exactly 5...5(n digits). Using code, the sum of them is ('5' * n).to_i * 3 ** n (Ruby), or int('5' * n) * 3 ** n (Python).
You calculate up to 6-digits numbers, then subtract the sum of 666555 to 666666.
P.S: for small numbers like 666554, using pattern matching is fast enough. (example)
Implement a counter in base 3 (number of digit values), e.g. 0,1,2,10,11,12,20,21,22,100.... and then translate the base-3 number into a decimal with the digits 4,5,6 (0->4, 1->5, 2->6), and add to running total. Repeat until the limit.
def compute_sum(digits, max_val):
def _next_val(cur_val):
for pos in range(len(cur_val)):
cur_val[pos]+=1
if cur_val[pos]<len(digits):
return
cur_val[pos]=0
cur_val.append(0)
def _get_val(cur_val):
digit_val=1
num_val=0
for x in cur_val:
num_val+=digits[x]*digit_val
digit_val*=10
return num_val
cur_val=[]
sum=0
while(True):
_next_val(cur_val)
num_val=_get_val(cur_val)
if num_val>max_val:
break
sum+=num_val
return sum
def main():
digits=[4,5,6]
max_val=666554
print(digits, max_val)
print(compute_sum(digits, max_val))
Mathematics are good, but not all problems are trivially "compressible", so knowing how to deal with them without mathematics can be worthwhile.
In this problem, the summation is trivial, the difficulty is efficiently enumerating the numbers that need be added, at first glance.
The "filter" route is a possibility: generate all possible numbers, incrementally, and filter out those which do not match; however it is also quite inefficient (in general):
the condition might not be trivial to match: in this case, the easier way is a conversion to string (fairly heavy on divisions and tests) followed by string-matching
the ratio of filtering is not too bad to start with at 30% per digit, but it scales very poorly as gen-y-s remarked: for a 4 digits number it is at 1%, or generating and checking 100 numbers to only get 1 out of them.
I would therefore advise a "generational" approach: only generate numbers that match the condition (and all of them).
I would note that generating all numbers composed of 4, 5 and 6 is like counting (in ternary):
starts from 4
45 becomes 46 (beware of carry-overs)
66 becomes 444 (extreme carry-over)
Let's go, in Python, as a generator:
def generator():
def convert(array):
i = 0
for e in array:
i *= 10
i += e
return i
def increment(array):
result = []
carry = True
for e in array[::-1]:
if carry:
e += 1
carry = False
if e > 6:
e = 4
carry = True
result = [e,] + result
if carry:
result = [4,] + result
return result
array = [4]
while True:
num = convert(array)
if num > 666554: break
yield num
array = increment(array)
Its result can be printed with sum(generator()):
$ time python example.py
409632209
python example.py 0.03s user 0.00s system 82% cpu 0.043 total
And here is the same in C++.
"Start with a simpler problem." —Polya
Sum the n-digit numbers which consist of the digits 4,5,6 only
As Yu Hao explains above, there are 3**n numbers and their average by symmetry is eg. 555555, so the sum is 3**n * (10**n-1)*5/9. But if you didn't spot that, here's how you might solve the problem another way.
The problem has a recursive construction, so let's try a recursive solution. Let g(n) be the sum of all 456-numbers of exactly n digits. Then we have the recurrence relation:
g(n) = (4+5+6)*10**(n-1)*3**(n-1) + 3*g(n-1)
To see this, separate the first digit of each number in the sum (eg. for n=3, the hundreds column). That gives the first term. The second term is sum of the remaining digits, one count of g(n-1) for each prefix of 4,5,6.
If that's still unclear, write out the n=2 sum and separate tens from units:
g(2) = 44+45+46 + 54+55+56 + 64+65+66
= (40+50+60)*3 + 3*(4+5+6)
= (4+5+6)*10*3 + 3*g(n-1)
Cool. At this point, the keen reader might like to check Yu Hao's formula for g(n) satisfies our recurrence relation.
To solve OP's problem, the sum of all 456-numbers from 4 to 666666 is g(1) + g(2) + g(3) + g(4) + g(5) + g(6). In Python, with dynamic programming:
def sum456(n):
"""Find the sum of all numbers at most n digits which consist of 4,5,6 only"""
g = [0] * (n+1)
for i in range(1,n+1):
g[i] = 15*10**(i-1)*3**(i-1) + 3*g[i-1]
print(g) # show the array of partial solutions
return sum(g)
For n=6
>>> sum456(6)
[0, 15, 495, 14985, 449955, 13499865, 404999595]
418964910
Edit: I note that OP truncated his sum at 666554 so it doesn't fit the general pattern. It will be less the last few terms
>>> sum456(6) - (666555 + 666556 + 666564 + 666565 + 666566 + 666644 + 666645 + 666646 + 666654 + 666655 + 666656 + + 666664 + 666665 + 666666)
409632209
The sum of 4 through 666666 is:
total = sum([15*(3**i)*int('1'*(i+1)) for i in range(6)])
>>> 418964910
The sum of the few numbers between 666554 and 666666 is:
rest = 666555+666556+666564+666565+666566+
666644+666645+666646+
666654+666655+666656+
666664+666665+666666
>>> 9332701
total - rest
>>> 409632209
Java implementation of question:-
This uses the modulo(10^9 +7) for the answer.
public static long compute_sum(long[] digits, long max_val, long count[]) {
List<Long> cur_val = new ArrayList<>();
long sum = 0;
long mod = ((long)Math.pow(10,9))+7;
long num_val = 0;
while (true) {
_next_val(cur_val, digits);
num_val = _get_val(cur_val, digits, count);
sum =(sum%mod + (num_val)%mod)%mod;
if (num_val == max_val) {
break;
}
}
return sum;
}
public static void _next_val(List<Long> cur_val, long[] digits) {
for (int pos = 0; pos < cur_val.size(); pos++) {
cur_val.set(pos, cur_val.get(pos) + 1);
if (cur_val.get(pos) < digits.length)
return;
cur_val.set(pos, 0L);
}
cur_val.add(0L);
}
public static long _get_val(List<Long> cur_val, long[] digits, long count[]) {
long digit_val = 1;
long num_val = 0;
long[] digitAppearanceCount = new long[]{0,0,0};
for (Long x : cur_val) {
digitAppearanceCount[x.intValue()] = digitAppearanceCount[x.intValue()]+1;
if (digitAppearanceCount[x.intValue()]>count[x.intValue()]){
num_val=0;
break;
}
num_val = num_val+(digits[x.intValue()] * digit_val);
digit_val *= 10;
}
return num_val;
}
public static void main(String[] args) {
long [] digits=new long[]{4,5,6};
long count[] = new long[]{1,1,1};
long max_val= 654;
System.out.println(compute_sum(digits, max_val, count));
}
The Answer by #gen-y-s (https://stackoverflow.com/a/31286947/8398943) is wrong (It includes 55,66,44 for x=y=z=1 which is exceeding the available 4s, 5s, 6s). It gives output as 12189 but it should be 3675 for x=y=z=1.
The logic by #Yu Hao (https://stackoverflow.com/a/31285816/8398943) has the same mistake as mentioned above. It gives output as 12189 but it should be 3675 for x=y=z=1.

Find number of binary numbers with certain constraints

This is more of a puzzle than a coding problem. I need to find how many binary numbers can be generated satisfying certain constraints. The inputs are
(integer) Len - Number of digits in the binary number
(integer) x
(integer) y
The binary number has to be such that taking any x adjacent digits from the binary number should contain at least y 1's.
For example -
Len = 6, x = 3, y = 2
0 1 1 0 1 1 - Length is 6, Take any 3 adjacent digits from this and
there will be 2 l's
I had this C# coding question posed to me in an interview and I cannot figure out any algorithm to solve this. Not looking for code (although it's welcome), any sort of help, pointers are appreciated
This problem can be solved using dynamic programming. The main idea is to group the binary numbers according to the last x-1 bits and the length of each binary number. If appending a bit sequence to one number yields a number satisfying the constraint, then appending the same bit sequence to any number in the same group results in a number satisfying the constraint also.
For example, x = 4, y = 2. both of 01011 and 10011 have the same last 3 bits (011). Appending a 0 to each of them, resulting 010110 and 100110, both satisfy the constraint.
Here is pseudo code:
mask = (1<<(x-1)) - 1
count[0][0] = 1
for(i = 0; i < Len-1; ++i) {
for(j = 0; j < 1<<i && j < 1<<(x-1); ++j) {
if(i<x-1 || count1Bit(j*2+1)>=y)
count[i+1][(j*2+1)&mask] += count[i][j];
if(i<x-1 || count1Bit(j*2)>=y)
count[i+1][(j*2)&mask] += count[i][j];
}
}
answer = 0
for(j = 0; j < 1<<i && j < 1<<(x-1); ++j)
answer += count[Len][j];
This algorithm assumes that Len >= x. The time complexity is O(Len*2^x).
EDIT
The count1Bit(j) function counts the number of 1 in the binary representation of j.
The only input to this algorithm are Len, x, and y. It starts from an empty binary string [length 0, group 0], and iteratively tries to append 0 and 1 until length equals to Len. It also does the grouping and counting the number of binary strings satisfying the 1-bits constraint in each group. The output of this algorithm is answer, which is the number of binary strings (numbers) satisfying the constraints.
For a binary string in group [length i, group j], appending 0 to it results in a binary string in group [length i+1, group (j*2)%(2^(x-1))]; appending 1 to it results in a binary string in group [length i+1, group (j*2+1)%(2^(x-1))].
Let count[i,j] be the number of binary strings in group [length i, group j] satisfying the 1-bits constraint. If there are at least y 1 in the binary representation of j*2, then appending 0 to each of these count[i,j] binary strings yields a binary string in group [length i+1, group (j*2)%(2^(x-1))] which also satisfies the 1-bit constraint. Therefore, we can add count[i,j] into count[i+1,(j*2)%(2^(x-1))]. The case of appending 1 is similar.
The condition i<x-1 in the above algorithm is to keep the binary strings growing when length is less than x-1.
Using the example of LEN = 6, X = 3 and Y = 2...
Build an exhaustive bit pattern generator for X bits. A simple binary counter can do this. For example, if X = 3
then a counter from 0 to 7 will generate all possible bit patterns of length 3.
The patterns are:
000
001
010
011
100
101
110
111
Verify the adjacency requirement as the patterns are built. Reject any patterns that do not qualify.
Basically this boils down to rejecting any pattern containing fewer than 2 '1' bits (Y = 2). The list prunes down to:
011
101
110
111
For each member of the pruned list, add a '1' bit and retest the first X bits. Keep the new pattern if it passes the
adjacency test. Do the same with a '0' bit. For example this step proceeds as:
1011 <== Keep
1101 <== Keep
1110 <== Keep
1111 <== Keep
0011 <== Reject
0101 <== Reject
0110 <== Keep
0111 <== Keep
Which leaves:
1011
1101
1110
1111
0110
0111
Now repeat this process until the pruned set is empty or the member lengths become LEN bits long. In the end
the only patterns left are:
111011
111101
111110
111111
110110
110111
101101
101110
101111
011011
011101
011110
011111
Count them up and you are done.
Note that you only need to test the first X bits on each iteration because all the subsequent patterns were verified in prior steps.
Considering that input values are variable and wanted to see the actual output, I used recursive algorithm to determine all combinations of 0 and 1 for a given length :
private static void BinaryNumberWithOnes(int n, int dump, int ones, string s = "")
{
if (n == 0)
{
if (BinaryWithoutDumpCountContainsnumberOfOnes(s, dump,ones))
Console.WriteLine(s);
return;
}
BinaryNumberWithOnes(n - 1, dump, ones, s + "0");
BinaryNumberWithOnes(n - 1, dump, ones, s + "1");
}
and BinaryWithoutDumpCountContainsnumberOfOnes to determine if the binary number meets the criteria
private static bool BinaryWithoutDumpCountContainsnumberOfOnes(string binaryNumber, int dump, int ones)
{
int current = 0;
int count = binaryNumber.Length;
while(current +dump < count)
{
var fail = binaryNumber.Remove(current, dump).Replace("0", "").Length < ones;
if (fail)
{
return false;
}
current++;
}
return true;
}
Calling BinaryNumberWithOnes(6, 3, 2) will output all binary numbers that match
010011
011011
011111
100011
100101
100111
101011
101101
101111
110011
110101
110110
110111
111011
111101
111110
111111
Sounds like a nested for loop would do the trick. Pseudocode (not tested).
value = '0101010111110101010111' // change this line to format you would need
for (i = 0; i < (Len-x); i++) { // loop over value from left to right
kount = 0
for (j = i; j < (i+x); j++) { // count '1' bits in the next 'x' bits
kount += value[j] // add 0 or 1
if kount >= y then return success
}
}
return fail
The naive approach would be a tree-recursive algorithm.
Our recursive method would slowly build the number up, e.g. it would start at xxxxxx, return the sum of a call with 1xxxxx and 0xxxxx, which themselves will return the sum of a call with 10, 11 and 00, 01, etc. except if the x/y conditions are NOT satisfied for the string it would build by calling itself it does NOT go down that path, and if you are at a terminal condition (built a number of the correct length) you return 1. (note that since we're building the string up from left to right, you don't have to check x/y for the entire string, just also considering the newly added digit!)
By returning a sum over all calls then all of the returned 1s will pool together and be returned by the initial call, equalling the number of constructed strings.
No idea what the big O notation for time complexity is for this one, it could be as bad as O(2^n)*O(checking x/y conditions) but it will prune lots of branches off the tree in most cases.
UPDATE: One insight I had is that all branches of the recursive tree can be 'merged' if they have identical last x digits so far, because then the same checks would be applied to all digits hereafter so you may as well double them up and save a lot of work. This now requires building the tree explicitly instead of implicitly via recursive calls, and maybe some kind of hashing scheme to detect when branches have identical x endings, but for large length it would provide a huge speedup.
My approach is to start by getting the all binary numbers with the minimum number of 1's, which is easy enough, you just get every unique permutation of a binary number of length x with y 1's, and cycle each unique permutation "Len" times. By flipping the 0 bits of these seeds in every combination possible, we are guaranteed to iterate over all of the binary numbers that fit the criteria.
from itertools import permutations, cycle, combinations
def uniq(x):
d = {}
for i in x:
d[i]=1
return d.keys()
def findn( l, x, y ):
window = []
for i in xrange(y):
window.append(1)
for i in xrange(x-y):
window.append(0)
perms = uniq(permutations(window))
seeds=[]
for p in perms:
pr = cycle(p)
seeds.append([ pr.next() for i in xrange(l) ]) ###a seed is a binary number fitting the criteria with minimum 1 bits
bin_numbers=[]
for seed in seeds:
if seed in bin_numbers: continue
indexes = [ i for i, x in enumerate(seed) if x == 0] ### get indexes of 0 "bits"
exit = False
for i in xrange(len(indexes)+1):
if( exit ): break
for combo in combinations(indexes, i): ### combinatorically flipping the zero bits in the seed
new_num = seed[:]
for index in combo: new_num[index]+=1
if new_num in bin_numbers:
### if our new binary number has been seen before
### we can break out since we are doing a depth first traversal
exit=True
break
else:
bin_numbers.append(new_num)
print len(bin_numbers)
findn(6,3,2)
Growth of this approach is definitely exponential, but I thought I'd share my approach in case it helps someone else get to a lower complexity solution...
Set some condition and introduce simple help variable.
L = 6, x = 3 , y = 2 introduce d = x - y = 1
Condition: if the list of the next number hypotetical value and the previous x - 1 elements values has a number of 0-digits > d next number concrete value must be 1, otherwise add two brances with both 1 and 0 as concrete value.
Start: check(Condition) => both 0,1 due to number of total zeros in the 0-count check.
Empty => add 0 and 1
Step 1:Check(Condition)
0 (number of next value if 0 and previous x - 1 zeros > d(=1)) -> add 1 to sequence
1 -> add both 0,1 in two different branches
Step 2: check(Condition)
01 -> add 1
10 -> add 1
11 -> add 0,1 in two different branches
Step 3:
011 -> add 0,1 in two branches
101 -> add 1 (the next value if 0 and prev x-1 seq would be 010, so we prune and set only 1)
110 -> add 1
111 -> add 0,1
Step 4:
0110 -> obviously 1
0111 -> both 0,1
1011 -> both 0,1
1101 -> 1
1110 -> 1
1111 -> 0,1
Step 5:
01101 -> 1
01110 -> 1
01111 -> 0,1
10110 -> 1
10111 -> 0,1
11011 -> 0,1
11101 -> 1
11110 -> 1
11111 -> 0,1
Step 6 (Finish):
011011
011101
011110
011111
101101
101110
101111
110110
110111
111011
111101
111110
111111
Now count. I've tested for L = 6, x = 4 and y = 2 too, but consider to check the algorithm for special cases and extended cases.
Note: I'm pretty sure some algorithm with Disposition Theory bases should be a really massive improvement of my algorithm.
So in a series of Len binary digits, you are looking for a x-long segment that contains y 1's ..
See the execution: http://ideone.com/xuaWaK
Here's my Algorithm in Java:
import java.util.*;
import java.lang.*;
class Main
{
public static ArrayList<String> solve (String input, int x, int y)
{
int s = 0;
ArrayList<String> matches = new ArrayList<String>();
String segment = null;
for (int i=0; i<(input.length()-x); i++)
{
s = 0;
segment = input.substring(i,(i+x));
System.out.print(" i: "+i+" ");
for (char c : segment.toCharArray())
{
System.out.print("*");
if (c == '1')
{
s = s + 1;
}
}
if (s == y)
{
matches.add(segment);
}
System.out.println();
}
return matches;
}
public static void main (String [] args)
{
String input = "011010101001101110110110101010111011010101000110010";
int x = 6;
int y = 4;
ArrayList<String> matches = null;
matches = solve (input, x, y);
for (String match : matches)
{
System.out.println(" > "+match);
}
System.out.println(" Number of matches is " + matches.size());
}
}
The number of patterns of length X that contain at least Y 1 bits is countable. For the case x == y we know there is exactly one pattern of the 2^x possible patterns that meets the criteria. For smaller y we need to sum up the number of patterns which have excess 1 bits and the number of patterns that have exactly y bits.
choose(n, k) = n! / k! (n - k)!
numPatterns(x, y) {
total = 0
for (int j = x; j >= y; j--)
total += choose(x, j)
return total
}
For example :
X = 4, Y = 4 : 1 pattern
X = 4, Y = 3 : 1 + 4 = 5 patterns
X = 4, Y = 2 : 1 + 4 + 6 = 11 patterns
X = 4, Y = 1 : 1 + 4 + 6 + 4 = 15 patterns
X = 4, Y = 0 : 1 + 4 + 6 + 4 + 1 = 16
(all possible patterns have at least 0 1 bits)
So let M be the number of X length patterns that meet the Y criteria. Now, that X length pattern is a subset of N bits. There are (N - x + 1) "window" positions for the sub pattern, and 2^N total patterns possible. If we start with any of our M patterns, we know that appending a 1 to the right and shifting to the next window will result in one of our known M patterns. The question is, how many of the M patterns can we add a 0 to, shift right, and still have a valid pattern in M?
Since we are adding a zero, we have to be either shifting away from a zero, or we have to already be in an M where we have an excess of 1 bits. To flip that around, we can ask how many of the M patterns have exactly Y bits and start with a 1. Which is the same as "how many patterns of length X-1 have Y-1 bits", which we know how to answer:
shiftablePatternCount = M - choose(X-1, Y-1)
So starting with M possibilities, we are going to increase by shiftablePatternCount when we slide to the right. All patterns in the new window are in the set of M, with some patterns now duplicated. We are going to shift a number of times to fill up N by (N - X), each time increasing the count by shiftablePatternCount, so the full answer should be :
totalCountOfMatchingPatterns = M + (N - X)*shiftablePatternCount
edit - realized a mistake. I need to count the duplicates of the shiftable patterns that are generated. I think that's doable. (draft still)
I am not sure about my answer but here is my view.just take a look at it,
Len=4,
x=3,
y=2.
i just took out two patterns,cause pattern must contain at least y's 1.
X 1 1 X
1 X 1 X
X - represent don't care
now count for 1st expression is 2 1 1 2 =4
and for 2nd expression 1 2 1 2 =4
but 2 pattern is common between both so minus 2..so there will be total 6 pair which satisfy the condition.
I happen to be using a algoritem similar to your problem, trying to find a way to improve it, I found your question. So I will share
static int GetCount(int length, int oneBits){
int result = 0;
double count = Math.Pow(2, length);
for (int i = 1; i <= count - 1; i++)
{
string str = Convert.ToString(i, 2).PadLeft(length, '0');
if (str.ToCharArray().Count(c => c == '1') == oneBits)
{
result++;
}
}
return result;
}
not very efficent I think, but elegent solution.

Finding the index of a given permutation

I'm reading the numbers 0, 1, ..., (N - 1) one by one in some order. My goal is to find the lexicography index of this given permutation, using only O(1) space.
This question was asked before, but all the algorithms I could find used O(N) space. I'm starting to think that it's not possible. But it would really help me a lot with reducing the number of allocations.
Considering the following data:
chars = [a, b, c, d]
perm = [c, d, a, b]
ids = get_indexes(perm, chars) = [2, 3, 0, 1]
A possible solution for permutation with repetitions goes as follows:
len = length(perm) (len = 4)
num_chars = length(chars) (len = 4)
base = num_chars ^ len (base = 4 ^ 4 = 256)
base = base / len (base = 256 / 4 = 64)
id = base * ids[0] (id = 64 * 2 = 128)
base = base / len (base = 64 / 4 = 16)
id = id + (base * ids[1]) (id = 128 + (16 * 3) = 176)
base = base / len (base = 16 / 4 = 4)
id = id + (base * ids[2]) (id = 176 + (4 * 0) = 176)
base = base / len (base = 4 / 4 = 1)
id = id + (base * ids[3]) (id = 176 + (1 * 1) = 177)
Reverse process:
id = 177
(id / (4 ^ 3)) % 4 = (177 / 64) % 4 = 2 % 4 = 2 -> chars[2] -> c
(id / (4 ^ 2)) % 4 = (177 / 16) % 4 = 11 % 4 = 3 -> chars[3] -> d
(id / (4 ^ 1)) % 4 = (177 / 4) % 4 = 44 % 4 = 0 -> chars[0] -> a
(id / (4 ^ 0)) % 4 = (177 / 1) % 4 = 177 % 4 = 1 -> chars[1] -> b
The number of possible permutations is given by num_chars ^ num_perm_digits, having num_chars as the number of possible characters, and num_perm_digits as the number of digits in a permutation.
This requires O(1) in space, considering the initial list as a constant cost; and it requires O(N) in time, considering N as the number of digits your permutation will have.
Based on the steps above, you can do:
function identify_permutation(perm, chars) {
for (i = 0; i < length(perm); i++) {
ids[i] = get_index(perm[i], chars);
}
len = length(perm);
num_chars = length(chars);
index = 0;
base = num_chars ^ len - 1;
base = base / len;
for (i = 0; i < length(perm); i++) {
index += base * ids[i];
base = base / len;
}
}
It's a pseudocode, but it's also quite easy to convert to any language (:
If you are looking for a way to obtain the lexicographic index or rank of a unique combination instead of a permutation, then your problem falls under the binomial coefficient. The binomial coefficient handles problems of choosing unique combinations in groups of K with a total of N items.
I have written a class in C# to handle common functions for working with the binomial coefficient. It performs the following tasks:
Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters.
Converts the K-indexes to the proper lexicographic index or rank of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle and is very efficient compared to iterating over the set.
Converts the index in a sorted binomial coefficient table to the corresponding K-indexes. I believe it is also faster than older iterative solutions.
Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.
The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to use the 4 above methods. Accessor methods are provided to access the table.
There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.
To read about this class and download the code, see Tablizing The Binomial Coeffieicent.
The following tested code will iterate through each unique combinations:
public void Test10Choose5()
{
String S;
int Loop;
int N = 10; // Total number of elements in the set.
int K = 5; // Total number of elements in each group.
// Create the bin coeff object required to get all
// the combos for this N choose K combination.
BinCoeff<int> BC = new BinCoeff<int>(N, K, false);
int NumCombos = BinCoeff<int>.GetBinCoeff(N, K);
// The Kindexes array specifies the indexes for a lexigraphic element.
int[] KIndexes = new int[K];
StringBuilder SB = new StringBuilder();
// Loop thru all the combinations for this N choose K case.
for (int Combo = 0; Combo < NumCombos; Combo++)
{
// Get the k-indexes for this combination.
BC.GetKIndexes(Combo, KIndexes);
// Verify that the Kindexes returned can be used to retrive the
// rank or lexigraphic order of the KIndexes in the table.
int Val = BC.GetIndex(true, KIndexes);
if (Val != Combo)
{
S = "Val of " + Val.ToString() + " != Combo Value of " + Combo.ToString();
Console.WriteLine(S);
}
SB.Remove(0, SB.Length);
for (Loop = 0; Loop < K; Loop++)
{
SB.Append(KIndexes[Loop].ToString());
if (Loop < K - 1)
SB.Append(" ");
}
S = "KIndexes = " + SB.ToString();
Console.WriteLine(S);
}
}
You should be able to port this class over fairly easily to the language of your choice. You probably will not have to port over the generic part of the class to accomplish your goals. Depending on the number of combinations you are working with, you might need to use a bigger word size than 4 byte ints.
There is a java solution to this problem on geekviewpoint. It has a good explanation for why it's true and the code is easy to follow. http://www.geekviewpoint.com/java/numbers/permutation_index. It also has a unit test that runs the code with different inputs.
There are N! permutations. To represent index you need at least N bits.
Here is a way to do it if you want to assume that arithmetic operations are constant time:
def permutationIndex(numbers):
n=len(numbers)
result=0
j=0
while j<n:
# Determine factor, which is the number of possible permutations of
# the remaining digits.
i=1
factor=1
while i<n-j:
factor*=i
i+=1
i=0
# Determine index, which is how many previous digits there were at
# the current position.
index=numbers[j]
while i<j:
# Only the digits that weren't used so far are valid choices, so
# the index gets reduced if the number at the current position
# is greater than one of the previous digits.
if numbers[i]<numbers[j]:
index-=1
i+=1
# Update the result.
result+=index*factor
j+=1
return result
I've purposefully written out certain calculations that could be done more simply using some Python built-in operations, but I wanted to make it more obvious that no extra non-constant amount of space was being used.
As maxim1000 noted, the number of bits required to represent the result will grow quickly as n increases, so eventually big integers will be required, which no longer have constant-time arithmetic, but I think this code addresses the spirit of your question.
Nothing really new in the idea but a fully matricial method with no explicit loop or recursion (using Numpy but easy to adapt):
import numpy as np
import math
vfact = np.vectorize(math.factorial, otypes='O')
def perm_index(p):
return np.dot( vfact(range(len(p)-1, -1, -1)),
p-np.sum(np.triu(p>np.vstack(p)), axis=0) )
I just wrote a code using Visual Basic and my program can directly calculate every index or every corresponding permutation to a given index up to 17 elements (this limit is due to the approximation of the scientific notation of numbers over 17! of my compiler).
If you are interested I can I can send the program or publish it somewhere for download.
It works fine and It can be useful for testing and paragon the output of your codes.
I used the method of James D. McCaffrey called factoradic and you can read about it here and something also here (in the discussion at the end of the page).

Resources