Generate number with equal probability - algorithm

You are given a function let’s say bin() which will generate 0 or 1 with equal probability. Now you are given a range of contiguous integers say [a,b] (a and b inclusive).
Write a function say rand() using bin() to generate numbers within range [a,b] with equal probability

The insight you need is that your bin() function returns a single binary digit, or "bit". Invoking it once gives you 0 or 1. If you invoke it twice you get two bits b0 and b1 which can be combined as b1 * 2 + b0, giving you one of 0, 1, 2 or 3 with equal probability. If you invoke it thrice you get three bits b0, b1 and b2. Put them together and you get b2 * 2^2 + b1 * 2 + b0, giving you a member of {0, 1, 2, 3, 4, 5, 6, 7} with equal probability. And so on, as many as you want.
Your range [a, b] has m = b-a+1 values. You just need enough bits to generate a number between 0 and 2^n-1, where n is the smallest value that makes 2^n-1 greater than or equal to m. Then just scale that set to start at a and you're good.
So let's say you are given the range [20, 30]. There are 11 numbers there from 20 to 30 inclusive. 11 is greater than 8 (2^3), but less than 16 (2^4), so you'll need 4 bits. Use bin() to generate four bits b0, b1, b2, and b3. Put them together as x = b3 * 2^3 + b2 * 2^2 + b1 * 2 + b0. You'll get a result, x, between 0 and 15. If x > 11 then generate another four bits. When x <= 11, your answer is x + 20.

Help, but no code:
You can shift the range [0,2 ** n] easily to [a,a+2 ** n]
You can easily produce an equal probability from [0,2**n-1]
If you need a number that isn't a power of 2, just generate a number up to 2 ** n and re-roll if it exceeds the number you need

Subtract the numbers to work out your range:
Decimal: 20 - 10 = 10
Binary : 10100 - 01010 = 1010
Work out how many bits you need to represent this: 4.
For each of these, generate a random 1 or 0:
num_bits = 4
rand[num_bits]
for (x = 0; x < num_bits; ++x)
rand[x] = bin()
Let's say rand[] = [0,1,0,0] after this. Add this number back to the start of your range.
Binary: 1010 + 0100 = 1110
Decimal: 10 + 4 = 14

You can always change the range [a,b] to [0,b-a], denote X = b - a. Then you can define a function rand(X) as follows:
function int rand(X){
int i = 1;
// determine how many bits you need (see above answer for why)
while (X < 2^i) {
i++;
}
// generate the random numbers
Boolean cont = true;
int num = 0;
while (cont == true) {
for (j = 1 to i) {
// this generates num in range [0,2^i -1] with equal prob
// but we need to discard if num is larger than X
num = num + bin() * 2^j;
}
if (num <= X) { cont = false}
}
return num;
}

Related

How to approach and understand a math related DSA question

I found this question online and I really have no idea what the question is even asking. I would really appreciate some help in first understanding the question, and a solution if possible. Thanks!
To see if a number is divisible by 3, you need to add up the digits of its decimal notation, and check if the sum is divisible by 3.
To see if a number is divisible by 11, you need to split its decimal notation into pairs of digits (starting from the right end), add up corresponding numbers and check if the sum is divisible by 11.
For any prime p (except for 2 and 5) there exists an integer r such that a similar divisibility test exists: to check if a number is divisible by p, you need to split its decimal notation into r-tuples of digits (starting from the right end), add up these r-tuples and check whether their sum is divisible by p.
Given a prime int p, find the minimal r for which such divisibility test is valid and output it.
The input consists of a single integer p - a prime between 3 and 999983, inclusive, not equal to 5.
Example
input
3
output
1
input
11
output
2
This is a very cool problem! It uses modular arithmetic and some basic number theory to devise the solution.
Let's say we have p = 11. What divisibility rule applies here? How many digits at once do we need to take, to have a divisibility rule?
Well, let's try a single digit at a time. That would mean, that if we have 121 and we sum its digits 1 + 2 + 1, then we get 4. However we see, that although 121 is divisible by 11, 4 isn't and so the rule doesn't work.
What if we take two digits at a time? With 121 we get 1 + 21 = 22. We see that 22 IS divisible by 11, so the rule might work here. And in fact, it does. For p = 11, we have r = 2.
This requires a bit of intuition which I am unable to convey in text (I really have tried) but it can be proven that for a given prime p other than 2 and 5, the divisibility rule works for tuples of digits of length r if and only if the number 99...9 (with r nines) is divisible by p. And indeed, for p = 3 we have 9 % 3 = 0, while for p = 11 we have 9 % 11 = 9 (this is bad) and 99 % 11 = 0 (this is what we want).
If we want to find such an r, we start with r = 1. We check if 9 is divisible by p. If it is, then we found the r. Otherwise, we go further and we check if 99 is divisible by p. If it is, then we return r = 2. Then, we check if 999 is divisible by p and if so, return r = 3 and so on. However, the 99...9 numbers can get very large. Thankfully, to check divisibility by p we only need to store the remainder modulo p, which we know is small (at least smaller than 999983). So the code in C++ would look something like this:
int r(int p) {
int result = 1;
int remainder = 9 % p;
while (remainder != 0) {
remainder = (remainder * 10 + 9) % p;
result++;
}
return result;
}
I have no idea how they expect a random programmer with no background to figure out the answer from this.
But here is the brief introduction to modulo arithmetic that should make this doable.
In programming, n % k is the modulo operator. It refers to taking the remainder of n / k. It satisfies the following two important properties:
(n + m) % k = ((n % k) + (m % k)) % k
(n * m) % k = ((n % k) * (m % k)) % k
Because of this, for any k we can think of all numbers with the same remainder as somehow being the same. The result is something called "the integers modulo k". And it satisfies most of the rules of algebra that you're used to. You have the associative property, the commutative property, distributive law, addition by 0, and multiplication by 1.
However if k is a composite number like 10, you have the unfortunate fact that 2 * 5 = 10 which means that modulo 10, 2 * 5 = 0. That's kind of a problem for division.
BUT if k = p, a prime, then things become massively easier. If (a*m) % p = (b*m) % p then ((a-b) * m) % p = 0 so (a-b) * m is divisible by p. And therefore either (a-b) or m is divisible by p.
For any non-zero remainder m, let's look at the sequence m % p, m^2 % p, m^3 % p, .... This sequence is infinitely long and can only take on p values. So we must have a repeat where, a < b and m^a % p = m^b %p. So (1 * m^a) % p = (m^(b-a) * m^a) % p. Since m doesn't divide p, m^a doesn't either, and therefore m^(b-a) % p = 1. Furthermore m^(b-a-1) % p acts just like m^(-1) = 1/m. (If you take enough math, you'll find that the non-zero remainders under multiplication is a finite group, and all the remainders forms a field. But let's ignore that.)
(I'm going to drop the % p everywhere. Just assume it is there in any calculation.)
Now let's let a be the smallest positive number such that m^a = 1. Then 1, m, m^2, ..., m^(a-1) forms a cycle of length a. For any n in 1, ..., p-1 we can form a cycle (possibly the same, possibly different) n, n*m, n*m^2, ..., n*m^(a-1). It can be shown that these cycles partition 1, 2, ..., p-1 where every number is in a cycle, and each cycle has length a. THEREFORE, a divides p-1. As a side note, since a divides p-1, we easily get Fermat's little theorem that m^(p-1) has remainder 1 and therefore m^p = m.
OK, enough theory. Now to your problem. Suppose we have a base b = 10^i. The primality test that they are discussing is that a_0 + a_1 * b + a_2 * b^2 + a_k * b^k is divisible by a prime p if and only if a_0 + a_1 + ... + a_k is divisible by p. Looking at (p-1) + b, this can only happen if b % p is 1. And if b % p is 1, then in modulo arithmetic b to any power is 1, and the test works.
So we're looking for the smallest i such that 10^i % p is 1. From what I showed above, i always exists, and divides p-1. So you just need to factor p-1, and try 10 to each power until you find the smallest i that works.
Note that you should % p at every step you can to keep those powers from getting too big. And with repeated squaring you can speed up the calculation. So, for example, calculating 10^20 % p could be done by calculating each of the following in turn.
10 % p
10^2 % p
10^4 % p
10^5 % p
10^10 % p
10^20 % p
This is an almost direct application of Fermat's little theorem.
First, you have to reformulate the "split decimal notation into tuples [...]"-condition into something you can work with:
to check if a number is divisible by p, you need to split its decimal notation into r-tuples of digits (starting from the right end), add up these r-tuples and check whether their sum is divisible by p
When you translate it from prose into a formula, what it essentially says is that you want
for any choice of "r-tuples of digits" b_i from { 0, ..., 10^r - 1 } (with only finitely many b_i being non-zero).
Taking b_1 = 1 and all other b_i = 0, it's easy to see that it is necessary that
It's even easier to see that this is also sufficient (all 10^ri on the left hand side simply transform into factor 1 that does nothing).
Now, if p is neither 2 nor 5, then 10 will not be divisible by p, so that Fermat's little theorem guarantees us that
, that is, at least the solution r = p - 1 exists. This might not be the smallest such r though, and computing the smallest one is hard if you don't have a quantum computer handy.
Despite it being hard in general, for very small p, you can simply use an algorithm that is linear in p (you simply look at the sequence
10 mod p
100 mod p
1000 mod p
10000 mod p
...
and stop as soon as you find something that equals 1 mod p).
Written out as code, for example, in Scala:
def blockSize(p: Int, n: Int = 10, r: Int = 1): Int =
if n % p == 1 then r else blockSize(p, n * 10 % p, r + 1)
println(blockSize(3)) // 1
println(blockSize(11)) // 2
println(blockSize(19)) // 18
or in Python:
def blockSize(p: int, n: int = 10, r: int = 1) -> int:
return r if n % p == 1 else blockSize(p, n * 10 % p, r + 1)
print(blockSize(3)) # 1
print(blockSize(11)) # 2
print(blockSize(19)) # 18
A wall of numbers, just in case someone else wants to sanity-check alternative approaches:
11 -> 2
13 -> 6
17 -> 16
19 -> 18
23 -> 22
29 -> 28
31 -> 15
37 -> 3
41 -> 5
43 -> 21
47 -> 46
53 -> 13
59 -> 58
61 -> 60
67 -> 33
71 -> 35
73 -> 8
79 -> 13
83 -> 41
89 -> 44
97 -> 96
101 -> 4
103 -> 34
107 -> 53
109 -> 108
113 -> 112
127 -> 42
131 -> 130
137 -> 8
139 -> 46
149 -> 148
151 -> 75
157 -> 78
163 -> 81
167 -> 166
173 -> 43
179 -> 178
181 -> 180
191 -> 95
193 -> 192
197 -> 98
199 -> 99
Thank you andrey tyukin.
Simple terms to remember:
When x%y =z then (x%y)%y again =z
(X+y)%z == (x%z + y%z)%z
keep this in mind.
So you break any number into some r digits at a time together. I.e. break 3456733 when r=6 into 3 * 10 power(6 * 1) + 446733 * 10 power(6 * 0).
And you can break 12536382626373 into 12 * 10 power (6 * 2). + 536382 * 10 power (6 * 1) + 626373 * 10 power (6 * 0)
Observe that here r is 6.
So when we say we combine the r digits and sum them together and apply modulo. We are saying we apply modulo to coefficients of above breakdown.
So how come coefficients sum represents whole number’s sum?
When the “10 power (6* anything)” modulo in the above break down becomes 1 then that particular term’s modulo will be equal to the coefficient’s modulo. That means the 10 power (r* anything) is of no effect. You can check why it will have no effect by using the formulas 1&2.
And the other similar terms 10 power (r * anything) also will have modulo as 1. I.e. if you can prove that (10 power r)modulo is 1. Then (10 power r * anything) is also 1.
But the important thing is we should have 10 power (r) equal to 1. Then every 10 power (r * anything) is 1 that leads to modulo of number equal to sum of r digits divided modulo.
Conclusion: find r in (10 power r) such that the given prime number will leave 1 as reminder.
That also mean the smallest 9…..9 which is divisible by given prime number decides r.

How to find the count of numbers which are divisible by 7?

Given an integer N, how to efficiently find the count of numbers which are divisible by 7 (their reverse should also be divisible by 7) in the range:
[0, 10^N - 1]
Example:
For N=2, answer:
4 {0, 7, 70, 77}
[All numbers from 0 to 99 which are divisible by 7 (also their reverse is divisible)]
My approach, simple brute-force:
initialize count to zero
run a loop from i=0 till end
if a(i) % 7 == 0 && reverse(a(i)) % 7 == 0, then we increase the count
Note:
reverse(123) = 321, reverse(1200) = 21, for example!
Let's see what happens mod 7 when we add a digit, d, to a prefix, abc.
10 * abc + d =>
(10 mod 7 * abc mod 7) mod 7 + d mod 7
reversed number:
abc + d * 10^(length(prefix) =>
abc mod 7 + (d mod 7 * 10^3 mod 7) mod 7
Note is that we only need the count of prefixes of abc mod 7 for each such remainder, not the actual prefixes.
Let COUNTS(n,f,r) be the number of n-digit numbers such that n%7 = f and REVERSE(n)%7 = r
The counts are easy to calculate for n=1:
COUNTS(1,f,r) = 0 when f!=r, since a 1-digit number is the same as its reverse.
COUNTS(1,x,x) = 1 when x >= 3, and
COUNTS(1,x,x) = 2 when x < 3, since 7%3=0, 8%3=1, and 9%3=2
The counts for other lengths can be figured out by calculating what happens when you add each digit from 0 to 9 to the numbers characterized by the previous counts.
At the end, COUNTS(N,0,0) is the answer you are looking for.
In python, for example, it looks like this:
def getModCounts(len):
counts=[[0]*7 for i in range(0,7)]
if len<1:
return counts
if len<2:
counts[0][0] = counts[1][1] = counts[2][2] = 2
counts[3][3] = counts[4][4] = counts[5][5] = counts[6][6] = 1
return counts
prevCounts = getModCounts(len-1)
for f in range(0,7):
for r in range(0,7):
c = prevCounts[f][r]
rplace=(10**(len-1))%7
for newdigit in range(0,10):
newf=(f*10 + newdigit)%7
newr=(r + newdigit*rplace)%7
counts[newf][newr]+=c
return counts
def numFwdAndRevDivisible(len):
return getModCounts(len)[0][0]
#TEST
for i in range(0,20):
print("{0} -> {1}".format(i, numFwdAndRevDivisible(i)))
See if it gives the answers you're expecting. If not, maybe there's a bug I need to fix:
0 -> 0
1 -> 2
2 -> 4
3 -> 22
4 -> 206
5 -> 2113
6 -> 20728
7 -> 205438
8 -> 2043640
9 -> 20411101
10 -> 204084732
11 -> 2040990205
12 -> 20408959192
13 -> 204085028987
14 -> 2040823461232
15 -> 20408170697950
16 -> 204081640379568
17 -> 2040816769367351
18 -> 20408165293673530
19 -> 204081641308734748
This is a pretty good answer when counting up to N is reasonable -- way better than brute force, which counts up to 10^N.
For very long lengths like N=10^18 (you would probably be asked for a the count mod 1000000007 or something), there is a next-level answer.
Note that there is a linear relationship between the counts for length n and the counts for length n+1, and that this relationship can be represented by a 49x49 matrix. You can exponentiate this matrix to the Nth power using exponentiation by squaring in O(log N) matrix multiplications, and then just multiply by the single digit counts to get the length N counts.
There is a recursive solution using digit dp technique for any digits.
long long call(int pos , int Mod ,int revMod){
if(pos == len ){
if(!Mod && !revMod)return 1;
return 0;
}
if(dp[pos][Mod][revMod] != -1 )return dp[pos][Mod][revMod] ;
long long res =0;
for(int i= 0; i<= 9; i++ ){
int revValue =(base[pos]*i + revMod)%7;
int curValue = (Mod*10 + i)%7;
res += call(pos+1, curValue,revValue) ;
}
return dp[pos][Mod][revMod] = res ;
}

How can I find a permutation of all the digits of a given number such that it is closest to the target number

I just come across this interesting question from a book and I am unable to find the answer.
I have a given number X and a target number Y, task is to find such permutation of all the digits of X such that it is closest to Y.
Numbers are in form of array. No array size limit is given there.
Example
Given number X = 1212
Target number Y = 1500
Answer = 1221
Here, abs(1500-1221) is smallest among all permutations of X.
Given number X = 1212
Target number Y = 1900
Answer = 2112
Here, abs(1900-2112) is smallest among all permutations of X.
Given number X = 1029
Target number Y = 2000
Answer = 2019
Here, abs(2000-2019) is smallest among all permutations of X.
One of the solution I can find is to generate all permutations of the given number and at each stage calculates the difference. But this is very slow.
I tried to find the greedy approach, where I will iterate through all the indices of the target number Y and at each index I will put that digit of the given number X such that abs(Y[i] - X[i]) is minimum. But this fails for many cases.
I am trying to think of a DP approach, but unable to come up with any.
Any lead to the answer will be helpful.
Edit -
Adding pseudo code for my greedy approach
for each index i in [0,Y]:
min_index = 0;
for each index j in [1, X.length]:
if abs(X[j] - Y[i]) < abs(X[min_index] - Y[i]):
min_val = j
print X[min_index]
remove min_index from X
Example X = 1212 and Y = 1900.
step 1 - output 1 and remove index 0 from X.
step 2 - output 2 and remove index 1 from X.
step 3 - output 1 and remove index 2 from X.
step 2 - output 1 and remove index 3 from X.
answer = 1212 which is wrong (correct answer is 2112).
So fails for this test case and lots more.
So, the problem can be seen as follow:
Starting from the largest significant digits, for each of these index, there are three cases:
The current digit will be less than the desired digit, so for the rest of the digits, we try to create the largest number possible => for the rest of the digits, we sorted them in descending order , i.e if we have 0, 2, 7, 5 left -> we will create 7520
The current digit will be larger than the desired digit, so for the rest of the digits, we try to create the smallest number possible => for the rest of the digits, we sorted them in ascending order , i.e if we have 0, 2, 7, 5 left -> we will create 0275
If the current digit is equal to the desired digit, we will append it to the prefix and try to find better match in next iteration.
Pseudo-code:
int prefix, result;
for each index i from 0 to Y.length() {
int larger = prefix + smallestDigitLargerThan(Y(i)) + OtherDigitInAscendingOrder;
int smaller = prefix + largestDigitSmallerThan(Y(i)) + OtherDigitInDescendingOrder;
update result based on larger and smaller;
if there is no digit equals to Y(i)
break;
else {
remove Y(i) in X
prefix = prefix*10 + Y(i)
}
}
}
if prefix == Y {
//We have a full match
return prefix;
}
return result;
For example
X = 1029
Y = 2000
At index 0 -> Y(0) = 2,
int smaller = 0 (prefix) + 1(largest digit that is less than 2) + 920 (other digit in descending order) = 1920
int larger = 0 (prefix) + 9(smallest digit that is greater than 2) + 012 (other digit in ascending order) = 9012
int result = 1920
int prefix = 2
At index 1 -> Y(1) = 0,
int smaller = //Not exist
int larger = 2 + 1 + 09 = 2109
int result = 1920
int prefix = 20
At index 2 -> Y(2) = 0,
int smaller = //Not exist
int larger = 20 + 1 + 9 = 2019
int result = 2019
//Break as there is no digit match Y(2) = 0 from X
Other example:
X = 1212
Y = 1500
At index 0 -> Y(0) = 1,
int smaller = //Not exist
int larger = 0 + 2 + 112 = 2112
int result = 2112
int prefix = 1
At index 1 -> Y(1) = 5,
int smaller = 1 + 2 + 21 = 1221
int larger = //Not exist
int result = 1221
//Break from here as there is no digit match Y(1) = 5 in X
Beam search with width of 3 could be an approach. The idea is to construct the numbers from the largest to the smallest digit, and filling the rest with zeros. You construct the nearest and the second nearest numbers at each step for each number in the beam, and discarding all numbers which are worse than the top three. (In fact you're needing a beam size of two at most. The case of three is only needed, if the distance of two entries in the beams are equal.) During computation the constructed numbers Aand B should never be equal (except for the special case that X only contains the same digit.)
Here are the beams for the second example. The * denotes the best beam, and no * means that both are equally good:
2000* -> 2100* -> 2112*
2200 -> 2211
1000 -> 1200
1100
This is for the first example:
1000 -> 1200* -> 1221*
1100 -> 1122
2000 -> 2100
2200
Third example needs a beam size of 3 for second step, because the distance of second best beams 1900 and 2100 to 2000 is 100:
1000 -> 1900 -> 1901
1100
2000* -> 2000* -> 2019*
2100 2109
Note: I've joined the 3. and the 4. step in all examples.
The numbers X = 1992and Y = 2000 are an interesting example
1000 -> 1900 -> 1992*
1200
2000* -> 2100 -> 2199
2900
because the best beam is changing during computation.
I wrote a small python program for demonstration:
import sys
X = sys.argv[1]
Y = int(sys.argv[2])
def remove(s, i):
return s[:i] + s[i+1:]
def expand(t):
result = set()
val = t[0]
chars = t[1]
index = len(val) - len(chars)
for i in range(len(chars)):
s = val[:index] + chars[i]
r = remove(chars, i)
if index < len(val):
s += val[index + 1:]
result.add((s, r))
return result
beams = [("0" * len(X), X)]
for i in range(len(X)):
newBeams = set()
for t in beams:
newBeams.update(expand(t))
beams = sorted(newBeams, key = lambda t: abs(Y - int(t[0])))[:3]
print beams
print "Result:", beams[0][0]
The code is not optimal but this algorithm has polynomial running time, O(n² ln n) at most, and this estimate is very generous.

How to check divisibility of a number not in base 10 without converting?

Let's say I have a number of base 3, 1211. How could I check this number is divisible by 2 without converting it back to base 10?
Update
The original problem is from TopCoder
The digits 3 and 9 share an interesting property. If you take any multiple of 3 and sum its digits, you get another multiple of 3. For example, 118*3 = 354 and 3+5+4 = 12, which is a multiple of 3. Similarly, if you take any multiple of 9 and sum its digits, you get another multiple of 9. For example, 75*9 = 675 and 6+7+5 = 18, which is a multiple of 9. Call any digit for which this property holds interesting, except for 0 and 1, for which the property holds trivially.
A digit that is interesting in one base is not necessarily interesting in another base. For example, 3 is interesting in base 10 but uninteresting in base 5. Given an int base, your task is to return all the interesting digits for that base in increasing order. To determine whether a particular digit is interesting or not, you need not consider all multiples of the digit. You can be certain that, if the property holds for all multiples of the digit with fewer than four digits, then it also holds for multiples with more digits. For example, in base 10, you would not need to consider any multiples greater than 999.
Notes
- When base is greater than 10, digits may have a numeric value greater than 9. Because integers are displayed in base 10 by default, do not be alarmed when such digits appear on your screen as more than one decimal digit. For example, one of the interesting digits in base 16 is 15.
Constraints
- base is between 3 and 30, inclusive.
This is my solution:
class InterestingDigits {
public:
vector<int> digits( int base ) {
vector<int> temp;
for( int i = 2; i <= base; ++i )
if( base % i == 1 )
temp.push_back( i );
return temp;
}
};
The trick was well explained here : https://math.stackexchange.com/questions/17242/how-does-base-of-a-number-relate-to-modulos-of-its-each-individual-digit
Thanks,
Chan
If your number k is in base three, then you can write it as
k = a0 3^n + a1 3^{n-1} + a2 3^{n-2} + ... + an 3^0
where a0, a1, ..., an are the digits in the base-three representation.
To see if the number is divisible by two, you're interested in whether the number, modulo 2, is equal to zero. Well, k mod 2 is given by
k mod 2 = (a0 3^n + a1 3^{n-1} + a2 3^{n-2} + ... + an 3^0) mod 2
= (a0 3^n) mod 2 + (a1 3^{n-1}) mod 2 + ... + an (3^0) mod 2
= (a0 mod 2) (3^n mod 2) + ... + (an mod 2) (3^0 mod 2)
The trick here is that 3^i = 1 (mod 2), so this expression is
k mod 2 = (a0 mod 2) + (a1 mod 2) + ... + (an mod 2)
In other words, if you sum up the digits of the ternary representation and get that this value is divisible by two, then the number itself must be divisible by two. To make this even cooler, since the only ternary digits are 0, 1, and 2, this is equivalent to asking whether the number of 1s in the ternary representation is even!
More generally, though, if you have a number in base m, then that number is divisible by m - 1 iff the sum of the digits is divisible by m. This is why you can check if a number in base 10 is divisible by 9 by summing the digits and seeing if that value is divisible by nine.
You can always build a finite automaton for any base and any divisor:
Normally to compute the value n of a string of digits in base b
you iterate over the digits and do
n = (n * b) + d
for each digit d.
Now if you are interested in divisibility you do this modulo m instead:
n = ((n * b) + d) % m
Here n can take at most m different values. Take these as states of a finite automaton, and compute the transitions depending on the digit d according to that formula. The accepting state is the one where the remainder is 0.
For your specific case we have
n == 0, d == 0: n = ((0 * 3) + 0) % 2 = 0
n == 0, d == 1: n = ((0 * 3) + 1) % 2 = 1
n == 0, d == 2: n = ((0 * 3) + 2) % 2 = 0
n == 1, d == 0: n = ((1 * 3) + 0) % 2 = 1
n == 1, d == 1: n = ((1 * 3) + 1) % 2 = 0
n == 1, d == 2: n = ((1 * 3) + 2) % 2 = 1
which shows that you can just sum the digits 1 modulo 2 and ignore any digits 0 or 2.
Add all the digits together (or even just count the ones) - if the answer is odd, the number is odd; if it's even, the nmber is even.
How does that work? Each digit from the number contributes 0, 1 or 2 times (1, 3, 9, 27, ...). A 0 or a 2 adds an even number, so no effect on the oddness/evenness (parity) of the number as a whole. A 1 adds one of the powers of 3, which is always odd, and so flips the parity). And we start from 0 (even). So by counting whether the number of flips is odd or even we can tell whether the number itself is.
I'm not sure on what CPU you have a number in base-3, but the normal way to do this is to perform a modulus/remainder operation.
if (n % 2 == 0) {
// divisible by 2, so even
} else {
// odd
}
How to implement the modulus operator is going to depend on how you're storing your base-3 number. The simplest to code will probably be to implement normal pencil-and-paper long division, and get the remainder from that.
0 2 2 0
_______
2 ⟌ 1 2 1 1
0
---
1 2
1 1
-----
1 1
1 1
-----
0 1 <--- remainder = 1 (so odd)
(This works regardless of base, there are "tricks" for base-3 as others have mentioned)
Same as in base 10, for your example:
1. Find the multiple of 2 that's <= 1211, that's 1210 (see below how to achieve it)
2. Substract 1210 from 1211, you get 1
3. 1 is < 10, thus 1211 isn't divisible by 2
how to achieve 1210:
1. starts with 2
2. 2 + 2 = 11
3. 11 + 2 = 20
4. 20 + 2 = 22
5. 22 + 2 = 101
6. 101 + 2 = 110
7. 110 + 2 = 112
8. 112 + 2 = 121
9. 121 + 2 = 200
10. 200 + 2 = 202
... // repeat until you get the biggest number <= 1211
it's basically the same as base 10 it's just the round up happens on 3 instead of 10.

Compress two or more numbers into one byte

I think this is not really possible but worth asking anyway. Say I have two small numbers (Each ranges from 0 to 11). Is there a way that I can compress them into one byte and get them back later. How about with four numbers of similar sizes.
What I need is something like: a1 + a2 = x. I only know x and from that get a1, a2
For the second part: a1 + a2 + a3 + a4 = x. I only know x and from that get a1, a2, a3, a4
Note: I know you cannot unadd, just illustrating my question.
x must be one byte. a1, a2, a3, a4 range [0, 11].
Thats trivial with bit masks. Idea is to divide byte into smaller units and dedicate them to different elements.
For 2 numbers, it can be like this: first 4 bits are number1, rest are number2. You would use number1 = (x & 0b11110000) >> 4, number2 = (x & 0b00001111) to retrieve values, and x = (number1 << 4) | number2 to compress them.
For two numbers, sure. Each one has 12 possible values, so the pair has a total of 12^2 = 144 possible values, and that's less than the 256 possible values of a byte. So you could do e.g.
x = 12*a1 + a2
a1 = x / 12
a2 = x % 12
(If you only have signed bytes, e.g. in Java, it's a little trickier)
For four numbers from 0 to 11, there are 12^4 = 20736 values, so you couldn't fit them in one byte, but you could do it with two.
x = 12^3*a1 + 12^2*a2 + 12*a3 + a4
a1 = x / 12^3
a2 = (x / 12^2) % 12
a3 = (x / 12) % 12
a4 = x % 12
EDIT: the other answers talk about storing one number per four bits and using bit-shifting. That's faster.
The 0-11 example is pretty easy -- you can store each number in four bits, so putting them into a single byte is just a matter of shifting one 4 bits to the left, and oring the two together.
Four numbers of similar sizes won't fit -- four bits apiece times four gives a minimum of 16 bits to hold them.
Let's say it in general: suppose you want to mix N numbers a1, a2, ... aN, a1 ranging from 0..k1-1, a2 from 0..k2-1, ... and aN from 0 .. kN-1.
Then, the encoded number is:
encoded = a1 + k1*a2 + k1*k2*a3 + ... k1*k2*..*k(N-1)*aN
The decoding is then more tricky, stepwise:
rest = encoded
a1 = rest mod k1
rest = rest div k1
a2 = rest mod k2
rest = rest div k2
...
a(N-1) = rest mod k(N-1)
rest = rest div k(N-1)
aN = rest # rest is already < kN
If the numbers 0-11 aren't evenly distributed you can do even better by using shorter bit sequences for common values and longer ones for rarer values. It costs at least one bit to code which length you are using so there is a whole branch of CS devoted to proving when it's worth doing.
So a byte can hold upto 256 values or FF in Hex. So you can encode two numbers from 0-16 in a byte.
byte a1 = 0xf;
byte a2 = 0x9;
byte compress = a1 << 4 | (0x0F & a2); // should yield 0xf9 in one byte.
4 Numbers you can do if you reduce it to only 0-8 range.
Since a single byte is 8 bits, you can easily subdivide it, with smaller ranges of values. The extreme limit of this is when you have 8 single bit integers, which is called a bit field.
If you want to store two 4-bit integers (which gives you 0-15 for each), you simply have to do this:
value = a * 16 + b;
As long as you do proper bounds checking, you will never lose any information here.
To get the two values back, you just have to do this:
a = floor(value / 16)
b = value MOD 15
MOD is modulus, it's the "remainder" of a division.
If you want to store four 2-bit integers (0-3), you can do this:
value = a * 64 + b * 16 + c * 4 + d
And, to get them back:
a = floor(value / 64)
b = floor(value / 16) MOD 4
c = floor(value / 4) MOD 4
d = value MOD 4
I leave the last division as an exercise for the reader ;)
#Mike Caron
your last example (4 integers between 0-3) is much faster with bit-shifting. No need for floor().
value = (a << 6) | (b << 4) | (c << 2) | d;
a = (value >> 6);
b = (value >> 4) % 4;
c = (value >> 2) % 4;
d = (value) % 4;
Use Bit masking or Bit Shifting. The later is faster
Test out BinaryTrees for some fun. (it will be handing later on in dev life regarding data and all sorts of dev voodom lol)
Packing four values into one number will require at least 15 bits. This doesn't fit in a single byte, but in two.
What you need to do is a conversion from base 12 to base 65536 and conversely.
B = A1 + 12.(A2 + 12.(A3 + 12.A4))
A1 = B % 12
A2 = (B / 12) % 12
A3 = (B / 144) % 12
A4 = B / 1728
As this takes 2 bytes anyway, conversion from base 12 to (packed) base 16 is by far prefable.
B1 = A1 + 256.A2
B2 = A3 + 256.A4
A1 = B1 % 256
A2 = B1 / 256
A3 = B2 % 256
A4 = B2 / 256
The modulos and divisions are implemented bymaskings and shifts.
0-9 works much easier. You can easily store 11random order decimals in 4 1/2 bytes. Which is tighter compression than log(256)÷log(10). Just by creative mapping. Remember not all compression has to do with, dictionaries, redundancies, or sequences.
If you are talking of random numbers 0 - 9 you can have 4 digits per 14 bits not 15.

Resources