Edge case for finding strictly increasing squares of a number - algorithm

I'm trying to solve this codewars kata, Square into Squares.
I'm passing most of the tests, but there are two inputs for which my algorithm exceeds the maximum call stack size.
I feel like I'm taking care of all the edge conditions, and I can't figure out what I'm missing.
function sumSquares (n) {
function decompose (num, whatsLeft, result) {
if (whatsLeft === 0) return result
else {
if (whatsLeft < 0 || num === 0) return null
else {
return decompose(num-1, whatsLeft - num * num, [num].concat(result)) || decompose(num-1, whatsLeft, result)
}
}
}
return decompose(n-1, n*n, [])
}
const resA = sumSquares(50) //[1,3,5,8,49]
console.log(resA)
const resB = sumSquares(7) //[2,3,6]
console.log(resB)
const resC = sumSquares(11) //[ 1, 2, 4, 10 ]
console.log(resC)
const res1 = sumSquares(90273)
console.log(res1)
const res2 = sumSquares(123456)
console.log(res2)

It looks like your code is correct, but has two problems: first, your call stack will eventually reach size "num" (which may be causing your failure for large inputs), and second, it may recompute the same values multiple times.
The first problem is easy to fix: you can skip num values which give a negative whatsLeft result. Like this:
while(num * num > whatsLeft) num = num - 1;
You can insert this after the first if statement. This also enables you to remove the check for negative whatsLeft. As a matter of style, I removed the else{} cases for your if statements after a return -- this reduces the indentation and (I think) makes the code easier to read. But that's just a matter of personal taste.
function sumSquares (n) {
function decompose (num, whatsLeft, result) {
if (whatsLeft === 0) return result;
while (num * num > whatsLeft) num -= 1;
if (num === 0) return null;
return decompose(num-1, whatsLeft - num * num, [num].concat(result)) || decompose(num-1, whatsLeft, result);
}
return decompose(n-1, n*n, []);
}
Your test cases run instantly for me with these changes, so the second problem (which would be solved by memoization) isn't necessary to address. I also tried submitting it on the codewars site, and with a little tweaking (the outer function needs to be called decompose, so both the outer and inner functions need renaming), all 113 test cases pass in 859ms.

#PaulHankin's answer offers good insight
Let's look at sumSquares (n) where n = 100000
decompose (1e5 - 1, 1e5 * 1e5, ...)
In the first frame,
num = 99999
whatsLeft = 10000000000
Which spawns
decompose (99999 - 1, 1e10 - 99999 * 99999, ...)
Where the second frame is
num = 99998
whatsLeft = 199999
And here's the problem: num * num above is significantly larger than whatsLeft and each time we recur to try a new num that first, we only decrease by -1 each frame. Without fixing anything, the next process spawned will be
decompose (99998 - 1, 199999 - 99998 * 99998, ...)
Where the third frame is
num = 99997
whatsLeft = -9999500005
See how whatsLeft is significantly negative? It means we'll have to decrease num by a lot before the next value doesn't cause whatsLeft to drop below zero
// [frame #4]
num = 99996
whatsLeft = -9999000017
// [frame #5]
num = 99995
whatsLeft = -9998800026
...
// [frame #99552]
num = 448
whatsLeft = -705
// [frame #99553]
num = 447
whatsLeft = 190
As we can see above, it would take almost 100000 frames just to guess the second digit of sumSquares (100000). This is exactly what Paul Hankin describes as your first problem.
We can also visualize it a little easer if we only look at decompose with num. Below, if a solution cannot be found, the stack will grow to size num and therefore cannot be used to compute solutions where num exceeds the stack limit
// imagine num = 100000
function decompose (num, ...) {
...
decompose (num - 1 ...) || decompose (num - 1, ...)
}
Paul's solution uses a while loop to decrement num using a loop until num is small enough. Another solution would involve calculating the next guess by finding the square root of the remaining whatsLeft
const sq = num * num
const next = whatsLeft - sq
const guess = Math.floor (Math.sqrt (next))
return decompose (guess, next, ...) || decompose (num - 1, whatsLeft, ...)
Now it can be used to calculate values where num is huge
console.log (sumSquares(123456))
// [ 1, 2, 7, 29, 496, 123455 ]
But notice there's a bug for certain inputs. The squares of the solution still sum to the correct amount, but it's allowing some numbers to be repeated
console.log (sumSquares(50))
// [ 1, 1, 4, 9, 49 ]
To enforce the strictly increasing requirement, we have to ensure that a calculated guess is still lower than the previous. We can do that using Math.min
const guess = Math.floor (Math.sqrt (next))
const guess = Math.min (num - 1, Math.floor (Math.sqrt (next)))
Now the bug is fixed
console.log (sumSquares(50))
// [ 1, 1, 4, 9, 49 ]
// [ 1, 3, 5, 8, 49 ]
Full program demonstration
function sumSquares (n) {
function decompose (num, whatsLeft, result) {
if (whatsLeft === 0)
return result;
if (whatsLeft < 0 || num === 0)
return null;
const sq = num * num
const next = whatsLeft - sq
const guess = Math.min (num - 1, Math.floor (Math.sqrt (next)))
return decompose(guess, next, [num].concat(result)) || decompose(num-1, whatsLeft, result);
}
return decompose(n-1, n*n, []);
}
console.log (sumSquares(50))
// [ 1, 3, 5, 8, 49 ]
console.log (sumSquares(123456))
// [ 1, 2, 7, 29, 496, 123455 ]

Related

How many PR numbers exist in a given range?

It is not a homework problem. I am just curious about this problem. And my approach is simple brute-force :-)
My brute-force C++ code:
int main()
{
ll l,r;
cin>>l>>r;
ll f=0;
ll i=l;
while(i<=r)
{
ll j=0;
string s;
ll c=0;
s=to_string(i);
// cout<<s<<" ";
ll x=s.length();
if(x==1)
{
c=0;
}
else
{
j=0;
//whil
while(j<=x-2)
{
string b,g;
b="1";
g="1";
b=s[j];
g=s[j+1];
ll k1,k2;
k1=stoi(b);
k2=stoi(g);
if(__gcd(k1,k2)==1)
{
c=1;
break;
}
j++;
}
}
ll d=0;
j=0;
while(j<=x-1)
{
if( s[j]=='2' || s[j]=='3' || s[j]=='5' || s[j]=='7')
{
string b;
b="1";
b=s[j];
ll k1=stoi(b);
if(i%k1==0)
{
//d=0;
}
else
{
d=1;
break;
}
}
j++;
}
if(c==1 || d==1)
{
// cout<<"NO";
}
else
{
f++;
// cout<<"PR";
}
// cout<<"\n";
i++;
}
cout<<f;
return 0;
}
You are given 2 integers 'L' and 'R' . You are required to find the count of all the PR numbers in the range 'L' to 'R' inclusively. PR number are the numbers which satisfy following properties:
No pair of adjacent digits are co-prime i.e. adjacent digits in a PR number will not be co-prime to each other.
PR number is divisible by all the single digit prime numbers which occur as a digit in the PR number.
Note: Two numbers 'a' and 'b' are co-prime, if gcd(a,b)=1.
Also, gcd(0,a)=a;
Example:
Input: [2,5].
Output: '4'.
(Note: '1' is not a prime-number, though its very common)
(All the integers: '2','3','4','5') satisfy the condition of PR numbers :-)
Constraints on 'L','R': 1 <= L, R <= 10^18
What can be the the most efficient algorithm to solve this ?
Note: This will solve only part 1 which is No pair of adjacent digits are co-prime i.e. adjacent digits in a PR number will not be co-prime to each other.
Here is a constructive approach in python: instead of going throught all numbers in range and filtering by conditions, we will just construct all numbers that satisfy the condition. Note that if we have a valid sequence of digits, for it to continue being valid only the rightmost digit matters in order to decide what the next digit will be.
def ways(max_number, prev_digit, current_number):
if current_number > max_number:
return 0
count = 1
if prev_digit == 0:
if current_number != 0:
count += ways(max_number, 0, current_number * 10)
for i in range(2, 10):
count += ways(max_number, i, current_number * 10 + i)
if prev_digit == 2 or prev_digit == 4 or prev_digit == 8:
for i in [0, 2, 4, 6, 8]:
count += ways(max_number, i, current_number * 10 + i)
if prev_digit == 3 or prev_digit == 9:
for i in [0, 3, 6, 9]:
count += ways(max_number, i, current_number * 10 + i)
if prev_digit == 5 or prev_digit == 7:
count += ways(max_number, 0, current_number * 10)
count += ways(max_number, prev_digit, current_number * 10 + prev_digit)
if prev_digit == 6:
for i in [0, 2, 3, 4, 6, 8, 9]:
count += ways(max_number, i, current_number * 10 + i)
return count
As we are generating all valid numbers up to max_number without any repeats, the complexity of this function is O(amount of numbers between 0 and max_number that satisfy condition 1). To calculate the range a to b, we just need to do ways(b) - ways(a - 1).
Takes less than 1 second to caculate these numbers from 0 to 1 million, as there are only 42935 numbers that satisfy the result. As there are few numbers that satisfy the condition, we can then check if they are multiple of its prime digits to satisfy also condition 2. I leave this part up to the reader as there are multiple ways to do it.
TL;DR: This is more commonly called "digit dynamic programming with bitmask"
In more competitive-programming-familiar terms, you'd compute dp[n_digit][mod_2357][is_less_than_r][digit_appeared][last_digit] = number of numbers with n_digit digits (including leading zeroes), less than the number formed by first n_digit digits of R and with the other properties match. Do it twice with R and L-1 then take the difference. The number of operations required would be about 19 (number of digits) * 210 (mod) * 2 * 24 (it's only necessary to check for appearance of single-digit primes) * 10 * 10, which is obviously manageable by today computers.
Think about how you'd check whether a number is valid.
Not the normal way. Using a finite state automaton that take the input from left to right, digit by digit.
For simplicity, assume the input has a fixed number of digits (so that comparison with L/R is easier. This is possible because the number has at most as many digits as R).
It's necessary for each state to keep track of:
which digit appeared in the number (use a bit mask, there are 4 1-digit primes)
is the number in range [L..R] (either this is guaranteed to be true/false by the prefix, otherwise the prefix matches with that of L/R)
what is the value of the prefix mod each single digit prime
the most recent digit (to check whether all pairs of consecutive digits are coprime)
After the finite state automaton is constructed, the rest is simple. Just use dynamic programming to count the number of path to any accepted state from the starting state.
Remark: This method can be used to count the number of any type of object that can be verified using a finite state automaton (roughly speaking, you can check whether the property is satisfied using a program with constant memory usage, and takes the object piece-by-piece in some order)
We need a table where we can look up the count of suffixes that would match a prefix to construct valid numbers. Given a prefix's
right digit
prime combination
mod combination
and a suffix length, we'd like the count of suffixes that have searchable:
left digit
length
prime combination
mod combination
I started coding in Python, then switched to JavaScript to be able to offer a snippet. Comments in the code describe each lookup table. There are a few of them to allow for faster enumeration. There are samples of prefix-suffix calculations to illustrate how one can build an arbitrary upper-bound using the table, although at least some, maybe all of the prefix construction and aggregation could be made during the tabulation.
function gcd(a,b){
if (!b)
return a
else
return gcd(b, a % b)
}
// (Started writing in Python,
// then switched to JavaScript...
// 'xrange(4)' -> [0, 1, 2, 3]
// 'xrange(2, 4)' -> [2, 3]
function xrange(){
let l = 0
let r = arguments[1] || arguments[0]
if (arguments.length > 1)
l = arguments[0]
return new Array(r - l).fill(0).map((_, i) => i + l)
}
// A lookup table and its reverse,
// mapping each of the 210 mod combinations,
// [n % 2, n % 3, n % 5, n % 7], to a key
// from 0 to 209.
// 'mod_combs[0]' -> [0, 0, 0, 0]
// 'mod_combs[209]' -> [1, 2, 4, 6]
// 'mod_keys[[0,0,0,0]]' -> 0
// 'mod_keys[[1,2,4,6]]' -> 209
let mod_combs = {}
let mod_keys = {}
let mod_key_count = 0
for (let m2 of xrange(2)){
for (let m3 of xrange(3)){
for (let m5 of xrange(5)){
for (let m7 of xrange(7)){
mod_keys[[m2, m3, m5, m7]] = mod_key_count
mod_combs[mod_key_count] = [m2, m3, m5, m7]
mod_key_count += 1
}
}
}
}
// The main lookup table built using the
// dynamic program
// [mod_key 210][l_digit 10][suffix length 20][prime_comb 16]
let table = new Array(210)
for (let mk of xrange(210)){
table[mk] = new Array(10)
for (let l_digit of xrange(10)){
table[mk][l_digit] = new Array(20)
for (let sl of xrange(20)){
table[mk][l_digit][sl] = new Array(16).fill(0)
}
}
}
// We build prime combinations from 0 (no primes) to
// 15 (all four primes), using a bitmask of up to four bits.
let prime_set = [0, 0, 1<<0, 1<<1, 0, 1<<2, 0, 1<<3, 0, 0]
// The possible digits that could
// follow a digit
function get_valid_digits(digit){
if (digit == 0)
return [0, 2, 3, 4, 5, 6, 7, 8, 9]
else if ([2, 4, 8].includes(digit))
return [0, 2, 4, 6, 8]
else if ([3, 9].includes(digit))
return [0, 3, 6, 9]
else if (digit == 6)
return [0, 2, 3, 4, 6, 8, 9]
else if (digit == 5)
return [0, 5]
else if (digit == 7)
return [0, 7]
}
// Build the table bottom-up
// Single digits
for (let i of xrange(10)){
let mod_key = mod_keys[[i % 2, i % 3, i % 5, i % 7]]
let length = 1
let l_digit = i
let prime_comb = prime_set[i]
table[mod_key][l_digit][length][prime_comb] = 1
}
// Everything else
// For demonstration, we just table up to 6 digits
// since either JavaScript, this program, or both seem
// to be too slow for a full demo.
for (let length of xrange(2, 6)){
// We're appending a new left digit
for (let new_l_digit of xrange(0, 10)){
// The digit 1 is never valid
if (new_l_digit == 1)
continue
// The possible digits that could
// be to the right of our new left digit
let ds = get_valid_digits(new_l_digit)
// For each possible digit to the right
// of our new left digit, iterate over all
// the combinations of primes and remainder combinations.
// The ones that are populated are valid paths, the
// sum of which can be aggregated for each resulting
// new combination of primes and remainders.
for (let l_digit of ds){
for (let p_comb of xrange(16)){
for (let m_key of xrange(210)){
new_prime_comb = prime_set[new_l_digit] | p_comb
// suffix's remainder combination
let [m2, m3, m5, m7] = mod_combs[m_key]
// new remainder combination
let m = Math.pow(10, length - 1) * new_l_digit
let new_mod_key = mod_keys[[(m + m2) % 2, (m + m3) % 3, (m + m5) % 5, (m + m7) % 7]]
// Aggregate any populated entries into the new
// table entry
table[new_mod_key][new_l_digit][length][new_prime_comb] += table[m_key][l_digit][length - 1][p_comb]
}
}
}
}
}
// If we need only a subset of the mods set to
// zero, we need to check all instances where
// this subset is zero. For example,
// for the prime combination, [2, 3], we need to
// check all mod combinations where the first two
// are zero since we don't care about the remainders
// for 5 and 7: [0,0,0,0], [0,0,0,1],... [0,0,4,6]
// Return all needed combinations given some
// predetermined, indexed remainders.
function prime_comb_to_mod_keys(remainders){
let mod_map = [2, 3, 5, 7]
let mods = []
for (let i of xrange(4))
mods.push(!remainders.hasOwnProperty(i) ? mod_map[i] - 1 : 0)
function f(ms, i){
if (i == ms.length){
for (let idx in remainders)
ms[idx] = remainders[idx]
return [mod_keys[ms]]
}
let result = []
for (let m=ms[i] - 1; m>=0; m--){
let _ms = ms.slice()
_ms[i] = m
result = result.concat(f(_ms, i + 1))
}
return result.concat(f(ms, i + 1))
}
return f(mods, 0)
}
function get_matching_mods(prefix, len_suffix, prime_comb){
let ps = [2, 3, 5, 7]
let actual_prefix = Math.pow(10, len_suffix) * prefix
let remainders = {}
for (let i in xrange(4)){
if (prime_comb & (1 << i))
remainders[i] = (ps[i] - (actual_prefix % ps[i])) % ps[i]
}
return prime_comb_to_mod_keys(remainders)
}
// A brute-force function to check the
// table is working. Returns a list of
// valid numbers of 'length' digits
// given a prefix.
function confirm(prefix, length){
let result = [0, []]
let ps = [0, 0, 2, 3, 0, 5, 0, 7, 0, 0]
let p_len = String(prefix).length
function check(suffix){
let num = Math.pow(10, length - p_len) * prefix + suffix
let temp = num
prev = 0
while (temp){
let d = temp % 10
if (d == 1 || gcd(prev, d) == 1 || (ps[d] && num % d))
return [0, []]
prev = d
temp = ~~(temp / 10)
}
return [1, [num]]
}
for (suffix of xrange(Math.pow(10, length - p_len))){
let [a, b] = check(suffix)
result[0] += a
result[1] = result[1].concat(b)
}
return result
}
function get_prime_comb(prefix){
let prime_comb = 0
while (prefix){
let d = prefix % 10
prime_comb |= prime_set[d]
prefix = ~~(prefix / 10)
}
return prime_comb
}
// A function to test the table
// against the brute-force method.
// To match a prefix with the number
// of valid suffixes of a chosen length
// in the table, we want to aggregate all
// prime combinations for all valid digits,
// where the remainders for each combined
// prime combination (prefix with suffix)
// sum to zero (with the appropriate mod).
function test(prefix, length, show=false){
let r_digit = prefix % 10
let len_suffix = length - String(prefix).length
let prefix_prime_comb = get_prime_comb(prefix)
let ds = get_valid_digits(r_digit)
let count = 0
for (let l_digit of ds){
for (let prime_comb of xrange(16)){
for (let i of get_matching_mods(prefix, len_suffix, prefix_prime_comb | prime_comb)){
let v = table[i][l_digit][len_suffix][prime_comb]
count += v
}
}
}
let c = confirm(prefix, length)
return `${ count }, ${ c[0] }${ show ? ': ' + c[1] : '' }`
}
// Arbitrary prefixes
for (let length of [3, 4]){
for (let prefix of [2, 30]){
console.log(`prefix, length: ${ prefix }, ${ length }`)
console.log(`tabled, brute-force: ${ test(prefix, length, true) }\n\n`)
}
}
let length = 6
for (let l_digit=2; l_digit<10; l_digit++){
console.log(`prefix, length: ${ l_digit }, ${ length }`)
console.log(`tabled, brute-force: ${ test(l_digit, length) }\n\n`)
}

Fastest way to check if a number is a vampire number?

A vampire number is defined here https://en.wikipedia.org/wiki/Vampire_number. A number V is a vampire number if:
It can be expressed as X*Y such that X and Y have N/2 digits each where N is the number of digits in V
Both X & Y should not have trailing zeros
X & Y together should have the same digits as V
I came up with a solution,
strV = sort(toString(V))
for factor <- pow(10, N/2) to sqrt(V)
if factor divides V
X <- factor
Y <- V/factor
if X and Y have trailing zeros
continue
checkStr = sort(toString(X) + toString(Y))
if checkStr equals strV return true
Another possible solution is to permute the string represented by V and split it into half and check if its a vampire number. Which one is the best way to do so?
The algorithm I propose here will not go through all permutations of digits. It will eliminate possibilities as fast as possible so that only a fraction of permutations will actually be tested.
Algorithm explained by example
Here is how it works based on example number 125460. If you are fine with reading the code directly, then you can skip this (long) part:
At first the two fangs (i.e. vampire factors) are obviously not known, and the problem can be represented as follows:
?**
X ?**
-------
=125460
For the left most digit of the first factor (marked with ?) we could choose any of the digits 0,1,2,5,4, or 6. But on closer analysis 0 would not be a viable possibility, as the product would never reach more than a 5-digit number. So it would be a waste of time to go through all permutations of digits that start with a zero.
For the left most digit of the second factor (also marked with ?), the same is true. However, when looking at the combinations, we can again filter out some pairs that cannot contribute to reaching the target product. For instance, this combination should be discarded:
1**
X 2**
-------
=125460
The greatest number that can be achieved with these digits is 199x299 = 59501 (ignoring the fact that we don't even have a 9), which is not even half of the desired number. So we should reject the combination (1, 2). For the same reason, the pair (1, 5) can be discarded for taking these positions. Similarly, the pairs (4, 5), (4, 6), and (5, 6) can be rejected as well, because they yield a too large product (>= 200000). I will call this kind of a test -- where it is determined whether the target number is within reach for a certain chosen digit pair, the "range test".
At this stage there is no difference between the first and the second fang, so we should also not have to investigate pairs where the second digit is smaller than the first, because they mirror a pair that would already have been investigated (or rejected).
So of all the possible pairs that could take up this first position (there are 30 possibilities to take 2 digits from a set of 6 digits), only the following 4 need to be investigated:
(1, 6), (2, 4), (2, 5), (2, 6)
In a more elaborate notation this means we are limiting the search to these number patterns:
1** 2** 2** 2**
X 6** X 4** X 5** X 6**
------- ------- ------- -------
=125460 =125460 =125460 =125460
A B C D
It is clear that this reduction of possibilities before even looking at the other positions greatly reduces the search tree.
The algorithm will take each of these 4 possibilities in order, and for each will check the possibilities for the next digit position. So first configuration A is analysed:
1?*
X 6?*
-------
=125460
The pairs that are available for the ?-marked positions are these 12:
(0, 2), (0, 4), (0, 5)
(2, 0), (2, 4), (2, 5)
(4, 0), (4, 2), (4, 5)
(5, 0), (5, 2), (5, 4)
Again, we can eliminate pairs by applying the range test. Let's take for instance the pair (5, 4). This would mean we had factors 15* and 64* (where * is an unknown digit at this point). The product of these two will be maximised with 159 * 649, i.e. 103191 (again ignoring the fact we do not even have a 9 available): this is too low for reaching the target, so this pair can be ignored. By further applying the range test, all these 12 pairs can be discarded, and so the search within configuration A stops here: there is no solution there.
Then the algorithm moves to configuration B:
2?*
X 4?*
-------
=125460
Again, the range test is applied to the possible pairs for the second position, and again it turns out none of these pairs passes the test: for instance (5, 6) can never represent a greater product than 259 * 469 = 121471, which is (only just) too small.
Then the algorithm moves to option C:
2?*
X 5?*
-------
=125460
Of all 12 possible pairs, only the following survive the range test: (4, 0), (4, 1), (6, 0), (6, 1). So now we have the following second-level configurations:
24* 24* 26* 26*
X 50* X 51* X 50* X 51*
------- ------- ------- -------
=125460 =125460 =125460 =125460
Ca Cb Cc Cd
In configuration Ca, there is no pair that passes the range test.
In configuration Cb, the pair (6, 0) passes, and leads to a solution:
246
X 510
-------
=125460
At this point the algorithm stops searching. The outcome is clear. In total the number of configurations looked at is very small compared to a brute force permutation checking algorithm. Here is a visualisation of the search tree:
*-+-- (1, 6)
|
+-- (2, 4)
|
+-- (2, 5) -+-- (4, 0)
| |
| +-- (4, 1) ---- (6, 0) = success: 246 * 510
/ /
| +-- (6, 0)
| |
| +-- (6, 1)
|
+-- (2, 6) ---- (0, 1) ---- (4, 5) = success: 204 * 615
The variants below / are only for showing what else the algorithm would have done, if there had not been a solution found. But in this actual case, that part of the search tree was actually never followed.
I have no clear idea of the time complexity, but it seems to run quite well for larger numbers, showing that the elimination of digits at an early stage makes the width of the search tree quite narrow.
Here is a live JavaScript implementation, which also runs some test cases when it it is activated (and it has a few other optimisations -- see code comments).
/*
Function: vampireFangs
Arguments:
vampire: number to factorise into two fangs, if possible.
Return value:
Array with two fangs if indeed the argument is a vampire number.
Otherwise false (not a vampire number) or null (argument too large to
compute)
*/
function vampireFangs(vampire) {
/* Function recurse: for the recursive part of the algorithm.
prevA, prevB: partial, potential fangs based on left-most digits of the given
number
counts: array of ten numbers representing the occurrence of still
available digits
divider: power of 100, is divided by 100 each next level in the search tree.
Determines the number of right-most digits of the given number that
are ignored at first in the algorithm. They will be considered in
deeper levels of recursion.
*/
function recurse(vampire, prevA, prevB, counts, divider) {
if (divider < 1) { // end of recursion
// Product of fangs must equal original number and fangs must not both
// end with a 0.
return prevA * prevB === vampire && (prevA % 10 + prevB % 10 > 0)
? [prevA, prevB] // Solution found
: false; // It's not a solution
}
// Get left-most digits (multiple of 2) of potential vampire number
var v = Math.floor(vampire/divider);
// Shift decimal digits of partial fangs to the left to make room for
// the next digits
prevA *= 10;
prevB *= 10;
// Calculate the min/max A digit that can potentially contribute to a
// solution
var minDigA = Math.floor(v / (prevB + 10)) - prevA;
var maxDigA = prevB ? Math.floor((v + 1) / prevB) - prevA : 9;
if (maxDigA > 9) maxDigA = 9;
for (var digA = minDigA; digA <= maxDigA; digA++) {
if (!counts[digA]) continue; // this digit is not available
var fangA = prevA + digA;
counts[digA]--;
// Calculate the min/max B digit that can potentially contribute to
// a solution
var minDigB = Math.floor(v / (fangA + 1)) - prevB;
var maxDigB = fangA ? (v + 1) / fangA - prevB : 9;
// Don't search mirrored A-B digits when both fangs are equal until now.
if (prevA === prevB && digA > minDigB) minDigB = digA;
if (maxDigB > 9) maxDigB = 9;
for (var digB = minDigB; digB <= Math.min(maxDigB, 9); digB++) {
if (!counts[digB]) continue; // this digit is not available
var fangB = prevB + digB;
counts[digB]--;
// Recurse by considering the next two digits of the potential
// vampire number, for finding the next digits to append to
// both partial fangs.
var result = recurse(vampire, fangA, fangB, counts, divider / 100);
// When one solution is found: stop searching & exit search tree.
if (result) return result; // solution found
// Restore counts
counts[digB]++;
}
counts[digA]++;
}
}
// Validate argument
if (typeof vampire !== 'number') return false;
if (vampire < 0 || vampire % 1 !== 0) return false; // not positive and integer
if (vampire > 9007199254740991) return null; // beyond JavaScript precision
var digits = vampire.toString(10).split('').map(Number);
// A vampire number has an even number of digits
if (!digits.length || digits.length % 2 > 0) return false;
// Register per digit (0..9) the frequency of that digit in the argument
var counts = [0,0,0,0,0,0,0,0,0,0];
for (var i = 0; i < digits.length; i++) {
counts[digits[i]]++;
}
return recurse(vampire, 0, 0, counts, Math.pow(10, digits.length - 2));
}
function Timer() {
function now() { // try performance object, else use Date
return performance ? performance.now() : new Date().getTime();
}
var start = now();
this.spent = function () { return Math.round(now() - start); }
}
// I/O
var button = document.querySelector('button');
var input = document.querySelector('input');
var output = document.querySelector('pre');
button.onclick = function () {
var str = input.value;
// Convert to number
var vampire = parseInt(str);
// Measure performance
var timer = new Timer();
// Input must be valid number
var result = vampire.toString(10) !== str ? null
: vampireFangs(vampire);
output.textContent = (result
? 'Vampire number. Fangs are: ' + result.join(', ')
: result === null
? 'Input is not an integer or too large for JavaScript'
: 'Not a vampire number')
+ '\nTime spent: ' + timer.spent() + 'ms';
}
// Tests (numbers taken from wiki page)
var tests = [
// Negative test cases:
[1, 999, 126000, 1023],
// Positive test cases:
[1260, 1395, 1435, 1530, 1827, 2187, 6880,
102510, 104260, 105210, 105264, 105750, 108135,
110758, 115672, 116725, 117067, 118440,
120600, 123354, 124483, 125248, 125433, 125460, 125500,
13078260,
16758243290880,
24959017348650]
];
tests.forEach(function (vampires, shouldBeVampire) {
vampires.forEach(function (vampire) {
var isVampire = vampireFangs(vampire);
if (!isVampire !== !shouldBeVampire) {
output.textContent = 'Unexpected: vampireFangs('
+ vampire + ') returns ' + JSON.stringify(isVampire);
throw 'Test failed';
}
});
});
output.textContent = 'All tests passed.';
N: <input value="1047527295416280"><button>Vampire Check</button>
<pre></pre>
As JavaScript uses 64 bit floating point representation, the above snippet only accepts to numbers up to 253-1. Above that limit there would be loss of precision and consequently unreliable results.
As Python does not have such limitation, I also put a Python implementation on eval.in. That site has a limitation on execution times, so you'd have to run it elsewhere if that becomes an issue.
In pseudocode:
if digitcount is odd return false
if digitcount is 2 return false
for A = each permutation of length digitcount/2 selected from all the digits,
for B = each permutation of the remaining digits,
if either A or B starts with a zero, continue
if both A and B end in a zero, continue
if A*B == the number, return true
There are a number of optimizations that could still be performed here, mostly in terms of ensuring that each possible pair of factors is tried only once. In other words, how to best check for repeating digits when selecting permutations?
But that's the gist of the algorithm I would use.
P.S.: You're not looking for primes, so why use a primality test? You just care about whether these are vampire numbers; there are only a very few possible factors. No need to check all the numbers up to sqrt(number).
Here are some suggestions:
First a simple improvement: if the number of digits is < 4 or odd return false (or if v is negative too).
You don't need to sort v, it is enough to count how many times each digit occurs O(n).
You don't have to check each number, only the combinations that are possible with the digits. This could be done by backtracking and significantly reduces the amount of numbers that have to be checked.
The final sort to check if all digits were used isn't needed either, just add up the used digits of both numbers and compare with the occurences in v.
Here is the code for a JS-like language with integers that never overflow, the V parameter is an integer string without leading 0s:
Edit: As it turns out the code is not only JS-like, but valid JS code and it had no problem to decide that 1047527295416280 is indeed a vampire number (jsfiddle).
var V, v, isVmp, digits, len;
function isVampire(numberString) {
V = numberString;
if (V.length < 4 || V.length % 2 == 1 )
return false;
v = parseInt(V);
if (v < 0)
return false;
digits = countDigits(V);
len = V.length / 2;
isVmp = false;
checkNumbers();
return isVmp;
}
function countDigits(s) {
var offset = "0".charCodeAt(0);
var ret = [0,0,0,0,0,0,0,0,0,0];
for (var i = 0; i < s.length; i++)
ret[s.charCodeAt(i) - offset]++;
return ret;
}
function checkNumbers(number, depth) {
if (isVmp)
return;
if (typeof number == 'undefined') {
for (var i = 1; i < 10; i++) {
if (digits[i] > 0) {
digits[i]--;
checkNumbers(i, len - 1);
digits[i]++;
}
}
} else if (depth == 0) {
if (v % number == 0) {
var b = v / number;
if (number % 10 != 0 || b % 10 != 0) {
var d = countDigits('' + b);
if (d[0] == digits[0] && d[1] == digits[1] && d[2] == digits[2] &&
d[3] == digits[3] && d[4] == digits[4] && d[5] == digits[5] &&
d[6] == digits[6] && d[7] == digits[7] && d[8] == digits[8] &&
d[9] == digits[9])
isVmp = true;
}
}
} else {
for (var i = 0; i < 10; i++) {
if (digits[i] > 0) {
digits[i]--;
checkNumbers(number * 10 + i, depth - 1);
digits[i]++;
}
}
}
}

Getting numbers around a number

So, I'm trying to do something similar to a paginator (list of page numbers) where the current number is in the middle or as close as can be
Every way I solve it is hard and weird, just wondering if there is a nice mathy way to do it :)
given:
a: current page number
x: first page number
y: last page number
n: number required
I want to generate a list of numbers where a is as close to the center as can be, while staying within x and y
so f(5, 1, 10, 5) would return [3, 4, 5, 6, 7]
but f(1, 1, 10, 5) would return [1, 2, 3, 4, 5]
and f(9, 1, 10, 5) would return [6, 7, 8, 9, 10]
Can anyone think of a nice way of getting that kind of thing?
Implemented in a probably complicated way in ruby, can it be done simpler?
def numbers_around(current:, total:, required: 5)
required_before = (required - 1) / 2
required_after = (required - 1) / 2
before_x = current - required_before
after_x = current + required_after
if before_x < 1
after_x += before_x.abs + 1
before_x = 1
end
if after_x > total
before_x -= (after_x - total)
after_x = total
end
(before_x..after_x)
end
Here's something kind of mathy that returns the first number in the list (JavaScript code):
function f(a,x,y,n){
var m = n >> 1;
return x * (n > y - x) || a - m
+ Math.max(0,m - a + x)
- Math.max(0,m - y + a);
}
Output:
console.log(f(5,1,10,5)); // 3
console.log(f(1,1,10,5)); // 1
console.log(f(9,1,10,5)); // 6
console.log(f(2,1,10,5)); // 1
console.log(f(11,1,10,5)); // 6
console.log(f(7,3,12,10)); // 3
As you wont be mentioning the language you want this to do, here is some explained code I put together in C++:
std::vector<int> getPageNumbers(int first, int last, int page, int count) {
int begin = page - (count / 2);
if (begin < first) {
begin = first;
}
int cur = 0;
std::vector<int> result;
while (begin + cur <= last && cur < count) {
result.push_back(begin + cur);
++cur;
}
cur = 0;
while (begin - cur >= first && result.size() < count) {
++cur;
result.insert(result.begin(), begin-cur);
}
return result;
}
int main() {
std::vector<int> foo = getPageNumbers(1,10,10,4);
std::vector<int>::iterator it;
for (it = foo.begin(); it != foo.end(); ++it) {
std::cout << *it << " " << std::endl;
}
return 0;
}
What it does is basically:
Start at the Element page - (count/2) (count/2 is fine, you dont need to substract zero, as e.g. 2.5 will get rounded to 2).
If start element is below first, start at first
Keep adding Elements to the Result as long as current page number is smaller or equal the lastpage or until enough elements are inserted
Keep on inserting Elements in the beginning as long as there are less than count elements in the Resultvector or until the current Element is smaller than the first page
That is my basic attempt now. The code is executable.
After writing this, I realized it's very similar to #Nidhoegger's answer but maybe it will help? PHP
<?
//Assume 0 index pages
$current = 2;
$first = 1;
$last = 10;
$limit = 5;
$page_counter = floor($limit / 2); //start at half the limit, so if the limit is 5, start at current -2 (page 0) and move up
$pages = array();
for ($i = 0; $i < $limit) {
$page_to_add = $current + $page_counter;
$page_counter++;
if ($page_to_add > $last)
break;
if ($page_to_add > -1) {
$i++;
$pages[] = $page_to_add;
}
}
?>
I think it's just one of those problems with a lot of annoying corner cases.
start = a - (n / 2);
if (start < x) start = x; // don't go past first page.
end = start + (n - 1); // whereever you start, proceed n pages
if (end > y) { // also don't go past last page.
end = y;
start = end - (n - 1); // if you hit the end, go back n pages
if (start < x) start = x; // but _still_ don't go past first page (fewer than n pages)
}
// make some kind of vector [start..end] inclusive.
or, assuming higher-level primitives, if you prefer:
start = max(x, a - (n / 2)) // (n/2) pages before but don't pass x
end = min(start + (n - 1), y) // n pages long, but don't pass y
start = max(x, end - (n - 1)) // really n pages long, but really don't pass x
// make some kind of vector [start..end] inclusive.
Here's what seems to be the most efficient way to me. Using an array from 1 to n, find the index for the a value. First find the center point of the indexes of the array, then check to see if the number is close to one end or the other, and modify it by the difference. Then fill in the values.
It should be quick since instead of iterating, it uses algorithms to arrive at the index numbers.
Pseudocode:
centerindex = Ceiling(n/2, 1)
If (y-a) < (n - centerindex) Then centerindex = 2 * centerindex - (y - a) - 1
If (a-x) < (n - centerindex) Then centerindex = (a - x) + 1
For i = 1 to n
pages(i) = a - (centerindex - i)
Next i

Split a random value into four that sum up to it

I have one value like 24, and I have four textboxes. How can I dynamically generate four values that add up to 24?
All the values must be integers and can't be negative, and the result cannot be 6, 6, 6, 6; they must be different like: 8, 2, 10, 4. (But 5, 6, 6, 7 would be okay.)
For your stated problem, it is possible to generate an array of all possible solutions and then pick one randomly. There are in fact 1,770 possible solutions.
var solutions = [[Int]]()
for i in 1...21 {
for j in 1...21 {
for k in 1...21 {
let l = 24 - (i + j + k)
if l > 0 && !(i == 6 && j == 6 && k == 6) {
solutions.append([i, j, k, l])
}
}
}
}
// Now generate 20 solutions
for _ in 1...20 {
let rval = Int(arc4random_uniform(UInt32(solutions.count)))
println(solutions[rval])
}
This avoids any bias at the cost of initial setup time and storage.
This could be improved by:
Reducing storage space by only storing the first 3 numbers. The 4th one is always 24 - (sum of first 3)
Reducing storage space by storing each solution as a single integer: (i * 10000 + j * 100 + k)
Speeding up the generation of solutions by realizing that each loop doesn't need to go to 21.
Here is the solution that stores each solution as a single integer and optimizes the loops:
var solutions = [Int]()
for i in 1...21 {
for j in 1...22-i {
for k in 1...23-i-j {
if !(i == 6 && j == 6 && k == 6) {
solutions.append(i * 10000 + j * 100 + k)
}
}
}
}
// Now generate 20 solutions
for _ in 1...20 {
let rval = Int(arc4random_uniform(UInt32(solutions.count)))
let solution = solutions[rval]
// unpack the values
let i = solution / 10000
let j = (solution % 10000) / 100
let k = solution % 100
let l = 24 - (i + j + k)
// print the solution
println("\([i, j, k, l])")
}
Here is a Swift implementation of the algorithm given in https://stackoverflow.com/a/8064754/1187415, with a slight
modification because all numbers are required to be positive.
The method to producing N positive random integers with sum M is
Build an array containing the number 0, followed by N-1 different
random numbers in the range 1 .. M-1, and finally the number M.
Compute the differences of subsequent array elements.
In the first step, we need a random subset of N-1 elements out of
the set { 1, ..., M-1 }. This can be achieved by iterating over this
set and choosing each element with probability n/m, where
m is the remaining number of elements we can choose from and
n is the remaining number of elements to choose.
Instead of storing the chosen random numbers in an array, the
difference to the previously chosen number is computed immediately
and stored.
This gives the following function:
func randomNumbers(#count : Int, withSum sum : Int) -> [Int] {
precondition(sum >= count, "`sum` must not be less than `count`")
var diffs : [Int] = []
var last = 0 // last number chosen
var m = UInt32(sum - 1) // remaining # of elements to choose from
var n = UInt32(count - 1) // remaining # of elements to choose
for i in 1 ..< sum {
// Choose this number `i` with probability n/m:
if arc4random_uniform(m) < n {
diffs.append(i - last)
last = i
n--
}
m--
}
diffs.append(sum - last)
return diffs
}
println(randomNumbers(count: 4, withSum: 24))
If a solution with all elements equal (e.g 6+6+6+6=24) is not
allowed, you can repeat the method until a valid solution is found:
func differentRandomNumbers(#count : Int, withSum sum : Int) -> [Int] {
precondition(count >= 2, "`count` must be at least 2")
var v : [Int]
do {
v = randomNumbers(count: count, withSum: sum)
} while (!contains(v, { $0 != v[0]} ))
return v
}
Here is a simple test. It computes 1,000,000 random representations
of 7 as the sum of 3 positive integers, and counts the distribution
of the results.
let set = NSCountedSet()
for i in 1 ... 1_000_000 {
let v = randomNumbers(count: 3, withSum: 7)
set.addObject(v)
}
for (_, v) in enumerate(set) {
let count = set.countForObject(v)
println("\(v as! [Int]) \(count)")
}
Result:
[1, 4, 2] 66786
[1, 5, 1] 67082
[3, 1, 3] 66273
[2, 2, 3] 66808
[2, 3, 2] 66966
[5, 1, 1] 66545
[2, 1, 4] 66381
[1, 3, 3] 67153
[3, 3, 1] 67034
[4, 1, 2] 66423
[3, 2, 2] 66674
[2, 4, 1] 66418
[4, 2, 1] 66292
[1, 1, 5] 66414
[1, 2, 4] 66751
Update for Swift 3:
func randomNumbers(count : Int, withSum sum : Int) -> [Int] {
precondition(sum >= count, "`sum` must not be less than `count`")
var diffs : [Int] = []
var last = 0 // last number chosen
var m = UInt32(sum - 1) // remaining # of elements to choose from
var n = UInt32(count - 1) // remaining # of elements to choose
for i in 1 ..< sum {
// Choose this number `i` with probability n/m:
if arc4random_uniform(m) < n {
diffs.append(i - last)
last = i
n -= 1
}
m -= 1
}
diffs.append(sum - last)
return diffs
}
print(randomNumbers(count: 4, withSum: 24))
Update for Swift 4.2 (and later), using the unified random API:
func randomNumbers(count : Int, withSum sum : Int) -> [Int] {
precondition(sum >= count, "`sum` must not be less than `count`")
var diffs : [Int] = []
var last = 0 // last number chosen
var m = sum - 1 // remaining # of elements to choose from
var n = count - 1 // remaining # of elements to choose
for i in 1 ..< sum {
// Choose this number `i` with probability n/m:
if Int.random(in: 0..<m) < n {
diffs.append(i - last)
last = i
n -= 1
}
m -= 1
}
diffs.append(sum - last)
return diffs
}
func getRandomValues(amountOfValues:Int, totalAmount:Int) -> [Int]?{
if amountOfValues < 1{
return nil
}
if totalAmount < 1{
return nil
}
if totalAmount < amountOfValues{
return nil
}
var values:[Int] = []
var valueLeft = totalAmount
for i in 0..<amountOfValues{
if i == amountOfValues - 1{
values.append(valueLeft)
break
}
var value = Int(arc4random_uniform(UInt32(valueLeft - (amountOfValues - i))) + 1)
valueLeft -= value
values.append(value)
}
var shuffledArray:[Int] = []
for i in 0..<values.count {
var rnd = Int(arc4random_uniform(UInt32(values.count)))
shuffledArray.append(values[rnd])
values.removeAtIndex(rnd)
}
return shuffledArray
}
getRandomValues(4, 24)
This is not a final answer, but it should be a (good) starting point.
How it works: It takes 2 parameters. The amount of random values (4 in your case) and the total amount (24 in your case).
It takes a random value between the total Amount and 0, stores this in an array and it subtracts this from a variable which stores the amount that is left and stores the new value.
Than it takes a new random value between the amount that is left and 0, stores this in an array and it again subtracts this from the amount that is left and stores the new value.
When it is the last number needed, it sees what amount is left and adds that to the array
EDIT:
Adding a +1 to the random value removes the problem of having 0 in your array.
EDIT 2:
Shuffling the array does remove the increased chance of having a high value as the first value.
One solution that is unfortunatly non-deterministic but completely random is as follows:
For a total of 24 in 4 numbers:
pick four random numbers between 1 and 21
repeat until the total of the numbers equals 24 and they are not all 6.
This will, on average, loop about 100 times before finding a solution.
Here's a solution which should have significantly* less bias than some of the other methods. It works by generating the requested number of random floating point numbers, multiplying or dividing all of them until they add up to the target total, and then rounding them into integers. The rounding process changes the total, so we need to correct for that by adding or subtracting from random terms until they add up to the right amount.
func getRandomDoubles(#count: Int, #total: Double) -> [Double] {
var nonNormalized = [Double]()
nonNormalized.reserveCapacity(count)
for i in 0..<count {
nonNormalized.append(Double(arc4random()) / 0xFFFFFFFF)
}
let nonNormalizedSum = reduce(nonNormalized, 0) { $0 + $1 }
let normalized = nonNormalized.map { $0 * total / nonNormalizedSum }
return normalized
}
func getRandomInts(#count: Int, #total: Int) -> [Int] {
let doubles = getRandomDoubles(count: count, total: Double(total))
var ints = [Int]()
ints.reserveCapacity(count)
for double in doubles {
if double < 1 || double % 1 >= 0.5 {
// round up
ints.append(Int(ceil(double)))
} else {
// round down
ints.append(Int(floor(double)))
}
}
let roundingErrors = total - (reduce(ints, 0) { $0 + $1 })
let directionToAdjust: Int = roundingErrors > 0 ? 1 : -1
var corrections = abs(roundingErrors)
while corrections > 0 {
let index = Int(arc4random_uniform(UInt32(count)))
if directionToAdjust == -1 && ints[index] <= 1 { continue }
ints[index] += directionToAdjust
corrections--
}
return ints
}
*EDIT: Martin R has correctly pointed out that this is not nearly as uniform as one might expect, and is in fact highly biased towards numbers in the middle of the 1-24 range. I would not recommend using this solution, but I'm leaving it up so that others can know not to make the same mistake.
As a recursive function the algorithm is very nice:
func getRandomValues(amount: Int, total: Int) -> [Int] {
if amount == 1 { return [total] }
if amount == total { return Array(count: amount, repeatedValue: 1) }
let number = Int(arc4random()) % (total - amount + 1) + 1
return [number] + getRandomValues(amount - 1, total - number)
}
And with safety check:
func getRandomValues(amount: Int, total: Int) -> [Int]? {
if !(1...total ~= amount) { return nil }
if amount == 1 { return [total] }
if amount == total { return Array(count: amount, repeatedValue: 1) }
let number = Int(arc4random()) % (total - amount + 1) + 1
return [number] + getRandomValues(amount - 1, total - number)!
}
As #MartinR pointed out the code above is extremely biased. So in order to have a uniform distribution of the output values you should use this piece of code:
func getRandomValues(amount: Int, total: Int) -> [Int] {
var numberSet = Set<Int>()
// add splitting points to numberSet
for _ in 1...amount - 1 {
var number = Int(arc4random()) % (total - 1) + 1
while numberSet.contains(number) {
number = Int(arc4random()) % (total - 1) + 1
}
numberSet.insert(number)
}
// sort numberSet and return the differences between the splitting points
let sortedArray = (Array(numberSet) + [0, total]).sort()
return sortedArray.enumerate().flatMap{
indexElement in
if indexElement.index == amount { return nil }
return sortedArray[indexElement.index + 1] - indexElement.element
}
}
A javascript implementation for those who may be looking for such case:
const numbersSumTo = (length, value) => {
const fourRandomNumbers = Array.from({ length: length }, () => Math.floor(Math.random() * 6) + 1);
const res = fourRandomNumbers.map(num => (num / fourRandomNumbers.reduce((a, b) => a + b, 0)) * value).map(num => Math.trunc(num));
res[0] += Math.abs(res.reduce((a, b) => a + b, 0) - value);
return res;
}
// Gets an array with 4 items which sum to 100
const res = numbersSumTo(4, 100);
const resSum = res.reduce((a, b) => a + b, 0);
console.log({
res,
resSum
});
Also plenty of different methods of approach can be found here on this question: https://math.stackexchange.com/questions/1276206/method-of-generating-random-numbers-that-sum-to-100-is-this-truly-random

An interview question: About Probability

An interview question:
Given a function f(x) that 1/4 times returns 0, 3/4 times returns 1.
Write a function g(x) using f(x) that 1/2 times returns 0, 1/2 times returns 1.
My implementation is:
function g(x) = {
if (f(x) == 0){ // 1/4
var s = f(x)
if( s == 1) {// 3/4 * 1/4
return s // 3/16
} else {
g(x)
}
} else { // 3/4
var k = f(x)
if( k == 0) {// 1/4 * 3/4
return k // 3/16
} else {
g(x)
}
}
}
Am I right? What's your solution?(you can use any language)
If you call f(x) twice in a row, the following outcomes are possible (assuming that
successive calls to f(x) are independent, identically distributed trials):
00 (probability 1/4 * 1/4)
01 (probability 1/4 * 3/4)
10 (probability 3/4 * 1/4)
11 (probability 3/4 * 3/4)
01 and 10 occur with equal probability. So iterate until you get one of those
cases, then return 0 or 1 appropriately:
do
a=f(x); b=f(x);
while (a == b);
return a;
It might be tempting to call f(x) only once per iteration and keep track of the two
most recent values, but that won't work. Suppose the very first roll is 1,
with probability 3/4. You'd loop until the first 0, then return 1 (with probability 3/4).
The problem with your algorithm is that it repeats itself with high probability. My code:
function g(x) = {
var s = f(x) + f(x) + f(x);
// s = 0, probability: 1/64
// s = 1, probability: 9/64
// s = 2, probability: 27/64
// s = 3, probability: 27/64
if (s == 2) return 0;
if (s == 3) return 1;
return g(x); // probability to go into recursion = 10/64, with only 1 additional f(x) calculation
}
I've measured average number of times f(x) was calculated for your algorithm and for mine. For yours f(x) was calculated around 5.3 times per one g(x) calculation. With my algorithm this number reduced to around 3.5. The same is true for other answers so far since they are actually the same algorithm as you said.
P.S.: your definition doesn't mention 'random' at the moment, but probably it is assumed. See my other answer.
Your solution is correct, if somewhat inefficient and with more duplicated logic. Here is a Python implementation of the same algorithm in a cleaner form.
def g ():
while True:
a = f()
if a != f():
return a
If f() is expensive you'd want to get more sophisticated with using the match/mismatch information to try to return with fewer calls to it. Here is the most efficient possible solution.
def g ():
lower = 0.0
upper = 1.0
while True:
if 0.5 < lower:
return 1
elif upper < 0.5:
return 0
else:
middle = 0.25 * lower + 0.75 * upper
if 0 == f():
lower = middle
else:
upper = middle
This takes about 2.6 calls to g() on average.
The way that it works is this. We're trying to pick a random number from 0 to 1, but we happen to stop as soon as we know whether the number is 0 or 1. We start knowing that the number is in the interval (0, 1). 3/4 of the numbers are in the bottom 3/4 of the interval, and 1/4 are in the top 1/4 of the interval. We decide which based on a call to f(x). This means that we are now in a smaller interval.
If we wash, rinse, and repeat enough times we can determine our finite number as precisely as possible, and will have an absolutely equal probability of winding up in any region of the original interval. In particular we have an even probability of winding up bigger than or less than 0.5.
If you wanted you could repeat the idea to generate an endless stream of bits one by one. This is, in fact, provably the most efficient way of generating such a stream, and is the source of the idea of entropy in information theory.
Given a function f(x) that 1/4 times returns 0, 3/4 times returns 1
Taking this statement literally, f(x) if called four times will always return zero once and 1 3 times. This is different than saying f(x) is a probabalistic function and the 0 to 1 ratio will approach 1 to 3 (1/4 vs 3/4) over many iterations. If the first interpretation is valid, than the only valid function for f(x) that will meet the criteria regardless of where in the sequence you start from is the sequence 0111 repeating. (or 1011 or 1101 or 1110 which are the same sequence from a different starting point). Given that constraint,
g()= (f() == f())
should suffice.
As already mentioned your definition is not that good regarding probability. Usually it means that not only probability is good but distribution also. Otherwise you can simply write g(x) which will return 1,0,1,0,1,0,1,0 - it will return them 50/50, but numbers won't be random.
Another cheating approach might be:
var invert = false;
function g(x) {
invert = !invert;
if (invert) return 1-f(x);
return f(x);
}
This solution will be better than all others since it calls f(x) only one time. But the results will not be very random.
A refinement of the same approach used in btilly's answer, achieving an average ~1.85 calls to f() per g() result (further refinement documented below achieves ~1.75, tbilly's ~2.6, Jim Lewis's accepted answer ~5.33). Code appears lower in the answer.
Basically, I generate random integers in the range 0 to 3 with even probability: the caller can then test bit 0 for the first 50/50 value, and bit 1 for a second. Reason: the f() probabilities of 1/4 and 3/4 map onto quarters much more cleanly than halves.
Description of algorithm
btilly explained the algorithm, but I'll do so in my own way too...
The algorithm basically generates a random real number x between 0 and 1, then returns a result depending on which "result bucket" that number falls in:
result bucket result
x < 0.25 0
0.25 <= x < 0.5 1
0.5 <= x < 0.75 2
0.75 <= x 3
But, generating a random real number given only f() is difficult. We have to start with the knowledge that our x value should be in the range 0..1 - which we'll call our initial "possible x" space. We then hone in on an actual value for x:
each time we call f():
if f() returns 0 (probability 1 in 4), we consider x to be in the lower quarter of the "possible x" space, and eliminate the upper three quarters from that space
if f() returns 1 (probability 3 in 4), we consider x to be in the upper three-quarters of the "possible x" space, and eliminate the lower quarter from that space
when the "possible x" space is completely contained by a single result bucket, that means we've narrowed x down to the point where we know which result value it should map to and have no need to get a more specific value for x.
It may or may not help to consider this diagram :-):
"result bucket" cut-offs 0,.25,.5,.75,1
0=========0.25=========0.5==========0.75=========1 "possible x" 0..1
| | . . | f() chooses x < vs >= 0.25
| result 0 |------0.4375-------------+----------| "possible x" .25..1
| | result 1| . . | f() chooses x < vs >= 0.4375
| | | . ~0.58 . | "possible x" .4375..1
| | | . | . | f() chooses < vs >= ~.58
| | ||. | | . | 4 distinct "possible x" ranges
Code
int g() // return 0, 1, 2, or 3
{
if (f() == 0) return 0;
if (f() == 0) return 1;
double low = 0.25 + 0.25 * (1.0 - 0.25);
double high = 1.0;
while (true)
{
double cutoff = low + 0.25 * (high - low);
if (f() == 0)
high = cutoff;
else
low = cutoff;
if (high < 0.50) return 1;
if (low >= 0.75) return 3;
if (low >= 0.50 && high < 0.75) return 2;
}
}
If helpful, an intermediary to feed out 50/50 results one at a time:
int h()
{
static int i;
if (!i)
{
int x = g();
i = x | 4;
return x & 1;
}
else
{
int x = i & 2;
i = 0;
return x ? 1 : 0;
}
}
NOTE: This can be further tweaked by having the algorithm switch from considering an f()==0 result to hone in on the lower quarter, to having it hone in on the upper quarter instead, based on which on average resolves to a result bucket more quickly. Superficially, this seemed useful on the third call to f() when an upper-quarter result would indicate an immediate result of 3, while a lower-quarter result still spans probability point 0.5 and hence results 1 and 2. When I tried it, the results were actually worse. A more complex tuning was needed to see actual benefits, and I ended up writing a brute-force comparison of lower vs upper cutoff for second through eleventh calls to g(). The best result I found was an average of ~1.75, resulting from the 1st, 2nd, 5th and 8th calls to g() seeking low (i.e. setting low = cutoff).
Here is a solution based on central limit theorem, originally due to a friend of mine:
/*
Given a function f(x) that 1/4 times returns 0, 3/4 times returns 1. Write a function g(x) using f(x) that 1/2 times returns 0, 1/2 times returns 1.
*/
#include <iostream>
#include <cstdlib>
#include <ctime>
#include <cstdio>
using namespace std;
int f() {
if (rand() % 4 == 0) return 0;
return 1;
}
int main() {
srand(time(0));
int cc = 0;
for (int k = 0; k < 1000; k++) { //number of different runs
int c = 0;
int limit = 10000; //the bigger the limit, the more we will approach %50 percent
for (int i=0; i<limit; ++i) c+= f();
cc += c < limit*0.75 ? 0 : 1; // c will be 0, with probability %50
}
printf("%d\n",cc); //cc is gonna be around 500
return 0;
}
Since each return of f() represents a 3/4 chance of TRUE, with some algebra we can just properly balance the odds. What we want is another function x() which returns a balancing probability of TRUE, so that
function g() {
return f() && x();
}
returns true 50% of the time.
So let's find the probability of x (p(x)), given p(f) and our desired total probability (1/2):
p(f) * p(x) = 1/2
3/4 * p(x) = 1/2
p(x) = (1/2) / 3/4
p(x) = 2/3
So x() should return TRUE with a probability of 2/3, since 2/3 * 3/4 = 6/12 = 1/2;
Thus the following should work for g():
function g() {
return f() && (rand() < 2/3);
}
Assuming
P(f[x] == 0) = 1/4
P(f[x] == 1) = 3/4
and requiring a function g[x] with the following assumptions
P(g[x] == 0) = 1/2
P(g[x] == 1) = 1/2
I believe the following definition of g[x] is sufficient (Mathematica)
g[x_] := If[f[x] + f[x + 1] == 1, 1, 0]
or, alternatively in C
int g(int x)
{
return f(x) + f(x+1) == 1
? 1
: 0;
}
This is based on the idea that invocations of {f[x], f[x+1]} would produce the following outcomes
{
{0, 0},
{0, 1},
{1, 0},
{1, 1}
}
Summing each of the outcomes we have
{
0,
1,
1,
2
}
where a sum of 1 represents 1/2 of the possible sum outcomes, with any other sum making up the other 1/2.
Edit.
As bdk says - {0,0} is less likely than {1,1} because
1/4 * 1/4 < 3/4 * 3/4
However, I am confused myself because given the following definition for f[x] (Mathematica)
f[x_] := Mod[x, 4] > 0 /. {False -> 0, True -> 1}
or alternatively in C
int f(int x)
{
return (x % 4) > 0
? 1
: 0;
}
then the results obtained from executing f[x] and g[x] seem to have the expected distribution.
Table[f[x], {x, 0, 20}]
{0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0}
Table[g[x], {x, 0, 20}]
{1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1}
This is much like the Monty Hall paradox.
In general.
Public Class Form1
'the general case
'
'twiceThis = 2 is 1 in four chance of 0
'twiceThis = 3 is 1 in six chance of 0
'
'twiceThis = x is 1 in 2x chance of 0
Const twiceThis As Integer = 7
Const numOf As Integer = twiceThis * 2
Private Sub Button1_Click(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles Button1.Click
Const tries As Integer = 1000
y = New List(Of Integer)
Dim ct0 As Integer = 0
Dim ct1 As Integer = 0
Debug.WriteLine("")
''show all possible values of fx
'For x As Integer = 1 To numOf
' Debug.WriteLine(fx)
'Next
'test that gx returns 50% 0's and 50% 1's
Dim stpw As New Stopwatch
stpw.Start()
For x As Integer = 1 To tries
Dim g_x As Integer = gx()
'Debug.WriteLine(g_x.ToString) 'used to verify that gx returns 0 or 1 randomly
If g_x = 0 Then ct0 += 1 Else ct1 += 1
Next
stpw.Stop()
'the results
Debug.WriteLine((ct0 / tries).ToString("p1"))
Debug.WriteLine((ct1 / tries).ToString("p1"))
Debug.WriteLine((stpw.ElapsedTicks / tries).ToString("n0"))
End Sub
Dim prng As New Random
Dim y As New List(Of Integer)
Private Function fx() As Integer
'1 in numOf chance of zero being returned
If y.Count = 0 Then
'reload y
y.Add(0) 'fx has only one zero value
Do
y.Add(1) 'the rest are ones
Loop While y.Count < numOf
End If
'return a random value
Dim idx As Integer = prng.Next(y.Count)
Dim rv As Integer = y(idx)
y.RemoveAt(idx) 'remove the value selected
Return rv
End Function
Private Function gx() As Integer
'a function g(x) using f(x) that 50% of the time returns 0
' that 50% of the time returns 1
Dim rv As Integer = 0
For x As Integer = 1 To twiceThis
fx()
Next
For x As Integer = 1 To twiceThis
rv += fx()
Next
If rv = twiceThis Then Return 1 Else Return 0
End Function
End Class

Resources