Generate Random(a, b) making calls to Random(0, 1) - algorithm

There is known Random(0,1) function, it is a uniformed random function, which means, it will give 0 or 1, with probability 50%. Implement Random(a, b) that only makes calls to Random(0,1)
What I though so far is, put the range a-b in a 0 based array, then I have index 0, 1, 2...b-a.
then call the RANDOM(0,1) b-a times, sum the results as generated idx. and return the element.
However since there is no answer in the book, I don't know if this way is correct or the best. How to prove that the probability of returning each element is exactly same and is 1/(b-a+1) ?
And what is the right/better way to do this?

If your RANDOM(0, 1) returns either 0 or 1, each with probability 0.5 then you can generate bits until you have enough to represent the number (b-a+1) in binary. This gives you a random number in a slightly too large range: you can test and repeat if it fails. Something like this (in Python).
def rand_pow2(bit_count):
"""Return a random number with the given number of bits."""
result = 0
for i in xrange(bit_count):
result = 2 * result + RANDOM(0, 1)
return result
def random_range(a, b):
"""Return a random integer in the closed interval [a, b]."""
bit_count = math.ceil(math.log2(b - a + 1))
while True:
r = rand_pow2(bit_count)
if a + r <= b:
return a + r

When you sum random numbers, the result is not longer evenly distributed - it looks like a Gaussian function. Look up "law of large numbers" or read any probability book / article. Just like flipping coins 100 times is highly highly unlikely to give 100 heads. It's likely to give close to 50 heads and 50 tails.

Your inclination to put the range from 0 to a-b first is correct. However, you cannot do it as you stated. This question asks exactly how to do that, and the answer utilizes unique factorization. Write m=a-b in base 2, keeping track of the largest needed exponent, say e. Then, find the biggest multiple of m that is smaller than 2^e, call it k. Finally, generate e numbers with RANDOM(0,1), take them as the base 2 expansion of some number x, if x < k*m, return x, otherwise try again. The program looks something like this (simple case when m<2^2):
int RANDOM(0,m) {
// find largest power of n needed to write m in base 2
int e=0;
while (m > 2^e) {
++e;
}
// find largest multiple of m less than 2^e
int k=1;
while (k*m < 2^2) {
++k
}
--k; // we went one too far
while (1) {
// generate a random number in base 2
int x = 0;
for (int i=0; i<e; ++i) {
x = x*2 + RANDOM(0,1);
}
// if x isn't too large, return it x modulo m
if (x < m*k)
return (x % m);
}
}
Now you can simply add a to the result to get uniformly distributed numbers between a and b.

Divide and conquer could help us in generating a random number in range [a,b] using random(0,1). The idea is
if a is equal to b, then random number is a
Find mid of the range [a,b]
Generate random(0,1)
If above is 0, return a random number in range [a,mid] using recursion
else return a random number in range [mid+1, b] using recursion
The working 'C' code is as follows.
int random(int a, int b)
{
if(a == b)
return a;
int c = RANDOM(0,1); // Returns 0 or 1 with probability 0.5
int mid = a + (b-a)/2;
if(c == 0)
return random(a, mid);
else
return random(mid + 1, b);
}

If you have a RNG that returns {0, 1} with equal probability, you can easily create a RNG that returns numbers {0, 2^n} with equal probability.
To do this you just use your original RNG n times and get a binary number like 0010110111. Each of the numbers are (from 0 to 2^n) are equally likely.
Now it is easy to get a RNG from a to b, where b - a = 2^n. You just create a previous RNG and add a to it.
Now the last question is what should you do if b-a is not 2^n?
Good thing that you have to do almost nothing. Relying on rejection sampling technique. It tells you that if you have a big set and have a RNG over that set and need to select an element from a subset of this set, you can just keep selecting an element from a bigger set and discarding them till they exist in your subset.
So all you do, is find b-a and find the first n such that b-a <= 2^n. Then using rejection sampling till you picked an element smaller b-a. Than you just add a.

Related

Given a number, produce another random number that is the same every time and distinct from all other results

Basically, I would like help designing an algorithm that takes a given number, and returns a random number that is unrelated to the first number. The stipulations being that a) the given output number will always be the same for a similar input number, and b) within a certain range (ex. 1-100), all output numbers are distinct. ie., no two different input numbers under 100 will give the same output number.
I know it's easy to do by creating an ordered list of numbers, shuffling them randomly, and then returning the input's index. But I want to know if it can be done without any caching at all. Perhaps with some kind of hashing algorithm? Mostly the reason for this is that if the range of possible outputs were much larger, say 10000000000, then it would be ludicrous to generate an entire range of numbers and then shuffle them randomly, if you were only going to get a few results out of it.
Doesn't matter what language it's done in, I just want to know if it's possible. I've been thinking about this problem for a long time and I can't think of a solution besides the one I've already come up with.
Edit: I just had another idea; it would be interesting to have another algorithm that returned the reverse of the first one. Whether or not that's possible would be an interesting challenge to explore.
This sounds like a non-repeating random number generator. There are several possible approaches to this.
As described in this article, we can generate them by selecting a prime number p and satisfies p % 4 = 3 that is large enough (greater than the maximum value in the output range) and generate them this way:
int randomNumberUnique(int range_len , int p , int x)
if(x * 2 < p)
return (x * x) % p
else
return p - (x * x) % p
This algorithm will cover all values in [0 , p) for an input in range [0 , p).
Here's an example in C#:
private void DoIt()
{
const long m = 101;
const long x = 387420489; // must be coprime to m
var multInv = MultiplicativeInverse(x, m);
var nums = new HashSet<long>();
for (long i = 0; i < 100; ++i)
{
var encoded = i*x%m;
var decoded = encoded*multInv%m;
Console.WriteLine("{0} => {1} => {2}", i, encoded, decoded);
if (!nums.Add(encoded))
{
Console.WriteLine("Duplicate");
}
}
}
private long MultiplicativeInverse(long x, long modulus)
{
return ExtendedEuclideanDivision(x, modulus).Item1%modulus;
}
private static Tuple<long, long> ExtendedEuclideanDivision(long a, long b)
{
if (a < 0)
{
var result = ExtendedEuclideanDivision(-a, b);
return Tuple.Create(-result.Item1, result.Item2);
}
if (b < 0)
{
var result = ExtendedEuclideanDivision(a, -b);
return Tuple.Create(result.Item1, -result.Item2);
}
if (b == 0)
{
return Tuple.Create(1L, 0L);
}
var q = a/b;
var r = a%b;
var rslt = ExtendedEuclideanDivision(b, r);
var s = rslt.Item1;
var t = rslt.Item2;
return Tuple.Create(t, s - q*t);
}
That generates numbers in the range 0-100, from input in the range 0-100. Each input results in a unique output.
It also shows how to reverse the process, using the multiplicative inverse.
You can extend the range by increasing the value of m. x must be coprime with m.
Code cribbed from Eric Lippert's article, A practical use of multiplicative inverses, and a few of the previous articles in that series.
You can not have completely unrelated (particularly if you want the reverse as well).
There is a concept of modulo inverse of a number, but this would work only if the range number is a prime, eg. 100 will not work, you would need 101 (a prime). This can provide you a pseudo random number if you want.
Here is the concept of modulo inverse:
If there are two numbers a and b, such that
(a * b) % p = 1
where p is any number, then
a and b are modular inverses of each other.
For this to be true, if we have to find the modular inverse of a wrt a number p, then a and p must be co-prime, ie. gcd(a,p) = 1
So, for all numbers in a range to have modular inverses, the range bound must be a prime number.
A few outputs for range bound 101 will be:
1 == 1
2 == 51
3 == 34
4 == 76
etc.
EDIT:
Hey...actually you know, you can use the combined approach of modulo inverse and the method as defined by #Paul. Since every pair will be unique and all numbers will be covered, your random number can be:
random(k) = randomUniqueNumber(ModuloInverse(k), p) //this is Paul's function

Better Algorithm to find the maximum number who's square divides K :

Given a number K which is a product of two different numbers (A,B), find the maximum number(<=A & <=B) who's square divides the K .
Eg : K = 54 (6*9) . Both the numbers are available i.e 6 and 9.
My approach is fairly very simple or trivial.
taking the smallest of the two ( 6 in this case).Lets say A
Square the number and divide K, if its a perfect division, that's the number.
Else A = A-1 ,till A =1.
For the given example, 3*3 = 9 divides K, and hence 3 is the answer.
Looking for a better algorithm, than the trivial solution.
Note : The test cases are in 1000's so the best possible approach is needed.
I am sure someone else will come up with a nice answer involving modulus arithmetic. Here is a naive approach...
Each of the factors can themselves be factored (though it might be an expensive operation).
Given the factors, you can then look for groups of repeated factors.
For instance, using your example:
Prime factors of 9: 3, 3
Prime factors of 6: 2, 3
All prime factors: 2, 3, 3, 3
There are two 3s, so you have your answer (the square of 3 divides 54).
Second example of 36 x 9 = 324
Prime factors of 36: 2, 2, 3, 3
Prime factors of 9: 3, 3
All prime factors: 2, 2, 3, 3, 3, 3
So you have two 2s and four 3s, which means 2x3x3 is repeated. 2x3x3 = 18, so the square of 18 divides 324.
Edit: python prototype
import math
def factors(num, dict):
""" This finds the factors of a number recursively.
It is not the most efficient algorithm, and I
have not tested it a lot. You should probably
use another one. dict is a dictionary which looks
like {factor: occurrences, factor: occurrences, ...}
It must contain at least {2: 0} but need not have
any other pre-populated elements. Factors will be added
to this dictionary as they are found.
"""
while (num % 2 == 0):
num /= 2
dict[2] += 1
i = 3
found = False
while (not found and (i <= int(math.sqrt(num)))):
if (num % i == 0):
found = True
factors(i, dict)
factors(num / i, dict)
else:
i += 2
if (not found):
if (num in dict.keys()):
dict[num] += 1
else:
dict[num] = 1
return 0
#MAIN ROUTINE IS HERE
n1 = 37 # first number (6 in your example)
n2 = 41 # second number (9 in your example)
dict = {2: 0} # initialise factors (start with "no factors of 2")
factors(n1, dict) # find the factors of f1 and add them to the list
factors(n2, dict) # find the factors of f2 and add them to the list
sqfac = 1
# now find all factors repeated twice and multiply them together
for k in dict.keys():
dict[k] /= 2
sqfac *= k ** dict[k]
# here is the result
print(sqfac)
Answer in C++
int func(int i, j)
{
int k = 54
float result = pow(i, 2)/k
if (static_cast<int>(result)) == result)
{
if(i < j)
{
func(j, i);
}
else
{
cout << "Number is correct: " << i << endl;
}
}
else
{
cout << "Number is wrong" << endl;
func(j, i)
}
}
Explanation:
First recursion then test if result is a positive integer if it is then check if the other multiple is less or greater if greater recursive function tries the other multiple and if not then it is correct. Then if result is not positive integer then print Number is wrong and do another recursive function to test j.
If I got the problem correctly, I see that you have a rectangle of length=A, width=B, and area=K
And you want convert it to a square and lose the minimum possible area
If this is the case. So the problem with your algorithm is not the cost of iterating through mutliple iterations till get the output.
Rather the problem is that your algorithm depends heavily on the length A and width B of the input rectangle.
While it should depend only on the area K
For example:
Assume A =1, B=25
Then K=25 (the rect area)
Your algorithm will take the minimum value, which is A and accept it as answer with a single
iteration which is so fast but leads to wrong asnwer as it will result in a square of area 1 and waste the remaining 24 (whatever cm
or m)
While the correct answer here should be 5. which will never be reached by your algorithm
So, in my solution I assume a single input K
My ideas is as follows
x = sqrt(K)
if(x is int) .. x is the answer
else loop from x-1 till 1, x--
if K/x^2 is int, x is the answer
This might take extra iterations but will guarantee accurate answer
Also, there might be some concerns on the cost of sqrt(K)
but it will be called just once to avoid misleading length and width input

Find the sum of least common multiples of all subsets of a given set

Given: set A = {a0, a1, ..., aN-1} (1 &leq; N &leq; 100), with 2 &leq; ai &leq; 500.
Asked: Find the sum of all least common multiples (LCM) of all subsets of A of size at least 2.
The LCM of a setB = {b0, b1, ..., bk-1} is defined as the minimum integer Bmin such that bi | Bmin, for all 0 &leq; i < k.
Example:
Let N = 3 and A = {2, 6, 7}, then:
LCM({2, 6}) = 6
LCM({2, 7}) = 14
LCM({6, 7}) = 42
LCM({2, 6, 7}) = 42
----------------------- +
answer 104
The naive approach would be to simply calculate the LCM for all O(2N) subsets, which is not feasible for reasonably large N.
Solution sketch:
The problem is obtained from a competition*, which also provided a solution sketch. This is where my problem comes in: I do not understand the hinted approach.
The solution reads (modulo some small fixed grammar issues):
The solution is a bit tricky. If we observe carefully we see that the integers are between 2 and 500. So, if we prime factorize the numbers, we get the following maximum powers:
2 8
3 5
5 3
7 3
11 2
13 2
17 2
19 2
Other than this, all primes have power 1. So, we can easily calculate all possible states, using these integers, leaving 9 * 6 * 4 * 4 * 3 * 3 * 3 * 3 states, which is nearly 70000. For other integers we can make a dp like the following: dp[70000][i], where i can be 0 to 100. However, as dp[i] is dependent on dp[i-1], so dp[70000][2] is enough. This leaves the complexity to n * 70000 which is feasible.
I have the following concrete questions:
What is meant by these states?
Does dp stand for dynamic programming and if so, what recurrence relation is being solved?
How is dp[i] computed from dp[i-1]?
Why do the big primes not contribute to the number of states? Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?
*The original problem description can be found from this source (problem F). This question is a simplified version of that description.
Discussion
After reading the actual contest description (page 10 or 11) and the solution sketch, I have to conclude the author of the solution sketch is quite imprecise in their writing.
The high level problem is to calculate an expected lifetime if components are chosen randomly by fair coin toss. This is what's leading to computing the LCM of all subsets -- all subsets effectively represent the sample space. You could end up with any possible set of components. The failure time for the device is based on the LCM of the set. The expected lifetime is therefore the average of the LCM of all sets.
Note that this ought to include the LCM of sets with only one item (in which case we'd assume the LCM to be the element itself). The solution sketch seems to sabotage, perhaps because they handled it in a less elegant manner.
What is meant by these states?
The sketch author only uses the word state twice, but apparently manages to switch meanings. In the first use of the word state it appears they're talking about a possible selection of components. In the second use they're likely talking about possible failure times. They could be muddling this terminology because their dynamic programming solution initializes values from one use of the word and the recurrence relation stems from the other.
Does dp stand for dynamic programming?
I would say either it does or it's a coincidence as the solution sketch seems to heavily imply dynamic programming.
If so, what recurrence relation is being solved? How is dp[i] computed from dp[i-1]?
All I can think is that in their solution, state i represents a time to failure , T(i), with the number of times this time to failure has been counted, dp[i]. The resulting sum would be to sum all dp[i] * T(i).
dp[i][0] would then be the failure times counted for only the first component. dp[i][1] would then be the failure times counted for the first and second component. dp[i][2] would be for the first, second, and third. Etc..
Initialize dp[i][0] with zeroes except for dp[T(c)][0] (where c is the first component considered) which should be 1 (since this component's failure time has been counted once so far).
To populate dp[i][n] from dp[i][n-1] for each component c:
For each i, copy dp[i][n-1] into dp[i][n].
Add 1 to dp[T(c)][n].
For each i, add dp[i][n-1] to dp[LCM(T(i), T(c))][n].
What is this doing? Suppose you knew that you had a time to failure of j, but you added a component with a time to failure of k. Regardless of what components you had before, your new time to fail is LCM(j, k). This follows from the fact that for two sets A and B, LCM(A union B} = LCM(LCM(A), LCM(B)).
Similarly, if we're considering a time to failure of T(i) and our new component's time to failure of T(c), the resultant time to failure is LCM(T(i), T(c)). Note that we recorded this time to failure for dp[i][n-1] configurations, so we should record that many new times to failure once the new component is introduced.
Why do the big primes not contribute to the number of states?
Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?
You're right, of course. However, the solution sketch states that numbers with large primes are handled in another (unspecified) fashion.
What would happen if we did include them? The number of states we would need to represent would explode into an impractical number. Hence the author accounts for such numbers differently. Note that if a number less than or equal to 500 includes a prime larger than 19 the other factors multiply to 21 or less. This makes such numbers amenable for brute forcing, no tables necessary.
The first part of the editorial seems useful, but the second part is rather vague (and perhaps unhelpful; I'd rather finish this answer than figure it out).
Let's suppose for the moment that the input consists of pairwise distinct primes, e.g., 2, 3, 5, and 7. Then the answer (for summing all sets, where the LCM of 0 integers is 1) is
(1 + 2) (1 + 3) (1 + 5) (1 + 7),
because the LCM of a subset is exactly equal to the product here, so just multiply it out.
Let's relax the restriction that the primes be pairwise distinct. If we have an input like 2, 2, 3, 3, 3, and 5, then the multiplication looks like
(1 + (2^2 - 1) 2) (1 + (2^3 - 1) 3) (1 + (2^1 - 1) 5),
because 2 appears with multiplicity 2, and 3 appears with multiplicity 3, and 5 appears with multiplicity 1. With respect to, e.g., just the set of 3s, there are 2^3 - 1 ways to choose a subset that includes a 3, and 1 way to choose the empty set.
Call a prime small if it's 19 or less and large otherwise. Note that integers 500 or less are divisible by at most one large prime (with multiplicity). The small primes are more problematic. What we're going to do is to compute, for each possible small portion of the prime factorization of the LCM (i.e., one of the ~70,000 states), the sum of LCMs for the problem derived by discarding the integers that could not divide such an LCM and leaving only the large prime factor (or 1) for the other integers.
For example, if the input is 2, 30, 41, 46, and 51, and the state is 2, then we retain 2 as 1, discard 30 (= 2 * 3 * 5; 3 and 5 are small), retain 41 as 41 (41 is large), retain 46 as 23 (= 2 * 23; 23 is large), and discard 51 (= 3 * 17; 3 and 17 are small). Now, we compute the sum of LCMs using the previously described technique. Use inclusion-exclusion to get rid of the subsets whose LCM whose small portion properly divides the state instead of being exactly equal. Maybe I'll work a complete example later.
What is meant by these states?
I think here, states refer to if the number is in set B = {b0, b1, ..., bk-1} of LCMs of set A.
Does dp stand for dynamic programming and if so, what recurrence relation is being solved?
dp in the solution sketch stands for dynamic programming, I believe.
How is dp[i] computed from dp[i-1]?
It's feasible that we can figure out the state of next group of LCMs from previous states. So, we only need array of 2, and toggle back and forth.
Why do the big primes not contribute to the number of states? Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?
We can use Prime Factorization and exponents only to present the number.
Here is one example.
6 = (2^1)(3^1)(5^0) -> state "1 1 0" to represent 6
18 = (2^1)(3^2)(5^0) -> state "1 2 0" to represent 18
Here is how we can get LMC of 6 and 18 using Prime Factorization
LCM (6,18) = (2^(max(1,1)) (3^ (max(1,2)) (5^max(0,0)) = (2^1)(3^2)(5^0) = 18
2^9 > 500, 3^6 > 500, 5^4 > 500, 7^4>500, 11^3 > 500, 13^3 > 500, 17^3 > 500, 19^3 > 500
we can use only count of exponents of prime number 2,3,5,7,11,13,17,19 to represent the LCMs in the set B = {b0, b1, ..., bk-1}
for the given set A = {a0, a1, ..., aN-1} (1 ≤ N ≤ 100), with 2 ≤ ai ≤ 500.
9 * 6 * 4 * 4 * 3 * 3 * 3 * 3 <= 70000, so we only need two of dp[9][6][4][4][3][3][3][3] to keep tracks of all LCMs' states. So, dp[70000][2] is enough.
I put together a small C++ program to illustrate how we can get sum of LCMs of the given set A = {a0, a1, ..., aN-1} (1 ≤ N ≤ 100), with 2 ≤ ai ≤ 500. In the solution sketch, we need to loop through 70000 max possible of LCMs.
int gcd(int a, int b) {
int remainder = 0;
do {
remainder = a % b;
a = b;
b = remainder;
} while (b != 0);
return a;
}
int lcm(int a, int b) {
if (a == 0 || b == 0) {
return 0;
}
return (a * b) / gcd(a, b);
}
int sum_of_lcm(int A[], int N) {
// get the max LCM from the array
int max = A[0];
for (int i = 1; i < N; i++) {
max = lcm(max, A[i]);
}
max++;
//
int dp[max][2];
memset(dp, 0, sizeof(dp));
int pri = 0;
int cur = 1;
// loop through n x 70000
for (int i = 0; i < N; i++) {
for (int v = 1; v < max; v++) {
int x = A[i];
if (dp[v][pri] > 0) {
x = lcm(A[i], v);
dp[v][cur] = (dp[v][cur] == 0) ? dp[v][pri] : dp[v][cur];
if ( x % A[i] != 0 ) {
dp[x][cur] += dp[v][pri] + dp[A[i]][pri];
} else {
dp[x][cur] += ( x==v ) ? ( dp[v][pri] + dp[v][pri] ) : ( dp[v][pri] ) ;
}
}
}
dp[A[i]][cur]++;
pri = cur;
cur = (pri + 1) % 2;
}
for (int i = 0; i < N; i++) {
dp[A[i]][pri] -= 1;
}
long total = 0;
for (int j = 0; j < max; j++) {
if (dp[j][pri] > 0) {
total += dp[j][pri] * j;
}
}
cout << "total:" << total << endl;
return total;
}
int test() {
int a[] = {2, 6, 7 };
int n = sizeof(a)/sizeof(a[0]);
int total = sum_of_lcm(a, n);
return 0;
}
Output
total:104
The states are one more than the powers of primes. You have numbers up to 2^8, so the power of 2 is in [0..8], which is 9 states. Similarly for the other states.
"dp" could well stand for dynamic programming, I'm not sure.
The recurrence relation is the heart of the problem, so you will learn more by solving it yourself. Start with some small, simple examples.
For the large primes, try solving a reduced problem without using them (or their equivalents) and then add them back in to see their effect on the final result.

Randomly Generate a set of numbers of n length totaling x

I'm working on a project for fun and I need an algorithm to do as follows:
Generate a list of numbers of Length n which add up to x
I would settle for list of integers, but ideally, I would like to be left with a set of floating point numbers.
I would be very surprised if this problem wasn't heavily studied, but I'm not sure what to look for.
I've tackled similar problems in the past, but this one is decidedly different in nature. Before I've generated different combinations of a list of numbers that will add up to x. I'm sure that I could simply bruteforce this problem but that hardly seems like the ideal solution.
Anyone have any idea what this may be called, or how to approach it? Thanks all!
Edit: To clarify, I mean that the list should be length N while the numbers themselves can be of any size.
edit2: Sorry for my improper use of 'set', I was using it as a catch all term for a list or an array. I understand that it was causing confusion, my apologies.
This is how to do it in Python
import random
def random_values_with_prescribed_sum(n, total):
x = [random.random() for i in range(n)]
k = total / sum(x)
return [v * k for v in x]
Basically you pick n random numbers, compute their sum and compute a scale factor so that the sum will be what you want it to be.
Note that this approach will not produce "uniform" slices, i.e. the distribution you will get will tend to be more "egalitarian" than it should be if it was picked at random among all distribution with the given sum.
To see the reason you can just picture what the algorithm does in the case of two numbers with a prescribed sum (e.g. 1):
The point P is a generic point obtained by picking two random numbers and it will be uniform inside the square [0,1]x[0,1]. The point Q is the point obtained by scaling P so that the sum is required to be 1. As it's clear from the picture the points close to the center of the have an higher probability; for example the exact center of the squares will be found by projecting any point on the diagonal (0,0)-(1,1), while the point (0, 1) will be found projecting only points from (0,0)-(0,1)... the diagonal length is sqrt(2)=1.4142... while the square side is only 1.0.
Actually, you need to generate a partition of x into n parts. This is usually done the in following way: The partition of x into n non-negative parts can be represented in the following way: reserve n + x free places, put n borders to some arbitrary places, and stones to the rest. The stone groups add up to x, thus the number of possible partitions is the binomial coefficient (n + x \atop n).
So your algorithm could be as follows: choose an arbitrary n-subset of (n + x)-set, it determines uniquely a partition of x into n parts.
In Knuth's TAOCP the chapter 3.4.2 discusses random sampling. See Algortihm S there.
Algorithm S: (choose n arbitrary records from total of N)
t = 0, m = 0;
u = random, uniformly distributed on (0, 1)
if (N - t)*u >= n - m, skip t-th record and increase t by 1; otherwise include t-th record in the sample, increase m and t by 1
if M < n, return to 2, otherwise, algorithm finished
The solution for non-integers is algorithmically trivial: you just select arbitrary n numbers that don't sum up to 0, and norm them by their sum.
If you want to sample uniformly in the region of N-1-dimensional space defined by x1 + x2 + ... + xN = x, then you're looking at a special case of sampling from a Dirichlet distribution. The sampling procedure is a little more involved than generating uniform deviates for the xi. Here's one way to do it, in Python:
xs = [random.gammavariate(1,1) for a in range(N)]
xs = [x*v/sum(xs) for v in xs]
If you don't care too much about the sampling properties of your results, you can just generate uniform deviates and correct their sum afterwards.
Here is a version of the above algorithm in Javascript
function getRandomArbitrary(min, max) {
return Math.random() * (max - min) + min;
};
function getRandomArray(min, max, n) {
var arr = [];
for (var i = 0, l = n; i < l; i++) {
arr.push(getRandomArbitrary(min, max))
};
return arr;
};
function randomValuesPrescribedSum(min, max, n, total) {
var arr = getRandomArray(min, max, n);
var sum = arr.reduce(function(pv, cv) { return pv + cv; }, 0);
var k = total/sum;
var delays = arr.map(function(x) { return k*x; })
return delays;
};
You can call it with
var myarray = randomValuesPrescribedSum(0,1,3,3);
And then check it with
var sum = myarray.reduce(function(pv, cv) { return pv + cv;},0);
This code does a reasonable job. I think it produces a different distribution than 6502's answer, but I am not sure which is better or more natural. Certainly his code is clearer/nicer.
import random
def parts(total_sum, num_parts):
points = [random.random() for i in range(num_parts-1)]
points.append(0)
points.append(1)
points.sort()
ret = []
for i in range(1, len(points)):
ret.append((points[i] - points[i-1]) * total_sum)
return ret
def test(total_sum, num_parts):
ans = parts(total_sum, num_parts)
assert abs(sum(ans) - total_sum) < 1e-7
print ans
test(5.5, 3)
test(10, 1)
test(10, 5)
In python:
a: create a list of (random #'s 0 to 1) times total; append 0 and total to the list
b: sort the list, measure the distance between each element
c: round the list elements
import random
import time
TOTAL = 15
PARTS = 4
PLACES = 3
def random_sum_split(parts, total, places):
a = [0, total] + [random.random()*total for i in range(parts-1)]
a.sort()
b = [(a[i] - a[i-1]) for i in range(1, (parts+1))]
if places == None:
return b
else:
b.pop()
c = [round(x, places) for x in b]
c.append(round(total-sum(c), places))
return c
def tick():
if info.tick == 1:
start = time.time()
alpha = random_sum_split(PARTS, TOTAL, PLACES)
end = time.time()
log('alpha: %s' % alpha)
log('total: %.7f' % sum(alpha))
log('parts: %s' % PARTS)
log('places: %s' % PLACES)
log('elapsed: %.7f' % (end-start))
yields:
[2014-06-13 01:00:00] alpha: [0.154, 3.617, 6.075, 5.154]
[2014-06-13 01:00:00] total: 15.0000000
[2014-06-13 01:00:00] parts: 4
[2014-06-13 01:00:00] places: 3
[2014-06-13 01:00:00] elapsed: 0.0005839
to the best of my knowledge this distribution is uniform

Algorithm to calculate the number of divisors of a given number

What would be the most optimal algorithm (performance-wise) to calculate the number of divisors of a given number?
It'll be great if you could provide pseudocode or a link to some example.
EDIT: All the answers have been very helpful, thank you. I'm implementing the Sieve of Atkin and then I'm going to use something similar to what Jonathan Leffler indicated. The link posted by Justin Bozonier has further information on what I wanted.
Dmitriy is right that you'll want the Sieve of Atkin to generate the prime list but I don't believe that takes care of the whole issue. Now that you have a list of primes you'll need to see how many of those primes act as a divisor (and how often).
Here's some python for the algo Look here and search for "Subject: math - need divisors algorithm". Just count the number of items in the list instead of returning them however.
Here's a Dr. Math that explains what exactly it is you need to do mathematically.
Essentially it boils down to if your number n is:
n = a^x * b^y * c^z
(where a, b, and c are n's prime divisors and x, y, and z are the number of times that divisor is repeated)
then the total count for all of the divisors is:
(x + 1) * (y + 1) * (z + 1).
Edit: BTW, to find a,b,c,etc you'll want to do what amounts to a greedy algo if I'm understanding this correctly. Start with your largest prime divisor and multiply it by itself until a further multiplication would exceed the number n. Then move to the next lowest factor and times the previous prime ^ number of times it was multiplied by the current prime and keep multiplying by the prime until the next will exceed n... etc. Keep track of the number of times you multiply the divisors together and apply those numbers into the formula above.
Not 100% sure about my algo description but if that isn't it it's something similar .
There are a lot more techniques to factoring than the sieve of Atkin. For example suppose we want to factor 5893. Well its sqrt is 76.76... Now we'll try to write 5893 as a product of squares. Well (77*77 - 5893) = 36 which is 6 squared, so 5893 = 77*77 - 6*6 = (77 + 6)(77-6) = 83*71. If that hadn't worked we'd have looked at whether 78*78 - 5893 was a perfect square. And so on. With this technique you can quickly test for factors near the square root of n much faster than by testing individual primes. If you combine this technique for ruling out large primes with a sieve, you will have a much better factoring method than with the sieve alone.
And this is just one of a large number of techniques that have been developed. This is a fairly simple one. It would take you a long time to learn, say, enough number theory to understand the factoring techniques based on elliptic curves. (I know they exist. I don't understand them.)
Therefore unless you are dealing with small integers, I wouldn't try to solve that problem myself. Instead I'd try to find a way to use something like the PARI library that already has a highly efficient solution implemented. With that I can factor a random 40 digit number like 124321342332143213122323434312213424231341 in about .05 seconds. (Its factorization, in case you wondered, is 29*439*1321*157907*284749*33843676813*4857795469949. I am quite confident that it didn't figure this out using the sieve of Atkin...)
#Yasky
Your divisors function has a bug in that it does not work correctly for perfect squares.
Try:
int divisors(int x) {
int limit = x;
int numberOfDivisors = 0;
if (x == 1) return 1;
for (int i = 1; i < limit; ++i) {
if (x % i == 0) {
limit = x / i;
if (limit != i) {
numberOfDivisors++;
}
numberOfDivisors++;
}
}
return numberOfDivisors;
}
I disagree that the sieve of Atkin is the way to go, because it could easily take longer to check every number in [1,n] for primality than it would to reduce the number by divisions.
Here's some code that, although slightly hackier, is generally much faster:
import operator
# A slightly efficient superset of primes.
def PrimesPlus():
yield 2
yield 3
i = 5
while True:
yield i
if i % 6 == 1:
i += 2
i += 2
# Returns a dict d with n = product p ^ d[p]
def GetPrimeDecomp(n):
d = {}
primes = PrimesPlus()
for p in primes:
while n % p == 0:
n /= p
d[p] = d.setdefault(p, 0) + 1
if n == 1:
return d
def NumberOfDivisors(n):
d = GetPrimeDecomp(n)
powers_plus = map(lambda x: x+1, d.values())
return reduce(operator.mul, powers_plus, 1)
ps That's working python code to solve this problem.
Here is a straight forward O(sqrt(n)) algorithm. I used this to solve project euler
def divisors(n):
count = 2 # accounts for 'n' and '1'
i = 2
while i ** 2 < n:
if n % i == 0:
count += 2
i += 1
if i ** 2 == n:
count += 1
return count
This interesting question is much harder than it looks, and it has not been answered. The question can be factored into 2 very different questions.
1 given N, find the list L of N's prime factors
2 given L, calculate number of unique combinations
All answers I see so far refer to #1 and fail to mention it is not tractable for enormous numbers. For moderately sized N, even 64-bit numbers, it is easy; for enormous N, the factoring problem can take "forever". Public key encryption depends on this.
Question #2 needs more discussion. If L contains only unique numbers, it is a simple calculation using the combination formula for choosing k objects from n items. Actually, you need to sum the results from applying the formula while varying k from 1 to sizeof(L). However, L will usually contain multiple occurrences of multiple primes. For example, L = {2,2,2,3,3,5} is the factorization of N = 360. Now this problem is quite difficult!
Restating #2, given collection C containing k items, such that item a has a' duplicates, and item b has b' duplicates, etc. how many unique combinations of 1 to k-1 items are there? For example, {2}, {2,2}, {2,2,2}, {2,3}, {2,2,3,3} must each occur once and only once if L = {2,2,2,3,3,5}. Each such unique sub-collection is a unique divisor of N by multiplying the items in the sub-collection.
An answer to your question depends greatly on the size of the integer. Methods for small numbers, e.g. less then 100 bit, and for numbers ~1000 bit (such as used in cryptography) are completely different.
general overview: http://en.wikipedia.org/wiki/Divisor_function
values for small n and some useful references: A000005: d(n) (also called tau(n) or sigma_0(n)), the number of divisors of n.
real-world example: factorization of integers
JUST one line
I have thought very carefuly about your question and I have tried to write a highly efficient and performant piece of code
To print all divisors of a given number on screen we need just one line of code!
(use option -std=c99 while compiling via gcc)
for(int i=1,n=9;((!(n%i)) && printf("%d is a divisor of %d\n",i,n)) || i<=(n/2);i++);//n is your number
for finding numbers of divisors you can use the following very very fast function(work correctly for all integer number except 1 and 2)
int number_of_divisors(int n)
{
int counter,i;
for(counter=0,i=1;(!(n%i) && (counter++)) || i<=(n/2);i++);
return counter;
}
or if you treat given number as a divisor(work correctly for all integer number except 1 and 2)
int number_of_divisors(int n)
{
int counter,i;
for(counter=0,i=1;(!(n%i) && (counter++)) || i<=(n/2);i++);
return ++counter;
}
NOTE:two above functions works correctly for all positive integer number except number 1 and 2
so it is functional for all numbers that are greater than 2
but if you Need to cover 1 and 2 , you can use one of the following functions( a little slower)
int number_of_divisors(int n)
{
int counter,i;
for(counter=0,i=1;(!(n%i) && (counter++)) || i<=(n/2);i++);
if (n==2 || n==1)
{
return counter;
}
return ++counter;
}
OR
int number_of_divisors(int n)
{
int counter,i;
for(counter=0,i=1;(!(i==n) && !(n%i) && (counter++)) || i<=(n/2);i++);
return ++counter;
}
small is beautiful :)
The sieve of Atkin is an optimized version of the sieve of Eratosthenes which gives all prime numbers up to a given integer. You should be able to google this for more detail.
Once you have that list, it's a simple matter to divide your number by each prime to see if it's an exact divisor (i.e., remainder is zero).
The basic steps calculating the divisors for a number (n) are [this is pseudocode converted from real code so I hope I haven't introduced errors]:
for z in 1..n:
prime[z] = false
prime[2] = true;
prime[3] = true;
for x in 1..sqrt(n):
xx = x * x
for y in 1..sqrt(n):
yy = y * y
z = 4*xx+yy
if (z <= n) and ((z mod 12 == 1) or (z mod 12 == 5)):
prime[z] = not prime[z]
z = z-xx
if (z <= n) and (z mod 12 == 7):
prime[z] = not prime[z]
z = z-yy-yy
if (z <= n) and (x > y) and (z mod 12 == 11):
prime[z] = not prime[z]
for z in 5..sqrt(n):
if prime[z]:
zz = z*z
x = zz
while x <= limit:
prime[x] = false
x = x + zz
for z in 2,3,5..n:
if prime[z]:
if n modulo z == 0 then print z
You might try this one. It's a bit hackish, but it's reasonably fast.
def factors(n):
for x in xrange(2,n):
if n%x == 0:
return (x,) + factors(n/x)
return (n,1)
Once you have the prime factorization, there is a way to find the number of divisors. Add one to each of the exponents on each individual factor and then multiply the exponents together.
For example:
36
Prime Factorization: 2^2*3^2
Divisors: 1, 2, 3, 4, 6, 9, 12, 18, 36
Number of Divisors: 9
Add one to each exponent 2^3*3^3
Multiply exponents: 3*3 = 9
Before you commit to a solution consider that the Sieve approach might not be a good answer in the typical case.
A while back there was a prime question and I did a time test--for 32-bit integers at least determining if it was prime was slower than brute force. There are two factors going on:
1) While a human takes a while to do a division they are very quick on the computer--similar to the cost of looking up the answer.
2) If you do not have a prime table you can make a loop that runs entirely in the L1 cache. This makes it faster.
This is an efficient solution:
#include <iostream>
int main() {
int num = 20;
int numberOfDivisors = 1;
for (int i = 2; i <= num; i++)
{
int exponent = 0;
while (num % i == 0) {
exponent++;
num /= i;
}
numberOfDivisors *= (exponent+1);
}
std::cout << numberOfDivisors << std::endl;
return 0;
}
Divisors do something spectacular: they divide completely. If you want to check the number of divisors for a number, n, it clearly is redundant to span the whole spectrum, 1...n. I have not done any in-depth research for this but I solved Project Euler's problem 12 on Triangular Numbers. My solution for the greater then 500 divisors test ran for 309504 microseconds (~0.3s). I wrote this divisor function for the solution.
int divisors (int x) {
int limit = x;
int numberOfDivisors = 1;
for (int i(0); i < limit; ++i) {
if (x % i == 0) {
limit = x / i;
numberOfDivisors++;
}
}
return numberOfDivisors * 2;
}
To every algorithm, there is a weak point. I thought this was weak against prime numbers. But since triangular numbers are not print, it served its purpose flawlessly. From my profiling, I think it did pretty well.
Happy Holidays.
You want the Sieve of Atkin, described here: http://en.wikipedia.org/wiki/Sieve_of_Atkin
Number theory textbooks call the divisor-counting function tau. The first interesting fact is that it's multiplicative, ie. τ(ab) = τ(a)τ(b) , when a and b have no common factor. (Proof: each pair of divisors of a and b gives a distinct divisor of ab).
Now note that for p a prime, τ(p**k) = k+1 (the powers of p). Thus you can easily compute τ(n) from its factorisation.
However factorising large numbers can be slow (the security of RSA crytopraphy depends on the product of two large primes being hard to factorise). That suggests this optimised algorithm
Test if the number is prime (fast)
If so, return 2
Otherwise, factorise the number (slow if multiple large prime factors)
Compute τ(n) from the factorisation
This is the most basic way of computing the number divissors:
class PrintDivisors
{
public static void main(String args[])
{
System.out.println("Enter the number");
// Create Scanner object for taking input
Scanner s=new Scanner(System.in);
// Read an int
int n=s.nextInt();
// Loop from 1 to 'n'
for(int i=1;i<=n;i++)
{
// If remainder is 0 when 'n' is divided by 'i',
if(n%i==0)
{
System.out.print(i+", ");
}
}
// Print [not necessary]
System.out.print("are divisors of "+n);
}
}
the prime number method is very clear here .
P[] is a list of prime number less than or equal the sq = sqrt(n) ;
for (int i = 0 ; i < size && P[i]<=sq ; i++){
nd = 1;
while(n%P[i]==0){
n/=P[i];
nd++;
}
count*=nd;
if (n==1)break;
}
if (n!=1)count*=2;//the confusing line :D :P .
i will lift the understanding for the reader .
i now look forward to a method more optimized .
The following is a C program to find the number of divisors of a given number.
The complexity of the above algorithm is O(sqrt(n)).
This algorithm will work correctly for the number which are perfect square as well as the numbers which are not perfect square.
Note that the upperlimit of the loop is set to the square-root of number to have the algorithm most efficient.
Note that storing the upperlimit in a separate variable also saves the time, you should not call the sqrt function in the condition section of the for loop, this also saves your computational time.
#include<stdio.h>
#include<math.h>
int main()
{
int i,n,limit,numberOfDivisors=1;
printf("Enter the number : ");
scanf("%d",&n);
limit=(int)sqrt((double)n);
for(i=2;i<=limit;i++)
if(n%i==0)
{
if(i!=n/i)
numberOfDivisors+=2;
else
numberOfDivisors++;
}
printf("%d\n",numberOfDivisors);
return 0;
}
Instead of the above for loop you can also use the following loop which is even more efficient as this removes the need to find the square-root of the number.
for(i=2;i*i<=n;i++)
{
...
}
Here is a function that I wrote. it's worst time complexity is O(sqrt(n)),best time on the other hand is O(log(n)). It gives you all the prime divisors along with the number of its occurence.
public static List<Integer> divisors(n) {
ArrayList<Integer> aList = new ArrayList();
int top_count = (int) Math.round(Math.sqrt(n));
int new_n = n;
for (int i = 2; i <= top_count; i++) {
if (new_n == (new_n / i) * i) {
aList.add(i);
new_n = new_n / i;
top_count = (int) Math.round(Math.sqrt(new_n));
i = 1;
}
}
aList.add(new_n);
return aList;
}
#Kendall
I tested your code and made some improvements, now it is even faster.
I also tested with #هومن جاویدپور code, this is also faster than his code.
long long int FindDivisors(long long int n) {
long long int count = 0;
long long int i, m = (long long int)sqrt(n);
for(i = 1;i <= m;i++) {
if(n % i == 0)
count += 2;
}
if(n / m == m && n % m == 0)
count--;
return count;
}
Isn't this just a question of factoring the number - determining all the factors of the number? You can then decide whether you need all combinations of one or more factors.
So, one possible algorithm would be:
factor(N)
divisor = first_prime
list_of_factors = { 1 }
while (N > 1)
while (N % divisor == 0)
add divisor to list_of_factors
N /= divisor
divisor = next_prime
return list_of_factors
It is then up to you to combine the factors to determine the rest of the answer.
I think this is what you are looking for.I does exactly what you asked for.
Copy and Paste it in Notepad.Save as *.bat.Run.Enter Number.Multiply the process by 2 and thats the number of divisors.I made that on purpose so the it determine the divisors faster:
Pls note that a CMD varriable cant support values over 999999999
#echo off
modecon:cols=100 lines=100
:start
title Enter the Number to Determine
cls
echo Determine a number as a product of 2 numbers
echo.
echo Ex1 : C = A * B
echo Ex2 : 8 = 4 * 2
echo.
echo Max Number length is 9
echo.
echo If there is only 1 proces done it
echo means the number is a prime number
echo.
echo Prime numbers take time to determine
echo Number not prime are determined fast
echo.
set /p number=Enter Number :
if %number% GTR 999999999 goto start
echo.
set proces=0
set mindet=0
set procent=0
set B=%Number%
:Determining
set /a mindet=%mindet%+1
if %mindet% GTR %B% goto Results
set /a solution=%number% %%% %mindet%
if %solution% NEQ 0 goto Determining
if %solution% EQU 0 set /a proces=%proces%+1
set /a B=%number% / %mindet%
set /a procent=%mindet%*100/%B%
if %procent% EQU 100 set procent=%procent:~0,3%
if %procent% LSS 100 set procent=%procent:~0,2%
if %procent% LSS 10 set procent=%procent:~0,1%
title Progress : %procent% %%%
if %solution% EQU 0 echo %proces%. %mindet% * %B% = %number%
goto Determining
:Results
title %proces% Results Found
echo.
#pause
goto start
i guess this one will be handy as well as precise
script.pyton
>>>factors=[ x for x in range (1,n+1) if n%x==0]
print len(factors)
Try something along these lines:
int divisors(int myNum) {
int limit = myNum;
int divisorCount = 0;
if (x == 1)
return 1;
for (int i = 1; i < limit; ++i) {
if (myNum % i == 0) {
limit = myNum / i;
if (limit != i)
divisorCount++;
divisorCount++;
}
}
return divisorCount;
}
I don't know the MOST efficient method, but I'd do the following:
Create a table of primes to find all primes less than or equal to the square root of the number (Personally, I'd use the Sieve of Atkin)
Count all primes less than or equal to the square root of the number and multiply that by two. If the square root of the number is an integer, then subtract one from the count variable.
Should work \o/
If you need, I can code something up tomorrow in C to demonstrate.

Resources