Related
I wrote a haskell function to produce prime factorizations for numbers until a certain threshould – made of some prime factors. A minimal working code can be found here:
http://lpaste.net/117263
The problem: It works very good for "threshould <= 10^9" on my computer. But beginning with "threshould = 10^10" the method don't produce any results on my computer – I never see (even not) the first list element on my screen. The name of the critical function is "exponentSets". For every prime in the list 'factors', it computes the possible exponents (with respect to already chosen exponents for other primes). Further commends are in the code. If 10^10 works good on your machine, try it with an higher exponent (10^11 ...).
My question: what is responsible for that? How can I improve the quality of the function "exponentSets"? (I'm still not very experienced in Haskell so someone more experienced might have an Idea)
Even though you are using 64-bit integers, you still do not have enough capacity to store a temporary integer which is created in intLog:
intLog base num =
let searchExtend lower#(e, n) =
let upper#(e', n') = (2 * e, n^2) -- this line is what causes the problems
-- some code
in (some if) searchExtend (1, base)
rawLists is defined like this:
rawLists = recCall 1 threshould
Which in turn sets remaining_threshould in recCall to
threshould `quot` 1 -- same as threshould
Now intLog gets called by recCall like this:
intLog p remaining_threshould
which is the same as
intLog p threshould
Now comes the interesing part: Since num p is smaller than your base threshold, you call searchExtend (1, base), which then in turn does this:
searchExtend (e, n) =
let (e', n') = (2 * e, n ^ 2)
Since n is remaining_threshould, which is the same as threshould, you essentially square 2^32 + 1 and store this in an Int, which overflows and causes rawLists to give bogus results.
(2 ^ 32 + 1) ^ 2 :: Int is 8589934593
(2 ^ 32 + 1) ^ 2 :: Integer is 18446744082299486209
I would like an alogrithm that would use only shift, add or subtract operations to find whether a number is a multiple of 6. So, basically just binary operations.
So far I think I should logical right shift the number twice to divide by 4 and then subtract 6 once from it. But I know something is wrong with my approach and cannot figure out what.
1) Simple (N & 1) == 0 to check if number is divisible by 2.
2) Use the Bit hack answer (from This thread. )to check for divisibility by 3.
If both are true, your number is divisible by 6.
how about keep subtracting the number by 6 until it reaches zero.
If you get zero the number is divisible by 6 otherwise not.
OR
keep dividing the number by 2 (shift operation on binary) until the number is less than 12.
then subtract 6 from it . If less than zero (not divisible )
if zero divisible.
if not subtract 3
If less than zero (not divisible )
if zero divisible.
You could try implementing a division algorithm with the primitive operations available to you. The basic long-division algorithm from 4th grade might be enough (just do things in base 2 instead of base 10, with bitshifting instead of multiplication)
OK. This is how I would go about it (just a first thought) :
A multiple of 6 is both a multiple of 2 and 3, so it should satisfy the divisibility criteria of 2 and 3 at the same time... So...
Check divisibility by 2 :
Right shift the number
If remainder>1, repeat 1.
If remainder=1, then FALSE, else continue.
Checking the divisibility by 2, could obviously be also implemented by (N & 1 == 0), as stated above. This simply checks the last digit of N's binary representation : if it's 1, N is odd (thus NOT divisible by 2), if it's 0, it's perfectly divisible...
Check divisibility by 3 :
Substract 3.
If remainder>3, repeat 1.
If remainder>0, then FALSE, else TRUE.
Reference: http://wiki.answers.com/Q/How_can_you_tell_if_a_number_is_a_multiple_of_6
It is a multiple of six if BOTH of the following statements are true:
1) The last digit (ones place) is 0, 2, 4, 6, or 8.
2) When you add all the digits together, you get a multiple of 3.
Reference: http://wiki.answers.com/Q/How_can_you_tell_if_a_number_is_a_multiple_of_3
1) Start with a number N.
2) Sum the digits of the number, and get M.
3) If M is larger than 10, set N=M and return to stage 2.
4) Otherwise, M is now smaller than 10. If M is 0,3,6 or 9, then N is a multiple of 3
If we extend the range of operations to "bit-masking" and "bit-shifting", it's simple.
As quite a few have stated, divisibility by two is equivalent to (n & 1) == 0. Divisbility by 3 is (relatively) easy in binary. Initialize an accumulator a to 0, then repeat a += (n & 3); n = (n >> 2); until n is 0. If (and only if) a is 3 is n divisible by 3.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
how to get uniformed random between a, b by a known uniformed random function RANDOM(0,1)
In the book of Introduction to algorithms, there is an excise:
Describe an implementation of the procedure Random(a, b) that only makes calls to Random(0,1). What is the expected running time of your procedure, as a function of a and b? The probability of the result of Random(a,b) should be pure uniformly distributed, as Random(0,1)
For the Random function, the results are integers between a and b, inclusively. For e.g., Random(0,1) generates either 0 or 1; Random(a, b) generates a, a+1, a+2, ..., b
My solution is like this:
for i = 1 to b-a
r = a + Random(0,1)
return r
the running time is T=b-a
Is this correct? Are the results of my solutions uniformly distributed?
Thanks
What if my new solution is like this:
r = a
for i = 1 to b - a //including b-a
r += Random(0,1)
return r
If it is not correct, why r += Random(0,1) makes r not uniformly distributed?
Others have explained why your solution doesn't work. Here's the correct solution:
1) Find the smallest number, p, such that 2^p > b-a.
2) Perform the following algorithm:
r=0
for i = 1 to p
r = 2*r + Random(0,1)
3) If r is greater than b-a, go to step 2.
4) Your result is r+a
So let's try Random(1,3).
So b-a is 2.
2^1 = 2, so p will have to be 2 so that 2^p is greater than 2.
So we'll loop two times. Let's try all possible outputs:
00 -> r=0, 0 is not > 2, so we output 0+1 or 1.
01 -> r=1, 1 is not > 2, so we output 1+1 or 2.
10 -> r=2, 2 is not > 2, so we output 2+1 or 3.
11 -> r=3, 3 is > 2, so we repeat.
So 1/4 of the time, we output 1. 1/4 of the time we output 2. 1/4 of the time we output 3. And 1/4 of the time we have to repeat the algorithm a second time. Looks good.
Note that if you have to do this a lot, two optimizations are handy:
1) If you use the same range a lot, have a class that computes p once so you don't have to compute it each time.
2) Many CPUs have fast ways to perform step 1 that aren't exposed in high-level languages. For example, x86 CPUs have the BSR instruction.
No, it's not correct, that method will concentrate around (a+b)/2. It's a binomial distribution.
Are you sure that Random(0,1) produces integers? it would make more sense if it produced floating point values between 0 and 1. Then the solution would be an affine transformation, running time independent of a and b.
An idea I just had, in case it's about integer values: use bisection. At each step, you have a range low-high. If Random(0,1) returns 0, the next range is low-(low+high)/2, else (low+high)/2-high.
Details and complexity left to you, since it's homework.
That should create (approximately) a uniform distribution.
Edit: approximately is the important word there. Uniform if b-a+1 is a power of 2, not too far off if it's close, but not good enough generally. Ah, well it was a spontaneous idea, can't get them all right.
No, your solution isn't correct. This sum'll have binomial distribution.
However, you can generate a pure random sequence of 0, 1 and treat it as a binary number.
repeat
result = a
steps = ceiling(log(b - a))
for i = 0 to steps
result += (2 ^ i) * Random(0, 1)
until result <= b
KennyTM: my bad.
I read the other answers. For fun, here is another way to find the random number:
Allocate an array with b-a elements.
Set all the values to 1.
Iterate through the array. For each nonzero element, flip the coin, as it were. If it is came up 0, set the element to 0.
Whenever, after a complete iteration, you only have 1 element remaining, you have your random number: a+i where i is the index of the nonzero element (assuming we start indexing on 0). All numbers are then equally likely. (You would have to deal with the case where it's a tie, but I leave that as an exercise for you.)
This would have O(infinity) ... :)
On average, though, half the numbers would be eliminated, so it would have an average case running time of log_2 (b-a).
First of all I assume you are actually accumulating the result, not adding 0 or 1 to a on each step.
Using some probabilites you can prove that your solution is not uniformly distibuted. The chance that the resulting value r is (a+b)/2 is greatest. For instance if a is 0 and b is 7, the chance that you get a value 4 is (combination 4 of 7) divided by 2 raised to the power 7. The reason for that is that no matter which 4 out of the 7 values are 1 the result will still be 4.
The running time you estimate is correct.
Your solution's pseudocode should look like:
r=a
for i = 0 to b-a
r+=Random(0,1)
return r
As for uniform distribution, assuming that the random implementation this random number generator is based on is perfectly uniform the odds of getting 0 or 1 are 50%. Therefore getting the number you want is the result of that choice made over and over again.
So for a=1, b=5, there are 5 choices made.
The odds of getting 1 involves 5 decisions, all 0, the odds of that are 0.5^5 = 3.125%
The odds of getting 5 involves 5 decisions, all 1, the odds of that are 0.5^5 = 3.125%
As you can see from this, the distribution is not uniform -- the odds of any number should be 20%.
In the algorithm you created, it is really not equally distributed.
The result "r" will always be either "a" or "a+1". It will never go beyond that.
It should look something like this:
r=0;
for i=0 to b-a
r = a + r + Random(0,1)
return r;
By including "r" into your computation, you are including the "randomness" of all the previous "for" loop runs.
Say S = 5 and N = 3 the solutions would look like - <0,0,5> <0,1,4> <0,2,3> <0,3,2> <5,0,0> <2,3,0> <3,2,0> <1,2,2> etc etc.
In the general case, N nested loops can be used to solve the problem. Run N nested loop, inside them check if the loop variables add upto S.
If we do not know N ahead of time, we can use a recursive solution. In each level, run a loop starting from 0 to N, and then call the function itself again. When we reach a depth of N, see if the numbers obtained add up to S.
Any other dynamic programming solution?
Try this recursive function:
f(s, n) = 1 if s = 0
= 0 if s != 0 and n = 0
= sum f(s - i, n - 1) over i in [0, s] otherwise
To use dynamic programming you can cache the value of f after evaluating it, and check if the value already exists in the cache before evaluating it.
There is a closed form formula : binomial(s + n - 1, s) or binomial(s+n-1,n-1)
Those numbers are the simplex numbers.
If you want to compute them, use the log gamma function or arbitrary precision arithmetic.
See https://math.stackexchange.com/questions/2455/geometric-proof-of-the-formula-for-simplex-numbers
I have my own formula for this. We, together with my friend Gio made an investigative report concerning this. The formula that we got is [2 raised to (n-1) - 1], where n is the number we are looking for how many addends it has.
Let's try.
If n is 1: its addends are o. There's no two or more numbers that we can add to get a sum of 1 (excluding 0). Let's try a higher number.
Let's try 4. 4 has addends: 1+1+1+1, 1+2+1, 1+1+2, 2+1+1, 1+3, 2+2, 3+1. Its total is 7.
Let's check with the formula. 2 raised to (4-1) - 1 = 2 raised to (3) - 1 = 8-1 =7.
Let's try 15. 2 raised to (15-1) - 1 = 2 raised to (14) - 1 = 16384 - 1 = 16383. Therefore, there are 16383 ways to add numbers that will equal to 15.
(Note: Addends are positive numbers only.)
(You can try other numbers, to check whether our formula is correct or not.)
This can be calculated in O(s+n) (or O(1) if you don't mind an approximation) in the following way:
Imagine we have a string with n-1 X's in it and s o's. So for your example of s=5, n=3, one example string would be
oXooXoo
Notice that the X's divide the o's into three distinct groupings: one of length 1, length 2, and length 2. This corresponds to your solution of <1,2,2>. Every possible string gives us a different solution, by counting the number of o's in a row (a 0 is possible: for example, XoooooX would correspond to <0,5,0>). So by counting the number of possible strings of this form, we get the answer to your question.
There are s+(n-1) positions to choose for s o's, so the answer is Choose(s+n-1, s).
There is a fixed formula to find the answer. If you want to find the number of ways to get N as the sum of R elements. The answer is always:
(N+R-1)!/((R-1)!*(N)!)
or in other words:
(N+R-1) C (R-1)
This actually looks a lot like a Towers of Hanoi problem, without the constraint of stacking disks only on larger disks. You have S disks that can be in any combination on N towers. So that's what got me thinking about it.
What I suspect is that there is a formula we can deduce that doesn't require the recursive programming. I'll need a bit more time though.
just out of curiosity I tried to do the following, which turned out to be not so obvious to me;
Suppose I have nested loops with runtime bounds, for example:
t = 0 // trip count
for l in 0:N
for k in 0:N
for j in max(l,k):N
for i in k:j+1
t += 1
t is loop trip count
is there a general algorithm/way (better than N^4 obviously) to calculate loop trip count?
if not, I would be curious to know how you would approach just this particular loop. the above loop is symmetric (it's loops over symmetric rank-4 tensor), and I am also interested in methods to detect loop symmetry.
I am working on the assumption that the iteration bounds depend only on constant or previous loop variables. link/journal article, If you know of one, would be great.
I believe the inner loop will run
t = 1/8 * (N^4 + 6 * N^3 + 7 * N^2 + 2 * N)
times.
I did not really solve the problem directly, I fitted a 4-th order polynomial expression to exactly calculated t for N from 1 to 50 hoping that I'll get exact fit.
To calculate exact t I used
sum(sum(sum(sum(1,i,k,j+1),j,max(l,k),N),k,1,N),l,1,N)
which should be the equivalent of actually running your loops.
data fit, log scale http://img714.imageshack.us/img714/2313/plot3.png
The fit for N from 1 to 50 matches exactly and calculating it for N=100 gives 13258775 using both methods.
EDIT:
The exercise was done using open source algebra system maxima, here's the actual source (output discarded):
nr(n):=sum(sum(sum(sum(1,i,k,j+1),j,max(l,k),n),k,1,n),l,1,n);
M : genmatrix( lambda([i,j],if j=1 then i else nr(i)), 50, 2 );
coefs : lsquares_estimates(M, [x,y], y = A*x^4+B*x^3+C*x^2+D*x+E, [A,B,C,D,E]);
sol(x):=ev(A*x^4+B*x^3+C*x^2+D*x+E, coefs);
sol(N);
S : genmatrix( lambda([i,j], if j=1 then i else sol(i)), 50, 2);
M-S;
plot2d([[discrete,makelist([M[N][1],M[N][2]],N,1,50)], sol(N)], [N, 1, 60], [style, points, lines], [color, red, blue], [legend, "simulation", sol(N)], [logy]);
compare(nr(100),sol(100));
If you want to know how many times the inner loop:
for j in max(l,k):N
Would be executed, just compute: N - max(l, k) assuming open range, N + 1 - max(l, k) assuming closed range.
For example, if:
l = 2
k = 7
N = 10
then it will run on 7, 8, 9, 10 (closed range), so indeed 10 + 1 - 7 = 4 times.
the answer is no, as long as the loop bounds can depend from the outer variables in an arbitrary fashionm as this would provide a general means for getting closed form formulations of integral series.
To see this, consider the following:
for x in 0:N
for y in 0:f(x)
t += 1
The trip count t(N) equals the sum t(N) = f(0)+f(1)+f(2)+f(3)+...+f(N-1).
So if you can get a closed form formulation for t(N) regardless of f(), you have found a very general method of producing closed forms, too general I would say, because what you have here correspond to an integral, and it's known that not all integrals admit closed form formulations.