smallest integral divisor in OCaml - algorithm

I want to get a smallest integral divisor of a given number n
let rec smallest_divisor : int -> int
=fun n -> let i = ref 2 if n mod !i = 0 then i else (i :!i+1; smallest_divisor(n))
What I mean above that code is first variable i is 2, and if (n mod i) is 0 then return i, else i+1 and repeat that function
I really have no idea how to get this

For very large number it might be better to try testing only prime numbers. Your approach can work, but be aware that it might not be suitable for big numbers.
Another note: while writing recursive functions you should pay attention to the base case.
This is a minimal example that can achieve what you described, but keep in mind that it is not the best way to go about it:
let smallest_divisor n =
let rec aux n i =
if i > (n / 2) then n (* base case, should use sqrt here *) else
if (n mod i == 0) then i else aux n (i + 1) in
aux n 2;;

Related

Find the value of f(T) for big value T

I am trying to solve a problem which is described below,
Given value of f(0) and k , which are integers.
I need to find value of f( T ). where T<=1010
Recursive function is,
f(n) = 2*f(n-1) , if 4*f(n-1) <=k
k - ( 2*f(n-1) ) , if 4*f(n-1) > k
My efforts,
#include<iostream>
using namespace std;
int main(){
long k,f0,i;
cin>>k>>f0;
long operation ;
cin>>operation;
long answer=f0;
for(i=1;i<=operation;i++){
answer=(4*answer <= k )?(2*answer):(k-(2*answer));
}
cout<<answer;
return 0;
}
My code gives me right answer. But, The code will run 1010 time in worst case that gives me Time Limit Exceed. I need more efficient solution for this problem. Please help me. I don't know the correct algorithm.
If 2f(0) < k then you can compute this function in O(log n) time (using exponentiation by squaring modulo k).
r = f(0) * 2^n mod k
return 2 * r >= k ? k - r : r
You can prove this by induction. The induction hypothesis is that 0 <= f(n) < k/2, and that the above code fragment computes f(n).
Here's a Python program which checks random test cases, comparing a naive implementation (f) with an optimized one (g).
def f(n, k, z):
r = z
for _ in xrange(n):
if 4*r <= k:
r = 2 * r
else:
r = k - 2 * r
return r
def g(n, k, z):
r = (z * pow(2, n, k)) % k
if 2 * r >= k:
r = k - r
return r
import random
errs = 0
while errs < 20:
k = random.randrange(100, 10000000)
n = random.randrange(100000)
z = random.randrange(k//2)
a1 = f(n, k, z)
a2 = g(n, k, z)
if a1 != a2:
print n, k, z, a1, a2
errs += 1
print '.',
Can you use methmetical solution before progamming and compulating?
Actually,
f(n) = f0*2^(n-1) , if f(n-1)*4 <= k
k - f0*2^(n-1) , if f(n-1)*4 > k
thus, your code will write like this:
condition = f0*pow(2, operation-2)
answer = condition*4 =< k? condition*2: k - condition*2
For a simple loop, your answer looks pretty tight; one could optimise a little bit using answer<<2 instead of 4*answer, and answer<<1 for 2*answer, but quite possibly your compiler is already doing that. If you're blowing the time with this, it might be necessary to reduce the loop itself somehow.
I can't figure out a mathematical pattern that #Shannon was going for, but I'm thinking we could exploit the fact that this function will sooner or later cycle. If the cycle is short enough, then we could short the loop by just getting the answer at the same point in the cycle.
So let's get some cycle detection equipment in the form of Brent's algorithm, and see if we can cut the loop to reasonable levels.
def brent(f, x0):
# main phase: search successive powers of two
power = lam = 1
tortoise = x0
hare = f(x0) # f(x0) is the element/node next to x0.
while tortoise != hare:
if power == lam: # time to start a new power of two?
tortoise = hare
power *= 2
lam = 0
hare = f(hare)
lam += 1
# Find the position of the first repetition of length λ
mu = 0
tortoise = hare = x0
for i in range(lam):
# range(lam) produces a list with the values 0, 1, ... , lam-1
hare = f(hare)
# The distance between the hare and tortoise is now λ.
# Next, the hare and tortoise move at same speed until they agree
while tortoise != hare:
tortoise = f(tortoise)
hare = f(hare)
mu += 1
return lam, mu
f0 = 2
k = 198779
t = 10000000000
def f(x):
if 4 * x <= k:
return 2 * x
else:
return k - 2 * x
lam, mu = brent(f, f0)
t2 = t
if t >= mu + lam: # if T is past the cycle's first loop,
t2 = (t - mu) % lam + mu # find the equivalent place in the first loop
x = f0
for i in range(t2):
x = f(x)
print("Cycle start: %d; length: %d" % (mu, lam))
print("Equivalent result at index: %d" % t2)
print("Loop iterations skipped: %d" % (t - t2))
print("Result: %d" % x)
As opposed to the other proposed answers, this approach actually could use a memo array to speed up the process, since the start of the function is actually calculated multiple times (in particular, inside brent), or it may be irrelevant, depending on how big the cycle happens to be.
The algorithm you proposed already has O(n).
To come up with more efficient algorithms, there is not that much direction we can go about. Some typical options we have
1.Decease the coefficients of the linear term( but I doubt it would make a difference in this case
2.Change to O(Logn)(typically use some sort of divide and conquer technique)
3.Change to O(1)
In this case, we can do the last one.
The recursion function is a piece-wise function
f(n) = 2*f(n-1) , if 4*f(n-1) <=k
k - ( 2*f(n-1) ) , if 4*f(n-1) > k
Let's tackle it by case:
case 1: if 4*f(n-1) <= k (1)(assuming the starting index is zero)
this is a obvious a geometry series
a_n = 2*a_n-1
Therefore, have the formula
Sn = 2^(n-1)f(0) ----()
Case 2: if 4*f(n-1) > k (2), we have
a_n = -2a_n-1 + k
Assuming, a_j is the element in the sequence which just satisfy condition (2)
Nestedly sub in an_1 to the formula, you will obtain the equation
an = k -2k +4k -8k... +(-2)^(n-j)* a_j
k -2k 4k -8... is another gemo series
Sn = k*(1-2^(n-j))/(1-2) ---gemo series sum formula with starting value k and ratio = -2
Therefore, we have a formula for an in the case 2
an = k * (1-2^(n-j))/(1-2) + (-2)^(n-j) * a_j ----(**)
All we left to do it to find aj which just dissatisfy condition (1) and satisfy (2)
This can be obtained in constant time again using the formula we have for case 1:
find n such that, 4*an = 4*Sn = 4*2^(n-1)*f(0)
solve for n: 4*2^(n-1)*f(0) = k, if n is not integer, take ceiling of n
In my first attempt to solve this question, I had wrong assumption that the value of the sequence is monotonically increasing but in fact the sequence might jump between case 1 and case 2. Therefore, there might not be constant algorithm to solve the problem.
However, we can use utilize the result above to skip iterative update complexity.
The overall algorithm will look something like:
start with T, K, and f(0)
compute n that make the condition switch using either (*) or (**)
update f(0) with f(n), update T - n
repeat
terminate when T-n = 0(the last iteration might over compute causing T-n<0, therefore, you need to go back a little bit if that happen)
Create a map that can store your results. Before finding f(n) check in that map, if solution is already existed or not.
If exists, use that solution.
Otherwise find it, store it for future use.
For C++:
Definition:
map<long,long>result;
Insertion:
result[key]=value
Accessing:
value=result[key];
Checking:
map<long,long>::iterator it=result.find(key);
if(it==result.end())
{
//key was not found, find the solution and insert into result
}
else
{
return result[key];
}
Use above technique for better solution.

How to find Inverse Modulus of a number i.e (a%m) when m is not prime

I searched the answer for this question, i got various useful links but when i implemented the idea, i am getting wrong answer.
This is what I understood :
If m is prime, then it is very simple. Inverse modulus of any number 'a' can be calculated as:inverse_mod(a) = (a^(m-2))%m
but when m is not prime, the we have to find the prime factors of m ,
i.e. m= (p1^a1)*(p2^a2)*....*(pk^ak). Here p1,p2,....,pk are the prime factors of m and a1,a2,....,ak are their respective powers.
then we have to calculate :
m1 = a%(p1^a1),
m2 = a%(p2^a2),
.......
mk = a%(pk^ak)
then we have to combine all these remainders using Chinese Remainder Theorem (https://en.wikipedia.org/wiki/Chinese_remainder_theorem)
I implemented this idea for m=1000,000,000,but still i am getting Wrong Answer.
Here is my explanation for m=1000,000,000 which is not prime
m= (2^9)*(5^9) where 2 and 5 are m's prime factors.
let a is the number for which have to calculate inverse modulo m.
m1 = a%(2^9) = a^512
m2 = a%(5^9) = a^1953125
Our answer will be = m1*e1 + m2*e2
where e1= { 1 (mod 512)
0 (mod 1953125)
}
and e2= { 1 (mod 1953125)
0 (mod 512)
}
Now to calculate 'e1' and 'e2' , I used Extended Euclidean Algorithm.
https://en.wikipedia.org/wiki/Extended_Euclidean_algorithm
The Code is :
void extend_euclid(lld a,lld b,lld& x,lld& y)
{
if(a%b==0)
{
x=0;
y=1;
return ;
}
extend_euclid(b,a%b,x,y);
int tmp=x;
x=y;
y=tmp-(a/b)*y;
}
Now e1= 1953125*y and e2=512*y;
So, Our final answer will be = m1*e1 + m2*e2 .
But after doing all this, I am getting wrong answer.
please explain and point out any mistakes which I have made while understanding Chinese Remainder Theorem .
Thank You Very Much.
The inverse of a modulo m only exists if a and m are coprime. If they are not coprime, nothing will help. For example: what is the inverse of 2 mod 4?
2*0 = 0 mod 4
2*1 = 2 mod 4
2*2 = 0 mod 4
2*3 = 2 mod 4
So no inverse.
This can indeed be computed by using the extended euclidean algorithm (although I'm not sure if you're doing it right), but the simplest way, in my opinion, is by using Euler's theorem:
a^phi(m) = 1 (mod m)
a*a^(phi(m) - 1) = 1 (mod m)
=> a^(phi(m) - 1) is the invers of a (mod m)
Where phi is the totient function:
phi(x) = x * (1 - 1/f1)(1 - 1/f2)...(1 - 1/fk)
where fi > 1 is a divisor of x (not necessarily a prime divisor)
phi(36) = 36(1 - 1/2)(1 - 1/3)(1 - 1/4)(1 - 1/6)(1 - 1/9)(1 - 1/12)(1 - 1/18)(1 - 1/36)
So it can be computed in O(sqrt n).
The exponentiation can then be computed using exponentiation by squaring.
If you want to read about how you can use the extended Euclidean algorithm to find the inverse faster, read this. I don't think the Chinese remainder theorem can help here.
I believe the following function will do what you want. Change from long to int if appropriate. It returns -1 if there is no inverse, otherwise returns a positive number in the range [0..m).
public static long inverse(long a, long m) { // mult. inverse of a mod m
long r = m;
long nr = a;
long t = 0;
long nt = 1;
long tmp;
while (nr != 0) {
long q = r/nr;
tmp = nt; nt = t - q*nt; t = tmp;
tmp = nr; nr = r - q*nr; r = tmp;
}
if (r > 1) return -1; // no inverse
if (t < 0) t += m;
return t;
}
I can't follow your algorithm to see exactly what is wrong with it, but I have a few general comments: Euler's totient function is rather slow to calculate in general, depending as it does on prime factorizations. The Chinese Remainder Theorem is useful in many contexts for combining results mod coprimes but it's not necessary here and again overcomplicates this particular issue because you end up having to factor your modulus, a very slow operation. And it's faster to implement GCD and modular inverse in a loop, rather than using recursion, though of course the two methods are equally effective.
If you're trying to compute a^(-1) mod p^k for p prime, first compute a^(-1) mod p. Given an x such that ax = 1 (mod p^(k-1)), you can "Hensel lift"---you're looking for the y between 0 and p-1 such that a(x + y p^(k-1)) = 1 (mod p^k). Doing some algebra, you find that you're looking for the y such that a y p^(k-1) = 1 - ax (mod p^k)---i.e. a y = (1 - ax)/p^(k-1) (mod p), where the division by p^(k-1) is exact. You can work this out using a modular inverse for a (mod p).
(Alternatively, simply notice that a^(p^(k-1)(p-1) - 1) = 1 (mod p^k). I mention Hensel lifting because it works in much greater generality.)

What are the time complexities for finding pairs of numbers that when cubed add up to a certain value?

The problem is as follows:
Given an integer n, find all pairs that, when cubed, add up to n. (i.e. 1^3 + 12^3 = 1729, 9^3 + 10^3 = 1729, etc.)
I found two solutions for this:
1)
k = ceiling(cube root of n)
for i = 1 to k
if (cube root of (n - i^3) % 1 == 0)
print [i, j]
end
end
2)
i = 1
j = ceiling(cube root of n)
while (i <= j)
if (i^3 + j^3 == n)
print [i++, j--]
else if (i^3 + j^3 < n)
i++
else if (i^3 + j^3 > n)
j--
end
end
I believe both of these run at O(n), are there any advantages to one over the other?
In general, computing roots and similar exotic functions is an expensive operation, and coming up with exact numerical results so you can run a direct equality test is not always possible, so you need additional tests to see if you've come up with an exact integer solution. So your second piece of code is preferable. Technically it should start with I = 0.
Also, your optimal running time bound is O(n^(1/3)), not O(n).

Algorithm to partition a number

Given a positive integer X, how can one partition it into N parts, each between A and B where A <= B are also positive integers? That is, write
X = X_1 + X_2 + ... + X_N
where A <= X_i <= B and the order of the X_is doesn't matter?
If you want to know the number of ways to do this, then you can use generating functions.
Essentially, you are interested in integer partitions. An integer partition of X is a way to write X as a sum of positive integers. Let p(n) be the number of integer partitions of n. For example, if n=5 then p(n)=7 corresponding to the partitions:
5
4,1
3,2
3,1,1
2,2,1
2,1,1,1
1,1,1,1,1
The the generating function for p(n) is
sum_{n >= 0} p(n) z^n = Prod_{i >= 1} ( 1 / (1 - z^i) )
What does this do for you? By expanding the right hand side and taking the coefficient of z^n you can recover p(n). Don't worry that the product is infinite since you'll only ever be taking finitely many terms to compute p(n). In fact, if that's all you want, then just truncate the product and stop at i=n.
Why does this work? Remember that
1 / (1 - z^i) = 1 + z^i + z^{2i} + z^{3i} + ...
So the coefficient of z^n is the number of ways to write
n = 1*a_1 + 2*a_2 + 3*a_3 +...
where now I'm thinking of a_i as the number of times i appears in the partition of n.
How does this generalize? Easily, as it turns out. From the description above, if you only want the parts of the partition to be in a given set A, then instead of taking the product over all i >= 1, take the product over only i in A. Let p_A(n) be the number of integer partitions of n whose parts come from the set A. Then
sum_{n >= 0} p_A(n) z^n = Prod_{i in A} ( 1 / (1 - z^i) )
Again, taking the coefficient of z^n in this expansion solves your problem. But we can go further and track the number of parts of the partition. To do this, add in another place holder q to keep track of how many parts we're using. Let p_A(n,k) be the number of integer partitions of n into k parts where the parts come from the set A. Then
sum_{n >= 0} sum_{k >= 0} p_A(n,k) q^k z^n = Prod_{i in A} ( 1 / (1 - q*z^i) )
so taking the coefficient of q^k z^n gives the number of integer partitions of n into k parts where the parts come from the set A.
How can you code this? The generating function approach actually gives you an algorithm for generating all of the solutions to the problem as well as a way to uniformly sample from the set of solutions. Once n and k are chosen, the product on the right is finite.
Here is a python solution to this problem, This is quite un-optimised but I have tried to keep it as simple as I can to demonstrate an iterative method of solving this problem.
The results of this method will commonly be a list of max values and min values with maybe 1 or 2 values inbetween. Because of this, there is a slight optimisation in there, (using abs) which will prevent the iterator constantly trying to find min values counting down from max and vice versa.
There are recursive ways of doing this that look far more elegant, but this will get the job done and hopefully give you an insite into a better solution.
SCRIPT:
# iterative approach in-case the number of partitians is particularly large
def splitter(value, partitians, min_range, max_range, part_values):
# lower bound used to determine if the solution is within reach
lower_bound = 0
# upper bound used to determine if the solution is within reach
upper_bound = 0
# upper_range used as upper limit for the iterator
upper_range = 0
# lower range used as lower limit for the iterator
lower_range = 0
# interval will be + or -
interval = 0
while value > 0:
partitians -= 1
lower_bound = min_range*(partitians)
upper_bound = max_range*(partitians)
# if the value is more likely at the upper bound start from there
if abs(lower_bound - value) < abs(upper_bound - value):
upper_range = max_range
lower_range = min_range-1
interval = -1
# if the value is more likely at the lower bound start from there
else:
upper_range = min_range
lower_range = max_range+1
interval = 1
for i in range(upper_range, lower_range, interval):
# make sure what we are doing won't break solution
if lower_bound <= value-i and upper_bound >= value-i:
part_values.append(i)
value -= i
break
return part_values
def partitioner(value, partitians, min_range, max_range):
if min_range*partitians <= value and max_range*partitians >= value:
return splitter(value, partitians, min_range, max_range, [])
else:
print ("this is impossible to solve")
def main():
print(partitioner(9800, 1000, 2, 100))
The basic idea behind this script is that the value needs to fall between min*parts and max*parts, for each step of the solution, if we always achieve this goal, we will eventually end up at min < value < max for parts == 1, so if we constantly take away from the value, and keep it within this min < value < max range we will always find the result if it is possable.
For this code's example, it will basically always take away either max or min depending on which bound the value is closer to, untill some non min or max value is left over as remainder.
A simple realization you can make is that the average of the X_i must be between A and B, so we can simply divide X by N and then do some small adjustments to distribute the remainder evenly to get a valid partition.
Here's one way to do it:
X_i = ceil (X / N) if i <= X mod N,
floor (X / N) otherwise.
This gives a valid solution if A <= floor (X / N) and ceil (X / N) <= B. Otherwise, there is no solution. See proofs below.
sum(X_i) == X
Proof:
Use the division algorithm to write X = q*N + r with 0 <= r < N.
If r == 0, then ceil (X / N) == floor (X / N) == q so the algorithm sets all X_i = q. Their sum is q*N == X.
If r > 0, then floor (X / N) == q and ceil (X / N) == q+1. The algorithm sets X_i = q+1 for 1 <= i <= r (i.e. r copies), and X_i = q for the remaining N - r pieces. The sum is therefore (q+1)*r + (N-r)*q == q*r + r + N*q - r*q == q*N + r == X.
If floor (X / N) < A or ceil (X / N) > B, then there is no solution.
Proof:
If floor (X / N) < A, then floor (X / N) * N < A * N, and since floor(X / N) * N <= X, this means that X < A*N, so even using only the smallest pieces possible, the sum would be larger than X.
Similarly, if ceil (X / N) > B, then ceil (X / N) * N > B * N, and since ceil(X / N) * N >= X, this means that X > B*N, so even using only the largest pieces possible, the sum would be smaller than X.

Generating shuffled range using a PRNG rather than shuffling

Is there any known algorithm that can generate a shuffled range [0..n) in linear time and constant space (when output produced iteratively), given an arbitrary seed value?
Assume n may be large, e.g. in the many millions, so a requirement to potentially produce every possible permutation is not required, not least because it's infeasible (the seed value space would need to be huge). This is also the reason for a requirement of constant space. (So, I'm specifically not looking for an array-shuffling algorithm, as that requires that the range is stored in an array of length n, and so would use linear space.)
I'm aware of question 162606, but it doesn't present an answer to this particular question - the mappings from permutation indexes to permutations given in that question would require a huge seed value space.
Ideally, it would act like a LCG with a period and range of n, but the art of selecting a and c for an LCG is subtle. Simply satisfying the constraints for a and c in a full period LCG may satisfy my requirements, but I am wondering if there are any better ideas out there.
Based on Jason's answer, I've made a simple straightforward implementation in C#. Find the next largest power of two greater than N. This makes it trivial to generate a and c, since c needs to be relatively prime (meaning it can't be divisible by 2, aka odd), and (a-1) needs to be divisible by 2, and (a-1) needs to be divisible by 4. Statistically, it should take 1-2 congruences to generate the next number (since 2N >= M >= N).
class Program
{
IEnumerable<int> GenerateSequence(int N)
{
Random r = new Random();
int M = NextLargestPowerOfTwo(N);
int c = r.Next(M / 2) * 2 + 1; // make c any odd number between 0 and M
int a = r.Next(M / 4) * 4 + 1; // M = 2^m, so make (a-1) divisible by all prime factors, and 4
int start = r.Next(M);
int x = start;
do
{
x = (a * x + c) % M;
if (x < N)
yield return x;
} while (x != start);
}
int NextLargestPowerOfTwo(int n)
{
n |= (n >> 1);
n |= (n >> 2);
n |= (n >> 4);
n |= (n >> 8);
n |= (n >> 16);
return (n + 1);
}
static void Main(string[] args)
{
Program p = new Program();
foreach (int n in p.GenerateSequence(1000))
{
Console.WriteLine(n);
}
Console.ReadKey();
}
}
Here is a Python implementation of the Linear Congruential Generator from FryGuy's answer. Because I needed to write it anyway and thought it might be useful for others.
import random
import math
def lcg(start, stop):
N = stop - start
# M is the next largest power of 2
M = int(math.pow(2, math.ceil(math.log(N+1, 2))))
# c is any odd number between 0 and M
c = random.randint(0, M/2 - 1) * 2 + 1
# M=2^m, so make (a-1) divisible by all prime factors and 4
a = random.randint(0, M/4 - 1) * 4 + 1
first = random.randint(0, M - 1)
x = first
while True:
x = (a * x + c) % M
if x < N:
yield start + x
if x == first:
break
if __name__ == "__main__":
for x in lcg(100, 200):
print x,
Sounds like you want an algorithm which is guaranteed to produce a cycle from 0 to n-1 without any repeats. There are almost certainly a whole bunch of these depending on your requirements; group theory would be the most helpful branch of mathematics if you want to delve into the theory behind it.
If you want fast and don't care about predictability/security/statistical patterns, an LCG is probably the simplest approach. The wikipedia page you linked to contains this (fairly simple) set of requirements:
The period of a general LCG is at most
m, and for some choices of a much less
than that. The LCG will have a full
period if and only if:
c and m are relatively prime,
a - 1 is divisible by all prime factors of m
a - 1 is a multiple of 4 if m is a multiple of 4
Alternatively, you could choose a period N >= n, where N is the smallest value that has convenient numerical properties, and just discard any values produced between n and N-1. For example, the lowest N = 2k - 1 >= n would let you use linear feedback shift registers (LFSR). Or find your favorite cryptographic algorithm (RSA, AES, DES, whatever) and given a particular key, figure out the space N of numbers it permutes, and for each step apply encryption once.
If n is small but you want the security to be high, that's probably the trickiest case, as any sequence S is likely to have a period N much higher than n, but is also nontrivial to derive a nonrepeating sequence of numbers with a shorter period than N. (e.g. if you could take the output of S mod n and guarantee nonrepeating sequence of numbers, that would give information about S that an attacker might use)
See my article on secure permutations with block ciphers for one way to do it.
Look into Linear Feedback Shift Registers, they can be used for exactly this.
The short way of explaining them is that you start with a seed and then iterate using the formula
x = (x << 1) | f(x)
where f(x) can only return 0 or 1.
If you choose a good function f, x will cycle through all values between 1 and 2^n-1 (where n is some number), in a good, pseudo-random way.
Example functions can be found here, e.g. for 63 values you can use
f(x) = ((x >> 6) & 1) ^ ((x >> 5) & 1)

Resources