Optimize a recursive function in Julia

Optimize a recursive function in Julia - performance

I wrote a Julia code which computes integrals over Gaussian functions and I have a sort-of kernel function which is called over and over again.
According to the Julia built-in Profile Module, this is where I spend most of the time during the actual computation and therefore I would like to see if there is any way in which I can improve it.
It is a recursive function and I implemented it in a kind of straightforward way. As I am not that much used to recursive functions, maybe somebody out there has some ideas/suggestions on how to improve it (both from a purely theoretical algorithmic point of view and/or exploiting special optimizations from the JIT compiler).
Here you have it:
"""Returns the integral of an Hermite Gaussian divided by the Coulomb operator."""
function Rtuv{T<:Real}(t::Int, u::Int, v::Int, n::Int, p::Real, RPC::Vector{T})
if t == u == v == 0
return (-2.0*p)^n * boys(n,p*norm(RPC)^2)
elseif u == v == 0
if t > 1
return (t-1)*Rtuv(t-2, u, v, n+1, p, RPC) +
RPC[1]*Rtuv(t-1, u, v, n+1, p, RPC)
else
return RPC[1]*Rtuv(t-1, u, v, n+1, p, RPC)
end
elseif v == 0
if u > 1
return (u-1)*Rtuv(t, u-2, v, n+1, p, RPC) +
RPC[2]*Rtuv(t, u-1, v, n+1, p, RPC)
else
return RPC[2]*Rtuv(t, u-1, v, n+1, p ,RPC)
end
else
if v > 1
return (v-1)*Rtuv(t, u, v-2, n+1, p, RPC)
RPC[3]*Rtuv(t, u, v-1, n+1, p, RPC)
else
return RPC[3]*Rtuv(t, u, v-1, n+1, p, RPC)
end
end
end
Don't pay that much attention to the function boys, since according to the profiler it is not that heavy.
Just to give an idea of the range of numbers: usually the first call comes from t+u+v ranging from 0 to 3, while n always starts at 0.
Cheers
EDIT -- New information
The generated version is slower for small values of t,u,v, I believe the reason is because expressions are not optimzied by the compiler.
I was benchmarking badly for this case, without interpolating the argument passed. By doing it properly I am always faster with the approach explained in the accepted answer, so hurray!
More generally, does the compiler identify trivial cases such as multiplication by zeros and ones and optimize those away?
Answer to myself: from a quick checking of simple code with #code_llvm it seems not to be the case.

Maybe this works in your case: you can "memoize" whole compiled methods using generated functions and get rid of all recursion after the first call.
Since t, u, and v will stay small, you could generate the fully expanded code for the recursions. Assume for the simplicity a bogus implementation of
boys(n::Int, x::Real) = n + x
Then
function Rtuv_expr(t::Int, u::Int, v::Int, n, p, RPC)
ninc = :($n + 1)
if t == u == v == 0
:((-2.0 * $p)^$n * boys($n, $p * norm($RPC)^2))
elseif u == v == 0
if t > 1
:($(t-1) * $(Rtuv_expr(t-2, u, v, ninc, p, RPC)) +
$RPC[1] * $(Rtuv_expr(t-1, u, v, ninc, p, RPC)))
else
:($RPC[1] * $(Rtuv_expr(t-1, u, v, ninc, p, RPC)))
end
elseif v == 0
if u > 1
:($(u-1) * $(Rtuv_expr(t, u-2, v, ninc, p, RPC)) +
$RPC[2] * $(Rtuv_expr(t, u-1, v, ninc, p, RPC)))
else
:($RPC[2] * $(Rtuv_expr(t, u-1, v, ninc, p, RPC)))
end
else
if v > 1
:($(v-1) * $(Rtuv_expr(t, u, v-2, ninc, p, RPC)) +
$RPC[3] * $(Rtuv_expr(t, u, v-1, ninc, p, RPC)))
else
:($RPC[3] * $(Rtuv_expr(t, u, v-1, ninc, p, RPC)))
end
end
end
will generate you fully expanded expressions like this:
julia> Rtuv_expr(1, 2, 1, 0, 0.1, rand(3))
:(([0.868194, 0.928591, 0.295344])[3] * (1 * (([0.868194, 0.928591, 0.295344])[1] * ((-2.0 * 0.1) ^ (((0 + 1) + 1) + 1) * boys(((0 + 1) + 1) + 1, 0.1 * norm([0.868194, 0.928591, 0.295344]) ^ 2))) + ([0.868194, 0.928591, 0.295344])[2] * (([0.868194, 0.928591, 0.295344])[2] * (([0.868194, 0.928591, 0.295344])[1] * ((-2.0 * 0.1) ^ ((((0 + 1) + 1) + 1) + 1) * boys((((0 + 1) + 1) + 1) + 1, 0.1 * norm([0.868194, 0.928591, 0.295344]) ^ 2))))))
We can stuff that into a generated function Rtuv taking Val types. For each different combination of T, U, and V, this function will use Rtuv_expr to compile the respective expression and from then on use this method -- no recursion anymore:
#generated function Rtuv{T, U, V, X<:Real}(::Type{Val{T}}, ::Type{Val{U}}, ::Type{Val{V}},
n::Int, p::Real, RPC::Vector{X})
Rtuv_expr(T, U, V, :n, :p, :RPC)
end
You have to call it with t, u, v wrapped in Val, though:
julia> Rtuv(Val{1}, Val{2}, Val{1}, 0, 0.1, rand(3))
-0.0007782250832001092
If you test a small loop like this,
for t = 0:3, u = 0:3, v = 0:3
println(Rtuv(Val{t}, Val{u}, Val{v}, 0, 0.1, [1.0, 2.0, 3.0]))
end
it will need some time for the first run, but then go pretty fast, since the used methods are already compiled.

Related

Mathematical modeling from gurobipy to pyomo: How to enumerate over a set in pyomo?

I implemented a course planning problem in gurobipy. It all works well. My next task is to rewrite it in pyomo. I had difficulties with one specific equation (written in gurobipy):
model.addConstrs((quicksum(gamma[l, k, s, T[tau]] for tau in range(index_t, index_t + dur[m]) if tau < len(T))
>= dur[m] * start[l, k, s, t] for index_t, t in enumerate(T) for m in M
for k in KM[m] for l in LM[m] for s in SM[m]), name='Hintereinander')
gamma[lkst] and start[lkst] are binary decision variables. l,k,s,t are indices where t are periods. So the set T is a list of the periods i have. Here in this equation i need the ord(t) to be able to do the calculations in the sum. Therefore I perform an enumeration(T) at the end of the euqation.(When looping over all needed indices).
My data is given beforehand, so i formulate a ConcreteModel() in pyomo. I have difficulties in including the enumeration of the Set T in pyomo.
What I already have:
def gamma_hintereinander_rule(model,m,k,l,s):
for index_t,t in enumerate(T):
if k in KM[m]:
if l in LM[m]:
if s in SM[m]:
return sum(model.gamma[l, k, s, T[tau]] for tau in range(index_t, index_t + dur[m]) if tau< len(T)) >= dur[m] * model.start[l, k, s, t]
else:
return Constraint.Skip
else:
return Constraint.Skip
else:
return Constraint.Skip
model.gamma_hintereinander = Constraint(M, K, L, S,rule=gamma_hintereinander_rule)
It doesn't work correctly.
I'd be really happy and thankful if someone could help me!
Best regards!
Zeineb

The problem is the for-loop inside of the constraint rule. You are exiting the rule after the first return statement is encountered and so only 1 constraint or Constraint.Skip is returned despite the for-loop. I think the best approach is to index your Constraint by T something like:
def gamma_hintereinander_rule(model,m,k,l,s,t):
index_t = T.index(t)
if k in KM[m]:
if l in LM[m]:
if s in SM[m]:
return sum(model.gamma[l, k, s, T[tau]] for tau in range(index_t, index_t + dur[m]) if tau< len(T)) >= dur[m] * model.start[l, k, s, t]
else:
return Constraint.Skip
else:
return Constraint.Skip
else:
return Constraint.Skip
model.gamma_hintereinander = Constraint(M, K, L, S, T, rule=gamma_hintereinander_rule)

Algorithm to precisely compare two exponentiations for very large integers (order of 1 billion)

We want to compare a^b to c^d, and tell if the first is smaller, greater, or equal (where ^ denotes exponentiation).
Obviously, for very large numbers, we cannot explicitely compute these values.
The most common approach in this situation is to apply log on both sides and compare b * log(a) to d * log(c). The issue here is that logs are floating-point operations, and as such we cannot trust our answer with 100% confidence (there might be some values which are incredibly close, and because of floating-point error we get a wrong answer).
Is there an algorithm for solving this problem? I've been scouring the intrernet for this, but I can only find solutions which work for particular cases only (e.g. in which one exponent is a multiple of another), or which use floating point in some way (logarithms, division) etc.

This is sort of two questions in one:
Are they equal?
If not, which one is greater?
As Peter O. observes, it's easiest to build in a language that provides an arbitrary-precision fraction type. I'll use Python 3.
Let's assume without loss of generality that a ≤ c (swap if necessary) and b is relatively prime to d (divide both by the greatest common divisor).
To get at the core of the question, I'm going to assume that a, c > 0 and b, d ≥ 0. Removing this assumption is tedious but not difficult.
Equality test
There are some easy cases where a = 1 or b = 0 or c = 1 or d = 0.
Separately, necessary conditions for a^b = c^d are
i. b ≥ d, since otherwise b < d, which together with a ≤ c implies a^b < c^d;
ii. a is a divisor of c, since we know from (i) that a^b = c^d is a divisor of c^b = c^(b−d) c^d.
When these conditions hold, we can divide through by a^d to reduce the problem to testing whether a^(b−d) = (c/a)^d.
In Python 3:
def equal_powers(a, b, c, d):
while True:
lhs_is_one = a == 1 or b == 0
rhs_is_one = c == 1 or d == 0
if lhs_is_one or rhs_is_one:
return lhs_is_one and rhs_is_one
if a > c:
a, b, c, d = c, d, a, b
if b < d:
return False
q, r = divmod(c, a)
if r != 0:
return False
b -= d
c = q
def test_equal_powers():
for a in range(1, 25):
for b in range(25):
for c in range(1, 25):
for d in range(25):
assert equal_powers(a, b, c, d) == (a ** b == c ** d)
test_equal_powers()
Inequality test
Once we've established that the two quantities are not equal, it's time to figure out which one is greater. (Without the equality test, the code here could run forever.)
If you're doing this for real, you should consult an actual reference on computing elementary functions. I'm just going to try to do the simplest thing that works.
Time for a calculus refresher. We have the Taylor series
−log x = (1−x) + (1−x)^2/2 + (1−x)^3/3 + (1−x)^4/4 + ...
To get a lower bound, truncate the series. To get an upper bound, we can truncate but replace the final term (1−x)^n/n with (1−x)^n/n (1/x), since
(1−x)^n/n (1/x)
= (1−x)^n/n (1 + (1−x) + (1−x)^2 + ...)
= (1−x)^n/n + (1−x)^(n+1)/n + (1−x)^(n+2)/n + ...
> (1−x)^n/n + (1−x)^(n+1)/(n+1) + (1−x)^(n+2)/(n+2) + ...
To get a good convergence rate, we're going to want 0.5 ≤ x < 1, which we can achieve by dividing x by a power of two.
In Python, we'll represent a real number as an infinite generator of shrinking intervals that contain the true value. Once the intervals for b log a and d log c are disjoint, we can determine how they compare.
import fractions
def minus(x, y):
while True:
x_lo, x_hi = next(x)
y_lo, y_hi = next(y)
yield x_lo - y_hi, x_hi - y_lo
def times(b, x):
for lo, hi in x:
yield b * lo, b * hi
def restricted_log(a):
series = 0
n = 0
numerator = 1
while True:
n += 1
numerator *= 1 - a
series += fractions.Fraction(numerator, n)
yield -(series + fractions.Fraction(numerator * (1 - a), (n + 1) * a)), -series
def log(a):
n = 0
while a >= 1:
a = fractions.Fraction(a, 2)
n += 1
return minus(restricted_log(a), times(n, restricted_log(fractions.Fraction(1, 2))))
def less_powers(a, b, c, d):
lhs = times(b, log(a))
rhs = times(d, log(c))
while True:
lhs_lo, lhs_hi = next(lhs)
rhs_lo, rhs_hi = next(rhs)
if lhs_hi < rhs_lo:
return True
if rhs_hi < lhs_lo:
return False
def test_less_powers():
for a in range(1, 10):
for b in range(10):
for c in range(1, 10):
for d in range(10):
if a ** b != c ** d:
assert less_powers(a, b, c, d) == (a ** b < c ** d)
test_less_powers()

Mathematica, solving non linear system of equations with lot of equations and variables

I need to find a square matrix A satisfying the equation
A.L.A = -17/18A -2(A.L.L + L.A.L + (L.L).A) + 3(A.L + L.A) -4L.L.L + 8L.L - 44/9L + 8/9*(ID)
,where L is a diagonal matrix L = {{2/3,0,0,0},{0,5/12,0,0},{0,0,11/12,0},{0,0,0,2/3}}.
I can find the answers in the case that A is of dimension 2 and 3, but there is a problem with dimension 4 and above.
Actually, the matrix A has to satisfy the equation A.A = A too, but with a suitable matrix L only the equation above equation is enough.
This is my code ;
A = Table[a[i,j],{i,1,4},{j,1,4}]
B = A.L.A
ID = IdentityMatrix[4]
M = -17/18A -2(A.L.L + L.A.L + (L.L).A) + 3(A.L + L.A) -4L.L.L + 8L.L - 44/9L + 8/9*(ID)
diff = (B - M)//ExpandAll//Flatten ( so I get 16 non linear system of equations here )
A1 = A/.Solve[diff == 0][[1]]
After running this code for quite sometime, the error come up with there is not enough memory to compute.
In this case there are 16 equations and 16 variables. Some of the entries are parameters but I just do not know which one until I get the result.
I am not sure if there is anyway to solve this problem. I need the answer to be rational(probably integers) which is possible theoretically.
Could this problem be solved by matrix equation or any other method? I see one problem for this is there are too many equations and variables.

This evaluates fairly quickly and with modest memory for a problem this size.
L = {{2/3, 0, 0, 0}, {0, 5/12, 0, 0}, {0, 0, 11/12, 0}, {0, 0, 0, 2/3}};
A = {{a, b, c, d}, {e, f, g, h}, {i, j, k, l}, {m, n, o, p}};
Reduce[{A.L.A == -17/18 A - 2 (A.L.L + L.A.L + (L.L).A) + 3 (A.L + L.A) -
4 L.L.L + 8 L.L - 44/9 L + 8/9*IdentityMatrix[4]},
{a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p}, Backsubstitution->True
]
Then you just have to sort through the 143 potential solutions that it returns.
You might be able to Select from those that satisfy your A.A==A. You can also use ToRules on the result returned from Reduce to put this into a form similar to that returned from Solve, but check this carefully to make certain it is doing what you expect.
Check this very carefully to make certain I haven't made any mistakes.

Ordered set and natural bijection (combinatorial species)

Let A some set (eg. 1000, 1001, 1002, ..., 1999).
Let lessThan some order relation function (eg. (a lessThan b) <-> (a > b)).
Let index a function (with inverse index') mapping a A element to naturals.
Example:
index a = 2000 - a
index' n = 2000 - n
Exists some way to construct index (and index') function for all (or some kinds of) (A, lessThan) pairs in P (polynomial time)?
Best regards and thank's in advance!
EDITED: A could be a set by definition (eg. all combinations with repetition of another big subset), then, we can't suppose A is completely traversable (in P).
EDITED: another non trivial example, let An a set (with elements like (x, y, p)) whose elements are ordered clockwise into a n X n square, like this
1 2 3 4
12 13 14 5
11 16 15 6
10 9 8 7
then, we can map each triplet in An to Bn = [1..n^2] with O(1) (a polynomial).
Given one An element we can index to Bn with O(1).
Given one Bn element we can index' to An with O(1).
// Square perimeter; square x = 1, 2, 3, ...
Func<int, int, int> perimeter = ( x, n ) => 4 * ( n - 2 * x + 1 );
// Given main diagonal coordinates (1, 1), (2, 2), ... return cell number
Func<int, int, int> diagonalPos = ( x, n ) => -4 * x * x + ( 4 * n + 8 ) * x - 4 * n - 3;
// Given a number, return their square
Func<int, int, int> inSquare = ( z, n ) => (int) Math.Floor(n * 0.5 - 0.5 * Math.Sqrt(n * n - z + 1.0) + 1.0);
Func<int, int, Point> coords = ( z, n ) => {
var s = inSquare(z, n);
var l = perimeter(s, n) / 4; // length sub-square edge -1
var l2 = l + l;
var l3 = l2 + l;
var d = diagonalPos(s, n);
if( z <= d + l )
return new Point(s + z - d, s);
if( z <= d + l2 )
return new Point(s + l, s + z - d - l);
if( z <= d + l3 )
return new Point(s + d + l3 - z, s + l);
return new Point(s, s + d + l2 + l2 - z);
};
(I have read about "Combinatorial species", "Ordered construction of combinatorial objects", "species" haskell package and others)

I may be misunderstanding what you want, but in case I'm not:
If lessThan defines a total order on the set, you can create the index and index' functions by
converting the set to a list (or an array/vector)
sorting that according to lessThan
construct index' as Data.Map.fromDistinctAscList $ zip [1 .. ] sortedList
construct index as Data.Map.fromDistinctAscList $ zip (map NTC sortedList) [1 .. ]
where NTC is a newtype constructor wrapping the type of elements of the set in a newtype whose Ord instance is given by lessThan.
newtype Wrapped = NTC typeOfElements
instance Eq Wrapped where
(NTC x) /= (NTC y) = x `lessThan` y || y `lessThan` x
-- that can usually be done more efficiently
instance Ord Wrapped where
(NTC x) <= (NTC y) = not $ y `lessThan` x
EDITED: A could be a set by definition (eg. all combinations with repetition of another big subset), then, we can't suppose A is completely traversable (in P).
In that case, unless I'm missing something fundamental, it's impossible in principle, because the index' function would provide a complete traversal of the set.
So you can create the index and index' functions in polynomial time if and only if the set is traversable in polynomial time.

Poincare return map for a the logistic map f_c(x)=cx(1-x)

Suppose:
f[c_,x_]:= c x (1-x)
Fix an interval (a,b) in [0,1]. The Poincare return(to the interval (a,b)) map of f is a function
R[c,x]=f^k[c,x],
where k is the first iterate such that a<f^k[c,x]<b (here f^k[c,x] means the k times composition of f with itself, i.e. f[c,f[c,...f[c,x]...]] )
So, I would like to write a function (or module)
R[f_,a_,b_,n_,x_]
which considers the first n iterates of f and returns the value of the iterate of f[c,x] that first falls into the interval [a,b].
Here is what I attempted:
R[f_[x___ ], a_,b_, n_, x0_] :=
Module[{i, y=x0},
Catch[
For[i = 0, i <= n, i++,
If[a < f[{x}[[1]], y] < b,
Throw[f[{x}[[1]], y]], y = f[{x}[[1]], y]
]
]
]
]
The code does not work because where f[{x}[[1]], y] is written, f is understood as a multiplication by the number {x}[[1]] rather than the logistic function defined above.
Please note that I am looking for a simple piece of code and please, if possible, do not change the number of inputs to the function R in your answer.
EDIT: I would like to call R as follows.
R[f[3.5, t], 0.4, 0.7, 100, 0.2]
This should return the value of the iterate of x0=0.2 the first time it falls into the interval (0.4,0.7) upon applying the function f[3.5,x]=3.5x(1-x). n is just the maximum iterations that we try before giving up.

The problem you're having is that you don't actually return anything from the Module. To fix your code, as written, I'd use
R[f_[x___], a_, b_, n_, x0_] :=
Module[{i, y = x0},
Catch[
For[i = 0, i <= n, i++, Print["i= ", i];
If[
a < f[{x}[[1]], y] < b,
y= f[{x}[[1]], y]; Throw[y],
y = f[{x}[[1]], y]
](*If*)
](*For*)
];(*Catch*)
y (* returns y *)
](*Module*)
However, this can be rewritten more succinctly as
g[c_][x_] := c x(1 - x)
(* I used Q to differentiate it from R, above. *)
Q[f_, a_, b_, n_, x0_] :=
Module[{i},
(* -- FIXED THIS, See below. -- *)
NestWhile[f, f[x0], Not[a < # < b]&, 1, n]
](*Module*)
Note, I changed the invocation method of f from f[c,x] to g[c][x]. The main advantage is in being able to pass g[c] to Q instead of f[c,t] where t is a dummy variable. Then it is invoked like
Q[g[3.5], 0.4, 0.7, 100, 0.2]
which works just like R.
Edit: I was looking at extending the above code, and I noticed a flaw. I had for the condition a < f[#] < b& which says that the loop will continue only if the next iteration's value needs is within the boundaries. Instead, we want to continue only if the current iteration is outside of the range, so I changed it to Not[ a < # < b ]&.
As to the changes I was considering, sometimes it is nice with a calculation like this to be able to view the full list of iterations. To do that, we need to make a few small changes to the above code.
Clear[Q]
Options[Q] = {FullList -> False};
Q[f_, a_, b_, n_, x0_, opts : OptionsPattern[]] :=
Module[{i, nst},
nst = If[OptionValue[FullList],
NestWhileList, NestWhile];
nst[f, f[x0], Not[a < # < b] &, 1, n]
](*Module*)
Which introduces the Option FullList, which when set to True, Q will use NestWhileList instead of NestWhile.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Optimize a recursive function in Julia - performance

Related

Mathematical modeling from gurobipy to pyomo: How to enumerate over a set in pyomo?

Algorithm to precisely compare two exponentiations for very large integers (order of 1 billion)

Mathematica, solving non linear system of equations with lot of equations and variables

Ordered set and natural bijection (combinatorial species)

Poincare return map for a the logistic map f_c(x)=cx(1-x)

Categories

Resources