Optimizing computational cost on a task involving a multi-nested loop - algorithm

I am just a beginner of programming, and sorry in advance for bothering you by a (presumably) basic question.
I would like to perform the following task:
(I apologize for inconvenience; I don't know how to input a TeX-y formula in Stack Overflow ). I am primarily considering an implementation on MATLAB or Scilab, but language does not matter so much.
The most naive approach to perform this, I think, is to form an n-nested for loop, that is (the case n=2 on MATLAB is shown for example),
n=2;
x=[x1,x2];
for u=0:1
y(1)=u;
if x(1)>0 then
y(1)=1;
end
for v=0:1
y(2)=v;
if x(2)>0 then
y(2)=1;
end
z=Function(y);
end
end
However, this implementation is too laborious for large n, and more importantly, it causes 2^n-2^k abundant evaluations of the function, where k is a number of negative elements in x. Also, naively forming a k-nested for loop with knowledge of which element in x is negative, e.g.
n=2;
x=[-1,2];
y=[1,1];
for u=0:1
y(1)=u;
z=Function(y);
end
doesn't seem to be a good way; if we want to perform the task for different x, we have to rewrite a code.
I would be grateful if you provide an idea to implement a code such that (a) evaluates the function only 2^k times (possible minimum number of evaluations) and (b) we don't have to rewrite a code even if we change x.

You can evaluate Function on y in Ax easily using recursion
function eval(Function, x, y, i, n) {
if(i == n) {
// break condition, evaluate Function
Function(y);
} else {
// always evaluate y(i) == 1
y(i) = 1;
eval(Function, x, y, i + 1, n);
// eval y(i) == 0 only if x(i) <= 0
if(x(i) <= 0) {
y(i) = 0;
eval(Function, x, y, i + 1, n);
}
}
}
Turning that into efficient Matlab code is another problem.
As you've stated the number of evaluations is 2^k. Let's sort x so that only the last k elements are non-positive. To evaluate Function index y using the reverse of the permutation of the sort of x: Function(y(perm)). Even better the same method allows us to build Ax directly using dec2bin:
// every column of the resulting matrix is a member of Ax: y_i = Ax(:,i)
function Ax = getAx(x)
n = length(x);
// find the k indices of non-positives in x
is = find(x <= 0);
k = length(is);
// construct Y (last k rows are all possible combinations of [0 1])
Y = [ones(n - k, 2 ^ k); (dec2bin(0:2^k-1)' - '0')];
// re-order the rows in Y to get Ax according to the permutation is (inverse is)
perm([setdiff(1:n, is) is]) = 1:n;
Ax = Y(perm, :);
end
Now rewrite Function to accept a matrix or iterate over the columns in Ax = getAx(x); to evaluate all Function(y).

Related

Evenly space n items over m iterations

For context, this is to control multiple stepper motors simultaneously in a high-accuracy application.
Problem statement
Say I have a loop that will run i iterations. Over the course of those iterations, expression E_x should evaluate to true x times (x <= i is guaranteed).
Requirements
- E_x must evaluate to true exactly x times
- E_x must evaluate to true at more or less evenly spaced intervals*
* "evenly spaced intervals" means that the maximum interval size is minimized
Examples
For: i = 10, x = 7
E_x will be true on iterations marked 1: 1101101101
For: i = 10, x = 3
E_x will be true on iterations marked 1: 0010010010
For: i = 10, x = 2
E_x will be true on iterations marked 1: 0001000100
What is the best (or even "a good") way to have E_x evaluate to true at evenly spaced intervals while guaranteeing that it is true exactly x times?
This question is close to mine, however it assumes that E_x will always evaluate to true in the 1st and last iterations, which does not meet my requirements (see 2nd example above).
I'll use a bit different naming convention: let's there by T intervals [1..T] and N events to be fired. Also let's solve the problem as a cyclic one. To do the let's add one fake step at the end that we are guaranteed to fire event at (and this will be also the event at time 0 i.e. before the cycle). So my T is your i+1 and my N is your x+1.
If you divide T by N with reminder you'll get T = w*N + r. If r=0 the case is trivial. If r != 0 the best you can achieve is r intervals of size w+1 and (N-r) intervals of size w. The fast and simple but good enough solution would be something like this (pseudocode):
events = []
w = T / N
r = T % N
current = 0
for(i = 1; i<=N; i++) {
current += w;
if (i <= r)
current += 1;
events[i] = current;
}
You can see that the last value in the array will be T as was promised by our re-statement as a cyclic problem. It will be T because over the cycle we'll add w to current N times and add r times 1, so the sum will be w*N+r which is T.
The main drawback of this solution is that all the "long" intervals will be at the start while all the "short" interval will be at the end.
You can spread intervals more evenly if you are a bit smarter. And the resulting logic will be essentially the same as it is behind Bresenham's line algorithm referenced in comments. Imagine you are drawing a line on a plane, where X-axis represents time and Y-axis represents events, from (0,0) (which is the 0-th event, before your timeframe) to (i+1, x+1) (which is the x+1-th event, just after your timeframe). The moment to raise an event is when you switch to the next Y i.e. draw the first pixel at a given Y.
If you want to do x increments over n iterations, you can do it like this:
int incCount = 0;
int iterCount = 0;
boolean step() {
++iterCount;
int nextCount = (iterCount*x + n/2) / n; // this is rounding division
if (nextCount > incCount) {
++incCount;
return true;
}
else {
return false;
}
}
That's the easy-to-understand way. If you're on an embedded CPU where division is more expensive, you can accomplish exactly the same thing like this:
int accum = n/2;
boolean step() {
accum+=x;
if (accum >= n) {
accum-=n;
return true;
}
else {
return false;
}
}
The total amount added to accum here is iterCount*x + n/2 just like the first example, but the division is replaced with an incremental repeated subtraction. This is the way that Bresenham's line drawing algorithm works.

How to loop over matrix in Octave to generate cross-term polynomial of order n

What I am trying to do is the following, I have an n x m sized matrix, with n rows of data and m columns. Each of these columns is a different variable (think X, Y, Z, ect...).
What I want is to output a n x (m+f(m, i)) matrix, where i is the order of the polynomial requested, and f(m, i) is the number of terms, including cross terms of the polynomial.
I'll give an example, say I have a matrix with one row and three columns, and I want to return the polynomial terms up to order 3.
input = [x, y, z]
I want to get to
output = [x, y, z, x^2, y^2, z^2, x*y, x*z, y*z, x^3, y^3, z^3, x^2y, x^2*z, x*y^2, y^2*z, x*z^2, y*z^2, x*y*z]
From this we see f(3, 3) = 16.
I know I can do this with m nested loops, and I believe I can vectorize any algorithm over the number of rows, but it would be helpful to have a more efficient algorithm than brute force.
This can be done numerically using the following code, should be pretty easy to do symbolically as well.
function MatrixWithPolynomialTerms = GeneratePolynomialTerms
(InputDataMatrix, n)
resultMatrix = InputDataMatrix;
[nr, nc] = size(InputDataMatrix);
cart = nthargout ([1:nc], #ndgrid, [0:n]);
combs = cell2mat (cellfun (#(c) c(:), cart, "UniformOutput", false))';
for i = 1:length(combs)
if (sum(combs(:, i)) <= n)
resultColumn = ones(nr, 1);
for j = 1:nc
resultColumn.*=(InputDataMatrix(:, j).^combs(j, i));
end
resultMatrix = [resultMatrix, resultColumn];
end
end
MatrixWithPolynomialTerms = resultMatrix
endfunction

Improving performance of interpolation (Barycentric formula)

I have been given an assignment in which I am supposed to write an algorithm which performs polynomial interpolation by the barycentric formula. The formulas states that:
p(x) = (SIGMA_(j=0 to n) w(j)*f(j)/(x - x(j)))/(SIGMA_(j=0 to n) w(j)/(x - x(j)))
I have written an algorithm which works just fine, and I get the polynomial output I desire. However, this requires the use of some quite long loops, and for a large grid number, lots of nastly loop operations will have to be done. Thus, I would appreciate it greatly if anyone has any hints as to how I may improve this, so that I will avoid all these loops.
In the algorithm, x and f stand for the given points we are supposed to interpolate. w stands for the barycentric weights, which have been calculated before running the algorithm. And grid is the linspace over which the interpolation should take place:
function p = barycentric_formula(x,f,w,grid)
%Assert x-vectors and f-vectors have same length.
if length(x) ~= length(f)
sprintf('Not equal amounts of x- and y-values. Function is terminated.')
return;
end
n = length(x);
m = length(grid);
p = zeros(1,m);
% Loops for finding polynomial values at grid points. All values are
% calculated by the barycentric formula.
for i = 1:m
var = 0;
sum1 = 0;
sum2 = 0;
for j = 1:n
if grid(i) == x(j)
p(i) = f(j);
var = 1;
else
sum1 = sum1 + (w(j)*f(j))/(grid(i) - x(j));
sum2 = sum2 + (w(j)/(grid(i) - x(j)));
end
end
if var == 0
p(i) = sum1/sum2;
end
end
This is a classical case for matlab 'vectorization'. I would say - just remove the loops. It is almost that simple. First, have a look at this code:
function p = bf2(x, f, w, grid)
m = length(grid);
p = zeros(1,m);
for i = 1:m
var = grid(i)==x;
if any(var)
p(i) = f(var);
else
sum1 = sum((w.*f)./(grid(i) - x));
sum2 = sum(w./(grid(i) - x));
p(i) = sum1/sum2;
end
end
end
I have removed the inner loop over j. All I did here was in fact removing the (j) indexing and changing the arithmetic operators from / to ./ and from * to .* - the same, but with a dot in front to signify that the operation is performed on element by element basis. This is called array operators in contrast to ordinary matrix operators. Also note that treating the special case where the grid points fall onto x is very similar to what you had in the original implementation, only using a vector var such that x(var)==grid(i).
Now, you can also remove the outermost loop. This is a bit more tricky and there are two major approaches how you can do that in MATLAB. I will do it the simpler way, which can be less efficient, but more clear to read - using repmat:
function p = bf3(x, f, w, grid)
% Find grid points that coincide with x.
% The below compares all grid values with all x values
% and returns a matrix of 0/1. 1 is in the (row,col)
% for which grid(row)==x(col)
var = bsxfun(#eq, grid', x);
% find the logical indexes of those x entries
varx = sum(var, 1)~=0;
% and of those grid entries
varp = sum(var, 2)~=0;
% Outer-most loop removal - use repmat to
% replicate the vectors into matrices.
% Thus, instead of having a loop over j
% you have matrices of values that would be
% referenced in the loop
ww = repmat(w, numel(grid), 1);
ff = repmat(f, numel(grid), 1);
xx = repmat(x, numel(grid), 1);
gg = repmat(grid', 1, numel(x));
% perform the calculations element-wise on the matrices
sum1 = sum((ww.*ff)./(gg - xx),2);
sum2 = sum(ww./(gg - xx),2);
p = sum1./sum2;
% fix the case where grid==x and return
p(varp) = f(varx);
end
The fully vectorized version can be implemented with bsxfun rather than repmat. This can potentially be a bit faster, since the matrices are not explicitly formed. However, the speed difference may not be large for small system sizes.
Also, the first solution with one loop is also not too bad performance-wise. I suggest you test those and see, what is better. Maybe it is not worth it to fully vectorize? The first code looks a bit more readable..

Generate Random(a, b) making calls to Random(0, 1)

There is known Random(0,1) function, it is a uniformed random function, which means, it will give 0 or 1, with probability 50%. Implement Random(a, b) that only makes calls to Random(0,1)
What I though so far is, put the range a-b in a 0 based array, then I have index 0, 1, 2...b-a.
then call the RANDOM(0,1) b-a times, sum the results as generated idx. and return the element.
However since there is no answer in the book, I don't know if this way is correct or the best. How to prove that the probability of returning each element is exactly same and is 1/(b-a+1) ?
And what is the right/better way to do this?
If your RANDOM(0, 1) returns either 0 or 1, each with probability 0.5 then you can generate bits until you have enough to represent the number (b-a+1) in binary. This gives you a random number in a slightly too large range: you can test and repeat if it fails. Something like this (in Python).
def rand_pow2(bit_count):
"""Return a random number with the given number of bits."""
result = 0
for i in xrange(bit_count):
result = 2 * result + RANDOM(0, 1)
return result
def random_range(a, b):
"""Return a random integer in the closed interval [a, b]."""
bit_count = math.ceil(math.log2(b - a + 1))
while True:
r = rand_pow2(bit_count)
if a + r <= b:
return a + r
When you sum random numbers, the result is not longer evenly distributed - it looks like a Gaussian function. Look up "law of large numbers" or read any probability book / article. Just like flipping coins 100 times is highly highly unlikely to give 100 heads. It's likely to give close to 50 heads and 50 tails.
Your inclination to put the range from 0 to a-b first is correct. However, you cannot do it as you stated. This question asks exactly how to do that, and the answer utilizes unique factorization. Write m=a-b in base 2, keeping track of the largest needed exponent, say e. Then, find the biggest multiple of m that is smaller than 2^e, call it k. Finally, generate e numbers with RANDOM(0,1), take them as the base 2 expansion of some number x, if x < k*m, return x, otherwise try again. The program looks something like this (simple case when m<2^2):
int RANDOM(0,m) {
// find largest power of n needed to write m in base 2
int e=0;
while (m > 2^e) {
++e;
}
// find largest multiple of m less than 2^e
int k=1;
while (k*m < 2^2) {
++k
}
--k; // we went one too far
while (1) {
// generate a random number in base 2
int x = 0;
for (int i=0; i<e; ++i) {
x = x*2 + RANDOM(0,1);
}
// if x isn't too large, return it x modulo m
if (x < m*k)
return (x % m);
}
}
Now you can simply add a to the result to get uniformly distributed numbers between a and b.
Divide and conquer could help us in generating a random number in range [a,b] using random(0,1). The idea is
if a is equal to b, then random number is a
Find mid of the range [a,b]
Generate random(0,1)
If above is 0, return a random number in range [a,mid] using recursion
else return a random number in range [mid+1, b] using recursion
The working 'C' code is as follows.
int random(int a, int b)
{
if(a == b)
return a;
int c = RANDOM(0,1); // Returns 0 or 1 with probability 0.5
int mid = a + (b-a)/2;
if(c == 0)
return random(a, mid);
else
return random(mid + 1, b);
}
If you have a RNG that returns {0, 1} with equal probability, you can easily create a RNG that returns numbers {0, 2^n} with equal probability.
To do this you just use your original RNG n times and get a binary number like 0010110111. Each of the numbers are (from 0 to 2^n) are equally likely.
Now it is easy to get a RNG from a to b, where b - a = 2^n. You just create a previous RNG and add a to it.
Now the last question is what should you do if b-a is not 2^n?
Good thing that you have to do almost nothing. Relying on rejection sampling technique. It tells you that if you have a big set and have a RNG over that set and need to select an element from a subset of this set, you can just keep selecting an element from a bigger set and discarding them till they exist in your subset.
So all you do, is find b-a and find the first n such that b-a <= 2^n. Then using rejection sampling till you picked an element smaller b-a. Than you just add a.

Randomly Generate a set of numbers of n length totaling x

I'm working on a project for fun and I need an algorithm to do as follows:
Generate a list of numbers of Length n which add up to x
I would settle for list of integers, but ideally, I would like to be left with a set of floating point numbers.
I would be very surprised if this problem wasn't heavily studied, but I'm not sure what to look for.
I've tackled similar problems in the past, but this one is decidedly different in nature. Before I've generated different combinations of a list of numbers that will add up to x. I'm sure that I could simply bruteforce this problem but that hardly seems like the ideal solution.
Anyone have any idea what this may be called, or how to approach it? Thanks all!
Edit: To clarify, I mean that the list should be length N while the numbers themselves can be of any size.
edit2: Sorry for my improper use of 'set', I was using it as a catch all term for a list or an array. I understand that it was causing confusion, my apologies.
This is how to do it in Python
import random
def random_values_with_prescribed_sum(n, total):
x = [random.random() for i in range(n)]
k = total / sum(x)
return [v * k for v in x]
Basically you pick n random numbers, compute their sum and compute a scale factor so that the sum will be what you want it to be.
Note that this approach will not produce "uniform" slices, i.e. the distribution you will get will tend to be more "egalitarian" than it should be if it was picked at random among all distribution with the given sum.
To see the reason you can just picture what the algorithm does in the case of two numbers with a prescribed sum (e.g. 1):
The point P is a generic point obtained by picking two random numbers and it will be uniform inside the square [0,1]x[0,1]. The point Q is the point obtained by scaling P so that the sum is required to be 1. As it's clear from the picture the points close to the center of the have an higher probability; for example the exact center of the squares will be found by projecting any point on the diagonal (0,0)-(1,1), while the point (0, 1) will be found projecting only points from (0,0)-(0,1)... the diagonal length is sqrt(2)=1.4142... while the square side is only 1.0.
Actually, you need to generate a partition of x into n parts. This is usually done the in following way: The partition of x into n non-negative parts can be represented in the following way: reserve n + x free places, put n borders to some arbitrary places, and stones to the rest. The stone groups add up to x, thus the number of possible partitions is the binomial coefficient (n + x \atop n).
So your algorithm could be as follows: choose an arbitrary n-subset of (n + x)-set, it determines uniquely a partition of x into n parts.
In Knuth's TAOCP the chapter 3.4.2 discusses random sampling. See Algortihm S there.
Algorithm S: (choose n arbitrary records from total of N)
t = 0, m = 0;
u = random, uniformly distributed on (0, 1)
if (N - t)*u >= n - m, skip t-th record and increase t by 1; otherwise include t-th record in the sample, increase m and t by 1
if M < n, return to 2, otherwise, algorithm finished
The solution for non-integers is algorithmically trivial: you just select arbitrary n numbers that don't sum up to 0, and norm them by their sum.
If you want to sample uniformly in the region of N-1-dimensional space defined by x1 + x2 + ... + xN = x, then you're looking at a special case of sampling from a Dirichlet distribution. The sampling procedure is a little more involved than generating uniform deviates for the xi. Here's one way to do it, in Python:
xs = [random.gammavariate(1,1) for a in range(N)]
xs = [x*v/sum(xs) for v in xs]
If you don't care too much about the sampling properties of your results, you can just generate uniform deviates and correct their sum afterwards.
Here is a version of the above algorithm in Javascript
function getRandomArbitrary(min, max) {
return Math.random() * (max - min) + min;
};
function getRandomArray(min, max, n) {
var arr = [];
for (var i = 0, l = n; i < l; i++) {
arr.push(getRandomArbitrary(min, max))
};
return arr;
};
function randomValuesPrescribedSum(min, max, n, total) {
var arr = getRandomArray(min, max, n);
var sum = arr.reduce(function(pv, cv) { return pv + cv; }, 0);
var k = total/sum;
var delays = arr.map(function(x) { return k*x; })
return delays;
};
You can call it with
var myarray = randomValuesPrescribedSum(0,1,3,3);
And then check it with
var sum = myarray.reduce(function(pv, cv) { return pv + cv;},0);
This code does a reasonable job. I think it produces a different distribution than 6502's answer, but I am not sure which is better or more natural. Certainly his code is clearer/nicer.
import random
def parts(total_sum, num_parts):
points = [random.random() for i in range(num_parts-1)]
points.append(0)
points.append(1)
points.sort()
ret = []
for i in range(1, len(points)):
ret.append((points[i] - points[i-1]) * total_sum)
return ret
def test(total_sum, num_parts):
ans = parts(total_sum, num_parts)
assert abs(sum(ans) - total_sum) < 1e-7
print ans
test(5.5, 3)
test(10, 1)
test(10, 5)
In python:
a: create a list of (random #'s 0 to 1) times total; append 0 and total to the list
b: sort the list, measure the distance between each element
c: round the list elements
import random
import time
TOTAL = 15
PARTS = 4
PLACES = 3
def random_sum_split(parts, total, places):
a = [0, total] + [random.random()*total for i in range(parts-1)]
a.sort()
b = [(a[i] - a[i-1]) for i in range(1, (parts+1))]
if places == None:
return b
else:
b.pop()
c = [round(x, places) for x in b]
c.append(round(total-sum(c), places))
return c
def tick():
if info.tick == 1:
start = time.time()
alpha = random_sum_split(PARTS, TOTAL, PLACES)
end = time.time()
log('alpha: %s' % alpha)
log('total: %.7f' % sum(alpha))
log('parts: %s' % PARTS)
log('places: %s' % PLACES)
log('elapsed: %.7f' % (end-start))
yields:
[2014-06-13 01:00:00] alpha: [0.154, 3.617, 6.075, 5.154]
[2014-06-13 01:00:00] total: 15.0000000
[2014-06-13 01:00:00] parts: 4
[2014-06-13 01:00:00] places: 3
[2014-06-13 01:00:00] elapsed: 0.0005839
to the best of my knowledge this distribution is uniform

Resources