Numerical optimization and minimization - algorithm

Lets say I have a reference node R and several test nodes T1, T2.... Tn.
Any particular node has a set of properties Rp1, Rp2, ... Rpn and T1p1, T1p2, T1p3, ... T1pn, and T2p1, T2p2, T2p3, ... T2pn, and so on. So, any node can have n properties, each of a particular type.
I have my own method of defining the distance between any two properties of the same kind between any two nodes. Furthermore, I would weigh the distances between properties and then sum them up. Thus, the distance between R and T1 would be:
dRT1 = w1*dRT1p1 + w2*dRT1p2 + w3*dRT1p3 + w4*dRT1p4 + ... wn*dRT1pn.
Now, given the reference node R, and the test nodes T1, T2 .... Tn, and given that I know the distance is the least between R and a particular node Tm (1<m<n), and if the weights are actually variables and the distances are actually constants, how do I calculate the weights such that dRTm is the minimal among all the distances between R and every other test nodes.
We have the distances dRT1, dRT2, dRT3, dRT4, ... dRTn and we know that dRTm is minimum. What algorithm should we use to determine the weights?

It seems what you want to do is to set the weights so that a particular distance (dRTm) gets a lower numerical value than any other distance, i.e. set the weights so that the inequalities
dRTm <= dRT1
...
dRTm <= dRTn
are all fulfilled. Setting all the weights to zero as mentioned in one of the comments is a trivial solution because all distances will be identically zero and all the inequalities trivially fulfilled (with equality replacing inequality) so it makes more sense to consider the stronger problem
dRTm < dRT1
...
dRTm < dRT(m-1)
dRTm < dRT(m+1)
...
dRTm < dRTn
In any case, this a simple linear programming problem. Put in the above constraints and then minimize
min : dRTm
It is linear because the calculated individual distances are constants at the time of solving for minimum dRTm. You can solve this with any linear programming package, or if you want to cook up a slow but easy to implement solution by yourself, e.g. with Fourier-Motzkin elimination. Simplex would be the actual method of choice, however.

This looks like http://en.wikipedia.org/wiki/Linear_regression, where after the article says "Thus the model takes the form" y is dRt1, the x variables are e.g. dRT1p1, and the beta variables they are trying to work out - their parameter vector - is your w1, w2...

Related

Finding algorithm that minimizes a cost function

I have a restroom that I need to place at some point. I want the restroom's placement to minimize the total distance people have to travel to get there.
So I have x apartments, and each house has n people living in each apartment, so the apartments would be like a_1, a_2, a_3, ... a_x and the number of people in a_1 would be n_1, a_2 would be n_2, etc. No two apartments can be in the same space and each apartment has a positive number of people.
So I know the distance between an apartment a_1 and the proposed bathroom, placed at a, would be |a_1 - a|.
MY WORKING:
I defined a cost function, C(a) = SUM[from i = 1 to x] (n_i)|a_i - a|. I want to find the location a that minimizes this cost function, given two arrays - one for the location of the apartments and one for the number of people in each apartment. I want my algorithm to be in O(n) time.
I was thinking of representing this as a graph and using MSTs or Djikstra's but that would not meet the O(n) runtime. Clearly, there must be something I can do without graphs, but I am unsure.
My understanding of your problem:
You have a once dimensional line with points a1,...,an. Each point has a value n1,....n, and you need to pick a point a that minimizes the cost function
SUM[from i = 1 to x] (n_i)|a_i - a|.
Lets assume our input a1...an is sorted.
Our strategy will be a sweep from left to right, calculating possible a on the way.
Things we will keep track of:
total_n : the total number of people
left_n : the number of people living to the left or at our current position
right_n : the number of people living to the right of our current position
a: our current postition
Calculate:
C(a)
left_n = n1
right_n = total_n - left_n
Now we consider what happens to the sum if we move our restroom to the right 1 step. The people on the left get 1 step further away, but the people on the right get 1 step closer.
We can say that C(a+1) = C(a) +left_n -right_n
If the range an-a1 is fairly small, we can use this and just step through the range using this formula to update the sum. Note that when this sum starts increasing we have gone too far and can safely stop.
However, if the apartments are very far apart we cannot step 1 by 1 unit. We need instead to step apartment by apartment. Note
C(a[i]) = C(a[i-1]) + left_n*(a[i]-a[a-1]) - right_n*(a[i]-a[i-1])
If at any point C(a[i]) > C(a[i-1]) we know that the correct position of the restroom is somewhere between i and i-1.
We can calculate that position, lets call it x.
The sum at x is C(a[i-1]) + left_n*(x-a[i-1]) - right_n*(x-a[i-1]) and we want to minimize this. Note that everything but x is known values.
We can simplify to
f(x) = C(a[i-1]) + left_n*x-left_n*a[i-1]) - right_n*x-left_n*a[i-1])
Constant terms cannot affect our decision so we are actually looking to minize
f(x) = x*(left_n-right_n)
We see that if left_n < right_n we want the restroom to be at i+1, but if left_n > right_n we want the restroom to be at i.
We need to at most do this calculation at each apartment, so the running time is O(n).

Shortest Path for n entities

I'm trying to solve a problem that is about minimizing the distance traveled by a group of n entities who have to go trough a group of x points in a given order.
The n entities all start in the same position (1,1) and then I'm given x points that are in a queue and have to be "answered" in the correct order. However, I want the distance to be minimal.
My approach to the algorithm so far was to order the entities in increasing order of their distance to the x that is next in line. Then, I'd check, from the ones that are closer to the ones that are the furthest away, if this distance to the next in line is bigger than the distance to the one that comes afterwards to minimize the distance. The closest one to not fulfill this condition went to answer. If all were closer to the x that came afterwards, I'd reorder them in increasing order of distance to the one that came afterwards and send the furthest away from this to answer the x. Since this is a test problem I'm doing as practice for a competition I know what the result should be for my test case and it seems I'm doing this wrong.
How should I implement such an algorithm that guarantees that the distance is minimal?
The algorithm you've described sounds like a greedy search algorithm. Greedy algorithms are not guaranteed to find optimal solutions except under specific conditions that don't seem to hold here.
This looks like a candidate for a dynamic programming formulation. Alternatively, you can use heuristic search such as the A* search algorithm. I'd go with the latter if I was in the competition. See the link for a description of how the algorithm works, how to implement it, and how you might apply it to your problem.
Although since the points need to be visited in order, there is a bound on the number of possible arrangements, I couldn't think of a more efficient way than the following formulation. Let f(ns, i) represent the optimal arrangement up to the ith point, where ns is the list of the last chosen point for each entity that has at least one point. Then we have at most two choices: either start a new entity if we haven't run out, or try the current point as the next visit for each entity.
Python recursion:
import math
def d(p1, p2):
return math.sqrt(math.pow(p1[0] - p2[0], 2) + math.pow(p1[1] - p2[1], 2))
def f(ps, n):
def g(ns, i):
if i == len(ps):
return 0
# start a new entity if we haven't run out
best = d((0,0), ps[i]) + g(ns[:] + [i], i + 1) if len(ns) < n else float('inf')
# try the current point as the next visit for each entity
for entity_idx, point_idx in enumerate(ns):
_ns = ns[:]
_ns[entity_idx] = i
best = min(best, d(ps[point_idx], ps[i]) + g(_ns, i + 1))
return best
return d((0,0), ps[0]) + g([0], 1)
Output:
"""
p5
p3
p1
(0,0) p4 p6
p2
"""
points = [(0,1), (0,-1), (0,2), (2,0), (0,3), (3,0)]
print f(points, 3) # 7.0

implementing stochastic ACO algorithm

I am trying to implement a stochastic ant colony optimisation algorithm, and I'm having trouble working out how to implement movement choices based on probabilities.
the standard (greedy) version that I have implemented so far is that an ant m at a vertex i on a graph G = (V,E) where E is the set of edges (i, j), will choose the next vertex j based on the following criteria:
j = argmax(<fitness function for j>)
such that j is connected to i
the problem I am having is in trying to implement a stochastic version of this, so that now the criteria for choosing a new vertex, j is:
P(j) = <fitness function for j>/sum(<fitness function for J>)
where P(j) is the probability of choosing vertex j,
such j is connected to i,
and J is the set of all vertices connected to i
I understand the mathematics behind it, I am just having trouble working out how i should actually implement it.
if, say, i have 3 vertices connected to i, each with a probability of 0.2, 0.3, 0.5 - what is the best way to make the selection? should I just randomly select a vertex j, then generate a random number r in the range (0,1) and if r >= P(j), select vertex j? or is there a better way?
Looking at the problem statement, I think you are not trying to visit all nodes (connected to i (say) ), but some of the nodes based on some probability distribution. Lets take an example:
You have a node i and connected to it are 5 nodes, a1...a5, with probabilities p1...p5, such that sum(p_i) = 1. No, say the precision of probabilities that you consider is 2 places after decimal. Also, you dont want to visit all 5 nodes, but only k of them. Lets say, in this example, k = 2. So, since 2 places of decimal is your probability precision, add 3 to it to increase normality of probability distribution in the random function. (You can change this 3 to any number of your choice, as far as performance is concerned) (Since you have not tagged any language, I'll take example of java's nextInt() function to generate random numbers.)
Lets give some values:
p1...p5 = {0.17, 0.11, 0.45, 0.03, 0.24}
Now, in a loop from 1 to k, generate a random number from (0...10^5). {5 = 2 + 3, ie. precision + 3}. If the generated number is from 0 to 16999, go with node a1, 17000 to 27999, go with a2, 28000 to 72999, go with a3...and so on. You get the idea.
What you're trying to implement is a weighted random choice depending on the probabilities for the components of the solution, or a random proportional selection rule on ACO terms. Here is an snippet of the implementation of this rule on the Isula Framework:
double value = random.nextDouble();
while (componentWithProbabilitiesIterator.hasNext()) {
Map.Entry<C, Double> componentWithProbability = componentWithProbabilitiesIterator
.next();
Double probability = componentWithProbability.getValue();
total += probability;
if (total >= value) {
nextNode = componentWithProbability.getKey();
getAnt().visitNode(nextNode);
return true;
}
}
You just need to generate a random value between 0 and 1 (stored in value), and start accumulating the probabilities of the components (on the total variable). When the total exceeds the threshold defined in value, we have found the component to add to the solution.

How many thresholds and distance matrix are in Eigenface?

I edited my question trying to make it as short and precise.
I am developing a prototype of a facial recognition system for my Graduation Project. I use Eigenface and my main source is the document Turk and Pentland. It is available here: http://www.face-rec.org/algorithms/PCA/jcn.pdf.
My doubts focus on step 4 and 5.
I can not correctly interpret the number of thresholds: If two types of thresholds, or only one (Notice that the text speaks of two types but uses the same symbol). And again, my question is whether this (or these) threshold(s) is unique and global for all person or if each person has their own default.
I understand the steps to be calculated until an matrix O() of classes with weights or weighted. So this matrix O() is of dimension M'x P. Since M' equal to the amount of eigenfaces chosen and P the number of people.
What follows and confuses me. He speaks of two distances: the distance of a class against another, and also from a distance of one face to another. I call it D1 and D2 respectively. NOTE: In the training set there are M images in total, with F = M / P the number of images per person.
I understand that threshold(s) should be chosen empirically. But there must be a way to approximate. I was initially designing a matrix of distances D1() of dimension PxP. Where the row vector D(i) has the distances from the vector average class O(i) to each O(j), j = 1..P. Ie a "all vs all."
Until I came here, and what follows depends on whether I should actually choose a single global threshold for all. Or if I should be chosen for each individual value. Also not if they are 2 types: one for distance classes, and one for distance faces.
I have a theory as could proceed but not so supported by the concepts of Turk:
Stage Pre-Test:
Gender two matrices of distances D1 and D2:
In D1 would be stored distances between classes, and in D2 distances between faces. This basis of the matrices W and A respectively.
Then, as indeed in the training set are P people, taking the F vectors columns D1 for each person and estimate a threshold T1 was in range [Min, Max]. Thus I will have a T1(i), i = 1..P
Separately have a T2 based on the range [Min, Max] out of all the matrix D2. This define is a face or not.
Step Test:
Buid a test set of image with a 1 image for each known person
Itest = {Itest(1) ... Itest(P)}
For every image Itest(i) test:
Calculate the space face Atest = Itest - Imean
Calculate the weight vector Otest = UT * Atest
Calculating distances:
dist1(j) = distance(Otest, O (j)), j = 1..P
Af = project(Otest, U)
dist2 = distance(Atest, Af)
Evaluate recognition:
MinDist = Min(dist1)
For each j = 1..P
If dist2 > T2 then "not is face" else:
If MinDist <= T1(j) then "Subject identified as j" else "subject unidentified"
Then I take account of TFA and TFR and repeat the test process with different threshold values until I find the best approach gives to each person.
Already defined thresholds can put the system into operation unknown images. The algorithm is similar to the test.
I know I get out of "script" of the official documentation but at least this reasoning is the most logical place my head. I wondered if I could give guidance.
EDIT:
i No more to say that has not already been said and that may help clarify things.
Could anyone tell me if I'm okay tackled with my "theory"? I'm moving into my project, and if this is not the right way would appreciate some guidance and does not work and you wrong.

Dynamic programming question

I am stuck with one of the algorithm homework problem. Can anyone give me some hint to solve it? Here is the question:
Consider a chain structured computation represented by a weighted graph G = (V;E) where
V = {v1; v2; ... ; vn} and E = {(vi; vi+1) such that 1<= i <= n-1. We are also given a chain-structure m identical processors P = {P1; ... ; Pm} (i.e., there exists a communication link between Pk and Pk+1 for 1 <= k <= m - 1).
The set of vertices V represents computation modules, and the set of edges E represents
communication between the two modules. Each node vi is assigned a weight wi denoting the
execution time of the module on a single processor. Each edge (vi; vi+1) is assigned a weight ci denoting the amount of communication time between the two modules if they are assigned two different processors. If multiple modules are assigned to the same processor, the modules assigned to the same processor must be consecutive. Suppose modules va; va+1; .. ; vb are assigned to Processor Pk. Then, the time taken by Pk, denoted by Tk, is the time to compute assigned modules plus the time to communicate between neighboring processors. Hence, Tk = wa+...+ wb + ca-1 + cb. Note here that ca-1 = 0 if a = 1 and cb = 0 if b = n.
The objective of the problem is to find an assignment V to P such that max1<=k<=m Tk
is minimized, where we assume that each processor must take at least one module. (This
assumption can be relaxed by adding m dummy modules with zero weight on computational
and communication time.)
Develop a dynamic programming algorithm to solve this problem in polynomial time(i.e O(mn))
I tried to find the minimum execution time for each Pk and then find the max, but I doubt my solution is dynamic programming since there is no recursive formula. Please give me some hints!
Thanks!
I think you might be able to modify the Viterbi algorithm to solve this problem.
okay. this is easy.
decompose your problem to be a function you need to minimise, say F(n,k). which results into the minimum assignment of the first n nodes to k first processors.
Then derive your formula like this, collecting the number of nodes on the kth processor.
F(n,k) = min[i=0..n]( max(F(i,k-1), w[i]+...+w[n]+c[i-1]+c[n]) )
c[0] = 0
F(*,0) = inf
F(0,*) = inf

Resources