Implementing a perceptron with backpropagation algorithm - backpropagation

I am trying to implement a two-layer perceptron with backpropagation to solve the parity problem. The network has 4 binary inputs, 4 hidden units in the first layer and 1 output in the second layer. I am using this for reference, but am having problems with convergence.
First, I will note that I am using a sigmoid function for activation, and so the derivative is (from what I understand) the sigmoid(v) * (1 - sigmoid(v)). So, that is used when calculating the delta value.
So, basically I set up the network and run for just a few epochs (go through each possible pattern -- in this case, 16 patterns of input). After the first epoch, the weights are changed slightly. After the second, the weights do not change and remain so no matter how many more epochs I run. I am using a learning rate of 0.1 and a bias of +1 for now.
The process of training the network is below in pseudocode (which I believe to be correct according to sources I've checked):
Feed Forward Step:
v = SUM[weight connecting input to hidden * input value] + bias
y = Sigmoid(v)
set hidden.values to y
v = SUM[weight connecting hidden to output * hidden value] + bias
y = Sigmoid(v)
set output value to y
Backpropagation of Output Layer:
error = desired - output.value
outputDelta = error * output.value * (1 - output.value)
Backpropagation of Hidden Layer:
for each hidden neuron h:
error = outputDelta * weight connecting h to output
hiddenDelta[i] = error * h.value * (1 - h.value)
Update Weights:
for each hidden neuron h connected to the output layer
h.weight connecting h to output = learningRate * outputDelta * h.value
for each input neuron x connected to the hidden layer
x.weight connecting x to h[i] = learningRate * hiddenDelta[i] * x.value
This process is of course looped through the epochs and the weight changes persist. So, my question is, are there any reasons that the weights remain constant after the second epoch? If necessary I can post my code, but at the moment I am hoping for something obvious that I'm overlooking. Thanks all!
EDIT: Here are the links to my code as suggested by sarnold:

I think I spotted the problem; funny enough, what I found is visible in your high-level description, but I only found what looked odd in the code. First, the description:
for each hidden neuron h connected to the output layer
h.weight connecting h to output = learningRate * outputDelta * h.value
for each input neuron x connected to the hidden layer
x.weight connecting x to h[i] = learningRate * hiddenDelta[i] * x.value
I believe the h.weight should be updated with respect to the previous weight. Your update mechanism sets it based only on the learning rate, the output delta, and the value of the node. Similarly, the x.weight is also being set based on the learning rate, the hidden delta, and the value of the node:
/*** Weight updates ***/
// update weights connecting hidden neurons to output layer
for (i = 0; i < output.size(); i++) {
for (Neuron h : output.get(i).left) {
h.weights[i] = learningRate * outputDelta[i] * h.value;
// update weights connecting input neurons to hidden layer
for (i = 0; i < hidden.size(); i++) {
for (Neuron x : hidden.get(i).left) {
x.weights[i] = learningRate * hiddenDelta[i] * x.value;
I do not know what the correct solution is; but I have two suggestions:
Replace these lines:
h.weights[i] = learningRate * outputDelta[i] * h.value;
x.weights[i] = learningRate * hiddenDelta[i] * x.value;
with these lines:
h.weights[i] += learningRate * outputDelta[i] * h.value;
x.weights[i] += learningRate * hiddenDelta[i] * x.value;
(+= instead of =.)
Replace these lines:
h.weights[i] = learningRate * outputDelta[i] * h.value;
x.weights[i] = learningRate * hiddenDelta[i] * x.value;
with these lines:
h.weights[i] *= learningRate * outputDelta[i];
x.weights[i] *= learningRate * hiddenDelta[i];
(Ignore the value and simply scale the existing weight. The learning rate should be 1.05 instead of .05 for this change.)


Finite difference method for solving the Klein-Gordon equation in Matlab

I am trying to numerically solve the Klein-Gordon equation that can be found here. To make sure I solved it correctly, I am comparing it with an analytical solution that can be found on the same link. I am using the finite difference method and Matlab. The initial spatial conditions are known, not the initial time conditions.
I start off by initializing the constants and the space-time coordinate system:
close all
%% Constant parameters
A = 2;
B = 3;
lambda = 2;
mu = 3;
a = 4;
b = - (lambda^2 / a^2) + mu^2;
%% Coordinate system
number_of_discrete_time_steps = 300;
t = linspace(0, 2, number_of_discrete_time_steps);
dt = t(2) - t(1);
number_of_discrete_space_steps = 100;
x = transpose( linspace(0, 1, number_of_discrete_space_steps) );
dx = x(2) - x(1);
Next, I define and plot the analitical solution:
%% Analitical solution
Wa = cos(lambda * x) * ( A * cos(mu * t) + B * sin(mu * t) );
figure('Name', 'Analitical solution');
surface(t, x, Wa, 'edgecolor', 'none');
title('Wa(x, t) - analitical solution');
The plot of the analytical solution is shown here.
In the end, I define the initial spatial conditions, execute the finite difference method algorithm and plot the solution:
%% Numerical solution
Wn = zeros(number_of_discrete_space_steps, number_of_discrete_time_steps);
Wn(1, :) = Wa(1, :);
Wn(2, :) = Wa(2, :);
for j = 2 : (number_of_discrete_time_steps - 1)
for i = 2 : (number_of_discrete_space_steps - 1)
Wn(i + 1, j) = dx^2 / a^2 ...
* ( ( Wn(i, j + 1) - 2 * Wn(i, j) + Wn(i, j - 1) ) / dt^2 + b * Wn(i - 1, j - 1) ) ...
+ 2 * Wn(i, j) - Wn(i - 1, j);
figure('Name', 'Numerical solution');
surface(t, x, Wn, 'edgecolor', 'none');
title('Wn(x, t) - numerical solution');
The plot of the numerical solution is shown here.
The two plotted graphs are not the same, which is proof that I did something wrong in the algorithm. The problem is, I can't find the errors. Please help me find them.
To summarize, please help me change the code so that the two plotted graphs become approximately the same. Thank you for your time.
The finite difference discretization of w_tt = a^2 * w_xx - b*w is
( w(i,j+1) - 2*w(i,j) + w(i,j-1) ) / dt^2
= a^2 * ( w(i+1,j) - 2*w(i,j) + w(i-1,j) ) / dx^2 - b*w(i,j)
In your order this gives the recursion equation
w(i,j+1) = dt^2 * ( (a/dx)^2 * ( w(i+1,j) - 2*w(i,j) + w(i-1,j) ) - b*w(i,j) )
+2*w(i,j) - w(i,j-1)
The stability condition is that at least a*dt/dx < 1. For the present parameters this is not satisfied, they give this ratio as 2.6. Increasing the time discretization to 1000 points is sufficient.
Next up is the boundary conditions. Besides the two leading columns for times 0 and dt one also needs to set the values at the boundaries for x=0 and x=1. Copy also them from the exact solution.
Wn(:,1:2) = Wa(:,1:2);
Then also correct the definition (and use) of b to that in the source
b = - (lambda^2 * a^2) + mu^2;
and the resulting numerical image looks identical to the analytical image in the color plot. The difference plot confirms the closeness

Why does the code terminate with a "Solution Not Found" error and "EXIT: Converged to a point of local infeasibility. Problem may be infeasible"?

I cannot seem to figure out why IPOPT cannot find a solution to this. Initially, I thought the problem was totally infeasible but when I reduce the value of col_total to any number below 161000 or comment out the last constraint equation that contains col_total, it solves and EXITs with an Optimal Solution Found and a final objective value function of -161775.256826753. I have solved the same Maximization problem using Artificial Bee Colony and Particle Swamp Optimization techniques, and they solve and return optimal objective value function at least 225000 and 226000 respectively. Could it be that another solver is required? I have also tried APOPT, BPOPT, and IPOPT and have tinkered around with the tolerance values, but no combination none seems to work just yet. The code is posted below. Any guidance will be hugely appreciated.
from gekko import GEKKO
import numpy as np
distances = np.array([[[0, 0],[0,0],[0,0],[0,0]],\
alpha = 0.5 / np.log(30/0.075)
diam = 31
free = 7
rho = 1.2253
area = np.pi * (diam / 2)**2
min_v = 5.5
axi_max = 0.32485226746
col_total = 176542.96546512868
rat = 14
nn = 5
u_hub_lowerbound = 5.777777777777778
c_pow = 0.59230249
p_max = 0.5 * rho * area * c_pow * free**3
# Initialize Model
m = GEKKO(remote=True)
#initialize variables, Set lower and upper bounds
x = [m.Var(value = 0.03902278, lb = 0, ub = axi_max) \
for i in range(nn)]
# i = 0
b = 1
c = 0
v_s = list()
for i in range(nn-1): # Loop runs for nn-1 times
# print(i)
# print(i,b,c)
squared_defs = list()
while i < b:
d = distances[b][c][0]
r = distances[b][c][1]
ss = (2 * (alpha * d) / diam)
tt = r / ((diam/2) + (alpha * d))
squared_defs.append((2 * x[i] / (1 + ss**2)) * np.exp(-(tt**2)) ** 2)
m.Equation((free * (1 - (sum(squared_defs))**0.5)) - rat <= 0)
m.Equation((free * (1 - (sum(squared_defs))**0.5)) - u_hub_lowerbound >= 0)
v_s.append(free * (1 - (sum(squared_defs))**0.5))
# Inserts free as the first item on the v_s list to
# increase len(v_s) to nn, so that 'v_s' and 'x'
# are of same length
v_s.insert(0, free)
gamma = list()
for i in range(len(x)):
bet = (4*x[i]*((1-x[i])**2) * rho * area) / 2
gam = bet * v_s[i]**3
m.Equation(x[i] - axi_max <= 0)
m.Equation((((4*x[i]*((1-x[i])**2) * rho * area) / 2) \
* v_s[i]**3) - p_max <= 0)
m.Equation((((4*x[i]*((1-x[i])**2) * rho * area) / 2) * \
v_s[i]**3) > 0)
m.Equation(col_total - sum(gamma) <= 0)
y = sum(gamma)
m.Maximize(y) # Maximize
#Set global options
m.options.IMODE = 3 #steady state optimization
#Solve simulation
m.options.SOLVER = 3
m.solver_options = ['linear_solver ma27','mu_strategy adaptive','max_iter 2500', 'tol 1.0e-5' ]
Built the equations without .value in the expressions. The x[i].value is only needed at the end to view the solution after the solution is complete or to initialize the value of x[i]. The expression m.Maximize(y) is more readable than m.Obj(-y) although they are equivalent.
from gekko import GEKKO
import numpy as np
distances = np.array([[[0, 0],[0,0],[0,0],[0,0]],\
alpha = 0.5 / np.log(30/0.075)
diam = 31
free = 7
rho = 1.2253
area = np.pi * (diam / 2)**2
min_v = 5.5
axi_max = 0.069262150781
col_total = 20000
p_max = 4000
rat = 14
nn = 5
# Initialize Model
m = GEKKO(remote=True)
#initialize variables, Set lower and upper bounds
x = [m.Var(value = 0.03902278, lb = 0, ub = axi_max) \
for i in range(nn)]
i = 0
b = 1
c = 0
v_s = list()
for turbs in range(nn-1): # Loop runs for nn-1 times
squared_defs = list()
while i < b:
d = distances[b][c][0]
r = distances[b][c][1]
ss = (2 * (alpha * d) / diam)
tt = r / ((diam/2) + (alpha * d))
squared_defs.append((2 * x[i] / (1 + ss**2)) \
* m.exp(-(tt**2)) ** 2)
m.Equation((free * (1 - (sum(squared_defs))**0.5)) - rat <= 0)
m.Equation(min_v - (free * (1 - (sum(squared_defs))**0.5)) <= 0 )
v_s.append(free * (1 - (sum(squared_defs))**0.5))
# Inserts free as the first item on the v_s list to
# increase len(v_s) to nn, so that 'v_s' and 'x'
# are of same length
v_s.insert(0, free)
beta = list()
gamma = list()
for i in range(len(x)):
bet = (4*x[i]*((1-x[i])**2) * rho * area) / 2
gam = bet * v_s[i]**3
m.Equation((((4*x[i]*((1-x[i])**2) * rho * area) / 2) \
* v_s[i]**3) - p_max <= 0)
m.Equation((((4*x[i]*((1-x[i])**2) * rho * area) / 2) \
* v_s[i]**3) > 0)
m.Equation(col_total - sum(gamma) <= 0)
y = sum(gamma)
m.Maximize(y) # Maximize
#Set global options
m.options.IMODE = 3 #steady state optimization
#Solve simulation
m.options.SOLVER = 3
This gives a successful solution with maximized objective 20,000:
Number of Iterations....: 12
(scaled) (unscaled)
Objective...............: -4.7394814741924645e+00 -1.9999999999929641e+04
Dual infeasibility......: 4.4698510326511536e-07 1.8862194343304290e-03
Constraint violation....: 3.8275766582203308e-11 1.2941979026166479e-07
Complementarity.........: 2.1543608536533588e-09 9.0911246952931704e-06
Overall NLP error.......: 4.6245685940749926e-10 1.8862194343304290e-03
Number of objective function evaluations = 80
Number of objective gradient evaluations = 13
Number of equality constraint evaluations = 80
Number of inequality constraint evaluations = 0
Number of equality constraint Jacobian evaluations = 13
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations = 12
Total CPU secs in IPOPT (w/o function evaluations) = 0.010
Total CPU secs in NLP function evaluations = 0.011
EXIT: Optimal Solution Found.
The solution was found.
The final value of the objective function is -19999.9999999296
Solver : IPOPT (v3.12)
Solution time : 3.210000000399305E-002 sec
Objective : -19999.9999999296
Successful solution

subscript indices must be either positiveintegers less than 2^31 or logicals

SOS i keep getting errors in the loop solving by finite difference method.
I either get the following error when i start with i = 2 : N :
diffusion: A(I,J): row index out of bounds; value 2 out of bound 1
error: called from
diffusion at line 37 column 10 % note line change due to edit!
or, I get the following error when i do i = 2 : N :
subscript indices must be either positive integers less than 2^31 or logicals
error: called from
diffusion at line 37 column 10 % note line change due to edit!
Please help
clear all; close all;
% mesh in space
dx = 0.1;
x = 0 : dx : 1;
% mesh in time
dt = 1 / 50;
t0 = 0;
tf = 10;
t = t0 : dt : tf;
% diffusivity
D = 0.5;
% number of nodes
N = 11;
% number of iterations
M = 10;
% initial conditions
if x <= .5 && x >= 0 % note, in octave, you don't need parentheses around the test expression
u0 = x;
u0 = 1-x;
u = u0;
alpha = D * dt / (dx^2);
for j = 1 : M
for i = 1 : N
u(i, j+1) = u(i, j ) ...
+ alpha ...
* ( u(i-1, j) ...
+ u(i+1, j) ...
- 2 ...
* u(i, j) ...
) ;
u(N+1, j+1) = u(N+1, j) ...
+ alpha ...
* ( ...
u(N, j) ...
- 2 ...
* u(N+1, j) ...
+ u(N, j) ...
) ;
% boundary conditions
u(0, :) = u0;
u(1, :) = u1;
u1 = u0;
u0 = 0;
% exact solution with 14 terms
v = (4 / ((k * pi) .^ 2)) ...
* sin( (k * pi) / 2 ) ...
* sin( k * pi * x ) ...
* exp .^ (D * ((k * pi) ^ 2) * t) ;
exact = symsum( v, k, 1, 14 );
error = exact - u;
% plot stuff
plot( t, error );
xlabel( 'time' );
ylabel( 'error' );
legend( 't = 1 / 50' );
Have a look at the edited code I cleaned up for you above and study it.
Don't underestimate the importance of clean, readable code when hunting for bugs.
It will save you more time than it will cost. Especially a week from now when you will need to revisit this code and you will not remember at all what you were trying to do.
Now regarding your errors. (all line references are with respect to the cleaned up code above)
Scenario 1:
In line 29 you initialise u as a single value.
If you start your loop in line 35 starting with i = 2, then as soon as you try to do u(i, j+1), i.e. u(2,2) in the next line, octave will complain that you're trying to index the second row, in an array that so far only contains one row. (in fact, the same will apply for j at this point, since at this point you only have one column as well)
Scenario 2:
I assume the second scenario was a typo and you meant to say i = 1 : N.
If you start with i=1 in the loop, then have a look at line 38: you are trying to get element u(i-1, j), i.e. u(0,1). Therefore octave will complain that you're trying to get the zero element, but in octave arrays start from one and zero is not defined. Attempting to access any array with a zero will result in the error you see (try it in a terminal!).
Also, now that the code is clean, you can spot another bug, which octave helpfully warns you about if you try to run the code.
Look at line 26. There is NO condition in the elseif leg, so octave looks for the next statement as the test condition.
This means that the elseif condition will always succeed as long as the result of u0 = 1-x is non-zero.
This is clearly a bug. Either you forgot to put the condition for the elseif, or more likely, you probably just meant to say else, rather than elseif.

inverse of a matrix by gauss elimination method

How to find the inverse of a matrix? I am trying to use the Gauss elimination method. I know how to solve it by hand, but unable to understand how to code.
Guass-Jordan elimination is explained clearly here:
Also here is a C++ method implementation which is more aligned to finding the inverse of the matrix:
Note, please attempt to understand the reasoning behind the method. If I were learning this topic, I may try to write the code from the description myself first, then only look at the coded solution if I got stuck.
Also, there are likely other implementations in other languages - if you simply do a meaningful search on Google.
Good luck!
This should have been answered a billion time but ok. First of all, I don't think the Gauss-Jordan method is the best (for performances). I assume the matrix is of fixed size (3x3) in column notation. The following code is Javascript one but easily transposable to any othe language.
Matrix.prototype.inverse = function() {
var c, l, det, ret = new Matrix();
ret._M[0][0] = (this._M[1][1] * this._M[2][2] - this._M[2][1] * this._M[1][2]);
ret._M[0][1] = -(this._M[0][1] * this._M[2][2] - this._M[2][1] * this._M[0][2]);
ret._M[0][2] = (this._M[0][1] * this._M[1][2] - this._M[1][1] * this._M[0][2]);
ret._M[1][0] = -(this._M[1][0] * this._M[2][2] - this._M[2][0] * this._M[1][2]);
ret._M[1][1] = (this._M[0][0] * this._M[2][2] - this._M[2][0] * this._M[0][2]);
ret._M[1][2] = -(this._M[0][0] * this._M[1][2] - this._M[1][0] * this._M[0][2]);
ret._M[2][0] = (this._M[1][0] * this._M[2][1] - this._M[2][0] * this._M[1][1]);
ret._M[2][1] = -(this._M[0][0] * this._M[2][1] - this._M[2][0] * this._M[0][1]);
ret._M[2][2] = (this._M[0][0] * this._M[1][1] - this._M[1][0] * this._M[0][1]);
det = this._M[0][0] * ret._M[0][0] + this._M[0][1] * ret._M[1][0] + this._M[0][2] * ret._M[2][0];
for (c = 0; c < 3; c++) {
for (l = 0; l < 3; l++) {
ret._M[c][l] = ret._M[c][l] / det;
this._M = ret._M;

How can I find the center of a cluster of data points?

Let's say I plotted the position of a helicopter every day for the past year and came up with the following map:
Any human looking at this would be able to tell me that this helicopter is based out of Chicago.
How can I find the same result in code?
I'm looking for something like this:
$geoCodeArray = array([GET=]);
function findHome($geoCodeArray) {
// magic
return $geoCode;
Ultimately generating something like this:
UPDATE: Sample Dataset
Here's a map with a sample dataset:
Here's a pastebin of 150 geocodes:
The above contains 150 geocodes. The first 50 are in a few clusters close to Chicago. The remaining are scattered throughout the country, including some small clusters in New York, Los Angeles, and San Francisco.
I have about a million (seriously) datasets like this that I'll need to iterate through and identify the most likely "home". Your help is greatly appreciated.
UPDATE 2: Airplane switched to Helicopter
The airplane concept was drawing too much attention toward physical airports. The coordinates can be anywhere in the world, not just airports. Let's assume it's a super helicopter not bound by physics, fuel, or anything else. It can land where it wants. ;)
The following solution works even if the points are scattered all over the Earth, by converting latitude and longitude to Cartesian coordinates. It does a kind of KDE (kernel density estimation), but in a first pass the sum of kernels is evaluated only at the data points. The kernel should be chosen to fit the problem. In the code below it is what I could jokingly/presumptuously call a Trossian, i.e., 2-d²/h² for d≤h and h²/d² for d>h (where d is the Euclidean distance and h is the "bandwidth" $global_kernel_radius), but it could also be a Gaussian (e-d²/2h²), an Epanechnikov kernel (1-d²/h² for d<h, 0 otherwise), or another kernel. An optional second pass refines the search locally, either by summing an independent kernel on a local grid, or by calculating the centroid, in both cases in a surrounding defined by $local_grid_radius.
In essence, each point sums all the points it has around (including itself), weighing them more if they are closer (by the bell curve), and also weighing them by the optional weight array $w_arr. The winner is the point with the maximum sum. Once the winner has been found, the "home" we are looking for can be found by repeating the same process locally around the winner (using another bell curve), or it can be estimated to be the "center of mass" of all points within a given radius from the winner, where the radius can be zero.
The algorithm must be adapted to the problem by choosing the appropriate kernels, by choosing how to refine the search locally, and by tuning the parameters. For the example dataset, the Trossian kernel for the first pass and the Epanechnikov kernel for the second pass, with all 3 radii set to 30 mi and a grid step of 1 mi could be a good starting point, but only if the two sub-clusters of Chicago should be seen as one big cluster. Otherwise smaller radii must be chosen.
function find_home($lat_arr, $lng_arr, $global_kernel_radius,
$local_grid_radius, // 0 for no 2nd pass
$local_grid_step, // 0 for centroid
// for lat,lng <-> x,y,z see
// for K and h see
switch (strtolower($units)) {
/* */case 'nm' :
/*or*/case 'nmi': $m_divisor = 1852;
break;case 'mi': $m_divisor = 1609.344;
break;case 'km': $m_divisor = 1000;
break;case 'm': $m_divisor = 1;
break;default: return false;
$a = 6378137 / $m_divisor; // Earth semi-major axis (WGS84)
$e2 = 6.69437999014E-3; // First eccentricity squared (WGS84)
$lat_lng_count = count($lat_arr);
if ( !$w_arr) {
$w_arr = array_fill(0, $lat_lng_count, 1.0);
$x_arr = array();
$y_arr = array();
$z_arr = array();
$rad = M_PI / 180;
$one_e2 = 1 - $e2;
for ($i = 0; $i < $lat_lng_count; $i++) {
$lat = $lat_arr[$i];
$lng = $lng_arr[$i];
$sin_lat = sin($lat * $rad);
$sin_lng = sin($lng * $rad);
$cos_lat = cos($lat * $rad);
$cos_lng = cos($lng * $rad);
// height = 0 (!)
$N = $a / sqrt(1 - $e2 * $sin_lat * $sin_lat);
$x_arr[$i] = $N * $cos_lat * $cos_lng;
$y_arr[$i] = $N * $cos_lat * $sin_lng;
$z_arr[$i] = $N * $one_e2 * $sin_lat;
$h = $global_kernel_radius;
$h2 = $h * $h;
$max_K_sum = -1;
$max_K_sum_idx = -1;
for ($i = 0; $i < $lat_lng_count; $i++) {
$xi = $x_arr[$i];
$yi = $y_arr[$i];
$zi = $z_arr[$i];
$K_sum = 0;
for ($j = 0; $j < $lat_lng_count; $j++) {
$dx = $xi - $x_arr[$j];
$dy = $yi - $y_arr[$j];
$dz = $zi - $z_arr[$j];
$d2 = $dx * $dx + $dy * $dy + $dz * $dz;
$K_sum += $w_arr[$j] * ($d2 <= $h2 ? (2 - $d2 / $h2) : $h2 / $d2); // Trossian ;-)
// $K_sum += $w_arr[$j] * exp(-0.5 * $d2 / $h2); // Gaussian
if ($max_K_sum < $K_sum) {
$max_K_sum = $K_sum;
$max_K_sum_i = $i;
$winner_x = $x_arr [$max_K_sum_i];
$winner_y = $y_arr [$max_K_sum_i];
$winner_z = $z_arr [$max_K_sum_i];
$winner_lat = $lat_arr[$max_K_sum_i];
$winner_lng = $lng_arr[$max_K_sum_i];
$sin_winner_lat = sin($winner_lat * $rad);
$cos_winner_lat = cos($winner_lat * $rad);
$sin_winner_lng = sin($winner_lng * $rad);
$cos_winner_lng = cos($winner_lng * $rad);
$east_x = -$local_grid_step * $sin_winner_lng;
$east_y = $local_grid_step * $cos_winner_lng;
$east_z = 0;
$north_x = -$local_grid_step * $sin_winner_lat * $cos_winner_lng;
$north_y = -$local_grid_step * $sin_winner_lat * $sin_winner_lng;
$north_z = $local_grid_step * $cos_winner_lat;
if ($local_grid_radius > 0 && $local_grid_step > 0) {
$r = intval($local_grid_radius / $local_grid_step);
$r2 = $r * $r;
$h = $local_kernel_radius;
$h2 = $h * $h;
$max_L_sum = -1;
$max_L_sum_idx = -1;
for ($i = -$r; $i <= $r; $i++) {
$winner_east_x = $winner_x + $i * $east_x;
$winner_east_y = $winner_y + $i * $east_y;
$winner_east_z = $winner_z + $i * $east_z;
$j_max = intval(sqrt($r2 - $i * $i));
for ($j = -$j_max; $j <= $j_max; $j++) {
$x = $winner_east_x + $j * $north_x;
$y = $winner_east_y + $j * $north_y;
$z = $winner_east_z + $j * $north_z;
$L_sum = 0;
for ($k = 0; $k < $lat_lng_count; $k++) {
$dx = $x - $x_arr[$k];
$dy = $y - $y_arr[$k];
$dz = $z - $z_arr[$k];
$d2 = $dx * $dx + $dy * $dy + $dz * $dz;
if ($d2 < $h2) {
$L_sum += $w_arr[$k] * ($h2 - $d2); // Epanechnikov
if ($max_L_sum < $L_sum) {
$max_L_sum = $L_sum;
$max_L_sum_i = $i;
$max_L_sum_j = $j;
$x = $winner_x + $max_L_sum_i * $east_x + $max_L_sum_j * $north_x;
$y = $winner_y + $max_L_sum_i * $east_y + $max_L_sum_j * $north_y;
$z = $winner_z + $max_L_sum_i * $east_z + $max_L_sum_j * $north_z;
} else if ($local_grid_radius > 0) {
$r = $local_grid_radius;
$r2 = $r * $r;
$wx_sum = 0;
$wy_sum = 0;
$wz_sum = 0;
$w_sum = 0;
for ($k = 0; $k < $lat_lng_count; $k++) {
$xk = $x_arr[$k];
$yk = $y_arr[$k];
$zk = $z_arr[$k];
$dx = $winner_x - $xk;
$dy = $winner_y - $yk;
$dz = $winner_z - $zk;
$d2 = $dx * $dx + $dy * $dy + $dz * $dz;
if ($d2 <= $r2) {
$wk = $w_arr[$k];
$wx_sum += $wk * $xk;
$wy_sum += $wk * $yk;
$wz_sum += $wk * $zk;
$w_sum += $wk;
$x = $wx_sum / $w_sum;
$y = $wy_sum / $w_sum;
$z = $wz_sum / $w_sum;
$max_L_sum_i = false;
$max_L_sum_j = false;
} else {
return array($winner_lat, $winner_lng, $max_K_sum_i, false, false);
$deg = 180 / M_PI;
$a2 = $a * $a;
$e4 = $e2 * $e2;
$p = sqrt($x * $x + $y * $y);
$zeta = (1 - $e2) * $z * $z / $a2;
$rho = ($p * $p / $a2 + $zeta - $e4) / 6;
$rho3 = $rho * $rho * $rho;
$s = $e4 * $zeta * $p * $p / (4 * $a2);
$t = pow($s + $rho3 + sqrt($s * ($s + 2 * $rho3)), 1 / 3);
$u = $rho + $t + $rho * $rho / $t;
$v = sqrt($u * $u + $e4 * $zeta);
$w = $e2 * ($u + $v - $zeta) / (2 * $v);
$k = 1 + $e2 * (sqrt($u + $v + $w * $w) + $w) / ($u + $v);
$lat = atan($k * $z / $p) * $deg;
$lng = atan2($y, $x) * $deg;
return array($lat, $lng, $max_K_sum_i, $max_L_sum_i, $max_L_sum_j);
The fact that distances are Euclidean and not great-circle should have negligible effects for the task at hand. Calculating great-circle distances would be much more cumbersome, and would cause only the weight of very far points to be significantly lower - but these points already have a very low weight. In principle, the same effect could be achieved by a different kernel. Kernels that have a complete cut-off beyond some distance, like the Epanechnikov kernel, don't have this problem at all (in practice).
The conversion between lat,lng and x,y,z for the WGS84 datum is given exactly (although without guarantee of numerical stability) more as a reference than because of a true need. If the height is to be taken into account, or if a faster back-conversion is needed, please refer to the Wikipedia article.
The Epanechnikov kernel, besides being "more local" than the Gaussian and Trossian kernels, has the advantage of being the fastest for the second loop, which is O(ng), where g is the number of points of the local grid, and can also be employed in the first loop, which is O(n²), if n is big.
This can be solved by finding a jeopardy surface. See Rossmo's Formula.
This is the predator problem. Given a set of geographically-located carcasses, where is the lair of the predator? Rossmo's formula solves this problem.
Find the point with the largest density estimate.
Should be pretty much straightforward. Use a kernel radius that roughly covers a large airport in diameter. A 2D Gaussian or Epanechnikov kernel should be fine.
This is similar to computing a Heap Map:
and then finding the brightest spot there. Except it computes the brightness right away.
For fun I read a 1% sample of the Geocoordinates of DBpedia (i.e. Wikipedia) into ELKI, projected it into 3D space and enabled the density estimation overlay (hidden in the visualizers scatterplot menu). You can see there is a hotspot on Europe, and to a lesser extend in the US. The hotspot in Europe is Poland I believe. Last I checked, someone apparently had created a Wikipedia article with Geocoordinates for pretty much any town in Poland. The ELKI visualizer, unfortunately, neither allows you to zoom in, rotate, or reduce the kernel bandwidth to visually find the most dense point. But it's straightforward to implement yourself; you probably also don't need to go into 3D space, but can just use latitudes and longitudes.
Kernel Density Estimation should be available in tons of applications. The one in R is probably much more powerful. I just recently discovered this heatmap in ELKI, so I knew how to quickly access it. See e.g. for a related R function.
On your data, in R, try for example:
smoothScatter(data, nbin=512, bandwidth=c(.25,.25))
this should show a strong preference for Chicago.
dens=bkde2D(data, gridsize=c(512, 512), bandwidth=c(.25,.25))
contour(dens$x1, dens$x2, dens$fhat)
maxpos = which(dens$fhat == max(dens$fhat), arr.ind=TRUE)
c(dens$x1[maxpos[1]], dens$x2[maxpos[2]])
yields [1] 42.14697 -88.09508, which is less than 10 miles from Chicago airport.
To get better coordinates try:
rerunning on a 20x20 miles area around the estimated coordinates
a non-binned KDE in that area
better bandwidth selection with dpik
higher grid resolution
in Astrophysics we use the so called "half mass radius". Given a distribution and its center, the half mass radius is the minimum radius of a circle that contains half of the points of your distribution.
This quantity is a characteristic length of a distribution of points.
If you want that the home of the helicopter is where the points are maximally concentrated so it is the point that has the minimum half mass radius!
My algorithm is as follows: for each point you compute this half mass radius centring the distribution in the current point. The "home" of the helicopter will be the point with the minimum half mass radius.
I've implemented it and the computed center is 42.149994 -88.133698 (which is in Chicago)
I've also used the 0.2 of the total mass instead of the 0.5(half) usually used in Astrophysics.
This is my (in python) alghorithm that finds the home of the helicopter:
import math
import numpy
def inside(points,center,radius):
return points[ids]
points = numpy.loadtxt(open('points.txt'),comments='#')
for i in xrange(0,npoints):
while stayHere:
#print 'point',i,'r',radius,'in',ninside,'center',center
if(halfrmin==None or radius<halfrmin):
print 'point',i,halfrmin,idcenter,points[idcenter]
#print halfrmin,idcenter
print points[idcenter]
You can use DBSCAN for that task.
DBSCAN is a density based clustering with a notion of noise. You need two parameters:
First the number of points a cluster should have at minimum "minpoints".
And second a neighbourhood parameter called "epsilon" that sets a distance threshold to the surrounding points that should be included in your cluster.
The whole algorithm works like this:
Start with an arbitrary point in your set that hasn't been visited yet
Retrieve points from the epsilon neighbourhood mark all as visited
if you have found enough points in this neighbourhood (> minpoints parameter) you start a new cluster and assign those points. Now recurse into step 2 again for every point in this cluster.
if you don't have, declare this point as noise
go all over again until you've visited all points
It is really simple to implement and there are lots of frameworks that support this algorithm already. To find the mean of your cluster, you can simply take the mean of all the assigned points from its neighbourhood.
However, unlike the method that #TylerDurden proposes, this needs a parameterization- so you need to find some hand tuned parameters that fit your problem.
In your case, you can try to set the minpoints to 10% of your total points if the plane is likely to stay 10% of the time you track at an airport. The density parameter epsilon depends on the resolution of your geographic sensor and the distance metric you use- I would suggest the haversine distance for geographic data.
How about divide the map into many zones and then find the center of plane in zone with the most plane. Algorithm will be something like this
set Zones[40]
foreach Plane in Planes
set MaxZone = Zones[0]
foreach Zone in Zones
if MaxZone.Length() < Zone.Length()
MaxZone = Zone
set Center
foreach Plane in MaxZone
Center.X += Plane.X
Center.Y += Plane.Y
Center.X /= MaxZone.Length
Center.Y /= MaxZone.Length
All I have on this machine is an old compiler so I made an ASCII version of this. It "draws" (in ASCII) a map - dots are points, X is where the real source is, G is where the guessed source is. If the two overlap, only X is shown.
Examples (DIFFICULTY 1.5 and 3 respectively):
The points are generated by picking a random point as the source, then randomly distributing points, making them more likely to be closer to the source.
DIFFICULTY is a floating point constant that regulates the initial point generation - how much more likely the points are to be closer to the source - if it is 1 or less, the program should be able to guess the exact source, or very close. At 2.5, it should still be pretty decent. At 4+, it will start to guess worse, but I think it still guesses better than a human would.
It could be optimized by using binary search over X, then Y - this would make the guess worse, but would be much, much faster. Or by starting with larger blocks, then splitting the best block further (or the best block and the 8 surrounding it). For a higher resolution system, one of these would be necessary. This is quite a naive approach, though, but it seems to work well in an 80x24 system. :D
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <math.h>
#define Y 24
#define X 80
#define DIFFICULTY 1 // Try different values...
static int point[Y][X];
double dist(int x1, int y1, int x2, int y2)
return sqrt((y1 - y2)*(y1 - y2) + (x1 - x2)*(x1 - x2));
int y = rand()%Y;
int x = rand()%X;
// Generate points
for (int i = 0; i < Y; i++)
for (int j = 0; j < X; j++)
double u = DIFFICULTY * pow(dist(x, y, j, i), 1.0 / DIFFICULTY);
if ((int)u == 0)
u = 1;
point[i][j] = !(rand()%(int)u);
// Find best source
int maxX = -1;
int maxY = -1;
double maxScore = -1;
for (int cy = 0; cy < Y; cy++)
for (int cx = 0; cx < X; cx++)
double score = 0;
for (int i = 0; i < Y; i++)
for (int j = 0; j < X; j++)
if (point[i][j] == 1)
double d = dist(cx, cy, j, i);
if (d == 0)
d = 0.5;
score += 1000 / d;
if (score > maxScore || maxScore == -1)
maxScore = score;
maxX = cx;
maxY = cy;
// Print out results
for (int i = 0; i < Y; i++)
for (int j = 0; j < X; j++)
if (i == y && j == x)
else if (i == maxY && j == maxX)
else if (point[i][j] == 0)
printf(" ");
else if (point[i][j] == 1)
printf("Distance from real source: %f", dist(maxX, maxY, x, y));
scanf("%d", 0);
Virtual earth has a very good explanation of how you can do it relatively quick. They also have provided code examples. Please have a look at
A simple mixture model seems to work pretty well for this problem.
In general, to get a point that minimizes the distance to all other points in a dataset, you can just take the mean. In this case, you want to find a point that minimizes the distance from a subset of concentrated points. If you postulate that a point can either come from the concentrated set of points of interest or from a diffuse set of background points, then this gives a mixture model.
I have included some python code below. The concentrated area is modeled by a high-precision normal distribution and the background point are modeled by either a low-precision normal distribution or a uniform distribution over a bounding box on the dataset (there is a line of code that can be commented out to switch between these options). Also, mixture models can be somewhat unstable, so running the EM algorithm a few times with random initial conditions and choosing the run with the highest log-likelihood gives better results.
If you are actually looking at airplanes, then adding some sort of time dependent dynamics will probably improve your ability to infer the home base immensely.
I would also be wary of Rossimo's formula because it includes some pretty-strong assumptions about crime distributions.
#the dataset
import StringIO
import numpy as np
import re
import matplotlib.pyplot as plt
def lp(l):
return map(lambda m: float(,re.finditer('[^, \n]+',l))
# area of the point set bounding box
M_ITER=100 #maximum number of iterations
THRESH=1e-10 # stopping threshold
def em(x):
print '\nSTART EM'
mu0=np.mean( data , 0 ) # the sample mean of the data - use this as the mean of the low-precision gaussian
# the mean of the high-precision Gaussian - this is what we are looking for
mu=np.random.rand( 2 )*np.array([xmx-xmn,ymx-ymn])+np.array([xmn,ymn])
lam_lo=.001 # precision of the low-precision Gaussian
lam_hi=.1 # precision of the high-precision Gaussian
prz=np.random.rand( 1 ) # probability of choosing the high-precision Gaussian mixture component
for i in xrange(M_ITER):
#low-precision normal background distribution
#uncomment for the uniform background distribution
#expectation step
#compute bound on the likelihood
print i,lh
#maximization step
mu=np.sum(zs[:,None]*x,0)/np.sum(zs) #mean
lam_hi=np.sum(zs)/np.sum(zs*.5*np.sum((x-mu)**2,1)) #precision
prz=1.0/(1.0+np.sum(1.0-zs)/np.sum(zs)) #mixure component probability
if np.abs((lh-old_lh)/lh)<THRESH:
return lh,lam_hi,mlst
if __name__=='__main__':
#repeat the EM algorithm a number of times and get the run with the best log likelihood
for i in xrange(4):
if prm[0]>mx_prm[0]:
print prm[0]
print mx_prm[0]
print 'best loglikelihood:', lh
#print 'final precision value:', lam_hi
print 'point of interest:', mu
for m in mlst:
You can easily adapt the Rossmo's formula, quoted by Tyler Durden to your case with few simple notes:
The formula :
This formula give something close to a probability of presence of the base operation for a predator or a serial killer. In your case it could give the probability of a base to be in a certain point. I'll explain later how to use it. U can write it this way :
Proba(base on point A)= Sum{on all spots} ( Phi/(dist^f)+(1-Phi)(B*(g-f))/(2B-dist)^g )
Using Euclidian distance
You want an Euclidian distance and not the Manhattan's one because an airplane or helicopter is not bound to road/streets. So using Euclidian distance is the correct way, if you are tracking an airplane & not a serial killer. So "dist" in the formula is the euclidian distance between the spot ur testing and the spot considered
Taking reasonable variable B
Variable B was used to represent the rule "reasonably smart killer will not kill his neighbor". In your case the will also applied because no one use an airplane/roflcopter to get to the next street corner. we can suppose that the minimal journey is for example 10km or anything reasonable when applied to your case.
Exponential factor f
Factor f is used to add a weight to the distance. For example if all the spots are in a small area you could want a big factor f because the probability of the airport/base/HQ will decrease fast if all your datapoint are in the same sector. g works in a similar way, allow to choose the size of "base is unlikely to be just next to the spot" area
Factor Phi :
Again this factor has to be determined using your knowledge of the problem. It permits to choose the most accurate factor between "base is close to spots" and "i'll not use the plane to make 5 m" if for example u think that the second one is almost irrelevent you can set Phi to 0.95 (0<Phi<1) If both are interesting phi will be around 0.5
How to implement it as something usefull :
First you want to divide your map into little squares : meshing the map ( just like invisal did) (the smaller the squares ,the more accurate the result (in general)) then using the formula to find the more probable location. In fact the mesh is just an array with all possible locations. (if u want to be accurate you increase the number of possible spots but it will require more computational time and PhP is not well-known for it's amazing speed)
Algorithm :
//define all the factors you need(B , f , g , phi)
for(i=0..mesh_size) // computing the probability of presence for each square of the mesh
geocode squarePosition;//GeoCode of the square's center
for(j=0..geocodearray_size)//sum on all the known spots
dist=Distance(geocodearray[j],squarePosition);//small function returning distance between two geocodes
return geocode corresponding to max(P(i))
Hope that it will help you
First I would like to express my fondness of your method in illustrating and explaining the problem ..
If I were in your shoes, I would go for a density based algorithm like DBSCAN
and then after clustering the areas and removing the noise points a few areas (choices) will remain .. then I'll take the cluster with the highest density of points and calculate the average point and find the nearest real point to it . done, found the place! :).
Why not something like this:
For each point, calculate it's distance from all other points and sum the total.
The point with the smallest sum is your center.
Maybe sum isn't the best metric to use. Possibly the point with the most "small distances"?
Sum over the distances. Take the point with the smallest summed distance.
function () {
for i in points P:
S[i] = 0
for j in points P:
S[i] += distance(P[i], P[j])
return min(S);
You can take a minimum spanning tree and remove the longest edges. The smaller trees give you the centeroid to lookup. The algorithm name is single-link k-clustering. There is a post here:
