Given a set of points on a plane, find the shortest line segment formed by any two of these points.
How can I do that? The trivial way is obviously to calculate each distance, but I need another algorithm to compare.
The problem can be solved in O(n log n) time using the recursive divide and conquer approach, e.g., as follows:
Sort points along the x-coordinate
Split the set of points into two equal-sized subsets by a vertical line x = xmid
Solve the problem recursively in the left and right subsets. This will give the left-side and right-side minimal distances dLmin and dRmin respectively.
Find the minimal distance dLRmin among the pair of points in which one point lies on the left of the dividing vertical and the second point lies to the right.
The final answer is the minimum among dLmin, dRmin, and dLRmin.

I can't immediately think of a quicker alternative than the brute force technique (although there must be plenty) but whatever algorithm you choose don't calculate the distance between each point. If you need to compare distances just compare the squares of the distances to avoid the expensive and entirely redundant square root.

One possibility would be to sort the points by their X coordinates (or the Y -- doesn't really matter which, just be consistent). You can then use that to eliminate comparisons to many of the other points. When you're looking at the distance between point[i] and point[j], if the X distance alone is greater than your current shortest distance, then point[j+1]...point[N] can be eliminated as well (assuming i<j -- if j<i, then it's point[0]...point[i] that are eliminated).
If your points start out as polar coordinates, you can use a variation of the same thing -- sort by distance from the origin, and if the difference in distance from the origin is greater than your current shortest distance, you can eliminate that point, and all the others that are farther from (or closer to) the origin than the one you're currently considering.

You can extract the closest pair in linear time from the Delaunay triangulation and conversly from Voronoi diagram.

There is a standard algorithm for this problem, here you can find it:
And here is my implementation of this algo, sorry it's without comments:
static long distSq(Point a, Point b) {
return ((long) (a.x - b.x) * (long) (a.x - b.x) + (long) (a.y - b.y) * (long) (a.y - b.y));
static long ccw(Point p1, Point p2, Point p3) {
return (long) (p2.x - p1.x) * (long) (p3.y - p1.y) - (long) (p2.y - p1.y) * (long) (p3.x - p1.x);
static List<Point> convexHull(List<Point> P) {
if (P.size() < 3) {
return null;
int k = 0;
for (int i = 0; i < P.size(); i++) {
if (P.get(i).y < P.get(k).y || (P.get(i).y == P.get(k).y && P.get(i).x < P.get(k).x)) {
k = i;
Collections.swap(P, k, P.size() - 1);
final Point o = P.get(P.size() - 1);
P.remove(P.size() - 1);
Collections.sort(P, new Comparator() {
public int compare(Object o1, Object o2) {
Point a = (Point) o1;
Point b = (Point) o2;
long t1 = (long) (a.y - o.y) * (long) (b.x - o.x) - (long) (a.x - o.x) * (long) (b.y - o.y);
if (t1 == 0) {
long tt = distSq(o, a);
tt -= distSq(o, b);
if (tt > 0) {
return 1;
} else if (tt < 0) {
return -1;
return 0;
if (t1 < 0) {
return -1;
return 1;
List<Point> hull = new ArrayList<Point>();
for (int i = 1; i < P.size(); i++) {
while (hull.size() >= 2 &&
ccw(hull.get(hull.size() - 2), hull.get(hull.size() - 1), P.get(i)) <= 0) {
hull.remove(hull.size() - 1);
return hull;
static long nearestPoints(List<Point> P, int l, int r) {
if (r - l == P.size()) {
Collections.sort(P, new Comparator() {
public int compare(Object o1, Object o2) {
int t = ((Point) o1).x - ((Point) o2).x;
if (t == 0) {
return ((Point) o1).y - ((Point) o2).y;
return t;
if (r - l <= 100) {
long ret = distSq(P.get(l), P.get(l + 1));
for (int i = l; i < r; i++) {
for (int j = i + 1; j < r; j++) {
ret = Math.min(ret, distSq(P.get(i), P.get(j)));
return ret;
int c = (l + r) / 2;
long lD = nearestPoints(P, l, c);
long lR = nearestPoints(P, c + 1, r);
long ret = Math.min(lD, lR);
Set<Point> set = new TreeSet<Point>(new Comparator<Point>() {
public int compare(Point o1, Point o2) {
int t = o1.y - o2.y;
if (t == 0) {
return o1.x - o2.x;
return t;
for (int i = l; i < r; i++) {
int x = P.get(c).x;
double theta = Math.sqrt(ret);
Point[] Q = set.toArray(new Point[0]);
Point[] T = new Point[Q.length];
int pos = 0;
for (int i = 0; i < Q.length; i++) {
if (Q[i].x - x + 1 > theta) {
T[pos++] = Q[i];
for (int i = 0; i < pos; i++) {
for (int j = 1; j < 7 && i + j < pos; j++) {
ret = Math.min(ret, distSq(T[i], T[j + i]));
return ret;

From your question it is not clear if you are looking for the distance of the segment, or the segment itself. Assuming you are looking for the distance (the segment in then a simple modification, once you know which are the two points whose distance is minimal), given 5 points, numbered from 1 to 5, you need to
compare 1 with 2,3,4,5, then
compare 2, with 3,4,5, then
compare 3 with 4,5, then
compare 4 with 5.
If I am not wrong, given the commutativity of the distance you do not need to perform other comparisons.
In python, may sound like something
import numpy as np
def find_min_distance_of_a_cloud(cloud):
Given a cloud of points in the n-dim space, provides the minimal distance.
:param cloud: list of nX1-d vectors, as ndarray.
dist_min = None
for i, p_i in enumerate(cloud[:-1]):
new_dist_min = np.min([np.linalg.norm(p_i - p_j) for p_j in cloud[(i + 1):]])
if dist_min is None or dist_min > new_dist_min:
dist_min = new_dist_min
return dist_min
That can be tested with something like the following code:
from import assert_equal
def test_find_min_distance_of_a_cloud_1pt():
cloud = [np.array((1, 1, 1)), np.array((0, 0, 0))]
min_out = find_min_distance_of_a_cloud(cloud)
assert_equal(min_out, np.sqrt(3))
def test_find_min_distance_of_a_cloud_5pt():
cloud = [np.array((0, 0, 0)),
np.array((1, 1, 0)),
np.array((2, 1, 4)),
np.array((3, 4, 4)),
np.array((5, 3, 4))]
min_out = find_min_distance_of_a_cloud(cloud)
assert_equal(min_out, np.sqrt(2))
If more than two points can have the same minimal distance, and you are looking for the segments, you need again to modify the proposed code, and the output will be the list of points whose distance is minimal (or couple of points). Hope it helps!

Here is a code example demonstrating how to implement the divide and conquer algorithm. For the algorithm to work, the points x-values must be unique. The non-obvious part of the algorithm is that you must sort both along the x and the y-axis. Otherwise you can't find minimum distances over the split seam in linear time.
from collections import namedtuple
from itertools import combinations
from math import sqrt
IxPoint = namedtuple('IxPoint', ['x', 'y', 'i'])
ClosestPair = namedtuple('ClosestPair', ['distance', 'i', 'j'])
def check_distance(cp, p1, p2):
xd = p1.x - p2.x
yd = p1.y - p2.y
dist = sqrt(xd * xd + yd * yd)
if dist < cp.distance:
return ClosestPair(dist, p1.i, p2.i)
return cp
def closest_helper(cp, xs, ys):
n = len(xs)
if n <= 3:
for p1, p2 in combinations(xs, 2):
cp = check_distance(cp, p1, p2)
return cp
# Divide
mid = n // 2
mid_x = xs[mid].x
xs_left = xs[:mid]
xs_right = xs[mid:]
ys_left = [p for p in ys if p.x < mid_x]
ys_right = [p for p in ys if p.x >= mid_x]
# Conquer
cp_left = closest_helper(cp, xs_left, ys_left)
cp_right = closest_helper(cp, xs_right, ys_right)
if cp_left.distance < cp_right.distance:
cp = cp_left
cp = cp_right
ys_strip = [p for p in ys if abs(p.x - mid_x) < cp.distance]
n_strip = len(ys_strip)
for i in range(n_strip):
for j in range(i + 1, n_strip):
p1, p2 = ys_strip[j], ys_strip[i]
if not p1.y - p2.y < cp.distance:
cp = check_distance(cp, p1, p2)
return cp
def closest_pair(points):
points = [IxPoint(p[0], p[1], i)
for (i, p) in enumerate(points)]
xs = sorted(points, key = lambda p: p.x)
xs = [IxPoint(p.x + i * 1e-8, p.y, p.i)
for (i, p) in enumerate(xs)]
ys = sorted(xs, key = lambda p: p.y)
cp = ClosestPair(float('inf'), -1, -1)
return closest_helper(cp, xs, ys)


Find max apples you can pick ensuring you reach bottom right cell by time T

You start at top-left cell of a given grid. Some cells have wall, some
you can walk, and some cells have apple. You are given a time limit =
T, and you should reach bottom right cell by atmost T time. Find
maximum number of apples you can collect. You cannot visit a cell
twice. N, M, T <= 14.
I tried a lot of ideas, most promising one is this - rephrase problem as find shortest time to reach destination collecting atleast X apples. Then we could binary search on number of apples.
But I am not able to pin down a solution from last 6hours.
"You cannot visit a cell twice." this is causing me problem.
Any other idea or hint is appreciated.
Have you tried backtracking? Something like this?
// Heuristic: Manhattan distance to end
function dist(y, x, n, m){
return n - y + m - x - 2;
function getNext(i, j, n, m){
const ways = [];
if (i + 1 < n)
ways.push([i+1, j]);
if (i > 0)
ways.push([i-1, j]);
if (j + 1 < m)
ways.push([i, j+1]);
if (j > 0)
ways.push([i, j-1]);
return ways;
function f(M, T){
const WALL = 2;
const n = M.length;
const m = M[0].length;
const visited = new Array(n);
for (let i=0; i<n; i++)
visited[i] = new Array(m).fill(0);
let best = 0;
function backtrack(i, j, t, k){
if (i == n-1 && j == m-1){
best = Math.max(best, k + M[i][j]);
for (const [ii, jj] of getNext(i, j, n, m)){
if (!visited[ii][jj] &&
M[ii][jj] != WALL &&
t + dist(ii, jj, n, m) <= T){
visited[ii][jj] = 1;
backtrack(ii, jj, t + 1, k + M[i][j]);
visited[ii][jj] = 0
backtrack(0, 0, 0, 0);
return best;
var N = 8;
var M = 8;
var T = 14;
var matrix = new Array(N);
for (let i=0; i<N; i++)
matrix[i] = new Array(M).fill(0);
// Apples
matrix[5][5] = 1;
matrix[5][6] = 1;
// Walls
matrix[5][7] = 2;
matrix[5][4] = 2;
matrix[4][4] = 2;
matrix[4][5] = 2;
console.log(f(matrix, T));
matrix[5][4] = 0;
console.log(f(matrix, T));
Given the constraints, you can use a simple recursive function to complete the problem.
Let solve(i,j,steps,vis) be the function, where (i,j) are current coordinates, time is the time remaining, and vis is the set of currently visited nodes. The answer will be solve(0,0,T,[]).
The simple recursion would be (using pseudo-code):
def solve(i,j,t,vis):
if (i<0 or i>=n or j<0 or j>=m) return -1
if ((i,j) in vis) return -1
if (cell[i][j] == WALL) return -1
if (t==0){
if (i==n-1 and j==m-1) return cell[i][j]
else return -1
if (i==n-1 and j==m-1) return cell[i][j]
max_here = cell[i][j]
temp = max(solve(i,j+1,t-1,vis+(i,j)), solve(i,j-1,t-1,vis+(i,j)), solve(i+1,j,t-
1,vis+(i,j)), solve(i-1,j,t-1,vis+(i,j))) #assuming movement in 4 directions
if (temp==-1) return -1 # since none of the neighbours lead to destination
return max_here+temp

Clarification of Answer... find the max possible two equal sum in a SET

I need a clarification of the answer of this question but I can not comment (not enough rep) so I ask a new question. Hope it is ok.
The problem is this:
Given an array, you have to find the max possible two equal sum, you
can exclude elements.
i.e 1,2,3,4,6 is given array we can have max two equal sum as 6+2 =
i.e 4,10,18, 22, we can get two equal sum as 18+4 = 22
what would be your approach to solve this problem apart from brute
force to find all computation and checking two possible equal sum?
edit 1: max no of array elements are N <= 50 and each element can be
up to 1<= K <=1000
edit 2: Total elements sum cannot be greater than 1000.
The approved answer says:
I suggest solving this using DP where instead of tracking A,B (the
size of the two sets), you instead track A+B,A-B (the sum and
difference of the two sets).
Then for each element in the array, try adding it to A, or B, or
The advantage of tracking the sum/difference is that you only need to
keep track of a single value for each difference, namely the largest
value of the sum you have seen for this difference.
What I do not undertand is:
If this was the subset sum problem I could solve it with DP, having a memoization matrix of (N x P), where N is the size of the set and P is the target sum...
But I can not figure it out how I should keep track A+B,A-B (as said for the author of the approved answer). Which should be the dimensions of the memoization matrix ? and how that helps to solve the problem ?
The author of the answer was kind enough to provide a code example but it is hard to me to undertand since I do not know python (I know java).
I think thinking how this solution relates to the single subset problem might be misleading for you. Here we are concerned with a maximum achievable sum, and what's more, we need to distinguish between two disjoint sets of numbers as we traverse. Clearly tracking specific combinations would be too expensive.
Looking at the difference between sets A and B, we can say:
A - B = d
A = d + B
Clearly, we want the highest sum when d = 0. How do we know that sum? It's (A + B) / 2!
For the transition in the dynamic program, we'd like to know if it's better to place the current element in A, B or neither. This is achieved like this:
e <- current element
d <- difference between A and B
(1) add e to A -> d + e
A = d + B
(A + e) = d + e + B
(2) add e to B -> d - e
A = d + B
A = d - e + (B + e)
(3) don't use e -> that's simply
what we already have stored for d
Let's look at Peter de Rivas' code for the transition:
# update a copy of our map, so
# we can reference previous values,
# while assigning new values
# d is A - B
# s is A + B
for d,s in D.items():
# a new sum that includes element a
# we haven't decided if a
# will be in A or B
s2 = s + a
# d2 will take on each value here
# in turn, once d - a (adding a to B),
# and once d + a (adding a to A)
for d2 in [d-a, d+a]:
# The main transition:
# the two new differences,
# (d-a) and (d+a) as keys in
# our map get the highest sum
# seen so far, either (1) the
# new sum, s2, or (2) what we
# already stored (meaning `a`
# will be excluded here)
# so all three possibilities
# are covered.
D2[abs(d2)] = max(D2[abs(d2)], s2)
In the end we have stored the highest A + B seen for d = 0, where the elements in A and B form disjoint sets. Return (A + B) / 2.
Try this dp approch : it works fine.
i/p ::
1 2 3 4 6
o/p : 8
4 10 18 22
o/p : 22
4 118 22 3
o/p : 0
import java.util.Scanner;
public class TwoPipesOfMaxEqualLength {
public static void main(String[] args) {
Scanner sc = new Scanner(;
int t = sc.nextInt();
while (t-- > 0) {
int n = sc.nextInt();
int[] arr = new int[n + 1];
for (int i = 1; i <= n; i++) {
arr[i] = sc.nextInt();
MaxLength(arr, n);
private static void MaxLength(int[] arr, int n) {
int dp[][] = new int[1005][1005];
int dp1[][] = new int[1005][1005];
// initialize dp with values as 0.
for (int i = 0; i <= 1000; i++) {
for (int j = 0; j <= 1000; j++)
dp[i][j] = 0;
// make (0,0) as 1.
dp[0][0] = 1;
for (int i = 1; i <= n; i++) {
for (int j = 0; j <= 1000; j++) {
for (int k = 0; k <= 1000; k++) {
if (j >= arr[i]) {
if (dp[j - arr[i]][k] == 1) {
dp1[j][k] = 1;## Heading ##
if (k >= arr[i]) {
if (dp[j][k - arr[i]] == 1) {
dp1[j][k] = 1;
if (dp[j][k] == 1) {
dp1[j][k] = 1;
for (int j = 0; j <= 1000; j++) {
for (int k = 0; k <= 1000; k++) {
dp[j][k] = dp1[j][k];
dp1[j][k] = 0;
int ans = 0;
for (int i = 1; i <= 1000; i++) {
if (dp[i][i] == 1) {
ans = i;
#include <bits/stdc++.h>
using namespace std;
Brute force recursive solve.
void solve(vector<int>&arr, int &ans, int p1, int p2, int idx, int mx_p){
// if p1 == p2, we have a potential answer
if(p1 == p2){
ans = max(ans, p1);
//base case 1:
if((p1>mx_p) || (p2>mx_p) || (idx >= arr.size())){
// leave the current element
solve(arr, ans, p1, p2, idx+1, mx_p);
// add the current element to p1
solve(arr, ans, p1+arr[idx], p2, idx+1, mx_p);
// add the current element to p2
solve(arr, ans, p1, p2+arr[idx], idx+1, mx_p);
Recursive solve with memoization.
int solve(vector<vector<vector<int>>>&memo, vector<int>&arr,
int p1, int p2, int idx, int mx_p){
//base case 1:
if((p1>mx_p) || (p2>mx_p) || (idx>arr.size())){
return -1;
// memo'ed answer
return memo[p1][p2][idx];
// if p1 == p2, we have a potential answer
if(p1 == p2){
memo[p1][p2][idx] = max(memo[p1][p2][idx], p1);
// leave the current element
memo[p1][p2][idx] = max(memo[p1][p2][idx], solve(memo, arr, p1, p2,
idx+1, mx_p));
// add the current element to p1
memo[p1][p2][idx] = max(memo[p1][p2][idx],
solve(memo, arr, p1+arr[idx], p2, idx+1, mx_p));
// add the current element to p2
memo[p1][p2][idx] = max(memo[p1][p2][idx],
solve(memo, arr, p1, p2+arr[idx], idx+1, mx_p));
return memo[p1][p2][idx];
int main(){
vector<int>arr = {1, 2, 3, 4, 7};
int ans = 0;
int mx_p = 0;
for(auto i:arr){
mx_p += i;
mx_p /= 2;
vector<vector<vector<int>>>memo(mx_p+1, vector<vector<int>>(mx_p+1,
ans = solve(memo, arr, 0, 0, 0, mx_p);
ans = (ans>=0)?ans:0;
// solve(arr, ans, 0, 0, 0, mx_p);
cout << ans << endl;
return 0;

Drawing ellipse with Bresenham's algorithm

I'm trying to draw an ellipse, which is parallel to the orthogonal system, using Bresenham's algorithm. I want to draw the top-left (W,SW,S) quarter of the ellipse, and then deduce others.
To do this, i'm using an incremental algorithm with the second-order logic. I did it from another algorithm that draw the top-right quarter first, but what i'm doing isn't working.
The problem appears when the 2nd region is drawing, and I don't know where it comes from.
You can see what I have (black), and what I expect (green):
(center of the ellipse (xc, yc) and the upper right button (x2,y2) that is ~(xc+30,yc+20) in this example)
(a is abs(x2-xc), and b is abs(y2-yc))
The first parameter is the middle of the ellipse (xc, yc), the second is the upper right point established the x and y radius. You can see the ellipse goes too far (2 points on the left and on the right). You can see an other example
(center of the ellipse (xc, yc) and the upper right button (x2,y2) that is ~(xc+15,yc+18) in this example)
The algorithm is deduced from the incremental algorithm with the second-order logic.
Here is my code, (a is abs(x2-xc), and b is abs(y2-yc))
ellipse(int a, int b, int xc, int yc) {
int a2 = a*a, b2 = b*b;
int x = 0, y = b; //Starting point
int incSW = b2*2 + a2*2;
int deltaW = b2*(-2*x + 3); //deduced from incremental algorithm with the second-order logic
int deltaS = a2*(-2*y + 3);
int deltaSW = deltaW + deltaS;
int d1 = b2 - a2*b + a2/4; //dp starting value in the first region
int d2 = b2*(x - 0.5)*(x - 0.5) + a2*(y - 1)*(y - 1) - a2*b2; //dp starting value in the second region
//First region
while(a2*(y-0.5) >= b2*(-x-1)) {
DrawPixel(g,-x+xc, -y+yc); // 1st case
DrawPixel(g,-x+xc, y+yc); // 2nd case
DrawPixel(g,x+xc, y+yc); // 3rd case
DrawPixel(g,x+xc, -y+yc); // 4th case
if(d1>0) {
else {
deltaSW = b2*(2 - 2*x) + a2*(-2*y + 3);
//Second region
while(y>=0) {
DrawPixel(g,-x+xc, -y+yc); // 1st case
DrawPixel(g,-x+xc, y+yc); // 2nd case
DrawPixel(g,x+xc, y+yc); // 3rd case
DrawPixel(g,x+xc, -y+yc); // 4th case
if(d2>0) {
else {
I hope you can help me, thanks.
Using the error term e = a x^2 + b y^2 - r^2, it's pretty easy to show that a step from (x,y) to (x,y+1) changes the error by 2by + b, a step to (x+1,y+1) by 2ax + a + 2by + b, and a step to (x+1,y) by 2ax + a.
Starting from a point (-x0, 0), choose the least absolute error step from these three. The first two cases are the norm for the "first region" as you call it.
The first time a step right, (x,y) to (x+1,y), produces least error, you know you're in the second region. At this point the first case is no longer needed. The quarter ellipse can be finished using only the second two cases.
Note this check avoids the floating point operations you've used. The whole point of Bresenham-ish algorithms is to avoid floating point.
The last bit to notice is that you don't want to compute 2ax or 2by each iteration. The multiplications can be avoided by maintaining variables, say dx=2ax and dy=2by, and updating them. A step from x to x+1 increments dx by 2a, a constant. Similarly a step from y to y+1 increments dy by 2b.
Putting all this together, you get the (rough) code below.
Note that you can check the incremental error computation by verifying it against the original error term. If (x0,0) is the initial point, then you know x0^2 = r^2. So the actual error in every iteration is a * x^2 + b * y^2 - x0^2. This ought to equal e in the code below, and it does.
import static java.lang.Math.abs;
import java.util.Arrays;
import java.util.function.BiConsumer;
public class EllipseTracer {
static char [] [] raster = new char[51][101];
static void trace(int x, int y, int a, int b, BiConsumer<Integer, Integer> emitter) {
emitter.accept(x, y);
int e = 0;
int dx = 2 * a * x;
int dy = 2 * b * y;
// First region: stepping north and northeast.
while (x < 0) {
int dxa = dx + a;
int dyb = dy + b;
int eUp = e + dyb;
int eRt = e + dxa;
int eDg = e + dxa + dyb;
if (abs(eUp) < abs(eDg)) {
emitter.accept(x, ++y);
e = eUp;
dy += 2 * b;
} else {
if (abs(eRt) < abs(eDg)) {
// Step east is least error. Found second region.
emitter.accept(++x, y);
e = eRt;
dx += 2 * a;
emitter.accept(++x, ++y);
e = eDg;
dy += 2 * b;
dx += 2 * a;
// Second region: step northeast and east.
while (x < 0) {
int dxa = dx + a;
int dyb = dy + b;
int eRt = e + dxa;
int eDg = e + dxa + dyb;
if (abs(eRt) < abs(eDg)) {
emitter.accept(++x, y);
e = eRt;
dx += 2 * a;
} else {
emitter.accept(++x, ++y);
e = eDg;
dy += 2 * b;
dx += 2 * a;
static void emit(int x, int y) {
raster[y][x + 100] = '*';
public static void main(String [] args) {
for (int i = 0; i < raster.length; ++i) {
Arrays.fill(raster[i], ' ');
trace(-100, 0, 1, 4, EllipseTracer::emit);
for (int i = 0; i < raster.length; ++i) {
You can add more tricks to avoid the absolute values, but I'll let you look for those.

Discover long patterns

Given a sorted list of numbers, I would like to find the longest subsequence where the differences between successive elements are geometrically increasing. So if the list is
1, 2, 3, 4, 7, 15, 27, 30, 31, 81
then the subsequence is 1, 3, 7, 15, 31. Alternatively consider 1, 2, 5, 6, 11, 15, 23, 41, 47 which has subsequence 5, 11, 23, 47 with a = 3 and k = 2.
Can this be solved in O(n2) time? Where n is the length of the list.
I am interested both in the general case where the progression of differences is ak, ak2, ak3, etc., where both a and k are integers, and in the special case where a = 1, so the progression of difference is k, k2, k3, etc.
I have made an improvement of the algorithm that it takes an average of O(M + N^2) and memory needs of O(M+N). Mainly is the same that the protocol described below, but to calculate the possible factors A,K for ech diference D, I preload a table. This table takes less than a second to be constructed for M=10^7.
I have made a C implementation that takes less than 10minutes to solve N=10^5 diferent random integer elements.
Here is the source code in C: To execute just do: gcc -O3 -o findgeo findgeo.c
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <memory.h>
#include <time.h>
struct Factor {
int a;
int k;
struct Factor *next;
struct Factor *factors = 0;
int factorsL=0;
void ConstructFactors(int R) {
int a,k,C;
int R2;
struct Factor *f;
float seconds;
clock_t end;
clock_t start = clock();
if (factors) free(factors);
factors = malloc (sizeof(struct Factor) *((R>>1) + 1));
R2 = R>>1 ;
for (a=0;a<=R2;a++) {
factors[a].a= a;
R2 = floor(sqrt(R));
for (k=2; k<=R2; k++) {
while (C<R) {
C >>= 1;
f=malloc(sizeof(struct Factor));
end = clock();
seconds = (float)(end - start) / CLOCKS_PER_SEC;
printf("Construct Table: %f\n",seconds);
void DestructFactors() {
int i;
struct Factor *f;
for (i=0;i<factorsL;i++) {
while (factors[i].next) {
int ipow(int base, int exp)
int result = 1;
while (exp)
if (exp & 1)
result *= base;
exp >>= 1;
base *= base;
return result;
void findGeo(int **bestSolution, int *bestSolutionL,int *Arr, int L) {
int i,j,D;
int mustExistToBeBetter;
int R=Arr[L-1]-Arr[0];
int *possibleSolution;
int possibleSolutionL=0;
int exp;
int NextVal;
int idx;
int kMax,aMax;
float seconds;
clock_t end;
clock_t start = clock();
kMax = floor(sqrt(R));
aMax = floor(R/2);
possibleSolution = malloc(sizeof(int)*(R+1));
struct Factor *f;
int *H=malloc(sizeof(int)*(R+1));
memset(H,0, sizeof(int)*(R+1));
for (i=0;i<L;i++) {
H[ Arr[i]-Arr[0] ]=1;
for (i=0; i<L-2;i++) {
for (j=i+2; j<L; j++) {
if (D & 1) continue;
f = factors + (D >>1);
while (f) {
idx=Arr[i] + f->a * f->k - Arr[0];
if ((f->k <= kMax)&& (f->a<aMax)&&(idx<=R)&&H[idx]) {
if (f->k ==1) {
mustExistToBeBetter = Arr[i] + f->a * (*bestSolutionL);
} else {
mustExistToBeBetter = Arr[i] + f->a * f->k * (ipow(f->k,*bestSolutionL) - 1)/(f->k-1);
if (mustExistToBeBetter< Arr[L-1]+1) {
idx= floor(mustExistToBeBetter - Arr[0]);
} else {
idx = R+1;
if ((idx<=R)&&H[idx]) {
possibleSolution[1]=Arr[i] + f->a*f->k;
exp = f->k * f->k * f->k;
NextVal = Arr[j] + f->a * exp;
idx=NextVal - Arr[0];
while ( (idx<=R) && H[idx]) {
exp = exp * f->k;
NextVal = NextVal + f->a * exp;
idx=NextVal - Arr[0];
if (possibleSolutionL > *bestSolutionL) {
*bestSolution = possibleSolution;
possibleSolution = malloc(sizeof(int)*(R+1));
kMax= floor( pow (R, 1/ (*bestSolutionL) ));
aMax= floor(R / (*bestSolutionL));
if (*bestSolutionL == 2) {
for (i=0; (i<2)&&(i<L); i++ ) {
*bestSolution = possibleSolution;
} else {
end = clock();
seconds = (float)(end - start) / CLOCKS_PER_SEC;
printf("findGeo: %f\n",seconds);
int compareInt (const void * a, const void * b)
return *(int *)a - *(int *)b;
int main(void) {
int N=100000;
int R=10000000;
int *A = malloc(sizeof(int)*N);
int *Sol;
int SolL;
int i;
int *S=malloc(sizeof(int)*R);
for (i=0;i<R;i++) S[i]=i+1;
for (i=0;i<N;i++) {
int r = rand() % (R-i);
int step = floor(R/N);
for (i=1;i<N;i++) {
for (i=0;i<SolL;i++) {
if (i>0) printf(",");
printf("Size: %d\n",SolL);
I will try to demonstrate that the algorithm that I proposed is in average for an equally distributed random sequence. I’m not a mathematician and I am not used to do this kind of demonstrations, so please fill free to correct me any error that you can see.
There are 4 indented loops, the two firsts are the N^2 factor. The M is for the calculation of the possible factors table).
The third loop is executed only once in average for each pair. You can see this checking the size of the pre-calculated factors table. It’s size is M when N->inf. So the average steps for each pair is M/M=1.
So the proof happens to check that the forth loop. (The one that traverses the good made sequences is executed less that or equal O(N^2) for all the pairs.
To demonstrate that, I will consider two cases: one where M>>N and other where M ~= N. Where M is the maximum difference of the initial array: M= S(n)-S(1).
For the first case, (M>>N) the probability to find a coincidence is p=N/M. To start a sequence, it must coincide the second and the b+1 element where b is the length of the best sequence until now. So the loop will enter times. And the average length of this series (supposing an infinite series) is . So the total number of times that the loop will be executed is . And this is close to 0 when M>>N. The problem here is when M~=N.
Now lets consider this case where M~=N. Lets consider that b is the best sequence length until now. For the case A=k=1, then the sequence must start before N-b, so the number of sequences will be N-b, and the times that will go for the loop will be a maximum of (N-b)*b.
For A>1 and k=1 we can extrapolate to where d is M/N (the average distance between numbers). If we add for all A’s from 1 to dN/b then we see a top limit of:
For the cases where k>=2, we see that the sequence must start before , So the loop will enter an average of and adding for all As from 1 to dN/k^b, it gives a limit of
Here, the worst case is when b is minimum. Because we are considering minimum series, lets consider a very worst case of b= 2 so the number of passes for the 4th loop for a given k will be less than
And if we add all k’s from 2 to infinite will be:
So adding all the passes for k=1 and k>=2, we have a maximum of:
Note that d=M/N=1/p.
So we have two limits, One that goes to infinite when d=1/p=M/N goes to 1 and other that goes to infinite when d goes to infinite. So our limit is the minimum of both, and the worst case is when both equetions cross. So if we solve the equation:
we see that the maximum is when d=1.353
So it is demonstrated that the forth loops will be processed less than 1.55N^2 times in total.
Of course, this is for the average case. For the worst case I am not able to find a way to generate series whose forth loop are higher than O(N^2), and I strongly believe that they does not exist, but I am not a mathematician to prove it.
Old Answer
Here is a solution in average of O((n^2)*cube_root(M)) where M is the difference between the first and last element of the array. And memory requirements of O(M+N).
1.- Construct an array H of length M so that M[i - S[0]]=true if i exists in the initial array and false if it does not exist.
2.- For each pair in the array S[j], S[i] do:
2.1 Check if it can be the first and third elements of a possible solution. To do so, calculate all possible A,K pairs that meet the equation S(i) = S(j) + AK + AK^2. Check this SO question to see how to solve this problem. And check that exist the second element: S[i]+ A*K
2.2 Check also that exist the element one position further that the best solution that we have. For example, if the best solution that we have until now is 4 elements long then check that exist the element A[j] + AK + AK^2 + AK^3 + AK^4
2.3 If 2.1 and 2.2 are true, then iterate how long is this series and set as the bestSolution until now is is longer that the last.
Here is the code in javascript:
function getAKs(A) {
if (A / 2 != Math.floor(A / 2)) return [];
var solution = [];
var i;
var SR3 = Math.pow(A, 1 / 3);
for (i = 1; i <= SR3; i++) {
var B, C;
C = i;
B = A / (C * (C + 1));
if (B == Math.floor(B)) {
solution.push([B, C]);
B = i;
C = (-1 + Math.sqrt(1 + 4 * A / B)) / 2;
if (C == Math.floor(C)) {
solution.push([B, C]);
return solution;
function getBestGeometricSequence(S) {
var i, j, k;
var bestSolution = [];
var H = Array(S[S.length-1]-S[0]);
for (i = 0; i < S.length; i++) H[S[i] - S[0]] = true;
for (i = 0; i < S.length; i++) {
for (j = 0; j < i; j++) {
var PossibleAKs = getAKs(S[i] - S[j]);
for (k = 0; k < PossibleAKs.length; k++) {
var A = PossibleAKs[k][0];
var K = PossibleAKs[k][17];
var mustExistToBeBetter;
if (K==1) {
mustExistToBeBetter = S[j] + A * bestSolution.length;
} else {
mustExistToBeBetter = S[j] + A * K * (Math.pow(K,bestSolution.length) - 1)/(K-1);
if ((H[S[j] + A * K - S[0]]) && (H[mustExistToBeBetter - S[0]])) {
var possibleSolution=[S[j],S[j] + A * K,S[i]];
exp = K * K * K;
var NextVal = S[i] + A * exp;
while (H[NextVal - S[0]] === true) {
exp = exp * K;
NextVal = NextVal + A * exp;
if (possibleSolution.length > bestSolution.length) {
bestSolution = possibleSolution;
return bestSolution;
//var A= [ 1, 2, 3,5,7, 15, 27, 30,31, 81];
var A=[];
for (i=1;i<=3000;i++) {
var sol=getBestGeometricSequence(A);
You can check the code here:
I maintain the other solution because I believe that it is still better when M is very big compared to N.
Just to start with something, here is a simple solution in JavaScript:
var input = [0.7, 1, 2, 3, 4, 7, 15, 27, 30, 31, 81],
output = [], indexes, values, i, index, value, i_max_length,
i1, i2, i3, j1, j2, j3, difference12a, difference23a, difference12b, difference23b,
scale_factor, common_ratio_a, common_ratio_b, common_ratio_c,
error, EPSILON = 1e-9, common_ratio_is_integer,
resultDiv = $("#result");
for (i1 = 0; i1 < input.length - 2; ++i1) {
for (i2 = i1 + 1; i2 < input.length - 1; ++i2) {
scale_factor = difference12a = input[i2] - input[i1];
for (i3 = i2 + 1; i3 < input.length; ++i3) {
difference23a = input[i3] - input[i2];
common_ratio_1a = difference23a / difference12a;
common_ratio_2a = Math.round(common_ratio_1a);
error = Math.abs((common_ratio_2a - common_ratio_1a) / common_ratio_1a);
common_ratio_is_integer = error < EPSILON;
if (common_ratio_2a > 1 && common_ratio_is_integer) {
indexes = [i1, i2, i3];
j1 = i2;
j2 = i3
difference12b = difference23a;
for (j3 = j2 + 1; j3 < input.length; ++j3) {
difference23b = input[j3] - input[j2];
common_ratio_1b = difference23b / difference12b;
common_ratio_2b = Math.round(common_ratio_1b);
error = Math.abs((common_ratio_2b - common_ratio_1b) / common_ratio_1b);
common_ratio_is_integer = error < EPSILON;
if (common_ratio_is_integer && common_ratio_2a === common_ratio_2b) {
j1 = j2;
j2 = j3
difference12b = difference23b;
values = [];
for (i = 0; i < indexes.length; ++i) {
index = indexes[i];
value = input[index];
if (output !== []) {
i_max_length = 0;
for (i = 1; i < output.length; ++i) {
if (output[i_max_length].length < output[i].length)
i_max_length = i;
for (i = 0; i < output.length; ++i) {
if (output[i_max_length].length == output[i].length)
resultDiv.append("<p>[" + output[i] + "]</p>");
[1, 3, 7, 15, 31]
I find the first three items of every subsequence candidate, calculate the scale factor and the common ratio from them, and if the common ratio is integer, then I iterate over the remaining elements after the third one, and add those to the subsequence, which fit into the geometric progression defined by the first three items. As a last step, I select the sebsequence/s which has/have the largest length.
In fact it is exactly the same question as Longest equally-spaced subsequence, you just have to consider the logarithm of your data. If the sequence is a, ak, ak^2, ak^3, the logarithmique value is ln(a), ln(a) + ln(k), ln(a)+2ln(k), ln(a)+3ln(k), so it is equally spaced. The opposite is of course true. There is a lot of different code in the question above.
I don't think the special case a=1 can be resolved more efficiently than an adaptation from an algorithm above.
Here is my solution in Javascript. It should be close to O(n^2) except may be in some pathological cases.
function bsearch(Arr,Val, left,right) {
if (left == right) return left;
var m=Math.floor((left + right) /2);
if (Val <= Arr[m]) {
return bsearch(Arr,Val,left,m);
} else {
return bsearch(Arr,Val,m+1,right);
function findLongestGeometricSequence(S) {
var bestSolution=[];
var i,j,k;
var H={};
for (i=0;i<S.length;i++) H[S[i]]=true;
for (i=0;i<S.length;i++) {
for (j=0;j<i;j++) {
for (k=j+1;k<i;) {
var possibleSolution=[S[j],S[k],S[i]];
var K = (S[i] - S[k]) / (S[k] - S[j]);
var A = (S[k] - S[j]) * (S[k] - S[j]) / (S[i] - S[k]);
if ((Math.floor(K) == K) && (Math.floor(A)==A)) {
exp= K*K*K;
var NextVal= S[i] + A * exp;
while (H[NextVal] === true) {
exp = exp * K;
NextVal= NextVal + A * exp;
if (possibleSolution.length > bestSolution.length)
} else {
if (K>0) {
var NextPossibleMidValue= (S[i] + K*S[j]) / (K +1);
if (S[k]<NextPossibleMidValue) {
k=bsearch(S,NextPossibleMidValue, k+1, i);
} else {
return bestSolution;
function Run() {
var MyS= [0.7, 1, 2, 3, 4, 5,6,7, 15, 27, 30,31, 81];
var sol = findLongestGeometricSequence(MyS);
Small Explanation
If we take 3 numbers of the array S(j) < S(k) < S(i) then you can calculate a and k so that: S(k) = S(j) + a*k and S(i) = S(k) + a*k^2 (2 equations and 2 incognits). With that in mind, you can check if exist a number in the array that is S(next) = S(i) + a*k^3. If that is the case, then continue checknng for S(next2) = S(next) + a*k^4 and so on.
This would be a O(n^3) solution, but you can hava advantage that k must be integer in order to limit the S(k) points selected.
In case that a is known, then you can calculate a(k) and you need to check only one number in the third loop, so this case will be clearly a O(n^2).
I think this task is related with not so long ago posted Longest equally-spaced subsequence. I've just modified my algorithm in Python a little bit:
from math import sqrt
def add_precalc(precalc, end, (a, k), count, res, N):
if end + a * k ** res[1]["count"] > N: return
x = end + a * k ** count
if x > N or x < 0: return
if precalc[x] is None: return
if (a, k) not in precalc[x]:
precalc[x][(a, k)] = count
def factors(n):
res = []
for x in range(1, int(sqrt(n)) + 1):
if n % x == 0:
y = n / x
res.append((x, y))
res.append((y, x))
return res
def work(input):
precalc = [None] * (max(input) + 1)
for x in input: precalc[x] = {}
N = max(input)
res = ((0, 0), {"end":0, "count":0})
for i, x in enumerate(input):
for y in input[i::-1]:
for a, k in factors(x - y):
if (a, k) in precalc[x]: continue
add_precalc(precalc, x, (a, k), 2, res, N)
for step, count in precalc[x].iteritems():
count += 1
if count > res[1]["count"]: res = (step, {"end":x, "count":count})
add_precalc(precalc, x, step, count, res, N)
precalc[x] = None
d = [res[1]["end"]]
for x in range(res[1]["count"] - 1, 0, -1):
d.append(d[-1] - res[0][0] * res[0][1] ** x)
return d
Traversing the array
For each previous element of the array calculate factors of the difference between current and taken previous element and then precalculate next possible element of the sequence and saving it to precalc array
So when arriving at element i there're already all possible sequences with element i in the precalc array, so we have to calculate next possible element and save it to precalc.
Currently there's one place in algorithm that could be slow - factorization of each previous number. I think it could be made faster with two optimizations:
more effective factorization algorithm
find a way not to see at each element of array, using the fact that array is sorted and there's already a precalculated sequences
def subseq(a):
seq = []
aset = set(a)
for i, x in enumerate(a):
# elements after x
for j, x2 in enumerate(a[i+1:]):
j += i + 1 # enumerate starts j at 0, we want a[j] = x2
bk = x2 - x # b*k (assuming k and k's exponent start at 1)
# given b*k, bruteforce values of k
for k in range(1, bk + 1):
items = [x, x2] # our subsequence so far
nextdist = bk * k # what x3 - x2 should look like
while items[-1] + nextdist in aset:
items.append(items[-1] + nextdist)
nextdist *= k
if len(items) > len(seq):
seq = items
return seq
Running time is O(dn^3), where d is the (average?) distance between two elements,
and n is of course len(a).

Greatest linear dimension 2d set of points

Given an ordered set of 2D pixel locations (adjacent or adjacent-diagonal) that form a complete path with no repeats, how do I determine the Greatest Linear Dimension of the polygon whose perimeter is that set of pixels? (where the GLD is the greatest linear distance of any pair of points in the set)
For my purposes, the obvious O(n^2) solution is probably not fast enough for figures of thousands of points. Are there good heuristics or lookup methods that bring the time complexity nearer to O(n) or O(log(n))?
An easy way is to first find the convex hull of the points, which can be done in O(n log n) time in many ways. [I like Graham scan (see animation), but the incremental algorithm is also popular, as are others, although some take more time.]
Then you can find the farthest pair (the diameter) by starting with any two points (say x and y) on the convex hull, moving y clockwise until it is furthest from x, then moving x, moving y again, etc. You can prove that this whole thing takes only O(n) time (amortized). So it's O(n log n)+O(n)=O(n log n) in all, and possibly O(nh) if you use gift-wrapping as your convex hull algorithm instead. This idea is called rotating calipers, as you mentioned.
Here is code by David Eppstein (computational geometry researcher; see also his Python Algorithms and Data Structures for future reference).
All this is not very hard to code (should be a hundred lines at most; is less than 50 in the Python code above), but before you do that -- you should first consider whether you really need it. If, as you say, you have only "thousands of points", then the trivial O(n^2) algorithm (that compares all pairs) will be run in less than a second in any reasonable programming language. Even with a million points it shouldn't take more than an hour. :-)
You should pick the simplest algorithm that works.
On this page:
it shows that you can determine the maximum diameter of a convex polygon in O(n). I just need to turn my point set into a convex polygon first (probably using Graham scan).
Here is some C# code I came across for computing the convex hull:
I ported the Python code to C#. It seems to work.
using System;
using System.Collections.Generic;
using System.Drawing;
// Based on code here:
// Jared Updike ported it to C# 3 December 2008
public class Convexhull
// given a polygon formed by pts, return the subset of those points
// that form the convex hull of the polygon
// for integer Point structs, not float/PointF
public static Point[] ConvexHull(Point[] pts)
PointF[] mpts = FromPoints(pts);
PointF[] result = ConvexHull(mpts);
int n = result.Length;
Point[] ret = new Point[n];
for (int i = 0; i < n; i++)
ret[i] = new Point((int)result[i].X, (int)result[i].Y);
return ret;
// given a polygon formed by pts, return the subset of those points
// that form the convex hull of the polygon
public static PointF[] ConvexHull(PointF[] pts)
PointF[][] l_u = ConvexHull_LU(pts);
PointF[] lower = l_u[0];
PointF[] upper = l_u[1];
// Join the lower and upper hull
int nl = lower.Length;
int nu = upper.Length;
PointF[] result = new PointF[nl + nu];
for (int i = 0; i < nl; i++)
result[i] = lower[i];
for (int i = 0; i < nu; i++)
result[i + nl] = upper[i];
return result;
// returns the two points that form the diameter of the polygon formed by points pts
// takes and returns integer Point structs, not PointF
public static Point[] Diameter(Point[] pts)
PointF[] fpts = FromPoints(pts);
PointF[] maxPair = Diameter(fpts);
return new Point[] { new Point((int)maxPair[0].X, (int)maxPair[0].Y), new Point((int)maxPair[1].X, (int)maxPair[1].Y) };
// returns the two points that form the diameter of the polygon formed by points pts
public static PointF[] Diameter(PointF[] pts)
IEnumerable<Pair> pairs = RotatingCalipers(pts);
double max2 = Double.NegativeInfinity;
Pair maxPair = null;
foreach (Pair pair in pairs)
PointF p = pair.a;
PointF q = pair.b;
double dx = p.X - q.X;
double dy = p.Y - q.Y;
double dist2 = dx * dx + dy * dy;
if (dist2 > max2)
maxPair = pair;
max2 = dist2;
// return Math.Sqrt(max2);
return new PointF[] { maxPair.a, maxPair.b };
private static PointF[] FromPoints(Point[] pts)
int n = pts.Length;
PointF[] mpts = new PointF[n];
for (int i = 0; i < n; i++)
mpts[i] = new PointF(pts[i].X, pts[i].Y);
return mpts;
private static double Orientation(PointF p, PointF q, PointF r)
return (q.Y - p.Y) * (r.X - p.X) - (q.X - p.X) * (r.Y - p.Y);
private static void Pop<T>(List<T> l)
int n = l.Count;
l.RemoveAt(n - 1);
private static T At<T>(List<T> l, int index)
int n = l.Count;
if (index < 0)
return l[n + index];
return l[index];
private static PointF[][] ConvexHull_LU(PointF[] arr_pts)
List<PointF> u = new List<PointF>();
List<PointF> l = new List<PointF>();
List<PointF> pts = new List<PointF>(arr_pts.Length);
foreach (PointF p in pts)
while (u.Count > 1 && Orientation(At(u, -2), At(u, -1), p) <= 0) Pop(u);
while (l.Count > 1 && Orientation(At(l, -2), At(l, -1), p) >= 0) Pop(l);
return new PointF[][] { l.ToArray(), u.ToArray() };
private class Pair
public PointF a, b;
public Pair(PointF a, PointF b)
this.a = a;
this.b = b;
private static IEnumerable<Pair> RotatingCalipers(PointF[] pts)
PointF[][] l_u = ConvexHull_LU(pts);
PointF[] lower = l_u[0];
PointF[] upper = l_u[1];
int i = 0;
int j = lower.Length - 1;
while (i < upper.Length - 1 || j > 0)
yield return new Pair(upper[i], lower[j]);
if (i == upper.Length - 1) j--;
else if (j == 0) i += 1;
else if ((upper[i + 1].Y - upper[i].Y) * (lower[j].X - lower[j - 1].X) >
(lower[j].Y - lower[j - 1].Y) * (upper[i + 1].X - upper[i].X))
private static int Compare(PointF a, PointF b)
if (a.X < b.X)
return -1;
else if (a.X == b.X)
if (a.Y < b.Y)
return -1;
else if (a.Y == b.Y)
return 0;
return 1;
You could maybe draw a circle that was bigger than the polygon and slowly shrink it, checking if youve intersected any points yet. Then your diameter is the number youre looking for.
Not sure if this is a good method, it sounds somewhere between O(n) and O(n^2)
My off-the-cuff solution is to try a binary partitioning approach, where you draw a line somwwhere in the middle and check distances of all points from the middle of that line.
That would provide you with 2 Presumably Very Far points. Then check the distance of those two and repeat the above distance check. Repeat this process for a while.
My gut says this is an n log n heuristic that will get you Pretty Close.
