I have a sorted array of N intervals of different length. I am plotting these intervals with alternating colors blue/green.
I am trying to find a method or algorithm to "downsample" the array of intervals to produce a visually similar plot, but with less elements.
Ideally I could write some function where I can pass the target number of output intervals as an argument. The output length only has to come close to the target.
input = [
[0, 5, "blue"],
[5, 6, "green"],
[6, 10, "blue"],
// ...etc
output = downsample(input, 25)
// [[0, 10, "blue"], ... ]
Below is a picture of what I am trying to accomplish. In this example the input has about 250 intervals, and the output about ~25 intervals. The input length can vary a lot.

Update 1:
Below is my original post which I initially deleted, because there were issues with displaying the equations and also I wasn't very confident if it really makes sense. But later, I figured that the optimisation problem that I described can be actually solved efficiently with DP (Dynamic programming).
So I did a sample C++ implementation. Here are some results:
Here is a live demo that you can play with in your browser (make sure browser support WebGL2, like Chrome or Firefox). It takes a bit to load the page.
Here is the C++ implementation: link
Update 2:
Turns out the proposed solution has the following nice property - we can easily control the importance of the two parts F1 and F2 of the cost function. Simply change the cost function to F(α)=F1 + αF2, where α >= 1.0 is a free parameter. The DP algorithm remains the same.
Here are some result for different α values using the same number of intervals N:
Live demo (WebGL2 required)
As can be seen, higher α means it is more important to cover the original input intervals even if this means covering more of the background in-between.
Original post
Even-though some good algorithms have already been proposed, I would like to propose a slightly unusual approach - interpreting the task as an optimisation problem. Although, I don't know how to efficiently solve the optimisation problem (or even if it can be solved in reasonable time at all), it might be useful to someone purely as a concept.
First, without loss of generality, lets declare the blue color to be background. We will be painting N green intervals on top of it (N is the number provided to the downsample() function in OP's description). The ith interval is defined by its starting coordinate 0 <= xi < xmax and width wi >= 0 (xmax is the maximum coordinate from the input).
Lets also define the array G(x) to be the number of green cells in the interval [0, x) in the input data. This array can easily be pre-calculated. We will use it to quickly calculate the number of green cells in arbitrary interval [x, y) - namely: G(y) - G(x).
We can now introduce the first part of the cost function for our optimisation problem:
The smaller F1 is, the better our generated intervals cover the input intervals, so we will be searching for xi, wi that minimise it. Ideally we want F1=0 which would mean that the intervals do not cover any of the background (which of course is not possible because N is less than the input intervals).
However, this function is not enough to describe the problem, because obviously we can minimise it by taking empty intervals: F1(x, 0)=0. Instead, we want to cover as much as possible from the input intervals. Lets introduce the second part of the cost function which corresponds to this requirement:
The smaller F2 is, the more input intervals are covered. Ideally we want F2=0 which would mean that we covered all of the input rectangles. However, minimising F2 competes with minimising F1.
Finally, we can state our optimisation problem: find xi, wi that minimize F=F1 + F2
How to solve this problem? Not sure. Maybe use some metaheuristic approach for global optimisation such as Simulated annealing or Differential evolution. These are typically easy to implement, especially for this simple cost function.
Best case would be to exist some kind of DP algorithm for solving it efficiently, but unlikely.

I would advise you to use Haar wavelet. That is a very simple algorithm which was often used to provide the functionality of progressive loading for big images on websites.
Here you can see how it works with 2D function. That is what you can use. Alas, the document is in Ukrainian, but code in C++, so readable:)
This document provides an example of 3D object:
Pseudocode on how to compress with Haar wavelet you can find in Wavelets for Computer Graphics: A Primer Part 1y.

You could do the following:
Write out the points that divide the whole strip into intervals as the array [a[0], a[1], a[2], ..., a[n-1]]. In your example, the array would be [0, 5, 6, 10, ... ].
Calculate double-interval lengths a[2]-a[0], a[3]-a[1], a[4]-a[2], ..., a[n-1]-a[n-3] and find the least of them. Let it be a[k+2]-a[k]. If there are two or more equal lengths having the lowest value, choose one of them randomly. In your example, you should get the array [6, 5, ... ] and search for the minimum value through it.
Swap the intervals (a[k], a[k+1]) and (a[k+1], a[k+2]). Basically, you need to assign a[k+1]=a[k]+a[k+2]-a[k+1] to keep the lengths, and to remove the points a[k] and a[k+2] from the array after that because two pairs of intervals of the same color are now merged into two larger intervals. Thus, the numbers of blue and green intervals decreases by one each after this step.
If you're satisfied with the current number of intervals, end the process, otherwise go to the step 1.
You performed the step 2 in order to decrease "color shift" because, at the step 3, the left interval is moved a[k+2]-a[k+1] to the right and the right interval is moved a[k+1]-a[k] to the left. The sum of these distances, a[k+2]-a[k] can be considered a measure of change you're introducing into the whole picture.
Main advantages of this approach:
It is simple.
It doesn't give a preference to any of the two colors. You don't need to assign one of the colors to be the background and the other to be the painting color. The picture can be considered both as "green-on-blue" and "blue-on-green". This reflects quite common use case when two colors just describe two opposite states (like the bit 0/1, "yes/no" answer) of some process extended in time or in space.
It always keeps the balance between colors, i.e. the sum of intervals of each color remains the same during the reduction process. Thus the total brightness of the picture doesn't change. It is important as this total brightness can be considered an "indicator of completeness" at some cases.

Here's another attempt at dynamic programming that's slightly different than Georgi Gerganov's, although the idea to try and formulate a dynamic program may have been inspired by his answer. Neither the implementation nor the concept is guaranteed to be sound but I did include a code sketch with a visual example :)
The search space in this case is not reliant on the total unit width but rather on the number of intervals. It's O(N * n^2) time and O(N * n) space, where N and n are the target and given number of (green) intervals, respectively, because we assume that any newly chosen green interval must be bound by two green intervals (rather than extend arbitrarily into the background).
The idea also utilises the prefix sum idea used to calculate runs with a majority element. We add 1 when we see the target element (in this case green) and subtract 1 for others (that algorithm is also amenable to multiple elements with parallel prefix sum tracking). (I'm not sure that restricting candidate intervals to sections with a majority of the target colour is always warranted but it may be a useful heuristic depending on the desired outcome. It's also adjustable -- we can easily adjust it to check for a different part than 1/2.)
Where Georgi Gerganov's program seeks to minimise, this dynamic program seeks to maximise two ratios. Let h(i, k) represent the best sequence of green intervals up to the ith given interval, utilising k intervals, where each is allowed to stretch back to the left edge of some previous green interval. We speculate that
h(i, k) = max(r + C*r1 + h(i-l, k-1))
where, in the current candidate interval, r is the ratio of green to the length of the stretch, and r1 is the ratio of green to the total given green. r1 is multiplied by an adjustable constant to give more weight to the volume of green covered. l is the length of the stretch.
JavaScript code (for debugging, it includes some extra variables and log lines):
function rnd(n, d=2){
let m = Math.pow(10,d)
return Math.round(m*n) / m;
function f(A, N, C){
let ps = [[0,0]];
let psBG = [0];
let totalG = 0;
for (let i=1; i<A.length; i++){
let [l,r,c] = A[i];
if (c == 'g'){
totalG += r - l;
let prevI = ps[ps.length-1][1];
let d = l - A[prevI][1];
let prevS = ps[ps.length-1][0];
[prevS - d, i, 'l'],
[prevS - d + r - l, i, 'r']
psBG[i] = psBG[i-1];
} else {
psBG[i] = psBG[i-1] + r - l;
let m = new Array(N + 1);
m[0] = new Array((ps.length >> 1) + 1);
for (let i=0; i<m[0].length; i++)
m[0][i] = [0,0];
// for each in N
for (let i=1; i<=N; i++){
m[i] = new Array((ps.length >> 1) + 1);
for (let ii=0; ii<m[0].length; ii++)
m[i][ii] = [0,0];
// for each interval
for (let j=i; j<m[0].length; j++){
m[i][j] = m[i][j-1];
for (let k=j; k>i-1; k--){
// our anchors are the right
// side of each interval, k's are the left
let jj = 2*j;
let kk = 2*k - 1;
// positive means green
// is a majority
if (ps[jj][0] - ps[kk][0] > 0){
let bg = psBG[ps[jj][1]] - psBG[ps[kk][1]];
let s = A[ps[jj][1]][1] - A[ps[kk][1]][0] - bg;
let r = s / (bg + s);
let r1 = C * s / totalG;
let candidate = r + r1 + m[i-1][j-1][0];
if (candidate > m[i][j][0]){
m[i][j] = [
ps[kk][1] + ',' + ps[jj][1],
bg, s, r, r1,k,m[i-1][j-1][0]
for (row of m)
console.log(JSON.stringify( => => typeof x != 'number' ? x : rnd(x)))));
let result = new Array(N);
let j = m[0].length - 1;
for (let i=N; i>0; i--){
let [_,idxs,w,x,y,z,k] = m[i][j];
let [l,r] = idxs.split(',');
result[i-1] = [A[l][0], A[r][1], 'g'];
j = k - 1;
return result;
function show(A, last){
if (last[1] != A[A.length-1])
let s = '';
let j;
for (let i=A.length-1; i>=0; i--){
let [l, r, c] = A[i];
let cc = c == 'g' ? 'X' : '.';
for (let j=r-1; j>=l; j--)
s = cc + s;
if (i > 0)
for (let j=l-1; j>=A[i-1][1]; j--)
s = '.' + s
for (let j=A[0][0]-1; j>=0; j--)
s = '.' + s
return s;
function g(A, N, C){
const ts = f(A, N, C);
show(A, A[A.length-1]);
show(ts, A[A.length-1]);
var a = [
// (input, N, C)
g(a, 2, 2);
g(a, 3, 2);
g(a, 4, 2);
g(a, 4, 5);

I would suggest using K-means it is an algorithm used to group data(a more detailed explanation here: and here
this would be a brief explanation of how the function should look like, hope it is helpful.
from sklearn.cluster import KMeans
import numpy as np
def downsample(input, cluster = 25):
# you will need to group your labels in a nmpy array as shown bellow
# for the sake of example I will take just a random array
X = np.array([[1, 2], [1, 4], [1, 0],[4, 2], [4, 4], [4, 0]])
# n_clusters will be the same as desired output
kmeans = KMeans(n_clusters= cluster, random_state=0).fit(X)
# then you can iterate through labels that was assigned to every entr of your input
# in our case the interval
kmeans_list = [None]*cluster
for i in range(0, X.shape[0]):
# after that you will basicly have a list of lists and every inner list will contain all points that corespond to a
# specific label
ret = [] #return list
for label_list in kmeans_list:
left = 10001000 # a big enough number to exced anything that you will get as an input
right = -left # same here
for entry in label_list:
left = min(left, entry[0])
right = max(right, entry[1])
return ret


Proving that there are no overlapping sub-problems?

I just got the following interview question:
Given a list of float numbers, insert “+”, “-”, “*” or “/” between each consecutive pair of numbers to find the maximum value you can get. For simplicity, assume that all operators are of equal precedence order and evaluation happens from left to right.
(1, 12, 3) -> 1 + 12 * 3 = 39
If we built a recursive solution, we would find that we would get an O(4^N) solution. I tried to find overlapping sub-problems (to increase the efficiency of this algorithm) and wasn't able to find any overlapping problems. The interviewer then told me that there wasn't any overlapping subsolutions.
How can we detect when there are overlapping solutions and when there isn't? I spent a lot of time trying to "force" subsolutions to appear and eventually the Interviewer told me that there wasn't any.
My current solution looks as follows:
def maximumNumber(array, current_value=None):
if current_value is None:
current_value = array[0]
array = array[1:]
if len(array) == 0:
return current_value
return max(
maximumNumber(array[1:], current_value * array[0]),
maximumNumber(array[1:], current_value - array[0]),
maximumNumber(array[1:], current_value / array[0]),
maximumNumber(array[1:], current_value + array[0])
Looking for "overlapping subproblems" sounds like you're trying to do bottom up dynamic programming. Don't bother with that in an interview. Write the obvious recursive solution. Then memoize. That's the top down approach. It is a lot easier to get working.
You may get challenged on that. Here was my response the last time that I was asked about that.
There are two approaches to dynamic programming, top down and bottom up. The bottom up approach usually uses less memory but is harder to write. Therefore I do the top down recursive/memoize and only go for the bottom up approach if I need the last ounce of performance.
It is a perfectly true answer, and I got hired.
Now you may notice that tutorials about dynamic programming spend more time on bottom up. They often even skip the top down approach. They do that because bottom up is harder. You have to think differently. It does provide more efficient algorithms because you can throw away parts of that data structure that you know you won't use again.
Coming up with a working solution in an interview is hard enough already. Don't make it harder on yourself than you need to.
EDIT Here is the DP solution that the interviewer thought didn't exist.
def find_best (floats):
current_answers = {floats[0]: ()}
floats = floats[1:]
for f in floats:
next_answers = {}
for v, path in current_answers.iteritems():
next_answers[v + f] = (path, '+')
next_answers[v * f] = (path, '*')
next_answers[v - f] = (path, '-')
if 0 != f:
next_answers[v / f] = (path, '/')
current_answers = next_answers
best_val = max(current_answers.keys())
return (best_val, current_answers[best_val])
Generally the overlapping sub problem approach is something where the problem is broken down into smaller sub problems, the solutions to which when combined solve the big problem. When these sub problems exhibit an optimal sub structure DP is a good way to solve it.
The decision about what you do with a new number that you encounter has little do with the numbers you have already processed. Other than accounting for signs of course.
So I would say this is a over lapping sub problem solution but not a dynamic programming problem. You could use dive and conquer or evenmore straightforward recursive methods.
Initially let's forget about negative floats.
process each new float according to the following rules
If the new float is less than 1, insert a / before it
If the new float is more than 1 insert a * before it
If it is 1 then insert a +.
If you see a zero just don't divide or multiply
This would solve it for all positive floats.
Now let's handle the case of negative numbers thrown into the mix.
Scan the input once to figure out how many negative numbers you have.
Isolate all the negative numbers in a list, convert all the numbers whose absolute value is less than 1 to the multiplicative inverse. Then sort them by magnitude. If you have an even number of elements we are all good. If you have an odd number of elements store the head of this list in a special var , say k, and associate a processed flag with it and set the flag to False.
Proceed as before with some updated rules
If you see a negative number less than 0 but more than -1, insert a / divide before it
If you see a negative number less than -1, insert a * before it
If you see the special var and the processed flag is False, insert a - before it. Set processed to True.
There is one more optimization you can perform which is removing paris of negative ones as candidates for blanket subtraction from our initial negative numbers list, but this is just an edge case and I'm pretty sure you interviewer won't care
Now the sum is only a function of the number you are adding and not the sum you are adding to :)
Computing max/min results for each operation from previous step. Not sure about overall correctness.
Time complexity O(n), space complexity O(n)
const max_value = (nums) => {
const ops = [(a, b) => a+b, (a, b) => a-b, (a, b) => a*b, (a, b) => a/b]
const dp = Array.from({length: nums.length}, _ => [])
dp[0] = Array.from({length: ops.length}, _ => [nums[0],nums[0]])
for (let i = 1; i < nums.length; i++) {
for (let j = 0; j < ops.length; j++) {
let mx = -Infinity
let mn = Infinity
for (let k = 0; k < ops.length; k++) {
if (nums[i] === 0 && k === 3) {
// If current number is zero, removing division
ops.splice(3, 1)
dp.splice(3, 1)
const opMax = ops[j](dp[i-1][k][0], nums[i])
const opMin = ops[j](dp[i-1][k][1], nums[i])
mx = Math.max(opMax, opMin, mx)
mn = Math.min(opMax, opMin, mn)
return Math.max(...dp[nums.length-1].map(v => Math.max(...v)))
// Tests
console.log(max_value([1, 12, 3]))
console.log(max_value([1, 0, 3]))
console.log(max_value([59, 60, -0.000001]))
console.log(max_value([0, 1, -0.0001, -1.00000001]))

How to select a uniformly distributed subset of a partially dense dataset?

P is an n*d matrix, holding n d-dimensional samples. P in some areas is several times more dense than others. I want to select a subset of P in which distance between any pairs of samples be more than d0, and I need it to be spread all over the area. All samples have same priority and there's no need to optimize anything (e.g. covered area or sum of pairwise distances).
Here is a sample code that does so, but it's really slow. I need a more efficient code since I need to call it several times.
%% generating sample data
n_4 = 1000; n_2 = n_4*2;n = n_4*4;
x1=[ randn(n_4, 1)*10+30; randn(n_4, 1)*3 + 60];
y1=[ randn(n_4, 1)*5 + 35; randn(n_4, 1)*20 + 80 ];
x2 = rand(n_2, 1)*(max(x1)-min(x1)) + min(x1);
y2 = rand(n_2, 1)*(max(y1)-min(y1)) + min(y1);
P = [x1,y1;x2, y2];
%% eliminating close ones
d0 = 1.5;
D = pdist2(P, P);D(1:n+1:end) = inf;
E = zeros(n, 1); % eliminated ones
for i=1:n-1
if ~E(i)
CloseOnes = (D(i,:)<d0) & ((1:n)>i) & (~E');
E(CloseOnes) = 1;
P2 = P(~E, :);
%% plotting samples
subplot(121); scatter(P(:, 1), P(:, 2)); axis equal;
subplot(122); scatter(P2(:, 1), P2(:, 2)); axis equal;
Edit: How big the subset should be?
As j_random_hacker pointed out in comments, one can say that P(1, :) is the fastest answer if we don’t define a constraint on the number of selected samples. It delicately shows incoherence of the title! But I think the current title better describes the purpose. So let’s define a constraint: “Try to select m samples if it’s possible”. Now with the implicit assumption of m=n we can get the biggest possible subset. As I mentioned before a faster method excels the one that finds the optimum answer.
Finding closest points over and over suggests a different data structure that is optimized for spatial searches. I suggest a delaunay triangulation.
The below solution is "approximate" in the sense that it will likely remove more points than strictly necessary. I'm batching all the computations and removing all points in each iteration that contribute to distances that are too long, and in many cases removing one point may remove the edge that appears later in the same iteration. If this matters, the edge list can be further processed to avoid duplicates, or even to find points to remove that will impact the greatest number of distances.
This is fast.
dt = delaunayTriangulation(P(:,1), P(:,2));
d0 = 1.5;
while 1
edge = edges(dt); % vertex ids in pairs
% Lookup the actual locations of each point and reorganize
pwise = reshape(dt.Points(edge.', :), 2, size(edge,1), 2);
% Compute length of each edge
difference = pwise(1,:,:) - pwise(2,:,:);
edge_lengths = sqrt(difference(1,:,1).^2 + difference(1,:,2).^2);
% Find edges less than minimum length
idx = find(edge_lengths < d0);
% pick first vertex of each too-short edge for deletion
% This could be smarter to avoid overdeleting
points_to_delete = unique(edge(idx, 1));
% remove them. triangulation auto-updates
dt.Points(points_to_delete, :) = [];
% repeat until no edge is too short
P2 = dt.Points;
You don't specify how many points you want to select. This is crucial to the problem.
I don't readily see a way to optimise your method.
Assuming that Euclidean distance is acceptable as a distance measure, the following implementation is much faster when selecting only a small number of points, and faster even when trying to the subset with 'all' valid points (note that finding the maximum possible number of points is hard).
subplot(121); scatter(P(:, 1), P(:, 2)); axis equal;
d0 = 1.5;
m_range = linspace(1, 2000, 100);
m_time = NaN(size(m_range));
for m_i = 1:length(m_range);
m = m_range(m_i)
a = tic;
% Test points in random order.
r = randperm(n);
r_i = 1;
S = false(n, 1); % selected ones
for i=1:m
found = false;
while ~found
j = r(r_i);
r_i = r_i + 1;
if r_i > n
% We have tried all points. Nothing else can be valid.
if sum(S) == 0
% This is the first point.
found = true;
% Get the points already selected
P_selected = P(S, :);
% Exclude points >= d0 along either axis - they cannot have
% a Euclidean distance less than d0.
P_valid = (abs(P_selected(:, 1) - P(j, 1)) < d0) & (abs(P_selected(:, 2) - P(j, 2)) < d0);
if sum(P_valid) == 0
% There are no points that can be < d0.
found = true;
% Implement Euclidean distance explicitly rather than
% using pdist - this makes a large difference to
% timing.
found = min(sqrt(sum((P_selected(P_valid, :) - repmat(P(j, :), sum(P_valid), 1)) .^ 2, 2))) >= d0;
if found
% We found a valid point - select it.
S(j) = true;
% Nothing found, so we must have exhausted all points.
P2 = P(S, :);
m_time(m_i) = toc(a);
subplot(122); scatter(P2(:, 1), P2(:, 2)); axis equal;
plot(m_range, m_time);
hold on;
plot(m_range([1 end]), ones(2, 1) * original_time);
hold off;
where original_time is the time taken by your method. This gives the following timings, where the red line is your method, and the blue is mine, with the number of points selected along the x axis. Note that the line flattens when 'all' points meeting the criteria have been selected.
As you say in your comment, performance is highly dependent on the value of d0. In fact, as d0 is reduced, the method above appears to have even greater improvement in performance (this is for d0=0.1):
Note however that this is also dependent on other factors such as the distribution of your data. This method exploits specific properties of your data set, and reduces the number of expensive calculations by filtering out points where calculating the Euclidean distance is pointless. This works particularly well for selecting fewer points, and it is actually faster for smaller d0 because there are fewer points in the data set that match the criteria (so there are fewer computations of the Euclidean distance required). The optimal solution for a problem like this will usually be specific to the exact data set used.
Also note that in my code above, manually calculating the Euclidean distance is much faster then calling pdist. The flexibility and generality of the Matlab built-ins is often detrimental to performance in simple cases.

Randomly pick elements from a vector of counts

I'm currently trying to optimize some MATLAB/Octave code by means of an algorithmic change, but can't figure out how to deal with some randomness here. Suppose that I have a vector V of integers, with each element representing a count of some things, photons in my case. Now I want to randomly pick some amount of those "things" and create a new vector of the same size, but with the counts adjusted.
Here's how I do this at the moment:
function W = photonfilter(V, eff)
% W = photonfilter(V, eff)
% Randomly takes photons from V according to the given efficiency.
% Args:
% V: Input vector containing the number of emitted photons in each
% timeslot (one element is one timeslot). The elements are rounded
% to integers before processing.
% eff: Filter efficiency. On the average, every 1/eff photon will be
% taken. This value must be in the range 0 < eff <= 1.
% W: Output row vector with the same length as V and containing the number
% of received photons in each timeslot.
% WARNING: This function operates on a photon-by-photon basis in that it
% constructs a vector with one element per photon. The storage requirements
% therefore directly depend on sum(V), not only on the length of V.
% Round V and make it flat.
Ntot = length(V);
V = round(V);
V = V(:);
% Initialize the photon-based vector, so that each element contains
% the original index of the photon.
idxV = zeros(1, sum(V), 'uint32');
iout = 1;
for i = 1:Ntot
N = V(i);
idxV(iout:iout+N-1) = i;
iout = iout + N;
% Take random photons.
idxV = idxV(randperm(length(idxV)));
idxV = idxV(1:round(length(idxV)*eff));
% Generate the output vector by placing the remaining photons back
% into their timeslots.
[W, trash] = hist(idxV, 1:Ntot);
This is a rather straightforward implementation of the description above. But it has an obvious performance drawback: The function creates a vector (idxV) containing one element per single photon. So if my V has only 1000 elements but an average count of 10000 per element, the internal vector will have 10 million elements making the function slow and heavy.
What I'd like to achieve now is not to directly optimize this code, but to use some other kind of algorithm which immediately calculates the new counts without giving each photon some kind of "identity". This must be possible somehow, but I just can't figure out how to do it.
The output vector W must have the same number of elements as the input vector V.
W(i) must be an integer and bounded by 0 <= W(i) <= V(i).
The expected value of sum(W) must be sum(V)*eff.
The algorithm must somehow implement this "random picking" of photons, i.e. there should not be some deterministic part like "run through V dividing all counts by the stepsize and propagating the remainders", as the whole point of this function is to bring randomness into the system.
An explicit loop over V is allowed if unavoidable, but a vectorized approach is preferable.
Any ideas how to implement something like this? A solution using only a random vector and then some trickery with probabilities and rounding would be ideal, but I haven't had any success with that so far.
Thanks! Best regards, Philipp
The method you employ to compute W is called Monte Carlo method. And indeed there can be some optimizations. Once of such is instead of calculating indices of photons, let's imagine a set of bins. Each bin has some probability and the sum of all bins' probabilities adds up to 1. We divide the segment [0, 1] into parts whose lengths are proportional to the probabilities of the bins. Now for every random number within [0, 1) that we generate we can quickly find the bin that it belongs to. Finally, we count numbers in the bins to obtain the final result. The code below illustrates the idea.
% Population size (number of photons).
N = 1000000;
% Sample size, size of V and W as well.
% For convenience of plotting, V and W are of the same size, but
% the algorithm doesn't enforce this constraint.
M = 10000;
% Number of Monte Carlo iterations, greater numbers give better quality.
K = 100000;
% Generate population of counts, use gaussian distribution to test the method.
% If implemented correctly histograms should have the same shape eventually.
V = hist(randn(1, N), M);
P = cumsum(V / sum(V));
% For every generated random value find its bin and then count the bins.
% Finally we normalize counts by the ration of N / K.
W = hist(lookup(P, rand(1, K)), M) * N / K;
% Compare distribution plots, they should be the same.
hold on;
plot(W, '+r');
plot(V, '*b');
Based on the answer from Alexander Solovets, this is how the code now looks:
function W = photonfilter(V, eff, impl=1)
Ntot = length(V);
V = V(:);
if impl == 0
% Original "straightforward" solution.
V = round(V);
idxV = zeros(1, sum(V), 'uint32');
iout = 1;
for i = 1:Ntot
N = V(i);
idxV(iout:iout+N-1) = i;
iout = iout + N;
idxV = idxV(randperm(length(idxV)));
idxV = idxV(1:round(length(idxV)*eff));
[W, trash] = hist(idxV, 1:Ntot);
% Monte Carlo approach.
Nphot = sum(V);
P = cumsum(V / Nphot);
W = hist(lookup(P, rand(1, round(Nphot * eff))), 0:Ntot-1);
The results are quite comparable, as long as eff if not too close to 1 (with eff=1, the original solution yields W=V while the Monte Carlo approach still has some randomness, thereby violating the upper bound constraints).
Test in the interactive Octave shell:
octave:1> T=linspace(0,10*pi,10000);
octave:2> V=100*(1+sin(T));
octave:3> W1=photonfilter(V, 0.1, 0);
octave:4> W2=photonfilter(V, 0.1, 1);
octave:5> plot(T,V,T,W1,T,W2);
octave:6> legend('V','Random picking','Monte Carlo')
octave:7> sum(W1)
ans = 100000
octave:8> sum(W2)
ans = 100000

How to calculate iteratively the running weighted average so that last values to weight most?

I want to implement an iterative algorithm, which calculates weighted average. The specific weight law does not matter, but it should be close to 1 for the newest values and close to 0 to the oldest.
The algorithm should be iterative. i.e. it should not remember all previous values. It should know only one newest value and any aggregative information about past, like previous values of the average, sums, counts etc.
Is it possible?
For example, the following algorithm can be:
void iterate(double value) {
sum *= 0.99;
sum += value;
avg = sum / count;
It will give exponential decreasing weight, which may be not good. Is it possible to have step decreasing weight or something?
The the requirements for weighing law is follows:
1) The weight decreases into past
2) I has some mean or characteristic duration so that values older this duration matters much lesser than newer ones
3) I should be able to set this duration
I need the following. Suppose v_i are values, where v_1 is the first. Also suppose w_i are weights. But w_0 is THE LAST.
So, after first value came I have first average
a_1 = v_1 * w_0
After the second value v_2 came, I should have average
a_2 = v_1 * w_1 + v_2 * w_0
With next value I should have
a_3 = v_1 * w_2 + v_2 * w_1 + v_3 * w_0
Note, that weight profile is moving with me, while I am moving along value sequence.
I.e. each value does not have it's own weight all the time. My goal is to have this weight lower while going to past.
First a bit of background. If we were keeping a normal average, it would go like this:
average(a) = 11
average(a,b) = (average(a)+b)/2
average(a,b,c) = (average(a,b)*2 + c)/3
average(a,b,c,d) = (average(a,b,c)*3 + d)/4
As you can see here, this is an "online" algorithm and we only need to keep track of pieces of data: 1) the total numbers in the average, and 2) the average itself. Then we can undivide the average by the total, add in the new number, and divide it by the new total.
Weighted averages are a bit different. It depends on what kind of weighted average. For example if you defined:
weightedAverage(a,wa, b,wb, c,wc, ..., z,wz) = a*wa + b*wb + c*wc + ... + w*wz
weightedAverage(elements, weights) = elements·weights
...then you don't need to do anything besides add the new element*weight! If however you defined the weighted average akin to an expected-value from probability:
weightedAverage(elements,weights) = elements·weights / sum(weights)
...then you'd need to keep track of the total weights. Instead of undividing by the total number of elements, you undivide by the total weight, add in the new element&ast;weight, then divide by the new total weight.
Alternatively you don't need to undivide, as demonstrated below: you can merely keep track of the temporary dot product and weight total in a closure or an object, and divide it as you yield (this can help a lot with avoiding numerical inaccuracy from compounded rounding errors).
In python this would be:
def makeAverager():
dotProduct = 0
totalWeight = 0
def averager(newValue, weight):
nonlocal dotProduct,totalWeight
dotProduct += newValue*weight
totalWeight += weight
return dotProduct/totalWeight
return averager
>>> averager = makeAverager()
>>> [averager(value,w) for value,w in [(100,0.2), (50,0.5), (100,0.1)]]
[100.0, 64.28571428571429, 68.75]
>>> averager(10,1.1)
>>> averager(10,1.1)
>>> averager(30,2.0)
> But my task is to have average recalculated each time new value arrives having old values reweighted. –OP
Your task is almost always impossible, even with exceptionally simple weighting schemes.
You are asking to, with O(1) memory, yield averages with a changing weighting scheme. For example, {values·weights1, (values+[newValue2])·weights2, (values+[newValue2,newValue3])·weights3, ...} as new values are being passed in, for some nearly arbitrarily changing weights sequence. This is impossible due to injectivity. Once you merge the numbers in together, you lose a massive amount of information. For example, even if you had the weight vector, you could not recover the original value vector, or vice versa. There are only two cases I can think of where you could get away with this:
Constant weights such as [2,2,2,...2]: this is equivalent to an on-line averaging algorithm, which you don't want because the old values are not being "reweighted".
The relative weights of previous answers do not change. For example you could do weights of [8,4,2,1], and add in a new element with arbitrary weight like ...+[1], but you must increase all the previous by the same multiplicative factor, like [16,8,4,2]+[1]. Thus at each step, you are adding a new arbitrary weight, and a new arbitrary rescaling of the past, so you have 2 degrees of freedom (only 1 if you need to keep your dot-product normalized). The weight-vectors you'd get would look like:
[w0*(s1), w1]
[w0*(s1*s2), w1*(s2), w2]
[w0*(s1*s2*s3), w1*(s2*s3), w2*(s3), w3]
Thus any weighting scheme you can make look like that will work (unless you need to keep the thing normalized by the sum of weights, in which case you must then divide the new average by the new sum, which you can calculate by keeping only O(1) memory). Merely multiply the previous average by the new s (which will implicitly distribute over the dot-product into the weights), and tack on the new +w*newValue.
I think you are looking for something like this:
void iterate(double value) {
weight = max(0, 1 - (count / 1000));
avg = ( avg * total_weight * (count - 1) + weight * value) / (total_weight * (count - 1) + weight)
total_weight += weight;
Here I'm assuming you want the weights to sum to 1. As long as you can generate a relative weight without it changing in the future, you can end up with a solution which mimics this behavior.
That is, suppose you defined your weights as a sequence {s_0, s_1, s_2, ..., s_n, ...} and defined the input as sequence {i_0, i_1, i_2, ..., i_n}.
Consider the form: sum(s_0*i_0 + s_1*i_1 + s_2*i_2 + ... + s_n*i_n) / sum(s_0 + s_1 + s_2 + ... + s_n). Note that it is trivially possible to compute this incrementally with a couple of aggregation counters:
int counter = 0;
double numerator = 0;
double denominator = 0;
void addValue(double val)
double weight = calculateWeightFromCounter(counter);
numerator += weight * val;
denominator += weight;
double getAverage()
if (denominator == 0.0) return 0.0;
return numerator / denominator;
Of course, calculateWeightFromCounter() in this case shouldn't generate weights that sum to one -- the trick here is that we average by dividing by the sum of the weights so that in the end, the weights virtually seem to sum to one.
The real trick is how you do calculateWeightFromCounter(). You could simply return the counter itself, for example, however note that the last weighted number would not be near the sum of the counters necessarily, so you may not end up with the exact properties you want. (It's hard to say since, as mentioned, you've left a fairly open problem.)
This is too long to post in a comment, but it may be useful to know.
Suppose you have:
w_0*v_n + ... w_n*v_0 (we'll call this w[0..n]*v[n..0] for short)
Then the next step is:
w_0*v_n1 + ... w_n1*v_0 (and this is w[0..n1]*v[n1..0] for short)
This means we need a way to calculate w[1..n1]*v[n..0] from w[0..n]*v[n..0].
It's certainly possible that v[n..0] is 0, ..., 0, z, 0, ..., 0 where z is at some location x.
If we don't have any 'extra' storage, then f(z*w(x))=z*w(x + 1) where w(x) is the weight for location x.
Rearranging the equation, w(x + 1) = f(z*w(x))/z. Well, w(x + 1) better be constant for a constant x, so f(z*w(x))/z better be constant. Hence, f must let z propagate -- that is, f(z*w(x)) = z*f(w(x)).
But here again we have an issue. Note that if z (which could be any number) can propagate through f, then w(x) certainly can. So f(z*w(x)) = w(x)*f(z). Thus f(w(x)) = w(x)/f(z).
But for a constant x, w(x) is constant, and thus f(w(x)) better be constant, too. w(x) is constant, so f(z) better be constant so that w(x)/f(z) is constant. Thus f(w(x)) = w(x)/c where c is a constant.
So, f(x)=c*x where c is a constant when x is a weight value.
So w(x+1) = c*w(x).
That is, each weight is a multiple of the previous. Thus, the weights take the form w(x)=m*b^x.
Note that this assumes the only information f has is the last aggregated value. Note that at some point you will be reduced to this case unless you're willing to store a non-constant amount of data representing your input. You cannot represent an infinite length vector of real numbers with a real number, but you can approximate them somehow in a constant, finite amount of storage. But this would merely be an approximation.
Although I haven't rigorously proven it, it is my conclusion that what you want is impossible to do with a high degree of precision, but you may be able to use log(n) space (which may as well be O(1) for many practical applications) to generate a quality approximation. You may be able to use even less.
I tried to practically code something (in Java). As has been said, your goal is not achievable. You can only count average from some number of last remembered values. If you don't need to be exact, you can approximate the older values. I tried to do it by remembering last 5 values exactly and older values only SUMmed by 5 values, remembering the last 5 SUMs. Then, the complexity is O(2n) for remembering last n+n*n values. This is a very rough approximation.
You can modify the "lastValues" and "lasAggregatedSums" array sizes as you want. See this ascii-art picture trying to display a graph of last values, showing that the first columns (older data) are remembered as aggregated value (not individually), and only the earliest 5 values are remembered individually.
##### ##### #
##### ##### ##### # #
##### ##### ##### ##### ## ##
##### ##### ##### ##### ##### #####
time: --->
Challenge 1: My example doesn't count weights, but I think it shouldn't be problem for you to add weights for the "lastAggregatedSums" appropriately - the only problem is, that if you want lower weights for older values, it would be harder, because the array is rotating, so it is not straightforward to know which weight for which array member. Maybe you can modify the algorithm to always "shift" values in the array instead of rotating? Then adding weights shouldn't be a problem.
Challenge 2: The arrays are initialized with 0 values, and those values are counting to the average from the beginning, even when we haven't receive enough values. If you are running the algorithm for long time, you probably don't bother that it is learning for some time at the beginning. If you do, you can post a modification ;-)
public class AverageCounter {
private float[] lastValues = new float[5];
private float[] lastAggregatedSums = new float[5];
private int valIdx = 0;
private int aggValIdx = 0;
private float avg;
public void add(float value) {
lastValues[valIdx++] = value;
if(valIdx == lastValues.length) {
// count average of last values and save into the aggregated array.
float sum = 0;
for(float v: lastValues) {sum += v;}
lastAggregatedSums[aggValIdx++] = sum;
if(aggValIdx >= lastAggregatedSums.length) {
// rotate aggregated values index
aggValIdx = 0;
valIdx = 0;
float sum = 0;
for(float v: lastValues) {sum += v;}
for(float v: lastAggregatedSums) {sum += v;}
avg = sum / (lastValues.length + lastAggregatedSums.length * lastValues.length);
public float getAvg() {
return avg;
you can combine (weighted sum) exponential means with different effective window sizes (N) in order to get the desired weights.
Use more exponential means to define your weight profile more detailed.
(more exponential means also means to store and calculate more values, so here is the trade off)
A memoryless solution is to calculate the new average from a weighted combination of the previous average and the new value:
average = (1 - P) * average + P * value
where P is an empirical constant, 0 <= P <= 1
expanding gives:
average = sum i (weight[i] * value[i])
where value[0] is the newest value, and
weight[i] = P * (1 - P) ^ i
When P is low, historical values are given higher weighting.
The closer P gets to 1, the more quickly it converges to newer values.
When P = 1, it's a regular assignment and ignores previous values.
If you want to maximise the contribution of value[N], maximize
weight[N] = P * (1 - P) ^ N
where 0 <= P <= 1
I discovered weight[N] is maximized when
P = 1 / (N + 1)

Suggestion on algorithm to distribute objects of different value

I have the following problem:
Given N objects (N < 30) of different values multiple of a "k" constant i.e. k, 2k, 3k, 4k, 6k, 8k, 12k, 16k, 24k and 32k, I need an algorithm that will distribute all items to M players (M <= 6) in such a way that the total value of the objects each player gets is as even as possible (in other words, I want to distribute all objects to all players in the fairest way possible).
EDIT: By fairest distribution I mean that the difference between the value of the objects any two players get is minimal.
Another similar case would be: I have N coins of different values and I need to divide them equally among M players; sometimes they don't divide exactly and I need to find the next best case of distribution (where no player is angry because another one got too much money).
I don't need (pseudo)code to solve this (also, this is not a homework :) ), but I'll appreciate any ideas or links to algorithms that could solve this.
The problem is strongly NP-complete. This means there is no way to ensure a correct solution in reasonable time. (See 3-partition-problem, thanks Paul).
Instead you'll wanna go for a good approximate solution generator. These can often get very close to the optimal answer in very short time. I can recommend the Simulated Annealing technique, which you will also be able to use for a ton of other NP-complete problems.
The idea is this:
Distribute the items randomly.
Continually make random swaps between two random players, as long as it makes the system more fair, or only a little less fair (see the wiki for details).
Stop when you have something fair enough, or you have run out of time.
This solution is much stronger than the 'greedy' algorithms many suggest. The greedy algorithm is the one where you continuously add the largest item to the 'poorest' player. An example of a testcase where greedy fails is [10,9,8,7,7,5,5].
I did an implementation of SA for you. It follows the wiki article strictly, for educational purposes. If you optimize it, I would say a 100x improvement wouldn't be unrealistic.
from __future__ import division
import random, math
values = [10,9,8,7,7,5,5]
M = 3
kmax = 1000
emax = 0
def s0():
s = [[] for i in xrange(M)]
for v in values:
return s
def E(s):
avg = sum(values)/M
return sum(abs(avg-sum(p))**2 for p in s)
def neighbour(s):
snew = [p[:] for p in s]
while True:
p1, p2 = random.sample(xrange(M),2)
if s[p1]: break
item = random.randrange(len(s[p1]))
return snew
def P(e, enew, T):
if enew < e: return 1
return math.exp((e - enew) / T)
def temp(r):
return (1-r)*100
s = s0()
e = E(s)
sbest = s
ebest = e
k = 0
while k < kmax and e > emax:
snew = neighbour(s)
enew = E(snew)
if enew < ebest:
sbest = snew; ebest = enew
if P(e, enew, temp(k/kmax)) > random.random():
s = snew; e = enew
k += 1
print sbest
Update: After playing around with Branch'n'Bound, I now believe this method to be superior, as it gives perfect results for the N=30, M=6 case within a second. However I guess you could play around with the simulated annealing approach just as much.
The greedy solution suggested by a few people seems like the best option, I ran it a bunch of times with some random values, and it seems to get it right every time.
If it's not optimal, it's at the very least very close, and it runs in O(nm) or so (I can't be bothered to do the math right now)
C# Implementation:
static List<List<int>> Dist(int n, IList<int> values)
var result = new List<List<int>>();
for (int i = 1; i <= n; i++)
result.Add(new List<int>());
var sortedValues = values.OrderByDescending(val => val);
foreach (int val in sortedValues)
var lowest = result.OrderBy(a => a.Sum()).First();
return result;
how about this:
order the k values.
order the players.
loop over the k values giving the next one to the next player.
when you get to the end of the players, turn around and continue giving the k values to the players in the opposite direction.
Repeatedly give the available object with the largest value to the player who has the least total value of objects assigned to him.
This is a straight-forward implementation of Justin Peel's answer:
M = 3
players = [[] for i in xrange(M)]
values = [10,4,3,1,1,1]
for v in values:
lowest=sorted(players, key=lambda x: sum(x))[0]
print players
print [sum(p) for p in players]
I am a beginner with Python, but it seems to work okay. This example will print
[[10], [4, 1], [3, 1, 1]]
[10, 5, 5]
30 ^ 6 isn't that large (it's less than 1 billion). Go through every possible allocation, and pick the one that's the fairest by whatever measure you define.
The purpose was to use the greedy solution with small improvement in the implementation, which is maybe transparent in C#:
static List<List<int>> Dist(int n, IList<int> values)
var result = new List<List<int>>();
for (int i = 1; i <= n; i++)
result.Add(new List<int>());
var sortedValues = values.OrderByDescending(val => val);//Assume the most efficient sorting algorithm - O(N log(N))
foreach (int val in sortedValues)
var lowest = result.OrderBy(a => a.Sum()).First();//This can be done in O(M * log(n)) [M - size of sortedValues, n - size of result]
return result;
Regarding this stage:
var lowest = result.OrderBy(a => a.Sum()).First();//This can be done in O(M * log(n)) [M - size of sortedValues, n - size of result]
The idea is that the list is always sorted (In this code it is done by OrderBy). Eventually, this sorting wont take more than O (log(n)) - because we just need to INSERT at most one item into a sorted list - that should take the same as a binary search.
Because we need to repeat this phase for sortedValues.Length times, the whole algorithm runs in O(M * log(n)).
So, in words, it can be rephrased as:
Repeat the steps below till you finish the Values values:
1. Add the biggest value to the smallest player
2. Check if this player still has the smallest sum
3. If yes, go to step 1.
4. Insert the last-that-was-got player to the sorted players list
Step 4 is the O (log(n)) step - as the list is always sorted.
