Aggregating HSL Values - algorithm

I have written a piece of code that takes an input HSL color value, and categorizes it as one of eight predefined colors. Because the color of the object I'm measuring is not perfectly "smooth" (the exact H, S, and L values vary pixel to pixel, but there is a limited range each can fall into depending on the color), I want to aggregate the H's, S's, and L's of several pixels on the object, before identifying the resulting HSL value as a particular color.
To give an example, if the object is in fact black, then the H of any of its pixels should be in the range 24 - 33, while the H range is 33 - 37 for yellow. Aggregating several measured H's, instead of relying on a single measurement, will tend to yield a result closer to the middle of whichever range corresponds to the correct color, which desirably reduces the likelihood of needing to interpret the ambiguous 33 case.
I have been using something similar to median as my aggregation algorithm (exact algorithm shown below), but I've come across one case where this doesn't work well. In particular, the H range for a purple object includes both 231 - 240 and 0 - 10 (240 is the maximum H value, so it wraps around). All other colors' possible H's, S's, and L's are single, continuous ranges, for which the median approach (or my modified version) works well, but it isn't ideal in the purple H case, because it yields results which are actually closer to the edge of its range, so the input HSL value is more likely to be mistaken for another color (red's H range is 9 - 14).
Is there a better aggregation algorithm than median for this task, one that will tend to produce results nearer the middle of a color's H, S, and L ranges, even in the wrap around purple H case?
The algorithm:
private hslColor aggregateEachAttribute(hslColor[] hslData)
{
List<double> hAttributes = new List<double>();
List<double> sAttributes = new List<double>();
List<double> lAttributes = new List<double>();
for (int i = 0; i < hslData.Length; i++)
{
hAttributes.Add(hslData[i].H);
}
for (int i = 0; i < hslData.Length; i++)
{
sAttributes.Add(hslData[i].S);
}
for (int i = 0; i < hslData.Length; i++)
{
lAttributes.Add(hslData[i].L);
}
hAttributes.Sort();
sAttributes.Sort();
lAttributes.Sort();
while (hAttributes.Distinct().Count() >= 3)
{
hAttributes.RemoveAll(h => h == hAttributes[0]);
hAttributes.RemoveAll(h => h == hAttributes[hAttributes.Count() - 1]);
}
while (sAttributes.Distinct().Count() >= 3)
{
sAttributes.RemoveAll(s => s == sAttributes[0]);
sAttributes.RemoveAll(s => s == sAttributes[sAttributes.Count() - 1]);
}
while (lAttributes.Distinct().Count() >= 3)
{
lAttributes.RemoveAll(l => l == lAttributes[0]);
lAttributes.RemoveAll(l => l == lAttributes[lAttributes.Count() - 1]);
}
return new hslColor(hAttributes[0], sAttributes[0], lAttributes[0]);
}

If you just have wrap-around to worry about, and your points are either < X or > 240-X for some reasonably small X, you could calculate a median by adding X to your data mod 240, calculating a median, and then subtracting X from the result mod 240.
More generally, you could look for a median or medoid by finding the value that minimizes a sum of distances, where d(x, y) = min(|x-y|, |240 + x - y|, |240 + y - x|). I think you could calculate this in time O(n) after a sort that takes O(n log n) with some rather fiddly code in which you calculate the sum of distances associated with each candidate median or mediod, dividing the points into two sets according to whether they are clockwise or anticlockwise from the current point and updating the sum of distances incrementally as you move from one candidate median or medoid to the next.

Related

Algorithm for downsampling array of intervals

I have a sorted array of N intervals of different length. I am plotting these intervals with alternating colors blue/green.
I am trying to find a method or algorithm to "downsample" the array of intervals to produce a visually similar plot, but with less elements.
Ideally I could write some function where I can pass the target number of output intervals as an argument. The output length only has to come close to the target.
input = [
[0, 5, "blue"],
[5, 6, "green"],
[6, 10, "blue"],
// ...etc
]
output = downsample(input, 25)
// [[0, 10, "blue"], ... ]
Below is a picture of what I am trying to accomplish. In this example the input has about 250 intervals, and the output about ~25 intervals. The input length can vary a lot.
Update 1:
Below is my original post which I initially deleted, because there were issues with displaying the equations and also I wasn't very confident if it really makes sense. But later, I figured that the optimisation problem that I described can be actually solved efficiently with DP (Dynamic programming).
So I did a sample C++ implementation. Here are some results:
Here is a live demo that you can play with in your browser (make sure browser support WebGL2, like Chrome or Firefox). It takes a bit to load the page.
Here is the C++ implementation: link
Update 2:
Turns out the proposed solution has the following nice property - we can easily control the importance of the two parts F1 and F2 of the cost function. Simply change the cost function to F(α)=F1 + αF2, where α >= 1.0 is a free parameter. The DP algorithm remains the same.
Here are some result for different α values using the same number of intervals N:
Live demo (WebGL2 required)
As can be seen, higher α means it is more important to cover the original input intervals even if this means covering more of the background in-between.
Original post
Even-though some good algorithms have already been proposed, I would like to propose a slightly unusual approach - interpreting the task as an optimisation problem. Although, I don't know how to efficiently solve the optimisation problem (or even if it can be solved in reasonable time at all), it might be useful to someone purely as a concept.
First, without loss of generality, lets declare the blue color to be background. We will be painting N green intervals on top of it (N is the number provided to the downsample() function in OP's description). The ith interval is defined by its starting coordinate 0 <= xi < xmax and width wi >= 0 (xmax is the maximum coordinate from the input).
Lets also define the array G(x) to be the number of green cells in the interval [0, x) in the input data. This array can easily be pre-calculated. We will use it to quickly calculate the number of green cells in arbitrary interval [x, y) - namely: G(y) - G(x).
We can now introduce the first part of the cost function for our optimisation problem:
The smaller F1 is, the better our generated intervals cover the input intervals, so we will be searching for xi, wi that minimise it. Ideally we want F1=0 which would mean that the intervals do not cover any of the background (which of course is not possible because N is less than the input intervals).
However, this function is not enough to describe the problem, because obviously we can minimise it by taking empty intervals: F1(x, 0)=0. Instead, we want to cover as much as possible from the input intervals. Lets introduce the second part of the cost function which corresponds to this requirement:
The smaller F2 is, the more input intervals are covered. Ideally we want F2=0 which would mean that we covered all of the input rectangles. However, minimising F2 competes with minimising F1.
Finally, we can state our optimisation problem: find xi, wi that minimize F=F1 + F2
How to solve this problem? Not sure. Maybe use some metaheuristic approach for global optimisation such as Simulated annealing or Differential evolution. These are typically easy to implement, especially for this simple cost function.
Best case would be to exist some kind of DP algorithm for solving it efficiently, but unlikely.
I would advise you to use Haar wavelet. That is a very simple algorithm which was often used to provide the functionality of progressive loading for big images on websites.
Here you can see how it works with 2D function. That is what you can use. Alas, the document is in Ukrainian, but code in C++, so readable:)
This document provides an example of 3D object:
Pseudocode on how to compress with Haar wavelet you can find in Wavelets for Computer Graphics: A Primer Part 1y.
You could do the following:
Write out the points that divide the whole strip into intervals as the array [a[0], a[1], a[2], ..., a[n-1]]. In your example, the array would be [0, 5, 6, 10, ... ].
Calculate double-interval lengths a[2]-a[0], a[3]-a[1], a[4]-a[2], ..., a[n-1]-a[n-3] and find the least of them. Let it be a[k+2]-a[k]. If there are two or more equal lengths having the lowest value, choose one of them randomly. In your example, you should get the array [6, 5, ... ] and search for the minimum value through it.
Swap the intervals (a[k], a[k+1]) and (a[k+1], a[k+2]). Basically, you need to assign a[k+1]=a[k]+a[k+2]-a[k+1] to keep the lengths, and to remove the points a[k] and a[k+2] from the array after that because two pairs of intervals of the same color are now merged into two larger intervals. Thus, the numbers of blue and green intervals decreases by one each after this step.
If you're satisfied with the current number of intervals, end the process, otherwise go to the step 1.
You performed the step 2 in order to decrease "color shift" because, at the step 3, the left interval is moved a[k+2]-a[k+1] to the right and the right interval is moved a[k+1]-a[k] to the left. The sum of these distances, a[k+2]-a[k] can be considered a measure of change you're introducing into the whole picture.
Main advantages of this approach:
It is simple.
It doesn't give a preference to any of the two colors. You don't need to assign one of the colors to be the background and the other to be the painting color. The picture can be considered both as "green-on-blue" and "blue-on-green". This reflects quite common use case when two colors just describe two opposite states (like the bit 0/1, "yes/no" answer) of some process extended in time or in space.
It always keeps the balance between colors, i.e. the sum of intervals of each color remains the same during the reduction process. Thus the total brightness of the picture doesn't change. It is important as this total brightness can be considered an "indicator of completeness" at some cases.
Here's another attempt at dynamic programming that's slightly different than Georgi Gerganov's, although the idea to try and formulate a dynamic program may have been inspired by his answer. Neither the implementation nor the concept is guaranteed to be sound but I did include a code sketch with a visual example :)
The search space in this case is not reliant on the total unit width but rather on the number of intervals. It's O(N * n^2) time and O(N * n) space, where N and n are the target and given number of (green) intervals, respectively, because we assume that any newly chosen green interval must be bound by two green intervals (rather than extend arbitrarily into the background).
The idea also utilises the prefix sum idea used to calculate runs with a majority element. We add 1 when we see the target element (in this case green) and subtract 1 for others (that algorithm is also amenable to multiple elements with parallel prefix sum tracking). (I'm not sure that restricting candidate intervals to sections with a majority of the target colour is always warranted but it may be a useful heuristic depending on the desired outcome. It's also adjustable -- we can easily adjust it to check for a different part than 1/2.)
Where Georgi Gerganov's program seeks to minimise, this dynamic program seeks to maximise two ratios. Let h(i, k) represent the best sequence of green intervals up to the ith given interval, utilising k intervals, where each is allowed to stretch back to the left edge of some previous green interval. We speculate that
h(i, k) = max(r + C*r1 + h(i-l, k-1))
where, in the current candidate interval, r is the ratio of green to the length of the stretch, and r1 is the ratio of green to the total given green. r1 is multiplied by an adjustable constant to give more weight to the volume of green covered. l is the length of the stretch.
JavaScript code (for debugging, it includes some extra variables and log lines):
function rnd(n, d=2){
let m = Math.pow(10,d)
return Math.round(m*n) / m;
}
function f(A, N, C){
let ps = [[0,0]];
let psBG = [0];
let totalG = 0;
A.unshift([0,0]);
for (let i=1; i<A.length; i++){
let [l,r,c] = A[i];
if (c == 'g'){
totalG += r - l;
let prevI = ps[ps.length-1][1];
let d = l - A[prevI][1];
let prevS = ps[ps.length-1][0];
ps.push(
[prevS - d, i, 'l'],
[prevS - d + r - l, i, 'r']
);
psBG[i] = psBG[i-1];
} else {
psBG[i] = psBG[i-1] + r - l;
}
}
//console.log(JSON.stringify(A));
//console.log('');
//console.log(JSON.stringify(ps));
//console.log('');
//console.log(JSON.stringify(psBG));
let m = new Array(N + 1);
m[0] = new Array((ps.length >> 1) + 1);
for (let i=0; i<m[0].length; i++)
m[0][i] = [0,0];
// for each in N
for (let i=1; i<=N; i++){
m[i] = new Array((ps.length >> 1) + 1);
for (let ii=0; ii<m[0].length; ii++)
m[i][ii] = [0,0];
// for each interval
for (let j=i; j<m[0].length; j++){
m[i][j] = m[i][j-1];
for (let k=j; k>i-1; k--){
// our anchors are the right
// side of each interval, k's are the left
let jj = 2*j;
let kk = 2*k - 1;
// positive means green
// is a majority
if (ps[jj][0] - ps[kk][0] > 0){
let bg = psBG[ps[jj][1]] - psBG[ps[kk][1]];
let s = A[ps[jj][1]][1] - A[ps[kk][1]][0] - bg;
let r = s / (bg + s);
let r1 = C * s / totalG;
let candidate = r + r1 + m[i-1][j-1][0];
if (candidate > m[i][j][0]){
m[i][j] = [
candidate,
ps[kk][1] + ',' + ps[jj][1],
bg, s, r, r1,k,m[i-1][j-1][0]
];
}
}
}
}
}
/*
for (row of m)
console.log(JSON.stringify(
row.map(l => l.map(x => typeof x != 'number' ? x : rnd(x)))));
*/
let result = new Array(N);
let j = m[0].length - 1;
for (let i=N; i>0; i--){
let [_,idxs,w,x,y,z,k] = m[i][j];
let [l,r] = idxs.split(',');
result[i-1] = [A[l][0], A[r][1], 'g'];
j = k - 1;
}
return result;
}
function show(A, last){
if (last[1] != A[A.length-1])
A.push(last);
let s = '';
let j;
for (let i=A.length-1; i>=0; i--){
let [l, r, c] = A[i];
let cc = c == 'g' ? 'X' : '.';
for (let j=r-1; j>=l; j--)
s = cc + s;
if (i > 0)
for (let j=l-1; j>=A[i-1][1]; j--)
s = '.' + s
}
for (let j=A[0][0]-1; j>=0; j--)
s = '.' + s
console.log(s);
return s;
}
function g(A, N, C){
const ts = f(A, N, C);
//console.log(JSON.stringify(ts));
show(A, A[A.length-1]);
show(ts, A[A.length-1]);
}
var a = [
[0,5,'b'],
[5,9,'g'],
[9,10,'b'],
[10,15,'g'],
[15,40,'b'],
[40,41,'g'],
[41,43,'b'],
[43,44,'g'],
[44,45,'b'],
[45,46,'g'],
[46,55,'b'],
[55,65,'g'],
[65,100,'b']
];
// (input, N, C)
g(a, 2, 2);
console.log('');
g(a, 3, 2);
console.log('');
g(a, 4, 2);
console.log('');
g(a, 4, 5);
I would suggest using K-means it is an algorithm used to group data(a more detailed explanation here: https://en.wikipedia.org/wiki/K-means_clustering and here https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html)
this would be a brief explanation of how the function should look like, hope it is helpful.
from sklearn.cluster import KMeans
import numpy as np
def downsample(input, cluster = 25):
# you will need to group your labels in a nmpy array as shown bellow
# for the sake of example I will take just a random array
X = np.array([[1, 2], [1, 4], [1, 0],[4, 2], [4, 4], [4, 0]])
# n_clusters will be the same as desired output
kmeans = KMeans(n_clusters= cluster, random_state=0).fit(X)
# then you can iterate through labels that was assigned to every entr of your input
# in our case the interval
kmeans_list = [None]*cluster
for i in range(0, X.shape[0]):
kmeans_list[kmeans.labels_[i]].append(X[i])
# after that you will basicly have a list of lists and every inner list will contain all points that corespond to a
# specific label
ret = [] #return list
for label_list in kmeans_list:
left = 10001000 # a big enough number to exced anything that you will get as an input
right = -left # same here
for entry in label_list:
left = min(left, entry[0])
right = max(right, entry[1])
ret.append([left,right])
return ret

O(n) algorithm for two identical points

The Problem Statement
Given n points in a 2D plane having x and y coordinate. Two points are identical if one can be obtained from the other by multiplication by the same number. Example: (10,15) and (2,3) are identical whereas (10,15) and (10,20) are not. Suggest an O(n) algorithm which determines whether the input n points contains two identical points or not.
The simple approach can be just checking for each points i.e. if there are 5 points, for the first one I have 4 comparisons, for the second one I have 3 comparisons and so on. But that isn't an O(n) time complexity solution. I really can't think ahead of that. Any suggestions?
One obvious (but possibly inadequate) possibility would be to reduce each point to a floating point number representing the ratio, so (2,3) and (10,15) both become 0.66667, and (10, 20) become 0.5.
The reason this wouldn't work is that floating point numbers tend to be approximate, so you'd just about need to use an approximate comparison, and put up with the fact that it would show points as identical as long as they were equal to (say) 15 decimal places.
If you don't want that, you could create a rational number class that supported comparison (e.g., reduced each ratio to lowest terms).
Either way, once you've reduced a point to a single number, you just insert each into (for one possibility) a hash table. As you insert each you check whether that ratio is already in the hash table--if it is, you have an identical point. If not, insert it normally.
One way to reduce a point to a single number is to multiply the first co-ordinate of the point by product of all the second co-ordinates of the other points.
So for e.g:
(10, 20) -> 10 * 10 * 4 = 400
(5, 10) -> 5 * 20 * 4 = 400
(3, 4) -> 3 * 20 * 10 = 600
The first and second point match. For large sets of points the products would be very large, and would require using a BigNumber (which will be more than O(n)) but you could keep the numbers within a reasonable limit by taking a modulo after each multiplication. Then use a hash table as suggested in Jerry Coffin's answer.
You can easily compute the product of all the second co-ordinates by doing a single forward pass then a single backwards pass over the array and keeping running products:
e.g. in Java:
long m = 9223372036854775783L; // largest prime less than max long
int[][] points = {{1, 2}, {1, 3}, {1, 4}, {1, 5}, {2, 6}};
long[] mods = new long[points.length];
long prod = 1;
for(int i = 0; i < points.length; i++)
{
mods[i] = prod;
prod = (points[i][1] * prod) % m;
}
prod = 1;
for(int i = points.length - 1; i >= 0 ; i--)
{
mods[i] = (mods[i] * prod) % m;
prod = (points[i][1] * prod) % m;
}
HashSet<Long> set = new HashSet<Long>();
for(int i = 0; i < points.length; i++)
{
prod = (mods[i] * points[i][0]) % m;
if(set.contains(prod))
System.out.println("Found a match");
set.add(prod);
}
This algorithm assumes all the co-ordinates are integers != 0. Zeroes can be handled as special cases: all points with zero in the first place match each other, likewise for those with zero in the second place, and (0, 0) matches all points. As an optimization, the second and third pass through the array could be merged into a single pass.

Generating Random Numbers for RPG games

I'm wondering if there is an algorithm to generate random numbers that most likely will be low in a range from min to max. For instance if you generate a random number between 1 and 100 it should most of the time be below 30 if you call the function with f(min: 1, max: 100, avg: 30), but if you call it with f(min: 1, max: 200, avg: 10) the most the average should be 10. A lot of games does this, but I simply can't find a way to do this with formula. Most of the examples I have seen uses a "drop table" or something like that.
I have come up with a fairly simple way to weight the outcome of a roll, but it is not very efficient and you don't have a lot of control over it
var pseudoRand = function(min, max, n) {
if (n > 0) {
return pseudoRand(min, Math.random() * (max - min) + min, n - 1)
}
return max;
}
rands = []
for (var i = 0; i < 20000; i++) {
rands.push(pseudoRand(0, 100, 1))
}
avg = rands.reduce(function(x, y) { return x + y } ) / rands.length
console.log(avg); // ~50
The function simply picks a random number between min and max N times, where it for every iteration updates the max with the last roll. So if you call it with N = 2, and max = 100 then it must roll 100 two times in a row in order to return 100
I have looked at some distributions on wikipedia, but I don't quite understand them enough to know how I can control the min and max outputs etc.
Any help is very much welcomed
A simple way to generate a random number with a given distribution is to pick a random number from a list where the numbers that should occur more often are repeated according with the desired distribution.
For example if you create a list [1,1,1,2,2,2,3,3,3,4] and pick a random index from 0 to 9 to select an element from that list you will get a number <4 with 90% probability.
Alternatively, using the distribution from the example above, generate an array [2,5,8,9] and pick a random integer from 0 to 9, if it's ≤2 (this will occur with 30% probability) then return 1, if it's >2 and ≤5 (this will also occur with 30% probability) return 2, etc.
Explained here: https://softwareengineering.stackexchange.com/a/150618
A probability distribution function is just a function that, when you put in a value X, will return the probability of getting that value X. A cumulative distribution function is the probability of getting a number less than or equal to X. A CDF is the integral of a PDF. A CDF is almost always a one-to-one function, so it almost always has an inverse.
To generate a PDF, plot the value on the x-axis and the probability on the y-axis. The sum (discrete) or integral (continuous) of all the probabilities should add up to 1. Find some function that models that equation correctly. To do this, you may have to look up some PDFs.
Basic Algorithm
https://en.wikipedia.org/wiki/Inverse_transform_sampling
This algorithm is based off of Inverse Transform Sampling. The idea behind ITS is that you are randomly picking a value on the y-axis of the CDF and finding the x-value it corresponds to. This makes sense because the more likely a value is to be randomly selected, the more "space" it will take up on the y-axis of the CDF.
Come up with some probability distribution formula. For instance, if you want it so that as the numbers get higher the odds of them being chosen increases, you could use something like f(x)=x or f(x)=x^2. If you want something that bulges in the middle, you could use the Gaussian Distribution or 1/(1+x^2). If you want a bounded formula, you can use the Beta Distribution or the Kumaraswamy Distribution.
Integrate the PDF to get the Cumulative Distribution Function.
Find the inverse of the CDF.
Generate a random number and plug it into the inverse of the CDF.
Multiply that result by (max-min) and then add min
Round the result to the nearest integer.
Steps 1 to 3 are things you have to hard code into the game. The only way around it for any PDF is to solve for the shape parameters of that correspond to its mean and holds to the constraints on what you want the shape parameters to be. If you want to use the Kumaraswamy Distribution, you will set it so that the shape parameters a and b are always greater than one.
I would suggest using the Kumaraswamy Distribution because it is bounded and it has a very nice closed form and closed form inverse. It only has two parameters, a and b, and it is extremely flexible, as it can model many different scenarios, including polynomial behavior, bell curve behavior, and a basin-like behavior that has a peak at both edges. Also, modeling isn't too hard with this function. The higher the shape parameter b is, the more tilted it will be to the left, and the higher the shape parameter a is, the more tilted it will be to the right. If a and b are both less than one, the distribution will look like a trough or basin. If a or b is equal to one, the distribution will be a polynomial that does not change concavity from 0 to 1. If both a and b equal one, the distribution is a straight line. If a and b are greater than one, than the function will look like a bell curve. The best thing you can do to learn this is to actually graph these functions or just run the Inverse Transform Sampling algorithm.
https://en.wikipedia.org/wiki/Kumaraswamy_distribution
For instance, if I want to have a probability distribution shaped like this with a=2 and b=5 going from 0 to 100:
https://www.wolframalpha.com/input/?i=2*5*x%5E(2-1)*(1-x%5E2)%5E(5-1)+from+x%3D0+to+x%3D1
Its CDF would be:
CDF(x)=1-(1-x^2)^5
Its inverse would be:
CDF^-1(x)=(1-(1-x)^(1/5))^(1/2)
The General Inverse of the Kumaraswamy Distribution is:
CDF^-1(x)=(1-(1-x)^(1/b))^(1/a)
I would then generate a number from 0 to 1, put it into the CDF^-1(x), and multiply the result by 100.
Pros
Very accurate
Continuous, not discreet
Uses one formula and very little space
Gives you a lot of control over exactly how the randomness is spread out
Many of these formulas have CDFs with inverses of some sort
There are ways to bound the functions on both ends. For instance, the Kumaraswamy Distribution is bounded from 0 to 1, so you just input a float between zero and one, then multiply the result by (max-min) and add min. The Beta Distribution is bounded differently based on what values you pass into it. For something like PDF(x)=x, the CDF(x)=(x^2)/2, so you can generate a random value from CDF(0) to CDF(max-min).
Cons
You need to come up with the exact distributions and their shapes you plan on using
Every single general formula you plan on using needs to be hard coded into the game. In other words, you can program the general Kumaraswamy Distribution into the game and have a function that generates random numbers based on the distribution and its parameters, a and b, but not a function that generates a distribution for you based on the average. If you wanted to use Distribution x, you would have to find out what values of a and b best fit the data you want to see and hard code those values into the game.
I would use a simple mathematical function for that. From what you describe, you need an exponential progression like y = x^2. at average (average is at x=0.5 since rand gets you a number from 0 to 1) you would get 0.25. If you want a lower average number, you can use a higher exponent like y = x^3 what would result in y = 0.125 at x = 0.5
Example:
http://www.meta-calculator.com/online/?panel-102-graph&data-bounds-xMin=-2&data-bounds-xMax=2&data-bounds-yMin=-2&data-bounds-yMax=2&data-equations-0=%22y%3Dx%5E2%22&data-rand=undefined&data-hideGrid=false
PS: I adjusted the function to calculate the needed exponent to get the average result.
Code example:
function expRand (min, max, exponent) {
return Math.round( Math.pow( Math.random(), exponent) * (max - min) + min);
}
function averageRand (min, max, average) {
var exponent = Math.log(((average - min) / (max - min))) / Math.log(0.5);
return expRand(min, max, exponent);
}
alert(averageRand(1, 100, 10));
You may combine 2 random processes. For example:
first rand R1 = f(min: 1, max: 20, avg: 10);
second rand R2 = f(min:1, max : 10, avg : 1);
and then multiply R1*R2 to have a result between [1-200] and average around 10 (the average will be shifted a bit)
Another option is to find the inverse of the random function you want to use. This option has to be initialized when your program starts but doesn't need to be recomputed. The math used here can be found in a lot of Math libraries. I will explain point by point by taking the example of an unknown random function where only four points are known:
First, fit the four point curve with a polynomial function of order 3 or higher.
You should then have a parametrized function of type : ax+bx^2+cx^3+d.
Find the indefinite integral of the function (the form of the integral is of type a/2x^2+b/3x^3+c/4x^4+dx, which we will call quarticEq).
Compute the integral of the polynomial from your min to your max.
Take a uniform random number between 0-1, then multiply by the value of the integral computed in Step 5. (we name the result "R")
Now solve the equation R = quarticEq for x.
Hopefully the last part is well known, and you should be able to find a library that can do this computation (see wiki). If the inverse of the integrated function does not have a closed form solution (like in any general polynomial with degree five or higher), you can use a root finding method such as Newton's Method.
This kind of computation may be use to create any kind of random distribution.
Edit :
You may find the Inverse Transform Sampling described above in wikipedia and I found this implementation (I haven't tried it.)
You can keep a running average of what you have returned from the function so far and based on that in a while loop get the next random number that fulfills the average, adjust running average and return the number
Using a drop table permit a very fast roll, that in a real time game matter. In fact it is only one random generation of a number from a range, then according to a table of probabilities (a Gauss distribution for that range) a if statement with multiple choice. Something like that:
num = random.randint(1,100)
if num<10 :
case 1
if num<20 and num>10 :
case 2
...
It is not very clean but when you have a finite number of choices it can be very fast.
There are lots of ways to do so, all of which basically boil down to generating from a right-skewed (a.k.a. positive-skewed) distribution. You didn't make it clear whether you want integer or floating point outcomes, but there are both discrete and continuous distributions that fit the bill.
One of the simplest choices would be a discrete or continuous right-triangular distribution, but while that will give you the tapering off you desire for larger values, it won't give you independent control of the mean.
Another choice would be a truncated exponential (for continuous) or geometric (for discrete) distribution. You'd need to truncate because the raw exponential or geometric distribution has a range from zero to infinity, so you'd have to lop off the upper tail. That would in turn require you to do some calculus to find a rate λ which yields the desired mean after truncation.
A third choice would be to use a mixture of distributions, for instance choose a number uniformly in a lower range with some probability p, and in an upper range with probability (1-p). The overall mean is then p times the mean of the lower range + (1-p) times the mean of the upper range, and you can dial in the desired overall mean by adjusting the ranges and the value of p. This approach will also work if you use non-uniform distribution choices for the sub-ranges. It all boils down to how much work you're willing to put into deriving the appropriate parameter choices.
One method would not be the most precise method, but could be considered "good enough" depending on your needs.
The algorithm would be to pick a number between a min and a sliding max. There would be a guaranteed max g_max and a potential max p_max. Your true max would slide depending on the results of another random call. This will give you a skewed distribution you are looking for. Below is the solution in Python.
import random
def get_roll(min, g_max, p_max)
max = g_max + (random.random() * (p_max - g_max))
return random.randint(min, int(max))
get_roll(1, 10, 20)
Below is a histogram of the function ran 100,000 times with (1, 10, 20).
private int roll(int minRoll, int avgRoll, int maxRoll) {
// Generating random number #1
int firstRoll = ThreadLocalRandom.current().nextInt(minRoll, maxRoll + 1);
// Iterating 3 times will result in the roll being relatively close to
// the average roll.
if (firstRoll > avgRoll) {
// If the first roll is higher than the (set) average roll:
for (int i = 0; i < 3; i++) {
int verificationRoll = ThreadLocalRandom.current().nextInt(minRoll, maxRoll + 1);
if (firstRoll > verificationRoll && verificationRoll >= avgRoll) {
// If the following condition is met:
// The iteration-roll is closer to 30 than the first roll
firstRoll = verificationRoll;
}
}
} else if (firstRoll < avgRoll) {
// If the first roll is lower than the (set) average roll:
for (int i = 0; i < 3; i++) {
int verificationRoll = ThreadLocalRandom.current().nextInt(minRoll, maxRoll + 1);
if (firstRoll < verificationRoll && verificationRoll <= avgRoll) {
// If the following condition is met:
// The iteration-roll is closer to 30 than the first roll
firstRoll = verificationRoll;
}
}
}
return firstRoll;
}
Explanation:
roll
check if the roll is above, below or exactly 30
if above, reroll 3 times & set the roll according to the new roll, if lower but >= 30
if below, reroll 3 times & set the roll according to the new roll, if
higher but <= 30
if exactly 30, don't set the roll anew
return the roll
Pros:
simple
effective
performs well
Cons:
You'll naturally have more results that are in the range of 30-40 than you'll have in the range of 20-30, simple due to the 30-70 relation.
Testing:
You can test this by using the following method in conjunction with the roll()-method. The data is saved in a hashmap (to map the number to the number of occurences).
public void rollTheD100() {
int maxNr = 100;
int minNr = 1;
int avgNr = 30;
Map<Integer, Integer> numberOccurenceMap = new HashMap<>();
// "Initialization" of the map (please don't hit me for calling it initialization)
for (int i = 1; i <= 100; i++) {
numberOccurenceMap.put(i, 0);
}
// Rolling (100k times)
for (int i = 0; i < 100000; i++) {
int dummy = roll(minNr, avgNr, maxNr);
numberOccurenceMap.put(dummy, numberOccurenceMap.get(dummy) + 1);
}
int numberPack = 0;
for (int i = 1; i <= 100; i++) {
numberPack = numberPack + numberOccurenceMap.get(i);
if (i % 10 == 0) {
System.out.println("<" + i + ": " + numberPack);
numberPack = 0;
}
}
}
The results (100.000 rolls):
These were as expected. Note that you can always fine-tune the results, simply by modifying the iteration-count in the roll()-method (the closer to 30 the average should be, the more iterations should be included (note that this could hurt the performance to a certain degree)). Also note that 30 was (as expected) the number with the highest number of occurences, by far.
<10: 4994
<20: 9425
<30: 18184
<40: 29640
<50: 18283
<60: 10426
<70: 5396
<80: 2532
<90: 897
<100: 223
Try this,
generate a random number for the range of numbers below the average and generate a second random number for the range of numbers above the average.
Then randomly select one of those, each range will be selected 50% of the time.
var psuedoRand = function(min, max, avg) {
var upperRand = (int)(Math.random() * (max - avg) + avg);
var lowerRand = (int)(Math.random() * (avg - min) + min);
if (math.random() < 0.5)
return lowerRand;
else
return upperRand;
}
Having seen much good explanations and some good ideas, I still think this could help you:
You can take any distribution function f around 0, and substitute your interval of interest to your desired interval [1,100]: f -> f'.
Then feed the C++ discrete_distribution with the results of f'.
I've got an example with the normal distribution below, but I can't get my result into this function :-S
#include <iostream>
#include <random>
#include <chrono>
#include <cmath>
using namespace std;
double p1(double x, double mean, double sigma); // p(x|x_avg,sigma)
double p2(int x, int x_min, int x_max, double x_avg, double z_min, double z_max); // transform ("stretch") it to the interval
int plot_ps(int x_avg, int x_min, int x_max, double sigma);
int main()
{
int x_min = 1;
int x_max = 20;
int x_avg = 6;
double sigma = 5;
/*
int p[]={2,1,3,1,2,5,1,1,1,1};
default_random_engine generator (chrono::system_clock::now().time_since_epoch().count());
discrete_distribution<int> distribution {p*};
for (int i=0; i< 10; i++)
cout << i << "\t" << distribution(generator) << endl;
*/
plot_ps(x_avg, x_min, x_max, sigma);
return 0; //*/
}
// Normal distribution function
double p1(double x, double mean, double sigma)
{
return 1/(sigma*sqrt(2*M_PI))
* exp(-(x-mean)*(x-mean) / (2*sigma*sigma));
}
// Transforms intervals to your wishes ;)
// z_min and z_max are the desired values f'(x_min) and f'(x_max)
double p2(int x, int x_min, int x_max, double x_avg, double z_min, double z_max)
{
double y;
double sigma = 1.0;
double y_min = -sigma*sqrt(-2*log(z_min));
double y_max = sigma*sqrt(-2*log(z_max));
if(x < x_avg)
y = -(x-x_avg)/(x_avg-x_min)*y_min;
else
y = -(x-x_avg)/(x_avg-x_max)*y_max;
return p1(y, 0.0, sigma);
}
//plots both distribution functions
int plot_ps(int x_avg, int x_min, int x_max, double sigma)
{
double z = (1.0+x_max-x_min);
// plot p1
for (int i=1; i<=20; i++)
{
cout << i << "\t" <<
string(int(p1(i, x_avg, sigma)*(sigma*sqrt(2*M_PI)*20.0)+0.5), '*')
<< endl;
}
cout << endl;
// plot p2
for (int i=1; i<=20; i++)
{
cout << i << "\t" <<
string(int(p2(i, x_min, x_max, x_avg, 1.0/z, 1.0/z)*(20.0*sqrt(2*M_PI))+0.5), '*')
<< endl;
}
}
With the following result if I let them plot:
1 ************
2 ***************
3 *****************
4 ******************
5 ********************
6 ********************
7 ********************
8 ******************
9 *****************
10 ***************
11 ************
12 **********
13 ********
14 ******
15 ****
16 ***
17 **
18 *
19 *
20
1 *
2 ***
3 *******
4 ************
5 ******************
6 ********************
7 ********************
8 *******************
9 *****************
10 ****************
11 **************
12 ************
13 *********
14 ********
15 ******
16 ****
17 ***
18 **
19 **
20 *
So - if you could give this result to the discrete_distribution<int> distribution {}, you got everything you want...
Well, from what I can see of your problem, I would want for the solution to meet these criteria:
a) Belong to a single distribution: If we need to "roll" (call math.Random) more than once per function call and then aggregate or discard some results, it stops being truly distributed according to the given function.
b) Not be computationally intensive: Some of the solutions use Integrals, (Gamma distribution, Gaussian Distribution), and those are computationally intensive. In your description, you mention that you want to be able to "calculate it with a formula", which fits this description (basically, you want an O(1) function).
c) Be relatively "well distributed", e.g. not have peaks and valleys, but instead have most results cluster around the mean, and have nice predictable slopes downwards towards the ends, and yet have the probability of the min and the max to be not zero.
d) Not to require to store a large array in memory, as in drop tables.
I think this function meets the requirements:
var pseudoRand = function(min, max, avg )
{
var randomFraction = Math.random();
var head = (avg - min);
var tail = (max - avg);
var skewdness = tail / (head + tail);
if (randomFraction < skewdness)
return min + (randomFraction / skewdness) * head;
else
return avg + (1 - randomFraction) / (1 - skewdness) * tail;
}
This will return floats, but you can easily turn them to ints by calling
(int) Math.round(pseudoRand(...))
It returned the correct average in all of my tests, and it is also nicely distributed towards the ends. Hope this helps. Good luck.

Randomly pick elements from a vector of counts

I'm currently trying to optimize some MATLAB/Octave code by means of an algorithmic change, but can't figure out how to deal with some randomness here. Suppose that I have a vector V of integers, with each element representing a count of some things, photons in my case. Now I want to randomly pick some amount of those "things" and create a new vector of the same size, but with the counts adjusted.
Here's how I do this at the moment:
function W = photonfilter(V, eff)
% W = photonfilter(V, eff)
% Randomly takes photons from V according to the given efficiency.
%
% Args:
% V: Input vector containing the number of emitted photons in each
% timeslot (one element is one timeslot). The elements are rounded
% to integers before processing.
% eff: Filter efficiency. On the average, every 1/eff photon will be
% taken. This value must be in the range 0 < eff <= 1.
% W: Output row vector with the same length as V and containing the number
% of received photons in each timeslot.
%
% WARNING: This function operates on a photon-by-photon basis in that it
% constructs a vector with one element per photon. The storage requirements
% therefore directly depend on sum(V), not only on the length of V.
% Round V and make it flat.
Ntot = length(V);
V = round(V);
V = V(:);
% Initialize the photon-based vector, so that each element contains
% the original index of the photon.
idxV = zeros(1, sum(V), 'uint32');
iout = 1;
for i = 1:Ntot
N = V(i);
idxV(iout:iout+N-1) = i;
iout = iout + N;
end;
% Take random photons.
idxV = idxV(randperm(length(idxV)));
idxV = idxV(1:round(length(idxV)*eff));
% Generate the output vector by placing the remaining photons back
% into their timeslots.
[W, trash] = hist(idxV, 1:Ntot);
This is a rather straightforward implementation of the description above. But it has an obvious performance drawback: The function creates a vector (idxV) containing one element per single photon. So if my V has only 1000 elements but an average count of 10000 per element, the internal vector will have 10 million elements making the function slow and heavy.
What I'd like to achieve now is not to directly optimize this code, but to use some other kind of algorithm which immediately calculates the new counts without giving each photon some kind of "identity". This must be possible somehow, but I just can't figure out how to do it.
Requirements:
The output vector W must have the same number of elements as the input vector V.
W(i) must be an integer and bounded by 0 <= W(i) <= V(i).
The expected value of sum(W) must be sum(V)*eff.
The algorithm must somehow implement this "random picking" of photons, i.e. there should not be some deterministic part like "run through V dividing all counts by the stepsize and propagating the remainders", as the whole point of this function is to bring randomness into the system.
An explicit loop over V is allowed if unavoidable, but a vectorized approach is preferable.
Any ideas how to implement something like this? A solution using only a random vector and then some trickery with probabilities and rounding would be ideal, but I haven't had any success with that so far.
Thanks! Best regards, Philipp
The method you employ to compute W is called Monte Carlo method. And indeed there can be some optimizations. Once of such is instead of calculating indices of photons, let's imagine a set of bins. Each bin has some probability and the sum of all bins' probabilities adds up to 1. We divide the segment [0, 1] into parts whose lengths are proportional to the probabilities of the bins. Now for every random number within [0, 1) that we generate we can quickly find the bin that it belongs to. Finally, we count numbers in the bins to obtain the final result. The code below illustrates the idea.
% Population size (number of photons).
N = 1000000;
% Sample size, size of V and W as well.
% For convenience of plotting, V and W are of the same size, but
% the algorithm doesn't enforce this constraint.
M = 10000;
% Number of Monte Carlo iterations, greater numbers give better quality.
K = 100000;
% Generate population of counts, use gaussian distribution to test the method.
% If implemented correctly histograms should have the same shape eventually.
V = hist(randn(1, N), M);
P = cumsum(V / sum(V));
% For every generated random value find its bin and then count the bins.
% Finally we normalize counts by the ration of N / K.
W = hist(lookup(P, rand(1, K)), M) * N / K;
% Compare distribution plots, they should be the same.
hold on;
plot(W, '+r');
plot(V, '*b');
pause
Based on the answer from Alexander Solovets, this is how the code now looks:
function W = photonfilter(V, eff, impl=1)
Ntot = length(V);
V = V(:);
if impl == 0
% Original "straightforward" solution.
V = round(V);
idxV = zeros(1, sum(V), 'uint32');
iout = 1;
for i = 1:Ntot
N = V(i);
idxV(iout:iout+N-1) = i;
iout = iout + N;
end;
idxV = idxV(randperm(length(idxV)));
idxV = idxV(1:round(length(idxV)*eff));
[W, trash] = hist(idxV, 1:Ntot);
else
% Monte Carlo approach.
Nphot = sum(V);
P = cumsum(V / Nphot);
W = hist(lookup(P, rand(1, round(Nphot * eff))), 0:Ntot-1);
end;
The results are quite comparable, as long as eff if not too close to 1 (with eff=1, the original solution yields W=V while the Monte Carlo approach still has some randomness, thereby violating the upper bound constraints).
Test in the interactive Octave shell:
octave:1> T=linspace(0,10*pi,10000);
octave:2> V=100*(1+sin(T));
octave:3> W1=photonfilter(V, 0.1, 0);
octave:4> W2=photonfilter(V, 0.1, 1);
octave:5> plot(T,V,T,W1,T,W2);
octave:6> legend('V','Random picking','Monte Carlo')
octave:7> sum(W1)
ans = 100000
octave:8> sum(W2)
ans = 100000
Plot:

How to calculate iteratively the running weighted average so that last values to weight most?

I want to implement an iterative algorithm, which calculates weighted average. The specific weight law does not matter, but it should be close to 1 for the newest values and close to 0 to the oldest.
The algorithm should be iterative. i.e. it should not remember all previous values. It should know only one newest value and any aggregative information about past, like previous values of the average, sums, counts etc.
Is it possible?
For example, the following algorithm can be:
void iterate(double value) {
sum *= 0.99;
sum += value;
count++;
avg = sum / count;
}
It will give exponential decreasing weight, which may be not good. Is it possible to have step decreasing weight or something?
EDIT 1
The the requirements for weighing law is follows:
1) The weight decreases into past
2) I has some mean or characteristic duration so that values older this duration matters much lesser than newer ones
3) I should be able to set this duration
EDIT 2
I need the following. Suppose v_i are values, where v_1 is the first. Also suppose w_i are weights. But w_0 is THE LAST.
So, after first value came I have first average
a_1 = v_1 * w_0
After the second value v_2 came, I should have average
a_2 = v_1 * w_1 + v_2 * w_0
With next value I should have
a_3 = v_1 * w_2 + v_2 * w_1 + v_3 * w_0
Note, that weight profile is moving with me, while I am moving along value sequence.
I.e. each value does not have it's own weight all the time. My goal is to have this weight lower while going to past.
First a bit of background. If we were keeping a normal average, it would go like this:
average(a) = 11
average(a,b) = (average(a)+b)/2
average(a,b,c) = (average(a,b)*2 + c)/3
average(a,b,c,d) = (average(a,b,c)*3 + d)/4
As you can see here, this is an "online" algorithm and we only need to keep track of pieces of data: 1) the total numbers in the average, and 2) the average itself. Then we can undivide the average by the total, add in the new number, and divide it by the new total.
Weighted averages are a bit different. It depends on what kind of weighted average. For example if you defined:
weightedAverage(a,wa, b,wb, c,wc, ..., z,wz) = a*wa + b*wb + c*wc + ... + w*wz
or
weightedAverage(elements, weights) = elements·weights
...then you don't need to do anything besides add the new element*weight! If however you defined the weighted average akin to an expected-value from probability:
weightedAverage(elements,weights) = elements·weights / sum(weights)
...then you'd need to keep track of the total weights. Instead of undividing by the total number of elements, you undivide by the total weight, add in the new element&ast;weight, then divide by the new total weight.
Alternatively you don't need to undivide, as demonstrated below: you can merely keep track of the temporary dot product and weight total in a closure or an object, and divide it as you yield (this can help a lot with avoiding numerical inaccuracy from compounded rounding errors).
In python this would be:
def makeAverager():
dotProduct = 0
totalWeight = 0
def averager(newValue, weight):
nonlocal dotProduct,totalWeight
dotProduct += newValue*weight
totalWeight += weight
return dotProduct/totalWeight
return averager
Demo:
>>> averager = makeAverager()
>>> [averager(value,w) for value,w in [(100,0.2), (50,0.5), (100,0.1)]]
[100.0, 64.28571428571429, 68.75]
>>> averager(10,1.1)
34.73684210526316
>>> averager(10,1.1)
25.666666666666668
>>> averager(30,2.0)
27.4
> But my task is to have average recalculated each time new value arrives having old values reweighted. –OP
Your task is almost always impossible, even with exceptionally simple weighting schemes.
You are asking to, with O(1) memory, yield averages with a changing weighting scheme. For example, {values·weights1, (values+[newValue2])·weights2, (values+[newValue2,newValue3])·weights3, ...} as new values are being passed in, for some nearly arbitrarily changing weights sequence. This is impossible due to injectivity. Once you merge the numbers in together, you lose a massive amount of information. For example, even if you had the weight vector, you could not recover the original value vector, or vice versa. There are only two cases I can think of where you could get away with this:
Constant weights such as [2,2,2,...2]: this is equivalent to an on-line averaging algorithm, which you don't want because the old values are not being "reweighted".
The relative weights of previous answers do not change. For example you could do weights of [8,4,2,1], and add in a new element with arbitrary weight like ...+[1], but you must increase all the previous by the same multiplicative factor, like [16,8,4,2]+[1]. Thus at each step, you are adding a new arbitrary weight, and a new arbitrary rescaling of the past, so you have 2 degrees of freedom (only 1 if you need to keep your dot-product normalized). The weight-vectors you'd get would look like:
[w0]
[w0*(s1), w1]
[w0*(s1*s2), w1*(s2), w2]
[w0*(s1*s2*s3), w1*(s2*s3), w2*(s3), w3]
...
Thus any weighting scheme you can make look like that will work (unless you need to keep the thing normalized by the sum of weights, in which case you must then divide the new average by the new sum, which you can calculate by keeping only O(1) memory). Merely multiply the previous average by the new s (which will implicitly distribute over the dot-product into the weights), and tack on the new +w*newValue.
I think you are looking for something like this:
void iterate(double value) {
count++;
weight = max(0, 1 - (count / 1000));
avg = ( avg * total_weight * (count - 1) + weight * value) / (total_weight * (count - 1) + weight)
total_weight += weight;
}
Here I'm assuming you want the weights to sum to 1. As long as you can generate a relative weight without it changing in the future, you can end up with a solution which mimics this behavior.
That is, suppose you defined your weights as a sequence {s_0, s_1, s_2, ..., s_n, ...} and defined the input as sequence {i_0, i_1, i_2, ..., i_n}.
Consider the form: sum(s_0*i_0 + s_1*i_1 + s_2*i_2 + ... + s_n*i_n) / sum(s_0 + s_1 + s_2 + ... + s_n). Note that it is trivially possible to compute this incrementally with a couple of aggregation counters:
int counter = 0;
double numerator = 0;
double denominator = 0;
void addValue(double val)
{
double weight = calculateWeightFromCounter(counter);
numerator += weight * val;
denominator += weight;
}
double getAverage()
{
if (denominator == 0.0) return 0.0;
return numerator / denominator;
}
Of course, calculateWeightFromCounter() in this case shouldn't generate weights that sum to one -- the trick here is that we average by dividing by the sum of the weights so that in the end, the weights virtually seem to sum to one.
The real trick is how you do calculateWeightFromCounter(). You could simply return the counter itself, for example, however note that the last weighted number would not be near the sum of the counters necessarily, so you may not end up with the exact properties you want. (It's hard to say since, as mentioned, you've left a fairly open problem.)
This is too long to post in a comment, but it may be useful to know.
Suppose you have:
w_0*v_n + ... w_n*v_0 (we'll call this w[0..n]*v[n..0] for short)
Then the next step is:
w_0*v_n1 + ... w_n1*v_0 (and this is w[0..n1]*v[n1..0] for short)
This means we need a way to calculate w[1..n1]*v[n..0] from w[0..n]*v[n..0].
It's certainly possible that v[n..0] is 0, ..., 0, z, 0, ..., 0 where z is at some location x.
If we don't have any 'extra' storage, then f(z*w(x))=z*w(x + 1) where w(x) is the weight for location x.
Rearranging the equation, w(x + 1) = f(z*w(x))/z. Well, w(x + 1) better be constant for a constant x, so f(z*w(x))/z better be constant. Hence, f must let z propagate -- that is, f(z*w(x)) = z*f(w(x)).
But here again we have an issue. Note that if z (which could be any number) can propagate through f, then w(x) certainly can. So f(z*w(x)) = w(x)*f(z). Thus f(w(x)) = w(x)/f(z).
But for a constant x, w(x) is constant, and thus f(w(x)) better be constant, too. w(x) is constant, so f(z) better be constant so that w(x)/f(z) is constant. Thus f(w(x)) = w(x)/c where c is a constant.
So, f(x)=c*x where c is a constant when x is a weight value.
So w(x+1) = c*w(x).
That is, each weight is a multiple of the previous. Thus, the weights take the form w(x)=m*b^x.
Note that this assumes the only information f has is the last aggregated value. Note that at some point you will be reduced to this case unless you're willing to store a non-constant amount of data representing your input. You cannot represent an infinite length vector of real numbers with a real number, but you can approximate them somehow in a constant, finite amount of storage. But this would merely be an approximation.
Although I haven't rigorously proven it, it is my conclusion that what you want is impossible to do with a high degree of precision, but you may be able to use log(n) space (which may as well be O(1) for many practical applications) to generate a quality approximation. You may be able to use even less.
I tried to practically code something (in Java). As has been said, your goal is not achievable. You can only count average from some number of last remembered values. If you don't need to be exact, you can approximate the older values. I tried to do it by remembering last 5 values exactly and older values only SUMmed by 5 values, remembering the last 5 SUMs. Then, the complexity is O(2n) for remembering last n+n*n values. This is a very rough approximation.
You can modify the "lastValues" and "lasAggregatedSums" array sizes as you want. See this ascii-art picture trying to display a graph of last values, showing that the first columns (older data) are remembered as aggregated value (not individually), and only the earliest 5 values are remembered individually.
values:
#####
##### ##### #
##### ##### ##### # #
##### ##### ##### ##### ## ##
##### ##### ##### ##### ##### #####
time: --->
Challenge 1: My example doesn't count weights, but I think it shouldn't be problem for you to add weights for the "lastAggregatedSums" appropriately - the only problem is, that if you want lower weights for older values, it would be harder, because the array is rotating, so it is not straightforward to know which weight for which array member. Maybe you can modify the algorithm to always "shift" values in the array instead of rotating? Then adding weights shouldn't be a problem.
Challenge 2: The arrays are initialized with 0 values, and those values are counting to the average from the beginning, even when we haven't receive enough values. If you are running the algorithm for long time, you probably don't bother that it is learning for some time at the beginning. If you do, you can post a modification ;-)
public class AverageCounter {
private float[] lastValues = new float[5];
private float[] lastAggregatedSums = new float[5];
private int valIdx = 0;
private int aggValIdx = 0;
private float avg;
public void add(float value) {
lastValues[valIdx++] = value;
if(valIdx == lastValues.length) {
// count average of last values and save into the aggregated array.
float sum = 0;
for(float v: lastValues) {sum += v;}
lastAggregatedSums[aggValIdx++] = sum;
if(aggValIdx >= lastAggregatedSums.length) {
// rotate aggregated values index
aggValIdx = 0;
}
valIdx = 0;
}
float sum = 0;
for(float v: lastValues) {sum += v;}
for(float v: lastAggregatedSums) {sum += v;}
avg = sum / (lastValues.length + lastAggregatedSums.length * lastValues.length);
}
public float getAvg() {
return avg;
}
}
you can combine (weighted sum) exponential means with different effective window sizes (N) in order to get the desired weights.
Use more exponential means to define your weight profile more detailed.
(more exponential means also means to store and calculate more values, so here is the trade off)
A memoryless solution is to calculate the new average from a weighted combination of the previous average and the new value:
average = (1 - P) * average + P * value
where P is an empirical constant, 0 <= P <= 1
expanding gives:
average = sum i (weight[i] * value[i])
where value[0] is the newest value, and
weight[i] = P * (1 - P) ^ i
When P is low, historical values are given higher weighting.
The closer P gets to 1, the more quickly it converges to newer values.
When P = 1, it's a regular assignment and ignores previous values.
If you want to maximise the contribution of value[N], maximize
weight[N] = P * (1 - P) ^ N
where 0 <= P <= 1
I discovered weight[N] is maximized when
P = 1 / (N + 1)

Resources