Related
The problem is pretty much what the title says. There is an n-element(n<10^5) array, which consists of n zeros. There are q operations (q<2*10^5): Each operation can be one of two below:
1. Add x to all elements on a [l,r] range(x can be also negative)
2. Ask for the number of zeros on a [l,r] range
Note that it is guaranteed that absolute values in the array will never get greater than 10^5
I am asking this question because I was reading a solution to another problem where my question was its subproblem. The author said that it can be solved using segment tree with lazy propagation. I can not figure out how to do that. The brute force solution(O(q*n)) is too slow...
What is the most efficient way to implement answering the query considering the first operation? O(q*long(n)) is what I would be guessing.
Example:
The array is: 0 0 0 0
-> Add 3 from index 2 to index 3:
The array is now: 0 0 3 3
-> Ask about number of zeros on [1,2]
The answer is 1
-> Add -3 from index 3 to index 3
The array is now: 0 0 3 0
-> Ask about number of zeros on [0,3]
The answer is 3
Ok, I have solved this task. All we have to do is create a segment tree of minimums with lazy propagation, which also counts number of those minimums.
In each node of our segment tree we will store 3 values:
1. Minimum from the segment operated by our node.
2. Number of those minimums on a segment operated by our node.
3. Lazy propagation values(values which tell us what should we pass to our sons when visiting this node next time).
When reading from a segment we will get:
1.Minimum on this segment
2.How many numbers are equal to the minimum on this segment.
If segment's minimum is 0, then we have to simply return the second value. If our minimum is higher than 0, the answer is 0(no zeros found on this segment, because the lowest number is higher than 0). Since read operation, as well as update operations, is O(log(n)), and we have q operations, the complexity of this algorithm is O(q*log(n)), which is sufficient.
Pseudocode:
min_count[2*MAX_N]
val[2*MAX_N]
lazy[2*MAX_N]
values_from_sons(node)
{
if(node has no childern) stop the function
val[node]=min(val[2*node],val[2*node+1] //it is a segment tree of minimums
if(val[2*node]<val[2*node+1]) //minimum from the left son < minimum from the right son
{
min_count[node]=min_count[2*node]
stop the function
}
if(val[2*node]<val[2*node+1]) //minimum from the left son > minimum from the right son
{
min_count[node]=min_count[2*node]
stop the function
}
if(val[2*node]==val[2*node+1])
{
min_count[node]=min_count[2*node]+min_count[2*node+1];
//we have x minimums in the left son, and y non-intersecting with x minimums on the right, so we can sum them up
}
}
pass(node)
{
if(node has no childern) stop the function
//we are passing values to our children when visiting node,
// remember that array "lazy" stores values which belong to node's sons
val[2*node]+=lazy[node];
lazy[2*node]+=lazy[node];
val[2*node+1]+=lazy[node];
lazy[2*node+1]+=lazy[node];
lazy[node]=0;
}
update(node,left,right,s1,s2,add)
//node-number of a node, [left,right]-segment operated by this node, [s1,s2]-segment on which we want to add "add" value
{
pass(node)
if([left,right] and [s1,s2] have no intersections) stop the function
if([left,right] and [s1,s2] have at least one intersection) /// add "add" value to this node's lazy and val
{
val[node]+=add
lazy[node]+=add
stop the function
}
update(values of the left son)
update(values of the right son)
values_from_sons(node)
//placing this function here updates this node's values when some of his lower ancestors were changed
}
read(node,left,right,s1,s2)
//node-number of a node, [left,right]-segment operated by this node, [s1,s2]-segment for which we want an answer
// this function returns 2 values - minimum from a [s1,s2] segment, and number of values equal to this minimum
{
pass(node)
if([left,right] and [s1,s2] have no intersections) return {INF,0}; //return neutral value of min operation
if([left,right] and [s1,s2] have at least one intersection) return {val[node],min_count[node]}
vl=read(values of the left son)
vr=read(values of the right son)
if(vl<vr)
{
//vl has lower minimums, so the answer for this node will be vl
return vl
}
else if(vl>vr)
{
//vr has lower minimums, so the answer for this node will be vr
return vr
}
else
{
//left and right son have the same minimum, and non intersecting values. Hence we can add them
return {vl's minimum, vl's count of minimums + vr's count of minimums};
}
}
ini()
//builds tree. remember that you have to use it before using any of the functions above
{
//Hence we don't have to worry about beginning values, all of them are set to 0 at the beginning,
// we just have to set min_count table properly
for(each leaf[node that has no sons])
{
min_cout[leaf]=1;
}
for(x=MAX_N-1, x>0, x--)
{
min_count[x]=min_count[2*x]+min_count[2*x+1]
}
}
The details are a bit cringe, fair warning lol:
I want to set up meters on the floor of my building to catch someone; assume my floor is a number line from 0 to length L. The specific type of meter I am designing has a radius of detection that is 4.7 meters in the -x and +x direction (diameter of 9.4 meters of detection). I want to set them up in such a way that if the person I am trying to find steps foot anywhere in the floor, I will know. However, I can't just setup a meter anywhere (it may annoy other residents); therefore, there are only n valid locations that I can setup a meter. Additionally, these meters are expensive and time consuming to make, so I would like to use as few as possible.
For simplicity, you can assume the meter has 0 width, and that each valid location is just a point on the number line aformentioned. What is a greedy algorithm that places as few meters as possible, while being able to detect the entire hallway of length L like I want it to, or, if detecting the entire hallway is not possible, will output false for the set of n locations I have (and, if it isn't able to detect the whole hallway, still uses as few meters as possible while attempting to do so)?
Edit: some clarification on being able to detect the entire hallway or not
Given:
L (hallway length)
a list of N valid positions to place a meter (p_0 ... p_N-1) of radius 4.7
You can determine in O(N) either a valid and minimal ("good") covering of the whole hallway or a proof that no such covering exists given the constraints as follows (pseudo-code):
// total = total length;
// start = current starting position, initially 0
// possible = list of possible meter positions
// placed = list of (optimal) meter placements, initially empty
boolean solve(float total, float start, List<Float> possible, List<Float> placed):
if (total-start <= 0):
return true; // problem solved with no additional meters - woo!
else:
Float next = extractFurthestWithinRange(start, possible, 4.7);
if (next == null):
return false; // no way to cover end of hall: report failure
else:
placed.add(next); // placement decided
return solve(total, next + 4.7, possible, placed);
Where extractFurthestWithinRange(float start, List<Float> candidates, float range) returns null if there are no candidates within range of start, or returns the last position p in candidates such that p <= start + range -- and also removes p, and all candidates c such that p >= c.
The key here is that, by always choosing to place a meter in the next position that a) leaves no gaps and b) is furthest from the previously-placed position we are simultaneously creating a valid covering (= no gaps) and an optimal covering (= no possible valid covering could have used less meters - because our gaps are already as wide as possible). At each iteration, we either completely solve the problem, or take a greedy bite to reduce it to a (guaranteed) smaller problem.
Note that there can be other optimal coverings with different meter positions, but they will use the exact same number of meters as those returned from this pseudo-code. For example, if you adapt the code to start from the end of the hallway instead of from the start, the covering would still be good, but the gaps could be rearranged. Indeed, if you need the lexicographically minimal optimal covering, you should use the adapted algorithm that places meters starting from the end:
// remaining = length (starts at hallway length)
// possible = positions to place meters at, starting by closest to end of hallway
// placed = positions where meters have been placed
boolean solve(float remaining, List<Float> possible, Queue<Float> placed):
if (remaining <= 0):
return true; // problem solved with no additional meters - woo!
else:
// extracts points p up to and including p such that p >= remaining - range
Float next = extractFurthestWithinRange2(remaining, possible, 4.7);
if (next == null):
return false; // no way to cover start of hall: report failure
else:
placed.add(next); // placement decided
return solve(next - 4.7, possible, placed);
To prove that your solution is optimal if it is found, you merely have to prove that it finds the lexicographically last optimal solution.
And you do that by induction on the size of the lexicographically last optimal solution. The case of a zero length floor and no monitor is trivial. Otherwise you demonstrate that you found the first element of the lexicographically last solution. And covering the rest of the line with the remaining elements is your induction step.
Technical note, for this to work you have to be allowed to place monitoring stations outside of the line.
I have written a piece of code that takes an input HSL color value, and categorizes it as one of eight predefined colors. Because the color of the object I'm measuring is not perfectly "smooth" (the exact H, S, and L values vary pixel to pixel, but there is a limited range each can fall into depending on the color), I want to aggregate the H's, S's, and L's of several pixels on the object, before identifying the resulting HSL value as a particular color.
To give an example, if the object is in fact black, then the H of any of its pixels should be in the range 24 - 33, while the H range is 33 - 37 for yellow. Aggregating several measured H's, instead of relying on a single measurement, will tend to yield a result closer to the middle of whichever range corresponds to the correct color, which desirably reduces the likelihood of needing to interpret the ambiguous 33 case.
I have been using something similar to median as my aggregation algorithm (exact algorithm shown below), but I've come across one case where this doesn't work well. In particular, the H range for a purple object includes both 231 - 240 and 0 - 10 (240 is the maximum H value, so it wraps around). All other colors' possible H's, S's, and L's are single, continuous ranges, for which the median approach (or my modified version) works well, but it isn't ideal in the purple H case, because it yields results which are actually closer to the edge of its range, so the input HSL value is more likely to be mistaken for another color (red's H range is 9 - 14).
Is there a better aggregation algorithm than median for this task, one that will tend to produce results nearer the middle of a color's H, S, and L ranges, even in the wrap around purple H case?
The algorithm:
private hslColor aggregateEachAttribute(hslColor[] hslData)
{
List<double> hAttributes = new List<double>();
List<double> sAttributes = new List<double>();
List<double> lAttributes = new List<double>();
for (int i = 0; i < hslData.Length; i++)
{
hAttributes.Add(hslData[i].H);
}
for (int i = 0; i < hslData.Length; i++)
{
sAttributes.Add(hslData[i].S);
}
for (int i = 0; i < hslData.Length; i++)
{
lAttributes.Add(hslData[i].L);
}
hAttributes.Sort();
sAttributes.Sort();
lAttributes.Sort();
while (hAttributes.Distinct().Count() >= 3)
{
hAttributes.RemoveAll(h => h == hAttributes[0]);
hAttributes.RemoveAll(h => h == hAttributes[hAttributes.Count() - 1]);
}
while (sAttributes.Distinct().Count() >= 3)
{
sAttributes.RemoveAll(s => s == sAttributes[0]);
sAttributes.RemoveAll(s => s == sAttributes[sAttributes.Count() - 1]);
}
while (lAttributes.Distinct().Count() >= 3)
{
lAttributes.RemoveAll(l => l == lAttributes[0]);
lAttributes.RemoveAll(l => l == lAttributes[lAttributes.Count() - 1]);
}
return new hslColor(hAttributes[0], sAttributes[0], lAttributes[0]);
}
If you just have wrap-around to worry about, and your points are either < X or > 240-X for some reasonably small X, you could calculate a median by adding X to your data mod 240, calculating a median, and then subtracting X from the result mod 240.
More generally, you could look for a median or medoid by finding the value that minimizes a sum of distances, where d(x, y) = min(|x-y|, |240 + x - y|, |240 + y - x|). I think you could calculate this in time O(n) after a sort that takes O(n log n) with some rather fiddly code in which you calculate the sum of distances associated with each candidate median or mediod, dividing the points into two sets according to whether they are clockwise or anticlockwise from the current point and updating the sum of distances incrementally as you move from one candidate median or medoid to the next.
I got asked the other day about "Outlining a general algorithm for solving a maze with n balls, where the goal is to get all the balls to a given position in the maze (the maze has no exit)". The only rules are that the algorithm must be effective (better than randomly moving the balls) and that all the commands issued will affect all the balls, so that is one ball is moved north, so will all the others if they are not blocked.
To do this I made some assumptions, namely that
The maze is standard/perfect
The balls cannot block each other
The balls can get to the position asked
A command will let the balls roll until the hit a wall
A command can only be N/S/E/W
The maze is randomly constructed and the balls randomly distributed each time it is reset
All of the maze can be observed at once by the maze-operator
And, to make my algorithm work
The balls are not identical (e.g. the have a number or something on them)
The maze-operator has a photographic memory
Given this, I thought the best idea would be to
Randomly (or in a smart way) move the balls until two balls get to the target position
Save the path from their starting position to their end position.
Identify those balls and where they came from, and for each of them, do 1.
The "break" in this recursive algorithm would be when all the balls have a way to reach the given target (O(log(n)) recursions I think?)
Does this work? Does anyone else have a better algorithm for doing this?
I had another idea involving moving all the balls to the same random position and then moving them all as one ball, but that seemed like a worse algorithm.
Another idea would be to generate a graph (graph theory that is) where all the stable points for a ball would be a node, and the move would be an edge, but I can't see how that doesn't need a lot of brute force to be done.
I would suggest the following algorithm:
Create a data structure for the maze where for each free cell (square) the following is known:
a. the coordinates (row, column)
b. the target cells for the 4 moves (north, east, south, west)
c. the reverse of b: the cells from where marbles can come to this cell (if any).
Perform a BFS starting from the target cell, performing reverse moves with one imaginary marble, assigning to each visited cell, the least number of moves necessary from that cell to reach the target cell. Note that some cells might not get visited this way, meaning that if a marble would be placed there, there would be no way to get it to the target cell by performing legal moves. These cells will get an infinite distance attributed to them (initial value).
Create an evaluation function that will assign a cost to a certain configuration of marbles. The suggested evaluation function would sum up the squares of the distances of each of the cells occupied by at least one marble. By taking the square, higher distances will bring a relatively high cost, so that the algorithm will favour moves that improve the position of the marble that has the worst position. This function would not count double the cells that are occupied by more than one marble. That way, configurations where marbles share a cell are favoured.
From the starting position, generate the 4 possible moves with their evaluated cost. Sort them ascending by their cost, and in that order perform a DFS, repeating this step recursively. When the cost becomes zero, a solution is found, and during the immediate backtracking out of recursion, the "path" of moves is returned. When the cost is infinite, then the search is stopped, and the next move is tried, ...etc.
During the search keep a list of visited positions. When a position is visited again, the evaluation function will give it a value of infinity, so that the search will backtrack when this happens.
Here is a JavaScript implementation of the above algorithm:
"use strict";
function createMaze(mazeStr) {
var maze, lines, cell, row, ch, id, r, c, n, m;
maze = {
nodesRowCol: [],
nodes: [],
target: null,
marbles: []
};
id = 0;
lines = mazeStr.split("\n");
for (r = 0; r < lines.length; r++) {
maze.nodesRowCol[r] = row = [];
for (c = 0; c < lines[r].length; c++) {
ch = lines[r].charAt(c);
if (ch !== '#') {
maze.nodes[id] = row[c] = cell = {
row: r,
col: c,
id: id++,
comeFrom: [],
};
// Keep track of target and marbles
if (ch === '*') maze.target = cell;
if (ch === '.') maze.marbles.push(cell);
}
}
}
// Add neighbours
for (n = 0; n < maze.nodes.length; n++) {
cell = maze.nodes[n];
cell.neighbours = [
maze.nodesRowCol[cell.row-1][cell.col], /* north */
maze.nodesRowCol[cell.row][cell.col+1], /* east */
maze.nodesRowCol[cell.row+1][cell.col], /* south */
maze.nodesRowCol[cell.row][cell.col-1] /* west */
];
}
// Add marble moves in two steps
for (n = 0; n < maze.nodes.length; n++) {
cell = maze.nodes[n];
cell.moves = [
cell.neighbours[0] ? cell.neighbours[0].moves[0] : cell, /* north */
null,
null,
cell.neighbours[3] ? cell.neighbours[3].moves[3] : cell, /* west */
];
}
for (n = maze.nodes.length - 1; n >= 0; n--) {
cell = maze.nodes[n];
cell.moves[1] = cell.neighbours[1] ? cell.neighbours[1].moves[1] : cell; /* west */
cell.moves[2] = cell.neighbours[2] ? cell.neighbours[2].moves[2] : cell; /* south */
}
// add reverse-move ("marble can come from") data
for (n = maze.nodes.length - 1; n >= 0; n--) {
cell = maze.nodes[n];
for (m = 0; m < 4; m++) {
if (cell.moves[m] !== cell) cell.moves[m].comeFrom.push(cell);
}
}
return maze;
}
function setDistances(maze) {
var n, cell, distance, stack, newStack, i;
// clear distance information
for (n = 0; n < maze.nodes.length; n++) {
maze.nodes[n].distance = Number.POSITIVE_INFINITY;
}
// set initial distance
cell = maze.target;
cell.distance = distance = 0;
// BSF loop to set the distance for each cell that can be reached
stack = cell.comeFrom.slice();
while (stack.length) {
distance++;
newStack = [];
for (i = 0; i < stack.length; i++) {
cell = stack[i];
if (distance < cell.distance) {
cell.distance = distance;
newStack = newStack.concat(cell.comeFrom);
}
}
stack = newStack;
}
}
function evaluatedPosition(position, visited) {
// Assign heurstic cost to position
var m, ids;
position.cost = 0;
ids = []; // keep track of marble positions
for (m = 0; m < position.marbles.length; m++) {
// If mulitple marbles are at same cell, only account for that cell once.
// This will favour such positions:
if (ids[position.marbles[m].id] === undefined) {
// Make higher distances cost a lot, so that insentive
// is to improve the position of the worst placed marble
position.cost += Math.pow(position.marbles[m].distance, 2);
ids[position.marbles[m].id] = position.marbles[m].id;
}
}
// Assign some unique string, identifying the marble configuration
position.id = ids.join(',');
// If position was already visited before, give it the maximum cost
if (visited[position.id]) position.cost = Number.POSITIVE_INFINITY;
// Mark position as visited
visited[position.id] = 1;
return position;
}
function createMove(dir, marbles, visited) {
var m, movedMarbles;
movedMarbles = [];
for (m = 0; m < marbles.length; m++) {
movedMarbles[m] = marbles[m].moves[dir];
}
return evaluatedPosition({
dir: dir,
marbles: movedMarbles,
}, visited);
}
function solve(maze) {
var visited = {}; // nothing visited yet
function recurse (position) {
var ids, m, moves, i, path;
if (position.cost == 0) return []; // marbles are all on target spot.
if (!isFinite(position.cost)) return false; // no solution
// get move list
moves = [];
for (i = 0; i < 4; i++) {
moves[i] = createMove(i, position.marbles, visited);
}
// apply heuristic: sort the 4 moves by ascending cost
moves.sort(function (a,b) { return a.cost - b.cost });
for (i = 0; i < 4; i++) {
//console.log('=== move === ' + moves[i].dir);
path = recurse(moves[i]);
if (path !== false) return [moves[i].dir].concat(path);
}
return false; // no solution found
}
// Enrich initial position with cost, and start recursive search
return recurse(evaluatedPosition({
marbles: maze.marbles
}, visited));
}
// # = wall
// * = target
// . = marble
var mazeStr = `
###########
# # #*#
# # #.# .#
# #. #.# #
# # # ### #
# # #
###########
`.trim();
var maze = createMaze(mazeStr);
setDistances(maze);
console.log('#=wall, .=marble, *=target\n\n' + mazeStr);
var moves = solve(maze);
console.log('moves (0=north,1=east,2=south,3=west): ' + moves);
The found solution is not necessarily optimal. It performs an evaluation with depth 1. For better solutions, the algorithm could do an evaluation at greater depths.
The maze and allowed movements can be modelled as a deterministic finite automaton (DFA) on an alphabet of four symbols. Each cell in the maze is a DFA state, and cell x has a transition to cell y on symbol s whenever a ball at cell x would move to cell y when issued the command s.
The algorithm has three stages:
Construct a DFA consisting of only those states reachable by any ball in the maze, by some sequence of commands.
Find any synchronizing word for the DFA. A synchronizing word, or "reset word", is any sequence of symbols for which all initial states end at the same state. Note that we really only need a word which synchronizes all the initial states of the balls, not every state in the DFA.
Find a shortest sequence of moves from the reset word's end state to the target position in the maze. This can be done using any shortest-path algorithm, e.g. breadth-first search (BFS).
This needs some explanation.
Firstly, not every DFA has a reset word, but if the DFA constructed in step 1 has no reset word then, by definition, no sequence of commands can bring all balls to the same target cell. So this algorithm will solve every solvable instance of the problem.
Secondly, finding a minimum-length reset word is a hard problem which takes exponential time in the worst case. But the question says only that "the algorithm must be effective (better than randomly moving the balls)", so any reset word will do.
The simplest way to construct a reset word is probably using breadth-first search on the Cartesian product of the DFA with itself. For a DFA with n states, it takes O(n²) time to find a word which synchronizes two states; this must be repeated up to k - 1 times to synchronize the k initial states of the balls, giving an O(kn²) running time, and a reset word of length O(kn²).
Put in plainer language, the simplest form of this algorithm uses BFS to get two of the balls into the same place, then BFS again to get a third ball into the same place as those two, and so on, until all the balls are in the same place. Then it uses BFS to get them all to the target in unison. But the algorithm can be improved by plugging in a better reset-word-finding algorithm; generally, a reset word shorter than n² symbols should exist even in the worst case (this is believed but unproven), which is much better than kn².
I want to implement an iterative algorithm, which calculates weighted average. The specific weight law does not matter, but it should be close to 1 for the newest values and close to 0 to the oldest.
The algorithm should be iterative. i.e. it should not remember all previous values. It should know only one newest value and any aggregative information about past, like previous values of the average, sums, counts etc.
Is it possible?
For example, the following algorithm can be:
void iterate(double value) {
sum *= 0.99;
sum += value;
count++;
avg = sum / count;
}
It will give exponential decreasing weight, which may be not good. Is it possible to have step decreasing weight or something?
EDIT 1
The the requirements for weighing law is follows:
1) The weight decreases into past
2) I has some mean or characteristic duration so that values older this duration matters much lesser than newer ones
3) I should be able to set this duration
EDIT 2
I need the following. Suppose v_i are values, where v_1 is the first. Also suppose w_i are weights. But w_0 is THE LAST.
So, after first value came I have first average
a_1 = v_1 * w_0
After the second value v_2 came, I should have average
a_2 = v_1 * w_1 + v_2 * w_0
With next value I should have
a_3 = v_1 * w_2 + v_2 * w_1 + v_3 * w_0
Note, that weight profile is moving with me, while I am moving along value sequence.
I.e. each value does not have it's own weight all the time. My goal is to have this weight lower while going to past.
First a bit of background. If we were keeping a normal average, it would go like this:
average(a) = 11
average(a,b) = (average(a)+b)/2
average(a,b,c) = (average(a,b)*2 + c)/3
average(a,b,c,d) = (average(a,b,c)*3 + d)/4
As you can see here, this is an "online" algorithm and we only need to keep track of pieces of data: 1) the total numbers in the average, and 2) the average itself. Then we can undivide the average by the total, add in the new number, and divide it by the new total.
Weighted averages are a bit different. It depends on what kind of weighted average. For example if you defined:
weightedAverage(a,wa, b,wb, c,wc, ..., z,wz) = a*wa + b*wb + c*wc + ... + w*wz
or
weightedAverage(elements, weights) = elements·weights
...then you don't need to do anything besides add the new element*weight! If however you defined the weighted average akin to an expected-value from probability:
weightedAverage(elements,weights) = elements·weights / sum(weights)
...then you'd need to keep track of the total weights. Instead of undividing by the total number of elements, you undivide by the total weight, add in the new element*weight, then divide by the new total weight.
Alternatively you don't need to undivide, as demonstrated below: you can merely keep track of the temporary dot product and weight total in a closure or an object, and divide it as you yield (this can help a lot with avoiding numerical inaccuracy from compounded rounding errors).
In python this would be:
def makeAverager():
dotProduct = 0
totalWeight = 0
def averager(newValue, weight):
nonlocal dotProduct,totalWeight
dotProduct += newValue*weight
totalWeight += weight
return dotProduct/totalWeight
return averager
Demo:
>>> averager = makeAverager()
>>> [averager(value,w) for value,w in [(100,0.2), (50,0.5), (100,0.1)]]
[100.0, 64.28571428571429, 68.75]
>>> averager(10,1.1)
34.73684210526316
>>> averager(10,1.1)
25.666666666666668
>>> averager(30,2.0)
27.4
> But my task is to have average recalculated each time new value arrives having old values reweighted. –OP
Your task is almost always impossible, even with exceptionally simple weighting schemes.
You are asking to, with O(1) memory, yield averages with a changing weighting scheme. For example, {values·weights1, (values+[newValue2])·weights2, (values+[newValue2,newValue3])·weights3, ...} as new values are being passed in, for some nearly arbitrarily changing weights sequence. This is impossible due to injectivity. Once you merge the numbers in together, you lose a massive amount of information. For example, even if you had the weight vector, you could not recover the original value vector, or vice versa. There are only two cases I can think of where you could get away with this:
Constant weights such as [2,2,2,...2]: this is equivalent to an on-line averaging algorithm, which you don't want because the old values are not being "reweighted".
The relative weights of previous answers do not change. For example you could do weights of [8,4,2,1], and add in a new element with arbitrary weight like ...+[1], but you must increase all the previous by the same multiplicative factor, like [16,8,4,2]+[1]. Thus at each step, you are adding a new arbitrary weight, and a new arbitrary rescaling of the past, so you have 2 degrees of freedom (only 1 if you need to keep your dot-product normalized). The weight-vectors you'd get would look like:
[w0]
[w0*(s1), w1]
[w0*(s1*s2), w1*(s2), w2]
[w0*(s1*s2*s3), w1*(s2*s3), w2*(s3), w3]
...
Thus any weighting scheme you can make look like that will work (unless you need to keep the thing normalized by the sum of weights, in which case you must then divide the new average by the new sum, which you can calculate by keeping only O(1) memory). Merely multiply the previous average by the new s (which will implicitly distribute over the dot-product into the weights), and tack on the new +w*newValue.
I think you are looking for something like this:
void iterate(double value) {
count++;
weight = max(0, 1 - (count / 1000));
avg = ( avg * total_weight * (count - 1) + weight * value) / (total_weight * (count - 1) + weight)
total_weight += weight;
}
Here I'm assuming you want the weights to sum to 1. As long as you can generate a relative weight without it changing in the future, you can end up with a solution which mimics this behavior.
That is, suppose you defined your weights as a sequence {s_0, s_1, s_2, ..., s_n, ...} and defined the input as sequence {i_0, i_1, i_2, ..., i_n}.
Consider the form: sum(s_0*i_0 + s_1*i_1 + s_2*i_2 + ... + s_n*i_n) / sum(s_0 + s_1 + s_2 + ... + s_n). Note that it is trivially possible to compute this incrementally with a couple of aggregation counters:
int counter = 0;
double numerator = 0;
double denominator = 0;
void addValue(double val)
{
double weight = calculateWeightFromCounter(counter);
numerator += weight * val;
denominator += weight;
}
double getAverage()
{
if (denominator == 0.0) return 0.0;
return numerator / denominator;
}
Of course, calculateWeightFromCounter() in this case shouldn't generate weights that sum to one -- the trick here is that we average by dividing by the sum of the weights so that in the end, the weights virtually seem to sum to one.
The real trick is how you do calculateWeightFromCounter(). You could simply return the counter itself, for example, however note that the last weighted number would not be near the sum of the counters necessarily, so you may not end up with the exact properties you want. (It's hard to say since, as mentioned, you've left a fairly open problem.)
This is too long to post in a comment, but it may be useful to know.
Suppose you have:
w_0*v_n + ... w_n*v_0 (we'll call this w[0..n]*v[n..0] for short)
Then the next step is:
w_0*v_n1 + ... w_n1*v_0 (and this is w[0..n1]*v[n1..0] for short)
This means we need a way to calculate w[1..n1]*v[n..0] from w[0..n]*v[n..0].
It's certainly possible that v[n..0] is 0, ..., 0, z, 0, ..., 0 where z is at some location x.
If we don't have any 'extra' storage, then f(z*w(x))=z*w(x + 1) where w(x) is the weight for location x.
Rearranging the equation, w(x + 1) = f(z*w(x))/z. Well, w(x + 1) better be constant for a constant x, so f(z*w(x))/z better be constant. Hence, f must let z propagate -- that is, f(z*w(x)) = z*f(w(x)).
But here again we have an issue. Note that if z (which could be any number) can propagate through f, then w(x) certainly can. So f(z*w(x)) = w(x)*f(z). Thus f(w(x)) = w(x)/f(z).
But for a constant x, w(x) is constant, and thus f(w(x)) better be constant, too. w(x) is constant, so f(z) better be constant so that w(x)/f(z) is constant. Thus f(w(x)) = w(x)/c where c is a constant.
So, f(x)=c*x where c is a constant when x is a weight value.
So w(x+1) = c*w(x).
That is, each weight is a multiple of the previous. Thus, the weights take the form w(x)=m*b^x.
Note that this assumes the only information f has is the last aggregated value. Note that at some point you will be reduced to this case unless you're willing to store a non-constant amount of data representing your input. You cannot represent an infinite length vector of real numbers with a real number, but you can approximate them somehow in a constant, finite amount of storage. But this would merely be an approximation.
Although I haven't rigorously proven it, it is my conclusion that what you want is impossible to do with a high degree of precision, but you may be able to use log(n) space (which may as well be O(1) for many practical applications) to generate a quality approximation. You may be able to use even less.
I tried to practically code something (in Java). As has been said, your goal is not achievable. You can only count average from some number of last remembered values. If you don't need to be exact, you can approximate the older values. I tried to do it by remembering last 5 values exactly and older values only SUMmed by 5 values, remembering the last 5 SUMs. Then, the complexity is O(2n) for remembering last n+n*n values. This is a very rough approximation.
You can modify the "lastValues" and "lasAggregatedSums" array sizes as you want. See this ascii-art picture trying to display a graph of last values, showing that the first columns (older data) are remembered as aggregated value (not individually), and only the earliest 5 values are remembered individually.
values:
#####
##### ##### #
##### ##### ##### # #
##### ##### ##### ##### ## ##
##### ##### ##### ##### ##### #####
time: --->
Challenge 1: My example doesn't count weights, but I think it shouldn't be problem for you to add weights for the "lastAggregatedSums" appropriately - the only problem is, that if you want lower weights for older values, it would be harder, because the array is rotating, so it is not straightforward to know which weight for which array member. Maybe you can modify the algorithm to always "shift" values in the array instead of rotating? Then adding weights shouldn't be a problem.
Challenge 2: The arrays are initialized with 0 values, and those values are counting to the average from the beginning, even when we haven't receive enough values. If you are running the algorithm for long time, you probably don't bother that it is learning for some time at the beginning. If you do, you can post a modification ;-)
public class AverageCounter {
private float[] lastValues = new float[5];
private float[] lastAggregatedSums = new float[5];
private int valIdx = 0;
private int aggValIdx = 0;
private float avg;
public void add(float value) {
lastValues[valIdx++] = value;
if(valIdx == lastValues.length) {
// count average of last values and save into the aggregated array.
float sum = 0;
for(float v: lastValues) {sum += v;}
lastAggregatedSums[aggValIdx++] = sum;
if(aggValIdx >= lastAggregatedSums.length) {
// rotate aggregated values index
aggValIdx = 0;
}
valIdx = 0;
}
float sum = 0;
for(float v: lastValues) {sum += v;}
for(float v: lastAggregatedSums) {sum += v;}
avg = sum / (lastValues.length + lastAggregatedSums.length * lastValues.length);
}
public float getAvg() {
return avg;
}
}
you can combine (weighted sum) exponential means with different effective window sizes (N) in order to get the desired weights.
Use more exponential means to define your weight profile more detailed.
(more exponential means also means to store and calculate more values, so here is the trade off)
A memoryless solution is to calculate the new average from a weighted combination of the previous average and the new value:
average = (1 - P) * average + P * value
where P is an empirical constant, 0 <= P <= 1
expanding gives:
average = sum i (weight[i] * value[i])
where value[0] is the newest value, and
weight[i] = P * (1 - P) ^ i
When P is low, historical values are given higher weighting.
The closer P gets to 1, the more quickly it converges to newer values.
When P = 1, it's a regular assignment and ignores previous values.
If you want to maximise the contribution of value[N], maximize
weight[N] = P * (1 - P) ^ N
where 0 <= P <= 1
I discovered weight[N] is maximized when
P = 1 / (N + 1)