How to implement a counter that decays over time? - algorithm

Requirements of special counter
I want to implement a special counter: all increment operations time out after a fixed period of time (say 30 days).
An example:
Day 0: counter = 0. TTL = 30 days
Day 1: increment counter (+1)
Day 2: increment counter (+1)
Day 3: value of counter == 2
Day 31: value of counter == 1
Day 32: value of counter == 0
Naive solution
A naïve implementation is to maintain a set of timestamps, where each timestamp equals the time of an increment. The value of the counter equals the size of the set after subtracting all timestamps that have timed out.
This naïve counter has O(n) space (size of the set), has O(n) lookup and O(1) inserts. The values are exact.
Better solution (for me)
Trade speed and memory for accuracy.
I want a counter with O(1) lookup and insert, O(1) space. The accuracy < exact.
Alternatively, I would accept O(log n) space and lookup.
The counter representation should be suited for storage in a database field, i.e., I should be able to update and poll the counter rapidly without too much (de)serialization overhead.
I'm essentially looking for a counter that resembles a HyperLogLog counter, but for a different type of approximate count: decaying increments vs. number of distinct elements
How could I implement such a counter?

If you can live with 24 hour granularity then you can bucket your counter into k buckets where k is the number of days in your longest TTL.
Incrementing is an O(1) operation - simply increment the value in the bucket with index (k-TTL), as well as the current sum total.
Reading is another O(1) operation as you simply read the current sum total.
A cronjob pops off the now-expired bucket each night (and adds a bucket with value 0 at the opposite end) and decreases your counter by the sum in that bucket (this is a background task so it would not affect your insert or read operations)

Decaying counter based on annealing
Here is a counter that is based on annealing (implemented in Python).
The counter exponentially decays over time; controlled by the rate alpha
When you read and write the counter, you provide a time index (increment or read the counter at time t)
You can read the counter in the present and future (w.r.t. index of last increment), but not in the past
Time indices of sequential increments must be weakly monotonically increasing
The algorithm is exact w.r.t. the alternative formulation (annealing vs. TTL). It has O(1) increment and read. It consumes O(1) space, in fact just three floating point fields.
class AnnealingCounter():
def __init__(self, alpha=0.9):
self.alpha = alpha # rate of decay
self.last_t = .0 # time of last increment
self.heat = .0 # value of counter at last_t
def increment(self, t=None, amount=1.0):
"""
t is a floating point temporal index.
If t is not provided, the value of last_t is used
"""
if t is None: t = self.last_t
elapsed = t - self.last_t
if elapsed < .0 :
raise ValueError('Cannot increment the counter in the past, i.e. before the last increment')
self.heat = amount + self.heat * (self.alpha ** elapsed)
self.last_t = t
def get_value(self, t=None):
"""
t is a floating point temporal index.
If t is not provided, the value of last_t is used
"""
if t is None: t = self.last_t
elapsed = t - self.last_t
if elapsed < .0 :
raise ValueError('Cannot increment the counter in the past, i.e. before the last increment')
return self.heat * (self.alpha ** elapsed)
def __str__(self):
return str('Counter value at time {}: {}'.format(self.last_t, self.heat))
def __repr__(self):
return self.__str__()
Here is how to use it:
>>> c = AnnealingCounter(alpha=0.9)
Counter has value 0.0 at time 0.0
>>> c.increment() # increment by 1.0, but don't move time forward
Counter has value 1.0 at time 0.0
>>> c.increment(amount=3.2, t=0.5) # increment by 3.2 and move time forward (t=0.5)
Counter has value 4.14868329805 at time 0.5
>>> c.increment() # increment by 1.0, but don't move time forward
Counter has value 5.14868329805 at time 0.5
>>> c.get_value() # get value as after last increment (t=0.5)
5.148683298050514
>>> c.get_value(t=2.0)
4.396022866630942 # get future value (t=2.0)

Since the increments expire in the same order as they happen, the timestamps form a simple queue.
The current value of the counter can be stored separately in O(1) additional memory. At the start of each operation (insert or query), while the front of the queue is expired, it's popped out of the queue, and the counter is decreased.
Note that each of the n timestamps is created and popped out once. Thus you have O(1) amortized time to access the current value, and O(n) memory to store the non-expired timestamps. The actual highest memory usage is also limited by the ratio of TTL / frequency of new timestamp insertions.

Related

How to implement a time interval as an exponential function?

I need to implement an event (stock loss error) that occurs between time intervals as a renewal process. With every day of non-occurrence the probability of occurrence at the other day increases based on an exponential distribution: "The time intervals are based on an exponential distribution with a mean time between stock loss events (TBSLE). The frequency of (stock loss-)occurrence is the reciprocal of TBSLE. The expected value for the mean stock loss quantity can be estimated as 2.05."
First try:
def stockLossError(self):
stockLossErrorProbability = 0
inverseLambda =
errors = 0
randomnumber = np.random.exponential(scale=inverseLambda,size=(1,1))
if(randomnumber > stockLossErrorProbability):
self.daysSinceLastError += 1
self.errors += 2.05

Verilog: Minimal (hardware) algorithm for multiplying a binary input to its delayed form

I have a binary input in (1 bit serial input) which I want to delay by M clock pulses and then multiply (AND) the 2 signals. In other words, I want to evaluate the sum:
sum(in[n]*in[n+M])
where n is expressed in terms of number of clock pulses.
The most straightforward way is to store in a memory buffer in_dly the latest M samples of in. In Verilog, this would be something like:
always #(posedge clock ...)
...
in_dly[M-1:0] <= {in_dly[M-2:0], in};
if (in_dly[M-1] & in)
sum <= sum + 'd1;
...
While this works in theory, with large values of M (can be ~2000), the size of the buffer is not practical. However, I was thinking to take advantage of the fact that the input signal is 1 bit and it is expected to toggle only a few times (~1-10) during M samples.
This made me think of storing the toggle times from 2k*M to (2k+1)*M in an array a and from (2k+1)*M to (2k+2)*M in an array b (k is just an integer used to generalize the idea):
reg [10:0] a[0:9]; //2^11 > max(M)=2000 and "a" has max 10 elements
reg [10:0] b[0:9]; //same as "a"
Therefore, during M samples, in = 'b1 during intervals [a[1],a[2]], [a[3],a[4]], etc. Similarly, during the next M samples, the input is high during [b[1],b[2]], [b[3],b[4]], etc. Now, the sum is the "overlapping" of these intervals:
min(b[2],a[2])-max(b[1],a[1]), if b[2]>a[1] and b[1]<a[2]; 0 otherwise
Finally, the array b becomes the new array a and the next M samples are evaluated and stored into b. The process is repeated until the end of in.
Comparing this "optimized" method to the initial one, there is a significant gain in hardware: initially 2000 bits were stored, and now 220 bits are stored (for this example). However, the number is still large and not very practical..
I would greatly appreciate if somebody could suggest a more optimal (hardware-wise) way or a simpler way (algorithm-wise) of doing this operation. Thank you in advance!
Edit:
Thanks to Alexey's idea, I optimized the algorithm as follows:
Given a set of delays M[i] for i=1 to 10 with M[1]<M[2]<..<M[10], and an input binary array in, we need to compute the outputs:
y[i] = sum(in[n]*in[n+M[i]]) for n=1 to length(in).
We then define 2 empty arrays a[j] and b[j] with j=1,~5. Whenever in has a 0->1 transition, the smallest index empty element a[j] is "activated" and will increment at each clock cycle. Same goes for b[j] at 1->0 transitions. Basically, the pairs (a[j],b[j]) represent the portions of in equal to 1.
Whenever a[j] equals M[i], the sum y[i] will increment by 1 at each cycle while in = 1, until b[j] equals M[i]. Once a[j] equals M[10], a[j] is cleared. Same goes for b[j]. This is repeated until the end of in.
Based on the same numerical assumptions as the initial question, a total of 10 arrays (a and b) of 11 bits allow the computation of the 10 sums, corresponding to 10 different delays M[i]. This is almost 20 times better (in terms of resources used) than my initial approach. Any further optimization or idea is welcomed!
Try this:
make array A,
every time when in==1 get free A element and write M to it.
every clock decrement all non-zero A elements,
once any decremented element becomes zero, test in, if in==1 - sum++.
Edit: algorithm above intended for input like
- 00000000000010000000100010000000, while LLDinu realy needs
- 11111111111110000000011111000000, so here is modified algorithm:
make array (ring buffer) A,
every time when in toggles, get free A element and write M to it.
every clock decrement all non-zero A elements,
every clock test in, if in==1 and number of non-zero A elements is even - sum++.

Interview q: Data structure and algorithm for O(1) retrieval of avg. response time in client server architecture

Intv Q:
In a client-server architecture, there are multiple requests from multiple clients to the server. The server should maintain the response times of all the requests in the previous hour. What data structure and algo will be used for this? Also, the average response time needs to be maintained and has to be retrieved in O(1).
My take:
algo: maintain a running mean
mean = mean_prev *n + current_response_time
-------------------------------
n+1
DS: a set (using order statistic tree).
My question is whether there is a better answer. I felt that my answer is very trivial and the answer to the questions(in the interview) before this one and after this one where non trivial.
EDIT:
Based on what amit suggested:
cleanup()
while(queue.front().timestamp-curr_time > 1hr)
(timestamp,val)=queue.pop();
sum=sum-val
n=n-1;
insert(timestamp,value)
queue.push(timestamp,value);
sum=sum+val
n=n+1;
cleanup();
query_average()
cleanup();
return sum/n;
And if we can ensure that cleanup() is triggered once every hour or half an hour, then query_average() will not take very long. But if someone were to implement timer trigger for a function call, how would they do it?
The problem with your solution is it only takes the total average since the beginning of time, and not for the last one hour, as you supposed to.
To do so, you need to maintain 2 variables and a queue of entries (timestamp,value).
The 2 variables will be n (the number of elements that are relevant to the last hours) and sum - the sum of the elements from the last hour.
When a new element arrives:
queue.add(timestamp,value)
sum = sum + value
n = n+1
When you have a query for average:
while (queue.front().timestamp > currentTimeAtamp() - 1 hour):
(timestamp,value) = queue.pop()
sum = sum - value
n = n-1
return sum/n
Note that the above is still O(1) on average, because for every insertion to the queue - you do exactly one deletion. You might add the above loop to the insertion procedure as well.

Limit for quadratic probing a hash table

I was doing a program to compare the average and maximum accesses required for linear probing, quadratic probing and separate chaining in hash table.
I had done the element insertion part for 3 cases. While finding the element from hash table, I need to have a limit for ending the searching.
In the case of separate chaining, I can stop when next pointer is null.
For linear probing, I can stop when probed the whole table (ie size of table).
What should I use as limit in quadratic probing? Will table size do?
My quadratic probing function is like this
newKey = (key + i*i) % size;
where i varies from 0 to infinity. Please help me..
For such problems analyse the growth of i in two pieces:
First Interval : i goes from 0 to size-1
In this case, I haven't got the solution for now. Hopefully will update.
Second Interval : i goes from size to infinity
In this case i can be expressed as i = size + k, then
newKey = (key + i*i) % size
= (key + (size+k)*(size+k)) % size
= (key + size*size + 2*k*size + k*k) % size
= (key + k*k) % size
So it's sure that we will start probing previously probed cells, after i reaches to size. So you only need to consider the situation where i goes from 0 to size-1. Because rest is only the same story again and again.
What the story tells up to now: A simple analysis showed me that I need to probe at most size times because beyond size times I started probing the same cells.
See this link. If your table size is power of 2 and you are using a reprobe function f(i)=i*(i+1)/2, you are guaranteed to traverse the entire table. If your table size is a prime number, you are guaranteed to traverse at least half of the table. In general, you can check if at some point you are back to the original point. If that happens, you need to rehash.
After doing some simulations in Excel, it appears that iterating up to i = size / 2 would be all that needs to be tested. This is when using the standard method of adding sequential perfect squares to the single-hashed position.
The answer that you can quit if a position is revisited would not allow testing of all possible positions that could be reached by the quadratic-probe method, at least not for all array sizes. (I tested array size 21 and found that i=5 revisits the same position as i=2, but i=6 yields a previously not-calculated position.)

Algorithm for collecting total events within the last certain time

I am facing an algorithm problem.
We have a task that runs every 10ms and during the running, an event can happen or not happen. Is there any simple algorithm that allows us to keep track of how many time an event is triggered within the latest, say, 1 second?
The only idea that I have is to implement an array and save all the events. As we are programming embedded systems, there is not enough space...
Thanks in advance.
an array of 13 bytes for a second worth of events in 10ms steps.
consider it an array of 104 bits marking 0ms to 104ms
if the event occurs mark the bit and increment to the next time, else just increment to next bit/byte.
if you want ... run length encode after each second to offload the event bits into another value.
or ... treat it as a circular buffer and keep the count available for query.
or both
You could reduce the array size to match the space available.
It is not clear if an event could occur multiple times while your task was running, or if it is always 10ms between events.
This is more-or-less what Dtyree and Weeble have suggested, but an example implementation may help ( C code for illustration):
#include <stdint.h>
#include <stdbool.h>
#define HISTORY_LENGTH 100 // 1 second when called every 10ms
int rollingcount( bool event )
{
static uint8_t event_history[(HISTORY_LENGTH+7) / 8] ;
static int next_history_bit = 0 ;
static int event_count = 0 ;
// Get history byte index and bit mask
int history_index = next_history_bit >> 3 ; // ">> 3" is same as "/ 8" but often faster
uint8_t history_mask = 1 << (next_history_bit & 0x7) ; // "& 0x07" is same as "% 8" but often faster
// Get current bit value
bool history_bit = (event_history[history_index] & history_mask) != 0 ;
// If oldest history event is not the same as new event, adjust count
if( history_bit != event )
{
if( event )
{
// Increment count for 0->1
event_count++ ;
// Replace oldest bit with 1
event_history[history_index] |= history_mask ;
}
else
{
// decrement count for 1->0
event_count-- ;
// Replace oldest bit with 0
event_history[history_index] &= ~history_mask ;
}
}
// increment to oldest history bit
next_history_bit++ ;
if( next_history_bit >= HISTORY_LENGTH ) // Could use "next_history_bit %= HISTORY_COUNT" here, but may be expensive of some processors
{
next_history_bit = 0 ;
}
return event_count ;
}
For a 100 sample history, it requires 13 bytes plus two integers of statically allocated memory, I have used int for generality, but in this case uint8_t counters would suffice. In addition there are three stack variables, and again the use of int is not necessary if you need to really optimise memory use. So in total it is possible to use as little as 15 bytes plus three bytes of stack. The event argument may or may not be passed on the stack, then there is the function call return address, but again that depends on the calling convention of your compiler/processor.
You need some kind of list/queue etc, but a ringbuffer has probably the best performance.
You need to store 100 counters (1 for each time period of 10 ms during the last second) and a current counter.
Ringbuffer solution:
(I used pseudo code).
Create a counter_array of 100 counters (initially filled with 0's).
int[100] counter_array;
current_counter = 0
During the 10 ms cycle:
counter_array[current_counter] = 0;
current_counter++;
For every event:
counter_array[current_counter]++
To check the number of events during the last s, take the sum of counter_array
Can you afford an array of 100 booleans? Perhaps as a bit field? As long as you can afford the space cost, you can track the number of events in constant time:
Store:
A counter C, initially 0.
The array of booleans B, of size equal to the number of intervals you want to track, i.e. 100, initially all false.
An index I, initially 0.
Each interval:
read the boolean at B[i], and decrement C if it's true.
set the boolean at B[i] to true if the event occurred in this interval, false otherwise.
Increment C if the event occurred in this interval.
When I reaches 100, reset it to 0.
That way you at least avoid scanning the whole array every interval.
EDIT - Okay, so you want to track events over the last 3 minutes (180s, 18000 intervals). Using the above algorithm and cramming the booleans into a bit-field, that requires total storage:
2 byte unsigned integer for C
2 byte unsigned integer for I
2250 byte bit-field for B
That's pretty much unavoidable if you require to have a precise count of the number of events in the last 180.0 seconds at all times. I don't think it would be hard to prove that you need all of that information to be able to give an accurate answer at all times. However, if you could live with knowing only the number of events in the last 180 +/- 2 seconds, you could instead reduce your time resolution. Here's a detailed example, expanding on my comment below.
The above algorithm generalizes:
Store:
A counter C, initially 0.
The array of counters B, of size equal to the number of intervals you want to track, i.e. 100, initially all 0.
An index I, initially 0.
Each interval:
read B[i], and decrement C by that amount.
write the number of events that occurred this interval into B[i].
Increment C by the number of events that occurred this interval.
When I reaches the length of B, reset it to 0.
If you switch your interval to 2s, then in that time 0-200 events might occur. So each counter in the array could be a one-byte unsigned integer. You would have 90 such intervals over 3 minutes, so your array would need 90 elements = 90 bytes.
If you switch your interval to 150ms, then in that time 0-15 events might occur. If you are pressed for space, you could cram this into a half-byte unsigned integer. You would have 1200 such intervals over 3 minutes, so your array would need 1200 elements = 600 bytes.
Will the following work for you application?
A rolling event counter that increments every event.
In the routine that runs every 10ms, you compare the current event counter value with the event counter value stored the last time the routine ran.
That tells you how many events occurred during the 10ms window.

Resources