Logarithmic score decay based on time - algorithm

I'm looking for an algorithm that has a score that logarithmically gets smaller over time. This is similar to this question but the algorithm should have a nice curve instead of being linear. A time of 1 should have a score of 1, with the score diminishing as the time value increases and ideally there would be a configurable value where the score become crosses the X axis and becomes 0.

This function satisfies your criteria
score(t) = -A log(t) + 1
where A > 0
The score crosses the X-axis at
T = exp(1/A)

Related

What's the Difference Between Kendall's Distance and Kendall tau Distance?

I'm now trying to use Kendall's distance to improve sets of rankings based on Borda counts method.
I'm asked to follow a specific document's instructions. In the document it states that :
"The Kendall's distance counts the pairwise disagreements between items from two rankings as :
where
The Kendall's distance is normalized by its maximum value C2n. The less the Kendall’s distance is, the greater the similarity degree between the rankings is.
The Kendall's tau is another method for measuring the similarity degree between rankings, which is easy to be confused with the Kendall's distance.
The Kendall's tau is defined as:
The Kendall's tau is defined based on the normalized Kendall's distance. Note that the greater the Kendall's tau is, the greater the similarity degree between the compared rankings is. In this paper, we use the Kendall's distance rather than the Kendall's tau."
My goal is to improve the following ranking by using Kendall's distance :
x1 x2 x3 x4
A1 4 1 3 2
A2 4 1 3 2
A3 4 3 2 1
A4 1 4 3 2
A5 1 2 4 3
In this ranking, the ith row represents the ranking obtained based on Ai, and each column represents the ranking position of the corresponding item in each ranking. (i.e. xn represents the items to be ranked, Ai represents the ones who rank the items.)
I don't understand what's the difference between the two distances despite the explanation of the doc. And what what does the "(j,s), j != s" beneath the sigma symbol stand for? And finally how to implement Kendall's distance in the ranking provided above?
Distance and similarity are two related concepts, but for distance, exact identity means distance 0, and as things get more different, the distance between them gets greater, with no very obvious fixed limit. A well-behaved distance will obey the rules for a metric - see https://en.wikipedia.org/wiki/Metric_(mathematics). For a similarity, exact identity means similarity 1, and similarity decreases as things get greater, but usually never decreases below 0. Kendall's tau seems to be a way of turning Kendall's distance into a similarity.
"(j,s), j != s" means consider all possibilities for j and s except those for which j = s.
You can compute Kendall's distance by simply summing over all possibilities for j not equal to s - but the time taken for this goes up with the square of the number of items. There are ways for which the time taken only goes up as n * log(n) where n is the number of items - for this and much other stuff on Kendall see https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient

Is this an accurate average or an exponential moving average formula?

I'm trying to calculate the average of a value that is changing, and I would like to do so without storing all the previous values in an array and iterating over them.
I found this formula
avg = avg + (value - avg) / n
where n is the number of changes to value.
TL;DR
My question is if this formula is identical to the normal way of
calculating an average (which it seems to be when I compare them), or
if it might give different results under certain circumstances?
I'm not sure what the correct name of this formula is - I've seen "running average, "rolling average", "moving average", etc. The results of it seem to be exactly the same as storing each historical value, summing them up and dividing by n - i.e. a "normal average".
What's confusing is that people sometimes call this formula a "moving average", which in my mind sounds more like you're using a subset of the historical values to calculate an average. Others say it's an exponential moving average (see comment by Julia on OP).
Is this formula is identical to the normal way of calculating an average?
With infinite precision, this formula does indeed compute the sum of the first n samples if avg is set equal to 0 at the start.
It is clearly true when n=1 because the average of 1 sample works out as:
avg' = avg + (value - avg) / n
= 0 + (value - 0) / 1
= value
For larger values, assume it is true for n-1 (i.e. avg=(x[1]+..+x[n-1])/(n-1) ).
Then:
avg' = avg + (x[n] - avg) / n
= (n-1)*avg/n + x[n]/n
= (x[1]+...+x[n-1])/n + x[n]/n
= (x[1]+...+x[n])/n
So the new value of avg is also equal to the average of the first n samples.
Is this a moving average?
Normally by "moving average" people are referring to a simple moving average.
This formula is actually known as a cumulative moving average.

differential equation VS Algorithms complexity

I don't know if it's the right place to ask because my question is about how to calculate a computer science algorithm complexity using differential equation growth and decay method.
The algorithm that I would like to prove is Binary search for a sorted array, which has a complexity of log2(n)
The algorithm says: if the target value are searching for is equal to the mid element, then return its index. If if it's less, then search on the left sub-array, if greater search on the right sub-array.
As you can see each time N(t): [number of nodes at time t] is being divided by half. Therefore, we can say that it takes O(log2(n)) to find an element.
Now using differential equation growth and decay method.
dN(t)/dt = N(t)/2
dN(t): How fast the number of elements is increasing or decreasing
dt: With respect to time
N(t): Number of elements at time t
The above equation says that the number of cells is being divided by 2 with time.
Solving the above equations gives us:
dN(t)/N(t) = dt/2
ln(N(t)) = t/2 + c
t = ln(N(t))*2 + d
Even though we got t = ln(N(t)) and not log2(N(t)), we can still say that it's logarithmic.
Unfortunately, the above method, even if it makes sense while approaching it to finding binary search complexity, turns out it does not work for all algorithms. Here's a counter example:
Searching an array linearly: O(n)
dN(t)/dt = N(t)
dN(t)/N(t) = dt
t = ln(N(t)) + d
So according to this method, the complexity of searching linearly takes O(ln(n)) which is NOT true of course.
This differential equation method is called growth and decay and it's very popluar. So I would like to know if this method could be applied in computer science algorithm like the one I picked, and if yes, what did I do wrong to get incorrect result for the linear search ? Thank you
The time an algorithm takes to execute is proportional to the number
of steps covered(reduced here).
In your linear searching of the array, you have assumed that dN(t)/dt = N(t).
Incorrect Assumption :-
dN(t)/dt = N(t)
dN(t)/N(t) = dt
t = ln(N(t)) + d
Going as per your previous assumption, the binary-search is decreasing the factor by 1/2 terms(half-terms are directly reduced for traversal in each of the pass of array-traversal,thereby reducing the number of search terms by half). So, your point of dN(t)/dt=N(t)/2 was fine. But, when you are talking of searching an array linearly, obviously, you are accessing the element in one single pass and hence, your searching terms are decreasing in the order of one item in each of the passes. So, how come your assumption be true???
Correct Assumption :-
dN(t)/dt = 1
dN(t)/1 = dt
t = N(t) + d
I hope you got my point. The array elements are being accessed sequentially one pass(iteration) each. So, the array accessing is not changing in order of N(t), but in order of a constant 1. So, this N(T) order result!

Calculate a Confidence measure for Image similarity

I am using Euclidean distance between Histograms of 2 images for calculating image similarity.
The Histogram is of 15 bins and is normalized with respect to the image size (Thus, sum of all bins = 1).
Now, for the user, the distance value is not of any use and I want to convert it to a more tangible value - such as a % Confidence measure.
So, if the distance is 0, the confidence is 100% and if the distance is maximum, i.e 1 (is this correct?), then the confidence is 0%.
However, the scaling is not linear because of the properties of the histogram and the distance metric i.e. distance = 0.5 doesn't equal a confidence measure of 50%.
Can someone suggest me a scaling function to convert distance to a confidence measure ?
You could give more weight to the results where the distance is closer to 0 with an inverse exponential. Something to the effect of the following might work, where d is the distance:
((2 - d) ^ 2 - 1) / 3
A distance of 1 would result in a confidence score of 1 (i.e. 100%), and a distance of 1 would result in a confidence of 0. You'd also get at .5 a confidence of ~0.412. You can weight the lower distances higher by increasing the exponent and the divisor. An exponent of 3 instead of 2 would mean you'd want to divide the whole thing by 7 instead of 3, and would pull down a distance of .5 to ~0.339.

Algorithm for making two histograms proportional, minimizing units removed

Imagine you have two histograms with an equal number of bins. N observations are distributed among the bins. Each bin now has between 0 and N observations.
What algorithm would be appropriate for determining the minimum number of observations to remove from both histograms in order to make them proportional? They do not need to be equal in absolute number, only proportional to each other. That is, there must be a common factor by which all the bins in one histogram can be multiplied in order to make it equal to the other histogram.
For example, imagine the following two histograms, where the item i in each histogram refers to the number of observations in bin i for the respective histogram.
Histogram 1: 4, 7, 4, 9
Histogram 2: 2, 0, 2, 1
For these histograms, the solution would be to remove from histogram 1 all 7 observations in bin 2 and another 7 observations from bin 4, such that (histogram 1)*2 = histogram 2.
But what general algorithm could be used to find the subsets of the two histograms that maximized the number of total observations between them while making them proportional? You can drop observations from both histograms or just one.
Thanks!
Seems to me that the problem is equivalent (if you consider each histogram as a N-dimensional vector), to minimizing the Manhattan length |R|, where R=xA-B, A and B are your 'vectors' and x is your proportional scale.
|R| has a single minimum (not necessarily an integer) so you can find it fairly rapidly using a simple bisection algorithm (or something akin to Newton's method).
Then, assuming you want a solution where the proportion is an integer, test the two cases ceil(x), and floor(x), to find which has the smallest Manhattan length (and that is the number of observations you need to remove).
Proof that the problem is not NP-hard:
Consider an inefficient 'solution' whereby you removed all N observations from all the bins. Now both A and B are equal to the 'zero' histogram 0 = (0,0,0,...). The two histograms are equal and thus proportional as 0 = s * 0 for all proportional values s, so a hard maximum for the number of observations to remove is N.
Now assume a more efficient solution exists with assitions/removals < N and a proportional scale s > 2*N (i.e after removal of some observations A = N * B or B=N * A ). If both A = 0 and B = 0, we have the previous solution with N removals (which contradicts the assumption that there are less than N removals). If A = 0 and B ≠ 0 then there is no s <> 0 such that 0 = s * B and no s such that s * 0 = B (with a similar argument for B = 0 and S ≠ 0). So it must be the case that both A ≠ 0 and B ≠ 0. Assume for a moment that A is the histogram to be scaled (so A * s = B), A must have at least one non-zero entry A[i] with minimum value 1 (after removal of extra observations), so when scaled it will have minimum value ≥. Therefore the equivalent entry B[i] must also have at least 2*N observations. But the total number of observations was initially N, so we have needed to add at least N observations to B[i], which contradicts the assumption that the improved solution had less than N additions/removals. So no 'efficient' solution requires a proportional scale greater than N.
So to find an efficient solution requires, at worst, testing the 'best fit' solution for scaling factors in the range 0-N.
The 'best fit' solution for scaling factor s in A = s * B, where A and B have M bins each requires
Sum(i=1 to M) of { Abs(A[i]- s * B[i]) mod s + Abs(A[i]- s * B[i]) div s } additions/removals.
This is an order M operation, so to test for each scaling factor in the range 0-N will be an algorithm of order O(M*N)
I am fairly certain (but haven't got a formal proof), that the scale factor cannot exceed the number of observations in the most filled bin. In practice it is typically very much smaller. For two histograms with two hundred bins and randomly chosen 30-300 observations per bin: if there were Na > Nb total observations in all the bins of A and B respectively the scaling factor was either almost always found in the range Na/Nb-4 < s < Na/Nb + 4, (or s = 0 if Na >> Nb).

Resources