Smoothly make a number approach zero - animation

I have a floating point value X which is animated. When in rest it's at zero, but at times an outside source may change it to somewhere between -1 and 1.
If that happens I want it to go smoothly back to 0. I currently do something like
addToXspeed(-x * FACTOR);
// below is out of my control
function addToXspeed(bla) {
xspeed += bla;
x += xspeed;
}
every step in the animation, but that only causes X to oscillate. I want it to rest on 0 however.
(I've explained the problem in abstracts. The specific thing I'm trying to do is make a jumping game character balance himself upright in the air by applying rotational force)

Interesting problem.
What you are asking for is the stabilization of the following discrete-time linear system:
| x(t+1)| = | 1 dt | | x(t)| + | 0 | u(t)
|xspeed(t+1)| | 0 1 | |xspeed(t)| | 1 |
where dt is the sampling time and u(t) is the quantity you addToXspeed(). (Further, the system is subject to random disturbances on the first variable x, which I don't show in the equation above.) Now if you "set the control input equal to a linear feedback of the state", i.e.
u(t) = [a b] | x(t)| = a*x(t) + b*xspeed(t)
|xspeed(t)|
then the "closed-loop" system becomes
| x(t+1)| = | 1 dt | | x(t)|
|xspeed(t+1)| | a b+1 | |xspeed(t)|
Now, in order to obtain "asymptotic stability" of the system, we stipulate that the eigenvalues of the closed-loop matrix are placed "inside the complex unit circle", and we do this by tuning a and b. We place the eigenvalues, say, at 0.5.
Therefore the characteristic polynomial of the closed-loop matrix, which is
(s - 1)(s - (b+1)) - a*dt = s^2 -(2+b)*s + (b+1-a*dt)
should equal
(s - 0.5)^2 = s^2 - s + 0.25
This is easily attained if we choose
b = -1 a = -0.25/dt
or
u(t) = a*x(t) + b*xspeed(t) = -(0.25/dt)*x(t) - xspeed(t)
addToXspeed(u(t))
which is more or less what appears in your own answer
targetxspeed = -x * FACTOR;
addToXspeed(targetxspeed - xspeed);
where, if we are asked to place the eigenvalues at 0.5, we should set FACTOR = (0.25/dt).

x = x*FACTOR
This should do the trick when factor is between 0 and 1.
The lower the factor the quicker you'll go to 0.

Why don't you define a fixed step to be decremented from x?
You just have to be sure to make it small enough so that the said person doesn't seem to be traveling at small bursts at a time, but not small enough that she doesn't move at a perceived rate.

Writing the question oftens results in realising the answer.
targetxspeed = -x * FACTOR;
addToXspeed(targetxspeed - xspeed);
// below is out of my control
function addToXspeed(bla) {
xspeed += bla;
x += xspeed;
}
So simple too

If you want to scale it but can only add, then you have to figure out which value to add in order to get the desired scaling:
Let's say x = 0.543, and we want to cause it to rapidly go towards 0, i.e. by dropping it by 95%.
We want to do:
scaled_x = x * (1.0 - 0.95);
This would leave x at 0.543 * 0.05, or 0.02715. The difference between this value and the original is then what you need to add to get this value:
delta = scaled_x - x;
This would make delta equal -0,51585, which is what you need to add to simulate a scaling by 5%.

Related

Matthews Correlation Coefficient yielding values outside of [-1,1]

I'm using the formula found on Wikipedia for calculating Matthew's Correlation Coefficient. It works fairly well, most of the time, but I'm running into problems in my tool's implementation, and I'm not seeing the problem.
MCC = ((TP*TN)-(FP*FN))/sqrt(((TP + FP)( TP + FN )( TN + FP )( TN + FN )))
Where TP, TN, FP, and FN are the non-negative, integer counts of the appropriate fields.
Which should only return values $\epsilon$ [-1,1]
My implementation is as follows:
double ret;
if ((TruePositives + FalsePositives) == 0 || (TruePositives + FalseNegatives) == 0 ||
( TrueNegatives + FalsePositives) == 0 || (TrueNegatives + FalseNegatives) == 0)
//To avoid dividing by zero
ret = (double)(TruePositives * TrueNegatives -
FalsePositives * FalseNegatives);
else{
double num = (double)(TruePositives * TrueNegatives -
FalsePositives * FalseNegatives);
double denom = (TruePositives + FalsePositives) *
(TruePositives + FalseNegatives) *
(TrueNegatives + FalsePositives) *
(TrueNegatives + FalseNegatives);
denom = Math.Sqrt(denom);
ret = num / denom;
}
return ret;
When I use this, as I said it works properly most of the time, but for instance if TP=280, TN = 273, FP = 67, and FN = 20, then we get:
MCC = (280*273)-(67*20)/sqrt((347*300*340*293)) = 75100/42196.06= (approx) 1.78
Is this normal behavior of Matthews Correlation Coefficient? I'm a programmer by trade, so statistics aren't a part of my formal training. Also, I've looked at questions with answers, and none of them discuss this behavior. Is it a bug in my code or in the formula itself?
The code is clear and looks correct. (But one's eyes can always deceive.)
One issue is a concern whether the output is guaranteed to lie between -1 and 1. Assuming all inputs are nonnegative, though, we can round the numerator up and the denominator down, thereby overestimating the result, by zeroing out all the "False*" terms, producing
TP*TN / Sqrt(TP*TN*TP*TN) = 1.
The lower limit is obtained similarly by zeroing out all the "True*" terms. Therefore, working code cannot produce a value larger than 1 in size unless it is presented with invalid input.
I therefore recommend placing a guard (such as an Assert statement) to assure the inputs are nonnegative. (Clearly it matters not in the preceding argument whether they are integral.) Place another assertion to check that the output is in the interval [-1,1]. Together, these will detect either or both of (a) invalid inputs or (b) an error in the calculation.

implementing a simple big bang big crunch (BB-BC) in matlab

i want to implement a simple BB-BC in MATLAB but there is some problem.
here is the code to generate initial population:
pop = zeros(N,m);
for j = 1:m
% formula used to generate random number between a and b
% a + (b-a) .* rand(N,1)
pop(:,j) = const(j,1) + (const(j,2) - const(j,1)) .* rand(N,1);
end
const is a matrix (mx2) which holds constraints for control variables. m is number of control variables. random initial population is generated.
here is the code to compute center of mass in each iteration
sum = zeros(1,m);
sum_f = 0;
for i = 1:N
f = fitness(new_pop(i,:));
%keyboard
sum = sum + (1 / f) * new_pop(i,:);
%keyboard
sum_f = sum_f + 1/f;
%keyboard
end
CM = sum / sum_f;
new_pop holds newly generated population at each iteration, and is initialized with pop.
CM is a 1xm matrix.
fitness is a function to give fitness value for each particle in generation. lower the fitness, better the particle.
here is the code to generate new population in each iteration:
for i=1:N
new_pop(i,:) = CM + rand(1) * alpha1 / (n_itr+1) .* ( const(:,2)' - const(:,1)');
end
alpha1 is 0.9.
the problem is that i run the code for 100 iterations, but fitness just decreases and becomes negative. it shouldnt happen at all, because all particles are in search space and CM should be there too, but it goes way beyond the limits.
for example, if this is the limits (m=4):
const = [1 10;
1 9;
0 5;
1 4];
then running yields this CM:
57.6955 -2.7598 15.3098 20.8473
which is beyond all limits.
i tried limiting CM in my code, but then it just goes and sticks at all top boundaries, which in this example give CM=
10 9 5 4
i am confused. there is something wrong in my implementation or i have understood something wrong in BB-BC?

Determining Floating Point Square Root

How do I determine the square root of a floating point number? Is the Newton-Raphson method a good way? I have no hardware square root either. I also have no hardware divide (but I have implemented floating point divide).
If possible, I would prefer to reduce the number of divides as much as possible since they are so expensive.
Also, what should be the initial guess to reduce the total number of iterations???
Thank you so much!
When you use Newton-Raphson to compute a square-root, you actually want to use the iteration to find the reciprocal square root (after which you can simply multiply by the input--with some care for rounding--to produce the square root).
More precisely: we use the function f(x) = x^-2 - n. Clearly, if f(x) = 0, then x = 1/sqrt(n). This gives rise to the newton iteration:
x_(i+1) = x_i - f(x_i)/f'(x_i)
= x_i - (x_i^-2 - n)/(-2x_i^-3)
= x_i + (x_i - nx_i^3)/2
= x_i*(3/2 - 1/2 nx_i^2)
Note that (unlike the iteration for the square root), this iteration for the reciprocal square root involves no divisions, so it is generally much more efficient.
I mentioned in your question on divide that you should look at existing soft-float libraries, rather than re-inventing the wheel. That advice applies here as well. This function has already been implemented in existing soft-float libraries.
Edit: the questioner seems to still be confused, so let's work an example: sqrt(612). 612 is 1.1953125 x 2^9 (or b1.0011001 x 2^9, if you prefer binary). Pull out the even portion of the exponent (9) to write the input as f * 2^(2m), where m is an integer and f is in the range [1,4). Then we will have:
sqrt(n) = sqrt(f * 2^2m) = sqrt(f)*2^m
applying this reduction to our example gives f = 1.1953125 * 2 = 2.390625 (b10.011001) and m = 4. Now do a newton-raphson iteration to find x = 1/sqrt(f), using a starting guess of 0.5 (as I noted in a comment, this guess converges for all f, but you can do significantly better using a linear approximation as an initial guess):
x_0 = 0.5
x_1 = x_0*(3/2 - 1/2 * 2.390625 * x_0^2)
= 0.6005859...
x_2 = x_1*(3/2 - 1/2 * 2.390625 * x_1^2)
= 0.6419342...
x_3 = 0.6467077...
x_4 = 0.6467616...
So even with a (relatively bad) initial guess, we get rapid convergence to the true value of 1/sqrt(f) = 0.6467616600226026.
Now we simply assemble the final result:
sqrt(f) = x_n * f = 1.5461646...
sqrt(n) = sqrt(f) * 2^m = 24.738633...
And check: sqrt(612) = 24.738633...
Obviously, if you want correct rounding, careful analysis needed to ensure that you carry sufficient precision at each stage of the computation. This requires careful bookkeeping, but it isn't rocket science. You simply keep careful error bounds and propagate them through the algorithm.
If you want to correct rounding without explicitly checking a residual, you need to compute sqrt(f) to a precision of 2p + 2 bits (where p is precision of the source and destination type). However, you can also take the strategy of computing sqrt(f) to a little more than p bits, square that value, and adjust the trailing bit by one if necessary (which is often cheaper).
sqrt is nice in that it is a unary function, which makes exhaustive testing for single-precision feasible on commodity hardware.
You can find the OS X soft-float sqrtf function on opensource.apple.com, which uses the algorithm described above (I wrote it, as it happens). It is licensed under the APSL, which may or not be suitable for your needs.
Probably (still) the fastest implementation for finding the inverse square root and the 10 lines of code that I adore the most.
It's based on Newton Approximation, but with a few quirks. There's even a great story around this.
Easiest to implement (you can even implement this in a calculator):
def sqrt(x, TOL=0.000001):
y=1.0
while( abs(x/y -y) > TOL ):
y= (y+x/y)/2.0
return y
This is exactly equal to newton raphson:
y(new) = y - f(y)/f'(y)
f(y) = y^2-x and f'(y) = 2y
Substituting these values:
y(new) = y - (y^2-x)/2y = (y^2+x)/2y = (y+x/y)/2
If division is expensive you should consider: http://en.wikipedia.org/wiki/Shifting_nth-root_algorithm .
Shifting algorithms:
Let us assume you have two numbers a and b such that least significant digit (equal to 1) is larger than b and b has only one bit equal to (eg. a=1000 and b=10). Let s(b) = log_2(b) (which is just the location of bit valued 1 in b).
Assume we already know the value of a^2. Now (a+b)^2 = a^2 + 2ab + b^2. a^2 is already known, 2ab: shift a by s(b)+1, b^2: shift b by s(b).
Algorithm:
Initialize a such that a has only one bit equal to one and a^2<= n < (2*a)^2.
Let q=s(a).
b=a
sqra = a*a
For i = q-1 to -10 (or whatever significance you want):
b=b/2
sqrab = sqra + 2ab + b^2
if sqrab > n:
continue
sqra = sqrab
a=a+b
n=612
a=10000 (16)
sqra = 256
Iteration 1:
b=01000 (8)
sqrab = (a+b)^2 = 24^2 = 576
sqrab < n => a=a+b = 24
Iteration 2:
b = 4
sqrab = (a+b)^2 = 28^2 = 784
sqrab > n => a=a
Iteration 3:
b = 2
sqrab = (a+b)^2 = 26^2 = 676
sqrab > n => a=a
Iteration 4:
b = 1
sqrab = (a+b)^2 = 25^2 = 625
sqrab > n => a=a
Iteration 5:
b = 0.5
sqrab = (a+b)^2 = 24.5^2 = 600.25
sqrab < n => a=a+b = 24.5
Iteration 6:
b = 0.25
sqrab = (a+b)^2 = 24.75^2 = 612.5625
sqrab < n => a=a
Iteration 7:
b = 0.125
sqrab = (a+b)^2 = 24.625^2 = 606.390625
sqrab < n => a=a+b = 24.625
and so on.
A good approximation to square root on the range [1,4) is
def sqrt(x):
y = x*-0.000267
y = x*(0.004686+y)
y = x*(-0.034810+y)
y = x*(0.144780+y)
y = x*(-0.387893+y)
y = x*(0.958108+y)
return y+0.315413
Normalise your floating point number so the mantissa is in the range [1,4), use the above algorithm on it, and then divide the exponent by 2. No floating point divisions anywhere.
With the same CPU time budget you can probably do much better, but that seems like a good starting point.

Rolling variance algorithm

I'm trying to find an efficient, numerically stable algorithm to calculate a rolling variance (for instance, a variance over a 20-period rolling window). I'm aware of the Welford algorithm that efficiently computes the running variance for a stream of numbers (it requires only one pass), but am not sure if this can be adapted for a rolling window. I would also like the solution to avoid the accuracy problems discussed at the top of this article by John D. Cook. A solution in any language is fine.
I've run across this problem as well. There are some great posts out there in computing the running cumulative variance such as John Cooke's Accurately computing running variance post and the post from Digital explorations, Python code for computing sample and population variances, covariance and correlation coefficient. Just could not find any that were adapted to a rolling window.
The Running Standard Deviations post by Subluminal Messages was critical in getting the rolling window formula to work. Jim takes the power sum of the squared differences of the values versus Welford’s approach of using the sum of the squared differences of the mean. Formula as follows:
PSA today = PSA(yesterday) + (((x today * x today) - x yesterday)) / n
x = value in your time series
n = number of values you've analyzed so far.
But, to convert the Power Sum Average formula to a windowed variety you need tweak the formula to the following:
PSA today = PSA yesterday + (((x today * x today) - (x yesterday * x Yesterday) / n
x = value in your time series
n = number of values you've analyzed so far.
You'll also need the Rolling Simple Moving Average formula:
SMA today = SMA yesterday + ((x today - x today - n) / n
x = value in your time series
n = period used for your rolling window.
From there you can compute the Rolling Population Variance:
Population Var today = (PSA today * n - n * SMA today * SMA today) / n
Or the Rolling Sample Variance:
Sample Var today = (PSA today * n - n * SMA today * SMA today) / (n - 1)
I've covered this topic along with sample Python code in a blog post a few years back, Running Variance.
Hope this helps.
Please note: I provided links to all the blog posts and math formulas
in Latex (images) for this answer. But, due to my low reputation (<
10); I'm limited to only 2 hyperlinks and absolutely no images. Sorry
about this. Hope this doesn't take away from the content.
I have been dealing with the same issue.
Mean is simple to compute iteratively, but you need to keep the complete history of values in a circular buffer.
next_index = (index + 1) % window_size; // oldest x value is at next_index, wrapping if necessary.
new_mean = mean + (x_new - xs[next_index])/window_size;
I have adapted Welford's algorithm and it works for all the values that I have tested with.
varSum = var_sum + (x_new - mean) * (x_new - new_mean) - (xs[next_index] - mean) * (xs[next_index] - new_mean);
xs[next_index] = x_new;
index = next_index;
To get the current variance just divide varSum by the window size: variance = varSum / window_size;
If you prefer code over words (heavily based on DanS' post):
http://calcandstuff.blogspot.se/2014/02/rolling-variance-calculation.html
public IEnumerable RollingSampleVariance(IEnumerable data, int sampleSize)
{
double mean = 0;
double accVar = 0;
int n = 0;
var queue = new Queue(sampleSize);
foreach(var observation in data)
{
queue.Enqueue(observation);
if (n < sampleSize)
{
// Calculating first variance
n++;
double delta = observation - mean;
mean += delta / n;
accVar += delta * (observation - mean);
}
else
{
// Adjusting variance
double then = queue.Dequeue();
double prevMean = mean;
mean += (observation - then) / sampleSize;
accVar += (observation - prevMean) * (observation - mean) - (then - prevMean) * (then - mean);
}
if (n == sampleSize)
yield return accVar / (sampleSize - 1);
}
}
Actually Welfords algorithm can AFAICT easily be adapted to compute weighted Variance.
And by setting weights to -1, you should be able to effectively cancel out elements. I havn't checked the math whether it allows negative weights though, but at a first look it should!
I did perform a small experiment using ELKI:
void testSlidingWindowVariance() {
MeanVariance mv = new MeanVariance(); // ELKI implementation of weighted Welford!
MeanVariance mc = new MeanVariance(); // Control.
Random r = new Random();
double[] data = new double[1000];
for (int i = 0; i < data.length; i++) {
data[i] = r.nextDouble();
}
// Pre-roll:
for (int i = 0; i < 10; i++) {
mv.put(data[i]);
}
// Compare to window approach
for (int i = 10; i < data.length; i++) {
mv.put(data[i-10], -1.); // Remove
mv.put(data[i]);
mc.reset(); // Reset statistics
for (int j = i - 9; j <= i; j++) {
mc.put(data[j]);
}
assertEquals("Variance does not agree.", mv.getSampleVariance(),
mc.getSampleVariance(), 1e-14);
}
}
I get around ~14 digits of precision compared to the exact two-pass algorithm; this is about as much as can be expected from doubles. Note that Welford does come at some computational cost because of the extra divisions - it takes about twice as long as the exact two-pass algorithm. If your window size is small, it may be much more sensible to actually recompute the mean and then in a second pass the variance every time.
I have added this experiment as unit test to ELKI, you can see the full source here: http://elki.dbs.ifi.lmu.de/browser/elki/trunk/test/de/lmu/ifi/dbs/elki/math/TestSlidingVariance.java
it also compares to the exact two-pass variance.
However, on skewed data sets, the behaviour might be different. This data set obviously is uniform distributed; but I've also tried a sorted array and it worked.
Update: we published a paper with details on differentweighting schemes for (co-)variance:
Schubert, Erich, and Michael Gertz. "Numerically stable parallel computation of (co-) variance." Proceedings of the 30th International Conference on Scientific and Statistical Database Management. ACM, 2018. (Won the SSDBM best-paper award.)
This also discusses how weighting can be used to parallelize the computation, e.g., with AVX, GPUs, or on clusters.
Here's a divide and conquer approach that has O(log k)-time updates, where k is the number of samples. It should be relatively stable for the same reasons that pairwise summation and FFTs are stable, but it's a bit complicated and the constant isn't great.
Suppose we have a sequence A of length m with mean E(A) and variance V(A), and a sequence B of length n with mean E(B) and variance V(B). Let C be the concatenation of A and B. We have
p = m / (m + n)
q = n / (m + n)
E(C) = p * E(A) + q * E(B)
V(C) = p * (V(A) + (E(A) + E(C)) * (E(A) - E(C))) + q * (V(B) + (E(B) + E(C)) * (E(B) - E(C)))
Now, stuff the elements in a red-black tree, where each node is decorated with mean and variance of the subtree rooted at that node. Insert on the right; delete on the left. (Since we're only accessing the ends, a splay tree might be O(1) amortized, but I'm guessing amortized is a problem for your application.) If k is known at compile-time, you could probably unroll the inner loop FFTW-style.
I know this question is old, but in case someone else is interested here follows the python code. It is inspired by johndcook blog post, #Joachim's, #DanS's code and #Jaime comments. The code below still gives small imprecisions for small data windows sizes. Enjoy.
from __future__ import division
import collections
import math
class RunningStats:
def __init__(self, WIN_SIZE=20):
self.n = 0
self.mean = 0
self.run_var = 0
self.WIN_SIZE = WIN_SIZE
self.windows = collections.deque(maxlen=WIN_SIZE)
def clear(self):
self.n = 0
self.windows.clear()
def push(self, x):
self.windows.append(x)
if self.n <= self.WIN_SIZE:
# Calculating first variance
self.n += 1
delta = x - self.mean
self.mean += delta / self.n
self.run_var += delta * (x - self.mean)
else:
# Adjusting variance
x_removed = self.windows.popleft()
old_m = self.mean
self.mean += (x - x_removed) / self.WIN_SIZE
self.run_var += (x + x_removed - old_m - self.mean) * (x - x_removed)
def get_mean(self):
return self.mean if self.n else 0.0
def get_var(self):
return self.run_var / (self.WIN_SIZE - 1) if self.n > 1 else 0.0
def get_std(self):
return math.sqrt(self.get_var())
def get_all(self):
return list(self.windows)
def __str__(self):
return "Current window values: {}".format(list(self.windows))
I look forward to be proven wrong on this but I don't think this can be done "quickly." That said, a large part of the calculation is keeping track of the EV over the window which can be done easily.
I'll leave with the question: are you sure you need a windowed function? Unless you are working with very large windows it is probably better to just use a well known predefined algorithm.
I guess keeping track of your 20 samples, Sum(X^2 from 1..20), and Sum(X from 1..20) and then successively recomputing the two sums at each iteration isn't efficient enough? It's possible to recompute the new variance without adding up, squaring, etc., all of the samples each time.
As in:
Sum(X^2 from 2..21) = Sum(X^2 from 1..20) - X_1^2 + X_21^2
Sum(X from 2..21) = Sum(X from 1..20) - X_1 + X_21
Here's another O(log k) solution: find squares the original sequence, then sum pairs, then quadruples, etc.. (You'll need a bit of a buffer to be able to find all of these efficiently.) Then add up those values that you need to to get your answer. For example:
||||||||||||||||||||||||| // Squares
| | | | | | | | | | | | | // Sum of squares for pairs
| | | | | | | // Pairs of pairs
| | | | // (etc.)
| |
^------------------^ // Want these 20, which you can get with
| | // one...
| | | | // two, three...
| | // four...
|| // five stored values.
Now you use your standard E(x^2)-E(x)^2 formula and you're done. (Not if you need good stability for small sets of numbers; this was assuming that it was only accumulation of rolling error that was causing issues.)
That said, summing 20 squared numbers is very fast these days on most architectures. If you were doing more--say, a couple hundred--a more efficient method would clearly be better. But I'm not sure that brute force isn't the way to go here.
For only 20 values, it's trivial to adapt the method exposed here (I didn't say fast, though).
You can simply pick up an array of 20 of these RunningStat classes.
The first 20 elements of the stream are somewhat special, however once this is done, it's much more simple:
when a new element arrives, clear the current RunningStat instance, add the element to all 20 instances, and increment the "counter" (modulo 20) which identifies the new "full" RunningStat instance
at any given moment, you can consult the current "full" instance to get your running variant.
You will obviously note that this approach isn't really scalable...
You can also note that there is some redudancy in the numbers we keep (if you go with the RunningStat full class). An obvious improvement would be to keep the 20 lasts Mk and Sk directly.
I cannot think of a better formula using this particular algorithm, I am afraid that its recursive formulation somewhat ties our hands.
This is just a minor addition to the excellent answer provided by DanS. The following equations are for removing the oldest sample from the window and updating the mean and variance. This is useful, for example, if you want to take smaller windows near the right edge of your input data stream (i.e. just remove the oldest window sample without adding a new sample).
window_size -= 1; % decrease window size by 1 sample
new_mean = prev_mean + (prev_mean - x_old) / window_size
varSum = varSum - (prev_mean - x_old) * (new_mean - x_old)
Here, x_old is the oldest sample in the window you wish to remove.
For those coming here now, here's a reference containing the full derivation, with proofs, of DanS's answer and Jaime's related comment.
DanS and Jaime's response in concise C.
typedef struct {
size_t n, i;
float *samples, mean, var;
} rolling_var_t;
void rolling_var_init(rolling_var_t *c, size_t window_size) {
size_t ss;
memset(c, 0, sizeof(*c));
c->n = window_size;
c->samples = (float *) malloc(ss = sizeof(float)*window_size);
memset(c->samples, 0, ss);
}
void rolling_var_add(rolling_var_t *c, float x) {
float nmean; // new mean
float xold; // oldest x
float dx;
c->i = (c->i + 1) % c->n;
xold = c->samples[c->i];
dx = x - xold;
nmean = c->mean + dx / (float) c->n; // walk mean
//c->var += ((x - c->mean)*(x - nmean) - (xold - c->mean) * (xold - nmean)) / (float) c->n;
c->var += ((x + xold - c->mean - nmean) * dx) / (float) c->n;
c->mean = nmean;
c->samples[c->i] = x;
}

What is the correct algorthm for a logarthmic distribution curve between two points?

I've read a bunch of tutorials about the proper way to generate a logarithmic distribution of tagcloud weights. Most of them group the tags into steps. This seems somewhat silly to me, so I developed my own algorithm based on what I've read so that it dynamically distributes the tag's count along the logarthmic curve between the threshold and the maximum. Here's the essence of it in python:
from math import log
count = [1, 3, 5, 4, 7, 5, 10, 6]
def logdist(count, threshold=0, maxsize=1.75, minsize=.75):
countdist = []
# mincount is either the threshold or the minimum if it's over the threshold
mincount = threshold<min(count) and min(count) or threshold
maxcount = max(count)
spread = maxcount - mincount
# the slope of the line (rise over run) between (mincount, minsize) and ( maxcount, maxsize)
delta = (maxsize - minsize) / float(spread)
for c in count:
logcount = log(c - (mincount - 1)) * (spread + 1) / log(spread + 1)
size = delta * logcount - (delta - minsize)
countdist.append({'count': c, 'size': round(size, 3)})
return countdist
Basically, without the logarithmic calculation of the individual count, it would generate a straight line between the points, (mincount, minsize) and (maxcount, maxsize).
The algorithm does a good approximation of the curve between the two points, but suffers from one drawback. The mincount is a special case, and the logarithm of it produces zero. This means the size of the mincount would be less than minsize. I've tried cooking up numbers to try to solve this special case, but can't seem to get it right. Currently I just treat the mincount as a special case and add "or 1" to the logcount line.
Is there a more correct algorithm to draw a curve between the two points?
Update Mar 3: If I'm not mistaken, I am taking the log of the count and then plugging it into a linear equation. To put the description of the special case in other words, in y=lnx at x=1, y=0. This is what happens at the mincount. But the mincount can't be zero, the tag has not been used 0 times.
Try the code and plug in your own numbers to test. Treating the mincount as a special case is fine by me, I have a feeling it would be easier than whatever the actual solution to this problem is. I just feel like there must be a solution to this and that someone has probably come up with a solution.
UPDATE Apr 6: A simple google search turns up a many of the tutorials I've read, but this is probably the most complete example of stepped tag clouds.
UPDATE Apr 28: In response to antti.huima's solution: When graphed, the curve that your algorithm creates lies below the line between the two points. I've been trying to juggle the numbers around but still can't seem to come up with a way to flip that curve to the other side of the line. I'm guessing that if the function was changed to some form of logarithm instead of an exponent it would do exactly what I'd need. Is that correct? If so, can anyone explain how to achieve this?
Thanks to antti.huima's help, I re-thought out what I was trying to do.
Taking his method of solving the problem, I want an equation where the logarithm of the mincount is equal to the linear equation between the two points.
weight(MIN) = ln(MIN-(MIN-1)) + min_weight
min_weight = ln(1) + min_weight
While this gives me a good starting point, I need to make it pass through the point (MAX, max_weight). It's going to need a constant:
weight(x) = ln(x-(MIN-1))/K + min_weight
Solving for K we get:
K = ln(MAX-(MIN-1))/(max_weight - min_weight)
So, to put this all back into some python code:
from math import log
count = [1, 3, 5, 4, 7, 5, 10, 6]
def logdist(count, threshold=0, maxsize=1.75, minsize=.75):
countdist = []
# mincount is either the threshold or the minimum if it's over the threshold
mincount = threshold<min(count) and min(count) or threshold
maxcount = max(count)
constant = log(maxcount - (mincount - 1)) / (maxsize - minsize)
for c in count:
size = log(c - (mincount - 1)) / constant + minsize
countdist.append({'count': c, 'size': round(size, 3)})
return countdist
Let's begin with your mapping from the logged count to the size. That's the linear mapping you mentioned:
size
|
max |_____
| /
| /|
| / |
min |/ |
| |
/| |
0 /_|___|____
0 a
where min and max are the min and max sizes, and a=log(maxcount)-b. The line is of y=mx+c where x=log(count)-b
From the graph, we can see that the gradient, m, is (maxsize-minsize)/a.
We need x=0 at y=minsize, so log(mincount)-b=0 -> b=log(mincount)
This leaves us with the following python:
mincount = min(count)
maxcount = max(count)
xoffset = log(mincount)
gradient = (maxsize-minsize)/(log(maxcount)-log(mincount))
for c in count:
x = log(c)-xoffset
size = gradient * x + minsize
If you want to make sure that the minimum count is always at least 1, replace the first line with:
mincount = min(count+[1])
which appends 1 to the count list before doing the min. The same goes for making sure the maxcount is always at least 1. Thus your final code per above is:
from math import log
count = [1, 3, 5, 4, 7, 5, 10, 6]
def logdist(count, maxsize=1.75, minsize=.75):
countdist = []
mincount = min(count+[1])
maxcount = max(count+[1])
xoffset = log(mincount)
gradient = (maxsize-minsize)/(log(maxcount)-log(mincount))
for c in count:
x = log(c)-xoffset
size = gradient * x + minsize
countdist.append({'count': c, 'size': round(size, 3)})
return countdist
what you have is that you have tags whose counts are from MIN to MAX; the threshold issue can be ignored here because it amounts to setting every count below threshold to the threshold value and taking the minimum and maximum only afterwards.
You want to map the tag counts to "weights" but in a "logarithmic fashion", which basically means (as I understand it) the following. First, the tags with count MAX get max_weight weight (in your example, 1.75):
weight(MAX) = max_weight
Secondly, the tags with the count MIN get min_weight weight (in your example, 0.75):
weight(MIN) = min_weight
Finally, it holds that when your count decreases by 1, the weight is multiplied with a constant K < 1, which indicates the steepness of the curve:
weight(x) = weight(x + 1) * K
Solving this, we get:
weight(x) = weight_max * (K ^ (MAX - x))
Note that with x = MAX, the exponent is zero and the multiplicand on the right becomes 1.
Now we have the extra requirement that weight(MIN) = min_weight, and we can solve:
weight_min = weight_max * (K ^ (MAX - MIN))
from which we get
K ^ (MAX - MIN) = weight_min / weight_max
and taking logarithm on both sides
(MAX - MIN) ln K = ln weight_min - ln weight_max
i.e.
ln K = (ln weight_min - ln weight_max) / (MAX - MIN)
The right hand side is negative as desired, because K < 1. Then
K = exp((ln weight_min - ln weight_max) / (MAX - MIN))
So now you have the formula to calculate K. After this you just apply for any count x between MIN and MAX:
weight(x) = max_weight * (K ^ (MAX - x))
And you are done.
On a log scale, you just plot the log of the numbers linearly (in other words, pretend you're plotting linearly, but take the log of the numbers to be plotted first).
The zero problem can't be solved analytically--you have to pick a minimum order of magnitude for your scale, and no matter what you can't ever reach zero. If you want to plot something at zero, your choices are to arbitrarily give it the minimum order of magnitude of the scale, or to omit it.
I don't have the exact answer, but i think you want to look up Linearizing Exponential Data. Start by calculate the equation of the line passing through the points and take the log of both sides of that equation.

Resources