Let's say I have a data set of {10, 20, 30}. My mean and variance here are mean = 20 and variance = 66.667. Is there a formula that lets me calculate the new variance value if I was to remove 10 from the data set turning it into {20, 30}?
This is a similar question to https://math.stackexchange.com/questions/3112650/formula-to-recalculate-variance-after-removing-a-value-and-adding-another-one-gi which deals with the case when there is replacement. https://math.stackexchange.com/questions/775391/can-i-calculate-the-new-standard-deviation-when-adding-a-value-without-knowing-t is also a similar question except that deals with adding adding a value instead of removing one. Removing a prior sample while using Welford's method for computing single pass variance deals with removing a sample, but I cannot figure out how to modify it for dealing with population.
To compute Mean and Variance we want 3 parameters:
N - number of items
Sx - sum of items
Sxx - sum of items squared
Having all these values we can find mean and variance as
Mean = Sx / N
Variance = Sxx / N - Sx * Sx / N / N
In your case
items = {10, 20, 30}
N = 3
Sx = 60 = 10 + 20 + 30
Sxx = 1400 = 100 + 400 + 900 = 10 * 10 + 20 * 20 + 30 * 30
Mean = 60 / 3 = 20
Variance = 1400 / 3 - 60 * 60 / 3 / 3 = 66.666667
If you want to remove an item, just update N, Sx, Sxx values and compute a new variance:
item = 10
N' = N - 1 = 3 - 1 = 2
Sx' = Sx - item = 60 - 10 = 50
Sxx' = Sxx - item * item = 1400 - 10 * 10 = 1300
Mean' = Sx' / N' = 50 / 2 = 25
Variance' = Sxx' / N' - Sx' * Sx' / N' / N' = 1300 / 2 - 50 * 50 / 2 / 2 = 25
So if you remove item = 10 the new mean and variance will be
Mean' = 25
Variance' = 25
I got stuck with below task and spent about 3 hours trying to figure it out.
Task description: A man has a rather old car being worth $2000. He saw a secondhand car being worth $8000. He wants to keep his old car until he can buy the secondhand one.
He thinks he can save $1000 each month but the prices of his old car and of the new one decrease of 1.5 percent per month. Furthermore this percent of loss increases by 0.5 percent at the end of every two months. Our man finds it difficult to make all these calculations.
How many months will it take him to save up enough money to buy the car he wants, and how much money will he have left over?
My code so far:
def nbMonths(startPriceOld, startPriceNew, savingperMonth, percentLossByMonth)
dep_value_old = startPriceOld
mth_count = 0
total_savings = 0
dep_value_new = startPriceNew
mth_count_new = 0
while startPriceOld != startPriceNew do
if startPriceOld >= startPriceNew
return mth_count = 0, startPriceOld - startPriceNew
end
dep_value_new = dep_value_new - (dep_value_new * percentLossByMonth / 100)
mth_count_new += 1
if mth_count_new % 2 == 0
dep_value_new = dep_value_new - (dep_value_new * 0.5) / 100
end
dep_value_old = dep_value_old - (dep_value_old * percentLossByMonth / 100)
mth_count += 1
total_savings += savingperMonth
if mth_count % 2 == 0
dep_value_old = dep_value_old - (dep_value_old * 0.5) / 100
end
affordability = total_savings + dep_value_old
if affordability >= dep_value_new
return mth_count, affordability - dep_value_new
end
end
end
print nbMonths(2000, 8000, 1000, 1.5) # Expected result[6, 766])
The data are as follows.
op = 2000.0 # current old car value
np = 8000.0 # current new car price
sv = 1000.0 # annual savings
dr = 0.015 # annual depreciation, both cars (1.5%)
cr = 0.005. # additional depreciation every two years, both cars (0.5%)
After n >= 0 months the man's (let's call him "Rufus") savings plus the value of his car equal
sv*n + op*(1 - n*dr - (cr + 2*cr + 3*cr +...+ (n/2)*cr))
where n/2 is integer division. As
cr + 2*cr + 3*cr +...+ (n/2)*cr = cr*((1+2+..+n)/2) = cr*(1+n/2)*(n/2)
the expression becomes
sv*n + op*(1 - n*dr - cr*(1+(n/2))*(n/2))
Similarly, after n years the cost of the car he wants to purchase will fall to
np * (1 - n*dr - cr*(1+(n/2))*(n/2))
If we set these two expressions equal we obtain the following.
sv*n + op - op*dr*n - op*cr*(n/2) - op*cr*(n/2)**2 =
np - np*dr*n - np*cr*(n/2) - np*cr*(n/2)**2
which reduces to
cr*(np-op)*(n/2)**2 + (sv + dr*(np-op))*n + cr*(np-op)*(n/2) - (np-op) = 0
or
cr*(n/2)**2 + (sv/(np-op) + dr)*n + cr*(n/2) - 1 = 0
If we momentarily treat (n/2) as a float division, this expression reduces to a quadratic.
(cr/4)*n**2 + (sv/(np-op) + dr + cr/2)*n - 1 = 0
= a*n**2 + b*n + c = 0
where
a = cr/4 = 0.005/4 = 0.00125
b = sv/(np-op) + dr + cr/(2*a) = 1000.0/(8000-2000) + 0.015 + 0.005/2 = 0.18417
c = -1
Incidentally, Rufus doesn't have a computer, but he does have an HP 12c calculator his grandfather gave him when he was a kid, which is perfectly adequate for these simple calculations.
The roots are computed as follows.
(-b + Math.sqrt(b**2 - 4*a*c))/(2*a) #=> 5.24
(-b - Math.sqrt(b**2 - 4*a*c))/(2*a) #=> -152.58
It appears that Rufus can purchase the new vehicle (if it's still for sale) in six years. Had we been able able to solve the above equation for n/2 using integer division it might have turned out that Rufus would have had to wait longer. That’s because for a given n both cars would have depreciated less (or at least not not more), and because the car to be purchased is more expensive than the current car, the difference in values would be greater than that obtained with the float approximation for 1/n. We need to check that, however. After n years, Rufus' savings and the value of his beater will equal
sv*n + op*(1 - dr*n - cr*(1+(n/2))*(n/2))
= 1000*n + 2000*(1 - 0.015*n - 0.005*(1+(n/2))*(n/2))
For n = 6 this equals
1000*6 + 2000*(1 - 0.015*6 - 0.005*(1+(6/2))*(6/2))
= 1000*6 + 2000*(1 - 0.015*6 - 0.005*(1+3)*3)
= 1000*6 + 2000*0.85
= 7700
The cost of Rufus' dream car after n years will be
np * (1 - dr*n - cr*(1+(n/2))*(n/2))
= 8000 * (1 - 0.015*n - 0.005*(1+(n/2))*(n/2))
For n=6 this becomes
8000 * (1 - 0.015*6 - 0.005*(1+(6/2))*(6/2))
= 8000*0.85
= 6800
(Notice that the factor 0.85 is the same in both calculations.)
Yes, Rufus will be able to buy the car in 6 years.
def nbMonths(old, new, savings, percent)
percent = percent.fdiv(100)
current_savings = 0
months = 0
loop do
break if current_savings + old >= new
current_savings += savings
old -= old * percent
new -= new * percent
months += 1
percent += 0.005 if months.odd?
end
[months, (current_savings + old - new).round]
end
~Why the hell has this had down votes.... you people are weird!
Ok so this is a very simply HTML5 and jQuery and PHP game. Sorry to the people who have answered, I forgot to say this is a php script, i have updated here to reflect.
the first level takes 1 minute. Every level after that takes an extra 10 seconds than the last level. like so;
level 1 = 60 seconds
level 2 = 70 seconds
level 3 = 80 seconds
level 4 = 90 seconds
and so on infinitely.
I need an equation that can figure out what is the total amount of seconds played based on the users level.
level = n
i started with (n * 10) + (n * 60) but soon realized that that doesn't account for the last level already being 10 seconds longer than the last. I have temporarily fixed it using a function calling a foreach loop stopping at the level number and returning the value. but i really want an actual equation.
SO i know you wont let me down :-)
Thanks in advance.
this is what i am using;
function getnumberofsecondsfromlevel($level){
$lastlevelseconds = 60;
while($counter < $level){
$totalseconds = $lastlevelseconds+$totalseconds;
$lastlevelseconds = $lastlevelseconds + 10;
$counter++;
}
return $totalseconds;
}
$level = $_SESSION['**hidden**']['thelevel'];
$totaldureationinseconds = getnumberofsecondsfromlevel($level);
but i want to replace with an actual equation
like so;(of course this is wrong, this is just the example of the format i want it in i.e an equation)
$n = $_SESSION['**hidden**']['thelevel']; (level to get total value of
in seconds)
$s = 60; (start level)
$totaldureationinseconds = ($n * 10) + ($s * $n);
SOLVED by Gopalkrishna Narayan Prabhu :-)
$totalseconds = 60 * $level + 5* (($level-1) * $level);
var total_secs = 0;
for(var i = 1; i<= n ;i++){
total_secs = total_secs + (i*10) + 50;
}
for n= 1, total_secs = 0 + 10 + 50 = 60
for n= 2, total_secs = 60 + 20 + 50 = 130
and so on...
For a single equation:
var n = level_number;
total_secs = 60 * n + 5* ((n-1) * n);
Hope this helps.
It seems as though you're justing looking for the equation
60 + ((levelN - 1) * 10)
Where levelN is the current level, starting at 1. If you make the first level 0, you can get rid of the - 1 part and make it just
60 + (levelN * 10)
Thought process:
What's the base/first number? What's the lowest it can ever be? 60. That means your equation will start with
60 + ...
Every time you increase the level, you add 10, so at some point you'll need something like levelN * 10. Then, it's just some fiddling. In those case, since you don't add any on the first left, and the first level is level 1, you just need to subtract 1 from the level number to fix that.
You can solve this with a really simple mathematical phrase (with factorial).
((n-1)! * 10) + (60 * n)
n is the level ofcourse.
Recently I found this in some code I wrote a few years ago. It was used to rationalize a real value (within a tolerance) by determining a suitable denominator and then checking if the difference between the original real and the rational was small enough.
Edit to clarify : I actually don't want to convert all real values. For instance I could choose a max denominator of 14, and a real value that equals 7/15 would stay as-is. It's not as clear that as it's an outside variable in the algorithms I wrote here.
The algorithm to get the denominator was this (pseudocode):
denominator(x)
frac = fractional part of x
recip = 1/frac
if (frac < tol)
return 1
else
return recip * denominator(recip)
end
end
Seems to be based on continued fractions although it became clear on looking at it again that it was wrong. (It worked for me because it would eventually just spit out infinity, which I handled outside, but it would be often really slow.) The value for tol doesn't really do anything except in the case of termination or for numbers that end up close. I don't think it's relatable to the tolerance for the real - rational conversion.
I've replaced it with an iterative version that is not only faster but I'm pretty sure it won't fail theoretically (d = 1 to start with and fractional part returns a positive, so recip is always >= 1) :
denom_iter(x d)
return d if d > maxd
frac = fractional part of x
recip = 1/frac
if (frac = 0)
return d
else
return denom_iter(recip d*recip)
end
end
What I'm curious to know if there's a way to pick the maxd that will ensure that it converts all values that are possible for a given tolerance. I'm assuming 1/tol but don't want to miss something. I'm also wondering if there's an way in this approach to actually limit the denominator size - this allows some denominators larger than maxd.
This can be considered a 2D minimization problem on error:
ArgMin ( r - q / p ), where r is real, q and p are integers
I suggest the use of Gradient Descent algorithm . The gradient in this objective function is:
f'(q, p) = (-1/p, q/p^2)
The initial guess r_o can be q being the closest integer to r, and p being 1.
The stopping condition can be thresholding of the error.
The pseudo-code of GD can be found in wiki: http://en.wikipedia.org/wiki/Gradient_descent
If the initial guess is close enough, the objective function should be convex.
As Jacob suggested, this problem can be better solved by minimizing the following error function:
ArgMin ( p * r - q ), where r is real, q and p are integers
This is linear programming, which can be efficiently solved by any ILP (Integer Linear Programming) solvers. GD works on non-linear cases, but lack efficiency in linear problems.
Initial guesses and stopping condition can be similar to stated above. Better choice can be obtained for individual choice of solver.
I suggest you should still assume convexity near the local minimum, which can greatly reduce cost. You can also try Simplex method, which is great on linear programming problem.
I give credit to Jacob on this.
A problem similar to this is solved in the Approximations section beginning ca. page 28 of Bill Gosper's Continued Fraction Arithmetic document. (Ref: postscript file; also see text version, from line 1984.) The general idea is to compute continued-fraction approximations of the low-end and high-end range limiting numbers, until the two fractions differ, and then choose a value in the range of those two approximations. This is guaranteed to give a simplest fraction, using Gosper's terminology.
The python code below (program "simpleden") implements a similar process. (It probably is not as good as Gosper's suggested implementation, but is good enough that you can see what kind of results the method produces.) The amount of work done is similar to that for Euclid's algorithm, ie O(n) for numbers with n bits, so the program is reasonably fast. Some example test cases (ie the program's output) are shown after the code itself. Note, function simpleratio(vlo, vhi) as shown here returns -1 if vhi is smaller than vlo.
#!/usr/bin/env python
def simpleratio(vlo, vhi):
rlo, rhi, eps = vlo, vhi, 0.0000001
if vhi < vlo: return -1
num = denp = 1
nump = den = 0
while 1:
klo, khi = int(rlo), int(rhi)
if klo != khi or rlo-klo < eps or rhi-khi < eps:
tlo = denp + klo * den
thi = denp + khi * den
if tlo < thi:
return tlo + (rlo-klo > eps)*den
elif thi < tlo:
return thi + (rhi-khi > eps)*den
else:
return tlo
nump, num = num, nump + klo * num
denp, den = den, denp + klo * den
rlo, rhi = 1/(rlo-klo), 1/(rhi-khi)
def test(vlo, vhi):
den = simpleratio(vlo, vhi);
fden = float(den)
ilo, ihi = int(vlo*den), int(vhi*den)
rlo, rhi = ilo/fden, ihi/fden;
izok = 'ok' if rlo <= vlo <= rhi <= vhi else 'wrong'
print '{:4d}/{:4d} = {:0.8f} vlo:{:0.8f} {:4d}/{:4d} = {:0.8f} vhi:{:0.8f} {}'.format(ilo,den,rlo,vlo, ihi,den,rhi,vhi, izok)
test (0.685, 0.695)
test (0.685, 0.7)
test (0.685, 0.71)
test (0.685, 0.75)
test (0.685, 0.76)
test (0.75, 0.76)
test (2.173, 2.177)
test (2.373, 2.377)
test (3.484, 3.487)
test (4.0, 4.87)
test (4.0, 8.0)
test (5.5, 5.6)
test (5.5, 6.5)
test (7.5, 7.3)
test (7.5, 7.5)
test (8.534537, 8.534538)
test (9.343221, 9.343222)
Output from program:
> ./simpleden
8/ 13 = 0.61538462 vlo:0.68500000 9/ 13 = 0.69230769 vhi:0.69500000 ok
6/ 10 = 0.60000000 vlo:0.68500000 7/ 10 = 0.70000000 vhi:0.70000000 ok
6/ 10 = 0.60000000 vlo:0.68500000 7/ 10 = 0.70000000 vhi:0.71000000 ok
2/ 4 = 0.50000000 vlo:0.68500000 3/ 4 = 0.75000000 vhi:0.75000000 ok
2/ 4 = 0.50000000 vlo:0.68500000 3/ 4 = 0.75000000 vhi:0.76000000 ok
3/ 4 = 0.75000000 vlo:0.75000000 3/ 4 = 0.75000000 vhi:0.76000000 ok
36/ 17 = 2.11764706 vlo:2.17300000 37/ 17 = 2.17647059 vhi:2.17700000 ok
18/ 8 = 2.25000000 vlo:2.37300000 19/ 8 = 2.37500000 vhi:2.37700000 ok
114/ 33 = 3.45454545 vlo:3.48400000 115/ 33 = 3.48484848 vhi:3.48700000 ok
4/ 1 = 4.00000000 vlo:4.00000000 4/ 1 = 4.00000000 vhi:4.87000000 ok
4/ 1 = 4.00000000 vlo:4.00000000 8/ 1 = 8.00000000 vhi:8.00000000 ok
11/ 2 = 5.50000000 vlo:5.50000000 11/ 2 = 5.50000000 vhi:5.60000000 ok
5/ 1 = 5.00000000 vlo:5.50000000 6/ 1 = 6.00000000 vhi:6.50000000 ok
-7/ -1 = 7.00000000 vlo:7.50000000 -7/ -1 = 7.00000000 vhi:7.30000000 wrong
15/ 2 = 7.50000000 vlo:7.50000000 15/ 2 = 7.50000000 vhi:7.50000000 ok
8030/ 941 = 8.53347503 vlo:8.53453700 8031/ 941 = 8.53453773 vhi:8.53453800 ok
24880/2663 = 9.34284641 vlo:9.34322100 24881/2663 = 9.34322193 vhi:9.34322200 ok
If, rather than the simplest fraction in a range, you seek the best approximation given some upper limit on denominator size, consider code like the following, which replaces all the code from def test(vlo, vhi) forward.
def smallden(target, maxden):
global pas
pas = 0
tol = 1/float(maxden)**2
while 1:
den = simpleratio(target-tol, target+tol);
if den <= maxden: return den
tol *= 2
pas += 1
# Test driver for smallden(target, maxden) routine
import random
totalpass, trials, passes = 0, 20, [0 for i in range(20)]
print 'Maxden Num Den Num/Den Target Error Passes'
for i in range(trials):
target = random.random()
maxden = 10 + round(10000*random.random())
den = smallden(target, maxden)
num = int(round(target*den))
got = float(num)/den
print '{:4d} {:4d}/{:4d} = {:10.8f} = {:10.8f} + {:12.9f} {:2}'.format(
int(maxden), num, den, got, target, got - target, pas)
totalpass += pas
passes[pas-1] += 1
print 'Average pass count: {:0.3}\nPass histo: {}'.format(
float(totalpass)/trials, passes)
In production code, drop out all the references to pas (etc.), ie, drop out pass-counting code.
The routine smallden is given a target value and a maximum value for allowed denominators. Given maxden possible choices of denominators, it's reasonable to suppose that a tolerance on the order of 1/maxden² can be achieved. The pass-counts shown in the following typical output (where target and maxden were set via random numbers) illustrate that such a tolerance was reached immediately more than half the time, but in other cases tolerances 2 or 4 or 8 times as large were used, requiring extra calls to simpleratio. Note, the last two lines of output from a 10000-number test run are shown following the complete output of a 20-number test run.
Maxden Num Den Num/Den Target Error Passes
1198 32/ 509 = 0.06286837 = 0.06286798 + 0.000000392 1
2136 115/ 427 = 0.26932084 = 0.26932103 + -0.000000185 1
4257 839/2670 = 0.31423221 = 0.31423223 + -0.000000025 1
2680 449/ 509 = 0.88212181 = 0.88212132 + 0.000000486 3
2935 440/1853 = 0.23745278 = 0.23745287 + -0.000000095 1
6128 347/1285 = 0.27003891 = 0.27003899 + -0.000000077 3
8041 1780/4243 = 0.41951449 = 0.41951447 + 0.000000020 2
7637 3926/7127 = 0.55086292 = 0.55086293 + -0.000000010 1
3422 27/ 469 = 0.05756930 = 0.05756918 + 0.000000113 2
1616 168/1507 = 0.11147976 = 0.11147982 + -0.000000061 1
260 62/ 123 = 0.50406504 = 0.50406378 + 0.000001264 1
3775 52/3327 = 0.01562970 = 0.01562750 + 0.000002195 6
233 6/ 13 = 0.46153846 = 0.46172772 + -0.000189254 5
3650 3151/3514 = 0.89669892 = 0.89669890 + 0.000000020 1
9307 2943/7528 = 0.39094049 = 0.39094048 + 0.000000013 2
962 206/ 225 = 0.91555556 = 0.91555496 + 0.000000594 1
2080 564/1975 = 0.28556962 = 0.28556943 + 0.000000190 1
6505 1971/2347 = 0.83979548 = 0.83979551 + -0.000000022 1
1944 472/ 833 = 0.56662665 = 0.56662696 + -0.000000305 2
3244 291/1447 = 0.20110574 = 0.20110579 + -0.000000051 1
Average pass count: 1.85
Pass histo: [12, 4, 2, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
The last two lines of output from a 10000-number test run:
Average pass count: 1.77
Pass histo: [56659, 25227, 10020, 4146, 2072, 931, 497, 233, 125, 39, 33, 17, 1, 0, 0, 0, 0, 0, 0, 0]
So I have a counter. It is supposed to calculate the current amount of something. To calculate this, I know the start date, and start amount, and the amount to increment the counter by each second. Easy peasy. The tricky part is that the growth is not quite linear. Every day, the increment amount increases by a set amount. I need to recreate this algorithmically - basically figure out the exact value at the current date based on the starting value, the amount incremented over time, and the amount the increment has increased over time.
My target language is Javascript, but pseudocode is fine too.
Based on AB's solution:
var now = new Date();
var startDate1 = new Date("January 1 2010");
var days1 = (now - startDate1) / 1000 / 60 / 60 / 24;
var startNumber1 = 9344747520;
var startIncrement1 = 463;
var dailyIncrementAdjustment1 = .506;
var currentIncrement = startIncrement1 + (dailyIncrementAdjustment1 * days1);
startNumber1 = startNumber1 + (days1 / 2) * (2 * startIncrement1 + (days1 - 1) * dailyIncrementAdjustment1);
Does that look reasonable to you guys?
It's a quadratic function. If t is the time passed, then it's the usual at2+bt+c, and you can figure out a,b,c by substituting the results for the first 3 seconds.
Or: use the formula for the arithmetic progression sum, where a1 is the initial increment, and d is the "set amount" you refer to. Just don't forget to add your "start amount" to what the formula gives you.
If x0 is the initial amount, d is the initial increment, and e is the "set amount" to increase the incerement, it comes to
x0 + (t/2)*(2d + (t-1)*e)
If I understand your question correctly, you have an initial value x_0, an initial increment per second of d_0 and an increment adjustment of e per day. That is, on day one the increment per second is d_0, on day two the increment per second is d_0 + e, etc.
Then, we note that the increment per second at time t is
d(t) = d_0 + floor(t / S) * e
where S is the number of seconds per day and t is the number of seconds that have elapsed since t = t_0. Then
x = x_0 + sum_{k < floor(t / S)} S * d(k) + S * (t / S - floor(t / S)) * d(t)
is the formula that you are seeking. From here, you can simplify this to
x = x_0 + S * floor(t / S) d_0 + S * e * (floor(t / S) - 1) * floor(t / S) / 2.
use strict; use warnings;
my $start = 0;
my $stop = 100;
my $current = $start;
for my $day ( 1 .. 100 ) {
$current += ($day / 10);
last unless $current < $stop;
printf "Day: %d\tLeft %.2f\n", $day, (1 - $current/$stop);
}
Output:
Day: 1 Left 1.00
Day: 2 Left 1.00
Day: 3 Left 0.99
Day: 4 Left 0.99
Day: 5 Left 0.98
...
Day: 42 Left 0.10
Day: 43 Left 0.05
Day: 44 Left 0.01