Formula to recalculate population variance after removing a value - algorithm

Let's say I have a data set of {10, 20, 30}. My mean and variance here are mean = 20 and variance = 66.667. Is there a formula that lets me calculate the new variance value if I was to remove 10 from the data set turning it into {20, 30}?
This is a similar question to https://math.stackexchange.com/questions/3112650/formula-to-recalculate-variance-after-removing-a-value-and-adding-another-one-gi which deals with the case when there is replacement. https://math.stackexchange.com/questions/775391/can-i-calculate-the-new-standard-deviation-when-adding-a-value-without-knowing-t is also a similar question except that deals with adding adding a value instead of removing one. Removing a prior sample while using Welford's method for computing single pass variance deals with removing a sample, but I cannot figure out how to modify it for dealing with population.

To compute Mean and Variance we want 3 parameters:
N - number of items
Sx - sum of items
Sxx - sum of items squared
Having all these values we can find mean and variance as
Mean = Sx / N
Variance = Sxx / N - Sx * Sx / N / N
In your case
items = {10, 20, 30}
N = 3
Sx = 60 = 10 + 20 + 30
Sxx = 1400 = 100 + 400 + 900 = 10 * 10 + 20 * 20 + 30 * 30
Mean = 60 / 3 = 20
Variance = 1400 / 3 - 60 * 60 / 3 / 3 = 66.666667
If you want to remove an item, just update N, Sx, Sxx values and compute a new variance:
item = 10
N' = N - 1 = 3 - 1 = 2
Sx' = Sx - item = 60 - 10 = 50
Sxx' = Sxx - item * item = 1400 - 10 * 10 = 1300
Mean' = Sx' / N' = 50 / 2 = 25
Variance' = Sxx' / N' - Sx' * Sx' / N' / N' = 1300 / 2 - 50 * 50 / 2 / 2 = 25
So if you remove item = 10 the new mean and variance will be
Mean' = 25
Variance' = 25

Related

How to write a program in gwbasic for adding the natural numbers for 1 to 100?

I am trying to write a program for adding the natural numbers from 1 to n (1 + 2 + 3 + ... + n). However, the sum appears 1 when I use if statement. And when I use for-next statement there is a syntax error that I don't understand.
if:
30 let s = 0
40 let i = 1
50 s = s + i
60 i = i + 1
70 if i<=n, then goto 50
80 print s
for-next:
30 let i, s
40 s = 0
50 for i = 1 to n
60 s = s + i
70 next i
80 print n
When I take n = 10, the if statement code gives a result of 1, but it should be 55.
When I try to use the for-next statement, it gives no result saying that there is a syntax error in 30.
Why is this happening?
The following code works in this online Basic interpreter.
10 let n = 100
30 let s = 0
40 let i = 1
50 s = s + i
60 i = i + 1
70 if i <= n then goto 50 endif
80 print s
I initialised n on the line labelled 10, removed the comma on the line labelled 70 and added an endif on the same line.
This is the for-next version:
30 let n = 100
40 let s = 0
50 for i = 1 to n
60 s = s + i
70 next i
80 print s
(btw, the sum of the first n natural numbers is n(n+1)/2:
10 let n = 100
20 let s = n * (n + 1) / 2
30 print s
)
Why is this happening? Where am I mistaking?
30 let s = 0
40 let i = 1
50 s = s + i
60 i = i + 1
70 if i<=n, then goto 50
80 print s
Fix #1: Initialize variable 'n':
20 let n = 10
Fix #2: Remove comma from line 70:
70 if i<=n then goto 50
30 let i, s
40 s = 0
50 for i = 1 to n
60 s = s + i
70 next i
80 print n
Fix #1: Initialize variable 'n':
30 let n = 10
Fix #2: Print 's' instead of 'n':
80 print s
10 cls
20 let x=1
30 for x=1 to 100
40 print x
50 next x
60 end

Ruby algorithms loops codewars

I got stuck with below task and spent about 3 hours trying to figure it out.
Task description: A man has a rather old car being worth $2000. He saw a secondhand car being worth $8000. He wants to keep his old car until he can buy the secondhand one.
He thinks he can save $1000 each month but the prices of his old car and of the new one decrease of 1.5 percent per month. Furthermore this percent of loss increases by 0.5 percent at the end of every two months. Our man finds it difficult to make all these calculations.
How many months will it take him to save up enough money to buy the car he wants, and how much money will he have left over?
My code so far:
def nbMonths(startPriceOld, startPriceNew, savingperMonth, percentLossByMonth)
dep_value_old = startPriceOld
mth_count = 0
total_savings = 0
dep_value_new = startPriceNew
mth_count_new = 0
while startPriceOld != startPriceNew do
if startPriceOld >= startPriceNew
return mth_count = 0, startPriceOld - startPriceNew
end
dep_value_new = dep_value_new - (dep_value_new * percentLossByMonth / 100)
mth_count_new += 1
if mth_count_new % 2 == 0
dep_value_new = dep_value_new - (dep_value_new * 0.5) / 100
end
dep_value_old = dep_value_old - (dep_value_old * percentLossByMonth / 100)
mth_count += 1
total_savings += savingperMonth
if mth_count % 2 == 0
dep_value_old = dep_value_old - (dep_value_old * 0.5) / 100
end
affordability = total_savings + dep_value_old
if affordability >= dep_value_new
return mth_count, affordability - dep_value_new
end
end
end
print nbMonths(2000, 8000, 1000, 1.5) # Expected result[6, 766])
The data are as follows.
op = 2000.0 # current old car value
np = 8000.0 # current new car price
sv = 1000.0 # annual savings
dr = 0.015 # annual depreciation, both cars (1.5%)
cr = 0.005. # additional depreciation every two years, both cars (0.5%)
After n >= 0 months the man's (let's call him "Rufus") savings plus the value of his car equal
sv*n + op*(1 - n*dr - (cr + 2*cr + 3*cr +...+ (n/2)*cr))
where n/2 is integer division. As
cr + 2*cr + 3*cr +...+ (n/2)*cr = cr*((1+2+..+n)/2) = cr*(1+n/2)*(n/2)
the expression becomes
sv*n + op*(1 - n*dr - cr*(1+(n/2))*(n/2))
Similarly, after n years the cost of the car he wants to purchase will fall to
np * (1 - n*dr - cr*(1+(n/2))*(n/2))
If we set these two expressions equal we obtain the following.
sv*n + op - op*dr*n - op*cr*(n/2) - op*cr*(n/2)**2 =
np - np*dr*n - np*cr*(n/2) - np*cr*(n/2)**2
which reduces to
cr*(np-op)*(n/2)**2 + (sv + dr*(np-op))*n + cr*(np-op)*(n/2) - (np-op) = 0
or
cr*(n/2)**2 + (sv/(np-op) + dr)*n + cr*(n/2) - 1 = 0
If we momentarily treat (n/2) as a float division, this expression reduces to a quadratic.
(cr/4)*n**2 + (sv/(np-op) + dr + cr/2)*n - 1 = 0
= a*n**2 + b*n + c = 0
where
a = cr/4 = 0.005/4 = 0.00125
b = sv/(np-op) + dr + cr/(2*a) = 1000.0/(8000-2000) + 0.015 + 0.005/2 = 0.18417
c = -1
Incidentally, Rufus doesn't have a computer, but he does have an HP 12c calculator his grandfather gave him when he was a kid, which is perfectly adequate for these simple calculations.
The roots are computed as follows.
(-b + Math.sqrt(b**2 - 4*a*c))/(2*a) #=> 5.24
(-b - Math.sqrt(b**2 - 4*a*c))/(2*a) #=> -152.58
It appears that Rufus can purchase the new vehicle (if it's still for sale) in six years. Had we been able able to solve the above equation for n/2 using integer division it might have turned out that Rufus would have had to wait longer. That’s because for a given n both cars would have depreciated less (or at least not not more), and because the car to be purchased is more expensive than the current car, the difference in values would be greater than that obtained with the float approximation for 1/n. We need to check that, however. After n years, Rufus' savings and the value of his beater will equal
sv*n + op*(1 - dr*n - cr*(1+(n/2))*(n/2))
= 1000*n + 2000*(1 - 0.015*n - 0.005*(1+(n/2))*(n/2))
For n = 6 this equals
1000*6 + 2000*(1 - 0.015*6 - 0.005*(1+(6/2))*(6/2))
= 1000*6 + 2000*(1 - 0.015*6 - 0.005*(1+3)*3)
= 1000*6 + 2000*0.85
= 7700
The cost of Rufus' dream car after n years will be
np * (1 - dr*n - cr*(1+(n/2))*(n/2))
= 8000 * (1 - 0.015*n - 0.005*(1+(n/2))*(n/2))
For n=6 this becomes
8000 * (1 - 0.015*6 - 0.005*(1+(6/2))*(6/2))
= 8000*0.85
= 6800
(Notice that the factor 0.85 is the same in both calculations.)
Yes, Rufus will be able to buy the car in 6 years.
def nbMonths(old, new, savings, percent)
percent = percent.fdiv(100)
current_savings = 0
months = 0
loop do
break if current_savings + old >= new
current_savings += savings
old -= old * percent
new -= new * percent
months += 1
percent += 0.005 if months.odd?
end
[months, (current_savings + old - new).round]
end

work out how many seconds have expired in total during game play

~Why the hell has this had down votes.... you people are weird!
Ok so this is a very simply HTML5 and jQuery and PHP game. Sorry to the people who have answered, I forgot to say this is a php script, i have updated here to reflect.
the first level takes 1 minute. Every level after that takes an extra 10 seconds than the last level. like so;
level 1 = 60 seconds
level 2 = 70 seconds
level 3 = 80 seconds
level 4 = 90 seconds
and so on infinitely.
I need an equation that can figure out what is the total amount of seconds played based on the users level.
level = n
i started with (n * 10) + (n * 60) but soon realized that that doesn't account for the last level already being 10 seconds longer than the last. I have temporarily fixed it using a function calling a foreach loop stopping at the level number and returning the value. but i really want an actual equation.
SO i know you wont let me down :-)
Thanks in advance.
this is what i am using;
function getnumberofsecondsfromlevel($level){
$lastlevelseconds = 60;
while($counter < $level){
$totalseconds = $lastlevelseconds+$totalseconds;
$lastlevelseconds = $lastlevelseconds + 10;
$counter++;
}
return $totalseconds;
}
$level = $_SESSION['**hidden**']['thelevel'];
$totaldureationinseconds = getnumberofsecondsfromlevel($level);
but i want to replace with an actual equation
like so;(of course this is wrong, this is just the example of the format i want it in i.e an equation)
$n = $_SESSION['**hidden**']['thelevel']; (level to get total value of
in seconds)
$s = 60; (start level)
$totaldureationinseconds = ($n * 10) + ($s * $n);
SOLVED by Gopalkrishna Narayan Prabhu :-)
$totalseconds = 60 * $level + 5* (($level-1) * $level);
var total_secs = 0;
for(var i = 1; i<= n ;i++){
total_secs = total_secs + (i*10) + 50;
}
for n= 1, total_secs = 0 + 10 + 50 = 60
for n= 2, total_secs = 60 + 20 + 50 = 130
and so on...
For a single equation:
var n = level_number;
total_secs = 60 * n + 5* ((n-1) * n);
Hope this helps.
It seems as though you're justing looking for the equation
60 + ((levelN - 1) * 10)
Where levelN is the current level, starting at 1. If you make the first level 0, you can get rid of the - 1 part and make it just
60 + (levelN * 10)
Thought process:
What's the base/first number? What's the lowest it can ever be? 60. That means your equation will start with
60 + ...
Every time you increase the level, you add 10, so at some point you'll need something like levelN * 10. Then, it's just some fiddling. In those case, since you don't add any on the first left, and the first level is level 1, you just need to subtract 1 from the level number to fix that.
You can solve this with a really simple mathematical phrase (with factorial).
((n-1)! * 10) + (60 * n)
n is the level ofcourse.

Non-linear counter

So I have a counter. It is supposed to calculate the current amount of something. To calculate this, I know the start date, and start amount, and the amount to increment the counter by each second. Easy peasy. The tricky part is that the growth is not quite linear. Every day, the increment amount increases by a set amount. I need to recreate this algorithmically - basically figure out the exact value at the current date based on the starting value, the amount incremented over time, and the amount the increment has increased over time.
My target language is Javascript, but pseudocode is fine too.
Based on AB's solution:
var now = new Date();
var startDate1 = new Date("January 1 2010");
var days1 = (now - startDate1) / 1000 / 60 / 60 / 24;
var startNumber1 = 9344747520;
var startIncrement1 = 463;
var dailyIncrementAdjustment1 = .506;
var currentIncrement = startIncrement1 + (dailyIncrementAdjustment1 * days1);
startNumber1 = startNumber1 + (days1 / 2) * (2 * startIncrement1 + (days1 - 1) * dailyIncrementAdjustment1);
Does that look reasonable to you guys?
It's a quadratic function. If t is the time passed, then it's the usual at2+bt+c, and you can figure out a,b,c by substituting the results for the first 3 seconds.
Or: use the formula for the arithmetic progression sum, where a1 is the initial increment, and d is the "set amount" you refer to. Just don't forget to add your "start amount" to what the formula gives you.
If x0 is the initial amount, d is the initial increment, and e is the "set amount" to increase the incerement, it comes to
x0 + (t/2)*(2d + (t-1)*e)
If I understand your question correctly, you have an initial value x_0, an initial increment per second of d_0 and an increment adjustment of e per day. That is, on day one the increment per second is d_0, on day two the increment per second is d_0 + e, etc.
Then, we note that the increment per second at time t is
d(t) = d_0 + floor(t / S) * e
where S is the number of seconds per day and t is the number of seconds that have elapsed since t = t_0. Then
x = x_0 + sum_{k < floor(t / S)} S * d(k) + S * (t / S - floor(t / S)) * d(t)
is the formula that you are seeking. From here, you can simplify this to
x = x_0 + S * floor(t / S) d_0 + S * e * (floor(t / S) - 1) * floor(t / S) / 2.
use strict; use warnings;
my $start = 0;
my $stop = 100;
my $current = $start;
for my $day ( 1 .. 100 ) {
$current += ($day / 10);
last unless $current < $stop;
printf "Day: %d\tLeft %.2f\n", $day, (1 - $current/$stop);
}
Output:
Day: 1 Left 1.00
Day: 2 Left 1.00
Day: 3 Left 0.99
Day: 4 Left 0.99
Day: 5 Left 0.98
...
Day: 42 Left 0.10
Day: 43 Left 0.05
Day: 44 Left 0.01

Programming an algebra equation

in another post, MSN gave me a good guide on solving my algebra problem (Calculating bid price from total cost). Now, even though I can calculate it by hand, I'm completely stuck on how to write this in pseudocode or code. Anyone could give me a quick hint? By the way, I want to calculate the bid given the final costs .
usage cost(bid) = PIN(bid*0.10, 10, 50)
seller cost(bid) = bid*.02
added cost(bid) = PIN(ceiling(bid/500)*5, 5, 10) + PIN(ceiling((bid - 1000)/2000)*5, 0, 10)
storing cost(bid) = 100
So the final cost is something like:
final cost(bid) = PIN(bid*.1, 10, 50) + pin(ceiling(bid/500)*5, 5, 20) + PIN(ceiling((bid - 1000)/2000)*10, 0, 20) + bid*.02 + 100 + bid
Solve for a particular value and you're done.
For example, if you want the total cost to be $2000:
2000 = PIN(bid*.1, 10, 50) + pin(ceiling(bid/500)*5, 5, 10) + PIN(ceiling((bid - 1000)/2000)*5, 0, 10) + bid*.02 + 100 + bid.
Bid must be at least > 1500 and < 2000, which works out nicely since we can make those PIN sections constant:
2000 = 50 + 10 + 5 + 100 + bid*1.02
1835 = bid*1.02
bid = 1799.0196078431372549019607843137
The function simplifies to:
/ 1.02 * bid + 115 bid < 100
| 1.12 * bid + 105 bid <= 500
final cost(bid) = | 1.02 * bid + 160 bid <= 1000
| 1.02 * bid + 165 bid <= 3000
\ 1.02 * bid + 170 otherwise
If you consider each piece as a separate function, they can be inverted:
bid_a(cost) = (cost - 115) / 1.02
bid_b(cost) = (cost - 105) / 1.12
bid_c(cost) = (cost - 160) / 1.02
bid_d(cost) = (cost - 165) / 1.02
bid_e(cost) = (cost - 170) / 1.02
If you plug your cost into each function you get an estimated bid value for that range. You must check that this value indeed is within that functions valid range.
Example:
cost = 2000
bid_a(2000) = (2000 - 115) / 1.02 = 1848 Too big! Need to be < 100
bid_b(2000) = (2000 - 105) / 1.12 = 1692 Too big! Need to be <= 500
bid_c(2000) = (2000 - 160) / 1.02 = 1804 Too big! Need to be <= 1000
bid_d(2000) = (2000 - 165) / 1.02 = 1799 Good. It is <= 3000
bid_e(2000) = (2000 - 170) / 1.02 = 1794 Too small! Need to be > 3000
Just to check:
final cost(1799) = 1.02 * 1799 + 165 = 2000 Good!
Since the original function is strictly increasing, at most one of those functions will give an acceptable value. But for some inputs none of them will give a good value. This is because the original function jumps over those values.
final cost(1000) = 1.02 * 1000 + 160 = 1180
final cost(1001) = 1.02 * 1001 + 165 = 1186
So no function will give an acceptable value for cost = 1182 for example.
Due to the use of PIN and ceiling, I don't see a easy way to invert the calculation. Assuming that bid has a fixed precision (I'd guess two decimals behind the dot) you can always use a binary search (as the functions are monotone).
Edit: After thinking about it some more, I observed that, taking x = bid*1.02 + 100, we have that the final costs are between x+15 (exclusive) and x+70 (inclusive) (i.e. x+15 < final cost < x+70). Given the size of this range (70-15=55) and the fact that the special values (see note below) for bid are all apart more than this, you can take x+15 = final cost and x+70 = final cost, get the right cases/values of usage and added costs and simply solve that equation (which no longer has either PIN or ceiling in it).
To illustrate, let the final cost be 222. From x+15 = 222 it follows that bid = 107/1.02 = 104.90. Then we have that the usage costs are given by bid*0.1 and that the additional costs are 5. In other words, we get final cost = bid*0.1 + bid*0.02 + 5 + 100 + bid = bid*1.12 + 105 and therefore bid = (222-105)/1.12 = 104.46. As this value of bid means the right values for usage and additional costs were taken, we know that this is the solution.
However, if we would have first looked at x+70 = 222, we would get the following. First we get that for this assumption that bid = 52/1.02 = 50.98. This means that usage costs are 10 and the additional costs are 5. So we get final costs = 10 + bid*0.02 + 5 + 100 + bid = bid*1.02 + 115 and therefore bid = (222-115)/1.02 = 104.90. But if bid is 104.90 then the usage costs are not 10 but bid*0.1, so this isn't the right solution.
I hope I explained it clearly enough. If not, please let me know.
N.B.: With special values I mean those for which the function defining the values of usage and added costs change. For example, for usage cost these values are 100 and 500: below 100 you use 10, above 500 you use 50 and in between you use bid*0.1.

Resources