probability n choose k - probability

I tried to solve this TopCoder problem: http://community.topcoder.com/stat?c=problem_statement&pm=10863&rd=14150
But, my solution is not good, and I don't understand why.
I understood the solution given there (down page: look for LotteryPyaterochka): http://apps.topcoder.com/wiki/display/tc/SRM+466
So, to sum up my problem:
We are playing a special kind of lottery:
Each ticket in this lottery is a rectangular grid with N rows and 5 columns, where each cell contains an integer between 1 and 5*N, inclusive. All integers within a single ticket are distinct.
The lottery organizers randomly choose 5 distinct integers, each between 1 and 5*N, inclusive. Each possible subset of 5 integers has the same probability of being chosen. These integers are called the winning numbers. A ticket is considered a winner if and only if it has a row which contains at least 3 winning numbers.
We want to know the number of winning ticket (thus, having at least 3 winning number in the same row)
So, I stuck in the following step:
number of ways of choosing the 5 numbers which appear in the 'winning row'.
The topCoder solution says:
(#ways of choosing the 5 numbers which appear in the 'winning row') =
(#ways of choosing the x winning numbers which appear in the 'winning row') * (#ways of choosing 5-x 'non-winning numbers') =
(5 choose x) * ((5N-5) choose (5-x))
Since the number of winning numbers in this row is at least 3, x can be 3 or 4 or 5. So, we have
(#ways of choosing the 5 numbers which appear in the 'winning row') =
(5 choose 3) * ((5N-5) choose 2) + (5 choose 4) * ((5N-5) choose 1) + (5 choose 5) * ((5N-5) choose 0))
And what I say:
(#ways of choosing the 5 numbers which appear in the 'winning row') =
(3 number among the 5 winning number) * (2 numbers to complete the row to choose among the 5N-5 non winning number + 2 winning number non chosen before) =
(5N choose 3) * ((5N-3)choose 2)
For N = 10 my method give: (5 choose 3)*(47 choose 2) = 10810
And the topcoder method give: ((5 choose 3)(45 choose 2) + (5 choose 4)(45 choose 1) + (5 choose 5)*(45 choose 0)) = 10126
Why is my method wrong ?
Thanks

Let's say the winning numbers are 1, 2, 3, 4 and 5. Now let's look at the ticket that contains all five numbers in the winning row.
Your method counts that ticket many times, since it's included in the following counts:
1 2 3 + two other numbers
1 2 4 + two other numbers
1 2 5 + two other numbers
1 3 4 + two other numbers
...
The same thing happens to tickets with four winning numbers.
This is the reason why these cases need to be counted separately.

Related

Number of ways to change coins in constant time?

Let's say I have three types of coins -- a penny (0.01), a nickel (0.05), and a dime (0.10) and I want to find the number of ways to make change of a certain amount. For example to change 27 cents:
change(amount=27, coins=[1,5,10])
One of the more common ways to approach this problem is recursively/dynamically: to find the number of ways to make that change without a particular coin, and then deduct that coin amount and find the ways to do it with that coin.
But, I'm wondering if there is a way to do it using a cached value and mod operator. For example:
10 cents can be changed 4 ways:
10 pennies
1 dime
2 nickels
1 nickel, 5 pennies
5 cents can be changed 2 ways:
1 nickel
5 pennies
1-4 cents can be changed 1 way:
1-4 pennies
For example, this is wrong, but my idea was along the lines of:
def change(amount, coins=[1,5,10]):
cache = {10: 4, 5: 2, 1: 1}
for coin in sorted(coins, reverse=True):
# yes this will give zerodivision
# and a penny shouldn't be multiplied
# but this is just to demonstrate the basic idea
ways = (amount % coin) * cache[coin]
amount = amount % ways
return ways
If so, how would that algorithm work? Any language (or pseudo-language) is fine.
Precomputing the number of change possibilities for 10 cents and 5 cents cannot be applied to bigger values in a straight forward way, but for special cases like the given example of pennies, nickels and dimes a formula for the number of change possibilities can be derived when looking into more detail how the different ways of change for 5 and 10 cents can be combined.
Lets first look at multiples of 10. Having e.g. n=20 cents, the first 10 cents can be changed in 4 ways, so can the second group of 10 cents. That would make 4x4 = 16 ways of change. But not all combinations are different: a dime for the first 10 cents and 10 pennies for the other 10 cents is the same as having 10 pennies for the first 10 cents and a dime for the second 10 cents. So we have to count the possibilities in an ordered way: that would give (n/10+3) choose 3 possibilities. But still not all possibilities in this counting are different: choosing a nickel and 5 pennies for the first and the second group of 10 cents gives the same change as choosing two nickels for the first group and 10 cents for the second group. Thinking about this a little more one finds out that the possibility of 1 nickel and 5 pennies should be chosen only once. So we get (n/10+2) choose 2 ways of change without the nickel/pennies split (i.e. the total number of nickels will be even) and ((n-10)/10+2) choose 2 ways of change with one nickel/pennies split (i.e. the total number of nickels will be odd).
For an arbitrary number n of cents let [n/10] denote the value n/10 rounded down, i.e. the maximal number of dimes that can be used in the change. The cents exceeding the largest multiple of 10 in n can only be changed in maximally two ways: either they are all pennies or - if at least 5 cents remain - one nickel and pennies for the rest. To avoid counting the same way of change several times one can forbid to use any more pennies (for the groups of 10 cents) if there is a nickel in the change of the 'excess'-cents, so only dimes and and nickels for the groups of 10 cents, giving [n/10]+1 ways.
Alltogether one arrives at the following formula for N, the total number of ways for changing n cents:
N1 = ([n/10]+2) choose 2 + ([n/10]+1) choose 2 = ([n/10]+1)^2
[n/10]+1, if n mod 10 >= 5
N2 = {
0, otherwise
N = N1 + N2
Or as Python code:
def change_1_5_10_count(n):
n_10 = n // 10
N1 = (n_10+1)**2
N2 = (n_10+1) if n % 10 >= 5 else 0
return N1 + N2
btw, the computation can be further simplified: N = [([n/5]+2)^2/4], or in Python notation: (n // 5 + 2)**2 // 4.
Almost certainly not for the general case. That's why recursive and bottom-up dynamic programs are used. The modulus operator would provide us with a remainder when dividing the amount by the coin denomination -- meaning we would be using the maximum count of that coin that we can -- but for our solution, we need to count ways of making change when different counts of each coin denomination are used.
Identical intermediate amounts can be reached by using different combinations of coins, and that is what the classic method uses a cache for. O(amount * num_coins):
# Adapted from https://algorithmist.com/wiki/Coin_change#Dynamic_Programming
def coin_change_bottom_up(amount, coins):
cache = [[None] * len(coins) for _ in range(amount + 1)]
for m in range(amount+1):
for i in range(len(coins)):
# There is one way to return
# zero change with the ith coin.
if m == 0:
cache[m][i] = 1
# Base case: the first
# coin (which would be last
# in a top-down recursion).
elif i == 0:
# If this first/last coin
# divides m, there's one
# way to make change;
if m % coins[i] == 0:
cache[m][i] = 1
# otherwise, no way to make change.
else:
cache[m][i] = 0
else:
# Add the number of ways to
# make change for this amount
# without this particular coin.
cache[m][i] = cache[m][i - 1]
# If this coin's denomintion is less
# than or equal to the amount we're
# making change for, add the number
# of ways we can make change for the
# amount reduced by the coin's denomination
# (thus using the coin), again considering
# this and previously seen coins.
if coins[i] <= m:
cache[m][i] += cache[m - coins[i]][i]
return cache[amount][len(coins)-1]
With Python you can leverage the #cache decorator (or #lru_cache) and automatically make a recursive solution into a cached one. For example:
from functools import cache
#cache
def change(amount, coins=(1, 5, 10)):
if coins==(): return amount==0
C = coins[-1]
return sum([change(amount - C*x, coins[:-1]) for x in range(1+(amount//C))])
print(change(27, (1, 5, 10))) # 12
print(change(27, (1, 5))) # 6
print(change(17, (1, 5))) # 4
print(change(7, (1, 5))) # 2
# ch(27, (1, 5, 10)) == ch(27, (1, 5)) + ch(17, (1, 5)) + ch(7, (1, 5))
This will invoke the recursion only for those values of the parameters which the result hasn't been already computed and stored. With #lru_cache, you can even specify the maximum number of elements you allow in the cache.
This will be one of the DP approach for this problem:
def coin_ways(coins, amount):
dp = [[] for _ in range(amount+1)]
dp[0].append([]) # or table[0] = [[]], if prefer
for coin in coins:
for x in range(coin, amount+1):
dp[x].extend(ans + [coin] for ans in dp[x-coin])
#print(dp)
return len(dp[amount])
if __name__ == '__main__':
coins = [1, 5, 10] # 2, 5, 10, 25]
print(coin_ways(coins, 27)) # 12

Count integers up to n that contain the digits 2018 in order [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
Given an integer n between 0 and 10,0000,0000, count the number of integers smaller than n which contain the digits [2,0,1,8] in order.
So e.g. the number 9,230,414,587 should be counted, because removing the digits [9,3,4,4,5,7] leaves us with [2,0,1,8].
Example input and output:
n = 2018 -> count = 1
n = 20182018 -> count = 92237
My general thought is that: the maximum length of n is 10 and the worst situation is that we have to insert 6 digits into [2,0,1,8] and remove the duplicates and the numbers greater than n.
I don't see any own attempts to solve, so I'll give only clue:
You have 9-digits number (small numbers might be represented as 000002018) containing digit sequence 2,0,1,8.
Name them 'good' ones.
Let denote digit places from 1 to 9 right to left:
number 532705183
digits 5 3 2 7 0 5 1 8 3
index 9 8 7 6 5 4 3 2 1
The most left '2' digit can occupy places from 4 to 9. How many good numbers contain the first 2 at k-th place? Let make function F2(l, k) for quantity of good numbers where 2 refers to digit 2, l is number length, k is place for the most left digit.
. . . . 2 . . . .
^
|
left part k right part should contain 0 1 8 sequence
without 2's
F2(9, k) = 9^(9-k) * Sum(F0(k-1, j) for j=1..k-1)
Overall quantity of good numbers is sum of F2(9, k) for all possible k.
GoodCount = Sum(F2(9, k) for k=4..9)
Explanation:
There are 9-k places at the left. We can put any digit but 2 there, so there are 9^(9-k) possible left parts.
Now we can place 0 at the right part and count possible variants for 018 subsequences. F0(...) will of course depend on F1(...) and F1 will depend on F8(...) for shorter numbers.
So fill tables for values for F8, F0, F1 step-by-step and finally calculate result for digit 2.
Hand-made example for 4-digit numbers containing subsequence 1 8 and k = position of the first '1':
k=2: there are 81 numbers of kind xx18
k=3: there are numbers of kind x1x8 and x18x
there are 9 subnumbers like x8, 10 subnumbers 8x, so (10+9)*9=171
k=4: there are numbers of kind
1xx8 (9*9=81 such numbers),
1x8x (9*10=90 numbers),
18xx (100 numbers),
so 81+90+100=271
Overall: 81+171+271=523
This is actually a relatively small problem set. If the numbers were much bigger, I'd opt to use optimised techniques to just generate all numbers that meet your criteria (those containing the digits in that order) rather than generating all possible numbers and checking each to ensure it meets the criteria.
However, the brute force method does your 20182018 variant in about ten seconds and the full 1,000,000,000 range in a little under eight minutes.
So, unless you need it faster than that, you may find the brute-force method more than adequate:
import re
num = 1000000000 # or 20182018 or something else.
lookfor = re.compile("2.*0.*1.*8")
count = 0
for i in range(num + 1):
if lookfor.search(str(i)) is not None:
count += 1
#print(count, i) # For checking.
print(count)

Most efficient way to add individual digits of a number

I am working on an algorithm to determine whether a given number is prime and came across this website. But then I though of trying my own logic. I can easily eliminate numbers ending in 2,4,5,6,8 (and 0 for numbers above 5), so I am left with 1,3,7 and 9 as the possible last digit. Now, if the last digit is 3, I can add up the individual digits to check if it is divisible by 3. I don't want to perform modulus(%) operation and add them. Is there a much more efficient way to sum the digits in a decimal number? Maybe using bitwise operations... ?
% or modulus operator would be faster than adding individul digits. But if you really want to do this, you can unroll your loop partly in such a way that multiples of 3 are escaped automatically.
For ex:
2 is prime
3 is prime
candidate = 5
while(candidate <= limit - 2 * 3) // Unrolling loop for next 2 * 3 number
{
if ( CheckPrime(candidate) ) candidate is prime;
candidate += 2;
if ( CheckPrime(candidate) ) candidate is prime;
candidate += 4; // candidate + 2 is multiple of 3 (9, 15, 21 etc)
}
if(candidate < limit) CheckPrime(candidate);
In above method we are eliminating multiples of 3 instead of checking the divisibility of 3 by adding the digits.
You had a good observation. Incidentally it is called wheel factorization to find prime. I have done for wheel size = 6 (2*3), but you can do the same for larger wheel size also, for ex: 30(2*3*5). The snippet above is also called as all prime number are of type 6N±1.
(because 6N+3 is multiple of 3)
p.s. Not all numbers ending at 2 and 5 are composite. Number 2 and 5 are exceptions.
You might consider the following but i think modulus is fastest way :-
1. 2^n mod 3 = 1 if n is even and = 2 if n is odd
2. odd bits and even bits cancel each out as their sum is zero modulo 3
4. so the absolute difference of odd and even bits is the remainder
5. As difference might be again greater than 3 you need to again calculate modulo 3
6. step 5 can be done recursively
Pseudo code :-
int modulo3(int num) {
if(num<3)
return num;
int odd_bits = cal_odd(num);
int even_bits = cal_even(num);
return module3(abs(even_bits-odd_bits));
}

Check if a number N is sum of multiple of 3 and 5 given that N could be as big as 100,000 [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
How do I check if a number is sum of multiples of 3 and 5 given that the number could be as big as 100,000 . I need an optimized way to break a number into two parts such that the two parts are multiple of 3 and 5 only and the part which is multiple of 3 is greater than the part which is multiple of 5 and if that kind of splitting is not possible then I need to reject that number .
Eg:
1 => cant be split so rejected ,
35 => 30 + 5 ,
65 => 60 + 5 (Though 30 + 35 could be a split but since part which is multiple of 3 has to be greater than the part which is multiple of 5),
11 => 6+5
Every (integer) number modulo 3 yields 0, 1 or 2.
So let's examine all cases (n > 3 must yield for obvious reasons):
n % 3 == 0. Easy: We just take 0 == 0 * 5 and n / 3 as splitting.
n % 3 == 2. Easy again: One number will be 5 and the other (n-5) / 3. When subtracting 5 from n, we will create a second number (n-5), which falls under the first case.
n % 3 == 1. Same as in case 2, but this time we substract 10 == 2*5.
A small problem is the property that the multiple of 3 has to be larger than the one of 5. For this to hold true, n has to be at least 22. ( 22 == 2 * 5 + 3 * 4).
So all numbers smaller than 22 with the property n % 3 == 1 have to be rejected: 4, 7, 10, 13, 16 and 19. (As long as the factor for the multiples have to be non-negative).
If you mean to find a way to split a number to two parts, where the first part is a multiple of 3 and the second is a multiple of 5, with the extra requirement that the first (multiple of 3) part is greater than than the second (multiple of 5) part, then it's rather trivial:
Every number from 20 and above can be split that way.
Proof: For given number N, exactly one of the three numbers, N, N-5, N-10 will be a multiple of 3 (consider modulo 3 arithmetic.) So, one of these three splits satisfy the requirements:
N 0
N-5 5
N-10 10
and since N >= 20, the 1st part is greater (or equal) than the 2nd.
Off the top of my head --
Make Q = N / 3, integer division, rounding down. Make R the remainder.
If R = 0 you're done.
If R == 2, decrement Q.
Else R must be 1, subtract 2 from Q.
Your answer is Q * 3 and N - (Q * 3). Check that all results are positive and that the 3s multiple > 5s multiple restriction is satisfied.
(Note that this is essentially the same as Sirko's answer, but I felt it worthwhile to think it through separately, so I didn't attempt to analyze his first.)
max divisor of 3 and 5 is 1.
so when N = 3, or N >= 5, it can be sum of multiple of 3 and 5.
Just use this code:-
Enjoy :)
$num = 0; // Large Number
$arr = array();
if(isset($_POST['number']) $num = $_POST['number'];
// Assuming you post the number to be checked.
$j=0;
for($i=0;$i<$num;$i++)
{
if(($num-$i)%3==0 || ($num-$i)%5==0) { $arr[j] = $num - $i; $j++; }
}
//This creates an array of all possible numbers.
$keepLooping = true;
while($keepLooping)
{
$rand = array_rand($arr,2);
if(($rand[0] + $rand[1]) == $num)
{
//Do whatever you like with them. :)
}
}
I haven't tested it though but just for your idea. Instead of the for loop to select the possibilities, you can choose some other way whichever suits you.

Using one probability set to generate another [duplicate]

This question already has answers here:
Expand a random range from 1–5 to 1–7
(78 answers)
Closed 8 years ago.
How can I generate a bigger probability set from a smaller probability set?
This is from Algorithm Design Manual -Steven Skiena
Q:
Use a random number generator (rng04) that generates numbers from {0,1,2,3,4} with equal probability to write a random number generator that generates numbers from 0 to 7 (rng07) with equal probability?
I tried for around 3 hours now, mostly based on summing two rng04 outputs. The problem is that in that case the probability of each value is different - 4 can come with 5/24 probability while 0 happening is 1/24. I tried some ways to mask it, but cannot.
Can somebody solve this?
You have to find a way to combine the two sets of random numbers (the first and second random {0,1,2,3,4} ) and make n*n distinct possibilities. Basically the problem is that with addition you get something like this
X
0 1 2 3 4
0 0 1 2 3 4
Y 1 1 2 3 4 5
2 2 3 4 5 6
3 3 4 5 6 7
4 4 5 6 7 8
Which has duplicates, which is not what you want. One possible way to combine the two sets would be the Z = X + Y*5 where X and Y are the two random numbers. That would give you a set of results like this
X
0 1 2 3 4
0 0 1 2 3 4
Y 1 5 6 7 8 9
2 10 11 12 13 14
3 15 16 17 18 19
4 20 21 22 23 24
So now that you have a bigger set of random numbers, you need to do the reverse and make it smaller. This set has 25 distinct values (because you started with 5, and used two random numbers, so 5*5=25). The set you want has 8 distinct values. A naïve way to do this would be
x = rnd(5) // {0,1,2,3,4}
y = rnd(5) // {0,1,2,3,4}
z = x+y*5 // {0-24}
random07 = x mod 8
This would indeed have a range of {0,7}. But the values {1,7} would appear 3/25 times, and the value 0 would appear 4/25 times. This is because 0 mod 8 = 0, 8 mod 8 = 0, 16 mod 8 = 0 and 24 mod 8 = 0.
To fix this, you can modify the code above to this.
do {
x = rnd(5) // {0,1,2,3,4}
y = rnd(5) // {0,1,2,3,4}
z = x+y*5 // {0-24}
while (z != 24)
random07 = z mod 8
This will take the one value (24) that is throwing off your probabilities and discard it. Generating a new random number if you get a 'bad' value like this will make your algorithm run very slightly longer (in this case 1/25 of the time it will take 2x as long to run, 1/625 it will take 3x as long, etc). But it will give you the right probabilities.
The real problem, of course, is the fact that the numbers in the middle of the sum (4 in this case) occur in many combinations (0+4, 1+3, etc.) whereas 0 and 8 have exactly one way to be produced.
I don't know how to solve this problem, but I'm going to try to reduce it a bit for you. Some points to consider:
The 0-7 range has 8 possible values, so ultimately the total number of possible situations that you should aim for has to be a multiple of 8. That way you can have an integral number of distributions per value in that codomain.
When you take the sum of two density functions, the number of possible situations (not necessarily distinct when you evaluate the sum, just in terms of different permutations of inputs) is equal to the product of the size of each of the input sets.
Thus, given two {0,1,2,3,4} sets summed together, you have 5*5=25 possibilities.
It will not be possible to get a multiple of eight (see first point) from powers of 5 (see second point, but extrapolate it to any number of sets > 1), so you will need to have a surplus of possible situations in your function and ignore some of them if they occur.
The simplest way to do that, as far as I can see at this point, is to use the sum of two {0,1,2,3,4} sets (25 possibilities) and ignore 1 (to leave 24, a multiple of 8).
Thus the challenge now has been reduced to this: Find a way to distribute the remaining 24 possibilities among the 8 output values. For this, you'll probably NOT want to use the sum, but rather just the input values.
One way to do that is, imagine a number in base 5 constructed from your input. Ignore 44 (that's your 25th, superfluous value; if you get it, synthesize a new set of inputs) and take the others, modulo 8, and you'll get your 0-7 across 24 different input combinations (3 each), which is an equal distribution.
My logic would be this:
rn07 = 0;
do {
num = rng04;
}
while(num == 4);
rn07 = num * 2;
do {
num = rng04;
}
while(num == 4);
rn07 += num % 2

Resources