random number with ratio 1:2 - algorithm

I have to generate two random sets of matrices
Each containing 3 digit numbers ranging from 2 - 10
like that
matrix 1: 994,878,129,121
matrix 2: 272,794,378,212
the numbers in both matrices have to be greater then 100 and less then 999
BUT
the mean for both matrices has to be in the ratio of 1:2 or 2:3 what ever constraint the user inputs
my math skills are kind of limited so any ideas how do i make this happen?

In order to do this, you have to know how many numbers are in each list. I'm assuming from your example that there are four numbers in each.
Fill the first list with four random numbers.
Calculate the mean of the first list.
Multiply the mean by 2 or by 3/2, whichever the user input. This is the required mean of the second list.
Multiply by 4. This is the required total of the second list.
Generate 3 random numbers.
Subtract the total of the three numbers in step 5 from the total in step 4. This is the fourth number for the second list.
If the number in step 6 is not in the correct range, start over from step 5.
Note that the last number in the second list is not truly random, since it's based on the other values in the list.

You have a set of random numbers, s1.
s1= [ random.randint(100,999) for i in range(n) ]
For some other set, s2, to have a different mean it's simply got to have a different range. Either you select values randomly from a different range, or you filter random values to get a different range.
No matter how many random numbers you select from the range 100 to 999, the mean is always just about 550. The odds of being a different value are exactly the normal distribution probabilities on either side of the mean.
You can't have a radically different mean with values selected from the same range.

Related

How to pick two random numbers with no carry when adding

I'm trying to make this algorithm which inputs a lower and upper limit for two numbers (the two numbers may have different lower and upper limits) and outputs two random numbers within that range
The catch is however that when the two numbers are added, no "carry" should be there. This means the sum of the digits in each place should be no more than 9.
How can I make sure that the numbers are truly random and that no carrying occurs when adding the two numbers
Thanks a lot!
Edit: The ranges can vary, the widest range can be 0 to 999. Also, I'm using VBA (Excel)
An easy and distributionally correct way of doing this is to use Rejection Sampling, a.k.a. "Acceptance/Rejection". Generate the values independently, and if the carry constraint is violated repeat. In pseudocode
do {
generate x, y
} while (x + y > threshold)
The number of times the loop will iterate has a geometric distribution with an expected value of (proportion of sums below the threshold)-1. For example, if you're below the threshold 90% of the time then the long term number of iterations will average out to 10/9, 1.11... iterations per pair generated. For lower likelihoods of acceptance, it will take more attempts on average.

Specific knapsack-like

I am looking for some ideas how to deal with this specific knapsack problem (I believe it is knapsack-like problem although I might be mistaken).
As input we get set of numbers, and each can be positive or negative - we don't know that.
We have to find minimum possible absolute value of sum of some these numbers.
We don't have to use all numbers. We have to do additions (or subtractions) in same order in which numbers are given and we have to start with first number (and add or subtract following ones).
Example would be:
4 11 5 5 => 0
because 4+11-5-5 = 0
10 3 9 4 100 => 2
because 10-3-9 = -2
In second example we skipped two last numbers - because adding next numbers wouldn't give us smaller absolute number.
Amount of numbers can be up to 5,000
, and the sum of them won't over exceed 10,000
They are integers
If you were to explore all combinations of addition and subtraction of 5000 numbers, you would have to go through 25000−1 ≈ 1.4⋅101505 alternatives. That's obviously not reasonable. However, since the sum of the numbers is at most 10000, we know that all partial sums (including subtraction) must lie between -10000 and 10000, so there can be less than 20000 different sums. If you only consider different sums when you work through the 5000 positions you have less than 100 million sums to consider, which is not that much work for a computer.
Example: suppose the first three numbers are 5,1,1. The possible sums that include exactly three numbers are
5+1+1=7
5+1-1=5
5-1+1=5
5-1-1=3
Before adding the fourth number it is important to recognize that you have only three unique results from the four computations.

Given a true random number generator which outputs either a 1 or 0 per call, how do you use this to pick a number from an arbitrary range?

If I have a true random number generator (TRNG) which can give me either a 0 or a 1 each time I call it, then it is trivial to then generate any number in a range with a length equal to a power of 2. For example, if I wanted to generate a random number between 0 and 63, I would simply poll the TRNG 5 times, for a maximum value of 11111 and a minimum value of 00000. The problem is when I want a number in a rangle not equal to 2^n. Say I wanted to simulate the roll of a dice. I would need a range between 1 and 6, with equal weighting. Clearly, I would need three bits to store the result, but polling the TRNG 3 times would introduce two eroneous values. We could simply ignore them, but then that would give one side of the dice a much lower odds of being rolled.
My question of ome most effectively deals with this.
The easiest way to get a perfectly accurate result is by rejection sampling. For example, generate a random value from 1 to 8 (3 bits), rejecting and generating a new value (3 new bits) whenever you get a 7 or 8. Do this in a loop.
You can get arbitrarily close to accurate just by generating a large number of bits, doing the mod 6, and living with the bias. In cases like 32-bit values mod 6, the bias will be so small that it will be almost impossible to detect, even after simulating millions of rolls.
If you want a number in range 0 .. R - 1, pick least n such that R is less or equal to 2n. Then generate a random number r in the range 0 .. 2n-1 using your method. If it is greater or equal to R, discard it and generate again. The probability that your generation fails in this manner is at most 1/2, you will get a number in your desired range with less than two attempts on the average. This method is balanced and does not impair the randomness of the result in any fashion.
As you've observed, you can repeatedly double the range of a possible random values through powers of two by concatenating bits, but if you start with an integer number of bits (like zero) then you cannot obtain any range with prime factors other than two.
There are several ways out; none of which are ideal:
Simply produce the first reachable range which is larger than what you need, and to discard results and start again if the random value falls outside the desired range.
Produce a very large range, and distribute that as evenly as possible amongst your desired outputs, and overlook the small bias that you get.
Produce a very large range, distribute what you can evenly amongst your desired outputs, and if you hit upon one of the [proportionally] few values which fall outside of the set which distributes evenly, then discard the result and start again.
As with 3, but recycle the parts of the value that you did not convert into a result.
The first option isn't always a good idea. Numbers 2 and 3 are pretty common. If your random bits are cheap then 3 is normally the fastest solution with a fairly small chance of repeating often.
For the last one; supposing that you have built a random value r in [0,31], and from that you need to produce a result x [0,5]. Values of r in [0,29] could be mapped to the required output without any bias using mod 6, while values [30,31] would have to be dropped on the floor to avoid bias.
In the former case, you produce a valid result x, but there's some more randomness left over -- the difference between the ranges [0,5], [6,11], etc., (five possible values in this case). You can use this to start building your new r for the next random value you'll need to produce.
In the latter case, you don't get any x and are going to have to try again, but you don't have to throw away all of r. The specific value picked from the illegal range [30,31] is left-over and free to be used as a starting value for your next r (two possible values).
The random range you have from that point on needn't be a power of two. That doesn't mean it'll magically reach the range you need at the time, but it does mean you can minimise what you throw away.
The larger you make r, the more bits you may need to throw away if it overflows, but the smaller the chances of that happening. Adding one bit halves your risk but increases the cost only linearly, so it's best to use the largest r you can handle.

Randomly choosing from a list with weighted probabilities

I have an array of N elements (representing the N letters of a given alphabet), and each cell of the array holds an integer value, that integer value meaning the number of occurrences in a given text of that letter. Now I want to randomly choose a letter from all of the letters in the alphabet, based on his number of appearances with the given constraints:
If the letter has a positive (nonzero) value, then it can be always chosen by the algorithm (with a bigger or smaller probability, of course).
If a letter A has a higher value than a letter B, then it has to be more likely to be chosen by the algorithm.
Now, taking that into account, I've come up with a simple algorithm that might do the job, but I was just wondering if there was a better thing to do. This seems to be quite fundamental, and I think there might be more clever things to do in order to accomplish this more efficiently. This is the algorithm i thought:
Add up all the frequencies in the array. Store it in SUM
Choosing up a random value from 0 to SUM. Store it in RAN
[While] RAN > 0, Starting from the first, visit each cell in the array (in order), and subtract the value of that cell from RAN
The last visited cell is the chosen one
So, is there a better thing to do than this? Am I missing something?
I'm aware most modern computers can compute this so fast I won't even notice if my algorithm is inefficient, so this is more of a theoretical question rather than a practical one.
I prefer an explained algorithm rather than just code for an answer, but If you're more comfortable providing your answer in code, I have no problem with that.
The idea:
Iterate through all the elements and set the value of each element as the cumulative frequency thus far.
Generate a random number between 1 and the sum of all frequencies
Do a binary search on the values for this number (finding the first value greater than or equal to the number).
Example:
Element A B C D
Frequency 1 4 3 2
Cumulative 1 5 8 10
Generate a random number in the range 1-10 (1+4+3+2 = 10, the same as the last value in the cumulative list), do a binary search, which will return values as follows:
Number Element returned
1 A
2 B
3 B
4 B
5 B
6 C
7 C
8 C
9 D
10 D
The Alias Method has amortized O(1) time per value generated, but requires two uniforms per lookup. Basically, you create a table where each column contains one of the values to be generated, a second value called an alias, and a conditional probability of choosing between the value and its alias. Use your first uniform to pick any of the columns with equal likelihood. Then choose between the primary value and the alias based on your second uniform. It takes a O(n log n) work to initially set up a valid table for n values, but after the table's built generating values is constant time. You can download this Ruby gem to see an actual implementation.
Two other very fast methods by Marsaglia et al. are described here. They have provided C implementations.

Maximum number of equal elements in array

I was solving the problems from codeforces practice problem achieve.
I am not able to find efficient solution.
How to solve the following problem?
I can only think of a brute force solution
Polycarpus has an array, consisting of n integers a1, a2, ..., an. Polycarpus likes it when numbers in an array match. That's why he wants the array to have as many equal numbers as possible. For that Polycarpus performs the following operation multiple times:
he chooses two elements of the array ai, aj (i ≠ j);
he simultaneously increases number ai by 1 and decreases number aj by 1, that is, executes ai = ai + 1 and aj = aj - 1.
The given operation changes exactly two distinct array elements. Polycarpus can apply the described operation an infinite number of times.
Now he wants to know what maximum number of equal array elements he can get if he performs an arbitrary number of such operation. Help Polycarpus.
Input
The first line contains integer n (1 ≤ n ≤ 105) — the array size. The second line contains space-separated integers a1, a2, ..., an (|ai| ≤ 104) — the original array.
Output
Print a single integer — the maximum number of equal array elements he can get if he performs an arbitrary number of the given operation.
Sample test(s)
input
2
2 1
output
1
input
3
1 4 1
output
3
find the sum of all the elements.
If the sum%n==0 then n else n-1
EDIT: Explanations :
First of all it is very easy to spot that the answer is minimum n-1.It cannot be lesser .
Proof: Choose any number that you wish to make as your target.And suppose the last index n.Now you make a1=target by applying operation on a1 and an.Similarly on a2 and an and so on.So all numbers except the last one are equal to target.
Now we need to see that if sum%n==0 then all numbers are possible.Clearly you can choose your target as the mean of all the numbers here.You can apply operation by choosing a index with value less than mean and other with value greater than mean and make one of them (possibly both) equal to mean.

Resources