Is there a way to force RATIO_TO_REPORT for generating rounded ratios to two cents? Sometimes if you manually round cents you will not receive the exact number of the sum of the records you based the ratio to report column and this is causing me a big issue in accounting reports in my company.
Please advice.
There's nothing that works out of the box, no. You're going to have to build your own procedure to allocate any differences in pennies.
The way my company does it is by the following algorithm:
Round values using your rounding method of choice, and compute the difference between the original value and the rounded value (this will always be less than a penny)
Sum the rounded values. Find the difference between the new sum and the "goal" sum. If the difference is positive (the new sum is higher), you need to remove pennies. If the difference is negative, you need to add pennies.
Since you (presumably) want to minimize differences between the unrounded and rounded values, depending on whether the difference between the new sum and the goal sum is positive or negative:
a. Give whole pennies to the rounded values with the largest difference between unrounded and rounded values, until your sum matches the goal sum
b. Take away whole pennies from the rounded values with the smallest difference between unrounded and rounded values, until your sum matches the goal sum
Related
I'm trying to make this algorithm which inputs a lower and upper limit for two numbers (the two numbers may have different lower and upper limits) and outputs two random numbers within that range
The catch is however that when the two numbers are added, no "carry" should be there. This means the sum of the digits in each place should be no more than 9.
How can I make sure that the numbers are truly random and that no carrying occurs when adding the two numbers
Thanks a lot!
Edit: The ranges can vary, the widest range can be 0 to 999. Also, I'm using VBA (Excel)
An easy and distributionally correct way of doing this is to use Rejection Sampling, a.k.a. "Acceptance/Rejection". Generate the values independently, and if the carry constraint is violated repeat. In pseudocode
do {
generate x, y
} while (x + y > threshold)
The number of times the loop will iterate has a geometric distribution with an expected value of (proportion of sums below the threshold)-1. For example, if you're below the threshold 90% of the time then the long term number of iterations will average out to 10/9, 1.11... iterations per pair generated. For lower likelihoods of acceptance, it will take more attempts on average.
This is a purely theoretical question.
We all know that most, if not all, random-number generators actually only generate pseudo-random numbers.
Let's say I want a random number from 10 to 20. I can do this as follows (myRandomNumber being an integer-type variable):
myRandomNumber = rand(10, 20);
However, if I execute this statement:
myRandomNumber = rand(5, 10) + rand(5, 10);
Is this method more random?
No.
The randomness is not cumulative. The rand() function uses a uniform distribution between your two defined endpoints.
Adding two uniformly distributions invalidates the uniform distribution. It will make a strange looking pyramid, with the most probability tending toward the center. This is because of accumulation of the probability density function with increasing degrees of freedom.
I urge you to read this:
Uniform Distribution
and this:
Convolution
Pay special attention to what happens with the two uniform distributions on the top right of the screen.
You can prove this to yourself by writing to a file all the sums and then plotting in excel. Make sure you give yourself a large enough sample size. 25000 should be sufficient.
The best way to understand this is by considering the popular fair ground game "Lucky Seven".
If we roll a six sided die, we know that the probability of obtaining any of the six numbers is the same - 1/6.
What if we roll two dice and add the numbers that appear on the two ?
The sum can range from 2 ( both dice show 'one') uptil 12 (both dice show 'six')
The probabilities of obtaining different numbers from 2 to 12 are no longer uniform. The probability of obtaining a 'seven' is the highest. There can be a 1+6, a 6+1, a 2+5, a 5+2, a 3+4 and a 4+3. Six ways of obtaining a 'seven' out of 36 possibilities.
If we plot the distribution we get a pyramid. The probabilities would be 1,2,3,4,5,6,5,4,3,2,1 (of course each of these has to be divided by 36).
The pyramidal figure (and the probability distribution) of the sum can be obtained by 'convolution.
If we know the 'expected value' and standard deviation ('sigma') for the two random numbers, we can perform a quick a ready calculation of the expected value of the sum of the two random numbers.
The expected value is simply the addition of the two individual expected values.
The sigma is obtained by applying the "pythagoras theorem" on the two individual sigmas (square root of the sum of the square of each sigma).
Given a 2-D array starting at (0,0) and proceeding to infinity in positive x and y axes. Given a number k>0 , find the number of cells reachable from (0,0) such that at every moment -> sum of digits of x+ sum of digits of y <=k . Moves can be up, down ,left or right. given x,y>=0 . Dfs gives answers but not sufficient for large values of k. anyone can help me with a better algorithm for this?
I think they asked you to calculate the number of cells (x,y) reachable with k>=x+y. If x=1 for example, then y can take any number between 0 and k-1 and the sum would be <=k. The total number of possibilities can be calculated by
sum(sum(1,y=0..k-x),x=0..k) = 1/2*k²+3/2*k+1
That should be able to do the trick for large k.
I am somewhat confused by the "digits" in your question. The digits make up the index like 3 times 9 makes 999. The sum of digits for the cell (999,888) would be 51. If you would allow the sum of digits to be 10^9 then you could potentially have 10^8 digits for an index, resulting something around 10^(10^8) entries, well beyond normal sizes for a table. I am therefore assuming my first interpretation. If that's not correct, then could you explain it a bit more?
EDIT:
okay, so my answer is not going to solve it. I'm afraid I don't see a nice formula or answer. I would approach it as a coloring/marking problem and mark all valid cells, then use some other technique to make sure all the parts are connected/to count them.
I have tried to come up with something but it's too messy. Basically I would try and mark large parts at once based on the index and k. If k=20, you can mark the cell range (0,0..299) at once (as any lower index will have a lower index sum) and continue to check the rest of the range. I start with 299 by fixing the 2 last digits to their maximum value and look for the max value for the first digit. Then continue that process for the remaining hundreds (300-999) and only fix the last digit to end up with 300..389 and 390..398. However, you can already see that it's a mess... (nevertheless i wanted to give it to you, you might get some better idea)
Another thing you can see immediately is that you problem is symmetric in index so any valid cell (x,y) tells you there's another valid cell (y,x). In a marking scheme / dfs/ bfs this can be exploited.
Scenario:
Drawing a graph. Have data points which range from A to B, and want to decide on a granularity for drawing the axis scales. Eg, for 134 to 151 the scale might run from 130 to 155, to start and end on "round" numbers in the decimal system. But the numbers might run from 134.31 to 134.35, in which case a scale from 130 to 135 would (visually) compress out the "significance" in the data -- it would be better to draw the scale from 134 to 135, or maybe even 134.3 to 134.4. And the data values might instead run from 0.013431 to 0.013435, or from 1343100 to 1343500.
So I'm trying to figure out an elegant way to calculate the "granularity" to round the low bound down to and the upper bound up to, to produce a "pleasing" chart. One could just "hack" it somehow, but that produces little confidence that "odd" cases will be handled well.
Any ideas?
Just an idea:
Add about 10% to your range, tune this figure empirically
Divide size of range by number of tick marks you want to have
Take the base 10 logarithm of that number
Multiply the result by three, then round to the nearest integer
The remainder modulo 3 will tell you whether you want the least significant decimal to change in steps of 1, 2, or 5
The result of an integer division by 3 will tell you the power of ten to use
Take the (extended) range and compute the extremal tick points it contains, according to the tick frequencey just computed
Ensure that all data points actually lie within that range, add ticks if not
If needed, add minor ticks by decreasing the integer above by one
I found a very helpful calculation which is very similar to the axis scale of excel graphs:
It is written for excel but I used and transformed it into objective-c code for setting up my graph axis.
Is there an elegant method to create a number that does not exist in a given list of floating point numbers? It would be nice if this number were not close to the existing values in the array.
For example, in the list [-1.5, 1e+38, -1e38, 1e-12] it might be nice to pick a number like 20 that's "far" away from the existing numbers as opposed to 0.0 which is not in the list, but very close to 1e-12.
The only algorithm I've been able to come up with involves creating a random number and testing to see if it is not in the array. If so, regenerate. Is there a better deterministic approach?
Here's a way to select a random number not in the list, where the probability is higher the further away from an existing point you get.
Create a probability distribution function f as follows:
f(x) = <the absolute distance to the point closest to x>
such function gives a higher probability the further away from the a given point you are. (Note that it should be normalized so that the area below the function is 1.)
Create the primitive function F of f (i.e. the accumulated area below f up to a given point).
Generate a uniformly random number, x, between 0 and 1 (that's easy! :)
Get the final result by applying the inverse of F to that value: F-1(x).
Here's a picture describing a situation with 1.5, 2.2 and 2.9 given as existing numbers:
Here's the intuition of why it works:
The higher probability you have (the higher the blue line is) the steeper the red line is.
The steeper the red line is, the more probable it is that x hits the red line at that point.
For example: At the given points, the blue lines is 0, thus the red line is horizontal. If the red line is horizontal, probability that x hits that point is zero.
(If you want the full range of doubles, you could set min / max to -Double.MAX_VALUE and Double.MAX_VALUE respectively.)
If you have the constraint, that the new value must be somewhere in between [min, max] then you could sort your values and insert the mean value of the two adjacent values with the largest absolute difference.
In your sample case [-1e38, -1.5, 1e-12, 1e+38] is the ordered list. As you calculate the absolute differences, you'll find the maximum difference for the values (1e-12, 1e+38) so you calculate the new value to be ((n[i+1] - n[i]) / 2) + n[i] (simple mean value calculation).
Update:
Additionally you could also check if the FLOAT_MAX or FLOAT_MIN values will give good candidates. Simply check their distance to min and max and if the result values are larger than the maximum difference for two adjacent values, pick them.
If there is no upper bound, just sum up the absolute value of all the numbers, or subtract them all.
Another possible solution would be to get the smallest number and the greatest number in the list, and choose something outside their bounds (maybe double the greatest number).
Or probably the best way would be to compute the average, the smalelst and the biggest number, as long as the standard deviation. Then, with all this data, you know how the numbers are structured, and can choose accordingly (all clustered around a given negative value? Chosoe a positive one. All small numbers? Choose a big one. etc.)
Something along the lines of
number := 1
multiplier := random(1000)+1
if avg>0
number:= -number
if min < 1 and max > 1
multiplier:= 1 / (random(1000)+1)
if stdDev > 1000
number := avg+random(500)-250
multiplier:= multiplier / (random(1000)+1)
(just an example from the top of my head)
Or another Possibility would be to XOR all the numbers together. Should yield a good result.