Python - Which round way to be applied in percentage statement? - precision

I haven't understood how to round (round up or down) in percentage notation.
n1 = 125.5
n2 = 1.255
print(f'{n1:.0f}') ==> 126
print(f'{n2:.0%}') ==> 125%
Why variable n2 print 125%, not 126%?
Is there different a way to round between percentage and float?
Thank you.

Related

How to round to an arbitrary (non power-of-ten) precision [duplicate]

This question already has answers here:
Round to nearest multiple of a number
(3 answers)
Closed last year.
Let’s imagine I would like to round a number (i.e x = 7.4355) to a given arbitrary precision (i.e p = 0.002). In this case, I would expect to see:
round_arbitrary(x, p) = 7.436
What would be the best approach to design such a rounding function? Ideas in pseudocode or Rust are welcome
What would be the best approach to design such a rounding function?
An approach that gets near to OP's goal:
// Pseudo code (p != 0)
round_arbitrary(x, p)
x /= p
x = round(x)
return x*p
A key point is that floating point numbers are finite in size and so can represent about 264 different values exactly whereas code values like 7.4355, 0.002 and the math quotient 1/7.0 are of a much bigger set. Thus the above will get one close, but not certainty to an exact mathematically rounded value.
More advanced code would avoid overflow by not rounding large values which do not need rounding.
// Assume 0 < |p| < 1.0
round_arbitrary_2(x, p)
if (round(x) != x)
x /= p
x = round(x)
x *= p;
return x*p
Deeper
This issues lies with floating point numbers that are encoded with an integer times a power-of-2. Then the question is not so much "How to round to an arbitrary (non power-of-ten) precision", but "How to round to an arbitrary (non power-of-2) precision".

Round-off to zero behavior in Maxima float coefficients

So I am working on a Maxima program that involves a bunch of iterations (the Souriau-Frame Drazin Inverse Algorithm, to be specific), each step of which yields a polynomial. I need to check and stop my iterations when the polynomial goes to zero (i.e., all coefficients go to zero).
Maxima seems to never truncate small numbers to zero up until it reaches something absurd like $10^{-323}$ and so on.
The following code snippet gives an idea of what I need:
(%i3) rat(1e-300);
rat: replaced 1.0E-300 by 1/999999999999999903803069407426113968898218766118103141789833949572356552411722264192305659040010509526872994217248819197070144216063125530186267630296136203765329090687113225440746189048800695790727969805197112921161540803823920273299782054992133678869364753954248541633605124057805104488924519071744 = 1.0E-300
(%o3)/R/ 1/9999999999999999038030694074261139688982187661181031417898339495723\
565524117222641923056590400105095268729942172488191970701442160631255301862676\
302961362037653290906871132254407461890488006957907279698051971129211615408038\
23920273299782054992
133678869364753954248541633605124057805104488924519071744
(%i4) rat(1e-323);
rat: replaced 9.0E-324 by^C
Maxima encountered a Lisp error:
SIMPLE-ERROR: Console interrupt.
Automatically continuing.
To enable the Lisp debugger set *debugger-hook* to nil.
(%i5) rat(1e-325);
rat: replaced 0.0 by 0/1 = 0.0
(%o5)/R/ 0
(%i6)
As one can see, it's not truncating $10^{-300}$ to zero, it hangs (and I had to sigkill it) for $10^{-323}$ and everything smaller than $10^{-325}$ is set to zero.
I don't know where this 324 is coming from. And I'd like to know if it's possible to reduce this for my code.
Edit 1: Here's the output if I used rationalize instead of rat:
(%i3) rationalize(1e-300);
(%o3) 6032057205060441/6032057205060440848842124543157735677050252251748505781\
796615064961622344493727293370973578138265743708225425014400837164813540499979\
063179105919597766951022193355091707896034850684039059079180396788349106095584\
290087446076413771468940477241550670753145517602931224392424029547429993824129\
889235158145614364972941312
(%i4) rationalize(1e-323);
(%o4) 1/1012011266536553091762476733594586535247783248820710591784506790137151\
697839976734459801918507185622475935389321584059556949043686928967384335066999\
703692549607587121382831806822334538710466081706198838392363725342810037417123\
463493090516778245797781704050282561793847761667073076152512660931637543230031\
31653853870546747392
(%i5) rationalize(1e-324);
(%o5) 0
Edit 2: Here's the output to "build_info();":
(%i6) build_info();
(%o6)
Maxima version: "5.43.2"
Maxima build date: "2020-02-21 05:22:38"
Host type: "x86_64-pc-linux-gnu"
Lisp implementation type: "GNU Common Lisp (GCL)"
Lisp implementation version: "GCL 2.6.12"
User dir: "/home/nidish/.maxima"
Temp dir: "/tmp"
Object dir: "/home/nidish/.maxima/binary/5_43_2/gcl/GCL_2_6_12"
Frontend: false
I gather that the goal is to replace small (in absolute value) float with zero. There doesn't appear to be a built-in function for that. Here's an attempt at an implementation via the pattern matching machinery.
First define a rule to replace small floats, and define a function which applies the rule to an expression.
(%i4) matchdeclare(xx,floatnump) $
(%i5) defrule(squashing_rule,xx, if abs(xx) <= squashing_tolerance then 0 else xx);
(%o5) squashing_rule : xx -> (if abs(xx) <= squashing_tolerance then 0 else xx)
(%i6) squashing_tolerance:0.01 $
(%i7) squash_floats(expr):=applyb1(expr,squashing_rule) $
Now create a random polynomial.
(%i8) e:makelist(float((((2*random(2)-1)*(1+random(8)))/8) *10^-random(4)) *x^k,k,1,6);
2 3 4 5 6
(%o8) [- 3.75e-4 x, - 0.00625 x , - 0.05 x , 0.00625 x , 0.005 x , 0.5 x ]
(%i9) e1:apply("+",e);
6 5 4 3 2
(%o9) 0.5 x + 0.005 x + 0.00625 x - 0.05 x - 0.00625 x - 3.75e-4 x
Apply squash_floats to the generated polynomial.
(%i10) squash_floats(e1);
6 3
(%o10) 0.5 x - 0.05 x
Change the squashing tolerance.
(%i11) squashing_tolerance:0.001;
(%i12) squash_floats(e1);
6 5 4 3 2
(%o12) 0.5 x + 0.005 x + 0.00625 x - 0.05 x - 0.00625 x
Verify the replacement happens in nested expressions.
(%i13) squash_floats(sin(1+1/e1));
1
(%o13) sin(----------------------------------------------------- + 1)
6 5 4 3 2
0.5 x + 0.005 x + 0.00625 x - 0.05 x - 0.00625 x
First let's step back a moment. What is the behavior you are hoping to find? If you need to convert very small floats to rational numbers accurately, try rationalize instead of rat. Does that work correctly for 1e-323?
If you want floats smaller than a tolerance to be converted to zero, we'll need to take a different approach. I'll hold off on that for the moment.
About the specific behavior you have observed, it appears to be implementation-dependent; I get a different (still buggy) behavior with Maxima + SBCL, which reports a floating point overflow. What does build_info(); report?
I don't know if it matters, but 1e-323 is a so-called denormalized float -- it is smaller than the smallest normalized (full precision) float, which is about 1e-308.
First you say "I want to know when a polynomial is exactly going to zero." And then you say "if a coefficient in a polynomial drops below a threshold, I want that terms to be completely thrown out of the polynomial". So you don't want the polynomial to be exactly zero, you want it to be zero within some threshold (relative? absolute?).
I'm afraid I'm not familiar with the Souriau-Frame Drazin algorithm, but looking at the Greville paper about it, it seems that all the calculations are rational (no square roots etc.), so I wonder if it's feasible to perform your calculations with completely exact rational numbers instead of using floating-point numbers. Then presumably exact means exact, and you don't need to worry about thresholds at all.

Nested loop with dependent bounds trip count

just out of curiosity I tried to do the following, which turned out to be not so obvious to me;
Suppose I have nested loops with runtime bounds, for example:
t = 0 // trip count
for l in 0:N
for k in 0:N
for j in max(l,k):N
for i in k:j+1
t += 1
t is loop trip count
is there a general algorithm/way (better than N^4 obviously) to calculate loop trip count?
if not, I would be curious to know how you would approach just this particular loop. the above loop is symmetric (it's loops over symmetric rank-4 tensor), and I am also interested in methods to detect loop symmetry.
I am working on the assumption that the iteration bounds depend only on constant or previous loop variables. link/journal article, If you know of one, would be great.
I believe the inner loop will run
t = 1/8 * (N^4 + 6 * N^3 + 7 * N^2 + 2 * N)
times.
I did not really solve the problem directly, I fitted a 4-th order polynomial expression to exactly calculated t for N from 1 to 50 hoping that I'll get exact fit.
To calculate exact t I used
sum(sum(sum(sum(1,i,k,j+1),j,max(l,k),N),k,1,N),l,1,N)
which should be the equivalent of actually running your loops.
data fit, log scale http://img714.imageshack.us/img714/2313/plot3.png
The fit for N from 1 to 50 matches exactly and calculating it for N=100 gives 13258775 using both methods.
EDIT:
The exercise was done using open source algebra system maxima, here's the actual source (output discarded):
nr(n):=sum(sum(sum(sum(1,i,k,j+1),j,max(l,k),n),k,1,n),l,1,n);
M : genmatrix( lambda([i,j],if j=1 then i else nr(i)), 50, 2 );
coefs : lsquares_estimates(M, [x,y], y = A*x^4+B*x^3+C*x^2+D*x+E, [A,B,C,D,E]);
sol(x):=ev(A*x^4+B*x^3+C*x^2+D*x+E, coefs);
sol(N);
S : genmatrix( lambda([i,j], if j=1 then i else sol(i)), 50, 2);
M-S;
plot2d([[discrete,makelist([M[N][1],M[N][2]],N,1,50)], sol(N)], [N, 1, 60], [style, points, lines], [color, red, blue], [legend, "simulation", sol(N)], [logy]);
compare(nr(100),sol(100));
If you want to know how many times the inner loop:
for j in max(l,k):N
Would be executed, just compute: N - max(l, k) assuming open range, N + 1 - max(l, k) assuming closed range.
For example, if:
l = 2
k = 7
N = 10
then it will run on 7, 8, 9, 10 (closed range), so indeed 10 + 1 - 7 = 4 times.
the answer is no, as long as the loop bounds can depend from the outer variables in an arbitrary fashionm as this would provide a general means for getting closed form formulations of integral series.
To see this, consider the following:
for x in 0:N
for y in 0:f(x)
t += 1
The trip count t(N) equals the sum t(N) = f(0)+f(1)+f(2)+f(3)+...+f(N-1).
So if you can get a closed form formulation for t(N) regardless of f(), you have found a very general method of producing closed forms, too general I would say, because what you have here correspond to an integral, and it's known that not all integrals admit closed form formulations.

Derive integer factors of float value?

I have a difficult mathematical question that is breaking my brain, my whiteboard, and all my pens. I am working with a file that expresses 2 values, a multiplicand and a percentage. Both of those values must be integers. These two values are multiplied together to produce a range. Range is a float value.
My users edit the range, and I have to calculate a new percentage and multiplicand value. Confused yet? Here's an example:
Multiplicand: 25000 Apples
Percentage: 400 (This works out to .4% or .004)
Range: 100.0 Apples (Calculated by Multiplicand * Percentage)
To complicate things, the allowable values for Percentage are 0-100000. (Meaning 0-100%) Multiplicand is a value between 1 and 32bit int max (presumably unsigned).
I need to allow for users to input a range, like so:
Range: .04 Apples
And calculate the appropriate Percentage and Multiplicand. Using the first example:
OriginalMultiplicand: 25000 Apples
OriginalPercentage: 400 (This works out to .4% or .004)
OriginalRange: 100.0 Apples (Calculated by Multiplicand * Percentage)
NewRange: .01 Apples
NewPercentage: 40
NewMultiplicand: 25 Apples
The example calculation is easy, all that was required was adjusting down the multiplicand and percentage down by the scale factor of the new and old range. The problem arises when the user changes the value to something like 1400.00555. Suddenly I don't have a clean way to adjust the two values.
I need an algorithmic approach to getting values for M & P that produce the closest possible value to the desired range. Any suggestions?
To maximize the numbers of decimal points stored, you should use a P of 1, or 0.1%. If that overflows M, then increment P.
So for your example of 1400.00555, P is 1 and M is 1400006
Your algorithm would search for the lowest P such that M does not overflow. And you can do a binary search here.
public int binarySearch(int P0, int P1) {
P = (P1 - P0)/2;
if(P == P0) {
if(R/(P0/100f) does not overflows 32-bit int) {
return P0;
} else {
return P1;
}
}
if(R/(P/100f) does not overflows 32-bit int) {
return binarySearch(P0, P);
} else {
return binarSearch(P, P1);
}
}
P = binarySearch(1, 100000);
M = round(R/(P/100f));
(I had a bad method here, but I erased it because it sucked.)
EDIT:
There's got to be a better way than that. Let's rephrase the problem:
What you have is an arbitrary floating-point number. You want to represent this floating-point number with two integers. The integers, when multiplied together and then divided by 100000.0, are equal to the floating-point number. The only other constraint is that one of the integers must be equal to or less than 100000.
It's clear that you can't actually represent floating-point numbers accurately. In fact, you can ONLY represent numbers that are expressible in 1/100000s accurately, even if you have an infinite number of digits of precision in "multiplicand". You can represent 333.33333 accurately, with 33333333 as one number and 1 as the other; you just can't get any more 3s.
Given this limitation, I think your best bet is the following:
Multiply your float by 100000 in an integer format, probably a long or some variant of BigNumber.
Factor it. Record all the factors. It doesn't matter if you store them as 2^3 or 2*2*2 or what.
Grab as many factors as you can without the multiplication of them all exceeding 100000. That becomes your percent. (Don't try to do this perfectly; finding the optimal solution is an NP-hard problem.)
Take the rest of the factors and multiply them together. That's your multiplicand.
As I understand from your example, you could represent the range in 100000 different multiplicand * percentage. any choice of multiplicand will give you a satisfying value of percentage, and vice versa. So you have this equation in two variables:
Multiplicand * Percentage = 100.0
You should figure out another equation(constraint), to get a specific value of Multiplicand OR Percentage to solve this equation. Otherwise, you could choose Percentage to be any number between 0-100000 and just substitute it in the first equation to get the value of Multiplicand. I hope I understood the question correctly :)
Edit: OK, then you should factorize the range easily. Get the range, then try to factorize it by dividing range by percentage(2-100000). Once the reminder of division is zero you got the factors. This is a quick pseudo-code:
get range;
percentage = 2;
while(range % percentage != 0)
{
percentage++;
}
multiplicand = range / percentage;
All what you have to do now is to calculate your limits:
max of percentage = 100000;
max of multiplicand = 4294967295;
Max of range = 4294967295 * 100000 = 429496729500000 (15-digit);
your Max range consists of 15 digit at a maximum. double data types in most programming languages can represent it. Do the calculation using doubles and just convert the Multiplicand & Percentage to int at the end.
It seems you want to choose M and P such that R = (M * P) / 100000.
So M * P = 100000 * R, where you have to round the right-hand side to an integer.
I'd multiply the range by 100000, and then choose M and P as factors of the result so that they don't overflow their allowed ranges.
say you have
1) M * P = A
then you have a second value for A, so also new values for M and P, lets call then M2, P2 and A2:
2) M2 * P2 = A2
This I dont know for sure, but that is what you seem to be saying imho: the ratio has to stay the same, then
3) M/P = M2/P2
Now we have 3 equations and 2 unknowns M2 and P2
One way to solve it:
3) becomes
M/P = M2/P2
=>M2 = (M/P)*P2
than substitute that in 2)
(M/P)*P2*P2 = A2
=> P2*P2 = A2 * (P/M)
=> P2 = sqrt(A2 * (P/M))
so first solve P2, then M2 if i didn't make any mistakes
There will have to be some rounding if M2 and P2 have to be integers.
EDIT: i forgot about the integer percentage, so say
P = percentage/100000 or P*100000 = percentage
P2 = percentage2/100000 or P2*100000 = percentage2
so just solve for P2 and M2, and multiply P2 with 100000

How to point on non-linear range to linear and back?

I have a list of linear ranges which represent one big range:
X'
100 200 300 400 500 600 700 | 900 (X)
|----------|----------|----------|--------+----------|
0 | 100 (Y)
Y'
X consists of the following ranges (even and round numbers are just examples for ease of comprehension, they could be anything, no proportions here at all):
From 100 to 200
From 300 to 400
From 500 to 600
From 700 to 900
On the flip side, Y has just one range:
From 0 to 100
Both X and Y are of the same length, just different units. Let's say one is dollars and another is percents (or any other similarly unrelated units). So Y'0 == X'100 and Y'100 == X'900.
Given any point in Y, what is equivalent point in X and vise-versa, given a point in X - what is it in Y?
Is this a typical math problem? Does it have a name?
How many ranges do you have? Is it acceptable that the algorithm is O(number of ranges)?
If so, below is the description of the algorithm. Let me explain it on your (original) example.
100 200 300 400 500 600 700 800
|----------|----------|----------|----------|
0% 100%
1) What you're doing to do is to map the value X in range A (100-800) to the value Y in continous range B (0-399) (as the total number of elements in your range is 400). Then it's easy to change position in B to percents, I will omit this part.
2) Create a list of records, where each records represents one range mapping.
struct RangeRecord {
int start_in_a;
int start_in_b;
};
In your case, you will get the following list:
{100, 0}, {300, 100}, {500, 200}, {700, 300}
3) When you need to map a number X from A to B, you iterate the list to find first record with start_in_a <= X.Then your value Y is
Y = X + start_in_b - start_in_a;
4) The algorithm is symmettric, you just iterate the list to find the first record with start_in_b <= Y, and then
X = Y + start_in_a - start_in_b.
Note 1. For error checking purposes, you might keep the range size in RangeRecord, as well.
Note 2. If O(number of ranges) is not good enough, keep the records as a tree instead of a list. You will need O(log(number of ranges)) operations then,
Say you have one range (a, b) and another one (c, d). Now you have a number i for which a < i < b. You can "normalize" it by subtracting a and dividing by b - a - this gives you a value between 0 and 1. You can then use this value to transfer it into the other range by reversing this calculation with the other bounds, so to speak multiply it by (d - c) and add c.
Say the corresponding point in the other range is i'. Then,
i' = (i - a) / (b - a) * (d - c) + c
The term you are searching for is scaling and translation.
This is not really solvable because the problem is underspecified. Even for the same ranges, there can be different sliders like this:
1 100 101 1000
|-----|-----------|
1 100 101 1000
|-----------|-----|
For each range like [1..100] you need to know how which percent points on the slider correspond to it. In the above examples this could be something like [0%..33%] or [0%..66%]. Once you have this information, it's easy to determine in which of the ranges and at which position of that range a given data point is and to what value it corresponds.
You have three things you need to adjust for in converting from some X' to Y' and vice versa:
The ranges start at different places.
One of them is discontinuous.
The size of each step is different between the two ranges.
It might be helpful (at least while developing your solutions) to consider a similar range Z, which is the range 0 to 503 and has a one-to-one mapping with the 504 possible values in X. That is, for each discontinuity, if the X value is greater than the upper end of the discontinuity, subtract 99 (the size of the discontinuity). Then X'100 = Z'0, X'200 = Z'100, X'300 = Z'101, X'400 = Z'201, X'500 = Z'202, etc. The introduction of the Z range resolves problems 1 and 2 in the list above.
To convert from Z to Y, you just multiply by 101/504, which scales Z onto Y.
Assuming the piecewise linear arrangement you imply, you can find X by:
X = 4*Y + 100*int(1 + Y/25.)
and the reverse for Y:
X2 = int(X/100.)
X3 = X2-int(X2/2.)
Y = (X-100*X3)/4.
edit: This solution works for the original range you gave:
100 200 300 400 500 600 700 800
|----------|----------|----------|----------|
0% 100%
And of course, the reverse formula only holds for valid values of X.
Here's a figure of the two curves. The green is your original specification and the blue is the reverse curve (again, only valid for the valid x-values).
alt text http://img523.imageshack.us/img523/8858/66945008.png

Resources