Handling Massive (high precision) Floats in Ruby - ruby

I'm making an application which is listening to prices being updated regularly, but occasionally my data source throws me something like "1.79769313486232e+308". The numbers that get sent will never by really big numbers (eg. "179769313486232e+308"), but as with the example above, they come with a lot of precision.
I'd be happy to drop everything after the first couple of decimal places, and end up with something like "1.798", but the following code isn't working for me:
irb(main):001:0> s = '1.79769313486232e+308'
=> "1.79769313486232e+308"
irb(main):002:0> ("%.3f" % s).to_f
(irb):2: warning: Float 1.79769313486232e+30... out of range
=> 0.0
Any graceful ways to handle these kind of numbers in Ruby?

You need to find out from your data source what those really big numbers mean. The price isn't actually 1.797e+308, but it probably also isn't 1.797. How you deal with those numbers depends entirely on what value you should be interpreting them as.
Update: I'm not sure you understand what this number means. 1.79769313486232e+308 is 1.79769313486232 times ten to the 308th power. It's a number with more than 300 digits to the left of the decimal point. This isn't a price, it's a mistake. It's the upper limit of what double-precision floating point can represent.
In other words, you're getting the equivalent of 0xFFFFFFFF or something like that, but interpreted as a floating point number.

That's right at the top of the range of "doubles". In Python for example it's converted to the special float called "inf".
The graceful way to handle this one might be to treat it as infinity

you could strip out extra decimals using regexp
>> s =~ /\.\d\d\d(\d+)/
>> s.gsub($1, '')
=> "1.797e+308"

Related

why does Ruby's Rational class treat string arguments differently from numeric arguments?

I'm using ruby's Rational library to convert the width & height of images to aspect ratios.
I've noticed that string arguments are treated differently than numeric arguments.
>> Rational('1.91','1')
=> (191/100)
>> Rational(1.91,1)
=> (8601875288277647/4503599627370496)
>> RUBY_VERSION
=> "2.1.5"
>> RUBY_ENGINE
=> "ruby"
FYI 1.91:1 is an aspect ratio recommended by Facebook for images on their platform.
Values like 191 and 100 are much more convenient to store in my database than 8601875288277647 and 4503599627370496. But I'd like to understand where this different originates before deciding which approach to use.
The Rational test suite doesn't seem to cover this exact case.
Disclaimer: This is only an educated guess, based on some knowledge on how to implement such a feat.
As Kent Dahl already said, Floats are not precise, they have a fixed precision, which means 1.91 is really 1.910000000000000000001 or something like that, which ruby "knows" should be displayed as 1.91.
"1.91" on the other hand is a string, basically an array of characters: '1', '.', '9', '1'.
This said, here is what you need to do, to build the rational out of floats:
Get rid of the . (mathematically by multiplying the numerator and denominator with 10^x, or multiplying with ten as many times, as there are numbers behind the .)
Find the greatest common denominator (gcd)
Divide num and denom with the gcd
Step 1 however, is a little different for Float and String:
The Float, we will have to multiply with 10^x, where x is (because of the precision) not 2 (as one would think with 1.91), but more something like 16 (remember: 1.9100...1).
For the String, we COULD cast it into a float and do the same trick, but hey, there is an easier way: We just count the number of characters behind the dot (which is 2), remove the dot and multiply the denom with 10^2... This is not only the easier, but also the more precise way.
The big numbers might disappear again, when applying step 3, that's why you will not always get those strange results when dealing with rationals from floats.
TLDR: The numbers will be built differently based on the arguments being String, or FLoat. FLoats can produce long-ass numbers, because precision.
The Float 1.91 is stored as a double which has a given amount of precision, limited by binary presentation. The equivalent Rational object retains this precision a such as possible, so it is huge. There is no way of storing 1.91 exactly in a double, but the value you get is close enough for most uses.
As for the String, it represents a different value - the exact value of 1.91 - and as you create a Rational it retains it better. It is more correct than the Float, UT takes longer to use for calculations.
This is similar to the problem with 1.0/3 as it "goes on forever" 0.333333...etc, but Rational can represent it exactly.

JDBC / Oracle Double value insertion fails [duplicate]

double r = 11.631;
double theta = 21.4;
In the debugger, these are shown as 11.631000000000000 and 21.399999618530273.
How can I avoid this?
These accuracy problems are due to the internal representation of floating point numbers and there's not much you can do to avoid it.
By the way, printing these values at run-time often still leads to the correct results, at least using modern C++ compilers. For most operations, this isn't much of an issue.
I liked Joel's explanation, which deals with a similar binary floating point precision issue in Excel 2007:
See how there's a lot of 0110 0110 0110 there at the end? That's because 0.1 has no exact representation in binary... it's a repeating binary number. It's sort of like how 1/3 has no representation in decimal. 1/3 is 0.33333333 and you have to keep writing 3's forever. If you lose patience, you get something inexact.
So you can imagine how, in decimal, if you tried to do 3*1/3, and you didn't have time to write 3's forever, the result you would get would be 0.99999999, not 1, and people would get angry with you for being wrong.
If you have a value like:
double theta = 21.4;
And you want to do:
if (theta == 21.4)
{
}
You have to be a bit clever, you will need to check if the value of theta is really close to 21.4, but not necessarily that value.
if (fabs(theta - 21.4) <= 1e-6)
{
}
This is partly platform-specific - and we don't know what platform you're using.
It's also partly a case of knowing what you actually want to see. The debugger is showing you - to some extent, anyway - the precise value stored in your variable. In my article on binary floating point numbers in .NET, there's a C# class which lets you see the absolutely exact number stored in a double. The online version isn't working at the moment - I'll try to put one up on another site.
Given that the debugger sees the "actual" value, it's got to make a judgement call about what to display - it could show you the value rounded to a few decimal places, or a more precise value. Some debuggers do a better job than others at reading developers' minds, but it's a fundamental problem with binary floating point numbers.
Use the fixed-point decimal type if you want stability at the limits of precision. There are overheads, and you must explicitly cast if you wish to convert to floating point. If you do convert to floating point you will reintroduce the instabilities that seem to bother you.
Alternately you can get over it and learn to work with the limited precision of floating point arithmetic. For example you can use rounding to make values converge, or you can use epsilon comparisons to describe a tolerance. "Epsilon" is a constant you set up that defines a tolerance. For example, you may choose to regard two values as being equal if they are within 0.0001 of each other.
It occurs to me that you could use operator overloading to make epsilon comparisons transparent. That would be very cool.
For mantissa-exponent representations EPSILON must be computed to remain within the representable precision. For a number N, Epsilon = N / 10E+14
System.Double.Epsilon is the smallest representable positive value for the Double type. It is too small for our purpose. Read Microsoft's advice on equality testing
I've come across this before (on my blog) - I think the surprise tends to be that the 'irrational' numbers are different.
By 'irrational' here I'm just referring to the fact that they can't be accurately represented in this format. Real irrational numbers (like π - pi) can't be accurately represented at all.
Most people are familiar with 1/3 not working in decimal: 0.3333333333333...
The odd thing is that 1.1 doesn't work in floats. People expect decimal values to work in floating point numbers because of how they think of them:
1.1 is 11 x 10^-1
When actually they're in base-2
1.1 is 154811237190861 x 2^-47
You can't avoid it, you just have to get used to the fact that some floats are 'irrational', in the same way that 1/3 is.
One way you can avoid this is to use a library that uses an alternative method of representing decimal numbers, such as BCD
If you are using Java and you need accuracy, use the BigDecimal class for floating point calculations. It is slower but safer.
Seems to me that 21.399999618530273 is the single precision (float) representation of 21.4. Looks like the debugger is casting down from double to float somewhere.
You cant avoid this as you're using floating point numbers with fixed quantity of bytes. There's simply no isomorphism possible between real numbers and its limited notation.
But most of the time you can simply ignore it. 21.4==21.4 would still be true because it is still the same numbers with the same error. But 21.4f==21.4 may not be true because the error for float and double are different.
If you need fixed precision, perhaps you should try fixed point numbers. Or even integers. I for example often use int(1000*x) for passing to debug pager.
Dangers of computer arithmetic
If it bothers you, you can customize the way some values are displayed during debug. Use it with care :-)
Enhancing Debugging with the Debugger Display Attributes
Refer to General Decimal Arithmetic
Also take note when comparing floats, see this answer for more information.
According to the javadoc
"If at least one of the operands to a numerical operator is of type double, then the
operation is carried out using 64-bit floating-point arithmetic, and the result of the
numerical operator is a value of type double. If the other operand is not a double, it is
first widened (§5.1.5) to type double by numeric promotion (§5.6)."
Here is the Source

How to do high precision float point arithmetics in mathematica

In Mma, for example, I want to calculate
1.0492843824838929890231*0.2323432432432432^3
But it does not show the full precision. I tried N or various other functions but none seemed to work. How to achieve this? Many thanks.
When you specify numbers using decimal point, it takes them to have MachinePrecision, roughly 16 digits, hence the results typically have less than 16 meaningful digits. You can do infinite precision by using rational/algebraic numbers. If you want finite precision that's better than default, specify your numbers like this
123.23`100
This makes Mathematica interpret the number as having 100 digits of precision. So you can do
ans=1.0492843824838929890231`100*0.2323432432432432`100^3
Check precision of the final answer using Precision
Precision[ans]
Check tutorial/ArbitraryPrecisionNumbers for more details
You may do:
r[x_]:=Rationalize[x,0];
n = r#1.0492843824838929890231 (r#0.2323432432432432)^3
Out:
228598965838025665886943284771018147212124/17369643723462006556253010609136949809542531
And now, for example
N[n,100]
0.01316083216659453615093767083090600540780118249299143245357391544869\
928014026433963352910151464006549
Sometimes you just want to see more of the machine precision result. These are a few methods.
(1) Put the cursor at the end of the output line, and press Enter (not on the numeric keypad) to copy the output to a new input line, showing all digits.
(2) Use InputForm as in InputForm[1.0/7]
(3) Change the setting of PrintPrecision using the Options Inspector.

Arithmetic in ruby

Why this code 7.30 - 7.20 in ruby returns 0.0999999999999996, not 0.10?
But if i'll write 7.30 - 7.16, for example, everything will be ok, i'll get 0.14.
What the problem, and how can i solve it?
What Every Computer Scientist Should Know About Floating-Point Arithmetic
The problem is that some numbers we can easily write in decimal don't have an exact representation in the particular floating point format implemented by current hardware. A casual way of stating this is that all the integers do, but not all of the fractions, because we normally store the fraction with a 2**e exponent. So, you have 3 choices:
Round off appropriately. The unrounded result is always really really close, so a rounded result is invariably "perfect". This is what Javascript does and lots of people don't even realize that JS does everything in floating point.
Use fixed point arithmetic. Ruby actually makes this really easy; it's one of the only languages that seamlessly shifts to Class Bignum from Fixnum as numbers get bigger.
Use a class that is designed to solve this problem, like BigDecimal
To look at the problem in more detail, we can try to represent your "7.3" in binary. The 7 part is easy, 111, but how do we do .3? 111.1 is 7.5, too big, 111.01 is 7.25, getting closer. Turns out, 111.010011 is the "next closest smaller number", 7.296875, and when we try to fill in the missing .003125 eventually we find out that it's just 111.010011001100110011... forever, not representable in our chosen encoding in a finite bit string.
The problem is that floating point is inaccurate. You can solve it by using Rational, BigDecimal or just plain integers (for example if you want to store money you can store the number of cents as an int instead of the number of dollars as a float).
BigDecimal can accurately store any number that has a finite number of digits in base 10 and rounds numbers that don't (so three thirds aren't one whole).
Rational can accurately store any rational number and can't store irrational numbers at all.
That is a common error from how float point numbers are represented in memory.
Use BigDecimal if you need exact results.
result=BigDecimal.new("7.3")-BigDecimal("7.2")
puts "%2.2f" % result
It is interesting to note that a number that has few decimals in one base may typically have a very large number of decimals in another. For instance, it takes an infinite number of decimals to express 1/3 (=0.3333...) in the base 10, but only one decimal in the base 3. Similarly, it takes many decimals to express the number 1/10 (=0.1) in the base 2.
Since you are doing floating point math then the number returned is what your computer uses for precision.
If you want a closer answer, to a set precision, just multiple the float by that (such as by 100), convert it to an int, do the math, then divide.
There are other solutions, but I find this to be the simplest since rounding always seems a bit iffy to me.
This has been asked before here, you may want to look for some of the answers given before, such as this one:
Dealing with accuracy problems in floating-point numbers

Why is BigDecimal returning a weird value?

I am writing code that will deal with currencies, charges, etc. I am going to use the BigDecimal class for math and storage, but we ran into something weird with it.
This statement:
1876.8 == BigDecimal('1876.8')
returns false.
If I run those values through a formatting string "%.13f" I get:
"%.20f" % 1876.8 => 1876.8000000000000
"%.20f" % BigDecimal('1876.8') => 1876.8000000000002
Note the extra 2 from the BigDecimal at the last decimal place.
I thought BigDecimal was supposed to counter the inaccuracies of storing real numbers directly in the native floating point of the computer. Where is this 2 coming from?
It won't give you as much control over the number of decimal places, but the conventional format mechanism for BigDecimal appears to be:
a.to_s('F')
If you need more control, consider using the Money gem, assuming your domain problem is mostly about currency.
gem install money
You are right, BigDecimal should be storing it correctly, my best guess is:
BigDecimal is storing the value correctly
When passed to a string formatting function, BigDecimal is being cast as a lower precision floating point value, creating the ...02.
When compared directly with a float, the float has an extra decimal place far beyond the 20 you see (classic floats can't be compared behavoir).
Either way, you are unlikely to get accurate results comparing a float to a BigDecimal.
Don't compare FPU decimal string fractions for equality
The problem is that the equality comparison of a floating or double value with a decimal constant that contains a fraction is rarely successful.
Very few decimal string fractions have exact values in the binary FP representation, so equality comparisons are usually doomed.*
To answer your exact question, the 2 is coming from a slightly different conversion of the decimal string fraction into the Float format. Because the fraction cannot be represented exactly, it's possible that two computations will consider different amounts of precision in intermediate calculations and ultimately end up rounding the result to a 52-bit IEEE 754 double precision mantissa differently. It hardly matters because there is no exact representation anyway, but one is probably more wrong than the other.
In particular, your 1876.8 cannot be represented exactly by an FP object, in fact, between 0.01 and 0.99, only 0.25, 0.50, and 0.75 have exact binary representations. All the others, include 1876.8, repeat forever and are rounded to 52 bits. This is about half of the reason that BigDecimal even exists. (The other half of the reason is the fixed precision of FP data: sometimes you need more.)
So, the result that you get when comparing an actual machine value with a decimal string constant depends on every single bit in the binary fraction ... down to 1/252 ... and even then requires rounding.
If there is anything even the slightest bit (hehe, bit, sorry) imperfect about the process that produced the number, or the input conversion code, or anything else involved, they won't look exactly equal.
An argument could even be made that the comparison should always fail because no IEEE-format FPU can even represent that number exactly. They really are not equal, even though they look like it. On the left, your decimal string has been converted to a binary string, and most of the numbers just don't convert exactly. On the right, it's still a decimal string.
So don't mix floats with BigDecimal, just compare one BigDecimal with another BigDecimal. (Even when both operands are floats, testing for equality requires great care or a fuzzy test. Also, don't trust every formatted digit: output formatting will carry remainders way off the right side of the fraction, so you don't generally start seeing zeroes, you will just see garbage values.)
*The problem: machine numbers are x/2n, but decimal constants are x/(2n * 5m). Your value as sign, exponent, and mantissa is the infinitely repeating 0 10000001001 1101010100110011001100110011001100110011001100110011... Ironically, FP arithmetic is perfectly precise and equality comparisons work perfectly well when the value has no fraction.
as David said, BigDecimal is storing it right
p (BigDecimal('1876.8') * 100000000000000).to_i
returns 187680000000000000
so, yes, the string formatting is ruining it
If you don't need fractional cents, consider storing and manipulating the currency as an integer, then dividing by 100 when it's time to display. I find that easier than dealing with the inevitable precision issues of storing and manipulating in floating point.
On Mac OS X, I'm running ruby 1.8.7 (2008-08-11 patchlevel 72) [i686-darwin9]
irb(main):004:0> 1876.8 == BigDecimal('1876.8') => true
However, being Ruby, I think you should think in terms of messages sent to objects. What does this return to you:
BigDecimal('1876.8') == 1876.8
The two aren't equivalent, and if you're trying to use BigDecimal's ability to determine precise decimal equality, it should be the receiver of the message asking about the equality.
For the same reason I don't think formatting the BigDecimal by sending a format message to the format string is the right approach either.

Resources