In my ruby code, I'm talking to a server that responds with 128bit long doubles ("128 bit long doubles", "binary128" or "quadruple precision floating points") as strings in binary format.
Is there a way to unpack these strings for use in ruby? According to the documentation on String.unpack, the maximum precision seems to be doubles. Is it possible to represent these 128bit floats in ruby at all?
Related
This question already has answers here:
Floating point arithmetic and reproducibility
(1 answer)
Reproducibility of floating point operation result
(2 answers)
Is specifying floating-point type sufficient to guarantee same results?
(1 answer)
Closed 1 year ago.
Hash functions like SHA-2 are widely used to validate the transfer of data or to validate cryptographic integer calculations.
Is it possible to use hash functions to validate floating point calculations, e.g. a physics simulation. In other words is it possible and viable to make floating point calculations consistent across multiple platforms?
If the answer to the previous question is "no", given the calculations are carefully crafted, it might be possible to hash data with reduced precision and still get high confidence in the validation. Is there any ongoing project which tries to achieve that?
Hashes work on binary input, usually bytes. So if you want to use hashes to compare results instead of comparing the results themselves then you need to create a canonical binary encoding of the values. For floating point, the common standard that specifies how numbers can be encoded is IEEE 754, and most languages will specify how they use floats compared to that standard.
For instance, Java adheres to the "round-to-nearest rule of IEEE 754 floating-point arithmetic" and starts with this part:
The floating-point types are float and double, which are conceptually associated with the single-precision 32-bit and double-precision 64-bit format IEEE 754 values and operations as specified in IEEE Standard for Binary Floating-Point Arithmetic, ANSI/IEEE Standard 754-1985 (IEEE, New York).
Please that you should also understand the strictfp flag for Java. Similar tricks may be required for other languages as well.
I have the following string:
"1.273,08"
And I need to convert to float and the result must be:
1273.08
I tried some code using gsub but I can't solve this.
How can I do this conversion?
You have already received two good answers how to massage your String into your desired format using String#delete and String#tr.
But there is a deeper problem.
The decimal value 1 273.0810 cannot be accurately represented as an IEEE 754-2019 / ISO/IEC 60559:2020 binary64 floating point value.
Just like the value 1/3rd can easily be represented in ternary (0.13) but has an infinite representation in decimal (0.33333333…10, i.e. 0.[3]…10) and thus cannot be accurately represented, the value 8/100th can easily be represented in decimal (0.0810) but has an infinite representation in binary (0.0001010001111010111000010100011110101110000101…2, i.e. 0.[00010100011110101110]…2). In other words, it is impossible to express 1 273.0810 as a Ruby Float.
And that's not specific to Ruby, or even to programming, that is just basic high school maths: you cannot represent this number in binary, period, just like you cannot represent 1/3rd in decimal, or π in any integer base.
And of course, computers don't have infinite memory, so not only does 1 273.0810 have an infinite representation in binary, but as a Float, it will also be cut off after 64 bits. The closest possible value to 1 273.0810 as an IEEE 754-2019 / ISO/IEC 60559:2020 binary64 floating point value is 1 273.079 999 999 999 927 240 423 858 1710, which is less than 1 273.0810.
That is why you should never represent money using binary numbers: everybody will expect it to be decimal, not binary; if I write a cheque, I write it in decimal, not binary. People will expect that it is impossible to represent $ 1/3rd, they will expect that it is impossible to represent $ π, but they will not expect and not accept that if they put $ 1273.08 into their account, they will actually end up with slightly less than that.
The correct way to represent money would be to use a specialized Money datatype, or at least using the bigdecimal library from the standard library:
require 'bigdecimal'
BigDecimal('1.273,08'.delete('.').tr(',', '.'))
#=> 0.127308e4
I would do
"1.273,08".delete('.') # delete '.' from the string
.tr(',', '.') # replace ',' with '.'
.to_f # translate to float
#=> 1273.08
So, we're using . as a thousands separator and , instead of a dot:
str = "1.273,08"
str.gsub('.','').gsub(',', '.').to_f
int types have a very low range of number it supports as compared to double. For example I want to use a integer number with a high range. Should I use double for this purpose. Or is there an alternative for this.
Is arithmetic slow in doubles ?
Whether double arithmetic is slow as compared to integer arithmetic depends on the CPU and the bit size of the integer/double.
On modern hardware floating point arithmetic is generally not slow. Even though the general rule may be that integer arithmetic is typically a bit faster than floating point arithmetic, this is not always true. For instance multiplication & division can even be significantly faster for floating point than the integer counterpart (see this answer)
This may be different for embedded systems with no hardware support for floating point. Then double arithmetic will be extremely slow.
Regarding your original problem: You should note that a 64 bit long long int can store more integers exactly (2^63) while double can store integers only up to 2^53 exactly. It can store higher numbers though, but not all integers: they will get rounded.
The nice thing about floating point is that it is much more convenient to work with. You have special symbols for infinity (Inf) and a symbol for undefined (NaN). This makes division by zero for instance possible and not an exception. Also one can use NaN as a return value in case of error or abnormal conditions. With integers one often uses -1 or something to indicate an error. This can propagate in calculations undetected, while NaN will not be undetected as it propagates.
Practical example: The programming language MATLAB has double as the default data type. It is used always even for cases where integers are typically used, e.g. array indexing. Even though MATLAB is an intepreted language and not so fast as a compiled language such as C or C++ is is quite fast and a powerful tool.
Bottom line: Using double instead of integers will not be slow. Perhaps not most efficient, but performance hit is not severe (at least not on modern desktop computer hardware).
Is there any way to tell the default ruby JSON library to parse non-integer numeric values as string (or BigDecimal?) instead of floats?
ie JSON.parse('{"foo": 123.45}')['foo'].class outputs Float, which may lead to precision issues.
PD: the oj library supports loading these values as BigDecimals.
PD2: seems there isn't: https://github.com/flori/json/blob/76f41a84e2bace20c3076aba53887537e37dfdb2/lib/json/pure/parser.rb#L196
In theory JSON as a container could hold highly precise numbers, but in practice one end is generally limited to IEEE 754 double precision floating point numbers as that is what JavaScript itself is limited to. Any precision loss will already be incurred if the values are encoded in JavaScript or almost any JSON implementation.
Hence, converting to BigDecimal from the parsed Float will almost always result in no additional loss of precision:
data = JSON.parse("[1.025]")
# Float can't represent decimal values precisely, so `round` fails
data.first.round(2) # => 1.02
# Converting to big decimal improves the precision of future operations
BigDecimal.new(data.first.to_s).round(2).to_s # => "1.03"
You are much better off transporting your highly precise values as strings.
Lastly, if you really need to ruby libraries can always be monkey patched to behave how you want.
I am writing code that will deal with currencies, charges, etc. I am going to use the BigDecimal class for math and storage, but we ran into something weird with it.
This statement:
1876.8 == BigDecimal('1876.8')
returns false.
If I run those values through a formatting string "%.13f" I get:
"%.20f" % 1876.8 => 1876.8000000000000
"%.20f" % BigDecimal('1876.8') => 1876.8000000000002
Note the extra 2 from the BigDecimal at the last decimal place.
I thought BigDecimal was supposed to counter the inaccuracies of storing real numbers directly in the native floating point of the computer. Where is this 2 coming from?
It won't give you as much control over the number of decimal places, but the conventional format mechanism for BigDecimal appears to be:
a.to_s('F')
If you need more control, consider using the Money gem, assuming your domain problem is mostly about currency.
gem install money
You are right, BigDecimal should be storing it correctly, my best guess is:
BigDecimal is storing the value correctly
When passed to a string formatting function, BigDecimal is being cast as a lower precision floating point value, creating the ...02.
When compared directly with a float, the float has an extra decimal place far beyond the 20 you see (classic floats can't be compared behavoir).
Either way, you are unlikely to get accurate results comparing a float to a BigDecimal.
Don't compare FPU decimal string fractions for equality
The problem is that the equality comparison of a floating or double value with a decimal constant that contains a fraction is rarely successful.
Very few decimal string fractions have exact values in the binary FP representation, so equality comparisons are usually doomed.*
To answer your exact question, the 2 is coming from a slightly different conversion of the decimal string fraction into the Float format. Because the fraction cannot be represented exactly, it's possible that two computations will consider different amounts of precision in intermediate calculations and ultimately end up rounding the result to a 52-bit IEEE 754 double precision mantissa differently. It hardly matters because there is no exact representation anyway, but one is probably more wrong than the other.
In particular, your 1876.8 cannot be represented exactly by an FP object, in fact, between 0.01 and 0.99, only 0.25, 0.50, and 0.75 have exact binary representations. All the others, include 1876.8, repeat forever and are rounded to 52 bits. This is about half of the reason that BigDecimal even exists. (The other half of the reason is the fixed precision of FP data: sometimes you need more.)
So, the result that you get when comparing an actual machine value with a decimal string constant depends on every single bit in the binary fraction ... down to 1/252 ... and even then requires rounding.
If there is anything even the slightest bit (hehe, bit, sorry) imperfect about the process that produced the number, or the input conversion code, or anything else involved, they won't look exactly equal.
An argument could even be made that the comparison should always fail because no IEEE-format FPU can even represent that number exactly. They really are not equal, even though they look like it. On the left, your decimal string has been converted to a binary string, and most of the numbers just don't convert exactly. On the right, it's still a decimal string.
So don't mix floats with BigDecimal, just compare one BigDecimal with another BigDecimal. (Even when both operands are floats, testing for equality requires great care or a fuzzy test. Also, don't trust every formatted digit: output formatting will carry remainders way off the right side of the fraction, so you don't generally start seeing zeroes, you will just see garbage values.)
*The problem: machine numbers are x/2n, but decimal constants are x/(2n * 5m). Your value as sign, exponent, and mantissa is the infinitely repeating 0 10000001001 1101010100110011001100110011001100110011001100110011... Ironically, FP arithmetic is perfectly precise and equality comparisons work perfectly well when the value has no fraction.
as David said, BigDecimal is storing it right
p (BigDecimal('1876.8') * 100000000000000).to_i
returns 187680000000000000
so, yes, the string formatting is ruining it
If you don't need fractional cents, consider storing and manipulating the currency as an integer, then dividing by 100 when it's time to display. I find that easier than dealing with the inevitable precision issues of storing and manipulating in floating point.
On Mac OS X, I'm running ruby 1.8.7 (2008-08-11 patchlevel 72) [i686-darwin9]
irb(main):004:0> 1876.8 == BigDecimal('1876.8') => true
However, being Ruby, I think you should think in terms of messages sent to objects. What does this return to you:
BigDecimal('1876.8') == 1876.8
The two aren't equivalent, and if you're trying to use BigDecimal's ability to determine precise decimal equality, it should be the receiver of the message asking about the equality.
For the same reason I don't think formatting the BigDecimal by sending a format message to the format string is the right approach either.