If I run the following piece of Go code:
fmt.Println(float32(0.1) + float32(0.2))
fmt.Println(float64(0.1) + float64(0.2))
the output is:
0.3
0.30000000000000004
It appears the result of the float32 sum is more exact than the result of the float64 sum, why? I thought that float64 is always more precise than float32. How do I decide which one to pick to have the most accurate result?
It isn't. fmt.Println is just making it look more precise. Println uses %g for floating point and complex numbers. The docs say...
The default precision for... %g it is the smallest number of digits necessary to identify the value uniquely.
0.3 is sufficient to identify a float32. But float64 being much more precise needs more digits.
We can use fmt.Printf and %0.20g to force both numbers to display the same precision.
f32 := float32(0.1) + float32(0.2)
f64 := float64(0.1) + float64(0.2)
fmt.Printf("%0.20g\n", f32)
fmt.Printf("%0.20g\n", f64)
0.30000001192092895508
0.30000000000000004441
float64 is more precise. Neither are exact as that is the nature of floating point numbers.
We can use strconv.FormatFloat to see what these numbers really are.
fmt.Println(strconv.FormatFloat(float64(f32), 'b', -1, 32))
fmt.Println(strconv.FormatFloat(f64, 'b', -1, 64))
10066330p-25
5404319552844596p-54
That is 10066330 * 2^-25 and 5404319552844596 * 2^-54.
Related
In Golang, it seems that when a float64 var first convert to float32 then convert float64, it's value will change.
a := -8888.95
fmt.Println(a) // -8888.95
fmt.Println(float32(a)) // -8888.95
fmt.Println(float64(float32(a))) // -8888.9501953125
How can I make it unchanging
The way you have described the problem is perhaps misleading.
The precision is not lost "when converting float32 to float64"; rather, it is lost when converting from float64 to float32.
So how can you avoid losing precision when converting from float64 to float32? You can't. This task is impossible, and it's quite easy to see the reason why:
float64 has twice as many bits as float32
multiple different float64 values will map to the same float32 value due to the pigeonhole principle
the conversion is therefore not reversible.
package main
import (
"fmt"
)
func main() {
a := -8888.95
fmt.Printf("%.20f\n", a)
fmt.Printf("%.20f\n", float32(a))
fmt.Printf("%.20f\n", float64(float32(a)))
}
Adjusting your program to show a more precise output of each value, you'll see exactly where the precision is lost:
-8888.95000000000072759576
-8888.95019531250000000000
-8888.95019531250000000000
That is, after the float32 conversion (as is expected).
It's also worth noting that neither float64 nor float32 can represent your value -8888.95 exactly. If you convert this number to a fraction, you will get -177779/20. Notice the denominator, 20. The prime factorization of 20 is 2 * 2 * 5.
If you apply this process to a number and the prime factorization of the denominator contains any factors which are NOT 2, then you can rest assured that this number is definitely not representable exactly in binary floating point form. You may discover that the probability of any number passing this test is quite low.
If you parse a string into a big.Float like f.SetString("0.001"), then multiply it, I'm seeing a loss of precision. If I use f.SetFloat64(0.001), I don't lose precision. Even doing a strconv.ParseFloat("0.001", 64), then calling f.SetFloat() works.
Full example of what I'm seeing here:
https://play.golang.org/p/_AyTHJJBUeL
Expanded from this question: https://stackoverflow.com/a/47546136/105562
The difference in output is due to imprecise representation of base 10 floating point numbers in float64 (IEEE-754 format) and the default precision and rounding of big.Float.
See this simple code to verify:
fmt.Printf("%.30f\n", 0.001)
f, ok := new(big.Float).SetString("0.001")
fmt.Println(f.Prec(), ok)
Output of the above (try it on the Go Playground):
0.001000000000000000020816681712
64 true
So what we see is that the float64 value 0.001 is not exactly 0.001, and the default precision of big.Float is 64.
If you increase the precision of the number you set via a string value, you will see the same output:
s := "0.001"
f := new(big.Float)
f.SetPrec(100)
f.SetString(s)
fmt.Println(s)
fmt.Println(BigFloatToBigInt(f))
Now output will also be the same (try it on the Go Playground):
0.001
1000000000000000
As long as floating point is used, 0.1 can not be represented exactly in memory, so we know that this value usually comes out to 0.10000000000000004.
But when using go to add 0.1 and 0.2.
I'm getting 0.3.
fmt.Println(0.1 + 0.2)
// Output : 0.3
Why is 0.3 coming out instead of 0.30000000000000004 ?
It is because when you print it (e.g. with the fmt package), the printing function already rounds to a certain amount of fraction digits.
See this example:
const ca, cb = 0.1, 0.2
fmt.Println(ca + cb)
fmt.Printf("%.20f\n", ca+cb)
var a, b float64 = 0.1, 0.2
fmt.Println(a + b)
fmt.Printf("%.20f\n", a+b)
Output (try it on the Go Playground):
0.3
0.29999999999999998890
0.30000000000000004
0.30000000000000004441
First we used constants because that's different than using (non-constant) values of type float64. Numeric constants represent exact values of arbitrary precision and do not overflow.
But when printing the result of ca+cb, the constant value have to be converted to a non-constant, typed value to be able to be passed to fmt.Println(). This value will be of type float64, which cannot represent 0.3 exactly. But fmt.Println() will round it to like ~16 fraction digits, which will be 0.3. But when we explicitly state we want it displayed with 20 digits, we'll see it's not exact. Note that only 0.3 will be converted to float64, because the constant arithmetic 0.1+0.2 will be evaluated by the compiler (at compile time).
Next we started with variables of type float64, and to no surprise, output wasn't 0.3 exactly, but this time even with the default rounding we got a result different from 0.3. The reason for this is because in the first case (constants) it was 0.3 that was converted, but this time both 0.1 and 0.2 were converted to float64, none of which is exact, and adding them resulted in a number having bigger distance from 0.3, big enough to make a "visual appearance" with the default rounding of the fmt package.
Check out similar / relevant questions+answers to know more about the topic:
Why do these two float64s have different values?
How does Go perform arithmetic on constants?
Golang converting float64 to int error
Does go compiler's evaluation differ for constant expression and other expression
Why does adding 0.1 multiple times remain lossless?
Golang Round to Nearest 0.05
Go: Converting float64 to int with multiplier
What is the correct way to store and do arithmetic on currency in Go? There doesn't seem to be a corresponding decimal type and using floats is a big no.
I'd say a way to go is to store amounts of money using properly sized integer type, normalized to the lowest possible amount. Say, if you need to store amounts in US dollars down to one cent, multiply your values by 100 and hence store them in full cents.
Another way is to implement a custom type which would model what is "decimal" in some other languages, that is, it would use two integer numbers to represent amount of money.
This seems like a great opportunity to create a type, which stores the value in a safe and precise integer-based way, but gives you extra behavior you'd want from a decimal type. For instance, a quick implementation might look like this (https://play.golang.org/p/nYbLiadQOc):
// USD represents US dollar amount in terms of cents
type USD int64
// ToUSD converts a float64 to USD
// e.g. 1.23 to $1.23, 1.345 to $1.35
func ToUSD(f float64) USD {
return USD((f * 100) + 0.5)
}
// Float64 converts a USD to float64
func (m USD) Float64() float64 {
x := float64(m)
x = x / 100
return x
}
// Multiply safely multiplies a USD value by a float64, rounding
// to the nearest cent.
func (m USD) Multiply(f float64) USD {
x := (float64(m) * f) + 0.5
return USD(x)
}
// String returns a formatted USD value
func (m USD) String() string {
x := float64(m)
x = x / 100
return fmt.Sprintf("$%.2f", x)
}
The given type behaves the way one might expect, especially given tricky use-cases.
fmt.Println("Product costs $9.09. Tax is 9.75%.")
f := 9.09
t := 0.0975
ft := f * t
fmt.Printf("Floats: %.18f * %.18f = %.18f\n", f, t, ft)
u := ToUSD(9.09)
ut := u.Multiply(t)
fmt.Printf("USD: %v * %v = %v\n", u, t, ut)
Product costs $9.09. Tax is 9.75%.
Floats: 9.089999999999999858 * 0.097500000000000003 = 0.886275000000000035
USD: $9.09 * 0.0975 = $0.89
Rational numbers are quite a good solution for representing money values. That is, a type that has a numerator and a denominator.
Often monetary data structures are overly complex - Java's BigDecimal being an example. A more mathematically-consistent approach is to define a type that handles rational numbers. When 64bit integers are used, a huge range of numbers can be accurately and efficiently represented. Errors and rounding issues are less of a problem than for any solution that needs to convert binary fractions to/from decimal fractions.
Edit: The Go standard library includes arbitrary-precision integers and rational numbers. The Rat type will work well for currency, especially for those cases that require arbitrary precision, e.g. foreign exchange. Here's an example.
Edit 2: I have used the decimal.Decimal Shopspring package extensively. Under the hood, this combines big.Int with an exponent to provide a fixed-point decimal with a nearly-unlimited range of values. The Decimal type is a rational number where the denominator is always a power of ten, which works very well in practice.
There are actually a few packages implementing a decimal type, though there's no clear leader among them.
Hi I am new to Go programing language.
I am learning from http://www.golang-book.com/
In chapter 4, under Exercises, there is a question on converting from Fahrenheit to Centigrade.
I coded up the answer as follows
package main
import "fmt"
func main(){
fmt.Println("Enter temperature in Farentheit ");
var input float64
fmt.Scanf("%f",&input)
var outpu1 float64 = ( ( (input-32)* (5) ) /9)
var outpu2 float64= (input-32) * (5/9)
var outpu3 float64= (input -32) * 5/9
var outpu4 float64= ( (input-32) * (5/9) )
fmt.Println("the temperature in Centigrade is ",outpu1)
fmt.Println("the temperature in Centigrade is ",outpu2)
fmt.Println("the temperature in Centigrade is ",outpu3)
fmt.Println("the temperature in Centigrade is ",outpu4)
}
The output was as follows
sreeprasad:projectsInGo sreeprasad$ go run convertFarentheitToCentigrade.go
Enter temperature in Farentheit
12.234234
the temperature in Centigrade is -10.980981111111111
the temperature in Centigrade is -0
the temperature in Centigrade is -10.980981111111111
the temperature in Centigrade is -0
My question is with outpu2 and outpu4. The parenthesizes are correct but how or why does it print -0.
Could anyone please explain
Quite simply, the expression (5/9) is evaluated as (int(5)/int(9)) which equals 0. Try (5./9)
And to clarify why this is happening, it deals with the order in which the expression variable's types are determined.
I would guess that b/c (5/9) exists without regards to input in case 2 and 4 above, the compiler interprets them as int and simply replaces the expression with 0, at which point then the zero is considered dependent on input and thus takes on the type float64 before final compilation.
Generally speaking, Go does not convert numeric types for you, so this is the only explanation that would make sense to me.
The Go language Spec indicates that float32 and float64 are signed floating numbers that follow IEEE-754 standard. Following text is quoted from Wikipedia - Signed zero:
The IEEE 754 standard for floating point arithmetic (presently used by most computers and programming languages that support floating point numbers) requires both +0 and −0. The zeroes can be considered as a variant of the extended real number line such that 1/−0 = −∞ and 1/+0 = +∞, division by zero is only undefined for ±0/±0 and ±∞/±∞.
Clearly, input, as a float64, when applied minus 32, turns into another float64 which is negative. 5/9 evaluates into 0. A negative float64 timed by 0 is -0.
Interestingly, if you replace input with an integer, e.g. 1, you'll get 0 instead of -0. It seems that in Go, floating numbers have both +0 and -0, but integers don't.
EDIT: PhiLho explains in comment about the reason why floating numbers have such thing while integers don't: normalized floating point numbers have special representations of +0, -0, NaN, +Infinity and -Infinity, while you cannot reserve some bit combinations of an integer number to have such meanings.