In Go/Golang I have a variable of type big.Float with an (arbitrary) precision of 3,324,000 to represent a decimal number of 1,000,000 digits. It's the result of an iteration to calculate pi.
Now I want to print out the least significant 100 digits, i.e. digits 999,900 to 1,000,000.
I tried to convert the variable to a string by using fmt.Sprintf() and big.Text(). However, both functions consume a lot of processing time which gets unacceptable (many hours and even days) when further raising the precision.
I'm searching for some functions which extract the last 100 (decimal) digits of the variable.
Thanks in advance for your kind support.
The standard library doesn't provide a function to return those digits efficiently, but you can calculate them.
It is more efficient to isolate the digits you are interested in and print them. This avoids excessive calculations of an extremely large number to determine each individual digit.
The code below shows a way it can be done. You will need to ensure you have enough precision to generate them accurately.
package main
import (
"fmt"
"math"
"math/big"
)
func main() {
// Replace with larger calculation.
pi := big.NewFloat(math.Pi)
const (
// Pi: 3.1415926535897932...
// Output: 5926535897
digitOffset = 3
digitLength = 10
)
// Move the desired digits to the right side of the decimal point.
mult := pow(10, digitOffset)
digits := new(big.Float).Mul(pi, mult)
// Remove the integer component.
digits.Sub(digits, trunc(digits))
// Move the digits to the left of the decimal point, and truncate
// to an integer representing the desired digits.
// This avoids undesirable rounding if you simply print the N
// digits after the decimal point.
mult = pow(10, digitLength)
digits.Mul(digits, mult)
digits = trunc(digits)
// Display the next 'digitLength' digits. Zero padded.
fmt.Printf("%0*.0f\n", digitLength, digits)
}
// trunc returns the integer component.
func trunc(n *big.Float) *big.Float {
intPart, accuracy := n.Int(nil)
_ = accuracy
return new(big.Float).SetInt(intPart)
}
// pow calculates n^idx.
func pow(n, idx int64) *big.Float {
if idx < 0 {
panic("invalid negative exponent")
}
result := new(big.Int).Exp(big.NewInt(n), big.NewInt(idx), nil)
return new(big.Float).SetInt(result)
}
Somehow, I happened to look at source code for Go on how it implements Random function when passed a length of array.
Here's the calling code
func randomFormat() string {
formats := []string{
"Hi, %v. Welcome!",
"Great to see you, %v!",
"Hail, %v! Well met!",
}
return formats[rand.Intn(len(formats))]
}
Go Source code: main part
func (r *Rand) Intn(n int) int {
if n <= 0 {
panic("invalid argument to Intn")
}
if n <= 1<<31-1 {
return int(r.Int31n(int32(n)))
}
return int(r.Int63n(int64(n)))
}
Go Source code: reference part - Most of devs have this already on their machines or go repo.
// Int31n returns, as an int32, a non-negative pseudo-random number in [0,n).
// It panics if n <= 0.
func (r *Rand) Int31n(n int32) int32 {
if n <= 0 {
panic("invalid argument to Int31n")
}
if n&(n-1) == 0 { // n is power of two, can mask
return r.Int31() & (n - 1)
}
max := int32((1 << 31) - 1 - (1<<31)%uint32(n))
v := r.Int31()
for v > max {
v = r.Int31()
}
return v % n
}
// It panics if n <= 0.
func (r *Rand) Int63n(n int64) int64 {
if n <= 0 {
panic("invalid argument to Int63n")
}
if n&(n-1) == 0 { // n is power of two, can mask
return r.Int63() & (n - 1)
}
max := int64((1 << 63) - 1 - (1<<63)%uint64(n))
v := r.Int63()
for v > max {
v = r.Int63()
}
return v % n
}
func (r *Rand) Int31() int32 { return int32(r.Int63() >> 32) }
func (r *Rand) Int63() int64 { return r.src.Int63() }
type Source interface {
Int63() int64
Seed(seed int64)
}
I want to understand how the random function works encapsulating all inner functions. I am overwhelmed by the code and if someone has to plan the steps out in plain English what would those be?
For example, I don't get the logic for doing minus 1 in
if n <= 1<<31-1
Then, I don't get any of the head or toe of Int31n function
if n&(n-1) == 0 { // n is power of two, can mask
return r.Int31() & (n - 1)
}
max := int32((1 << 31) - 1 - (1<<31)%uint32(n))
v := r.Int31()
for v > max {
v = r.Int31()
}
return v % n
This is more of a question about algorithms than it is about Go, but there are some Go parts. In any case I'll start with the algorithm issues.
Shrinking the range of a uniform random number generator
Suppose that we have a uniform-distribution random number generator that returns a number between, say, 0 and 7 inclusive. That is, it will, over time, return about the same number of 0s, 1s, 2s, ..., 7s, but with no apparent pattern between them.
Now, if we want a uniformly distributed random number between 0 and 7, this thing is perfect. That's what it returns. We just use it. But what if we want a uniformly distributed random number between 0 and 6 instead?
We could write:
func randMod7() int {
return generate() % 7
}
so that if generate() returns 7 (which it has a 1 out of 8 chance of doing), we convert that value to zero. But then we'll get zero back 2 out of 8 times, instead of 1 out of 8 times. We'll get 1, 2, 3, 4, 5, and 6 back 1 out of 8 times, and zero 2 out of 8 times, on average: once for each actual zero, and once for each 7.
What we need to do, then, is throw away any occurrences of 7:
func randMod7() int {
for {
if i := generate() < 7 {
return i
}
// oops, got 7, try again
}
}
Now, if we had a uniform-random-number-generator named generate() that returned a value between 0 and (say) 11 (12 possible values) and we wanted a value between 0 and 3 (four possible values), we could just use generate() % 4, because the 12 possible results would fall into 3 groups of four with equal probability. If we wanted a value between 0 and 5 inclusive, we could use generate() % 6, because the 12 possible results would fall into two groups of 6 with equal probability. In fact, all we need to do is examine the prime factorization of the range of our uniform number generator to see what moduli work. The factors of 12 are 2, 2, 3; so 2, 3, 4, and 6 all work here. Any other modulus, such as generate() % 10, produce a biased result: 0 and 1 occur 2 out of 12 times, but 2 through 9 occur 1 out of 12 times. (Note: generate() % 12 also works, but is kind of pointless.)
In our particular case, we have two different uniform random number generators available. One, Int31(), produces values between 0 and 0x7fffffff (2147483647 decimal, or 231 - 1, or 1<<31 - 1) inclusive. The other, Int63(), produces values between 0 and 0x7fffffffffffffff (9223372036854775807, or 263 - 1, or 1<<63 - 1). These are ranges that hold 231 and 263 values respectively, and hence their prime factorization is 31 2s, or 63 2s.
What this means is that we can compute Int31() mod 2k, for any integer k in zero to 31 inclusive, without messing up our uniformity. With Int63(), we can do the same with k ranging all the way up to 63.
Introducing the computer
Now, mathematically-and-computer-ly speaking, given any nonnegative integer n in [0..0x7ffffff] or [0..0x7fffffffffffffff], and a non-negative integer k in the right range (no more than 31 or 63 respectively), computing that integer n mod 2k produces the same result as computing that integer and doing a bit-mask operation with k bits set. To get that number of set bits, we want to take 1<<k and subtract 1. If k is, say, 4, we get 1<<4 or 16. Subtracting 1, we get 15, or 0xf, which has four 1 bits in it.
So:
n % (1 << k)
and:
n & (1<<k - 1)
produce the same result. Concretely, when k==4, this is n%16 or n&0xf. When k==5 this is n%32 or n&0x1f. Try it for k==0 and k==63.
Introducing Go-the-language
We're now ready to consider doing all of this in Go. We note that int (plain, unadorned int) is guaranteed to be able to hold values between -2147483648 and +2147483647 (-0x80000000 through +0x7fffffff) respectively. It may extend all the way to -0x8000000000000000 through +0x7ffffffffffffff.
Meanwhile, int32 always handles the smaller range and int64 always handles the larger range. The plain int is a different type from these other two, but implements the same range as one of the two. We just don't know which one.
Our Int31 implementation returns a uniformly distributed random number in the 0..0x7ffffff range. (It does this by returning the upper 32 bits of r.Int63(), though this is an implementation detail.) Our Int63 implementation returns a uniformly distributed random number in the 0..0x7ffffffffffffff range.
The Intn function you show here:
func (r *Rand) Intn(n int) int {
if n <= 0 {
panic("invalid argument to Intn")
}
if n <= 1<<31-1 {
return int(r.Int31n(int32(n)))
}
return int(r.Int63n(int64(n)))
}
just picks one of the two functions, based on the value of n: if it's less than or equal to 0x7fffffff (1<<31 - 1), the result fits in int32, so it uses int32(n) to convert n to int32, calls r.Int31n, and converts the result back to int. Otherwise, the value of n exceeds 0x7fffffff, implying that int has the larger range and we must use the larger-range generator, r.Int63n. The rest is the same except for types.
The code could just do:
return int(r.Int63n(int64(n)))
every time, but on 32-bit machines, where 64-bit arithmetic may be slow, this might be slow. (There's a lot of may and might here and if you were writing this yourself today, you should start by profiling / benchmarking the code. The Go authors did do this, though this was many years ago; at that time it was worth doing this fancy stuff.)
More bit-manipulation
The insides of both functions Int31n and Int63n are quite similar; the main difference is the types involved, and then in a few places, the maximum values. Again, the reason for this is at least partly historical: on some (mostly old now) computers, the Int63n variant is significantly slower than the Int32n variant. (In some non-Go language, we might write these as generics and then have the compiler generate a type-specific version automatically.) So let's just look at the Int63 variant:
func (r *Rand) Int63n(n int64) int64 {
if n <= 0 {
panic("invalid argument to Int63n")
}
if n&(n-1) == 0 { // n is power of two, can mask
return r.Int63() & (n - 1)
}
max := int64((1 << 63) - 1 - (1<<63)%uint64(n))
v := r.Int63()
for v > max {
v = r.Int63()
}
return v % n
}
The argument n has type int64, so that its value will not exceed 263-1 or 0x7fffffffffffffff or 9223372036854775807. But it could be negative, and negative values won't work right, so the first thing we do is test for that and panic if so. We also panic if the input is zero (this is something of a choice, but it's useful to note it now).
Next we have the n&(n-1) == 0 test. This is a test for powers of two, with one slight flaw, and it works in many languages (those that have bit-masking):
A power of two is always represented as a single set bit, in the binary representation of a number. For instance, 2 itself is 000000012, 4 is 000000102, 8 is 000001002, and so on, through 128 being 100000002. (Since I only "drew" eight bits this series maxes out at 128.)
Subtracting 1 from that number causes a borrow: that bit goes to zero, and all the lesser bits become 1. For instance, 100000002 - 1 is 011111112.
AND-ing these two together produces zero if there was just the single bit set initially. If not—for instance, if we have the value 130 or 100000102 initially, subtracting 1 produces 100000012—there's no borrow out of the top bit, so the top bit is set in both inputs and therefore is set in the AND-ed result.
The slight flaw is that if the initial value is zero, then we have 0-1, which produces all-1s; 0&0xffffffffffffffff is zero too, but zero is not an integer power of two. (20 is 1, not 0.) This minor flaw is not important for our purpose here, because we already made sure to panic for this case: it just doesn't happen.
Now we have the most complicated line of all:
max := int64((1 << 63) - 1 - (1<<63)%uint64(n))
The recurring 63s here are because we have a value range going from zero to 263-1. 1<<63 - 1 is (still, again, always) 9223372036854775807 or 0x7fffffffffffffff. Meanwhile, 1<<63, without 1 subtracted from it, is 9223372036854775808 or 0x8000000000000000. This value does not fit into int64 but it does fit into uint64. So if we turn n into a uint64, we can compute uint64(9223372036854775808) % uint64(n), which is what the % expression does. By using uint64 for this calculation, we ensure that it doesn't overflow.
But: what is this calculation all about? Well, go back to our example with a generate() that produces values in [0..7]. When we wanted a number in [0..5], we had to discard both 6 and 7. That's what we're going for here: we want to find the value above which we should discard values.
If we were to take 8%6, we'd get 2. 8 is one bigger than the maximum that our 3-bit generate() would generate. 8%6 == 2 is the number of "high values" that we have to discard: 8-2 = 6 and we want to discard values that are 6 or more. Subtract 1 from this, and we get 7-2 = 5; we can accept numbers in this input range, from 0 to 5 inclusive.
So, this somewhat fancy calculation for setting max is just a way to find out what the maximum value we like is. Values that are greater than max need to be tossed out.
This particular calculation works nicely even if n is much less than our generator returns. For instance, suppose we had a four-bit generator, returning values in the [0..15] range, and we wanted a number in [0..2]. Our n is therefore 3 (to indicate that we want a number in [0..2]). We compute 16%3 to get 1. We then take 15 (one less than our maximum output value) - 1 to get 14 as our maximum acceptable value. That is, we would allow numbers in [0..14], but exclude 15.
With a 63-bit generator returning values in [0..9223372036854775807], and n==3, we would set max to 9223372036854775805. That's what we want: it throws out the two biasing values, 9223372036854775806 and 9223372036854775807.
The remainder of the code simply does that:
v := r.Int63()
for v > max {
v = r.Int63()
}
return v % n
We pick one Int63-range number. If it exceeds max, we pick another one and check again, until we pick one that is in the [0..max] range, inclusive of max.
Once we get a number that is in range, we use % n to shrink the range if needed. For instance, if the range is [0..2], we use v % 3. If v is (say) 14, 14%3 is 2. Our actual max is, again, 9223372036854775805, and whatever v is, between 0 and that, v%3 is between 0 and 2 and remains uniformly distributed, with no slight bias to 0 and 1 (9223372036854775806 would give us that one extra 0, and 9223372036854775807 would give us that one extra 1).
(Now repeat the above for int32 and 32 and 1<<32, for the Int31 function.)
I want a function getNthFloat(uint32 n) float32 such that for each n,m < 2³²-4 with n≠m, getNthFloat(n) and getNthFloat(m) return distinct floats that real numbers (neither NaN nor ±∞). 2³²-4 is chosen because if I understand IEEE 754 correctly, there are two binary representations of NaN, one for ∞ and one for -∞.
I imagine I should convert my uint32 into bits and convert bits into float32, but I can't figure out how to avoid the four values efficiently.
You can't get 2^32-4 valid floating point numbers in a float32. IEEE 754 binary32 numbers have two infinities (negative and positive) and 2^24-2 possible NaN values.
A 32 bit floating point number has the following bits:
31 30...23 22...0
sign exponent mantissa
All exponents with the value 0xff are either infinity (when mantissa is 0) or NaN (when mantissa isn't 0). So you can't generate those exponents.
Then it's just a simple matter or mapping your allowed integers into this format and then use math.Float32frombits to generate a float32. How you do that is your choice. I'd probably be lazy and just use the lowest bit for the sign and then reject all numbers higher than 2^32 - 2^24 - 1 and then shift the bits around.
So something like this (untested):
func foo(n uint32) float32 {
if n >= 0xff000000 {
panic("xxx")
}
return math.Float32frombits((n & 1) << 31 | (n >> 1))
}
N.B. I'd probably also avoid denormal numbers, that is numbers with the exponent 0 and non-zero mantissa. They can be slow and might not be handled correctly. For example they could all be mapped to zero, there's nothing in the go spec that talks about how denormal numbers are handled, so I'd be careful.
i think you are looking for math.Float32frombits function: https://golang.org/pkg/math/#Float32frombits
Imagine for printing in a 12 fixed width table we need printing float64 numbers:
fmt.Printf("%12.6g\n", 9.405090880450127e+119) //"9.40509e+119"
fmt.Printf("%12.6g\n", 0.1234567890123) //" 0.123457"
fmt.Printf("%12.6g\n", 123456789012.0) //" 1.23457e+11"
We prefer 0.1234567890 to " 0.123457" we lose 6 significant digits.
We prefer 123456789012 to " 1.23457e+11" we lose 6 significant digits.
Is there any standard library to convert float64 to string with fix width with maximum number of significant digits?
Thanks in Advance.
Basically you have 2 output formats: either a scientific notation or a regular form. The turning point between those 2 formats is 1e12.
So you can branch if x >= 1e12. In both branches you may do a formatting with 0 fraction digits to see how long the number will be, so you can calculate how many fraction digits will fit in for 12 width, and so you can construct the final format string, using the calculated precision.
The pre-check is required in the scientific notation too (%g), because the width of exponent may vary (e.g. e+1, e+10, e+100).
Here is an example implementation. This is to get you started, but it does not mean to handle all cases, and it is not the most efficient solution (but relatively simple and does the job):
// format12 formats x to be 12 chars long.
func format12(x float64) string {
if x >= 1e12 {
// Check to see how many fraction digits fit in:
s := fmt.Sprintf("%.g", x)
format := fmt.Sprintf("%%12.%dg", 12-len(s))
return fmt.Sprintf(format, x)
}
// Check to see how many fraction digits fit in:
s := fmt.Sprintf("%.0f", x)
if len(s) == 12 {
return s
}
format := fmt.Sprintf("%%%d.%df", len(s), 12-len(s)-1)
return fmt.Sprintf(format, x)
}
Testing it:
fs := []float64{0, 1234.567890123, 0.1234567890123, 123456789012.0, 1234567890123.0,
9.405090880450127e+9, 9.405090880450127e+19, 9.405090880450127e+119}
for _, f := range fs {
fmt.Println(format12(f))
}
Output (try it on the Go Playground):
0.0000000000
0.1234567890
1234.5678901
123456789012
1.234568e+12
9405090880.5
9.405091e+19
9.40509e+119
I am trying to get some floats formatted with the same width using fmt.Printf().
For example, given the float values 0.0606060606060606, 0.3333333333333333, 0.05, 0.4 and 0.1818181818181818, I would like to get each value formatted in, say, 10 runes:
0.06060606
0.33333333
0.05
0.4
0.18181818
But I can't understand how it's done. Documentation says that
For floating-point values, width sets the minimum width of the field
and precision sets the number of places after the decimal, if
appropriate, except that for %g/%G it sets the total number of digits.
For example, given 123.45 the format %6.2f prints 123.45 while %.4g
prints 123.5. The default precision for %e and %f is 6; for %g it is
the smallest number of digits necessary to identify the value
uniquely.
So, if I use %f a larger number will not fit in 10-character constraint, therefore %g is required. To get a minimum width of 10 is %10g and to get a maximum number of 9 digits (+1 for the dot) it's %.9g, but combining them in %10.9g is not behaving as I expect
0.0606060606
0.333333333
0.05
0.4
0.181818182
How come I get strings which are of 10 runes, others that are 11 runes and others that are 12 runes?
In particular, it seems that %.9g does not produce 9 digits in total. See for example: http://play.golang.org/p/ie9k8bYC7r
Firstly, we need to understand the documentation correctly:
width sets the minimum width of the field and precision sets the number of places after the decimal, if appropriate, except that for %g/%G it sets the total number of digits.
This line is grammatically correct, but the it in the last part of this sentence is really confusing: it actually refers to the precision, not the width.
Therefore, let's look at some examples:
123.45
12312.2
1.6069
0.6069
0.0006069
and you print it like fmt.Printf("%.4g"), it gives you
123.5
1.231e+04
1.607
0.6069
0.0006069
only 4 digits, excluding all decimal points and exponent. But wait, what happens to the last 2 example? Are you kidding me isn't that more than 5 digits?
This is the confusing part in printing: leading 0s won't be counted as digits, and won't be shrunk when there are less than 4 zeros.
Let's look at 0 behavior using the example below:
package main
import "fmt"
func main() {
fmt.Printf("%.4g\n", 0.12345)
fmt.Printf("%.4g\n", 0.012345)
fmt.Printf("%.4g\n", 0.0012345)
fmt.Printf("%.4g\n", 0.00012345)
fmt.Printf("%.4g\n", 0.000012345)
fmt.Printf("%.4g\n", 0.0000012345)
fmt.Printf("%.4g\n", 0.00000012345)
fmt.Printf("%g\n", 0.12345)
fmt.Printf("%g\n", 0.012345)
fmt.Printf("%g\n", 0.0012345)
fmt.Printf("%g\n", 0.00012345)
fmt.Printf("%g\n", 0.000012345)
fmt.Printf("%g\n", 0.0000012345)
fmt.Printf("%g\n", 0.00000012345)
}
and the output:
0.1235
0.01235
0.001234
0.0001234
1.234e-05
1.234e-06
1.235e-07
0.12345
0.012345
0.0012345
0.00012345
1.2345e-05
1.2345e-06
1.2345e-07
So you could see, when there are less than 4 leading 0s, they will be counted, and be shrunk if there are more than that.
Ok, next thing is the width. From the documentation, width only specifies the minimum width, including decimal place and exponent. Which means, if you have more digits than what width specified, it will shoot out of the width.
Remember, width will be taken account as the last step, which means it needs to first satisfy the precision field.
Let's go back to your case. You specified %10.9g, that means you want a total digit of 9, excluding the leading 0, and a min width of 10 including decimal place and exponent, and the precision should take priority.
0.0606060606060606: take 9 digits without leading 0 will give you 0.0606060606, since it's already 12 width, it passes the min width of 10;
0.3333333333333333: take 9 digits without leading 0 will give you 0.333333333, since it's already 11 width, it passes the min width of 10;
0.05: take 9 digits without leading 0 will give you 0.05, since it's less than width 10, it will pad with another 6 width to get width of 10;
0.4: same as above;
0.1818181818181818: take 9 digits without leading 0 will give you 0.181818182 with rounding, since it's already 11 width, it passes the min width of 10.
So this explains why you got the funny printing.
Yes, I agree: it gives precedence to the "precision fields" not to "width".
So when we need fix columns for printing we need write new formatting func.