I have unsigned 64bit number, representing mantissa, or fraction (which represent range from [0..1), where 0.0 maps to 0 and 0xffffff.. maps to a number "just before 1.0")
Now i want to split this range into equal buckets - and to answer - given random number key, to which part of the range it will fall to?
Its easier to get from following code:
func BucketIndex(key, buckets uint64) uint64 {
return uint64(float64(key) / ((math.Pow(2, 64) / float64(buckets)))
}
My attempt to "hack this over" - was to split 2^64 to two, like if I will reduce range to 32bit, and operate in 64bit in order to conduct math:
// ~=key / ((1 << 64) / buckets)
return ((key >> 32) * buckets) >> 32
but ranges stopped to be equal..
eg one third (buckets==3) will be at 0x5555555600000000, instead of being at 0x5555555555555556
thats sad story, so im asking do you know of a better methods of finding (1 << 64) / buckets?
If buckets is (compile-time) constant, you may use constant expression to calculate bucket size: constants are of arbitrary size. Else you may use big.Int to calculate it at runtime, and store the result (so you don't have to use big.Int calculations all the time).
Using a constant expression, at compile-time
To achieve an integer division rounding up, add divisor - 1 to the dividend:
const (
max = math.MaxUint64 + 1
buckets = 3
bucketSize = uint64((max + buckets - 1) / buckets)
)
Using big.Int, at runtime
We can use the above same logic with big.Int too. An alternative would be to use Int.DivMod() (instead of adding buckets -1), and if mod is greater than zero, increment the result by 1.
func calcBucketSize(max, buckets *big.Int) uint64 {
max = max.Add(max, buckets)
max = max.Add(max, big.NewInt(-1))
return max.Div(max, buckets).Uint64()
}
var bucketSize = calcBucketSize(new(big.Int).SetUint64(math.MaxUint64), big.NewInt(3))
Somehow, I happened to look at source code for Go on how it implements Random function when passed a length of array.
Here's the calling code
func randomFormat() string {
formats := []string{
"Hi, %v. Welcome!",
"Great to see you, %v!",
"Hail, %v! Well met!",
}
return formats[rand.Intn(len(formats))]
}
Go Source code: main part
func (r *Rand) Intn(n int) int {
if n <= 0 {
panic("invalid argument to Intn")
}
if n <= 1<<31-1 {
return int(r.Int31n(int32(n)))
}
return int(r.Int63n(int64(n)))
}
Go Source code: reference part - Most of devs have this already on their machines or go repo.
// Int31n returns, as an int32, a non-negative pseudo-random number in [0,n).
// It panics if n <= 0.
func (r *Rand) Int31n(n int32) int32 {
if n <= 0 {
panic("invalid argument to Int31n")
}
if n&(n-1) == 0 { // n is power of two, can mask
return r.Int31() & (n - 1)
}
max := int32((1 << 31) - 1 - (1<<31)%uint32(n))
v := r.Int31()
for v > max {
v = r.Int31()
}
return v % n
}
// It panics if n <= 0.
func (r *Rand) Int63n(n int64) int64 {
if n <= 0 {
panic("invalid argument to Int63n")
}
if n&(n-1) == 0 { // n is power of two, can mask
return r.Int63() & (n - 1)
}
max := int64((1 << 63) - 1 - (1<<63)%uint64(n))
v := r.Int63()
for v > max {
v = r.Int63()
}
return v % n
}
func (r *Rand) Int31() int32 { return int32(r.Int63() >> 32) }
func (r *Rand) Int63() int64 { return r.src.Int63() }
type Source interface {
Int63() int64
Seed(seed int64)
}
I want to understand how the random function works encapsulating all inner functions. I am overwhelmed by the code and if someone has to plan the steps out in plain English what would those be?
For example, I don't get the logic for doing minus 1 in
if n <= 1<<31-1
Then, I don't get any of the head or toe of Int31n function
if n&(n-1) == 0 { // n is power of two, can mask
return r.Int31() & (n - 1)
}
max := int32((1 << 31) - 1 - (1<<31)%uint32(n))
v := r.Int31()
for v > max {
v = r.Int31()
}
return v % n
This is more of a question about algorithms than it is about Go, but there are some Go parts. In any case I'll start with the algorithm issues.
Shrinking the range of a uniform random number generator
Suppose that we have a uniform-distribution random number generator that returns a number between, say, 0 and 7 inclusive. That is, it will, over time, return about the same number of 0s, 1s, 2s, ..., 7s, but with no apparent pattern between them.
Now, if we want a uniformly distributed random number between 0 and 7, this thing is perfect. That's what it returns. We just use it. But what if we want a uniformly distributed random number between 0 and 6 instead?
We could write:
func randMod7() int {
return generate() % 7
}
so that if generate() returns 7 (which it has a 1 out of 8 chance of doing), we convert that value to zero. But then we'll get zero back 2 out of 8 times, instead of 1 out of 8 times. We'll get 1, 2, 3, 4, 5, and 6 back 1 out of 8 times, and zero 2 out of 8 times, on average: once for each actual zero, and once for each 7.
What we need to do, then, is throw away any occurrences of 7:
func randMod7() int {
for {
if i := generate() < 7 {
return i
}
// oops, got 7, try again
}
}
Now, if we had a uniform-random-number-generator named generate() that returned a value between 0 and (say) 11 (12 possible values) and we wanted a value between 0 and 3 (four possible values), we could just use generate() % 4, because the 12 possible results would fall into 3 groups of four with equal probability. If we wanted a value between 0 and 5 inclusive, we could use generate() % 6, because the 12 possible results would fall into two groups of 6 with equal probability. In fact, all we need to do is examine the prime factorization of the range of our uniform number generator to see what moduli work. The factors of 12 are 2, 2, 3; so 2, 3, 4, and 6 all work here. Any other modulus, such as generate() % 10, produce a biased result: 0 and 1 occur 2 out of 12 times, but 2 through 9 occur 1 out of 12 times. (Note: generate() % 12 also works, but is kind of pointless.)
In our particular case, we have two different uniform random number generators available. One, Int31(), produces values between 0 and 0x7fffffff (2147483647 decimal, or 231 - 1, or 1<<31 - 1) inclusive. The other, Int63(), produces values between 0 and 0x7fffffffffffffff (9223372036854775807, or 263 - 1, or 1<<63 - 1). These are ranges that hold 231 and 263 values respectively, and hence their prime factorization is 31 2s, or 63 2s.
What this means is that we can compute Int31() mod 2k, for any integer k in zero to 31 inclusive, without messing up our uniformity. With Int63(), we can do the same with k ranging all the way up to 63.
Introducing the computer
Now, mathematically-and-computer-ly speaking, given any nonnegative integer n in [0..0x7ffffff] or [0..0x7fffffffffffffff], and a non-negative integer k in the right range (no more than 31 or 63 respectively), computing that integer n mod 2k produces the same result as computing that integer and doing a bit-mask operation with k bits set. To get that number of set bits, we want to take 1<<k and subtract 1. If k is, say, 4, we get 1<<4 or 16. Subtracting 1, we get 15, or 0xf, which has four 1 bits in it.
So:
n % (1 << k)
and:
n & (1<<k - 1)
produce the same result. Concretely, when k==4, this is n%16 or n&0xf. When k==5 this is n%32 or n&0x1f. Try it for k==0 and k==63.
Introducing Go-the-language
We're now ready to consider doing all of this in Go. We note that int (plain, unadorned int) is guaranteed to be able to hold values between -2147483648 and +2147483647 (-0x80000000 through +0x7fffffff) respectively. It may extend all the way to -0x8000000000000000 through +0x7ffffffffffffff.
Meanwhile, int32 always handles the smaller range and int64 always handles the larger range. The plain int is a different type from these other two, but implements the same range as one of the two. We just don't know which one.
Our Int31 implementation returns a uniformly distributed random number in the 0..0x7ffffff range. (It does this by returning the upper 32 bits of r.Int63(), though this is an implementation detail.) Our Int63 implementation returns a uniformly distributed random number in the 0..0x7ffffffffffffff range.
The Intn function you show here:
func (r *Rand) Intn(n int) int {
if n <= 0 {
panic("invalid argument to Intn")
}
if n <= 1<<31-1 {
return int(r.Int31n(int32(n)))
}
return int(r.Int63n(int64(n)))
}
just picks one of the two functions, based on the value of n: if it's less than or equal to 0x7fffffff (1<<31 - 1), the result fits in int32, so it uses int32(n) to convert n to int32, calls r.Int31n, and converts the result back to int. Otherwise, the value of n exceeds 0x7fffffff, implying that int has the larger range and we must use the larger-range generator, r.Int63n. The rest is the same except for types.
The code could just do:
return int(r.Int63n(int64(n)))
every time, but on 32-bit machines, where 64-bit arithmetic may be slow, this might be slow. (There's a lot of may and might here and if you were writing this yourself today, you should start by profiling / benchmarking the code. The Go authors did do this, though this was many years ago; at that time it was worth doing this fancy stuff.)
More bit-manipulation
The insides of both functions Int31n and Int63n are quite similar; the main difference is the types involved, and then in a few places, the maximum values. Again, the reason for this is at least partly historical: on some (mostly old now) computers, the Int63n variant is significantly slower than the Int32n variant. (In some non-Go language, we might write these as generics and then have the compiler generate a type-specific version automatically.) So let's just look at the Int63 variant:
func (r *Rand) Int63n(n int64) int64 {
if n <= 0 {
panic("invalid argument to Int63n")
}
if n&(n-1) == 0 { // n is power of two, can mask
return r.Int63() & (n - 1)
}
max := int64((1 << 63) - 1 - (1<<63)%uint64(n))
v := r.Int63()
for v > max {
v = r.Int63()
}
return v % n
}
The argument n has type int64, so that its value will not exceed 263-1 or 0x7fffffffffffffff or 9223372036854775807. But it could be negative, and negative values won't work right, so the first thing we do is test for that and panic if so. We also panic if the input is zero (this is something of a choice, but it's useful to note it now).
Next we have the n&(n-1) == 0 test. This is a test for powers of two, with one slight flaw, and it works in many languages (those that have bit-masking):
A power of two is always represented as a single set bit, in the binary representation of a number. For instance, 2 itself is 000000012, 4 is 000000102, 8 is 000001002, and so on, through 128 being 100000002. (Since I only "drew" eight bits this series maxes out at 128.)
Subtracting 1 from that number causes a borrow: that bit goes to zero, and all the lesser bits become 1. For instance, 100000002 - 1 is 011111112.
AND-ing these two together produces zero if there was just the single bit set initially. If not—for instance, if we have the value 130 or 100000102 initially, subtracting 1 produces 100000012—there's no borrow out of the top bit, so the top bit is set in both inputs and therefore is set in the AND-ed result.
The slight flaw is that if the initial value is zero, then we have 0-1, which produces all-1s; 0&0xffffffffffffffff is zero too, but zero is not an integer power of two. (20 is 1, not 0.) This minor flaw is not important for our purpose here, because we already made sure to panic for this case: it just doesn't happen.
Now we have the most complicated line of all:
max := int64((1 << 63) - 1 - (1<<63)%uint64(n))
The recurring 63s here are because we have a value range going from zero to 263-1. 1<<63 - 1 is (still, again, always) 9223372036854775807 or 0x7fffffffffffffff. Meanwhile, 1<<63, without 1 subtracted from it, is 9223372036854775808 or 0x8000000000000000. This value does not fit into int64 but it does fit into uint64. So if we turn n into a uint64, we can compute uint64(9223372036854775808) % uint64(n), which is what the % expression does. By using uint64 for this calculation, we ensure that it doesn't overflow.
But: what is this calculation all about? Well, go back to our example with a generate() that produces values in [0..7]. When we wanted a number in [0..5], we had to discard both 6 and 7. That's what we're going for here: we want to find the value above which we should discard values.
If we were to take 8%6, we'd get 2. 8 is one bigger than the maximum that our 3-bit generate() would generate. 8%6 == 2 is the number of "high values" that we have to discard: 8-2 = 6 and we want to discard values that are 6 or more. Subtract 1 from this, and we get 7-2 = 5; we can accept numbers in this input range, from 0 to 5 inclusive.
So, this somewhat fancy calculation for setting max is just a way to find out what the maximum value we like is. Values that are greater than max need to be tossed out.
This particular calculation works nicely even if n is much less than our generator returns. For instance, suppose we had a four-bit generator, returning values in the [0..15] range, and we wanted a number in [0..2]. Our n is therefore 3 (to indicate that we want a number in [0..2]). We compute 16%3 to get 1. We then take 15 (one less than our maximum output value) - 1 to get 14 as our maximum acceptable value. That is, we would allow numbers in [0..14], but exclude 15.
With a 63-bit generator returning values in [0..9223372036854775807], and n==3, we would set max to 9223372036854775805. That's what we want: it throws out the two biasing values, 9223372036854775806 and 9223372036854775807.
The remainder of the code simply does that:
v := r.Int63()
for v > max {
v = r.Int63()
}
return v % n
We pick one Int63-range number. If it exceeds max, we pick another one and check again, until we pick one that is in the [0..max] range, inclusive of max.
Once we get a number that is in range, we use % n to shrink the range if needed. For instance, if the range is [0..2], we use v % 3. If v is (say) 14, 14%3 is 2. Our actual max is, again, 9223372036854775805, and whatever v is, between 0 and that, v%3 is between 0 and 2 and remains uniformly distributed, with no slight bias to 0 and 1 (9223372036854775806 would give us that one extra 0, and 9223372036854775807 would give us that one extra 1).
(Now repeat the above for int32 and 32 and 1<<32, for the Int31 function.)
After doing some calculations using big.Float in golang, I am setting the precision to 2.
And even thou the number is just a simple 10, after setting the precision it is 8.
package main
import (
"fmt"
"math/big"
)
func main() {
cost := big.NewFloat(10)
fmt.Println("COST NOW", cost)
perKWh := big.NewFloat(0)
cost.Add(cost, perKWh)
fmt.Println("COST ", cost.String())
perMinute := big.NewFloat(0)
cost.Add(cost, perMinute)
fmt.Println("COST ", cost.String())
discountAmount := big.NewFloat(0)
cost.Sub(cost, discountAmount)
floatCos, _ := cost.Float64()
fmt.Println(fmt.Sprintf("COST FLOAT %v", floatCos))
cost.SetPrec(2)
fmt.Println("COST ", cost.String())
}
Check playground example here: https://play.golang.org/p/JmCRXkD5u49
Would like to understand why
From the fine manual:
type Float
[...]
Each Float value also has a precision, rounding mode, and accuracy. The precision is the maximum number of mantissa bits available to represent the value. The rounding mode specifies how a result should be rounded to fit into the mantissa bits, and accuracy describes the rounding error with respect to the exact result.
And big.Float is represented internally as:
sign × mantissa × 2**exponent
When you call SetPrec you're setting the number of bits available for the mantissa, not the number of digits of precision in the decimal representation of the number.
You can't represent decimal 10 (1010 binary) in two bits of mantissa so it rounds to decimal 8 (1000 binary) which can fit into 2 bits. You need at least three bits to store the 101 part of decimal 10. 8 can fit into a single bit of mantissa so you'll see the same 8 if you say cost.SetPrec(1).
You need to be thinking in terms of binary when using the big package.
First, discard all the irrelevant code. Next, print useful diagnostic information.
package main
import (
"fmt"
"math/big"
)
func main() {
cost := big.NewFloat(10)
fmt.Println("Cost ", cost.String())
fmt.Println("Prec", cost.Prec())
fmt.Println("MinPrec", cost.MinPrec())
fmt.Println("Mode", cost.Mode())
cost.SetPrec(2)
fmt.Println("Prec", cost.Prec())
fmt.Println("Accuracy", cost.Acc())
fmt.Println("Cost ", cost.String())
}
Output:
Cost 10
Prec 53
MinPrec 3
Mode ToNearestEven
Prec 2
Accuracy Below
Cost 8
Round 10 to the nearest even number that can be represented in an sign, exponent, and a 2-bit mantissa and you get 8.
Rounding ToNearestEven is IEE754 rounding. Round to nearest, ties to even – rounds to the nearest value; if the number falls midway it is rounded to the nearest value with an even (zero) least significant bit.
Imagine for printing in a 12 fixed width table we need printing float64 numbers:
fmt.Printf("%12.6g\n", 9.405090880450127e+119) //"9.40509e+119"
fmt.Printf("%12.6g\n", 0.1234567890123) //" 0.123457"
fmt.Printf("%12.6g\n", 123456789012.0) //" 1.23457e+11"
We prefer 0.1234567890 to " 0.123457" we lose 6 significant digits.
We prefer 123456789012 to " 1.23457e+11" we lose 6 significant digits.
Is there any standard library to convert float64 to string with fix width with maximum number of significant digits?
Thanks in Advance.
Basically you have 2 output formats: either a scientific notation or a regular form. The turning point between those 2 formats is 1e12.
So you can branch if x >= 1e12. In both branches you may do a formatting with 0 fraction digits to see how long the number will be, so you can calculate how many fraction digits will fit in for 12 width, and so you can construct the final format string, using the calculated precision.
The pre-check is required in the scientific notation too (%g), because the width of exponent may vary (e.g. e+1, e+10, e+100).
Here is an example implementation. This is to get you started, but it does not mean to handle all cases, and it is not the most efficient solution (but relatively simple and does the job):
// format12 formats x to be 12 chars long.
func format12(x float64) string {
if x >= 1e12 {
// Check to see how many fraction digits fit in:
s := fmt.Sprintf("%.g", x)
format := fmt.Sprintf("%%12.%dg", 12-len(s))
return fmt.Sprintf(format, x)
}
// Check to see how many fraction digits fit in:
s := fmt.Sprintf("%.0f", x)
if len(s) == 12 {
return s
}
format := fmt.Sprintf("%%%d.%df", len(s), 12-len(s)-1)
return fmt.Sprintf(format, x)
}
Testing it:
fs := []float64{0, 1234.567890123, 0.1234567890123, 123456789012.0, 1234567890123.0,
9.405090880450127e+9, 9.405090880450127e+19, 9.405090880450127e+119}
for _, f := range fs {
fmt.Println(format12(f))
}
Output (try it on the Go Playground):
0.0000000000
0.1234567890
1234.5678901
123456789012
1.234568e+12
9405090880.5
9.405091e+19
9.40509e+119
I was wondering if we can specify to the random generator to how many numbers should be generated after the point decimal?
Example of default behaviour:
fmt.Println(rand.float64())
Would print out the number 0.6046602879796196
Desired behaviour:
fmt.Println(rand.float64(4))
Would then print out the number 0.6047.
Does this functionality already exist in GO or would I have to implement it myself ?
Thank you!
It sounds like only the string representation is important to you, and the fmt package does provide that for you:
fmt.Printf("%1.4f", rand.Float64())
So yes, you would still need to wrap this call to specify the number of digits after the decimal point.
func RandomDigits(number int) string {
return fmt.Sprintf("%1." + strconv.Itoa(number) + "f", rand.Float64())
}
I don't know of such function, however it is easy to implement by yourself (play):
// Truncate the number x to n decimal places
//
// +- Inf -> +- Inf; NaN -> NaN
func truncate(x float64, n int) float64 {
return math.Trunc(x * math.Pow(10, float64(n))) * math.Pow(10, -float64(n))
}
Shift the number n decimal places to the left, truncate decimal places, shift the number n places to the right.
In case you want to present your number to the user then you will, at one point, convert the number
to a string. When you do that, you should not use this method and instead use string formatting as pointed
out by Tyson. For example, as floating point numbers are imprecise there might be rounding errors:
truncate(0.9405090880450124,3) // 0.9400000000000001