Why can I not use the MAX and MIN values of f32 and f64 in a call to rand::rngs::ThreadRng::gen_range() in Rust - random

I have a utility I've written to generate dummy data. For example, you might write:
giveme 12000 u32
... to get an array of 12000 32-bit unsigned integers.
It is possible to set a maximum and/or minimum allowed value, so you might write:
giveme 100 f32 --max=155.25
If one or the other is not given, the programme uses type::MAX and type::MIN.
With the floating point types, however, one cannot pass those values to rand::rngs::ThreadRng::gen_range().
Here is my code for the f64:
fn generate_gift(gift_type: & GiftType,
generator: &mut rand::rngs::ThreadRng,
min: Option<f64>,
max: Option<f64>) -> Gift
{
match gift_type
{
...
GiftType::Float64 =>
{
let _min: f64 = min.unwrap_or(f64::MIN);
let _max: f64 = max.unwrap_or(f64::MAX);
let x: f64 = generator.gen_range(_min..=_max);
Gift::Float64(x)
},
}
}
If one or both of the limits is missing for floating point, 32 or 64-bit, then I get this error:
thread 'main' panicked at 'UniformSampler::sample_single: range overflow', /home/jack/.cargo/registry/src/github.com-1ecc6299db9ec823/rand-0.8.5/src/distributions/uniform.rs:998:1
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
It is not obvious to me at all why this error arises. Can you shed any light upon it?

It's a limitation of what it used to generate the float number, see #1090.
gen_range(f64::MIN..f64::MAX) results in high - low overflowing
See the book
f64: we treat this as an approximation of the real numbers, and, by convention, restrict to the range 0 to 1 (if not otherwise specified). We will come back to the conversions used later; for now note that these produce 52-53 bits of precision (depending on which conversion is used, output will be in steps of ε or ε/2, where 1+ε is the smallest representable value greater than 1).
For f32 and f64 the range 0.0 .. 1.0 is used (exclusive of 1.0), for two reasons: (a) this is common practice for random-number generators and (b) because for many purposes having a uniform distribution of samples (along the Real number line) is important, and this is only possible for floating-point representations by restricting the range.

Its a limitation in rand.
In the source code it calculates high - low which is inf in your case, the implementation checks this and raises this error.
https://docs.rs/rand/0.8.5/src/rand/distributions/uniform.rs.html#811-814
The algorithm first calculates a random number in [1,2) then scales it with the length of the interval.

Related

How go compare overflow int? [duplicate]

I've been reading this post on constants in Go, and I'm trying to understand how they are stored and used in memory. You can perform operations on very large constants in Go, and as long as the result fits in memory, you can coerce that result to a type. For example, this code prints 10, as you would expect:
const Huge = 1e1000
fmt.Println(Huge / 1e999)
How does this work under the hood? At some point, Go has to store 1e1000 and 1e999 in memory, in order to perform operations on them. So how are constants stored, and how does Go perform arithmetic on them?
Short summary (TL;DR) is at the end of the answer.
Untyped arbitrary-precision constants don't live at runtime, constants live only at compile time (during the compilation). That being said, Go does not have to represent constants with arbitrary precision at runtime, only when compiling your application.
Why? Because constants do not get compiled into the executable binaries. They don't have to be. Let's take your example:
const Huge = 1e1000
fmt.Println(Huge / 1e999)
There is a constant Huge in the source code (and will be in the package object), but it won't appear in your executable. Instead a function call to fmt.Println() will be recorded with a value passed to it, whose type will be float64. So in the executable only a float64 value being 10.0 will be recorded. There is no sign of any number being 1e1000 in the executable.
This float64 type is derived from the default type of the untyped constant Huge. 1e1000 is a floating-point literal. To verify it:
const Huge = 1e1000
x := Huge / 1e999
fmt.Printf("%T", x) // Prints float64
Back to the arbitrary precision:
Spec: Constants:
Numeric constants represent exact values of arbitrary precision and do not overflow.
So constants represent exact values of arbitrary precision. As we saw, there is no need to represent constants with arbitrary precision at runtime, but the compiler still has to do something at compile time. And it does!
Obviously "infinite" precision cannot be dealt with. But there is no need, as the source code itself is not "infinite" (size of the source is finite). Still, it's not practical to allow truly arbitrary precision. So the spec gives some freedom to compilers regarding to this:
Implementation restriction: Although numeric constants have arbitrary precision in the language, a compiler may implement them using an internal representation with limited precision. That said, every implementation must:
Represent integer constants with at least 256 bits.
Represent floating-point constants, including the parts of a complex constant, with a mantissa of at least 256 bits and a signed exponent of at least 32 bits.
Give an error if unable to represent an integer constant precisely.
Give an error if unable to represent a floating-point or complex constant due to overflow.
Round to the nearest representable constant if unable to represent a floating-point or complex constant due to limits on precision.
These requirements apply both to literal constants and to the result of evaluating constant expressions.
However, also note that when all the above said, the standard package provides you the means to still represent and work with values (constants) with "arbitrary" precision, see package go/constant. You may look into its source to get an idea how it's implemented.
Implementation is in go/constant/value.go. Types representing such values:
// A Value represents the value of a Go constant.
type Value interface {
// Kind returns the value kind.
Kind() Kind
// String returns a short, human-readable form of the value.
// For numeric values, the result may be an approximation;
// for String values the result may be a shortened string.
// Use ExactString for a string representing a value exactly.
String() string
// ExactString returns an exact, printable form of the value.
ExactString() string
// Prevent external implementations.
implementsValue()
}
type (
unknownVal struct{}
boolVal bool
stringVal string
int64Val int64 // Int values representable as an int64
intVal struct{ val *big.Int } // Int values not representable as an int64
ratVal struct{ val *big.Rat } // Float values representable as a fraction
floatVal struct{ val *big.Float } // Float values not representable as a fraction
complexVal struct{ re, im Value }
)
As you can see, the math/big package is used to represent untyped arbitrary precision values. big.Int is for example (from math/big/int.go):
// An Int represents a signed multi-precision integer.
// The zero value for an Int represents the value 0.
type Int struct {
neg bool // sign
abs nat // absolute value of the integer
}
Where nat is (from math/big/nat.go):
// An unsigned integer x of the form
//
// x = x[n-1]*_B^(n-1) + x[n-2]*_B^(n-2) + ... + x[1]*_B + x[0]
//
// with 0 <= x[i] < _B and 0 <= i < n is stored in a slice of length n,
// with the digits x[i] as the slice elements.
//
// A number is normalized if the slice contains no leading 0 digits.
// During arithmetic operations, denormalized values may occur but are
// always normalized before returning the final result. The normalized
// representation of 0 is the empty or nil slice (length = 0).
//
type nat []Word
And finally Word is (from math/big/arith.go)
// A Word represents a single digit of a multi-precision unsigned integer.
type Word uintptr
Summary
At runtime: predefined types provide limited precision, but you can "mimic" arbitrary precision with certain packages, such as math/big and go/constant. At compile time: constants seemingly provide arbitrary precision, but in reality a compiler may not live up to this (doesn't have to); but still the spec provides minimal precision for constants that all compiler must support, e.g. integer constants must be represented with at least 256 bits which is 32 bytes (compared to int64 which is "only" 8 bytes).
When an executable binary is created, results of constant expressions (with arbitrary precision) have to be converted and represented with values of finite precision types – which may not be possible and thus may result in compile-time errors. Note that only results –not intermediate operands– have to be converted to finite precision, constant operations are carried out with arbitrary precision.
How this arbitrary or enhanced precision is implemented is not defined by the spec, math/big for example stores "digits" of the number in a slice (where digits is not a digit of the base 10 representation, but "digit" is an uintptr which is like base 4294967295 representation on 32-bit architectures, and even bigger on 64-bit architectures).
Go constants are not allocated to memory. They are used in context by the compiler. The blog post you refer to gives the example of Pi:
Pi = 3.14159265358979323846264338327950288419716939937510582097494459
If you assign Pi to a float32 it will lose precision to fit, but if you assign it to a float64, it will lose less precision, but the compiler will determine what type to use.

How play method works at "NCD.L1.sample--lottery" contract?

Here is the contract repo. https://github.com/Learn-NEAR/NCD.L1.sample--lottery
I don't understand the play method here
https://github.com/Learn-NEAR/NCD.L1.sample--lottery/blob/2bd11bc1092004409e32b75736f78adee821f35b/src/lottery/assembly/lottery.ts#L11-L16
play(): bool {
const rng = new RNG<u32>(1, u32.MAX_VALUE);
const roll = rng.next();
logging.log("roll: " + roll.toString());
return roll <= <u32>(<f64>u32.MAX_VALUE * this.chance);
}
I don't understand the winning process but I'm sure it is hidden inside this method. So can someone explain how this play method works in detail?
To understand the winning process we should take a look at the play method in the lottery.ts file in the contract.
https://github.com/Learn-NEAR/NCD.L1.sample--lottery/blob/2bd11bc1092004409e32b75736f78adee821f35b/src/lottery/assembly/lottery.ts#L11-L16
play(): bool {
const rng = new RNG<u32>(1, u32.MAX_VALUE);
const roll = rng.next();
logging.log("roll: " + roll.toString());
return roll <= <u32>(<f64>u32.MAX_VALUE * this.chance);
}
There are a couple of things we should know about before we read this code.
bool
u32
f64
RNG<32>
bool means that our play method should only return true or false.
u32 is a 32-bit unsigned integer. It is a positive integer stored using 32 bits.
u8 has a max value of 255. u16 has a max value of 65535. u32 has a max value of 4294967295. u64 has a max value of 18446744073709551615. So, these unsigned integers can't be negative values.
f64 is a number that has a decimal place. This type can represent a wide range of decimal numbers, like 3.5, 27, -113.75, 0.0078125, 34359738368, 0, -1. So unlike integer types (such as i32), floating-point types can represent non-integer numbers, too.
RNG stands for Random Number Generator. It basically gives you a random number in the range of u32. And it takes two parameters that define the range of your method. In that case, the range is between 1 and u32.MAX_VALUE. In other words, it is 1 and 4294967296.
The next line creates a variable called roll and assigned it to the value of rng.next().
So, what does next() do? Think rng as a big machine which only has one big red button on it. When you hit that big red button, it gives you a number that this machine is capable of producing. Meaning, every time you hit that button, it gives you a number between 1 and u32.MAX_VALUE
The third line is just about logging the roll into the console. You should see something like that in your console roll: 3845432649
The last line looks confusing at the beginning but let's take a look piece by piece.
Here, u32.MAX_VALUE * this.chance we multiply this max value with a variable called chance which we defined as 0.2 in the Lottery class.
Then, we put <f64> at the beginning of this calculation because the result always will be a floating number due to 0.2.
Then, we put <32> at the beginning of all to convert that floating number into unsigned integer because we need to compare it with the roll which is an unsigned integer. You can't compare floating numbers with unsigned integers.
Finally, if the roll less than or equals to <u32>(<f64>u32.MAX_VALUE * this.chance) this, player wins.

How to generate a random Rust integer in a range without introducing bias?

How you I generate a random dice roll in Rust?
I know I can use rand::random, but that requires I want to generate a value of an integer type. Using rand::random<u8>() % 6 introduces a bias.
Use Rng::gen_range for a one-off value:
use rand::{self, Rng}; // 0.8.0
fn main() {
let mut rng = rand::thread_rng();
let die = rng.gen_range(1..=6);
println!("The die was: {}", die);
}
Under the hood, this creates a Uniform struct. Create this struct yourself if you will be getting multiple random numbers:
use rand::{
self,
distributions::{Distribution, Uniform},
}; // 0.8.0
fn main() {
let mut rng = rand::thread_rng();
let die_range = Uniform::new_inclusive(1, 6);
let die = die_range.sample(&mut rng);
println!("{}", die);
}
Uniform does some precomputation to figure out how to map the complete range of random values to your desired range without introducing bias. It translates and resizes your original range to most closely match the range of the random number generator, discards any random numbers that fall outside this new range, then resizes and translates back to the original range.
See also:
Why do people say there is modulo bias when using a random number generator?
You're correct that a bias is introduced; whenever you want to map from set A to set B where the cardinality of set B is not a factor or multiple of set A, you will have bias.
In your case, 42*6=252. So you can just throw away any u8 values of 252 or greater (and call random again).
Your output can then be safely mapped with the modulus operator. Finally add 1 to achieve the standard [1,6] dice output.
It might seem unclean to call random again but there is no way of mapping a set of 256 values to a set of 6 without introducing bias.
Edit: looks like the rand crate has something which takes bias into account: https://docs.rs/rand/latest/rand/distributions/uniform/struct.Uniform.html

How does Go perform arithmetic on constants?

I've been reading this post on constants in Go, and I'm trying to understand how they are stored and used in memory. You can perform operations on very large constants in Go, and as long as the result fits in memory, you can coerce that result to a type. For example, this code prints 10, as you would expect:
const Huge = 1e1000
fmt.Println(Huge / 1e999)
How does this work under the hood? At some point, Go has to store 1e1000 and 1e999 in memory, in order to perform operations on them. So how are constants stored, and how does Go perform arithmetic on them?
Short summary (TL;DR) is at the end of the answer.
Untyped arbitrary-precision constants don't live at runtime, constants live only at compile time (during the compilation). That being said, Go does not have to represent constants with arbitrary precision at runtime, only when compiling your application.
Why? Because constants do not get compiled into the executable binaries. They don't have to be. Let's take your example:
const Huge = 1e1000
fmt.Println(Huge / 1e999)
There is a constant Huge in the source code (and will be in the package object), but it won't appear in your executable. Instead a function call to fmt.Println() will be recorded with a value passed to it, whose type will be float64. So in the executable only a float64 value being 10.0 will be recorded. There is no sign of any number being 1e1000 in the executable.
This float64 type is derived from the default type of the untyped constant Huge. 1e1000 is a floating-point literal. To verify it:
const Huge = 1e1000
x := Huge / 1e999
fmt.Printf("%T", x) // Prints float64
Back to the arbitrary precision:
Spec: Constants:
Numeric constants represent exact values of arbitrary precision and do not overflow.
So constants represent exact values of arbitrary precision. As we saw, there is no need to represent constants with arbitrary precision at runtime, but the compiler still has to do something at compile time. And it does!
Obviously "infinite" precision cannot be dealt with. But there is no need, as the source code itself is not "infinite" (size of the source is finite). Still, it's not practical to allow truly arbitrary precision. So the spec gives some freedom to compilers regarding to this:
Implementation restriction: Although numeric constants have arbitrary precision in the language, a compiler may implement them using an internal representation with limited precision. That said, every implementation must:
Represent integer constants with at least 256 bits.
Represent floating-point constants, including the parts of a complex constant, with a mantissa of at least 256 bits and a signed exponent of at least 32 bits.
Give an error if unable to represent an integer constant precisely.
Give an error if unable to represent a floating-point or complex constant due to overflow.
Round to the nearest representable constant if unable to represent a floating-point or complex constant due to limits on precision.
These requirements apply both to literal constants and to the result of evaluating constant expressions.
However, also note that when all the above said, the standard package provides you the means to still represent and work with values (constants) with "arbitrary" precision, see package go/constant. You may look into its source to get an idea how it's implemented.
Implementation is in go/constant/value.go. Types representing such values:
// A Value represents the value of a Go constant.
type Value interface {
// Kind returns the value kind.
Kind() Kind
// String returns a short, human-readable form of the value.
// For numeric values, the result may be an approximation;
// for String values the result may be a shortened string.
// Use ExactString for a string representing a value exactly.
String() string
// ExactString returns an exact, printable form of the value.
ExactString() string
// Prevent external implementations.
implementsValue()
}
type (
unknownVal struct{}
boolVal bool
stringVal string
int64Val int64 // Int values representable as an int64
intVal struct{ val *big.Int } // Int values not representable as an int64
ratVal struct{ val *big.Rat } // Float values representable as a fraction
floatVal struct{ val *big.Float } // Float values not representable as a fraction
complexVal struct{ re, im Value }
)
As you can see, the math/big package is used to represent untyped arbitrary precision values. big.Int is for example (from math/big/int.go):
// An Int represents a signed multi-precision integer.
// The zero value for an Int represents the value 0.
type Int struct {
neg bool // sign
abs nat // absolute value of the integer
}
Where nat is (from math/big/nat.go):
// An unsigned integer x of the form
//
// x = x[n-1]*_B^(n-1) + x[n-2]*_B^(n-2) + ... + x[1]*_B + x[0]
//
// with 0 <= x[i] < _B and 0 <= i < n is stored in a slice of length n,
// with the digits x[i] as the slice elements.
//
// A number is normalized if the slice contains no leading 0 digits.
// During arithmetic operations, denormalized values may occur but are
// always normalized before returning the final result. The normalized
// representation of 0 is the empty or nil slice (length = 0).
//
type nat []Word
And finally Word is (from math/big/arith.go)
// A Word represents a single digit of a multi-precision unsigned integer.
type Word uintptr
Summary
At runtime: predefined types provide limited precision, but you can "mimic" arbitrary precision with certain packages, such as math/big and go/constant. At compile time: constants seemingly provide arbitrary precision, but in reality a compiler may not live up to this (doesn't have to); but still the spec provides minimal precision for constants that all compiler must support, e.g. integer constants must be represented with at least 256 bits which is 32 bytes (compared to int64 which is "only" 8 bytes).
When an executable binary is created, results of constant expressions (with arbitrary precision) have to be converted and represented with values of finite precision types – which may not be possible and thus may result in compile-time errors. Note that only results –not intermediate operands– have to be converted to finite precision, constant operations are carried out with arbitrary precision.
How this arbitrary or enhanced precision is implemented is not defined by the spec, math/big for example stores "digits" of the number in a slice (where digits is not a digit of the base 10 representation, but "digit" is an uintptr which is like base 4294967295 representation on 32-bit architectures, and even bigger on 64-bit architectures).
Go constants are not allocated to memory. They are used in context by the compiler. The blog post you refer to gives the example of Pi:
Pi = 3.14159265358979323846264338327950288419716939937510582097494459
If you assign Pi to a float32 it will lose precision to fit, but if you assign it to a float64, it will lose less precision, but the compiler will determine what type to use.

hashing a small number to a random looking 64 bit integer

I am looking for a hash-function which operates on a small integer (say in the range 0...1000) and outputs a 64 bit int.
The result-set should look like a random distribution of 64 bit ints: a uniform distribution with no linear correlation between the results.
I was hoping for a function that only takes a few CPU-cycles to execute. (the code will be in C++).
I considered multiplying the input by a big prime number and taking the modulo 2**64 (something like a linear congruent generator), but there are obvious dependencies between the outputs (in the lower bits).
Googling did not show up anything, but I am probably using wrong search terms.
Does such a function exist?
Some Background-info:
I want to avoid using a big persistent table with pseudo random numbers in an algorithm, and calculate random-looking numbers on the fly.
Security is not an issue.
I tested the 64-bit finalizer of MurmurHash3 (suggested by #aix and this SO post). This gives zero if the input is zero, so I increased the input parameter by 1 first:
typedef unsigned long long uint64;
inline uint64 fasthash(uint64 i)
{
i += 1ULL;
i ^= i >> 33ULL;
i *= 0xff51afd7ed558ccdULL;
i ^= i >> 33ULL;
i *= 0xc4ceb9fe1a85ec53ULL;
i ^= i >> 33ULL;
return i;
}
Here the input argument i is a small integer, for example an element of {0, 1, ..., 1000}. The output looks random:
i fasthash(i) decimal: fasthash(i) hex:
0 12994781566227106604 0xB456BCFC34C2CB2C
1 4233148493373801447 0x3ABF2A20650683E7
2 815575690806614222 0x0B5181C509F8D8CE
3 5156626420896634997 0x47900468A8F01875
... ... ...
There is no linear correlation between subsequent elements of the series:
The range of both axes is 0..2^64-1
Why not use an existing hash function, such as MurmurHash3 with a 64-bit finalizer? According to the author, the function takes tens of CPU cycles per key on current Intel hardware.
Given: input i in the range of 0 to 1,000.
const MaxInt which is the maximum value that cna be contained in a 64 bit int. (you did not say if it is signed or unsigned; 2^64 = 18446744073709551616 )
and a function rand() that returns a value between 0 and 1 (most languages have such a function)
compute hashvalue = i * rand() * ( MaxInt / 1000 )
1,000 * 1,000 = 1,000,000. That fits well within an Int32.
Subtract the low bound of your range, from the number.
Square it, and use it as a direct subscript into some sort of bitmap.

Resources