Rust: Mocking rand functions with gen_range - random

I am writing a small program that randomly selects an entry from an enum. Sample code:
#[derive(Debug)]
enum SettlementSize {
VILLAGE,
TOWN,
CITY
}
impl Distribution<SettlementSize> for Standard {
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> SettlementSize {
let res = rng.gen_range(0, 3);
match res {
0 => SettlementSize::VILLAGE,
1 => SettlementSize::TOWN,
2 => SettlementSize::CITY,
_ => panic!("Unknown value!")
}
}
}
fn get_settlement_size(mut rng: impl RngCore) -> SettlementSize {
let size: SettlementSize = rng.gen();
size
}
Now, of course I want to test it. That's why get_settlement_size takes the rng value.
#[test]
fn random_human_readable() {
let rng = StepRng::new(1, 1);
assert_eq!("Town", get_settlement_size(rng).human_readable());
}
Unfortunately, this doesn't work. When I added some printlns, the value returned from:
rng.gen_range(0, 3);
is always 0. I copied StepRng code into my test module to add println inside and I see next_u32 and next_u64 called. However, later the code disappears into UniformSampler and at that point it becomes too hard for me to follow. What am I doing wrong? Can I somehow retain the testability (which means being able to set fixed results for random in my mind)?

You're right,
it is easy to mock the primitive functions of the RngCore trait,
but the way they are used to avoid bias from low order bits and
bias from modulus calculations make it very tricky to mock the
more complicated functions in RngCore.
IMO the simplest way to approach this is to place a layer between your use of the Rng and the code you want to test.
So instead of this:
fn get_settlement_size(mut rng: impl RngCore) -> SettlementSize {
let size: SettlementSize = rng.gen();
size
}
you have
trait RngWrapper {
fn get_settlement_size(&mut self) -> SettlementSize;
}
fn get_settlement_size(rng: &mut impl RngWrapper) -> SettlementSize {
rng.get_settlement_size()
}
Now your "real" implementation looks like this
impl Distribution<SettlementSize> for Standard {
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> SettlementSize {
let res = rng.gen_range(0..3);
match res {
0 => SettlementSize::VILLAGE,
1 => SettlementSize::TOWN,
2 => SettlementSize::CITY,
_ => panic!("Unknown value!"),
}
}
}
struct Random<'a, R: RngCore> {
rng: &'a mut R,
}
impl<'a, R> RngWrapper for Random<'a, R>
where
R: RngCore,
{
fn get_settlement_size(&mut self) -> SettlementSize {
self.rng.gen()
}
}
And your mock could look something like this:
struct AlwaysTown {}
impl RngWrapper for AlwaysTown {
fn get_settlement_size(&mut self) -> SettlementSize {
SettlementSize::TOWN
}
}
Now you can test anything that uses get_settlement_size(), but you've not addressed testing of Random<'_, ThreadRng>::get_settlement_size - However this is now an isolated issue and doesn't require the "being able to set fixed results for random" that you mention in your question - instead I'd do a statistical test - that each of the expected cases comes out roughly the expected number of times.
If you want this rigorous, then you need a bit of stats (which I'm not going to put here) - but you should be able to put together a rough test that will pass 99.99+% of the time.
A full example, except the statistical tests on RngCore, is on the playground.

Related

How to merge two elements of a list in Rust?

I've been working to try to optimize a section of my code and I've hit an area where I think I could use some community wisdom. I'm essentially trying to merge two elements of a list without moving the elements in the list (via two removes and an insert), because as far as I can tell in Rust doing so to a vector costs O(n) time.
Take a glance at the code that captures the essence of my problem:
use std::cell::RefCell;
use std::rc::Rc;
use std::collections::BinaryHeap;
#[derive(PartialOrd, Ord, PartialEq, Eq)]
pub struct Num {
pub num: usize
}
impl Num {
pub fn new(num: usize) -> Num {
Num {
num
}
}
}
fn main() {
let mut a = vec![];
for i in 0..10 {
a.push(Rc::new(RefCell::new(Num::new(i))));
}
let mut b = BinaryHeap::with_capacity(a.len());
for i in 0..a.len() - 1 {
b.push((i, Rc::clone(&a[i]), Rc::clone(&a[i + 1])));
}
drop(a);
while !b.is_empty() {
let c = b.pop().unwrap();
let first = c.1;
let next = c.2;
println!("c: c.0: {}", c.0);
println!("c: first.num before: {}", RefCell::borrow(&first).num);
println!("c: next.num before: {}", RefCell::borrow(&next).num);
// Here I want to replace the two structs referenced in first and next
// with a single new struct that first and next both point to.
// e.g. first -> new_num <- next
println!("c: first.num after: {}", RefCell::borrow(&first).num);
println!("c: next.num after: {}", RefCell::borrow(&next).num);
assert_eq!(RefCell::borrow(&first).num, RefCell::borrow(&next).num);
}
}
I want to be able to take two elements within a list, merge them into one pseudo-element, where the two previous "elements" are actually just pointers to the same new element. However, I'm having trouble finding a way to do this without copying memory or structures around in the list.
My understanding of your requirement is that you need the Vec to be able to hold items that are either a value or a reference to another item, while keeping the structure similar to what you have presented.
We can model that by changing your item type to an enum, which can hold either a value or a reference to another item:
pub enum Num {
Raw(usize),
Ref(Rc<RefCell<Num>>),
}
And add methods to include abstractions for constructing the different variants and for accessing the underlying numeric value:
impl Num {
pub fn new(num: usize) -> Num {
Num::Raw(num)
}
pub fn new_ref(other: Rc<RefCell<Num>>) -> Num {
Num::Ref(other)
}
pub fn get_num(&self) -> usize {
match &self {
Num::Raw(n) => *n,
Num::Ref(r) => r.borrow().get_num()
}
}
}
If you create a new value like this:
let new_num = Rc::new(RefCell::new(Num::new(100)));
You can reference it in other nodes like this:
*first.borrow_mut() = Num::new_ref(Rc::clone(&new_num));
*next.borrow_mut() = Num::new_ref(Rc::clone(&new_num));
The full code then looks like this:
use std::cell::RefCell;
use std::rc::Rc;
use std::collections::BinaryHeap;
#[derive(PartialOrd, Ord, PartialEq, Eq)]
pub enum Num {
Raw(usize),
Ref(Rc<RefCell<Num>>),
}
impl Num {
pub fn new(num: usize) -> Num {
Num::Raw(num)
}
pub fn new_ref(other: Rc<RefCell<Num>>) -> Num {
Num::Ref(other)
}
pub fn get_num(&self) -> usize {
match &self {
Num::Raw(n) => *n,
Num::Ref(r) => r.borrow().get_num()
}
}
}
fn main() {
let mut a = vec![];
for i in 0..10 {
a.push(Rc::new(RefCell::new(Num::new(i))));
}
let mut b = BinaryHeap::with_capacity(a.len());
for i in 0..a.len() - 1 {
b.push((i, Rc::clone(&a[i]), Rc::clone(&a[i + 1])));
}
drop(a);
let new_num = Rc::new(RefCell::new(Num::new(100)));
while !b.is_empty() {
let c = b.pop().unwrap();
let first = c.1;
let next = c.2;
println!("c: c.0: {}", c.0);
println!("c: first.num before: {}", RefCell::borrow(&first).get_num());
println!("c: next.num before: {}", RefCell::borrow(&next).get_num());
*first.borrow_mut() = Num::new_ref(Rc::clone(&new_num))
*next.borrow_mut() = Num::new_ref(Rc::clone(&new_num))
println!("c: first.num after: {}", RefCell::borrow(&first).get_num());
println!("c: next.num after: {}", RefCell::borrow(&next).get_num());
assert_eq!(RefCell::borrow(&first).get_num(), RefCell::borrow(&next).get_num());
}
}
As for whether this will prove to be better performance than a different approach, it's hard to say. Your starting point seems quite complicated, and if you can simplify that and use a different underlying data structure, then you should try it and benchmark. I have often been surprised at the actual speed of O(n) operations on a Vec, even when the size is around 1000 items or more.

"Expected type parameter, found reference to type parameter" when implementing a generic cache struct

In the Closures chapter of the second edition of The Rust Programming Language, the writer implements a Cache struct and leaves it with a few problems for the reader to fix up, such as:
Accepting generic parameters and return values on the closure function
Allowing more than one value to be cached
I've attempted to fix those problems but I am quite stuck and can't make it work.
use std::collections::HashMap;
use std::hash::Hash;
struct Cacher<T, X, Y>
where
T: Fn(&X) -> &Y,
X: Eq + Hash,
{
calculation: T,
results: HashMap<X, Y>,
}
impl<T, X, Y> Cacher<T, X, Y>
where
T: Fn(&X) -> &Y,
X: Eq + Hash,
{
fn new(calculation: T) -> Cacher<T, X, Y> {
Cacher {
calculation,
results: HashMap::new(),
}
}
fn value<'a>(&'a mut self, arg: &'a X) -> &'a Y {
match self.results.get(arg) {
Some(v) => v,
None => {
let res = (self.calculation)(arg);
self.results.insert(*arg, res);
res
}
}
}
}
Where T is the closure function type, X is the argument type and Y is the return value type.
The error I get:
error[E0308]: mismatched types
--> src/main.rs:30:43
|
30 | self.results.insert(*arg, res);
| ^^^ expected type parameter, found &Y
|
= note: expected type `Y`
found type `&Y`
I understand this, but I can't think of an elegant solution for the whole ordeal.
You've stated that your closure returns a reference:
T: Fn(&X) -> &Y,
but then you are trying to store something that isn't a reference:
results: HashMap<X, Y>,
This is fundamentally incompatible; you need to unify the types.
In many cases, there's no reason to have a reference to a generic type because a generic type can already be a reference. Additionally, forcing the closure to return a reference means that a closure like |_| 42 would not be valid. Because of that, I'd say you should return and store the value type.
Next you need to apply similar logic to value, as it needs to take the argument by value in order to store it. Additionally, remove all the lifetimes from it as elision does the right thing: fn value(&mut self, arg: X) -> &Y.
Once you've straightened that out, apply the knowledge from How to lookup from and insert into a HashMap efficiently?:
fn value(&mut self, arg: X) -> &Y {
match self.results.entry(arg) {
Entry::Occupied(e) => e.into_mut(),
Entry::Vacant(e) => {
let res = (self.calculation)(e.key());
e.insert(res)
}
}
}
Round it off with some tests that assert it's only called once, and you are good to go. Note that we had to make decisions along the way, but they aren't the only ones we could have chosen. For example, we could have made it so that the cached value is cloned when returned.
use std::collections::HashMap;
use std::collections::hash_map::Entry;
use std::hash::Hash;
struct Cacher<F, I, O>
where
F: Fn(&I) -> O,
I: Eq + Hash,
{
calculation: F,
results: HashMap<I, O>,
}
impl<F, I, O> Cacher<F, I, O>
where
F: Fn(&I) -> O,
I: Eq + Hash,
{
fn new(calculation: F) -> Self {
Cacher {
calculation,
results: HashMap::new(),
}
}
fn value(&mut self, arg: I) -> &O {
match self.results.entry(arg) {
Entry::Occupied(e) => e.into_mut(),
Entry::Vacant(e) => {
let res = (self.calculation)(e.key());
e.insert(res)
}
}
}
}
#[test]
fn called_once() {
use std::sync::atomic::{AtomicUsize, Ordering};
let calls = AtomicUsize::new(0);
let mut c = Cacher::new(|&()| {
calls.fetch_add(1, Ordering::SeqCst);
()
});
c.value(());
c.value(());
c.value(());
assert_eq!(1, calls.load(Ordering::SeqCst));
}

How to match enum variants dynamically in Rust? [duplicate]

I have an enum with the following structure:
enum Expression {
Add(Add),
Mul(Mul),
Var(Var),
Coeff(Coeff)
}
where the 'members' of each variant are structs.
Now I want to compare if two enums have the same variant. So if I have
let a = Expression::Add({something});
let b = Expression::Add({somethingelse});
cmpvariant(a, b) should be true. I can imagine a simple double match code that goes through all the options for both enum instances. However, I am looking for a fancier solution, if it exists. If not, is there overhead for the double match? I imagine that internally I am just comparing two ints (ideally).
As of Rust 1.21.0, you can use std::mem::discriminant:
fn variant_eq(a: &Op, b: &Op) -> bool {
std::mem::discriminant(a) == std::mem::discriminant(b)
}
This is nice because it can be very generic:
fn variant_eq<T>(a: &T, b: &T) -> bool {
std::mem::discriminant(a) == std::mem::discriminant(b)
}
Before Rust 1.21.0, I'd match on the tuple of both arguments and ignore the contents of the tuple with _ or ..:
struct Add(u8);
struct Sub(u8);
enum Op {
Add(Add),
Sub(Sub),
}
fn variant_eq(a: &Op, b: &Op) -> bool {
match (a, b) {
(&Op::Add(..), &Op::Add(..)) => true,
(&Op::Sub(..), &Op::Sub(..)) => true,
_ => false,
}
}
fn main() {
let a = Op::Add(Add(42));
let b = Op::Add(Add(42));
let c = Op::Add(Add(21));
let d = Op::Sub(Sub(42));
println!("{}", variant_eq(&a, &b));
println!("{}", variant_eq(&a, &c));
println!("{}", variant_eq(&a, &d));
}
I took the liberty of renaming the function though, as the components of enums are called variants, and really you are testing to see if they are equal, not comparing them (which is usually used for ordering / sorting).
For performance, let's look at the LLVM IR in generated by Rust 1.60.0 in release mode (and marking variant_eq as #[inline(never)]). The Rust Playground can show you this:
; playground::variant_eq
; Function Attrs: mustprogress nofree noinline norecurse nosync nounwind nonlazybind readonly uwtable willreturn
define internal fastcc noundef zeroext i1 #_ZN10playground10variant_eq17hc64d59c7864eb861E(i8 %a.0.0.val, i8 %b.0.0.val) unnamed_addr #2 {
start:
%_8.not = icmp eq i8 %a.0.0.val, %b.0.0.val
ret i1 %_8.not
}
This code directly compares the variant discriminant.
If you wanted to have a macro to generate the function, something like this might be good start.
struct Add(u8);
struct Sub(u8);
macro_rules! foo {
(enum $name:ident {
$($vname:ident($inner:ty),)*
}) => {
enum $name {
$($vname($inner),)*
}
impl $name {
fn variant_eq(&self, b: &Self) -> bool {
match (self, b) {
$((&$name::$vname(..), &$name::$vname(..)) => true,)*
_ => false,
}
}
}
}
}
foo! {
enum Op {
Add(Add),
Sub(Sub),
}
}
fn main() {
let a = Op::Add(Add(42));
let b = Op::Add(Add(42));
let c = Op::Add(Add(21));
let d = Op::Sub(Sub(42));
println!("{}", Op::variant_eq(&a, &b));
println!("{}", Op::variant_eq(&a, &c));
println!("{}", Op::variant_eq(&a, &d));
}
The macro does have limitations though - all the variants need to have a single variant. Supporting unit variants, variants with more than one type, struct variants, visibility, etc are all real hard. Perhaps a procedural macro would make it a bit easier.

Using "moved" values in a function

I want to learn Rust and am making a small program to deal with sound ques. I have a function with this signature:
fn edit_show(mut show: &mut Vec<Que>) {
show.sort_by(|a, b| que_ordering(&a.id, &b.id));
loop {
println!("Current ques");
for l in show {
println!("{}", que_to_line(&l));
}
}
}
I get an error:
use of moved value: 'show'
I cannot find anything on how to fix this. This seems like an odd error for sort since (I assume) if I was to do this in the main function where I pass in the value which seems quite useless.
Solution
Your problem is in this line:
for l in show {
...
}
This consumes the vector show. If you want to just borrow it's elements, you should write:
for l in &show {
...
}
If you want to borrow them mutably, write for l in &mut show.
Explanation
The Rust for loop expects a type that implements IntoIterator. First thing to note: IntoIterator is implemented for every Iterator. See:
impl<I> IntoIterator for I where I: Iterator
Now lets search for the Vec impls:
impl<T> IntoIterator for Vec<T> {
type Item = T
...
}
impl<'a, T> IntoIterator for &'a Vec<T> {
type Item = &'a T
...
}
impl<'a, T> IntoIterator for &'a mut Vec<T> {
type Item = &'a mut T
...
}
Here you can see that it's implemented for the Vec directly, but also for references to it. I hope these three impl blocks speak for themselves.

Borrow mutable and immutable reference in the same block [duplicate]

Why does the call self.f2() in the following code trip the borrow checker? Isn't the else block in a different scope? This is quite a conundrum!
use std::str::Chars;
struct A;
impl A {
fn f2(&mut self) {}
fn f1(&mut self) -> Option<Chars> {
None
}
fn f3(&mut self) {
if let Some(x) = self.f1() {
} else {
self.f2()
}
}
}
fn main() {
let mut a = A;
}
Playground
error[E0499]: cannot borrow `*self` as mutable more than once at a time
--> src/main.rs:16:13
|
13 | if let Some(x) = self.f1() {
| ---- first mutable borrow occurs here
...
16 | self.f2()
| ^^^^ second mutable borrow occurs here
17 | }
| - first borrow ends here
Doesn't the scope of the borrow for self begin and end with the self.f1() call? Once the call from f1() has returned f1() is not using self anymore hence the borrow checker should not have any problem with the second borrow. Note the following code fails too...
// ...
if let Some(x) = self.f1() {
self.f2()
}
// ...
Playground
I think the second borrow should be fine here since f1 and f3 are not using self at the same time as f2.
I put together an example to show off the scoping rules here:
struct Foo {
a: i32,
}
impl Drop for Foo {
fn drop(&mut self) {
println!("Foo: {}", self.a);
}
}
fn generate_temporary(a: i32) -> Option<Foo> {
if a != 0 { Some(Foo { a: a }) } else { None }
}
fn main() {
{
println!("-- 0");
if let Some(foo) = generate_temporary(0) {
println!("Some Foo {}", foo.a);
} else {
println!("None");
}
println!("-- 1");
}
{
println!("-- 0");
if let Some(foo) = generate_temporary(1) {
println!("Some Foo {}", foo.a);
} else {
println!("None");
}
println!("-- 1");
}
{
println!("-- 0");
if let Some(Foo { a: 1 }) = generate_temporary(1) {
println!("Some Foo {}", 1);
} else {
println!("None");
}
println!("-- 1");
}
{
println!("-- 0");
if let Some(Foo { a: 2 }) = generate_temporary(1) {
println!("Some Foo {}", 1);
} else {
println!("None");
}
println!("-- 1");
}
}
This prints:
-- 0
None
-- 1
-- 0
Some Foo 1
Foo: 1
-- 1
-- 0
Some Foo 1
Foo: 1
-- 1
-- 0
None
Foo: 1
-- 1
In short, it seems that the expression in the if clause lives through both the if block and the else block.
On the one hand it is not surprising since it is indeed required to live longer than the if block, but on the other hand it does indeed prevent useful patterns.
If you prefer a visual explanation:
if let pattern = foo() {
if-block
} else {
else-block
}
desugars into:
{
let x = foo();
match x {
pattern => { if-block }
_ => { else-block }
}
}
while you would prefer that it desugars into:
bool bypass = true;
{
let x = foo();
match x {
pattern => { if-block }
_ => { bypass = false; }
}
}
if not bypass {
else-block
}
You are not the first one being tripped by this, so this may be addressed at some point, despite changing the meaning of some code (guards, in particular).
It's annoying, but you can work around this by introducing an inner scope and changing the control flow a bit:
fn f3(&mut self) {
{
if let Some(x) = self.f1() {
// ...
return;
}
}
self.f2()
}
As pointed out in the comments, this works without the extra braces. This is because an if or if...let expression has an implicit scope, and the borrow lasts for this scope:
fn f3(&mut self) {
if let Some(x) = self.f1() {
// ...
return;
}
self.f2()
}
Here's a log of an IRC chat between Sandeep Datta and mbrubeck:
mbrubeck: std:tr::Chars contains a borrowed reference to the string that created it. The full type name is Chars<'a>. So f1(&mut self) -> Option<Chars> without elision is f1(&'a mut self) -> Option<Chars<'a>> which means that self remains borrowed as long as
the return value from f1 is in scope.
Sandeep Datta: Can I use 'b for self and 'a for Chars to avoid this problem?
mbrubeck: Not if you are actually returning an iterator over something from self. Though if you can make a function from &self -> Chars (instead of &mut self -> Chars) that would fix the issue.
As of Rust 2018, available in Rust 1.31, the original code will work as-is. This is because Rust 2018 enables non-lexical lifetimes.
A mutable reference is a very strong guarantee: that there's only one pointer to a particular memory location. Since you've already had one &mut borrow, you can't also have a second. That would introduce a data race in a multithreaded context, and iterator invalidation and other similar issues in a single-threaded context.
Right now, borrows are based on lexical scope, and so the first borrow lasts until the end of the function, period. Eventually, we hope to relax this restriction, but it will take some work.
Here is how you can get rid of the spurious errors. I am new to Rust so there may be serious errors in the following explanation.
use std::str::Chars;
struct A<'a> {
chars: Chars<'a>,
}
The 'a here is a lifetime parameter (just like template parameters in C++). Types can be parameterised by lifetimes in Rust.
The Chars type also takes a lifetime parameter. What this implies is that the Chars type probably has a member element which needs a lifetime parameter. Lifetime parameters only make sense on references (since lifetime here actually means "lifetime of a borrow").
We know that Chars needs to keep a reference to the string from which it was created, 'a will probably be used to denote the source string's lifetime.
Here we simply supply 'a as the lifetime parameter to Chars telling the Rust compiler that the lifetime of Chars is the same as the lifetime of the struct A. IMO "lifetime 'a of type A" should be read as "lifetime 'a of the references contained in the struct A".
I think the struct implementation can be parameterised independently from the struct itself hence we need to repeat the parameters with the impl keyword. Here we bind the name 'a to the lifetime of the struct A.
impl<'a> A<'a> {
The name 'b is introduced in the context of the function f2. Here it is used to bind with the lifetime of the reference &mut self.
fn f2<'b>(&'b mut self) {}
The name 'b is introduced in the context of the function f1.This 'b does not have a direct relationship with the 'b introduced by f2 above.
Here it is used to bind with the lifetime of the reference &mut self. Needless to say this reference also does not have any relationship with the &mut self in the previous function, this is a new independent borrow of self.
Had we not used explicit lifetime annotation here Rust would have used its lifetime elision rules to arrive at the following function signature...
//fn f1<'a>(&'a mut self) -> Option<Chars<'a>>
As you can see this binds the lifetime of the reference &mut self parameter to the lifetime of the Chars object being returned from this function (this Chars object need not be the same as self.chars) this is absurd since the returned Chars will outlive the &mut self reference. Hence we need to separate the two lifetimes as follows...
fn f1<'b>(&'b mut self) -> Option<Chars<'a>> {
self.chars.next();
Remember &mut self is a borrow of self and anything referred to by &mut self is also a borrow. Hence we cannot return Some(self.chars) here. self.chars is not ours to give (Error: cannot move out of borrowed content.).
We need to create a clone of self.chars so that it can be given out.
Some(self.chars.clone())
Note here the returned Chars has the same lifetime as the struct A.
And now here is f3 unchanged and without compilation errors!
fn f3<'b>(&'b mut self) {
if let Some(x) = self.f1() { //This is ok now
} else {
self.f2() //This is also ok now
}
}
The main function just for completeness...
fn main() {
let mut a = A { chars:"abc".chars() };
a.f3();
for c in a.chars {
print!("{}", c);
}
}
I have updated the code the make the lifetime relationships clearer.

Resources