Deriving std::hash::Hash for enums [duplicate] - enums

I would like to use a HashSet as the key to a HashMap. Is this possible?
use std::collections::{HashMap, HashSet};
fn main() {
let hmap: HashMap<HashSet<usize>, String> = HashMap::new();
}
gives the following error:
error[E0277]: the trait bound `std::collections::HashSet<usize>: std::hash::Hash` is not satisfied
--> src/main.rs:4:49
|
4 | let hmap: HashMap<HashSet<usize>, String> = HashMap::new();
| ^^^^^^^^^^^^ the trait `std::hash::Hash` is not implemented for `std::collections::HashSet<usize>`
|
= note: required by `<std::collections::HashMap<K, V>>::new`

To make something the key of a HashMap, you need to satisfy 3 traits:
Hash — How do you calculate a hash value for the type?
PartialEq — How do you decide if two instances of a type are the same?
Eq — Can you guarantee that the equality is reflexive, symmetric, and transitive? This requires PartialEq.
This is based on the definition of HashMap:
impl<K: Hash + Eq, V> HashMap<K, V, RandomState> {
pub fn new() -> HashMap<K, V, RandomState> { /* ... */ }
}
Checking out the docs for HashSet, you can see what traits it implements (listed at the bottom of the page).
There isn't an implementation of Hash for HashSet, so it cannot be used as a key in a HashMap. That being said, if you have a rational way of computing the hash of a HashSet, then you could create a "newtype" around the HashSet and implement these three traits on it.
Here's an example for the "newtype":
use std::{
collections::{HashMap, HashSet},
hash::{Hash, Hasher},
};
struct Wrapper<T>(HashSet<T>);
impl<T> PartialEq for Wrapper<T>
where
T: Eq + Hash,
{
fn eq(&self, other: &Wrapper<T>) -> bool {
self.0 == other.0
}
}
impl<T> Eq for Wrapper<T> where T: Eq + Hash {}
impl<T> Hash for Wrapper<T> {
fn hash<H>(&self, _state: &mut H)
where
H: Hasher,
{
// do something smart here!!!
}
}
fn main() {
let hmap: HashMap<Wrapper<u32>, String> = HashMap::new();
}

Related

How do I filter a vector of an enum of different structs using a common field?

I found you can create a vector of different types of structs using an enum. When filtering the vector on a common field, such as id, the compiler doesn't know the type while iterating:
use chrono::{DateTime, Utc}; // 0.4.19
use serde::{Serialize, Deserialize}; // 1.0.126
#[derive(Deserialize, Debug, Serialize, Clone)]
pub enum TransactionsEnum {
TransactionOrderA(TransactionOrderA),
TransactionOrderB(TransactionOrderB),
}
#[derive(Deserialize, Debug, Serialize, Clone)]
pub struct TransactionOrderA {
pub id: i64,
pub accountID: String,
}
#[derive(Serialize, Deserialize, Debug, Clone)]
pub struct TransactionOrderB {
pub id: i64,
pub time: DateTime<Utc>,
}
fn transactions_filter(
transactions_vector: Vec<TransactionsEnum>,
x: i64,
) -> Vec<TransactionsEnum> {
transactions_vector
.into_iter()
.filter(|e| e.id >= x)
.collect()
}
error[E0609]: no field `id` on type `&TransactionsEnum`
--> src/lib.rs:27:23
|
27 | .filter(|e| e.id >= x)
| ^^
Sharing a common value in all enum values and Is there a way to directly access a field value in an enum struct without pattern matching? indirectly answer my question, but the answer provided here helped me understand why a match statement is necessary.
Those are not a common field, they are completely unrelated fields. The fact that they share a name is, as far as the Rust compiler is concerned, an insignificant coincidence. You need to use pattern matching to get the field from either case
impl TransactionsEnum {
pub fn id(&self) -> i64 {
match self {
TransactionsEnum::TransactionOrderA(value) => value.id,
TransactionsEnum::TransactionOrderB(value) => value.id,
}
}
}
transactions_vector
.into_iter()
.filter(|e| e.id() >= x) // Note the parentheses since we're calling a function now
.collect()

How to merge two elements of a list in Rust?

I've been working to try to optimize a section of my code and I've hit an area where I think I could use some community wisdom. I'm essentially trying to merge two elements of a list without moving the elements in the list (via two removes and an insert), because as far as I can tell in Rust doing so to a vector costs O(n) time.
Take a glance at the code that captures the essence of my problem:
use std::cell::RefCell;
use std::rc::Rc;
use std::collections::BinaryHeap;
#[derive(PartialOrd, Ord, PartialEq, Eq)]
pub struct Num {
pub num: usize
}
impl Num {
pub fn new(num: usize) -> Num {
Num {
num
}
}
}
fn main() {
let mut a = vec![];
for i in 0..10 {
a.push(Rc::new(RefCell::new(Num::new(i))));
}
let mut b = BinaryHeap::with_capacity(a.len());
for i in 0..a.len() - 1 {
b.push((i, Rc::clone(&a[i]), Rc::clone(&a[i + 1])));
}
drop(a);
while !b.is_empty() {
let c = b.pop().unwrap();
let first = c.1;
let next = c.2;
println!("c: c.0: {}", c.0);
println!("c: first.num before: {}", RefCell::borrow(&first).num);
println!("c: next.num before: {}", RefCell::borrow(&next).num);
// Here I want to replace the two structs referenced in first and next
// with a single new struct that first and next both point to.
// e.g. first -> new_num <- next
println!("c: first.num after: {}", RefCell::borrow(&first).num);
println!("c: next.num after: {}", RefCell::borrow(&next).num);
assert_eq!(RefCell::borrow(&first).num, RefCell::borrow(&next).num);
}
}
I want to be able to take two elements within a list, merge them into one pseudo-element, where the two previous "elements" are actually just pointers to the same new element. However, I'm having trouble finding a way to do this without copying memory or structures around in the list.
My understanding of your requirement is that you need the Vec to be able to hold items that are either a value or a reference to another item, while keeping the structure similar to what you have presented.
We can model that by changing your item type to an enum, which can hold either a value or a reference to another item:
pub enum Num {
Raw(usize),
Ref(Rc<RefCell<Num>>),
}
And add methods to include abstractions for constructing the different variants and for accessing the underlying numeric value:
impl Num {
pub fn new(num: usize) -> Num {
Num::Raw(num)
}
pub fn new_ref(other: Rc<RefCell<Num>>) -> Num {
Num::Ref(other)
}
pub fn get_num(&self) -> usize {
match &self {
Num::Raw(n) => *n,
Num::Ref(r) => r.borrow().get_num()
}
}
}
If you create a new value like this:
let new_num = Rc::new(RefCell::new(Num::new(100)));
You can reference it in other nodes like this:
*first.borrow_mut() = Num::new_ref(Rc::clone(&new_num));
*next.borrow_mut() = Num::new_ref(Rc::clone(&new_num));
The full code then looks like this:
use std::cell::RefCell;
use std::rc::Rc;
use std::collections::BinaryHeap;
#[derive(PartialOrd, Ord, PartialEq, Eq)]
pub enum Num {
Raw(usize),
Ref(Rc<RefCell<Num>>),
}
impl Num {
pub fn new(num: usize) -> Num {
Num::Raw(num)
}
pub fn new_ref(other: Rc<RefCell<Num>>) -> Num {
Num::Ref(other)
}
pub fn get_num(&self) -> usize {
match &self {
Num::Raw(n) => *n,
Num::Ref(r) => r.borrow().get_num()
}
}
}
fn main() {
let mut a = vec![];
for i in 0..10 {
a.push(Rc::new(RefCell::new(Num::new(i))));
}
let mut b = BinaryHeap::with_capacity(a.len());
for i in 0..a.len() - 1 {
b.push((i, Rc::clone(&a[i]), Rc::clone(&a[i + 1])));
}
drop(a);
let new_num = Rc::new(RefCell::new(Num::new(100)));
while !b.is_empty() {
let c = b.pop().unwrap();
let first = c.1;
let next = c.2;
println!("c: c.0: {}", c.0);
println!("c: first.num before: {}", RefCell::borrow(&first).get_num());
println!("c: next.num before: {}", RefCell::borrow(&next).get_num());
*first.borrow_mut() = Num::new_ref(Rc::clone(&new_num))
*next.borrow_mut() = Num::new_ref(Rc::clone(&new_num))
println!("c: first.num after: {}", RefCell::borrow(&first).get_num());
println!("c: next.num after: {}", RefCell::borrow(&next).get_num());
assert_eq!(RefCell::borrow(&first).get_num(), RefCell::borrow(&next).get_num());
}
}
As for whether this will prove to be better performance than a different approach, it's hard to say. Your starting point seems quite complicated, and if you can simplify that and use a different underlying data structure, then you should try it and benchmark. I have often been surprised at the actual speed of O(n) operations on a Vec, even when the size is around 1000 items or more.

"Expected type parameter, found reference to type parameter" when implementing a generic cache struct

In the Closures chapter of the second edition of The Rust Programming Language, the writer implements a Cache struct and leaves it with a few problems for the reader to fix up, such as:
Accepting generic parameters and return values on the closure function
Allowing more than one value to be cached
I've attempted to fix those problems but I am quite stuck and can't make it work.
use std::collections::HashMap;
use std::hash::Hash;
struct Cacher<T, X, Y>
where
T: Fn(&X) -> &Y,
X: Eq + Hash,
{
calculation: T,
results: HashMap<X, Y>,
}
impl<T, X, Y> Cacher<T, X, Y>
where
T: Fn(&X) -> &Y,
X: Eq + Hash,
{
fn new(calculation: T) -> Cacher<T, X, Y> {
Cacher {
calculation,
results: HashMap::new(),
}
}
fn value<'a>(&'a mut self, arg: &'a X) -> &'a Y {
match self.results.get(arg) {
Some(v) => v,
None => {
let res = (self.calculation)(arg);
self.results.insert(*arg, res);
res
}
}
}
}
Where T is the closure function type, X is the argument type and Y is the return value type.
The error I get:
error[E0308]: mismatched types
--> src/main.rs:30:43
|
30 | self.results.insert(*arg, res);
| ^^^ expected type parameter, found &Y
|
= note: expected type `Y`
found type `&Y`
I understand this, but I can't think of an elegant solution for the whole ordeal.
You've stated that your closure returns a reference:
T: Fn(&X) -> &Y,
but then you are trying to store something that isn't a reference:
results: HashMap<X, Y>,
This is fundamentally incompatible; you need to unify the types.
In many cases, there's no reason to have a reference to a generic type because a generic type can already be a reference. Additionally, forcing the closure to return a reference means that a closure like |_| 42 would not be valid. Because of that, I'd say you should return and store the value type.
Next you need to apply similar logic to value, as it needs to take the argument by value in order to store it. Additionally, remove all the lifetimes from it as elision does the right thing: fn value(&mut self, arg: X) -> &Y.
Once you've straightened that out, apply the knowledge from How to lookup from and insert into a HashMap efficiently?:
fn value(&mut self, arg: X) -> &Y {
match self.results.entry(arg) {
Entry::Occupied(e) => e.into_mut(),
Entry::Vacant(e) => {
let res = (self.calculation)(e.key());
e.insert(res)
}
}
}
Round it off with some tests that assert it's only called once, and you are good to go. Note that we had to make decisions along the way, but they aren't the only ones we could have chosen. For example, we could have made it so that the cached value is cloned when returned.
use std::collections::HashMap;
use std::collections::hash_map::Entry;
use std::hash::Hash;
struct Cacher<F, I, O>
where
F: Fn(&I) -> O,
I: Eq + Hash,
{
calculation: F,
results: HashMap<I, O>,
}
impl<F, I, O> Cacher<F, I, O>
where
F: Fn(&I) -> O,
I: Eq + Hash,
{
fn new(calculation: F) -> Self {
Cacher {
calculation,
results: HashMap::new(),
}
}
fn value(&mut self, arg: I) -> &O {
match self.results.entry(arg) {
Entry::Occupied(e) => e.into_mut(),
Entry::Vacant(e) => {
let res = (self.calculation)(e.key());
e.insert(res)
}
}
}
}
#[test]
fn called_once() {
use std::sync::atomic::{AtomicUsize, Ordering};
let calls = AtomicUsize::new(0);
let mut c = Cacher::new(|&()| {
calls.fetch_add(1, Ordering::SeqCst);
()
});
c.value(());
c.value(());
c.value(());
assert_eq!(1, calls.load(Ordering::SeqCst));
}

How to match enum variants dynamically in Rust? [duplicate]

I have an enum with the following structure:
enum Expression {
Add(Add),
Mul(Mul),
Var(Var),
Coeff(Coeff)
}
where the 'members' of each variant are structs.
Now I want to compare if two enums have the same variant. So if I have
let a = Expression::Add({something});
let b = Expression::Add({somethingelse});
cmpvariant(a, b) should be true. I can imagine a simple double match code that goes through all the options for both enum instances. However, I am looking for a fancier solution, if it exists. If not, is there overhead for the double match? I imagine that internally I am just comparing two ints (ideally).
As of Rust 1.21.0, you can use std::mem::discriminant:
fn variant_eq(a: &Op, b: &Op) -> bool {
std::mem::discriminant(a) == std::mem::discriminant(b)
}
This is nice because it can be very generic:
fn variant_eq<T>(a: &T, b: &T) -> bool {
std::mem::discriminant(a) == std::mem::discriminant(b)
}
Before Rust 1.21.0, I'd match on the tuple of both arguments and ignore the contents of the tuple with _ or ..:
struct Add(u8);
struct Sub(u8);
enum Op {
Add(Add),
Sub(Sub),
}
fn variant_eq(a: &Op, b: &Op) -> bool {
match (a, b) {
(&Op::Add(..), &Op::Add(..)) => true,
(&Op::Sub(..), &Op::Sub(..)) => true,
_ => false,
}
}
fn main() {
let a = Op::Add(Add(42));
let b = Op::Add(Add(42));
let c = Op::Add(Add(21));
let d = Op::Sub(Sub(42));
println!("{}", variant_eq(&a, &b));
println!("{}", variant_eq(&a, &c));
println!("{}", variant_eq(&a, &d));
}
I took the liberty of renaming the function though, as the components of enums are called variants, and really you are testing to see if they are equal, not comparing them (which is usually used for ordering / sorting).
For performance, let's look at the LLVM IR in generated by Rust 1.60.0 in release mode (and marking variant_eq as #[inline(never)]). The Rust Playground can show you this:
; playground::variant_eq
; Function Attrs: mustprogress nofree noinline norecurse nosync nounwind nonlazybind readonly uwtable willreturn
define internal fastcc noundef zeroext i1 #_ZN10playground10variant_eq17hc64d59c7864eb861E(i8 %a.0.0.val, i8 %b.0.0.val) unnamed_addr #2 {
start:
%_8.not = icmp eq i8 %a.0.0.val, %b.0.0.val
ret i1 %_8.not
}
This code directly compares the variant discriminant.
If you wanted to have a macro to generate the function, something like this might be good start.
struct Add(u8);
struct Sub(u8);
macro_rules! foo {
(enum $name:ident {
$($vname:ident($inner:ty),)*
}) => {
enum $name {
$($vname($inner),)*
}
impl $name {
fn variant_eq(&self, b: &Self) -> bool {
match (self, b) {
$((&$name::$vname(..), &$name::$vname(..)) => true,)*
_ => false,
}
}
}
}
}
foo! {
enum Op {
Add(Add),
Sub(Sub),
}
}
fn main() {
let a = Op::Add(Add(42));
let b = Op::Add(Add(42));
let c = Op::Add(Add(21));
let d = Op::Sub(Sub(42));
println!("{}", Op::variant_eq(&a, &b));
println!("{}", Op::variant_eq(&a, &c));
println!("{}", Op::variant_eq(&a, &d));
}
The macro does have limitations though - all the variants need to have a single variant. Supporting unit variants, variants with more than one type, struct variants, visibility, etc are all real hard. Perhaps a procedural macro would make it a bit easier.

Using "moved" values in a function

I want to learn Rust and am making a small program to deal with sound ques. I have a function with this signature:
fn edit_show(mut show: &mut Vec<Que>) {
show.sort_by(|a, b| que_ordering(&a.id, &b.id));
loop {
println!("Current ques");
for l in show {
println!("{}", que_to_line(&l));
}
}
}
I get an error:
use of moved value: 'show'
I cannot find anything on how to fix this. This seems like an odd error for sort since (I assume) if I was to do this in the main function where I pass in the value which seems quite useless.
Solution
Your problem is in this line:
for l in show {
...
}
This consumes the vector show. If you want to just borrow it's elements, you should write:
for l in &show {
...
}
If you want to borrow them mutably, write for l in &mut show.
Explanation
The Rust for loop expects a type that implements IntoIterator. First thing to note: IntoIterator is implemented for every Iterator. See:
impl<I> IntoIterator for I where I: Iterator
Now lets search for the Vec impls:
impl<T> IntoIterator for Vec<T> {
type Item = T
...
}
impl<'a, T> IntoIterator for &'a Vec<T> {
type Item = &'a T
...
}
impl<'a, T> IntoIterator for &'a mut Vec<T> {
type Item = &'a mut T
...
}
Here you can see that it's implemented for the Vec directly, but also for references to it. I hope these three impl blocks speak for themselves.

Resources