Am I using inheritance, borrowed pointers, and explicit lifetime annotations correctly? - memory-management

I'm learning Rust, coming from an almost exclusively garbage collected background. So I want to make sure I'm getting off on the right foot as I write my first program.
The tutorial on the Rust site said I should be dubious of using pointers that aren't &. With that in mind, here's what I ended up with in my little class hierarchy (names changed to protect innocent). The gist is, I have two different entities, let's say Derived1 and Derived2, which share some behavior and structure. I put the common data into a Foo struct and common behavior into a Fooish trait:
struct Foo<'a> {
name: &'a str,
an_array: &'a [AnEnumType],
/* etc. */
}
struct Derived1<'a> {
foo: &'a Foo<'a>,
other_stuff: &'a str,
}
struct Derived2<'a> {
foo: &'a Foo<'a>,
/* etc. */
}
trait Fooish<'a> {
fn new(foo: &'a Foo<'a>) -> Self;
/* etc. */
}
impl<'a> Fooish<'a> for Derived1<'a> {
fn new(foo: &'a Foo<'a>) -> Derived1<'a> {
Derived1 { foo: foo, other_stuff: "bar" }
}
/* etc. */
}
/* and so forth for Derived2 */
My questions:
Am I "doing inheritance in Rust" more-or-less idiomatically?
Is it correct to use & pointers as struct fields here? (such as for string data, and array fields whose sizes vary from instance to instance? What about for Foo in Derived?)
If the answer to #2 is 'yes', then I need explicit lifetime annotations, right?
Is it common to have so many lifetime annotations everywhere as in my example?
Thanks!

I'd say that this is not idiomatic at all, but sometimes there are tasks which require stepping away from idiomatic approaches, it is just not clear if this is really such a case.
I'd suggest you to refrain from using ideas from OO languages which operate in terms of classes and inheritance - they won't work correctly in Rust. Instead you should think of your data in terms of ownership: ask yourself a question, should the given struct own the data? In other words, does the data belong to the struct naturally or it can be used independently somewhere?
If you apply this reasoning to your structures:
struct Foo<'a> {
name: &'a str,
an_array: &'a [AnEnumType],
/* etc. */
}
struct Derived1<'a> {
foo: &'a Foo<'a>,
other_stuff: &'a str,
}
struct Derived2<'a> {
foo: &'a Foo<'a>,
/* etc. */
}
you would see that it doesn't really make sense to encode inheritance using references. If Derived1 has a reference to Foo, then it is implied that this Foo is created somewhere else, and Derived1 is only borrowing it for a while. While this may be something you really want, this is not how inheritance works: inherited structures/classes usually contain their "parent" contents inside them; in other words, they own their parent data, so this will be more appropriate structure:
struct Foo<'a> {
name: &'a str,
an_array: &'a [AnEnumType],
/* etc. */
}
struct Derived1<'a> {
foo: Foo<'a>
other_stuff: &'a str,
}
struct Derived2<'a> {
foo: Foo<'a>,
/* etc. */
}
Note that Derived* structures now include Foo into them.
As for strings and arrays (string slices and array slices in fact), then yes, if you want to hold them in structures you do have to use lifetime annotations. However, it does not happen that often, and, again, designing structures based on ownership usually helps to decide whether this should be a slice or a dynamically allocated String or Vec. There is a nice tutorial on strings, which explains, among everything else, when you need to use owned strings and when you need slices. Same reasoning applies to &[T]/Vec<T>. In short, if your struct owns the string/array, you have to use String/Vec. Otherwise, consider using slices.

Related

generic callback with data

There is already a very popular question about this topic but I don;t fully understand the answer.
The goal is:
I need a list (read a Vec) of "function pointers" that modify data stored elsewhere in a program. The simplest example I can come up with are callbacks to be called when a key is pressed. So when any key is pressed, all functions passed to the object will be called in some order.
Reading the answer, it is not clear to me how I would be able to make such a list. It sounds like I would need to restrict the type of the callback to something known, else I don't know how you would be able to make an array of it.
It's also not clear to me how to store the data pointers/references.
Say I have
struct Processor<CB>
where
CB: FnMut(),
{
callback: CB,
}
Like the answer suggests, I can't make an array of processors, can I? since each Processor is technically a different type depending on the generic isntantiation.
Indeed, you can't make a vector of processors. Usually, closures all have different, innominable types. What you want instead are trait objects, which allow you to have dynamic dispatch of callback calls. Since those are not Sized, you'd probably want to put them in a Box. The final type is Vec<Box<dyn FnMut()>>.
fn add_callback(list: &mut Vec<Box<dyn FnMut()>>, cb: impl FnMut() + 'static) {
list.push(Box::new(cb))
}
fn run_callback(list: &mut [Box<dyn FnMut()>]) {
for cb in list {
cb()
}
}
see the playground
If you do like that, however, you might have some issues with the lifetimes (because your either force to move-in everything, or only modify values that life for 'static, which isn't very convenient. Instead, the following might be better
#[derive(Default)]
struct Producer<'a> {
list: Vec<Box<dyn FnMut() + 'a>>,
}
impl<'a> Producer<'a> {
fn add_callback(&mut self, cb: impl FnMut() + 'a) {
self.list.push(Box::new(cb))
}
fn run_callbacks(&mut self) {
for cb in &mut self.list {
cb()
}
}
}
fn callback_1() {
println!("Hello!");
}
fn main() {
let mut modified = 0;
let mut prod = Producer::default();
prod.add_callback(callback_1);
prod.add_callback(
|| {
modified += 1;
println!("World!");
}
);
prod.run_callbacks();
drop(prod);
println!("{}", modified);
}
see the playground
Just a few things to note:
You manually have to drop the producer, otherwise Rust will complain that it will be dropped at the end of the scope, but it contains (through the closure) an exclusive reference to modified, which is not ok since I try to read it.
Current, run_callbacks take a &mut self, because we only require for a FnMut. If you wanted it to be only a &self, you'd need to replace FnMut with Fn, which means the callbacks can still modify things outside of them, but not inside.
Yes, all closures are differents type, so if you want to have a vec of different closure you will need to make them trait objects. This can be archieve with Box<dyn Trait> (or any smart pointer). Box<dyn FnMut()> implements FnMut(), so you can have Processor<Box<dyn FnMut()>> and can make a vec of them, and call the callbacks on them: playground

Why can method taking immutable `&self` modify data in field with mutex?

Consider the following code (also on the playground):
use std::{collections::HashMap, sync::Mutex};
struct MyStruct {
dummy_map: Mutex<HashMap<i64, i64>>,
}
impl MyStruct {
pub fn new() -> Self {
MyStruct {
dummy_map: Mutex::new(Default::default()),
}
}
pub fn insert(&self, key: i64, val: i64) { // <-- immutable &self
self.dummy_map.lock().unwrap().insert(key, val); // <-- insert in dummy_map
}
}
fn main() {
let s = MyStruct::new();
let key = 1;
s.insert(key, 1);
assert!(s.dummy_map.lock().unwrap().get(&key).is_some());
}
The code runs wihout panic, meaning insert takes immutable &self, but still it can insert into the map (which is wrapped in Mutex).
How come this is possible?
Would it be better for insert to take &mut self? To indicate that a field is modified...
If dummy_map is not wrapped in a Mutex, the code does not compile (as I`d expect). See this playground.
mut is kind of a misnomer in Rust, it actually means "exclusive access" (which you need to be able to mutate a value but is slightly more general). In this case, you obviously can't get exclusive access to the Mutex itself (because the whole point is to share it between threads), and therefore you can't get exclusive access to self either. However, you can get temporary exclusive access to the data inside the Mutex, at which point it becomes possible to mutate it. Kinda like what happens with Cell and RefCell too.
A Mutex is one of the Rust types that implements "interior mutability" (see the docs for Cell for discussion on that).
In short, types that implement "interior mutability" circumvent the compile-time ownership checks in favor of runtime checks. In this case, Mutex enforces the mutability rules at runtime by ensuring only one thread can access the data using its lock() and try_lock() methods.
Both locking methods return an owned MutexGuard which can provide read-only and mutable access to the data through the Deref and DerefMut traits, respectively.
In the end, this means that the variable that needs to be mut is the returned MutexGuard, not the Mutex itself.

Can 'static be placed along the struct declaration?

In Rust programming language, can there be a case where 'static is placed here:
struct Abc <'static> {
...
This is a bit like asking if can you specify i32 as a type parameter in a struct declaration:
struct Abc<i32> {}
It doesn't make sense[1].
A type parameter lets fields of a struct be generic:
struct Abc<T> {
foo: T, // a user of this struct can choose the type of T
}
In the same way, a lifetime parameter allows the lifetime of a reference to be generic:
struct Abc<'a> {
foo: &'a str, // a user of this struct can choose the lifetime bound
}
If you want the lifetime to be always static, then just specify that, rather than making it generic:
struct Abc {
foo: &'static str, // this must always be static
}
[1] Actually it declares a type parameter, which just happens to have the same name as the i32 type—but that is unlikely to be what a person who tried writing something like this would have intended.
No.
Every type has an implicit 'static lifetime if no generic lifetime is specified. Lifetimes in the declaration of a structure as in
struct Abc<'here, 'and, 'there>;
are meant to specify that the structure contains shorter lifetimes and give them a name. 'static being a special lifetime doesn't need to be specified here or to have a local name.
This doesn't mean however that those lifetimes can't be 'static for a particular instance of the structure.

Can traits be used on enum types?

I read through the trait documentation and found a neat definition for using traits on structs. Is it possible to use traits on enum types? I have seen answers that say no, but they are 3 years old and don't quite do what I'm trying to do.
I tried to do this:
#[derive(Debug, Copy, Clone)]
pub enum SceneType {
Cutscene,
Game,
Menu,
Pause,
Credits,
Exit,
}
//We want to guarantee every SceneType can be played statically
trait Playable {
fn play();
}
impl Playable for SceneType::Cutscene {
fn play() {}
}
error[E0573]: expected type, found variant `SceneType::Cutscene`
--> src/main.rs:16:19
|
16 | impl Playable for SceneType::Cutscene {
| ^^^^^^^^^^^^^^^^^^^
| |
| not a type
| help: you can try using the variant's enum: `SceneType`
I don't understand this error because the enum it references is in the same file. If I really can't use traits on enum variants, is there any way I can guarantee any enum trait must implement certain methods?
Can traits be used on enum types?
Yes. In fact, you already have multiple traits defined for your enum; the traits Debug, Copy and Clone:
#[derive(Debug, Copy, Clone)]
pub enum SceneType
The problem is that you aren't attempting to implement Playable for your enum, you are trying to implement it for one of the enum's variants. Enum variants are not types.
As the error message tells you:
help: you can try using the variant's enum: `SceneType`
impl Playable for SceneType {
fn play() {}
}
See also:
Can struct-like enums be used as types?
Is there a way to use existing structs as enum variants?
If you want to implement a trait for Playable (i.e. for all enum variants) then the answer is quite simply: Yes you can. And Shepmaster's answer details how to do that.
However, if you really only want one enum variant to be Playable and not the others, then Rust doesn't directly support that, but there's an idiom I've seen used to emulate it. Instead of
enum MyEnum {
A(i32, i32),
B(String),
}
you explicitly implement each enum variant as a separate struct, so
enum MyEnum {
A(A),
B(B),
}
struct A {
x: i32,
y: i32,
}
struct B {
name: String,
}
And then you can impl Playable for A without impl Playable for B. Whenever you want to call it, pattern match the MyEnum and, if you get an A, you can call play in your example on the result of the pattern match.
I don't recommend using this pattern for every enum you write, as it does make the code a decent bit more verbose and requires some boilerplate constructor methods to make it palatable. But for complicated enums with a lot of options, this sort of pattern can make the code easier to reason about, especially if you have a lot of traits or functions that only really apply to a couple of the enum possibilities.
Edit: Truly apologize; this answer isn't about
every SceneType can be played statically
Old answer
Try generics:
#[derive(Debug, Copy, Clone)]
pub enum SceneType <Cutscene>
where
Cutscene: Playable
{
Cutscene(Cutscene),
Game,
Menu,
Pause,
Credits,
Exit,
}
//We want to guarantee every SceneType can be played statically
// Notice: add `pub` as enum
pub trait Playable {
fn play();
}
// create struct for inner of SceneType::Cutscene
struct Cutscene {
// ...
}
// impl to specific Cutscene
impl Playable for Cutscene {
fn play() {}
}
Test it:
fn main () {
let cutscene = Cutscene{};
let scenetype = SceneType::Cutscene(cutscene);
}
A downside I realized is that the generics are static. When there are more than one generics for an enum, all generics must be specified.
enum E <A, B>
where
A: SomeTrait1,
B: SomeTrait2,
{
Enum1(A),
Enum2(B),
}
trait SomeTrait1 {}
trait SomeTrait2 {}
struct S1 {}
impl SomeTrait1 for S1{}
struct S2 {}
impl SomeTrait2 for S2{}
struct X1 {}
impl SomeTrait1 for X1{}
fn main () {
// specify the generics
E::<S1, S2>::Enum1(S1{});
E::<X1, S2>::Enum1(X1{});
//error[E0282]: type annotations needed
// --> src/main.rs:26:5
// |
//33 | E::Enum1(S1{});
// | ^^^^^^^^ cannot infer type for type parameter `B` declared on the enum `E`
// E::Enum1(S1{});
// E::Enum1(X1{});
}

Implementing a Merkle-tree data structure in Go

I'm currently attempting to implement a merkle-tree data structure in Go. Basically, my end goal is to store a small set of structured data (10MB max) and allow this "database" to be easily synchronised with other nodes distributed over the network (see related ).
I've implemented this reasonably effectively in Node as there are no type-checks. Herein lies the problem with Go, I'd like to make use of Go's compile-time type checks, though I also want to have one library which works with any provided tree.
In short, I'd like to use structs as merkle nodes and I'd like to have one Merkle.Update() method which is embedded in all types. I'm trying to avoid writing an Update() for every struct (though I'm aware this might be the only/best way).
My idea was to use embedded types:
//library
type Merkle struct {
Initialised bool
Container interface{} //in example this references foo
Fields []reflect.Type
//... other merkle state
}
//Merkle methods... Update()... etc...
//userland
type Foo struct {
Merkle
A int
B bool
C string
D map[string]*Bazz
E []*Bar
}
type Bazz struct {
Merkle
S int
T int
U int
}
type Bar struct {
Merkle
X int
Y int
Z int
}
In this example, Foo will be the root, which will contain Bazzs and Bars. This relationship could be inferred by reflecting on the types. The problem is the usage:
foo := &Foo{
A: 42,
B: true,
C: "foo",
D: map[string]*Bazz{
"b1": &Bazz{},
"b2": &Bazz{},
},
E: []*Bar{
&Bar{},
&Bar{},
&Bar{},
},
}
merkle.Init(foo)
foo.Hash //Initial hash => abc...
foo.A = 35
foo.E = append(foo.E, &Bar{})
foo.Update()
foo.Hash //Updated hash => def...
I think we need to merkle.Init(foo) since foo.Init() would actually be foo.Merkle.Init() and would not be able to reflect on foo. The uninitialised Bars and Bazzs could be detected and initialised by the parent foo.Update(). Some reflection is acceptable as correctness is more important than performance at the moment.
Another problem is, when we Update() a node, all struct fields (child nodes) would need to be Update()d as well (rehashed) since we aren't sure what was changed. We could do foo.SetInt("A", 35) to implement an auto-update, though then we lose compile time type-checks.
Would this be considered idiomatic Go? If not, how could this be improved? Can anyone think of an alternative way to store a dataset in memory (for fast reads) with concise dataset comparison (for efficient delta transfers over the network)?
Edit: And also a meta-question: Where is the best place to ask this kind of question, StackOverflow, Reddit or go-nuts? Originally posted on reddit with no answer :(
Some goals seem like:
Hash anything -- make it easy to use by hashing lots of things out of the box
Cache hashes -- make updates just rehash what they need to
Be idiomatic -- fit in well among other Go code
I think you can attack hashing anything roughly the way that serialization tools like the built-in encoding/gob or encoding/json do, which is three-pronged: use a special method if the type implements it (for JSON that's MarshalJSON), use a type switch for basic types, and fall back to a nasty default case using reflection. Here's an API sketch that provides a helper for hash caching and lets types either implement Hash or not:
package merkle
type HashVal uint64
const MissingHash HashVal = 0
// Hasher provides a custom hash implementation for a type. Not
// everything needs to implement it, but doing so can speed
// updates.
type Hasher interface {
Hash() HashVal
}
// HashCacher is the interface for items that cache a hash value.
// Normally implemented by embedding HashCache.
type HashCacher interface {
CachedHash() *HashVal
}
// HashCache implements HashCacher; it's meant to be embedded in your
// structs to make updating hash trees more efficient.
type HashCache struct {
h HashVal
}
// CachedHash implements HashCacher.
func (h *HashCache) CachedHash() *HashVal {
return &h.h
}
// Hash returns something's hash, using a cached hash or Hash() method if
// available.
func Hash(i interface{}) HashVal {
if hashCacher, ok := i.(HashCacher); ok {
if cached := *hashCacher.CachedHash(); cached != MissingHash {
return cached
}
}
switch i := i.(type) {
case Hasher:
return i.Hash()
case uint64:
return HashVal(i * 8675309) // or, you know, use a real hash
case []byte:
// CRC the bytes, say
return 0xdeadbeef
default:
return 0xdeadbeef
// terrible slow recursive case using reflection
// like: iterate fields using reflect, then hash each
}
// instead of panic()ing here, you could live a little
// dangerously and declare that changes to unhashable
// types don't invalidate the tree
panic("unhashable type passed to Hash()")
}
// Item is a node in the Merkle tree, which must know how to find its
// parent Item (the root node should return nil) and should usually
// embed HashCache for efficient updates. To avoid using reflection,
// Items might benefit from being Hashers as well.
type Item interface {
Parent() Item
HashCacher
}
// Update updates the chain of items between i and the root, given the
// leaf node that may have been changed.
func Update(i Item) {
for i != nil {
cached := i.CachedHash()
*cached = MissingHash // invalidate
*cached = Hash(i)
i = i.Parent()
}
}
Go doesn't have inheritance in the same way other languages do.
The "parent" can't modify items in the child, you'd have to implement Update on each struct then do your business in it then have it call the parent's Update.
func (b *Bar) Update() {
b.Merkle.Update()
//do stuff related to b and b.Merkle
//stuff
}
func (f *Foo) Update() {
f.Merkle.Update()
for _, b := range f.E {
b.Update()
}
//etc
}
I think you will have to re-implement your tree in a different way.
Also please provide a testable case the next time.
Have you seen https://github.com/xsleonard/go-merkle which will allow you to create a binary merkle tree. You could append a type byte to the end of your data to identify it.

Resources