How do Rust's Trait Objects handle methods which moves the instance? - methods

Here are my underlying assumptions of how Rust's methods work:
foo.method();
where method is defined as method(&self) and foo is an instance of Foo, is the same as
Foo::method(&foo);
My understanding of Trait Objects is a struct with two void pointers, one to the data, and another to the function pointers (vtable)
A polymorphic function that takes in a Trait Object and calls a method on that Trait object will compile down to looking at the offset of the method in the Trait Object and passing in the data pointer
But what if the method moves the instance? Correct me if I am wrong, but to call a virtual move method, the function would have to push the actual data stored inside the Trait Object onto the stack instead of just the data pointer. The data size obviously cannot be known at compile-time, so what is going on here? Is this a VLA sort of situation, or do I misunderstand how a move works?

The answer is very simple - it's impossible to call the self-consuming method on trait object.
The key words are object safety. Essentially, this property combines two requirements:
every method must work with self through some kind of indirection;
every method must be fully identifiable via original type (i.e. no generics).
To see this in more detail, let's actually try to code something and ask compiler's opinion. First of all, just trying to define the trait:
trait Consumer {
fn consume(self);
}
Compiler is already unhappy:
error[E0277]: the size for values of type `Self` cannot be known at compilation time
--> src/lib.rs:2:16
|
2 | fn consume(self);
| ^^^^ doesn't have a size known at compile-time
|
help: consider further restricting `Self`
|
2 | fn consume(self) where Self: Sized;
| ^^^^^^^^^^^^^^^^^
help: function arguments must have a statically known size, borrowed types always have a known size
|
2 | fn consume(&self);
| ^
OK, we can be even more conservative then the compiler's advice and add the restriction on trait. Then, add a stub for trait object creation:
trait Consumer where Self: Sized {
fn consume(self);
}
fn main() {
let _: Box<dyn Consumer> = todo!();
}
Now, the error is slightly more complex:
error[E0038]: the trait `Consumer` cannot be made into an object
--> src/main.rs:6:12
|
6 | let _: Box<dyn Consumer> = todo!();
| ^^^^^^^^^^^^^^^^^ `Consumer` cannot be made into an object
|
note: for a trait to be "object safe" it needs to allow building a vtable to allow the call to be resolvable dynamically; for more information visit <https://doc.rust-lang.org/reference/items/traits.html#object-safety>
--> src/main.rs:1:28
|
1 | trait Consumer where Self: Sized {
| -------- ^^^^^ ...because it requires `Self: Sized`
| |
| this trait cannot be made into an object...
There is one workaround, however: it's not necessary to restrict the whole trait - just the offending method, as we were told from the start. Moving where clause, as here:
trait Consumer {
fn consume(self) where Self: Sized;
}
...makes the code above compile.
Now, what about actually using this trait object? Let's implement it, for example, for unit type, and use this from main:
trait Consumer {
fn consume(self) where Self: Sized;
}
impl Consumer for () {
fn consume(self) {}
}
fn main() {
let consumer: Box<dyn Consumer> = Box::new(());
consumer.consume();
}
Another compiler error!
error: the `consume` method cannot be invoked on a trait object
--> src/main.rs:11:14
|
2 | fn consume(self) where Self: Sized;
| ----- this has a `Sized` requirement
...
11 | consumer.consume();
| ^^^^^^^
Again, the restriction we've placed on method forbids the code which would make no sense if it compiled.

Related

How can we call a function mutable for self inside a for loop? [duplicate]

I'm having a hard time with the borrow checker.
for item in self.xxx.iter() {
self.modify_self_but_not_xxx(item);
}
The above code worked before I refactored some code into modify_self_but_not_xxx():
error: cannot borrow `*self` as mutable because `self.xxx` is also borrowed as immutable
How can I call a mutating method while holding a reference to self (e.g. from within a for-loop)?
How can I call a mutating method while holding a reference to self (e.g. from within a for-loop)?
You can't, that's exactly what the borrowing rules prevent.
The main idea is that in your code, the borrow checker cannot possibly know that self.modify_self_but_not_xxx(..) will not modify xxx.
However, you can mutate self.yyy or any other parameters, so either you can:
do the computations of modify_self_but_not_xxx(..) directly in your loop body
define a helper function taking mutable references to update them:
fn do_computations(item: Foo, a: &mut Bar, b: &mut Baz) { /* ... */ }
/* ... */
for item in self.xxx.iter() {
do_computations(item, &mut self.bar, &mut self.baz);
}
define a helper struct that has helper methods

What is the difference between method call syntax `foo.method()` and UFCS `Foo::method(&foo)`?

Is there any difference in Rust between calling a method on a value, like this:
struct A { e: u32 }
impl A {
fn show(&self) {
println!("{}", self.e)
}
}
fn main() {
A { e: 0 }.show();
}
...and calling it on the type, like this:
fn main() {
A::show(&A { e: 0 })
}
Summary: The most important difference is that the universal function call syntax (UFCS) is more explicit than the method call syntax.
With UFCS there is basically no ambiguity what function you want to call (there is still a longer form of the UFCS for trait methods, but let's ignore that for now). The method call syntax, on the other hand, requires more work in the compiler to figure out which method to call and how to call it. This manifests in mostly two things:
Method resolution: figure out if the method is inherent (bound to the type, not a trait) or a trait method. And in the latter case, also figure out which trait it belongs to.
Figure out the correct receiver type (self) and potentially use type coercions to make the call work.
Receiver type coercions
Let's take a look at this example to understand the type coercions to the receiver type:
struct Foo;
impl Foo {
fn on_ref(&self) {}
fn on_mut_ref(&mut self) {}
fn on_value(self) {}
}
fn main() {
let reference = &Foo; // type `&Foo`
let mut_ref = &mut Foo; // type `&mut Foo`
let mut value = Foo; // type `Foo`
// ...
}
So we have three methods that take Foo, &Foo and &mut Foo receiver and we have three variables with those types. Let's try out all 9 combinations with each, method call syntax and UFCS.
UFCS
Foo::on_ref(reference);
//Foo::on_mut_ref(reference); error: mismatched types
//Foo::on_value(reference); error: mismatched types
//Foo::on_ref(mut_ref); error: mismatched types
Foo::on_mut_ref(mut_ref);
//Foo::on_value(mut_ref); error: mismatched types
//Foo::on_ref(value); error: mismatched types
//Foo::on_mut_ref(value); error: mismatched types
Foo::on_value(value);
As we can see, only the calls succeed where the types are correct. To make the other calls work we would have to manually add & or &mut or * in front of the argument. That's the standard behavior for all function arguments.
Method call syntax
reference.on_ref();
//reference.on_mut_ref(); error: cannot borrow `*reference` as mutable
//reference.on_value(); error: cannot move out of `*reference`
mut_ref.on_ref();
mut_ref.on_mut_ref();
//mut_ref.on_value(); error: cannot move out of `*mut_ref`
value.on_ref();
value.on_mut_ref();
value.on_value();
Only three of the method calls lead to an error while the others succeed. Here, the compiler automatically inserts deref (dereferencing) or autoref (adding a reference) coercions to make the call work. Also note that the three errors are not "type mismatch" errors: the compiler already tried to adjust the type correctly, but this lead to other errors.
There are some additional coercions:
Unsize coercions, described by the Unsize trait. Allows you to call slice methods on arrays and to coerce types into trait objects of traits they implement.
Advanced deref coercions via the Deref trait. This allows you to call slice methods on Vec, for example.
Method resolution: figuring out what method to call
When writing lhs.method_name(), then the method method_name could be an inherent method of the type of lhs or it could belong to a trait that's in scope (imported). The compiler has to figure out which one to call and has a number of rules for this. When getting into the details, these rules are actually really complex and can lead to some surprising behavior. Luckily, most programmers will never have to deal with that and it "just works" most of the time.
To give a coarse overview how it works, the compiler tries the following things in order, using the first method that is found.
Is there an inherent method with the name method_name where the receiver type fits exactly (does not need coercions)?
Is there a trait method with the name method_name where the receiver type fits exactly (does not need coercions)?
Is there an inherent method with the name method_name? (type coercions will be performed)
Is there a trait method with the name method_name? (type coercions will be performed)
(Again, note that this is still a simplification. Different type of coercions are preferred over others, for example.)
This shows one rule that most programmers know: inherent methods have a higher priority than trait methods. But a bit unknown is the fact that whether or not the receiver type fits perfectly is a more important factor. There is a quiz that nicely demonstrates this: Rust Quiz #23. More details on the exact method resolution algorithm can be found in this StackOverflow answer.
This set of rules can actually make a bunch of changes to an API to be breaking changes. We currently have to deal with that in the attempt to add an IntoIterator impl for arrays.
Another – minor and probably very obvious – difference is that for the method call syntax, the type name does not have to be imported.
Apart from that it's worth pointing out what is not different about the two syntaxes:
Runtime behavior: no difference whatsoever.
Performance: the method call syntax is "converted" (desugared) into basically the UFCS pretty early inside the compiler, meaning that there aren't any performance differences either.

How do I constrain associated types on a non-owned trait? [duplicate]

I have this code:
extern crate serde;
use serde::de::DeserializeOwned;
use serde::Serialize;
trait Bar<'a, T: 'a>
where
T: Serialize,
&'a T: DeserializeOwned,
{
}
I would like to write this using an associated type, because the type T is unimportant to the users of this type. I got this far:
trait Bar {
type T: Serialize;
}
I cannot figure out how to specify the other bound.
Ultimately, I want to use a function like this:
extern crate serde_json;
fn test<I: Bar>(t: I::T) -> String {
serde_json::to_string(&t).unwrap()
}
The "correct" solution is to place the bounds on the trait, but referencing the associated type. In this case, you can also use higher ranked trait bounds to handle the reference:
trait Bar
where
Self::T: Serialize,
// ^^^^^^^ Bounds on an associated type
for<'a> &'a Self::T: DeserializeOwned,
// ^^^^^^^^^^^ Higher-ranked trait bounds
{
type T;
}
However, this doesn't work yet.
I believe that you will need to either:
wait for issue 20671 and/or issue 50346 to be fixed.
wait for the generic associated types feature which introduces where clauses on associated types.
In the meantime, the workaround is to duplicate the bound everywhere it's needed:
fn test<I: Bar>(t: I::T) -> String
where
for<'a> &'a I::T: DeserializeOwned,
{
serde_json::to_string(&t).unwrap()
}

Why can't `Self` be used to refer to an enum's variant in a method body?

This question is now obsolete because this feature has been implemented. Related answer.
The following Rust code fails to compile:
enum Foo {
Bar,
}
impl Foo {
fn f() -> Self {
Self::Bar
}
}
The error message confuses me:
error[E0599]: no associated item named `Bar` found for type `Foo` in the current scope
--> src/main.rs:7:9
|
7 | Self::Bar
| ^^^^^^^^^
The problem can be fixed by using Foo instead of Self, but this strikes me as strange since Self is supposed to refer to the type that is being implemented (ignoring traits), which in this case is Foo.
enum Foo {
Bar,
}
impl Foo {
fn f() -> Self {
Foo::Bar
}
}
Why can't Self be used in this situation? Where exactly can Self be used*? Is there anything else I can use to avoid repeating the type name in the method body?
* I'm ignoring usage in traits, where Self refers to whatever type implements the trait.
An important thing to note is that the error said associated item. enum Foo { Baz } doesn't have associated items. A trait can have an associated item:
trait FooBaz { type Baz }
// ^~~~~~~~ - associated item
To summarize:
Why can't Self be used in this situation?
Because of this issue. RFC 2338 has not been implemented yet.
Self seems to act as a type alias, albeit with some modifications.
Where exactly can Self be used?
Self can only be used in traits and impls. This code:
struct X {
f: i32,
x: &Self,
}
Outputs the following:
error[E0411]: cannot find type `Self` in this scope
--> src/main.rs:3:9
|
3 | x: &Self,
| ^^^^ `Self` is only available in traits and impls
This is possibly a temporary situation and might change in the future!
More precisely, Self should be used only as part of method signature (e.g. fn self_in_self_out(&self) -> Self) or to access an associated type:
enum Foo {
Baz,
}
trait FooBaz {
type Baz;
fn b(&self) -> Self::Baz; // Valid use of `Self` as method argument and method output
}
impl FooBaz for Foo {
type Baz = Foo;
fn b(&self) -> Self::Baz {
let x = Foo::Baz as Self::Baz; // You can use associated type, but it's just a type
x
}
}
I think user4815162342 covered the rest of the answer best.
If the enum name Foo is in reality long and you want to avoid repeating it across the implementation, you have two options:
use LongEnumName as Short at module level. This will allow you to return Short::Bar at the end of f.
use LongEnumName::* at module level, allowing an even shorter Bar.
If you omit pub, the imports will be internal and won't affect the public API of the module.
This is now possible as of version 1.37.
Enum constructors != associated items.
It is a known issue, but it's not expected to be fixed, at least not in the foreseeable future. From what I have gathered it is not trivial to just allow this to work; at this point it is more likely that the related documentation or the error message will be improved.
There is little documentation I could find on the topic of associated items in general; The Rust Book has a chapter on associated types, though. In addition, there are plenty of good answers about Self in this related question.
There is an experimental feature that would make your example work without any other changes. You can try it out in a nightly build of Rust by adding this in your main file:
#![feature(type_alias_enum_variants)]
You can follow the progress of the feature towards stabilisation in its tracking issue.

What does the 'where' clause within a trait do?

If I have this code:
trait Trait {
fn f(&self) -> i32 where Self: Sized;
fn g(&self) -> i32;
}
fn object_safety_dynamic(x: &Trait) {
x.f(); // error
x.g(); // works
}
What does the where clause actually do?
Naively, I was thinking where Self: Sized; dictates something about the type implementing Trait, like 'if you implement Trait for type A your type A must be sized, i.e., it can be i32 but not [i32].
However, such a constraint would rather go as trait Trait: Sized (correct me if I am wrong)?
Now I noticed where Self: Sized; actually determines if I can call f or g from within object_safety_dynamic.
My questions:
What happens here behind the scenes?
What (in simple English) am I actually telling the compiler by where Self: Sized; that makes g() work but f() not?
In particular: Since &self is a reference anyway, what compiled difference exists between f and g for various (sized or unsized) types. Wouldn't it always boil down to something like _vtable_f_or_g(*self) -> i32, regardless of where or if the type is sized or not?
Why can I implement Trait for both u8 and [u8]. Shouldn't the compiler actually stop me from implementing f() for [u8], instead of throwing an error at the call site?
fn f(&self) -> i32 where Self: Sized;
This says that f is only defined for types that also implement Sized. Unsized types may still implement Trait, but f will not be available.
Inside object_safety_dynamic, calling x.f() is actually doing: (*x).f(). While x is sized because it's a pointer, *x might not be because it could be any implementation of Trait. But code inside the function has to work for any valid argument, so you are not allowed to call x.f() there.
What does the where clause actually do?
Naively, I was thinking where Self: Sized; dictates something about the type implementing Trait, like 'if you implement Trait for type A your type A must be sized, i.e., it can be i32 but not [i32].
However, such a constraint would rather go as trait Trait: Sized
This is correct.
However, in this case, the bound applies only to the function. where bounds on functions are only checked at the callsite.
What happens here behind the scenes?
There is a confusing bit about rust's syntax which is that Trait can refer to either
The trait Trait; or
The "trait object" Trait, which is actually a type, not an object.
Sized is a trait, and any type T that is Sized may have its size taken as a constant, by std::mem::size_of::<T>(). Such types that are not sized are str and [u8], whose contents do not have a fixed size.
The type Trait is also unsized. Intuitively, this is because Trait as a type consists of all values of types that implement the trait Trait, which may have varying size. This means you can never have a value of type Trait - you can only refer to one via a "fat pointer" such as &Trait or Box<Trait> and so on. These have the size of 2 pointers - one for a vtable, one for the data. It looks roughly like this:
struct &Trait {
pub data: *mut (),
pub vtable: *mut (),
}
There is automatically an impl of the form:
impl Trait /* the trait */ for Trait /* the type */ {
fn f(&self) -> i32 where Self: Sized { .. }
fn g(&self) -> i32 {
/* vtable magic: something like (self.vtable.g)(self.data) */
}
}
What (in simple English) am I actually telling the compiler by where Self: Sized; that makes g() work but f() not?
Note that since, as I mentioned, Trait is not Sized, the bound Self: Sized is not satisfied and so the function f cannot be called where Self == Trait.
In particular: Since &self is a reference anyway, what compiled difference exists between f and g for various (sized or unsized) types. Wouldn't it always boil down to something like _vtable_f_or_g(*self) -> i32, regardless of where or if the type is sized or not?
The type Trait is always unsized. It doesn't matter which type has been coerced to Trait. The way you call the function with a Sized variable is to use it directly:
fn generic<T: Trait + Sized>(x: &T) { // the `Sized` bound is implicit, added here for clarity
x.f(); // compiles just fine
x.g();
}
Why can I implement Trait for both u8 and [u8]. Shouldn't the compiler actually stop me from implementing f() for [u8], instead of throwing an error at the call site?
Because the trait is not bounded by Self: Sized - the function f is. So there is nothing stopping you from implementing the function - it's just that the bounds on the function can never be satisfied, so you can never call it.

Resources