Why can't `Self` be used to refer to an enum's variant in a method body? - enums

This question is now obsolete because this feature has been implemented. Related answer.
The following Rust code fails to compile:
enum Foo {
Bar,
}
impl Foo {
fn f() -> Self {
Self::Bar
}
}
The error message confuses me:
error[E0599]: no associated item named `Bar` found for type `Foo` in the current scope
--> src/main.rs:7:9
|
7 | Self::Bar
| ^^^^^^^^^
The problem can be fixed by using Foo instead of Self, but this strikes me as strange since Self is supposed to refer to the type that is being implemented (ignoring traits), which in this case is Foo.
enum Foo {
Bar,
}
impl Foo {
fn f() -> Self {
Foo::Bar
}
}
Why can't Self be used in this situation? Where exactly can Self be used*? Is there anything else I can use to avoid repeating the type name in the method body?
* I'm ignoring usage in traits, where Self refers to whatever type implements the trait.

An important thing to note is that the error said associated item. enum Foo { Baz } doesn't have associated items. A trait can have an associated item:
trait FooBaz { type Baz }
// ^~~~~~~~ - associated item
To summarize:
Why can't Self be used in this situation?
Because of this issue. RFC 2338 has not been implemented yet.
Self seems to act as a type alias, albeit with some modifications.
Where exactly can Self be used?
Self can only be used in traits and impls. This code:
struct X {
f: i32,
x: &Self,
}
Outputs the following:
error[E0411]: cannot find type `Self` in this scope
--> src/main.rs:3:9
|
3 | x: &Self,
| ^^^^ `Self` is only available in traits and impls
This is possibly a temporary situation and might change in the future!
More precisely, Self should be used only as part of method signature (e.g. fn self_in_self_out(&self) -> Self) or to access an associated type:
enum Foo {
Baz,
}
trait FooBaz {
type Baz;
fn b(&self) -> Self::Baz; // Valid use of `Self` as method argument and method output
}
impl FooBaz for Foo {
type Baz = Foo;
fn b(&self) -> Self::Baz {
let x = Foo::Baz as Self::Baz; // You can use associated type, but it's just a type
x
}
}
I think user4815162342 covered the rest of the answer best.

If the enum name Foo is in reality long and you want to avoid repeating it across the implementation, you have two options:
use LongEnumName as Short at module level. This will allow you to return Short::Bar at the end of f.
use LongEnumName::* at module level, allowing an even shorter Bar.
If you omit pub, the imports will be internal and won't affect the public API of the module.

This is now possible as of version 1.37.

Enum constructors != associated items.
It is a known issue, but it's not expected to be fixed, at least not in the foreseeable future. From what I have gathered it is not trivial to just allow this to work; at this point it is more likely that the related documentation or the error message will be improved.
There is little documentation I could find on the topic of associated items in general; The Rust Book has a chapter on associated types, though. In addition, there are plenty of good answers about Self in this related question.

There is an experimental feature that would make your example work without any other changes. You can try it out in a nightly build of Rust by adding this in your main file:
#![feature(type_alias_enum_variants)]
You can follow the progress of the feature towards stabilisation in its tracking issue.

Related

Rust cannot infer an appropriate lifetime for autoref due to conflicting requirements [duplicate]

I have a value and I want to store that value and a reference to
something inside that value in my own type:
struct Thing {
count: u32,
}
struct Combined<'a>(Thing, &'a u32);
fn make_combined<'a>() -> Combined<'a> {
let thing = Thing { count: 42 };
Combined(thing, &thing.count)
}
Sometimes, I have a value and I want to store that value and a reference to
that value in the same structure:
struct Combined<'a>(Thing, &'a Thing);
fn make_combined<'a>() -> Combined<'a> {
let thing = Thing::new();
Combined(thing, &thing)
}
Sometimes, I'm not even taking a reference of the value and I get the
same error:
struct Combined<'a>(Parent, Child<'a>);
fn make_combined<'a>() -> Combined<'a> {
let parent = Parent::new();
let child = parent.child();
Combined(parent, child)
}
In each of these cases, I get an error that one of the values "does
not live long enough". What does this error mean?
Let's look at a simple implementation of this:
struct Parent {
count: u32,
}
struct Child<'a> {
parent: &'a Parent,
}
struct Combined<'a> {
parent: Parent,
child: Child<'a>,
}
impl<'a> Combined<'a> {
fn new() -> Self {
let parent = Parent { count: 42 };
let child = Child { parent: &parent };
Combined { parent, child }
}
}
fn main() {}
This will fail with the error:
error[E0515]: cannot return value referencing local variable `parent`
--> src/main.rs:19:9
|
17 | let child = Child { parent: &parent };
| ------- `parent` is borrowed here
18 |
19 | Combined { parent, child }
| ^^^^^^^^^^^^^^^^^^^^^^^^^^ returns a value referencing data owned by the current function
error[E0505]: cannot move out of `parent` because it is borrowed
--> src/main.rs:19:20
|
14 | impl<'a> Combined<'a> {
| -- lifetime `'a` defined here
...
17 | let child = Child { parent: &parent };
| ------- borrow of `parent` occurs here
18 |
19 | Combined { parent, child }
| -----------^^^^^^---------
| | |
| | move out of `parent` occurs here
| returning this value requires that `parent` is borrowed for `'a`
To completely understand this error, you have to think about how the
values are represented in memory and what happens when you move
those values. Let's annotate Combined::new with some hypothetical
memory addresses that show where values are located:
let parent = Parent { count: 42 };
// `parent` lives at address 0x1000 and takes up 4 bytes
// The value of `parent` is 42
let child = Child { parent: &parent };
// `child` lives at address 0x1010 and takes up 4 bytes
// The value of `child` is 0x1000
Combined { parent, child }
// The return value lives at address 0x2000 and takes up 8 bytes
// `parent` is moved to 0x2000
// `child` is ... ?
What should happen to child? If the value was just moved like parent
was, then it would refer to memory that no longer is guaranteed to
have a valid value in it. Any other piece of code is allowed to store
values at memory address 0x1000. Accessing that memory assuming it was
an integer could lead to crashes and/or security bugs, and is one of
the main categories of errors that Rust prevents.
This is exactly the problem that lifetimes prevent. A lifetime is a
bit of metadata that allows you and the compiler to know how long a
value will be valid at its current memory location. That's an
important distinction, as it's a common mistake Rust newcomers make.
Rust lifetimes are not the time period between when an object is
created and when it is destroyed!
As an analogy, think of it this way: During a person's life, they will
reside in many different locations, each with a distinct address. A
Rust lifetime is concerned with the address you currently reside at,
not about whenever you will die in the future (although dying also
changes your address). Every time you move it's relevant because your
address is no longer valid.
It's also important to note that lifetimes do not change your code; your
code controls the lifetimes, your lifetimes don't control the code. The
pithy saying is "lifetimes are descriptive, not prescriptive".
Let's annotate Combined::new with some line numbers which we will use
to highlight lifetimes:
{ // 0
let parent = Parent { count: 42 }; // 1
let child = Child { parent: &parent }; // 2
// 3
Combined { parent, child } // 4
} // 5
The concrete lifetime of parent is from 1 to 4, inclusive (which I'll
represent as [1,4]). The concrete lifetime of child is [2,4], and
the concrete lifetime of the return value is [4,5]. It's
possible to have concrete lifetimes that start at zero - that would
represent the lifetime of a parameter to a function or something that
existed outside of the block.
Note that the lifetime of child itself is [2,4], but that it refers
to a value with a lifetime of [1,4]. This is fine as long as the
referring value becomes invalid before the referred-to value does. The
problem occurs when we try to return child from the block. This would
"over-extend" the lifetime beyond its natural length.
This new knowledge should explain the first two examples. The third
one requires looking at the implementation of Parent::child. Chances
are, it will look something like this:
impl Parent {
fn child(&self) -> Child { /* ... */ }
}
This uses lifetime elision to avoid writing explicit generic
lifetime parameters. It is equivalent to:
impl Parent {
fn child<'a>(&'a self) -> Child<'a> { /* ... */ }
}
In both cases, the method says that a Child structure will be
returned that has been parameterized with the concrete lifetime of
self. Said another way, the Child instance contains a reference
to the Parent that created it, and thus cannot live longer than that
Parent instance.
This also lets us recognize that something is really wrong with our
creation function:
fn make_combined<'a>() -> Combined<'a> { /* ... */ }
Although you are more likely to see this written in a different form:
impl<'a> Combined<'a> {
fn new() -> Combined<'a> { /* ... */ }
}
In both cases, there is no lifetime parameter being provided via an
argument. This means that the lifetime that Combined will be
parameterized with isn't constrained by anything - it can be whatever
the caller wants it to be. This is nonsensical, because the caller
could specify the 'static lifetime and there's no way to meet that
condition.
How do I fix it?
The easiest and most recommended solution is to not attempt to put
these items in the same structure together. By doing this, your
structure nesting will mimic the lifetimes of your code. Place types
that own data into a structure together and then provide methods that
allow you to get references or objects containing references as needed.
There is a special case where the lifetime tracking is overzealous:
when you have something placed on the heap. This occurs when you use a
Box<T>, for example. In this case, the structure that is moved
contains a pointer into the heap. The pointed-at value will remain
stable, but the address of the pointer itself will move. In practice,
this doesn't matter, as you always follow the pointer.
Some crates provide ways of representing this case, but they
require that the base address never move. This rules out mutating
vectors, which may cause a reallocation and a move of the
heap-allocated values.
rental (no longer maintained or supported)
owning_ref (has multiple soundness issues)
ouroboros
self_cell
Examples of problems solved with Rental:
Is there an owned version of String::chars?
Returning a RWLockReadGuard independently from a method
How can I return an iterator over a locked struct member in Rust?
How to return a reference to a sub-value of a value that is under a mutex?
How do I store a result using Serde Zero-copy deserialization of a Futures-enabled Hyper Chunk?
How to store a reference without having to deal with lifetimes?
In other cases, you may wish to move to some type of reference-counting, such as by using Rc or Arc.
More information
After moving parent into the struct, why is the compiler not able to get a new reference to parent and assign it to child in the struct?
While it is theoretically possible to do this, doing so would introduce a large amount of complexity and overhead. Every time that the object is moved, the compiler would need to insert code to "fix up" the reference. This would mean that copying a struct is no longer a very cheap operation that just moves some bits around. It could even mean that code like this is expensive, depending on how good a hypothetical optimizer would be:
let a = Object::new();
let b = a;
let c = b;
Instead of forcing this to happen for every move, the programmer gets to choose when this will happen by creating methods that will take the appropriate references only when you call them.
A type with a reference to itself
There's one specific case where you can create a type with a reference to itself. You need to use something like Option to make it in two steps though:
#[derive(Debug)]
struct WhatAboutThis<'a> {
name: String,
nickname: Option<&'a str>,
}
fn main() {
let mut tricky = WhatAboutThis {
name: "Annabelle".to_string(),
nickname: None,
};
tricky.nickname = Some(&tricky.name[..4]);
println!("{:?}", tricky);
}
This does work, in some sense, but the created value is highly restricted - it can never be moved. Notably, this means it cannot be returned from a function or passed by-value to anything. A constructor function shows the same problem with the lifetimes as above:
fn creator<'a>() -> WhatAboutThis<'a> { /* ... */ }
If you try to do this same code with a method, you'll need the alluring but ultimately useless &'a self. When that's involved, this code is even more restricted and you will get borrow-checker errors after the first method call:
#[derive(Debug)]
struct WhatAboutThis<'a> {
name: String,
nickname: Option<&'a str>,
}
impl<'a> WhatAboutThis<'a> {
fn tie_the_knot(&'a mut self) {
self.nickname = Some(&self.name[..4]);
}
}
fn main() {
let mut tricky = WhatAboutThis {
name: "Annabelle".to_string(),
nickname: None,
};
tricky.tie_the_knot();
// cannot borrow `tricky` as immutable because it is also borrowed as mutable
// println!("{:?}", tricky);
}
See also:
Cannot borrow as mutable more than once at a time in one code - but can in another very similar
What about Pin?
Pin, stabilized in Rust 1.33, has this in the module documentation:
A prime example of such a scenario would be building self-referential structs, since moving an object with pointers to itself will invalidate them, which could cause undefined behavior.
It's important to note that "self-referential" doesn't necessarily mean using a reference. Indeed, the example of a self-referential struct specifically says (emphasis mine):
We cannot inform the compiler about that with a normal reference,
since this pattern cannot be described with the usual borrowing rules.
Instead we use a raw pointer, though one which is known to not be null,
since we know it's pointing at the string.
The ability to use a raw pointer for this behavior has existed since Rust 1.0. Indeed, owning-ref and rental use raw pointers under the hood.
The only thing that Pin adds to the table is a common way to state that a given value is guaranteed to not move.
See also:
How to use the Pin struct with self-referential structures?
A slightly different issue which causes very similar compiler messages is object lifetime dependency, rather than storing an explicit reference. An example of that is the ssh2 library. When developing something bigger than a test project, it is tempting to try to put the Session and Channel obtained from that session alongside each other into a struct, hiding the implementation details from the user. However, note that the Channel definition has the 'sess lifetime in its type annotation, while Session doesn't.
This causes similar compiler errors related to lifetimes.
One way to solve it in a very simple way is to declare the Session outside in the caller, and then for annotate the reference within the struct with a lifetime, similar to the answer in this Rust User's Forum post talking about the same issue while encapsulating SFTP. This will not look elegant and may not always apply - because now you have two entities to deal with, rather than one that you wanted!
Turns out the rental crate or the owning_ref crate from the other answer are the solutions for this issue too. Let's consider the owning_ref, which has the special object for this exact purpose:
OwningHandle. To avoid the underlying object moving, we allocate it on the heap using a Box, which gives us the following possible solution:
use ssh2::{Channel, Error, Session};
use std::net::TcpStream;
use owning_ref::OwningHandle;
struct DeviceSSHConnection {
tcp: TcpStream,
channel: OwningHandle<Box<Session>, Box<Channel<'static>>>,
}
impl DeviceSSHConnection {
fn new(targ: &str, c_user: &str, c_pass: &str) -> Self {
use std::net::TcpStream;
let mut session = Session::new().unwrap();
let mut tcp = TcpStream::connect(targ).unwrap();
session.handshake(&tcp).unwrap();
session.set_timeout(5000);
session.userauth_password(c_user, c_pass).unwrap();
let mut sess = Box::new(session);
let mut oref = OwningHandle::new_with_fn(
sess,
unsafe { |x| Box::new((*x).channel_session().unwrap()) },
);
oref.shell().unwrap();
let ret = DeviceSSHConnection {
tcp: tcp,
channel: oref,
};
ret
}
}
The result of this code is that we can not use the Session anymore, but it is stored alongside with the Channel which we will be using. Because the OwningHandle object dereferences to Box, which dereferences to Channel, when storing it in a struct, we name it as such. NOTE: This is just my understanding. I have a suspicion this may not be correct, since it appears to be quite close to discussion of OwningHandle unsafety.
One curious detail here is that the Session logically has a similar relationship with TcpStream as Channel has to Session, yet its ownership is not taken and there are no type annotations around doing so. Instead, it is up to the user to take care of this, as the documentation of handshake method says:
This session does not take ownership of the socket provided, it is
recommended to ensure that the socket persists the lifetime of this
session to ensure that communication is correctly performed.
It is also highly recommended that the stream provided is not used
concurrently elsewhere for the duration of this session as it may
interfere with the protocol.
So with the TcpStream usage, is completely up to the programmer to ensure the correctness of the code. With the OwningHandle, the attention to where the "dangerous magic" happens is drawn using the unsafe {} block.
A further and a more high-level discussion of this issue is in this Rust User's Forum thread - which includes a different example and its solution using the rental crate, which does not contain unsafe blocks.
I've found the Arc (read-only) or Arc<Mutex> (read-write with locking) patterns to be sometimes quite useful tradeoff between performance and code complexity (mostly caused by lifetime-annotation).
Arc:
use std::sync::Arc;
struct Parent {
child: Arc<Child>,
}
struct Child {
value: u32,
}
struct Combined(Parent, Arc<Child>);
fn main() {
let parent = Parent { child: Arc::new(Child { value: 42 }) };
let child = parent.child.clone();
let combined = Combined(parent, child.clone());
assert_eq!(combined.0.child.value, 42);
assert_eq!(child.value, 42);
// combined.0.child.value = 50; // fails, Arc is not DerefMut
}
Arc + Mutex:
use std::sync::{Arc, Mutex};
struct Child {
value: u32,
}
struct Parent {
child: Arc<Mutex<Child>>,
}
struct Combined(Parent, Arc<Mutex<Child>>);
fn main() {
let parent = Parent { child: Arc::new(Mutex::new(Child {value: 42 }))};
let child = parent.child.clone();
let combined = Combined(parent, child.clone());
assert_eq!(combined.0.child.lock().unwrap().value, 42);
assert_eq!(child.lock().unwrap().value, 42);
child.lock().unwrap().value = 50;
assert_eq!(combined.0.child.lock().unwrap().value, 50);
}
See also RwLock (When or why should I use a Mutex over an RwLock?)
As a newcomer to Rust, I had a case similar to your last example:
struct Combined<'a>(Parent, Child<'a>);
fn make_combined<'a>() -> Combined<'a> {
let parent = Parent::new();
let child = parent.child();
Combined(parent, child)
}
In the end, I solved it by using this pattern:
fn make_parent_and_child<'a>(anchor: &'a mut DataAnchorFor1<Parent>) -> Child<'a> {
// construct parent, then store it in anchor object the caller gave us a mut-ref to
*anchor = DataAnchorFor1::holding(Parent::new());
// now retrieve parent from storage-slot we assigned to in the previous line
let parent = anchor.val1.as_mut().unwrap();
// now proceed with regular code, except returning only the child
// (the parent can already be accessed by the caller through the anchor object)
let child = parent.child();
child
}
// this is a generic struct that we can define once, and use whenever we need this pattern
// (it can also be extended to have multiple slots, naturally)
struct DataAnchorFor1<T> {
val1: Option<T>,
}
impl<T> DataAnchorFor1<T> {
fn empty() -> Self {
Self { val1: None }
}
fn holding(val1: T) -> Self {
Self { val1: Some(val1) }
}
}
// for my case, this was all I needed
fn main_simple() {
let anchor = DataAnchorFor1::empty();
let child = make_parent_and_child(&mut anchor);
let child_processing_result = do_some_processing(child);
println!("ChildProcessingResult:{}", child_processing_result);
}
// but if access to parent-data later on is required, you can use this
fn main_complex() {
let anchor = DataAnchorFor1::empty();
// if you want to use the parent object (which is stored in anchor), you must...
// ...wrap the child-related processing in a new scope, so the mut-ref to anchor...
// ...gets dropped at its end, letting us access anchor.val1 (the parent) directly
let child_processing_result = {
let child = make_parent_and_child(&mut anchor);
// do the processing you want with the child here (avoiding ref-chain...
// ...back to anchor-data, if you need to access parent-data afterward)
do_some_processing(child)
};
// now that scope is ended, we can access parent data directly
// so print out the relevant data for both parent and child (adjust to your case)
let parent = anchor.val1.unwrap();
println!("Parent:{} ChildProcessingResult:{}", parent, child_processing_result);
}
This is far from a universal solution! But it worked in my case, and only required usage of the main_simple pattern above (not the main_complex variant), because in my case the "parent" object was just something temporary (a database "Client" object) that I had to construct to pass to the "child" object (a database "Transaction" object) so I could run some database commands.
Anyway, it accomplished the encapsulation/simplification-of-boilerplate that I needed (since I had many functions that needed creation of a Transaction/"child" object, and now all they need is that generic anchor-object creation line), while avoiding the need for using a whole new library.
These are the libraries I'm aware of that may be relevant:
owning-ref
rental
ouroboros
reffers
self_cell
escher
rust-viewbox
However, I scanned through them, and they all seem to have issues of one kind or another (not being updated in years, having multiple unsoundness issues/concerns raised, etc.), so I was hesitant to use them.
So while this isn't as generic of a solution, I figured I would mention it for people with similar use-cases:
Where the caller only needs the "child" object returned.
But the called-function needs to construct a "parent" object to perform its functions.
And the borrowing rules requires that the "parent" object be stored somewhere that persists beyond the "make_parent_and_child" function. (in my case, this was a start_transaction function)

How can we call a function mutable for self inside a for loop? [duplicate]

I'm having a hard time with the borrow checker.
for item in self.xxx.iter() {
self.modify_self_but_not_xxx(item);
}
The above code worked before I refactored some code into modify_self_but_not_xxx():
error: cannot borrow `*self` as mutable because `self.xxx` is also borrowed as immutable
How can I call a mutating method while holding a reference to self (e.g. from within a for-loop)?
How can I call a mutating method while holding a reference to self (e.g. from within a for-loop)?
You can't, that's exactly what the borrowing rules prevent.
The main idea is that in your code, the borrow checker cannot possibly know that self.modify_self_but_not_xxx(..) will not modify xxx.
However, you can mutate self.yyy or any other parameters, so either you can:
do the computations of modify_self_but_not_xxx(..) directly in your loop body
define a helper function taking mutable references to update them:
fn do_computations(item: Foo, a: &mut Bar, b: &mut Baz) { /* ... */ }
/* ... */
for item in self.xxx.iter() {
do_computations(item, &mut self.bar, &mut self.baz);
}
define a helper struct that has helper methods

How do Rust's Trait Objects handle methods which moves the instance?

Here are my underlying assumptions of how Rust's methods work:
foo.method();
where method is defined as method(&self) and foo is an instance of Foo, is the same as
Foo::method(&foo);
My understanding of Trait Objects is a struct with two void pointers, one to the data, and another to the function pointers (vtable)
A polymorphic function that takes in a Trait Object and calls a method on that Trait object will compile down to looking at the offset of the method in the Trait Object and passing in the data pointer
But what if the method moves the instance? Correct me if I am wrong, but to call a virtual move method, the function would have to push the actual data stored inside the Trait Object onto the stack instead of just the data pointer. The data size obviously cannot be known at compile-time, so what is going on here? Is this a VLA sort of situation, or do I misunderstand how a move works?
The answer is very simple - it's impossible to call the self-consuming method on trait object.
The key words are object safety. Essentially, this property combines two requirements:
every method must work with self through some kind of indirection;
every method must be fully identifiable via original type (i.e. no generics).
To see this in more detail, let's actually try to code something and ask compiler's opinion. First of all, just trying to define the trait:
trait Consumer {
fn consume(self);
}
Compiler is already unhappy:
error[E0277]: the size for values of type `Self` cannot be known at compilation time
--> src/lib.rs:2:16
|
2 | fn consume(self);
| ^^^^ doesn't have a size known at compile-time
|
help: consider further restricting `Self`
|
2 | fn consume(self) where Self: Sized;
| ^^^^^^^^^^^^^^^^^
help: function arguments must have a statically known size, borrowed types always have a known size
|
2 | fn consume(&self);
| ^
OK, we can be even more conservative then the compiler's advice and add the restriction on trait. Then, add a stub for trait object creation:
trait Consumer where Self: Sized {
fn consume(self);
}
fn main() {
let _: Box<dyn Consumer> = todo!();
}
Now, the error is slightly more complex:
error[E0038]: the trait `Consumer` cannot be made into an object
--> src/main.rs:6:12
|
6 | let _: Box<dyn Consumer> = todo!();
| ^^^^^^^^^^^^^^^^^ `Consumer` cannot be made into an object
|
note: for a trait to be "object safe" it needs to allow building a vtable to allow the call to be resolvable dynamically; for more information visit <https://doc.rust-lang.org/reference/items/traits.html#object-safety>
--> src/main.rs:1:28
|
1 | trait Consumer where Self: Sized {
| -------- ^^^^^ ...because it requires `Self: Sized`
| |
| this trait cannot be made into an object...
There is one workaround, however: it's not necessary to restrict the whole trait - just the offending method, as we were told from the start. Moving where clause, as here:
trait Consumer {
fn consume(self) where Self: Sized;
}
...makes the code above compile.
Now, what about actually using this trait object? Let's implement it, for example, for unit type, and use this from main:
trait Consumer {
fn consume(self) where Self: Sized;
}
impl Consumer for () {
fn consume(self) {}
}
fn main() {
let consumer: Box<dyn Consumer> = Box::new(());
consumer.consume();
}
Another compiler error!
error: the `consume` method cannot be invoked on a trait object
--> src/main.rs:11:14
|
2 | fn consume(self) where Self: Sized;
| ----- this has a `Sized` requirement
...
11 | consumer.consume();
| ^^^^^^^
Again, the restriction we've placed on method forbids the code which would make no sense if it compiled.

What is the difference between method call syntax `foo.method()` and UFCS `Foo::method(&foo)`?

Is there any difference in Rust between calling a method on a value, like this:
struct A { e: u32 }
impl A {
fn show(&self) {
println!("{}", self.e)
}
}
fn main() {
A { e: 0 }.show();
}
...and calling it on the type, like this:
fn main() {
A::show(&A { e: 0 })
}
Summary: The most important difference is that the universal function call syntax (UFCS) is more explicit than the method call syntax.
With UFCS there is basically no ambiguity what function you want to call (there is still a longer form of the UFCS for trait methods, but let's ignore that for now). The method call syntax, on the other hand, requires more work in the compiler to figure out which method to call and how to call it. This manifests in mostly two things:
Method resolution: figure out if the method is inherent (bound to the type, not a trait) or a trait method. And in the latter case, also figure out which trait it belongs to.
Figure out the correct receiver type (self) and potentially use type coercions to make the call work.
Receiver type coercions
Let's take a look at this example to understand the type coercions to the receiver type:
struct Foo;
impl Foo {
fn on_ref(&self) {}
fn on_mut_ref(&mut self) {}
fn on_value(self) {}
}
fn main() {
let reference = &Foo; // type `&Foo`
let mut_ref = &mut Foo; // type `&mut Foo`
let mut value = Foo; // type `Foo`
// ...
}
So we have three methods that take Foo, &Foo and &mut Foo receiver and we have three variables with those types. Let's try out all 9 combinations with each, method call syntax and UFCS.
UFCS
Foo::on_ref(reference);
//Foo::on_mut_ref(reference); error: mismatched types
//Foo::on_value(reference); error: mismatched types
//Foo::on_ref(mut_ref); error: mismatched types
Foo::on_mut_ref(mut_ref);
//Foo::on_value(mut_ref); error: mismatched types
//Foo::on_ref(value); error: mismatched types
//Foo::on_mut_ref(value); error: mismatched types
Foo::on_value(value);
As we can see, only the calls succeed where the types are correct. To make the other calls work we would have to manually add & or &mut or * in front of the argument. That's the standard behavior for all function arguments.
Method call syntax
reference.on_ref();
//reference.on_mut_ref(); error: cannot borrow `*reference` as mutable
//reference.on_value(); error: cannot move out of `*reference`
mut_ref.on_ref();
mut_ref.on_mut_ref();
//mut_ref.on_value(); error: cannot move out of `*mut_ref`
value.on_ref();
value.on_mut_ref();
value.on_value();
Only three of the method calls lead to an error while the others succeed. Here, the compiler automatically inserts deref (dereferencing) or autoref (adding a reference) coercions to make the call work. Also note that the three errors are not "type mismatch" errors: the compiler already tried to adjust the type correctly, but this lead to other errors.
There are some additional coercions:
Unsize coercions, described by the Unsize trait. Allows you to call slice methods on arrays and to coerce types into trait objects of traits they implement.
Advanced deref coercions via the Deref trait. This allows you to call slice methods on Vec, for example.
Method resolution: figuring out what method to call
When writing lhs.method_name(), then the method method_name could be an inherent method of the type of lhs or it could belong to a trait that's in scope (imported). The compiler has to figure out which one to call and has a number of rules for this. When getting into the details, these rules are actually really complex and can lead to some surprising behavior. Luckily, most programmers will never have to deal with that and it "just works" most of the time.
To give a coarse overview how it works, the compiler tries the following things in order, using the first method that is found.
Is there an inherent method with the name method_name where the receiver type fits exactly (does not need coercions)?
Is there a trait method with the name method_name where the receiver type fits exactly (does not need coercions)?
Is there an inherent method with the name method_name? (type coercions will be performed)
Is there a trait method with the name method_name? (type coercions will be performed)
(Again, note that this is still a simplification. Different type of coercions are preferred over others, for example.)
This shows one rule that most programmers know: inherent methods have a higher priority than trait methods. But a bit unknown is the fact that whether or not the receiver type fits perfectly is a more important factor. There is a quiz that nicely demonstrates this: Rust Quiz #23. More details on the exact method resolution algorithm can be found in this StackOverflow answer.
This set of rules can actually make a bunch of changes to an API to be breaking changes. We currently have to deal with that in the attempt to add an IntoIterator impl for arrays.
Another – minor and probably very obvious – difference is that for the method call syntax, the type name does not have to be imported.
Apart from that it's worth pointing out what is not different about the two syntaxes:
Runtime behavior: no difference whatsoever.
Performance: the method call syntax is "converted" (desugared) into basically the UFCS pretty early inside the compiler, meaning that there aren't any performance differences either.

universal reference as parameter and return type

Universal references as parameter or return type
I read a few articles about universal references but I still don't understand in which cases I might need to use this as a parameter type besides the move constructor. Could someone enlighten me?
void Foo(Bar&& x);
Bar&& Foo();
Under which circumstances would I ever want to have this which I couldn't solve with a simple Bar& to move something?
When to use std::move
Could someone explain me when an explicit std::move is necessary (for parameters and return types) under which circumstances I can expect that the compiler uses it automatically during the optimization phase? For example
struct A { A(A&& src) ... };
A Foo()
{
A a;
...
return a;
}
In this case I might benefit from RVO, so should I even ever consider using std::move for a result? Thanks a lot!
Universal references
The example you've provided doesn't actually use universal references, those are just r-value references. Syntactically, the universal reference is an r-value reference to a parameter of a deduce templated type:
template <typename Bar>
void foo(Bar &&bar);
This is actually different then a regular r-value reference and it is used to solve a perfect forwarding problem. But I assume this isn't what your question is about.
R-value references
In most cases when you want to move the value to or from the function you can simply do it by value:
void foo(Bar b);
...
Bar somebar;
foo(std::move(somebar)); //function argument is move-constructed
/**************************************************************/
Bar foo()
{
Bar somebar;
return somebar; //return value is move-constructed
}
Doing this using l-value reference is actually incorrect:
void foo(Bar &b)
{
Bar somebar = std::move(b); //you "stole" passed value
}
...
Bar somebar;
foo(somebar); //but the caller didn't intend to move his value
Also returning any reference to a local variable is wrong.
The only reason one would use r-value reference instead of passing by value is to allow moving the value without actually moving it one extra time:
Bar &&Foo::foo()
{
return memberBar;
}
...
Foo f;
Bar b = f.foo(); //"b" will be move-constructed
...
f.foo().doBar(); //returned "Bar" value is just used and not moved at all
When to use std::move
You need to use std::move every time you want to move a variable even if it's already an r-value reference:
Foo::Foo(Bar &&bar)
: memberBar(std::move(bar)) //still need to move explicitly!
{
}
You don't need to use std::move when:
Returning a local variable by value
Passing a temporary to a function, e.g. foo(Bar())
Passing non-movable types (those without move-constructor) including primitive types
A common mistake:
Bar *bar = new Bar();
foo(std::move(bar)); //not needed! nothing to move since the pointer is passed and not the object itself
However when using a conditional operator:
Bar foo()
{
Bar somebar;
Bar otherbar;
return std::move(true ? somebar : otherbar); //need to move explicitly!
}

Resources