What is the difference between method call syntax `foo.method()` and UFCS `Foo::method(&foo)`? - methods

Is there any difference in Rust between calling a method on a value, like this:
struct A { e: u32 }
impl A {
fn show(&self) {
println!("{}", self.e)
}
}
fn main() {
A { e: 0 }.show();
}
...and calling it on the type, like this:
fn main() {
A::show(&A { e: 0 })
}

Summary: The most important difference is that the universal function call syntax (UFCS) is more explicit than the method call syntax.
With UFCS there is basically no ambiguity what function you want to call (there is still a longer form of the UFCS for trait methods, but let's ignore that for now). The method call syntax, on the other hand, requires more work in the compiler to figure out which method to call and how to call it. This manifests in mostly two things:
Method resolution: figure out if the method is inherent (bound to the type, not a trait) or a trait method. And in the latter case, also figure out which trait it belongs to.
Figure out the correct receiver type (self) and potentially use type coercions to make the call work.
Receiver type coercions
Let's take a look at this example to understand the type coercions to the receiver type:
struct Foo;
impl Foo {
fn on_ref(&self) {}
fn on_mut_ref(&mut self) {}
fn on_value(self) {}
}
fn main() {
let reference = &Foo; // type `&Foo`
let mut_ref = &mut Foo; // type `&mut Foo`
let mut value = Foo; // type `Foo`
// ...
}
So we have three methods that take Foo, &Foo and &mut Foo receiver and we have three variables with those types. Let's try out all 9 combinations with each, method call syntax and UFCS.
UFCS
Foo::on_ref(reference);
//Foo::on_mut_ref(reference); error: mismatched types
//Foo::on_value(reference); error: mismatched types
//Foo::on_ref(mut_ref); error: mismatched types
Foo::on_mut_ref(mut_ref);
//Foo::on_value(mut_ref); error: mismatched types
//Foo::on_ref(value); error: mismatched types
//Foo::on_mut_ref(value); error: mismatched types
Foo::on_value(value);
As we can see, only the calls succeed where the types are correct. To make the other calls work we would have to manually add & or &mut or * in front of the argument. That's the standard behavior for all function arguments.
Method call syntax
reference.on_ref();
//reference.on_mut_ref(); error: cannot borrow `*reference` as mutable
//reference.on_value(); error: cannot move out of `*reference`
mut_ref.on_ref();
mut_ref.on_mut_ref();
//mut_ref.on_value(); error: cannot move out of `*mut_ref`
value.on_ref();
value.on_mut_ref();
value.on_value();
Only three of the method calls lead to an error while the others succeed. Here, the compiler automatically inserts deref (dereferencing) or autoref (adding a reference) coercions to make the call work. Also note that the three errors are not "type mismatch" errors: the compiler already tried to adjust the type correctly, but this lead to other errors.
There are some additional coercions:
Unsize coercions, described by the Unsize trait. Allows you to call slice methods on arrays and to coerce types into trait objects of traits they implement.
Advanced deref coercions via the Deref trait. This allows you to call slice methods on Vec, for example.
Method resolution: figuring out what method to call
When writing lhs.method_name(), then the method method_name could be an inherent method of the type of lhs or it could belong to a trait that's in scope (imported). The compiler has to figure out which one to call and has a number of rules for this. When getting into the details, these rules are actually really complex and can lead to some surprising behavior. Luckily, most programmers will never have to deal with that and it "just works" most of the time.
To give a coarse overview how it works, the compiler tries the following things in order, using the first method that is found.
Is there an inherent method with the name method_name where the receiver type fits exactly (does not need coercions)?
Is there a trait method with the name method_name where the receiver type fits exactly (does not need coercions)?
Is there an inherent method with the name method_name? (type coercions will be performed)
Is there a trait method with the name method_name? (type coercions will be performed)
(Again, note that this is still a simplification. Different type of coercions are preferred over others, for example.)
This shows one rule that most programmers know: inherent methods have a higher priority than trait methods. But a bit unknown is the fact that whether or not the receiver type fits perfectly is a more important factor. There is a quiz that nicely demonstrates this: Rust Quiz #23. More details on the exact method resolution algorithm can be found in this StackOverflow answer.
This set of rules can actually make a bunch of changes to an API to be breaking changes. We currently have to deal with that in the attempt to add an IntoIterator impl for arrays.
Another – minor and probably very obvious – difference is that for the method call syntax, the type name does not have to be imported.
Apart from that it's worth pointing out what is not different about the two syntaxes:
Runtime behavior: no difference whatsoever.
Performance: the method call syntax is "converted" (desugared) into basically the UFCS pretty early inside the compiler, meaning that there aren't any performance differences either.

Related

How to write several implementation of the same method that have a different signature

I have several implementation of the same method SetRateForMeasure:
package repartition
type Repartition interface {
Name() string
Compute(meters []models.Meter, totalsProd, totalsConso map[string]float64) []models.Meter
SetRateForMeasure(meter models.Meter, measure models.Measure, total float64) float64
}
Then, in my code (in repartition.go), I call it:
rate := repartition.SetRateForMeasure(meter, measure, total)
where repartition is the interface defined before.
Thing is, when I add a new implementation of this method, the arguments of my functions might differ.
For example, the static repartition use a static percentage that is only used in this case.
I end up adding parameters so that I have a common interface to all methods, but it results that there is a lot of unused parameters depending on the implementation.
If I add it to common interface, it will be unused for the other definitions.
I tried to remove this method from my interface definition, but now
rate := repartition.SetRateForMeasure()
is no more defined.
How should I organize my code ?
There is no function overloading in Go, so you cannot declare the same function with different arguments. There's a few ways you can implement this though:
You can add multiple functions with different names and signatures
You can change the function to accept a struct instead of arguments
SetRateForMeasure(args SetRateOptions) float64
type SetRateOptions struct {
Meter models.Meter
Measure models.Measure
Total float64
Percentage *float64 // If nil, use default percentage
... // more parameters as needed
}
Go doesn't support method overriding. You either ​define methods with different names that take different parameters
​ or you can declare the method to accept a parameter struct.
type SetRateParams struct {
Meter models.Meter
Measure models.Measure
Total float64
}
type Repartition interface {
SetRateForMeasure(params SetRateParams) float64
}
Optionally, you can declare params in your structs as pointers, so you can represent "not-provided" semantics with nil instead of using the zero-value. This might be relevant in case of numerical params where 0 could be a valid value.
Using a struct param has also the advantage that you don't have to change all the call sites in case you decide to add an additional param 6 months from now (you just add it to the struct).
There are also worse solutions with interface{} varargs, for the sake of stating what is possible, but unless you loathe type safety, I wouldn't recommend that.

What does the 'where' clause within a trait do?

If I have this code:
trait Trait {
fn f(&self) -> i32 where Self: Sized;
fn g(&self) -> i32;
}
fn object_safety_dynamic(x: &Trait) {
x.f(); // error
x.g(); // works
}
What does the where clause actually do?
Naively, I was thinking where Self: Sized; dictates something about the type implementing Trait, like 'if you implement Trait for type A your type A must be sized, i.e., it can be i32 but not [i32].
However, such a constraint would rather go as trait Trait: Sized (correct me if I am wrong)?
Now I noticed where Self: Sized; actually determines if I can call f or g from within object_safety_dynamic.
My questions:
What happens here behind the scenes?
What (in simple English) am I actually telling the compiler by where Self: Sized; that makes g() work but f() not?
In particular: Since &self is a reference anyway, what compiled difference exists between f and g for various (sized or unsized) types. Wouldn't it always boil down to something like _vtable_f_or_g(*self) -> i32, regardless of where or if the type is sized or not?
Why can I implement Trait for both u8 and [u8]. Shouldn't the compiler actually stop me from implementing f() for [u8], instead of throwing an error at the call site?
fn f(&self) -> i32 where Self: Sized;
This says that f is only defined for types that also implement Sized. Unsized types may still implement Trait, but f will not be available.
Inside object_safety_dynamic, calling x.f() is actually doing: (*x).f(). While x is sized because it's a pointer, *x might not be because it could be any implementation of Trait. But code inside the function has to work for any valid argument, so you are not allowed to call x.f() there.
What does the where clause actually do?
Naively, I was thinking where Self: Sized; dictates something about the type implementing Trait, like 'if you implement Trait for type A your type A must be sized, i.e., it can be i32 but not [i32].
However, such a constraint would rather go as trait Trait: Sized
This is correct.
However, in this case, the bound applies only to the function. where bounds on functions are only checked at the callsite.
What happens here behind the scenes?
There is a confusing bit about rust's syntax which is that Trait can refer to either
The trait Trait; or
The "trait object" Trait, which is actually a type, not an object.
Sized is a trait, and any type T that is Sized may have its size taken as a constant, by std::mem::size_of::<T>(). Such types that are not sized are str and [u8], whose contents do not have a fixed size.
The type Trait is also unsized. Intuitively, this is because Trait as a type consists of all values of types that implement the trait Trait, which may have varying size. This means you can never have a value of type Trait - you can only refer to one via a "fat pointer" such as &Trait or Box<Trait> and so on. These have the size of 2 pointers - one for a vtable, one for the data. It looks roughly like this:
struct &Trait {
pub data: *mut (),
pub vtable: *mut (),
}
There is automatically an impl of the form:
impl Trait /* the trait */ for Trait /* the type */ {
fn f(&self) -> i32 where Self: Sized { .. }
fn g(&self) -> i32 {
/* vtable magic: something like (self.vtable.g)(self.data) */
}
}
What (in simple English) am I actually telling the compiler by where Self: Sized; that makes g() work but f() not?
Note that since, as I mentioned, Trait is not Sized, the bound Self: Sized is not satisfied and so the function f cannot be called where Self == Trait.
In particular: Since &self is a reference anyway, what compiled difference exists between f and g for various (sized or unsized) types. Wouldn't it always boil down to something like _vtable_f_or_g(*self) -> i32, regardless of where or if the type is sized or not?
The type Trait is always unsized. It doesn't matter which type has been coerced to Trait. The way you call the function with a Sized variable is to use it directly:
fn generic<T: Trait + Sized>(x: &T) { // the `Sized` bound is implicit, added here for clarity
x.f(); // compiles just fine
x.g();
}
Why can I implement Trait for both u8 and [u8]. Shouldn't the compiler actually stop me from implementing f() for [u8], instead of throwing an error at the call site?
Because the trait is not bounded by Self: Sized - the function f is. So there is nothing stopping you from implementing the function - it's just that the bounds on the function can never be satisfied, so you can never call it.

Use map[string]SpecificType with method of map[string]SomeInterface into

I get cannot use map[string]MyType literal (type map[string]MyType) as type map[string]IterableWithID in argument to MapToList with the code below, how do I pass in a concrete map type to method that expects a interface type?
https://play.golang.org/p/G7VzMwrRRw
Go's interface convention doesn't quite work the same way as in, say, Java (and the designers apparently didn't like the idea of getters and setters very much :-/ ). So you've got two core problems:
A map[string]Foo is not the same as a map[string]Bar, even if Bar implements Foo, so you have to break it out a bit (use make() beforehand, then assign in a single assignment).
Interface methods are called by value with no pointers, so you really need to do foo = foo.Method(bar) in your callers or get really pointer-happy to implement something like this.
What you can do to more-or-less simulate what you want:
type IterableWithID interface {
SetID(id string) IterableWithID // use as foo = foo.SetID(bar)
}
func (t MyType) SetID(id string) IterableWithID {
t.ID = id
return t
}
...and to deal with the typing problem
t := make(map[string]IterableWithID)
t["foo"] = MyType{}
MapToList(t) // This is a map[string]IterableWithID, so compiler's happy.
...and finally...
value = value.SetID(key) // We set back the copy of the value we mutated
The final value= deals with the fact that the method gets a fresh copy of the value object, so the original would be untouched by your method (the change would simply vanish).
Updated code on the Go Playground
...but it's not particularly idiomatic Go--they really want you to just reference struct members rather than use Java-style mutators in interfaces (though TBH I'm not so keen on that little detail--mutators are supes handy to do validation).
You can't do what you want to do because the two map types are different. It doesn't matter that the element type of one is a type that implements the interface which is the element type of the other. The map type that you pass into the function has to be map[string]IterableWithID. You could create a map of that type, assign values of type MyType to the map, and pass that to the function.
See https://play.golang.org/p/NfsTlunHkW
Also, you probably don't want to be returning a pointer to a slice in MapToList. Just return the slice itself. A slice contains a reference to the underlying array.

Cannot use Rayon's `.par_iter()`

I have a struct which implements Iterator and it works fine as an iterator. It produces values, and using .map(), I download each item from a local HTTP server and save the results. I now want to parallelize this operation, and Rayon looks friendly.
I am getting a compiler error when trying to follow the example in the documentation.
This is the code that works sequentially. generate_values returns the struct which implements Iterator. dl downloads the values and saves them (i.e. it has side effects). Since iterators are lazy in Rust, I have put a .count() at the end so that it will actually run it.
generate_values(14).map(|x| { dl(x, &path, &upstream_url); }).count();
Following the Rayon example I tried this:
generate_values(14).par_iter().map(|x| { dl(x, &path, &upstream_url); }).count();
and got the following error:
src/main.rs:69:27: 69:37 error: no method named `par_iter` found for type `MyIterator` in the current scope
Interestingly, when I use .iter(), which many Rust things use, I get a similar error:
src/main.rs:69:27: 69:33 error: no method named `iter` found for type `MyIterator` in the current scope
src/main.rs:69 generate_values(14).iter().map(|tile| { dl_tile(tile, &tc_path, &upstream_url); }).count();
Since I implement Iterator, I should get .iter() for free right? Is this why .par_iter() doesn't work?
Rust 1.6 and Rayon 0.3.1
$ rustc --version
rustc 1.6.0 (c30b771ad 2016-01-19)
Rayon 0.3.1 defines par_iter as:
pub trait IntoParallelRefIterator<'data> {
type Iter: ParallelIterator<Item=&'data Self::Item>;
type Item: Sync + 'data;
fn par_iter(&'data self) -> Self::Iter;
}
There is only one type that implements this trait in Rayon itself: [T]:
impl<'data, T: Sync + 'data> IntoParallelRefIterator<'data> for [T] {
type Item = T;
type Iter = SliceIter<'data, T>;
fn par_iter(&'data self) -> Self::Iter {
self.into_par_iter()
}
}
That's why Lukas Kalbertodt's answer to collect to a Vec will work; Vec dereferences to a slice.
Generally, Rayon could not assume that any iterator would be amenable to parallelization, so it cannot default to including all Iterators.
Since you have defined generate_values, you could implement the appropriate Rayon trait for it as well:
IntoParallelIterator
IntoParallelRefIterator
IntoParallelRefMutIterator
That should allow you to avoid collecting into a temporary vector.
No, the Iterator trait has nothing to do with the iter() method. Yes, this is slightly confusing.
There are a few different concepts here. An Iterator is a type that can spit out values; it only needs to implement next() and has many other methods, but none of these is iter(). Then there is IntoIterator which says that a type can be transformed into an Iterator. This trait has the into_iter() method. Now the iter() method is not really related to any of those two traits. It's just a normal method of many types, that often works similar to into_iter().
Now to your Rayon problem: it looks like you can't just take any normal iterator and turn it into a parallel one. However, I never used this library, so takes this with a grain of salt. To me it looks like you need to collect your iterator into a Vec to be able to use par_iter().
And just as a note: when using normal iterators, you shouldn't use map() and count(), but rather use a standard for loop.

Explicit lifetime error in rust

I have a rust enum that I want to use, however I recieve the error;
error: explicit lifetime bound required
numeric(Num),
~~~
The enum in question:
enum expr{
numeric(Num),
symbol(String),
}
I don't think I understand what is being borrowed here. My intent was for the Num or String to have the same lifetime as the containing expr allowing me to return them from functions.
The error message is somewhat misleading. Num is a trait and it is a dynamically sized type, so you can't have values of it without some kind of indirection (a reference or a Box). The reason for this is simple; just ask yourself a question: what size (in bytes) expr enum values must have? It is certainly at least as large as String, but what about Num? Arbitrary types can implement this trait, so in order to be sound expr has to have infinite size!
Hence you can use traits as types only with some kind of pointer: &Num or Box<Num>. Pointers always have fixed size, and trait objects are "fat" pointers, keeping additional information within them to help with method dispatching.
Also traits are usually used as bounds for generic type parameters. Because generics are monomorphized, they turn into static types in the compiled code, so their size is always statically known and they don't need pointers. Using generics should be the default approach, and you should switch to trait objects only when you know why generics won't work for you.
These are possible variants of your type definition. With generics:
enum Expr<N: Num> {
Numeric(N),
Symbol(String)
}
Trait object through a reference:
enum Expr<'a> { // '
Numeric(&'a Num + 'a),
Symbol(String)
}
Trait object with a box:
enum Expr {
Numeric(Box<Num + 'static>), // ' // I used 'static because numbers usually don't contain references inside them
Symbol(String)
}
You can read more about generics and traits in the official guide, though at the moment it lacks information on trait objects. Please do ask if you don't understand something.
Update
'a in
enum Expr<'a> { // '
Numeric(&'a Num + 'a),
Symbol(String)
}
is a lifetime parameter. It defines both the lifetime of a reference and of trait object internals inside Numeric variant. &'a Num + 'a is a type that you can read as "a trait object behind a reference which lives at least as long as 'a with references inside it which also live at least as long as 'a". That is, first, you specify 'a as a reference lifetime: &'a, and second, you specify the lifetime of trait object internals: Num + 'a. The latter is needed because traits can be implemented for any types, including ones which contain references inside them, so you need to put the minimum lifetime of these references into trait object type too, otherwise borrow checking won't work correctly with trait objects.
With Box the situation is very similar. Box<Num + 'static> is "a trait object inside a heap-allocated box with references inside it which live at least as long as 'static". The Box type is a smart pointer for heap-allocated owned data. Because it owns the data it holds, it does not need a lifetime parameter like references do. However, the trait object still can contain references inside it, and that's why Num + 'a is still used; I just chose to use 'static lifetime instead of adding another lifetime parameter. This is because numerical types are usually simple and don't have references inside them, and it is equivalent to 'static bound. You are free to add a lifetime parameter if you want, of course.
Note that all of these variants are correct:
&'a SomeTrait + 'a
&'a SomeTrait + 'static
Box<SomeTrait + 'a> // '
Box<SomeTrait + 'static>
Even this is correct, with 'a and 'b as different lifetime parameters:
&'a SomeTrait + 'b
though this is rarely useful, because 'b must be at least as long as 'a (otherwise internals of the trait object could be invalidated while it itself is still alive), so you can just as well use &'a SomeTrait + 'a.

Resources