What is the difference between &Trait and impl Trait when used as method arguments? - syntax

In my project so far, I use many traits to permit mocking/stubbing in unit tests for injected dependencies. However, one detail of what I'm doing so far seems so suspicious that I'm surprised it even compiles. I'm worried that something dangerous is going on that I don't see or understand. It's based on the difference between these two method signatures:
fn confirm<T>(subject: &MyTrait<T>) ...
fn confirm<T>(subject: impl MyTrait<T>) ...
I only just discovered the impl ... syntax in method arguments, and it seems like the only documented way to do this, but my tests pass using the other way already, which I came to by intuition based on how Go solves the same problem (size of method argument at compile time, when argument can be any implementer of an interface, and references can come to the rescue).
What is the difference between these two? And why are they both allowed? Do they both represent legitimate use cases, or is my reference syntax (&MyTrait<T>) strictly a worse idea?

The two are different, and serve different purposes. Both are useful, and depending on circumstances one or the other may be the best choice.
The first case, &MyTrait<T>, is preferably written &dyn MyTrait<T> in modern Rust. It is a so-called trait object. The reference points to any type implementing MyTrait<T>, and method calls are dispatched dynamically at runtime. To make this possible, the reference is actually a fat pointer; apart from a pointer to the object it also stores a pointer to the virtual method table of the type of the object, to allow dynamic dispatch. If the actual type of your object only becomes known at runtime, this is the only version you can use, since you need to use dynamic dispatch in that case. The downside of the approach is that there is a runtime cost, and that it only works for traits that are object-safe.
The second case, impl MyTrait<T>, denotes any type implementing MyTrait<T> again, but in this case the exact type needs to be known at compile time. The prototype
fn confirm<T>(subject: impl MyTrait<T>);
is equivalent to
fn confirm<M, T>(subject: M)
where
M: MyTrait<T>;
For each type M that is used in your code, the compiler creates a separate version of confim in the binary, and method calls are dispatched statically at compile time. This version is preferable if all types are known at compile time, since you don't need to pay the runtime cost of dynamically dispatching to the concrete types.
Another difference between the two prototypes is that the first version accepts subject by reference, while the second version consumes the argument that is passed in. This isn't a conceptual difference, though – while the first version cannot be written to consume the object, the second version can easily be written to accept subject by reference:
fn confirm<T>(subject: &impl MyTrait<T>);
Given that you introduced the traits to facilitate testing, it is likely that you should prefer &impl MyTrait<T>.

It is indeed different. The impl version is equivalent to the following:
fn confirm<T, M: MyTrait<T>>(subject: M) ...
so unlike the first version, subject is moved (passed by value) into confirm, rather than passed by reference. So in the impl version, confirm takes ownership of this value.

Related

How does GoLand of JetBrains find the implementations of interface?

As I know, it's basing on Guru for vim-go to find the implementations or usage which needs to compile the whole project as a premise. Otherwise, GoLand doesn't need to do that, but how?
While this task may look rather trivial, GoLand uses some tricks to perform it more efficiently. Let's go step by step to explore how it works.
When one opens a project for the first time, the IDE performs so-called indexing. In particular, it stores all the method and method spec names as well as the number of their parameters.
At the beginning of the search, GoLand takes method specs of the interface and finds one with the biggest number of parameters. This is a performance optimization. The idea behind it is that methods with many parameters occur less often in the code, so the IDE needs to check just a few of them.
It's time to use the index. For the chosen method spec, the IDE finds all corresponding methods. The scope is taken into account, so for a private interface, for instance, it's much smaller.
For each method, GoLand resolves its type and checks whether it implements the interface. This is the moment when all interface's method specs are taken into account.
Not only structures can implement interfaces but other interfaces, too. As the next steps, the IDE looks for all corresponding method specifications, that is, method specifications with the same name and number of parameters. There's a separate index that's responsible for this.
For each method spec, its interface is taken and checked. No resolution is involved this time as it's enough to traverse a syntax tree up to find an interface of an arbitrary method spec.
That's basically the algorithm. There are a few more implementation details to make it works faster, but they don't affect results.
GoLand relies on the IntelliJ Platform and a set of custom written tooling bundled as a custom language plugin on top of it to handle indexing, parsing, navigation and editing code.

How to split up really long match on enum with many variants?

What's the usual best practice to split up a really long match on an enum with dozens of variants to handle, each with dozens or hundreds of lines of code?
I've started to create helper functions for each case and just call those functions passing in the enum's fields (or whatever they're called). But it seems a bit redundant to have MyEnum::MyCase{a,b,c} => handle_mycase(a,b,c) many times.
And if that is the best practice, is it possible to destructure MyEnum::MyCase directly in that helper function's parameters, despite the fact that technically it's refutable, since realistically I already know I'm calling it with the right case?
Maybe the crate enum_dispatch helps you.
IIRC, on a high level: It assumes that all your enum variants implement a trait with a function handle_mycase. Then handle_mycase can be called on the enum directly and will be dispatched to the concrete struct.

Is there a difference between fun(n::Integer) and fun(n::T) where T<:Integer in performance/code generation?

In Julia, I most often see code written like fun(n::T) where T<:Integer, when the function works for all subtypes of Integer. But sometimes, I also see fun(n::Integer), which some guides claim is equivalent to the above, whereas others say it's less efficient because Julia doesn't specialize on the specific subtype unless the subtype T is explicitly referred to.
The latter form is obviously more convenient, and I'd like to be able to use that if possible, but are the two forms equivalent? If not, what are the practicaly differences between them?
Yes Bogumił Kamiński is correct in his comment: f(n::T) where T<:Integer and f(n::Integer) will behave exactly the same, with the exception the the former method will have the name T already defined in its body. Of course, in the latter case you can just explicitly assign T = typeof(n) and it'll be computed at compile time.
There are a few other cases where using a TypeVar like this is crucially important, though, and it's probably worth calling them out:
f(::Array{T}) where T<:Integer is indeed very different from f(::Array{Integer}). This is the common parametric invariance gotcha (docs and another SO question about it).
f(::Type) will generate just one specialization for all DataTypes. Because types are so important to Julia, the Type type itself is special and allows parameterization like Type{Integer} to allow you to specify just the Integer type. You can use f(::Type{T}) where T<:Integer to require Julia to specialize on the exact type of Type it gets as an argument, allowing Integer or any subtypes thereof.
Both definitions are equivalent. Normally you will use fun(n::Integer) form and apply fun(n::T) where T<:Integer only if you need to use specific type T directly in your code. For example consider the following definitions from Base (all following definitions are also from Base) where it has a natural use:
zero(::Type{T}) where {T<:Number} = convert(T,0)
or
(+)(x::T, y::T) where {T<:BitInteger} = add_int(x, y)
And even if you need type information in many cases it is enough to use typeof function. Again an example definition is:
oftype(x, y) = convert(typeof(x), y)
Even if you are using a parametric type you can often avoid using where clause (which is a bit verbose) like in:
median(r::AbstractRange{<:Real}) = mean(r)
because you do not care about the actual value of the parameter in the body of the function.
Now - if you are Julia user like me - the question is how to convince yourself that this works as expected. There are the following methods:
you can check that one definition overwrites the other in methods table (i.e. after evaluating both definitions only one method is present for this function);
you can check code generated by both functions using #code_typed, #code_warntype, #code_llvm or #code_native etc. and find out that it is the same
finally you can benchmark the code for performance using BenchmarkTools
A nice plot explaining what Julia does with your code is here http://slides.com/valentinchuravy/julia-parallelism#/1/1 (I also recommend the whole presentation to any Julia user - it is excellent). And you can see on it that Julia after lowering AST applies type inference step to specialize function call before LLVM codegen step.
You can hint Julia compiler to avoid specialization. This is done using #nospecialize macro on Julia 0.7 (it is only a hint though).

In Java 8, can the method reference part of a call always be parenthesized?

Say I have a type correct Java method call, such as
f.g(5)
Java 8 now allows method references, so in most cases one can now write
(f::g)(5)
where f::g turns into a lambda function which is then called.
Question: Is this always possible even in cases where f::g is overloaded, or can overloading interfere with the two step process? This would happen if the overload determination must happen at the level of the method reference, before the argument types are known.
Motivation: I am writing compiler-like code, which is why I need to understand these subtleties. I am aware that parenthesizing method references in calls is not a necessary software engineering practice.
No.
However, I think you are confused about what this feature is. The utterance of f.g in the above is not a method reference; it's not even an expression.
A method reference is an expression that looks like Foo::bar and can be converted to a functional interface type.

How to achieve a recursive deftype

I'm curious as to how to do a Clojure deftype that contains a reference to itself, e.g.
(deftype BinaryTree [^BinaryTree left ^BinaryTree right])
This doesn't work... however I see no intrinsic reason why it shouldn't be possible since the underlying Java class is perfectly capable of referring to itself.
What am I doing wrong here?
Mike.
Currently ^Class hints on fields (in opposition to ^primitive hints) are discarded, so there's no gain in trying to put them. This may change in the future.
However auto reference in a type definition (eg in method bodies, not in fields) somewhat works but the implementation is a bit of a hack. There's little incentive to fix auto-reference in the current java compiler given the promise of the rewrite of the compiler in Clojure.

Resources