If I have a data structure like this:
struct Point {
x: i32,
y: i32,
}
impl Point {
fn setX(&mut self, x: i32) -> &mut Point {
self.x = x;
self
}
}
Is it possible to iterate through Point and see both each member and the name of each member?
Is it also possible to go through the implementation and see what each function's name is?
Is it possible to do the above two tasks at runtime, without special implementations?
In fact, there is a way to (ab)use Encodable or Serialize traits to obtain reflection-like information about structure contents (not methods, though).
Encodable/Serialize are used primarily for writing a structure to some serialized representation, e.g. a JSON object. Their implementations can be automatically generated (e.g. with #[derive(RustcEncodable)] for Encodable) for any structure whose contents also implement corresponding trait.
Implementations of these traits capture information about the structure and they pass it to an implementation of Encoder or Serializer. Implementors of the latter traits usually use this information (field names, types and values) to serialize objects but of course you can write your own implementation of Encoder/Serializer which will do with this information whatever you want. I'm not providing an example of such implementation here because they tend to be boilerplate-y, but you can find some through the links above.
The limitation is that you always need a value of a structure in order to get information about fields. You can't just get a list of fields of an arbitrary type, like e.g. Java reflection allows. I think it is possible to write an internally unsafe implementation of Encoder/Serializer and a function like fn type_info<T: Encodable>() -> TypeInfo which collects information about a type by creating an uninitialized piece of memory of the corresponding type and running its Encodable methods, but I'm not 100% sure about this.
Rust does not really support this kind of reflection at runtime, no.
In theory, you might be able to write a syntax extension that would let you generate some code that would do something like this, maybe...
You can accomplish the first thing (iterating over the fields) with my crate fields-iter:
#[derive(fields_iter::FieldsInspect)]
struct Point {
x: i32,
y: i32,
}
for (name, value) in fields_iter::FieldsIter::new(&point) {
// Do something with it.
}
Related
In the following example of passing a trait as a parameter, what's the need of sending impl in the function signature?
I understand that traits are more generic types and not concrete types, but since the Rust compiler doesn't allow sharing names across structs and traits, why is there a need to provide impl in the function signature to represent the type?
pub fn notify(item: impl Summary) {
println!("Breaking news! {}", item.summarize());
}
The documentation mentions that the above signature is just syntactic sugar for the below signature. Wouldn't it make sense to use trait Summary instead of impl Summary as impl can also be used to define methods on structs?
pub fn notify<T: Summary>(item: T) {
println!("Breaking news! {}", item.summarize());
}
Is there any hidden concept around it that I'm missing?
Contrary to languages such as Go or Java, Rust allows for both static and dynamic dispatch, and some syntax was required to let programmers choose between the two.
As dynamic dispatch must work on objects which might not be Sized, you need a reference to use it. That is, you would use &dyn Trait or Box<dyn Trait> (note: for historical reasons, the dyn keyword is not required, but modern Rust uses it). In C++, dynamic dispatch also requires a reference or pointer.
Static dispatch is not something Go or Java have. In C++, it works with templates and duck-typing. In Rust, it works with generics and traits, and its original syntax was:
fn some_function<T: Trait>(foo: T) { … }
Later, the following syntax was added to the language:
fn some_function(foo: impl Trait) { … }
which is equivalent to the above.
This syntax was originally invented to be used in return types, where there is no generic equivalent:
fn some_function() -> impl Trait { … }
This means that some_function can return any single type that implements Trait, but this type must be known at compile time. This has some performance benefits over returning Box<Trait> for example. In C++, the closest equivalent would be returning auto or decltype(auto).
The syntax in parameter position was added for symmetry.
You might wonder why not simply make the generics implicit and have:
fn some_function(foo: Trait) { … }
But this would be slightly confusing. Trait by itself is not sized, and therefore cannot be used as a parameter, unless they are generic. This would make traits stand out in the realm of unsized types. For example, if (foo: Trait) would work, you might wonder why (foo: str) doesn't, but what would that one mean? There is also other problems with making generics implicit, for example, generics in traits make the trait non-object-safe.
Later, Rust will likely extend those existential types and allow this at a module level:
type Foo = impl Bar;
(which is currently allowed on nightly, guarded by the type_alias_impl_trait feature)
Finally, you are asking why the syntax is impl Foo, rather than trait Foo. This reads well as "a type that implements Foo". The original RFC doesn't discuss alternative syntaxes much. Another RFC discusses the syntax more, in particular whether the syntax should have been any Foo in parameter position, and some Foo in return position. The syntax trait Foo was never considered as far as I am aware.
I have recently seen code using the dyn keyword:
fn foo(arg: &dyn Display) {}
fn bar() -> Box<dyn Display> {}
What does this syntax mean?
TL;DR: It's a syntax for specifying the type of a trait object and must be specified for clarity reasons.
Since Rust 1.0, traits have led a double life. Once a trait has been declared, it can be used either as a trait or as a type:
// As a trait
impl MyTrait for SomeType {}
// As a type!
impl MyTrait {}
impl AnotherTrait for MyTrait {}
As you can imagine, this double meaning can cause some confusion. Additionally, since the MyTrait type is an unsized / dynamically-sized type, this can expose people to very complex error messages.
To ameliorate this problem, RFC 2113 introduced the dyn syntax. This syntax is available starting in Rust 1.27:
use std::{fmt::Display, sync::Arc};
fn main() {
let display_ref: &dyn Display = &42;
let display_box: Box<dyn Display> = Box::new(42);
let display_arc: Arc<dyn Display> = Arc::new(42);
}
This new keyword parallels the impl Trait syntax and strives to make the type of a trait object more obviously distinct from the "bare" trait syntax.
dyn is short for "dynamic" and refers to the fact that trait objects perform dynamic dispatch. This means that the decision of exactly which function is called will occur at program run time. Contrast this to static dispatch which uses the impl Trait syntax.
The syntax without dyn is now deprecated and it's likely that in a subsequent edition of Rust it will be removed.
Why would I implement methods on a trait instead of as part of the trait?
What makes something a "trait object"?
TLDR: "dyn" allows you to store in a Box a mix of Apples and Oranges, because they all implement the same trait of Fruit, which is what your Box is using as a type constraint, instead of just a generic type. This is because Generic allows any ONE of Apple OR Orange, but not both:
Vec<Box<T>> --> Vector can hold boxes of either Apples OR Oranges structs
Vec<Box<dyn Fruit>> --> Vector can now hold a mix of boxes of Apples AND Oranges Structs
If you want to store multiple types to the same instance of a data-structure, you have to use a trait wrapping a generic type and tag it as a "dyn", which will then cause that generic type to be resolved each time it's called, during runtime.
Sometimes, rather than using a type (String, &str, i32, etc...) or generic (T, Vec, etc...), we are using a trait as the type constraint (i.e. TryFrom). This is to allow us to store multiple types (all implementing the required trait), in the same data-structure instance (you will probably need to Box<> it too).
"dyn" basically tells the compiler that we don't know what the type is going to be at compile-time in place of the trait, and that it will be determined at run-time. This allows the final type to actually be a mixture of types that all implement the trait.
For generics, the compiler will hard-code the type in place of our generic type at the first use of the call to our data-structure consuming the generics. Every other call to store data in that same data-structure is expected to be using the same type as in the first call.
WARNING
As with all things, there is a performance penalty for implementing added flexibility, and this case definitely has a performance penalty.
I found this blog post to explain this feature really clearly: https://medium.com/digitalfrontiers/rust-dynamic-dispatching-deep-dive-236a5896e49b
Relevant excerpt:
struct Service<T:Backend>{
backend: Vec<T> // Either Vec<TypeA> or Vec<TypeB>, not both
}
...
let mut backends = Vec::new();
backends.push(TypeA);
backends.push(TypeB); // <---- Type error here
vs
struct Service{
backends: Vec<Box<dyn Backend>>
}
...
let mut backends = Vec::new();
backends.push( Box::new(PositiveBackend{}) as Box<dyn Backend>);
backends.push( Box::new(NegativeBackend{}) as Box<dyn Backend>);
The dyn keyword is used to indicate that a type is a trait object. According to the Rust docs:
A trait object is an opaque value of another type that implements a
set of traits.
In other words, we do not know the specific type of the object at compile time, we just know that the object implements the trait.
Because the size of a trait object is unknown at compile time they must be placed behind a pointer. For example, if Trait is your trait name then you can use your trait objects in the following manner:
Box<dyn Trait>
&dyn Trait
and other pointer types
The variables/parameters which hold the trait objects are fat pointers which consists of the following components:
pointer to the object in memory
pointer to that object’s vtable, a vtable is a table with pointers which point to the actual method(s) implementation(s).
See my answer on What makes something a “trait object”? for further details.
I have a matrix struct written in Go. That matrix struct has a bunch of methods. I want to be able to efficiently compute matrix operations but I also want to be able to send it over the wire in order to distribute the computation.
I currently have the matrix and its methods separate from the protobuf definition. When I need to send it over the wire I have to create a new pb.Matrix{} from the existing Matrix{} struct and then make my grpc call. That seems like a waste. So, is it a waste? And should I just be defining my matrix struct as a protobuf definition and then use embedding to define operations on it? Or is it better to keep them separate from each other?
In terms of architecture, I'd keep them separate. That would agree with the Single Responsibility Principle. In one of my projects we use this form:
type Foo struct { ... }
func NewFooFromProto(f *myproto.Foo) *Foo { ... }
func (f *Foo) ToProto() *myproto.Foo { ... }
Suppose that I have a type type T intand I want to define a logic to operate on this type.
What abstraction should I use and When ?
Defining a method on that type:
func (T t) someLogic() {
// ...
}
Defining a function:
func somelogic(T t) {
// ...
}
Some situations where you tend to use methods:
Mutating the receiver: Things that modify fields of the objects are often methods. It's less surprising to your users that x.Foo will modify X than that Foo(x) will.
Side effects through the receiver: Things are often methods on a type if they have side effects on/through the object in subtler ways, like writing to a network connection that's part of the struct, or writing via pointers or slices or so on in the struct.
Accessing private fields: In theory, anything within the same package can see unexported fields of an object, but more commonly, just the object's constructor and methods do. Having other things look at unexported fields is sort of like having C++ friends.
Necessary to satisfy an interface: Only methods can be part of interfaces, so you may need to make something a method to just satisfy an interface. For example, Peter Bourgon's Go intro defines type openWeatherMap as an empty struct with a method, rather than a function, just to satisfy the same weatherProvider interface as other implementations that aren't empty structs.
Test stubbing: As a special case of the above, sometimes interfaces help stub out objects for testing, so your stub implementations might have to be methods even if they have no state.
Some where you tend to use functions:
Constructors: func NewFoo(...) (*Foo) is a function, not a method. Go has no notion of a constructor, so that's how it has to be.
Running on interfaces or basic types: You can't add methods on interfaces or basic types (unless you use type to make them a new type). So, strings.Split and reflect.DeepEqual must be functions. Also, io.Copy has to be a function because it can't just define a method on Reader or Writer. Note that these don't declare a new type (e.g., strings.MyString) to get around the inability to do methods on basic types.
Moving functionality out of oversized types or packages: Sometimes a single type (think User or Page in some Web apps) accumulates a lot of functionality, and that hurts readability or organization or even causes structural problems (like if it becomes harder to avoid cyclic imports). Making a non-method out of a method that isn't mutating the receiver, accessing unexported fields, etc. might be a refactoring step towards moving its code "up" to a higher layer of the app or "over" to another type/package, or the standalone function is just the most natural long-term place for it. (Hat tip Steve Francia for including an example of this from hugo in a talk about his Go mistakes.)
Convenience "just use the defaults" functions: If your users might want a quick way to use "default" object values without explicitly creating an object, you can expose functions that do that, often with the same name as an object method. For instance, http.ListenAndServe() is a package-level function that makes a trivial http.Server and calls ListenAndServe on it.
Functions for passing behavior around: Sometimes you don't need to define a type and interface just to pass functionality around and a bare function is sufficient, as in http.HandleFunc() or template.Funcs() or for registering go vet checks and so on. Don't force it.
Functions if object-orientation would be forced: Say your main() or init() are cleaner if they call out to some helpers, or you have private functions that don't look at any object fields and never will. Again, don't feel like you have to force OO (à la type Application struct{...}) if, in your situation, you don't gain anything by it.
When in doubt, if something is part of your exported API and there's a natural choice of what type to attach it to, make it a method. However, don't warp your design (pulling concerns into your type or package that could be separate) just so something can be a method. Writers don't WriteJSON; it'd be hard to implement one if they did. Instead you have JSON functionality added to Writers via a function elsewhere, json.NewEncoder(w io.Writer).
If you're still unsure, first write so that the documentation reads clearly, then so that code reads naturally (o.Verb() or o.Attrib()), then go with what feels right without sweating over it too much, because often you can rearrange it later.
Use the method if you are manipulating internal secrets of your object
(T *t) func someLogic() {
t.mu.Lock()
...
}
Use the function if you are using the public interface of the object
func somelogic(T *t) {
t.DoThis()
t.DoThat()
}
if you want to change T object, use
func (t *T) someLogic() {
// ...
}
if you donn't change T object and would like a origined-object way , use
func (t T) someLogic() {
// ...
}
but remeber that this will generate a temporay object T to call someLogic
if your like the way c language does, use
func somelogic(t T) {
t.DoThis()
t.DoThat()
}
or
func somelogic(t T) {
t.DoThis()
t.DoThat()
}
one more thing , the type is behide the var in golang.
For a given type Data , I would like to define a set of filters, each processing Data in a certain way. Some filters only need the data to be processed, other may need additional parameters.
type Data struct {
...
}
I want to be able to define a list of filters, and apply them sequentially to an instance of Data. To acheive this, I defined a Filter interface :
type Filter interface {
Apply (d *Data) error
}
To define a filter, all I have to do is create a new type and define the Apply method for it.
Now, let's say I have a filter that does not need any additional information. Is it good practice to define it as an empty struct ?
type MySimpleFilter struct {}
func (f *MySimpleFilter) Apply (d *Data) {
...
}
I'd argue this is good practice if you have no use for a Field, especially compared to using another type (i.e. type MySimpleFilter int) because an empty struct uses no space:
https://codereview.appspot.com/4634124
and it can still fulfill interface contracts (hence can be more useful than a functional approach in some cases).
This can also be a good idiom when using a map that you have no use for the value (i.e. map[string]struct{}). See this discussion for details:
https://groups.google.com/forum/?fromgroups=#!topic/golang-nuts/lb4xLHq7wug
This is a question that doesn't have a clear answer since it's a matter of taste. I'd say it is good practice because it makes MySimpleFilter symmetrical to the other filters, which makes it easier to understand the code.