Document duck types with multiple methods in YARD - ruby

YARD allows me to specify types for method parameters and return values. As I really like to duck type it is nice to see that YARD also supports defining types by specifying methods they must support.
As you can see here, expressions like #first_method, #second_method are interpreted as a logical disjunctions. This means an object needs to support #first_method or #second_method or both. This is not what I need.
I would like to be able to specify that an object is required to support both #first_method and #second_method for my parameter. Is there a way to specify this?

There is no idiomatic syntax for specifying a compound duck-type interface. That said, note that all type specifications are really just free-form text; the YARD types parser link given in the question is simply a set of conventional or idiomatic syntaxes for specifying common interfaces. If you think of a smart way to specify these types (and it makes sense to your users), you are free to do so. Perhaps something like #first#second might work, or #first&#second.
My recommendation, however, would be to create a concrete type to wrap this ad-hoc interface. Even if you don't actually use the type in your runtime, it would serve as object in your domain to explain the interface to your users. For example, by creating a class Foo with the two methods #first and #second, and then documenting those two methods, you could now use Foo as your type. You could explain in the Foo class documentation that it is never actually used in code and just exists as an "interface", if that's the case. You could also make Foo a "module" to denote that it is more of an "interface" (or "protocol", if you prefer that language) than a concrete type.

Related

'to_a' is not a method in 'Range', yet, it works on 'Range'?

The following code is valid:
(1..5).to_a
(1..5) is a Range. The method to_a appears to convert a range to an array.
However, the documentation for Range does not document. Since this documentation is presumably auto-generated from the source with Yard, I doubt it could not be in the list of methods. Is there auto conversion going on?
How is the above legal Ruby?
The following code is valid ruby:
b = (1..5).to_a
(1..5) is a Range object, and b is an Array object. The official(?) documentation for the Class Range does not document the method to_a, which appears to convert a range to an array.
So, how is the above legal Ruby?
Ruby has something called "inheritance". Inheritance is a method for differential code-reuse that actually does not only exist in Ruby, but is in fact quite popular in many languages such as Java, C♯, C++, Python, PHP, Scala, Kotlin, Ceylon, and so on and so forth.
Inheritance allows you to define methods in one place, and then inherit them in another place, overriding and defining only the methods whose behavior differs. Hence, "differential code re-use".
In this particular case, the method you are looking at is Enumerable#to_a.
Note: Ruby actually has two forms of inheritance, mixin inheritance and class inheritance. Mixin inheritance is like class inheritance where the mixin doesn't know its superclass. (The definitive resource about mixin inheritance is Gilad Bracha's PhD Thesis The Programming Language Jigsaw –􏰀 Mixins,􏰁 Modularity, and Multiple Inheritance.)
The official(?) documentation
Actually, ruby-doc is a third-party site. There is no official documentation site. (However, the documentation on ruby-doc is generated from documentation comments in YARV, one of the major Ruby implementations, and one that Yukihiro Matsumoto actively contributes to.)
Since this documentation is presumably autogenerated from the source with Yard,
It is actually autogenerated from the YARV source with RDoc, not YARD.
I'm confused how it could not be in the list of methods,
Note that the explanation I gave is the correct one, but there could be many other explanations. The most simple one: the reason why the method is not in the documentation is that it is not documented. Another reason could be a bug in the documentation generator that prevents the documentation for that method being displayed, a typo in the source code that prevents the documentation generator from recognizing the documentation, and many, many others.
so is there auto conversion going on?
No. Ruby does not have automatic conversions in the language. There are some methods which for reasons of efficiency are implemented in a non-object-oriented manner, and require an object to be an instance of a specific class (as opposed to the object-oriented manner, which would only require the object to conform to a specific protocol). Because it is a severe restriction to require an object to be an instance of a specific class, those methods will usually offer an "escape hatch" to the programmer, by sending a specific message and allowing the object to respond with an object of the required class.
For example, all methods printing something to the console, require an instance of String, but they will try sending to_s first, before rejecting the argument.
These are sometimes called "implicit conversions", but they have nothing to do with implicit conversions as in Scala or implicit casts as in C♯. They are in fact not even part of the language, they are just a convention for library writers.

Using design pattern with modules

As I understand: in Ruby, we use modules instead of classes when we don't need a state; if it's just a function that takes an input an produces an output.
Some design patterns depend on inheritance, like the template design pattern.
I am making a crawling library that takes a link as an input and produces an object containing the data.
It don't need a state so a module seems to be suitable for me instead of a class. I also need to use the template design pattern as I am using different algorithms to produce data with the same structure. I need to use inheritance to implement template design pattern, but I don't need a state; it just returns the data from the website.
Is implementing a design pattern a good reason to use classes (for inheritance)? What is the best practice in such cases?
In Ruby, we use modules instead of classes when we don't need a state; if it's just a function that takes an input an produces an output.
That's one use of modules. Another use of modules is to provide what other languages would call "traits", sets of functionality which can be plugged into any class without inheritance. This can and often does include storing state.
Be sure to remain flexible, Ruby is a very flexible language.
It don't need a state so a module seems to be suitable for me instead of a class. I also need to use the template design pattern as I am using different algorithms to produce data with the same structure.
Design patterns provide a pattern for solving certain general problems, not a set of rules which must be strictly adhered to. Many were written with C++ or Java in mind, extremely class-centric languages, and require some adapting to other languages.
In your case, all that matters is the interface remains the same. Rather than creating a whole abstract class, all you need to do is define an interface. Some languages have formal interfaces, like Rust or Java. Ruby does not. Instead it uses duck-typing. Basically if it has the methods you need and says it works like you need, use it.
module DoThatThing
module ThisProvider
def link_to_object(link)
...code to convert a link to ThisProvider into an object...
return object
end
end
module ThatProvider
def link_to_object(link)
...code to convert a link to ThatProvider into an object...
return object
end
end
end
Now it doesn't matter if you have DoThatThing::ThisProvider or DoThatThing::ThatProvider, you use them the same.
obj = provider_module.link_to_object(link)
You can, if you really want to, implement formal abstract modules in Ruby, but in Ruby it's generally a wasted effort and very much goes against how one does things in Ruby (in other languages, absolutely create an interface).
Instead, enforce this with testing. RSpec shared examples, or a similar feature of your testing system, are a good way to do that. You write one set of tests which should work for any implementation of DoThatThing and run them for all implementations of DoThatThing.
RSpec.shared_examples 'DoThatThing' do
...shared tests for any DoThatThing...
end
RSpec.describe DoThatThing::ThisProvider do
include_examples "DoThatThing"
...tests specific to ThisProvider...
end
RSpec.describe DoThatThing::ThatProvider do
include_examples "DoThatThing"
...tests specific to ThatProvider...
end

How to Work with Ruby Duck Typing

I am learning Ruby and I'm having a major conceptual problem concerning typing. Allow me to detail why I don't understand with paradigm.
Say I am method chaining for concise code as you do in Ruby. I have to precisely know what the return type of each method call in the chain, otherwise I can't know what methods are available on the next link. Do I have to check the method documentation every time?? I'm running into this constantly running tutorial exercises. It seems I'm stuck with a process of reference, infer, run, fail, fix, repeat to get code running rather then knowing precisely what I'm working with during coding. This flies in the face of Ruby's promise of intuitiveness.
Say I am using a third party library, once again I need to know what types are allow to pass on the parameters otherwise I get a failure. I can look at the code but there may or may not be any comments or declaration of what type the method is expecting. I understand you code based on methods are available on an object, not the type. But then I have to be sure whatever I pass as a parameter has all the methods the library is expect, so I still have to do type checking. Do I have to hope and pray everything is documented properly on an interface so I know if I'm expected to give a string, a hash, a class, etc.
If I look at the source of a method I can get a list of methods being called and infer the type expected, but I have to perform analysis.
Ruby and duck typing: design by contract impossible?
The discussions in the preceding stackoverflow question don't really answer anything other than "there are processes you have to follow" and those processes don't seem to be standard, everyone has a different opinion on what process to follow, and the language has zero enforcement. Method Validation? Test-Driven Design? Documented API? Strict Method Naming Conventions? What's the standard and who dictates it? What do I follow? Would these guidelines solve this concern https://stackoverflow.com/questions/616037/ruby-coding-style-guidelines? Is there editors that help?
Conceptually I don't get the advantage either. You need to know what methods are needed for any method called, so regardless you are typing when you code anything. You just aren't informing the language or anyone else explicitly, unless you decide to document it. Then you are stuck doing all type checking at runtime instead of during coding. I've done PHP and Python programming and I don't understand it there either.
What am I missing or not understanding? Please help me understand this paradigm.
This is not a Ruby specific problem, it's the same for all dynamically typed languages.
Usually there are no guidelines for how to document this either (and most of the time not really possible). See for instance map in the ruby documentation
map { |item| block } → new_ary
map → Enumerator
What is item, block and new_ary here and how are they related? There's no way to tell unless you know the implementation or can infer it from the name of the function somehow. Specifying the type is also hard since new_ary depends on what block returns, which in turn depends on the type of item, which could be different for each element in the Array.
A lot of times you also stumble across documentation that says that an argument is of type Object, Which again tells you nothing since everything is an Object.
OCaml has a solution for this, it supports structural typing so a function that needs an object with a property foo that's a String will be inferred to be { foo : String } instead of a concrete type. But OCaml is still statically typed.
Worth noting is that this can be a problem in statically typed lanugages too. Scala has very generic methods on collections which leads to type signatures like ++[B >: A, That](that: GenTraversableOnce[B])(implicit bf: CanBuildFrom[Array[T], B, That]): That for appending two collections.
So most of the time, you will just have to learn this by heart in dynamically typed languages, and perhaps help improve the documentation of libraries you are using.
And this is why I prefer static typing ;)
Edit One thing that might make sense is to do what Scala also does. It doesn't actually show you that type signature for ++ by default, instead it shows ++[B](that: GenTraversableOnce[B]): Array[B] which is not as generic, but probably covers most of the use cases. So for Ruby's map it could have a monomorphic type signature like Array<a> -> (a -> b) -> Array<b>. It's only correct for the cases where the list only contains values of one type and the block only returns elements of one other type, but it's much easier to understand and gives a good overview of what the function does.
Yes, you seem to misunderstand the concept. It's not a replacement for static type checking. It's just different. For example, if you convert objects to json (for rendering them to client), you don't care about actual type of the object, as long as it has #to_json method. In Java, you'd have to create IJsonable interface. In ruby no overhead is needed.
As for knowing what to pass where and what returns what: memorize this or consult docs each time. We all do that.
Just another day, I've seen rails programmer with 6+ years of experience complain on twitter that he can't memorize order of parameters to alias_method: does new name go first or last?
This flies in the face of Ruby's promise of intuitiveness.
Not really. Maybe it's just badly written library. In core ruby everything is quite intuitive, I dare say.
Statically typed languages with their powerful IDEs have a small advantage here, because they can show you documentation right here, very quickly. This is still accessing documentation, though. Only quicker.
Consider that the design choices of strongly typed languages (C++,Java,C#,et al) enforce strict declarations of type passed to methods, and type returned by methods. This is because these languages were designed to validate that arguments are correct (and since these languages are compiled, this work can be done at compile time). But some questions can only be answered at run time, and C++ for example has the RTTI (Run Time Type Interpreter) to examine and enforce type guarantees. But as the developer, you are guided by syntax, semantics and the compiler to produce code that follows these type constraints.
Ruby gives you flexibility to take dynamic argument types, and return dynamic types. This freedom enables you to write more generic code (read Stepanov on the STL and generic programming), and gives you a rich set of introspection methods (is_a?, instance_of?, respond_to?, kind_of?, is_array?, et al) which you can use dynamically. Ruby enables you to write generic methods, but you can also explicity enforce design by contract, and process failure of contract by means chosen.
Yes, you will need to use care when chaining methods together, but learning Ruby is not just a few new keywords. Ruby supports multiple paradigms; you can write procedural, object oriend, generic, and functional programs. The cycle you are in right now will improve quickly as you learn about Ruby.
Perhaps your concern stems from a bias towards strongly typed languages (C++, Java, C#, et al). Duck typing is a different approach. You think differently. Duck typing means that if an object looks like a , behaves like a , then it is a . Everything (almost) is an Object in Ruby, so everything is polymorphic.
Consider templates (C++ has them, C# has them, Java is getting them, C has macros). You build an algorithm, and then have the compiler generate instances for your chosen types. You aren't doing design by contract with generics, but when you recognize their power, you write less code, and produce more.
Some of your other concerns,
third party libraries (gems) are not as hard to use as you fear
Documented API? See Rdoc and http://www.ruby-doc.org/
Rdoc documentation is (usually) provided for libraries
coding guidelines - look at the source for a couple of simple gems for starters
naming conventions - snake case and camel case are both popular
Suggestion - approach an online tutorial with an open mind, do the tutorial (http://rubymonk.com/learning/books/ is good), and you will have more focused questions.

Do you use nouns for classnames that represent callable objects?

There is a more generic question asked here. Consider this as an extension to that.
I understand that classes represent type of objects and we should use nouns as their names. But there are function objects supported in many language that acts more like functions than objects. Should I name those classes also as nouns, or verbs are ok in that case. doSomething(), semantically, makes more sense than Something().
Update / Conclusion
The two top voted answers I got on this shares a mixed opinion:
Attila
In the case of functors, however, they represent "action", so naming them with a verb (or some sort of noun-verb combination) is more appropriate -- just like you would name a function based on what it is doing.
Rook
The instance of a functor on the other hand is effectively a function, and would perhaps be better named accordingly. Following Jim's suggestion above, SomethingDoer doSomething; doSomething();
Both of them, after going through some code, seems to be the common practice. In GNU implementation of stl I found classes like negate, plus, minus (bits/stl_function.h) etc. And variate_generator, mersenne_twister (tr1/random.h). Similarly in boost I found classes like base_visitor, predecessor_recorder (graph/visitors.hpp) as well as inverse, inplace_erase (icl/functors.hpp)
Objects (and thus classes) usually represent "things", therefore you want to name them with nouns. In the case of functors, however, they represent "action", so naming them with a verb (or some sort of noun-verb combination) is more appropriate -- just like you would name a function based on what it is doing.
In general, you would want to name things (objects, functions, etc.) after their purpose, that is, what they represent in the program/the world.
You can think of functors as functions (thus "action") with state. Since the clean way of achieving this (having a state associated with your "action") is to create an object that stores the state, you get an object, which is really an "action" (a fancy function).
Note that the above applies to languages where you can make a pure functor, that is, an object where the invocation is the same as for a regular function (e.g. C++). In languages where this is not supported (that is, you have to have a method other than operator() called, like command.do()), you would want to name the command-like class/object a noun and name the method called a verb.
The type of the functor is a class, which you may as well name in your usual style (for instance, a noun) because it doesn't really do anything on its own. The instance of a functor on the other hand is effectively a function, and would perhaps be better named accordingly. Following Jim's suggestion above, SomethingDoer doSomething; doSomething();
A quick peek at my own code shows I've managed to avoid this issue entirely by using words which are both nouns and verbs... Filter and Show for example. Might be tricky to apply that naming scheme in all circumstances, however!
I suggest you give them a suffix like Function, Callback, Action, Command, etc. so that you will have classes called SortFunction, SearchAction, ExecuteCommand instead of Sort, Search, Execute which sound more like names of actual functions.
I think there are two ways to see this:
1) The class which could be named sensibly so that it can be invoked as a functor directly upon construction:
Doer()(); // constructed and invoked in one shot, compiler could optimize
2) The instance in cases where we want the functor to be invoked multiple times on a stateless functor object, or perhaps for stylistic reasons when #1's syntax is not preferable.
Doer _do;
_do(); // operator () invocation after construction
I like #Rook's suggestion above of naming the class with words that have the same verb and noun forms such as Search or Filter.

How to name factory like methods?

I guess that most factory-like methods start with create. But why are they called "create"? Why not "make", "produce", "build", "generate" or something else? Is it only a matter of taste? A convention? Or is there a special meaning in "create"?
createURI(...)
makeURI(...)
produceURI(...)
buildURI(...)
generateURI(...)
Which one would you choose in general and why?
Some random thoughts:
'Create' fits the feature better than most other words. The next best word I can think of off the top of my head is 'Construct'. In the past, 'Alloc' (allocate) might have been used in similar situations, reflecting the greater emphasis on blocks of data than objects in languages like C.
'Create' is a short, simple word that has a clear intuitive meaning. In most cases people probably just pick it as the first, most obvious word that comes to mind when they wish to create something. It's a common naming convention, and "object creation" is a common way of describing the process of... creating objects.
'Construct' is close, but it is usually used to describe a specific stage in the process of creating an object (allocate/new, construct, initialise...)
'Build' and 'Make' are common terms for processes relating to compiling code, so have different connotations to programmers, implying a process that comprises many steps and possibly a lot of disk activity. However, the idea of a Factory "building" something is a sensible idea - especially in cases where a complex data-structure is built, or many separate pieces of information are combined in some way.
'Generate' to me implies a calculation which is used to produce a value from an input, such as generating a hash code or a random number.
'Produce', 'Generate', 'Construct' are longer to type/read than 'Create'. Historically programmers have favoured short names to reduce typing/reading.
Joshua Bloch in "Effective Java" suggests the following naming conventions
valueOf — Returns an instance that has, loosely speaking, the same value
as its parameters. Such static factories are effectively
type-conversion methods.
of — A concise alternative to valueOf, popularized by EnumSet (Item 32).
getInstance — Returns an instance that is described by the parameters
but cannot be said to have the same value. In the case of a singleton,
getInstance takes no parameters and returns the sole instance.
newInstance — Like getInstance, except that newInstance guarantees that
each instance returned is distinct from all others.
getType — Like getInstance, but used when the factory method is in a
different class. Type indicates the type of object returned by the
factory method.
newType — Like newInstance, but used when the factory method is in a
different class. Type indicates the type of object returned by the
factory method.
Wanted to add a couple of points I don't see in other answers.
Although traditionally 'Factory' means 'creates objects', I like to think of it more broadly as 'returns me an object that behaves as I expect'. I shouldn't always have to know whether it's a brand new object, in fact I might not care. So in suitable cases you might avoid a 'Create...' name, even if that's how you're implementing it right now.
Guava is a good repository of factory naming ideas. It is popularising a nice DSL style. examples:
Lists.newArrayListWithCapacity(100);
ImmutableList.of("Hello", "World");
"Create" and "make" are short, reasonably evocative, and not tied to other patterns in naming that I can think of. I've also seen both quite frequently and suspect they may be "de facto standards". I'd choose one and use it consistently at least within a project. (Looking at my own current project, I seem to use "make". I hope I'm consistent...)
Avoid "build" because it fits better with the Builder pattern and avoid "produce" because it evokes Producer/Consumer.
To really continue the metaphor of the "Factory" name for the pattern, I'd be tempted by "manufacture", but that's too long a word.
I think it stems from “to create an object”. However, in English, the word “create” is associated with the notion “to cause to come into being, as something unique that would not naturally evolve or that is not made by ordinary processes,” and “to evolve from one's own thought or imagination, as a work of art or an invention.” So it seems as “create” is not the proper word to use. “Make,” on the other hand, means “to bring into existence by shaping or changing material, combining parts, etc.” For example, you don’t create a dress, you make a dress (object). So, in my opinion, “make” by meaning “to produce; cause to exist or happen; bring about” is a far better word for factory methods.
Partly convention, partly semantics.
Factory methods (signalled by the traditional create) should invoke appropriate constructors. If I saw buildURI, I would assume that it involved some computation, or assembly from parts (and I would not think there was a factory involved). The first thing that I thought when I saw generateURI is making something random, like a new personalized download link. They are not all the same, different words evoke different meanings; but most of them are not conventionalised.
I'd call it UriFactory.Create()
Where,
UriFactory is the name of the class type which is provides method(s) that create Uri instances.
and Create() method is overloaded for as many as variations you have in your specs.
public static class UriFactory
{
//Default Creator
public static UriType Create()
{
}
//An overload for Create()
public static UriType Create(someArgs)
{
}
}
I like new. To me
var foo = newFoo();
reads better than
var foo = createFoo();
Translated to english we have foo is a new foo or foo is create foo. While I'm not a grammer expert I'm pretty sure the latter is grammatically incorrect.
I'd point out that I've seen all of the verbs but produce in use in some library or other, so I wouldn't call create being an universal convention.
Now, create does sound better to me, evokes the precise meaning of the action.
So yes, it is a matter of (literary) taste.
Personally I like instantiate and instantiateWith, but that's just because of my Unity and Objective C experiences. Naming conventions inside the Unity engine seem to revolve around the word instantiate to create an instance via a factory method, and Objective C seems to like with to indicate what the parameter/s are. This only really works well if the method is in the class that is going to be instantiated though (and in languages that allow constructor overloading, this isn't so much of a 'thing').
Just plain old Objective C's initWith is also a good'un!

Resources