Why does rubocop or the ruby style guide prefer not to use get_ or set_? - ruby

I was running rubocop on my project and fixing the complaints it raised.
One particular complaint bothered me
Do not prefix reader method names with get_
I could not understand much from this complaint so I looked at source code in github.
I found this snippet
def bad_reader_name?(method_name, args)
method_name.start_with?('get_') && args.to_a.empty?
end
def bad_writer_name?(method_name, args)
method_name.start_with?('set_') && args.to_a.one?
end
So the advice or convention is as follows:
1) Actually they advice us not to use get_ when the method does not have arguments . otherwise they allow get_
2) And they advice us not to use set_ when the method has only one argument .otherwise they allow set_
What is the reason behind this convention or rule or advice?

I think the point here is ruby devs prefer to always think of methods as getters since they returns something and use the equals "syntactic sugar" (like in def self.dog=(params) which lets you do Class.dog = something). In essence the point I've always seen made is that the get and set are redundant and verbose.
In opposition to this you have get and set with multiple args which are like finder methods (particularly get; think of ActiveRecord's where).
Keep in mind that 'style guide' = pure opinion. Consistency is the higher goal of style guides in general so unless something is arguably wrong or difficult to read, your goal should be more on having everything the same than of a certain type. Which is why rubocop let's you turn this off.

Another way to see it: the getter/setter paradigm was, as far as I can tell, largely a specific convention in Java/C++ etc.; at least I knew quite a few Java codebases in the very foggy past where Beans were littered with huge amounts of get_-getters and set_-setters. In that time, the private attribute would likely be called "name" with "set_name()" and "get_name()"; as the attribute itself was called "name", the getter could not be "name()" as well.
Hence the very existence of "get_" and "set_" is due to a (trivial) technical shortcoming of languages that do not allow the "=" in method names.
In Ruby, we have quite a different array of possibilities:
First and foremost, we have name() and name=(), which immediately removes the need for the get_ and set_ syntax. So, we do have getters and setters, we just call them differently from Java.
Also, the attribute then is not name but #name, hence solving this conflict as well.
Actually, you don't have attributes with the plain "obj.name" syntax at all! For eaxmple; while Rails/ActiveRecord pretends that "person.name" is a "attribute", it is in fact simply a pair of auto-generated getter name() and setter name=(). Conceptionally, the caller is not supposed to care about what "name" is (attribute or method), it is an implementation detail of the class.
It saves 4 or 3 key presses for each call. This might seem like a joke, but writing concise yet easily comprehensible code is a trademark of Ruby, after all. :-)

The way I understand it is that it's because foo.get_value is imperative and foo.value is declarative.

Related

What is special about boolean?

In Ruby, there is a convention to have a method name end with a question mark to indicate that its return value is boolean. Why is boolean considered so special? Is there anything convenient if you know that a method's return value is particularly boolean? After all, in Ruby, you can insert all kinds of value returning (getter) methods into a conditional without caring whether it is boolean or not.
I think it is a waste to use the question mark just for indicating a boolean value. There should be more useful uses. I have plenty of use case where I want to have a pair of getter and setter methods, where the setter method should return self so that I can use it in a method chain. And naming them something like get_foo and set_foo looks cumbersome. Rather than following the convention, I am tempted to name a pair of getter and setter methods like this:
def foo?; #foo end
def foo v; #foo = v end
where the value of #foo is not (necessarily) boolean. (Besides potential criticism that breaking the convention will confuse other programmers), is there something wrong with doing that?
There is nothing special at all, it's just a convention. A question can be answered with "yes" or "no", but also with another stuff like someone's name.
By returning a boolean on methods with a question mark, it indicates it to be an explicit behavior.
If you make the answer be "yes" or "no", it's easy for the reader of your code to identify the behavior of your method without even looking at the implementation. On the other hand, if you make it return any other type, it is more difficult for the reader to understand your code without reading your class and method definition.
With a boolean there are only two possible answers. If the return value is not boolean it can be anything, which would not help at all. You would still need to look at the method implementation. You should always look further to understand some piece of code, but using this convention makes it simpler.
There is a convention to use question mark in method names to indicate that a method is a predicate. AFAIK, this predicate is not required (by the convention) to return a boolean value, thanks to simple rules for truthy/falsey values.
Besides potential criticism that breaking the convention will confuse other programmers, is there something wrong with doing that?
Confusing and surprising fellow programmers is bad. Ruby couldn't care less. It's just a convention. And conventions exist for a reason.
You can put anything in a flow control construct, but semantically booleans are appropriate. "If" in real human language typically takes a boolean, and the same is true of the construct in many programming languages. Ruby likes to make things convenient and assigns a "truthiness" value to everything in the language, which affects how it behaves in a boolean context.
In other words, booleans are the only things that are almost exclusively used for flow control, so the convention is to make them look "right" for flow-control constructs. It's their native environment.
(Besides potential criticism that breaking the convention will confuse other programmers), is there something wrong with doing that?
In the same sense that there is nothing wrong with naming all your variables after 1920s comedians, no, there's nothing wrong with that. But also in the same sense as naming all your variables after 1920s comedians, it isn't a very good idea. Nowhere in any language that I know of -- human or computer -- does the question mark mean "get." So the semantics of your code are off with that convention.
This question and the answers boil down to "POLS" AKA "Principle of Least Surprise".
A method name can be a random choice of letters and numbers separated by underscores, with '!', '?' and '=' sprinkled through them, if we chose to do so. They could be randomly created by the code at run time, and, as long as the rest of the code used the same arrangement of characters, the program would run and Ruby would be happy.
We humans, the programmers, determine the name of the methods used, to represent something, a characteristic or an action. Trying to use randomly named methods would lead to madness, or at least a very hard to maintain program. So, instead, we try to use sensible names for things. Sometimes they're verbs or adjectives, sometimes they're more descriptive because the method does several things.
As part of that naming, sometimes we want to provide additional hints about the behavior of the method. By convention in Ruby, we use "!" to warn the coder that the method changes something or is destructive. "=" indicates the method takes a parameter and assigns it to the receiver/object. It's a setter method and in many other languages it'd be idiomatic to use "set_flag..." or "set_value..." as the name. It's just a convention in that language, and followed by developers in the language.
We use "?" in Ruby to ask a question about an object, whether it is, or isn't, true about that object. We could say "is_true?" or "true?" and indicate we are testing whether something is true about it. If it's true, or false, it's a Boolean response so we return a true/false value.

Naming convention for syntactic sugar methods

I'm build a library for generic reporting, Excel(using Spreadsheet), and most of the time I'll be writing things out on the last created worksheet (or active as I mostly refer to it).
So I'm wondering if there's a naming convention for methods that are mostly sugar to the normal/unsugared method.
For instance I saw a blog post, scroll down to Composite, a while ago where the author used the #method for the sugared, and #method! when unsugared/manual.
Could this be said to be a normal way of doing things, or just an odd implementation?
What I'm thinking of doing now is:
add_row(data)
add_row!(sheet, data)
This feels like a good fit to me, but is there a consensus on how these kinds of methods should be named?
Edit
I'm aware that the ! is used for "dangerous" methods and ? for query/boolean responses. Which is why I got curious whether the usage in Prawn (the blog post) could be said to be normal.
I think it's fair to say that your definitions:
add_row(data)
add_row!(sheet, data)
are going to confuse Ruby users. There is a good number of naming conventions is the Ruby community that are considered like a de-facto standard for naming. For example, the bang methods are meant to modify the receiver, see map and map!. Another convention is add the ? as a suffix to methods that returns a boolean. See all? or any? for a reference.
I used to see bang-methods as more dangerous version of a regular named method:
Array#reverse! that modifies array itself instead of returning new array with reversed order of elements.
ActiveRecord::Base#save! (from Rails) validates model and save it if it's valid. But unlike regular version that return true or false depending on whether the model was saved or not raises an exception if model is invalid.
I don't remember seeing bang-methods as sugared alternatives for regular methods. May be I'd give such methods their own distinct name other then just adding a bang to regular version name.
Why have two separate methods? You could for example make the sheet an optional parameter, for example
def add_row(sheet = active_sheet, data)
...
end
default values don't have to just be static values - in this case it's calling the active_sheet method. If my memory is correct prior to ruby 1.9 you'd have to swap the parameters as optional parameters couldn't be followed by non optional ones.
I'd agree with other answers that ! has rather different connotations to me.

How to name factory like methods?

I guess that most factory-like methods start with create. But why are they called "create"? Why not "make", "produce", "build", "generate" or something else? Is it only a matter of taste? A convention? Or is there a special meaning in "create"?
createURI(...)
makeURI(...)
produceURI(...)
buildURI(...)
generateURI(...)
Which one would you choose in general and why?
Some random thoughts:
'Create' fits the feature better than most other words. The next best word I can think of off the top of my head is 'Construct'. In the past, 'Alloc' (allocate) might have been used in similar situations, reflecting the greater emphasis on blocks of data than objects in languages like C.
'Create' is a short, simple word that has a clear intuitive meaning. In most cases people probably just pick it as the first, most obvious word that comes to mind when they wish to create something. It's a common naming convention, and "object creation" is a common way of describing the process of... creating objects.
'Construct' is close, but it is usually used to describe a specific stage in the process of creating an object (allocate/new, construct, initialise...)
'Build' and 'Make' are common terms for processes relating to compiling code, so have different connotations to programmers, implying a process that comprises many steps and possibly a lot of disk activity. However, the idea of a Factory "building" something is a sensible idea - especially in cases where a complex data-structure is built, or many separate pieces of information are combined in some way.
'Generate' to me implies a calculation which is used to produce a value from an input, such as generating a hash code or a random number.
'Produce', 'Generate', 'Construct' are longer to type/read than 'Create'. Historically programmers have favoured short names to reduce typing/reading.
Joshua Bloch in "Effective Java" suggests the following naming conventions
valueOf — Returns an instance that has, loosely speaking, the same value
as its parameters. Such static factories are effectively
type-conversion methods.
of — A concise alternative to valueOf, popularized by EnumSet (Item 32).
getInstance — Returns an instance that is described by the parameters
but cannot be said to have the same value. In the case of a singleton,
getInstance takes no parameters and returns the sole instance.
newInstance — Like getInstance, except that newInstance guarantees that
each instance returned is distinct from all others.
getType — Like getInstance, but used when the factory method is in a
different class. Type indicates the type of object returned by the
factory method.
newType — Like newInstance, but used when the factory method is in a
different class. Type indicates the type of object returned by the
factory method.
Wanted to add a couple of points I don't see in other answers.
Although traditionally 'Factory' means 'creates objects', I like to think of it more broadly as 'returns me an object that behaves as I expect'. I shouldn't always have to know whether it's a brand new object, in fact I might not care. So in suitable cases you might avoid a 'Create...' name, even if that's how you're implementing it right now.
Guava is a good repository of factory naming ideas. It is popularising a nice DSL style. examples:
Lists.newArrayListWithCapacity(100);
ImmutableList.of("Hello", "World");
"Create" and "make" are short, reasonably evocative, and not tied to other patterns in naming that I can think of. I've also seen both quite frequently and suspect they may be "de facto standards". I'd choose one and use it consistently at least within a project. (Looking at my own current project, I seem to use "make". I hope I'm consistent...)
Avoid "build" because it fits better with the Builder pattern and avoid "produce" because it evokes Producer/Consumer.
To really continue the metaphor of the "Factory" name for the pattern, I'd be tempted by "manufacture", but that's too long a word.
I think it stems from “to create an object”. However, in English, the word “create” is associated with the notion “to cause to come into being, as something unique that would not naturally evolve or that is not made by ordinary processes,” and “to evolve from one's own thought or imagination, as a work of art or an invention.” So it seems as “create” is not the proper word to use. “Make,” on the other hand, means “to bring into existence by shaping or changing material, combining parts, etc.” For example, you don’t create a dress, you make a dress (object). So, in my opinion, “make” by meaning “to produce; cause to exist or happen; bring about” is a far better word for factory methods.
Partly convention, partly semantics.
Factory methods (signalled by the traditional create) should invoke appropriate constructors. If I saw buildURI, I would assume that it involved some computation, or assembly from parts (and I would not think there was a factory involved). The first thing that I thought when I saw generateURI is making something random, like a new personalized download link. They are not all the same, different words evoke different meanings; but most of them are not conventionalised.
I'd call it UriFactory.Create()
Where,
UriFactory is the name of the class type which is provides method(s) that create Uri instances.
and Create() method is overloaded for as many as variations you have in your specs.
public static class UriFactory
{
//Default Creator
public static UriType Create()
{
}
//An overload for Create()
public static UriType Create(someArgs)
{
}
}
I like new. To me
var foo = newFoo();
reads better than
var foo = createFoo();
Translated to english we have foo is a new foo or foo is create foo. While I'm not a grammer expert I'm pretty sure the latter is grammatically incorrect.
I'd point out that I've seen all of the verbs but produce in use in some library or other, so I wouldn't call create being an universal convention.
Now, create does sound better to me, evokes the precise meaning of the action.
So yes, it is a matter of (literary) taste.
Personally I like instantiate and instantiateWith, but that's just because of my Unity and Objective C experiences. Naming conventions inside the Unity engine seem to revolve around the word instantiate to create an instance via a factory method, and Objective C seems to like with to indicate what the parameter/s are. This only really works well if the method is in the class that is going to be instantiated though (and in languages that allow constructor overloading, this isn't so much of a 'thing').
Just plain old Objective C's initWith is also a good'un!

Do you use articles in your variable names?

Edit: There appears to be at least two valid reasons why Smalltalkers do this (readability during message chaining and scoping issues) but perhaps the question can remain open longer to address general usage.
Original: For reasons I've long forgotten, I never use articles in my variable names. For instance:
aPerson, theCar, anObject
I guess I feel like articles dirty up the names with meaningless information. When I'd see a coworker's code using this convention, my blood pressure would tick up oh-so-slightly.
Recently I've started learning Smalltalk, mostly because I want to learn the language that Martin Fowler, Kent Beck, and so many other greats grew up on and loved.
I noticed, however, that Smalltalkers appear to widely use indefinite articles (a, an) in their variable names. A good example would be in the following Setter method:
name: aName address: anAddress.
self name: aName.
self address: anAddress
This has caused me to reconsider my position. If a community as greatly respected and influential as Smalltalkers has widely adopted articles in variable naming, maybe there's a good reason for it.
Do you use it? Why or why not?
This naming convention is one of the patterns in Kent Beck's book Smalltalk Best Practice Patterns. IMHO this book is a must-have even for non-smalltalkers, as it really helps naming things and writing self-documenting code. Plus it's probably one of the few pattern langages to exhibit Alexander's quality without a name.
Another good book on code patterns is Smalltalk with Style, which is available as a free PDF.
Generally, the convention is that instance variables and accessors use the bare noun, and parameters use the indefinite article plus either a role or a type, or a combination. Temporary variables can use bare nouns because they rarely duplicate the instance variable; alternatively, it's quite frequent to name them with more precision than just an indefinite article, in order to indicate their role in the control flow: eachFoo, nextFoo, randomChild...
It is in common use in Smalltalk as a typeless language because it hints the type of an argument in method call. The article itself signals that you are dealing with an instance of some object of specified class.
But remember that in Smalltalk the methods look differently, we use so called keyword messages and it this case the articles actually help the readability:
anAddressBook add: aPerson fromTownNamed: aString
I think I just found an answer. As Konrad Rudolph said, they use this convention because of a technical reason:
...this means it [method variable] cannot duplicate the name of an instance variable, a temporary variable defined in the interface, or another temporary variable.
-IBM Smalltalk Tutorial
Basically a local method variable cannot be named the same as an object/class variable. Coming from Java, I assumed a method's variables would be locally scoped, and you'd access the instance variables using something like:
self address
I still need to learn more about the method/local scoping in Smalltalk, but it appears they have no other choice; they must use a different variable name than the instance one, so anAddress is probably the simplest approach. Using just address results in:
Name is already defined ->address
if you have an instance variable address defined already...
I always felt the articles dirtied up the names with meaningless information.
Exactly. And this is all the reason necessary to drop articles: they clutter the code needlessly and provide no extra information.
I don’t know Smalltalk and can't talk about the reasons for “their” conventions but everywhere else, the above holds. There might be a simple technical reason behind the Smalltalk convention (such as ALL_CAPS in Ruby, which is a constant not only by convention but because of the language semantics).
I wobble back and forth on using this. I think that it depends on the ratio of C++ to Objective C in my projects at any given time. As for the basis and reasoning, Smalltalk popularized the notion of objects being "things". I think that it was Yourdon and Coad that strongly pushed describing classes in the first person. In Python it would be something like the following snippet. I really wish that I could remember enough SmallTalk to put together a "proper" example.
class Rectangle:
"""I am a rectangle. In other words, I am a polygon
of four sides and 90 degree vertices."""
def __init__(self, aPoint, anotherPoint):
"""Call me to create a new rectangle with the opposite
vertices defined by aPoint and anotherPoint."""
self.myFirstCorner = aPoint
self.myOtherCorner = anotherPoint
Overall, it is a conversational approach to program readability. Using articles in variable names was just one portion of the entire idiom. There was also an idiom surrounding the naming of parameters and message selectors IIRC. Something like:
aRect <- [Rectangle createFromPoint: startPoint
toPoint: otherPoint]
It was just another passing fad that still pops up every so often. Lately I have been noticing that member names like myHostName are popping up in C++ code as an alternative to m_hostName. I'm becoming more enamored with this usage which I think hearkens back to SmallTalk's idioms a little.
Never used, maybe because in my main language there are not any articles :P
Anyway i think that as long as variable's name is meaningful it's not important if there are articles or not, it's up to the coder's own preference.
Nope. I feel it is waste of characters space and erodes the readability of your code. I might use variations of the noun, for example Person vs People depending on the context. For example
ArrayList People = new ArrayList();
Person newPerson = new Person();
People.add(newPerson);
No I do not. I don't feel like it adds anything to the readability or maintainability of my code base and it does not distinguish the variable for me in any way.
The other downside is if you encourage articles in variable names, it's just a matter of time before someone does this in your code base.
var person = new Person();
var aPerson = GetSomeOtherPerson();
Where I work, the standard is to prefix all instance fields with "the-", local variables with "my-" and method parameters with "a-". I believe this came about because many developers were using text editors like vi instead of IDE's that can display different colors per scope.
In Java, I'd have to say I prefer it over writing setters where you dereference this.
Compare
public void setName(String name) {
this.name = name;
}
versus
public void setName(String aName) {
theName = aName;
}
The most important thing is to have a standard and for everyone to adhere to it.

What does 'Monkey Patching' exactly Mean in Ruby?

According to Wikipedia, a monkey patch is:
a way to extend or modify the runtime
code of dynamic languages [...]
without altering the original source
code.
The following statement from the same entry confused me:
In Ruby, the term monkey patch was
misunderstood to mean any dynamic
modification to a class and is often
used as a synonym for dynamically
modifying any class at runtime.
I would like to know the exact meaning of monkey patching in Ruby. Is it doing something like the following, or is it something else?
class String
def foo
"foo"
end
end
The best explanation I heard for Monkey patching/Duck-punching is by Patrick Ewing in RailsConf 2007
...if it walks like a duck and talks like a duck, it’s a duck, right? So
if this duck is not giving you the noise that you want, you’ve got to
just punch that duck until it returns what you expect.
The short answer is that there is no "exact" meaning, because it's a novel term, and different folks use it differently. That much at least can be discerned from the Wikipedia article. There are some who insist that it only applies to "runtime" code (built-in classes, I suppose) while some would use it to refer to the run-time modification of any class.
Personally, I prefer the more inclusive definition. After all, if we were to use the term for modification of built-in classes only, how would we refer to the run-time modification of all the other classes? The important thing to me is that there's a difference between the source code and the actual running class.
In Ruby, the term monkey patch was
misunderstood to mean any dynamic
modification to a class and is often
used as a synonym for dynamically
modifying any class at runtime.
The above statement asserts that the Ruby usage is incorrect - but terms evolve, and that's not always a bad thing.
Monkey patching is when you replace methods of a class at runtime (not adding new methods as others have described).
In addition to being a very un-obvious and difficult to debug way to change code, it doesn't scale; as more and more modules start monkey patching methods, the likelihood of the changes stomping each other grow.
You are correct; it's when you modify or extend an existing class rather than subclass it.
This is monkey patching:
class Float
def self.times(&block)
self.to_i.times { |i| yield(i) }
remainder = self - self.to_i
yield(remainder) if remainder > 0.0
end
end
Now I imagine this might be useful sometimes, but imagine if you saw routine.
def my_method(my_special_number)
sum = 0
my_special_number.times { |num| sum << some_val ** num }
sum
end
And it breaks only occasionally when it gets called. To those paying attention you already know why, but imagine that you didn't know about the float type having a .times class-method and you automatically assumed that my_special_number is an integer. Every time the parameter is a whole number, integer or float, it would work fine (whole ints are passed back except when there is a floating-point remainder). But pass a number with anything in the decimal area in and it'll break for sure!
Just imagine how often this might happen with your gems, Rails plugins, and even by your own co-workers in your projects. If there's one or two little methods in there like this and it could take some time to find and correct.
If you wonder why it breaks, note that sum is an integer and a floating-point remainder could be passed back; in addition, the exponential sign only works when types are the same. So you might think it's fixed, because you converted bother numbers to floats ... only to find that the sum can't take the floating-point result.
In Python monkeypatching is referred to a lot as a sign of embarrassment: "I had to monkeypatch this class because..." (I encountered it first when dealing with Zope, which the article mentions). It's used to say that it was necessary to take hold of an upstream class and fix it at runtime instead of lobbying to have the unwanted behaviors fixed in the actual class or fixing them in a subclass. In my experience Ruby people don't talk about monkeypatching that much, because it's not considered especially bad or even noteworthy (hence "duck punching"). Obviously you have to be careful about changing the return values of a method that will be used in other dependencies, but adding methods to a class the way that active_support and facets do is perfectly safe.
Update 10 years later: I would amend the last sentence to say "is relatively safe". Extending a core library class with new methods can lead to problems if somebody else gets the same idea and adds the same method with a different implementation or method signature, or if people confuse extended methods for core language functionality. Both cases often happen in Ruby (especially regarding active_support methods).
Explanation of the concept without code:
It means you can "dynamically" modify code. Wanna add a method "dynamically" to a particular class known only at "runtime"? No problem. It's powerful, yes: but can be misused. The concept "dynamically" might be a little too esoteric to understand, so I have prepared an example below (no code, I promise):
How to monkey patch a car:
Normal Car Operations
How do you normally start a car? It’s simple: you turn the ignition, the car starts!
Great, but how can we "monkey patch" the car class?
This is what Fabrizzio did to poor Michael Corleone. Normally, if you want to change how a car operates, you would have to make those changes in the car manufacturing plant (i.e. at "compile" time, within the Car class ^**). Fabrizzio ain't got no time for that: he monkey patches cars by getting under the bonnet to surreptitiously and sneakily rewire things. In other words, he re-opens the Car class, makes the changes he wants, and he's done: he's just monkey patched a car. he done this "dynamically".
You have to really know what you are doing when you monkey patch otherwise the results could be quite explosive.
“Fabrizzio, where are you going?”
Boom!
Like Confucius Say:
"Keep your source code close, but your monkey patches closer."
It can be dangerous.
^** yes i know, dynamic languages.
Usually it is meant about ad-hoc changes, using Ruby open classes, frequently with low quality code.
Here's a good follow-up on the subject.

Resources