How to name factory like methods? - coding-style

I guess that most factory-like methods start with create. But why are they called "create"? Why not "make", "produce", "build", "generate" or something else? Is it only a matter of taste? A convention? Or is there a special meaning in "create"?
createURI(...)
makeURI(...)
produceURI(...)
buildURI(...)
generateURI(...)
Which one would you choose in general and why?

Some random thoughts:
'Create' fits the feature better than most other words. The next best word I can think of off the top of my head is 'Construct'. In the past, 'Alloc' (allocate) might have been used in similar situations, reflecting the greater emphasis on blocks of data than objects in languages like C.
'Create' is a short, simple word that has a clear intuitive meaning. In most cases people probably just pick it as the first, most obvious word that comes to mind when they wish to create something. It's a common naming convention, and "object creation" is a common way of describing the process of... creating objects.
'Construct' is close, but it is usually used to describe a specific stage in the process of creating an object (allocate/new, construct, initialise...)
'Build' and 'Make' are common terms for processes relating to compiling code, so have different connotations to programmers, implying a process that comprises many steps and possibly a lot of disk activity. However, the idea of a Factory "building" something is a sensible idea - especially in cases where a complex data-structure is built, or many separate pieces of information are combined in some way.
'Generate' to me implies a calculation which is used to produce a value from an input, such as generating a hash code or a random number.
'Produce', 'Generate', 'Construct' are longer to type/read than 'Create'. Historically programmers have favoured short names to reduce typing/reading.

Joshua Bloch in "Effective Java" suggests the following naming conventions
valueOf — Returns an instance that has, loosely speaking, the same value
as its parameters. Such static factories are effectively
type-conversion methods.
of — A concise alternative to valueOf, popularized by EnumSet (Item 32).
getInstance — Returns an instance that is described by the parameters
but cannot be said to have the same value. In the case of a singleton,
getInstance takes no parameters and returns the sole instance.
newInstance — Like getInstance, except that newInstance guarantees that
each instance returned is distinct from all others.
getType — Like getInstance, but used when the factory method is in a
different class. Type indicates the type of object returned by the
factory method.
newType — Like newInstance, but used when the factory method is in a
different class. Type indicates the type of object returned by the
factory method.

Wanted to add a couple of points I don't see in other answers.
Although traditionally 'Factory' means 'creates objects', I like to think of it more broadly as 'returns me an object that behaves as I expect'. I shouldn't always have to know whether it's a brand new object, in fact I might not care. So in suitable cases you might avoid a 'Create...' name, even if that's how you're implementing it right now.
Guava is a good repository of factory naming ideas. It is popularising a nice DSL style. examples:
Lists.newArrayListWithCapacity(100);
ImmutableList.of("Hello", "World");

"Create" and "make" are short, reasonably evocative, and not tied to other patterns in naming that I can think of. I've also seen both quite frequently and suspect they may be "de facto standards". I'd choose one and use it consistently at least within a project. (Looking at my own current project, I seem to use "make". I hope I'm consistent...)
Avoid "build" because it fits better with the Builder pattern and avoid "produce" because it evokes Producer/Consumer.
To really continue the metaphor of the "Factory" name for the pattern, I'd be tempted by "manufacture", but that's too long a word.

I think it stems from “to create an object”. However, in English, the word “create” is associated with the notion “to cause to come into being, as something unique that would not naturally evolve or that is not made by ordinary processes,” and “to evolve from one's own thought or imagination, as a work of art or an invention.” So it seems as “create” is not the proper word to use. “Make,” on the other hand, means “to bring into existence by shaping or changing material, combining parts, etc.” For example, you don’t create a dress, you make a dress (object). So, in my opinion, “make” by meaning “to produce; cause to exist or happen; bring about” is a far better word for factory methods.

Partly convention, partly semantics.
Factory methods (signalled by the traditional create) should invoke appropriate constructors. If I saw buildURI, I would assume that it involved some computation, or assembly from parts (and I would not think there was a factory involved). The first thing that I thought when I saw generateURI is making something random, like a new personalized download link. They are not all the same, different words evoke different meanings; but most of them are not conventionalised.

I'd call it UriFactory.Create()
Where,
UriFactory is the name of the class type which is provides method(s) that create Uri instances.
and Create() method is overloaded for as many as variations you have in your specs.
public static class UriFactory
{
//Default Creator
public static UriType Create()
{
}
//An overload for Create()
public static UriType Create(someArgs)
{
}
}

I like new. To me
var foo = newFoo();
reads better than
var foo = createFoo();
Translated to english we have foo is a new foo or foo is create foo. While I'm not a grammer expert I'm pretty sure the latter is grammatically incorrect.

I'd point out that I've seen all of the verbs but produce in use in some library or other, so I wouldn't call create being an universal convention.
Now, create does sound better to me, evokes the precise meaning of the action.
So yes, it is a matter of (literary) taste.

Personally I like instantiate and instantiateWith, but that's just because of my Unity and Objective C experiences. Naming conventions inside the Unity engine seem to revolve around the word instantiate to create an instance via a factory method, and Objective C seems to like with to indicate what the parameter/s are. This only really works well if the method is in the class that is going to be instantiated though (and in languages that allow constructor overloading, this isn't so much of a 'thing').
Just plain old Objective C's initWith is also a good'un!

Related

Why does rubocop or the ruby style guide prefer not to use get_ or set_?

I was running rubocop on my project and fixing the complaints it raised.
One particular complaint bothered me
Do not prefix reader method names with get_
I could not understand much from this complaint so I looked at source code in github.
I found this snippet
def bad_reader_name?(method_name, args)
method_name.start_with?('get_') && args.to_a.empty?
end
def bad_writer_name?(method_name, args)
method_name.start_with?('set_') && args.to_a.one?
end
So the advice or convention is as follows:
1) Actually they advice us not to use get_ when the method does not have arguments . otherwise they allow get_
2) And they advice us not to use set_ when the method has only one argument .otherwise they allow set_
What is the reason behind this convention or rule or advice?
I think the point here is ruby devs prefer to always think of methods as getters since they returns something and use the equals "syntactic sugar" (like in def self.dog=(params) which lets you do Class.dog = something). In essence the point I've always seen made is that the get and set are redundant and verbose.
In opposition to this you have get and set with multiple args which are like finder methods (particularly get; think of ActiveRecord's where).
Keep in mind that 'style guide' = pure opinion. Consistency is the higher goal of style guides in general so unless something is arguably wrong or difficult to read, your goal should be more on having everything the same than of a certain type. Which is why rubocop let's you turn this off.
Another way to see it: the getter/setter paradigm was, as far as I can tell, largely a specific convention in Java/C++ etc.; at least I knew quite a few Java codebases in the very foggy past where Beans were littered with huge amounts of get_-getters and set_-setters. In that time, the private attribute would likely be called "name" with "set_name()" and "get_name()"; as the attribute itself was called "name", the getter could not be "name()" as well.
Hence the very existence of "get_" and "set_" is due to a (trivial) technical shortcoming of languages that do not allow the "=" in method names.
In Ruby, we have quite a different array of possibilities:
First and foremost, we have name() and name=(), which immediately removes the need for the get_ and set_ syntax. So, we do have getters and setters, we just call them differently from Java.
Also, the attribute then is not name but #name, hence solving this conflict as well.
Actually, you don't have attributes with the plain "obj.name" syntax at all! For eaxmple; while Rails/ActiveRecord pretends that "person.name" is a "attribute", it is in fact simply a pair of auto-generated getter name() and setter name=(). Conceptionally, the caller is not supposed to care about what "name" is (attribute or method), it is an implementation detail of the class.
It saves 4 or 3 key presses for each call. This might seem like a joke, but writing concise yet easily comprehensible code is a trademark of Ruby, after all. :-)
The way I understand it is that it's because foo.get_value is imperative and foo.value is declarative.

Do you use nouns for classnames that represent callable objects?

There is a more generic question asked here. Consider this as an extension to that.
I understand that classes represent type of objects and we should use nouns as their names. But there are function objects supported in many language that acts more like functions than objects. Should I name those classes also as nouns, or verbs are ok in that case. doSomething(), semantically, makes more sense than Something().
Update / Conclusion
The two top voted answers I got on this shares a mixed opinion:
Attila
In the case of functors, however, they represent "action", so naming them with a verb (or some sort of noun-verb combination) is more appropriate -- just like you would name a function based on what it is doing.
Rook
The instance of a functor on the other hand is effectively a function, and would perhaps be better named accordingly. Following Jim's suggestion above, SomethingDoer doSomething; doSomething();
Both of them, after going through some code, seems to be the common practice. In GNU implementation of stl I found classes like negate, plus, minus (bits/stl_function.h) etc. And variate_generator, mersenne_twister (tr1/random.h). Similarly in boost I found classes like base_visitor, predecessor_recorder (graph/visitors.hpp) as well as inverse, inplace_erase (icl/functors.hpp)
Objects (and thus classes) usually represent "things", therefore you want to name them with nouns. In the case of functors, however, they represent "action", so naming them with a verb (or some sort of noun-verb combination) is more appropriate -- just like you would name a function based on what it is doing.
In general, you would want to name things (objects, functions, etc.) after their purpose, that is, what they represent in the program/the world.
You can think of functors as functions (thus "action") with state. Since the clean way of achieving this (having a state associated with your "action") is to create an object that stores the state, you get an object, which is really an "action" (a fancy function).
Note that the above applies to languages where you can make a pure functor, that is, an object where the invocation is the same as for a regular function (e.g. C++). In languages where this is not supported (that is, you have to have a method other than operator() called, like command.do()), you would want to name the command-like class/object a noun and name the method called a verb.
The type of the functor is a class, which you may as well name in your usual style (for instance, a noun) because it doesn't really do anything on its own. The instance of a functor on the other hand is effectively a function, and would perhaps be better named accordingly. Following Jim's suggestion above, SomethingDoer doSomething; doSomething();
A quick peek at my own code shows I've managed to avoid this issue entirely by using words which are both nouns and verbs... Filter and Show for example. Might be tricky to apply that naming scheme in all circumstances, however!
I suggest you give them a suffix like Function, Callback, Action, Command, etc. so that you will have classes called SortFunction, SearchAction, ExecuteCommand instead of Sort, Search, Execute which sound more like names of actual functions.
I think there are two ways to see this:
1) The class which could be named sensibly so that it can be invoked as a functor directly upon construction:
Doer()(); // constructed and invoked in one shot, compiler could optimize
2) The instance in cases where we want the functor to be invoked multiple times on a stateless functor object, or perhaps for stylistic reasons when #1's syntax is not preferable.
Doer _do;
_do(); // operator () invocation after construction
I like #Rook's suggestion above of naming the class with words that have the same verb and noun forms such as Search or Filter.

Naming convention for syntactic sugar methods

I'm build a library for generic reporting, Excel(using Spreadsheet), and most of the time I'll be writing things out on the last created worksheet (or active as I mostly refer to it).
So I'm wondering if there's a naming convention for methods that are mostly sugar to the normal/unsugared method.
For instance I saw a blog post, scroll down to Composite, a while ago where the author used the #method for the sugared, and #method! when unsugared/manual.
Could this be said to be a normal way of doing things, or just an odd implementation?
What I'm thinking of doing now is:
add_row(data)
add_row!(sheet, data)
This feels like a good fit to me, but is there a consensus on how these kinds of methods should be named?
Edit
I'm aware that the ! is used for "dangerous" methods and ? for query/boolean responses. Which is why I got curious whether the usage in Prawn (the blog post) could be said to be normal.
I think it's fair to say that your definitions:
add_row(data)
add_row!(sheet, data)
are going to confuse Ruby users. There is a good number of naming conventions is the Ruby community that are considered like a de-facto standard for naming. For example, the bang methods are meant to modify the receiver, see map and map!. Another convention is add the ? as a suffix to methods that returns a boolean. See all? or any? for a reference.
I used to see bang-methods as more dangerous version of a regular named method:
Array#reverse! that modifies array itself instead of returning new array with reversed order of elements.
ActiveRecord::Base#save! (from Rails) validates model and save it if it's valid. But unlike regular version that return true or false depending on whether the model was saved or not raises an exception if model is invalid.
I don't remember seeing bang-methods as sugared alternatives for regular methods. May be I'd give such methods their own distinct name other then just adding a bang to regular version name.
Why have two separate methods? You could for example make the sheet an optional parameter, for example
def add_row(sheet = active_sheet, data)
...
end
default values don't have to just be static values - in this case it's calling the active_sheet method. If my memory is correct prior to ruby 1.9 you'd have to swap the parameters as optional parameters couldn't be followed by non optional ones.
I'd agree with other answers that ! has rather different connotations to me.

Why isn't DRY considered a good thing for type declarations?

It seems like people who would never dare cut and paste code have no problem specifying the type of something over and over and over. Why isn't it emphasized as a good practice that type information should be declared once and only once so as to cause as little ripple effect as possible throughout the source code if the type of something is modified? For example, using pseudocode that borrows from C# and D:
MyClass<MyGenericArg> foo = new MyClass<MyGenericArg>(ctorArg);
void fun(MyClass<MyGenericArg> arg) {
gun(arg);
}
void gun(MyClass<MyGenericArg> arg) {
// do stuff.
}
Vs.
var foo = new MyClass<MyGenericArg>(ctorArg);
void fun(T)(T arg) {
gun(arg);
}
void gun(T)(T arg) {
// do stuff.
}
It seems like the second one is a lot less brittle if you change the name of MyClass, or change the type of MyGenericArg, or otherwise decide to change the type of foo.
I don't think you're going to find a lot of disagreement with your argument that the latter example is "better" for the programmer. A lot of language design features are there because they're better for the compiler implementer!
See Scala for one reification of your idea.
Other languages (such as the ML family) take type inference much further, and create a whole style of programming where the type is enormously important, much more so than in the C-like languages. (See The Little MLer for a gentle introduction.)
It isn't considered a bad thing at all. In fact, C# maintainers are already moving a bit towards reducing the tiring boilerplate with the var keyword, where
MyContainer<MyType> cont = new MyContainer<MyType>();
is exactly equivalent to
var cont = new MyContainer<MyType>();
Although you will see many people who will argue against var usage, which kind of shows that many people is not familiar with strong typed languages with type inference; type inference is mistaken for dynamic/soft typing.
Repetition may lead to more readable code, and sometimes may be required in the general case. I've always seen the focus of DRY being more about duplicating logic than repeating literal text. Technically, you can eliminate 'var' and 'void' from your bottom code as well. Not to mention you indicate scope with indentation, why repeat yourself with braces?
Repetition can also have practical benefits: parsing by a program is easier by keeping the 'void', for example.
(However, I still strongly agree with you on prefering "var name = new Type()" over "Type name = new Type()".)
It's a bad thing. This very topic was mentioned in Google's Go language Techtalk.
Albert Einstein said, "Everything should be made as simple as possible, but not one bit simpler."
Your complaint makes no sense in the case of a dynamically typed language, so you must intend this to refer to statically typed languages. In that case, your replacement example implicitly uses Generics (aka Template Classes), which means that any time that fun or gun is used, a new definition based upon the type of the argument. That could result in dozens of extra methods, regardless of the intent of the programmer. In particular, you're throwing away the benefit of compiler-checked type-safety for a runtime error.
If your goal was to simply pass through the argument without checking its type, then the correct type would be Object not T.
Type declarations are intended to make the programmer's life simpler, by catching errors at compile-time, instead of failing at runtime. If you have an overly complex type definition, then you probably don't understand your data. In your example, I would have suggested adding fun and gun to MyClass, instead of defining them separately. If fun and gun don't apply to all possible template types, then they should be defined in an explicit subclass, not as separate functions that take a templated class argument.
Generics exist as a way to wrap behavior around more specific objects. List, Queue, Stack, these are fine reasons for Generics, but at the end of the day, the only thing you should be doing with a bare Generic is creating an instance of it, and calling methods on it. If you really feel the need to do more than that with a Generic, then you probably need to embed your Generic class as an instance object in a wrapper class, one that defines the behaviors you need. You do this for the same reason that you embed primitives into a class: because by themselves, numbers and strings do not convey semantic information about their contents.
Example:
What semantic information does List convey? Just that you're working with multiple triples of integers. On the other hand, List, where a color has 3 integers (red, blue, green) with bounded values (0-255) conveys the intent that you're working with multiple Colors, but provides no hint as to whether the List is ordered, allows duplicates, or any other information about the Colors. Finally a Palette can add those semantics for you: a Palette has a name, contains multiple Colors, but no duplicates, and order isn't important.
This has gotten a bit far afield from the original question, but what it means to me is that DRY (Don't Repeat Yourself) means specifying information once, but that specification should be as precise as is necessary.

Do you use articles in your variable names?

Edit: There appears to be at least two valid reasons why Smalltalkers do this (readability during message chaining and scoping issues) but perhaps the question can remain open longer to address general usage.
Original: For reasons I've long forgotten, I never use articles in my variable names. For instance:
aPerson, theCar, anObject
I guess I feel like articles dirty up the names with meaningless information. When I'd see a coworker's code using this convention, my blood pressure would tick up oh-so-slightly.
Recently I've started learning Smalltalk, mostly because I want to learn the language that Martin Fowler, Kent Beck, and so many other greats grew up on and loved.
I noticed, however, that Smalltalkers appear to widely use indefinite articles (a, an) in their variable names. A good example would be in the following Setter method:
name: aName address: anAddress.
self name: aName.
self address: anAddress
This has caused me to reconsider my position. If a community as greatly respected and influential as Smalltalkers has widely adopted articles in variable naming, maybe there's a good reason for it.
Do you use it? Why or why not?
This naming convention is one of the patterns in Kent Beck's book Smalltalk Best Practice Patterns. IMHO this book is a must-have even for non-smalltalkers, as it really helps naming things and writing self-documenting code. Plus it's probably one of the few pattern langages to exhibit Alexander's quality without a name.
Another good book on code patterns is Smalltalk with Style, which is available as a free PDF.
Generally, the convention is that instance variables and accessors use the bare noun, and parameters use the indefinite article plus either a role or a type, or a combination. Temporary variables can use bare nouns because they rarely duplicate the instance variable; alternatively, it's quite frequent to name them with more precision than just an indefinite article, in order to indicate their role in the control flow: eachFoo, nextFoo, randomChild...
It is in common use in Smalltalk as a typeless language because it hints the type of an argument in method call. The article itself signals that you are dealing with an instance of some object of specified class.
But remember that in Smalltalk the methods look differently, we use so called keyword messages and it this case the articles actually help the readability:
anAddressBook add: aPerson fromTownNamed: aString
I think I just found an answer. As Konrad Rudolph said, they use this convention because of a technical reason:
...this means it [method variable] cannot duplicate the name of an instance variable, a temporary variable defined in the interface, or another temporary variable.
-IBM Smalltalk Tutorial
Basically a local method variable cannot be named the same as an object/class variable. Coming from Java, I assumed a method's variables would be locally scoped, and you'd access the instance variables using something like:
self address
I still need to learn more about the method/local scoping in Smalltalk, but it appears they have no other choice; they must use a different variable name than the instance one, so anAddress is probably the simplest approach. Using just address results in:
Name is already defined ->address
if you have an instance variable address defined already...
I always felt the articles dirtied up the names with meaningless information.
Exactly. And this is all the reason necessary to drop articles: they clutter the code needlessly and provide no extra information.
I don’t know Smalltalk and can't talk about the reasons for “their” conventions but everywhere else, the above holds. There might be a simple technical reason behind the Smalltalk convention (such as ALL_CAPS in Ruby, which is a constant not only by convention but because of the language semantics).
I wobble back and forth on using this. I think that it depends on the ratio of C++ to Objective C in my projects at any given time. As for the basis and reasoning, Smalltalk popularized the notion of objects being "things". I think that it was Yourdon and Coad that strongly pushed describing classes in the first person. In Python it would be something like the following snippet. I really wish that I could remember enough SmallTalk to put together a "proper" example.
class Rectangle:
"""I am a rectangle. In other words, I am a polygon
of four sides and 90 degree vertices."""
def __init__(self, aPoint, anotherPoint):
"""Call me to create a new rectangle with the opposite
vertices defined by aPoint and anotherPoint."""
self.myFirstCorner = aPoint
self.myOtherCorner = anotherPoint
Overall, it is a conversational approach to program readability. Using articles in variable names was just one portion of the entire idiom. There was also an idiom surrounding the naming of parameters and message selectors IIRC. Something like:
aRect <- [Rectangle createFromPoint: startPoint
toPoint: otherPoint]
It was just another passing fad that still pops up every so often. Lately I have been noticing that member names like myHostName are popping up in C++ code as an alternative to m_hostName. I'm becoming more enamored with this usage which I think hearkens back to SmallTalk's idioms a little.
Never used, maybe because in my main language there are not any articles :P
Anyway i think that as long as variable's name is meaningful it's not important if there are articles or not, it's up to the coder's own preference.
Nope. I feel it is waste of characters space and erodes the readability of your code. I might use variations of the noun, for example Person vs People depending on the context. For example
ArrayList People = new ArrayList();
Person newPerson = new Person();
People.add(newPerson);
No I do not. I don't feel like it adds anything to the readability or maintainability of my code base and it does not distinguish the variable for me in any way.
The other downside is if you encourage articles in variable names, it's just a matter of time before someone does this in your code base.
var person = new Person();
var aPerson = GetSomeOtherPerson();
Where I work, the standard is to prefix all instance fields with "the-", local variables with "my-" and method parameters with "a-". I believe this came about because many developers were using text editors like vi instead of IDE's that can display different colors per scope.
In Java, I'd have to say I prefer it over writing setters where you dereference this.
Compare
public void setName(String name) {
this.name = name;
}
versus
public void setName(String aName) {
theName = aName;
}
The most important thing is to have a standard and for everyone to adhere to it.

Resources