Question about object oriented design with Ruby - ruby

I'm thinking of writing a CLI Monopoly game in Ruby. This would be the first large project I've done in Ruby. Most of my experience in programming has been with functional programming languages like Clojure and Haskell and such. I understand Object Orientation pretty well, but I have no experience with designing object oriented programs.
Now, here's the deal.
In Monopoly, there are lots of spaces spread around the board. Most spaces are properties, and others do other things.
Would it be smart to have a class for each of the spaces? I was thinking of having a Space class that all other spaces inherit from, and having a Property class that inherits from Space, and then having a class for every property that inherits from Property. This would mean a lot of classes, which leads me to believe that this is an awful way to do what I'm trying to do.
What I also intended to do, was use the 'inherited' hook method to keep track of all the properties, so that I could search them and remove them from the unbought list when needed.
This sort of problem seems to arise in a lot of programs, so my question is: Is there a better way to do this, and am I missing something very crucial to Object Oriented design?
I'm sorry if this is a stupid question, I'm just clueless when it comes to OOPD.
Thanks.

You're on the right track, but you've gone too far, which is a common beginner's mistake with OOP. Every property should not be a separate class; they should all be instances of the Property class. I'd do classes with attributes along these lines:
Space
Name
Which pieces are on it
Which space is next (and maybe previous?)
Any special actions which take place when landing on it
Property (extends Space)
Who owns it
How many houses/hotels are on it
Property value
Property monopoly group
Rent rates
Whether it is mortgaged
So, for example, Boardwalk would be an object of type Property, with specific values that apply to it, such as belonging to the dark blue monopoly group.

Classes for spaces is probably not a bad idea; but let's see why that's the case.
If you were writing this in a procedural language, where would the majority of the 'if' and 'switch' statements be? They'd be in determining what happened to each player when they landed on a space. In OO, we want to prevent as many if/switch statements as possible, and replace them with polymorphism. So, clearly, if we create many derivatives of Space, we will prevent a large number of if/switch statements.
But before we make that leap, let's look at the metaphor of the game. The game is all about a city. Addresses, streets, utilities, public works, jails, etc. Each square on the board represents some kind of location in the city. Indeed the players travel through the city paying rent (or some other action) anywhere they happen to land.
So let's call the spaces, CityBlocks. Each CityBlock has an address, like "Boardwalk" or "Electric Company", or "Community Chest". Are there different types of CityBlock objects? Or should we simply consider CityBlocks to be locations that have a certain kind of ZoningOrdinance?
I kind of like that. |CityBlock|---->|ZoningOrdinance|
Now we could have several different derivatives of ZoningOrdinance. This is the "Strategy" Pattern. I rather like the flexibility this give us, and the separation of 'where' and 'what' that it provides.
You'll also need a Game class that understand the rules of dice, tokens, FreeParking, Passing Go, etc. That class will manipulate the players, and invoke the methods on CityBlock. CityBlock will invoke methods on ZoningOrdinance.
Anyway, that's one way...

Related

Is there such a thing as ‘class bloat’ - i.e. too many classes causing inefficiencies?

E.g. let’s consider I have the following classes:
Item
ItemProperty which would include objects such as Colour and Size. There's a relation-property of the Item class which lists all of the ItemProperty objects applicable to this Item (i.e. for one item you might need to specify the Colour and for another you might want to specify the Size).
ItemPropertyOption would include objects such as Red, Green (for Colour) and Big, Small (for Size).
Then an Item Object would relate to an ItemProperty, whereas an ItemChoice Object would relate to an ItemPropertyOption (and the ItemProperty which the ItemPropertyOption refers to could be inferred).
The reason for this is so I could then make use of queries much more effectively. i.e. give me all item-choices which are Red. It would also allow me to use the Parse Dashboard to quickly add elements to the site as I could easily specify more ItemPropertys and ItemPropertyOptions, rather than having to add them in the codebase.
This is just a small example and there's many more instances where I'd like to use classes so that 'options' for various drop-downs in forms are in the database and can easily be added and edited by me, rather than hard-coded.
1) I’ll probably be doing this in a similar way for 5+ more similar kinds of class-structures
2) there could be hundreds of nested properties that I want to access via ‘inverse querying’
So, I can think of 2 potential causes of inefficiency and wanted to know if they’re founded:
Is having lots of classes inefficient?
Is back-querying against nested classes inefficient?
The other option I can think of — if ‘class-bloat’ really is a problem — is to make fields on parent classes that, instead of being nested across other classes (that represent further properties, as above), just representing them as a nested JSON property directly.
The job of designing is to render in object descriptions truths about the world that are relevent to the system's requirements. In the world of the OP's "items", it's a fact that items have color, and it's a relevant fact because users care about an item's color. You'd only call a system inefficient if it consumes computing resources that it doesn't need to consume.
So, for something like a configurator, the fact that we have items, and that those items have properties, and those properties have an enumerable set of possible values sounds like a perfectly rational design.
Is it inefficient or "bloated"? The only place I'd raise doubt is in the explicit assertion that items have properties. Of course they do, but that's natively true of javascript objects and parse entities.
In other words, you might be able to get along with just item and several flavors of propertyOptions: e.g. Item has an attribute called "colorProperty" that is a pointer to an instance of "ColorProperty" (whose instances have a name property like 'red', 'green', etc. and maybe describe other pertinent facts, like a more precise description in RGB form).
There's nothing wrong with lots of classes if they represent relevant truth. Do that first. You might discover empirically that your design is too resource consumptive (I doubt you will in this case), at which point we'd start looking for cheats to be somehow skinnier. But do it the right way first, cheat later only if you must.
Is having lots of classes inefficient?
It's certainly inefficient for poor humans who have to remember what all those classes do and how they're related to each other. It takes time to write all those classes in the first place, and every line that you write is a line that has to be maintained.
Beyond that, there's certainly some cost for each class in any OOP language, and creating more classes than you really need will mean that you're paying more than you need to for the work that you're doing, which is pretty much the definition of inefficient.
I’ll probably be doing this in a similar way for 5+ more similar kinds of class-structures
Maybe you could spend some time thinking about the similarity between these cases and come up with a single set of more flexible classes that you can use in all those cases. Writing general code is harder than writing very specific code, but if you do a good job you'll recoup the extra effort many times over through reuse.

how to organize classes in ruby if they are literal subclasses

I know that title didn't make sense, Im sorry! Its hard to word what I am trying to ask. I had trouble googling it for the same reason.
So this isn't even Ruby specific, but I am working in ruby and I am new to it, so bear with me.
So you have a class that is a document. Inside each document, you have sentences, and each sentence has words. Words will have properties, like "noun" or a count of how many times they are used in the document, etc.
I would like each of the elements, document, sentence, word be an object.
Now, if you think literally - sentences are in documents, and words are in sentences.
Should this be organized literally like this as well? Like inside the document class you will define and instantiate the sentence objects, and inside the sentence class you will define and instantiate the words?
Or, should everything be separate and reference each other? Like the word class would sit outside the sentence class but the sentence class would be able to instantiate and work with words?
This is a basic OOP question I guess, and I suppose you could argue to do it either way. What do you guys think?
Each sentence in the document could be stored in a hash of sentence objects inside the document object, and each word in the sentence could be stored in a hash of word objects inside the sentence.
I dont want to code myself into a corner here, thats why I am asking, plus I have wondered this before in other situations.
Thank you!
Looks like you are mixing 2 concepts here: nesting objects and nesting classes. These are different things. Your objects, likely, need to be nested. However, your classes don't have to be nested. They might, but it's a matter of style (and some languages don't allow nested classes at all). Also, it depends on how you plan to use these classes. For example, in your case, do words have meaning as a separate concept? not just as a part of a sentence? If so, I would define Word as a separate class.
For languages that allow a class to be defined within the definition of another class, you could run into visibility problems where the internal class cannot be used by an outside class. In your case, if you think you might want to use one of these internal classes outside of the whole document-sentence-word system, then internal classes is not a good idea. In fact, I try to avoid using internal classes as a general rule, and only do it when the internal class is one that does not make sense in any other context except that of the containing class.
...so to answer your question: if it were me, I'd have them all as separate classes that reference each other.
You can create an anonymous class. You might even be able to control who has access to those anonymous classes by using access control on a method that creates / returns the anonymous class - I don't know. But I don't think it's something you really need.
Rather than totally prohibiting other parts of the code from using those classes, it'd probably be best to merely document that they're for internal use only. One way of indicating they're for internal use is to use namespaces.
If the class is named DocumentParser, then it sounds like a class everyone can use. However, if you have something named DocumentParser::Noun or DocumentParser::Sentence then they might realize that they may be using implementation details that shouldn't be relied upon.

How to name factory like methods?

I guess that most factory-like methods start with create. But why are they called "create"? Why not "make", "produce", "build", "generate" or something else? Is it only a matter of taste? A convention? Or is there a special meaning in "create"?
createURI(...)
makeURI(...)
produceURI(...)
buildURI(...)
generateURI(...)
Which one would you choose in general and why?
Some random thoughts:
'Create' fits the feature better than most other words. The next best word I can think of off the top of my head is 'Construct'. In the past, 'Alloc' (allocate) might have been used in similar situations, reflecting the greater emphasis on blocks of data than objects in languages like C.
'Create' is a short, simple word that has a clear intuitive meaning. In most cases people probably just pick it as the first, most obvious word that comes to mind when they wish to create something. It's a common naming convention, and "object creation" is a common way of describing the process of... creating objects.
'Construct' is close, but it is usually used to describe a specific stage in the process of creating an object (allocate/new, construct, initialise...)
'Build' and 'Make' are common terms for processes relating to compiling code, so have different connotations to programmers, implying a process that comprises many steps and possibly a lot of disk activity. However, the idea of a Factory "building" something is a sensible idea - especially in cases where a complex data-structure is built, or many separate pieces of information are combined in some way.
'Generate' to me implies a calculation which is used to produce a value from an input, such as generating a hash code or a random number.
'Produce', 'Generate', 'Construct' are longer to type/read than 'Create'. Historically programmers have favoured short names to reduce typing/reading.
Joshua Bloch in "Effective Java" suggests the following naming conventions
valueOf — Returns an instance that has, loosely speaking, the same value
as its parameters. Such static factories are effectively
type-conversion methods.
of — A concise alternative to valueOf, popularized by EnumSet (Item 32).
getInstance — Returns an instance that is described by the parameters
but cannot be said to have the same value. In the case of a singleton,
getInstance takes no parameters and returns the sole instance.
newInstance — Like getInstance, except that newInstance guarantees that
each instance returned is distinct from all others.
getType — Like getInstance, but used when the factory method is in a
different class. Type indicates the type of object returned by the
factory method.
newType — Like newInstance, but used when the factory method is in a
different class. Type indicates the type of object returned by the
factory method.
Wanted to add a couple of points I don't see in other answers.
Although traditionally 'Factory' means 'creates objects', I like to think of it more broadly as 'returns me an object that behaves as I expect'. I shouldn't always have to know whether it's a brand new object, in fact I might not care. So in suitable cases you might avoid a 'Create...' name, even if that's how you're implementing it right now.
Guava is a good repository of factory naming ideas. It is popularising a nice DSL style. examples:
Lists.newArrayListWithCapacity(100);
ImmutableList.of("Hello", "World");
"Create" and "make" are short, reasonably evocative, and not tied to other patterns in naming that I can think of. I've also seen both quite frequently and suspect they may be "de facto standards". I'd choose one and use it consistently at least within a project. (Looking at my own current project, I seem to use "make". I hope I'm consistent...)
Avoid "build" because it fits better with the Builder pattern and avoid "produce" because it evokes Producer/Consumer.
To really continue the metaphor of the "Factory" name for the pattern, I'd be tempted by "manufacture", but that's too long a word.
I think it stems from “to create an object”. However, in English, the word “create” is associated with the notion “to cause to come into being, as something unique that would not naturally evolve or that is not made by ordinary processes,” and “to evolve from one's own thought or imagination, as a work of art or an invention.” So it seems as “create” is not the proper word to use. “Make,” on the other hand, means “to bring into existence by shaping or changing material, combining parts, etc.” For example, you don’t create a dress, you make a dress (object). So, in my opinion, “make” by meaning “to produce; cause to exist or happen; bring about” is a far better word for factory methods.
Partly convention, partly semantics.
Factory methods (signalled by the traditional create) should invoke appropriate constructors. If I saw buildURI, I would assume that it involved some computation, or assembly from parts (and I would not think there was a factory involved). The first thing that I thought when I saw generateURI is making something random, like a new personalized download link. They are not all the same, different words evoke different meanings; but most of them are not conventionalised.
I'd call it UriFactory.Create()
Where,
UriFactory is the name of the class type which is provides method(s) that create Uri instances.
and Create() method is overloaded for as many as variations you have in your specs.
public static class UriFactory
{
//Default Creator
public static UriType Create()
{
}
//An overload for Create()
public static UriType Create(someArgs)
{
}
}
I like new. To me
var foo = newFoo();
reads better than
var foo = createFoo();
Translated to english we have foo is a new foo or foo is create foo. While I'm not a grammer expert I'm pretty sure the latter is grammatically incorrect.
I'd point out that I've seen all of the verbs but produce in use in some library or other, so I wouldn't call create being an universal convention.
Now, create does sound better to me, evokes the precise meaning of the action.
So yes, it is a matter of (literary) taste.
Personally I like instantiate and instantiateWith, but that's just because of my Unity and Objective C experiences. Naming conventions inside the Unity engine seem to revolve around the word instantiate to create an instance via a factory method, and Objective C seems to like with to indicate what the parameter/s are. This only really works well if the method is in the class that is going to be instantiated though (and in languages that allow constructor overloading, this isn't so much of a 'thing').
Just plain old Objective C's initWith is also a good'un!

Refactoring methods in existing code base with huge number of parameters

I have inherited an existing code base where the "features" are as follows:
huge monolithic classes with
(literally) 100's of member variables
and methods that go one for pages
(er. screens)
public and private methods with a large number of arguments.
I am trying to clean up and refactor the code, to leave it a little better
than how I found it. So my questions
is worth it (or do you) refactor methods with 10 or so arguments so that they are more readable ?
are there best practices on how long methods should be ? How long do you usually keep them?
are monolithic classes bad ?
is worth it (or do you) refactor methods with 10 or so arguments so that they are more readable ?
Yes, it is worth it. It is typically more important to refactor methods that are not "reasonable" than ones that already are nice, short, and have a small argument list.
Typically, if you have many arguments, it's because a method does too much - most likely, it should be a class of it's own, not a method.
That being said, in those cases when many parameters are required, it's best to encapsulate the parameters into a single class (ie: SpecificAlgorithmOptions), and pass one instance of that class. This way, you can provide clean defaults, and its very obvious which methods are essential vs. optional (based on what is required to construct the options class).
are there best practices on how long methods should be ? How long do you usually keep them?
A method should be as short as possible. It should have one purpose, and be used for one task, whenver possible. If it's possible to split it into separate methods, where each as a real, qualitative "task", then do so when refactoring.
are monolithic classes bad ?
Yes.
if the code is working and there is no need to touch it, i wouldn't refactor. i only refactor very problematic cases if i anyway have to touch them (either for extending them for functionality or bug-fixing). I favor the pragmatic way: Only (in 95%) touch, what you change.
Some first thoughts on your specific problem (though in detail it is difficult without knowing the code):
start to group instance variables, these groups will then be target to do 'extract class'
when having grouped these variables you hopefully can group some methods, which also be moved when doing 'extract class'
often there are many methods which aren't using any fields. make them static (they most likely are helper methods, which can be extracted to helper-classes.
in case non-related instance fields are mixed in many methods, do loads of 'extract method'
use automatic refactoring tools as much as possible, because you most likely have no tests in place and automation is more safe.
Regarding your other concrete questions.
is worth it (or do you) refactor methods with 10 or so arguments so that they are more readable?
definetely. 10 parameters are too many to grasp for us humans. most likely the method is doing too much.
are there best practices on how long methods should be ? How long do you usually keep them?
it depends... on preferences. i stated some things on this thread (though the question was PHP). still i would apply these numbers/metrics to any language.
are monolithic classes bad ?
it depends, what you mean with monolithic. if you mean many instance variables, endless methods, a lot of if/else complexity, yes.
also have a look at a real gem (to me a must have for every developer): working effectively with legacy code
Assuming the code is functioning I would suggest you think about these questions first:
is the code well documented?
do you understand the code?
how often are new features being added?
how often are bugs reported and fixed?
how difficult is it to modify and fix the code?
what is the expected life of the code?
how many versions of the compiler are you behind (if at all)?
is the OS it runs on expected to change during its lifetime?
If the system will be replaced in five years, is documented well, will undergo few changes, and bugs are easy to fix - leave it alone regardless of the size of the classes and the number of parameters. If you are determined to refactor make a list of your refactoring proposals in the order of maximum benefit with minimum changes and attack it incrementally.

How do you feel about code folding? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
For those of you in the Visual Studio environment, how do you feel about wrapping any of your code in #regions? (or if any other IDE has something similar...)
9 out of 10 times, code folding means that you have failed to use the SoC principle for what its worth.
I more or less feel the same thing about partial classes. If you have a piece of code you think is too big you need to chop it up in manageable (and reusable) parts, not hide or split it up.It will bite you the next time someone needs to change it, and cannot see the logic hidden in a 250 line monster of a method.
Whenever you can, pull some code out of the main class, and into a helper or factory class.
foreach (var item in Items)
{
//.. 100 lines of validation and data logic..
}
is not as readable as
foreach (var item in Items)
{
if (ValidatorClass.Validate(item))
RepositoryClass.Update(item);
}
My $0.02 anyways.
This was talked about on Coding Horror.
My personal belief is that is that they are useful, but like anything in excess can be too much.
I use it to order my code blocks into:
Enumerations
Declarations
Constructors
Methods
Event Handlers
Properties
Sometimes you might find yourself working on a team where #regions are encouraged or required. If you're like me and you can't stand messing around with folded code you can turn off outlining for C#:
Options -> Text Editor -> C# -> Advanced Tab
Uncheck "Enter outlining mode when files open"
I use #Region to hide ugly and useless automatically generated code, which really belongs in the automatically generated part of the partial class. But, when working with old projects or upgraded projects, you don't always have that luxury.
As for other types of folding, I fold Functions all the time. If you name the function well, you will never have to look inside unless you're testing something or (re-)writing it.
While I understand the problem that Jeff, et. al. have with regions, what I don't understand is why hitting CTRL+M,CTRL+L to expand all regions in a file is so difficult to deal with.
I use Textmate (Mac only) which has Code folding and I find it really useful for folding functions, I know what my "getGet" function does, I don't need it taking up 10 lines of oh so valuable screen space.
I never use it to hide a for loop, if statement or similar unless showing the code to someone else where I will hide code they have seen to avoid showing the same code twice.
I prefer partial classes as opposed to regions.
Extensive use of regions by others also give me the impression that someone, somewhere, is violating the Single Responsibility Principle and is trying to do too many things with one object.
#Tom
Partial classes are provided so that you can separate tool auto-generated code from any customisations you may need to make after the code gen has done its bit. This means your code stays intact after you re-run the codegen and doesn't get overwritten. This is a good thing.
I'm not a fan of partial classes - I try to develop my classes such that each class has a very clear, single issue for which it's responsible. To that end, I don't believe that something with a clear responsibility should be split across multiple files. That's why I don't like partial classes.
With that said, I'm on the fence about regions. For the most part, I don't use them; however, I work with code every day that includes regions - some people go really heavy on them (folding up private methods into a region and then each method folded into its own region), and some people go light on them (folding up enums, folding up attributes, etc). My general rule of thumb, as of now, is that I only put code in regions if (a) the data is likely to remain static or will not be touched very often (like enums), or (b) if there are methods that are implemented out of necessity because of subclassing or abstract method implementation, but, again, won't be touched very often.
Regions must never be used inside methods. They may be used to group methods but this must be handled with extreme caution so that the reader of the code does not go insane. There is no point in folding methods by their modifiers. But sometimes folding may increase readability. For e.g. grouping some methods that you use for working around some issues when using an external library and you won't want to visit too often may be helpful. But the coder must always seek for solutions like wrapping the library with appropriate classes in this particular example. When all else fails, use folding for improving readibility.
This is just one of those silly discussions that lead to nowhere. If you like regions, use them. If you don't, configure your editor to turn them off. There, everybody is happy.
I generally find that when dealing with code like Events in C# where there's about 10 lines of code that are actually just part of an event declaration (the EventArgs class the delegate declaration and the event declaration) Putting a region around them and then folding them out of the way makes it a little more readable.
Region folding would be fine if I didn't have to manually maintain region groupings based on features of my code that are intrinsic to the language. For example, the compiler already knows it's a constructor. The IDE's code model already knows it's a constructor. But if I want to see a view of the code where the constructors are grouped together, for some reason I have to restate the fact that these things are constructors, by physically placing them together and then putting a group around them. The same goes for any other way of slicing up a class/struct/interface. What if I change my mind and want to see the public/protected/private stuff separated out into groups first, and then grouped by member kind?
Using regions to mark out public properties (for example) is as bad as entering a redundant comment that adds nothing to what is already discernible from the code itself.
Anyway, to avoid having to use regions for that purpose, I wrote a free, open source Visual Studio 2008 IDE add-in called Ora. It provides a grouped view automatically, making it far less necessary to maintain physical grouping or to use regions. You may find it useful.
I think that it's a useful tool, when used properly. In many cases, I feel that methods and enumerations and other things that are often folded should be little black boxes. Unless you must look at them for some reason, their contents don't matter and should be as hidden as possible. However, I never fold private methods, comments, or inner classes. Methods and enums are really the only things I fold.
My approach is similar to a few others here, using regions to organize code blocks into constructors, properties, events, etc.
There's an excellent set of VS.NET macros by Roland Weigelt available from his blog entry, Better Keyboard Support for #region ... #endregion. I've been using these for years, mapping ctrl+. to collapse the current region and ctrl++ to expand it. Find that it works a lot better that the default VS.NET functionality which folds/unfolds everything.
I personally use #Regions all the time. I find that it helps me to keep things like properties, declarations, etc separated from each other.
This is probably a good answer, too!
Coding Horror
Edit: Dang, Pat beat me to this!
The Coding Horror article actual got me thinking about this as well.
Generally, I large classes I will put a region around the member variables, constants, and properties to reduce the amount of text I have to scroll through and leave everything else outside of a region. On forms I will generally group things into "member variables, constants, and properties", form functions, and event handlers. Once again, this is more so I don't have to scroll through a lot of text when I just want to review some event handlers.
I prefer #regions myself, but an old coworker couldn't stand to have things hidden. I understood his point once I worked on a page with 7 #regions, at least 3 of which had been auto-generated and had the same name, but in general I think they're a useful way of splitting things up and keeping everything less cluttered.
I really don't have a problem with using #region to organize code. Personally, I'll usually setup different regions for things like properties, event handlers, and public/private methods.
Eclipse does some of this in Java (or PHP with plugins) on its own. Allows you to fold functions and such. I tend to like it. If I know what a function does and I am not working on it, I dont need to look at it.
Emacs has a folding minor mode, but I only fire it up occasionally. Mostly when I'm working on some monstrosity inherited from another physicist who evidently had less instruction or took less care about his/her coding practices.
Using regions (or otherwise folding code) should have nothing to do with code smells (or hiding them) or any other idea of hiding code you don't want people to "easily" see.
Regions and code folding is really all about providing a way to easily group sections of code that can be collapsed/folded/hidden to minimize the amount of extraneous "noise" around what you are currently working on. If you set things up correctly (meaning actually name your regions something useful, like the name of the method contained) then you can collapse everything except for the function you are currently editing and still maintain some level of context without having to actually see the other code lines.
There probably should be some best practice type guidelines around these ideas, but I use regions extensively to provide a standard structure to my code files (I group events, class-wide fields, private properties/methods, public properties/methods). Each method or property also has a region, where the region name is the method/property name. If I have a bunch of overloaded methods, the region name is the full signature and then that entire group is wrapped in a region that is just the function name.
I personally hate regions. The only code that should be in regions in my opinion is generated code.
When I open file I always start with Ctrl+M+O. This folds to method level. When you have regions you see nothing but region names.
Before checking in I group methods/fields logically so that it looks ok after Ctrl+M+O.
If you need regions you have to much lines in your class. I also find that this is very common.
region ThisLooksLikeWellOrganizedCodeBecauseIUseRegions
// total garbage, no structure here
endregion
Enumerations
Properties
.ctors
Methods
Event Handlers
That's all I use regions for. I had no idea you could use them inside of methods.
Sounds like a terrible idea :)

Resources