I am getting rubocop error 'Class definition is too long. [236/100]'.
My class looks like below:
class SomeClassName
include HelperModule
attr_accessor :aaa, :bbb, :ccc
.... methods .....
end
What might go wrong?
The rubocop docs ClassLength says "length of a class exceeds some maximum value".
What does it mean?
Yes, this is because the overall lines are considered to be too many by rubucop. I agree classes shouldn't get too long but think that should ultimately be determined by: does the class have a single responsibility, are the methods concise enough, are there methods that could be shared via module, etc... The number/alert is a great warning though. If things in the class look OK, you can add # rubocop:disable ClassLength immediately above the class definition.
This likely means that your class definition takes more than 100 lines of code.
(There's lots of good info here already, but I came to this answer looking for the syntax for specifying the max lines per class in Rubocop, and I figure others might come here for that as well.)
In .rubocop.yml
# Allow classes longer than 100 lines of code
ClassLength:
Max: 250 # or whatever ends up being appropriate
General answer
Do I need to disable this cop if I want my class to be longer than 100 here or need to configure the maximum length. What would you suggest?
I use rubocop with this workflow, assuming I encounter a pre-existing code base with loads of warnings:
Run rubocop --auto-gen-config to create a "TODO" file. Include that file in your primary rubocop configuration file. See their documentation for details. In your example, it would generate a configuration that allows long classes (at least 236 lines, or more if you have larger classes).
Now, if you run rubocop, it will ignore all offences that you put in your TODO file - i.e., everything will looke fine now. Only if you introduce more errors/warnings (like a class with 237 lines) will it fire up again. So, at this point, rubocop will not do anything for you except prevent you to make it worse.
Once in a while, when I got some time to kill, I pick a rule from the TODO file and work on it. There are three possibilities:
Remove the rule from the TODO file. This will restore the original behaviour of rubocop.
Relax the number associated with it. Say, the class length is limited to 250 right now; I would like to set it to (say) 100 lines, but I know that I do not have time to refactor many classes right now. So I set it to 240. This will trigger all classes that are between 240 and 250 lines; likely only a few that I can handle easily. I fix them and move on. Another day, I maybe return to it and go from 240 to 230 etc.
Sometimes I decide that I do not bother about a particular warning. Then I move the configuration from the TODO file to the proper .rubocop file, allowing it permanently.
So, there are no hard and fast rules about all of this. You are supposed to find your own values. Some things that rubocop objects to are totally fine with me since they are more down to coding style than correctness or whatever.
Specific answer
Do I need to disable this cop if I want my class to be longer than 100 here or need to configure the maximum length. What would you suggest?
I most certainly do have a maximum amount of lines configured for my class files (as well as for methods). The length of a unit of code (be it a class or a method) is a very simple yet effective "code smell" that points to grown code and candidates for refactoring/splitting.
I have picked some numbers for me, and I stick with them. I.e., I don't constantly move them up or down to suit the particular code around, but if a piece of code grows "1 line too large", I take action. Most times my target is to split it roughly in half, which will, in the long run, lead to the minimum required effort.
If a class is very long, it usually breaks the "one responsibility per class" rule. It is usually beneficial to break it down into parts. Not just random subclasses, but actully OO-justifiable, patterned, constructs.
If a method is very long, it can sometimes point to OO measures to be taken (i.e., classes introduced or the method being split over existing classes; especially if the method consists of large if/else constructs or especially case statements), but more often it calls for simple old-fashioned refactoring into smaller (likely private) methods.
Have fun finding your favourite settings in rubocop, it's awesome.
Related
I am working on a huge project and we have decided to make all code conform to 80 characters per line. It is a ruby based project with wrappers for C. For Ruby, I decided to use Rubocop and with the command:
rubocop --only LineLength
I got 1,714 errors, where length of the line was greater than 80 characters. Aside from that, there were many other errors detected by Rubocop which I want to ignore for now.
I am looking for the easiest way to auto-correct all the line length violations only, to satisfy the 80 character limit both in C and Ruby.
Please don't change line length automatically.
Linelength is a metric, not a style. Styles can often be exchanged, like double vs. single quotes, using hashrockets vs. the new hash syntax introduced in ruby 2 etc. Choosing a style is usually a matter of taste and has few (if any) impact on the program itself. That's why there are auto corrections for styles: Changing them does not change semantics.
Metrics include linelength, classlength and ABC-Size, among others. Checking metrics using static code analysis is something completely different than checking styles. Setting a maximum to linelength for instance is not a matter of taste, you'd rather use it to enforce a certain style of programming. A programmer would have to avoid long variable names and deep nesting to keep the linelength under the limit.
Metrics can also indicate problems in the code. For example, too high ABC-size indicates a method might be doing too much.
Aside from that, it would be very difficult, if not impossible, to shorten all lines of code automatically, since ruby is a very complex language.
Instead of reducing linelength automatically, here's some alternatives:
see if you can reduce the number of violations by enabling the AllowHeredoc and AllowURI options. Read about them here https://rubocop.readthedocs.io/en/latest/cops_metrics/#metricslinelength.
run rubocop --only LineLength --auto-gen-config and use rubocops configuration to stop checking linelength.
ask yourself: What value do I gain by reducing linelength?
don't think of too long lines as style violation, but rather as a possible indicator of an underlying problem. Try to find that problem, and solve it.
I've discovered that I can split a class into multiple files by calling class <sameclassname>; <code> ;end from within each file. I've decided to divide up a very large class this way. The advantages I see:
I can have separate spec files called by guard to reduce spec time.
Forces me to organize and compartmentalize my code
Are there any pitfalls to this method? I can't find any information about people doing it.
I often do this in my gem files to manage documentation and avoid long files.
The only issues I found (which are solvable) are:
Initialization of class or module data should be thoughtfully managed. As each 'compartment' (file) is updated, the initialization data might need updating, but that data is often in a different file and we are (after all) fallible.
In my GReactor project (edit: deprecated), I wrote a different initialization method for each section and called all of them in the main initialization method.
Since each 'compartment' of the class or module is in a different file, it is easy to forget that they all share the same namespace, so more care should be taken when naming variables and methods.
The code in each file is executed in the order of the files being loaded (much like it would be if you were writing one long file)... but since you 'close' the class/module between each file, than your method declaration order might be important. Care should be taken when requiring the files, so that the desired order of the code execution is preserved.
The GReactor is a good example for managing a Mega-Module with a large API by compartmentalizing the different aspects of the module in different files.
There are no other pitfalls or issues that I have experienced.
Defining / reopening the same class in many different files makes it harder to locate the source of any given method, since there's no one clear place for it.
This also opens up the possibility of nasty loading sequence bugs, eg. file A is trying to call a method in file B, but file B has not loaded yet.
Having a very large class is a sign that the class is trying to do too much, and should be split up into smaller modules/subclasses. Sandi Metz's POODR recommends limiting classes to under 100 lines, among other guidelines.
In Ruby classes are never closed. What you call "splitting" is actually just reopening the class. You can reopen classes and add methods to them at any time. If you define a class in file A and include it in file B, even if you reopen the class in file B it'll still contain all the code from file A. I personally prefer only to reopen a class when I have to. It sounds like in your case, I would define my class in one file. I think this method is better organized and has a lower risk of interfering with previously defined methods. More on the subject at rubylearning.
Here's a good collection of Ruby design patters, or actually design pattern examples in Ruby: https://github.com/nslocum/design-patterns-in-ruby
Take a look at decorator as a good way to achieve modularity without a rigid parent<->child tree of classes.
The only pitfall is that your class is split in multiple files, that you need to menage. User of your class would only need to require the second file, so if your class is part of gem or some package, they probably wouldn't even be aware that it was ever reopened.
I'm working on a code visualization tool and I'd like to be able to display the size(in lines) of each Class, Method, and Module in a project. It seems like existing parsers(such as Ripper) could make this info easy to get. Is there a preferred way to do this? Is there a method of assessing size for classes that have been re-opened in separate locations? How about for dynamically (Class.new {}, Module.new {}) defined structures?
I think what you're asking for is not possible in general without actually running the whole Ruby program the classes are part of (and then you run into the halting problem). Ruby is extremely dynamic, so lines could be added to a class' definition anywhere, at any time, without necessarily referring to the particular class by name (e.g. using class_eval on a class passed into a method as an argument). Not that the source code of a class' definition is saved anyway... I think the closest you could get to that is the source_locations of the methods of the class.
You could take the difference of the maximum and minimum line numbers of those source_locations for each file. Then you'd have to assume that the class is opened only once per file, and that the size of the last method in a file is negligible (as well as any non-method parts of the class definition that happen before the first method definition or after the last one).
If you want something more accurate maybe you could run the program, get method source_locations, and try to correlate those with a separate parse of the source file(s), looking for enclosing class blocks etc.
But anything you do will most likely involve assumptions about how classes are generally defined, and thus not always be correct.
EDIT: Just saw that you were asking about methods and modules too, not just classes, but I think similar arguments apply for those.
I've created a gem that handles this problem in the fashion suggested by wdebaum. class_source. It certainly doesn't cover all cases but is a nice 80% solution for folks that need this type of thing. Patches welcome!
I have inherited an existing code base where the "features" are as follows:
huge monolithic classes with
(literally) 100's of member variables
and methods that go one for pages
(er. screens)
public and private methods with a large number of arguments.
I am trying to clean up and refactor the code, to leave it a little better
than how I found it. So my questions
is worth it (or do you) refactor methods with 10 or so arguments so that they are more readable ?
are there best practices on how long methods should be ? How long do you usually keep them?
are monolithic classes bad ?
is worth it (or do you) refactor methods with 10 or so arguments so that they are more readable ?
Yes, it is worth it. It is typically more important to refactor methods that are not "reasonable" than ones that already are nice, short, and have a small argument list.
Typically, if you have many arguments, it's because a method does too much - most likely, it should be a class of it's own, not a method.
That being said, in those cases when many parameters are required, it's best to encapsulate the parameters into a single class (ie: SpecificAlgorithmOptions), and pass one instance of that class. This way, you can provide clean defaults, and its very obvious which methods are essential vs. optional (based on what is required to construct the options class).
are there best practices on how long methods should be ? How long do you usually keep them?
A method should be as short as possible. It should have one purpose, and be used for one task, whenver possible. If it's possible to split it into separate methods, where each as a real, qualitative "task", then do so when refactoring.
are monolithic classes bad ?
Yes.
if the code is working and there is no need to touch it, i wouldn't refactor. i only refactor very problematic cases if i anyway have to touch them (either for extending them for functionality or bug-fixing). I favor the pragmatic way: Only (in 95%) touch, what you change.
Some first thoughts on your specific problem (though in detail it is difficult without knowing the code):
start to group instance variables, these groups will then be target to do 'extract class'
when having grouped these variables you hopefully can group some methods, which also be moved when doing 'extract class'
often there are many methods which aren't using any fields. make them static (they most likely are helper methods, which can be extracted to helper-classes.
in case non-related instance fields are mixed in many methods, do loads of 'extract method'
use automatic refactoring tools as much as possible, because you most likely have no tests in place and automation is more safe.
Regarding your other concrete questions.
is worth it (or do you) refactor methods with 10 or so arguments so that they are more readable?
definetely. 10 parameters are too many to grasp for us humans. most likely the method is doing too much.
are there best practices on how long methods should be ? How long do you usually keep them?
it depends... on preferences. i stated some things on this thread (though the question was PHP). still i would apply these numbers/metrics to any language.
are monolithic classes bad ?
it depends, what you mean with monolithic. if you mean many instance variables, endless methods, a lot of if/else complexity, yes.
also have a look at a real gem (to me a must have for every developer): working effectively with legacy code
Assuming the code is functioning I would suggest you think about these questions first:
is the code well documented?
do you understand the code?
how often are new features being added?
how often are bugs reported and fixed?
how difficult is it to modify and fix the code?
what is the expected life of the code?
how many versions of the compiler are you behind (if at all)?
is the OS it runs on expected to change during its lifetime?
If the system will be replaced in five years, is documented well, will undergo few changes, and bugs are easy to fix - leave it alone regardless of the size of the classes and the number of parameters. If you are determined to refactor make a list of your refactoring proposals in the order of maximum benefit with minimum changes and attack it incrementally.
According to Wikipedia, a monkey patch is:
a way to extend or modify the runtime
code of dynamic languages [...]
without altering the original source
code.
The following statement from the same entry confused me:
In Ruby, the term monkey patch was
misunderstood to mean any dynamic
modification to a class and is often
used as a synonym for dynamically
modifying any class at runtime.
I would like to know the exact meaning of monkey patching in Ruby. Is it doing something like the following, or is it something else?
class String
def foo
"foo"
end
end
The best explanation I heard for Monkey patching/Duck-punching is by Patrick Ewing in RailsConf 2007
...if it walks like a duck and talks like a duck, it’s a duck, right? So
if this duck is not giving you the noise that you want, you’ve got to
just punch that duck until it returns what you expect.
The short answer is that there is no "exact" meaning, because it's a novel term, and different folks use it differently. That much at least can be discerned from the Wikipedia article. There are some who insist that it only applies to "runtime" code (built-in classes, I suppose) while some would use it to refer to the run-time modification of any class.
Personally, I prefer the more inclusive definition. After all, if we were to use the term for modification of built-in classes only, how would we refer to the run-time modification of all the other classes? The important thing to me is that there's a difference between the source code and the actual running class.
In Ruby, the term monkey patch was
misunderstood to mean any dynamic
modification to a class and is often
used as a synonym for dynamically
modifying any class at runtime.
The above statement asserts that the Ruby usage is incorrect - but terms evolve, and that's not always a bad thing.
Monkey patching is when you replace methods of a class at runtime (not adding new methods as others have described).
In addition to being a very un-obvious and difficult to debug way to change code, it doesn't scale; as more and more modules start monkey patching methods, the likelihood of the changes stomping each other grow.
You are correct; it's when you modify or extend an existing class rather than subclass it.
This is monkey patching:
class Float
def self.times(&block)
self.to_i.times { |i| yield(i) }
remainder = self - self.to_i
yield(remainder) if remainder > 0.0
end
end
Now I imagine this might be useful sometimes, but imagine if you saw routine.
def my_method(my_special_number)
sum = 0
my_special_number.times { |num| sum << some_val ** num }
sum
end
And it breaks only occasionally when it gets called. To those paying attention you already know why, but imagine that you didn't know about the float type having a .times class-method and you automatically assumed that my_special_number is an integer. Every time the parameter is a whole number, integer or float, it would work fine (whole ints are passed back except when there is a floating-point remainder). But pass a number with anything in the decimal area in and it'll break for sure!
Just imagine how often this might happen with your gems, Rails plugins, and even by your own co-workers in your projects. If there's one or two little methods in there like this and it could take some time to find and correct.
If you wonder why it breaks, note that sum is an integer and a floating-point remainder could be passed back; in addition, the exponential sign only works when types are the same. So you might think it's fixed, because you converted bother numbers to floats ... only to find that the sum can't take the floating-point result.
In Python monkeypatching is referred to a lot as a sign of embarrassment: "I had to monkeypatch this class because..." (I encountered it first when dealing with Zope, which the article mentions). It's used to say that it was necessary to take hold of an upstream class and fix it at runtime instead of lobbying to have the unwanted behaviors fixed in the actual class or fixing them in a subclass. In my experience Ruby people don't talk about monkeypatching that much, because it's not considered especially bad or even noteworthy (hence "duck punching"). Obviously you have to be careful about changing the return values of a method that will be used in other dependencies, but adding methods to a class the way that active_support and facets do is perfectly safe.
Update 10 years later: I would amend the last sentence to say "is relatively safe". Extending a core library class with new methods can lead to problems if somebody else gets the same idea and adds the same method with a different implementation or method signature, or if people confuse extended methods for core language functionality. Both cases often happen in Ruby (especially regarding active_support methods).
Explanation of the concept without code:
It means you can "dynamically" modify code. Wanna add a method "dynamically" to a particular class known only at "runtime"? No problem. It's powerful, yes: but can be misused. The concept "dynamically" might be a little too esoteric to understand, so I have prepared an example below (no code, I promise):
How to monkey patch a car:
Normal Car Operations
How do you normally start a car? It’s simple: you turn the ignition, the car starts!
Great, but how can we "monkey patch" the car class?
This is what Fabrizzio did to poor Michael Corleone. Normally, if you want to change how a car operates, you would have to make those changes in the car manufacturing plant (i.e. at "compile" time, within the Car class ^**). Fabrizzio ain't got no time for that: he monkey patches cars by getting under the bonnet to surreptitiously and sneakily rewire things. In other words, he re-opens the Car class, makes the changes he wants, and he's done: he's just monkey patched a car. he done this "dynamically".
You have to really know what you are doing when you monkey patch otherwise the results could be quite explosive.
“Fabrizzio, where are you going?”
Boom!
Like Confucius Say:
"Keep your source code close, but your monkey patches closer."
It can be dangerous.
^** yes i know, dynamic languages.
Usually it is meant about ad-hoc changes, using Ruby open classes, frequently with low quality code.
Here's a good follow-up on the subject.