As I know, it's basing on Guru for vim-go to find the implementations or usage which needs to compile the whole project as a premise. Otherwise, GoLand doesn't need to do that, but how?
While this task may look rather trivial, GoLand uses some tricks to perform it more efficiently. Let's go step by step to explore how it works.
When one opens a project for the first time, the IDE performs so-called indexing. In particular, it stores all the method and method spec names as well as the number of their parameters.
At the beginning of the search, GoLand takes method specs of the interface and finds one with the biggest number of parameters. This is a performance optimization. The idea behind it is that methods with many parameters occur less often in the code, so the IDE needs to check just a few of them.
It's time to use the index. For the chosen method spec, the IDE finds all corresponding methods. The scope is taken into account, so for a private interface, for instance, it's much smaller.
For each method, GoLand resolves its type and checks whether it implements the interface. This is the moment when all interface's method specs are taken into account.
Not only structures can implement interfaces but other interfaces, too. As the next steps, the IDE looks for all corresponding method specifications, that is, method specifications with the same name and number of parameters. There's a separate index that's responsible for this.
For each method spec, its interface is taken and checked. No resolution is involved this time as it's enough to traverse a syntax tree up to find an interface of an arbitrary method spec.
That's basically the algorithm. There are a few more implementation details to make it works faster, but they don't affect results.
GoLand relies on the IntelliJ Platform and a set of custom written tooling bundled as a custom language plugin on top of it to handle indexing, parsing, navigation and editing code.
Related
Uncle Bob says:
"Defensive programming, in non-public APIs, is a smell, and a symptom, of teams that don't do TDD."
I am wondering how TDD can avoid an (internal) function to be used in an unintended way? I think TDD can´t avoid it. It merely shows that the function is used correctly because a calling function is covered by it´s passing unit tests.
When developing a new feature using the (undefensive) function this feature is also developed with TDD. So unintended use of the function will fail the new features tests.
So using TDD to drive new features will force you to correcty use (internal) functions.
Do you think that is what is meant by Uncle Bob´s tweet?
So using TDD to drive new features will force you to correctly use (internal) functions.
Exactly. But keep in mind the subtle "gap" here: you should use TDD to write (unit) tests that test the contract of your public methods. You do not care about the implementation of these methods - that is all internal implementation detail.
Therefore: if your "new" code uses an existing method in an unintended way you are "told" because an exception is thrown or you receive an unexpected result.
That is what I mean by "gap": you see, the above describes a black box testing approach. You have a public method X, and you verify its public contract. Compare that to white box testing where you write tests to cover all paths taken within X. When doing that, you could notice: "ok to test that one condition in my internal method, I would have to drive this special data".
But as said - I think you should go for black box testing - white box tests might break easily when refactoring internal methods.
And there is an additional dimension here: keep in mind that ideally you change code in order to implement new features. This means that adding new features only takes place by writing new classes and methods. This means that your new code has no chance using private internal methods. Because you are within a new class. In other words: when you regularly happen to run into situations where your internal methods are used in many different ways - then you are probably doing something wrong.
The ideal path is: you implement a new requirement by creating a set of new classes. Later on, you have to add other requirements - by writing more classes.
In that ideal path - there is no need for defensive programming within internal methods. Because you exactly understand each use case for such internal methods!
Thus, the conclusion is: avoid defensive programming in internal methods. Make sure that your public APIs check all pre-conditions, so they fail (as fast as possible) if there is a problem. Try to avoid these internal consistency checks - as they tend to bloat your code - and rest assured: in 5 weeks or 5 months you will not remember if you really needed that check, or if it is just "defensive".
One way to answer this is to look at what else Uncle Bob has had to say on the topic. For example:
In a system with meager code coverage, few tests, and lots of tangled legacy code, defensive programming should be the rule.
In a system born of TDD, with 90+% coverage and highly reliable, well-maintained unit tests, defensive programming should be the exception.
From this, we can infer his main argument -- if the defensive checks are actually providing a benefit, then that is a hint that we are missing some constraints. If we are missing some constraints, and all the tests are passing, then we must also be missing some tests.
Or, to express the same idea in a slightly different way -- the constraints implied by the defensive patterns in your implementation belong closer to the boundary (ie, in the public API).
If there are constraints, for example, to limit what data is allowed to pass through the boundary, then there should be tests to ensure that the boundary actually implements the constraints.
When you use TDD properly, you cover all the possible cases and assert that your public functions that call the private ones do respond properly as expected not only for the happy scenario, but for all different possible scenarios. When you use defending programing in your private methods, you are actually getting yourself ready for these (different possible) scenarios mentioned above.
I, personally, do not think defending programing is bad even if it is in private methods, however, based on my description above I see it is a double effort that is unnecessary and also, it eliminates the importance of the TTD because you are handling these special cases in your application by complicating the code, instead of writing it a way that is proof.
I am starting to work on a project and for one of the tasks I need to analyze the source code in order to gather information about the classes and their methods. More specifically, for each method I need to know which internal attributes and external objects (references) it uses throughout the entire method body.
I discussed it with my supervisors and they think that Bytecode manipulation libraries is the way to go. I already looked at BCEL, ASM and Javassist but I'm not sure which one I need to use. Do they all provide access to the method body where I can see all the instructions and get the information I need?
Any advice would be appreciate it. Thank you!
If you really “need to analyze the source code”, then libraries which allow to inspect the bytecode are not the way to go.
Otherwise, you really need to define your task precisely. Either, you are about to analyze classes, regardless of whether you will look at their source code or byte code, or you want to analyze source code and consider doing it by compiling first, followed by analyzing the compiled result. In the latter case, you have to compare the effort of both steps with alternative solution, which may, e.g. incorporate direct source code analysis.
Parsing byte code is rather easy, easier than analyzing source code, which is the reason why bytecode is produced prior to the execution of Java programs. To answer your concrete question, yes, all three libraries offer you a way to analyze the instructions and associated information. Which one is the best to fit your needs, is a question that is beyond the scope of Stackoverflow.
Whether analyzing the byte code helps, depends on your exact requirements. When it comes to field and method access, you may precisely get most of them using that approach. Only inlined compile-time constants lack their origins. When it comes to type use, you have to consider that not every source code artifact has an existing counterpart in the byte code, e.g. widening casts produce no actual code and and local variables usually don’t have a declared type (debugging information aside), but only an implied type which depends on how they are actually used. They also have no information about Generics, unless debugging information has been included.
I am learning Ruby and I'm having a major conceptual problem concerning typing. Allow me to detail why I don't understand with paradigm.
Say I am method chaining for concise code as you do in Ruby. I have to precisely know what the return type of each method call in the chain, otherwise I can't know what methods are available on the next link. Do I have to check the method documentation every time?? I'm running into this constantly running tutorial exercises. It seems I'm stuck with a process of reference, infer, run, fail, fix, repeat to get code running rather then knowing precisely what I'm working with during coding. This flies in the face of Ruby's promise of intuitiveness.
Say I am using a third party library, once again I need to know what types are allow to pass on the parameters otherwise I get a failure. I can look at the code but there may or may not be any comments or declaration of what type the method is expecting. I understand you code based on methods are available on an object, not the type. But then I have to be sure whatever I pass as a parameter has all the methods the library is expect, so I still have to do type checking. Do I have to hope and pray everything is documented properly on an interface so I know if I'm expected to give a string, a hash, a class, etc.
If I look at the source of a method I can get a list of methods being called and infer the type expected, but I have to perform analysis.
Ruby and duck typing: design by contract impossible?
The discussions in the preceding stackoverflow question don't really answer anything other than "there are processes you have to follow" and those processes don't seem to be standard, everyone has a different opinion on what process to follow, and the language has zero enforcement. Method Validation? Test-Driven Design? Documented API? Strict Method Naming Conventions? What's the standard and who dictates it? What do I follow? Would these guidelines solve this concern https://stackoverflow.com/questions/616037/ruby-coding-style-guidelines? Is there editors that help?
Conceptually I don't get the advantage either. You need to know what methods are needed for any method called, so regardless you are typing when you code anything. You just aren't informing the language or anyone else explicitly, unless you decide to document it. Then you are stuck doing all type checking at runtime instead of during coding. I've done PHP and Python programming and I don't understand it there either.
What am I missing or not understanding? Please help me understand this paradigm.
This is not a Ruby specific problem, it's the same for all dynamically typed languages.
Usually there are no guidelines for how to document this either (and most of the time not really possible). See for instance map in the ruby documentation
map { |item| block } → new_ary
map → Enumerator
What is item, block and new_ary here and how are they related? There's no way to tell unless you know the implementation or can infer it from the name of the function somehow. Specifying the type is also hard since new_ary depends on what block returns, which in turn depends on the type of item, which could be different for each element in the Array.
A lot of times you also stumble across documentation that says that an argument is of type Object, Which again tells you nothing since everything is an Object.
OCaml has a solution for this, it supports structural typing so a function that needs an object with a property foo that's a String will be inferred to be { foo : String } instead of a concrete type. But OCaml is still statically typed.
Worth noting is that this can be a problem in statically typed lanugages too. Scala has very generic methods on collections which leads to type signatures like ++[B >: A, That](that: GenTraversableOnce[B])(implicit bf: CanBuildFrom[Array[T], B, That]): That for appending two collections.
So most of the time, you will just have to learn this by heart in dynamically typed languages, and perhaps help improve the documentation of libraries you are using.
And this is why I prefer static typing ;)
Edit One thing that might make sense is to do what Scala also does. It doesn't actually show you that type signature for ++ by default, instead it shows ++[B](that: GenTraversableOnce[B]): Array[B] which is not as generic, but probably covers most of the use cases. So for Ruby's map it could have a monomorphic type signature like Array<a> -> (a -> b) -> Array<b>. It's only correct for the cases where the list only contains values of one type and the block only returns elements of one other type, but it's much easier to understand and gives a good overview of what the function does.
Yes, you seem to misunderstand the concept. It's not a replacement for static type checking. It's just different. For example, if you convert objects to json (for rendering them to client), you don't care about actual type of the object, as long as it has #to_json method. In Java, you'd have to create IJsonable interface. In ruby no overhead is needed.
As for knowing what to pass where and what returns what: memorize this or consult docs each time. We all do that.
Just another day, I've seen rails programmer with 6+ years of experience complain on twitter that he can't memorize order of parameters to alias_method: does new name go first or last?
This flies in the face of Ruby's promise of intuitiveness.
Not really. Maybe it's just badly written library. In core ruby everything is quite intuitive, I dare say.
Statically typed languages with their powerful IDEs have a small advantage here, because they can show you documentation right here, very quickly. This is still accessing documentation, though. Only quicker.
Consider that the design choices of strongly typed languages (C++,Java,C#,et al) enforce strict declarations of type passed to methods, and type returned by methods. This is because these languages were designed to validate that arguments are correct (and since these languages are compiled, this work can be done at compile time). But some questions can only be answered at run time, and C++ for example has the RTTI (Run Time Type Interpreter) to examine and enforce type guarantees. But as the developer, you are guided by syntax, semantics and the compiler to produce code that follows these type constraints.
Ruby gives you flexibility to take dynamic argument types, and return dynamic types. This freedom enables you to write more generic code (read Stepanov on the STL and generic programming), and gives you a rich set of introspection methods (is_a?, instance_of?, respond_to?, kind_of?, is_array?, et al) which you can use dynamically. Ruby enables you to write generic methods, but you can also explicity enforce design by contract, and process failure of contract by means chosen.
Yes, you will need to use care when chaining methods together, but learning Ruby is not just a few new keywords. Ruby supports multiple paradigms; you can write procedural, object oriend, generic, and functional programs. The cycle you are in right now will improve quickly as you learn about Ruby.
Perhaps your concern stems from a bias towards strongly typed languages (C++, Java, C#, et al). Duck typing is a different approach. You think differently. Duck typing means that if an object looks like a , behaves like a , then it is a . Everything (almost) is an Object in Ruby, so everything is polymorphic.
Consider templates (C++ has them, C# has them, Java is getting them, C has macros). You build an algorithm, and then have the compiler generate instances for your chosen types. You aren't doing design by contract with generics, but when you recognize their power, you write less code, and produce more.
Some of your other concerns,
third party libraries (gems) are not as hard to use as you fear
Documented API? See Rdoc and http://www.ruby-doc.org/
Rdoc documentation is (usually) provided for libraries
coding guidelines - look at the source for a couple of simple gems for starters
naming conventions - snake case and camel case are both popular
Suggestion - approach an online tutorial with an open mind, do the tutorial (http://rubymonk.com/learning/books/ is good), and you will have more focused questions.
I'm working on a code visualization tool and I'd like to be able to display the size(in lines) of each Class, Method, and Module in a project. It seems like existing parsers(such as Ripper) could make this info easy to get. Is there a preferred way to do this? Is there a method of assessing size for classes that have been re-opened in separate locations? How about for dynamically (Class.new {}, Module.new {}) defined structures?
I think what you're asking for is not possible in general without actually running the whole Ruby program the classes are part of (and then you run into the halting problem). Ruby is extremely dynamic, so lines could be added to a class' definition anywhere, at any time, without necessarily referring to the particular class by name (e.g. using class_eval on a class passed into a method as an argument). Not that the source code of a class' definition is saved anyway... I think the closest you could get to that is the source_locations of the methods of the class.
You could take the difference of the maximum and minimum line numbers of those source_locations for each file. Then you'd have to assume that the class is opened only once per file, and that the size of the last method in a file is negligible (as well as any non-method parts of the class definition that happen before the first method definition or after the last one).
If you want something more accurate maybe you could run the program, get method source_locations, and try to correlate those with a separate parse of the source file(s), looking for enclosing class blocks etc.
But anything you do will most likely involve assumptions about how classes are generally defined, and thus not always be correct.
EDIT: Just saw that you were asking about methods and modules too, not just classes, but I think similar arguments apply for those.
I've created a gem that handles this problem in the fashion suggested by wdebaum. class_source. It certainly doesn't cover all cases but is a nice 80% solution for folks that need this type of thing. Patches welcome!
I have inherited an existing code base where the "features" are as follows:
huge monolithic classes with
(literally) 100's of member variables
and methods that go one for pages
(er. screens)
public and private methods with a large number of arguments.
I am trying to clean up and refactor the code, to leave it a little better
than how I found it. So my questions
is worth it (or do you) refactor methods with 10 or so arguments so that they are more readable ?
are there best practices on how long methods should be ? How long do you usually keep them?
are monolithic classes bad ?
is worth it (or do you) refactor methods with 10 or so arguments so that they are more readable ?
Yes, it is worth it. It is typically more important to refactor methods that are not "reasonable" than ones that already are nice, short, and have a small argument list.
Typically, if you have many arguments, it's because a method does too much - most likely, it should be a class of it's own, not a method.
That being said, in those cases when many parameters are required, it's best to encapsulate the parameters into a single class (ie: SpecificAlgorithmOptions), and pass one instance of that class. This way, you can provide clean defaults, and its very obvious which methods are essential vs. optional (based on what is required to construct the options class).
are there best practices on how long methods should be ? How long do you usually keep them?
A method should be as short as possible. It should have one purpose, and be used for one task, whenver possible. If it's possible to split it into separate methods, where each as a real, qualitative "task", then do so when refactoring.
are monolithic classes bad ?
Yes.
if the code is working and there is no need to touch it, i wouldn't refactor. i only refactor very problematic cases if i anyway have to touch them (either for extending them for functionality or bug-fixing). I favor the pragmatic way: Only (in 95%) touch, what you change.
Some first thoughts on your specific problem (though in detail it is difficult without knowing the code):
start to group instance variables, these groups will then be target to do 'extract class'
when having grouped these variables you hopefully can group some methods, which also be moved when doing 'extract class'
often there are many methods which aren't using any fields. make them static (they most likely are helper methods, which can be extracted to helper-classes.
in case non-related instance fields are mixed in many methods, do loads of 'extract method'
use automatic refactoring tools as much as possible, because you most likely have no tests in place and automation is more safe.
Regarding your other concrete questions.
is worth it (or do you) refactor methods with 10 or so arguments so that they are more readable?
definetely. 10 parameters are too many to grasp for us humans. most likely the method is doing too much.
are there best practices on how long methods should be ? How long do you usually keep them?
it depends... on preferences. i stated some things on this thread (though the question was PHP). still i would apply these numbers/metrics to any language.
are monolithic classes bad ?
it depends, what you mean with monolithic. if you mean many instance variables, endless methods, a lot of if/else complexity, yes.
also have a look at a real gem (to me a must have for every developer): working effectively with legacy code
Assuming the code is functioning I would suggest you think about these questions first:
is the code well documented?
do you understand the code?
how often are new features being added?
how often are bugs reported and fixed?
how difficult is it to modify and fix the code?
what is the expected life of the code?
how many versions of the compiler are you behind (if at all)?
is the OS it runs on expected to change during its lifetime?
If the system will be replaced in five years, is documented well, will undergo few changes, and bugs are easy to fix - leave it alone regardless of the size of the classes and the number of parameters. If you are determined to refactor make a list of your refactoring proposals in the order of maximum benefit with minimum changes and attack it incrementally.