What does it mean to "pollute the global namespace"? - ruby

In ruby, some gems choose to "pollute the global namespace".
What does this mean?
How can I see where it's happening?
Why would a gem need to do this?
When faced with two gems that are polluting the global namespace and conflicting, what tradeoffs am I making when I choose to "isolate" one?
For example:
I'm using two gems that are both polluting the global namespace: pry and gli so I'm not able to place my binding.prys where I want anymore.
One solution is to wrap the entire cli in a module:
module Wrapper
include GLI::App
extend self
program_desc "..."
...
exit run ARGV
end
Now I'm able to use my binding.prys wherever I want.
Why did this work?
What tradeoffs am I making when I choose to do "isolate gli"? Or is it "isolate the GLI::App module"?

Ruby has a singular root namespace shared by all code and any constants and globals you define there are universal through the whole application. This makes conflict inevitable if you're not careful about namespacing things.
The module construct is there as a namespace primitive, all constants will be local to that, all classes defined within it. You can also use a class as a namespace if you prefer, it's up to you.
Forcing the include of something into the root namespace is a big problem. That's usually only done in quick scripts that are fairly tiny and self-contained. That's a bad habit to get into when you're doing anything non-trivial as it mashes together all the constants and methods in those two contexts, potentially over-writing them.

Related

Splitting a ruby class into multiple files

I've discovered that I can split a class into multiple files by calling class <sameclassname>; <code> ;end from within each file. I've decided to divide up a very large class this way. The advantages I see:
I can have separate spec files called by guard to reduce spec time.
Forces me to organize and compartmentalize my code
Are there any pitfalls to this method? I can't find any information about people doing it.
I often do this in my gem files to manage documentation and avoid long files.
The only issues I found (which are solvable) are:
Initialization of class or module data should be thoughtfully managed. As each 'compartment' (file) is updated, the initialization data might need updating, but that data is often in a different file and we are (after all) fallible.
In my GReactor project (edit: deprecated), I wrote a different initialization method for each section and called all of them in the main initialization method.
Since each 'compartment' of the class or module is in a different file, it is easy to forget that they all share the same namespace, so more care should be taken when naming variables and methods.
The code in each file is executed in the order of the files being loaded (much like it would be if you were writing one long file)... but since you 'close' the class/module between each file, than your method declaration order might be important. Care should be taken when requiring the files, so that the desired order of the code execution is preserved.
The GReactor is a good example for managing a Mega-Module with a large API by compartmentalizing the different aspects of the module in different files.
There are no other pitfalls or issues that I have experienced.
Defining / reopening the same class in many different files makes it harder to locate the source of any given method, since there's no one clear place for it.
This also opens up the possibility of nasty loading sequence bugs, eg. file A is trying to call a method in file B, but file B has not loaded yet.
Having a very large class is a sign that the class is trying to do too much, and should be split up into smaller modules/subclasses. Sandi Metz's POODR recommends limiting classes to under 100 lines, among other guidelines.
In Ruby classes are never closed. What you call "splitting" is actually just reopening the class. You can reopen classes and add methods to them at any time. If you define a class in file A and include it in file B, even if you reopen the class in file B it'll still contain all the code from file A. I personally prefer only to reopen a class when I have to. It sounds like in your case, I would define my class in one file. I think this method is better organized and has a lower risk of interfering with previously defined methods. More on the subject at rubylearning.
Here's a good collection of Ruby design patters, or actually design pattern examples in Ruby: https://github.com/nslocum/design-patterns-in-ruby
Take a look at decorator as a good way to achieve modularity without a rigid parent<->child tree of classes.
The only pitfall is that your class is split in multiple files, that you need to menage. User of your class would only need to require the second file, so if your class is part of gem or some package, they probably wouldn't even be aware that it was ever reopened.

Ruby: What is considered global, and how do you avoid it?

I am having a tough time grasping the idea of completely avoiding globals in Ruby.
To my understanding, if I define a method, the method would be considered global because I can call the method later on in the script. Same goes for classes. Can you completely avoid globalsl?
Research has pointed my inconclusively towards closures and singleton methods but I am still having trouble understanding how I would 'completely avoid globals.'
EDIT: I have also programmed a bit in JavaScript and used closure as follows to avoid the use of any globals: (function(){...})(); Can something similar be done in Ruby?
It is important to understand the reasoning behind avoiding globals. The main reason is avoiding global state. By storing variable information in a global variable, you are allowing components of the program to behave differently when used in the same way at different times. This usually results in unintended side-effects, causing testing and maintenance issues. Global classes or methods are unable to change (not considering reflection) and are not an issue because of that.
Another thing you may have associated with globals is namespace pollution, which can be partially resolved by nesting namespaces in a way that groups components semantically. Those are still global, though and thus not really avoidable.
Modules and Classes as globals are not a problem.
You could use a module to conceal a class, but ultimately you're going to have some globals.
Avoiding global variables and methods would be advised, though.

How do I add Ruby methods to a rake file 'the right way'?

I'm creating a complicated Rakefile, and have some logic which is used in various places, and want to package it up in some 'helper' methods. I see three possibilities:
Put the methods after the tasks that invoke them.
Put them in a separate rake_helpers.rb file and include that at the start.
Use some rake feature I don't know about to handle just that case.
What's the best practice or convention here?
I just stick them in a lib/rake subdirectory, and only include them for the purpose of the rake tasks. If I need to, I can also separately include those files in my Rails (or whatever else) environment.
I actually have a whole library of special functions like this. When I'm not using Rails, for example, I have my own say_with_time("message") do; block; end logger.

Ruby class loading mechanism

I'm beginning with the Ruby programming language and I'm interested in understanding it in depth before I start studding the Rails framework.
I'm currently a little disappointed because everybody seams to care only about the Rails framework, and other aspects of the language are just not discussed in depth, such as its class loading mechanism.
Considering that I'm starting by doing some desktop/console experiments, I would like to better understand the following matters:
Is it a good practice to place each Ruby class in a separate Ruby file? (*.rb)
If I have, let's say .. 10 classes .. and all of them reference each other, by instantiating one another and calling each other's methods, should I add a 'require' statement in each file to state which classes are required by the class in that file? (just like we do with 'import' statements in each Java class file?)
Is there a difference in placing a 'require' statement before or after (inside) a class declaration?
What could be considered a proper Ruby program's 'entry point'? It seams to me that any .rb script will suffice, since the language doesn't have a convention like C or Java where we always need a 'main' function of method.
Is class loading considered a 'phase' in the execution of a Ruby program? Are we supposed to load all the classes that are needed by the application right at the start?
Shouldn't the interpreter itself be responsible for finding and loading classes as we run the code that needs them? By searching the paths in the $LOAD_PATH variable, like Java does with its $CLASSPATH?
Thank you.
In general terms, it's a good practice to create a separate .rb file for each Ruby class unless the classes are of a utility nature and are too trivial to warrant separation. An instance of this would be a custom Exception derived class where putting it in a separate file would be more trouble than its worth.
Tradition holds that the name of the class and the filename are related. Where the class is called ExampleClass, the file is called example_class, the "underscored" version of same. There are occasions when you'll buck this convention, but so long as you're consistent about it there shouldn't be problems. The Rails ActiveSupport auto-loader will help you out a lot if you follow convention, so a lot of people follow this practice.
Likewise, you'll want to organize your application into folders like lib and bin to separate command-line scripts from back-end libraries. The command-line scripts do not usually have a .rb extension, whereas the libraries should.
When it comes to require, this should be used sparingly. If you structure your library files correctly they can all load automatically once you've called require on the top-level one. This is done with the autoload feature.
For example, lib/example_class.rb might look like:
class ExampleClass
class SpecialException < Exception
end
autoload(:Foo, 'example_class/foo')
# ...
end
You would organize other things under separate directories or files, like lib/example_class/foo.rb which could contain:
class ExampleClass::Foo
# ...
end
You can keep chaining autoloads all the way down. This has the advantage of only loading modules that are actually referenced.
Sometimes you'll want to defer a require to somewhere inside the class implementation. This is useful if you want to avoid loading in a heavy library unless a particular feature is used, where this feature is unlikely to be used under ordinary circumstances.
For example, you might not want to load the YAML library unless you're doing some debugging:
def debug_export_to_yaml
require 'yaml'
YAML.dump(some_stuff)
end
If you look at the structure of common Ruby gems, the "entry point" is often the top-level of your library or a utility script that includes this library. So for an example ExampleLibrary, your entry point would be lib/example_library.rb which would be structured to include the rest on demand. You might also have a script bin/library_tool that would do this for you.
As for when to load things, if there's a very high chance of something getting used, load it up front to pay the price early, so called "eager loading". If there's a low chance of it getting used, load it on demand, or leave it "lazy loaded" as it's called.
Have a look at the source of some simple but popular gems to get a sense of how most people structure their applications.
I'll try to help you with the first one:
Is it a good practice to place each Ruby class in a separate Ruby file? (*.rb)
It comes down to how closely related those classes are. Let's see a few examples. Look this class: https://github.com/resque/resque/blob/master/lib/resque.rb
, it "imports" the functionality of several classes that, although they work together, they are not closely related to be bundled together.
On the other hand, take a look at this module: https://github.com/resque/resque/blob/master/lib/resque/errors.rb. It bundles 5 different classes, but these do belong together since they are all essentially representing the same.
Additionally, from a design standpoint a good rule of thump could be asking yourself, who else is using this class/ functionality (meaning which other parts of the code base needs it)?
Let's say that you want to represent a Click and WheelScroll performed by a Mouse. It would make more sense in this trivial example, that those classes be bundled together:
module ComputerPart
class Mouse; end
class WheelScroll; end
class Click; end
end
Finally, I would recommend that you peruse the code of some of these popular projects to kind of get the feeling how the community usually make these decisions.
1.) I follow this practice, but it is not necessary, you can put a bunch of classes in one file if you want.
2.) If the classes are in the same file, no, they will all be accessible when you run the script. If they are in separate files then you should require them, you can also require the entire directory that the file(self) is in.
3.)Yes, it should be at the top of the file.
4.) In ruby everything descends from the Main object, the Interpreter just handles creating it for you. If you are writing OO ruby and not just scripts, then the entry point will be the init method of the first class you call.
5.) Yes, before the program runs it loads up all the dependencies.
6.) I think it does this, all you have to do is require the proper files at the top of the files, after that you can use them as you wish without having to implicitly load them again.

Running another ruby script with globals intact?

For reasons that are a little hard to explain, I need to do the following: I have a master.rb file that sets some global like: a = 1. I want to call another file other_file.rb that will run with the globals that were set in the master file. In python I'd use runpy.run_module( 'other_module', globals() ).
Can anyone think of an equivalent in Ruby? I've looked at require, include, and load, but none seem to do quite what I need, specifically they don't pull the globals into the other_file.rb. Note that I am not trying to fork a new process, just hand execution over to "other_module" while maintaining the state of the globals.
a=1 is not a global variable, it is a local variable that gets scoped to the file. If you really nee this behavior, use $a=1 to set global variables.
If you absolutely must, you can use globals, and they're declared with the $ prefix. They are highly discouraged because there is only one global namespace, which makes collisions possible. Generally they are used for interpreter configuration, like $LOAD_PATH.
A better approach is to use a module that has instance variables:
module MyContainer
def self.settings
#settings ||= { }
end
end
MyContainer.settings[:foo] = :bar
This has the advantage of keeping your variables contained in a namespace while not preventing other sub-programs from accessing them.
Keep in mind this will only work within the context of the same Ruby process or children created using fork, so using system or exec will not work. Remember also that forked processes need to use IPC to communicate with their parent.

Resources