Global variable vs. constant vs. class instance variable for global cache - ruby

In general which is better for a global cache: global variable, constant, or class instance variable?
Here is an example of each:
module Foo
$FOO_CACHE = {}
def self.access_to_cache
$FOO_CACHE
end
end
module Foo
CACHE = {}
def self.access_to_cache
CACHE
end
end
module Foo
#cache = {}
def self.access_to_cache
#cache
end
end

This is ultimately pretty subjective, but I’ll address each option one-by-one:
Global variable: no …because putting a global variable inside a module (or a class, or anything for that matter) doesn’t make much sense, it’s going to be in scope everywhere anyway. Besides the fact that if you can use something other than a global variable, you should always do so.
Constant: no …because the cache is not constant! While Ruby doesn't enforce that constants can’t change, that doesn’t mean you should do it. There’s a reason they’re called constants.
Class instance variable: yes …because it’s the only one here that makes any sense (though the name might not, technically here it’s a module instance variable, but that’s being rather pedantic). This is the only one of the three that both makes semantic sense to modify and is encapsulated by some scope.

Related

How can I determine what objects a call to ruby require added to the global namespace?

Suppose I have a file example.rb like so:
# example.rb
class Example
def foo
5
end
end
that I load with require or require_relative. If I didn't know that example.rb defined Example, is there a list (other than ObjectSpace) that I could inspect to find any objects that had been defined? I've tried checking global_variables but that doesn't seem to work.
Thanks!
Although Ruby offers a lot of reflection methods, it doesn't really give you a top-level view that can identify what, if anything, has changed. It's only if you have a specific target you can dig deeper.
For example:
def tree(root, seen = { })
seen[root] = true
root.constants.map do |name|
root.const_get(name)
end.reject do |object|
seen[object] or !object.is_a?(Module)
end.map do |object|
seen[object] = true
puts object
[ object.to_s, tree(object, seen) ]
end.to_h
end
p tree(Object)
Now if anything changes in that tree structure you have new things. Writing a diff method for this is possible using seen as a trigger.
The problem is that evaluating Ruby code may not necessarily create all the classes that it will or could create. Ruby allows extensive modification to any and all classes, and it's common that at run-time it will create more, or replace and remove others. Only libraries that forcibly declare all of their modules and classes up front will work with this technique, and I'd argue that's a small portion of them.
It depends on what you mean by "the global namespace". Ruby doesn't really have a "global" namespace (except for global variables). It has a sort-of "root" namespace, namely the Object class. (Although note that Object may have a superclass and mixes in Kernel, and stuff can be inherited from there.)
"Global" constants are just constants of Object. "Global functions" are just private instance methods of Object.
So, you can get reasonably close by examining global_variables, Object.constants, and Object.instance_methods before and after the call to require/require_relative.
Note, however, that, depending on your definition of "global namespace" (private) singleton methods of main might also count, so you check for those as well.
Of course, any of the methods the script added could, when called at a later time, themselves add additional things to the global scope. For example, the following script adds nothing to the scope, but calling the method will:
class String
module MyNonGlobalModule
def self.my_non_global_method
Object.const_set(:MY_GLOBAL_CONSTANT, 'Haha, gotcha!')
end
end
end
Strictly speaking, however, you asked about adding "objects" to the global namespace, and neither constants nor methods nor variables are objects, soooooo … the answer is always "none"?

Understanding the access of a variable assigned in initialize in Ruby

As a beginner, I've not quite got my head around self so I'm having trouble understanding how the self.blogs in initialize, and blogs then self.blogs on the next line after in the add_blog method, are all working together in the below code.
Why does blogs in the add_blog method access the same variable as self.blogs in initalize?
And then why is self.blogs used afterwards to sort the blogs array?
Also, would it matter if I used #blogs in initialize, instead of self.blogs?
class User
attr_accessor :username, :blogs
def initialize(username)
self.username = username
self.blogs = []
end
def add_blog(date, text)
added_blog = Blog.new(date, self, text)
blogs << added_blog
self.blogs = blogs.sort_by { |blog| blog.date }.reverse
added_blog
end
end
To answer your question, we have to reveal the true nature of attr_accessor.
class Foo
attr_accessor :bar
end
is completely equivalent to
class Foo
def bar
#bar
end
def bar=(value)
#bar = value
end
end
You can see that attr_accessor :bar defines two instance methods Foo#bar and Foo#bar= that access an instance variable #bar.
Lets then look at your code.
self.blogs = [] in initialize is actually calling the method User#blogs=, and through it sets the instance variable #blogs with an empty array. It can be written as self.blogs=([]) but it's noisy, isn't it? By the way, you can't omit self. here otherwise it just sets a local variable.
blogs << added_blog calls the method User#blog which returns the value of #blogs. It can also be written as self.blogs().push(added_blog), but again it's not rubyish. You can omit self. because there is no local variable named blogs in User#add_blog, so ruby falls back to call the instance method.
self.blogs = blogs.sort_by { |blog| blog.date }.reverse mixes call to User#blogs= and User#blogs.
For most method calls on self, self.method_name is equivalent to just method_name. That's not the case for methods whose name ends with an =, though.
The first thing to note, then, is that self.blogs = etc doesn't call a method named blogs and then somehow 'assign etc to it'; that line calls the method blogs=, and passes etc to it as an argument.
The reason you can't shorten that to just blogs = etc, like you can with other method calls, is because blogs = etc is indistinguishable from creating a new local variable named blogs.
When, on the previous line, you see a bare blogs, that is also a method call, and could just as easily have been written self.blogs. Writing it with an implicit receiver is just shorter. Of course, blogs is also potentially ambiguous as the use of a local variable, but in this case the parser can tell it's not, since there's no local variable named blogs assigned previously in the method (and if there had been, a bare blogs would have the value of that local variable, and self.blogs would be necessary if you had meant the method call).
As for using #blogs = instead of self.blogs =, in this case it would have the same effect, but there is a subtle difference: if you later redefine the blogs= method to have additional effects (say, writing a message to a log), the call to self.blogs = will pick up those changes, whereas the bare direct access will not. In the extreme case, if you redefine blogs= to store the value in a database rather than an instance variable, #blogs = won't even be similar anymore (though obviously that sort of major change in infrastructure will probably have knock-on effects internal to the class regardless).
#variable will directly access the instance variable for that class. Writing self.variable will send to the object a message variable. By default it will return the instance variable but it could do other things depending on how you set up your object. It could be a call to a method, or a subclass, or anything else.
The difference between calling blogs or self.blogs is totally up to syntax. If you use an opinionated syntax checker like rubocop it will tell you that you have a redundant use of self

When to use constants instead of instance variables in Ruby?

I understand that instance variables are mean to be states, and constants are meant to be constant. Is there any reason (besides convention) to use a constant instead of an instance variable? Is there a memory/speed advantage to using constants?
There's a few things to consider here:
Will the value change within the life-cycle of an object?
Will you need to override the value in sub-classes?
Do you need to configure the value at run-time?
The best kind of constants are those that don't really change short of updating the software:
class ExampleClass
STATES = %i[
off
on
broken
].freeze
end
Generally you use these constants internally in the class and avoid sharing them. When you share them you're limited in how they're used. For example, if another class referenced ExampleClass::STATES then you can't change that structure without changing other code.
You can make this more abstract by providing an interface:
class ExampleClass
def self.states
STATES
end
end
If you change the structure of that constant in the future you can always preserve the old behaviour:
class ExampleClass
STATES = {
on: 'On',
off: 'Off',
broken: 'Broken'
}.freeze
def self.states
STATES.keys
end
end
When you're talking about instance variables you mean things you can configure:
class ConfigurableClass
INITIAL_STATE_DEFAULT = :off
def self.initial_state
#initial_state || INITIAL_STATE_DEFAULT
end
def self.initial_state=(value)
#initial_state = value ? value.to_sym
end
end
Constants are great in that they're defined once and used for the duration of the process, so technically they're faster. Instance variables are still pretty quick, and are often a necessity as illustrated above.
Constants, unlike instance variables, are global. And they will at least complain if you try to re-assign their value.
While there might be a theoretical difference in memory/speed, it will be irrelevant in practice.
You may not realize this, but classes and modules are considered constants.
pry(main)> Foo
NameError: uninitialized constant Foo
The best advice I can give as to when you should use constants are when they are exactly that, constant. For example, if I was making a scope in rails to find all the of recent Foos for instance, I would create a constant that shows what recent is.
class Foo < ActiveRecord::Base
DAYS_TILL_OLD = 7.days
scope :recent, -> { where "created_at > ?", DateTime.now - DAYS_TILL_OLD }
end

Which is better? Creating a instance variable or passing around a local variable in Ruby?

In general what is the best practice and pro/cons to creating an instance variable that can be accessed from multiple methods or creating an instance variable that is simply passed as an argument to those methods. Functionally they are equivalent since the methods are still able to do the work using the variable. While I could see a benefit if you were updating the variable and wanted to return the updated value but in my specific case the variable is never updated only read by each method to decide how to operate.
Example code to be clear:
class Test
#foo = "something"
def self.a
if #foo == "something"
puts "do #{#foo}"
end
end
a()
end
vs
class Test
foo = "something"
def self.a(foo)
if foo == "something"
puts "do #{foo}"
end
end
a(foo)
end
I don't pass instance variable around. They are state values for the instance.
Think of them as part of the DNA of that particular object, so they'll always be part of what makes the object be what it is. If I call a method of that object, it will already know how to access its own DNA and will do it internally, not through some parameter being passed in.
If I want to apply something that is foreign to the object, then I'll have to pass it in via the parameters.
As you mentioned, this is a non-functional issue about the code. With that in mind...
It's hard to give a definitive rule about it since it depends entirely on the context. Is the variable set once and forgotten about it, or constantly updated? How many methods share the same variable? How will the code be used?
In my experience, variables that drive behavior of the object but are seldom (if at all) modified are set in the initialize method, or given to the method that will cascade behavior. Libraries and leaf methods tend to have the variable passed in, as it's likely somebody will want to call it in isolation.
I'd suggest you start by passing everything first, and then refactoring if you notice the same variable being passed around all over the class.
If I need a variable that is scoped at the instance level, I use an instance variable, set in the initialize method.
If I need a variable that is scoped at the method level (that is, a value that is passed from one method to another method) I create the variable at the method level.
So the answer to your question is "When should my variable be in scope" and I can't really answer that without seeing all of your code and knowing what you plan to do with it.
If your object behavior should be statically set in the initialization phase, I would use an instance variable.

How to make instance variables private in Ruby?

Is there any way to make instance variables "private"(C++ or Java definition) in ruby? In other words I want following code to result in an error.
class Base
def initialize()
#x = 10
end
end
class Derived < Base
def x
#x = 20
end
end
d = Derived.new
Like most things in Ruby, instance variables aren't truly "private" and can be accessed by anyone with d.instance_variable_get :#x.
Unlike in Java/C++, though, instance variables in Ruby are always private. They are never part of the public API like methods are, since they can only be accessed with that verbose getter. So if there's any sanity in your API, you don't have to worry about someone abusing your instance variables, since they'll be using the methods instead. (Of course, if someone wants to go wild and access private methods or instance variables, there isn’t a way to stop them.)
The only concern is if someone accidentally overwrites an instance variable when they extend your class. That can be avoided by using unlikely names, perhaps calling it #base_x in your example.
Never use instance variables directly. Only ever use accessors. You can define the reader as public and the writer private by:
class Foo
attr_reader :bar
private
attr_writer :bar
end
However, keep in mind that private and protected do not mean what you think they mean. Public methods can be called against any receiver: named, self, or implicit (x.baz, self.baz, or baz). Protected methods may only be called with a receiver of self or implicitly (self.baz, baz). Private methods may only be called with an implicit receiver (baz).
Long story short, you're approaching the problem from a non-Ruby point of view. Always use accessors instead of instance variables. Use public/protected/private to document your intent, and assume consumers of your API are responsible adults.
It is possible (but inadvisable) to do exactly what you are asking.
There are two different elements of the desired behavior. The first is storing x in a read-only value, and the second is protecting the getter from being altered in subclasses.
Read-only value
It is possible in Ruby to store read-only values at initialization time. To do this, we use the closure behavior of Ruby blocks.
class Foo
def initialize (x)
define_singleton_method(:x) { x }
end
end
The initial value of x is now locked up inside the block we used to define the getter #x and can never be accessed except by calling foo.x, and it can never be altered.
foo = Foo.new(2)
foo.x # => 2
foo.instance_variable_get(:#x) # => nil
Note that it is not stored as the instance variable #x, yet it is still available via the getter we created using define_singleton_method.
Protecting the getter
In Ruby, almost any method of any class can be overwritten at runtime. There is a way to prevent this using the method_added hook.
class Foo
def self.method_added (name)
raise(NameError, "cannot change x getter") if name == :x
end
end
class Bar < Foo
def x
20
end
end
# => NameError: cannot change x getter
This is a very heavy-handed method of protecting the getter.
It requires that we add each protected getter to the method_added hook individually, and even then, you will need to add another level of method_added protection to Foo and its subclasses to prevent a coder from overwriting the method_added method itself.
Better to come to terms with the fact that code replacement at runtime is a fact of life when using Ruby.
Unlike methods having different levels of visibility, Ruby instance variables are always private (from outside of objects). However, inside objects instance variables are always accessible, either from parent, child class, or included modules.
Since there probably is no way to alter how Ruby access #x, I don't think you could have any control over it. Writing #x would just directly pick that instance variable, and since Ruby doesn't provide visibility control over variables, live with it I guess.
As #marcgg says, if you don't want derived classes to touch your instance variables, don't use it at all or find a clever way to hide it from seeing by derived classes.
It isn't possible to do what you want, because instance variables aren't defined by the class, but by the object.
If you use composition rather than inheritance, then you won't have to worry about overwriting instance variables.
If you want protection against accidental modification. I think attr_accessor can be a good fit.
class Data
attr_accessor :id
private :id
end
That will disable writing of id but would be readable. You can however use public attr_reader and private attr_writer syntax as well. Like so:
class Data
attr_reader :id
private
attr_writer :id
end
I know this is old, but I ran into a case where I didn't as much want to prevent access to #x, I did want to exclude it from any methods that use reflection for serialization. Specifically I use YAML::dump often for debug purposes, and in my case #x was of class Class, which YAML::dump refuses to dump.
In this case I had considered several options
Addressing this just for yaml by redefining "to_yaml_properties"
def to_yaml_properties
super-["#x"]
end
but this would have worked just for yaml and if other dumpers (to_xml ?) would not be happy
Addressing for all reflection users by redefining "instance_variables"
def instance_variables
super-["#x"]
end
Also, I found this in one of my searches, but have not tested it as the above seem simpler for my needs
So while these may not be exactly what the OP said he needed, if others find this posting while looking for the variable to be excluded from listing, rather than access - then these options may be of value.

Resources