I have been a Java programmer for a while and I am trying to switch to ruby for a while. I was just trying to develop a small test program in ruby and my intention is something like following.
I want to create a simple linked list type of an object in ruby; where an instance variable in class points to another instance of same type.
I want to populate and link all nodes; before the constructor is called and only once. Something that we'd usually do in Java Static block.
Initialize method is a constructor signature in ruby. Are there any rules around them? Like in Java you cannot call another constructor from a constructor if its not the first line (or after calling the class code?)
Thanks for the help.
-Priyank
I want to create a simple linked list type of an object in ruby; where an instance variable in class points to another instance of same type.
Just a quick note: the word type is a very dangerous word in Ruby, especially if you come from Java. Due to an historic accident, the word is used both in dynamic typing and in static typing to mean two only superficially related, but very different things.
In dynamic typing, a type is a label that gets attached to a value (not a reference).
Also, in Ruby the concept of type is much broader than in Java. In Java programmer's minds, "type" means the same thing as "class" (although that's not true, since Interfaces and primitives are also types). In Ruby, "type" means "what can I do with it".
Example: in Java, when I say something is of type String, I mean it is a direct instance of the String class. In Ruby, when I say something is of type String, I mean it is either
a direct instance of the String class or
an instance of a subclass of the String class or
an object which responds to the #to_str method or
an object which behaves indistinguishably from a String.
I want to populate and link all nodes; before the constructor is called and only once. Something that we'd usually do in Java Static block.
In Ruby, everything is executable. In particular, there is no such thing as a "class declaration": a class body is just exectuable code, just like any other. If you have a list of method definitions in your class body, those are not declarations that are read by the compiler and then turned into a class object. Those are expressions that get executed by the evaluator one by one.
So, you can put any code you like into a class body, and that code will be evaluated when the class is created. Within the context of a class body, self is bound to the class (remember, classes are just objects like any other).
Initialize method is a constructor signature in ruby. Are there any rules around them? Like in Java you cannot call another constructor from a constructor if its not the first line (or after calling the class code?)
Ruby doesn't have constructors. Constructors are just factory methods (with stupid restrictions); there is no reason to have them in a well-designed language, if you can just use a (more powerful) factory method instead.
Object construction in Ruby works like this: object construction is split into two phases, allocation and initialization. Allocation is done by a public class method called allocate, which is defined as an instance method of class Class and is generally never overriden. It just allocates the memory space for the object and sets up a few pointers, however, the object is not really usable at this point.
That's where the initializer comes in: it is an instance method called initialize, which sets up the object's internal state and brings it into a consistent, fully defined state which can be used by other objects.
So, in order to fully create a new object, what you need to do is this:
x = X.allocate
x.initialize
[Note: Objective-C programmers may recognize this.]
However, because it is too easy to forget to call initialize and as a general rule an object should be fully valid after construction, there is a convenience factory method called Class#new, which does all that work for you and looks something like this:
class Class
def new(*args, &block)
obj = alloc
obj.initialize(*args, &block)
return obj
end
end
[Note: actually, initialize is private, so reflection has to be used to circumvent the access restrictions like this: obj.send(:initialize, *args, &block)]
That, by the way, is the reason why to construct an object you call a public class method Foo.new but you implement a private instance method Foo#initialize, which seems to trip up a lot of newcomers.
To answer your question: since an initializer method is just a method like any other, there are absolutely no restrictions as to what you can do whithin an initializer, in particular you can call super whenever, wherever, however and how often you want.
BTW: since initialize and new are just normal methods, there is no reason why they need to be called initialize and new. That's only a convention, although a pretty strong one, since it is embodied in the core library. In your case, you want to write a collection class, and it is quite customary for a collection class to offer an alternative factory method called [], so that I can call List[1, 2, 3] instead of List.new(1, 2, 3).
Just as a side note: one obvious advantage of using normal methods for object construction is that you can construct instances of anonymous classes. This is not possible in Java, for absolutely no sensible reason whatsoever. The only reason why it doesn't work is that the constructor has the same name as the class, and anonymous classes don't have a name, ergo there cannot be a constructor.
Although I am not quite sure why you would need to run anything before object creation. Unless I am missing something, shouldn't a list basically be
class List
def initialize(head=nil, *tail)
#head = head
#tail = List.new(*tail) unless tail.empty?
end
end
for a Lisp-style cons-list or
class List
def initialize(*elems)
elems.map! {|el| Element.new(el)}
elems.zip(elems.drop(1)) {|prv, nxt| prv.instance_variable_set(:#next, nxt)}
#head = elems.first
end
class Element
def initialize(this)
#this = this
end
end
end
for a simple linked list?
You can simply initialize your class variables in the class body, outside of any method declaration. It will behave like a static initializer in Java:
class Klass
##foo = "bar"
def sayFoo
puts ##foo
end
def self.sayFoo
puts ##foo
end
end
The class field ##foo is here initialized to "bar".
In ruby object creation works like this
class Class
def new(*args)
obj= self.allocate # get some memory
obj.send(:initialize) # call the private method initialize
end
end
Object#initialize is just an ordinary private method.
If you wan't something to happen before Object#initialize you have to write your own Class#new. But I see no reason why you would want to do that.
This is basically the same answer paradigmatic gave back in '09.
Here I want to illustrate that the "static initializer" can call other code. I'm simulating a scenario of loading a special user once, upon class initialization.
class Foo
def user
"Thomas"
end
end
class Bar
##my_user = Foo.new.user
def my_statically_defined_user
##my_user
end
end
b = Bar.new
puts b.my_statically_defined_user # ==> Thomas
Related
Some open source code I'm integrating in my application has some classes that include code to that effect:
class SomeClass < SomeParentClass
def self.new(options = {})
super().tap { |o|
# do something with `o` according to `options`
}
end
def initialize(options = {})
# initialize some data according to `options`
end
end
As far as I understand, both self.new and initialize do the same thing - the latter one "during construction" and the former one "after construction", and it looks to me like a horrible pattern to use - why split up the object initialization into two parts where one is obviously "The Wrong Think(tm)"?
Ideally, I'd like to see what is inside the super().tap { |o| block, because although this looks like bad practice, just maybe there is some interaction required before or after initialize is called.
Without context, it is possible that you are just looking at something that works but is not considered good practice in Ruby.
However, maybe the approach of separate self.new and initialize methods allows the framework designer to implement a subclass-able part of the framework and still ensure setup required for the framework is completed without slightly awkward documentation that requires a specific use of super(). It would be a slightly easier to document and cleaner-looking API if the end user gets functionality they expect with just the subclass class MyClass < FrameworkClass and without some additional note like:
When you implement the subclass initialize, remember to put super at the start, otherwise the magic won't work
. . . personally I'd find that design questionable, but I think there would at least be a clear motivation.
There might be deeper Ruby language reasons to have code run in a custom self.new block - for instance it may allow constructor to switch or alter the specific object (even returning an object of a different class) before returning it. However, I have very rarely seen such things done in practice, there is nearly always some other way of achieving the goals of such code without customising new.
Examples of custom/different Class.new methods raised in the comments:
Struct.new which can optionally take a class name and return objects of that dynamically created class.
In-table inheritance for ActiveRecord, which allows end user to load an object of unknown class from a table and receive the right object.
The latter one could possibly be avoided with a different ORM design for inheritance (although all such schemes have pros/cons).
The first one (Structs) is core to the language, so has to work like that now (although the designers could have chosen a different method name).
It's impossible to tell why that code is there without seeing the rest of the code.
However, there is something in your question I want to address:
As far as I understand, both self.new and initialize do the same thing - the latter one "during construction" and the former one "after construction"
They do not do the same thing.
Object construction in Ruby is performed in two steps: Class#allocate allocates a new empty object from the object space and sets its internal class pointer to self. Then, you initialize the empty object with some default values. Customarily, this initialization is performed by a method called initialize, but that is just a convention; the method can be called anything you like.
There is an additional helper method called Class#new which does nothing but perform the two steps in sequence, for the programmer's convenience:
class Class
def new(*args, &block)
obj = allocate
obj.send(:initialize, *args, &block)
obj
end
def allocate
obj = __MagicVM__.__allocate_an_empty_object_from_the_object_space__
obj.__set_internal_class_pointer__(self)
obj
end
end
class BasicObject
private def initialize(*) end
end
The constructor new has to be a class method since you start from where there is no instance; you can't be calling that method on a particular instance. On the other hand, an initialization routine initialize is better defined as an instance method because you want to do something specifically with a certain instance. Hence, Ruby is designed to internally call the instance method initialize on a new instance right after its creation by the class method new.
I am new to Ruby and I saw methods defined like:
def method_one
puts "method 1"
end
class MyClass
method_one
def method_two
puts "method 2"
end
end
The way method_one is used reminds me of Python decorators.The output of
c = MyClass.new
c.method_two
is
method 1
method 2
I have been trying to search for more information about this syntax/language feature in the Ruby documentation on the web but I don't know what keywords to search for.
What this is thing called?
TL;DR
This code doesn't do what you think it does. Don't do stuff like this.
Ruby's Top-Level Object
Ruby lets you define methods outside a class. These methods exist on a top-level object, which you can (generally) treat as a sort of catch-all namespace. You can see various posts like What is the Ruby Top-Level? for more details, but you shouldn't really need to care.
In your original post, method_one is just a method defined in the top-level. It is therefore available to classes and methods nested within the top-level, such as MyClass.
Methods in Classes
Despite what you think, the following doesn't actually declare a :method_one class or instance method on MyClass:
class MyClass
method_one
def method_two; end
end
Instead, Ruby calls the top-level ::method_one during the definition of the class, but it never becomes a class method (e.g. MyClass::method_one) or an instance method (e.g. MyClass.new.method_one). There might be a few use cases for doing this (e.g. printing debugging information, test injection, etc.) but it's confusing, error-prone, and generally to be avoided unless you have a really strong use case for it.
Better Options
In general, when you see something like this outside an academic lesson, the programmer probably meant to do one of the following:
Extend a class.
Add a singleton method to a class.
Include a module in a class.
Set up a closure during class definition.
The last gets into murky areas of metaprogramming, at which point you should probably be looking at updating your class initializer, or passing Proc or lambda objects around instead. Ruby lets you do all sorts of weird and wonderful things, but that doesn't mean you should.
I think you're a little mislead; the output of:
c = MyClass.new
c.method_two
is
#<MyClass:0x007feda41acf18>
"method 2"
You're not going to see method one until the class is loaded or if you're in IRB you enter the last end statement.
I would suggest looking into ruby's initialize method.
This class takes in a hash, and depending on the input, it converts temperatures.
class Temp
def initialize(opt={})
if opt.include?(:cold)
#colddegree=opt[:cold]
end
end
def self.from_cold(cel)
Temp.new(:cold => cel) <= instance of class created in class method
end
end
An instance of a class is created inside a class method. Why is it necessary to do so, and what it does it do, what is the reasoning behind it?
Why would we need to create an instance of a class inside the class instead of the main?
Why would it be used inside a class method? Can there be a time when it would be required inside a regular object methods?
What is it calling and what is happening when it is creating an instance inside a class method? what difference does it make?
Rubyists don't always use the word, but self.from_cold is a factory. This allows you to expose a Temp.from_cold(-40) method signature that programmers consuming your API can understand readily without having to concern themselves with the boilerplate of, say, learning that you have an implicitly required parameter named :cold.
It becomes extra useful when you have a work-performing object that needs to be initialized and then invoked, such as TempConverter.new(cel: -40).to_fahrenheit. Sometimes it's cleaner to expose a TempConverter.cel_to_fahr(-40) option to be consumed by other libraries. It's mostly just a way of hiding complexity inside of this class so that other classes with temp conversion needs don't have to violate the Law of Demeter.
An important thing to understand is that unlike C#, JavaScript, or C++, new is not a keyword in Ruby. It's just a message which objects of class Class understand. The default behaviour is to allocate and initialize a new object of that class, but there's nothing stopping you overriding it, for example:
class Foo
def self.new
puts "OMG i'm initializing an object"
super
end
end
So to answer your last question, it makes no difference where Temp.new is called. In all cases, it sends the message new to the object Temp (which is also a class, but remember that almost everything in Ruby is an object, including classes), which creates and returns a new instance.
I'm not going to attempt to answer your other two questions, because the other answer already does.
Do I need to explicitly initialize an object if an initialize method is included in class definition?
No, Ruby does not call initialize automatically.
The default implementation of Class#new looks a bit like this:
class Class
def new(*args, &block)
obj = allocate
obj.initialize(*args, &block)
obj
end
end
[Actually, initialize is private by default so you need to use obj.send(:initialize, *args, &block).]
So, the default implementation of Class#new does call initialize, but it would be perfectly possible (albeit extremely stupid) to override or overwrite it with an implementation that does not.
So, it's not Ruby that calls initialize, it's Class#new. You may think that's splitting hairs, because Class#new is an integral part of Ruby, but the important thing here is: it's not some kind of language magic. It's a method like any other, and like any other method it can be overridden or overwritten to do something completely different.
And, of course, if you don't use new to create an object but instead do it manually with allocate, then initialize wouldn't be called either.
There are some cases where objects are created without calling initialize. E.g. when duping or cloneing, initialize_dup and initialize_clone are called instead of initialize (both of which, in turn, call initialize_copy). And when deserializing an object via, say, Marshal, its internal state is reconstructed directly (i.e. the instance variables are set reflectively) instead of through initialize.
Yes, it's called from new method, which you use to create objects.
It depends on your definition of "explicit". Usually you need to, even if there are no arguments:
object = MyClass.new(...)
In some cases there are factory methods that produce instances you can use, creating a form of implicit initialization:
object = MyClass.factory_method(...)
This would have the effect of calling MyObject.new internally.
There are some libraries which have rather unusual method signatures, like:
object = MyClass(...)
object = MyClass[...]
The effect is the same, as these might look odd but are just method calls.
In Ruby, when making a new class, we will define the constructor method like so:
class Thing
def initialize
do_stuff
end
end
However, when actually making an instance of the object, we find ourselves not calling initialize on the instance but new on the class.
That being the case, why don't we instead define ::new?
class Thing
def self.new
do_stuff
end
end
Is there something ::new does beind the scenes that initalize doesn't define? Are those two different at all? Would defining ::new work? Or is it just that def initialize is shorter (not) than def self.new?
I'm thinking there must be a good reason for the disparity.
New allocates space for the new object and creates it. It then calls the Objects initialize method to create a new Object using the allocated memory. Usually, the only part you want to customize is the actual creation, and are happy to leave the behind memory allocation to the Object.new method, so you write an initialize method. What new is doing under the hood looks something like this (except in C):
class Object
def self.new(*args, &block)
object = allocate
object.send(:initialize, *args, &block)
return object
end
end
So when you call Object.new, what actually happens is:
1) Memory is allocated
2) The objects initialize method is called.
To provide access to instance variables.
Instance variables, such as #value, are accessible from instances only but not from within a class method. This differs from languages such as Java where private instance variables have class rather than instance scope and are thus accessible from the static constructor.
class Thing
def initialize
#value = 42
end
end
class Thing
def self.new
# no way to set the value of #value !!!!!!!!
end
end
For those interested in the history of Ruby, the object model with instance-private instance variables goes back to Smalltalk. And you'll find the same pattern in modern Smalltalk dialect like Pharo, new is implemented in Object to call self initialize such that subclasses can easily initialize instance variables.
To elaborate on Abraham's point, in effect, that is reversing the wrapping relation. If you had allocate as a primitive and usually define new, then you would always have to do common things like calling allocate and returning the object at the end, which is a redundant thing to do. By having new and initialize, the former will wrap the latter, so you only need to define what is wrapped, not the wrapper.
new is a class method, so when you define that, you do not have access the instance methods and instance variables by default, and you need to rely on accessors. On the other hand, initialize is an instance method, so it will be easier to work with instance variables and instance methods.