What does left shift mean in Ruby? - ruby

Can anyone explain to me what does the "shift left" syntax in ruby means?
For instance, I have this
File.open( folder, 'w' ){ |f| f << datavalue }
I know that it means to write each datavalue to folder, but the |f| f << datavalue part does not make sense to me. Why does the f is inside the bracket, in relation to shift left and write the datavalue to folder?
Basically, I don"t understand the meaning of this line
{ |f| f << datavalue }

File.open( folder, 'w' ){ |f| f << datavalue } is the same as writing:
File.open( folder, 'w' ) do |f|
f << datavalue
end
Both are examples of Ruby block notation. Blocks in Ruby are anonymous methods. The variables the block expects are declared between vertical bars. In this case the variable f represents the file object returned via the File.open command.
As regards the << operator. Here it is being used as a concatenator. I believe it's called an append operator when used on objects (such as strings, arrays, in this case a file). The exception is if the object is numeric, which is when it becomes the shift left operator to shift the bits of a number.

some_text = "world!"
hello = "Hello, "
hello << some_text
puts hello # prints "Hello, world!"

Then answer is: It depends. I don't want to nitpick but Ruby hardly has any real operators. In Ruby most "operators" are in fact methods which is possibly as in Ruby everything is an object.
E.g. consider this code
o.x = a + b
There are no operators here because in fact this is only an alternative way for writing this code
o.x=(a.+(b))
And x= is the name of a setter method and + is also just the name of a method of the object a. In Ruby characters that are operators in other languages can be used as a part of method name (just think of ? which is commonly used in Ruby method names).
So this code
a = b << c
Is in fact the same as writing
a = b.<<(c)
So what << does depends on how b implements this method.
E.g. for a String the << method means append.
a = "Hello, " << "Word"
# a == "Hello, Word"
But in case of a Fixnum the << method just means shift left:
a = 5 << 2
# a == 20
So it cannot be answered what << means, you have to look up in the documentation what it means for the object you are passing this method to. If you write your own class, you can just implement this operator in any way you like
class MyClass
# If you prefer, can also be written as
# def << x
def << ( x )
# do something with x
end
end
o = MyClass.new()
x = o << a
Your method << is called and you decide what it does with a.

Related

Why does double shovel in Ruby not mutate state?

I ran into this weird side effect that caused a bug or confusion. So imagine that this isn't a trivial example but an example of a gotcha perhaps.
name = "Zorg"
def say_hello(name)
greeting = "Hi there, " << name << "?"
puts greeting
end
say_hello(name)
puts name
# Hi there, Zorg?
# Zorg
This doesn't mutate name. Name is still Zorg.
But now look at a very subtle difference. in this next example:
name = "Zorg"
def say_hello(name)
greeting = name << "?"
puts "Hi there, #{greeting}"
end
say_hello(name)
puts name
# Hi there, Zorg?
# Zorg? <-- name has mutated
Now name is Zorg?. Crazy. Very subtle difference in the greeting = assignment. Ruby is doing something internally with the parsing (?) or message passing chaining? I thought this would just chain the shovels like name.<<("?") along but I guess this isn't happening.
This is why I avoid the shovel operator when trying to do concatenation. I generally try to avoid mutating state when I can but Ruby (currently) isn't optimized for this (yet). Maybe Ruby 3 will change things. Sorry for scope-creep / side discussion about the future of Ruby.
I think this is particularly weird since the example with less side effects (first one) has two shovel operators where the example with more side effects has fewer shovel operators.
Update
You are correct DigitalRoss, I'm making it too complicated.
Reduced example:
one = "1"
two = "2"
three = "3"
message = one << two << three
Now what do you think everything is set to? (no peeking!)
If I had to guess I'd say:
one is 123
two is 23
three is 3
message is 123
But I'd be wrong about two. Two is 2.
If we convert your a << b << c construct to a more method-ish form and throw in a bunch of implicit parentheses the behavior should be clearer. Rewriting:
greeting = "Hi there, " << name << "?"
yields:
greeting = ("Hi there, ".<<(name)).<<("?")
String#<< is modifying things but name never appears as the target/LHS of <<, the "Hi there ," << name string does but name doesn't. If you replace the first string literal with a variable:
hi_there = 'Hi there, '
greeting = hi_there << name << '?'
puts hi_there
you'll see that << changed hi_there; in your "Hi there, " case, this change was hidden because you were modifying something (a string literal) that you couldn't look at afterwards.
You are making it too complicated.
The operator returns the left-hand side and so in the first case it's just reading name (because "Hi there, " << name is evaluated first) but in the second example it is writing it.
Now, many Ruby operators are right-associative, but << is not one of them. See: https://stackoverflow.com/a/21060235/140740
The right-hand side of your = is evaluated left to right.
When you are doing
"Hello" << name << "?"
The operation starts with "Hello", adds name to it, then adds "?" to the mutated "Hello".
When you do
name << "?"
The operation starts with name, and adds "?" to it, mutating name (which exists outside the internal scope of the method.
So in your example of one << two << three, you are mutating only one.

Appending strings using << does not work as expected, but using + does

I have the following code which is causing me problems around the line I've marked.
arr = 'I wish I may I wish I might'.split
dictionary = Hash.new
arr.each_with_index do |word, index|
break if arr[index + 2] == nil
key = word << " " << arr[index + 1] #This is the problem line
value = arr[index + 2]
dictionary.merge!( { key => value } ) { |key, v1, v2| [v1] << v2 }
end
puts dictionary
Running this code, I would expect the following output:
{"I wish"=>["I", "I"], "wish I"=>["may", "might"], "I may"=>"I", "may I"=>"wish"}
However, what I instead get is
{"I wish"=>["I may", "I"], "wish I"=>["may I", "might"], "I may"=>"I wish", "may I"=>"wish I"}
I've found that if I replace the problem line with
key = word + " " + arr[index + 1]
Everything works as expected. What is it about the first version of my line that was causing the unexpected behaviour?
The String#<< method modifies the original object on which it is called.
Here that is the object referred to by your word variable which is just
another reference to one of the Strings in the arr Array. You can see this
effect with the code:
a = 'Hello'
b = a << ' ' << 'World'
puts a.__id__
puts b.__id__
So when you use that method in one pass through the iterator it affects the
following passes as well.
On the other hand the String#+ method creates a new String object to hold
the combined strings. With this method one pass through the iterator has no
effect on other passes.
key = word << " " << arr[index + 1]
The problem is that String#<< performs an in-place operation so the string is modified the next time it's used. On the other hand String#+ returns a new copy.
You have been bitten by an imperative side-effect (which is not unusual since side-effects are a huge source of bugs. Unless there are very compelling performance reasons, a functional approach yields better code). For example, that's how it could be written using each_cons and map_by from Facets:
words = 'I wish I may I wish I might'.split
dictionary = words.each_cons(3).map_by do |word1, word2, word3|
["#{word1} #{word2}", word3]
end

Clarification on the Ruby << Operator

I am quite new to Ruby and am wondering about the << operator. When I googled this operator, it says that it is a Binary Left Shift Operator given this example:
a << 2 will give 15 which is 1111 0000
however, it does not seem to be a "Binary Left Shift Operator" in this code:
class TextCompressor
attr_reader :unique, :index
def initialize(text)
#unique = []
#index = []
add_text(text)
end
def add_text(text)
words = text.split
words.each { |word| do add_word(word) }
end
def add_word(word)
i = unique_index_of(word) || add_unique_word(word)
#index << i
end
def unique_index_of(word)
#unique.index(word)
end
def add_unique_word
#unique << word
unique.size - 1
end
end
and this question does not seem to apply in the code I have given. So with the code I have, how does the Ruby << operator work?
Ruby is an object-oriented language. The fundamental principle of object orientation is that objects send messages to other objects, and the receiver of the message can respond to the message in whatever way it sees fit. So,
a << b
means whatever a decides it should mean. It's impossible to say what << means without knowing what a is.
As a general convention, << in Ruby means "append", i.e. it appends its argument to its receiver and then returns the receiver. So, for Array it appends the argument to the array, for String it performs string concatenation, for Set it adds the argument to the set, for IO it writes to the file descriptor, and so on.
As a special case, for Fixnum and Bignum, it performs a bitwise left-shift of the twos-complement representation of the Integer. This is mainly because that's what it does in C, and Ruby is influenced by C.
<< is just a method. It usually means "append" in some sense, but can mean anything. For strings and arrays it means append/add. For integers it's bitwise shift.
Try this:
class Foo
def << (message)
print "hello " + message
end
end
f = Foo.new
f << "john" # => hello john
In Ruby, operators are just methods. Depending on the class of your variable, << can do different things:
# For integers it means bitwise left shift:
5 << 1 # gives 10
17 << 3 # gives 136
# arrays and strings, it means append:
"hello, " << "world" # gives "hello, world"
[1, 2, 3] << 4 # gives [1, 2, 3, 4]
It all depends on what the class defines << to be.
<< is an operator that is syntactic sugar for calling the << method on the given object. On Fixnum it is defined to bitshift left, but it has different meanings depending on the class it's defined on. For example, for Array it adds (or, rather, "shovels") the object into the array.
We can see here that << is indeed just syntactic sugar for a method call:
[] << 1 # => [1]
[].<<(1) # => [1]
and thus in your case it just calls << on #unique, which in this case is an Array.
The << function, according to http://ruby-doc.org/core-1.9.3/Array.html#method-i-3C-3C, is an append function. It appends the passed-in value to the array and then returns the array itself. Ruby objects can often have functions defined on them that, in other languages, would look like an operator.

What does << mean in Ruby?

I have code:
def make_all_thumbs(source)
sizes = ['1000','1100','1200','800','600']
threads = []
sizes.each do |s|
threads << Thread.new(s) {
create_thumbnail(source+'.png', source+'-'+s+'.png', s)
}
end
end
what does << mean?
It can have 3 distinct meanings:
'<<' as an ordinary method
In most cases '<<' is a method defined like the rest of them, in your case it means "add to the end of this array" (see also here).
That's in your particular case, but there are also a lot of other occasions where you'll encounter the "<<" method. I won't call it 'operator' since it's really a method that is defined on some object that can be overridden by you or implemented for your own objects. Other cases of '<<'
String concatenation: "a" << "b"
Writing output to an IO: io << "A line of text\n"
Writing data to a message digest, HMAC or cipher: sha << "Text to be hashed"
left-shifting of an OpenSSL::BN: bn << 2
...
Singleton class definition
Then there is the mysterious shift of the current scope (=change of self) within the program flow:
class A
class << self
puts self # self is the singleton class of A
end
end
a = A.new
class << a
puts self # now it's the singleton class of object a
end
The mystery class << self made me wonder and investigate about the internals there. Whereas in all the examples I mentioned << is really a method defined in a class, i.e.
obj << stuff
is equivalent to
obj.<<(stuff)
the class << self (or any object in place of self) construct is truly different. It is really a builtin feature of the language itself, in CRuby it's defined in parse.y as
k_class tLSHFT expr
k_class is the 'class' keyword, where tLSHFT is a '<<' token and expr is an arbitrary expression. That is, you can actually write
class << <any expression>
and will get shifted into the singleton class of the result of the expression. The tLSHFT sequence will be parsed as a 'NODE_SCLASS' expression, which is called a Singleton Class definition (cf. node.c)
case NODE_SCLASS:
ANN("singleton class definition");
ANN("format: class << [nd_recv]; [nd_body]; end");
ANN("example: class << obj; ..; end");
F_NODE(nd_recv, "receiver");
LAST_NODE;
F_NODE(nd_body, "singleton class definition");
break;
Here Documents
Here Documents use '<<' in a way that is again totally different. You can define a string that spans over multiple lines conveniently by declaring
here_doc = <<_EOS_
The quick brown fox jumps over the lazy dog.
...
_EOS_
To distinguish the 'here doc operator' an arbitrary String delimiter has to immediately follow the '<<'. Everything inbetween that initial delimiter and the second occurrence of that same delimiter will be part of the final string. It is also possible to use '<<-', the difference is that using the latter will ignore any leading or trailing whitespace.
Mostly used in arrays to append the value to the end of the array.
a = ["orange"]
a << "apple"
puts a
gives this ["orange", "apple"] result.
'a << b' means append b to the end of a
It's the operator which allows you to feed existing arrays, by appending new items.
In the example above you are just populating the empty array threads with 5 new threads.
In ruby you always have more the one way to do things. So, Ruby has some nice shortcuts for common method names. like this one is for .push instead of typing out the .push method name, you can simply use <<, the concatenation operator. in fact in some cases you can use any of these for the same operation .push and + with <<.
Like you can see in this example:
alphabet = ["a", "b", "c"]
alphabet << "d" # Update me!
alphabet.push("e") # Update me!
print alphabet
caption = "the boy is surrounded by "
caption << "weezards!" # Me, too!
caption += " and more. " # Me, too!
# .push can no be uses for concatenate
print caption
so you see the result is:
["a", "b", "c", "d", "e"]
the boy is surrounded by weezards! and more.
you can use the operator << to push a element into an array or to concatenate a string to another.
so, what this is this doing is creating a new element/object Thread type and pushing it into the array.
threads << Thread.new(s) {
create_thumbnail(source+'.png', source+'-'+s+'.png', s)
}
In ruby '<<' operator is basically used for:
Appending a value in the array (at last position)
[2, 4, 6] << 8
It will give [2, 4, 6, 8]
It also used for some active record operations in ruby. For example we have a Cart and LineItem model associated as cart has_many line_items. Cart.find(A).line_items will return ActiveRecord::Associations object with line items that belongs to cart 'A'.
Now, to add (or say to associate) another line_item (X) to cart (A),
Cart.find(A).line_items << LineItem.find(X)
Now to add another LineItem to the same cart 'A', but this time we will not going to create any line_item object (I mean will not create activerecord object manually)
Cart.find(A).line_items << LineItem.new
In above code << will save object and append it to left side active record association array.
And many others which are already covered in above answers.
Also, since Ruby 2.6, the << method is defined also on Proc.
Proc#<< allows to compose two or more procs.
It means add to the end (append).
a = [1,2,3]
a << 4
a = [1,2,3,4]

Implicit return values in Ruby

I am somewhat new to Ruby and although I find it to be a very intuitive language I am having some difficulty understanding how implicit return values behave.
I am working on a small program to grep Tomcat logs and generate pipe-delimited CSV files from the pertinent data. Here is a simplified example that I'm using to generate the lines from a log entry.
class LineMatcher
class << self
def match(line, regex)
output = ""
line.scan(regex).each do |matched|
output << matched.join("|") << "\n"
end
return output
end
end
end
puts LineMatcher.match("00:00:13,207 06/18 INFO stateLogger - TerminationRequest[accountId=AccountId#66679198[accountNumber=0951714636005,srNumber=20]",
/^(\d{2}:\d{2}:\d{2},\d{3}).*?(\d{2}\/\d{2}).*?\[accountNumber=(\d*?),srNumber=(\d*?)\]/)
When I run this code I get back the following, which is what is expected when explicitly returning the value of output.
00:00:13,207|06/18|0951714636005|20
However, if I change LineMatcher to the following and don't explicitly return output:
class LineMatcher
class << self
def match(line, regex)
output = ""
line.scan(regex).each do |matched|
output << matched.join("|") << "\n"
end
end
end
end
Then I get the following result:
00:00:13,207
06/18
0951714636005
20
Obviously, this is not the desired outcome. It feels like I should be able to get rid of the output variable, but it's unclear where the return value is coming from. Also, any other suggestions/improvements for readability are welcome.
Any statement in ruby returns the value of the last evaluated expression.
You need to know the implementation and the behavior of the most used method in order to exactly know how your program will act.
#each returns the collection you iterated on. That said, the following code will return the value of line.scan(regexp).
line.scan(regex).each do |matched|
output << matched.join("|") << "\n"
end
If you want to return the result of the execution, you can use map, which works as each but returns the modified collection.
class LineMatcher
class << self
def match(line, regex)
line.scan(regex).map do |matched|
matched.join("|")
end.join("\n") # remember the final join
end
end
end
There are several useful methods you can use depending on your very specific case. In this one you might want to use inject unless the number of results returned by scan is high (working on arrays then merging them is more efficient than working on a single string).
class LineMatcher
class << self
def match(line, regex)
line.scan(regex).inject("") do |output, matched|
output << matched.join("|") << "\n"
end
end
end
end
In ruby the return value of a method is the value returned by the last statement. You can opt to have an explicit return too.
In your example, the first snippet returns the string output. The second snippet however returns the value returned by the each method (which is now the last stmt), which turns out to be an array of matches.
irb(main):014:0> "StackOverflow Meta".scan(/[aeiou]\w/).each do |match|
irb(main):015:1* s << match
irb(main):016:1> end
=> ["ac", "er", "ow", "et"]
Update: However that still doesn't explain your output on a single line. I think it's a formatting error, it should print each of the matches on a different line because that's how puts prints an array. A little code can explain it better than me..
irb(main):003:0> one_to_three = (1..3).to_a
=> [1, 2, 3]
irb(main):004:0> puts one_to_three
1
2
3
=> nil
Personally I find your method with the explicit return more readable (in this case)

Resources