Are strings mutable in Ruby? - ruby

Are Strings mutable in Ruby? According to the documentation doing
str = "hello"
str = str + " world"
creates a new string object with the value "hello world" but when we do
str = "hello"
str << " world"
It does not mention that it creates a new object, so does it mutate the str object, which will now have the value "hello world"?

Yes, << mutates the same object, and + creates a new one. Demonstration:
irb(main):011:0> str = "hello"
=> "hello"
irb(main):012:0> str.object_id
=> 22269036
irb(main):013:0> str << " world"
=> "hello world"
irb(main):014:0> str.object_id
=> 22269036
irb(main):015:0> str = str + " world"
=> "hello world world"
irb(main):016:0> str.object_id
=> 21462360
irb(main):017:0>

Just to complement, one implication of this mutability is seem below:
ruby-1.9.2-p0 :001 > str = "foo"
=> "foo"
ruby-1.9.2-p0 :002 > ref = str
=> "foo"
ruby-1.9.2-p0 :003 > str = str + "bar"
=> "foobar"
ruby-1.9.2-p0 :004 > str
=> "foobar"
ruby-1.9.2-p0 :005 > ref
=> "foo"
and
ruby-1.9.2-p0 :001 > str = "foo"
=> "foo"
ruby-1.9.2-p0 :002 > ref = str
=> "foo"
ruby-1.9.2-p0 :003 > str << "bar"
=> "foobar"
ruby-1.9.2-p0 :004 > str
=> "foobar"
ruby-1.9.2-p0 :005 > ref
=> "foobar"
So, you should choose wisely the methods you use with strings in order to avoid unexpected behavior.
Also, if you want something immutable and unique throughout your application you should go with symbols:
ruby-1.9.2-p0 :001 > "foo" == "foo"
=> true
ruby-1.9.2-p0 :002 > "foo".object_id == "foo".object_id
=> false
ruby-1.9.2-p0 :003 > :foo == :foo
=> true
ruby-1.9.2-p0 :004 > :foo.object_id == :foo.object_id
=> true

While above answers are perfect, Just adding this answer for future readers.
In most languages, string literals are also immutable, just like
numbers and symbols. In Ruby versions that are less than 3, all
strings are mutable by default. This changed in Ruby 3. Now all string
literals are immutable by default in Ruby 3++ versions.

Related

Why a dangerous method doesn't work with a character element of String in Ruby?

When I apply the upcase! method I get:
a="hello"
a.upcase!
a # Shows "HELLO"
But in this other case:
b="hello"
b[0].upcase!
b[0] # Shows h
b # Shows hello
I don't understand why the upcase! applied to b[0] doesn't have any efect.
b[0] returns a new String every time. Check out the object id:
b = 'hello'
# => "hello"
b[0].object_id
# => 1640520
b[0].object_id
# => 25290780
b[0].object_id
# => 24940620
When you are selecting an individual character in a string, you're not referencing the specific character, you're calling a accessor/mutator function which performs the evaluation:
2.0.0-p643 :001 > hello = "ruby"
=> "ruby"
2.0.0-p643 :002 > hello[0] = "R"
=> "R"
2.0.0-p643 :003 > hello
=> "Ruby"
In the case when you run a dangerous method, the value is requested by the accessor, then it's manipulated and the new variable is updated, but because there is no longer a connection between the character and the string, it will not update the reference.
2.0.0-p643 :004 > hello = "ruby"
=> "ruby"
2.0.0-p643 :005 > hello[0].upcase!
=> "R"
2.0.0-p643 :006 > hello
=> "ruby"

Print memory address for Ruby array

irb> class A; end
=> nil
irb> a=A.new
=> "#<A:0x3094638>"
irb> a.inspect
=> "#<A:0x3094638>"
irb> b=[]
=> []
irb> b.inspect
=> "[]"
How to get memory address of an array object?
Use the method Object#object_id.
Returns an integer identifier for obj. The same number will be returned on all calls to id for a given object, and no two active objects will share an id. #object_id is a different concept from the :name notation, which returns the symbol id of name.
Example :-
Arup-iMac:arup_ruby $ irb
2.1.2 :001 > s = "I am a string"
=> "I am a string"
2.1.2 :002 > obj_id = s.object_id
=> 2156122060
2.1.2 :003 > ObjectSpace._id2ref obj_id
=> "I am a string"
2.1.2 :004 >

Unexpected substitution of SQL placeholder in this ruby database query

Can someone explain what's happening here?
It seems that the placeholding syntax in SQL statement string doesn't work as expected (or, to say it in a different way, it violates the principle of least surprise), and during runtime an unexpected substitution/escaping is done for var2:
ruby-1.9.2-p290 :001 > puts RUBY_VERSION
1.9.2
=> nil
ruby-1.9.2-p290 :002 > require 'ipaddr'
=> true
ruby-1.9.2-p290 :003 > require 'sqlite3'
=> true
ruby-1.9.2-p290 :004 > var1 = Addrinfo.ip("1.2.3.4")
=> #<Addrinfo: 1.2.3.4>
ruby-1.9.2-p290 :005 > var2 = var1.ip_address
=> "1.2.3.4"
ruby-1.9.2-p290 :006 > var3 = "1.2.3.4"
=> "1.2.3.4"
ruby-1.9.2-p290 :007 > var2 == var3
=> true
ruby-1.9.2-p290 :008 > var2 === var3
=> true
ruby-1.9.2-p290 :009 > var2.eql?(var3)
=> true
ruby-1.9.2-p290 :010 > db = SQLite3::Database.open( "test.db" )
=> #<SQLite3::Database:0x00000100bcfce0>
ruby-1.9.2-p290 :011 > db.execute( "SELECT * FROM devices WHERE deviceaddr=?", var2 )
=> []
ruby-1.9.2-p290 :011 > db.execute( "SELECT * FROM devices WHERE deviceaddr=?", var2.to_s )
=> []
ruby-1.9.2-p290 :012 > db.execute( "SELECT * FROM devices WHERE deviceaddr=?", var3 )
=> [["TEST_DEVICE", "1.2.3.4"]]
Without the SQL placeholder it works (but exposes the db to SQL injections!):
ruby-1.9.2-p290 :013 > db.execute( "SELECT * FROM devices WHERE deviceaddr='#{var2}'" )
=> [["TEST_DEVICE", "1.2.3.4"]]
So what is a safe way to make this work?
TL;DR: SQLite uses UTF; convert Addrinfo's 8-bit ASCII output.
One "safe" way is to use force_encoding("UTF-8") on the output from Addrinfo, so:
> var1.ip_address.encoding
=> #<Encoding:ASCII-8BIT>
> var3.encoding
=> #<Encoding:UTF-8>
> db.execute("SELECT * FROM foo WHERE ip=?", var2.force_encoding("UTF-8"))
=> [["1.2.3.4"]] 

Ruby Symbol#to_proc leaks references in 1.9.2-p180?

Ok, this is my second attempt at debugging the memory issues with my Sinatra app. I believe I have it nailed down into simple sample code this time.
It seems when I filter an array through .map(&:some_method), it causes the items in that array to not get garbage collected. Running the equivalent .map{|x| x.some_method} is totally fine.
Demonstration: Given a simple sample class:
class C
def foo
"foo"
end
end
If I run the following in IRB, it gets collected normally:
ruby-1.9.2-p180 :001 > a = 10.times.map{C.new}
=> [...]
ruby-1.9.2-p180 :002 > b = a.map{|x| x.foo}
=> ["foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo"]
ruby-1.9.2-p180 :003 > ObjectSpace.each_object(C){}
=> 10
ruby-1.9.2-p180 :004 > a = nil
=> nil
ruby-1.9.2-p180 :005 > b = nil
=> nil
ruby-1.9.2-p180 :006 > GC.start
=> nil
ruby-1.9.2-p180 :007 > ObjectSpace.each_object(C){}
=> 0
So no references to C exist anymore. Good. But substituting map{|x| x.foo} with map(&:foo) (which is advertised as equivalent), it doesn't get collected:
ruby-1.9.2-p180 :001 > a = 10.times.map{C.new}
=> [...]
ruby-1.9.2-p180 :002 > b = a.map(&:foo)
=> ["foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo"]
ruby-1.9.2-p180 :003 > ObjectSpace.each_object(C){}
=> 10
ruby-1.9.2-p180 :004 > a = nil
=> nil
ruby-1.9.2-p180 :005 > b = nil
=> nil
ruby-1.9.2-p180 :006 > GC.start
=> nil
ruby-1.9.2-p180 :007 > ObjectSpace.each_object(C){}
=> 10
ruby-1.9.2-p180 :008 >
Is this a ruby bug? I'll try in more versions of ruby to be sure but this seems like an obvious issue. Anyone know what I'm doing wrong?
Edit:
I've tried this in 1.8.7-p352 and it doesn't have the issue. 1.9.3-preview1 does however still have the issue. Is a bug report in order or am I doing something wrong?
Edit2: formatting (why does putting four spaces before each line produce syntax highlighting while <pre> tags don't?)
As a.map(&:foo) should be the exact equivalent to a.map{|x| x.foo}, it seems like you really hit a bug in the Ruby code here. It cannot hurt to file a bug report on (http://redmine.ruby-lang.org/), the worst that can happen is that its being ignored. You can decrease the chances of that by providing a patch for the issue.
EDIT: I threw on my IRB and tried your code. I can reproduce the issue you describe on ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-linux]. However, explicitely calling to_proc on the symbol does not suffer from the same problem:
irb(main):001:0> class C; def foo; end; end
=> nil
irb(main):002:0> a = 10.times.map { C.new }
=> [...]
irb(main):004:0> b = a.map(&:foo.to_proc)
=> [nil, nil, nil, nil, nil, nil, nil, nil, nil, nil]
irb(main):005:0> ObjectSpace.each_object(C){}
=> 10
irb(main):006:0> a = b = nil
=> nil
irb(main):007:0> GC.start
=> nil
irb(main):008:0> ObjectSpace.each_object(C){}
=> 0
It seems we are facing an issue with the implicit Symbol -> Proc conversion here. Maybe I will try to dive a bit into the Ruby source later. If so, I will keep you updated.
EDIT 2:
Simple workaround for the problem:
class Symbol
def to_proc
lambda { |x| x.send(self) }
end
end
class C
def foo; "foo"; end
end
a = 10.times.map { C.new }
b = a.map(&:foo)
p b
a = b = nil
GC.start
p ObjectSpace.each_object(C) {}
prints 0.

Why is String#index returning nil here?

On the last two lines:
$ ruby -v
ruby 1.9.1p378 (2010-01-10 revision 26273) [i386-darwin10]
$ irb
irb(main):001:0> def t(str)
irb(main):002:1> str.index str
irb(main):003:1> end
=> nil
irb(main):004:0> t 'abc'
=> 0
irb(main):005:0> t "\x01\x11\xfe"
=> nil
irb(main):006:0> t "\x01\x11\xfe".force_encoding(Encoding::UTF_8)
=> nil
Why does str.index str return nil?
"\x01\x11\xfe" is not valid UTF8.
If you call t "\x01\x11\xfe".force_encoding(Encoding::BINARY), you'll get the expected 0.
The behavior is different on my older Ruby:
$ ruby -v ; irb
ruby 1.8.7 (2010-01-10 patchlevel 249) [x86_64-linux]
irb(main):001:0> def t(str)
irb(main):002:1> str.index str
irb(main):003:1> end
=> nil
irb(main):004:0> t 'abc'
=> 0
irb(main):005:0> t "\x01\x11\xfe"
=> 0
I suspect it has something to do with the difference between single-quotes and double-quotes with string literals in ruby:
ruby-1.9.1-p378 > def t(str) ; str.index(str) ; end
=> nil
ruby-1.9.1-p378 > t 'abc'
=> 0
ruby-1.9.1-p378 > t "\x01\x11\xfe"
=> nil
ruby-1.9.1-p378 > t '\x01\x11\xfe'
=> 0
The short answer is that using single-quotes does minimal text processing, but double-quotes allows interpolation, character escaping, and a few other things.
Some examples:
#interpolation
ruby-1.9.1-p378 > x = 5 ; 'number: #{x}'
=> "number: \#{x}"
ruby-1.9.1-p378 > x = 5 ; "number: #{x}"
=> "number: 5"
#character escaping
ruby-1.9.1-p378 > puts 'tab\tseparated'
tab\tseparated
=> nil
ruby-1.9.1-p378 > puts "tab\tseparated"
tab separated
=> nil
#hex characters
ruby-1.9.1-p378 > puts '\x01\x11\xfe'
\x01\x11\xfe
=> nil
ruby-1.9.1-p378 > puts "\x01\x11\xfe"
�
=> nil
I'm sure someone can explain better why this happens, this is just what I've experienced in my rubying.

Resources