This question already has answers here:
Are strings in Ruby mutable? [duplicate]
(6 answers)
Closed 8 years ago.
How come concatenating to a string does not change its object_id? My understand was that Strings are immutable because Strings are essentally Arrays of Characters, and Arrays cannot be changed in memory since they are contiguous. Yet, as demonstrated below: Instantiating a String than adding characters does not change it's object_id. How does concatenation effect the String in memory?
2.1.2 :131 > t1 = "Hello "
=> "Hello "
2.1.2 :132 > t1.object_id
=> 70282949828720
2.1.2 :133 > t2 = t1
=> "Hello "
2.1.2 :134 > t2.object_id
=> 70282949828720
2.1.2 :135 > t2 << "HEY THERE MATE"
=> "Hello HEY THERE MATE"
2.1.2 :136 > t2.object_id
=> 70282949828720
2.1.2 :137 > t1.object_id
=> 70282949828720
2.1.2 :138 >
How come concatenating to a string does not change its object_id?
Because it's still the same string it was before.
My understand was that Strings are immutable
No, they are not immutable. In Ruby, strings are mutable.
because Strings are essentally Arrays of Characters,
They are not. In Ruby, strings are mostly a factory for iterators (each_line, each_char, each_codepoint, each_byte). It implements a subset of the Array protocol, but that does not mean that it is an array.
and Arrays cannot be changed in memory since they are contiguous.
Wrong, arrays are mutable in Ruby.
Yet, as demonstrated below: Instantiating a String than adding characters does not change it's object_id. How does concatenation effect the String in memory?
The Ruby Language Specification does not prescribe any particular in-memory representation of strings. Any representation is fine, as long as it supports the semantics specified in the Ruby Language Specification.
Here's a couple of examples from some Ruby implementations:
Rubinius:
kernel/common/string.rb
kernel/bootstrap/string.rb
vm/builtin/string.cpp
Topaz:
topaz/objects/stringobject.py
Cardinal:
src/classes/String.pir
IronRuby:
Ruby/Builtins/MutableString.cs
JRuby:
core/src/main/java/org/jruby/RubyString.java
Ruby strings are not immutable, in contrast to languages like Python and Java. The underlying char array is internally resized to accommodate the appended characters.
If you want an immutable string in ruby (for example, Bad Things can happen if you use a mutable value as a hash key), use a symbol:
my_sym = :foo
or
my_sym = my_string.to_sym
Related
In Python language I find rstr that can generate a string for a regex pattern.
Or in Python we have this method that can return range of string:
re.sre_parse.parse(pattern)
#..... ('range', (97, 122)) ....
But In Ruby I didn't find any thing.
So how to generate string for a regex pattern in Ruby(reverse regex)?
I wanna to some thing like this:
"/[a-z0-9]+/".example
#tvvd
"/[a-z0-9]+/".example
#yt
"/[a-z0-9]+/".example
#bgdf6
"/[a-z0-9]+/".example
#564fb
"/[a-z0-9]+/" is my input.
The outputs must be correct string that available in my regex pattern.
Here outputs were: tvvd , yt , bgdf6 , 564fb that "example" method generated them.
I need that method.
Thanks for your advice.
You can also use the Faker gem https://github.com/stympy/faker and then use this call:
Faker::Base.regexify(/[a-z0-9]{10}/)
In Ruby:
/qweqwe/.to_s
# => "(?-mix:qweqwe)"
When you declare a Regexp, you've got the Regexp class object, to convert it to String class object, you may use Regexp's method #to_s. During conversion the special fields will be expanded, as you may see in the example., using:
(using the (?opts:source) notation. This string can be fed back in to Regexp::new to a regular expression with the same semantics as the original.
Also, you can use Regexp's method #inspect, which:
produces a generally more readable version of rxp.
/ab+c/ix.inspect #=> "/ab+c/ix"
Note: that the above methods are only use for plain conversion Regexp into String, and in order to match or select set of string onto an other one, we use other methods. For example, if you have a sourse array (or string, which you wish to split with #split method), you can grep it, and get result array:
array = "test,ab,yr,OO".split( ',' )
# => ['test', 'ab', 'yr', 'OO']
array = array.grep /[a-z]/
# => ["test", "ab", "yr"]
And then convert the array into string as:
array.join(',')
# => "test,ab,yr"
Or just use #scan method, with slightly changed regexp:
"test,ab,yr,OO".scan( /[a-z]+/ )
# => ["test", "ab", "yr"]
However, if you really need a random string matched the regexp, you have to write your own method, please refer to the post, or use ruby-string-random library. The library:
generates a random string based on Regexp syntax or Patterns.
And the code will be like to the following:
pattern = '[aw-zX][123]'
result = StringRandom.random_regex(pattern)
A bit late to the party, but - originally inspired by this stackoverflow thread - I have created a powerful ruby gem which solves the original problem:
https://github.com/tom-lord/regexp-examples
/this|is|awesome/.examples #=> ['this', 'is', 'awesome']
/https?:\/\/(www\.)?github\.com/.examples #=> ['http://github.com', 'http://www.github.com', 'https://github.com', 'https://www.github.com']
UPDATE: Now regular expressions supported in string_pattern gem and it is 30 times faster than other gems
require 'string_pattern'
/[a-z0-9]+/.generate
To see a comparison of speed https://repl.it/#tcblues/Comparison-generating-random-string-from-regular-expression
I created a simple way to generate strings using a pattern without the mess of regular expressions, take a look at the string_pattern gem project: https://github.com/MarioRuiz/string_pattern
To install it: gem install string_pattern
This is an example of use:
# four characters. optional: capitals and numbers, required: lower
"4:XN/x/".gen # aaaa, FF9b, j4em, asdf, ADFt
Maybe you can find what you are looking for over here.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What is the colon operator in Ruby?
While learning Ruby I've come across the ":" operator on occasion. Usually I see it in the form of
:symbol => value
what does it mean?
It just indicates a that it is a symbol instead of a string. In ruby, it is common to use symbols instead of strings.
{:foo => value}
{'foo' => value}
It's basically a short-hand way of expressing a string. It can not contain spaces as you can imagine so symbols usually use underscores.
Try this on your own:
foo = :bar
foo.to_s # means to string
baz = 'goo'
baz.to_sym # means to symbol
Given a string, it may contain english or japanese(wide chars) or other languages
How can I get the first char / substrings of this string?
ex: "Give" => "G"
"日本" => "日"
Thanks!
This is built in to ruby so long as you have the correct encoding set on your string:
$ ruby -ve 'p "日本".encoding, "日本"[0]'
ruby 1.9.3p125 (2012-02-16 revision 34643) [x86_64-darwin11.3.0]
#<Encoding:UTF-8>
"日"
There is no need to use mb_chars nor ActiveRecord.
You can use ActiveSupport's Chars class
string = "日本"
string.mb_chars[0]
=> "日"
If you have 'ActiveRecord', you can use mb_chars.
Or you can use the standard library:
str = '日本'
str.codepoints.take(1)
#日
'codepoint' gives an enumerator through the string's actual encodings and 'take' will take any amount of chars you want. Or you can use
str.codepoints.to_a[0]
It will convert the string's encodings to an array. It is good for short strings but not good for big ones.
What is the Ruby idiomatic way for retrieving a single character from a string as a one-character string? There is the str[n] method of course, but (as of Ruby 1.8) it returns a character code as a fixnum, not a string. How do you get to a single-character string?
In Ruby 1.9, it's easy. In Ruby 1.9, Strings are encoding-aware sequences of characters, so you can just index into it and you will get a single-character string out of it:
'µsec'[0] => 'µ'
However, in Ruby 1.8, Strings are sequences of bytes and thus completely unaware of the encoding. If you index into a string and that string uses a multibyte encoding, you risk indexing right into the middle of a multibyte character (in this example, the 'µ' is encoded in UTF-8):
'µsec'[0] # => 194
'µsec'[0].chr # => Garbage
'µsec'[0,1] # => Garbage
However, Regexps and some specialized string methods support at least a small subset of popular encodings, among them some Japanese encodings (e.g. Shift-JIS) and (in this example) UTF-8:
'µsec'.split('')[0] # => 'µ'
'µsec'.split(//u)[0] # => 'µ'
Before Ruby 1.9:
'Hello'[1].chr # => "e"
Ruby 1.9+:
'Hello'[1] # => "e"
A lot has changed in Ruby 1.9 including string semantics.
Should work for Ruby before and after 1.9:
'Hello'[2,1] # => "l"
Please see Jörg Mittag's comment: this is correct only for single-byte character sets.
'abc'[1..1] # => "b"
'abc'[1].chr # => "b"
In Ruby, trying to print out the individual elements of a String is giving me trouble. Instead of seeing each character, I'm seeing their ASCII values instead:
>> a = "0123"
=> "0123"
>> a[0]
=> 48
I've looked online but can't find any way to get the original "0" back out of it. I'm a little new to Ruby to I know it has to be something simple but I just can't seem to find it.
Or you can convert the integer to its character value:
a[0].chr
You want a[0,1] instead of a[0].
I believe this is changing in Ruby 1.9 such that "asdf"[2] yields "d" rather than the character code
To summarize:
This behavior will be going away in version 1.9, in which the character itself is returned, but in previous versions, trying to reference a single character of a string by its character position will return its character value (so "ABC"[2] returns 67)
There are a number of methods that return a range of characters from a string (see the Ruby docs on the String slice method) All of the following return "C":
"ABC"[2,1]
"ABC"[2..2]
"ABC".slice(2,1)
I find the range selector to be the easiest to read. Can anyone speak to whether it is less efficient?
#Chris,
That's just how [] and [,] are defined for the String class.
Check out the String API.
The [,] operator returns a string back to you, it is a substring operator, where as the [] operator returns the character which ruby treats as a number when printing it out.
I think each_char or chars describes better what you want.
irb(main):001:0> a = "0123"
=> "0123"
irb(main):002:0> Array(a.each_char)
=> ["0", "1", "2", "3"]
irb(main):003:0> puts Array(a.each_char)
0
1
2
3
=> nil