I'm looking at ruby's replace: http://www.ruby-doc.org/core/classes/String.html#M001144
It doesn't seem to make sense to me, you call replace and it replaces the entire string.
I was expecting:
replace(old_value, new_value)
Is what I am looking for gsub then?
replace seems to be different than in most other languages.
I agree that replace is generally used as some sort of pattern replace in other languages, but Ruby is different :)
Yes, you are thinking of gsub:
ruby-1.9.2-p136 :001 > "Hello World!".gsub("World", "Earth")
=> "Hello Earth!"
One thing to note is that String#replace may seem pointeless, however it does remove 'taintediness". You can read more up on tained objects here.
I suppose the reason you feel that replace does not make sense is because there is assigment operator = (not much relevant to gsub).
The important point is that String instances are mutable objects. By using replace, you can change the content of the string while retaining its identity as an object. Compare:
a = 'Hello' # => 'Hello'
a.object_id # => 84793190
a.replace('World') # => 'World'
a.object_id # => 84793190
a = 'World' # => 'World'
a.object_id # => 84768100
See that replace has not changed the string object's id, whereas simple assignment did change it. This difference has some consequences. For example, suppose you assigned some instance variables to the string instance. By replace, that information will be retained, but if you assign the same variable simply to a different string, all that information is gone.
Yes, it is gsub and it is taken from awk syntax. I guess replace stands for the internal representation of the string, since, according to documentation, tainted-ness is removed too.
Related
When I had a look into the ActiveRecord source today, I stumbled upon these lines
name = -name.to_s
https://github.com/rails/rails/blob/2459c20afb508c987347f52148210d874a9af4fa/activerecord/lib/active_record/reflection.rb#L24
and
ar.aggregate_reflections = ar.aggregate_reflections.merge(-name.to_s => reflection)
https://github.com/rails/rails/blob/2459c20afb508c987347f52148210d874a9af4fa/activerecord/lib/active_record/reflection.rb#L29
What purpose does the - operator serve for on the symbol name?
That's String#-#:
Returns a frozen, possibly pre-existing copy of the string.
Example:
a = "foo"
b = "foo"
a.object_id #=> 6980
b.object_id #=> 7000
vs:
a = -"foo"
b = -"foo"
a.object_id #=> 6980
b.object_id #=> 6980
What purpose does the - operator serve for on the symbol name?
You have your precedence rules wrong: the binary message sending operator (.) has higher precedence than everything else, which means - is not applied to the expression name but to the expression name.to_s.
In other words, you seem to think that this expression is parsed like this:
(-name).to_s
# which is the same as
name.-#().to_s()
but it is actually parsed as
-(name.to_s)
# which is the same as
name.to_s().-#()
Now, we don't know what name is, but unless someone is seriously messing with you, #to_s should return a String. In other words, the operator is not applied to a Symbol, as you thought.
Hence, we know that we are sending the message -# to a String and can thus look up what String#-# does in the documentation:
-string → frozen_string
Returns a frozen, possibly pre-existing copy of the string.
The returned String will be deduplicated as long as it does not have any instance variables set on it.
Dynamically created Strings are not frozen by default. Only static String literals are, depending on your setting of the magic comment # frozen_string_literals: true. String#-# was added as an alias for String#freeze to allow you to freeze and de-duplicate a String with as little syntactic noise as possible.
The opposite operation is also available as String#+#.
How do I print/display just the part of a regular expression that is between the slashes?
irb> re = /\Ahello\z/
irb> puts "re is /#{re}/"
The result is:
re is /(?-mix:\Ahello\z)/
Whereas I want:
re is /\Ahello\z/
...but not by doing this craziness:
puts "re is /#{re.to_s.gsub( /.*:(.*)\)/, '\1' )}/"
If you want to see the original pattern between the delimiters, use source:
IP_PATTERN = /(?:\d{1,3}\.){3}\d{1,3}/
IP_PATTERN # => /(?:\d{1,3}\.){3}\d{1,3}/
IP_PATTERN.inspect # => "/(?:\\d{1,3}\\.){3}\\d{1,3}/"
IP_PATTERN.to_s # => "(?-mix:(?:\\d{1,3}\\.){3}\\d{1,3})"
Here's what source shows:
IP_PATTERN.source # => "(?:\\d{1,3}\\.){3}\\d{1,3}"
From the documentation:
Returns the original string of the pattern.
/ab+c/ix.source #=> "ab+c"
Note that escape sequences are retained as is.
/\x20\+/.source #=> "\\x20\\+"
NOTE:
It's common to build a complex pattern from small patterns, and it's tempting to use interpolation to insert the simple ones, but that doesn't work as most people think it will. Consider this:
foo = /foo/
bar = /bar/imx
foo_bar = /#{ foo }_#{ bar }/
foo_bar # => /(?-mix:foo)_(?mix:bar)/
Notice that foo_bar has the pattern flags for each of the sub-patterns. Those can REALLY mess you up when trying to match things if you're not aware of their existence. Inside the (?-...) block the pattern can have totally different settings for i, m or x in relation to the outer pattern. Debugging that can make you nuts, worse than trying to debug a complex pattern normally would. How do I know this? I'm a veteran of that particular war.
This is why source is important. It injects the exact pattern, without the flags:
foo_bar = /#{ foo.source}_#{ bar.source}/
foo_bar # => /foo_bar/
Use .inspect instead of .to_s:
> puts "re is #{re.inspect}"
re is /\Ahello\z/
In Python language I find rstr that can generate a string for a regex pattern.
Or in Python we have this method that can return range of string:
re.sre_parse.parse(pattern)
#..... ('range', (97, 122)) ....
But In Ruby I didn't find any thing.
So how to generate string for a regex pattern in Ruby(reverse regex)?
I wanna to some thing like this:
"/[a-z0-9]+/".example
#tvvd
"/[a-z0-9]+/".example
#yt
"/[a-z0-9]+/".example
#bgdf6
"/[a-z0-9]+/".example
#564fb
"/[a-z0-9]+/" is my input.
The outputs must be correct string that available in my regex pattern.
Here outputs were: tvvd , yt , bgdf6 , 564fb that "example" method generated them.
I need that method.
Thanks for your advice.
You can also use the Faker gem https://github.com/stympy/faker and then use this call:
Faker::Base.regexify(/[a-z0-9]{10}/)
In Ruby:
/qweqwe/.to_s
# => "(?-mix:qweqwe)"
When you declare a Regexp, you've got the Regexp class object, to convert it to String class object, you may use Regexp's method #to_s. During conversion the special fields will be expanded, as you may see in the example., using:
(using the (?opts:source) notation. This string can be fed back in to Regexp::new to a regular expression with the same semantics as the original.
Also, you can use Regexp's method #inspect, which:
produces a generally more readable version of rxp.
/ab+c/ix.inspect #=> "/ab+c/ix"
Note: that the above methods are only use for plain conversion Regexp into String, and in order to match or select set of string onto an other one, we use other methods. For example, if you have a sourse array (or string, which you wish to split with #split method), you can grep it, and get result array:
array = "test,ab,yr,OO".split( ',' )
# => ['test', 'ab', 'yr', 'OO']
array = array.grep /[a-z]/
# => ["test", "ab", "yr"]
And then convert the array into string as:
array.join(',')
# => "test,ab,yr"
Or just use #scan method, with slightly changed regexp:
"test,ab,yr,OO".scan( /[a-z]+/ )
# => ["test", "ab", "yr"]
However, if you really need a random string matched the regexp, you have to write your own method, please refer to the post, or use ruby-string-random library. The library:
generates a random string based on Regexp syntax or Patterns.
And the code will be like to the following:
pattern = '[aw-zX][123]'
result = StringRandom.random_regex(pattern)
A bit late to the party, but - originally inspired by this stackoverflow thread - I have created a powerful ruby gem which solves the original problem:
https://github.com/tom-lord/regexp-examples
/this|is|awesome/.examples #=> ['this', 'is', 'awesome']
/https?:\/\/(www\.)?github\.com/.examples #=> ['http://github.com', 'http://www.github.com', 'https://github.com', 'https://www.github.com']
UPDATE: Now regular expressions supported in string_pattern gem and it is 30 times faster than other gems
require 'string_pattern'
/[a-z0-9]+/.generate
To see a comparison of speed https://repl.it/#tcblues/Comparison-generating-random-string-from-regular-expression
I created a simple way to generate strings using a pattern without the mess of regular expressions, take a look at the string_pattern gem project: https://github.com/MarioRuiz/string_pattern
To install it: gem install string_pattern
This is an example of use:
# four characters. optional: capitals and numbers, required: lower
"4:XN/x/".gen # aaaa, FF9b, j4em, asdf, ADFt
Maybe you can find what you are looking for over here.
I'm sure I can do this with a regex, but I can't find any explanation for this behavior using just normal delete!:
#1.9.2
>> "helllom<em>".delete!"<em>"
=> "hlllo"
The docs don't have anything to say about this. Seems to me that it's treating '<em>' as a set. Where is this documented?
Edit: in my defense I was looking for special treatment of < and > in the docs under delete. Didn't see anything about it and tried google, which also didn't have anything to say about that -- because it doesn't exist.
String#delete is one of those unfortunate methods that is difficult to explain (I have no idea what the use case is). In practice, I've always used gsub with an empty string as the second argument.
'helllom<em>'.gsub '<em>', '' # => "helllom"
Note that String#gsub! also has weirdness such that you should not depend on its return value, it will return nil if it does not alter the string, so it is best to use gsub if you depend on the return value, or if you want to mutate the string, then use gsub! but and don't use anything else on that line.
You cannot use String#delete to remove substrings.
Check the API. It removes all the characters from given parameters from the given string.
I your case it removes all occurrences of e, m, < and >.
Straight from the docs:
delete([other_str]+) → new_str
Returns a copy of str with all characters in the intersection of its
arguments deleted. Uses the same rules for building the set of
characters as String#count.
ex:
"hello".delete "l","lo" #=> "heo"
"hello".delete "lo" #=> "he"
"hello".delete "aeiou", "^e" #=> "hell"
"hello".delete "ej-m" #=> "ho"
So every character in the intersection of the two strings is removed.
In Ruby 1.8.7, Array("hello\nhello") gives you ["hello\n", "hello"]. This does two things that I don't expect:
It splits the string on newlines. I'd expect it simply to give me an array with the string I pass in as its single element without modifying the data I pass in.
Even if you accept that it's reasonable to split a string when passing it to Array, why does it retain the newline character when "foo\nbar".split does not?
Additionally:
>> Array.[] "foo\nbar"
=> ["foo\nbar"]
>> Array.[] *"foo\nbar"
=> ["foo\n", "bar"]
It splits the string on newlines. I'd expect it simply to give me an array with the string I pass in as its single element without modifying the data I pass in.
That's a convention as good as any other. For example, the list constructor in Python does something entirely different:
>>> list("foo")
['f', 'o', 'o']
So long as it's consistent I don't see the problem.
Even if you accept that it's reasonable to split a string when passing it to Array, why does it retain the newline character when "foo\nbar".split does not?
My wild guess here (supported by quick googling and TryRuby) is that the .split method for strings does so to make it the "inverse" operation of the .join method for arrays.
>> "foospambar".split("spam").join("spam")
=> "foospambar"
By the way, I cannot replicate your behaviour on TryRuby:
>> x = Array("foo\nbar")
=> ["foo\nbar"]
>> Array.[] *"foo\nbar"
=> ["foo\nbar"]
If you replace the double-quotes with single-quotes it works as expected:
>> Array.[] "foo\nbar"
=> ["foo\nbar"]
>> Array.[] 'foo\nbar'
=> ["foo\\nbar"]
You may try:
"foo\nbar".split(/w/)
"foo\nbar".split(/^/)
"foo\nbar".split(/$/)
and other regular expressions.