What is the difference between %w and %W - ruby

I'm looking at the documentation for Ruby. I'm confused between using %w() or %W() (Later W is upcase). What is the difference between both? Can you point me to some documentation?

When capitalized, the array is constructed from strings that are interpolated, as would happen in a double-quoted string; when lowercased, it is constructed from strings that are not interpolated, as would happen in a single-quoted string. For example:
irb(main):001:0> foo = "bar"
=> "bar"
irb(main):002:0> %w(#{foo} bar baz)
=> ["\#{foo}", "bar", "baz"]
irb(main):003:0> %W(#{foo} bar baz)
=> ["bar", "bar", "baz"]
irb(main):004:0> ^D

Related

Checking if an element exists inside an array created with %w(), Ruby

I know I can use %w() as a shortcut to create an array. E.g. these should be equivalent:
FOO = %w(dog, cat)
BAR = ["dog", "cat"]
But when I use include to check what's included in these arrays:
FOO = %w(dog, cat)
BAR = ["dog", "cat"]
puts FOO.include? "dog" #false
puts BAR.include? "dog" #true
The first puts returns false while the second one returns true. Why is this happening?
No these arrays are not equivalent
FOO = %w(dog, cat)
BAR = ["dog", "cat"]
But these are
FOO = %w(dog cat) # note that the `%w` syntax doesn't need commas.
BAR = ["dog", "cat"]
That said, your first version does not FOO.include? "dog" because it includes "dog," and "cat".
In addition to #spickermann's answer, printing the array themselves is often more than enough to identify the issue.
Ruby has a built-in method p that prints out any object in a "representative" format, similar to Python's repr().
FOO = %w(dog, cat)
p FOO # => ["dog,", "cat"]
# ^ hey, bro?

How do I split a string on capitals unless preceded by a '+'

I have a CamelCased string, which I would like to split into individual words at the capitals, unless the capital is preceded by a '+':
Splitting on the caps is fairly simple in Ruby: s.split(/(?=[A-Z])/)
But I can't figure out how to add the "except after '+'" part.
For example:
s = "FooBashFizz+BuzzXBar"
p s.split(/(?=[A-Z])/)
=> ["Foo", "Bash", "Fizz+", "Buzz", "X", "Bar"]
desired:
=> ["Foo", "Bash", "Fizz+Buzz", "X", "Bar"]
Add a negative lookbehind at the start.
irb(main):001:0> s = "FooBashFizz+BuzzXBar"
=> "FooBashFizz+BuzzXBar"
irb(main):002:0> s.split(/(?<!\+)(?=[A-Z])/)
=> ["Foo", "Bash", "Fizz+Buzz", "X", "Bar"]
Explanation:
(?<!\+) Asserts that the preceding character would be any but not a + symbol.
(?=[A-Z]) Asserts that the following character must be an uppercase letter.
Alternative using String#scan. This also works in Ruby 1.8.
s = "FooBashFizz+BuzzXBar"
s.scan(/[A-Z][a-z]*(?:\+[A-Z][a-z]*)*/)
# => ["Foo", "Bash", "Fizz+Buzz", "X", "Bar"]

Is there a version of Ruby's Regexp.match that responds to the order of the matches within the string?

I want to use regexes to check if a given string is composed of certain substrings.
For example, given the regular expression
> regex = /(?:(foo)|(bar)|(baz))*/
I can determine whether a given string matches the pattern:
> regex === "bazbar"
=> true
> regex === "qux"
=> false
But I want to know how to break the string into substrings. I can almost do this with
> regex.match("barbazfoo").captures
=> ["foo", "bar", "baz"]
But here they appear in the order in which I specified them within the regex. I want to return
["bar", "baz", "foo"]
In the order in which they appeared in the string.
You can use String#scan with a modified regular expression:
regex = /foo|bar|baz/
"barbazfoo".scan(regex)
# => ["bar", "baz", "foo"]
UPDATE according to OP's comment.
If some of the strings I'm using are substrings of the others, you need to order the so that all the substrings go last.
"barfoo".scan(/ba|bar|foo/) # without ordering
# => ["ba", "foo"]
words = ['ba', 'bar', 'foo']
pattern = words.map { |word| Regexp.escape(word) }.sort_by { |x| -x.size }.join('|')
"barfoo".scan(Regexp.new(pattern))
# => ["bar", "foo"]

Passing a block variable to grep?

When I do:
["bar","quax"].grep(/bar/)
The output is:
=> ["bar"]
If I do:
["bar","quax"].grep(/fish/)
The output is:
=> []
I decided to take it further by attempting to pass a block to grep, but it did not work.
["foo", "bar", "baz", "quax"].each do |some_word|
["fish","jones"].grep(/some_word/)
end
The output is:
=> ["foo", "bar", "baz", "quax"]
I am curious why my extension doesn't work, as it seems fairly straightforward. Or, is it simply illegal to do?
each is iterating over your array but you aren't modifying it. So it's just looping a few times, calling grep which is effectively doing nothing (technically it's returning something, you're just not doing anything with it), then returning the original array.
You need to be doing something with the return value of grep - but it's not clear to me what that is supposed to be (I don't understand why you are using each at all?)
Also, some_word in this case is taken literally as "some_word". You should interpolate your regex like .grep(/#{some_word}/)
["foo", "bar", "baz", "quax"].each do |some_word|
["fish","jones"].grep(/some_word/)
end
isn't correct. some_word is the parameter to the block, but /some_word/ is a regular expression that matches the string 'some_word', whether that is the entire string or just a sub-string:
%w[some_word not_some_word_anymore].grep(/some_word/)
# => ["some_word", "not_some_word_anymore"]
If you want to use the variable/parameter some_word inside the regular expression, you have to substitute it in somehow. A simple way to do it is:
/#{ some_word }/
or:
Regexp.new( some_word )
For instance:
foo = 'some_word'
/#{ foo }/ # => /some_word/
Regexp.new(foo) # => /some_word/
The reason:
["foo", "bar", "baz", "quax"].each do |some_word|
end
returns the same array, is that is how each behaves. We don't care usually. If you want to transform that array, then use map:
["foo", "bar", "baz", "quax"].map { |some_word| some_word.size }
# => [3, 3, 3, 4]
If you're trying to reduce the array use something like grep or select or reject:
["foo", "bar", "baz", "quax"].reject { |some_word| some_word['oo'] }
# => ["bar", "baz", "quax"]
["foo", "bar", "baz", "quax"].select { |some_word| some_word['oo'] }
# => ["foo"]

Right way to extract multiple values from string using regex in ruby 1.8

I'm relatively new to ruby and I'm trying to figure out the "ruby" way of extracting multiple values from a string, based on grouping in regexes. I'm using ruby 1.8 (so I don't think I have named captures).
I could just match and then assign $1,$2 - but I feel like there's got to be a more elegant way (this is ruby, after all).
I've also got something working with grep, but it seems hackish since I'm using an array and just grabbing the first element:
input="FOO: 1 BAR: 2"
foo, bar = input.grep(/FOO: (\d+) BAR: (\d+)/){[$1,$2]}[0]
p foo
p bar
I've tried searching online and browsing the ruby docs, but haven't been able to figure anything better out.
Rubys String#match method returns a MatchData object with the method captures to return an Array of captures.
>> string = "FOO: 1 BAR: 2"
=> "FOO: 1 BAR: 2"
>> string.match /FOO: (\d+) BAR: (\d+)/
=> #<MatchData "FOO: 1 BAR: 2" 1:"1" 2:"2">
>> _.captures
=> ["1", "2"]
>> foo, bar = _
=> ["1", "2"]
>> foo
=> "1"
>> bar
=> "2"
To Summarize:
foo, bar = input.match(/FOO: (\d+) BAR: (\d+)/).captures
Either:
foo, bar = string.scan(/[A-Z]+: (\d+)/).flatten
or:
foo, bar = string.match(/FOO: (\d+) BAR: (\d+)/).captures
Use scan instead:
input="FOO: 1 BAR: 2"
input.scan(/FOO: (\d+) BAR: (\d+)/) #=> [["1", "2"]]

Resources