How do I get each comma-separated tokens inside parenthesis - ruby

How do I achieve the following result using regular expressions?
"(apple, banana, _orange)" # => ['apple', 'banana', '_orange']
"apple, banana, _orange" # => []
"(apple)" # => ['apple']
"()" # => []
"(apple,sauce)" # => ['apple', 'sauce']
This is what I have so far but I am only able to capture the last token:
\|(?:(?:,\s)?(\w+))*\|

You can try this:
/\b\w+\b(?=.*\))/m
it works for all your provided sample:
re = /\b\w+\b(?=.*\))/m
str1 = '(apple, banana, _orange)'
str2 = 'apple, banana, _orange'
str3 = '(apple)'
str4 = '()'
str5 = '(apple,sauce)'
p str1.scan(re)
p str2.scan(re)
p str3.scan(re)
p str4.scan(re)
p str5.scan(re)
Sample Output:
["apple", "banana", "_orange"]
[]
["apple"]
[]
["apple", "sauce"]
But ideally this is not the best solution as it doesn't check whether it starts with a ( or not.
If you really have to use regex then you can not do it perfectly with one regex:
You will first need to check if the string starts and ends with
partenthesis
then you need to scan the string by the second regex \b\w+\b

You could use this regex :
/(?<=\().*?(?=\))/
to scan for text between parens, and then split it around ','.
strings = [
'(apple, banana, _orange)',
'apple, banana, _orange',
'(apple)',
'()',
'(apple,sauce)',
'(apple) orange (sauce)',
'not properly closed)'
]
strings.each do |string|
p string.scan(/(?<=\().*?(?=\))/).flat_map { |s| s.split(',') }
end
# =>
# ["apple", " banana", " _orange"]
# []
# ["apple"]
# []
# ["apple", "sauce"]
# ["apple", "sauce"]
# []
It requires 2 steps, but it should be more resilient than just a single regex.

Related

How to match given characters in string

Given:
fruits = %w[Banana Apple Orange Grape]
chars = 'ep'
how can I print all elements of fruits that have all characters of chars? I tried the following:
fruits.each{|fruit| puts fruit if !(fruit=~/["#{chars}"]/i).nil?)}
but I see 'Orange' in the result, which does not have the 'p' character in it.
p fruits.select { |fruit| chars.delete(fruit.downcase).empty? }
["Apple", "Grape"]
String#delete returns a copy of chars with all characters in delete's argument deleted.
Just for fun, here's how you might do this with a regular expression, thanks to the magic of positive lookahead:
fruits = %w[Banana Apple Orange Grape]
p fruits.grep(/(?=.*e)(?=.*p)/i)
# => ["Apple", "Grape"]
This is nice and succinct, but the regex is a bit occult, and it gets worse if you want to generalize it:
def match_chars(arr, chars)
expr_parts = chars.chars.map {|c| "(?=.*#{Regexp.escape(c)})" }
arr.grep(Regexp.new(expr_parts.join, true))
end
p match_chars(fruits, "ar")
# => ["Orange", "Grape"]
Also, I'm pretty sure this would be outperformed by most or all of the other answers.
fruits = ["Banana", "Apple", "Orange", "Grape"]
chars = 'ep'.chars
fruits.select { |fruit| (fruit.split('') & chars).length == chars.length }
#=> ["Apple", "Grape"]
chars.each_char.with_object(fruits.dup){|e, a| a.select!{|s| s.include?(e)}}
# => ["Apple", "Grape"]
To print:
puts chars.each_char.with_object(fruits.dup){|e, a| a.select!{|s| s.include?(e)}}
I'm an absolute beginner, but here's what worked for me
fruits = %w[Banana Apple Orange Grape]
chars = 'ep'
fruits.each {|fruit| puts fruit if fruit.include?('e') && fruit.include?('p')}
Here is one more way to do this:
fruits.select {|f| chars.downcase.chars.all? {|c| f.downcase.include?(c)} }
Try this, first split all characters into an Array ( chars.split("") ) and after check if all are present into word.
fruits.select{|fruit| chars.split("").all? {|char| fruit.include?(char)}}
#=> ["Apple", "Grape"]

Get first index of any character among array from a string in Ruby?

I have string like this
hi, i am not coming today!
and i have an array of characters like this:
['a','e','i','o','u']
now i want to find the first occurrence of any word from array in string.
If it was only word i'd have been able to do it like this:
'string'.index 'c'
s = 'hi, i am not coming today!'
['a','e','i','o','u'].map { |c| [c, s.index(c)] }.to_h
#⇒ {
# "a" => 6,
# "e" => nil,
# "i" => 1,
# "o" => 10,
# "u" => nil
# }
To find the first occurence of any character from an array:
['a','e','i','o','u'].map { |c| s.index(c) }.compact.min
#⇒ 1
UPD Something different:
idx = str.split('').each_with_index do |c, i|
break i if ['a','e','i','o','u'].include? c
end
idx.is_a?(Numeric) ? idx : nil
str =~ /#{['a','e','i','o','u'].join('|')}/
str.index Regexp.union(['a','e','i','o','u']) # credits #steenslag

How to break a string into two arrays in Ruby

Is there a way to extract the strings removed by String#split into a separate array?
s = "This is a simple, uncomplicated sentence."
a = s.split( /,|\./ ) #=> [ "This is a simple", "uncomplicated sentence" ]
x = ... => should contain [ ",", "." ]
Note that the actual regex I need to use is much more complex than this example.
Something like this ?
a = s.scan( /,|\./ )
When you want both the matched delimiters and the substrings in between as in Stefan's comment, then you should use split with captures.
"This is a simple, uncomplicated sentence."
.split(/([,.])/)
# => ["This is a simple", ",", " uncomplicated sentence", "."]
If you want to separate them into different arrays, then do:
a, x =
"This is a simple, uncomplicated sentence."
.split(/([,.])/).each_slice(2).to_a.transpose
a # => ["This is a simple", " uncomplicated sentence"]
x # => [",", "."]
or
a =
"This is a simple, uncomplicated sentence."
.split(/([,.])/)
a.select.with_index{|_, i| i.even?}
# => ["This is a simple", " uncomplicated sentence"]
a.select.with_index{|_, i| i.odd?}
# => [",", "."]
try this:
a = s.split(/,/)[1..-1]

Ruby: Matching a delimiter with Regex

I'm trying to solve this with a regex pattern, and even though my test passes with this solution, I would like split to only have ["1", "2"] inside the array. Is there a better way of doing this?
irb testing:
s = "//;\n1;2" # when given a delimiter of ';'
s2 = "1,2,3" # should read between commas
s3 = "//+\n2+2" # should read between delimiter of '+'
s.split(/[,\n]|[^0-9]/)
=> ["", "", "", "", "1", "2"]
Production:
module StringCalculator
def self.add(input)
solution = input.scan(/\d+/).map(&:to_i).reduce(0, :+)
input.end_with?("\n") ? nil : solution
end
end
Test:
context 'when given a newline delimiter' do
it 'should read between numbers' do
expect(StringCalculator.add("1\n2,3")).to eq(6)
end
it 'should not end in a newline' do
expect(StringCalculator.add("1,\n")).to be_nil
end
end
context 'when given different delimiter' do
it 'should support that delimiter' do
expect(StringCalculator.add("//;\n1;2")).to eq(3)
end
end
Very simple using String#scan :
s = "//;\n1;2"
s.scan(/\d/) # => ["1", "2"]
/\d/ - A digit character ([0-9])
Note :
If you have a string like below then, you should use /\d+/.
s = "//;\n11;2"
s.scan(/\d+/) # => ["11", "2"]
You're getting data that looks like this string: //1\n212
If you're getting the data as a file, then treat it as two separate lines. If it's a string, then, again, treat it as two separate lines. In either case it'd look like
//1
212
when output.
If it's a string:
input = "//1\n212".split("\n")
delimiter = input.first[2] # => "1"
values = input.last.split(delimiter) # => ["2", "2"]
If it's a file:
line = File.foreach('foo.txt')
delimiter = line.next[2] # => "1"
values = line.next.chomp.split(delimiter) # => ["2", "2"]

Ruby: Insert Multiple Values Into String

Suppose we have the string "aaabbbccc" and want to use the String#insert to convert the string to "aaa<strong>bbb</strong>ccc". Is this the best way to insert multiple values into a Ruby string using String#insert or can multiple values simultaneously be added:
string = "aaabbbccc"
opening_tag = '<strong>'
opening_index = 3
closing_tag = '</strong>'
closing_index = 6
string.insert(opening_index, opening_tag)
closing_index = 6 + opening_tag.length # I don't really like this
string.insert(closing_index, closing_tag)
Is there a way to simultaneously insert multiple substrings into a Ruby string so the closing tag does not need to be offset by the length of the first substring that is added? I would like something like this one liner:
string.insert(3 => '<strong>', 6 => '</strong>') # => "aaa<strong>bbb</strong>ccc"
Let's have some fun. How about
class String
def splice h
self.each_char.with_index.inject('') do |accum,(c,i)|
accum + h.fetch(i,'') + c
end
end
end
"aaabbbccc".splice(3=>"<strong>", 6=>"</strong>")
=> "aaa<strong>bbb</strong>ccc"
(you can encapsulate this however you want, I just like messing with built-ins because Ruby lets me)
How about inserting from right to left?
string = "aaabbbccc"
string.insert(6, '</strong>')
string.insert(3, '<strong>')
string # => "aaa<strong>bbb</strong>ccc"
opening_tag = '<strong>'
opening_index = 3
closing_tag = '</strong>'
closing_index = 6
string = "aaabbbccc"
string[opening_index...closing_index] =
opening_tag + string[opening_index...closing_index] + closing_tag
#=> "<strong>bbb</strong>"
string
#=> "aaa<strong>bbb</strong>ccc"
If your string is comprised of three groups of consecutive characters, and you'd like to insert the opening tag between the first two groups and the closing tag between the last two groups, regardless of the size of each group, you could do that like this:
def stuff_tags(str, tag)
str.scan(/((.)\2*)/)
.map(&:first)
.insert( 1, "<#{tag}>")
.insert(-2, "<\/#{tag}>")
.join
end
stuff_tags('aaabbbccc', 'strong') #=> "aaa<strong>bbb</strong>ccc"
stuff_tags('aabbbbcccccc', 'weak') #=> "aa<weak>bbbb</weak>cccccc"
I will explain the regex used by scan, but first would like to show how the calculations proceed for the string 'aaabbbccc':
a = 'aaabbbccc'.scan(/((.)\2*)/)
#=> [["aaa", "a"], ["bbb", "b"], ["ccc", "c"]]
b = a.map(&:first)
#=> ["aaa", "bbb", "ccc"]
c = b.insert( 1, "<strong>")
#=> ["aaa", "<strong>", "bbb", "ccc"]
d = c.insert(-2, "<\/strong>")
#=> ["aaa", "<strong>", "bbb", "</strong>", "ccc"]
d.join
#=> "aaa<strong>bbb</strong>ccc"
We need two capture groups in the regex. The first (having the first left parenthesis) captures the string we want. The second captures the first character, (.). This is needed so that we can require that it be followed by zero or more copies of that character, \2*.
Here's another way this can be done:
def stuff_tags(str, tag)
str.chars.chunk {|c| c}
.map {|_,a| a.join}
.insert( 1, "<#{tag}>")
.insert(-2, "<\/#{tag}>")
.join
end
The calculations of a and b above change to the following:
a = 'aaabbbccc'.chars.chunk {|c| c}
#=> #<Enumerator: #<Enumerator::Generator:0x000001021622d8>:each>
# a.to_a => [["a",["a","a","a"]],["b",["b","b","b"]],["c",["c","c","c"]]]
b = a.map {|_,a| a.join }
#=> ["aaa", "bbb", "ccc"]

Resources