How to match given characters in string - ruby

Given:
fruits = %w[Banana Apple Orange Grape]
chars = 'ep'
how can I print all elements of fruits that have all characters of chars? I tried the following:
fruits.each{|fruit| puts fruit if !(fruit=~/["#{chars}"]/i).nil?)}
but I see 'Orange' in the result, which does not have the 'p' character in it.

p fruits.select { |fruit| chars.delete(fruit.downcase).empty? }
["Apple", "Grape"]
String#delete returns a copy of chars with all characters in delete's argument deleted.

Just for fun, here's how you might do this with a regular expression, thanks to the magic of positive lookahead:
fruits = %w[Banana Apple Orange Grape]
p fruits.grep(/(?=.*e)(?=.*p)/i)
# => ["Apple", "Grape"]
This is nice and succinct, but the regex is a bit occult, and it gets worse if you want to generalize it:
def match_chars(arr, chars)
expr_parts = chars.chars.map {|c| "(?=.*#{Regexp.escape(c)})" }
arr.grep(Regexp.new(expr_parts.join, true))
end
p match_chars(fruits, "ar")
# => ["Orange", "Grape"]
Also, I'm pretty sure this would be outperformed by most or all of the other answers.

fruits = ["Banana", "Apple", "Orange", "Grape"]
chars = 'ep'.chars
fruits.select { |fruit| (fruit.split('') & chars).length == chars.length }
#=> ["Apple", "Grape"]

chars.each_char.with_object(fruits.dup){|e, a| a.select!{|s| s.include?(e)}}
# => ["Apple", "Grape"]
To print:
puts chars.each_char.with_object(fruits.dup){|e, a| a.select!{|s| s.include?(e)}}

I'm an absolute beginner, but here's what worked for me
fruits = %w[Banana Apple Orange Grape]
chars = 'ep'
fruits.each {|fruit| puts fruit if fruit.include?('e') && fruit.include?('p')}

Here is one more way to do this:
fruits.select {|f| chars.downcase.chars.all? {|c| f.downcase.include?(c)} }

Try this, first split all characters into an Array ( chars.split("") ) and after check if all are present into word.
fruits.select{|fruit| chars.split("").all? {|char| fruit.include?(char)}}
#=> ["Apple", "Grape"]

Related

How do I get each comma-separated tokens inside parenthesis

How do I achieve the following result using regular expressions?
"(apple, banana, _orange)" # => ['apple', 'banana', '_orange']
"apple, banana, _orange" # => []
"(apple)" # => ['apple']
"()" # => []
"(apple,sauce)" # => ['apple', 'sauce']
This is what I have so far but I am only able to capture the last token:
\|(?:(?:,\s)?(\w+))*\|
You can try this:
/\b\w+\b(?=.*\))/m
it works for all your provided sample:
re = /\b\w+\b(?=.*\))/m
str1 = '(apple, banana, _orange)'
str2 = 'apple, banana, _orange'
str3 = '(apple)'
str4 = '()'
str5 = '(apple,sauce)'
p str1.scan(re)
p str2.scan(re)
p str3.scan(re)
p str4.scan(re)
p str5.scan(re)
Sample Output:
["apple", "banana", "_orange"]
[]
["apple"]
[]
["apple", "sauce"]
But ideally this is not the best solution as it doesn't check whether it starts with a ( or not.
If you really have to use regex then you can not do it perfectly with one regex:
You will first need to check if the string starts and ends with
partenthesis
then you need to scan the string by the second regex \b\w+\b
You could use this regex :
/(?<=\().*?(?=\))/
to scan for text between parens, and then split it around ','.
strings = [
'(apple, banana, _orange)',
'apple, banana, _orange',
'(apple)',
'()',
'(apple,sauce)',
'(apple) orange (sauce)',
'not properly closed)'
]
strings.each do |string|
p string.scan(/(?<=\().*?(?=\))/).flat_map { |s| s.split(',') }
end
# =>
# ["apple", " banana", " _orange"]
# []
# ["apple"]
# []
# ["apple", "sauce"]
# ["apple", "sauce"]
# []
It requires 2 steps, but it should be more resilient than just a single regex.

Ruby: Sorting an array of strings, in alphabetical order, that includes some arrays of strings

Say I have:
a = ["apple", "pear", ["grapes", "berries"], "peach"]
and I want to sort by:
a.sort_by do |f|
f.class == Array ? f.to_s : f
end
I get:
[["grapes", "berries"], "apple", "peach", "pear"]
Where I actually want the items in alphabetical order, with array items being sorted on their first element:
["apple", ["grapes", "berries"], "peach", "pear"]
or, preferably, I want:
["apple", "grapes, berries", "peach", "pear"]
If the example isn't clear enough, I'm looking to sort the items in alphabetical order.
Any suggestions on how to get there?
I've tried a few things so far yet can't seem to get it there. Thanks.
I think this is what you want:
a.sort_by { |f| f.class == Array ? f.first : f }
I would do
a = ["apple", "pear", ["grapes", "berries"], "peach"]
a.map { |e| Array(e).join(", ") }.sort
# => ["apple", "grapes, berries", "peach", "pear"]
Array#sort_by clearly is the right method, but here's a reminder of how Array#sort would be used here:
a.sort do |s1,s2|
t1 = (s1.is_a? Array) ? s1.first : s1
t2 = (s2.is_a? Array) ? s2.first : s2
t1 <=> t2
end.map {|e| (e.is_a? Array) ? e.join(', ') : e }
#=> ["apple", "grapes, berries", "peach", "pear"]
#theTinMan pointed out that sort is quite a bit slower than sort_by here, and gave a reference that explains why. I've been meaning to see how the Benchmark module is used, so took the opportunity to compare the two methods for the problem at hand. I used #Rafa's solution for sort_by and mine for sort.
For testing, I constructed an array of 100 random samples (each with 10,000 random elements to be sorted) in advance, so the benchmarks would not include the time needed to construct the samples (which was not insignificant). 8,000 of the 10,000 elements were random strings of 8 lowercase letters. The other 2,000 elements were 2-tuples of the form [str1, str2], where str1 and str2 were each random strings of 8 lowercase letters. I benchmarked with other parameters, but the bottom-line results did not vary significantly.
require 'benchmark'
# n: total number of items to sort
# m: number of two-tuples [str1, str2] among n items to sort
# n-m: number of strings among n items to sort
# k: length of each string in samples
# s: number of sorts to perform when benchmarking
def make_samples(n, m, k, s)
s.times.with_object([]) { |_, a| a << test_array(n,m,k) }
end
def test_array(n,m,k)
a = ('a'..'z').to_a
r = []
(n-m).times { r << a.sample(k).join }
m.times { r << [a.sample(k).join, a.sample(k).join] }
r.shuffle!
end
# Here's what the samples look like:
make_samples(6,2,4,4)
#=> [["bloj", "izlh", "tebz", ["lfzx", "rxko"], ["ljnv", "tpze"], "ryel"],
# ["jyoh", "ixmt", "opnv", "qdtk", ["jsve", "itjw"], ["pnog", "fkdr"]],
# ["sxme", ["emqo", "cawq"], "kbsl", "xgwk", "kanj", ["cylb", "kgpx"]],
# [["rdah", "ohgq"], "bnup", ["ytlr", "czmo"], "yxqa", "yrmh", "mzin"]]
n = 10000 # total number of items to sort
m = 2000 # number of two-tuples [str1, str2] (n-m strings)
k = 8 # length of each string
s = 100 # number of sorts to perform
samples = make_samples(n,m,k,s)
Benchmark.bm('sort_by'.size) do |bm|
bm.report 'sort_by' do
samples.each do |s|
s.sort_by { |f| f.class == Array ? f.first : f }
end
end
bm.report 'sort' do
samples.each do |s|
s.sort do |s1,s2|
t1 = (s1.is_a? Array) ? s1.first : s1
t2 = (s2.is_a? Array) ? s2.first : s2
t1 <=> t2
end
end
end
end
user system total real
sort_by 1.360000 0.000000 1.360000 ( 1.364781)
sort 4.050000 0.010000 4.060000 ( 4.057673)
Though it was never in doubt, #theTinMan was right! I did a few other runs with different parameters, but sort_by consistently thumped sort by similar performance ratios.
Note the "system" time is zero for sort_by. In other runs it was sometimes zero for sort. The values were always zero or 0.010000, leading me to wonder what's going on there. (I ran these on a Mac.)
For readers unfamiliar with Benchmark, Benchmark#bm takes an argument that equals the amount of left-padding desired for the header row (user system...). bm.report takes a row label as an argument.
You are really close. Just switch .to_s to .first.
irb(main):005:0> b = ["grapes", "berries"]
=> ["grapes", "berries"]
irb(main):006:0> b.to_s
=> "[\"grapes\", \"berries\"]"
irb(main):007:0> b.first
=> "grapes"
Here is one that works:
a.sort_by do |f|
f.class == Array ? f.first : f
end
Yields:
["apple", ["grapes", "berries"], "peach", "pear"]
a.map { |b| b.is_a?(Array) ? b.join(', ') : b }.sort
# => ["apple", "grapes, berries", "peach", "pear"]
Replace to_s with join.
a.sort_by do |f|
f.class == Array ? f.join : f
end
# => ["apple", ["grapes", "berries"], "peach", "pear"]
Or more concisely:
a.sort_by {|x| [*x].join }
# => ["apple", ["grapes", "berries"], "peach", "pear"]
The problem with to_s is that it converts your Array to a string that starts with "[":
"[\"grapes\", \"berries\"]"
which comes alphabetically before the rest of your strings.
join actually creates the string that you had expected to sort by:
"grapesberries"
which is alphabetized correctly, according to your logic.
If you don't want the arrays to remain arrays, then it's a slightly different operation, but you will still use join.
a.map {|x| [*x].join(", ") }.sort
# => ["apple", "grapes, berries", "peach", "pear"]
Sort a Flattened Array
If you just want all elements of your nested array flattened and then sorted in alphabetical order, all you need to do is flatten and sort. For example:
["apple", "pear", ["grapes", "berries"], "peach"].flatten.sort
#=> ["apple", "berries", "grapes", "peach", "pear"]

Split a string in Ruby

I have a hash returned to me in ruby
test_string = "{cat=6,bear=2,mouse=1,tiger=4}"
I need to get a list of these items in this form ordered by the number.
animals = [cat, tiger, bear, mouse]
My thoughts were to_s this in ruby and split on the '=' character. Then try to order them and put in a new list. Is there an easy way to do this in ruby? Sample code would be greatly appreciated.
s = "{cat=6,bear=2,mouse=1,tiger=4}"
a = s.scan(/(\w+)=(\d+)/)
p a.sort_by { |x| x[1].to_i }.reverse.map(&:first)
a = test_string.split('{')[1].split('}').first.split(',')
# => ["cat=6", "bear=2", "mouse=1", "tiger=4"]
a.map{|s| s.split('=')}.sort_by{|p| p[1].to_i}.reverse.map(&:first)
# => ["cat", "tiger", "bear", "mouse"]
Not the most elegant way to do it, but it works:
test_string.gsub(/[{}]/, "").split(",").map {|x| x.split("=")}.sort_by {|x| x[1].to_i}.reverse.map {|x| x[0].strip}
The below code should do it.
Explained the steps inline
test_string.gsub!(/{|}/, "") # Remove the curly braces
array = test_string.split(",") # Split on comma
array1= []
array.each {|word|
array1<<word.split("=") # Create an array of arrays
}
h1 = Hash[*array1.flatten] # Convert Array into Hash
puts h1.keys.sort {|a, b| h1[b] <=> h1[a]} # Print keys of the hash based on sorted values
test_string = "{cat=6,bear=2,mouse=1,tiger=4}"
Hash[*test_string.scan(/\w+/)].sort_by{|k,v| v.to_i }.map(&:first).reverse
#=> ["cat", "tiger", "bear", "mouse"]

Ruby regex matching overlapping terms

I'm using:
r = /(hell|hello)/
"hello".scan(r) #=> ["hell"]
but I would like to get [ "hell", "hello" ].
http://rubular.com/r/IxdPKYSUAu
You can use a fancier capture:
'hello'.match(/((hell)o)/).captures
=> ["hello", "hell"]
No, regexes don't work like that. But you can do something like this:
terms = %w{hell hello}.map{|t| /#{t}/}
str = "hello"
matches = terms.map{|t| str.scan t}
puts matches.flatten.inspect # => ["hell", "hello"]
Well, you can always take out common subexpression. I.e., the following works:
r = /hello{0,1}/
"hello".scan(r) #=> ["hello"]
"hell".scan(r) #=> ["hell"]
You could do something like this:
r = /(hell|(?<=hell)o)/
"hello".scan(r) #=> ["hell","o"]
It won't give you ["hell", "hello"], but rather ["hell", "o"]

Find just part of string with a regex

I have a string like so:
"#[30:Larry Middleton]"
I want to return just 30. Where 30 will always be digits, and can be of 1 to infinity in length.
I've tried:
user_id = result.match(/#\[(\d+):.*]/)
But that returns everything. How can I get back just 30?
If that's really all your string, you don't need to match the rest of the pattern; just match the consecutive integers:
irb(main):001:0> result = "#[30:Larry Middleton]"
#=> "#[30:Larry Middleton]"
irb(main):002:0> result[/\d+/]
#=> "30"
However, if you need to match this as part of a larger string that might have digits elsewhere:
irb(main):004:0> result[/#\[(\d+):.*?\]/]
#=> "#[30:Larry Middleton]"
irb(main):005:0> result[/#\[(\d+):.*?\]/,1]
#=> "30"
irb(main):006:0> result[/#\[(\d+):.*?\]/,1].to_i
#=> 30
If you need the name also:
irb(main):002:0> m = result.match /#\[(\d+):(.*?)\]/
#=> #<MatchData "#[30:Larry Middleton]" 1:"30" 2:"Larry Middleton">
irb(main):003:0> m[1]
#=> "30"
irb(main):004:0> m[2]
#=> "Larry Middleton"
In Ruby 1.9 you can even name the matches, instead of using the capture number:
irb(main):005:0> m = result.match /#\[(?<id>\d+):(?<name>.*?)\]/
#=> #<MatchData "#[30:Larry Middleton]" id:"30" name:"Larry Middleton">
irb(main):006:0> m[:id]
#=> "30"
irb(main):007:0> m[:name]
#=> "Larry Middleton"
And if you need to find many of these:
irb(main):008:0> result = "First there was #[30:Larry Middleton], age 17, and then there was #[42:Phrogz], age unknown."
#irb(main):015:0> result.scan /#\[(\d+):.*?\]/
#=> [["30"], ["42"]]
irb(main):016:0> result.scan(/#\[(\d+):.*?\]/).flatten.map(&:to_i)
#=> [30, 42]
irb(main):017:0> result.scan(/#\[(\d+):(.*?)\]/).each{ |id,name| puts "#{name} is #{id}" }
Larry is 30
Phrogz is 42
Try this:
user_id = result.match(/#\[(\d+):.*]/)[1]
You've forgot to escape ']':
user_id = result.match(/#\[(\d+):.*\]/)[1]
I don't know ruby, but if it supports lookbehinds and lookaheads:
user_id = result.match(/(?<#\[)\d+(?=:)/)
If not, you should have some way of retrieving subpattern from the match - again, I wouldn't know how.
I prefer String#scan for most of my regex needs, here's what I would do:
results.scan(/#\[(\d+):/).flatten.map(&:to_i).first
For your second question about getting the name:
results.scan(/(\d+):([A-Za-z ]+)\]$/).flatten[1]
Scan will always return an array of sub string matches:
"#[123:foo bars]".scan(/\d+/) #=> ['123']
If you include a pattern in parens, then each match for those "sub-patterns" will be included in a sub array:
"#[123:foo bars]".scan(/(\d+):(\w+)/) #=> [['123'], ['foo']]
That's why we have to do flatten on results involving sub-patterns:
[['123'], ['foo']].flatten = ['123', 'foo']
Also it always returns strings, that's why conversion to integer is needed in the first example:
['123'].to_i = 123
Hope this is helpful.

Resources