Search an array for a string and its substrings in ruby - ruby

I want to search an array for a certain string and(!) its substrings. For example my array is:
array = ["hello", "hell", "goodbye", "he"]
So when I search for "hello" and its substrings (but only from the beginning: "he", "hell", "hello"), it should return
=> ["hello", "hell", "he"]
What I've tried so far: Using a regular expression with the #grep and/or the #include? method like this:
array.grep("hello"[/\w+/])
or
array.select {|i| i.include?("hello"[/\w+/])}
but in both cases it only returns
=> ["hello"]
By the way, if I try array.select{|i| i.include?("he")} it works but like I said I want it the other way around: searching for "hello" and give me all results including the substrings from the beginning.

require "abbrev"
arr = ["hello", "hell", "goodbye", "he"]
p arr & ["hello"].abbrev.keys # => ["hello", "hell", "he"]

array = ["hello", "hell", "goodbye", "he", "he"]
# define search word:
search = "hello"
# find all substrings of this word:
substrings = (0..search.size - 1).each_with_object([]) { |i, subs| subs << search[0..i] }
#=> ["h", "he", "hel", "hell", "hello"]
# find intersection between array and substrings(will exclude duplicates):
p array & substrings
#=> ["hello", "hell", "he"]
# or select array elements that match any substring(will keep duplicates):
p array.select { |elem| substrings.include?(elem) }
#=> ["hello", "hell", "he", "he"]

I'd use String#[] :
array = ["hello", "hell", "goodbye", "he", "he"]
search = "hello"
array.select { |s| search[/\A#{s}/] }
# => ["hello", "hell", "he", "he"]

Turn all the characters other than h in hello to optional.
> array = ["hello", "hell", "goodbye", "he"]
> array.select{|i| i[/^he?l?l?o?/]}
=> ["hello", "hell", "he"]

You could still use a regular expression like this
#define Array
arr = ["hello", "hell", "goodbye", "he"]
#define search term as an Array of it's characters
search = "hello".split(//)
#=> ['h','e','l','l','o']
#deem the first as manditory search.shift
#the rest are optional ['e?','l?','l?','o?'].join
search = search.shift << search.map{|a| "#{a}?"}.join
#=> "he?l?l?o?"
#start at the beginning of the string \A
arr.grep(/\A#{search}/)
#=> ["hello", "hell", "he"]

Just as the question reads:
array.select { |w| "hello" =~ /^#{w}/ }
#=> ["hello", "hell", "he"]

Use array#keep_if
array = ["hello", "hell", he"]
substrings = array.keep_if{|a| a.start_with?('h')}
=> ["hello", "hell", "he"]

Related

Ruby string char chunking

I have a string "wwwggfffw" and want to break it up into an array as follows:
["www", "gg", "fff", "w"]
Is there a way to do this with regex?
"wwwggfffw".scan(/((.)\2*)/).map(&:first)
scan is a little funny, as it will return either the match or the subgroups depending on whether there are subgroups; we need to use subgroups to ensure repetition of the same character ((.)\1), but we'd prefer it if it returned the whole match and not just the repeated letter. So we need to make the whole match into a subgroup so it will be captured, and in the end we need to extract just the match (without the other subgroup), which we do with .map(&:first).
EDIT to explain the regexp ((.)\2*) itself:
( start group #1, consisting of
( start group #2, consisting of
. any one character
) and nothing else
\2 followed by the content of the group #2
* repeated any number of times (including zero)
) and nothing else.
So in wwwggfffw, (.) captures w into group #2; then \2* captures any additional number of w. This makes group #1 capture www.
You can use back references, something like
'wwwggfffw'.scan(/((.)\2*)/).map{ |s| s[0] }
will work
Here's one that's not using regex but works well:
def chunk(str)
chars = str.chars
chars.inject([chars.shift]) do |arr, char|
if arr[-1].include?(char)
arr[-1] << char
else
arr << char
end
arr
end
end
In my benchmarks it's faster than the regex answers here (with the example string you gave, at least).
Another non-regex solution, this one using Enumerable#slice_when, which made its debut in Ruby v.2.2:
str.each_char.slice_when { |a,b| a!=b }.map(&:join)
#=> ["www", "gg", "fff", "w"]
Another option is:
str.scan(Regexp.new(str.squeeze.each_char.map { |c| "(#{c}+)" }.join)).first
#=> ["www", "gg", "fff", "w"]
Here the steps are as follows
s = str.squeeze
#=> "wgfw"
a = s.each_char
#=> #<Enumerator: "wgfw":each_char>
This enumerator generates the following elements:
a.to_a
#=> ["w", "g", "f", "w"]
Continuing
b = a.map { |c| "(#{c}+)" }
#=> ["(w+)", "(g+)", "(f+)", "(w+)"]
c = b.join
#=> "(w+)(g+)(f+)(w+)"
r = Regexp.new(c)
#=> /(w+)(g+)(f+)(w+)/
d = str.scan(r)
#=> [["www", "gg", "fff", "w"]]
d.first
#=> ["www", "gg", "fff", "w"]
Here's one more way of doing it without a regex:
'wwwggfffw'.chars.chunk(&:itself).map{ |s| s[1].join }
# => ["www", "gg", "fff", "w"]

Ruby search for word in string

Given input = "helloworld"
The output should be output = ["hello", "world"]
Given I have a method called is_in_dict? which returns true if there's a word given
So far i tried:
ar = []
input.split("").each do |f|
ar << f if is_in_dict? f
// here need to check given char
end
How to achieve it in Ruby?
Instead of splitting the input into characters, you have to inspect all combinations, i.e. "h", "he", "hel", ... "helloworld", "e", "el" , "ell", ... "elloworld" and so on.
Something like this should work:
(0..input.size).to_a.combination(2).each do |a, b|
word = input[a...b]
ar << word if is_in_dict?(word)
end
#=> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
ar
#=> ["hello", "world"]
Or, using each_with_object, which returns the array:
(0..input.size).to_a.combination(2).each_with_object([]) do |(a, b), array|
word = input[a...b]
array << word if is_in_dict?(word)
end
#=> ["hello", "world"]
Another approach is to build a custom Enumerator:
class String
def each_combination
return to_enum(:each_combination) unless block_given?
(0..size).to_a.combination(2).each do |a, b|
yield self[a...b]
end
end
end
String#each_combination yields all combinations (instead of just the indices):
input.each_combination.to_a
#=> ["h", "he", "hel", "hell", "hello", "hellow", "hellowo", "hellowor", "helloworl", "helloworld", "e", "el", "ell", "ello", "ellow", "ellowo", "ellowor", "elloworl", "elloworld", "l", "ll", "llo", "llow", "llowo", "llowor", "lloworl", "lloworld", "l", "lo", "low", "lowo", "lowor", "loworl", "loworld", "o", "ow", "owo", "owor", "oworl", "oworld", "w", "wo", "wor", "worl", "world", "o", "or", "orl", "orld", "r", "rl", "rld", "l", "ld", "d"]
It can be used with select to easily filter specific words:
input.each_combination.select { |word| is_in_dict?(word) }
#=> ["hello", "world"]
This seems to be a task for recursion. In short you want to take letters one by one until you get a word which is in dictionary. This however will not guarantee that the result is correct, as the remaining letters may not form a words ('hell' + 'oworld'?). This is what I would do:
def split_words(string)
return [[]] if string == ''
chars = string.chars
word = ''
(1..string.length).map do
word += chars.shift
next unless is_in_dict?(word)
other_splits = split_words(chars.join)
next if other_splits.empty?
other_splits.map {|split| [word] + split }
end.compact.inject([], :+)
end
split_words('helloworld') #=> [['hello', 'world']] No hell!
It will also give you all possible splits, so pages with urls like penisland can be avoided
split_words('penisland') #=> [['pen', 'island'], [<the_other_solution>]]

Simple array sort and capitalize

New to Ruby and trying some stuff.
The code below is to convert the array to a string while sorting it and display the sorted results. Where I'm struggling is the use of the capitalize method to caps the all the sorted words.
the_data = ["dog", "cat", "fish", "zebra", "swan", "rabbit", "horse", "albatros", "frog", "mouse", "duck"]
puts "\nThe array:\n"
puts the_data
puts "\n"
puts "\nThe sorted array, capitalized:\n"
to_display = the_data.sort.join(("\n").capitalize)
puts to_display
You can use Array#map to capitalize each word of the Array
to_display = the_data.sort.map(&:capitalize).join("\n")
# => "Albatros\nCat\nDog\nDuck\nFish\nFrog\nHorse\nMouse\nRabbit\nSwan\nZebra"
If you want to capitalize all the letters, you can use upcase
to_display = the_data.sort.map(&:upcase).join("\n")
# => "ALBATROS\nCAT\nDOG\nDUCK\nFISH\nFROG\nHORSE\nMOUSE\nRABBIT\nSWAN\nZEBRA"

Move elements of an array to a different array in Ruby

Simple ruby question. Lets say I have an array of 10 strings and I want to move elements at array[3] and array[5] into a totally new array. The new array would then only have the two elements I moved from the first array, AND the first array would then only have 8 elements since two of them have been moved out.
Use Array#slice! to remove the elements from the first array, and append them to the second array with Array#<<:
arr1 = ['Foo', 'Bar', 'Baz', 'Qux']
arr2 = []
arr2 << arr1.slice!(1)
arr2 << arr1.slice!(2)
puts arr1.inspect
puts arr2.inspect
Output:
["Foo", "Baz"]
["Bar", "Qux"]
Depending on your exact situation, you may find other methods on array to be even more useful, such as Enumerable#partition:
arr = ['Foo', 'Bar', 'Baz', 'Qux']
starts_with_b, does_not_start_with_b = arr.partition{|word| word[0] == 'B'}
puts starts_with_b.inspect
puts does_not_start_with_b.inspect
Output:
["Bar", "Baz"]
["Foo", "Qux"]
a = (0..9).map { |i| "el##{i}" }
x = [3, 5].sort_by { |i| -i }.map { |i| a.delete_at(i) }
puts x.inspect
# => ["el#5", "el#3"]
puts a.inspect
# => ["el#0", "el#1", "el#2", "el#4", "el#6", "el#7", "el#8", "el#9"]
As noted in comments, there is some magic to make indices stay in place. This can be avoided by first getting all the desired elements using a.values_at(*indices), then deleting them as above.
Code:
arr = ["null","one","two","three","four","five","six","seven","eight","nine"]
p "Array: #{arr}"
third_el = arr.delete_at(3)
fifth_el = arr.delete_at(4)
first_arr = arr
p "First array: #{first_arr}"
concat_el = third_el + "," + fifth_el
second_arr = concat_el.split(",")
p "Second array: #{second_arr}"
Output:
c:\temp>C:\case.rb
"Array: [\"null\", \"one\", \"two\", \"three\", \"four\", \"five\", \"six\", \"s
even\", \"eight\", \"nine\"]"
"First array: [\"null\", \"one\", \"two\", \"four\", \"six\", \"seven\", \"eight
\", \"nine\"]"
"Second array: [\"three\", \"five\"]"
Why not start deleting from the highest index.
arr = ['Foo', 'Bar', 'Baz', 'Qux']
index_array = [2, 1]
new_ary = index_array.map { |index| arr.delete_at(index) }
new_ary # => ["Baz", "Bar"]
arr # => ["Foo", "Qux"]
Here's one way:
vals = arr.values_at *pulls
arr = arr.values_at *([*(0...arr.size)] - pulls)
Try it.
arr = %w[Now is the time for all Rubyists to code]
pulls = [3,5]
vals = arr.values_at *pulls
#=> ["time", "all"]
arr = arr.values_at *([*(0...arr.size)] - pulls)
#=> ["Now", "is", "the", "for", "Rubyists", "to", "code"]
arr = %w[Now is the time for all Rubyists to code]
pulls = [5,3]
vals = arr.values_at *pulls
#=> ["all", "time"]
arr = arr.values_at *([*(0...arr.size)] - pulls)
#=> ["Now", "is", "the", "for", "Rubyists", "to", "code"]

Finding the product of a variable number of Ruby arrays

I'm looking to find all combinations of single items from a variable number of arrays. How do I do this in Ruby?
Given two arrays, I can use Array.product like this:
groups = []
groups[0] = ["hello", "goodbye"]
groups[1] = ["world", "everyone"]
combinations = groups[0].product(groups[1])
puts combinations.inspect
# [["hello", "world"], ["hello", "everyone"], ["goodbye", "world"], ["goodbye", "everyone"]]
How could this code work when groups contains a variable number of arrays?
groups = [
%w[hello goodbye],
%w[world everyone],
%w[here there]
]
combinations = groups.first.product(*groups.drop(1))
p combinations
# [
# ["hello", "world", "here"],
# ["hello", "world", "there"],
# ["hello", "everyone", "here"],
# ["hello", "everyone", "there"],
# ["goodbye", "world", "here"],
# ["goodbye", "world", "there"],
# ["goodbye", "everyone", "here"],
# ["goodbye", "everyone", "there"]
# ]

Resources