Ruby array of strings: split into smallest pieces - ruby

I would like to know if there is a method in Ruby that splits an Array of String in smallest pieces. Consider:
['Cheese crayon', 'horse', 'elephant a b c']
Is there a method that turns this into:
['Cheese', 'crayon', 'horse', 'elephant', 'a', 'b', 'c']

p ['Cheese crayon', 'horse', 'elephant a b c'].flat_map(&:split)
# => ["Cheese", "crayon", "horse", "elephant", "a", "b", "c"]

None that I know of. But you can split each string individually and then flatten the results into a single array:
p ['Cheese crayon', 'horse', 'elephant a b c'].map(&:split).flatten

You can do it this way:
array.map { |s| s.split(/\s+/) }.flatten
This splits your string by any number of whitespace characters. As far as I know, it's the default behavior of split without any arguments, so you can shorten it to:
array.map(&:split).flatten

['Cheese crayon', 'horse', 'elephant a b c'].join(' ').split
# => ["Cheese", "crayon", "horse", "elephant", "a", "b", "c"]

Related

How do I split a string without keeping the delimiter?

In Ruby, how do I split a string and not keep the delimiter in the resulting split array? I though tthis was the default, but when I try
2.4.0 :016 > str = "a b c"
=> "a b c"
2.4.0 :017 > str.split(/([[:space:]]|,)+/)
=> ["a", " ", "b", " ", "c"]
I see the spaces included in my result. I would like the result to simply be
["a", "b", "c"]
From the String#split documentation:
If pattern contains groups, the respective matches will be returned in the array as well.
Answering your explicitly stated question: do not match the group:
# ⇓⇓ HERE
str.split(/(?:[[:space:]]|,)+/)
or, even without groups:
str.split(/[[:space:],]+/)
or, in more Rubyish way:
'a b, c,d e'.split(/[\p{Space},]+/)
#⇒ ["a", "b", "c", "d", "e"]
String#splitsplits on white-space by default, so don 't bother with a regex:
"a b c".split # => ["a", "b", "c"]
Try this please
str.split(' ')

Set multiple keys to the same value at once for a Ruby hash

I'm trying to create this huge hash, where there are many keys but only a few values.
So far I have it like so...
du_factor = {
"A" => 1,
"B" => 1,
"C" => 1,
"D" => 2,
"E" => 2,
"F" => 2,
...etc., etc., etc., on and on and on for longer than you even want to know. What's a shorter and more elegant way of creating this hash without flipping its structure entirely?
Edit: Hey so, I realized there was a waaaay easier and more elegant way to do this than the answers given. Just declare an empty hash, then declare some arrays with the keys you want, then use a for statement to insert them into the array, like so:
du1 = ["A", "B", "C"]
du2 = ["D", "E", "F"]
dufactor = {}
for i in du1
dufactor[i] = 1
end
for i in du740
dufactor[i] = 2
end
...but the fact that nobody suggested that makes me, the extreme Ruby n00b, think that there must be a reason why I shouldn't do it this way. Performance issues?
Combining Ranges with a case block might be another option (depending on the problem you are trying to solve):
case foo
when ('A'..'C') then 1
when ('D'..'E') then 2
# ...
end
Especially if you focus on your source code's readability.
How about:
vals_to_keys = {
1 => [*'A'..'C'],
2 => [*'D'..'F'],
3 => [*'G'..'L'],
4 => ['dog', 'cat', 'pig'],
5 => [1,2,3,4]
}
vals_to_keys.each_with_object({}) { |(v,arr),h| arr.each { |k| h[k] = v } }
#=> {"A"=>1, "B"=>1, "C"=>1, "D"=>2, "E"=>2, "F"=>2, "G"=>3, "H"=>3, "I"=>3,
# "J"=>3, "K"=>3, "L"=>3, "dog"=>4, "cat"=>4, "pig"=>4, 1=>5, 2=>5, 3=>5, 4=>5}
What about something like this:
du_factor = Hash.new
["A", "B", "C"].each {|ltr| du_factor[ltr] = 1}
["D", "E", "F"].each {|ltr| du_factor[ltr] = 2}
# Result:
du_factor # => {"A"=>1, "B"=>1, "C"=>1, "D"=>2, "E"=>2, "F"=>2}
Create an empty hash, then for each group of keys that share a value, create an array literal containing the keys, and use the array's '.each' method to batch enter them into the hash. Basically the same thing you did above with for loops, but it gets it done in three lines.
keys = %w(A B C D E F)
values = [1, 1, 1, 2, 2, 2]
du_factor = Hash[*[keys, values].transpose.flatten]
If these will be more than 100, writing them down to a CSV file might be better.
keys = [%w(A B C), %w(D E F)]
values = [1,2]
values.map!.with_index{ |value, idx| Array(value) * keys[idx].size }.flatten!
keys.flatten!
du_factor = Hash[keys.zip(values)]
Notice here that I used destructive methods (methods ending with !). this is important for performance and memory usage optimization.

Ignoring capture group in Regex that is used for repeating the patten

/((\w)\2)/ finds repeating letters. I was hoping to avoid the two dimensional array that is produced by ignoring the letter matching second capture group like this: /((?:\w)\2)/. It seems that's not possible. Any ideas why?
Rubular example
You don't need any capture groups:
str = [*'a+'..'z+', *'A+'..'Z+', *'0+'..'9+', '_+'].join('|')
#=> "a+|b+| ... |z+|A+|B+| ... |Z+|0+|1+| ... |9+|_+"
"aaabbcddd".scan(/#{str}/)
#=> ["aaa", "bb", "c", "ddd"]
but if you insist on having one:
"aaabbcddd".scan(/(#{str})/).flatten(1)
#=> ["aaa", "bb", "c", "ddd"]
Is this cheating? You did ask if it was possible.
If you mean you're using String#scan, you can post-process the result to return only the first items Enumerable#map:
'helloo'.scan(/((\w)\2)/)
# => [["ll", "l"], ["oo", "o"]]
'helloo'.scan(/((\w)\2)/).map { |m| m[0] }
# => ["ll", "oo"]

Format data in string to array?

I need to convert data from a string to an array. The string looks like this:
{a,b,c{1,2,3},d,e,f{11,22,33},g}
The array that I want to receive should look like this:
[a, b, c1, c2, c3, d, e, f11, f22, f33, g]
I tried to use the split method but it works poorly.
arr = str.split(' ');
keys = arr[0][2..-2]
keys = keys.split(',')
Do you have any ideas how it could be implemented?
Here's what I'd use:
string = '{a,b,c{1,2,3},d,e,f{11,22,33},g}'
array = string.scan(/[a-z](?:{.+?})?/).flat_map{ |s|
if s['{']
prefix = s[0]
values = s.scan(/\d+/)
([prefix] * values.size).zip(values).map(&:join)
else
s
end
}
array # => ["a", "b", "c1", "c2", "c3", "d", "e", "f11", "f22", "f33", "g"]
Here's how it works:
string.scan(/[a-z](?:{.+?})?/) # => ["a", "b", "c{1,2,3}", "d", "e", "f{11,22,33}", "g"]
returns the string broken into chunks, looking for a single letter followed by an optional string of { with some text then }.
values = s.scan(/\d+/) # => ["1", "2", "3"], ["11", "22", "33"]
As it's running in flat_map, if { is found, the numbers are scanned out.
([prefix] * values.size).zip(values).map(&:join) # => ["c1", "c2", "c3"], ["f11", "f22", "f33"]
And then an array of the prefix, with the same number of elements as there are values is created and zipped together, resulting in:
[["c", "1"], ["c", "2"], ["c", "3"]], [["f", "11"], ["f", "22"], ["f", "33"]]
The join glues those sub-arrays together. And flat_map flattens any subarrays created so the resulting output is a single array.
You need to arr = str.split(',') in the first step, because there is no whitespace between the values.
Also keep in mind you have {} to handle too.
This worked for me with simple regex and gsubing (though Tin Man's solution is better ruby):
def my_string_to_array(input_string)
groups = input_string.scan(/\w+\{.*?\}/)
groups.each do |group|
modified = group.gsub(',', ",#{group.match(/\w+/)[0]}").delete("{}")
input_string.gsub!(group, modified)
end
created_array = input_string.delete("{}").split(',')
end
string = '{a,b,c{1,2,3},d,e,f{11,22,33},g}'
my_string_to_array(string)
=> ["a", "b", "c1", "c2", "c3", "d", "e", "f11", "f22", "f33", "g"]
The way it works is that it first finds the groups having alphabets followed by braces and digits (like c{1,2,3})
For each such group, it modifies it by gsubing ',' with ',<alphabet>' and removing the braces.
Next, it replaces these groups with the modified ones in the original string.
And finally it removes the starting and ending braces in the original string, and converts it into an array.

How to grep elements in array that match patterns from another array?

I have two arrays:
a = ["X2", "X3/X4", "X5/X6/X7", "X8/X9/X10/X11"]
b = ["X9/X10", "X3/X4"]
Now I need to select entries from 'a' array which regexp with any of entries from array 'b'.
Expected result is:
["X3/X4", "X8/X9/X10/X11"]
How can I do this in Ruby?
I'd do:
a.grep(Regexp.union(b))
# => ["X3/X4", "X8/X9/X10/X11"]
This should work:
a.grep(/#{b.join('|')}/)
# => ["X3/X4", "X8/X9/X10/X11"]
Try the below:
a = ["X2", "X3/X4", "X5/X6/X7", "X8/X9/X10/X11"]
b = ["X9/X10", "X3/X4"]
p a.select{|i| b.any?{|j| i.include? j }}
#>> ["X3/X4", "X8/X9/X10/X11"]
The safest way is to build a regular expression and then select elements of your array matching this expression:
Regexp.union("a", "b", "c")
# => /a|b|c/
Regexp.union(["a", "b", "c"])
# => /a|b|c/
("b".."e").to_a.grep(Regexp.union("a", "b", "c"))
# => ["b", "c"]

Resources