Get index of string scan results in ruby - ruby

I want to get the index as well as the results of a scan
"abab".scan(/a/)
I would like to have not only
=> ["a", "a"]
but also the index of those matches
[1, 3]
any suggestion?

Try this:
res = []
"abab".scan(/a/) do |c|
res << [c, $~.offset(0)[0]]
end
res.inspect # => [["a", 0], ["a", 2]]

There's a gotcha to look out for here, depending on the behaviour you expect.
If you search for /dad/ in "dadad" you'd only get [["dad",0]] because scan advances to the end of each match when it finds one (which is wrong to me).
I came up with this alternative:
def scan_str(str, pattern)
res = []
(0..str.length).each do |i|
res << [Regexp.last_match.to_s, i] if str[i..-1] =~ /^#{pattern}/
end
res
end
If you wanted you could also do a similar thing with StringScanner from the standard library, it might be faster for long strings.

Very similar to what #jim has said and works a bit better for longer strings:
def matches str, pattern
arr = []
while (str && (m = str.match pattern))
offset = m.offset(0).first
arr << offset + (arr[-1] ? arr[-1] + 1 : 0)
str = str[(offset + 1)..-1]
end
arr
end

It surprised me that there isn't any method similar to String#scan which would return array of MatchData objects, similar to String#match. So, if you like monkey-patching, you can combine this with Todd's solution (Enumerator is introduced in 1.9):
class Regexp
def scan str
Enumerator.new do |y|
str.scan(self) do
y << Regexp.last_match
end
end
end
end
#=> nil
/a/.scan('abab').map{|m| m.offset(0)[0]}
#=> [0, 2]

Related

word_count(s) > Homework for counting letters in a text

My homework is to count the letters in a string regardless of the upper or lower case ... so far I have this which I still don't make it work, ideas?
def self.word_count_from_file(filename)
s = File.open(filename) { |file| file.read }
word_count(s)
end
def self.words_from_string(s)
s.downcase.scan(/[\w']+/)
end
def self.count_frequency(character)
counts = Hash.new(0)
for chatacter in characters
counts[character] += 1
end
# counts.to_a.sort {|a,b| b[1] <=> a[1]}
# sort by decreasing count, then lexicographically
counts.to_a.sort do |a,b|
[b[1],a[0]] <=> [a[1],b[0]]
end
end
Supposing you need to count words and not characters, I guess you expect to call the class as:
WordCount.word_count_from_string('Words from this string of words')
or
WordCount.word_count_from_file('filename.txt')
Then you need two class methods calling other methods in order to get the result. So, this is one option to make it work:
class WordCount
def self.word_count_from_file(filename)
s = File.open(filename) { |file| file.read }
count_frequency(s)
end
def self.word_count_from_string(s)
count_frequency(s)
end
def self.words_array(s)
s.downcase.scan(/[\w']+/)
end
def self.count_frequency(s)
counts = Hash.new(0)
for character in words_array(s) # <-- there were a typo
counts[character] += 1
end
counts.to_a.sort do |a,b|
[b[1],a[0]] <=> [a[1],b[0]]
end
end
end
WordCount.word_count_from_string('Words from this string of words')
#=> [["words", 2], ["from", 1], ["of", 1], ["string", 1], ["this", 1]]
WordCount.word_count_from_file('word-count.txt')
#=> [["words", 2], ["this", 1], ["in", 1], ["of", 1], ["string", 1], ["a", 1], ["from", 1], ["file", 1]]
Note that both word_count_from_file and word_count_from_string call count_frequency which calls words_array in order to get and return the result.
To be more Ruby-ish (each) and less Pythonic (for), this is an alternative version using also instance variable (#s) in order to avoid passing parameters (count_frequency instead of count_frequency(s), etc.).
class WordCount
def self.word_count_from_file(filename)
#s = File.open(filename) { |file| file.read }
count_frequency
end
def self.word_count_from_string(str)
#s = str
count_frequency
end
def self.count_frequency
words_array.each_with_object(Hash.new(0)) { |word, cnt| cnt[word] += 1 }.sort_by(&:last).reverse
end
def self.words_array
#s.downcase.scan(/[\w']+/)
end
end
Call as before.

Ruby difference in array including duplicates

[1,2,3,3] - [1,2,3] produces the empty array []. Is it possible to retain duplicates so it returns [3]?
I am so glad you asked. I would like to see such a method added to the class Array in some future version of Ruby, as I have found many uses for it:
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
A description of the method and links to some of its applications are given here.
By way of example:
a = [1,2,3,4,3,2,4,2]
b = [2,3,4,4,4]
a - b #=> [1]
a.difference b #=> [1,2,3,2]
Ruby v2.7 gave us the method Enumerable#tally, allowing us to replace the first line of the method with
h = other.tally
As far as I know, you can't do this with a built-in operation. Can't see anything in the ruby docs either. Simplest way to do this would be to extend the array class like this:
class Array
def difference(array2)
final_array = []
self.each do |item|
if array2.include?(item)
array2.delete_at(array2.find_index(item))
else
final_array << item
end
end
end
end
For all I know there's a more efficient way to do this, also
EDIT:
As suggested by user2864740 in question comments, using Array#slice! is a much more elegant solution
def arr_sub(a,b)
a = a.dup #if you want to preserve the original array
b.each {|del| a.slice!(a.index(del)) if a.include?(del) }
return a
end
Credit:
My original answer
def arr_sub(a,b)
b = b.each_with_object(Hash.new(0)){ |v,h| h[v] += 1 }
a = a.each_with_object([]) do |v, arr|
arr << v if b[v] < 1
b[v] -= 1
end
end
arr_sub([1,2,3,3],[1,2,3]) # a => [3]
arr_sub([1,2,3,3,4,4,4],[1,2,3,4,4]) # => [3, 4]
arr_sub([4,4,4,5,5,5,5],[4,4,5,5,5,5,6,6]) # => [4]

Find all the possible permutations using Ruby and recursion

I've been trying to solve a simple quiz question to find all the possible permutation of a string using Ruby and recursion.
I have the following Ruby code:
def permutation(string)
return [string] if string.size < 2
chr = string.chars.first
perms = permutation(string[1..-1])
result = []
for perm in perms
for i in (0..perm.size)
result << (perm[0..i] + chr + perm[i..-1])
end
end
return result
end
Whenever I try to test the code with puts permutation("abc") I get the following output:
cacbc
cbabc
cbcac
cbca
cacb
cbab
cba
Theoretically speaking it's supposed to be a very simple and straightforward problem, but I'm sure I'm doing something wrong. Most probably it's something with the ranges of the loops. And I know that Ruby Array class has instance method permutation to do that but I'm trying to solve it for practising.
Please note that the complexity is O(N!) for the current implementation. Is there anyway to enhance the performance further?
To see what the difficulty may be, let's try it with an even simpler example:
string = "ab"
Your desired result is ["ab", "ba"]. Let's see what you get:
string.size #=> 2
so we don't return when
return [string] if string.size < 2
#=> return ["ab"] if "ab".size < 2
is executed.
Next we calculate:
chr = string.chars.first #=> "a"
Notice that a more direct way of making this calculation is as follows:
chr = string[0] #=> "a"
or, better, using String#chr,
chr = string.chr #=> "a"
The latter illustrates why chr is not the best choice for the variable name.
Next
perms = permutation(string[1..-1])
#=> = permutation("b")
I will now indent the return values to emphasize that we are calling permutation a second time. permuation's argument is:
string #=> "b"
Now when we execute:
return [string] if string.size < 2
#=> return ["b"] if "b".size < 2
we return ["b"], so (back to original call to permutation):
perms = ["b"]
to go with chr => "a", calculated earlier. Next:
result = []
for perm in perms
for i in (0..perm.size)
result << (perm[0..i] + chr + perm[i..-1])
end
end
As perms contains only the single element "b", the two for loops simplify to:
for i in (0.."b".size)
result << ("b"[0..i] + "a" + "b"[i..-1])
end
which is:
for i in (0..1)
result << ("b"[0..i] + "a" + "b"[i..-1])
end
Notice that "b"[0..0], "b"[0..1] and "b"[0..-1] all equal "b"[0], which is just "b", and "b"[1..-1] #=> ''. Therefore, when i => 0, we execute:
result << ("b"[0..0] + "a" + "b"[0..-1])
#=> result << ("b" + "a" + "b")
#=> result << "bab"
and when i => 1:
result << ("b"[0..1] + "a" + "b"[1..-1])
#=> result << ("b" + "a" + "")
#=> result << "ba"
so:
result => ["bab" + "ba"]
which clearly is not what you want.
What you need to do is is change the double for loops to:
for perm in perms
result << chr + perm
for i in (1..perm.size-1)
result << (perm[0..i-1] + chr + perm[i..-1])
end
result << perm + chr
end
which could be written more compactly by employing the method String#insert:
for perm in perms
for i in (0..perm.size)
result << perm.dup.insert(i,chr)
end
end
which you would normally see written like this:
perms.each_with_object([]) do |perm, result|
(0..perm.size).each { |i| result << perm.dup.insert(i,chr) }
end
Notice that we have to .dup the string before sending insert, as insert modifies the string.
Doing it like this, you don't need result = []. Neither do you need return result, as parms.each_with_object returns result and if there is no return statement, the method returns the last quantity calculated. Also, you don't need the temporary variable perms (or ch, if desired).
Putting this altogether, we have:
def permutation(string)
return [string] if string.size < 2
ch = string[0]
permutation(string[1..-1]).each_with_object([]) do |perm, result|
(0..perm.size).each { |i| result << perm.dup.insert(i,ch) }
end
end
Let's try it:
permutation("ab")
#=> ["ab", "ba"]
permutation("abc")
#=> ["abc", "bac", "bca", "acb", "cab", "cba"]
permutation("abcd")
#=> ["abcd", "bacd", "bcad", "bcda", "acbd", "cabd",
# "cbad", "cbda", "acdb", "cadb", "cdab", "cdba",
# "abdc", "badc", "bdac", "bdca", "adbc", "dabc",
# "dbac", "dbca", "adcb", "dacb", "dcab", "dcba"]
Eki, which one are you in the picture?
You can use Array#permutation:
def permutation(string)
string.permutation(string.size).to_a
end
permutation('abc'.chars)
# => [["a", "b", "c"], ["a", "c", "b"], ["b", "a", "c"], ["b", "c", "a"],
# ["c", "a", "b"], ["c", "b", "a"]]
UPDATE Without usign Array#permutation:
def permutation(string)
return [''] if string.empty?
chrs = string.chars
(0...string.size).flat_map { |i|
chr, rest = string[i], string[0...i] + string[i+1..-1]
permutation(rest).map { |sub|
chr + sub
}
}
end
permutation('abc')
# => ["abc", "acb", "bac", "bca", "cab", "cba"]

What is the best way to split a string to get all the substrings by Ruby?

For example, the words "stack", I want to get an array like:
['s', 'st', 'sta', ... 'stack', 't', 'ta', ... , 'c', 'ck', 'k']
I did this by such code:
def split_word(str)
result = []
chas = str.split("")
len = chas.size
(0..len-1).each do |i|
(i..len-1).each do |j|
result.push(chas[i..j].join)
end
end
result.uniq
end
Is there better and clean way to do that? Thanks.
def split_word s
(0..s.length).inject([]){|ai,i|
(1..s.length - i).inject(ai){|aj,j|
aj << s[i,j]
}
}.uniq
end
And you can also consider using Set instead of Array for the result.
PS: Here's another idea, based on array product:
def split_word s
indices = (0...s.length).to_a
indices.product(indices).reject{|i,j| i > j}.map{|i,j| s[i..j]}.uniq
end
I'd write:
def split_word(s)
0.upto(s.length - 1).flat_map do |start|
1.upto(s.length - start).map do |length|
s[start, length]
end
end.uniq
end
groups = split_word("stack")
# ["s", "st", "sta", "stac", "stack", "t", "ta", "tac", "tack", "a", "ac", "ack", "c", "ck", "k"]
It's usually more clear and more compact to use map (functional) instead of the pattern init empty + each + append + return (imperative).
def substrings(str)
output = []
(0...str.length).each do |i|
(i...str.length).each do |j|
output << str[i..j]
end
end
output
end
this is just a cleaned up version of your method and it works with less steps =)
Don't think so.
Here's my attempted version:
def split_word(str)
length = str.length - 1
[].tap do |result|
0.upto(length) do |i|
length.downto(i) do |j|
substring = str[i..j]
result << substring unless result.include?(substring)
end
end
end
end
def substrings(str)
(0...str.length).map do |i|
(i...str.length).each { |j| str[i..j]}
end
end
Just another way to do it, that reads a little clearer to me.
Here is the recursive way to get all the possible sub strings.
def substrings str
return [] if str.size < 1
((0..str.size-1).map do |pos|
str[0..pos]
end) + substrings(str[1..])
end
Way later, but this is what I got from reformatting your code a bit.
def substrings(string)
siz = string.length
answer = []
(0..siz-1).each do |n|
(n..siz-1).each do |i|
answer << string[n..i]
end
end
answer
end

Map an array modifying only elements matching a certain condition

In Ruby, what is the most expressive way to map an array in such a way that certain elements are modified and the others left untouched?
This is a straight-forward way to do it:
old_a = ["a", "b", "c"] # ["a", "b", "c"]
new_a = old_a.map { |x| (x=="b" ? x+"!" : x) } # ["a", "b!", "c"]
Omitting the "leave-alone" case of course if not enough:
new_a = old_a.map { |x| x+"!" if x=="b" } # [nil, "b!", nil]
What I would like is something like this:
new_a = old_a.map_modifying_only_elements_where (Proc.new {|x| x == "b"})
do |y|
y + "!"
end
# ["a", "b!", "c"]
Is there some nice way to do this in Ruby (or maybe Rails has some kind of convenience method that I haven't found yet)?
Thanks everybody for replying. While you collectively convinced me that it's best to just use map with the ternary operator, some of you posted very interesting answers!
Because arrays are pointers, this also works:
a = ["hello", "to", "you", "dude"]
a.select {|i| i.length <= 3 }.each {|i| i << "!" }
puts a.inspect
# => ["hello", "to!", "you!", "dude"]
In the loop, make sure you use a method that alters the object rather than creating a new object. E.g. upcase! compared to upcase.
The exact procedure depends on what exactly you are trying to achieve. It's hard to nail a definite answer with foo-bar examples.
old_a.map! { |a| a == "b" ? a + "!" : a }
gives
=> ["a", "b!", "c"]
map! modifies the receiver in place, so old_a is now that returned array.
I agree that the map statement is good as it is. It's clear and simple,, and would easy
for anyone to maintain.
If you want something more complex, how about this?
module Enumerable
def enum_filter(&filter)
FilteredEnumerator.new(self, &filter)
end
alias :on :enum_filter
class FilteredEnumerator
include Enumerable
def initialize(enum, &filter)
#enum, #filter = enum, filter
if enum.respond_to?(:map!)
def self.map!
#enum.map! { |elt| #filter[elt] ? yield(elt) : elt }
end
end
end
def each
#enum.each { |elt| yield(elt) if #filter[elt] }
end
def each_with_index
#enum.each_with_index { |elt,index| yield(elt, index) if #filter[elt] }
end
def map
#enum.map { |elt| #filter[elt] ? yield(elt) : elt }
end
alias :and :enum_filter
def or
FilteredEnumerator.new(#enum) { |elt| #filter[elt] || yield(elt) }
end
end
end
%w{ a b c }.on { |x| x == 'b' }.map { |x| x + "!" } #=> [ 'a', 'b!', 'c' ]
require 'set'
Set.new(%w{ He likes dogs}).on { |x| x.length % 2 == 0 }.map! { |x| x.reverse } #=> #<Set: {"likes", "eH", "sgod"}>
('a'..'z').on { |x| x[0] % 6 == 0 }.or { |x| 'aeiouy'[x] }.to_a.join #=> "aefiloruxy"
Your map solution is the best one. I'm not sure why you think map_modifying_only_elements_where is somehow better. Using map is cleaner, more concise, and doesn't require multiple blocks.
One liner:
["a", "b", "c"].inject([]) { |cumulative, i| i == "b" ? (cumulative << "#{i}!") : cumulative }
In the code above, you start with [] "cumulative". As you enumerate through an Enumerator (in our case the array, ["a", "b", "c"]), cumulative as well as "the current" item get passed to our block (|cumulative, i|) and the result of our block's execution is assigned to cumulative. What I do above is keep cumulative unchanged when the item isn't "b" and append "b!" to cumulative array and return it when it is a b.
There is an answer above that uses select, which is the easiest way to do (and remember) it.
You can combine select with map in order to achieve what you're looking for:
arr = ["a", "b", "c"].select { |i| i == "b" }.map { |i| "#{i}!" }
=> ["b!"]
Inside the select block, you specify the conditions for an element to be "selected". This will return an array. You can call "map" on the resulting array to append the exclamation mark to it.
Ruby 2.7+
As of 2.7 there's a definitive answer.
Ruby 2.7 is introducing filter_map for this exact purpose. It's idiomatic and performant, and I'd expect it to become the norm very soon.
For example:
numbers = [1, 2, 5, 8, 10, 13]
enum.filter_map { |i| i * 2 if i.even? }
# => [4, 16, 20]
Here's a good read on the subject.
Hope that's useful to someone!
If you don't need the old array, I prefer map! in this case because you can use the ! method to represent you are changing the array in place.
self.answers.map!{ |x| (x=="b" ? x+"!" : x) }
I prefer this over:
new_map = self.old_map{ |x| (x=="b" ? x+"!" : x) }
It's a few lines long, but here's an alternative for the hell of it:
oa = %w| a b c |
na = oa.partition { |a| a == 'b' }
na.first.collect! { |a| a+'!' }
na.flatten! #Add .sort! here if you wish
p na
# >> ["b!", "a", "c"]
The collect with ternary seems best in my opinion.
I've found that the best way to accomplish this is by using tap
arr = [1,2,3,4,5,6]
[].tap do |a|
arr.each { |x| a << x if x%2==0 }
end

Resources