Ruby Counting chars in a sequence not using regex - ruby

Need help with this code on counting chars in a sequence.
This is what I want:
word("aaabbcbbaaa") == [["a", 3], ["b", 2], ["c", 1], ["b", 2], ["a", 3]]
word("aaaaaaaaaa") == [["a", 10]]
word("") == []
Here is my code:
def word(str)
words=str.split("")
count = Hash.new(0)
words.map {|char| count[char] +=1 }
return count
end
I got word("aaabbcbbaaa") => [["a", 6], ["b", 4], ["c", 1]], which is not what I want. I want to count each sequence. I prefer a none regex solution. Thanks.

Split string by chars, then group chunks by char, then count chars in chunks:
def word str
str
.chars
.chunk{ |e| e }
.map{|(e,ar)| [e, ar.length] }
end
p word "aaabbcbbaaa"
p word("aaaaaaaaaa")
p word ""
Result:
[["a", 3], ["b", 2], ["c", 1], ["b", 2], ["a", 3]]
[["a", 10]]
[]

If you don't want to use a regex, you may just have to do something like:
def word(str)
last, n, result = str.chars.first, 0, []
str.chars.each do |char|
if char != last
result << [last, n]
last, n = char, 1
else
n += 1
end
end
result << [last, n]
end
I'd like to use some higher-order function to make this more concise, but there's no appropriate one in the Ruby standard library. Enumerable#partition almost does it, but not quite.

I'd do the following. Note that each_char is a newer method (Ruby 1.9?) that might not be available on your version, so stick with words=str.split("") in that case.
def word(str)
return [] if str.length == 0
seq_count = []
last_char = nil
count = 0
str.each_char do |char|
if last_char == char
count += 1
else
seq_count << [last_char, count] unless last_char.nil?
count = 1
end
last_char = char
end
seq_count << [last_char, count]
end
[52] pry(main)> word("hello")
=> [["h", 1], ["e", 1], ["l", 2], ["o", 1]]
[54] pry(main)> word("aaabbcbbaaa")
=> [["a", 3], ["b", 2], ["c", 1], ["b", 2], ["a", 3]]
[57] pry(main)> word("")
=> []

Another non-regexp-version.
x = "aaabbcbbaaa"
def word(str)
str.squeeze.reverse.chars.each_with_object([]) do |char, list|
count = 0
count += 1 until str.chomp!(char).nil?
list << [char, count]
end
end
p word(x) #=> [["a", 3], ["b", 2], ["c", 1], ["b", 2], ["a", 3]]

If the world were without regex and chunk:
def word(str)
a = str.chars
b = []
loop do
return b if a.empty?
c = a.slice_before {|e| e != a.first}.first
b << [c.first, c.size]
a = a[c.size..-1]
end
end
word "aaabbcbbaaa" # => [["a", 3], ["b", 2], ["c", 1], ["b", 2], ["a", 3]]
word "aaa" # => [["a",3]]
word "" # => []
Here's another way. Initially I tried to find a solution that didn't require conversion of the string to an array of its characters. I couldn't come up with anything decent until I saw #hirolau 's answer, which I modified:
def word(str)
list = []
char = str[-1]
loop do
return list if str.empty?
count = 0
count += 1 until str.chomp!(char).nil?
list.unshift [char, count]
char = str[-1]
end
end

You can use this pattern with scan:
"aaabbcbbaaa".scan(/((.)\2*)/)
and after count the number of char for all group 1
example:
"aaabbcbbaaaa".scan(/((.)\2*)/).map do |x,y| [y, x.length] end

Related

Do something in middle of recursive function, then return as needed

How do I do something in the middle of a recursion, and return as needed? In other words, maybe no more recursion is needed because I have found a "solution" in which case to save resources, the recursion can stop.
For example, let's say I have a working permute method that does this
permute([["a","b"],[1,2]])
>>> [["a", 1], ["a", 2], ["b", 1], ["b", 2]]
Rather than have the method generate all 4 possibilities, if one meets my requirements, I'd like it to stop. For example, let's say I'm searching for ["a",2], then the method can stop after it creates the second possibility.
This is my current permute method that is working
def permute(arr)
if arr.length == 1
return arr.first
else
first = arr.shift
return first.product(permute(arr)).uniq
end
end
I feel like I need to inject a do block somewhere with something like the below, but not sure how/where...
if result_of_permutation_currently == ["a",2]
return ...
else
# continuing the permutations
end
You could write your method as follows.
def partial_product(arr, last_element)
#a = []
#last_element = last_element
recurse(arr)
#a
end
def recurse(arr, element = [])
first, *rest = arr
if rest.empty?
first.each do |e|
el = element + [e]
#a << el
return true if el == #last_element
end
else
first.each do |e|
rv = recurse(rest, element + [e])
return true if rv
end
end
false
end
arr = [["a","b"], [1,2,3], ["cat","dog"]]
partial_product(arr, ["b",2,"dog"])
#=> [["a", 1, "cat"], ["a", 1, "dog"], ["a", 2, "cat"],
# ["a", 2, "dog"], ["a", 3, "cat"], ["a", 3, "dog"],
# ["b", 1, "cat"], ["b", 1, "dog"], ["b", 2, "cat"],
# ["b", 2, "dog"]]
partial_product(arr, ["a",1,"dog"])
#=> [["a", 1, "cat"], ["a", 1, "dog"]]
partial_product(arr, ["b",2,"pig"])
#=> [["a", 1, "cat"], ["a", 1, "dog"], ["a", 2, "cat"],
# ["a", 2, "dog"], ["a", 3, "cat"], ["a", 3, "dog"],
# ["b", 1, "cat"], ["b", 1, "dog"], ["b", 2, "cat"],
# ["b", 2, "dog"], ["b", 3, "cat"], ["b", 3, "dog"]]
If you prefer to avoid using instance variables, you could carry a and last_element as arguments in recurse, but there would be inefficiencies by doing so, particularly in terms of memory use.
Here are two ways that could be done without using recursion.
Use each to generate elements of the desired array until the target pair is reached
def permute(arr1, arr2, last_pair = [])
arr1.each_with_object([]) do |e1,a|
arr2.each do |e2|
a << [e1, e2]
break a if [e1, e2] == last_pair
end
end
end
permute(["a","b"],[1,2],["b", 1])
#=> [["a", 1], ["a", 2], ["b", 1]]
permute(["a","b"],[1,2],["b", 99])
#=> [["a", 1], ["a", 2], ["b", 1], ["b", 2]]
permute(["a","b"],[1,2])
#=> [["a", 1], ["a", 2], ["b", 1], ["b", 2]]
permute(["a","b"],[],["b", 1])
#=> []
permute([],[1,2],["b", 1])
#=> []
permute([],[],["b", 1])
#=> []
Map a sequence of the indices of the desired array
def permute(arr1, arr2, last_pair = [])
n1 = arr1.size
n2 = arr2.size
idx1 = arr1.index(last_pair.first)
idx2 = idx1.nil? ? nil : arr2.index(last_pair.last)
return arr1.product(arr2) if idx2.nil?
0.step(to: idx1*n2+idx2).
map {|i| [arr1[(i % (n1*n2))/n2], arr2[i % n2]]}
end
permute(["a","b"],[1,2],["b", 1])
See Numeric#step
idx1*n2 + idx2, the number of elements in the array to be returned, is computed as follows.
last_pair = ["b", 1]
n2 = arr2.size
#=> 2
idx1 = arr1.index(last_pair.first)
#=> 1
idx2 = idx1.nil? ? nil : arr2.index(last_pair.last)
#=> 0
idx1*n2 + idx2
#=> 2
The element at index i of the array returned is:
n1 = arr1.size
#=> 2
[arr1[(i % (n1*n2))/n2], arr2[i % n2]]
#=> [["a","b"][(i % 2*2)/2], [1,2][i % 2]]
For i = 1 this is
[["a","b"][(1 % 4)/2], [1,2][1 % 2]]
#=> [["a","b"][0], [1,2][1]]
#=> [“a”, 2]
For i = 2 this is
[["a","b"][(2 % 4)/2], [1,2][2 % 2]]
#=> [["a","b"][1], [1,2][0]]
#=> [“b”,1]
Note that we cannot write
arr1.lazy.product(arr2).first(idx1*n2+idx2+1)
because arr1.lazy returns an enumerator (arr1.lazy
#=> #<Enumerator::Lazy: ["a", "b"]>) but Array#product requires it's receiver to be an array. It's for that reason that some Rubyists would like to see product made an Enumerable method (with a lazy version), but don't hold your breathe.

Consecutive letter frequency

I am trying to write code to determine consecutive frequency of letters within a string.
For example:
"aabbcbb" => ["a",2],["b",2],["c", 1], ["b", 2]
My code gives me the first letter frequency but doesn't move on to the next.
def encrypt(str)
array = []
count = 0
str.each_char do |letter|
if array.empty?
array << letter
count += 1
elsif array.last == letter
count += 1
else
return [array, count]
array = []
end
end
end
p "aabbcbb".chars.chunk{|c| c}.map{|c, a| [c, a.size]}
# => [["a", 2], ["b", 2], ["c", 1], ["b", 2]]
"aabbcbb".chars.slice_when(&:!=).map{|a| [a.first, a.length]}
# => [["a", 2], ["b", 2], ["c", 1], ["b", 2]]
There's a simple regular expression-based solution involving back-references:
"aabbbcbb".scan(/((.)\2*)/).map { |m,c| [c, m.length] }
# => [["a", 2], ["b", 3], ["c", 1], ["b", 2]]
But I would prefer the chunk method for clarity (and almost certainly efficiency).
Actually out of curiosity, I wrote a quick benchmark and scan is a little more than four times faster than chunk.map, but I'd still use chunk.map for clarity unless you're actually doing this hundreds of thousands of times:
require 'benchmark'
N = 10000
data = ('a'..'z').map { |c| c * 10 }.join("")
Benchmark.bm do |bm|
bm.report do
N.times { data.chars.chunk{ |c| c }.map { |c, a| [c, a.size] } }
end
bm.report do
N.times { data.scan(/((.)\2*)/).map { |m,c| [c, m.size] } }
end
end
user system total real
0.800000 0.010000 0.810000 ( 0.803824)
0.190000 0.000000 0.190000 ( 0.192915)
You need to build up an array of results, rather than simply stopping at the first one:
def consecutive_frequencies(str)
str.each_char.reduce([]) do |frequencies_arr, char|
if frequencies_arr.last && frequencies_arr.last[0] == char
frequencies_arr.last[1] += 1
else
frequencies_arr << [char, 1]
end
frequencies_arr
end
end
#steenslag gave the answer I would have given, so I'll try something different.
"aabbcbb".each_char.with_object([]) { |c,a| (a.any? && c == a.last.first) ?
a.last[-1] += 1 : a << [c, 1] }
#=> [["a", 2], ["b", 2], ["c", 1], ["b", 2]]
def encrypt(str)
count = 0
array = []
str.chars do |letter|
if array.empty?
array << letter
count += 1
elsif array.last == letter
count += 1
else
puts "[#{array}, #{count}]"
array.clear
count = 0
array << letter
count += 1
end
end
puts "[#{array}, #{count}]"
end
There are several errors with your implementation, I would try with a hash (rather than an array) and use something like this:
def encrypt(str)
count = 0
hash = {}
str.each_char do |letter|
if hash.key?(letter)
hash[letter] += 1
else
hash[letter] = 1
end
end
return hash
end
puts encrypt("aabbcbb")

Count consecutives

I need to write a method that does the following
consecutive_count("aaabbcbbaaa") == [["a", 3], ["b", 2], ["c", 1], ["b", 2], ["a", 3]]
I got the code, but it looks ugly and I'm trying to see a better solution, please advice.
Here is my code:
def consecutive_count(str)
el = str[0]; count = 0; result = []
str.split("").each do |l|
if (el != l)
result << [el, count]
count = 1
el = l
else
count +=1
end
end
result << [el, count] if !el.nil?
return result
end
Here is one way :
s = "aaabbcbbaaa"
s.chars.chunk{|e| e }.map{|item,ary| [item,ary.size]}
# => [["a", 3], ["b", 2], ["c", 1], ["b", 2], ["a", 3]]
"aaabbcbbaaa".scan(/(?<s>(?<c>.)\k<c>*)/).map{|s, c| [c, s.length]}
# => [["a", 3], ["b", 2], ["c", 1], ["b", 2], ["a", 3]]
or
"aaabbcbbaaa".scan(/((.)\2*)/).map{|s, c| [c, s.length]}
# => [["a", 3], ["b", 2], ["c", 1], ["b", 2], ["a", 3]]
A solution which does not involve regex magic (although those are a bit shorter and probably faster) is this:
str.each_char.each_with_object([]) do |char, result|
if (result.last || [])[0] == char
result.last[1] += 1
else
result << [char, 1]
end
end
Depending on your level of understanding, it might better transport your intended meaning which might help to debug the thing in 6 month :)
Regexp solution:
my_s = "aaabbcbbaaa"
p my_s.scan(/(.)(\1*)/).map{|x,y| [x, y.size + 1]}
#=> [["a", 3], ["b", 2], ["c", 1], ["b", 2], ["a", 3]]
or
a, result = "aaabbcbbaaa", []
result << a.slice!(/(\w)\1*/) until a.empty?
and then map the result with counts.
You can try:
def consecutive_count(str)
result = {}
array = str.split(//).uniq
array.each.map {|char| result[char] = 0}
array.each do |char|
while str.starts_with?(char) do
result[char] += 1
str[0] = ""
end
result
end

Ruby String Encode Consecutive Letter Frequency

I want to encode a string in Ruby such that output should be in pairs so that I could decode it. I want to encode in such a way that each pair contains the next distinct letter in the string, and the number consecutive repeats.
e.g If I encode "aaabbcbbaaa" output should
[["a", 3], ["b", 2], ["c", 1], ["b", 2], ["a", 3]]
here is the code.
def encode( s )
b = 0
e = s.length - 1
ret = []
while ( s <= e )
m = s.match( /(\w)\1*/ )
l = m[0][0]
n = m[0].length
ret << [l, n]
end
ret
end
"aaabbcbbaaa".chars.chunk{|i| i}.map{|m,n| [m,n.count(m)]}
#=> [["a", 3], ["b", 2], ["c", 1], ["b", 2], ["a", 3]]
"aaabbcbbaaa".scan(/((.)\2*)/).map{|s, c| [c, s.length]}
You could also do this procedurally.
def group_consecutive(input)
groups = []
input.each_char do |c|
if groups.empty? || groups.last[0] != c
groups << [c, 1]
else
groups.last[1] += 1
end
end
groups
end
'aaabbcbbaaa'.scan(/((.)\2*)/).map {|e| [e[1], e[0].size]}

How to count duplicates in Ruby Arrays

How do you count duplicates in a ruby array?
For example, if my array had three a's, how could I count that
Another version of a hash with a key for each element in your array and value for the count of each element
a = [ 1, 2, 3, 3, 4, 3]
h = Hash.new(0)
a.each { | v | h.store(v, h[v]+1) }
# h = { 3=>3, 2=>1, 1=>1, 4=>1 }
Given:
arr = [ 1, 2, 3, 2, 4, 5, 3]
My favourite way of counting elements is:
counts = arr.group_by{|i| i}.map{|k,v| [k, v.count] }
# => [[1, 1], [2, 2], [3, 2], [4, 1], [5, 1]]
If you need a hash instead of an array:
Hash[*counts.flatten]
# => {1=>1, 2=>2, 3=>2, 4=>1, 5=>1}
This will yield the duplicate elements as a hash with the number of occurences for each duplicate item. Let the code speak:
#!/usr/bin/env ruby
class Array
# monkey-patched version
def dup_hash
inject(Hash.new(0)) { |h,e| h[e] += 1; h }.select {
|k,v| v > 1 }.inject({}) { |r, e| r[e.first] = e.last; r }
end
end
# unmonkeey'd
def dup_hash(ary)
ary.inject(Hash.new(0)) { |h,e| h[e] += 1; h }.select {
|_k,v| v > 1 }.inject({}) { |r, e| r[e.first] = e.last; r }
end
p dup_hash([1, 2, "a", "a", 4, "a", 2, 1])
# {"a"=>3, 1=>2, 2=>2}
p [1, 2, "Thanks", "You're welcome", "Thanks",
"You're welcome", "Thanks", "You're welcome"].dup_hash
# {"You're welcome"=>3, "Thanks"=>3}
Simple.
arr = [2,3,4,3,2,67,2]
repeats = arr.length - arr.uniq.length
puts repeats
arr = %w( a b c d c b a )
# => ["a", "b", "c", "d", "c", "b", "a"]
arr.count('a')
# => 2
Another way to count array duplicates is:
arr= [2,2,3,3,2,4,2]
arr.group_by{|x| x}.map{|k,v| [k,v.count] }
result is
[[2, 4], [3, 2], [4, 1]]
requires 1.8.7+ for group_by
ary = %w{a b c d a e f g a h i b}
ary.group_by{|elem| elem}.select{|key,val| val.length > 1}.map{|key,val| key}
# => ["a", "b"]
with 1.9+ this can be slightly simplified because Hash#select will return a hash.
ary.group_by{|elem| elem}.select{|key,val| val.length > 1}.keys
# => ["a", "b"]
To count instances of a single element use inject
array.inject(0){|count,elem| elem == value ? count+1 : count}
arr = [1, 2, "a", "a", 4, "a", 2, 1]
arr.group_by(&:itself).transform_values(&:size)
#=> {1=>2, 2=>2, "a"=>3, 4=>1}
Ruby >= 2.7 solution here:
A new method .tally has been added.
Tallies the collection, i.e., counts the occurrences of each element. Returns a hash with the elements of the collection as keys and the corresponding counts as values.
So now, you will be able to do:
["a", "b", "c", "b"].tally #=> {"a"=>1, "b"=>2, "c"=>1}
What about a grep?
arr = [1, 2, "Thanks", "You're welcome", "Thanks", "You're welcome", "Thanks", "You're welcome"]
arr.grep('Thanks').size # => 3
Its Easy:
words = ["aa","bb","cc","bb","bb","cc"]
One line simple solution is:
words.each_with_object(Hash.new(0)) { |word,counts| counts[word] += 1 }
It works for me.
Thanks!!
I don't think there's a built-in method. If all you need is the total count of duplicates, you could take a.length - a.uniq.length. If you're looking for the count of a single particular element, try
a.select {|e| e == my_element}.length.
Improving #Kim's answer:
arr = [1, 2, "a", "a", 4, "a", 2, 1]
Hash.new(0).tap { |h| arr.each { |v| h[v] += 1 } }
# => {1=>2, 2=>2, "a"=>3, 4=>1}
Ruby code to get the repeated elements in the array:
numbers = [1,2,3,1,2,0,8,9,0,1,2,3]
similar = numbers.each_with_object([]) do |n, dups|
dups << n if seen.include?(n)
seen << n
end
print "similar --> ", similar
Another way to do it is to use each_with_object:
a = [ 1, 2, 3, 3, 4, 3]
hash = a.each_with_object({}) {|v, h|
h[v] ||= 0
h[v] += 1
}
# hash = { 3=>3, 2=>1, 1=>1, 4=>1 }
This way, calling a non-existing key such as hash[5] will return nil instead of 0 with Kim's solution.
I've used reduce/inject for this in the past, like the following
array = [1,5,4,3,1,5,6,8,8,8,9]
array.reduce (Hash.new(0)) {|counts, el| counts[el]+=1; counts}
produces
=> {1=>2, 5=>2, 4=>1, 3=>1, 6=>1, 8=>3, 9=>1}

Resources