Ruby element match - ruby

I am trying to find the index of the first and second instance of a string variable. I want to be able to use any predefined string variable but when I try to do that it gives me an error. I want to be able to declare multiple string variables like ss, aa, ff, etc and use them in place of xx. Can someone help me out?
#aa is a predefined array
xx = "--help--"
find_xx_instance = aa.each_with_index.select{|i,idx| i =~ /xx/}
#/--help--/works but not /xx/
find_xx_instance.map! {|i| i[1]}
#gives me info between the first two instances of string
puts aa[find_xx_instance[0]+1..find_xx_instance[1]-1]

As far as I understand, you just need to pass variable to regular expression. Try this:
find_xx_instance = aa.each_with_index.select{|i,idx| i =~ /#{xx}/}

I have assumed you are given an array of strings, arr, a string str, and an integer n, and wish to return an array a of n elements i, where i is the index of the ith+1 instance of str in arr.
For example:
arr = %w| Now is the time for the Zorgs to attack the Borgs |
#=> ["Now", "is", "the", "time", "for", "the", "Zorgs", "to", "attack", "the", "Borgs"]
str = "the"
nbr = 2
This is one way:
b = arr.each_index.select { |i| arr[i]==str }
#=> [2, 5, 9]
b.first(nbr)
#=> [2, 5]
which can be written
arr.each_index.select { |i| arr[i]==str }.first(nbr)
This For small problems like this one, that's fine, but if arr is large, it would be better to terminate the calculations after nbr instances of str have been found. We can do that by creating a Lazy enumerator:
arr.each_index.lazy.select { |i| arr[i]==str }.first(nbr)
#=> [2, 5]
Here's a second example that clearly illustrates that lazy is stopping the calculations after nbr strings str in arr have been found:
(0..Float::INFINITY).lazy.select { |i| arr[i] == str }.first(nbr)
#=> [2, 5]

Related

Optimizing Bird Migration Challenge

I am currently working on a challenge where the guidelines are as follows:
You have been asked to help study the population of birds migrating across the continent. Each type of bird you are interested in will be identified by an integer value. Each time a particular kind of bird is spotted, its id number will be added to your array of sightings. You would like to be able to find out which type of bird is most common given a list of sightings. Your task is to print the type number of that bird and if two or more types of birds are equally common, choose the type with the smallest ID number.
For example, assume your bird sightings are of types arr = [1, 1, 2, 2, 3]. There are two each of types 1 and 2, and one sighting of type 3. Pick the lower of the two types seen twice: type 1.
I have written code that passes most of the tests but times out on the exceedingly large inputs, I would like your advice on how to optimize it.
My code is as follows:
def migratoryBirds(arr)
sorted = Hash[arr.map { |x| [x, arr.select { |y| y==x }.count] }]
return sorted.max_by { |k,v| v }[0]
end
Your sorted hash can be written a bit more concise as:
sorted = arr.map { |x| [x, arr.count(x)] }.to_h
For the example array [1, 1, 2, 2, 3] this is equivalent to:
[
[1, arr.count(1)], # counts all 1's in arr
[1, arr.count(1)], # counts all 1's in arr (again)
[2, arr.count(2)], # counts all 2's in arr
[2, arr.count(2)], # counts all 2's in arr (again)
[3, arr.count(3)] # counts all 3's in arr
].to_h
Not only does it count 1 and 2 twice. It also has to traverse the entire array again for each count call (or select in your code).
A better approach is to traverse the array once and use a hash for counting the occurrences:
arr = [1, 1, 2, 2, 3]
sorted = Hash.new(0)
arr.each { |x| sorted[x] += 1 }
sorted #=> {1=>2, 2=>2, 3=>1}
This can also be written in a single line via each_with_object:
sorted = arr.each_with_object(Hash.new(0)) { |x, h| h[x] += 1 }
#=> {1=>2, 2=>2, 3=>1}
Ruby 2.7 even has a dedicated method tally to count occurrences:
sorted = arr.tally
#=> {1=>2, 2=>2, 3=>1}

How do I figure out the indexes of where a string was split?

Iā€™m using Rails 4.2.7 (Ruby 2.3). I use the following to split a string based on a regular expression
my_string.split(/\W+/)
My question is, how can I get an equivalent array of numbers in which each number represents the index of where the splits occurred? If no splits occurred, I would expect the array to only contain an element with ā€œ-1ā€.
So, you don't want to split, you just want the indices?
a = []
"Hello stack overflow".scan(/\W+/){a<<Regexp.last_match.begin(0)}
a
#=> [5, 11]
An empty array would mean that no split occured.
Edit : Shorter version could be
"Hello stack overflow".enum_for(:scan, /\W+/).map{Regexp.last_match.begin(0)}
#=> [5, 11]
In some cases String#split will split a string on substrings having multiple characters. The offset into the string of the beginning of each substring may not be sufficient. Both the start and end indices may be needed.
The following method returns an array of tuples, the first element of each tuple being the portion of the string returned by strip, the second being the index where that substring begins. Note only does this provide a way to compute the desired indices, but it also returns the substrings returned by split.
def split_with_offsets(str, r)
indices = []
str.scan(r) { indices << Regexp.last_match.begin(0) }
str.split(r).zip(indices << str.size).map { |s,ndx| [s, ndx-s.size] }
end
With this information it is straightforward to obtain an array of ranges, each range being the offsets of a substring on which the strip is split.
def gaps(str, r)
split_with_offsets(str, r).each_cons(2).map { |(s,f), (_,lp1)| f+s.size..lp1-1 }
end
Here are two examples.
#1
str = "Hello stack overflow"
r = /\W+/
split_with_offsets(str, r)
#=> [["Hello", 0], ["stack", 6], ["overflow", 12]]
gaps(str, r)
#=> [5..5, 11..11]
#2
str = "I | cannot | wait | for | the election |to |be | over"
r = /\s*\|\s*/
split_with_offsets(str, r)
#=> [["I", 0], ["cannot", 4], ["wait", 13], ["for", 20], ["the election", 29],
# ["to", 43], ["be", 47], ["over", 53]]
gaps(str, r)
#=> [1..3, 10..12, 17..19, 23..28, 41..42, 45..46, 49..52]

Most common words in string

I am new to Ruby and trying to write a method that will return an array of the most common word(s) in a string. If there is one word with a high count, that word should be returned. If there are two words tied for the high count, both should be returned in an array.
The problem is that when I pass through the 2nd string, the code only counts "words" twice instead of three times. When the 3rd string is passed through, it returns "it" with a count of 2, which makes no sense, as "it" should have a count of 1.
def most_common(string)
counts = {}
words = string.downcase.tr(",.?!",'').split(' ')
words.uniq.each do |word|
counts[word] = 0
end
words.each do |word|
counts[word] = string.scan(word).count
end
max_quantity = counts.values.max
max_words = counts.select { |k, v| v == max_quantity }.keys
puts max_words
end
most_common('a short list of words with some words') #['words']
most_common('Words in a short, short words, lists of words!') #['words']
most_common('a short list of words with some short words in it') #['words', 'short']
Your method of counting instances of the word is your problem. it is in with, so it's double counted.
[1] pry(main)> 'with some words in it'.scan('it')
=> ["it", "it"]
It can be done easier though, you can group an array's contents by the number of instances of the values using an each_with_object call, like so:
counts = words.each_with_object(Hash.new(0)) { |e, h| h[e] += 1 }
This goes through each entry in the array and adds 1 to the value for each word's entry in the hash.
So the following should work for you:
def most_common(string)
words = string.downcase.tr(",.?!",'').split(' ')
counts = words.each_with_object(Hash.new(0)) { |e, h| h[e] += 1 }
max_quantity = counts.values.max
counts.select { |k, v| v == max_quantity }.keys
end
p most_common('a short list of words with some words') #['words']
p most_common('Words in a short, short words, lists of words!') #['words']
p most_common('a short list of words with some short words in it') #['words', 'short']
As Nick has answered your question, I will just suggest another way this can be done. As "high count" is vague, I suggest you return a hash with downcased words and their respective counts. Since Ruby 1.9, hashes retain the order that key-value pairs have been entered, so we may want to make use of that and return the hash with key-value pairs ordered in decreasing order of values.
Code
def words_by_count(str)
str.gsub(/./) do |c|
case c
when /\w/ then c.downcase
when /\s/ then c
else ''
end
end.split
.group_by {|w| w}
.map {|k,v| [k,v.size]}
.sort_by(&:last)
.reverse
.to_h
end
words_by_count('Words in a short, short words, lists of words!')
The method Array#h was introduced in Ruby 2.1. For earlier Ruby versions, one must use:
Hash[str.gsub(/./)... .reverse]
Example
words_by_count('a short list of words with some words')
#=> {"words"=>2, "of"=>1, "some"=>1, "with"=>1,
# "list"=>1, "short"=>1, "a"=>1}
words_by_count('Words in a short, short words, lists of words!')
#=> {"words"=>3, "short"=>2, "lists"=>1, "a"=>1, "in"=>1, "of"=>1}
words_by_count('a short list of words with some short words in it')
#=> {"words"=>2, "short"=>2, "it"=>1, "with"=>1,
# "some"=>1, "of"=>1, "list"=>1, "in"=>1, "a"=>1}
Explanation
Here is what's happening in the second example, where:
str = 'Words in a short, short words, lists of words!'
str.gsub(/./) do |c|... matches each character in the string and sends it to the block to decide what do with it. As you see, word characters are downcased, whitespace is left alone and everything else is converted to a blank space.
s = str.gsub(/./) do |c|
case c
when /\w/ then c.downcase
when /\s/ then c
else ''
end
end
#=> "words in a short short words lists of words"
This is followed by
a = s.split
#=> ["words", "in", "a", "short", "short", "words", "lists", "of", "words"]
h = a.group_by {|w| w}
#=> {"words"=>["words", "words", "words"], "in"=>["in"], "a"=>["a"],
# "short"=>["short", "short"], "lists"=>["lists"], "of"=>["of"]}
b = h.map {|k,v| [k,v.size]}
#=> [["words", 3], ["in", 1], ["a", 1], ["short", 2], ["lists", 1], ["of", 1]]
c = b.sort_by(&:last)
#=> [["of", 1], ["in", 1], ["a", 1], ["lists", 1], ["short", 2], ["words", 3]]
d = c.reverse
#=> [["words", 3], ["short", 2], ["lists", 1], ["a", 1], ["in", 1], ["of", 1]]
d.to_h # or Hash[d]
#=> {"words"=>3, "short"=>2, "lists"=>1, "a"=>1, "in"=>1, "of"=>1}
Note that c = b.sort_by(&:last), d = c.reverse can be replaced by:
d = b.sort_by { |_,k| -k }
#=> [["words", 3], ["short", 2], ["a", 1], ["in", 1], ["lists", 1], ["of", 1]]
but sort followed by reverse is generally faster.
def count_words string
word_list = Hash.new(0)
words = string.downcase.delete(',.?!').split
words.map { |word| word_list[word] += 1 }
word_list
end
def most_common_words string
hash = count_words string
max_value = hash.values.max
hash.select { |k, v| v == max_value }.keys
end
most_common 'a short list of words with some words'
#=> ["words"]
most_common 'Words in a short, short words, lists of words!'
#=> ["words"]
most_common 'a short list of words with some short words in it'
#=> ["short", "words"]
Assuming string is a string containing multiple words.
words = string.split(/[.!?,\s]/)
words.sort_by{|x|words.count(x)}
Here we split the words in an string and add them to an array. We then sort the array based on the number of words. The most common words will appear at the end.
The same thing can be done in the following way too:
def most_common(string)
counts = Hash.new 0
string.downcase.tr(",.?!",'').split(' ').each{|word| counts[word] += 1}
# For "Words in a short, short words, lists of words!"
# counts ---> {"words"=>3, "in"=>1, "a"=>1, "short"=>2, "lists"=>1, "of"=>1}
max_value = counts.values.max
#max_value ---> 3
return counts.select{|key , value| value == counts.values.max}
#returns ---> {"words"=>3}
end
This is just a shorter solution, which you might want to use. Hope it helps :)
This is the kind of question programmers love, isn't it :) How about a functional approach?
# returns array of words after removing certain English punctuations
def english_words(str)
str.downcase.delete(',.?!').split
end
# returns hash mapping element to count
def element_counts(ary)
ary.group_by { |e| e }.inject({}) { |a, e| a.merge(e[0] => e[1].size) }
end
def most_common(ary)
ary.empty? ? nil :
element_counts(ary)
.group_by { |k, v| v }
.sort
.last[1]
.map(&:first)
end
most_common(english_words('a short list of words with some short words in it'))
#=> ["short", "words"]
def firstRepeatedWord(string)
h_data = Hash.new(0)
string.split(" ").each{|x| h_data[x] +=1}
h_data.key(h_data.values.max)
end
def common(string)
counts=Hash.new(0)
words=string.downcase.delete('.,!?').split(" ")
words.each {|k| counts[k]+=1}
p counts.sort.reverse[0]
end

How to create two seperate arrays from one input?

DESCRIPTION:
The purpose of my code is to take in input of a sequence of R's and C's and to simply store each number that comes after the character in its proper array.
For Example: "The input format is as follows: R1C4R2C5
Column Array: [ 4, 5 ] Row Array: [1,2]
My problem is I am getting the output like this:
[" ", 1]
[" ", 4]
[" ", 2]
[" ", 5]
**How do i get all the Row integers following R in one array, and all the Column integers following C in another seperate array. I do not want to create multiple arrays, Rather just two.
Help!
CODE:
puts 'Please input: '
input = gets.chomp
word2 = input.scan(/.{1,2}/)
col = []
row = []
word2.each {|a| col.push(a.split(/C/)) if a.include? 'C' }
word2.each {|a| row.push(a.split(/R/)) if a.include? 'R' }
col.each do |num|
puts num.inspect
end
row.each do |num|
puts num.inspect
end
x = "R1C4R2C5"
col = []
row = []
x.chars.each_slice(2) { |u| u[0] == "R" ? row << u[1] : col << u[1] }
p col
p row
The main problem with your code is that you replicate operations for rows and columns. You want to write "DRY" code, which stands for "don't repeat yourself".
Starting with your code as the model, you can DRY it out by writing a method like this to extract the information you want from the input string, and invoke it once for rows and once for columns:
def doit(s, c)
...
end
Here s is the input string and c is the string "R" or "C". Within the method you want
to extract substrings that begin with the value of c and are followed by digits. Your decision to use String#scan was a good one, but you need a different regex:
def doit(s, c)
s.scan(/#{c}\d+/)
end
I'll explain the regex, but let's first try the method. Suppose the string is:
s = "R1C4R2C5"
Then
rows = doit(s, "R") #=> ["R1", "R2"]
cols = doit(s, "C") #=> ["C4", "C5"]
This is not quite what you want, but easily fixed. First, though, the regex. The regex first looks for a character #{c}. #{c} transforms the value of the variable c to a literal character, which in this case will be "R" or "C". \d+ means the character #{c} must be followed by one or more digits 0-9, as many as are present before the next non-digit (here a "R" or "C") or the end of the string.
Now let's fix the method:
def doit(s, c)
a = s.scan(/#{c}\d+/)
b = a.map {|str| str[1..-1]}
b.map(&:to_i)
end
rows = doit(s, "R") #=> [1, 2]
cols = doit(s, "C") #=> [4, 5]
Success! As before, a => ["R1", "R2"] if c => "R" and a =>["C4", "C5"] if c => "C". a.map {|str| str[1..-1]} maps each element of a into a string comprised of all characters but the first (e.g., "R12"[1..-1] => "12"), so we have b => ["1", "2"] or b =>["4", "5"]. We then apply map once again to convert those strings to their Fixnum equivalents. The expression b.map(&:to_i) is shorthand for
b.map {|str| str.to_i}
The last computed quantity is returned by the method, so if it is what you want, as it is here, there is no need for a return statement at the end.
This can be simplified, however, in a couple of ways. Firstly, we can combine the last two statements by dropping the last one and changing the one above to:
a.map {|str| str[1..-1].to_i}
which also gets rid of the local variable b. The second improvement is to "chain" the two remaining statements, which also rids us of the other temporary variable:
def doit(s, c)
s.scan(/#{c}\d+/).map { |str| str[1..-1].to_i }
end
This is typical Ruby code.
Notice that by doing it this way, there is no requirement for row and column references in the string to alternate, and the numeric values can have arbitrary numbers of digits.
Here's another way to do the same thing, that some may see as being more Ruby-like:
s.scan(/[RC]\d+/).each_with_object([[],[]]) {|n,(r,c)|
(n[0]=='R' ? r : c) << n[1..-1].to_i}
Here's what's happening. Suppose:
s = "R1C4R2C5R32R4C7R18C6C12"
Then
a = s.scan(/[RC]\d+/)
#=> ["R1", "C4", "R2", "C5", "R32", "R4", "C7", "R18", "C6", "C12"]
scan uses the regex /([RC]\d+)/ to extract substrings that begin with 'R' or 'C' followed by one or more digits up to the next letter or end of the string.
b = a.each_with_object([[],[]]) {|n,(r,c)|(n[0]=='R' ? r : c) << n[1..-1].to_i}
#=> [[1, 2, 32, 4, 18], [4, 5, 7, 6, 12]]
The row values are given by [1, 2, 32, 4, 18]; the column values by [4, 5, 7, 6, 12].
Enumerable#each_with_object (v1.9+) creates an array comprised of two empty arrays, [[],[]]. The first subarray will contain the row values, the second, the column values. These two subarrays are represented by the block variables r and c, respectively.
The first element of a is "R1". This is represented in the block by the variable n. Since
"R1"[0] #=> "R"
"R1"[1..-1] #=> "1"
we execute
r << "1".to_i #=> [1]
so now
[r,c] #=> [[1],[]]
The next element of a is "C4", so we will execute:
c << "4".to_i #=> [4]
so now
[r,c] #=> [[1],[4]]
and so on.
rows, cols = "R1C4R2C5".scan(/R(\d+)C(\d+)/).flatten.partition.with_index {|_, index| index.even? }
> rows
=> ["1", "2"]
> cols
=> ["4", "5"]
Or
rows = "R1C4R2C5".scan(/R(\d+)/).flatten
=> ["1", "2"]
cols = "R1C4R2C5".scan(/C(\d+)/).flatten
=> ["4", "5"]
And to fix your code use:
word2.each {|a| col.push(a.delete('C')) if a.include? 'C' }
word2.each {|a| row.push(a.delete('R')) if a.include? 'R' }

Use regular expression to evaluate and modify string in Ruby?

I want to modify part of a string I have using Ruby.
The string is [x, y] where y is an integer that I want to change to its alphabetical letter. So say [1, 1] would become [1, A] and [1, 26] would become [1, Z].
Would a regular expression help me do this? or is there an easier way? I am not to strong with regular expressions, I am reading up on those now.
The shortest way I can think of is the following
string = "[1,1]"
array = string.chop.reverse.chop.reverse.split(',')
new_string="[#{array.first},#{(array.last.to_i+64).chr}]"
Maybe this helps:
Because we do not have an alphabet yet we can look up the position, create one.
This is a range converted to an array so you don't need to specify it yourself.
alphabet = ("A".."Z").to_a
Then we try to get the integer/position out of the string:
string_to_match = "[1,5]"
/(\d+)\]$/.match(string_to_match)
Maybe the regexp can be improved, however for this example it is working.
The first reference in the MatchData is holding the second integer in your "string_to_match".
Or you can get it via "$1".
Do not forget to convert it to an integer.
position_in_alphabet = $1.to_i
Also we need to remember that the index of arrays starts at 0 and not 1
position_in_alphabet -= 1
Finally, we can take a look which char we really get
char = alphabet[position_in_alphabet]
Example:
alphabet = ("A".."Z").to_a #=> ["A", "B", "C", ..*snip*.. "Y", "Z"]
string_to_match = "[1,5]" #=> "[1,5]"
/(\d+)\]$/.match(string_to_match) #=> #<MatchData "5]" 1:"5">
position_in_alphabet = $1.to_i #=> 5
position_in_alphabet -= 1 #=> 4
char = alphabet[position_in_alphabet] #=> "E"
Greetings~

Resources