Method not counting number of times the string is used - ruby

I'm trying to count the number of times a word from a dictionary is used. This is my code:
def substrings(words, dictionary)
hash = {}
substrings.downcase!
dictionary.each do |substring|
words.each do |word|
if word.include? substring +=1
end
end
end
hash.to_s
end
dictionary = ["below", "down", "go", "going", "horn", "how", "howdy", "it", "i", "low", "own", "part", "partner", "sit"]
substrings = "below", dictionary
This is the result:
["below", ["below", "down", "go", "going", "horn", "how", "howdy", "it", "i", "low", "own", "part", "partner", "sit"]]
But I'm looking for something like this:
=> {"below"=>1, "low"=>1}

you are redefining the method, not calling it. replace the last line with
substrings("below", dictionary)

Related

Split a string delimited by a list of substrings

I have data like:
str = "CODEA text for first item CODEB text for next item CODEB2 some"\
"more text CODEC yet more text"
and a list:
arr = ["CODEA", "CODEB", "CODEB2", "CODEC", ... ]
I want to divide this string into a hash. The keys of the hash will be CODEA, CODEB, etc. The values of the hash will be the text that follows, until the next CODE. The output should look like this:
"CODEA" => "text for first item",
"CODEB" => "text for next item",
"CODEB2" => "some more text",
"CODEC" => "yet more text"
We are given a sting and an array.
str = "CODEA text for first item CODEB text for next item " +
"CODEB2 some more text CODEC yet more text"
arr= %w|CODEC CODEB2 CODEA CODEB|
#=> ["CODEC", "CODEB2", "CODEA", "CODEB"]
This is one way to obtain the desired hash.
str.split.
slice_before { |word| arr.include?(word) }.
map { |word, *rest| [word, rest.join(' ')] }.
to_h
#=> {"CODEA" =>"text for first item",
# "CODEB" =>"text for next item",
# "CODEB2"=>"some more text",
# "CODEC" =>"yet more text"}
See Enumerable#slice_before.
The steps are as follows.
a = str.split
#=> ["CODEA", "text", "for", "first", "item", "CODEB",
# "text", "for", "next", "item", "CODEB2", "some",
# "more", "text", "CODEC", "yet", "more", "text"]
b = a.slice_before { |word| arr.include?(word) }
#=> #<Enumerator:
# #<Enumerator::Generator:0x00005cbdec2b5eb0>:each>
We can see the (4) elements (arrays) that will be generated by this enumerator and passed to each_with_object by converting it to an array.
b.to_a
#=> [["CODEA", "text", "for", "first", "item"],
# ["CODEB", "text", "for", "next", "item"],
# ["CODEB2", "some", "more", "text"],
# ["CODEC", "yet", "more", "text"]]
Continuing,
c = b.map { |word, *rest| [word, rest.join(' ')] }
#=> [["CODEA", ["text for first item"]],
# ["CODEB", ["text for next item"]],
# ["CODEB2", ["some more text"]],
# ["CODEC", ["yet more text"]]]
c.to_h
#=> {"CODEA"=>"text for first item",
# "CODEB"=>"text for next item",
# "CODEB2"=>"some more text",
# "CODEC"=>"yet more text"}
The following is perhaps a better way of doing this.
str.split.
slice_before { |word| arr.include?(word) }.
each_with_object({}) { |(word, *rest),h|
h[word] = rest.join(' ') }
When I was a kid this might be done as follows.
last_word = ''
str.split.each_with_object({}) do |word,h|
if arr.include?(word)
h[word]=''
last_word = word
else
h[last_word] << ' ' unless h[last_word].empty?
h[last_word] << word
end
end
last_word must be set to anything outside the block.
Code:
str = 'CODEA text for first item CODEB text for next item ' +
'CODEB2 some more text CODEC yet more text'
puts Hash[str.scan(/(CODE\S*) (.*?(?= CODE|$))/)]
Result:
{"CODEA"=>"text for first item", "CODEB"=>"text for next item", "CODEB2"=>"some more text", "CODEC"=>"yet more text"}
Another option.
string.split.reverse
.slice_when { |word| word.start_with? 'CODE' }
.map{ |(*v, k)| [k, v.reverse.join(' ')] }.to_h
Enumerator#slice_when, in this case returns this array:
[["text", "more", "yet", "CODEC"], ["text", "more", "some", "CODEB2"], ["item", "next", "for", "text", "CODEB"], ["item", "first", "for", "text", "CODEA"]]
Then the array is mapped to build the required hash to get the result (I did not reversed the Hash):
#=> {"CODEC"=>"yet more text", "CODEB2"=>"some more text", "CODEB"=>"text for next item", "CODEA"=>"text for first item"}
Adding parentheses to the pattern in String#split lets you get both the separators and the fields.
str.split(/(#{Regexp.union(*arr)})/).drop(1).each_slice(2).to_h
# =>
# {
# "CODEA"=>" text for first item ",
# "CODEB"=>"2 somemore text ",
# "CODEC"=>" yet more text"
# }

wrong number of arguments and hash issues

I am trying to make a method that counts the number of times it uses a word from a dictionary and is returned as a hash. Here's my code now:
def substrings(words, dictionary)
hash = {}
substrings.downcase!
dictionary.each do |substring|
words.each do |word|
if word.include? substring +=1
end
end
end
hash.to_s
end
dictionary = ["below", "down", "go", "going", "horn", "how", "howdy", "it", "i", "low", "own", "part", "partner", "sit"]
words = "below"
substrings(words, dictionary)
And I get this error:
wrong number of arguments (given 0, expected 2)
I'm looking for something like this:
=> {"below"=>1, "low"=>1}
I have tried multiple things but it never gives me that hash. I either get an undefined method error or this:
=> ["below", ["below", "down", "go", "going", "horn", "how", "howdy", "it", "i", "low", "own", "part", "partner", "sit"]]
Your error is caused by the line "substrings.downcase!" This is a recursive call to your substrings method which takes two arguments, and you are providing none. If this were not the case, you would still get an error, a stack overflow caused by the infinite recursion of this code.
This will produce the desired result, but I'm exchanging words in favor of word:
def substrings(word, dictionary)
word = word.downcase
dictionary.select { |entry| word.include?(entry.downcase) }
.group_by(&:itself)
.map { |k, v| [k, v.size] }.to_h
end
This results in:
>> dictionary = ["below", "down", "go", "going", "horn", "how", "howdy", "it", "i", "low", "own", "part", "partner", "sit"]
>> word = 'below'
>> substrings(word, dictionary)
=> {"below"=>1, "low"=>1}
And counts multiple copies of words, which although not explicitly stated, is presumably what you are after:
>> dictionary = ["below", "be", "below", "below", "low", "be", "pizza"]
>> word = 'below'
>> substrings(word, dictionary)
=> {"below"=>3, "be"=>2, "low"=>1}
You can use #reduce:
def substrings(sentence, dictionary)
sentence = sentence.downcase
dictionary.reduce(Hash.new(0)) do |counts,word|
counts[word] +=1 if sentence.include?(word.downcase)
counts
end
end
dictionary = ["below", "down", "go", "going", "horn", "how", "howdy", "it", "i", "low", "own", "part", "partner", "sit"]
sentence = "below"
substrings(sentence, dictionary) #=> {"below"=>1, "low"=>1}
Or #each:
def substrings(sentence, dictionary)
sentence = sentence.downcase
counts = Hash.new(0) # Makes the default value `0` instead of `nil`
dictionary.each do |word|
if sentence.include?(word.downcase)
counts[word] += 1
end
end
counts
end
dictionary = ["below", "down", "go", "going", "horn", "how", "howdy", "it", "i", "low", "own", "part", "partner", "sit"]
sentence = "below"
substrings(sentence, dictionary) #=> {"below"=>1, "low"=>1}

Method to front capitalized words

I am trying to move capitalized words to the front of the sentence. I expect to get this:
capsort(["a", "This", "test.", "Is"])
#=> ["This", "Is", "a", "test."]
capsort(["to", "return", "I" , "something", "Want", "It", "like", "this."])
#=> ["I", "Want", "It", "to", "return", "something", "like", "this."]
The key is maintaining the word order.
I feel like I'm very close.
def capsort(words)
array_cap = []
array_lowcase = []
words.each { |x| x.start_with? ~/[A-Z]/ ? array_cap.push(x) : array_lowcase.push(x) }
words= array_cap << array_lowcase
end
Curious to see what other elegant solutions might be.
The question was changed radically, making my earlier answer completely wrong. Now, the answer is:
def capsort(strings)
strings.partition(&/\p{Upper}/.method(:match)).flatten
end
capsort(["a", "This", "test.", "Is"])
# => ["This", "Is", "a", "test."]
My earlier answer was:
def capsort(strings)
strings.sort
end
capsort(["a", "This", "test.", "Is"])
# => ["Is", "This", "a", "test."]
'Z' < 'a' # => true, there's nothing to be done.
def capsort(words)
words.partition{|s| s =~ /\A[A-Z]/}.flatten
end
capsort(["a", "This", "test.", "Is"])
# => ["This", "Is", "a", "test."]
capsort(["to", "return", "I" , "something", "Want", "It", "like", "this."])
# => ["I", "Want", "It", "to", "return", "something", "like", "this."]
def capsort(words)
caps = words.select{ |x| x =~ /^[A-Z]/ }
lows = words.select{ |x| x !~ /^[A-Z]/ }
caps.concat(lows)
end

Ruby separating an array of strings by checking whether or not the object inherits from a certain class

I have an array of strings that I read from a file x
I have an empty array y
Some string objects are integers
How do I separate the integers from the strings, specifically by using a call to_a?
Right now i'm trying
x.each do |s|
if s.to_i.is_a?(Integer)
y << s
end
end
but this just converts everything to an integer and stuffs it in y, is there a way to see if an object is truly from the Integer class?
Edit add sample input/output
x = [ "This", "is", "a", "random", "amalgamation", "of", "text", "and", "a",
"bunch", "of", "numbers", "111113087403957304739703975", "how", "can", "I",
"read", "this", "in." ]
y = [ 111113087403957304739703975 ]
x = [ "This", "is", "a", "random", "amalgamation", "of", "text", "and", "a",
"bunch", "of", "numbers", "111113087403957304739703975", "how", "can", "I",
"read", "this", "in." ]
y = [ 111113087403957304739703975 ]
def extract_integers(array)
array.select { |v| v.match(/\A\d+\z/) }.map(&:to_i)
# or (simpler, as suggested by #theTinMan)
array.reject { |v| v[/\D/] }.map(&:to_i)
end
p extract_integers(x) #=> [111113087403957304739703975]
p extract_integers(x) == y #=> true
s.match(/^\d+$/) will match a string containing only numbers, so you can use this to test your strings against
You might use Enumerable#grep:
arr = %w[9 cats on 33 hot tin roofs]
#=> ["9", "cats", "on", "33", "hot", "tin", "roofs"]
arr.grep /^\d+$/
#=> ["9", "33"]
arr.grep(/^\d+$/).map(&:to_i)
#=> [9, 33]
x.each do |s|
begin
Integer(s)
rescue ArgumentError
else
y << s
end
end
If applied on a string that doesn't parse as an integer, Integer() raises an ArgumentError. You can use this to find integer strings.
It's always interesting, and useful to run benchmarks:
require 'fruity'
x = [ "This", "is", "a", "random", "amalgamation", "of", "text", "and", "a",
"bunch", "of", "numbers", "111113087403957304739703975", "how", "can", "I",
"read", "this", "in." ]
def extract_integers(array)
array.select { |v| v.match(/\A\d+\z/) }.map(&:to_i)
end
def extract_integers_reject(array)
array.reject { |v| v[/\D/] }.map(&:to_i)
end
compare do
use_exception {
y = []
x.each do |s|
begin
Integer(s)
rescue ArgumentError
else
y << s.to_i
end
end
y
}
use_extract_integers {
extract_integers(x)
}
use_extract_integers_reject {
extract_integers_reject(x)
}
end
Running that results in the following on my machine:
Running each test 256 times. Test will take about 1 second.
use_extract_integers_reject is faster than use_extract_integers by 30.000000000000004% ± 10.0%
use_extract_integers is faster than use_exception by 6x ± 0.1
Note, y << s was changed to y << s.to_i to make the outputs all match.
I'd probably simplify the code using the ArgumentError rescue like this:
x.each do |s|
begin
y << Integer(s)
rescue ArgumentError
end
end

Ruby string split into words ignoring all special characters: Simpler query

I need a query to be split into words everywhere a non word character is used. For example:
query = "I am a great, boy's and I like! to have: a lot-of-fun and #do$$nice&acti*vities+enjoy good ?times."
Should output:
["I", "am", "a", "great", "", "boy", "s", "and", "I", "like", "", "to", "have", "", "a", "lot", "of", "fun", "and", "", "do", "", "nice", "acti", "vities", "enjoy", "good", "", "times"]
This does the trick but is there a simpler way?
query.split(/[ ,'!:\\#\\$\\&\\*+?.-]/)
query.split(/\W+/)
# => ["I", "am", "a", "great", "boy", "s", "and", "I", "like", "to", "have", "a", "lot", "of", "fun", "and", "do", "nice", "acti", "vities", "enjoy", "good", "times"]
query.scan(/\w+/)
# => ["I", "am", "a", "great", "boy", "s", "and", "I", "like", "to", "have", "a", "lot", "of", "fun", "and", "do", "nice", "acti", "vities", "enjoy", "good", "times"]
This is different from the expected output in that it does not include empty strings.
I am adding this answer as #sawa's did not exactly reproduce the desired output:
#Split using any single non-word character:
query.split(/\W/) #=> ["I", "am", "a", "great", "", "boy", "s", "and", "I", "like", "", "to", "have", "", "a", "lot", "of", "fun", "and", "", "do", "", "nice", "acti", "vities", "enjoy", "good", "", "times"]
Now if you do not want the empty strings in the result just use sawa's answer.
The result above will create many empty strings in the result if the string contains multiple spaces, as each extra spaces will be matched again and create a new splitting point. To avoid that we can add an or condition:
# Split using any number of spaces or a single non-word character:
query.split(/\s+|\W/)

Resources