Bug in my Ruby counter - ruby

It is only counting once for each word. I want it to tell me how many times each word appears.
dictionary = ["to","do","to","do","to","do"]
string = "just do it to"
def machine(word,list)
initialize = Hash.new
swerve = word.downcase.split(" ")
list.each do |i|
counter = 0
swerve.each do |j|
if i.include? j
counter += 1
end
end
initialize[i]=counter
end
return initialize
end
machine(string,dictionary)

I assume that, for each word in string, you wish to determine the number of instances of that word in dictionary. If so, the first step is to create a counting hash.
dict_hash = dictionary.each_with_object(Hash.new(0)) { |word,h| h[word] += 1 }
#=> {"to"=>3, "do"=>3}
(I will explain this code later.)
Now split string on whitespace and create a hash whose keys are the words in string and whose values are the numbers of times that the value of word appears in dictionary.
string.split.each_with_object({}) { |word,h| h[word] = dict_hash.fetch(word, 0) }
#=> {"just"=>0, "do"=>3, "it"=>0, "to"=>3}
This of course assumes that each word in string is unique. If not, depending on the desired behavior, one possibility would be to use another counting hash.
string = "to just do it to"
string.split.each_with_object(Hash.new(0)) { |word,h|
h[word] += dict_hash.fetch(word, 0) }
#=> {"to"=>6, "just"=>0, "do"=>3, "it"=>0}
Now let me explain some of the constructs above.
I created two hashes with the form of the class method Hash::new that takes a parameter equal to the desired default value, which here is zero. What that means is that if
h = Hash.new(0)
and h does not have a key equal to the value word, then h[word] will return h's default value (and the hash h will not be changed). After creating the first hash that way, I wrote h[word] += 1. Ruby expands that to
h[word] = h[word] + 1
before she does any further processing. The first word in string that is passed to the block is "to" (which is assigned to the block variable word). Since the hash h is is initially empty (has no keys), h[word] on the right side of the above equality returns the default value of zero, giving us
h["to"] = h["to"] + 1
#=> = 0 + 1 => 1
Later, when word again equals "to" the default value is not used because h now has a key "to".
h["to"] = h["to"] + 1
#=> = 1 + 1 => 2
I used the well-worn method Enumerable#each_with_object. To a newbie this might seem complex. It isn't. The line
dict_hash = dictionary.each_with_object(Hash.new(0)) { |word,h| h[word] += 1 }
is effectively1 the same as the following.
h = Hash.new(0)
dict_hash = dictionary.each { |word| h[word] += 1 }
h
In other words, the method allows one to write a single line that creates, constructs and returns the hash, rather than three lines that do the same.
Notice that I used the method Hash#fetch for retrieving values from the hash:
dict_hash.fetch(word, 0)
fetch's second argument (here 0) is returned if dict_hash does not have a key equal to the value of word. By contrast, dict_hash[word] returns nil in that case.
1 The reason for "effectively" is that when using each_with_object, the variable h's scope is confined to the block, which is generally a good programming practice. Don't worry if you haven't learned about "scope" yet.

You can actually do this using Array#count rather easily:
def machine(word,list)
word.downcase.split(' ').collect do |w|
# for every word in `word`, count how many appearances in `list`
[w, list.count { |l| l.include?(w) }]
end.to_h
end
machine("just do it to", ["to","do","to","do","to","do"]) # => {"just"=>0, "do"=>3, "it"=>0, "to"=>3}

I think this is what you're looking for, but it seems like you're approaching this backwards
Convert your string "string" into an array, remove duplicate values and iterate through each element, counting the number of matches in your array "dictionary". The enumerable method :count is useful here.
A good data structure to output here would be a hash, where we store the unique words in our string "string" as keys and the number of occurrences of these words in array "dictionary" as the values. Hashes allow one to store more information about the data in a collection than an array or string, so this fits here.
dictionary = [ "to","do","to","do","to","do" ]
string = "just do it to"
def group_by_matches( match_str, list_of_words )
## trim leading and trailing whitespace and split string into array of words, remove duplicates.
to_match = match_str.strip.split.uniq
groupings = {}
## for each element in array of words, count the amount of times it appears *exactly* in the list of words array.
## store that in the groupings hash
to_match.each do | word |
groupings[ word ] = list_of_words.count( word )
end
groupings
end
group_by_matches( string, dictionary ) #=> {"just"=>0, "do"=>3, "it"=>0, "to"=>3}
On a side note, you should consider using more descriptive variable and method names to help yourself and others follow what's going on.
This also seems like you have it backwards. Typically, you'd want to use the array to count the number of occurrences in the string. This seems to more closely fit a real-world application where you'd examine a sentence/string of data for matches from a list of predefined words.
Arrays are also useful because they're flexible collections of data, easily iterated through and mutated with enumerable methods. To work with the words in our string, as you can see, it's easiest to immediately convert it to an array of words.
There are many alternatives. If you wanted to shorten the method, you could replace the more verbose each loop with an each_with_object call or a map call which will return a new object rather than the original object like each. In the case of using map.to_h, be careful as to_h will work on a two-dimensional array [["key1", "val1"], ["key2", "val2"]] but not on a single dimensional array.
## each_with_object
def group_by_matches( match_str, list_of_words )
to_match = match_str.strip.split.uniq
to_match.
each_with_object( {} ) { | word, groupings | groupings[ word ] = list_of_words.count( word ) }
end
## map
def group_by_matches( match_str, list_of_words )
to_match = match_str.strip.split.uniq
to_match.
map { | word | [ word, list_of_words.count( word ) ] }.to_h
end
Gauge your method preferences depending on performance, readability, and reliability.

list.each do |i|
counter = 0
swerve.each do |j|
if i.include? j
counter += 1
needs to be changed to
swerve.each do |i|
counter = 0
list.each do |j|
if i.include? j
counter += 1

Your code is telling how many times each word in the word/string (the word which is included in the dictionary) appears.
If you want to tell how many times each word in the dictionary appears, you can switch the list.each and swerve.each loops. Then, it will return a hash # => {"just"=>0, "do"=>3, "it"=>0, "to"=>3}

Related

Is it possible to keep the non letter symbols intact?

I am a Ruby beginner and i am working on a cypher program.
It takes a phrase , transforms the string to numbers , increments the numbers with a given value then transforms them again in to a string.
I would like to know how can i can keep the non letter symbols unchanged. like the " ! " or the space.
The code i have wrote is bellow:
def caesar_cypher ( phrase, number=0)
letters = phrase.downcase.split(//)
letters_to_numbers= letters.map { |idx| idx.ord }
incrementor = letters_to_numbers.map { |idx| idx+number}
numbers_to_letters = incrementor.map { |idx| idx.chr}.join.capitalize
p numbers_to_letters
#binding.pry
end
caesar_cypher("Hello world!", 4)
caesar_cypher("What a string!", 6)
Solution Using Array#rotate and Hash#fetch
Yes, you can pass characters through unmodified, but to do so you'll need to define what's a "letter" and what you want to include or exclude from the encoding within your #map block. Here's a slightly different way to approach the problem that does those things, but is also much shorter and adds some additional flexibility.
Create an Array of all uppercase and lowercase characters in the English alphabet, and assign each to a replacement value using the inverted hashed value of Array#rotate, where the rotation value is your reproducible cypher key.
Warn when you won't have an encrypted value because the rotation is key % 26 == 0, but allow it anyway. This helps with testing. Otherwise, you could simply raise an exception if you don't want to allow plaintext results, or set a default value for key.
Don't capitalize your sentences. That limits your randomness, and prevents you from having separate values for capital letters.
Using a default value with Hash#fetch allows you to return any character that isn't in your Hash without encoding it, so UTF-8 or punctuation will simply be passed through as-is.
Spaces are not part of the defined encoding in the Hash, so you can use String#join without having to treat them specially.
Using Ruby 3.0.2:
def caesar_cypher phrase, key
warn "no encoding when key=#{key}" if (key % 26).zero?
letters = [*(?A..?Z), *(?a..?z)]
encoding = letters.rotate(key).zip(letters).to_h.invert
phrase.chars.map { encoding.fetch _1, _1 }.join
end
You can verify that this gives you repeatable outputs with some of the following examples:
# verify your mapping with key=0,
# which returns the phrase as-is
caesar_cypher "foo bar", 0
#=> "foo bar"
caesar_cypher "foo bar", 5
#=> "ktt gfw"
caesar_cypher "Et tu, Brute?", 43
#=> "vk kl, silkV?"
# use any other rotation value you like;
# you aren't limited to just 0..51
caesar_cypher "Veni, vidi, vici", 152
#=> "Raje, reZe, reYe"
# UTF-8 and punctuation (actually, anything
# not /[A-Za-z]/) will simply pass through
# unencoded since it's not defined in the
# +encoding+ Hash
caesar_cypher "î.ô.ú.", 10
#=> "î.ô.ú."
Syntax Note for Numbered Arguments
The code above should work on most recent Ruby versions, but on versions older than 2.7 you may need to replace the _1 variables inside the block with something like:
phrase.chars.map { |char| encoding.fetch(char, char) }.join
instead of relying on numbered positional arguments. I can't think of anything else that would prevent this code from running on any Ruby version that's not past end-of-life, but if you find something specific please add a comment.
A_ORD = 'A'.ord
def caesar_cypher(str, offset)
h = ('A'..'Z').each_with_object(Hash.new(&:last)) do |ch,h|
h[ch] = (A_ORD + (ch.ord - A_ORD + offset) % 26).chr
h[ch.downcase] = (h[ch].ord + 32).chr
end
str.gsub(/./, h)
end
Try it.
caesar_cypher("Hello world!", 4)
#=> "Lipps asvph!"
caesar_cypher("What a string!", 6)
#=> "Cngz g yzxotm!"
In executing the first example the hash held by the variable h equals
{"A"=>"E", "a"=>"e", "B"=>"F", "b"=>"f", "C"=>"G", "c"=>"g", "D"=>"H",
"d"=>"h", "E"=>"I", "e"=>"i", "F"=>"J", "f"=>"j", "G"=>"K", "g"=>"k",
"H"=>"L", "h"=>"l", "I"=>"M", "i"=>"m", "J"=>"N", "j"=>"n", "K"=>"O",
"k"=>"o", "L"=>"P", "l"=>"p", "M"=>"Q", "m"=>"q", "N"=>"R", "n"=>"r",
"O"=>"S", "o"=>"s", "P"=>"T", "p"=>"t", "Q"=>"U", "q"=>"u", "R"=>"V",
"r"=>"v", "S"=>"W", "s"=>"w", "T"=>"X", "t"=>"x", "U"=>"Y", "u"=>"y",
"V"=>"Z", "v"=>"z", "W"=>"A", "w"=>"a", "X"=>"B", "x"=>"b", "Y"=>"C",
"y"=>"c", "Z"=>"D", "z"=>"d"}
The snippet
Hash.new(&:last)
if the same as
Hash.new { |h,k| k }
where the block variable h is the (initially-empty) hash that is being created and k is a key. If a hash is defined
hash = Hash.new { |h,k| k }
then (possibly after adding key-value pairs) if hash does not have a key k, hash[k] returns k (that is, the character k is left unchanged).
See the form of Hash::new that takes a block but no argument.
We can easily create a decrypting method.
def caesar_decrypt(str, offset)
caesar_cypher(str, 26-offset)
end
offset = 4
s = caesar_cypher("Hello world!", offset)
#=> "Lipps asvph!"
caesar_decrypt(s, offset)
#=> "Hello world!"
offset = 24
s = caesar_cypher("Hello world!", offset)
#=> Fcjjm umpjb!
caesar_decrypt(s, offset)
#=> "Hello world!"

Given a string, how do I compare the characters to see if there are duplicates?

I'm trying to compare characters in a given string to see if there are duplicates, and if there are I was to remove the two characters to reduce the string to as small at possible. eg. ("ttyyzx") would equal to ("zx")
I've tried converting the characters in an array and then using an #each_with_index to iterate over the characters.
arr = ("xxyz").split("")
arr.each_with_index do |idx1, idx2|
if idx1[idx2] == idx1[idx2 + 1]
p idx1[idx2]
p idx1[idx2 + 1]
end
end
At this point I just wan to be able to print the next character in the array within the loop so I know I can move on to the next step, but no matter what code I use it will only print out the first character "x".
To only keep the unique characters (ggorlen's answer is "b"): count all characters, find only those that appear once. We rely on Ruby's Hash producing keys in insertion order.
def keep_unique_chars(str)
str.each_char.
with_object(Hash.new(0)) { |element, counts| counts[element] += 1 }.
select { |_, count| count == 1 }.
keys.
join
end
To remove adjacent dupes only (ggorlen's answer is "aba"): a regular expression replacing adjacent repetitions is probably the go-to method.
def remove_adjacent_dupes(str)
str.gsub(/(.)\1+/, '')
end
Without regular expressions, we can use slice_when to cut the array when the character changes, then drop the groups that are too long. One might think a flatten would be required before join, but join doesn't care:
def remove_adjacent_dupes_without_regexp(str)
str.each_char.
slice_when { |prev, curr| prev != curr }.
select { |group| group.size == 1 }.
join
end
While amadan's and user's solution definitely solve the problem I felt like writing a solution closer to the OP's attempt:
def clean(string)
return string if string.length == 1
array = string.split('')
array.select.with_index do |value, index|
array[index - 1] != value && array[index + 1] != value
end.join
end
Here are a few examples:
puts clean("aaaaabccccdeeeeeefgggg")
#-> bdf
puts clean("m")
#-> m
puts clean("ttyyzx")
#-> zx
puts clean("aab")
#-> b
The method makes use of the fact that the characters are sorted and in case there are duplicates, they are either before or after the character that's being checked by the select method. The method is slower than the solutions posted above, but as OP mentioned he does not yet work with hashes yet I though this might be useful.
If speed is not an issue,
require 'set'
...
Set.new(("xxyz").split("")).to_a.join # => xyz
Making it a Set removes duplicates.
The OP does not want to remove duplicates and keep just a single copy, but remove all characters completely from occurring more than once. So here is a new approach, again compact, but not fast:
"xxyz".split('').sort.join.gsub(/(.)\1+/,'')
The idea is to sort the the letters; hence, identical letters will be joined together. The regexp /(.)\1+/ describes a repetition of a letter.

A way to specify and initialize the type of a map's values?

I want to count all the words in a line of text. I'm using a map to do this, with the words for keys and integers for values. I don't know how to tell Ruby that all the values will be integers. It forces me to put an ugly branching inside my iterator's block:
# in the constructor
#individual_words = {}
def count_words_from( text_line )
text_line.each do |line|
line.scan(/\p{Word}+/)
.reject{ |string| string =~ /\d/ }
.each do |word|
if #individual_words[ word ] == nil then # This is ugly
#individual_words[ word ] = 1 # This is ugly as well
else
#individual_words[ word ] += 1
end
end
end
end
In simple, I'd like to do something like this Java line:
Map<String, Integer> individualWords;
to avoid having to change the type of the first occurence of a word from Nil to Integer.
You can set a default value in your hash like this:
individual_words = Hash.new(0)
Then when you come across a word, whether its key is in the hash or not, all you have to do is:
individual_words[word] += 1
You can also do something like this
#individual_words[word] ||= 0
#individual_words[word] += 1
||= ensures that the value gets set if it's not truthy (ie. nil)

ruby string array iteration. Array of arrays

I have a ruby problem
Here's what i'm trying to do
def iterate1 #define method in given class
#var3 = #var2.split(" ") #split string to array
#var4 = #var3
#var4.each do |i| #for each array item do i
ra = []
i.each_char {|d| ra << counter1(d)} # for each char in i, apply def counter1
#sum = ra.inject(:+)
#sum2 = #sum.inject(:+) #have to do the inject twice to get values
end
#sum2
I know i have over complicated this
Basically the input is a string of letters and values like "14556 this word 398"
I am trying to sum the numbers in each value, seperated by the whitespace like (" ")
When i use the def iterate1 method the block calls the counter1 method just fine, but i can only get the value for the last word or value in the string.
In this case that's 398, which when summed would be 27.
If i include a break i get the first value, which would be 21.
I'm looking to output an array with all of the summed values
Any help would be greatly appreciated
I think you're after:
"10 d 20 c".scan(/\b\d+\b/).map(&:to_i).inject(:+) # Returns 30
scan(/\b\d+\b/) will extract all numbers that are made up of digits only in an array, map(&:to_i) will convert them to integers and I guess you already know what inject(:+) will do.
I'm not sure if I understand what you're after correctly, though, so it might help if you provide the answer you expect to this input.
EDIT:
If you want to sum the digits in each number, you can do it with:
"12 d 34 c".scan(/\b\d+\b/).map { |x| x.chars.map(&:to_i).inject(:+) }
x.chars will return an enumerator for the digits, map(&:to_i) will convert them to integers and inject(:+) will sum them.
The simplest answer is to use map instead of each because the former collects the results and returns an array. e.g:
def iterate1 #define method in given class
#var3 = #var2.split(" ") #split string to array
#var4 = #var3
#var4.map do |i| #for each array item do i
ra = []
i.each_char {|d| ra << counter1(d)} # for each char in i, apply def counter1
#sum = ra.inject(:+)
#sum2 = #sum.inject(:+) #have to do the inject twice to get values
end
end
You could write it a lot cleaner though and I think Stefan was a big help. You could solve the issue with a little modification of his code
# when you call iterate, you should pass in the value
# even if you have an instance variable available (e.g. #var2)
def iterate(thing)
thing.scan(/\b\d+\b/).map do |x|
x.chars.map{|d| counter1(d)}.inject(:+)
end
end
The above assumes that the counter1 method returns back the value as an integer

Explaining a Ruby code snippet

I'm in that uncomfortable position again, where somebody has left me with a code snippet in a language I don't know and I have to maintain it. While I haven't introduced Ruby to myself some parts of it are quite simple, but I'd like to hear your explanations nonetheless.
Here goes:
words = File.open("lengths.txt") {|f| f.read }.split # read all lines of a file in 'words'?
values = Array.new(0)
words.each { |value| values << value.to_i } # looked this one up, it's supposed to convert to an array of integers, right?
values.sort!
values.uniq!
diffs = Array.new(0) # this looks unused, unless I'm missing something obvious
sum = 0
s = 0 # another unused variable
# this looks like it's computing the sum of differences between successive
# elements, but that sum also remains unused, or does it?
values.each_index { |index| if index.to_i < values.length-1 then sum += values.at(index.to_i + 1) - values.at(index.to_i) end } # could you also explain the syntax here?
puts "delta has the value of\n"
# this will eventually print the minimum of the original values divided by 2
puts values.at(0) / 2
The above script was supposed to figure out the average of the differences between every two successive elements (integers, essentially) in a list. Am I right in saying this is nowhere near what it actually does, or am I missing something fundamental, which is likely considering I have no Ruby knowledge?
Explanation + refactor (non used variables removed, functional approach, each_cons):
# Read integer numbers from file, sort them ASC and remove duplicates
values = File.read("lengths.txt").split.map(&:to_i).sort.uniq
# Take pairwise combinations and get the total sum of partial differences
partial_diffs = values.each_cons(2).map { |a, b| b - a }.inject(0, :+)
That guy surely didn't grasp Ruby himself. I wonder why he chose to use that language.
Here's an annotated explanation:
# Yes, it reads all lines of a file in words (an array)
words = File.open("lengths.txt") {|f| f.read }.split
values = Array.new(0)
# Yes, to_i convert string into integer
words.each { |value| values << value.to_i }
values.sort!
values.uniq!
# diffs and s seem unused
diffs = Array.new(0)
sum = 0
s = 0
# The immediate line below can be read as `for(int index = 0; index < values.length; index++)`
values.each_index { |index|
# index is integer, to_i is unnecessary
if index.to_i < values.length-1 then
# The `sum` variable is used here
# Following can be rewritten as sum += values[i-1] - values[i]
sum += values.at(index.to_i + 1) - values.at(index.to_i)
end
}
puts "delta has the value of\n"
# Yes, this will eventually print the minimal of the original values divided by 2
puts values.at(0) / 2
To help you get a better grasp of what "real" (idiomatic) Ruby looks like, I've written what you wanted, with some annotations
values = open("lengths.txt") do |f|
# Read it like this:
#
# Take the list of all lines in a file,
# apply a function to each line
# The function is stripping the line and turning it
# into an integer
# (This means the resultant list is a list of integers)
#
# And then sort it and unique the resultant list
#
# The eventual resultant list is assigned to `values`
# by being the return value of this "block"
f.lines.map { |l| l.strip.to_i }.sort.uniq
end
# Assign `diffs` to an empty array (instead of using Array.new())
diffs = []
values.each_index do |i|
# Syntactic sugar for `if`
# It applies the 1st part if the 2nd part is true
diffs << (values[i+1] - values[i]) if i < values.length - 1
end
# You can almost read it like this:
#
# Take the list `diffs`, put all the elements in a sentence, like this
# 10 20 30 40 50
#
# We want to inject the function `plus` in between every element,
# so it becomes
# 10 + 20 + 30 + 40 + 50
#
# The colon `:+` is used to refer to the function `plus` as a symbol
#
# Take the result of the above summation, divided by length,
# which gives us average
delta = diffs.inject(:+) / diffs.length
# `delta` should now contains the "average of differences" between
# the original `values`
# String formatting using the % operator
# No \n needed since `puts` already add one for us
puts "delta has the value of %d" % delta
That is by no means pushing the true power of Ruby, but you see why Rubyists get so enthusiastic about expressiveness and stuffs :P
values.each_index { |index| if index.to_i < values.length-1 then sum += values.at(index.to_i + 1) - values.at(index.to_i) end }
The above line sums the differences between consecutive values. the test index.to_i < values.length-1 is to not access the array out of bounds, because of values.at(index.to_i + 1).
You are right, this code does not do much thing. it only prints half of the minimum value from the file.

Resources