How to reset value of local variable within loop? - ruby

I'd like to point out I tried quite extensively to find a solution for this and the closest I got was this. However I couldn't see how I could use map to solve my issue here. I'm brand new to Ruby so please bear that in mind.
Here's some code I'm playing with (simplified):
def base_word input
input_char_array = input.split('') # split string to array of chars
#file.split("\n").each do |dict_word|
input_text = input_char_array
dict_word.split('').each do |char|
if input_text.include? char.downcase
input_text.slice!(input_text.index(char))
end
end
end
end
I need to reset the value of input_text back to the original value of input_char_array after each cycle, but from what I gather since Ruby is reference-based, the modifications I make with the line input_text.slice!(input_text.index(char)) are reflected back in the original reference, and I end up assigning input_text to an empty array fairly quickly as a result.
How do I mitigate that? As mentioned I've tried to use .map but maybe I haven't fully wrapped my head around how I ought to go about it.

You can get an independent reference by cloning the array. This, obviously, has some RAM usage implications.
input_text = input_char_array.dup

The Short and Quite Frankly Not Very Good Answer
Using slice! overwrites the variable in place, equivalent to
input_text = input_text.slice # etc.
If you use plain old slice instead, it won't overwrite input_text.
The Longer and Quite Frankly Much Better Answer
In Ruby, code nested four levels deep is often a smell. Let's refactor, and avoid the need to reset a loop at all.
Instead of splitting the file by newline, we'll use Ruby's built-in file handling module to read through the lines. Memoizing it (the ||= operator) may prevent it from reloading the file each time it's referenced, if we're running this more than once.
def dictionary
#dict ||= File.open('/path/to/dictionary')
end
We could also immediately make all the words lowercase when we open the file, since every character is downcased individually in the original example.
def downcased_dictionary
#dict ||= File.open('/path/to/dictionary').each(&:downcase)
end
Next, we'll use Ruby's built-in file and string functions, including #each_char, to do the comparisons and output the results. We don't need to convert any inputs into Arrays (at all!), because #include? works on strings, and #each_char iterates over the characters of a string.
We'll decompose the string-splitting into its own method, so the loop logic and string logic can be understood more clearly.
Lastly, by using #slice instead of #slice!, we don't overwrite input_text and entirely avoid the need to reset the variable later.
def base_word(input)
input_text = input.to_s # Coerce in case it's not a string
# Read through each line in the dictionary
dictionary.each do |word|
word.each_char {|char| slice_base_word(input_text, char) }
end
end
def slice_base_word(input, char)
input.slice(input.index(char)) if input.include?(char)
end

Related

Handle ARGV in Ruby without if...else block

In a blog post about unconditional programming Michael Feathers shows how limiting if statements can be used as a tool for reducing code complexity.
He uses a specific example to illustrate his point. Now, I've been thinking about other specific examples that could help me learn more about unconditional/ifless/forless programming.
For example in this cat clone there is an if..else block:
#!/usr/bin/env ruby
if ARGV.length > 0
ARGV.each do |f|
puts File.read(f)
end
else
puts STDIN.read
end
It turns out ruby has ARGF which makes this program much simpler:
#!/usr/bin/env ruby
puts ARGF.read
I'm wondering if ARGF didn't exist how could the above example be refactored so there is no if..else block?
Also interested in links to other illustrative specific examples.
Technically you can,
inputs = { ARGV => ARGV.map { |f| File.open(f) }, [] => [STDIN] }[ARGV]
inputs.map(&:read).map(&method(:puts))
Though that's code golf and too clever for its own good.
Still, how does it work?
It uses a hash to store two alternatives.
Map ARGV to an array of open files
Map [] to an array with STDIN, effectively overwriting the ARGV entry if it is empty
Access ARGV in the hash, which returns [STDIN] if it is empty
Read all open inputs and print them
Don't write that code though.
As mentioned in my answer to your other question, unconditional programming is not about avoiding if expressions at all costs but about striving for readable and intention revealing code. And sometimes that just means using an if expression.
You can't always get rid of a conditional (maybe with an insane number of classes) and Michael Feathers isn't advocating that. Instead it's sort of a backlash against overuse of conditionals. We've all seen nightmare code that's endless chains of nested if/elsif/else and so has he.
Moreover, people do routinely nest conditionals inside of conditionals. Some of the worst code I've ever seen is a cavernous nightmare of nested conditions with odd bits of work interspersed within them. I suppose that the real problem with control structures is that they are often mixed with the work. I'm sure there's some way that we can see this as a form of single responsibility violation.
Rather than slavishly try to eliminate the condition, you could simplify your code by first creating an array of IO objects from ARGV, and use STDIN if that list is empty.
io = ARGV.map { |f| File.new(f) };
io = [STDIN] if !io.length;
Then your code can do what it likes with io.
While this has strictly the same number of conditionals, it eliminates the if/else block and thus a branch: the code is linear. More importantly, since it separates gathering data from using it, you can put it in a function and reuse it further reducing complexity. Once it's in a function, we can take advantage of early return.
# I don't have a really good name for this, but it's a
# common enough idiom. Perl provides the same feature as <>
def arg_files
return ARGV.map { |f| File.new(f) } if ARGV.length;
return [STDIN];
end
Now that it's in a function, your code to cat all the files or stdin becomes very simple.
arg_files.each { |f| puts f.read }
First, although the principle is good, you have to consider other things that are more importants such as readability and perhaps speed of execution.
That said, you could monkeypatch the String class to add a read method and put STDIN and the arguments in an array and start reading from the beginning until the end of the array minus 1, so stopping before STDIN if there are arguments and go on until -1 (the end) if there are no arguments.
class String
def read
File.read self if File.exist? self
end
end
puts [*ARGV, STDIN][0..ARGV.length-1].map{|a| a.read}
Before someone notices that I still use an if to check if a File exists, you should have used two if's in your example to check this also and if you don't, use a rescue to properly inform the user.
EDIT: if you would use the patch, read about the possible problems at these links
http://blog.jayfields.com/2008/04/alternatives-for-redefining-methods.html
http://www.justinweiss.com/articles/3-ways-to-monkey-patch-without-making-a-mess/
Since the read method isn't part of String the solutions using alias and super are not necessary, if you plan to use a Module, here is how to do that
module ReadString
def read
File.read self if File.exist? self
end
end
class String
include ReadString
end
EDIT: just read about a safe way to monkey patch, for your documentation see https://solidfoundationwebdev.com/blog/posts/writing-clean-monkey-patches-fixing-kaminari-1-0-0-argumenterror-comparison-of-fixnum-with-string-failed?utm_source=rubyweekly&utm_medium=email

Datatype conversion error in Ruby for-loop

I'm looking for some help understanding why I get an error (no implicit conversion of nil into String) when attempting to use a for-loop to search through an array of letters (and add them to a resulting string, which seems to be the real problem), but not when I use a while-loop or 'each' for the same purposes. I've looked through a lot of documentation, but haven't been able to find an answer as to why this is happening. I understand that I could just use the "each" method and call it a day, but I'd prefer to comprehend the cause as well as the effect (and hopefully avoid this problem in the future).
The following method works as desired: printing "result" which is the original string, only with "!" in place of any vowels.
s="helloHELLO"
result=""
vowels=["a","e","i","o","u","A","E","I","O","U"]
string_array=s.split("")
string_array.each do |i|
if vowels.include?(i)
result+="!"
else
result+=i
end
end
puts result
However, my initial attempt (posted below) raises the error mentioned above: "no implicit conversion of nil into String" citing lines 5 and 9.
s="helloHELLO"
result=""
vowels=["a","e","i","o","u","A","E","I","O","U"]
string_array=s.split("")
for i in 0..string_array.length
if vowels.include?(string_array[i])
result+= "!"
else
result+=string_array[i]
end
end
puts result
Through experimentation, I managed to get it working; and I determined--through printing to screen rather than storing in "result"--that the problem occurs during concatenation of the target letter to the string "result". But why is "string_array[i]" (line #9) seen as NIL rather than as a String? I feel like I'm missing something very obvious.
If it matters: This is just a kata on CodeWars that lead me to a fundamental question about data types and the mechanics of the for..in loop. This seemed very relevant, but not 100% on the mark for my question: "for" vs "each" in Ruby.
Thanks in advance for the help.
EDIT:
Okay, I think I figured it out. I'd still love some answers though, to confirm, clarify, or downright refute.
I realized that if I wanted to use the for-loop, I should use the array itself as the "range" rather than "0..array.length", like so:
s="helloHELLO"
result=""
vowels=["a","e","i","o","u","A","E","I","O","U"]
string_array=s.split("")
for i in string_array
if vowels.include?(i)
result+= "!"
else
result+=i
end
end
puts result
So, is it that since the "each" method variable (in this case, "i") doesn't exist outside the scope of the main block, its datatype become nil after evaluating whether it's included in the 'vowels' array?
You got beaten by the classical error when iterating an array starting with index 0, instead of length as end position it should be length-1.
But it seems like you come from some other programming language, your code is not Rubyesque, a 'For' for example is seldom used.
Ruby is a higher language than most others, it has many solutions build in, we call it 'sugared' because Ruby is meant to make us programmers happy. What you try to achieve can be done in just one line.
"helloHELLO".scan(/[aeoui]/i).count
Some explanation: the literal array "hello HELLO" is a String, meaning an object of the String class and as such has a lot of methods you can use, like scan, which scans the string for the regular expression /[aeoui]/ which means any of the characters enclosed in the [], the i at the end makes it case insentitive so you don't have to add AEOUI. The scan returns an array with the matching characters, an object of the Array class has the method count, which gives us the ... Yeah once you get the drift it's easy, you can string together methods which act upon each other.
Your for loop:
for i in 0..string_array.length
loops from 0 to 10.
But string[10] #=> nil because there is no element at index 10. And then on line 9 you try to add nil to result
result = result + string_array[i] #expanded
You can't add nil to a string like this, you have to convert nil to a string explicitly thus the error. The best way to fix this issue is to change your for loop to:
for i in 0..string_array.length-1
Then your loop will finish at the last element, string[9].

Why is there no `.split!` in Ruby?

It just seems pretty logical to have it when there's even a downcase!. Has anyone else run into this use case in Ruby?
For the curious, I'm trying to do this:
def some_method(foo)
foo.downcase!.split!(" ")
## do some stuff with foo later. ##
end
some_method("A String like any other")
Instead of this:
def some_method(foo)
foo = foo.downcase.split(" ")
## do some stuff with foo later. ##
end
some_method("A String like any other")
Which isn't a really big deal...but ! just seems cooler.
Why is there no .split! in Ruby?
It just seems pretty logical to have it when there's even a downcase!.
It may be logical, but it is impossible: objects cannot change their class or their identity in Ruby. You may be thinking of Smalltalk's become: which doesn't and cannot exist in Ruby. become: changes the identity of an object and thus can also change its class.
I don't see this "use case" as very important.
The only thing a "bang method" is doing is saving you the trouble of assigning a variable.
The reason "bang methods" are the exception instead of the rule is they can produce confusing results if you don't understand them.
i.e. if you write
a = "string"
def my_upcase(string)
string.upcase!
end
b = my_upcase(a)
then both a and b will have transformed value even if you didn't intend to change a. Removing the exclamation point fixes this example, but if you're using mutable objects such as hashes and arrays you'll have to look out for this in other situations as well.
a = [1,2,3]
def get_last_element(array)
array.pop
end
b = get_last_element(a)
Since Array#pop has side effects, a is now 1,2. It has the last element removed, which might not have been what you intended. You could replace .pop here with [-1] or .last to get rid of the side effect
The exclamation point in a method name is essentially warning you that there are side effects. This is important in the concept of functional programming, which prescribes side effect free code. Ruby is very much a functional programming language by design (although it's very object oriented as well).
If your "use case" boils down to avoiding assigning a variable, that seems like a really minor discomfort.
For a more technical reason, though, see Jorg Mittag's answer. It's impossible to write a method which changes the class of self
this
def some_method(foo)
foo = foo.downcase.split(" ")
end
some_method("A String like any other")
is the same as this
def some_method(foo)
foo.downcase.split
end
some_method("A String like any other")
Actually, both of your methods return the same result. We can look at a few examples of methods that modify the caller.
array.map! return a modified original array
string.upcase! return a modified original string
However,
split modifies the class of the caller, changing a string to an array.
Notice how the above examples only modify the content of the object, instead of changing its class.
This is most likely why there isn't a split! method, although it's pretty easy to define one yourself.
#split creates an array out of a string, you can't permanently mutate(!) the string into being an array. Because the method is creating a new form from the source information(string), the only thing you need to do to make it permanent, is to bind it to a variable.

Selecting key words in a string (that are included in an Array) to change their format in Ruby

Select key words in a string to change their format in Ruby
I have a big string (text) and an Array of strings (key_words) as below:
text = 'So in this election, we cannot sit back and hope that everything works out for the best. We cannot afford to be tired or frustrated or cynical. No, hear me. Between now and November, we need to do what we did eight years ago and four years ago…'
key_words = ['frustrated', 'tired', 'hope']
My objective is to print each word in ‘text’ while changing the colour and case of the words that are included in key_words. I’ve been able to do that by doing:
require 'colorize'
text.split(/\b/).each do |x|
if key_words.include?(x.downcase) ; print '#{x}'.colorize(:red)
else print '#{x}' end
end
However, since I don’t want to include many words in key_words I want to make the selection more sensitive going beyond an exact match. Such as if, for example:
key_words = ['frustrat', 'tire', 'hope'] => the algorithm would select both 'Frustration', 'Frustrated' or 'Tiring' and 'Tired' or 'Hope' and 'Hopeful'.
I’ve tried playing with word lengths in both the string and the array as below but it’s seems very inefficient solution and I’m getting very confused with the usage of .any? and .include? methods in this scenario.
key_words = ['frustrated', 'tired', 'hope']
key_words_abb = []
key_words.each { |x| key_words_abb << x.downcase[0][0..x.length-2]}
text.split(/\b/).each do |x|
if key_words_abb.include?(x.downcase[0][0..x.length-2]); print '#{x}'.colorize(:red)
else print x
end
end
Since I can’t find a specific solution online I would appreciate your help.
It's worth noting that when doing repeated substitutions on strings, especially longer ones, you'll want your substitution method to be as efficient as possible. Spinning through an array of things to switch out is painfully expensive, especially as that list grows.
Here's a variation on your approach:
replacement = Regexp.new('\b%s\b' % [ Regexp.union(key_words) ])
replaced = text.gsub(replacement) do |s|
s.colorize(:red)
end
puts replaced
If you're using that substitution repeatedly you should persist the Regexp object into a constant. That avoids having to compile it for each string you're adjusting. If the list changes based on factors hard to predict, leave it like this and produce it dynamically.
One thing to note about using Ruby is it's often best to express your code as a series of transformations with output as a final step. Putting things like print in the middle of a loop complicates things unnecessarily. If you want to add an additional step to your loop you have to do a lot of extra work to move that print to a later stage. With the approach here you can just chain on the end and do whatever you want.

ruby 1.9: how do I get a byte-index-based slice of a String?

I'm working with UTF-8 strings. I need to get a slice using byte-based indexes, not char-based.
I found references on the web to String#subseq, which is supposed to be like String#[], but for bytes. Alas, it seems not to have made it to 1.9.1.
Now, why would I want to do that? There's a chance I'll end up with an invalid string should I slice in the middle of a multi-byte char. This sounds like a terrible idea.
Well, I'm working with StringScanner, and it turns out its internal pointers are byte-based. I accept other options here.
Here's what I'm working with right now, but it's rather verbose:
s.dup.force_encoding("ASCII-8BIT")[ix...pos].force_encoding("UTF-8")
Both ix and pos come from StringScanner, so are byte-based.
You can do this too: s.bytes.to_a[ix...pos].join(""), but that looks even more esoteric to me.
If you're calling the line several times, a nicer way to do it could be this:
class String
def byteslice(*args)
self.dup.force_encoding("ASCII-8BIT").slice(*args).force_encoding("UTF-8")
end
end
s.byteslice(ix...pos)
Doesn't String#bytes do what you want? It returns an enumerator to the bytes in a string (as numbers, since they might not be valid characters, as you pointed out)
str.bytes.to_a.slice(...)
Use this monkeypatch until String#byteslice() is added to Ruby 1.9.
class String
unless method_defined? :byteslice
##
# Does the same thing as String#slice but
# operates on bytes instead of characters.
#
def byteslice(*args)
unpack('C*').slice(*args).pack('C*')
end
end
end

Resources