Ruby refactoring sentence manipulation - ruby

I have a bit of a brain buster i am trying to refactor this method. there are several goals behind doing this, first is so that if anyone reads this code once they can move on because they wouldn't have questions about it.
Second i am hoping there is a faster way of doing this. i mean in my mind its almost like a p vs np thing, but i am sure there are neater ways of accomplishing it. Applying the Solid principles is the goal.
This method receives a string and break it down into individual words. Then it inspects each of the words to find the following suffixes -er, -ers, -ed, -ant, -and and -anned. with each of these suffixes there is a new replacement -er becomes -xor, -ers becomes -xors, -ed becomes -d, -ant, -and and -anned becomes -&(eg. banned becomes b&)
This is what i have so far for the -er and -ers suffixes. its ugly and i bet really slow.
def reconstruct_sentence
s = #sentence.split(/\W+/)
s.each_with_index do |word, i|
if word.end_with?("er")
s[i] = word.chomp("er") + ("xor")
elsif word.end_with?("ers")
s[i] = word.chomp("ers") + ("xors")
else
return
end
s.join(" ")
end
end
I think what i am asking for is far out there and refactoring takes i think a few years of code experience to get used to. but as i go along i can see that this method has more then one purpose breaking solid. so i broke it like this.
def edit_sentence
split_sentence(#sentence) #this would be the sentence that is initialized
sentence.each_with_index do |word, i|
if word.end_with?("er")
sentence[i] = word.chomp("er") + ("xor")
elsif word.end_with?("ers")
sentence[i] = word.chomp("ers") + ("xors")
else
return
end
reconstruct_sentence
end
end
def split_sentence(sentence)
sentence.split(/\W+/)
end
def reconstruct_sentence
sentence.join(" ")
end
the part that i am struggling to re-factor is getting all the suffixes in one method call. i first thought with all the repetition i should use a hash to save all the old suffixes as a key and the new ones as a value but i reckon its going to get complicated like meta programming to the max. any advice? and does anyone know a good book on refactoring patterns?
thanks in advance.

First, your code doesn't work properly, since the return keyword returns from the whole method, instead of continuing.
For a short version of what you try to do, you can use gsub with replacements:
def reconstruct_sentence
#sentence.gsub(/ers?\b/, 'er' => 'xor', 'ers' => 'xors')
end
#sentence = 'this is a tester without any eaters'
reconstruct_sentence
# => "this is a testxor without any eatxors"
More generically you can do:
def reconstruct_sentence
replacements = {'er' => 'xor', 'ers' => 'xors', 'ed' => 'd',
'ant' => '&', 'and' => '&', 'anned' => '&'}
#sentence.gsub(Regexp.union(replacements.keys), replacements)
end
#sentence = 'this is a tester without any planned fighters'
reconstruct_sentence
# => "this is a testxor without any pl& fightxors"

Related

Convert a letter to its corresponding control code

Given a single letter (string), say "a", I want to convert this into its corresponding control code, i.e. "\ca" - or equivalently (in alternate syntax) - "\C-a", ?\ca, "\x01", "\u0001"
I was hoping there'd be some "nice", clean way of doing this conversion, but I can't figure it out.
An obvious first attempt might be to try something like:
def convert_to_control_code(letter)
"\c#{letter}"
end
...But this does not work, since this will always return "\u0003{letter}" (where "\u0003" is the control code "\c#"
My current solution is simply to "brute force" it by doing the following:
def convert_to_control_code(letter)
(0..255).detect { |x| x.chr =~ Regexp.new("\\c#{char}") }.chr
end
However, I can't help but feel there's a "right" way of doing this!
Edit:
Here's another, non brute-force solution I've come up with, that seems to work:
def convert_to_control_code(letter)
(letter.ord % 32).chr
end
This looks much nicer, but also very hacky!
You can write it as :
def convert_to_control_code(letter)
eval "?\\C-#{letter.chr}"
end
convert_to_control_code(97) # => "\u0001"
convert_to_control_code(98) # => "\u0002"
One possibility is to do the same as Ruby itself does. It might look something like this:
def convert_to_control(letter)
letter = letter.chr # ensure we are only dealing with a single char
return 0177.chr if letter == '?'
raise 'an error' unless letter.ascii_only? # or do something else
(letter.ord & 0x9f).chr
end
You might want to change the encoding of the result depending on what you are doing.

Ruby - Populate and Array with returned method values

So, pretend we have the following three methods that check a grid to determine if there is a winner, and will return true if there is.
def win_diagonal?
# Code here to check for diagonal win.
end
def win_horizontal?
# Code here to check for horizontal win.
end
def win_vertical?
# Code here to check for vertical win.
end
I would like to push the returned values of each method into an Array instead of literally using the method names. Is this possible?
def game_status
check_wins = [win_vertical?, win_diagonal?, win_horizontal?]
if check_wins.uniq.length != 1 # When we don't have only false returns from methods
return :game_over
end
end
What you are looking for will indeed work in ruby.
def hello_world?
"hello world!"
end
a = [hello_world?]
Prints out
=> ["hello world!"]
Hope that helps. IRB is your friend when you wonder if something is possible in Ruby :-)
Simpler way (and very readable) yet:
def game_status
win_vertical? || win_diagonal? || win_horizontal?
end
If, for example, win_vertical? returns true, the other algorithms won't even need to run. You return immediately.
Or, if you need to know in which way the user won, I mean, if you need to preserve the results of all methods after they ran, you can use a hash, like:
{:vertical => win_vertical?, :diagonal => win_diagonal?, :horizontal => win_horizontal?}
This solution, like the array one, is worse than the first one above for it runs all algorithms all the time. If they are complex, you may have a problem. =)
You can do something like this when you really want to store all return values in an array:
def game_status
check_wins = [win_vertical?, win_diagonal?, win_horizontal?]
return :game_over if check_wins.any?
end
For readability I would prefer:
def game_status
return :game_over if win_vertical? || win_diagonal? || win_horizontal?
end

Functionally find mapping of first value that passes a test

In Ruby, I have an array of simple values (possible encodings):
encodings = %w[ utf-8 iso-8859-1 macroman ]
I want to keep reading a file from disk until the results are valid. I could do this:
good = encodings.find{ |enc| IO.read(file, "r:#{enc}").valid_encoding? }
contents = IO.read(file, "r:#{good}")
...but of course this is dumb, since it reads the file twice for the good encoding. I could program it in gross procedural style like so:
contents = nil
encodings.each do |enc|
if (s=IO.read(file, "r:#{enc}")).valid_encoding?
contents = s
break
end
end
But I want a functional solution. I could do it functionally like so:
contents = encodings.map{|e| IO.read(f, "r:#{e}")}.find{|s| s.valid_encoding? }
…but of course that keeps reading files for every encoding, even if the first was valid.
Is there a simple pattern that is functional, but does not keep reading the file after a the first success is found?
If you sprinkle a lazy in there, map will only consume those elements of the array that are used by find - i.e. once find stops, map stops as well. So this will do what you want:
possible_reads = encodings.lazy.map {|e| IO.read(f, "r:#{e}")}
contents = possible_reads.find {|s| s.valid_encoding? }
Hopping on sepp2k's answer: If you can't use 2.0, lazy enums can be easily implemented in 1.9:
class Enumerator
def lazy_find
self.class.new do |yielder|
self.each do |element|
if yield(element)
yielder.yield(element)
break
end
end
end
end
end
a = (1..100).to_enum
p a.lazy_find { |i| i.even? }.first
# => 2
You want to use the break statement:
contents = encodings.each do |e|
s = IO.read( f, "r:#{e}" )
s.valid_encoding? and break s
end
The best I can come up with is with our good friend inject:
contents = encodings.inject(nil) do |s,enc|
s || (c=File.open(f,"r:#{enc}").valid_encoding? && c
end
This is still sub-optimal because it continues to loop through encodings after finding a match, though it doesn't do anything with them, so it's a minor ugliness. Most of the ugliness comes from...well, the code itself. :/

Is adding nowiki-tags to this parser feasible?

Update: for the record, here's the implementation I ended up using.
Here's a trimmed down version of a parser I'm working on. There's still some code, but it should be quite easy to grasp the basic concepts of this parser.
class Markup
def initialize(markup)
#markup = markup
end
def to_html
#html ||= #markup.split(/(\r\n){2,}|\n{2,}/).map {|p| Paragraph.new(p).to_html }.join("\n")
end
class Paragraph
def initialize(paragraph)
#p = paragraph
end
def to_html
#p.gsub!(/'{3}([^']+)'{3}/, "<strong>\\1</strong>")
#p.gsub!(/'{2}([^']+)'{2}/, "<em>\\1</em>")
#p.gsub!(/`([^`]+)`/, "<code>\\1</code>")
case #p
when /^=/
level = (#p.count("=") / 2) + 1 # Starting on h2
#p.gsub!(/^[= ]+|[= ]+$/, "")
"<h#{level}>" + #p + "</h#{level}>"
when /^(\*|\#)/
# I'm parsing lists here. Quite a lot of code, and not relevant, so
# I'm leaving it out.
else
#p.gsub!("\n", "\n<br/>")
"<p>" + #p + "</p>"
end
end
end
end
p Markup.new("Here is `code` and ''emphasis'' and '''bold'''!
Baz").to_html
# => "<p>Here is <code>code</code> and <em>emphasis</em> and <strong>bold</strong>!</p>\n<p>Baz</p>"
So, as you can see, I'm breaking the text into paragraphs, and each paragraph is either a header, a list or a regular paragraph.
Is it feasible to add support for nowiki tags (where everything between <nowiki></nowiki> is not being parsed) for a parser like this? Feel free to answer "no", and suggest alternative methods of creating a parser :)
As a sidenote, you can see the actual parser code on Github. markup.rb and paragraph.rb
If you make use of a simple tokenizer, it's much easier to manage this sort of thing. One approach is to create a single regular expression that can capture your entire grammar, but this might prove to be problematic. An alternative is to split up the document into sections that need to be rewritten, and sections that should be skipped, which is likely the easier approach here.
Here's a simple framework you can extend as required:
def wiki_subst(string)
buffer = string.dup
result = ''
while (m = buffer.match(/<\s*nowiki\s*>.*?<\s*\/\s*nowiki\s*>/i))
result << yield(m.pre_match)
result << m.to_s
buffer = m.post_match
end
result << yield(buffer)
result
end
example = "replace me<nowiki>but not me</nowiki>replace me too<NOWIKI>but not me either</nowiki>and me"
puts wiki_subst(example) { |s| s.upcase }
# => REPLACE ME<nowiki>but not me</nowiki>REPLACE ME TOO<NOWIKI>but not me either</nowiki>AND ME

can you define a block inline with ruby?

Is it possible to define a block in an inline statement with ruby? Something like this:
tasks.collect(&:title).to_block{|arr| "#{arr.slice(0, arr.length - 1).join(", ")} and #{arr.last}" }
Instead of this:
titles = tasks.collect(&:title)
"#{titles.slice(0, titles.length - 1).join(", ")} and #{titles.last}"
If you said tasks.collect(&:title).slice(0, this.length-1) how can you make 'this' refer to the full array that was passed to slice()?
Basically I'm just looking for a way to pass the object returned from one statement into another one, not necessarily iterating over it.
You're kind of confusing passing a return value to a method/function and calling a method on the returned value. The way to do what you described is this:
lambda {|arr| "#{arr.slice(0, arr.length - 1).join(", ")} and #{arr.last}"}.call(tasks.collect(&:title))
If you want to do it the way you were attempting, the closest match is instance_eval, which lets you run a block within the context of an object. So that would be:
tasks.collect(&:title).instance_eval {"#{slice(0, length - 1).join(", ")} and #{last}"}
However, I would not do either of those, as it's longer and less readable than the alternative.
I'm not sure exactly what you're trying to do, but:
If you said tasks.collect(&:title).slice(0, this.length-1) how can you make 'this' refer to the full array that was passed to slice()?
Use a negative number:
tasks.collect(&:title)[0..-2]
Also, in:
"#{titles.slice(0, titles.length - 1).join(", ")} and #{titles.last}"
you've got something weird going on with your quotes, I think.
I don't really understand why you would want to, but you could add a function to the ruby classes that takes a block, and passes itself as a parameter...
class Object
def to_block
yield self
end
end
At this point you would be able to call:
tasks.collect(&:title).to_block{|it| it.slice(0, it.length-1)}
Of course, modifying the Object class should not be taken lightly as there can be serious consequences when combining with other libraries.
Although there are many good answers here, perhaps you're looking for something more like this in terms of an objective:
class Array
def andjoin(separator = ', ', word = ' and ')
case (length)
when 0
''
when 1
last.to_s
when 2
join(word)
else
slice(0, length - 1).join(separator) + word + last.to_s
end
end
end
puts %w[ think feel enjoy ].andjoin # => "think, feel and enjoy"
puts %w[ mitchell webb ].andjoin # => "mitchell and webb"
puts %w[ yes ].andjoin # => "yes"
puts %w[ happy fun monkeypatch ].andjoin(', ', ', and ') # => "happy, fun, and monkeypatch"

Resources