Converting an array of morphemes to a sentence in Ruby

Converting an array of morphemes to a sentence in Ruby - ruby

I want to convert an array of morphemes produced by a PTB-style tokenizer:
["The", "house", "is", "n't", "on", "fire", "."]
To a sentence:
"The house isn't on fire."
What is a sensible way to accomplish this?

If we take #sawa's advice on the apostrophe and make your array this:
["The", "house", "isn't", "on", "fire", "."]
You can get what your looking for (with punctuation support!) with this:
def sentence(array)
str = ""
array.each_with_index do |w, i|
case w
when '.', '!', '?' #Sentence enders, inserts a space too if there are more words.
str << w
str << ' ' unless(i == array.length-1)
when ',', ';' #Inline separators
str << w
str << ' '
when '--' #Dash
str << ' -- '
else #It's a word
str << ' ' unless str[-1] == ' ' || str.length == 0
str << w
end
end
str
end

Related

How to do string slicing in Ruby

This is a Pig Latin translate practice in Ruby.
Why am I getting different results from these two versions of code? In other words, why is word = word[i..-1] not taking effect in the second code block?
def translate(input)
output_array = input.split(" ").each do |word|
i=0
while !['a', 'e', 'i', 'o', 'u'].include?(word[i])
i += 1
end
unless i == 0
word << word[0..i-1]
word[0..i-1] = ''
end
word << "ay"
end
return output_array.join(" ")
end
puts translate('apple')
puts translate('banana')
puts translate('trash')
puts translate('eat pie')
which outputs:
appleay
ananabay
ashtray
eatay iepay
And:
def translate(input)
output_array = input.split(" ").each do |word|
i=0
while !['a', 'e', 'i', 'o', 'u'].include?(word[i])
i += 1
end
unless i == 0
word << word[0..i-1]
word = word[i..-1]
end
word << "ay"
end
return output_array.join(" ")
end
puts translate('apple')
puts translate('banana')
puts translate('trash')
puts translate('eat pie')
prints out:
appleay
bananab
trashtr
eatay piep

output_array = input.split(" ").each do |word|
i=0
while !['a', 'e', 'i', 'o', 'u'].include?(word[i])
i += 1
end
unless i == 0
word << word[0..i-1] # Good
word = word[i..-1] # Bad
end
word << "ay"
end
The line
word << word[0..i-1]
changes the string in place, while
word = word[i..-1]
creates a new string and assigns the new string to word. Changing the new string does not affect the old string in the array, so the words in the array stay what they were after
word << word[0..i-1]
Do every modification in-place (like what you did in solution 1), or use Array#map which is more Ruby-like.
This is off-topic, but your while loop can be replaced by
i = word.index(/[aeiou]/)
if you happen to know regular expressions.

Use single quote in string inspection

I have the following program:
args = ["a", "b"]
cmd_args = args.map{|x| x.inspect}
str = cmd_args.join(' ')
puts str
The output is:
"a" "b"
I expect the output to be like the following (sub-string quoted with ' instead of "):
'a' 'b'
I don't want to do a gsub after string inspect because, in my real system, substring might contain ". For example:
args = ['a"c', "b"]
cmd_args = args.map{|x| x.inspect.gsub('"', '\'')}
str = cmd_args.join(' ')
puts str
will output:
'a\'c' 'b'
The " between a and c is wrongly replaced. My expected output is:
'a"c' 'b'
How can I make string inspect to quote strings with ' instead of "?

s = 'a"c'.inspect
s[0] = s[-1] = "'"
puts s.gsub("\\\"", "\"") #=> 'a"c'

You can't force String#inspect to use a single quote without rewriting or overwriting it.
Instead of x.inspect, you could substitute "'#{x}'", but then you would have to make sure you escape any ' characters that appear in x.
Here it is, working:
args = ["a", "b"]
cmd_args = args.map{|x| "'#{x}'" }
str = cmd_args.join(' ')
puts str
The output is:
'a' 'b'

Split a string into a single element array in ruby

So I wrote this code:
def translate_word word
vowel = ["a", "e", "i", "o", "u"]
if vowel.include? word[0]
word = word + "ay"
elsif vowel.include? word[1]
word = word[1..-1] + word[0] + "ay"
else
word = word[2..-1] + word[0..1] + "ay"
end
end
Translates a word into pig latin. For my purposes, works great. But what if we want to translate more than one word?
def translate string
vowel = ["a", "e", "i", "o", "u"]
words = string.split(" ")
words.each do |word|
if vowel.include? word[0]
word = word + "ay"
elsif vowel.include? word[1]
word = word[1..-1] + word[0] + "ay"
else
word = word[2..-1] + word[0..1] + "ay"
end
end
words.join(" ")
end
Except, if we try to do this with one word, it'll notice there aren't any spaces, say screw that, and return a string. Won't even throw me an error when I try to .each it, but the .each won't do any thing.
puts "apple".split
#=>apple
puts translate "apple"
#=>apple
This isn't an insurmountable problem. I could just run string.includes? " " and then run the two slightly different programs depending on if it was there or not. But this seems very ineloquent. What would be a better or more idiomatic way to deal with the string and the loop?

Assigning another value to the block argument doesn't change the array element:
words.each do |word|
word = word + "ay" # <- this doesn't work as expected
end
To change the element, you have to call a method that changes the receiver, e.g.:
words.each do |word|
word << "ay"
end
However, instead of repeating the algorithm, you could just call translate_word for each word:
def translate(string)
string.split.map { |word| translate_word(word) }.join(" ")
end
translate("apple orange")
#=> "appleay orangeay"
I've used split and join here, but you could also use gsub:
def translate(string)
string.gsub(/\w+/) { |word| translate_word(word) }
end

As far as I can see you're not manipulating your original array words.
You would need something like this:
def translate string
vowel = ["a", "e", "i", "o", "u"]
words = string.split(" ")
words.each_with_index do |word, index|
if vowel.include? word[0]
word = word + "ay"
elsif vowel.include? word[1]
word = word[1..-1] + word[0] + "ay"
else
word = word[2..-1] + word[0..1] + "ay"
end
words[index] = word
end
words.join(" ")
end

Replacing a char in Ruby with another character

I'm trying to replace all spaces in a string with '%20', but it's not producing the result I want.
I'm splitting the string, then going through each character. If the character is " " I want to replace it with '%20', but for some reason it is not being replaced. What am I doing wrong?
def twenty(string)
letters = string.split("")
letters.each do |char|
if char == " "
char = '%20'
end
end
letters.join
end
p twenty("Hello world is so played out")

Use URI.escape(...) for proper URI encoding:
require 'uri'
URI.escape('a b c') # => "a%20b%20c"
Or, if you want to roll your own as a fun learning exercise, here's my solution:
def uri_escape(str, encode=/\W/)
str.gsub(encode) { |c| '%' + c.ord.to_s(16) }
end
uri_escape('a b!c') # => "a%20%20b%21c"
Finally, to answer your specific question, your snippet doesn't behave as expected because the each iterator does not mutate the target; try using map with assignment (or map!) instead:
def twenty(string)
letters = string.split('')
letters.map! { |c| (c == ' ') ? '%20' : c }
letters.join
end
twenty('a b c') # => "a%20b%20c"

If you want to first split the string on spaces, you could do this:
def twenty(string)
string.split(' ').join('%20')
end
p twenty("Hello world is so played out")
#=> "Hello%20world%20is%20so%20played%20out"
Note that this is not the same as
def twenty_with_gsub(string)
string.gsub(' ', '%20')
end
for if
string = 'hi there'
then
twenty(string)
#=> "hi%20there"
twenty_with_gsub(string)
#=> "hi%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20there"

Pig-Latin method translation

Trying to write Method in ruby that will translate a string in pig-latin , the rule :
Rule 1: If a word begins with a vowel sound, add an "ay" sound to the end of the word.
Rule 2: If a word begins with a consonant sound, move it to the end of the word, and then add an "ay" sound to the end of the word and also when the word begins with 2 consonants , move both to the end of the word and add an "ay"
As a newbie , my prob is the second rule , when the word begin with only one consonant it work , but for more than one , I have trouble to make it work ,Can somebody look at the code and let me know how i can code that differently and probably what is my mistake , probably the code need refactoring. Thanks , so far i come up with this code :
def translate (str)
str1="aeiou"
str2=(/\A[aeiou]/)
vowel = str1.scan(/\w/)
alpha =('a'..'z').to_a
con = (alpha - vowel).join
word = str.scan(/\w/)
if #first rule
str =~ str2
str + "ay"
elsif # second rule
str != str2
s = str.slice!(/^./)
str + s + "ay"
elsif
word[0.1]=~(/\A[con]/)
s = str.slice!(/^../)
str + s + "ay"
else
word[0..2]=~(/\A[con]/)
s = str.slice!(/^.../)
str + s + "ay"
end
end
translate("apple") should == "appleay"
translate("cherry") should == "errychay"
translate("three") should == "eethray"

No need for all those fancy regexes. Keep it simple.
def translate str
alpha = ('a'..'z').to_a
vowels = %w[a e i o u]
consonants = alpha - vowels
if vowels.include?(str[0])
str + 'ay'
elsif consonants.include?(str[0]) && consonants.include?(str[1])
str[2..-1] + str[0..1] + 'ay'
elsif consonants.include?(str[0])
str[1..-1] + str[0] + 'ay'
else
str # return unchanged
end
end
translate 'apple' # => "appleay"
translate 'cherry' # => "errychay"
translate 'dog' # => "ogday"

This will handle multiple words, punctuation, and words like 'queer' = 'eerquay' and 'school' = 'oolschay'.
def translate (sent)
vowels = %w{a e i o u}
sent.gsub(/(\A|\s)\w+/) do |str|
str.strip!
while not vowels.include? str[0] or (str[0] == 'u' and str[-1] == 'q')
str += str[0]
str = str[1..-1]
end
str = ' ' + str + 'ay'
end.strip
end

okay this is an epic pig latin translator that I'm sure could use a bit of refactoring, but passes the tests
def translate(sent)
sent = sent.downcase
vowels = ['a', 'e', 'i', 'o', 'u']
words = sent.split(' ')
result = []
words.each_with_index do |word, i|
translation = ''
qu = false
if vowels.include? word[0]
translation = word + 'ay'
result.push(translation)
else
word = word.split('')
count = 0
word.each_with_index do |char, index|
if vowels.include? char
# handle words that start with 'qu'
if char == 'u' and translation[-1] == 'q'
qu = true
translation = words[i][count + 1..words[i].length] + translation + 'uay'
result.push(translation)
next
end
break
else
# handle words with 'qu' in middle
if char == 'q' and word[i+1] == 'u'
qu = true
translation = words[i][count + 2..words[i].length] + 'quay'
result.push(translation)
next
else
translation += char
end
count += 1
end
end
# translation of consonant words without qu
if not qu
translation = words[i][count..words[i].length] + translation + 'ay'
result.push(translation)
end
end
end
result.join(' ')
end
So this will give the following:
puts translate('apple') # "appleay"
puts translate("quiet") # "ietquay"
puts translate("square") # "aresquay"
puts translate("the quick brown fox") # "ethay ickquay ownbray oxfay"

def translate(sentence)
sentence.split(" ").map do |word|
word = word.gsub("qu", " ")
word.gsub!(/^([^aeiou]*)(.*)/,'\2\1ay')
word = word.gsub(" ", "qu")
end
end
That was fun! I don't like the hack for qu, but I couldn't find a nice way to do that.

So for this pig latin clearly I skipped and\an\in and singular things like a\I etc. I know that wasn't the main question but you can just leave out that logic if it's not for your use case. Also this goes for triple consonants if you want to keep it with one or two consonants then change the expression from {1,3} to {1,2}
All pig latin is similar so just alter for your use case. This is a good opportunity to use MatchData objects. Also vowel?(first_letter=word[0].downcase) is a style choice made to be more literate so I don't have to remember that word[0] is the first letter.
My answer is originally based off of Sergio Tulentsev's answer in this thread.
def to_pig_latin(sentence)
sentence.gsub('.','').split(' ').collect do |word|
translate word
end.compact.join(' ')
end
def translate(word)
if word.length > 1
if word == 'and' || word == 'an' || word == 'in'
word
elsif capture = consonant_expression.match(word)
capture.post_match.to_s + capture.to_s + 'ay'
elsif vowel?(first_letter=word[0].downcase)
word + 'ay'
elsif vowel?(last_letter=word[-1].downcase)
move_last_letter(word) + 'ay'
end
else
word
end
end
# Move last letter to beginning of word
def move_last_letter(word)
word[-1] + word[0..-2]
end
private
def consonant_expression
# at the beginning of a String
# capture anything not a vowel (consonants)
# capture 1, 2 or 3 occurences
# ignore case and whitespace
/^ [^aeiou] {1,3}/ix
end
def vowel?(letter)
vowels.include?(letter)
end
def vowels
%w[a e i o u]
end
Also just for the heck of it I'll include my dump from a pry session so you all can see how to use MatchData. MINSWAN. It's stuff like this that makes ruby great.
pry > def consonant_expression
pry * /^ [^aeiou] {1,3}/ix
pry * end
=> :consonant_expression
pry > consonant_expression.match('Stream')
=> #<MatchData "Str">
pry > capture = _
=> #<MatchData "Str">
pry > ls capture
MatchData#methods:
== begin end hash length offset pre_match regexp string to_s
[] captures eql? inspect names post_match pretty_print size to_a values_at
pry >
pry > capture.post_match
=> "eam"
pry > capture
=> #<MatchData "Str">
pry > capture.to_s
=> "Str"
pry > capture.post_match.to_s
=> "eam"
pry > capture.post_match.to_s + capture.to_s + 'ay'
=> "eamStray"
pry >

If I understood your question correctly, you can just directly check if a character is a vowel or consonant and then use array ranges to get the part of the string you want.
vowels = ['a', 'e', 'i', 'o', 'u']
consonants = ('a'..'z').to_a - vowels
return str + "ay" if vowels.include?(str[0])
if consonants.include?(str[0])
return str[2..-1] + str[0..1] + "ay" if consonants.include?(str[1])
return str[1..-1] + str[0] + "ay"
end
str

Here's a solution that handles the "qu" phoneme as well as other irregular characters. Had a little trouble putting the individual words back into a string with the proper spacing. Would appreciate any feedback!
def translate(str)
vowels = ["a", "e", "i", "o", "u"]
new_word = ""
str.split.each do |word|
vowel_idx = 0
if vowels.include? word[0]
vowel_idx = 0
elsif word.include? "qu"
until word[vowel_idx-2]+word[vowel_idx-1] == "qu"
vowel_idx += 1
end
else
until vowels.include? word[vowel_idx]
vowel_idx += 1
end
end
idx_right = vowel_idx
while idx_right < word.length
new_word += word[idx_right]
idx_right += 1
end
idx_left = 0
while idx_left < vowel_idx
new_word += word[idx_left]
idx_left += 1
end
new_word += "ay "
end
new_word.chomp(" ")
end

I done gone did one too
def translate(string)
vowels = %w{a e i o u}
phrase = string.split(" ")
phrase.map! do |word|
letters = word.split("")
find_vowel = letters.index do |letter|
vowels.include?(letter)
end
#turn "square" into "aresquay"
if letters[find_vowel] == "u"
find_vowel += 1
end
letters.rotate!(find_vowel)
letters.push("ay")
letters.join
end
return phrase.join(" ")
end

def piglatinize(word)
vowels = %w{a e i o u}
word.each_char do |chr|
index = word.index(chr)
if index != 0 && vowels.include?(chr.downcase)
consonants = word.slice!(0..index-1)
return word + consonants + "ay"
elsif index == 0 && vowels.include?(chr.downcase)
return word + "ay"
end
end
end
def to_pig_latin(sentence)
sentence.split(" ").collect { |word| piglatinize(word) }.join(" ")
end

This seems to handle all that I've thrown at it including the 'qu' phoneme rule...
def translate str
letters = ('a'..'z').to_a
vowels = %w[a e i o u]
consonants = letters - vowels
str2 = str.gsub(/\w+/) do|word|
if vowels.include?(word.downcase[0])
word+'ay'
elsif (word.include? 'qu')
idx = word.index(/[aeio]/)
word = word[idx, word.length-idx] + word[0,idx]+ 'ay'
else
idx = word.index(/[aeiou]/)
word = word[idx, word.length-idx] + word[0,idx]+'ay'
end
end
end
I'm grabbing the words with the 'qu' phoneme and then checking all the other vowels [excluding u].
Then I split the word by the index of the first vowel (or vowel without 'u' for the 'qu' cases) and dropping the word part before that index to the back of the word. And adding 'ay' ftw.

Many of the examples here are fairly long. Here's some relatively short code I came up with. It handles all cases including the "qu" problem! Feedback always appreciated (I'm pretty new to coding).
$vowels = "aeiou"
#First, I define a method that handle's a word starting with a consonant
def consonant(s)
n = 0
while n < s.length
if $vowels.include?(s[n]) && s[n-1..n] != "qu"
return "#{s[n..-1]}#{s[0..n-1]}ay"
else
n += 1
end
end
end
#Then, I write the main translate method that decides how to approach the word.
def translate(s)
s.split.map{ |s| $vowels.include?(s[0]) ? "#{s}ay" : consonant(s) }.join(" ")
end

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Converting an array of morphemes to a sentence in Ruby - ruby

I want to convert an array of morphemes produced by a PTB-style tokenizer: ["The", "house", "is", "n't", "on", "fire", "."] To a sentence: "The house isn't on fire." What is a sensible way to accomplish this?

Related

How to do string slicing in Ruby

Use single quote in string inspection

Split a string into a single element array in ruby

Replacing a char in Ruby with another character

Pig-Latin method translation

Categories

Resources