So here is the string I want to convert to an array, where then I want to reverse each word without reversing the entire sentence, and then join them back and provide the output.
For instance, I want to change "Hello there, and how are you?" to "olleH ,ereht dna woh era ?uoy"
This is the string:
sentence1="Hello there, and how are you?"
and, this is my code in which I have to incorporate .each(which i know is wrong, but don't know how)
def reverse_each_word(sentence1)
split_array = sentence1.split
reversed_array = split_array.reverse
reversed_array.each do |joined_array|
joined_array.join(' ')
end
end
and as mentioned, the desired result has to be:
"olleH ,ereht dna woh era ?uoy"
You're calling join in a string, since you're iterating over each element in reversed_array, and all those ones are string objects:
p sentence1.split.first.join(' ')
# undefined method `join' for "Hello":String (NoMethodError)
It might work if you use something to store the value in each iteration within the block, it can be a variable declared outside the iteration, or better map, after that, you can just reverse each string and then join everything:
def reverse_each_word(sentence1)
sentence1.split.map do |joined_array|
joined_array.reverse
end.join(' ')
end
p reverse_each_word(sentence1) # "olleH ,ereht dna woh era ?uoy"
Notice this can be written as sentence1.split.map(&:reverse).join(' ') too.
In case you're looking for each to solve this problem, you'll need a variable where to store each "modified" string as long as you're iterating over each of those elements:
memo = ''
sentence1.split.each { |joined_array| memo << "#{joined_array.reverse} " }
p memo.rstrip # "olleH ,ereht dna woh era ?uoy"
There you have a memo variable which is an empty string, just for the reason to be filled with each reversed string, you reverse the string and add a white space to the right. The last string is going to have an additional whitespace, so rstrip helps you to "remove" it.
For collect you can use the map approach, because they're aliases.
I would be inclined to use String#gsub with a regular expression.
str = "Hello there, and how are you?"
str.gsub(/\S+/) { |s| s.reverse }
#=> "olleH ,ereht dna woh era ?uoy"
The regular expression reads, "match one or more characters other than whitespace characters".
Related
I was recently asked this in an interview and was figuring out a way to do this without using regex in Ruby as I was told it would be a bonus if you can solve it without using regex.
Question: Assume that the hash has 1 million key, value pairs and we have to be able to sub the variables in the string that are between % % this pattern. How would I be able to do this without regex.
We have a string str = "%greet%! Hi there, %var_1% that can be any other %var_2% injected to the %var_3%. Nice!, goodbye)"
we have a hash called dict = { greet: 'Hi there', var_1: 'FIRST VARIABLE', var_2: 'values', var_3: 'string', }
This was my solution:
def template(str, dict)
vars = value.scan(/%(.*?)%/).flatten
vars.each do |var|
value = value.gsub("%#{var}%", dict[var.to_sym])
end
value
end
There are many ways to solve this, but you will probably need some kind of parsing and / or lexical analysis if you don't want to use built-in pattern matching.
Let's keep it very simple and say that your string's content falls into two categories: text and variable which are separated by %, e.g. (you could also think of the variables being enclosed by %, but that's harder to implement)
str = "Hello %name%, hope to see you %when%!"
# TTTTTT VVVV TTTTTTTTTTTTTTTTTT VVVV T
As you can see, the categories are alternating. We can utilize this and write a little helper method that turns a string into a list of [type, value] pairs, something like this:
def each_part(str)
return enum_for(__method__, str) unless block_given?
type = [:text, :var].cycle
buf = ''
str.each_char do |char|
if char != '%'
buf << char
else
yield type.next, buf
buf = ''
end
end
yield type.next, buf
end
It starts by defining an enumerator that will cycle between the two types and an empty buffer. It will then read each_char from the string. If the char is not %, it will just append it to the buffer and keep reading. Once it encounters a %, it will yield the current buffer along with the type and start a new buffer (next will also switch the type). After the loop ends, it will yield once more to output the remaining characters.
It outputs this kind of data:
each_part(str).to_a
#=> [[:text, "Hello "],
# [:var, "name"],
# [:text, ", hope to see you "],
# [:var, "when"],
# [:text, "!"]]
We can use this to convert the string:
dict = { name: 'Tom', when: 'soon' }
output = ''
each_part(str) do |type, value|
case type
when :text
output << value
when :var
output << dict[value.to_sym]
end
end
p output
#=> "Hello Tom, hope to see you soon!"
You could of course combine parsing and evaluation, but I like the separation. An full-fledged parser might involve even more steps.
A very simple approach:
First, split the string on '%':
str = "%greet%! Hi there, %var_1% that can be any other %var_2% injected to the %var_3%. Nice!, goodbye)"
chunks = str.split('%')
Now we can assume given the way the problem has been specified, that every other "chunk" will be a key to replace. Iterating with the index will make that easier to figure out.
chunks.each_with_index { |c, i| chunks[i] = (i.even? ? c : dict[c.to_sym]) }.join
Result:
"Hi there! Hi there, FIRST VARIABLE that can be any other values injected to the string. Nice!, goodbye)"
Note: this does not handle malformed input well at all.
I want to convert all the words(alphabetic) in the string to their abbreviations like i18n does. In other words I want to change "extraordinary" into "e11y" because there are 11 characters between the first and the last letter in "extraordinary". It works with a single word in the string. But how can I do the same for a multi-word string? And of course if a word is <= 4 there is no point to make an abbreviation from it.
class Abbreviator
def self.abbreviate(x)
x.gsub(/\w+/, "#{x[0]}#{(x.length-2)}#{x[-1]}")
end
end
Test.assert_equals( Abbreviator.abbreviate("banana"), "b4a", Abbreviator.abbreviate("banana") )
Test.assert_equals( Abbreviator.abbreviate("double-barrel"), "d4e-b4l", Abbreviator.abbreviate("double-barrel") )
Test.assert_equals( Abbreviator.abbreviate("You, and I, should speak."), "You, and I, s4d s3k.", Abbreviator.abbreviate("You, and I, should speak.") )
Your mistake is that your second parameter is a substitution string operating on x (the original entire string) as a whole.
Instead of using the form of gsub where the second parameter is a substitution string, use the form of gsub where the second parameter is a block (listed, for example, third on this page). Now you are receiving each substring into your block and can operate on that substring individually.
def short_form(str)
str.gsub(/[[:alpha:]]{4,}/) { |s| "%s%d%s" % [s[0], s.size-2, s[-1]] }
end
The regex reads, "match four or more alphabetic characters".
short_form "abc" # => "abc"
short_form "a-b-c" #=> "a-b-c"
short_form "cats" #=> "c2s"
short_form "two-ponies-c" #=> "two-p4s-c"
short_form "Humpty-Dumpty, who sat on a wall, fell over"
#=> "H4y-D4y, who sat on a w2l, f2l o2r"
I would recommend something along the lines of this:
class Abbreviator
def self.abbreviate(x)
x.gsub(/\w+/) do |word|
# Skip the word unless it's long enough
next word unless word.length > 4
# Do the same I18n conversion you do before
"#{word[0]}#{(word.length-2)}#{word[-1]}"
end
end
end
The accepted answer isn't bad, but it can be made a lot simpler by not matching words that are too short in the first place:
def abbreviate(str)
str.gsub(/([[:alpha:]])([[:alpha:]]{3,})([[:alpha:]])/i) { "#{$1}#{$2.size}#{$3}" }
end
abbreviate("You, and I, should speak.")
# => "You, and I, s4d s3k."
Alternatively, we can use lookbehind and lookahead, which makes the Regexp more complex but the substitution simpler:
def abbreviate(str)
str.gsub(/(?<=[[:alpha:]])[[:alpha:]]{3,}(?=[[:alpha:]])/i, &:size)
end
Say that we want to count the number of words in a document. I know we can do the following:
text.each_line(){ |line| totalWords = totalWords + line.split.size }
Say, that I just want to add some exceptions, such that, I don't want to count the following as words:
(1) numbers
(2) standalone letters
(3) email addresses
How can we do that?
Thanks.
You can wrap this up pretty neatly:
text.each_line do |line|
total_words += line.split.reject do |word|
word.match(/\A(\d+|\w|\S*\#\S+\.\S+)\z/)
end.length
end
Roughly speaking that defines an approximate email address.
Remember Ruby strongly encourages the use of variables with names like total_words and not totalWords.
assuming you can represent all the exceptions in a single regular expression regex_variable, you could do:
text.each_line(){ |line| totalWords = totalWords + line.split.count {|wrd| wrd !~ regex_variable }
your regular expression could look something like:
regex_variable = /\d.|^[a-z]{1}$|\A([^#\s]+)#((?:[-a-z0-9]+\.)+[a-z]{2,})\Z/i
I don't claim to be a regex expert, so you may want to double check that, particularly the email validation part
In addition to the other answers, a little gem hunting came up with this:
WordsCounted Gem
Get the following data from any string or readable file:
Word count
Unique word count
Word density
Character count
Average characters per word
A hash map of words and the number of times they occur
A hash map of words and their lengths
The longest word(s) and its length
The most occurring word(s) and its number of occurrences.
Count invividual strings for occurrences.
A flexible way to exclude words (or anything) from the count. You can pass a string, a regexp, an array, or a lambda.
Customisable criteria. Pass your own regexp rules to split strings if you prefer. The default regexp has two features:
Filters special characters but respects hyphens and apostrophes.
Plays nicely with diacritics (UTF and unicode characters): "São Paulo" is treated as ["São", "Paulo"] and not ["S", "", "o", "Paulo"].
Opens and reads files. Pass in a file path or a url instead of a string.
Have you ever started answering a question and found yourself wandering, exploring interesting, but tangential issues, or concepts you didn't fully understand? That's what happened to me here. Perhaps some of the ideas might prove useful in other settings, if not for the problem at hand.
For readability, we might define some helpers in the class String, but to avoid contamination, I'll use Refinements.
Code
module StringHelpers
refine String do
def count_words
remove_punctuation.split.count { |w|
!(w.is_number? || w.size == 1 || w.is_email_address?) }
end
def remove_punctuation
gsub(/[.!?,;:)](?:\s|$)|(?:^|\s)\(|\-|\n/,' ')
end
def is_number?
self =~ /\A-?\d+(?:\.\d+)?\z/
end
def is_email_address?
include?('#') # for testing only
end
end
end
module CountWords
using StringHelpers
def self.count_words_in_file(fname)
IO.foreach(fname).reduce(0) { |t,l| t+l.count_words }
end
end
Note that using must be in a module (possibly a class). It does not work in main, presumably because that would make the methods available in the class self.class #=> Object, which would defeat the purpose of Refinements. (Readers: please correct me if I'm wrong about the reason using must be in a module.)
Example
Let's first informally check that the helpers are working correctly:
module CheckHelpers
using StringHelpers
s = "You can reach my dog, a 10-year-old golden, at fido#dogs.org."
p s = s.remove_punctuation
#=> "You can reach my dog a 10 year old golden at fido#dogs.org."
p words = s.split
#=> ["You", "can", "reach", "my", "dog", "a", "10",
# "year", "old", "golden", "at", "fido#dogs.org."]
p '123'.is_number? #=> 0
p '-123'.is_number? #=> 0
p '1.23'.is_number? #=> 0
p '123.'.is_number? #=> nil
p "fido#dogs.org".is_email_address? #=> true
p "fido(at)dogs.org".is_email_address? #=> false
p s.count_words #=> 9 (`'a'`, `'10'` and "fido#dogs.org" excluded)
s = "My cat, who has 4 lives remaining, is at abbie(at)felines.org."
p s = s.remove_punctuation
p s.count_words
end
All looks OK. Next, put I'll put some text in a file:
FName = "pets"
text =<<_
My cat, who has 4 lives remaining, is at abbie(at)felines.org.
You can reach my dog, a 10-year-old golden, at fido#dogs.org.
_
File.write(FName, text)
#=> 125
and confirm the file contents:
File.read(FName)
#=> "My cat, who has 4 lives remaining, is at abbie(at)felines.org.\n
# You can reach my dog, a 10-year-old golden, at fido#dogs.org.\n"
Now, count the words:
CountWords.count_words_in_file(FName)
#=> 18 (9 in ech line)
Note that there is at least one problem with the removal of punctuation. It has to do with the hyphen. Any idea what that might be?
Something like...?
def is_countable(word)
return false if word.size < 2
return false if word ~= /^[0-9]+$/
return false if is_an_email_address(word) # you need a gem for this...
return true
end
wordCount = text.split().inject(0) {|count,word| count += 1 if is_countable(word) }
Or, since I am jumping to the conclusion that you can just split your entire text into an array with split(), you might need:
wordCount = 0
text.each_line do |line|
line.split.each{|word| wordCount += 1 if is_countable(word) }
end
In the book I'm reading to learn Rails (RailsSpace) , the author creates two functions (below) to turn all caps city names like LOS ANGELES into Los Angeles. There's something I don't get about the first function, below, however.
Namely, where does "word" come from? I understand that "word" is a local/block variable that disappears after the function has been completed, but what is being passed into/assigned to "word." IN other words, what is being split?
I would have expected there to have been some kind of argument taking an array or hash passed into this function...and then the "each" function run over that..
def capitalize_each
space = " "
split(space).each{ |word| word.capitalize! }.join(space)
end
# Capitalize each word in place.
def capitalize_each!
replace capitalize_each end
end
Let's break this up.
split(space)
turns the string into a list of would-be words. (Actually, if the string has two spaces in a row, the list will have an empty string in it. but that doesn't matter for this purpose.) I assume this is an instance method in String; otherwise, split wouldn't be defined.
.each { |word| word.capitalize! }
.each takes each thing in the list (returned by split), and runs the following block on it, passing the thing as an arg to the block. The |word| says that this block is going to call the arg "word". So effectively, what this does is capitalize each word in the string (and each blank string and lonely bit of punctuation too, but again, that's not important -- capitalization doesn't change characters that have no concept of case).
.join(space)
glues the words back together, reinserting the space that was used to separate them before. The string it returns is the return value of the function as well.
At first I thought that the method was incomplete because of the absence of self at the beginning but it seems that even without it split is being called over the string given, space would simply be a default separator. This is how the method could look with explicit self.
class String
def capitalize_each(separator = ' ')
self.split(separator).each{|word| word.capitalize!}.join(separator)
end
end
puts "LOS ANGELES".capitalize_each #=> Los Angeles
puts "LOS_ANGELES".capitalize_each('_') #=> Los_Angeles
The string is being split by spaces, i.e. into words.
So the 'each' iterator goes through all the words, one by one, each time the word is in the 'word' object. So then for that object (word) it uses the capitalize function for it. Finally it all gets joined back together With Spaces. So The End Result is Capitalized.
These methods are meant to be defined in the String class, so what is being split is whatever string you are calling the capitalize_each method on.
Some example usage (and a slightly better implementation):
class String
def capitalize_each
split(/\s+/).each{ |word| word.capitalize! }.join " "
end
def capitalize_each!
replace capitalize_each
end
end
puts "hi, i'm a sentence".capitalize_each #=> Hi, I'm A Sentence
Think of |word| word.capitalize! as a function whch you're passing into the each method. The function has one argument (word) and simply evaluates .capitalize! on it.
Now what the each method is doing is taking each item in split(space) and evaluating your function on it. So:
"abcd".each{|x| print x}
will evaluate, in order, print "a", print "b", print "c".
http://www.ruby-doc.org/core/classes/Array.html#M000231
To demystify this behavior a bit, it helps to understand exactly what it means to "take each item in __". Basically, any object which is enumerable can be .eached in this way.
If you're referring to how it gets into your block in the first place, it's yielded into the block. #split returns an Array, and it's #each method is doing something along the lines of:
for object in stored_objects
yield object
end
This works, but if you want to turn one array into another array, it's idiomatically better to use map instead of each, like this:
words.map{|word|word.capitalize}
(Without the trailing !, capitalize makes a new string instead of modifying the old string, and map collects those new strings into a new array. In contrast, each returns the old array.)
Or, following gunn's lead:
class String
def capitalize_each
self.split(/\s/).map{|word|word.capitalize}.join(' ')
end
end
"foo bar baz".capitalize_each #=> "Foo Bar Baz"
by default, split splits on strings of spaces, but by passing a regular expression it matches each individual space characters even if they're in a row.
So how can I still be able to write beautiful code such as:
'im a string meing!'.pop
Note: str.chop isn't sufficient answer
It is not what an enumerable string atually enumerates. Is a string a sequence of ...
lines,
characters,
codepoints or
bytes?
The answer is: all of those, any of those, either of those or neither of those, depending on the context. Therefore, you have to tell Ruby which of those you actually want.
There are several methods in the String class which return enumerators for any of the above. If you want the pre-1.9 behavior, your code sample would be
'im a string meing!'.bytes.to_a.pop
This looks kind of ugly, but there is a reason for it: a string is a sequence. You are treating it as a stack. A stack is not a sequence, in fact it pretty much is the opposite of a sequence.
That's not beautiful :)
Also #pop is not part of Enumerable, it's part of Array.
The reason why String is not enumerable is because there are no 'natural' units to enumerate, should it be on a character basis or a line basis? Because of this String does not have an #each
String instead provides the #each_char and #each_byte and #each_line methods for iteration in the way that you choose.
Since you don't like str[str.length], how about
'im a string meing!'[-1] # returns last character as a character value
or
'im a string meing!'[-1,1] # returns last character as a string
or, if you need it modified in place as well, while keeping it an easy one-liner:
class String
def pop
last = self[-1,1]
self.chop!
last
end
end
#!/usr/bin/ruby1.8
s = "I'm a string meing!"
s, last_char = s.rpartition(/./)
p [s, last_char] # => ["I'm a string meing", "!"]
String.rpartition is new for 1.9 but it's been back-ported to 1.8.7. It searches a string for a regular expression, starting at the end and working backwards. It returns the part of the string before the match, the match, and the part of the string after the match (which we discard here).
String#slice! and String#insert is going to get you much closer to what you want without converting your strings to arrays.
For example, to simulate Array#pop you can do:
text = '¡Exclamation!'
mark = text.slice! -1
mark == '!' #=> true
text #=> "¡Exclamation"
Likewise, for Array#shift:
text = "¡Exclamation!"
inverted_mark = text.slice! 0
inverted_mark == '¡' #=> true
text #=> "Exclamation!"
Naturally, to do an Array#push you just use one of the concatenation methods:
text = 'Hello'
text << '!' #=> "Hello!"
text.concat '!' #=> "Hello!!"
To simulate Array#unshift you use String#insert instead, it's a lot like the inverse of slice really:
text = 'World!'
text.insert 0, 'Hello, ' #=> "Hello, World!"
You can also grab chunks from the middle of a string in multiple ways with slice.
First you can pass a start position and length:
text = 'Something!'
thing = text.slice 4, 5
And you can also pass a Range object to grab absolute positions:
text = 'This is only a test.'
only = text.slice (8..11)
In Ruby 1.9 using String#slice like this is identical to String#[], but if you use the bang method String#slice! it will actually remove the substring you specify.
text = 'This is only a test.'
only = text.slice! (8..12)
text == 'This is a test.' #=> true
Here's a slightly more complex example where we reimplement a simple version of String#gsub! to do a search and replace:
text = 'This is only a test.'
search = 'only'
replace = 'not'
index = text =~ /#{search}/
text.slice! index, search.length
text.insert index, replace
text == 'This is not a test.' #=> true
Of course 99.999% of the time, you're going to want to use the aforementioned String.gsub! which will do the exact same thing:
text = 'This is only a test.'
text.gsub! 'only', 'not'
text == 'This is not a test.' #=> true
references:
Ruby String Documentation