Reverse a string in Ruby - ruby

How do you reverse a string in Ruby? I know about string#reverse. I'm interested in understanding how to write it in pure Ruby, preferably an in-place solution.

There's already an inplace reverse method, called "reverse!":
$ a = "abc"
$ a.reverse!
$ puts a
cba
If you want to do this manually try this (but it will probably not be multibyte-safe, eg UTF-8), and it will be slower:
class String
def reverse_inplace!
half_length = self.length / 2
half_length.times {|i| self[i], self[-i-1] = self[-i-1], self[i] }
self
end
end
This swaps every byte from the beginning with every byte from the end until both indexes meet at the center:
$ a = "abcd"
$ a.reverse_inplace!
$ puts a
dcba

Just for discussion, with that many alternates, it is good to see if there are major differences in speed/efficiency. I cleaned up the code a bit as the code showing output was repeatedly reversing the outputs.
# encoding: utf-8
require "benchmark"
reverse_proc = Proc.new { |reverse_me| reverse_me.chars.inject([]){|r,c| r.unshift c}.join }
class String
def reverse # !> method redefined; discarding old reverse
each_char.to_a.reverse.join
end
def reverse! # !> method redefined; discarding old reverse!
replace reverse
end
def reverse_inplace!
half_length = self.length / 2
half_length.times {|i| self[i], self[-i-1] = self[-i-1], self[i] }
end
end
def reverse(a)
(0...(a.length/2)).each {|i| a[i], a[a.length-i-1]=a[a.length-i-1], a[i]}
return a
end
def reverse_string(string) # method reverse_string with parameter 'string'
loop = string.length # int loop is equal to the string's length
word = '' # this is what we will use to output the reversed word
while loop > 0 # while loop is greater than 0, subtract loop by 1 and add the string's index of loop to 'word'
loop -= 1 # subtract 1 from loop
word += string[loop] # add the index with the int loop to word
end # end while loop
return word # return the reversed word
end # end the method
lorum = <<EOT
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent quis magna eu
lacus pulvinar vestibulum ut ac ante. Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Suspendisse et pretium orci. Phasellus congue iaculis
sollicitudin. Morbi in sapien mi, eget faucibus ipsum. Praesent pulvinar nibh
vitae sapien congue scelerisque. Aliquam sed aliquet velit. Praesent vulputate
facilisis dolor id ultricies. Phasellus ipsum justo, eleifend vel pretium nec,
pulvinar a justo. Phasellus erat velit, porta sit amet molestie non,
pellentesque a urna. Etiam at arcu lorem, non gravida leo. Suspendisse eu leo
nibh. Mauris ut diam eu lorem fringilla commodo. Aliquam at augue velit, id
viverra nunc.
EOT
And the results:
RUBY_VERSION # => "1.9.2"
name = "Marc-André"; reverse_proc.call(name) # => "érdnA-craM"
name = "Marc-André"; name.reverse! # => "érdnA-craM"
name = "Marc-André"; name.chars.inject([]){|s, c| s.unshift(c)}.join # => "érdnA-craM"
name = "Marc-André"; name.reverse_inplace!; name # => "érdnA-craM"
name = "Marc-André"; reverse(name) # => "érdnA-craM"
name = "Marc-André"; reverse_string(name) # => "érdnA-craM"
n = 5_000
Benchmark.bm(7) do |x|
x.report("1:") { n.times do; reverse_proc.call(lorum); end }
x.report("2:") { n.times do; lorum.reverse!; end }
x.report("3:") { n.times do; lorum.chars.inject([]){|s, c| s.unshift(c)}.join; end }
x.report("4:") { n.times do; lorum.reverse_inplace!; end }
x.report("5:") { n.times do; reverse(lorum); end }
x.report("6:") { n.times do; reverse_string(lorum); end }
end
# >> user system total real
# >> 1: 4.540000 0.000000 4.540000 ( 4.539138)
# >> 2: 2.080000 0.010000 2.090000 ( 2.084456)
# >> 3: 4.530000 0.010000 4.540000 ( 4.532124)
# >> 4: 7.010000 0.000000 7.010000 ( 7.015833)
# >> 5: 5.660000 0.010000 5.670000 ( 5.665812)
# >> 6: 3.990000 0.030000 4.020000 ( 4.021468)
It's interesting to me that the "C" version ("reverse_string()") is the fastest pure-Ruby version. #2 ("reverse!") is fastest but it's taking advantage of the [].reverse, which is in C.
Edit by Marc-André Lafortune *
Adding an extra test case (7):
def alt_reverse(string)
word = ""
chars = string.each_char.to_a
chars.size.times{word << chars.pop}
word
end
If the string is longer (lorum *= 10, n/=10), we can see that the difference widens because some functions are in O(n^2) while others (mine :-) are O(n):
user system total real
1: 10.500000 0.030000 10.530000 ( 10.524751)
2: 0.960000 0.000000 0.960000 ( 0.954972)
3: 10.630000 0.080000 10.710000 ( 10.721388)
4: 6.210000 0.060000 6.270000 ( 6.277207)
5: 4.210000 0.070000 4.280000 ( 4.268857)
6: 10.470000 3.540000 14.010000 ( 15.012420)
7: 1.600000 0.010000 1.610000 ( 1.601219)

The Ruby equivalent of the builtin reverse could look like:
# encoding: utf-8
class String
def reverse
each_char.to_a.reverse.join
end
def reverse!
replace reverse
end
end
str = "Marc-André"
str.reverse!
str # => "érdnA-craM"
str.reverse # => "Marc-André"
Note: this assumes Ruby 1.9, or else require "backports" and set $KCODE for UTF-8.
For a solution not involving reverse, one could do:
def alt_reverse(string)
word = ""
chars = string.each_char.to_a
chars.size.times{word << chars.pop}
word
end
Note: any solution using [] to access individual letters will be of order O(n^2); to access the 1000th letter, Ruby must go through the first 999 one by one to check for multibyte characters. It is thus important to use an iterator like each_char for a solution in O(n).
Another thing to avoid is to build intermediate values of increasing length; using += instead of << in alt_reverse would also make the solution O(n^2) instead of O(n).
Building an array with unshift will also make the solution O(n^2), because it implies recopying all existing elements one index higher each time one does an unshift.

Here's one way to do it with inject and unshift:
"Hello world".chars.inject([]) { |s, c| s.unshift(c) }.join

str = "something"
reverse = ""
str.length.times do |i|
reverse.insert(i, str[-1-i].chr)
end

"abcde".chars.reduce{|s,c| c + s } # => "edcba"

Use
def reverse_string(string) # Method reverse_string with parameter 'string'.
loop = string.length # int loop is equal to the string's length.
word = '' # This is what we will use to output the reversed word.
while loop > 0 # while loop is greater than 0, subtract loop by 1 and add the string's index of loop to 'word'.
loop -= 1 # Subtract 1 from loop.
word += string[loop] # Add the index with the int loop to word.
end # End while loop.
return word # Return the reversed word.
end # End the method.

def reverse(string)
result = ""
idx = string.length - 1
while idx >= 0
result << string [idx]
idx = idx - 1
end
result
end

The solution described below. There is no need to go beyond the half of array size:
class ReverseString
def initialize(array)
#array = array
#size = #array.size
end
def process
(0...#size/2).to_a.each_with_index do |e,i|
#array[i], #array[#size-i-1] = #array[#size-i-1], #array[i]
end
#array
end
end
require 'minitest/autorun'
class ReverseStringTest < Minitest::Unit::TestCase
def test_process
assert_equal "9876543210", ReverseString.new("0123456789").process
end
end

This is the solution that made the most sense to me as a ruby beginner
def reverse(string)
reversed_string = ''
i = 0
while i < string.length
reversed_string = string[i] + reversed_string
i += 1
end
reversed_string
end
p reverse("helter skelter")

Also, using Procs ...
Proc.new {|reverse_me| reverse_me.chars.inject([]){|r,c| r.unshift c}.join}.call("The house is blue")
=> "eulb si esuoh ehT"
Proc.new would be handy here because you could then nest your reversing algorithm (and still keep things on one line). This would be handy if, for instance, you needed to reverse each word in an already-reversed sentence:
# Define your reversing algorithm
reverser = Proc.new{|rev_me| rev_me.chars.inject([]){r,c| r.unshift c}.join}
# Run it twice - first on the entire sentence, then on each word
reverser.call("The house is blue").split.map {|w| reverser.call(w)}.join(' ')
=> "blue is house The"

Hard to read one-liner,
def reverse(a)
(0...(a.length/2)).each {|i| a[i], a[a.length-i-1]=a[a.length-i-1], a[i]}
return a
end

Consider looking at how Rubinius implements the method - they implement much of the core library in Ruby itself, and I wouldn't be surprised if String#reverse and String#reverse! is implemented in Ruby.

def palindrome(string)
s = string.gsub(/\W+/,'').downcase
t = s.chars.inject([]){|a,b| a.unshift(b)}.join
return true if(s == t)
false
end

If you have sentence "The greatest victory is that" and you want to have "that is victory greatest The" you should to use this method
def solution(sentance)
sentance.split.reverse.join(" ")
end
solution("The greatest victory is that")

Here's an alternative using the xor bitwise operations:
class String
def xor_reverse
len = self.length - 1
count = 0
while (count < len)
self[count] ^= self[len]
self[len] ^= self[count]
self[count] ^= self[len]
count += 1
len -= 1
end
self
end
"foobar".xor_reverse
=> raboof

In Ruby:
name = "Hello World"; reverse_proc.call(name)
name = "Hello World"; name.reverse!
name = "Hello World"; name.chars.inject([]){|s, c| s.unshift(c)}.join
name = "Hello World"; name.reverse_inplace!;
name = "Hello World"; reverse(name)
name = "Hello World"; reverse_string(name)

I believe this would work also
def reverse(str)
string = ''
(0..str.size-1).each do |i|
string << str[str.size - 1 - i]
end
string
end

def reverse(string)
reversed_string = ""
idx = 0
while idx < string.length
reversed_string = string[idx] + reversed_string
idx += 1
end
return reversed_string
end

string = "This is my string"
string_arr = string.split('')
n = string_arr.length
new_arr = Array.new
17.times do |i|
new_arr << string_arr.values_at(n - i)
end
reversed_string = new_arr.flatten.join('')
=> "gnirts ym si sihT"

Here is a simple alternative, it first breaks the string into an array, counts the length and subtracts one(because of ruby's indexing rule for array starting from 0), creates an empty variable, then runs an iteration on the keys of the array whilst appending the value of the array length minus current array index to the empty variable created and when it reaches the zeroth(sorry for my french) value it stops. Hope this helps.
class String
def rString
arr = self.split("")
len = arr.count - 1
final = ""
arr.each_index do |i|
final += arr[len - i]
end
final
end
end

A simple classic way with n/2 complexity
str = "Hello World!";
puts str;
for i in 0..(str.length/2).to_i
mid = (str.length-1-i);
temp = str[i];
str[i] = str[aa];
str[aa] = temp;
end
puts str;

we can use inject method to make it simple:
def reverse_str(str)
(1..str.length).inject('') {|rev_str, i| rev_str.concat(str[str.length-i])}
end
Note: I have used concat instead += because concat will change the string into same reference but += will create new object
for example
str = 'sanjay' #object.id #69080
str.concat('choudhary') #object.id #69080 #look here object id is same
str += 'choudhary' #object.id #78909 #look here object id will change.

Related

Ignore Lorem Ipsum text in a file Ruby

I have a .txt file that has last name, first name on one line and on every other line I have Lorem Ipsum text. I need to detect the Lorem Ipsum in every other line and skip it.
example txt.file
Spade, Kate
Voluptatem ipsam et at.
Vuitton, Louis
Facere et necessitatibus animi.
Bucks, Star
Eveniet temporibus ducimus amet eaque.
Cage, Nicholas
Unde voluptas sit fugit.
Brown, James
Maiores ab officia sed.
expected output:
#Spade, Kate
#Vuitton, Louis
#Bucks, Star
#Cage, Nicholas
#Brown, James
Reading 2 lines and ignoring the second:
File.open("test.txt", "r") do |f|
f.each_slice(2) do |odd, _even|
puts odd
end
end
If you just want to skip every second line you can do something like this:
File.open("text.txt", "r") do |f|
f.each_line.with_index do |line, i|
next unless i.even?
puts line
end
end
#Spade, Kate
#Vuitton, Louis
#Bucks, Star
#Cage, Nicholas
#Brown, James
Now I'm not really good with regexp, but you could also do something like this to process only the lines that are two words, both starting with a capital letter separated by a comma and space (basically first name and last name):
File.open("text.txt", "r") do |f|
f.each_line do |line|
next unless line =~ /[A-Z][a-z]+, [A-Z][a-z]+/
puts line
end
end
#Spade, Kate
#Vuitton, Louis
#Bucks, Star
#Cage, Nicholas
#Brown, James
You could also load the full Lorem Ipsum text from a file like this:
lorem = File.open("lorem.txt", "r").map(&:chomp).join(" ")
And then check each line if it's contained in the Lorem Ipsum text:
File.open("text.txt", "r") do |f|
f.each_line do |line|
next if lorem.include?(line[0...-1]) #removing the last character because you seem to have a dot at the end even though in the lorem text there's no dot on these positions.
puts line
end
end
#Spade, Kate
#Vuitton, Louis
#Bucks, Star
#Cage, Nicholas
#Brown, James
Now depending on what you want to do with the data you can replace the puts line line with something else.
Your description is unclear. If you just want to skip every other line, you can do something like this:
File.foreach("test.txt").with_index(1) do |l, i|
next if i.even?
puts l
end
Let's first create a file.
FName = 'temp.txt'
IO.write(FName,
<<~END
Spade, Kate
Voluptatem ipsam et at.
Vuitton, Louis
Facere et necessitatibus animi.
Bucks, Star
Eveniet temporibus ducimus amet eaque.
Cage, Nicholas
Unde voluptas sit fugit.
Brown, James
Maiores ab officia sed.
END
)
#=> 211
Here's one way to return every other line.
IO.foreach(FName).each_slice(2).map(&:first)
#=> ["Spade, Kate\n", "Vuitton, Louis\n", "Bucks, Star\n",
# "Cage, Nicholas\n", "Brown, James\n"]
See IO::write, IO::foreach, Enumerable#each_slice and Array#map.
Note that foreach, each_slice and map all return enumerators when they are not given block. We therefore obtain the following:
enum0 = IO.foreach(FName)
#=> #<Enumerator: IO:foreach("temp.txt")>
enum1 = enum0.each_slice(2)
#=> #<Enumerator: #<Enumerator: IO:foreach("temp.txt")>:each_slice(2)>
enum2 = enum1.map
#=> #<Enumerator: #<Enumerator: #<Enumerator: IO:foreach("temp.txt")>
# :each_slice(2)>:map>
enum2.each(&:first)
#=> ["Spade, Kate\n", "Vuitton, Louis\n", "Bucks, Star\n",
# "Cage, Nicholas\n", "Brown, James\n"]
Examine the return values for the calculation of enum1 and enum2. It may be helpful to think of these as These could be thought of as compound enumerators.
Two other ways:
enum = [true, false].cycle
#=> #<Enumerator: [true, false]:cycle>
IO.foreach(FName).select { enum.next }
#=> <as above>
keep = false
IO.foreach(FName).select { keep = !keep }
#=> <as above>

Recursive string modification in Ruby [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I have a string. I need to make every other line reversed. If count is 2, it should display:
"Hola mi amigo \nogima im aloH"
If count is 3, it should display:
"Hola mi amigo \nogima im aloH \nHola mi amigo"
and so on. What is the shortest way to perform string modification between every other "\n"?
My hunch is to use regex. In addition to regex solution, is there any non-regex solution? I would like to see both for comparison.
def hola(count)
if COUNT IS LESS THAN 2
return false
elsif ODD
(("Hola mi amigo ")*((count-1)/2) + "\n") * count
elsif EVEN
(("Hola mi amigo ")*(count/2) + "\n") * count
end
end
If there are no restrictions in the implementation, then try something like this:
string = "Hola mi amigo"
reversed_string = string.reverse
count.times.map { |i| i.odd? ? reversed_string : string }.join "\n"
times.map will create an enumerator. The next block will check if the current index is odd or even; in the first case, it will reverse the string, and return it. Finally, all strings are concatenated together with join (with newlines).
Also, here is another (and probably less efficient) recursive solution:
def hola(count)
if count == 0
''
elsif count.even?
"Hola mi amigo".reverse + "\n" + hola(count - 1)
else
"Hola mi amigo" + "\n" + hola(count - 1)
end
end
Solution 1
I'm surprised no one else has proposed a solution with Enumerable#cycle (perhaps #CarySwoveland is on vacation):
def hola(str, count)
[ str, str.reverse ].cycle.take(count).join("\n")
end
puts hola("Hola mi amigo", 1)
# => Hola mi amigo
puts hola("Hola mi amigo", 2)
# => Hola mi amigo
# ogima im aloH
puts hola("Hola mi amigo", 3)
# => Hola mi amigo
# ogima im aloH
# Hola mi amigo
puts hola("Hola mi amigo", 8)
# => Hola mi amigo
# ogima im aloH
# Hola mi amigo
# ogima im aloH
# Hola mi amigo
# ogima im aloH
# Hola mi amigo
# ogima im aloH
Solution 2
It occurred to me that perhaps OP was actually looking for a recursive solution, as they wrote in the title:
def hola(str, count)
return "" if count <= 0
str + "\n" + hola(str.reverse, count - 1).chomp
end
Output is the same as above. Obviously it's not as efficient, but otherwise I rather like it.
def alternate(str, count)
(("%s\n%s\n" % [str, str.reverse])*(count/2) << str*(count % 2)).chomp
end
str = "Hola mi amigo"
puts alternate(str, 4)
Hola mi amigo
ogima im aloH
Hola mi amigo
ogima im aloH
puts alternate(str, 5)
Hola mi amigo
ogima im aloH
Hola mi amigo
ogima im aloH
Hola mi amigo
Just out of curiosity:
str = '¡Hola mi amigo!'
([str, str.reverse] * (count / 2 + 1)).take(count).join $/
([str] * count).map.with_index { |s, i| i.even? ? s : s.reverse }.join $/
([str] * count).each.with_index.with_object("") do |(s, i), acc|
acc << (i.even? ? s : s.reverse) << $/
end.strip
To make it more robust (credits to Cary,) one might prepare reversed string in advance.
I'm not experienced with Ruby in particular, but from some light searching, I'd use the .reverse method and for-loops inside your elsif statements. It shouldn't be too hard to put together.
A bit of .reverse documentation can be found here:
http://www.informit.com/articles/article.aspx?p=2314083&seqNum=29

Find common words in sentences with Ruby

I have a task to find words that are in each sentence.
Given a string and we want to divide the string into sentences and then determine which words, if any, are in all the sentences.
Here is my solution:
# encoding: utf-8
text = ''
File.foreach("lab2.in") do |line|
text += line
end
hash = Hash.new
text = text.gsub(/[\n,]/,'').split(/[!.?]/)
number = 0
text.each do |sen|
number += 1
words = sen.split(/ /)
words.each do |word|
if hash[word]
hash[word] += "#{number}"
else
hash[word] = "#{number}"
end
end
end
flag = false
needle = ''
count = text.length
for i in 1..count
needle += "#{i}"
end
hash.each do |word|
if word[1].squeeze == needle
puts "this word is \"#{word[0]}\""
flag = true
end
end
if !flag
puts "There no such word"
end
How this task can be solved maybe more prettily? I'm interested in Ruby library methods. A simple solution, like character-by-character cycle I already know.
For example, with input like:
lorem ipsum dolor and another lorem! sit amet lorem? and another lorem.
The output will be:
this word is "lorem"
You could do this (I modified your example slightly):
str = "a lorem ipsum lorem dolor sit amet. a tut toje est lorem! a i tuta toje lorem?"
str.split(/[.!?]/).map(&:split).reduce(:&)
#=> ["a", "lorem"]
We have:
d = str.split(/[.!?]/)
#=> ["a lorem ipsum lorem dolor sit amet",
# " a tut toje est lorem",
# " a i tuta toje lorem"]
e = d.map(&:split)
#=> [["a", "lorem", "ipsum", "lorem", "dolor", "sit", "amet"],
# ["a", "tut", "toje", "est", "lorem"],
# ["a", "i", "tuta", "toje", "lorem"]]
e.reduce(:&)
#=> ["a", "lorem"]
To make it case-insensitive, change str.split... to str.downcase.split....

Ruby getting the longest word of a sentence

I'm trying to create method named longest_word that takes a sentence as an argument and The function will return the longest word of the sentence.
My code is:
def longest_word(str)
words = str.split(' ')
longest_str = []
return longest_str.max
end
The shortest way is to use Enumerable's max_by:
def longest(string)
string.split(" ").max_by(&:length)
end
Using regexp will allow you to take into consideration punctuation marks.
s = "lorem ipsum, loremmm ipsummm? loremm ipsumm...."
first longest word:
s.split(/[^\w]+/).max_by(&:length)
# => "loremmm"
# or using scan
s.scan(/\b\w+\b/).max_by(&:length)
# => "loremmm"
Also you may be interested in getting all longest words:
s.scan(/\b\w+\b/).group_by(&:length).sort.last.last
# => ["loremmm", "ipsummm"]
It depends on how you want to split the string. If you are happy with using a single space, than this works:
def longest(source)
arr = source.split(" ")
arr.sort! { |a, b| b.length <=> a.length }
arr[0]
end
Otherwise, use a regular expression to catch whitespace and puntuaction.
def longest_word(sentence)
longest_word = ""
words = sentence.split(" ")
words.each do |word|
longest_word = word unless word.length < longest_word.length
end
longest_word
end
That's a simple way to approach it. You could also strip the punctuation using a gsub method.
Funcional Style Version
str.split(' ').reduce { |r, w| w.length > r.length ? w : r }
Another solution using max
str.split(' ').max { |a, b| a.length <=> b.length }
sort_by! and reverse!
def longest_word(sentence)
longw = sentence.split(" ")
longw.sort_by!(&:length).reverse!
p longw[0]
end
longest_word("once upon a time long ago a very longword")
If you truly want to do it in the Ruby way it would be:
def longest(sentence)
sentence.split(' ').sort! { |a, b| b.length <=> a.length }[0]
end
This is to strip the word from the extra chars
sen.gsub(/[^0-9a-z ]/i, '').split(" ").max_by(&:length)
Find Longest word in a string
sentence = "Hi, my name is Mesut. There is longestword here!"
def longest_word(string)
long = ""
string.split(" ").each do |sent|
if sent.length >= long.length
long = sent
end
end
return long
end
p longest_word(sentence)

How to get words frequency in efficient way with ruby?

Sample input:
"I was 09809 home -- Yes! yes! You was"
and output:
{ 'yes' => 2, 'was' => 2, 'i' => 1, 'home' => 1, 'you' => 1 }
My code that does not work:
def get_words_f(myStr)
myStr=myStr.downcase.scan(/\w/).to_s;
h = Hash.new(0)
myStr.split.each do |w|
h[w] += 1
end
return h.to_a;
end
print get_words_f('I was 09809 home -- Yes! yes! You was');
This works but I am kinda new to Ruby too. There might be a better solution.
def count_words(string)
words = string.split(' ')
frequency = Hash.new(0)
words.each { |word| frequency[word.downcase] += 1 }
return frequency
end
Instead of .split(' '), you could also do .scan(/\w+/); however, .scan(/\w+/) would separate aren and t in "aren't", while .split(' ') won't.
Output of your example code:
print count_words('I was 09809 home -- Yes! yes! You was');
#{"i"=>1, "was"=>2, "09809"=>1, "home"=>1, "yes"=>2, "you"=>1}
def count_words(string)
string.scan(/\w+/).reduce(Hash.new(0)){|res,w| res[w.downcase]+=1;res}
end
Second variant:
def count_words(string)
string.scan(/\w+/).each_with_object(Hash.new(0)){|w,h| h[w.downcase]+=1}
end
def count_words(string)
Hash[
string.scan(/[a-zA-Z]+/)
.group_by{|word| word.downcase}
.map{|word, words|[word, words.size]}
]
end
puts count_words 'I was 09809 home -- Yes! yes! You was'
This code will ask you for input and then find the word frequency for you:
puts "enter some text man"
text = gets.chomp
words = text.split(" ")
frequencies = Hash.new(0)
words.each { |word| frequencies[word.downcase] += 1 }
frequencies = frequencies.sort_by {|a, b| b}
frequencies.reverse!
frequencies.each do |word, frequency|
puts word + " " + frequency.to_s
end
This works, and ignores the numbers:
def get_words(my_str)
my_str = my_str.scan(/\w+/)
h = Hash.new(0)
my_str.each do |s|
s = s.downcase
if s !~ /^[0-9]*\.?[0-9]+$/
h[s] += 1
end
end
return h
end
print get_words('I was there 1000 !')
puts '\n'
You can look at my code that splits the text into words. The basic code would look as follows:
sentence = "Ala ma kota za 5zł i 10$."
splitter = SRX::Polish::WordSplitter.new(sentence)
histogram = Hash.new(0)
splitter.each do |word,type|
histogram[word.downcase] += 1 if type == :word
end
p histogram
You should be careful if you wish to work with languages other than English, since in Ruby 1.9 the downcase won't work as you expected for letters such as 'Ł'.
class String
def frequency
self.scan(/[a-zA-Z]+/).each.with_object(Hash.new(0)) do |word, hash|
hash[word.downcase] += 1
end
end
end
puts "I was 09809 home -- Yes! yes! You was".frequency

Resources