Ruby -- capitalize first letter of every sentence in a paragraph

Ruby -- capitalize first letter of every sentence in a paragraph - ruby

Using Ruby language, I would like to capitalize the first letter of every sentence, and also get rid of any space before the period at the end of every sentence. Nothing else should change.
Input = "this is the First Sentence . this is the Second Sentence ."
Output = "This is the First Sentence. This is the Second Sentence."
Thank you folks.

Using regular expression (String#gsub):
Input = "this is the First Sentence . this is the Second Sentence ."
Input.gsub(/[a-z][^.?!]*/) { |match| match[0].upcase + match[1..-1].rstrip }
# => "This is the First Sentence. This is the Second Sentence."
Input.gsub(/([a-z])([^.?!]*)/) { $1.upcase + $2.rstrip } # Using capturing group
# => "This is the First Sentence. This is the Second Sentence."
I assumed the setence ends with ., ?, !.
UPDATE
input = "TESTest me is agreat. testme 5 is awesome"
input.gsub(/([a-z])((?:[^.?!]|\.(?=[a-z]))*)/i) { $1.upcase + $2.rstrip }
# => "TESTest me is agreat. Testme 5 is awesome"
input = "I'm headed to stackoverflow.com"
input.gsub(/([a-z])((?:[^.?!]|\.(?=[a-z]))*)/i) { $1.upcase + $2.rstrip }
# => "I'm headed to stackoverflow.com"

Input.split('.').map(&:strip).map { |s|
s[0].upcase + s[1..-1] + '.'
}.join(' ')
=> "This is the First Sentence. This is the Second Sentence."
My second approach is cleaner but produces a slightly different output:
Input.split('.').map(&:strip).map(&:capitalize).join('. ') + '.'
=> "This is the first sentence. This is the second sentence."
I'm not sure if you're fine with it.

Related

replace words in a string and re join them

Hi I'm building a function that has me take any instance of "u" or "you" in a string and replace it with a specific word. I can go in and isolate the instances no problem but I cannot get the words to output properly. So far I have.
def autocorrect(input)
#replace = [['you','u'], ['your sister']]
#replace.each{|replaced| input.gsub!(replaced[0], replaced[1])}
input.split(" ")
if (input == "u" && input.length == 1) || input == "you"
input.replace("your sister")
end
input.join(" ")
end
The ideal output would be:
autocorrect("I am so smitten with you")
"I am smitten with your sister"
I don't know how to get the last part correct, I can't think of a good method to use. Any help would be greatly appreciated.

The problem you're having with your code is that you call input.split(" ") but you don't save that to anything, and then you check for input == "u" # ..., and input is still the entire string, so if you called autocorrect('u') or autocorrect('you') you would get "your sister" back, except for the next line: input.join(" ") will throw an error.
This error is because, remember input is still the original string, not an array of each word, and strings don't have a join method.
To get your code working with the fewest changes possible, you can change it to:
def autocorrect(input)
#replace = [['you','u'], ['your sister']]
#replace.each{|replaced| input.gsub!(replaced[0], replaced[1])}
input.split(" ").map do |word|
if (word == "u" && word.length == 1) || word == "you"
"your sister"
else
word
end
end.join(" ")
end
So, now, you are doing something with each word after you split(" ") the input, and you are checking each word against "u" and "you", instead of the entire input string. You then map either the replacement word, or the original, and then join them back into a single string to return them.
As an alternative, shorter way, you can use String#gsub which can take a Hash as the second parameter to do substitutions:
If the second argument is a Hash, and the matched text is one of its keys, the corresponding value is the replacement string.
def autocorrect(input)
replace = { 'you' => 'your sister',
'u' => 'your sister',
'another word' => 'something else entirely' }
input.gsub(/\b(#{replace.keys.join('|')})\b/, replace)
end
autocorrect("I am u so smitten with utopia you and another word")
# => "I am your sister so smitten with utopia your sister and something else entirely"
the regex in that example comes out looking like:
/\b(you|u|another word)\b/
with \b being any word boundary.

Simple array mapping would do the job:
"I am u so smuitten with utopia you".split(' ').map{|word| %w(you u).include?(word) ? 'your sister' : word}.join(' ')
#=> "I am your sister so smuitten with utopia your sister"
Your method would be:
def autocorrect(input)
input.split(' ').map{|word| %w(you u).include?(word) ? 'your sister' : word}.join(' ')
end
autocorrect("I am so smitten with you")
#=> "I am smitten with your sister"

Count capitalized of each sentence in a paragraph Ruby

I answered my own question. Forgot to initialize count = 0
I have a bunch of sentences in a paragraph.
a = "Hello there. this is the best class. but does not offer anything." as an example.
To figure out if the first letter is capitalized, my thought is to .split the string so that a_sentence = a.split(".")
I know I can "hello world".capitalize! so that if it was nil it means to me that it was already capitalized
EDIT
Now I can use array method to go through value and use '.capitalize!
And I know I can check if something is .strip.capitalize!.nil?
But I can't seem to output how many were capitalized.
EDIT
a_sentence.each do |sentence|
if (sentence.strip.capitalize!.nil?)
count += 1
puts "#{count} capitalized"
end
end
It outputs:
1 capitalized
Thanks for all your help. I'll stick with the above code I can understand within the framework I only know in Ruby. :)

Try this:
b = []
a.split(".").each do |sentence|
b << sentence.strip.capitalize
end
b = b.join(". ") + "."
# => "Hello there. This is the best class. But does not offer anything."

Your post's title is misleading because from your code, it seems that you want to get the count of capitalized letters at the beginning of a sentence.
Assuming that every sentence is finishing on a period (a full stop) followed by a space, the following should work for you:
split_str = ". "
regex = /^[A-Z]/
paragraph_text.split(split_str).count do |sentence|
regex.match(sentence)
end
And if you want to simply ensure that each starting letter is capitalized, you could try the following:
paragraph_text.split(split_str).map(&:capitalize).join(split_str) + split_str

There's no need to split the string into sentences:
str = "It was the best of times. sound familiar? Out, damn spot! oh, my."
str.scan(/(?:^|[.!?]\s)\s*\K[A-Z]/).length
#=> 2
The regex could be written with documentation by adding x after the closing /:
r = /
(?: # start a non-capture group
^|[.!?]\s # match ^ or (|) any of ([]) ., ! or ?, then one whitespace char
) # end non-capture group
\s* # match any number of whitespace chars
\K # forget the preceding match
[A-Z] # match one capital letter
/x
a = str.scan(r)
#=> ["I", "O"]
a.length
#=> 2
Instead of Array#length, you could use its alias, size, or Array#count.

You can count how many were capitalized, like this:
a = "Hello there. this is the best class. but does not offer anything."
a_sentence = a.split(".")
a_sentence.inject(0) { |sum, s| s.strip!; s.capitalize!.nil? ? sum += 1 : sum }
# => 1
a_sentence
# => ["Hello there", "This is the best class", "But does not offer anything"]
And then put it back together, like this:
"#{a_sentence.join('. ')}."
# => "Hello there. This is the best class. But does not offer anything."
EDIT
As #Humza sugested, you could use count:
a_sentence.count { |s| s.strip!; s.capitalize!.nil? }
# => 1

Regex to match exact word in string

I've looked around but haven't been able to find a working solution to my problem.
I have an array of two strings input and want to test which element of the array contains an exact substring Test.
One thing I have tried (among numerous other attempts):
input = ["Test's string", "Test string"]
# Alternative input array that it needs to work on:
# ["Testing string", "some Test string"]
substring = "Test"
if (input[0].match(/\b#{substring}\b/))
puts "Test 0 "
# Do something...
elsif (input[1].match(/\b#{substring}\b/))
puts "Test 1"
# Do something different...
end
The desired result is a print of "Test 1". The input can be more complex but overall I am looking for a way to find an exact match of a substring in a longer string.
I feel like this should be a rather trivial regex but I haven't been able to come up with the correct pattern. Any help would be greatly appreciated!

Following code may be what you are looking for.
input = ["Testing string", "Test string"]
substring = "Test"
if (input[0].match(/[^|\s]#{substring}[\s|$]/)
puts "Test 0 "
elsif (input[1].match(/[^|\s]#{substring}[\s|$]/)
puts "Test 1"
end
The meaning of the pattern /[^|\s]#{substring}[\s|$]/ is
[^|\s] : left side of the substring is begining of string(^) or white space,
{substring} : subsring is matched exactly,
[\s|$] : right side of the substring is white space or end of string($).

One way to that is as follows:
input = ["Testing string", "Test"]
"Test #{ input.index { |s| s[/\bTest\b/] } }"
#=> "Test 1"
input = ["Test", "Testing string"]
"Test #{ input.index { |s| s[/\bTest\b/] } }"
#=> "Test 0"
\b is the regex denotes a word boundary.
Maybe you want a method to return the index of the first element of input that contains the word? That could be:
def matching_index(input, word)
input.index { |s| s[/\b#{word}\b/i] }
end
input = ["Testing string", "Test"]
matching_index(input, "Test") #=> 1
matching_index(input, "test") #=> 1
matching_index(input, "Testing") #=> 0
matching_index(input, "Testy") #=> nil
Then you could use it like this, for example:
word = 'Test'
puts "The matching element for '#{word}' is at index #{ matching_index(input, word) }"
#=> The matching element for 'Test' is at index 1
word = "Testing"
puts "The matching element for '#{word}' is '#{ input[matching_index(input, word)] }'"
#The matching element for 'Testing' is 'Testing string'

The problem is with your bounding. In your original question, the word Test will match the first string because the ' is will match the \b word boundary. It's a perfect match and is responding with "Test 0" correctly. You need to determine how you'll terminate your search. If your input contains special characters, I don't think the regex will work properly. /\bTest my $money.*/ will never match because the of the $ in your substring.
What happens if you have multiple matches in your input array? Do you want to do something to all of them or just the first one?

Check if string contains one word or more

When looping through lines of text, what is the neatest way (most 'Ruby') to do an if else statement (or similar) to check if the string is a single word or not?
def check_if_single_word(string)
# code here
end
s1 = "two words"
s2 = "hello"
check_if_single_word(s1) -> false
check_if_single_word(s2) -> true

Since you're asking about the 'most Ruby' way, I'd rename the method to single_word?
One way is to check for the presence of a space character.
def single_word?(string)
!string.strip.include? " "
end
But if you want to allow a particular set of characters that meet your definition of word, perhaps including apostrophes and hyphens, use a regex:
def single_word?(string)
string.scan(/[\w'-]+/).length == 1
end

Following your definition of a word given in the comment:
[A] stripped string that doesn't [include] whitespace
the code would be
def check_if_single_word(string)
string.strip == string and string.include?(" ").!
end
check_if_single_word("two words") # => false
check_if_single_word("New York") # => false
check_if_single_word("hello") # => true
check_if_single_word(" hello") # => false

Here some code may help you out :
def check_if_single_word(string)
ar = string.scan(/\w+/)
ar.size == 1 ? "only one word" : "more than one word"
end
s1 = "two words"
s2 = "hello"
check_if_single_word s1 # => "more than one word"
check_if_single_word s2 # => "only one word"
def check_if_single_word(string)
string.scan(/\w+/).size == 1
end
s1 = "two words"
s2 = "hello"
check_if_single_word s1 # => false
check_if_single_word s2 # => true

I would check if a space exists in the string.
def check_if_single_word(string)
return !(string.strip =~ / /)
end
.strip will remove excess white space that may exist at the start and the end of the string.
!(myString =~ / /) means that the string does not match the regular expression of a single space.
Likewise you could also use !string.strip[/ /].

a Ruby Way. Extend the calss String
class String
def one?
!self.strip.include? " "
end
end
Then use "Hello world".one? to Check if string contains one word or more.

How to split a string and skip whitespace?

I have a string like " This is a test ". I want to split the string by the space character. I do it like this:
puts " This is a test ".strip.each(' ') {|s| puts s.strip}
The result is:
This
is
a
test
This is a test
Why is there the last line "This is a test"?
And I need, that if there are two or more space characters between two words, that this should not return a "row".
I only want to get the words splitted in a given string.
Does anyone have an idea?

irb(main):002:0> " This is a test ".split
=> ["This", "is", "a", "test"]
irb(main):016:0* puts " This is a test ".split
This
is
a
test
str.split(pattern=$;, [limit]) => anArray
If pattern is omitted, the value of $;
is used. If $; is nil (which is the
default), str is split on whitespace
as if ` ’ were specified.

You should do
" This is a test ".strip.each(' ') {|s| puts s.strip}
If you don't want the last "this is a test"
Because
irb>>> puts " This is a test ".strip.each(' ') {}
This is a test

The first command "puts" will be put after the each-block is excecuted.
omit the first "puts" and you are done

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Ruby -- capitalize first letter of every sentence in a paragraph - ruby

Related

replace words in a string and re join them

Count capitalized of each sentence in a paragraph Ruby

Regex to match exact word in string

Check if string contains one word or more

How to split a string and skip whitespace?

Categories

Resources