Capitalization of strings - ruby

Let us imagine, that we have a simple abstract input form, whose aim is accepting some string, which could consist of any characters.
string = "mystical characters"
We need to process this string by making first character uppercased. Yes, that is our main goal. Thereafter we need to display this converted string in some abstract view template. So, the question is: do we really need to check whether the first character is already written correctly (uppercased) or we are able to write just this?
theresult = string.capitalize
=> "Mystical characters"
Which approach is better: check and then capitalize (if need) or force capitalization?

Check first if you need to process something, because String#capitalize doesn't only convert the first character to uppercase, but it also converts all other characters downcase. So..
"First Lastname".capitalize == "First lastname"
That might not be the wanted result.

If I understood correctly you are going to capitalize the string anyway, so why bother checking if it's already capitalized?

Based on Tonttu answer I would suggest not to worry too much and just capitalize like this:
new_string = string[0...1].capitalize + string[1..-1]

I ran in to Tonttu's problem importing a bunch of names, I went with:
strs = "first lastname".split(" ")
return_string = ""
strs.each do |str|
return_string += "#{str[0].upcase}#{str[1..str.length].downcase} "
end
return_string.chop
EDIT: The inevitable refactor (over a year) later.
"first lastname".split(" ").map do |str|
"#{str[0].upcase}#{str[1..str.length].downcase}"
end.join(' ')
while definitely not easier to read, it gets the same result while declaring fewer temporary variables.

I guess you could write something like:
string.capitalize unless string =~ /^[A-Z].*/
Personally I would just do string.capitalize

Unless you have a flag to be set for capitalized strings which you going to check than just capitalize without checking.
Also the capitalization itself is probably performing some checking.

Related

Ruby: issues with if-statement output

I am VERY new to Ruby. I was trying to make a simple "How's your day?" kind of thing, but when I answer with "Good" it's supposed to return "Good to hear" but it skips it and goes to my else statement that returns "Not valid". Same thing when I enter "Bad", it's supposed to give me "Oh no" but instead it gives me "Not valid". I know your usually supposed to use ==, but I don't know what I am missing here. Thank you for your help.
puts "How are you?"
answer = gets
if (answer == "Good");
print("Good to hear")
elsif (answer == "Bad");
print("Oh no")
else;
print("Not valid")
end
gets will capture a string including an endline character (\n). You're comparing the string Good (for example) against Good\n, and it obviously doesn't match.
You can observe this by adding a line after you populate answer, like puts answer.inspect (or more tersely, p answer). This will show you the string along with any not-normally-visible characters.
The easiest way to fix this will be to use answer = gets.strip, which will remove whitespace characters (including spaces, tabs, and newlines) from end of the captured input string.

opposite of sub in ruby

I want to replace the content (or delete it) that does not match with my filter.
I think the perfect description would be an opposite sub. I cannot find anything similar in the docs, and I'm not sure how to invert the regex, but I think a method would probably be the more convenient.
An example of how it would work (I've just changed the words to make it more clear)
"bird.cats.dogs".opposite_sub(/(dogs|cats)\.(dogs|cats)/, '')
#"cats.dogs"
I hope it's easy enough to understand.
Thanks in advance.
String#[] can take a regular expression as its parameter:
▶ "bird.cats.dogs"[/(dogs|cats)\.(dogs|cats)/]
#⇒ "cats.dogs"
For multiple matches one can use String#scan:
▶ "bird.cats.dogs.bird.cats.dogs".scan /(?:dogs|cats)\.(?:dogs|cats)/
#⇒ ["cats.dogs", "cats.dogs"]
So you want to extract the part that matches your regex?
You can use String#slice, for example:
"bird.cats.dogs".slice(/(dogs|cats)\.(dogs|cats)/)
#=> "cats.dogs"
And String#[] does the same.
"bird.cats.dogs"[/(dogs|cats)\.(dogs|cats)/]
#=> "cats.dogs"
You cannot have a single replacement string because the part of the string that matches the regex might not be at the beginning or end of the string, in which case it's not clear whether the replacement string should precede or follow the matching string. I've therefore written the following with two replacement strings, one for pre-match, the other for post_match. I've made this a method of the String class as that's what you've asked for (though I've given the method a less-perfect name :-) )
class String
def replace_non_matching(regex, replace_before, replace_after)
first, match, last = partition(regex)
replace_before + match + replace_after
end
end
r = /(dogs|cats)\.(dogs|cats)/
"birds.cats.dogs.pigs".replace_non_matching(r, "", "")
#=> "cats.dogs"
"birds.cats.dogs".replace_non_matching(r, "snakes.", ".hens")
#=> "snakes.cats.dogs.hens"
"birds.cats.dogs.mice.cats.dogs.bats".replace_non_matching(r, "snakes.", ".hens")
#=> "snakes.cats.dogs.hens"
Regarding the last example, the method could be modified to replace "birds.", ".mice." and ".bats", but in that case three replacement strings would be needed. In general, determining in advance the number of replacement strings needed could be problematic.

ruby/regex getting the first letter of each word

I want to get the first letter of each word put together, making something like "I need help" turn into "Inh". I was thinking to trim everything off, then going from there, or grab each first letter right away.
You could simply use split, map and join together here.
string = 'I need help'
result = string.split.map(&:first).join
puts result #=> "Inh"
How about regular expressions? Using the split method here forces a focus on the parts of the string that you don't need to for this problem, then taking another step of extracting the first letter of each word (chr). that's why I think regular expressions is better for this case. Node that this will also work if you have a - or another special character in the string. And then, of course you can add .upcase method at the end to get a proper acronym.
string = 'something - something and something else'
string.scan(/\b\w/).join
#=> ssase
Alternative solution using regex
string = 'I need help'
result = string.scan(/(\A\w|(?<=\s)\w)/).flatten.join
puts result
This basically says "look for either the first letter or any letter directly preceded by a space". The scan function returns array of arrays of matches, which is flattened (made into one array) and joined (made into a string).
string = 'I need help'
result = string.split.map(&:chr).join
puts result
http://ruby-doc.org/core-2.0/String.html#method-i-chr

Regex to match a specific parenthesis among multiple

Take the String:
"The only true (wisdom) is in knowing you know (nothing)"
I want to extract nothing.
What I know about it:
It will always be inside a parenthesis
The parenthesis will always be the last element before the line-end: $
I first attempted to match it with
/\(.*\)$/, but that obviously returned
(wisdom) is in knowing you know (nothing).
You want to use negative character group matching, like [^...]:
s = 'The only true (wisdom) is in knowing you know (nothing)'
s.match(/\(([^)]+)\)$/).captures
Debuggex Demo
In this case, nothing is in the first sub-group match, but the entire regex technically matches (nothing). To match exactly nothing as the entire match, use:
s = 'The only true (wisdom) is in knowing you know (nothing)'
s.match(/(?<=\()([^)]+)(?=\)$)/).captures
Debuggex Demo
I would do
s = 'The only true (wisdom) is in knowing you know (nothing)'
s.match(/\(([^)]+)\)$/).captures # => ["nothing"]
You could use scan to find all matches and then take the last one:
str = "The only true (wisdom) is in knowing you know (nothing)"
str.scan(/\((.+?)\)/).last
#=> "nothing"
You can use the \z which matches end of string. try
\([a-z]+\)\z
Way simpler and will ignore everything else but what you need.
Test it here:
http://rubular.com/
It's even trickier if there's any chance of nesting. In that case you need some recursion:
"...knowing you know ((almost) nothing)"[/\(((?:[^()]*|\(\g<1>\))*)\)$/, 1]
#=> "(almost) nothing"
Look ma, no regex!
s = 'The only true (wisdom) is in knowing you know (nothing)'
r = s.reverse
r[(r.index(')') + 1)...(r.index('('))].reverse
#=> "nothing"

Remove character from string if it starts with that character?

How can I remove the very first "1" from any string if that string starts with a "1"?
"1hello world" => "hello world"
"112345" => "12345"
I'm thinking of doing
string.sub!('1', '') if string =~ /^1/
but I' wondering there's a better way. Thanks!
Why not just include the regex in the sub! method?
string.sub!(/^1/, '')
As of Ruby 2.5 you can use delete_prefix or delete_prefix! to achieve this in a readable manner.
In this case "1hello world".delete_prefix("1").
More info here:
https://blog.jetbrains.com/ruby/2017/10/10-new-features-in-ruby-2-5/
https://bugs.ruby-lang.org/issues/12694
'invisible'.delete_prefix('in') #=> "visible"
'pink'.delete_prefix('in') #=> "pink"
N.B. you can also use this to remove items from the end of a string with delete_suffix and delete_suffix!
'worked'.delete_suffix('ed') #=> "work"
'medical'.delete_suffix('ed') #=> "medical"
https://bugs.ruby-lang.org/issues/13665
I've answered in a little more detail (with benchmarks) here: What is the easiest way to remove the first character from a string?
if you're going to use regex for the match, you may as well use it for the replacement
string.sub!(%r{^1},"")
BTW, the %r{} is just an alternate syntax for regular expressions. You can use %r followed by any character e.g. %r!^1!.
Careful using sub!(/^1/,'') ! In case the string doesn't match /^1/ it will return nil. You should probably use sub (without the bang).
This answer might be more optimised: What is the easiest way to remove the first character from a string?
string[0] = '' if string[0] == '1'
I'd like to post a tiny improvement to the otherwise excellent answer by Zach. The ^ matches the beginning of every line in Ruby regex. This means there can be multiple matches per string. Kenji asked about the beginning of the string which means they have to use this regex instead:
string.sub!(/\A1/, '')
Compare this - multiple matches with this - one match.

Resources