How to get gsub! to recognize different characters (uppercase, lowercase) - ruby

print "Enter your text here:"
user_text = gets.chomp
user_text_2 = user_text.gsub! "Damn", "Darn"
user_text_3 = user_text.gsub! "Shit", "Crap"
puts "Here is your edited text: #{user_text}"
I would like that code to also recognize when I use the lowercase versions of Shit and Damn and replace them with the substitute words. Right now it only recognizes when I type the words in with an uppercase first word. Is there any way to get it to recognize the lowercase words too, without adding more gsub! lines of code?

You can specify the i flag on your patten to ignore case:
user_text_2 = user_text.gsub! /Damn/i, "Darn"

Just a very short solution:
user_text.gsub!(/[Dd]amn/, 'Darn')
The more general approach, if this is what you want, is with an i which makes the regex case-insensitive.
user_text.gsub!(/damn/i, 'Darn')

You can do that with a Regexp and the i option, that makes the Regexp case insensitive:
foo = 'foo Foo'
foo.gsub(/foo/, 'bar')
#=> "bar Foo"
foo.gsub(/foo/i, 'bar') # with i
#=> "bar bar"

Use regular expressions instead of strings. So /damn/i instead of "damn". The 'i' at the end of a regular expression means ignore casing.

Related

Delete all the whitespaces that occur after a word in ruby

I have a string " hello world! How is it going?"
The output I need is " helloworld!Howisitgoing?"
So all the whitespaces after hello should be removed. I am trying to do this in ruby using regex.
I tried strip and delete(' ') methods but I didn't get what I wanted.
some_string = " hello world! How is it going?"
some_string.delete(' ') #deletes all spaces
some_string.strip #removes trailing and leading spaces only
Please help. Thanks in advance!
There are numerous ways this could be accomplished without without a regular expressions, but using them could be the "cleanest" looking approach without taking sub-strings, etc. The regular expression I believe you are looking for is /(?!^)(\s)/.
" hello world! How is it going?".gsub(/(?!^)(\s)/, '')
#=> " helloworld!Howisitgoing?"
The \s matched any whitespace character (including tabs, etc), and the ^ is an "anchor" meaning the beginning of the string. The ! indicates to reject a match with following criteria. Using those together to your goal can be accomplished.
If you are not familiar with gsub, it is very similar to replace, but takes a regular expression. It additionally has a gsub! counter-part to mutate the string in place without creating a new altered copy.
Note that strictly speaking, this isn't all whitespace "after a word" to quote the exact question, but I gathered from your examples that your intentions were "all whitespace except beginning of string", which this will do.
def remove_spaces_after_word(str, word)
i = str.index(/\b#{word}\b/i)
return str if i.nil?
i += word.size
str.gsub(/ /) { Regexp.last_match.begin(0) >= i ? '' : ' ' }
end
remove_spaces_after_word("Hey hello world! How is it going?", "hello")
#=> "Hey helloworld!Howisitgoing?"

Multiple Ruby chomp! statements

I'm writing a simple method to detect and strip tags from text strings. Given this input string:
{{foobar}}
The function has to return
foobar
I thought I could just chain multiple chomp! methods, like so:
"{{foobar}}".chomp!("{{").chomp!("}}")
but this won't work, because the first chomp! returns a NilClass. I can do it with regular chomp statements, but I'm really looking for a one-line solution.
The String class documentation says that chomp! returns a Str if modifications have been made - therefore, the second chomp! should work. It doesn't, however. I'm at a loss at what's happening here.
For the purposes of this question, you can assume that the input string is always a tag which begins and ends with double curly braces.
You can definitely chain multiple chomp statements (the non-bang version), still having a one-line solution as you wanted:
"{{foobar}}".chomp("{{").chomp("}}")
However, it will not work as expected because both chomp! and chomp removes the separator only from the end of the string, not from the beginning.
You can use sub
"{{foobar}}".sub(/{{(.+)}}/, '\1')
# => "foobar"
"alfa {{foobar}} beta".sub(/{{(.+)}}/, '\1')
# => "alfa foobar beta"
# more restrictive
"{{foobar}}".sub(/^{{(.+)}}$/, '\1')
# => "foobar"
Testing this out, it's clear that chomp! will return nil if the separator it's provided as an argument is not present at the end of the string.
So "{{text}}".chomp!("}}") returns a string, but "{{text}}".chomp!("{{") reurns nil.
See here for an answer of how to chomp at the beginning of a string. But recognize that chomp only looks at the end of the string. So you can call str.reverse.chomp!("{{").reverse to remove the opening brackets.
You could also use a regex:
string = "{{text}}"
puts [/^\{\{(.+)\}\}$/, 1]
# => "text"
Try tr:
'{{foobar}}'.tr('{{', '').tr('}}', '')
You can also use gsub or sub but if the replacement is not needed as pattern, then tr should be faster.
If there are always curly braces, then you can just slice the string:
'{{foobar}}'[2...-2]
If you plan to make a method which returns the string without curly braces then DO NOT use bang versions. Modifying the input parameter of a method will be suprising!
def strip(string)
string.tr!('{{', '').tr!('}}', '')
end
a = '{{foobar}}'
b = strip(a)
puts b #=> foobar
puts a #=> foobar

How to change case of letters in string using RegEx in Ruby

Say I have a string : "hEY "
I want to convert it to "Hey "
string.gsub!(/([a-z])([A-Z]+ )/, '\1'.upcase)
That is the idea I have, but it seems like the upcase method does nothing when I use it within the gsub method. Why is that?
EDIT: I came up with this method:
string.gsub!(/([a-z])([A-Z]+ )/) { |str| str.downcase!.capitalize! }
Is there a way to do this within the regex though? I don't really understand the '\1' '\2' thing. Is that backreferencing? How does that work
#sawa Has the simple answer, and you've edited your question with another mechanism. However, to answer two of your questions:
Is there a way to do this within the regex though?
No, Ruby's regex does not support a case-changing feature as some other regex flavors do. You can "prove" this to yourself by reviewing the official Ruby regex docs for 1.9 and 2.0 and searching for the word "case":
https://github.com/ruby/ruby/blob/ruby_1_9_3/doc/re.rdoc
https://github.com/ruby/ruby/blob/ruby_2_0_0/doc/re.rdoc
I don't really understand the '\1' '\2' thing. Is that backreferencing? How does that work?
Your use of \1 is a kind of backreference. A backreference can be when you use \1 and such in the search pattern. For example, the regular expression /f(.)\1/ will find the letter f, followed by any character, followed by that same character (e.g. "foo" or "f!!").
In this case, within a replacement string passed to a method like String#gsub, the backreference does refer to the previous capture. From the docs:
"If replacement is a String it will be substituted for the matched text. It may contain back-references to the pattern’s capture groups of the form \d, where d is a group number, or \k<n>, where n is a group name. If it is a double-quoted string, both back-references must be preceded by an additional backslash."
In practice, this means:
"hello world".gsub( /([aeiou])/, '_\1_' ) #=> "h_e_ll_o_ w_o_rld"
"hello world".gsub( /([aeiou])/, "_\1_" ) #=> "h_\u0001_ll_\u0001_ w_\u0001_rld"
"hello world".gsub( /([aeiou])/, "_\\1_" ) #=> "h_e_ll_o_ w_o_rld"
Now, you have to understand when code runs. In your original code…
string.gsub!(/([a-z])([A-Z]+ )/, '\1'.upcase)
…what you are doing is calling upcase on the string '\1' (which has no effect) and then calling the gsub! method, passing in a regex and a string as parameters.
Finally, another way to achieve this same goal is with the block form like so:
# Take your pick of which you prefer:
string.gsub!(/([a-z])([A-Z]+ )/){ $1.upcase << $2.downcase }
string.gsub!(/([a-z])([A-Z]+ )/){ [$1.upcase,$2.downcase].join }
string.gsub!(/([a-z])([A-Z]+ )/){ "#{$1.upcase}#{$2.downcase}" }
In the block form of gsub the captured patterns are set to the global variables $1, $2, etc. and you can use those to construct the replacement string.
I don't know why you are trying to do it in a complicated way, but the usual way is:
"hEY".capitalize # => "Hey"
If you insist in using a regex and upcase, then you would also need downcase:
"hEY".downcase.sub(/\w/){$&.upcase} # => "Hey"
If you really want to just swap the case of every letter in the string, you can avoid the complexity of regex entirely because There's A Method For That™.
"hEY".swapcase # => "Hey"
"HellO thERe".swapcase # => "hELLo THerE"
There's also swapcase! to do it destructively.

Ruby - How to keep \n when I print out strings

I would like to keep \n when I print out strings in ruby,
Like now, if I use puts or print, \n will end up with a newline:
pry(main)> print "abc\nabc"
abc
abc
is there a way to let ruby print it out like: abc\nabc ?
UPDATE
Sorry that maybe I didn't make it more clear. I am debugging my regexps, so when I output a string, if a \n is displayed as a \n, not a newline, it would be easier for me to check. So #slivu 's answer is exactly what I want. Thanks guys.
i would suggest to use p instead of puts/print:
p "abc\nabc"
=> "abc\nabc"
and in fact with pry you do not need to use any of them, just type your string and enter:
str = "abc\nabc"
=> "abc\nabc"
str
=> "abc\nabc"
In Ruby, strings in a single quote literal, like 'abc\nabc' will not be interpolated.
1.9.3p194 :001 > print 'abc\nabc'
abc\nabc => nil
vs with double quotes
1.9.3p194 :002 > print "abc\nabc"
abc
abc => nil
Because \n is a special keyword, you have to escape it, by adding a backslash before the special keyword (and because of that, backslashes must also be escaped), like so: abs\\nabc. If you wanted to print \\n, you would have to replace it to be abs\\\\n, and so on (two backslashes to display a backslash).
You can also just use single quotes instead of double quotes so that special keywords will not be interpreted. This, IMHO, is bad practice, but if it makes your code look nicer, I guess it's worth it :)
Here are some examples of ways you can escape (sort of like a TL;DR version):
puts 'abc\nabc' # Single quotes ignore special keywords
puts "abc\\nabc" # Escaping a special keyword (preferred technique, IMHO)
p "abc\nabc" # The "p" command does not interpret special keywords
You can also do it like this
puts 'abc\nabc'
Use %q to create a string for example:
str = %q{abc\nklm}
puts str #=> 'abc\nklm'
or
puts %q{abc\nklm}
or use escape character
puts "abc\\nklm"
To print the string: abc\nabc
The coded string must be: abc\\nabc which will not generate a newline.

Remove all non-alphabetical, non-numerical characters from a string?

If I wanted to remove things like:
.!,'"^-# from an array of strings, how would I go about this while retaining all alphabetical and numeric characters.
Allowed alphabetical characters should also include letters with diacritical marks including à or ç.
You should use a regex with the correct character property. In this case, you can invert the Alnum class (Alphabetic and numeric character):
"◊¡ Marc-André !◊".gsub(/\p{^Alnum}/, '') # => "MarcAndré"
For more complex cases, say you wanted also punctuation, you can also build a set of acceptable characters like:
"◊¡ Marc-André !◊".gsub(/[^\p{Alnum}\p{Punct}]/, '') # => "¡MarcAndré!"
For all character properties, you can refer to the doc.
string.gsub(/[^[:alnum:]]/, "")
The following will work for an array:
z = ['asfdå', 'b12398!', 'c98347']
z.each { |s| s.gsub! /[^[:alnum:]]/, '' }
puts z.inspect
I borrowed Jeremy's suggested regex.
You might consider a regular expression.
http://www.regular-expressions.info/ruby.html
I'm assuming that you're using ruby since you tagged that in your post. You could go through the array, put it through a test using a regexp, and if it passes remove/keep it based on the regexp you use.
A regexp you might use might go something like this:
[^.!,^-#]
That will tell you if its not one of the characters inside the brackets. However, I suggest that you look up regular expressions, you might find a better solution once you know their syntax and usage.
If you truly have an array (as you state) and it is an array of strings (I'm guessing), e.g.
foo = [ "hello", "42 cats!", "yöwza" ]
then I can imagine that you either want to update each string in the array with a new value, or that you want a modified array that only contains certain strings.
If the former (you want to 'clean' every string the array) you could do one of the following:
foo.each{ |s| s.gsub! /\p{^Alnum}/, '' } # Change every string in place…
bar = foo.map{ |s| s.gsub /\p{^Alnum}/, '' } # …or make an array of new strings
#=> [ "hello", "42cats", "yöwza" ]
If the latter (you want to select a subset of the strings where each matches your criteria of holding only alphanumerics) you could use one of these:
# Select only those strings that contain ONLY alphanumerics
bar = foo.select{ |s| s =~ /\A\p{Alnum}+\z/ }
#=> [ "hello", "yöwza" ]
# Shorthand method for the same thing
bar = foo.grep /\A\p{Alnum}+\z/
#=> [ "hello", "yöwza" ]
In Ruby, regular expressions of the form /\A………\z/ require the entire string to match, as \A anchors the regular expression to the start of the string and \z anchors to the end.

Resources