Check for a substring at the end of string - ruby

Let's say I have two strings:
"This-Test has a "
"This has a-Test"
How do I match the "Test" at the end of string and only get the second as a result and not the first string. I am using include? but it will match all occurrences and not just the ones where the substring occurs at the end of string.

You can do this very simply using end_with?, e.g.
"Test something Test".end_with? 'Test'
Or, you can use a regex that matches the end of the string:
/Test$/ === "Test something Test"

"This-Test has a ".end_with?("Test") # => false
"This has a-Test".end_with?("Test") # => true

Oh, the possibilities are many...
Let's say we have two strings, a = "This-Test has a" and b = "This has a-Test.
Because you want to match any string that ends exactly in "Test", a good RegEx would be /Test$/ which means "capital T, followed by e, then s, then t, then the end of the line ($)".
Ruby has the =~ operator which performs a RegEx match against a string (or string-like object):
a =~ /Test$/ # => nil (because the string does not match)
b =~ /Test$/ # => 11 (as in one match, starting at character 11)
You could also use String#match:
a.match(/Test$/) # => nil (because the string does not match)
b.match(/Test$/) # => a MatchData object (indicating at least one hit)
Or you could use String#scan:
a.scan(/Test$/) # => [] (because there are no matches)
b.scan(/Test$/) # => ['Test'] (which is the matching part of the string)
Or you could just use ===:
/Test$/ === a # => false (because there are no matches)
/Test$/ === b # => true (because there was a match)
Or you can use String#end_with?:
a.end_with?('Test') # => false
b.end_with?('Test') # => true
...or one of several other methods. Take your pick.

You can use the regex /Test$/ to test:
"This-Test has a " =~ /Test$/
#=> nil
"This has a-Test" =~ /Test$/
#=> 11

You can use a range:
"Your string"[-4..-1] == "Test"
You can use a regex:
"Your string " =~ /Test$/

String's [] makes it nice and easy and clean:
"This-Test has a "[/Test$/] # => nil
"This has a-Test"[/Test$/] # => "Test"
If you need case-insensitive:
"This-Test has a "[/test$/i] # => nil
"This has a-Test"[/test$/i] # => "Test"
If you want true/false:
str = "This-Test has a "
!!str[/Test$/] # => false
str = "This has a-Test"
!!str[/Test$/] # => true

Related

Replace characters from string without changing its object_id in Ruby

How can I replace characters from string without changing its object_id?
For example:
string = "this is a test"
The first 7 characters need to be replaced with capitalized characters like: "THIS IS a Test" and the object_id needs to be the same. In which way can I sub or replace the characters to make it happen?
You can do it like this:
string = "this is a test"
string[0, 7] = string[0, 7].upcase
With procedural languages, one might write the equivalent of:
string = "this is in jest"
string.object_id
#=> 70309969974760
(1..7).each { |i| string[i] = string[i].upcase }
#=> 1..7
string
#=> "tHIS IS in jest"
string.object_id
#=> 70309969974760
This is not very Ruby-like, but it does offer the advantage over #sawa's solution that it does not create a temporary 7-character string. (Well, it does create a one-character string.) This is unimportant for strings of reasonable length (and for those I'd certainly concur with sawa), but it could be significant for really, really, really long strings.
Another way to do this is as follows:
string.each_char.with_index { |c,i|
string[i] = string[i].upcase if (1..7).cover?(i) }
#=> "tHIS IS in jest"
string.object_id
#=> 70309969974760
This second way might be more efficient if string is not much larger than string[start_index..end_index].
Edit:
In a comment the OP indicates that the string is to be stripped, squeeze and reversed as well as certain characters converted to upper case. That could be done on the string in place, without creating a copy, as follows:
def strip_upcase_squeeze_reverse_whew(string, upcase_range, squeeze_str=nil)
string.strip!
upcase_range.each { |i| string[i] = string[i].upcase }
squeeze_str.nil? ? string.squeeze! : string.squeeze!(squeeze_str)
string.reverse!
end
I have assumed the four operations would be performed in a particular order, but if the order should be different, that's an easy fix.
string = " this may bee inn jest, butt it's alsoo a test "
string.object_id
#=> 70309970103280
strip_upcase_squeeze_reverse_whew(string, (1..7))
#=> "tset a osla s'ti tub ,tsej ni eb YAM SIHt"
string.object_id
#=> 70309970103280
The steps:
string = "this may bee inn jest, butt it's alsoo a test"
#=> "this may bee inn jest, butt it's alsoo a test"
upcase_range = (1..7)
#=> 1..7
string.strip!
#=> nil
string
#=> "this may bee inn jest, butt it's alsoo a test"
upcase_range.each { |i| string[i] = string[i].upcase }
#=> 1..7
string
#=> "tHIS MAY bee inn jest, butt it's alsoo a test"
squeeze_str.nil? ? string.squeeze! : string.squeeze!(squeeze_str)
#=> "tHIS MAY be in jest, but it's also a test"
string
#=> "tHIS MAY be in jest, but it's also a test"
string.reverse!
#=> "tset a osla s'ti tub ,tsej ni eb YAM SIHt"
Notice that in this example, strip! does not remove any characters, and therefore returns nil. Similarly, squeeze! would return nil if there is nothing to squeeze. It is for that reason that strip! and squeeze cannot be chained.
A second example:
string = " thiiiis may beeee in jeeest"
strip_upcase_squeeze_reverse_whew(string, (12..14), "aeiouAEIOU")
Adding onto a string without changing its object id:
foo = "foo"
# => "foo"
foo.object_id
# => 70196045363960
foo << "bar"
# => "foobar"
foo.object_id
# => 70196045363960
Replace an entire string without changing its object id
foo
# => "foo"
foo.object_id
# => 70196045363960
foo.gsub!(/./, '') << 'bar'
# => 'bar'
foo.object_id
# => 70196045363960
Replace part of a string without changing its object id
foo
# => "foo"
foo.object_id
# => 70196045363960
foo.gsub!(/o/, 'z')
# => 'fzz'
foo.object_id
# => 70196045363960

What's the difference between match method and the =~ operator?

Two expressions:
puts "String has vowels" if "This is a test".match(/[aeiou]/)
and
puts "String has vowels" if "This is a test" =~ /[aeiou]/
seem identical. Are they not? I did some testing below:
"This is a test" =~ /[aeiou]/
# => 2
"This is a test".match(/[aeiou]/)
# => MatchData "i"
So it seems like =~ gives you the position of the first match and match method gives you the first character that matches. Is this correct? They both return true and so what's the difference here?
They just differ on what they return if there is a match. If there is no match, both return nil.
~= returns the numerical index of the character in the string where the match started
.match returns an instance of the class MatchData
You're correct.
Expanding on Nobita's answer, match is less efficient if you want to just check to see if a string matches a regexp (like in your case). In that case, you should use =~. See the answer to "Fastest way to check if a string matches or not a regexp in ruby?", which contains these benchmarks:
require 'benchmark'
"test123" =~ /1/
=> 4
Benchmark.measure{ 1000000.times { "test123" =~ /1/ } }
=> 0.610000 0.000000 0.610000 ( 0.578133)
...
irb(main):019:0> "test123".match(/1/)
=> #<MatchData "1">
Benchmark.measure{ 1000000.times { "test123".match(/1/) } }
=> 1.703000 0.000000 1.703000 ( 1.578146)
So, in this case, =~ is a little less than three times faster than match

Regexp to match repeated substring

I would like to verify a string containing repeated substrings. The substrings have a particular structure. Whole string has a particular structure (substring split by "|"). For instance, the string can be:
1=23.00|6=22.12|12=21.34|112=20.34
1=23.00|6=22.12|12=21.34
1=23.00|12=21.34
1=23.00**
How can I check that all repeated substrings match a regexp? I tried to check it with:
"1=23.00|6=22.12|12=21.34".match(/([1-9][0-9]*[=][0-9\.]+)+/)
But checking gives true even when several substrings do not match the regexp:
"1=23.00|6=ass|=21.34".match(/([1-9][0-9]*[=][0-9\.]+)+/)
# => #<MatchData "1=23.00" 1:"1=23.00">
The question is whether every repeated substring matches a regex. I understand that the substrings are separated by the character | or $/, the latter being the end of a line. We first need to obtain the repeated substrings:
a = str.split(/[#{$/}\|]/)
.map(&:strip)
.group_by {|s| s}
.select {|_,v| v.size > 1 }
.keys
Next we specify whatever regex you wish to use. I am assuming it is this:
REGEX = /[1-9][0-9]*=[1-9]+\.[0-9]+/
but it could be altered if you have other requirements.
As we wish to determine if all repeated substrings match the regex, that is simply:
a.all? {|s| s =~ REGEX}
Here are the calculations:
str =<<_
1=23.00|6=22.12|12=21.34|112=20.34
1=23.00|6=22.12|12=21.34
1=23.00|12=21.34
1=23.00**
_
c = str.split(/[#{$/}\|]/)
#=> ["1=23.00", "6=22.12", "12=21.34", "112=20.34", "1=23.00",
# "6=22.12", "12=21.34", "1=23.00", "12=21.34", "1=23.00**"]
d = c.map(&:strip)
# same as c, possibly not needed or not wanted
e = d.group_by {|s| s}
# => {"1=23.00" =>["1=23.00", "1=23.00", "1=23.00"],
# "6=22.12" =>["6=22.12", "6=22.12"],
# "12=21.34" =>["12=21.34", "12=21.34", "12=21.34"],
# "112=20.34"=>["112=20.34"], "1=23.00**"=>["1=23.00**"]}
f = e.select {|_,v| v.size > 1 }
#=> {"1=23.00"=>["1=23.00", "1=23.00" , "1=23.00"],
# "6=22.12"=>["6=22.12", "6=22.12"],
# "12=21.34"=>["12=21.34", "12=21.34", "12=21.34"]}
a = f.keys
#=> ["1=23.00", "6=22.12", "12=21.34"]
a.all? {|s| s =~ REGEX}
#=> true
This will return true if there are any duplicates, false if there are not:
s = "1=23.00|6=22.12|12=21.34|112=20.34|3=23.00"
arr = s.split(/\|/).map { |s| s.gsub(/\d=/, "") }
arr != arr.uniq # => true
If you want to resolve it through regexp (not ruby), you should match whole string, not substrings. Well, I added [|] symbol and line ending to your regexp and it should works like you want.
([1-9][0-9]*[=][0-9\.]+[|]*)+$
Try it out.

Outputs string if string contains vowels and is in alphabetical order

I'm working on the following exercise below:
Write a method, ovd(str) that takes a string of lowercase words and returns a string with just the words containing all their vowels (excluding "y") in alphabetical order. Vowels may be repeated ("afoot" is an ordered vowel word). The method does not return the word if it is not in alphabetical order.
Example output is:
ovd("this is a test of the vowel ordering system") #output=> "this is a test of the system"
ovd("complicated") #output=> ""
Below is code I wrote that will do the job but I am looking to see if there is a shorter more clever way to do this. My solution seems too lengthy.Thanks in advance for helping.
def ovd?(str)
u=[]
k=str.split("")
v=["a","e","i","o","u"]
w=k.each_index.select{|i| v.include? k[i]}
r={}
for i in 0..v.length-1
r[v[i]]=i+1
end
w.each do |s|
u<<r[k[s]]
end
if u.sort==u
true
else
false
end
end
def ovd(phrase)
l=[]
b=phrase.split(" ")
b.each do |d|
if ovd?(d)==true
l<<d
end
end
p l.join(" ")
end
def ovd(str)
str.split.select { |word| "aeiou".match(/#{word.gsub(/[^aeiou]/, "").chars.uniq.join(".*")}/) }.join(" ")
end
ovd("this is a test of the vowel ordering system") # => "this is a test of the system"
ovd("complicated") # => ""
ovd("alooft") # => "alooft"
ovd("this is testing words having something in them") # => "this is testing words having in them"
EDIT
As requested by the OP, explanation
String#gsub word.gsub(/[^aeiou]/, "") removes the non-vowel characters e.g
"afloot".gsub(/[^aeiou]/, "") # => "aoo"
String#chars converts the new word to an array of characters
"afloot".gsub(/[^aeiou]/, "").chars # => ["a", "o", "o"]
Array#uniq converts returns only unique elements from the array e.g
"afloot".gsub(/[^aeiou]/, "").chars.uniq # => ["a", "o"]
Array#join converts an array to a string merging it with the supplied parameter e.g
"afloot".gsub(/[^aeiou]/, "").chars.uniq.join(".*") # => "a.*o"
#{} is simply String interpolation and // converts the interpolated string into a Regular Expression
/#{"afloot".gsub(/[^aeiou]/, "").chars.uniq.join(".*")}/ # => /a.*o/
A non-regex solution:
V = %w[a e i o u] # => ["a", "e", "i", "o", "u"]
def ovd(str)
str.split.select{|w| (x = w.downcase.chars.select \
{|c| V.include?(c)}) == x.sort}.join(' ')
end
ovd("this is a test of the vowel ordering system")
# => "this is a test of the system"
ovd("") # => ""
ovd("camper") # => "camper"
ovd("Try singleton") # => "Try"
ovd("I like leisure time") # => "I"
ovd("The one and only answer is '42'") # => "The and only answer is '42'"
ovd("Oil and water don't mix") # => "and water don't mix"
Edit to add an alternative:
NV = (0..127).map(&:chr) - %w(a e i o u) # All ASCII chars except lc vowels
def ovd(str)
str.split.select{|w| (x = w.downcase.chars - NV) == x.sort}.join(' ')
end
Note x = w.downcase.chars & V does not work. While it spears out the vowels from w, and preserves their order, it removes duplicates.

How do you strip substrings in ruby?

I'd like to replace/duplicate a substring, between two delimeters -- e.g.,:
"This is (the string) I want to replace"
I'd like to strip out everything between the characters ( and ), and set that substr to a variable -- is there a built in function to do this?
I would just do:
my_string = "This is (the string) I want to replace"
p my_string.split(/[()]/) #=> ["This is ", "the string", " I want to replace"]
p my_string.split(/[()]/)[1] #=> "the string"
Here are two more ways to do it:
/\((?<inside_parenthesis>.*?)\)/ =~ my_string
p inside_parenthesis #=> "the string"
my_new_var = my_string[/\((.*?)\)/,1]
p my_new_var #=> "the string"
Edit - Examples to explain the last method:
my_string = 'hello there'
capture = /h(e)(ll)o/
p my_string[capture] #=> "hello"
p my_string[capture, 1] #=> "e"
p my_string[capture, 2] #=> "ll"
var = "This is (the string) I want to replace"[/(?<=\()[^)]*(?=\))/]
var # => "the string"
str = "This is (the string) I want to replace"
str.match(/\((.*)\)/)
some_var = $1 # => "the string"
As I understand, you want to remove or replace a substring as well as set a variable equal to that substring (sans the parentheses). There are many ways to do this, some of which are slight variants of the other answers. Here's another way that also allows for the possibility of multiple substrings within parentheses, picking up from #sawa's comments:
def doit(str, repl)
vars = []
str.gsub(/\(.*?\)/) {|m| vars << m[1..-2]; repl}, vars
end
new_str, vars = doit("This is (the string) I want to replace", '')
new_str # => => "This is I want to replace"
vars # => ["the string"]
new_str, vars = doit("This is (the string) I (really) want (to replace)", '')
new_str # => "This is I want"
vars # => ["the string", "really, "to replace"]
new_str, vars = doit("This (short) string is a () keeper", "hot dang")
new_str # => "This hot dang string is a hot dang keeper"
vars # => ["short", ""]
In the regex, the ? in .*? makes .* "lazy". gsub passes each match m to the block; the block strips the parens and adds it to vars, then returns the replacement string. This regex also works:
/\([^\(]*\)/

Resources