How to replace the last occurrence of a substring in ruby? - ruby

I want to replace the last occurrence of a substring in Ruby. What's the easiest way?
For example, in abc123abc123, I want to replace the last abc to ABC. How do I do that?

How about
new_str = old_str.reverse.sub(pattern.reverse, replacement.reverse).reverse
For instance:
irb(main):001:0> old_str = "abc123abc123"
=> "abc123abc123"
irb(main):002:0> pattern="abc"
=> "abc"
irb(main):003:0> replacement="ABC"
=> "ABC"
irb(main):004:0> new_str = old_str.reverse.sub(pattern.reverse, replacement.reverse).reverse
=> "abc123ABC123"

"abc123abc123".gsub(/(.*(abc.*)*)(abc)(.*)/, '\1ABC\4')
#=> "abc123ABC123"
But probably there is a better way...
Edit:
...which Chris kindly provided in the comment below.
So, as * is a greedy operator, the following is enough:
"abc123abc123".gsub(/(.*)(abc)(.*)/, '\1ABC\3')
#=> "abc123ABC123"
Edit2:
There is also a solution which neatly illustrates parallel array assignment in Ruby:
*a, b = "abc123abc123".split('abc', -1)
a.join('abc')+'ABC'+b
#=> "abc123ABC123"

Since Ruby 2.0 we can use \K which removes any text matched before it from the returned match. Combine with a greedy operator and you get this:
'abc123abc123'.sub(/.*\Kabc/, 'ABC')
#=> "abc123ABC123"
This is about 1.4 times faster than using capturing groups as Hirurg103 suggested, but that speed comes at the cost of lowering readability by using a lesser-known pattern.
more info on \K: https://www.regular-expressions.info/keep.html

Here's another possible solution:
>> s = "abc123abc123"
=> "abc123abc123"
>> s[s.rindex('abc')...(s.rindex('abc') + 'abc'.length)] = "ABC"
=> "ABC"
>> s
=> "abc123ABC123"

When searching in huge streams of data, using reverse will definitively* lead to performance issues. I use string.rpartition*:
sub_or_pattern = "!"
replacement = "?"
string = "hello!hello!hello"
array_of_pieces = string.rpartition sub_or_pattern
( array_of_pieces[(array_of_pieces.find_index sub_or_pattern)] = replacement ) rescue nil
p array_of_pieces.join
# "hello!hello?hello"
The same code must work with a string with no occurrences of sub_or_pattern:
string = "hello_hello_hello"
# ...
# "hello_hello_hello"
*rpartition uses rb_str_subseq() internally. I didn't check if that function returns a copy of the string, but I think it preserves the chunk of memory used by that part of the string. reverse uses rb_enc_cr_str_copy_for_substr(), which suggests that copies are done all the time -- although maybe in the future a smarter String class may be implemented (having a flag reversed set to true, and having all of its functions operating backwards when that is set), as of now, it is inefficient.
Moreover, Regex patterns can't be simply reversed. The question only asks for replacing the last occurrence of a sub-string, so, that's OK, but readers in the need of something more robust won't benefit from the most voted answer (as of this writing)

You can achieve this with String#sub and greedy regexp .* like this:
'abc123abc123'.sub(/(.*)abc/, '\1ABC')

simple and efficient:
s = "abc123abc123abc"
p = "123"
s.slice!(s.rindex(p), p.size)
s == "abc123abcabc"

string = "abc123abc123"
pattern = /abc/
replacement = "ABC"
matches = string.scan(pattern).length
index = 0
string.gsub(pattern) do |match|
index += 1
index == matches ? replacement : match
end
#=> abc123ABC123

I've used this handy helper method quite a bit:
def gsub_last(str, source, target)
return str unless str.include?(source)
top, middle, bottom = str.rpartition(source)
"#{top}#{target}#{bottom}"
end
If you want to make it more Rails-y, extend it on the String class itself:
class String
def gsub_last(source, target)
return self unless self.include?(source)
top, middle, bottom = self.rpartition(source)
"#{top}#{target}#{bottom}"
end
end
Then you can just call it directly on any String instance, eg "fooBAR123BAR".gsub_last("BAR", "FOO") == "fooBAR123FOO"

.gsub /abc(?=[^abc]*$)/, 'ABC'
Matches a "abc" and then asserts ((?=) is positive lookahead) that no other characters up to the end of the string are "abc".

Related

Ruby regex checks string for variations of pattern of same length

I was wondering how you construct the regular expression to check if the string has a variation of a pattern with the same length. Say the string is "door boor robo omanyte" how do I return the words that have the variation of [door]?
You can easily get all the possible words using Array#permutation. Then you can scan for them in provided string. Here:
possible_words = %w[d o o r].permutation.map &:join
# => ["door", "doro", "door", "doro", "droo", "droo", "odor", "odro", "oodr", "oord", "ordo", "orod", "odor", "odro", "oodr", "oord", "ordo", "orod", "rdoo", "rdoo", "rodo", "rood", "rodo", "rood"]
string = "door boor robo omanyte"
string.scan(possible_words.join("|"))
# => ["door"]
string = "door close rood example ordo"
string.scan(possible_words.join("|"))
# => ["door", "rood", "ordo"]
UPDATE
You can improve scan further by looking for word boundary. Here:
string = "doorrood example ordo"
string.scan(/"\b#{possible_words.join('\b|\b')}\b"/)
# => ["ordo"]
NOTE
As Cary correctly pointed out in comments below, this process is quite inefficient if you intend to find permutation for a fairly large string. However it should work fine for OP's example.
If the comment I left on your question correctly interprets the question, you could do this:
str = "door sit its odor to"
str.split
.group_by { |w| w.chars.sort.join }
.values
.select { |a| a.size > 1 }
#=> [["door", "odor"], ["sit", "its"]]
This assumes all the letters are the same case.
If case is not important, just make a small change:
str = "dooR sIt itS Odor to"
str.split
.group_by { |w| w.downcase.chars.sort.join }
.values
.select { |a| a.size > 1 }
#=> [["dooR", "Odor"], ["sIt", "itS"]]
In my opinion the fastest way to find this will be
word_a.chars.sort == word_b.chars.sort
since we are using the same characters inside the words
IMO, some kind of iteration is definitely necessary to build a regular expression to match this one. Not using a regular expression is better too.
def variations_of_substr(str, sub)
# Creates regexp to match words with same length, and
# with same characters of str.
patt = "\\b" + ( [ "[#{sub}]{1}" ] * sub.size ).join + "\\b"
# Above alone won't be enough, characters in both words should
# match exactly.
str.scan( Regexp.new(patt) ).select do |m|
m.chars.sort == sub.chars.sort
end
end
variations_of_substr("door boor robo omanyte", "door")
# => ["door"]

Splitting string based on word

I have a string composed by words divided by'#'. For instance 'this#is#an#example' and I need to extract the last word or the last two words according to the second to last word.
If the second to last is 'myword' I need the last two words otherwise just the last one.
'this#is#an#example' => 'example'
'this#is#an#example#using#myword#also' => 'myword#also'
Is there a better way than splitting and checking the second to last? perhaps using regular expression?
Thanks.
You can use the end-of-line anchor $ and make the myword# prefix optional:
str = 'this#is#an#example'
str[/(?:#)((myword#)?[^#]+)$/, 1]
#=> "example"
str = 'this#is#an#example#using#myword#also'
str[/(?:#)((myword#)?[^#]+)$/, 1]
#=> "myword#also"
However, I don't think using a regular expression is "better" in this case. I would use something like Santosh's (deleted) answer: split the line by # and use an if clause.
def foo(str)
*, a, b = str.split('#')
if a == 'myword'
"#{a}##{b}"
else
b
end
end
str = 'this#is#an#example#using#myword#also'
array = str.split('#')
array[-2] == 'myword' ? array[-2..-1].join('#') : array[-1]
With regex:
'this#is#an#example'[/(myword\#)*\w+$/]
# => "example"
'this#is#an#example#using#myword#also'[/(myword\#)*\w+$/]
# => "myword#also"

How do I remove a substring after a certain character in a string using Ruby?

How do I remove a substring after a certain character in a string using Ruby?
new_str = str.slice(0..(str.index('blah')))
I find that "Part1?Part2".split('?')[0] is easier to read.
I'm surprised nobody suggested to use 'gsub'
irb> "truncate".gsub(/a.*/, 'a')
=> "trunca"
The bang version of gsub can be used to modify the string.
str = "Hello World"
stopchar = 'W'
str.sub /#{stopchar}.+/, stopchar
#=> "Hello W"
A special case is if you have multiple occurrences of the same character and you want to delete from the last occurrence to the end (not the first one).
Following what Jacob suggested, you just have to use rindex instead of index as rindex gets the index of the character in the string but starting from the end.
Something like this:
str = '/path/to/some_file'
puts str.slice(0, str.index('/')) # => ""
puts str.slice(0, str.rindex('/')) # => "/path/to"
We can also use partition and rpartitiondepending on whether we want to use the first or last instance of the specified character:
string = "abc-123-xyz"
last_char = "-"
string.partition(last_char)[0..1].join #=> "abc-"
string.rpartition(last_char)[0..1].join #=> "abc-123-"

Ruby, remove last N characters from a string?

What is the preferred way of removing the last n characters from a string?
irb> 'now is the time'[0...-4]
=> "now is the "
If the characters you want to remove are always the same characters, then consider chomp:
'abc123'.chomp('123') # => "abc"
The advantages of chomp are: no counting, and the code more clearly communicates what it is doing.
With no arguments, chomp removes the DOS or Unix line ending, if either is present:
"abc\n".chomp # => "abc"
"abc\r\n".chomp # => "abc"
From the comments, there was a question of the speed of using #chomp versus using a range. Here is a benchmark comparing the two:
require 'benchmark'
S = 'asdfghjkl'
SL = S.length
T = 10_000
A = 1_000.times.map { |n| "#{n}#{S}" }
GC.disable
Benchmark.bmbm do |x|
x.report('chomp') { T.times { A.each { |s| s.chomp(S) } } }
x.report('range') { T.times { A.each { |s| s[0...-SL] } } }
end
Benchmark Results (using CRuby 2.13p242):
Rehearsal -----------------------------------------
chomp 1.540000 0.040000 1.580000 ( 1.587908)
range 1.810000 0.200000 2.010000 ( 2.011846)
-------------------------------- total: 3.590000sec
user system total real
chomp 1.550000 0.070000 1.620000 ( 1.610362)
range 1.970000 0.170000 2.140000 ( 2.146682)
So chomp is faster than using a range, by ~22%.
Ruby 2.5+
As of Ruby 2.5 you can use delete_suffix or delete_suffix! to achieve this in a fast and readable manner.
The docs on the methods are here.
If you know what the suffix is, this is idiomatic (and I'd argue, even more readable than other answers here):
'abc123'.delete_suffix('123') # => "abc"
'abc123'.delete_suffix!('123') # => "abc"
It's even significantly faster (almost 40% with the bang method) than the top answer. Here's the result of the same benchmark:
user system total real
chomp 0.949823 0.001025 0.950848 ( 0.951941)
range 1.874237 0.001472 1.875709 ( 1.876820)
delete_suffix 0.721699 0.000945 0.722644 ( 0.723410)
delete_suffix! 0.650042 0.000714 0.650756 ( 0.651332)
I hope this is useful - note the method doesn't currently accept a regex so if you don't know the suffix it's not viable for the time being. However, as the accepted answer (update: at the time of writing) dictates the same, I thought this might be useful to some people.
str = str[0..-1-n]
Unlike the [0...-n], this handles the case of n=0.
I would suggest chop. I think it has been mentioned in one of the comments but without links or explanations so here's why I think it's better:
It simply removes the last character from a string and you don't have to specify any values for that to happen.
If you need to remove more than one character then chomp is your best bet. This is what the ruby docs have to say about chop:
Returns a new String with the last character removed. If the string
ends with \r\n, both characters are removed. Applying chop to an empty
string returns an empty string. String#chomp is often a safer
alternative, as it leaves the string unchanged if it doesn’t end in a
record separator.
Although this is used mostly to remove separators such as \r\n I've used it to remove the last character from a simple string, for example the s to make the word singular.
name = "my text"
x.times do name.chop! end
Here in the console:
>name = "Nabucodonosor"
=> "Nabucodonosor"
> 7.times do name.chop! end
=> 7
> name
=> "Nabuco"
Dropping the last n characters is the same as keeping the first length - n characters.
Active Support includes String#first and String#last methods which provide a convenient way to keep or drop the first/last n characters:
require 'active_support/core_ext/string/access'
"foobarbaz".first(3) # => "foo"
"foobarbaz".first(-3) # => "foobar"
"foobarbaz".last(3) # => "baz"
"foobarbaz".last(-3) # => "barbaz"
if you are using rails, try:
"my_string".last(2) # => "ng"
[EDITED]
To get the string WITHOUT the last 2 chars:
n = "my_string".size
"my_string"[0..n-3] # => "my_stri"
Note: the last string char is at n-1. So, to remove the last 2, we use n-3.
Check out the slice() method:
http://ruby-doc.org/core-2.5.0/String.html#method-i-slice
You can always use something like
"string".sub!(/.{X}$/,'')
Where X is the number of characters to remove.
Or with assigning/using the result:
myvar = "string"[0..-X]
where X is the number of characters plus one to remove.
If you're ok with creating class methods and want the characters you chop off, try this:
class String
def chop_multiple(amount)
amount.times.inject([self, '']){ |(s, r)| [s.chop, r.prepend(s[-1])] }
end
end
hello, world = "hello world".chop_multiple 5
hello #=> 'hello '
world #=> 'world'
Using regex:
str = 'string'
n = 2 #to remove last n characters
str[/\A.{#{str.size-n}}/] #=> "stri"
x = "my_test"
last_char = x.split('').last

Ruby: String no longer mixes in Enumerable in 1.9

So how can I still be able to write beautiful code such as:
'im a string meing!'.pop
Note: str.chop isn't sufficient answer
It is not what an enumerable string atually enumerates. Is a string a sequence of ...
lines,
characters,
codepoints or
bytes?
The answer is: all of those, any of those, either of those or neither of those, depending on the context. Therefore, you have to tell Ruby which of those you actually want.
There are several methods in the String class which return enumerators for any of the above. If you want the pre-1.9 behavior, your code sample would be
'im a string meing!'.bytes.to_a.pop
This looks kind of ugly, but there is a reason for it: a string is a sequence. You are treating it as a stack. A stack is not a sequence, in fact it pretty much is the opposite of a sequence.
That's not beautiful :)
Also #pop is not part of Enumerable, it's part of Array.
The reason why String is not enumerable is because there are no 'natural' units to enumerate, should it be on a character basis or a line basis? Because of this String does not have an #each
String instead provides the #each_char and #each_byte and #each_line methods for iteration in the way that you choose.
Since you don't like str[str.length], how about
'im a string meing!'[-1] # returns last character as a character value
or
'im a string meing!'[-1,1] # returns last character as a string
or, if you need it modified in place as well, while keeping it an easy one-liner:
class String
def pop
last = self[-1,1]
self.chop!
last
end
end
#!/usr/bin/ruby1.8
s = "I'm a string meing!"
s, last_char = s.rpartition(/./)
p [s, last_char] # => ["I'm a string meing", "!"]
String.rpartition is new for 1.9 but it's been back-ported to 1.8.7. It searches a string for a regular expression, starting at the end and working backwards. It returns the part of the string before the match, the match, and the part of the string after the match (which we discard here).
String#slice! and String#insert is going to get you much closer to what you want without converting your strings to arrays.
For example, to simulate Array#pop you can do:
text = '¡Exclamation!'
mark = text.slice! -1
mark == '!' #=> true
text #=> "¡Exclamation"
Likewise, for Array#shift:
text = "¡Exclamation!"
inverted_mark = text.slice! 0
inverted_mark == '¡' #=> true
text #=> "Exclamation!"
Naturally, to do an Array#push you just use one of the concatenation methods:
text = 'Hello'
text << '!' #=> "Hello!"
text.concat '!' #=> "Hello!!"
To simulate Array#unshift you use String#insert instead, it's a lot like the inverse of slice really:
text = 'World!'
text.insert 0, 'Hello, ' #=> "Hello, World!"
You can also grab chunks from the middle of a string in multiple ways with slice.
First you can pass a start position and length:
text = 'Something!'
thing = text.slice 4, 5
And you can also pass a Range object to grab absolute positions:
text = 'This is only a test.'
only = text.slice (8..11)
In Ruby 1.9 using String#slice like this is identical to String#[], but if you use the bang method String#slice! it will actually remove the substring you specify.
text = 'This is only a test.'
only = text.slice! (8..12)
text == 'This is a test.' #=> true
Here's a slightly more complex example where we reimplement a simple version of String#gsub! to do a search and replace:
text = 'This is only a test.'
search = 'only'
replace = 'not'
index = text =~ /#{search}/
text.slice! index, search.length
text.insert index, replace
text == 'This is not a test.' #=> true
Of course 99.999% of the time, you're going to want to use the aforementioned String.gsub! which will do the exact same thing:
text = 'This is only a test.'
text.gsub! 'only', 'not'
text == 'This is not a test.' #=> true
references:
Ruby String Documentation

Resources