How do you strip substrings in ruby? - ruby

I'd like to replace/duplicate a substring, between two delimeters -- e.g.,:
"This is (the string) I want to replace"
I'd like to strip out everything between the characters ( and ), and set that substr to a variable -- is there a built in function to do this?

I would just do:
my_string = "This is (the string) I want to replace"
p my_string.split(/[()]/) #=> ["This is ", "the string", " I want to replace"]
p my_string.split(/[()]/)[1] #=> "the string"
Here are two more ways to do it:
/\((?<inside_parenthesis>.*?)\)/ =~ my_string
p inside_parenthesis #=> "the string"
my_new_var = my_string[/\((.*?)\)/,1]
p my_new_var #=> "the string"
Edit - Examples to explain the last method:
my_string = 'hello there'
capture = /h(e)(ll)o/
p my_string[capture] #=> "hello"
p my_string[capture, 1] #=> "e"
p my_string[capture, 2] #=> "ll"

var = "This is (the string) I want to replace"[/(?<=\()[^)]*(?=\))/]
var # => "the string"

str = "This is (the string) I want to replace"
str.match(/\((.*)\)/)
some_var = $1 # => "the string"

As I understand, you want to remove or replace a substring as well as set a variable equal to that substring (sans the parentheses). There are many ways to do this, some of which are slight variants of the other answers. Here's another way that also allows for the possibility of multiple substrings within parentheses, picking up from #sawa's comments:
def doit(str, repl)
vars = []
str.gsub(/\(.*?\)/) {|m| vars << m[1..-2]; repl}, vars
end
new_str, vars = doit("This is (the string) I want to replace", '')
new_str # => => "This is I want to replace"
vars # => ["the string"]
new_str, vars = doit("This is (the string) I (really) want (to replace)", '')
new_str # => "This is I want"
vars # => ["the string", "really, "to replace"]
new_str, vars = doit("This (short) string is a () keeper", "hot dang")
new_str # => "This hot dang string is a hot dang keeper"
vars # => ["short", ""]
In the regex, the ? in .*? makes .* "lazy". gsub passes each match m to the block; the block strips the parens and adds it to vars, then returns the replacement string. This regex also works:
/\([^\(]*\)/

Related

Replace matched lines in a file but ignore commented-out lines using Ruby

How to replace a file in Ruby, but do not touch commented-out lines? To be more specific I want to change variable in configuration file. An example would be:
irb(main):014:0> string = "#replaceme\n\t\s\t\s# replaceme\nreplaceme\n"
=> "#replaceme\n\t \t # replaceme\nreplaceme\n"
irb(main):015:0> puts string.gsub(%r{replaceme}, 'replaced')
#replaced
# replaced
replaced
=> nil
irb(main):016:0>
Desired output:
#replaceme
# replaceme
replaced
I don't fully understand the question. To do a find and replace in each line, disregarding text following a pound sign, one could do the following.
def replace_em(str, source, replacement)
str.split(/(\#.*?$)/).
map { |s| s[0] == '#' ? s : s.gsub(source, replacement) }.
join
end
str = "It was known that # that dog has fleas, \nbut who'd know that that dog # wouldn't?"
replace_em(str, "that", "the")
#=> "It was known the # that dog has fleas, \nbut who'd know the the dog # wouldn't?"
str = "#replaceme\n\t\s\t\s# replaceme\nreplaceme\n"
replace_em(str, "replaceme", "replaced")
#=> "#replaceme\n\t \t # replaceme\nreplaced\n"
For the string
str = "It was known that # that dog has fleas, \nbut who'd know that that dog # wouldn't?"
source = "that"
replacement = "the"
the steps are as follows.
a = str.split(/(\#.*?$)/)
#=> ["It was known that ", "# that dog has fleas, ",
# "\nbut who'd know that that dog ", "# wouldn't?"]
Note that the body of the regular expression must be put in a capture group in order that the text used to split the string be included as elements in the resulting array. See String#split.
b = a.map { |s| s[0] == '#' ? s : s.gsub(source, replacement) }
#=> ["It was known the ", "# that dog has fleas, ",
# "\nbut who'd know the the dog ", "# wouldn't?"]
b.join
#=> "It was known the # that dog has fleas, \nbut who'd know the the dog # wouldn't?"
How about this?
puts string.gsub(%r{^replaceme}, 'replaced')

How do I escape a variable when using it in a regular expression? [duplicate]

Is is possible to create/use a regular expression pattern in ruby that is based on the value of a variable name?
For instance, we all know we can do the following with Ruby strings:
str = "my string"
str2 = "This is #{str}" # => "This is my string"
I'd like to do the same thing with regular expressions:
var = "Value"
str = "a test Value"
str.gsub( /#{var}/, 'foo' ) # => "a test foo"
Obviously that doesn't work as listed, I only put it there as an example to show what I'd like to do. I need to regexp match based on the value of a variable's content.
The code you think doesn't work, does:
var = "Value"
str = "a test Value"
p str.gsub( /#{var}/, 'foo' ) # => "a test foo"
Things get more interesting if var can contain regular expression meta-characters. If it does and you want those matacharacters to do what they usually do in a regular expression, then the same gsub will work:
var = "Value|a|test"
str = "a test Value"
str.gsub( /#{var}/, 'foo' ) # => "foo foo foo"
However, if your search string contains metacharacters and you do not want them interpreted as metacharacters, then use Regexp.escape like this:
var = "*This*"
str = "*This* is a string"
p str.gsub( /#{Regexp.escape(var)}/, 'foo' )
# => "foo is a string"
Or just give gsub a string instead of a regular expression. In MRI >= 1.8.7, gsub will treat a string replacement argument as a plain string, not a regular expression:
var = "*This*"
str = "*This* is a string"
p str.gsub(var, 'foo' ) # => "foo is a string"
(It used to be that a string replacement argument to gsub was automatically converted to a regular expression. I know it was that way in 1.6. I don't recall which version introduced the change).
As noted in other answers, you can use Regexp.new as an alternative to interpolation:
var = "*This*"
str = "*This* is a string"
p str.gsub(Regexp.new(Regexp.escape(var)), 'foo' )
# => "foo is a string"
It works, but you need to use gsub! or assign the return to another variable
var = "Value"
str = "a test Value"
str.gsub!( /#{var}/, 'foo' ) # Or this: new_str = str.gsub( /#{var}/, 'foo' )
puts str
Yes
str.gsub Regexp.new(var), 'foo'
You can use regular expressions through variables in ruby:
var = /Value/
str = "a test Value"
str.gsub( /#{var}/, 'foo' )
str.gsub( Regexp.new("#{var}"), 'foo' )

Why does capitalizing the first letter of string elements alter an array?

The following code is intended to capitalize the first letter of each word in a string, and it works:
def capitalize_words(string)
words = string.split(" ")
idx = 0
while idx < words.length
word = words[idx]
word[0] = word[0].upcase
idx += 1
end
return words.join(" ")
end
capitalize_words("this is a sentence") # => "This Is A Sentence"
capitalize_words("mike bloomfield") # => "Mike Bloomfield"
I do not understand why it works. In the while loop, I did not set any element in the words array to anything new. I understand that it might work if I added the following line before the index iteration:
words[idx] = word
I would then be altering the elements of words. However, the code works even without that line.
yet in no place in the while loop that I am using to capitalize the
first letter of each word do I actually set any of the elements in the
"words" array to anything new.
You do, actually, right here:
word = words[idx]
word[0] = word[0].upcase # This changes words[idx][0]!
The upcase method does just that: returns the upcase of a given string. For example:
'example'.upcase
# => "EXAMPLE"
'example'[0].upcase
# => "E"
The method String#[]= that you are using in:
word[0] = ...
is not variable assignment. It alters the content of the receiver string at the given index, retaining the identity of the string as an object. And since word is not a copy but is the original string taken from words, in turn, you are modifying words.
You're doing a lot of work that you don't have to:
def capitalize_words(string)
string.split.map{ |w|
[w[0].upcase, w[1..-1]].join # => "Foo", "Bar"
}.join(' ')
end
capitalize_words('foo bar')
# => "Foo Bar"
Breaking it down:
'foo'[0] # => "f"
'foo'[0].upcase # => "F"
'foo'[1..-1] # => "oo"
['F', 'oo'].join # => "Foo"

Replace characters from string without changing its object_id in Ruby

How can I replace characters from string without changing its object_id?
For example:
string = "this is a test"
The first 7 characters need to be replaced with capitalized characters like: "THIS IS a Test" and the object_id needs to be the same. In which way can I sub or replace the characters to make it happen?
You can do it like this:
string = "this is a test"
string[0, 7] = string[0, 7].upcase
With procedural languages, one might write the equivalent of:
string = "this is in jest"
string.object_id
#=> 70309969974760
(1..7).each { |i| string[i] = string[i].upcase }
#=> 1..7
string
#=> "tHIS IS in jest"
string.object_id
#=> 70309969974760
This is not very Ruby-like, but it does offer the advantage over #sawa's solution that it does not create a temporary 7-character string. (Well, it does create a one-character string.) This is unimportant for strings of reasonable length (and for those I'd certainly concur with sawa), but it could be significant for really, really, really long strings.
Another way to do this is as follows:
string.each_char.with_index { |c,i|
string[i] = string[i].upcase if (1..7).cover?(i) }
#=> "tHIS IS in jest"
string.object_id
#=> 70309969974760
This second way might be more efficient if string is not much larger than string[start_index..end_index].
Edit:
In a comment the OP indicates that the string is to be stripped, squeeze and reversed as well as certain characters converted to upper case. That could be done on the string in place, without creating a copy, as follows:
def strip_upcase_squeeze_reverse_whew(string, upcase_range, squeeze_str=nil)
string.strip!
upcase_range.each { |i| string[i] = string[i].upcase }
squeeze_str.nil? ? string.squeeze! : string.squeeze!(squeeze_str)
string.reverse!
end
I have assumed the four operations would be performed in a particular order, but if the order should be different, that's an easy fix.
string = " this may bee inn jest, butt it's alsoo a test "
string.object_id
#=> 70309970103280
strip_upcase_squeeze_reverse_whew(string, (1..7))
#=> "tset a osla s'ti tub ,tsej ni eb YAM SIHt"
string.object_id
#=> 70309970103280
The steps:
string = "this may bee inn jest, butt it's alsoo a test"
#=> "this may bee inn jest, butt it's alsoo a test"
upcase_range = (1..7)
#=> 1..7
string.strip!
#=> nil
string
#=> "this may bee inn jest, butt it's alsoo a test"
upcase_range.each { |i| string[i] = string[i].upcase }
#=> 1..7
string
#=> "tHIS MAY bee inn jest, butt it's alsoo a test"
squeeze_str.nil? ? string.squeeze! : string.squeeze!(squeeze_str)
#=> "tHIS MAY be in jest, but it's also a test"
string
#=> "tHIS MAY be in jest, but it's also a test"
string.reverse!
#=> "tset a osla s'ti tub ,tsej ni eb YAM SIHt"
Notice that in this example, strip! does not remove any characters, and therefore returns nil. Similarly, squeeze! would return nil if there is nothing to squeeze. It is for that reason that strip! and squeeze cannot be chained.
A second example:
string = " thiiiis may beeee in jeeest"
strip_upcase_squeeze_reverse_whew(string, (12..14), "aeiouAEIOU")
Adding onto a string without changing its object id:
foo = "foo"
# => "foo"
foo.object_id
# => 70196045363960
foo << "bar"
# => "foobar"
foo.object_id
# => 70196045363960
Replace an entire string without changing its object id
foo
# => "foo"
foo.object_id
# => 70196045363960
foo.gsub!(/./, '') << 'bar'
# => 'bar'
foo.object_id
# => 70196045363960
Replace part of a string without changing its object id
foo
# => "foo"
foo.object_id
# => 70196045363960
foo.gsub!(/o/, 'z')
# => 'fzz'
foo.object_id
# => 70196045363960

Check for a substring at the end of string

Let's say I have two strings:
"This-Test has a "
"This has a-Test"
How do I match the "Test" at the end of string and only get the second as a result and not the first string. I am using include? but it will match all occurrences and not just the ones where the substring occurs at the end of string.
You can do this very simply using end_with?, e.g.
"Test something Test".end_with? 'Test'
Or, you can use a regex that matches the end of the string:
/Test$/ === "Test something Test"
"This-Test has a ".end_with?("Test") # => false
"This has a-Test".end_with?("Test") # => true
Oh, the possibilities are many...
Let's say we have two strings, a = "This-Test has a" and b = "This has a-Test.
Because you want to match any string that ends exactly in "Test", a good RegEx would be /Test$/ which means "capital T, followed by e, then s, then t, then the end of the line ($)".
Ruby has the =~ operator which performs a RegEx match against a string (or string-like object):
a =~ /Test$/ # => nil (because the string does not match)
b =~ /Test$/ # => 11 (as in one match, starting at character 11)
You could also use String#match:
a.match(/Test$/) # => nil (because the string does not match)
b.match(/Test$/) # => a MatchData object (indicating at least one hit)
Or you could use String#scan:
a.scan(/Test$/) # => [] (because there are no matches)
b.scan(/Test$/) # => ['Test'] (which is the matching part of the string)
Or you could just use ===:
/Test$/ === a # => false (because there are no matches)
/Test$/ === b # => true (because there was a match)
Or you can use String#end_with?:
a.end_with?('Test') # => false
b.end_with?('Test') # => true
...or one of several other methods. Take your pick.
You can use the regex /Test$/ to test:
"This-Test has a " =~ /Test$/
#=> nil
"This has a-Test" =~ /Test$/
#=> 11
You can use a range:
"Your string"[-4..-1] == "Test"
You can use a regex:
"Your string " =~ /Test$/
String's [] makes it nice and easy and clean:
"This-Test has a "[/Test$/] # => nil
"This has a-Test"[/Test$/] # => "Test"
If you need case-insensitive:
"This-Test has a "[/test$/i] # => nil
"This has a-Test"[/test$/i] # => "Test"
If you want true/false:
str = "This-Test has a "
!!str[/Test$/] # => false
str = "This has a-Test"
!!str[/Test$/] # => true

Resources