How to delete specific characters from a string in Ruby? - ruby

I have several strings that look like this:
"((String1))"
They are all different lengths. How could I remove the parentheses from all these strings in a loop?

Do as below using String#tr :
"((String1))".tr('()', '')
# => "String1"

If you just want to remove the first two characters and the last two, then you can use negative indexes on the string:
s = "((String1))"
s = s[2...-2]
p s # => "String1"
If you want to remove all parentheses from the string you can use the delete method on the string class:
s = "((String1))"
s.delete! '()'
p s # => "String1"

For those coming across this and looking for performance, it looks like #delete and #tr are about the same in speed and 2-4x faster than gsub.
text = "Here is a string with / some forwa/rd slashes"
tr = Benchmark.measure { 10000.times { text.tr('/', '') } }
# tr.total => 0.01
delete = Benchmark.measure { 10000.times { text.delete('/') } }
# delete.total => 0.01
gsub = Benchmark.measure { 10000.times { text.gsub('/', '') } }
# gsub.total => 0.02 - 0.04

Using String#gsub with regular expression:
"((String1))".gsub(/^\(+|\)+$/, '')
# => "String1"
"(((((( parentheses )))".gsub(/^\(+|\)+$/, '')
# => " parentheses "
This will remove surrounding parentheses only.
"(((((( This (is) string )))".gsub(/^\(+|\)+$/, '')
# => " This (is) string "

Here is an even shorter way of achieving this:
1) using Negative character class pattern matching
irb(main)> "((String1))"[/[^()]+/]
=> "String1"
^ - Matches anything NOT in the character class. Inside the charachter class, we have ( and )
Or with global substitution "AKA: gsub" like others have mentioned.
irb(main)> "((String1))".gsub(/[)(]/, '')
=> "String1"

Use String#delete:
"((String1))".delete "()"
=> "String1"

Related

How do I escape a variable when using it in a regular expression? [duplicate]

Is is possible to create/use a regular expression pattern in ruby that is based on the value of a variable name?
For instance, we all know we can do the following with Ruby strings:
str = "my string"
str2 = "This is #{str}" # => "This is my string"
I'd like to do the same thing with regular expressions:
var = "Value"
str = "a test Value"
str.gsub( /#{var}/, 'foo' ) # => "a test foo"
Obviously that doesn't work as listed, I only put it there as an example to show what I'd like to do. I need to regexp match based on the value of a variable's content.
The code you think doesn't work, does:
var = "Value"
str = "a test Value"
p str.gsub( /#{var}/, 'foo' ) # => "a test foo"
Things get more interesting if var can contain regular expression meta-characters. If it does and you want those matacharacters to do what they usually do in a regular expression, then the same gsub will work:
var = "Value|a|test"
str = "a test Value"
str.gsub( /#{var}/, 'foo' ) # => "foo foo foo"
However, if your search string contains metacharacters and you do not want them interpreted as metacharacters, then use Regexp.escape like this:
var = "*This*"
str = "*This* is a string"
p str.gsub( /#{Regexp.escape(var)}/, 'foo' )
# => "foo is a string"
Or just give gsub a string instead of a regular expression. In MRI >= 1.8.7, gsub will treat a string replacement argument as a plain string, not a regular expression:
var = "*This*"
str = "*This* is a string"
p str.gsub(var, 'foo' ) # => "foo is a string"
(It used to be that a string replacement argument to gsub was automatically converted to a regular expression. I know it was that way in 1.6. I don't recall which version introduced the change).
As noted in other answers, you can use Regexp.new as an alternative to interpolation:
var = "*This*"
str = "*This* is a string"
p str.gsub(Regexp.new(Regexp.escape(var)), 'foo' )
# => "foo is a string"
It works, but you need to use gsub! or assign the return to another variable
var = "Value"
str = "a test Value"
str.gsub!( /#{var}/, 'foo' ) # Or this: new_str = str.gsub( /#{var}/, 'foo' )
puts str
Yes
str.gsub Regexp.new(var), 'foo'
You can use regular expressions through variables in ruby:
var = /Value/
str = "a test Value"
str.gsub( /#{var}/, 'foo' )
str.gsub( Regexp.new("#{var}"), 'foo' )

What's the difference between match method and the =~ operator?

Two expressions:
puts "String has vowels" if "This is a test".match(/[aeiou]/)
and
puts "String has vowels" if "This is a test" =~ /[aeiou]/
seem identical. Are they not? I did some testing below:
"This is a test" =~ /[aeiou]/
# => 2
"This is a test".match(/[aeiou]/)
# => MatchData "i"
So it seems like =~ gives you the position of the first match and match method gives you the first character that matches. Is this correct? They both return true and so what's the difference here?
They just differ on what they return if there is a match. If there is no match, both return nil.
~= returns the numerical index of the character in the string where the match started
.match returns an instance of the class MatchData
You're correct.
Expanding on Nobita's answer, match is less efficient if you want to just check to see if a string matches a regexp (like in your case). In that case, you should use =~. See the answer to "Fastest way to check if a string matches or not a regexp in ruby?", which contains these benchmarks:
require 'benchmark'
"test123" =~ /1/
=> 4
Benchmark.measure{ 1000000.times { "test123" =~ /1/ } }
=> 0.610000 0.000000 0.610000 ( 0.578133)
...
irb(main):019:0> "test123".match(/1/)
=> #<MatchData "1">
Benchmark.measure{ 1000000.times { "test123".match(/1/) } }
=> 1.703000 0.000000 1.703000 ( 1.578146)
So, in this case, =~ is a little less than three times faster than match

Ruby Interpolation

Could someone explain why doing this:
%{#$"}
in irb produces the following?
=> "[\"enumerator.so\", \"enc/encdb.so\", \"enc/big5.so\", \"enc/cp949.so\", \"enc/emacs_mule.so\", \"enc/euc_jp.so\", \"enc/euc_kr.so\", \"enc/euc_tw.so\", \"enc/gb2312.so\", \"enc/gb18030.so\", \"enc/gbk.so\", \"enc/iso_8859_1.so\" ... ]
Thanks!
%{ ... } is a string literal. It's similar to "...".
%{a string} == "a string"
# => true
#{expr} inside those string literal is interpolation. An expression expr inside the substituted with the value of it. For global variable you can omit { and }.
"#{1 + 2}"
# => "3"
%{#$"} == $".to_s
# => true
$" is one of pre-defined variables: an array of loaded module names.

Check for a substring at the end of string

Let's say I have two strings:
"This-Test has a "
"This has a-Test"
How do I match the "Test" at the end of string and only get the second as a result and not the first string. I am using include? but it will match all occurrences and not just the ones where the substring occurs at the end of string.
You can do this very simply using end_with?, e.g.
"Test something Test".end_with? 'Test'
Or, you can use a regex that matches the end of the string:
/Test$/ === "Test something Test"
"This-Test has a ".end_with?("Test") # => false
"This has a-Test".end_with?("Test") # => true
Oh, the possibilities are many...
Let's say we have two strings, a = "This-Test has a" and b = "This has a-Test.
Because you want to match any string that ends exactly in "Test", a good RegEx would be /Test$/ which means "capital T, followed by e, then s, then t, then the end of the line ($)".
Ruby has the =~ operator which performs a RegEx match against a string (or string-like object):
a =~ /Test$/ # => nil (because the string does not match)
b =~ /Test$/ # => 11 (as in one match, starting at character 11)
You could also use String#match:
a.match(/Test$/) # => nil (because the string does not match)
b.match(/Test$/) # => a MatchData object (indicating at least one hit)
Or you could use String#scan:
a.scan(/Test$/) # => [] (because there are no matches)
b.scan(/Test$/) # => ['Test'] (which is the matching part of the string)
Or you could just use ===:
/Test$/ === a # => false (because there are no matches)
/Test$/ === b # => true (because there was a match)
Or you can use String#end_with?:
a.end_with?('Test') # => false
b.end_with?('Test') # => true
...or one of several other methods. Take your pick.
You can use the regex /Test$/ to test:
"This-Test has a " =~ /Test$/
#=> nil
"This has a-Test" =~ /Test$/
#=> 11
You can use a range:
"Your string"[-4..-1] == "Test"
You can use a regex:
"Your string " =~ /Test$/
String's [] makes it nice and easy and clean:
"This-Test has a "[/Test$/] # => nil
"This has a-Test"[/Test$/] # => "Test"
If you need case-insensitive:
"This-Test has a "[/test$/i] # => nil
"This has a-Test"[/test$/i] # => "Test"
If you want true/false:
str = "This-Test has a "
!!str[/Test$/] # => false
str = "This has a-Test"
!!str[/Test$/] # => true

How to read characters from a text file, then store them into a hash in Ruby

I am working on an assignment, and can't figure it out. We have to first parse a text file, and then feed the results into a hash. I have done this:
code = File.open(WORKING_DIR + '/code.txt','r')
char_count = {'a' => 0,'b' => 0,'c' => 0,'d' => 0,'e' => 0,'f' => 0,'g' => 0,'h' => 0,'i' => 0,
'j' => 0,'k' => 0,'l' => 0,'m' => 0,'n' => 0,'o' => 0,'p' => 0,'q' => 0,'r' => 0,
's' => 0,'t' => 0,'u' => 0,'v' => 0,'w' => 0,'x' => 0,'y' => 0,'z' => 0
}
# Step through each line in the file.
code.readlines.each do |line|
# Print each character of this particular line.
line.split('').each do
|ch|
char_count.has_key?('ch')
char_count['ch'] +=1
end
My line of thinking: open the file to a variable named code
read the individual lines
break the lines into each character.
I know this works, I can puts out the characters to screen.
Now I need to feed the characters into the hash, and it isn't working. I am struggling with the syntax (at least) and basic concepts (at most). I only want the alphabet characters, not the punctuation, etc. from the file.
Any help would be greatly appreciated.
Thanks.
I would directly do :
File.open(WORKING_DIR + '/code.txt','r') do |f|
char_count = Hash.new(0) # create a hash where 0 is the default value
f.each_char do |c| # iterate on each character
... # some filter on the character you want to reject.
char_count[c] +=1
end
end
PS : you wrote 'ch' the string instead of ch the variable name
EDIT : the filter could be
f.each_char do |c| # iterate on each character
next if c ~= \/W\ # exclude with a regexp non word character
....
Try this, using Enumerable class methods:
open("file").each_char.grep(/\w/).group_by { |char|
char
}.each { |char,num|
p [char, num.count]
}
(The grep method filter is using regex "\w" (any character, digit ou underscore); you can change to [A-Za-z] for filter only alphabets.)
I think the problem is here:
char_count.has_key?('ch')
char_count['ch'] +=1
end
You're not using the variable but a string 'ch', change that in both places for ch.
Also the hash could be created using range, for example:
char_count = {}
('a'..'z').each{|l| char_count[l] = 0}
or:
char_count = ('a'..'z').inject({}){|hash,l| hash[l] = 0 ; hash}

Resources