Delete specific parts of a String in Ruby - ruby

I have a String str = "abcdefghij", and I want to set str2 to str minus the 4th to 6th character (assuming 0 based index).
Is it possible to do this in one go? slice! seems to do it, but it requires atleast 3 statements (duplicating, slicing, and then using the string).

A common way is to do it like this:
str = "abcdefghij"
str2 = str.dup
str2[4..6] = ''
# => "abcdhij"
but it still requires two steps.
If the range you want is continuous, then you can do it in one step
str2 = str[2..5]
# => "cdef"

Depending on what exactly you're deleting, http://ruby-doc.org/core/classes/String.html#M001201 might be an option.
You could probably do obscene things with regexes:
"abcdefghij".sub(/(.{4}).{2}/) { $1 }
But that's gross.

I went ahead with using the following:
str = "abcdefghij"
str2 = str[0, 4] + str[7..-1]
It turned out to be faster and cleaner than the other solutions presented. Here's a mini benchmark.
require 'benchmark'
str = "abcdefghij"
times = 1_000_000
Benchmark.bmbm do |bm|
bm.report("1 step") do
times.times do
str2 = str[0, 4] + str[7..-1]
end
end
bm.report("3 steps") do
times.times do
str2 = str.dup
str2[4..6] = ''
str2
end
end
end
Output on Ruby 1.9.2
Rehearsal -------------------------------------------
1 step 0.950000 0.010000 0.960000 ( 0.955288)
3 steps 1.250000 0.000000 1.250000 ( 1.250415)
---------------------------------- total: 2.210000sec
user system total real
1 step 0.960000 0.000000 0.960000 ( 0.950541)
3 steps 1.250000 0.010000 1.260000 ( 1.254416)
Edit: Update for <<.
Script:
require 'benchmark'
str = "abcdefghij"
times = 1_000_000
Benchmark.bmbm do |bm|
bm.report("1 step") do
times.times do
str2 = str[0, 4] + str[7..-1]
end
end
bm.report("3 steps") do
times.times do
str2 = str.dup
str2[4..6] = ''
str2
end
end
bm.report("1 step using <<") do
times.times do
str2 = str[0, 4] << str[7..-1]
end
end
end
Output on Ruby 1.9.2
Rehearsal ---------------------------------------------------
1 step 0.980000 0.010000 0.990000 ( 0.979944)
3 steps 1.270000 0.000000 1.270000 ( 1.265495)
1 step using << 0.910000 0.010000 0.920000 ( 0.909705)
------------------------------------------ total: 3.180000sec
user system total real
1 step 0.980000 0.000000 0.980000 ( 0.985154)
3 steps 1.280000 0.000000 1.280000 ( 1.281310)
1 step using << 0.930000 0.000000 0.930000 ( 0.916511)

Related

Why is storing a value in an instance variable more expensive than looking up a hash?

I ran a benchmark to see whether memoizing attributes was faster than reading from a configuration hash. The code below is an example. Can anybody explain them?
The Test
require 'benchmark'
class MyClassWithStuff
DEFAULT_VALS = { one: '1', two: 2, three: 3, four: 4 }
def memoized_fetch
#value ||= DEFAULT_VALS[:one]
end
def straight_fetch
DEFAULT_VALS[:one]
end
end
TIMES = 10000
CALL_TIMES = 1000
Benchmark.bmbm do |test|
test.report("Memoized") do
TIMES.times do
instance = MyClassWithStuff.new
CALL_TIMES.times { |i| instance.memoized_fetch }
end
end
test.report("Fetched") do
TIMES.times do
instance = MyClassWithStuff.new
CALL_TIMES.times { |i| instance.straight_fetch }
end
end
end
Results
Rehearsal --------------------------------------------
Memoized 1.500000 0.010000 1.510000 ( 1.510230)
Fetched 1.330000 0.000000 1.330000 ( 1.342800)
----------------------------------- total: 2.840000sec
user system total real
Memoized 1.440000 0.000000 1.440000 ( 1.456937)
Fetched 1.260000 0.000000 1.260000 ( 1.269904)

Random string generation with same pattern

I want to generate a random string with the pattern:
number-number-letter-SPACE-letter-number-number
for example "81b t15", "12a x13". How can I generate something like this? I tried generating each char and joining them into one string, but it does not look efficient.
Nums = (0..9).to_a
Ltrs = ("A".."Z").to_a + ("a".."z").to_a
def rand_num; Nums.sample end
def rand_ltr; Ltrs.sample end
"#{rand_num}#{rand_num}#{rand_ltr} #{rand_ltr}#{rand_num}#{rand_num}"
# => "71P v33"
Have you looked at randexp gem
It works like this:
> /\d\d\w \w\d\d/.gen
=> "64M c82"
Ok here's another entry for the competition :D
module RandomString
LETTERS = (("A".."Z").to_a + ("a".."z").to_a)
LETTERS_SIZE = LETTERS.size
SPACE = " "
FORMAT = [:number, :letter, :number, :space, :letter, :number, :number]
class << self
def generate
chars.join
end
def generate2
"#{number}#{letter}#{number} #{letter}#{number}#{number}"
end
private
def chars
FORMAT.collect{|char_class| send char_class}
end
def letter
LETTERS[rand(LETTERS_SIZE)]
end
def number
rand 10
end
def space
SPACE
end
end
end
And you use it like:
50.times { puts RandomString.generate }
Out of curiosity, I made a benchmark of all the solutions presented here. Here are the results:
JRuby:
user system total real
kimmmo 1.490000 0.000000 1.490000 ( 0.990000)
kimmmo2 0.600000 0.010000 0.610000 ( 0.479000)
sawa 0.960000 0.040000 1.000000 ( 0.533000)
hp4k 2.050000 0.230000 2.280000 ( 1.234000)
brian 17.700000 0.170000 17.870000 ( 14.867000)
MRI 2.0
user system total real
kimmmo 0.900000 0.000000 0.900000 ( 0.908601)
kimmmo2 0.410000 0.000000 0.410000 ( 0.406443)
sawa 0.570000 0.000000 0.570000 ( 0.568935)
hp4k 4.940000 0.000000 4.940000 ( 4.945404)
brian 25.860000 0.010000 25.870000 ( 25.870011)
You can do it this way
(0..9).to_a.sample(2).join + ('a'..'z').to_a.sample + " " + ('a'..'z').to_a.sample + (0..9).to_a.sample(2).join

Most efficient way to compare arrays in Ruby

The code below is supposed to find the numbers in arr_1 that are missing in arr_2.
def compare_1 (arr_1, arr_2)
output = []
temp = arr_2.each_with_object(Hash.new(0)) { |val, hsh| hsh[val] = 0 }
arr_1.each do |element|
if !temp.has_key? (element)
output << element
end
end
puts output
end
def compare_2 (arr_1, arr_2)
out = []
arr_1.each do |num|
if (!arr_2.include?(num))
out << num
end
end
puts out
end
According to 'benchmark', the first methods is faster, presumably by using hashes. Is there a neater way to write these or achieve this?
compare_1 times:
0.000000 0.000000 0.000000 ( 0.003001)
compare_2 times:
0.047000 0.000000 0.047000 ( 0.037002)
The above code is supposed to find the numbers in array_1 that are
missing in array_2
As SteveTurczyn said you could do array_1 - array_2
Here is the definition of Array Difference
Returns a new array that is a copy of the original array, removing any
items that also appear in other_ary. The order is preserved from the
original array.
It compares elements using their hash and eql? methods for efficiency.
[ 1, 1, 2, 2, 3, 3, 4, 5 ] - [ 1, 2, 4 ] #=> [ 3, 3, 5 ]
EDIT
Regarding performance, I made a benchmark by gathering the informations of this thread.
################################################
# $> ruby -v
# ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-darwin12.0]
################################################
require 'benchmark'
def compare_1 arr_1, arr_2
output = []
temp = arr_2.each_with_object(Hash.new(0)) { |val, hsh| hsh[val] = 0 }
arr_1.each do |element|
if !temp.has_key? (element)
output << element
end
end
output
end
def compare_2 arr_1, arr_2
out = []
arr_1.each do |num|
if (!arr_2.include?(num))
out << num
end
end
out
end
require 'set'
def compare_3 arr_1, arr_2
temp = Set.new arr_2
arr_1.reject { |e| temp.include? e }
end
def native arr_1, arr_2
arr_1 - arr_2
end
a1 = (0..50000).to_a
a2 = (0..49999).to_a
Benchmark.bmbm(11) do |x|
x.report("compare_1:") {compare_1(a1, a2)}
x.report("compare_2:") {compare_2(a1, a2)}
x.report("compare_3:") {compare_3(a1, a2)}
x.report("native:") {native(a1, a2)}
end
################################################
# $> ruby array_difference.rb
# Rehearsal -----------------------------------------------
# compare_1: 0.030000 0.000000 0.030000 ( 0.031663)
# compare_2: 71.300000 0.040000 71.340000 ( 71.436027)
# compare_3: 0.040000 0.000000 0.040000 ( 0.042202)
# native: 0.030000 0.010000 0.040000 ( 0.030908)
# ------------------------------------- total: 71.450000sec
#
# user system total real
# compare_1: 0.030000 0.000000 0.030000 ( 0.030870)
# compare_2: 71.090000 0.030000 71.120000 ( 71.221141)
# compare_3: 0.030000 0.000000 0.030000 ( 0.034612)
# native: 0.030000 0.000000 0.030000 ( 0.030670)
################################################

How do I test if a string contains two or more vowels in ruby?

How do I test if a string contains two or more vowels?
I have the following code, but it only tests 2 vowels adjacent to each other. I just want to know if the string contains two or more vowels regardless of where they appear in the string.
if /[aeiouy]{2,}/.match(word)
puts word
end
You could use scan which returns an array with all the matches:
if word.scan(/[aeiou]/).count >= 2
puts word
end
You could use something like:
/[aeiouy].*?[aeiouy]/
First some questions:
What is a vowel? In your example you have y. In my eyes, y is no vowel. What's about umlauts?
Only small letters or also capitals?
In my example you may adopt the constant VOWELS to your definition.
I think the easiest was is to count the vowels with String#count.
Below an example with three variants a-c.
You wrote about two vowels, not two different vowels. My solutions a+b works only for two vowels, even it is the same one. Variant c works only, if there are at least two different vowels in the word.
VOWELS = 'aeiouyAEIOUY'
%w{
test
teste
testa
}.each{|word|
puts 'a: ' + word if word.count(VOWELS) > 1
puts 'b: ' + word if /[#{VOWELS}].*?[#{VOWELS}]/ =~ word
puts 'c: ' + word if word.scan(/[#{VOWELS}]/).uniq.count > 1
}
I made a benchmark. The count solution is the fastest.
require 'benchmark'
N = 10_000 #Number of Test loops
VOWELS = 'aeiouyAEIOUY'
TESTDATA = %w{
test
teste
testa
}
Benchmark.bmbm(10) {|b|
b.report('count') { N.times { TESTDATA.each{|word| word.count(VOWELS) > 1} } }
b.report('regex') { N.times { TESTDATA.each{|word| /[#{VOWELS}].*?[#{VOWELS}]/ =~ word} } }
b.report('scab') { N.times { TESTDATA.each{|word| word =~ /[#{VOWELS}].*?[#{VOWELS}]/ } } }
b.report('scan/uniq') { N.times { TESTDATA.each{|word| word.scan(/[#{VOWELS}]/).uniq.count > 1 } } }
} #Benchmark
Result:
Rehearsal ---------------------------------------------
count 0.031000 0.000000 0.031000 ( 0.031250)
regex 0.562000 0.000000 0.562000 ( 0.562500)
scab 0.516000 0.000000 0.516000 ( 0.515625)
scan/uniq 0.437000 0.000000 0.437000 ( 0.437500)
------------------------------------ total: 1.546000sec
user system total real
count 0.031000 0.000000 0.031000 ( 0.031250)
regex 0.500000 0.000000 0.500000 ( 0.515625)
scab 0.500000 0.000000 0.500000 ( 0.500000)
scan/uniq 0.422000 0.000000 0.422000 ( 0.437500)

What is the easiest way to remove the first character from a string?

Example:
[12,23,987,43
What is the fastest, most efficient way to remove the "[",
using maybe a chop() but for the first character?
Similar to Pablo's answer above, but a shade cleaner :
str[1..-1]
Will return the array from 1 to the last character.
'Hello World'[1..-1]
=> "ello World"
I kind of favor using something like:
asdf = "[12,23,987,43"
asdf[0] = ''
p asdf
# >> "12,23,987,43"
I'm always looking for the fastest and most readable way of doing things:
require 'benchmark'
N = 1_000_000
puts RUBY_VERSION
STR = "[12,23,987,43"
Benchmark.bm(7) do |b|
b.report('[0]') { N.times { "[12,23,987,43"[0] = '' } }
b.report('sub') { N.times { "[12,23,987,43".sub(/^\[+/, "") } }
b.report('gsub') { N.times { "[12,23,987,43".gsub(/^\[/, "") } }
b.report('[1..-1]') { N.times { "[12,23,987,43"[1..-1] } }
b.report('slice') { N.times { "[12,23,987,43".slice!(0) } }
b.report('length') { N.times { "[12,23,987,43"[1..STR.length] } }
end
Running on my Mac Pro:
1.9.3
user system total real
[0] 0.840000 0.000000 0.840000 ( 0.847496)
sub 1.960000 0.010000 1.970000 ( 1.962767)
gsub 4.350000 0.020000 4.370000 ( 4.372801)
[1..-1] 0.710000 0.000000 0.710000 ( 0.713366)
slice 1.020000 0.000000 1.020000 ( 1.020336)
length 1.160000 0.000000 1.160000 ( 1.157882)
Updating to incorporate one more suggested answer:
require 'benchmark'
N = 1_000_000
class String
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
def first(how_many = 1)
self[0...how_many]
end
def shift(how_many = 1)
shifted = first(how_many)
self.replace self[how_many..-1]
shifted
end
alias_method :shift!, :shift
end
class Array
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
end
puts RUBY_VERSION
STR = "[12,23,987,43"
Benchmark.bm(7) do |b|
b.report('[0]') { N.times { "[12,23,987,43"[0] = '' } }
b.report('sub') { N.times { "[12,23,987,43".sub(/^\[+/, "") } }
b.report('gsub') { N.times { "[12,23,987,43".gsub(/^\[/, "") } }
b.report('[1..-1]') { N.times { "[12,23,987,43"[1..-1] } }
b.report('slice') { N.times { "[12,23,987,43".slice!(0) } }
b.report('length') { N.times { "[12,23,987,43"[1..STR.length] } }
b.report('eat!') { N.times { "[12,23,987,43".eat! } }
b.report('reverse') { N.times { "[12,23,987,43".reverse.chop.reverse } }
end
Which results in:
2.1.2
user system total real
[0] 0.300000 0.000000 0.300000 ( 0.295054)
sub 0.630000 0.000000 0.630000 ( 0.631870)
gsub 2.090000 0.000000 2.090000 ( 2.094368)
[1..-1] 0.230000 0.010000 0.240000 ( 0.232846)
slice 0.320000 0.000000 0.320000 ( 0.320714)
length 0.340000 0.000000 0.340000 ( 0.341918)
eat! 0.460000 0.000000 0.460000 ( 0.452724)
reverse 0.400000 0.000000 0.400000 ( 0.399465)
And another using /^./ to find the first character:
require 'benchmark'
N = 1_000_000
class String
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
def first(how_many = 1)
self[0...how_many]
end
def shift(how_many = 1)
shifted = first(how_many)
self.replace self[how_many..-1]
shifted
end
alias_method :shift!, :shift
end
class Array
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
end
puts RUBY_VERSION
STR = "[12,23,987,43"
Benchmark.bm(7) do |b|
b.report('[0]') { N.times { "[12,23,987,43"[0] = '' } }
b.report('[/^./]') { N.times { "[12,23,987,43"[/^./] = '' } }
b.report('[/^\[/]') { N.times { "[12,23,987,43"[/^\[/] = '' } }
b.report('sub+') { N.times { "[12,23,987,43".sub(/^\[+/, "") } }
b.report('sub') { N.times { "[12,23,987,43".sub(/^\[/, "") } }
b.report('gsub') { N.times { "[12,23,987,43".gsub(/^\[/, "") } }
b.report('[1..-1]') { N.times { "[12,23,987,43"[1..-1] } }
b.report('slice') { N.times { "[12,23,987,43".slice!(0) } }
b.report('length') { N.times { "[12,23,987,43"[1..STR.length] } }
b.report('eat!') { N.times { "[12,23,987,43".eat! } }
b.report('reverse') { N.times { "[12,23,987,43".reverse.chop.reverse } }
end
Which results in:
# >> 2.1.5
# >> user system total real
# >> [0] 0.270000 0.000000 0.270000 ( 0.270165)
# >> [/^./] 0.430000 0.000000 0.430000 ( 0.432417)
# >> [/^\[/] 0.460000 0.000000 0.460000 ( 0.458221)
# >> sub+ 0.590000 0.000000 0.590000 ( 0.590284)
# >> sub 0.590000 0.000000 0.590000 ( 0.596366)
# >> gsub 1.880000 0.010000 1.890000 ( 1.885892)
# >> [1..-1] 0.230000 0.000000 0.230000 ( 0.223045)
# >> slice 0.300000 0.000000 0.300000 ( 0.299175)
# >> length 0.320000 0.000000 0.320000 ( 0.325841)
# >> eat! 0.410000 0.000000 0.410000 ( 0.409306)
# >> reverse 0.390000 0.000000 0.390000 ( 0.393044)
Here's another update on faster hardware and a newer version of Ruby:
2.3.1
user system total real
[0] 0.200000 0.000000 0.200000 ( 0.204307)
[/^./] 0.390000 0.000000 0.390000 ( 0.387527)
[/^\[/] 0.360000 0.000000 0.360000 ( 0.360400)
sub+ 0.490000 0.000000 0.490000 ( 0.492083)
sub 0.480000 0.000000 0.480000 ( 0.487862)
gsub 1.990000 0.000000 1.990000 ( 1.988716)
[1..-1] 0.180000 0.000000 0.180000 ( 0.181673)
slice 0.260000 0.000000 0.260000 ( 0.266371)
length 0.270000 0.000000 0.270000 ( 0.267651)
eat! 0.400000 0.010000 0.410000 ( 0.398093)
reverse 0.340000 0.000000 0.340000 ( 0.344077)
Why is gsub so slow?
After doing a search/replace, gsub has to check for possible additional matches before it can tell if it's finished. sub only does one and finishes. Consider gsub like it's a minimum of two sub calls.
Also, it's important to remember that gsub, and sub can also be handicapped by poorly written regex which match much more slowly than a sub-string search. If possible anchor the regex to get the most speed from it. There are answers here on Stack Overflow demonstrating that so search around if you want more information.
We can use slice to do this:
val = "abc"
=> "abc"
val.slice!(0)
=> "a"
val
=> "bc"
Using slice! we can delete any character by specifying its index.
Ruby 2.5+
As of Ruby 2.5 you can use delete_prefix or delete_prefix! to achieve this in a readable manner.
In this case "[12,23,987,43".delete_prefix("[").
More info here:
Official docs
https://blog.jetbrains.com/ruby/2017/10/10-new-features-in-ruby-2-5/
https://bugs.ruby-lang.org/issues/12694
'invisible'.delete_prefix('in') #=> "visible"
'pink'.delete_prefix('in') #=> "pink"
N.B. you can also use this to remove items from the end of a string with delete_suffix and delete_suffix!
'worked'.delete_suffix('ed') #=> "work"
'medical'.delete_suffix('ed') #=> "medical"
Docs
https://bugs.ruby-lang.org/issues/13665
Edit:
Using the Tin Man's benchmark setup, it looks pretty quick too (under the last two entries delete_p and delete_p!). Doesn't quite pip the previous faves for speed, though is very readable.
2.5.0
user system total real
[0] 0.174766 0.000489 0.175255 ( 0.180207)
[/^./] 0.318038 0.000510 0.318548 ( 0.323679)
[/^\[/] 0.372645 0.001134 0.373779 ( 0.379029)
sub+ 0.460295 0.001510 0.461805 ( 0.467279)
sub 0.498351 0.001534 0.499885 ( 0.505729)
gsub 1.669837 0.005141 1.674978 ( 1.682853)
[1..-1] 0.199840 0.000976 0.200816 ( 0.205889)
slice 0.279661 0.000859 0.280520 ( 0.285661)
length 0.268362 0.000310 0.268672 ( 0.273829)
eat! 0.341715 0.000524 0.342239 ( 0.347097)
reverse 0.335301 0.000588 0.335889 ( 0.340965)
delete_p 0.222297 0.000832 0.223129 ( 0.228455)
delete_p! 0.225798 0.000747 0.226545 ( 0.231745)
I prefer this:
str = "[12,23,987,43"
puts str[1..-1]
>> 12,23,987,43
If you always want to strip leading brackets:
"[12,23,987,43".gsub(/^\[/, "")
If you just want to remove the first character, and you know it won't be in a multibyte character set:
"[12,23,987,43"[1..-1]
or
"[12,23,987,43".slice(1..-1)
Inefficient alternative:
str.reverse.chop.reverse
For example : a = "One Two Three"
1.9.2-p290 > a = "One Two Three"
=> "One Two Three"
1.9.2-p290 > a = a[1..-1]
=> "ne Two Three"
1.9.2-p290 > a = a[1..-1]
=> "e Two Three"
1.9.2-p290 > a = a[1..-1]
=> " Two Three"
1.9.2-p290 > a = a[1..-1]
=> "Two Three"
1.9.2-p290 > a = a[1..-1]
=> "wo Three"
In this way you can remove one by one first character of the string.
Easy way:
str = "[12,23,987,43"
removed = str[1..str.length]
Awesome way:
class String
def reverse_chop()
self[1..self.length]
end
end
"[12,23,987,43".reverse_chop()
(Note: prefer the easy way :) )
Thanks to #the-tin-man for putting together the benchmarks!
Alas, I don't really like any of those solutions. Either they require an extra step to get the result ([0] = '', .strip!) or they aren't very semantic/clear about what's happening ([1..-1]: "Um, a range from 1 to negative 1? Yearg?"), or they are slow or lengthy to write out (.gsub, .length).
What we are attempting is a 'shift' (in Array parlance), but returning the remaining characters, rather than what was shifted off. Let's use our Ruby to make this possible with strings! We can use the speedy bracket operation, but give it a good name, and take an arg to specify how much we want to chomp off the front:
class String
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
end
But there is more we can do with that speedy-but-unwieldy bracket operation. While we are at it, for completeness, let's write a #shift and #first for String (why should Array have all the fun‽‽), taking an arg to specify how many characters we want to remove from the beginning:
class String
def first(how_many = 1)
self[0...how_many]
end
def shift(how_many = 1)
shifted = first(how_many)
self.replace self[how_many..-1]
shifted
end
alias_method :shift!, :shift
end
Ok, now we have a good clear way of pulling characters off the front of a string, with a method that is consistent with Array#first and Array#shift (which really should be a bang method??). And we can easily get the modified string as well with #eat!. Hm, should we share our new eat!ing power with Array? Why not!
class Array
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
end
Now we can:
> str = "[12,23,987,43" #=> "[12,23,987,43"
> str.eat! #=> "12,23,987,43"
> str #=> "12,23,987,43"
> str.eat!(3) #=> "23,987,43"
> str #=> "23,987,43"
> str.first(2) #=> "23"
> str #=> "23,987,43"
> str.shift!(3) #=> "23,"
> str #=> "987,43"
> arr = [1,2,3,4,5] #=> [1, 2, 3, 4, 5]
> arr.eat! #=> [2, 3, 4, 5]
> arr #=> [2, 3, 4, 5]
That's better!
str = "[12,23,987,43"
str[0] = ""
class String
def bye_felicia()
felicia = self.strip[0] #first char, not first space.
self.sub(felicia, '')
end
end
Using regex:
str = 'string'
n = 1 #to remove first n characters
str[/.{#{str.size-n}}\z/] #=> "tring"
I find a nice solution to be str.delete(str[0]) for its readability, though I cannot attest to it's performance.
list = [1,2,3,4]
list.drop(1)
# => [2,3,4]
List drops one or more elements from the start of the array, does not mutate the array, and returns the array itself instead of the dropped element.

Resources