ruby fast reading from std

ruby fast reading from std - ruby

What is the fastest way to read from STDIN a number of 1000000 characters (integers), and split it into an array of one character integers (not strings) ?
123456 > [1,2,3,4,5,6]

The quickest method I have found so far is as follows :-
gets.unpack("c*").map { |c| c-48}
Here are some results from benchmarking most of the provided solutions. These tests were run with a 100,000 digit file but with 10 reps for each test.
user system total real
each_char_full_array: 1.780000 0.010000 1.790000 ( 1.788893)
each_char_empty_array: 1.560000 0.010000 1.570000 ( 1.572162)
map_byte: 0.760000 0.010000 0.770000 ( 0.773848)
gets_scan 2.220000 0.030000 2.250000 ( 2.250076)
unpack: 0.510000 0.020000 0.530000 ( 0.529376)
And here is the code that produced them
#!/usr/bin/env ruby
require "benchmark"
MAX_ITERATIONS = 100000
FILE_NAME = "1_million_digits"
def build_test_file
File.open(FILE_NAME, "w") do |f|
MAX_ITERATIONS.times {|x| f.syswrite rand(10)}
end
end
def each_char_empty_array
STDIN.reopen(FILE_NAME)
a = []
STDIN.each_char do |c|
a << c.to_i
end
a
end
def each_char_full_array
STDIN.reopen(FILE_NAME)
a = Array.new(MAX_ITERATIONS)
idx = 0
STDIN.each_char do |c|
a[idx] = c.to_i
idx += 1
end
a
end
def map_byte()
STDIN.reopen(FILE_NAME)
a = STDIN.bytes.map { |c| c-48 }
a[-1] == -38 && a.pop
a
end
def gets_scan
STDIN.reopen(FILE_NAME)
gets.scan(/\d/).map(&:to_i)
end
def unpack
STDIN.reopen(FILE_NAME)
gets.unpack("c*").map { |c| c-48}
end
reps = 10
build_test_file
Benchmark.bm(10) do |x|
x.report("each_char_full_array: ") { reps.times {|y| each_char_full_array}}
x.report("each_char_empty_array:") { reps.times {|y| each_char_empty_array}}
x.report("map_byte: ") { reps.times {|y| map_byte}}
x.report("gets_scan ") { reps.times {|y| gets_scan}}
x.report("unpack: ") { reps.times {|y| unpack}}
end

This should be reasonably fast:
a = []
STDIN.each_char do |c|
a << c.to_i
end
although some rough benchmarking shows this hackish version is considerably faster:
a = STDIN.bytes.map { |c| c-48 }

scan(/\d/).map(&:to_i)
This will split any string into an array of integers, ignoring any non-numeric characters. If you want to grab user input from STDIN add gets:
gets.scan(/\d/).map(&:to_i)

Related

Retrieving from an array an object that satisfies some characteristics

I have some objects in an array objects. Given a certain property-value pair, I need a function that returns the first object that matches this. For example, given objects.byName "John", it should return the first object with name: "John".
Currently I'm doing this:
def self.byName name
ID_obj_by_name = {}
##objects.each_with_index do |o, index|
ID_obj_by_name[o.name] = index
end
##objects[ID_obj_by_name[name]]
end
But it seems very slow, and is using a lot of memory. How can I improve this?

If you need performance, you should consider this approach:
require 'benchmark'
class Foo
def initialize(name)
#name = name
end
def name
#name
end
end
# Using array ######################################################################
test = []
500000.times do |i|
test << Foo.new("ABC" + i.to_s + "#!###!DS")
end
puts "using array"
time = Benchmark.measure {
result = test.find { |o| o.name == "ABC250000#!###!DS" }
}
puts time
####################################################################################
# Using a hash #####################################################################
test = {}
i_am_your_object = Object.new
500000.times do |i|
test["ABC" + i.to_s + "#!###!DS"] = i_am_your_object
end
puts "using hash"
time = Benchmark.measure {
result = test["ABC250000#!###!DS"]
}
puts time
####################################################################################
Results:
using array
0.060000 0.000000 0.060000 ( 0.060884)
using hash
0.000000 0.000000 0.000000 ( 0.000005)

Try something like
def self.by_name name
##objects.find { |o| o.name == name }
end

Ruby chunk and compare two large files

Looking for direction on how to chunk and compare two large text files using ruby. Any help is appreciated. Something like 100 lines at a time.
tried:
file(file1).foreach.each_slice(100) do |lines|
pp lines
end
getting confused how to include the second file to this loop.

CHUNK_SIZE = 256 # bytes
def same? path1, path2
return false unless [path1, path2].map { |f| File.size f }.reduce &:==
f1, f2 = [path1, path2].map { |f| File.new f }
loop do
s1, s2 = [f1, f2].map { |f| f.read(CHUNK_SIZE) }
break false if s1 != s2
break true if s1.nil? || s1.length < CHUNK_SIZE
end
ensure
[f1, f2].each &:close
end
UPD: credits for fixed typo and file size comparison goes to #tadman.

Just "Process two files at the same time in Ruby" and compare by chunks, like this:
f1 = File.open('file1.txt', 'r')
f2 = File.open('file2.txt', 'r')
f1.each_slice(10).zip(f2.each_slice(10)).each do |line1, line2|
return false unless line1 == line2
end
return true
Or, as suggested by #meagar (in this case line by line):
f1.each_line.zip(f2.each_line).all? { |a,b| a == b }
This will return true if files identical.

Just compare those files line by line:
def same_file?(path1, path2)
file1 = File.open(path1)
file2 = File.open(path2)
return true if File.absolute_path(path1) == File.absolute_path(path2)
return false unless file1.size == file2.size
enum1 = file1.each
enum2 = file2.each
loop do
# It's a mystery that the loop really ends
# when any of the 2 files has nothing to read
return false unless enum1.next == enum2.next
end
return true
ensure
file1.close
file2.close
end
I did my homework and found in the Kernel#loop documentation:
StopIteration raised in the block breaks the loop. In this case, loop returns the "result" value stored in the exception.
And, in the Enumerator#next documentation:
When the position reached at the end, StopIteration is raised.
So the mystery is no longer a mystery for me.

Here's another one, the approach is similar to mudasobwa's answer:
def same?(file_1, file_2)
return true if File.identical?(file_1, file_2)
return false unless File.size(file_1) == File.size(file_2)
buf_size = 2 ** 15 # 32 K
buf_1 = ''
buf_2 = ''
File.open(file_1) do |f1|
File.open(file_2) do |f2|
while f1.read(buf_size, buf_1) && f2.read(buf_size, buf_2)
return false unless buf_1 == buf_2
end
end
end
true
end
In the first two lines perform quick checks for identical files (e.g. hard and soft links) and same size using File.identical? and File.size.
File.open opens each file in read-only mode. The while loop then keeps calling read to read 32K chunks from each file into the buffers buf_1 and buf_2 until EOF. If the buffers differ, false is returned. Otherwise, i.e. without encountering any differences, true is returned.

To determine if two files have the exact same content, without comparing the actual content of the same chunk of each file, you can use a checksum function that turns the data into a hash string in a deterministic way. And while you have to read the contents to checksum it, you can get checksums for each slice, and end up with an array of checksums for each file.
You can then compare the collection of checksums. If the two files have the exact same content, the two collections will be equal.
require 'digest/md5'
hashes1 = File.foreach('./path_to_file').each_slice(100).map do |slice|
Digest::MD5.hexdigest(slice)
end
hashes2 = File.read('./path_to_duplicate').each_slice(100).map do |slice|
Digest::MD5.hexdigest(slice)
end
hashes1.join == hashes2.join
#=> true, meaning the two files contain the same content

Benchmark time
(Matt's answer is not included because I couldn't get it working)
Results 1 KB file size (N = 10000)
user system total real
aetherus 0.510000 0.300000 0.810000 ( 0.823201)
meagar 0.350000 0.160000 0.510000 ( 0.512755)
mudasobwa 0.290000 0.200000 0.490000 ( 0.500831)
stefan 0.150000 0.160000 0.310000 ( 0.312743)
yevgeniy_anfilofyev 0.320000 0.170000 0.490000 ( 0.497157)
Results 1 MB file size (N = 100)
user system total real
aetherus 1.540000 0.110000 1.650000 ( 1.667937)
meagar 1.170000 0.130000 1.300000 ( 1.310278)
mudasobwa 1.470000 0.830000 2.300000 ( 2.313481)
stefan 0.010000 0.030000 0.040000 ( 0.045577)
yevgeniy_anfilofyev 0.570000 0.100000 0.670000 ( 0.677226)
Results 1 GB file size (N = 1)
user system total real
aetherus 15.570000 0.920000 16.490000 ( 16.525826)
meagar 24.170000 1.910000 26.080000 ( 26.190057)
mudasobwa 16.260000 8.160000 24.420000 ( 24.471977)
stefan 0.120000 0.330000 0.450000 ( 0.443074)
yevgeniy_anfilofyev 12.940000 1.310000 14.250000 ( 14.295736)
Notes
mudasobwa's code runs significantly faster with larger CHUNK_SIZE
with identical chunk sizes, stefan's code seems to be ~2x faster than mudasobwa's code
"fastest" chunk size is somewhere between 16 K and 512 K
I couldn't use fruity because the 1 GB test would have taken too long
Code
def aetherus_same?(f1, f2)
enum1 = f1.each
enum2 = f2.each
loop do
return false unless enum1.next == enum2.next
end
return true
end
def meagar_same?(f1, f2)
f1.each_line.zip(f2.each_line).all? { |a,b| a == b }
end
CHUNK_SIZE = 256 # bytes
def mudasobwa_same?(f1, f2)
loop do
s1, s2 = [f1, f2].map { |f| f.read(CHUNK_SIZE) }
break false if s1 != s2
break true if s1.nil? || s1.length < CHUNK_SIZE
end
end
def stefan_same?(f1, f2)
buf_size = 2 ** 15 # 32 K
buf_1 = ''
buf_2 = ''
while f1.read(buf_size, buf_1) && f2.read(buf_size, buf_2)
return false unless buf_1 == buf_2
end
true
end
def yevgeniy_anfilofyev_same?(f1, f2)
f1.each_slice(10).zip(f2.each_slice(10)).each do |line1, line2|
return false unless line1 == line2
end
return true
end
FILE1 = ARGV[0]
FILE2 = ARGV[1]
N = ARGV[2].to_i
def with_files
File.open(FILE1) { |f1| File.open(FILE2) { |f2| yield f1, f2 } }
end
require 'benchmark'
Benchmark.bm(19) do |x|
x.report('aetherus') { N.times { with_files { |f1, f2| aetherus_same?(f1, f2) } } }
x.report('meagar') { N.times { with_files { |f1, f2| meagar_same?(f1, f2) } } }
x.report('mudasobwa') { N.times { with_files { |f1, f2| mudasobwa_same?(f1, f2) } } }
x.report('stefan') { N.times { with_files { |f1, f2| stefan_same?(f1, f2) } } }
x.report('yevgeniy_anfilofyev') { N.times { with_files { |f1, f2| yevgeniy_anfilofyev_same?(f1, f2) } } }
end

Replace keys from the hash by values from the hash

I have a hash (with hundreds of pairs) and I have a string.
I want to replace in this string all occurrences of keys from the hash to according values from the hash.
I understand that I can do something like this
some_hash.each { |key, value| str = str.gsub(key, value) }
However, I am wondering whether there is some better (performance wise) method to do this.

You only need to run gsub once. Since regex (oniguruma) is implemented in C, it should be faster than looping within Ruby.
some_hash = {
"a" => "A",
"b" => "B",
"c" => "C",
}
"abcdefgabcdefg".gsub(Regexp.union(some_hash.keys), some_hash)
# => "ABCdefgABCdefg"

Some benchmarks:
require 'benchmark'
SOME_HASH = Hash[('a'..'z').zip('A'..'Z')]
SOME_REGEX = Regexp.union(SOME_HASH.keys)
SHORT_STRING = ('a'..'z').to_a.join
LONG_STRING = SHORT_STRING * 100
N = 10_000
def sub1(str)
SOME_HASH.each { |key, value|
str = str.gsub(key, value)
}
str
end
def sub2(str)
SOME_HASH.each { |key, value|
str.gsub!(key, value)
}
str
end
def sub_regex(str)
str.gsub(SOME_REGEX, SOME_HASH)
end
puts RUBY_VERSION
puts "#{ N } loops"
puts
puts "sub1: #{ sub1(SHORT_STRING) }"
puts "sub2: #{ sub2(SHORT_STRING) }"
puts "sub_regex: #{ sub_regex(SHORT_STRING) }"
puts
Benchmark.bm(10) do |b|
b.report('gsub') { N.times { sub1(LONG_STRING) } }
b.report('gsub!') { N.times { sub2(LONG_STRING) } }
b.report('regex') { N.times { sub_regex(LONG_STRING) } }
end
Which outputs:
1.9.3
10000 loops
sub1: ABCDEFGHIJKLMNOPQRSTUVWXYZ
sub2: ABCDEFGHIJKLMNOPQRSTUVWXYZ
sub_regex: ABCDEFGHIJKLMNOPQRSTUVWXYZ
user system total real
gsub 14.360000 0.030000 14.390000 ( 14.412178)
gsub! 1.940000 0.010000 1.950000 ( 1.957591)
regex 0.080000 0.000000 0.080000 ( 0.075038)

Convert Input Value to Integer or Float, as Appropriate Using Ruby

I believe I have a good answer to this issue, but I wanted to make sure ruby-philes didn't have a much better way to do this.
Basically, given an input string, I would like to convert the string to an integer, where appropriate, or a float, where appropriate. Otherwise, just return the string.
I'll post my answer below, but I'd like to know if there is a better way out there.
Ex:
to_f_or_i_or_s("0523.49") #=> 523.49
to_f_or_i_or_s("0000029") #=> 29
to_f_or_i_or_s("kittens") #=> "kittens"

I would avoid using regex whenever possible in Ruby. It's notoriously slow.
def to_f_or_i_or_s(v)
((float = Float(v)) && (float % 1.0 == 0) ? float.to_i : float) rescue v
end
# Proof of Ruby's slow regex
def regex_float_detection(input)
input.match('\.')
end
def math_float_detection(input)
input % 1.0 == 0
end
n = 100_000
Benchmark.bm(30) do |x|
x.report("Regex") { n.times { regex_float_detection("1.1") } }
x.report("Math") { n.times { math_float_detection(1.1) } }
end
# user system total real
# Regex 0.180000 0.000000 0.180000 ( 0.181268)
# Math 0.050000 0.000000 0.050000 ( 0.048692)
A more comprehensive benchmark:
def wattsinabox(input)
input.match('\.').nil? ? Integer(input) : Float(input) rescue input.to_s
end
def jaredonline(input)
((float = Float(input)) && (float % 1.0 == 0) ? float.to_i : float) rescue input
end
def muistooshort(input)
case(input)
when /\A\s*[+-]?\d+\.\d+\z/
input.to_f
when /\A\s*[+-]?\d+(\.\d+)?[eE]\d+\z/
input.to_f
when /\A\s*[+-]?\d+\z/
input.to_i
else
input
end
end
n = 1_000_000
Benchmark.bm(30) do |x|
x.report("wattsinabox") { n.times { wattsinabox("1.1") } }
x.report("jaredonline") { n.times { jaredonline("1.1") } }
x.report("muistooshort") { n.times { muistooshort("1.1") } }
end
# user system total real
# wattsinabox 3.600000 0.020000 3.620000 ( 3.647055)
# jaredonline 1.400000 0.000000 1.400000 ( 1.413660)
# muistooshort 2.790000 0.010000 2.800000 ( 2.803939)

def to_f_or_i_or_s(v)
v.match('\.').nil? ? Integer(v) : Float(v) rescue v.to_s
end

A pile of regexes might be a good idea if you want to handle numbers in scientific notation (which String#to_f does):
def to_f_or_i_or_s(v)
case(v)
when /\A\s*[+-]?\d+\.\d+\z/
v.to_f
when /\A\s*[+-]?\d+(\.\d+)?[eE]\d+\z/
v.to_f
when /\A\s*[+-]?\d+\z/
v.to_i
else
v
end
end
You could mash both to_f cases into one regex if you wanted.
This will, of course, fail when fed '3,14159' in a locale that uses a comma as a decimal separator.

Depends on security requirements.
def to_f_or_i_or_s s
eval(s) rescue s
end

I used this method
def to_f_or_i_or_s(value)
return value if value[/[a-zA-Z]/]
i = value.to_i
f = value.to_f
i == f ? i : f
end

CSV has converters which do this.
require "csv"
strings = ["0523.49", "29","kittens"]
strings.each{|s|p s.parse_csv(converters: :numeric).first}
#523.49
#29
#"kittens"
However for some reason it converts "00029" to a float.

What is the easiest way to remove the first character from a string?

Example:
[12,23,987,43
What is the fastest, most efficient way to remove the "[",
using maybe a chop() but for the first character?

Similar to Pablo's answer above, but a shade cleaner :
str[1..-1]
Will return the array from 1 to the last character.
'Hello World'[1..-1]
=> "ello World"

I kind of favor using something like:
asdf = "[12,23,987,43"
asdf[0] = ''
p asdf
# >> "12,23,987,43"
I'm always looking for the fastest and most readable way of doing things:
require 'benchmark'
N = 1_000_000
puts RUBY_VERSION
STR = "[12,23,987,43"
Benchmark.bm(7) do |b|
b.report('[0]') { N.times { "[12,23,987,43"[0] = '' } }
b.report('sub') { N.times { "[12,23,987,43".sub(/^\[+/, "") } }
b.report('gsub') { N.times { "[12,23,987,43".gsub(/^\[/, "") } }
b.report('[1..-1]') { N.times { "[12,23,987,43"[1..-1] } }
b.report('slice') { N.times { "[12,23,987,43".slice!(0) } }
b.report('length') { N.times { "[12,23,987,43"[1..STR.length] } }
end
Running on my Mac Pro:
1.9.3
user system total real
[0] 0.840000 0.000000 0.840000 ( 0.847496)
sub 1.960000 0.010000 1.970000 ( 1.962767)
gsub 4.350000 0.020000 4.370000 ( 4.372801)
[1..-1] 0.710000 0.000000 0.710000 ( 0.713366)
slice 1.020000 0.000000 1.020000 ( 1.020336)
length 1.160000 0.000000 1.160000 ( 1.157882)
Updating to incorporate one more suggested answer:
require 'benchmark'
N = 1_000_000
class String
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
def first(how_many = 1)
self[0...how_many]
end
def shift(how_many = 1)
shifted = first(how_many)
self.replace self[how_many..-1]
shifted
end
alias_method :shift!, :shift
end
class Array
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
end
puts RUBY_VERSION
STR = "[12,23,987,43"
Benchmark.bm(7) do |b|
b.report('[0]') { N.times { "[12,23,987,43"[0] = '' } }
b.report('sub') { N.times { "[12,23,987,43".sub(/^\[+/, "") } }
b.report('gsub') { N.times { "[12,23,987,43".gsub(/^\[/, "") } }
b.report('[1..-1]') { N.times { "[12,23,987,43"[1..-1] } }
b.report('slice') { N.times { "[12,23,987,43".slice!(0) } }
b.report('length') { N.times { "[12,23,987,43"[1..STR.length] } }
b.report('eat!') { N.times { "[12,23,987,43".eat! } }
b.report('reverse') { N.times { "[12,23,987,43".reverse.chop.reverse } }
end
Which results in:
2.1.2
user system total real
[0] 0.300000 0.000000 0.300000 ( 0.295054)
sub 0.630000 0.000000 0.630000 ( 0.631870)
gsub 2.090000 0.000000 2.090000 ( 2.094368)
[1..-1] 0.230000 0.010000 0.240000 ( 0.232846)
slice 0.320000 0.000000 0.320000 ( 0.320714)
length 0.340000 0.000000 0.340000 ( 0.341918)
eat! 0.460000 0.000000 0.460000 ( 0.452724)
reverse 0.400000 0.000000 0.400000 ( 0.399465)
And another using /^./ to find the first character:
require 'benchmark'
N = 1_000_000
class String
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
def first(how_many = 1)
self[0...how_many]
end
def shift(how_many = 1)
shifted = first(how_many)
self.replace self[how_many..-1]
shifted
end
alias_method :shift!, :shift
end
class Array
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
end
puts RUBY_VERSION
STR = "[12,23,987,43"
Benchmark.bm(7) do |b|
b.report('[0]') { N.times { "[12,23,987,43"[0] = '' } }
b.report('[/^./]') { N.times { "[12,23,987,43"[/^./] = '' } }
b.report('[/^\[/]') { N.times { "[12,23,987,43"[/^\[/] = '' } }
b.report('sub+') { N.times { "[12,23,987,43".sub(/^\[+/, "") } }
b.report('sub') { N.times { "[12,23,987,43".sub(/^\[/, "") } }
b.report('gsub') { N.times { "[12,23,987,43".gsub(/^\[/, "") } }
b.report('[1..-1]') { N.times { "[12,23,987,43"[1..-1] } }
b.report('slice') { N.times { "[12,23,987,43".slice!(0) } }
b.report('length') { N.times { "[12,23,987,43"[1..STR.length] } }
b.report('eat!') { N.times { "[12,23,987,43".eat! } }
b.report('reverse') { N.times { "[12,23,987,43".reverse.chop.reverse } }
end
Which results in:
# >> 2.1.5
# >> user system total real
# >> [0] 0.270000 0.000000 0.270000 ( 0.270165)
# >> [/^./] 0.430000 0.000000 0.430000 ( 0.432417)
# >> [/^\[/] 0.460000 0.000000 0.460000 ( 0.458221)
# >> sub+ 0.590000 0.000000 0.590000 ( 0.590284)
# >> sub 0.590000 0.000000 0.590000 ( 0.596366)
# >> gsub 1.880000 0.010000 1.890000 ( 1.885892)
# >> [1..-1] 0.230000 0.000000 0.230000 ( 0.223045)
# >> slice 0.300000 0.000000 0.300000 ( 0.299175)
# >> length 0.320000 0.000000 0.320000 ( 0.325841)
# >> eat! 0.410000 0.000000 0.410000 ( 0.409306)
# >> reverse 0.390000 0.000000 0.390000 ( 0.393044)
Here's another update on faster hardware and a newer version of Ruby:
2.3.1
user system total real
[0] 0.200000 0.000000 0.200000 ( 0.204307)
[/^./] 0.390000 0.000000 0.390000 ( 0.387527)
[/^\[/] 0.360000 0.000000 0.360000 ( 0.360400)
sub+ 0.490000 0.000000 0.490000 ( 0.492083)
sub 0.480000 0.000000 0.480000 ( 0.487862)
gsub 1.990000 0.000000 1.990000 ( 1.988716)
[1..-1] 0.180000 0.000000 0.180000 ( 0.181673)
slice 0.260000 0.000000 0.260000 ( 0.266371)
length 0.270000 0.000000 0.270000 ( 0.267651)
eat! 0.400000 0.010000 0.410000 ( 0.398093)
reverse 0.340000 0.000000 0.340000 ( 0.344077)
Why is gsub so slow?
After doing a search/replace, gsub has to check for possible additional matches before it can tell if it's finished. sub only does one and finishes. Consider gsub like it's a minimum of two sub calls.
Also, it's important to remember that gsub, and sub can also be handicapped by poorly written regex which match much more slowly than a sub-string search. If possible anchor the regex to get the most speed from it. There are answers here on Stack Overflow demonstrating that so search around if you want more information.

We can use slice to do this:
val = "abc"
=> "abc"
val.slice!(0)
=> "a"
val
=> "bc"
Using slice! we can delete any character by specifying its index.

Ruby 2.5+
As of Ruby 2.5 you can use delete_prefix or delete_prefix! to achieve this in a readable manner.
In this case "[12,23,987,43".delete_prefix("[").
More info here:
Official docs
https://blog.jetbrains.com/ruby/2017/10/10-new-features-in-ruby-2-5/
https://bugs.ruby-lang.org/issues/12694
'invisible'.delete_prefix('in') #=> "visible"
'pink'.delete_prefix('in') #=> "pink"
N.B. you can also use this to remove items from the end of a string with delete_suffix and delete_suffix!
'worked'.delete_suffix('ed') #=> "work"
'medical'.delete_suffix('ed') #=> "medical"
Docs
https://bugs.ruby-lang.org/issues/13665
Edit:
Using the Tin Man's benchmark setup, it looks pretty quick too (under the last two entries delete_p and delete_p!). Doesn't quite pip the previous faves for speed, though is very readable.
2.5.0
user system total real
[0] 0.174766 0.000489 0.175255 ( 0.180207)
[/^./] 0.318038 0.000510 0.318548 ( 0.323679)
[/^\[/] 0.372645 0.001134 0.373779 ( 0.379029)
sub+ 0.460295 0.001510 0.461805 ( 0.467279)
sub 0.498351 0.001534 0.499885 ( 0.505729)
gsub 1.669837 0.005141 1.674978 ( 1.682853)
[1..-1] 0.199840 0.000976 0.200816 ( 0.205889)
slice 0.279661 0.000859 0.280520 ( 0.285661)
length 0.268362 0.000310 0.268672 ( 0.273829)
eat! 0.341715 0.000524 0.342239 ( 0.347097)
reverse 0.335301 0.000588 0.335889 ( 0.340965)
delete_p 0.222297 0.000832 0.223129 ( 0.228455)
delete_p! 0.225798 0.000747 0.226545 ( 0.231745)

I prefer this:
str = "[12,23,987,43"
puts str[1..-1]
>> 12,23,987,43

If you always want to strip leading brackets:
"[12,23,987,43".gsub(/^\[/, "")
If you just want to remove the first character, and you know it won't be in a multibyte character set:
"[12,23,987,43"[1..-1]
or
"[12,23,987,43".slice(1..-1)

Inefficient alternative:
str.reverse.chop.reverse

For example : a = "One Two Three"
1.9.2-p290 > a = "One Two Three"
=> "One Two Three"
1.9.2-p290 > a = a[1..-1]
=> "ne Two Three"
1.9.2-p290 > a = a[1..-1]
=> "e Two Three"
1.9.2-p290 > a = a[1..-1]
=> " Two Three"
1.9.2-p290 > a = a[1..-1]
=> "Two Three"
1.9.2-p290 > a = a[1..-1]
=> "wo Three"
In this way you can remove one by one first character of the string.

Easy way:
str = "[12,23,987,43"
removed = str[1..str.length]
Awesome way:
class String
def reverse_chop()
self[1..self.length]
end
end
"[12,23,987,43".reverse_chop()
(Note: prefer the easy way :) )

Thanks to #the-tin-man for putting together the benchmarks!
Alas, I don't really like any of those solutions. Either they require an extra step to get the result ([0] = '', .strip!) or they aren't very semantic/clear about what's happening ([1..-1]: "Um, a range from 1 to negative 1? Yearg?"), or they are slow or lengthy to write out (.gsub, .length).
What we are attempting is a 'shift' (in Array parlance), but returning the remaining characters, rather than what was shifted off. Let's use our Ruby to make this possible with strings! We can use the speedy bracket operation, but give it a good name, and take an arg to specify how much we want to chomp off the front:
class String
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
end
But there is more we can do with that speedy-but-unwieldy bracket operation. While we are at it, for completeness, let's write a #shift and #first for String (why should Array have all the fun‽‽), taking an arg to specify how many characters we want to remove from the beginning:
class String
def first(how_many = 1)
self[0...how_many]
end
def shift(how_many = 1)
shifted = first(how_many)
self.replace self[how_many..-1]
shifted
end
alias_method :shift!, :shift
end
Ok, now we have a good clear way of pulling characters off the front of a string, with a method that is consistent with Array#first and Array#shift (which really should be a bang method??). And we can easily get the modified string as well with #eat!. Hm, should we share our new eat!ing power with Array? Why not!
class Array
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
end
Now we can:
> str = "[12,23,987,43" #=> "[12,23,987,43"
> str.eat! #=> "12,23,987,43"
> str #=> "12,23,987,43"
> str.eat!(3) #=> "23,987,43"
> str #=> "23,987,43"
> str.first(2) #=> "23"
> str #=> "23,987,43"
> str.shift!(3) #=> "23,"
> str #=> "987,43"
> arr = [1,2,3,4,5] #=> [1, 2, 3, 4, 5]
> arr.eat! #=> [2, 3, 4, 5]
> arr #=> [2, 3, 4, 5]
That's better!

str = "[12,23,987,43"
str[0] = ""

class String
def bye_felicia()
felicia = self.strip[0] #first char, not first space.
self.sub(felicia, '')
end
end

Using regex:
str = 'string'
n = 1 #to remove first n characters
str[/.{#{str.size-n}}\z/] #=> "tring"

I find a nice solution to be str.delete(str[0]) for its readability, though I cannot attest to it's performance.

list = [1,2,3,4]
list.drop(1)
# => [2,3,4]
List drops one or more elements from the start of the array, does not mutate the array, and returns the array itself instead of the dropped element.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

ruby fast reading from std - ruby

What is the fastest way to read from STDIN a number of 1000000 characters (integers), and split it into an array of one character integers (not strings) ? 123456 > [1,2,3,4,5,6]

This should be reasonably fast: a = [] STDIN.each_char do |c| a << c.to_i end although some rough benchmarking shows this hackish version is considerably faster: a = STDIN.bytes.map { |c| c-48 }

scan(/\d/).map(&:to_i) This will split any string into an array of integers, ignoring any non-numeric characters. If you want to grab user input from STDIN add gets: gets.scan(/\d/).map(&:to_i)

Related

Retrieving from an array an object that satisfies some characteristics

Ruby chunk and compare two large files

Replace keys from the hash by values from the hash

Convert Input Value to Integer or Float, as Appropriate Using Ruby

What is the easiest way to remove the first character from a string?

Categories

Resources