What does "nil#" mean in profiler results? - ruby

Here is very simplified part of code:
srand 0
WIDTH, HEIGHT = 10, 10
array = Array.new(HEIGHT){ [0]*WIDTH }
require "profile"
10000.times do
y, x = rand(HEIGHT), rand(WIDTH)
g = array[y][x] + [-1,+1].sample
array[y][x] = g unless [[y-1,x],[y+1,x],[y,x-1],[y,x+1]].any?{ |y, x|
y>=0 && y<HEIGHT && x>=0 && x<WIDTH && 1 < (array[y][x] - g).abs
}
end
We see (ruby 2.0.0p451 (2014-02-24) [i386-mingw32]):
% cumulative self self total
time seconds seconds calls ms/call ms/call name
39.33 0.86 0.86 47471 0.02 0.05 nil#
If we totally remove the unless thing:
53.48 0.48 0.48 10000 0.05 0.09 nil#
Looks like we get this nil# operation:
on every comparision or boolean operation?
on every Array object creation or block envocation?
Would be nice to get a deep answer.

Related

How can I create a specific time interval in Ruby?

What I have tried so far ...
start_hour = 7
start_minute = 0 * 0.01
end_hour = 17
end_minute = 45 * 0.01
step_time = 25
start_time = start_hour + start_minute
end_time = end_hour + end_minute
if step_time > 59
step_time = 1 if step_time == 60
step_time = 1.3 if step_time == 90
step_time = 2 if step_time == 120
else
step_time *= 0.01
end
hours = []
(start_time..end_time).step(step_time).map do |x|
next if (x-x.to_i) > 0.55
hours << '%0.2f' % x.round(2).to_s
end
puts hours
If I enter the step interval 0, 5, 10, 20, I can get the time interval I want. But if I enter 15, 25, 90, I can't get the right range.
You currently have:
end_hour = 17
end_minute = 45 * 0.01
end_time = end_hour + end_minute
#=> 17.45
Although 17.45 looks like the correct value, it isn't. 45 minutes is 3 quarters (or 75%) of an hour, so the correct decimal value is 17.75.
You could change your code accordingly, but working with decimal hours is a bit strange. It's much easier to just work with minutes. Instead of turning the minutes into hours, you turn the hours into minutes:
start_hour = 7
start_minute = 0
start_time = start_hour * 60 + start_minute
#=> 420
end_hour = 17
end_minute = 45
end_time = end_hour * 60 + end_minute
#=> 1065
The total amount of minutes can easily be converted back to hour-minute pairs via divmod:
420.divmod(60) #=> [7, 0]
1065.divmod(60) #=> [17, 45]
Using the above, we can traverse the range without having to convert the step interval:
def hours(start_time, end_time, step_time)
(start_time..end_time).step(step_time).map do |x|
'%02d:%02d' % x.divmod(60)
end
end
hours(start_time, end_time, 25)
#=> ["07:00", "07:25", "07:50", "08:15", "08:40", "09:05", "09:30", "09:55",
# "10:20", "10:45", "11:10", "11:35", "12:00", "12:25", "12:50", "13:15",
# "13:40", "14:05", "14:30", "14:55", "15:20", "15:45", "16:10", "16:35",
# "17:00", "17:25"]
hours(start_time, end_time, 90)
#=> ["07:00", "08:30", "10:00", "11:30", "13:00", "14:30", "16:00", "17:30"]

Is my Theano program actually using the GPU?

Theano claims it's using the GPU; it says what device when it starts up, etc. Furthermore nvidia-smi says it's being used.
But the running time seems to be exactly the same regardless of whether or not I use it.
Could it have something to do with integer arithmetic?
import sys
import numpy as np
import theano
import theano.tensor as T
def ariths(v, ub):
"""Given a sorted vector v and scalar ub, returns multiples of elements in v.
Specifically, returns a vector containing all numbers j * k < ub where j is in
v and k >= j. Some elements may occur more than once in the output.
"""
lp = v[0]
v = T.shape_padright(v)
a = T.shape_padleft(T.arange(0, (ub + lp - 1) // lp - lp, 1, 'int64'))
res = v * (a + v)
return res[(res < ub).nonzero()]
def filter_composites(pv, using_primes):
a = ariths(using_primes, pv.size)
return T.set_subtensor(pv[a], 0)
def _iterfn(prev_bnds, pv):
bstart = prev_bnds[0]
bend = prev_bnds[1]
use_primes = pv[bstart:bend].nonzero()[0] + bstart
pv = filter_composites(pv, use_primes)
return pv
def primes_to(n):
if n <= 2:
return np.asarray([])
elif n <= 3:
return np.asarray([2])
res = T.ones(n, 'int8')
res = T.set_subtensor(res[:2], 0)
ubs = [[2, 4]]
ub = 4
while ub ** 2 < n:
prevub = ub
ub *= 2
ubs.append([prevub, ub])
(r, u5) = theano.scan(fn=_iterfn,
outputs_info=res, sequences=[np.asarray(ubs)])
return r[-1].nonzero()[0]
def main(n):
print(primes_to(n).size.eval())
if __name__ == '__main__':
main(int(sys.argv[1]))
The answer is yes. And no. If you profile your code in a GPU enabled Theano installation using nvprof, you will see something like this:
==16540== Profiling application: python ./theano_test.py
==16540== Profiling result:
Time(%) Time Calls Avg Min Max Name
49.22% 12.096us 1 12.096us 12.096us 12.096us kernel_reduce_ccontig_node_c8d7bd33dfef61705c2854dd1f0cb7ce_0(unsigned int, float const *, float*)
30.60% 7.5200us 3 2.5060us 832ns 5.7600us [CUDA memcpy HtoD]
13.93% 3.4240us 1 3.4240us 3.4240us 3.4240us [CUDA memset]
6.25% 1.5350us 1 1.5350us 1.5350us 1.5350us [CUDA memcpy DtoH]
i.e. There is a least a reduce operation being performed on your GPU. However, if you modify your main like this:
def main():
n = 100000000
print(primes_to(n).size.eval())
if __name__ == '__main__':
import cProfile, pstats
cProfile.run("main()", "{}.profile".format(__file__))
s = pstats.Stats("{}.profile".format(__file__))
s.strip_dirs()
s.sort_stats("time").print_stats(10)
and use cProfile to profile your code, you will see something like this:
Thu Mar 10 14:35:24 2016 ./theano_test.py.profile
486743 function calls (480590 primitive calls) in 17.444 seconds
Ordered by: internal time
List reduced from 1138 to 10 due to restriction <10>
ncalls tottime percall cumtime percall filename:lineno(function)
1 6.376 6.376 16.655 16.655 {theano.scan_module.scan_perform.perform}
13 6.168 0.474 6.168 0.474 subtensor.py:2084(perform)
27 2.910 0.108 2.910 0.108 {method 'nonzero' of 'numpy.ndarray' objects}
30 0.852 0.028 0.852 0.028 {numpy.core.multiarray.concatenate}
27 0.711 0.026 0.711 0.026 {method 'astype' of 'numpy.ndarray' objects}
13 0.072 0.006 0.072 0.006 {numpy.core.multiarray.arange}
1 0.034 0.034 17.142 17.142 function_module.py:482(__call__)
387 0.020 0.000 0.052 0.000 graph.py:486(stack_search)
77 0.016 0.000 10.731 0.139 op.py:767(rval)
316 0.013 0.000 0.066 0.000 graph.py:715(general_toposort)
The slowest operation (just) is the scan call, and looking at the source for scan, you can see that presently, GPU execution of scan is disabled.
So then answer is, yes, the GPU is being used for something in your code, but no, the most time consuming operation(s) are being run on the CPU because GPU execution appears to be hard disabled in the code at present.

sub-millisecond time measurement in Ruby

I'm running Ruby 2.2 in Rubymine 8.0.3
My machine is running Windows 7 Pro with an Intel core i7-4710MQ
I've been able to achieve ~411 ns precision with C++, Java, Python and JS on this machine, but can't seem to find a way to attain this performance in Ruby, as the built in Time library is good for ms only.
I can program my tests to tolerate this reduced precision, but is it possible to incorporate the windows QPC API for improved evaluation of execution time?
My test code for determining clock tick precision is below:
numTimes = 10000
times = Array.new(numTimes)
(0...(numTimes)).each do |i|
times[i] = Time.new
end
durations = []
(0...(numTimes - 1)).each do |i|
durations[i] = times[i+1] - times[i]
end
# Output duration only if the clock ticked over
durations.each do |duration|
if duration != 0
p duration.to_s + ','
end
end
The below code incorporates the QPC as found here
require "Win32API"
QueryPerformanceCounter = Win32API.new("kernel32",
"QueryPerformanceCounter", 'P', 'I')
QueryPerformanceFrequency = Win32API.new("kernel32",
"QueryPerformanceFrequency", 'P', 'I')
def get_ticks
tick = ' ' * 8
get_ticks = QueryPerformanceCounter.call(tick)
tick.unpack('q')[0]
end
def get_freq
freq = ' ' * 8
get_freq = QueryPerformanceFrequency.call(freq)
freq.unpack('q')[0]
end
def get_time_diff(a, b)
# This function takes two QPC ticks
(b - a).abs.to_f / (get_freq)
end
numTimes = 10000
times = Array.new(numTimes)
(0...(numTimes)).each do |i|
times[i] = get_ticks
end
durations = []
(0...(numTimes - 1)).each do |i|
durations[i] = get_time_diff(times[i+1], times[i])
end
durations.each do |duration|
p (duration * 1000000000).to_s + ','
end
This code returns durations between ticks of ~22-75 microseconds on my machine
You can get higher precision by using Process::clock_gettime:
Returns a time returned by POSIX clock_gettime() function.
Here's an example with Time.now
times = Array.new(1000) { Time.now }
durations = times.each_cons(2).map { |a, b| b - a }
durations.sort.group_by(&:itself).each do |time, elements|
printf("%5d ns x %d\n", time * 1_000_000_000, elements.count)
end
Output:
0 ns x 686
1000 ns x 296
2000 ns x 12
3000 ns x 2
12000 ns x 2
18000 ns x 1
And here's the same example with Process.clock_gettime:
times = Array.new(1000) { Process.clock_gettime(Process::CLOCK_MONOTONIC) }
Output:
163 ns x 1
164 ns x 1
164 ns x 9
165 ns x 6
165 ns x 22
166 ns x 39
166 ns x 174
167 ns x 13
167 ns x 129
168 ns x 95
168 ns x 32
169 ns x 203
169 ns x 141
170 ns x 23
170 ns x 37
171 ns x 30
171 ns x 3
172 ns x 24
172 ns x 10
174 ns x 1
175 ns x 2
180 ns x 1
194 ns x 1
273 ns x 1
2565 ns x 1
And here's a quick side-by-side comparison:
array = Array.new(12) { [Time.now, Process.clock_gettime(Process::CLOCK_MONOTONIC)] }
array.shift(2) # first elements are always inaccuate
base_t, base_p = array.first # baseline
printf("%-11.11s %-11.11s\n", 'Time.now', 'Process.clock_gettime')
array.each do |t, p|
printf("%.9f %.9f\n", t - base_t, p - base_p)
end
Output:
Time.now Process.clo
0.000000000 0.000000000
0.000000000 0.000000495
0.000001000 0.000000985
0.000001000 0.000001472
0.000002000 0.000001960
0.000002000 0.000002448
0.000003000 0.000002937
0.000003000 0.000003425
0.000004000 0.000003914
0.000004000 0.000004403
This is Ruby 2.3 on OS X running on an Intel Core i7, not sure about Windows.
To avoid precision loss due to floating point conversion, you can specify another unit, e.g.:
Process.clock_gettime(Process::CLOCK_MONOTONIC, :nanosecond)
#=> 191519383463873
Time#nsec:
numTimes = 10000
times = Array.new(numTimes)
(0...(numTimes)).each do |i|
# nsec ⇓⇓⇓⇓
times[i] = Time.new.nsec
end
durations = (0...(numTimes - 1)).inject([]) do |memo, i|
memo << times[i+1] - times[i]
end
puts durations.reject(&:zero?).join $/
Ruby Time objects store the number of nanoseconds since the epoch.
Since Ruby 1.9.2, Time implementation uses a signed 63 bit integer, Bignum or Rational. The integer is a number of nanoseconds since the Epoch which can represent 1823-11-12 to 2116-02-20.
You can access the nanosecond part most accurately with Time#nsec.
$ ruby -e 't1 = Time.now; puts t1.to_f; puts t1.nsec'
1457079791.351686
351686000
As you can see, on my OS X machine it's only precise down to the microsecond. This could be because OS X lacks clock_gettime().

Infinite loop in algorithm to match clocks running at different speeds

I'm trying to solve this problem:
Two clocks, which show the time in hours and minutes using the 24 hour clock, are running at different
speeds. Each clock is an exact number of minutes per hour fast. Both clocks start showing the same time
(00:00) and are checked regularly every hour (starting after one hour) according to an accurate timekeeper.
What time will the two clocks show on the first occasion when they are checked and show the same time?
NB: For this question we only care about the clocks matching when they are checked.
For example, suppose the first clock runs 1 minute fast (per hour) and the second clock runs 31 minutes
fast (per hour).
• When the clocks are first checked after one hour, the first clock will show 01:01 and the second clock
will show 01:31;
• When the clocks are checked after two hours, they will show 02:02 and 03:02;
• After 48 hours the clocks will both show 00:48.
Here is my code:
def add_delay(min,hash)
hash[:minutes] = (hash[:minutes] + min)
if hash[:minutes] > 59
hash[:minutes] %= 60
if min < 60
add_hour(hash)
end
end
hash[:hour] += (min / 60)
hash
end
def add_hour(hash)
hash[:hour] += 1
if hash[:hour] > 23
hash[:hour] %= 24
end
hash
end
def compare(hash1,hash2)
(hash1[:hour] == hash2[:hour]) && (hash1[:minutes] == hash2[:minutes])
end
#-------------------------------------------------------------------
first_clock = Integer(gets) rescue nil
second_clock = Integer(gets) rescue nil
#hash1 = if first_clock < 60 then {:hour => 1,:minutes => first_clock} else {:hour => 1 + (first_clock/60),:minutes => (first_clock%60)} end
#hash2 = if second_clock < 60 then {:hour => 1,:minutes => second_clock} else {:hour => 1 + (second_clock/60),:minutes => (second_clock%60)} end
hash1 = {:hour => 0, :minutes => 0}
hash2 = {:hour => 0, :minutes => 0}
begin
hash1 = add_hour(hash1)
hash1 = add_delay(first_clock,hash1)
hash2 = add_hour(hash2)
p hash2.to_s
hash2 = add_delay(second_clock,hash2)
p hash2.to_s
end while !compare(hash1,hash2)
#making sure print is good
if hash1[:hour] > 9
if hash1[:minutes] > 9
puts hash1[:hour].to_s + ":" + hash1[:minutes].to_s
else
puts hash1[:hour].to_s + ":0" + hash1[:minutes].to_s
end
else
if hash1[:minutes] > 9
puts "0" + hash1[:hour].to_s + ":" + hash1[:minutes].to_s
else
puts "0" + hash1[:hour].to_s + ":0" + hash1[:minutes].to_s
end
end
#-------------------------------------------------------------------
For 1 and 31 the code runs as expected. For anything bigger, such as 5 and 100, it seems to get into an infinite loop and I don't see where the bug is. What is going wrong?
The logic in your add_delay function is flawed.
def add_delay(min,hash)
hash[:minutes] = (hash[:minutes] + min)
if hash[:minutes] > 59
hash[:minutes] %= 60
if min < 60
add_hour(hash)
end
end
hash[:hour] += (min / 60)
hash
end
If hash[:minutes] is greater than 60, you should increment the hour no matter what. Observe that an increment less than 60 can cause the minutes to overflow.
Also, you may have to increment the hour more than once if the increment exceeds 60 minutes.
Finally, it is wrong to do hash[:hour] += (min / 60) because min is not necessarily over 60 and because you have already done add_hour(hash).
Here is a corrected version of the function:
def add_delay(minutes, time)
time[:minutes] += minutes
while time[:minutes] > 59 # If the minutes overflow,
time[:minutes] -= 60 # subtract 60 minutes and
add_hour(time) # increment the hour.
end # Repeat as necessary.
time
end
You can plug this function into your existing code. I have merely taken the liberty of renaming min to minutes and hash to time inside the function.
Your code
Let's look at your code and at the same time make some small improvements.
add_delay takes a given number of minutes to add to the hash, after converting the number of minutes to hours and minutes and then the number of hours to the number of hours within a day. One problem is that if a clock gains more than 59 minutes per hour, you may have to increment hours by more than one. Try writing it and add_hours like this:
def add_delay(min_to_add, hash)
mins = hash[:minutes] + min_to_add
hrs, mins = mins.divmod 60
hash[:minutes] = mins
add_hours(hash, hrs)
end
def add_hours(hash, hours=1)
hash[:hours] = (hash[:hours] + hours) % 24
end
We do not necessarily care what either of these methods returns, as they modify the argument hash.
This uses the very handy method Fixnum#divmod to convert minutes to hours and minutes.
(Aside: some Rubiests don't use hash as the name of a variable because it is also the name of a Ruby method.)
Next, compare determines if two hashes with keys :hour and :minutes are equal. Rather than checking if both the hours and minutes match, you can just see if the hashes are equal:
def compare(hash1, hash2)
hash1 == hash2
end
Get the minutes per hour by which the clocks are fast:
first_clock = Integer(gets) rescue nil
second_clock = Integer(gets) rescue nil
and now initialize the hashes and step by hour until a match is found, then return either hash:
def find_matching_time(first_clock, second_clock)
hash1 = {:hours => 0, :minutes => 0}
hash2 = {:hours => 0, :minutes => 0}
begin
add_delay(first_clock, hash1)
add_hours(hash1)
add_delay(second_clock, hash2)
add_hours(hash2)
end until compare(hash1, hash2)
hash1
end
Let's try it:
find_matching_time(1, 31)
# => {:hours=>0, :minutes=>48}
find_matching_time(5, 100)
#=> {:hours=>0, :minutes=>0}
find_matching_time(5, 5)
#=> {:hours=>1, :minutes=>5}
find_matching_time(0, 59)
#=> {:hours=>0, :minutes=>0}
These results match those I obtained below with an alternative method. You do not return the number hours from the present until the times are the same, but you may not need that.
I have not identified why you were getting the infinite loop, but perhaps with this analysis you will be able to find it.
There are two other small changes I would suggest: 1) incorporating add_hours in add_delay and renaming the latter, and 2) getting rid of compare because it so simple and only used in one place:
def add_hour_and_delay(min_to_add, hash)
mins = hash[:minutes] + min_to_add
hrs, mins = mins.divmod 60
hash[:minutes] = mins
hash[:hours] = (hash[:hours] + 1 + hrs) % 24
end
def find_matching_time(first_clock, second_clock)
hash1 = {:hours => 0, :minutes => 0}
hash2 = {:hours => 0, :minutes => 0}
begin
add_hour_and_delay(first_clock, hash1)
add_hour_and_delay(second_clock, hash2)
end until hash1 == hash2
hash1
end
Alternative method
Here's anther way to write the method. Let:
f0: minutes per hour the first clock is fast
f1: minutes per hour the second clock is fast
Then we can compute the next time they will show the same time as follows.
Code
MINS_PER_DAY = (24*60)
def find_matching_time(f0, f1)
elapsed_hours = (1..Float::INFINITY).find { |i|
(i*(60+f0)) % MINS_PER_DAY == (i*(60+f1)) % MINS_PER_DAY }
[elapsed_hours, "%d:%02d" % ((elapsed_hours*(60+f0)) % MINS_PER_DAY).divmod(60)]
end
Examples
find_matching_time(1, 31)
#=> [48, "0:48"]
After 48 hours both clocks will show a time of "0:48".
find_matching_time(5, 100)
#=> [288, "0:00"]
find_matching_time(5, 5)
#=> [1, "1:05"]
find_matching_time(0, 59)
#=> [1440, "0:00"]
Explanation
After i hours have elapsed, the two clocks will respectively display a time that is the following number of minutes within a day:
(i*(60+f0)) % MINS_PER_DAY # clock 0
(i*(60+f1)) % MINS_PER_DAY # clock 1
Enumerable#find is then used to determine the first number of elapsed hours i when these two values are equal. We don't know how long that may take, so I've enumerated over all positive integers beginning with 1. (I guess it could be no more than 59 hours, so I could have written (1..n).find.. where n is any integer greater than 58.) The value returned by find is assigned to the variable elapsed_hours.
Both clocks will display the same time after elapsed_hours, so we can compute the time either clock will show. I've chosen to do that for clock 0. For the first example (f0=1, f1=31)
elapsed_hours #=> 48
so
mins_clock0_advances = elapsed_hours*(60+1)
#=> 2928
mins_clock_advances_within_day = mins_clock0_advances % MINS_PER_DAY
#=> 48
We then convert this to hours and minutes:
mins_clock_advances_within_day.divmod(60)
#=> [0, 48]
which we can then the method String#% to format this result appropriately:
"%d:%02d" % mins_clock_advances_within_day.divmod(60)
#=> "0:48"
See Kernel#sprintf for information on formatting when using %. In "%02d", d is for "decimal", 2 is the field width and 0 means pad left with zeroes.

Pretty file size in Ruby?

I'm trying to make a method that converts an integer that represents bytes to a string with a 'prettied up' format.
Here's my half-working attempt:
class Integer
def to_filesize
{
'B' => 1024,
'KB' => 1024 * 1024,
'MB' => 1024 * 1024 * 1024,
'GB' => 1024 * 1024 * 1024 * 1024,
'TB' => 1024 * 1024 * 1024 * 1024 * 1024
}.each_pair { |e, s| return "#{s / self}#{e}" if self < s }
end
end
What am I doing wrong?
If you use it with Rails - what about standard Rails number helper?
http://api.rubyonrails.org/classes/ActionView/Helpers/NumberHelper.html#method-i-number_to_human_size
number_to_human_size(number, options = {})
?
How about the Filesize gem ? It seems to be able to convert from bytes (and other formats) into pretty printed values:
example:
Filesize.from("12502343 B").pretty # => "11.92 MiB"
http://rubygems.org/gems/filesize
I agree with #David that it's probably best to use an existing solution, but to answer your question about what you're doing wrong:
The primary error is dividing s by self rather than the other way around.
You really want to divide by the previous s, so divide s by 1024.
Doing integer arithmetic will give you confusing results, so convert to float.
Perhaps round the answer.
So:
class Integer
def to_filesize
{
'B' => 1024,
'KB' => 1024 * 1024,
'MB' => 1024 * 1024 * 1024,
'GB' => 1024 * 1024 * 1024 * 1024,
'TB' => 1024 * 1024 * 1024 * 1024 * 1024
}.each_pair { |e, s| return "#{(self.to_f / (s / 1024)).round(2)}#{e}" if self < s }
end
end
lets you:
1.to_filesize
# => "1.0B"
1020.to_filesize
# => "1020.0B"
1024.to_filesize
# => "1.0KB"
1048576.to_filesize
# => "1.0MB"
Again, I don't recommend actually doing that, but it seems worth correcting the bugs.
This is my solution:
def filesize(size)
units = %w[B KiB MiB GiB TiB Pib EiB ZiB]
return '0.0 B' if size == 0
exp = (Math.log(size) / Math.log(1024)).to_i
exp += 1 if (size.to_f / 1024 ** exp >= 1024 - 0.05)
exp = units.size - 1 if exp > units.size - 1
'%.1f %s' % [size.to_f / 1024 ** exp, units[exp]]
end
Compared to other solutions it's simpler, more efficient, and generates a more proper output.
Format
All other methods have the problem that they report 1023.95 bytes wrong. Moreover to_filesize simply errors out with big numbers (it returns an array).
- method: [ filesize, Filesize, number_to_human, to_filesize ]
- 0 B: [ 0.0 B, 0.00 B, 0 Bytes, 0.0B ]
- 1 B: [ 1.0 B, 1.00 B, 1 Byte, 1.0B ]
- 10 B: [ 10.0 B, 10.00 B, 10 Bytes, 10.0B ]
- 1000 B: [ 1000.0 B, 1000.00 B, 1000 Bytes, 1000.0B ]
- 1 KiB: [ 1.0 KiB, 1.00 KiB, 1 KB, 1.0KB ]
- 1.5 KiB: [ 1.5 KiB, 1.50 KiB, 1.5 KB, 1.5KB ]
- 10 KiB: [ 10.0 KiB, 10.00 KiB, 10 KB, 10.0KB ]
- 1000 KiB: [ 1000.0 KiB, 1000.00 KiB, 1000 KB, 1000.0KB ]
- 1 MiB: [ 1.0 MiB, 1.00 MiB, 1 MB, 1.0MB ]
- 1 GiB: [ 1.0 GiB, 1.00 GiB, 1 GB, 1.0GB ]
- 1023.95 GiB: [ 1.0 TiB, 1023.95 GiB, 1020 GB, 1023.95GB ]
- 1 TiB: [ 1.0 TiB, 1.00 TiB, 1 TB, 1.0TB ]
- 1 EiB: [ 1.0 EiB, 1.00 EiB, 1 EB, ERROR ]
- 1 ZiB: [ 1.0 ZiB, 1.00 ZiB, 1020 EB, ERROR ]
- 1 YiB: [ 1024.0 ZiB, 1024.00 ZiB, 1050000 EB, ERROR ]
Performance
Also, it has the best performance (seconds to process 1 million numbers):
- filesize: 2.15
- Filesize: 15.53
- number_to_human: 139.63
- to_filesize: 2.41
Here is a method using log10:
def number_format(d)
e = Math.log10(d).to_i / 3
return '%.3f' % (d / 1000 ** e) + ['', ' k', ' M', ' G'][e]
end
s = number_format(9012345678.0)
puts s == '9.012 G'
https://ruby-doc.org/core/Math.html#method-c-log10
You get points for adding a method to Integer, but this seems more File specific, so I would suggest monkeying around with File, say by adding a method to File called .prettysize().
But here is an alternative solution that uses iteration, and avoids printing single bytes as float :-)
def format_mb(size)
conv = [ 'b', 'kb', 'mb', 'gb', 'tb', 'pb', 'eb' ];
scale = 1024;
ndx=1
if( size < 2*(scale**ndx) ) then
return "#{(size)} #{conv[ndx-1]}"
end
size=size.to_f
[2,3,4,5,6,7].each do |ndx|
if( size < 2*(scale**ndx) ) then
return "#{'%.3f' % (size/(scale**(ndx-1)))} #{conv[ndx-1]}"
end
end
ndx=7
return "#{'%.3f' % (size/(scale**(ndx-1)))} #{conv[ndx-1]}"
end
#Darshan Computing's solution is only partial here. Since the hash keys are not guaranteed to be ordered this approach will not work reliably. You could fix this by doing something like this inside the to_filesize method,
conv={
1024=>'B',
1024*1024=>'KB',
...
}
conv.keys.sort.each { |s|
next if self >= s
e=conv[s]
return "#{(self.to_f / (s / 1024)).round(2)}#{e}" if self < s }
}
This is what I ended up doing for a similar method inside Float,
class Float
def to_human
conv={
1024=>'B',
1024*1024=>'KB',
1024*1024*1024=>'MB',
1024*1024*1024*1024=>'GB',
1024*1024*1024*1024*1024=>'TB',
1024*1024*1024*1024*1024*1024=>'PB',
1024*1024*1024*1024*1024*1024*1024=>'EB'
}
conv.keys.sort.each { |mult|
next if self >= mult
suffix=conv[mult]
return "%.2f %s" % [ self / (mult / 1024), suffix ]
}
end
end
FileSize may be dead, but now there is ByteSize.
require 'bytesize'
ByteSize.new(1210000000) #=> (1.21 GB)
ByteSize.new(1210000000).to_s #=> 1.21 GB

Resources