Pretty file size in Ruby? - ruby

I'm trying to make a method that converts an integer that represents bytes to a string with a 'prettied up' format.
Here's my half-working attempt:
class Integer
def to_filesize
{
'B' => 1024,
'KB' => 1024 * 1024,
'MB' => 1024 * 1024 * 1024,
'GB' => 1024 * 1024 * 1024 * 1024,
'TB' => 1024 * 1024 * 1024 * 1024 * 1024
}.each_pair { |e, s| return "#{s / self}#{e}" if self < s }
end
end
What am I doing wrong?

If you use it with Rails - what about standard Rails number helper?
http://api.rubyonrails.org/classes/ActionView/Helpers/NumberHelper.html#method-i-number_to_human_size
number_to_human_size(number, options = {})
?

How about the Filesize gem ? It seems to be able to convert from bytes (and other formats) into pretty printed values:
example:
Filesize.from("12502343 B").pretty # => "11.92 MiB"
http://rubygems.org/gems/filesize

I agree with #David that it's probably best to use an existing solution, but to answer your question about what you're doing wrong:
The primary error is dividing s by self rather than the other way around.
You really want to divide by the previous s, so divide s by 1024.
Doing integer arithmetic will give you confusing results, so convert to float.
Perhaps round the answer.
So:
class Integer
def to_filesize
{
'B' => 1024,
'KB' => 1024 * 1024,
'MB' => 1024 * 1024 * 1024,
'GB' => 1024 * 1024 * 1024 * 1024,
'TB' => 1024 * 1024 * 1024 * 1024 * 1024
}.each_pair { |e, s| return "#{(self.to_f / (s / 1024)).round(2)}#{e}" if self < s }
end
end
lets you:
1.to_filesize
# => "1.0B"
1020.to_filesize
# => "1020.0B"
1024.to_filesize
# => "1.0KB"
1048576.to_filesize
# => "1.0MB"
Again, I don't recommend actually doing that, but it seems worth correcting the bugs.

This is my solution:
def filesize(size)
units = %w[B KiB MiB GiB TiB Pib EiB ZiB]
return '0.0 B' if size == 0
exp = (Math.log(size) / Math.log(1024)).to_i
exp += 1 if (size.to_f / 1024 ** exp >= 1024 - 0.05)
exp = units.size - 1 if exp > units.size - 1
'%.1f %s' % [size.to_f / 1024 ** exp, units[exp]]
end
Compared to other solutions it's simpler, more efficient, and generates a more proper output.
Format
All other methods have the problem that they report 1023.95 bytes wrong. Moreover to_filesize simply errors out with big numbers (it returns an array).
- method: [ filesize, Filesize, number_to_human, to_filesize ]
- 0 B: [ 0.0 B, 0.00 B, 0 Bytes, 0.0B ]
- 1 B: [ 1.0 B, 1.00 B, 1 Byte, 1.0B ]
- 10 B: [ 10.0 B, 10.00 B, 10 Bytes, 10.0B ]
- 1000 B: [ 1000.0 B, 1000.00 B, 1000 Bytes, 1000.0B ]
- 1 KiB: [ 1.0 KiB, 1.00 KiB, 1 KB, 1.0KB ]
- 1.5 KiB: [ 1.5 KiB, 1.50 KiB, 1.5 KB, 1.5KB ]
- 10 KiB: [ 10.0 KiB, 10.00 KiB, 10 KB, 10.0KB ]
- 1000 KiB: [ 1000.0 KiB, 1000.00 KiB, 1000 KB, 1000.0KB ]
- 1 MiB: [ 1.0 MiB, 1.00 MiB, 1 MB, 1.0MB ]
- 1 GiB: [ 1.0 GiB, 1.00 GiB, 1 GB, 1.0GB ]
- 1023.95 GiB: [ 1.0 TiB, 1023.95 GiB, 1020 GB, 1023.95GB ]
- 1 TiB: [ 1.0 TiB, 1.00 TiB, 1 TB, 1.0TB ]
- 1 EiB: [ 1.0 EiB, 1.00 EiB, 1 EB, ERROR ]
- 1 ZiB: [ 1.0 ZiB, 1.00 ZiB, 1020 EB, ERROR ]
- 1 YiB: [ 1024.0 ZiB, 1024.00 ZiB, 1050000 EB, ERROR ]
Performance
Also, it has the best performance (seconds to process 1 million numbers):
- filesize: 2.15
- Filesize: 15.53
- number_to_human: 139.63
- to_filesize: 2.41

Here is a method using log10:
def number_format(d)
e = Math.log10(d).to_i / 3
return '%.3f' % (d / 1000 ** e) + ['', ' k', ' M', ' G'][e]
end
s = number_format(9012345678.0)
puts s == '9.012 G'
https://ruby-doc.org/core/Math.html#method-c-log10

You get points for adding a method to Integer, but this seems more File specific, so I would suggest monkeying around with File, say by adding a method to File called .prettysize().
But here is an alternative solution that uses iteration, and avoids printing single bytes as float :-)
def format_mb(size)
conv = [ 'b', 'kb', 'mb', 'gb', 'tb', 'pb', 'eb' ];
scale = 1024;
ndx=1
if( size < 2*(scale**ndx) ) then
return "#{(size)} #{conv[ndx-1]}"
end
size=size.to_f
[2,3,4,5,6,7].each do |ndx|
if( size < 2*(scale**ndx) ) then
return "#{'%.3f' % (size/(scale**(ndx-1)))} #{conv[ndx-1]}"
end
end
ndx=7
return "#{'%.3f' % (size/(scale**(ndx-1)))} #{conv[ndx-1]}"
end

#Darshan Computing's solution is only partial here. Since the hash keys are not guaranteed to be ordered this approach will not work reliably. You could fix this by doing something like this inside the to_filesize method,
conv={
1024=>'B',
1024*1024=>'KB',
...
}
conv.keys.sort.each { |s|
next if self >= s
e=conv[s]
return "#{(self.to_f / (s / 1024)).round(2)}#{e}" if self < s }
}
This is what I ended up doing for a similar method inside Float,
class Float
def to_human
conv={
1024=>'B',
1024*1024=>'KB',
1024*1024*1024=>'MB',
1024*1024*1024*1024=>'GB',
1024*1024*1024*1024*1024=>'TB',
1024*1024*1024*1024*1024*1024=>'PB',
1024*1024*1024*1024*1024*1024*1024=>'EB'
}
conv.keys.sort.each { |mult|
next if self >= mult
suffix=conv[mult]
return "%.2f %s" % [ self / (mult / 1024), suffix ]
}
end
end

FileSize may be dead, but now there is ByteSize.
require 'bytesize'
ByteSize.new(1210000000) #=> (1.21 GB)
ByteSize.new(1210000000).to_s #=> 1.21 GB

Related

Convert human readable file size to bytes in ruby

I went through this link. My requirement is the exact reverse of this. Example a string 10KB needs to be converted to 10240 (its equivalent byte size). Do we have any gem for this? or inbuilt method in ruby? I did my research, I wasn't able to spot it
There's filesize (rubygems)
It's quite trivial to write your own:
module ToBytes
def to_bytes
md = match(/^(?<num>\d+)\s?(?<unit>\w+)?$/)
md[:num].to_i *
case md[:unit]
when 'KB'
1024
when 'MB'
1024**2
when 'GB'
1024**3
when 'TB'
1024**4
when 'PB'
1024**5
when 'EB'
1024**6
when 'ZB'
1024**7
when 'YB'
1024**8
else
1
end
end
end
size_string = "10KB"
size_string.extend(ToBytes).to_bytes
=> 10240
String.include(ToBytes)
"1024 KB".to_bytes
=> 1048576
If you need KiB, MiB etc then you just add multipliers.
Here is a method using while:
def number_format(n)
n2, n3 = n, 0
while n2 >= 1e3
n2 /= 1e3
n3 += 1
end
return '%.3f' % n2 + ['', ' k', ' M', ' G'][n3]
end
s = number_format(9012345678)
puts s == '9.012 G'
https://ruby-doc.org/core/doc/syntax/control_expressions_rdoc.html#label-while+Loop

sub-millisecond time measurement in Ruby

I'm running Ruby 2.2 in Rubymine 8.0.3
My machine is running Windows 7 Pro with an Intel core i7-4710MQ
I've been able to achieve ~411 ns precision with C++, Java, Python and JS on this machine, but can't seem to find a way to attain this performance in Ruby, as the built in Time library is good for ms only.
I can program my tests to tolerate this reduced precision, but is it possible to incorporate the windows QPC API for improved evaluation of execution time?
My test code for determining clock tick precision is below:
numTimes = 10000
times = Array.new(numTimes)
(0...(numTimes)).each do |i|
times[i] = Time.new
end
durations = []
(0...(numTimes - 1)).each do |i|
durations[i] = times[i+1] - times[i]
end
# Output duration only if the clock ticked over
durations.each do |duration|
if duration != 0
p duration.to_s + ','
end
end
The below code incorporates the QPC as found here
require "Win32API"
QueryPerformanceCounter = Win32API.new("kernel32",
"QueryPerformanceCounter", 'P', 'I')
QueryPerformanceFrequency = Win32API.new("kernel32",
"QueryPerformanceFrequency", 'P', 'I')
def get_ticks
tick = ' ' * 8
get_ticks = QueryPerformanceCounter.call(tick)
tick.unpack('q')[0]
end
def get_freq
freq = ' ' * 8
get_freq = QueryPerformanceFrequency.call(freq)
freq.unpack('q')[0]
end
def get_time_diff(a, b)
# This function takes two QPC ticks
(b - a).abs.to_f / (get_freq)
end
numTimes = 10000
times = Array.new(numTimes)
(0...(numTimes)).each do |i|
times[i] = get_ticks
end
durations = []
(0...(numTimes - 1)).each do |i|
durations[i] = get_time_diff(times[i+1], times[i])
end
durations.each do |duration|
p (duration * 1000000000).to_s + ','
end
This code returns durations between ticks of ~22-75 microseconds on my machine
You can get higher precision by using Process::clock_gettime:
Returns a time returned by POSIX clock_gettime() function.
Here's an example with Time.now
times = Array.new(1000) { Time.now }
durations = times.each_cons(2).map { |a, b| b - a }
durations.sort.group_by(&:itself).each do |time, elements|
printf("%5d ns x %d\n", time * 1_000_000_000, elements.count)
end
Output:
0 ns x 686
1000 ns x 296
2000 ns x 12
3000 ns x 2
12000 ns x 2
18000 ns x 1
And here's the same example with Process.clock_gettime:
times = Array.new(1000) { Process.clock_gettime(Process::CLOCK_MONOTONIC) }
Output:
163 ns x 1
164 ns x 1
164 ns x 9
165 ns x 6
165 ns x 22
166 ns x 39
166 ns x 174
167 ns x 13
167 ns x 129
168 ns x 95
168 ns x 32
169 ns x 203
169 ns x 141
170 ns x 23
170 ns x 37
171 ns x 30
171 ns x 3
172 ns x 24
172 ns x 10
174 ns x 1
175 ns x 2
180 ns x 1
194 ns x 1
273 ns x 1
2565 ns x 1
And here's a quick side-by-side comparison:
array = Array.new(12) { [Time.now, Process.clock_gettime(Process::CLOCK_MONOTONIC)] }
array.shift(2) # first elements are always inaccuate
base_t, base_p = array.first # baseline
printf("%-11.11s %-11.11s\n", 'Time.now', 'Process.clock_gettime')
array.each do |t, p|
printf("%.9f %.9f\n", t - base_t, p - base_p)
end
Output:
Time.now Process.clo
0.000000000 0.000000000
0.000000000 0.000000495
0.000001000 0.000000985
0.000001000 0.000001472
0.000002000 0.000001960
0.000002000 0.000002448
0.000003000 0.000002937
0.000003000 0.000003425
0.000004000 0.000003914
0.000004000 0.000004403
This is Ruby 2.3 on OS X running on an Intel Core i7, not sure about Windows.
To avoid precision loss due to floating point conversion, you can specify another unit, e.g.:
Process.clock_gettime(Process::CLOCK_MONOTONIC, :nanosecond)
#=> 191519383463873
Time#nsec:
numTimes = 10000
times = Array.new(numTimes)
(0...(numTimes)).each do |i|
# nsec ⇓⇓⇓⇓
times[i] = Time.new.nsec
end
durations = (0...(numTimes - 1)).inject([]) do |memo, i|
memo << times[i+1] - times[i]
end
puts durations.reject(&:zero?).join $/
Ruby Time objects store the number of nanoseconds since the epoch.
Since Ruby 1.9.2, Time implementation uses a signed 63 bit integer, Bignum or Rational. The integer is a number of nanoseconds since the Epoch which can represent 1823-11-12 to 2116-02-20.
You can access the nanosecond part most accurately with Time#nsec.
$ ruby -e 't1 = Time.now; puts t1.to_f; puts t1.nsec'
1457079791.351686
351686000
As you can see, on my OS X machine it's only precise down to the microsecond. This could be because OS X lacks clock_gettime().

Ruby SHA2 digest incorrect doc or issue in my IRB?

The doc # http://www.ruby-doc.org/stdlib-1.9.3/libdoc/digest/rdoc/Digest/SHA2.html shows:
Digest::SHA256.new.digest_length * 8
#=> 512
Digest::SHA384.new.digest_length * 8
#=> 1024
Digest::SHA512.new.digest_length * 8
#=> 1024
Here's my output in 2.1.3:
Digest::SHA256.new.digest_length * 8
#=> 256
Digest::SHA384.new.digest_length * 8
#=> 384
Digest::SHA512.new.digest_length * 8
#=> 512
Why does my block length output differ from Ruby docs?
Seems like there is a typo in docs, look,
block_length → Integer
Returns the block length of the digest in bytes.
Digest::SHA256.new.digest_length * 8
# => 512
Digest::SHA384.new.digest_length * 8
# => 1024
Digest::SHA512.new.digest_length * 8
# => 1024
digest_length → Integer
Returns the length of the hash value of the digest in bytes.
Digest::SHA256.new.digest_length * 8
# => 256
Digest::SHA384.new.digest_length * 8
# => 384
Digest::SHA512.new.digest_length * 8
# => 512
Both are using digest_length in examples.
But instead it should be,
block_length → Integer
Returns the block length of the digest in bytes.
Digest::SHA256.new.block_length * 8
# => 512
Digest::SHA384.new.block_length * 8
# => 1024
Digest::SHA512.new.block_length * 8
# => 1024
digest_length → Integer
Returns the length of the hash value of the digest in bytes.
Digest::SHA256.new.digest_length * 8
# => 256
Digest::SHA384.new.digest_length * 8
# => 384
Digest::SHA512.new.digest_length * 8
# => 512
This has been fixed in the 2.0.0 documentation (commit)
There appears to be mistake in the Ruby 1.9.3 documentation for the method Digest::SHA2#block_length, as they are using the digest_length method instead of block_length in the examples.
Using digest_block actually gets the shown values 512, 1024, and 1024:
Digest::SHA256.new.block_length * 8
# => 512
Digest::SHA384.new.block_length * 8
# => 1024
Digest::SHA512.new.block_length * 8
# => 1024

Getting accurate file size in megabytes?

How can I get the accurate file size in MB? I tried this:
compressed_file_size = File.size("Compressed/#{project}.tar.bz2") / 1024000
puts "file size is #{compressed_file_size} MB"
But it chopped the 0.9 and showed 2 MB instead of 2.9 MB
Try:
compressed_file_size = File.size("Compressed/#{project}.tar.bz2").to_f / 2**20
formatted_file_size = '%.2f' % compressed_file_size
One-liner:
compressed_file_size = '%.2f' % (File.size("Compressed/#{project}.tar.bz2").to_f / 2**20)
or:
compressed_file_size = (File.size("Compressed/#{project}.tar.bz2").to_f / 2**20).round(2)
Further information on %-operator of String:
http://ruby-doc.org/core-1.9/classes/String.html#M000207
BTW: I prefer "MiB" instead of "MB" if I use base2 calculations (see: http://en.wikipedia.org/wiki/Mebibyte)
You're doing integer division (which drops the fractional part). Try dividing by 1024000.0 so ruby knows you want to do floating point math.
Try:
compressed_file_size = File.size("Compressed/#{project}.tar.bz2").to_f / 1024000
You might find a formatting function useful (pretty print file size), and here is my example,
def format_mb(size)
conv = [ 'b', 'kb', 'mb', 'gb', 'tb', 'pb', 'eb' ];
scale = 1024;
ndx=1
if( size < 2*(scale**ndx) ) then
return "#{(size)} #{conv[ndx-1]}"
end
size=size.to_f
[2,3,4,5,6,7].each do |ndx|
if( size < 2*(scale**ndx) ) then
return "#{'%.3f' % (size/(scale**(ndx-1)))} #{conv[ndx-1]}"
end
end
ndx=7
return "#{'%.3f' % (size/(scale**(ndx-1)))} #{conv[ndx-1]}"
end
Test it out,
tries = [ 1,2,3,500,1000,1024,3000,99999,999999,999999999,9999999999,999999999999,99999999999999,3333333333333333,555555555555555555555]
tries.each { |x|
print "size #{x} -> #{format_mb(x)}\n"
}
Which produces,
size 1 -> 1 b
size 2 -> 2 b
size 3 -> 3 b
size 500 -> 500 b
size 1000 -> 1000 b
size 1024 -> 1024 b
size 3000 -> 2.930 kb
size 99999 -> 97.655 kb
size 999999 -> 976.562 kb
size 999999999 -> 953.674 mb
size 9999999999 -> 9.313 gb
size 999999999999 -> 931.323 gb
size 99999999999999 -> 90.949 tb
size 3333333333333333 -> 2.961 pb
size 555555555555555555555 -> 481.868 eb

Round a ruby float up or down to the nearest 0.05

I'm getting numbers like
2.36363636363636
4.567563
1.234566465448465
10.5857447736
How would I get Ruby to round these numbers up (or down) to the nearest 0.05?
[2.36363636363636, 4.567563, 1.23456646544846, 10.5857447736].map do |x|
(x*20).round / 20.0
end
#=> [2.35, 4.55, 1.25, 10.6]
Check this link out, I think it's what you need.
Ruby rounding
class Float
def round_to(x)
(self * 10**x).round.to_f / 10**x
end
def ceil_to(x)
(self * 10**x).ceil.to_f / 10**x
end
def floor_to(x)
(self * 10**x).floor.to_f / 10**x
end
end
In general the algorithm for “rounding to the nearest x” is:
round(x / precision)) * precision
Sometimes is better to multiply by 1 / precision because it is an integer (and thus it works a bit faster):
round(x * (1 / precision)) / (1 / precision)
In your case that would be:
round(x * (1 / 0.05)) / (1 / 0.05)
which would evaluate to:
round(x * 20) / 20;
I don’t know any Python, though, so the syntax might not be correct but I’m sure you can figure it out.
less precise, but this method is what most people are googling this page for
(5.65235534).round(2)
#=> 5.65
Here's a general function that rounds by any given step value:
place in lib:
lib/rounding.rb
class Numeric
# round a given number to the nearest step
def round_by(increment)
(self / increment).round * increment
end
end
and the spec:
require 'rounding'
describe 'nearest increment by 0.5' do
{0=>0.0,0.5=>0.5,0.60=>0.5,0.75=>1.0, 1.0=>1.0, 1.25=>1.5, 1.5=>1.5}.each_pair do |val, rounded_val|
it "#{val}.round_by(0.5) ==#{rounded_val}" do val.round_by(0.5).should == rounded_val end
end
end
and usage:
require 'rounding'
2.36363636363636.round_by(0.05)
hth.
It’s possible to round numbers with String class’s % method.
For example
"%.2f" % 5.555555555
would give "5.56" as result (a string).
Ruby 2 now has a round function:
# Ruby 2.3
(2.5).round
3
# Ruby 2.4
(2.5).round
2
There are also options in ruby 2.4 like: :even, :up and :down
e.g;
(4.5).round(half: :up)
5
To get a rounding result without decimals, use Float's .round
5.44.round
=> 5
5.54.round
=> 6
I know that the question is old, but I like to share my invention with the world to help others: this is a method for rounding float number with step, rounding decimal to closest given number; it's usefull for rounding product price for example:
def round_with_step(value, rounding)
decimals = rounding.to_i
rounded_value = value.round(decimals)
step_number = (rounding - rounding.to_i) * 10
if step_number != 0
step = step_number * 10**(0-decimals)
rounded_value = ((value / step).round * step)
end
return (decimals > 0 ? "%.2f" : "%g") % rounded_value
end
# For example, the value is 234.567
#
# | ROUNDING | RETURN | STEP
# | 1 | 234.60 | 0.1
# | -1 | 230 | 10
# | 1.5 | 234.50 | 5 * 0.1 = 0.5
# | -1.5 | 250 | 5 * 10 = 50
# | 1.3 | 234.60 | 3 * 0.1 = 0.3
# | -1.3 | 240 | 3 * 10 = 30

Resources