How to parse ruby BigDecimal inspect? - ruby

In the following code:
x = BigDecimal(10)
s = x.inspect # "#<BigDecimal:6fe4790,'0.1E2',9(36)>"
Is there a way to parse s and get the original value ? The reason is that I have some text files with BigDecimal written in them using inspect, and I need to parse these values.

You .to_s to get the value in string. .inspect will print the object
x = BigDecimal(10)
x.to_s
# => "0.1E2"

The documentation for BigDecimal#inspect is incomplete. Consider the following:
require 'bigdecimal`
BigDecimal.new("1.2345").inspect
#=> "#<BigDecimal:7fb06a110298,'0.12345E1',18(18)>"
...
BigDecimal.new("1.234567890").inspect
#=> "#<BigDecimal:7fb06a16ab58,'0.123456789E1',18(27)>"
BigDecimal.new("1.2345678901").inspect
#=> "#<BigDecimal:7fb06a14a6a0,'0.1234567890 1E1',27(27)>"
BigDecimal.new("1.23456789012").inspect
#=> "#<BigDecimal:7fb06a1393a0,'0.1234567890 12E1',27(27)>"
BigDecimal.new("1.234567890123").inspect
#=> "#<BigDecimal:7fb06a123780,'0.1234567890 123E1',27(27)>"
It can be seen from the source code for inspect that, if there are more than 10 significant digits, each 10 characters are separate by a space (for readability, presumably):
BigDecimal.new("123.456789012345678901234567").inspect
#=> "#<BigDecimal:7fb06a0ac8b0,'0.1234567890 1234567890 1234567E3',36(36)>"
I suggest retrieving the string representation of the BigDecimal value as follows:
str = "#<BigDecimal:7fb06a14a6a0,'0.1234567890 1E1',27(27)>"
str.delete(' ').split(?')[1]
#=> "0.12345678901E1"
We are not finished. We must still convert the string we extract to a numerical object. We cannot use BigDecimal#to_f, however, if the value is large in absolute value:
"1.23456789012345678".to_f
#=> 1.2345678901234567
The safest course of action is to return a BigDecimal object, using the method BigDecimal::new, which takes two arguments:
the value to be converted to a BigDecimal object, which can be an Integer, Float, Rational, BigDecimal, or String. If a String, which is what we will supply, "spaces are ignored and unrecognized characters terminate the value" (similar to "123.4cat".to_f #=> 123.4).
the number of significant digits. If omitted or zero, the number of significant digits is determined from the value. I will omit this argument. (For example, BigDecimal.new("0.1234E2").precs #=> [18, 18], where the array contains the current and maximum numbers of significant digits.
Note the second argument is required if the first is a Float or Rational, else it is optional.
We therefore can write:
require 'bigdecimal'
def convert(str)
BigDecimal.new(str.delete(' ').split(?')[1])
end
convert "#<BigDecimal:7facd39d7ee8,'0.1234E4',9(18)>"
#=> #<BigDecimal:7facd39c7de0,'0.1234E4',9(18)>
convert "#<BigDecimal:7facd39b7be8,'0.1234E2',18(18)>"
#=> #<BigDecimal:7facd39ae610,'0.1234E2',18(18)>
convert "#<BigDecimal:7facd3990638,'0.1234E0',9(18)>"
#=> #<BigDecimal:7facd3980aa8,'0.1234E0',9(18)>
convert "#<BigDecimal:7facd3970e28,'0.1234E-2',9(18)>"
#=> #<BigDecimal:7facd39625d0,'0.1234E-2',9(18)>
v = convert "#<BigDecimal:7fb06a123780,'0.1234567890 123E1',27(27)>"
#=> #<BigDecimal:7fb069851d78,'0.1234567890 123E1',27(27)>
An easy way to see if the BigDecimal object can be converted to a float without loss of accuracy is:
def convert_bd_to_float(bd)
f = bd.to_f
(bd==BigDecimal.new(f.to_s)) ? f : nil
end
convert_bd_to_float BigDecimal.new('1234567890123456')
#=> 1.234567890123456e+15
convert_bd_to_float BigDecimal.new('12345678901234567')
#=> nil

"#<BigDecimal:6fe4790,'0.1E2',9(36)>"[/(?<=').+(?=')/]
# => "0.1E2"

I don't know which version of Ruby you are using, so I checked some MRI source code for BigDecimal:
2000 /* Returns debugging information about the value as a string of comma-separated
2001 * values in angle brackets with a leading #:
2002 *
2003 * BigDecimal.new("1234.5678").inspect ->
2004 * "#<BigDecimal:b7ea1130,'0.12345678E4',8(12)>"
2005 *
2006 * The first part is the address, the second is the value as a string, and
2007 * the final part ss(mm) is the current number of significant digits and the
2008 * maximum number of significant digits, respectively.
2009 */
2010 static VALUE
2011 BigDecimal_inspect(VALUE self)
2012 {
2013 ENTER(5);
2014 Real *vp;
2015 volatile VALUE obj;
2016 size_t nc;
2017 char *psz, *tmp;
2018
2019 GUARD_OBJ(vp, GetVpValue(self, 1));
2020 nc = VpNumOfChars(vp, "E");
2021 nc += (nc + 9) / 10;
2022
2023 obj = rb_str_new(0, nc+256);
2024 psz = RSTRING_PTR(obj);
2025 sprintf(psz, "#<BigDecimal:%"PRIxVALUE",'", self);
2026 tmp = psz + strlen(psz);
2027 VpToString(vp, tmp, 10, 0);
2028 tmp += strlen(tmp);
2029 sprintf(tmp, "',%"PRIuSIZE"(%"PRIuSIZE")>", VpPrec(vp)*VpBaseFig(), VpMaxPrec(vp)*VpBaseFig());
2030 rb_str_resize(obj, strlen(psz));
2031 return obj;
2032 }
2033
So, what you want seems to be the second part of the inspect-string, 0.1E2 in your case, equal to 10. The comment above is quite clear, this should be the full numeric value of the object. Simple regex will be enough.

another option:
"#<BigDecimal:95915c4,'0.1E2',9(27)>".split(",")[1].tr! "'", ''
=> "0.1E2"

Related

Ruby - Unpack array with mixed types

I am trying to use unpack to decode a binary file. The binary file has the following structure:
ABCDEF\tFFFABCDEF\tFFFF....
where
ABCDEF -> String of fixed length
\t -> tab character
FFF -> 3 Floats
.... -> repeat thousands of times
I know how to do it when types are all the same or with only numbers and fixed length arrays, but I am struggling in this situation. For example, if I had a list of floats I would do
s.unpack('F*')
Or if I had integers and floats like
[1, 3.4, 5.2, 4, 2.3, 7.8]
I would do
s.unpack('CF2CF2')
But in this case I am a bit lost. I was hoping to use a format string such `(CF2)*' with brackets, but it does not work.
I need to use Ruby 2.0.0-p247 if that matters
Example
ary = ["ABCDEF\t", 3.4, 5.6, 9.1, "FEDCBA\t", 2.5, 8.9, 3.1]
s = ary.pack('P7fffP7fff')
then
s.scan(/.{19}/)
["\xA8lf\xF9\xD4\x7F\x00\x00\x9A\x99Y#33\xB3#\x9A\x99\x11", "A\x80lf\xF9\xD4\x7F\x00\x00\x00\x00 #ff\x0EAff"]
Finally
s.scan(/.{19}/).map{ |item| item.unpack('P7fff') }
Error: #<ArgumentError: no associated pointer>
<main>:in `unpack'
<main>:in `block in <main>'
<main>:in `map'
<main>:in `<main>'
You could read the file in small chunks of 19 bytes and use 'A7fff' to pack and unpack. Do not use pointers to structure ('p' and 'P'), as they need more than 19 bytes to encode your information.
You could also use 'A6xfff' to ignore the 7th byte and get a string with 6 chars.
Here's an example, which is similar to the documentation of IO.read:
data = [["ABCDEF\t", 3.4, 5.6, 9.1],
["FEDCBA\t", 2.5, 8.9, 3.1]]
binary_file = 'data.bin'
chunk_size = 19
pattern = 'A7fff'
File.open(binary_file, 'wb') do |o|
data.each do |row|
o.write row.pack(pattern)
end
end
raise "Something went wrong. Please check data, pattern and chunk_size." unless File.size(binary_file) == data.length * chunk_size
File.open(binary_file, 'rb') do |f|
while record = f.read(chunk_size)
puts '%s %g %g %g' % record.unpack(pattern)
end
end
# =>
# ABCDEF 3.4 5.6 9.1
# FEDCBA 2.5 8.9 3.1
You could use a multiple of 19 to speed up the process if your file is large.
When dealing with mixed formats that repeat, and are of a known fixed size, it is often easier to split the string first,
Quick example would be:
binary.scan(/.{LENGTH_OF_DATA}/).map { |item| item.unpack(FORMAT) }
Considering your above example, take the length of the string including the tab character (in bytes), plus the size of a 3 floats. If your strings are literally 'ABCDEF\t', you would use a size of 19 (7 for the string, 12 for the 3 floats).
Your final product would look like this:
str.scan(/.{19}/).map { |item| item.unpack('P7fff') }
Per example:
irb(main):001:0> ary = ["ABCDEF\t", 3.4, 5.6, 9.1, "FEDCBA\t", 2.5, 8.9, 3.1]
=> ["ABCDEF\t", 3.4, 5.6, 9.1, "FEDCBA\t", 2.5, 8.9, 3.1]
irb(main):002:0> s = ary.pack('pfffpfff')
=> "\xE8Pd\xE4eU\x00\x00\x9A\x99Y#33\xB3#\x9A\x99\x11A\x98Pd\xE4eU\x00\x00\x00\x00 #ff\x0EAffF#"
irb(main):003:0> s.unpack('pfffpfff')
=> ["ABCDEF\t", 3.4000000953674316, 5.599999904632568, 9.100000381469727, "FEDCBA\t", 2.5, 8.899999618530273, 3.0999999046325684]
The minor differences in precision is unavoidable, but do not worry about it, as it comes from the difference of a 32-bit float and 64-bit double (what Ruby used internally), and the precision difference will be less than is significant for a 32-bit float.

Ruby convert string array to string

I have a ruby string array value and i want to get it as string value. I am using ruby with chef recipe. Running in windows platform. Code-
version_string = Mixlib::ShellOut.new('some.exe -version').run_command
Log.info(version.stdout.to_s)
extract_var = version_string.stdout.to_s.lines.grep(/ver/)
Log.info('version:'+ extract_var.to_s)
output is coming-
version 530
[2016-06-08T07:03:49+00:00] INFO: version ["version 530\r\n"]
I want to extract 530 string only.
long time no see since Rot :)
You can use some Chef helper methods and regular expressions to make this a little easier.
output = shell_out!('saphostexec.exe -version', cwd: 'C:\\Program Files\\hostctrl\\exe').stdout
if output =~ /kernel release\s+(\d+)/
kernel_version = $1
else
raise "unable to parse kernel version"
end
Chef::Log.info(kernel_version)
As you want val = 720 and not val = "720" you can write
val = strvar.first.to_i
#=> 720
You can return the first series of digits found as an integer from the current_kernel string with String#[regexp] :
current_kernel[/\d+/].to_i
#=> 720

How to unpack 7-bits at a time in ruby?

I'm trying to format a UUIDv4 into a url friendly string. The typical format in base16 is pretty long and has dashes:
xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx
To avoid dashes and underscores I was going to use base58 (like bitcoin does) so each character fully encode sqrt(58).floor = 7 bits.
I can pack the uuid into binary with:
[ uuid.delete('-') ].pack('H*')
To get 8-bit unsigned integers its:
binary.unpack('C*')
How can i unpack every 7-bits into 8-bit unsigned integers? Is there a pattern to scan 7-bits at a time and set the high bit to 0?
require 'base58'
uuid ="123e4567-e89b-12d3-a456-426655440000"
Base58.encode(uuid.delete('-').to_i(16))
=> "3fEgj34VWmVufdDD1fE1Su"
and back again
Base58.decode("3fEgj34VWmVufdDD1fE1Su").to_s(16)
=> "123e4567e89b12d3a456426655440000"
A handy pattern to reconstruct the uuid format from a template
template = 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'
src = "123e4567e89b12d3a456426655440000".each_char
template.each_char.reduce(''){|acc, e| acc += e=='-' ? e : src.next}
=> "123e4567-e89b-12d3-a456-426655440000"
John La Rooy's answer is great, but I just wanted to point out how simple the Base58 algorithm is because I think it's neat. (Loosely based on the base58 gem, plus bonus original int_to_uuid function):
ALPHABET = "123456789abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ".chars
BASE = ALPHABET.size
def base58_to_int(base58_val)
base58_val.chars
.reverse_each.with_index
.reduce(0) do |int_val, (char, index)|
int_val + ALPHABET.index(char) * BASE ** index
end
end
def int_to_base58(int_val)
''.tap do |base58_val|
while int_val > 0
int_val, mod = int_val.divmod(BASE)
base58_val.prepend ALPHABET[mod]
end
end
end
def int_to_uuid(int_val)
base16_val = int_val.to_s(16)
[ 8, 4, 4, 4, 12 ].map do |n|
base16_val.slice!(0...n)
end.join('-')
end
uuid = "123e4567-e89b-12d3-a456-426655440000"
int_val = uuid.delete('-').to_i(16)
base58_val = int_to_base58(int_val)
int_val2 = base58_to_int(base58_val)
uuid2 = int_to_uuid(int_val2)
printf <<END, uuid, int_val, base_58_val, int_val2, uuid2
Input UUID: %s
Input UUID as integer: %d
Integer encoded as base 58: %s
Integer decoded from base 58: %d
Decoded integer as UUID: %s
END
Output:
Input UUID: 123e4567-e89b-12d3-a456-426655440000
Input UUID as integer: 24249434048109030647017182302883282944
Integer encoded as base 58: 3fEgj34VWmVufdDD1fE1Su
Integer decoded from base 58: 24249434048109030647017182302883282944
Decoded integer as UUID: 123e4567-e89b-12d3-a456-426655440000

Match Multiple Patterns in a String and Return Matches as Hash

I'm working with some log files, trying to extract pieces of data.
Here's an example of a file which, for the purposes of testing, I'm loading into a variable named sample. NOTE: The column layout of the log files is not guaranteed to be consistent from one file to the next.
sample = "test script result
Load for five secs: 70%/50%; one minute: 53%; five minutes: 49%
Time source is NTP, 23:25:12.829 UTC Wed Jun 11 2014
D
MAC Address IP Address MAC RxPwr Timing I
State (dBmv) Offset P
0000.955c.5a50 192.168.0.1 online(pt) 0.00 5522 N
338c.4f90.2794 10.10.0.1 online(pt) 0.00 3661 N
990a.cb24.71dc 127.0.0.1 online(pt) -0.50 4645 N
778c.4fc8.7307 192.168.1.1 online(pt) 0.00 3960 N
"
Right now, I'm just looking for IPv4 and MAC address; eventually the search will need to include more patterns. To accomplish this, I'm using two regular expressions and passing them to Regexp.union
patterns = Regexp.union(/(?<mac_address>\h{4}\.\h{4}\.\h{4})/, /(?<ip_address>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/)
As you can see, I'm using named groups to identify the matches.
The result I'm trying to achieve is a Hash. The key should equal the capture group name, and the value should equal what was matched by the regular expression.
Example:
{"mac_address"=>"0000.955c.5a50", "ip_address"=>"192.168.0.1"}
{"mac_address"=>"338c.4f90.2794", "ip_address"=>"10.10.0.1"}
{"mac_address"=>"990a.cb24.71dc", "ip_address"=>"127.0.0.1"}
{"mac_address"=>"778c.4fc8.7307", "ip_address"=>"192.168.1.1"}
Here's what I've come up with so far:
sample.split(/\r?\n/).each do |line|
hashes = []
line.split(/\s+/).each do |val|
match = val.match(patterns)
if match
hashes << Hash[match.names.zip(match.captures)].delete_if { |k,v| v.nil? }
end
end
results = hashes.reduce({}) { |r,h| h.each {|k,v| r[k] = v}; r }
puts results if results.length > 0
end
I feel like there should be a more "elegant" way to do this. My chief concern, though, is performance.

Calculating the difference between durations with milliseconds in Ruby

TL;DR: I need to get the difference between HH:MM:SS.ms and HH:MM:SS.ms as HH:MM:SS:ms
What I need:
Here's a tricky one. I'm trying to calculate the difference between two timestamps such as the following:
In: 00:00:10.520
Out: 00:00:23.720
Should deliver:
Diff: 00:00:13.200
I thought I'd parse the times into actual Time objects and use the difference there. This works great in the previous case, and returns 00:0:13.200.
What doesn't work:
However, for some, this doesn't work right, as Ruby uses usec instead of msec:
In: 00:2:22.760
Out: 00:2:31.520
Diff: 00:0:8.999760
Obviously, the difference should be 00:00:8:760 and not 00:00:8.999760. I'm really tempted to just tdiff.usec.to_s.gsub('999','') ……
My code so far:
Here's my code so far (these are parsed from the input strings like "0:00:10:520").
tin_first, tin_second = ins.split(".")
tin_hours, tin_minutes, tin_seconds = tin_first.split(":")
tin_usec = tin_second * 1000
tin = Time.gm(0, 1, 1, tin_hours, tin_minutes, tin_seconds, tin_usec)
The same happens for tout. Then:
tdiff = Time.at(tout-tin)
For the output, I use:
"00:#{tdiff.min}:#{tdiff.sec}.#{tdiff.usec}"
Is there any faster way to do this? Remember, I just want to have the difference between two times. What am I missing?
I'm using Ruby 1.9.3p6 at the moment.
Using Time:
require 'time' # Needed for Time.parse
def time_diff(time1_str, time2_str)
t = Time.at( Time.parse(time2_str) - Time.parse(time1_str) )
(t - t.gmt_offset).strftime("%H:%M:%S.%L")
end
out_time = "00:00:24.240"
in_time = "00:00:14.520"
p time_diff(in_time, out_time)
#=> "00:00:09.720"
Here's a solution that doesn't rely on Time:
def slhck_diff( t1, t2 )
ms_to_time( time_as_ms(t2) - time_as_ms(t1) )
end
# Converts "00:2:22.760" to 142760
def time_as_ms( time_str )
re = /(\d+):(\d+):(\d+)(?:\.(\d+))?/
parts = time_str.match(re).to_a.map(&:to_i)
parts[4]+(parts[3]+(parts[2]+parts[1]*60)*60)*1000
end
# Converts 142760 to "00:02:22.760"
def ms_to_time(ms)
m = ms.floor / 60000
"%02i:%02i:%06.3f" % [ m/60, m%60, ms/1000.0 % 60 ]
end
t1 = "00:00:10.520"
t2 = "01:00:23.720"
p slhck_diff(t1,t2)
#=> "01:00:13.200"
t1 = "00:2:22.760"
t2 = "00:2:31.520"
p slhck_diff(t1,t2)
#=> "00:00:08.760"
I figured the following could work:
out_time = "00:00:24.240"
in_time = "00:00:14.520"
diff = Time.parse(out_time) - Time.parse(in_time)
Time.at(diff).strftime("%H:%M:%S.%L")
# => "01:00:09.720"
It does print 01 for the hour, which I don't really understand.
In the meantime, I used:
Time.at(diff).strftime("00:%M:%S.%L")
# => "00:00:09.720"
Any answer that does this better will get an upvote or the accept, of course.
in_time = "00:02:22.760"
out_time = "00:02:31.520"
diff = (Time.parse(out_time) - Time.parse(in_time))*1000
puts diff
OUTPUT:
8760.0 millliseconds
Time.parse(out_time) - Time.parse(in_time) gives the result in seconds so multiplied by 1000 to convert into milliseconds.

Resources