Ruby Difference Between Integer Methods - ruby

What is the difference between
10.6.to_i
and
10.6.to_int
?

This is excelently explained here:
First of all, neither to_i or to_int is meant to do something fancier than the other. Generally, those 2 methods don’t differ in their implementation that much, they only differ in what they announce to the outside world. The to_int method is generally used with conditional expressions and should be understood the following way : “Is this object can be considered like a real integer at all time?”
The to_i method is generally used to do the actual conversion and should be understood that way : “Give me the most accurate integer representation of this object please”
String, for example, is a class that implements to_i but does not implements to_int. It makes sense because a string cannot be seen as an integer in it’s own right. However, in some occasions, it can have a representation in the integer form. If we write x = “123″, we can very well do x.to_i and continue working with the resulting Fixnum instance. But it only worked because the characters in x could be translated into a numerical value. What if we had written : x = “the horse outside is funny” ? That’s right, a string just cannot be considered like an Integer all the time.

There is no difference. They are synonymous.

Related

Ruby: Help improving hashing algorithm

I am still relatively new to ruby as a language, but I know there are a lot of convenience methods built into the language. I am trying to generate a "hash" to check against in a low level block-chain verifier and I am wondering if there are any "convenience methods" that I could you to try to make this hashing algorithm more efficient. I think I can make this more efficient by utilizing ruby's max integer size, but I'm not sure.
Below is the current code which takes in a string to hash, unpacks it into an array of UTF-8 values, does computationally intensive math to each one of those values, adds up all of those values after the math is done to them, takes that value modulo 65,536, and then returns the hex representation of that value.
def generate_hash(string)
unpacked_string = string.unpack('U*')
sum = 0
unpacked_string.each do |x|
sum += (x**2000) * ((x + 2)**21) - ((x + 5)**3)
end
new_val = sum % 65_536 # Gives a number from 0 to 65,535
new_val.to_s(16)
end
On very large block-chains there is a very large performance hit which I am trying to get around. Any help would be great!
First and foremost, it is extremely unlikely that you are going to create anything that is more efficient than simply using String#hash. This is a case of you trying to build a better mousetrap.
Honestly, your hashing algorithm is very inefficient. The entire point of a hash is to be a fast, low-overhead way of quickly getting a "unique" (as unique as possible) integer to represent any object to avoid comparing by values.
Using that as a premise, if you start doing any type of intense computation in a hash algorithm, it is already counter-productive. Once you start implementing modulo and pow functions, it is inefficient.
Usually best practice involves taking a value(s) of the object that can be represented as integers, and performing bit operations on them, typically with prime numbers to help reduce hash collisions.
def hash
h = value1 ^ 393
h += value2 ^ 17
h
end
In your example, you are for some reason forcing the hash to the max value of a 16-bit unsigned integer, when typically 32-bits is used, although if you are comparing on the Ruby-side, this would be 31-bits due to how Ruby masks Fixnum values. Fixnum was deprecated on the Ruby side as it should have been, but internally the same threshold exists between what how a Bignum and Fixnum are handled. The Integer class simply provides one interface on the Ruby side, as those two really should never have been exposed outside of the C code.
In your specific example using strings, I would simply symbolize them. This guarantees a quick and efficient way that determines if two strings are equal without hardly any overhead, and comparing 2 symbols is the exact same as comparing 2 integers. There is a caveat to this method if you are comparing a vast number of strings. Once a symbol is created, it is alive for the life of the program. Any additional strings that equal to it will return the same symbol, but you cannot remove the memory of the symbol (just a few bytes) for as long as the program runs. Not good if using this method to compare thousands and thousands of unique strings.

How does integer-float comparison work?

The following expression evaluated to true on Ruby 1.9:
31964252037939931209 == 31964252037939933000.0
# => true
but I have no clue how this is happening. Am I missing something here?
The explanation is simply that standard methods for representing floating-point (i.e. decimal) numbers on computers are inherently inaccurate and only ever provide an approximate representation. This is not specific to Ruby; errors of the type you show in your question crop up in virtually every language and on every platform and you simply need to be aware they can happen.
Trying to convert the large integer value in your example to floating-point illustrates the problem a little better—you can see the interpreter is unable to provide an exact representation:
irb(main):008:0> 31964252037939931209.to_f
=> 31964252037939933000.0
Wikipedia's article on floating point has a more thorough discussion of accuracy problems with further examples.
Ruby used to convert bignums to floats in such comparisons, and in the conversion precision was lost. The issue is solved in more recent versions.
Here you can see the source code of the comparer for Ruby:
http://www.ruby-doc.org/core-1.9.3/Comparable.html#method-i-3D-3D
And seems to be using this actual comparer:
https://github.com/p12tic/libsimdpp/blob/master/simdpp/core/cmp_eq.h
The methood seems to be comparing using this:
/** Compares 8-bit values for equality.
#code
r0 = (a0 == b0) ? 0xff : 0x0
...
rN = (aN == bN) ? 0xff : 0x0
#endcode
#par 256-bit version:
#icost{SSE2-AVX, NEON, ALTIVEC, 2}
*/
My guess might be that the value is the same for both numbers.

Ruby unary tilde (`~`) method

I was goofing around in a pry REPL and found some very interesting behavior: the tilde method.
It appears Ruby syntax has a built-in literal unary operator, ~, just sitting around.
This means ~Object.new sends the message ~ to an instance of Object:
class Object
def ~
puts 'what are you doing, ruby?'
end
end
~Object.new #=> what are you doing, ruby?
This seems really cool, but mysterious. Is Matz essentially trying to give us our own customizable unary operator?
The only reference I can find to this in the rubydocs is in the operator precedence notes, where it's ranked as the number one highest precedence operator, alongside ! and unary + This makes sense for unary operators. (For interesting errata about the next two levels of precedence, ** then unary -, check out this question.) Aside from that, no mention of this utility.
The two notable references to this operator I can find by searching, amidst the ~=,!~, and~>` questions, are this and this. They both note its usefulness, oddity, and obscurity without going into its history.
After I was about to write off ~ as a cool way to provide custom unary operator behavior for your objects, I found a place where its actually used in ruby--fixnum (integers).
~2 returns -3. ~-1 returns 1. So it negates an integer and subtracts one... for some reason?
Can anyone enlighten me as purpose of the tilde operator's unique and unexpected behavior in ruby at large?
Using pry to inspect the method:
show-method 1.~
From: numeric.c (C Method):
Owner: Fixnum
Visibility: public
Number of lines: 5
static VALUE
fix_rev(VALUE num)
{
return ~num | FIXNUM_FLAG;
}
While this is impenetrable to me, it prompted me to look for a C unary ~ operator. One exists: it's the bitwise NOT operator, which flips the bits of a binary integer (~1010 => 0101). For some reason this translates to one less than the negation of a decimal integer in Ruby.
More importantly, since ruby is an object oriented language, the proper way to encode the behavior of ~0b1010 is to define a method (let's call it ~) that performs bitwise negation on a binary integer object. To realize this, the ruby parser (this is all conjecture here) has to interpret ~obj for any object as obj.~, so you get a unary operator for all objects.
This is just a hunch, anyone with a more authoritative or elucidating answer, please enlighten me!
--EDIT--
As #7stud points out, the Regexp class makes use of it as well, essentially matching the regex against $_, the last string received by gets in the current scope.
As #Daiku points out, the bitwise negation of Fixnums is also documented.
I think my parser explanation solves the bigger question of why ruby allows ~ as global unary operator that calls Object#~.
For fixnum, it's the one's complement, which in binary, flips all the ones and zeros to the opposite value. Here's the doc: http://www.ruby-doc.org/core-2.0/Fixnum.html#method-i-7E. To understand why it gives the values it does in your examples, you need to understand how negative numbers are represented in binary. Why ruby provides this, I don't know. Two's complement is generally the one used in modern computers. It has the advantage that the same rules for basic mathematical operations work for both positive and negative numbers.
The ~ is the binary one's complement operator in Ruby. One's complement is just flipping the bits of a number, to the effect that the number is now arithmetically negative.
For example, 2 in 32-bit (the size of a Fixnum) binary is 0000 0000 0000 0010, thus ~2 would be equal to 1111 1111 1111 1101 in one's complement.
However, as you have noticed and this article discusses in further detail, Ruby's version of one's complement seems to be implemented differently, in that it not only makes the integer negative but also subtracts 1 from it. I have no idea why this is, but it does seem to be the case.
It's mentioned in several places in pickaxe 1.8, e.g. the String class. However, in ruby 1.8.7 it doesn't work on the String class as advertised. It does work for the Regexp class:
print "Enter something: "
input = gets
pattern = 'hello'
puts ~ /#{pattern}/
--output:--
Enter something: 01hello
2
It is supposed to work similarly for the String class.
~ (Bignum)
~ (Complex)
~ (Fixnum)
~ (Regexp)
~ (IPAddr)
~ (Integer)
Each of these are documented in the documentation.
This list is from the documentation for Ruby 2.6
The behavior of this method "at large" is basically anything you want it to be, as you described yourself with your definition of a method called ~ on Object class. The behaviors on the core classes that have it defined by the implementations maintainers, seems to be pretty well documented, so that it should not have unexpected behavior for those objects.

The difference and use of strings and string arrays?

Okay, so for all i know a string is basically an array of characters. So why would there be string arrays in VB? And what differences are between them?
Just the basics, the way they operate that's what i'm interested in.
At times it is very useful to think of a String as an array of characters. It can also be useful to think of it as an array of bytes at times too - and this is of course not the same thing at all.
See The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) for better understanding of the differences between bytes and the characters held by Strings (UTF-16LE) as well as other character encodings commonly used.
But all of that aside, a String is really a higher level abstraction that you should not think of as an array of any kind.
After all, by that sort of logic an Integer or Long is an array as well.
So considering that a String is meant to be viewed as a primitive scalar value type the purpose of String arrays should be pretty clear. Arrays of Strings have pretty much the same sorts of uses as arrays of any other data type.
The fact that you have operations you can perform on Strings that root around inside them (substring operations) isn't much different conceptually than the operations that operate on the data inside any other simple type.
Say you need to store a list of names, it might be 100 names, or 200 names.. it depends from case to case.. what will u do?
String array can solve such case
Try this:
Dim Names() As String
ReDim Names(3) As String
Names(0) = "First"
Names(1) = "Second"
Names(2) = "Third"
Names(3) = "Fourth"
Dim l As Long
For l = LBound(Names) To UBound(Names)
MsgBox Names(l)
Next

Why does Ruby's Fixnum#/ method round down when it is called on another Fixnum?

Okay, so what's up with this?
irb(main):001:0> 4/3
=> 1
irb(main):002:0> 7/8
=> 0
irb(main):003:0> 5/2
=> 2
I realize Ruby is doing integer division here, but why? With a langauge as flexible as Ruby, why couldn't 5/2 return the actual, mathematical result of 5/2? Is there some common use for integer division that I'm missing? It seems to me that making 7/8 return 0 would cause more confusion than any good that might come from it is worth. Is there any real reason why Ruby does this?
Because most languages (even advanced/high-level ones) in creation do it? You will have the same behaviour on integer in C, C++, Java, Perl, Python... This is Euclidian Division (hence the corresponding modulo % operator).
The integer division operation is even implemented at hardware level on many architecture. Others have asked this question, and one reason is symetry: In static typed languages such as see, this allows all integer operations to return integers, without loss of precision. It also allow easy access to the corresponding low-level assembler operation, since C was designed as a sort of extension layer over it.
Moreover, as explained in one comment to the linked article, floating point operations were costly (or not supported on all architectures) for many years, and not required for processes such as splitting a dataset in fixed lots.

Resources