How can Floats and Symbols have immediate value? - ruby

I was informed by a fellow StackOverflow user that Floats now have immediate value in Ruby. However, I am confused as to how this is implemented.
I am also confused as to how Symbols can have immediate value.
I understand that objects with immediate value are objects who's entire state information can be encapsulated into an unsigned long C variable called VALUE.
I can intuitively understand how this would be possible when considering small integers(Fixnums), and trivial things like true false nil etc.
But, with no length restriction on Floats and Symbols, how can these objects be represented without their own structs?

First, a Float does a length restriction - a Float is basically a native double precision floating point value, i.e. 64 bits. That's still too much though, so a Float is an immediate value only if the mantissa is not too big (you can see the exact condition here).
As for symbols, there is a data structure enumerating all created symbols, so they can be referred to as offsets into that table (on ruby 2.2 only some symbols are like this - those that aren't are "normal", garbage collectible objects).

Related

What is the purpose of arbitrary precision constants in Go?

Go features untyped exact numeric constants with arbitrary size and precision. The spec requires all compilers to support integers to at least 256 bits, and floats to at least 272 bits (256 bits for the mantissa and 16 bits for the exponent). So compilers are required to faithfully and exactly represent expressions like this:
const (
PI = 3.1415926535897932384626433832795028841971
Prime256 = 84028154888444252871881479176271707868370175636848156449781508641811196133203
)
This is interesting...and yet I cannot find any way to actually use any such constant that exceeds the maximum precision of the 64 bit concrete types int64, uint64, float64, complex128 (which is just a pair of float64 values). Even the standard library big number types big.Int and big.Float cannot be initialized from large numeric constants -- they must instead be deserialized from string constants or other expressions.
The underlying mechanics are fairly obvious: the constants exist only at compile time, and must be coerced to some value representable at runtime to be used at runtime. They are a language construct that exists only in code and during compilation. You cannot retrieve the raw value of a constant at runtime; it is is not stored at some address in the compiled program itself.
So the question remains: Why does the language make such a point of supporting enormous constants when they cannot be used in practice?
TLDR; Go's arbitrary precision constants give you the possibility to work with "real" numbers and not with "boxed" numbers, so "artifacts" like overflow, underflow, infinity corner cases are relieved. You have the possibility to work with higher precision, and only the result have to be converted to limited-precision, mitigating the effect of intermediate errors.
The Go Blog: Constants: (emphasizes are mine answering your question)
Numeric constants live in an arbitrary-precision numeric space; they are just regular numbers. But when they are assigned to a variable the value must be able to fit in the destination. We can declare a constant with a very large value:
const Huge = 1e1000
—that's just a number, after all—but we can't assign it or even print it. This statement won't even compile:
fmt.Println(Huge)
The error is, "constant 1.00000e+1000 overflows float64", which is true. But Huge might be useful: we can use it in expressions with other constants and use the value of those expressions if the result can be represented in the range of a float64. The statement,
fmt.Println(Huge / 1e999)
prints 10, as one would expect.
In a related way, floating-point constants may have very high precision, so that arithmetic involving them is more accurate. The constants defined in the math package are given with many more digits than are available in a float64. Here is the definition of math.Pi:
Pi = 3.14159265358979323846264338327950288419716939937510582097494459
When that value is assigned to a variable, some of the precision will be lost; the assignment will create the float64 (or float32) value closest to the high-precision value. This snippet
pi := math.Pi
fmt.Println(pi)
prints 3.141592653589793.
Having so many digits available means that calculations like Pi/2 or other more intricate evaluations can carry more precision until the result is assigned, making calculations involving constants easier to write without losing precision. It also means that there is no occasion in which the floating-point corner cases like infinities, soft underflows, and NaNs arise in constant expressions. (Division by a constant zero is a compile-time error, and when everything is a number there's no such thing as "not a number".)
See related: How does Go perform arithmetic on constants?

64 bit integer and 64 bit float homogeneous representation

Assume we have some sequence as input. For performance reasons we may want to convert it in homogeneous representation. And in order to transform it into homogeneous representation we are trying to convert it to same type. Here lets consider only 2 types in input - int64 and float64 (in my simple code I will use numpy and python; it is not the matter of this question - one may think only about 64-bit integer and 64-bit floats).
First we may try to cast everything to float64.
So we want something like so as input:
31 1.2 -1234
be converted to float64. If we would have all int64 we may left it unchanged ("already homogeneous"), or if something else was found we would return "not homogeneous". Pretty straightforward.
But here is the problem. Consider a bit modified input:
31000000 1.2 -1234
Idea is clear - we need to check that our "caster" is able to handle large by absolute value int64 properly:
format(np.float64(31000000), '.0f') # just convert to float64 and print
'31000000'
Seems like not a problem at all. So lets go to the deal right away:
im = np.iinfo(np.int64).max # maximum of int64 type
format(np.float64(im), '.0f')
format(np.float64(im-100), '.0f')
'9223372036854775808'
'9223372036854775808'
Now its really undesired - we lose some information which maybe needed. I.e. we want to preserve all the information provided in the input sequence.
So our im and im-100 values cast to the same float64 representation. The reason of this is clear - float64 has only 53 significand of total 64 bits. That is why its precision enough to represent log10(2^53) ~= 15.95 i.e. about all 16-length int64 without any information loss. But int64 type contains up to 19 digits.
So we end up with about [10^16; 10^19] (more precisely [10^log10(53); int64.max]) range in which each int64 may be represented with information loss.
Q: What decision in such situation should one made in order to represent int64 and float64 homogeneously.
I see several options for now:
Just convert all int64 range to float64 and "forget" about possible information loss.
Motivation here is "majority of input barely will be > 10^16 int64 values".
EDIT: This clause was misleading. In clear formulation we don't consider such solutions (but left it for completeness).
Do not make such automatic conversions at all. Only if explicitly specified.
I.e. we agree with performance drawbacks. For any int-float arrays. Even with ones as in simplest 1st case.
Calculate threshold for performing conversion to float64 without possible information loss. And use it while making casting decision. If int64 above this threshold found - do not convert (return "not homogeneous").
We've already calculate this threshold. It is log10(2^53) rounded.
Create new type "fint64". This is an exotic decision but I'm considering even this one for completeness.
Motivation here consists of 2 points. First one: it is frequent situation when user wants to store int and float types together. Second - is structure of float64 type. I'm not quite understand why one will need ~308 digits value range if significand consists only of ~16 of them and other ~292 is itself a noise. So we might use one of float64 exponent bits to indicate whether its float or int is stored here. But for int64 it would be definitely drawback to lose 1 bit. Cause would reduce our integer range twice. But we would gain possibility freely store ints along with floats without any additional overhead.
EDIT: While my initial thinking of this was as "exotic" decision in fact it is just a variant of another solution alternative - composite type for our representation (see 5 clause). But need to add here that my 1st composition has definite drawback - losing some range for float64 and for int64. What we rather do - is not to subtract 1 bit but add one bit which represents a flag for int or float type stored in following 64 bits.
As proposed #Brendan one may use composite type consists of "combination of 2 or more primitive types". So using additional primitives we may cover our "problem" range for int64 for example and get homogeneous representation in this "new" type.
EDITs:
Because here question arisen I need to try be very specific: Devised application in question do following thing - convert sequence of int64 or float64 to some homogeneous representation lossless if possible. The solutions are compared by performance (e.g. total excessive RAM needed for representation). That is all. No any other requirements is considered here (cause we should consider a problem in its minimal state - not writing whole application). Correspondingly algo that represents our data in homogeneous state lossless (we are sure we not lost any information) fits into our app.
I've decided to remove words "app" and "user" from question - it was also misleading.
When choosing a data type there are 3 requirements:
if values may have different signs
needed precision
needed range
Of course hardware doesn't provide a lot of types to choose from; so you'll need to select the next largest provided type. For example, if you want to store values ranging from 0 to 500 with 8 bits of precision; then hardware won't provide anything like that and you will need to use either 16-bit integer or 32-bit floating point.
When choosing a homogeneous representation there are 3 requirements:
if values may have different signs; determined from the requirements from all of the original types being represented
needed precision; determined from the requirements from all of the original types being represented
needed range; determined from the requirements from all of the original types being represented
For example, if you have integers from -10 to +10000000000 you need a 35 bit integer type that doesn't exist so you'll use a 64-bit integer, and if you need floating point values from -2 to +2 with 31 bits of precision then you'll need a 33 bit floating point type that doesn't exist so you'll use a 64-bit floating point type; and from the requirements of these two original types you'll know that a homogeneous representation will need a sign flag, a 33 bit significand (with an implied bit), and a 1-bit exponent; which doesn't exist so you'll use a 64-bit floating point type as the homogeneous representation.
However; if you don't know anything about the requirements of the original data types (and only know that whatever the requirements were they led to the selection of a 64-bit integer type and a 64-bit floating point type), then you'll have to assume "worst cases". This leads to needing a homogeneous representation that has a sign flag, 62 bits of precision (plus an implied 1 bit) and an 8 bit exponent. Of course this 71 bit floating point type doesn't exist, so you need to select the next largest type.
Also note that sometimes there is no "next largest type" that hardware supports. When this happens you need to resort to "composed types" - a combination of 2 or more primitive types. This can include anything up to and including "big rational numbers" (numbers represented by 3 big integers in "numerator / divisor * (1 << exponent)" form).
Of course if the original types (the 64-bit integer type and 64-bit floating point type) were primitive types and your homogeneous representation needs to use a "composed type"; then your "for performance reasons we may want to convert it in homogeneous representation" assumption is likely to be false (it's likely that, for performance reasons, you want to avoid using a homogeneous representation).
In other words:
If you don't know anything about the requirements of the original data types, it's likely that, for performance reasons, you want to avoid using a homogeneous representation.
Now...
Let's rephrase your question as "How to deal with design failures (choosing the wrong types which don't meet requirements)?". There is only one answer, and that is to avoid the design failure. Run-time checks (e.g. throwing an exception if the conversion to the homogeneous representation caused precision loss) serve no purpose other than to notify developers of design failures.
It is actually very basic: use 64 bits floating point. Floating point is an approximation, and you will loose precision for many ints. But there are no uncertainties other than "might this originally have been integral" and "does the original value deviates more than 1.0".
I know of one non-standard floating point representation that would be more powerfull (to be found in the net). That might (or might not) help cover the ints.
The only way to have an exact int mapping, would be to reduce the int range, and guarantee (say) 60 bits ints to be precise, and the remaining range approximated by floating point. Floating point would have to be reduced too, either exponential range as mentioned, or precision (the mantissa).

What can you do with integers in GLSL ES?

There seems to be an extremely limited amount of things you can do with integers in GLSL ES 2.0.
For example:
No bitwise operations are supported.
No intrinsic functions seems to accept int types
So, it seems the only thing you can really do with integers is:
Assign integer to integer
Access array element by index
Basic arithmetic (-, +, *, /), but only between other integers. No operator exists which allows you to add a float and int together.
Compare them between each other.
Convert them to other types
vec constructors allow you to pass integers
Is that really it? Are there any intrinsic functions that take int parameters other than vec constructors? Are there any operations you can do between float and int?
You are correct, use cases of integers are limited. Per OpenGL ES Shading Language spec, 4.1.3:
Integers are mainly supported as a programming aid. At the hardware level, real integers would aid efficient implementation of loops and array indices, and referencing texture units. However, there is no requirement that integers in the language map to an integer type in hardware. It is not expected that underlying hardware has full support for a wide range of integer operations. An OpenGL ES Shading Language implementation may convert integers to floats to operate on them. Hence, there is no portable wrapping behavior.
Since there's no guarantee GLSL ES integer type maps to a hardware integer type, it's only useful for compile-time type checking with little to no effect at runtime. All places that require integers are listed in the spec.

How does Ruby differentiate VALUE with value and pointer?

For values such as true, nil, or small integers, Ruby does optimization. Instead of using VALUE pointer as a pointer, it directly uses VALUE to store data.
I wonder how Ruby makes a difference between these uses:
def foo(x)
...
with x that will be associated to VALUE. In low level terms, they are just a number. How can I tell whether or not a certain number is a pointer to an object? All that comes to my mind is to limit pointers to have the MSB set to 0, and direct values with MSB equal to 1. But this is just my guess. How is it done in Ruby?
There are many different implementations of Ruby. The Ruby Language Specification doesn't prescribe any particular internal representation for objects – why should it? It's an internal representation, after all!
For example, JRuby doesn't represent objects as C pointers at all, it represents them as Java objects. IronRuby represents them as .NET objects. Opal represents them as ECMAScript objects. MagLev represents them as Smalltalk objects.
However, there are indeed some implementations that use the strategy you describe. The now abandoned MRI did it that way, YARV and Rubinius also do it.
This is actually a very old trick, dating back to at least the 1960s. It's called a tagged pointer representation, and like the name suggests, you need to tag the pointer with some additional metadata in order to know whether or not it is actually a pointer to an object or an encoding of some other datatype.
Some CPUs have special tag bits specifically for that purpose. (For example, on the AS/400, the CPU doesn't even have pointers, it has 128bit object references, even though the original CPU was only 48bit wide, and the newer POWER-based CPUs 64 bit; the extra bits are used to encode all sorts of metadata like type, owner, access restrictions, etc.) Some CPUs have tag bits for other purposes that can be "abused" for this purpose. However, most modern mainstream CPUs don't have tag bits.
But, you can use a trick! On many modern CPUs, unaligned memory accesses (accessing an address that does not start at a word boundary) are really slow (on some, they aren't even possible at all), which means that on a 32bit CPU, all pointers that are realistically being used, end with two 00 bits and on 64 bit CPUs with three 000 bits. You can use these bits as tag bits: pointers that end with 00 are indeed pointers, pointers that end with 01, 10, or 11 are an encoding of some other data type.
In MRI, the pointers ending in 1 were used to encode 31/63 bit Fixnums. In YARV, they are used to encode 31/63 bit Fixnums, i.e. integers that are encoded as actual machine integers according to the formula 2n+1 (arithmetically speaking) or (n << 1) | 1 (as a bit pattern). On 64 bit platforms, YARV also uses pointers that end in 10 to encode 62 bit flonums using a similar scheme. (If you ever wondered why the object_id of a Fixnum in YARV is 2n+1, now you know: YARV uses the memory address for the object ID, and 2n+1 is the "memory address" of n.)
Now, what about nil, false and true? Well, there is no space for them in our current scheme. However, the very low memory addresses are usually reserved for the operating system kernel, which means that a pointer like 0 or 2 or 4 cannot realistically occur in a program. YARV uses that space to encode nil, false and true: false is encoded as 0 (which is convenient because that's also the encoding of false in C), nil is encoded as 0b1000 and true is encoded as 0b10100 (it used to be 0, 0b10 and 0b100 in older versions before the introduction of flonums).
Theoretically, there is a lot of space there to encode other objects as well, but YARV doesn't do that. Some Smalltalk or Lisp VMs, for example, encode ASCII or BMP Unicode character objects there, or some often used objects such as the empty list, empty array, or empty string.
There is still some piece missing, though: without an object header, with just the bare bit pattern, how can the VM access the class, the methods, the instance variables, etc.? Well, it can't. Those have to be special-cased and hardcoded into the VM. The VM simply has to know that a pointer ending in 1 is an encoded Fixnum and has to know that the class is Fixnum and the methods can be found there. And as for instance variables? Well, you could store them separately from the objects in a dictionary on the side. Or you go the Ruby route and simply disallow them altogether.
This answer is merely a distillation of #Jörg always-excellent treatise.
In MRI, true, false, nil and Fixnums are mapped to fixed object_id's; all other objects are assigned dynamically-generated values. The object_id for false is 0. For true and nil they are 20 and 8 (2 and 4 prior to v2.0), respectively. The integer i has object_id i*2+1. Dynamically-generated object_id's cannot be any of these values. Therefore, (in MRI) one can merely check to see if the object_id is one of these values to determine if the associated object has a fixed object_id.
Incidentally, objects can be obtained from their object_id's with the method ObjectSpace#_id2ref.
For more on this, see #sepp2k's answer here.

How do I trim the zero value after decimal

As I tried to debug, I found that : just as I type in
Dim value As Double
value = 0.90000
then hit enter, and it automatically converts to 0.9
Shouldn't it keep the precision in double in visual basic?
For my calculation, I absolutely need to show the precision
If precision is required then the Currency data type is what you want to use.
There are at least two representations of your value in play. One is the value you see on the screen -- a string -- and one is the internal representation -- a binary value. In dealing with fractional values, the two are often not equivalent and where they aren't, it's because they can't be.
If you stick with doubles, VB will maintain 53 bits of mantissa throughout your calculations, no matter how they might appear when printed. If you transition through the string domain, say by saving to a file or DB and later retrieving, it often has to leave some of that precision behind. It's inevitable, because the interface between the two domains is not perfect. Some values that can be exactly represented as strings (or Decimals, that is, powers of ten) can't be exactly represented as fractional powers of 2.
This has nothing to do with VB, it's the nature of floating point. The best you can do is control where the rounding occurs. For this purpose your friend is the Format function, which controls how a value appears in string form.
? Format$(0.9, "0.00000") will show you an example.
You are getting what you see on the screen confused with what bits are being set in the Double to make that number.
VB is simply being "helpful", and simply knocking off excess zeros. But for all intents and purposes,
0.9
is identical to
0.90000
If you don't believe me, try doing this comparison:
Debug.Print CDbl("0.9") = CDbl("0.90000")
As has already been said, displayed precision can be shown using the Format$() function, e.g.
Debug.Print Format$(0.9, "0.00000")
No, it shouldn't keep the precision. Binary floating point values don't retain this information... and it would be somewhat odd to do so, given that you're expressing the value in one base even though it's being represented in another.
I don't know whether VB6 has a decimal floating point type, but that's probably what you want - or a fixed point decimal type, perhaps. Certainly in .NET, System.Decimal has retained extra 0s from .NET 1.1 onwards. If this doesn't help you, you could think about remembering two integers - e.g. "90000" and "100000" in this case, so that the value you're representing is one integer divided by another, with the associated level of precision.
EDIT: I thought that Currency may be what you want, but according to this article, that's fixed at 4 decimal places, and you're trying to retain 5. You could potentially just multiply by 10, if you always want 5 decimal places - but it's an awkward thing to remember to do everywhere... and you'd have to work out how to format it appropriately. It would also always be 4 decimal places, I suspect, even if you'd specified fewer - so if you want "0.300" to be different to "0.3000" then Currency may not be appropriate. I'm entirely basing this on articles online though...
You can also enter the value as 0.9# instead. This helps avoid implicit coercion within an expression that may truncate the precision you expect. In most cases the compiler won't require this hint though because floating point literals default to Double (indeed, the IDE typically deletes the # symbol unless the value was an integer, e.g. 9#).
Contrast the results of these:
MsgBox TypeName(0.9)
MsgBox TypeName(0.9!)
MsgBox TypeName(0.9#)

Resources