I'm trying to understand how Ruby's Fiddle library works. I understand mostly how it interacts with libffi, but there's just one thing that's still baffling me: what on earth do Fiddle.dlwrap and Fiddle.dlunwrap do exactly?
The documentation just says
dlunwrap(addr)
Returns the hexadecimal representation of a memory pointer address addr
dlwrap(val)
Returns a memory pointer of a function’s hexadecimal address location val
(from ruby-doc.org)
I've tried experimenting with them passing in various different types of objects and strings. The methods always return values no matter what you pass in (not just hexadecimal strings of pointers to C functions). It seems that dlwrap is merely returning the memory address of the passed-in object, but two things don't make too much sense in this case:
If I pass in a short string, and create a pointer using the result as the address, the memory it points to is not the string.
If I pass in a number, it just returns the object ID of the number.
If anyone has some secret knowledge on the inner workings on Fiddle, and is willing to share, please help :)
Related
What I am reading about ints and strings over internet is they are immutable in the nature.
But the following code shows that after changing the values of these types, still they points to the same address. This contradicts the idea behind the nature of types in python.
Can anyone please explain me this?
Thanks in advance.
package main
import (
"fmt"
)
func main() {
num := 2
fmt.Println(&num)
num = 3
fmt.Println(&num) // address value of the num does not change
str := "2"
fmt.Println(&str)
str = "34"
fmt.Println(&str) // address value of the str does not change
}```
A number is immutable by nature. 7 is 7, and it won't be 8 tomorrow. That doesn't mean that which number is stored in a variable cannot change. Variables are variable. They're mutable containers for values which may be mutable or immutable.
A Go string is immutable by language design; the string type doesn't support any mutating operators (like appending or replacing a character in the middle of the string). But, again, assignment can change which string a variable contains.
In Python (CPython at least), a number is implemented as a kind of object, with an address and fields like any other object. When you do tricks with id(), you're looking at the address of the object "behind" the variable, which may or may not change depending on what you do to it, and whether or not it was originally an interned small integer or something like that.
In Go, an integer is an integer. It's stored as an integer. The address of the variable is the address of the variable. The address of the variable might change if the garbage collector decides to move it (making the numeric value of the address more or less useless), but it doesn't reveal to you any tricks about the implementation of arithmetic operators, because there aren't any.
Strings are more complicated than integers; they are kind of object-ish internally, being a structure containing a pointer and a size. But taking the address of a string variable with &str doesn't tell you anything about that internal structure, and it doesn't tell you whether the Go compiler decided to use a de novo string value for an assignment, or to modify the old one in place (which it could, without breaking any rules, if it could prove that the old one would never be seen again by anything else). All it tells you is the address of str. If you wanted to find out whether that internal pointer changed you would have to use reflection... but there's hardly ever any practical reason to do so.
When you read about a string being immutable, it means you cannot modify it by index, ex:
x := "hello"
x[2] = 'r'
//will raise an error
As a comment says, when you modify the whole var(and not a part of it with an index), it's not related to being mutable or not, and you can do it
I'm currently trying to learn GO and mainly knowing and working with Java, ASP.Net and some Python, there is no experience working with C-like pointers, which causes my current confusion.
A library I'm currently using to write my first GO project is called Commando.
There I have the struct CommandRegistry and the variable of interest is called Commands.
In the struct the variable is described as the following:
// registered command configurations
Commands map[string]*Command
On a first glimpse I would understand this as a Map object containing a list of Strings, however it also shows the pointer reference to the actual Command object.
All I can see is that it is a map I can loop over which returns the name of the command ( the string ),
however I'm wondering if the *Command in the type description means I can somehow dereference the pointer and retrieve the object itself to extract the additional information of it.
As I know the & operand is used to create a new pointer of another object. Pass-by-reference basically instead of pass-by-value.
And the * operand generally signals the object is a pointer or used to require a pointer in a new function.
Is there a way I can retrieve the Command object or why does the type contain the *Command in it's declaration?
Commands is a map (dictionary) which has strings as keys, and pointers to Commands as values. By passing it a key, you will get a pointer to the command it belongs to. You can then dereference the pointer to an actual Command object by using the * operator. Something like dereferencedCommand := *Commands["key"].
The * operator can be quite confusing, at least it was for me. When used as a type it denotes that we are receiving the memory address of some variable. But to dereference a memory address to a concrete type, you also use the * operator.
I was under the impression that using the unsafe package allows you to read/write arbitrary data. I'm trying to change the value the interface{} points to without changing the pointer itself.
Assuming that interface{} is implemented as
type _interface struct {
type_info *typ
value unsafe.Pointer
}
setting fails with a SIGSEGV, although reading is successful.
func data(i interface{}) unsafe.Pointer {
return unsafe.Pointer((*((*[2]uintptr)(unsafe.Pointer(&i))))[1])
}
func main() {
var i interface{}
i = 2
fmt.Printf("%v, %v\n", (*int)(data(i)), *(*int)(data(i)))
*((*int)(data(i))) = 3
}
Am I doing something wrong, or is this not possible in golang?
Hm... Here's how I understand your second code example currently, in case I've made an error (if you notice anything amiss in what I'm describing, my answer is probably irredeemably wrong and you should ignore the rest of what I have to say).
Allocate memory for interface i in main.
Set the value of i to an integer type with the value 2.
Allocate memory for interface i in data.
Copy the value of main's i to data's i; that is, set the value of the new interface to an integer type with the value 2.
Cast the address of the new variable into a pointer to length-2 array of uintptr (with unsafe.Pointer serving as the intermediary that forces the compiler to accept this cast).
Cast the second element of the array (whose value is the address of the value-part of i in data) back into an unsafe.Pointer and return it.
I've made an attempt at doing the same thing in more steps, but unfortunately I encountered all the same problems: the program recognizes that I have a non-nil pointer and it's able to dereference the pointer for reading, but using the same pointer for writing produces a runtime error.
It's step 6 that go vet complains about, and I think it's because, according to the package docs,
A uintptr is an integer, not a reference. Converting a Pointer to a uintptr creates an integer value with no pointer
semantics. Even if a uintptr holds the address of some object, the garbage collector will not update that uintptr's value if the object moves, nor will that uintptr keep the object from being reclaimed.
More to the point, from what I can tell (though I'll admit I'm having trouble digging up explicit confirmation without scanning the compiler and runtime source), the runtime doesn't appear to track the value-part of an interface{} type as a discrete pointer with its own reference count; you can, of course, trample over both the interface{}'s words by writing another interface value into the whole thing, but that doesn't appear to be what you wanted to do at all (write to the memory address of a pointer that is inside an interface type, all without moving the pointer).
What's interesting is that we seem to be able to approximate this behavior by just defining our own structured type that isn't given special treatment by the compiler (interfaces are clearly somewhat special, with type-assertion syntax and all). That is, we can use unsafe.Pointer to maintain a reference that points to a particular point in memory, and no matter what we cast it to, the memory address never moves even if the value changes (and the value can be reinterpreted by casting it to something else). The part that surprises me a bit is that, at least in my own example, and at least within the Playground environment, the value that is pointed to does not appear to have a fixed size; we can establish an address to write to once, and repeated writes to that address succeed even with huge (or tiny) amounts of data.
Of course, with at least this implementation, we lose a bunch of the other nice-to-have things we associate with interface types, especially non-empty interface types (i.e. with methods). So, there's no way to use this to (for example) make a super-sneaky "generic" type. It seems that an interface is its own value, and part of that value's definition is an address in memory, but it's not entirely the same thing as a pointer.
In Ruby, the to_s on an object includes an encoding of the object's id.
[2] pry(main)> shape = Shape.new(4,4)
=> #<Shape:0x00007fac5eb6afc8 #num_sides=4, #side_length=4>
In the documentation it says
Returns a string representing obj. The default to_s prints the object’s class and an encoding of the object id.
https://apidock.com/ruby/Object/to_s
In the example above, the encoding of the object id is 0x00007fac5eb6afc8.
In How does object_id assignment work? they explain
In MRI the object_id of an object is the same as the VALUE that represents the object on the C level.
So I compared to the object_id and it is not the same as the encoding of the object id.
[2] pry(main)> shape = Shape.new(4,4)
=> #<Shape:0x00007fac5eb6afc8 #num_sides=4, #side_length=4>
[3] pry(main)> shape.object_id
=> 70189150066660
What exactly is the encoding of the object id? It does not appear to be the object_id.
Think of the object_id, or __id__ as the "pointer" for the object. It is not technically a pointer, but does contain a unique value that can be used to retrieve the internal C VALUE.
There are patterns to the value it has for some data types, as you can see with its hexadecimal representation with to_s. I am will not go into all the details, as there are already numerous answers on SO explaining, and already linked from comments, but integers (up to a FIXNUM_MAX, have predictable values, and special constants like true, false, and nil will always have the same object_id in every run.
To put simply, it is nothing more than a number, and shown as a hexadecimal (base 16) value, not any actual "encoding" or cypher.
Going to expand upon this a bit more in light of your latest edits to the question. As you posted, the hexadecimal number you see in to_s is the value of the internal C VALUE of the object. VALUE is a C data type (unsigned, pointer size number) that every Ruby object is represented as in C code. As #Stefan pointed out in a comment, for non-integer types (I speak only for MRI version), it is twice the value of the object_id. Not that you probably care, but you can shift the bits of an integer to predict the value for those.
Therefore, using you example.
A value of 0x00007fac5eb6afc8 is simple hexadecimal notation for a number. It uses a base 16 counting system as opposed to the base 10 decimal system we are more used to in everyday life. It is simply a different way of looking at the same number.
So, using that logic.
a = 0x00007fac5eb6afc8
#=> 140378300133320 # Decimal representation
a /= 2 # Remember, non-integers are half of this value
#=> 70189150066660 # Your object_id
The best answer you can get is: You don't know, and you shouldn't need to.
Ruby guarantees exactly three things about object IDs:
An object has the same ID during its lifetime.
No two objects have the same ID at the same time.
IDs are integers.
In particular, this means that you cannot rely on a specific object having a specific ID (for example, nil having ID 8). It also means that IDs can be re-used. You should think of it as nothing but opaque identifier.
And, as you quoted, the default Object#to_s uses "some" encoding of the ID.
And that is all you know, and all you should ever rely on. In particular, you should never try to parse IDs or Object#to_s.
So, the ID part of Object#to_s is "some unspecified encoding" of the ID, which itself is "some opaque identifier".
Everything else is deliberately left unspecified, so that different implementations can make different choices that make sense for their specific needs. For example, it would be stupid to tie object IDs to memory addresses, because implementations like JRuby, Opal, IronPython, MagLev, and Topaz run on platforms where the concept of "memory address" doesn't even exist! And Rubinius uses a moving garbage collector, where objects can move around in memory and thus their address changes.
I was working on a simple task yesterday, just needed to sum the values in a handful of dropdown menus to display in a textbox via Javascript. Unexpectedly, it was just building a string so instead of giving me the value 4 it gave me "1111". I understand what was happening; but I don't understand how.
With a loosely typed language like Javascript or PHP, how does the computer "know" what type to treat something as? If I just type everything as a var, how does it differentiate a string from an int from an object?
What the + operator will do in Javascript is determined at runtime, when both actual arguments (and their types) are known.
If the runtime sees that one of the arguments is a string, it will do string concatenation. Otherwise it will do numeric addition (if necessary coercing the arguments into numbers).
This logic is coded into the implementation of the + operator (or any other function like it). If you looked at it, you would see if typeof(a) === 'string' statements (or something very similar) in there.
If I just type everything as a var
Well, you don't type it at all. The variable has no type, but any actual value that ends up in that variable has a type, and code can inspect that.