Enums in computer memory - enums

A quote from Wikipedia's article on enumerated types would be the best opening for this question:
In other words, an enumerated type has values that are different from each other, and that can be compared and assigned, but which are not specified by the programmer as having any particular concrete representation in the computer's memory; compilers and interpreters can represent them arbitrarily.
While I understand the definition and uses of enums, I can't yet grasp the interaction between enums and memory — when an enum type is declared without creating an instance of enum type variable, is the type definition stored in memory as a union or a structure? And what is the meaning behind the aforementioned Wiki excerpt?

The Wikipedia excerpt isn't talking specifically about C's enum types. The C standard has some specific requirements for how enums work.
An enumerated type is compatible with either char or some signed or unsigned integer type. The choice of representation is up to the compiler, which must document its choice (it's implementation-defined), but the type must be able to represent all the values of the enumeration.
The values of the enumeration constants start at 0 by default, and increment by 1 for each successive constant:
enum foo {
zero, // equal to 0
one, // equal to 1
two // equal to 2
};
The constants are always of type int, regardless of what the enum type itself is compatible with. (It would have made more sense for the constants to be of the enumerated type; they're of type int for historical reason.)
You can specify values for some or all of the constants -- which means that the values are not necessarily distinct:
enum bar {
two = 2,
deux = 2,
zwei = 2,
one = 1,
dos // implicitly equal to 2
};
Defining an enumerated type doesn't result in anything being stored in memory at run time. If you define an object of the enumerated type, that object's value will be stored in memory (unless it's optimized away), and will occupy sizeof (enum whatever) bytes. It's the same as for objects of any other type.
An enumeration constant is treated as a constant expression. The expression two is treated almost identically to a constant 2.
Note that C++ has some different rules for enum types. Your question is tagged C, so I won't go into details.

It means that the enum constants are not required to be located in memory. You cannot take the addresses of them.
This allows the compiler to replace all references to enum constants with their actual values. For example, the code:
enum { x = 123; }
int y = x;
may compile as if it were:
int y = 123;
When an enum type is declared without creating an instance of enum type variable, is the type definition stored in memory as a union or a structure?
In C, types are mostly compile-time constructs; once the program has been compiled to machine code, all the type information disappears*. Accessing a struct member is instead "access the memory n bytes past this pointer".
So if the compiler inlines all the enums as shown above, then enums do not exist at all in compiled code.
* Except optionally in the debugging info section, but that's usually only read by debuggers.

Related

How to do several enumerations type in Fortran?

I tried to declare several enumeration types in Fortran.
This funny simple example illustrates well my problem :
program Main
enum, bind(c)
enumerator :: Colors = 0
enumerator :: Blue = 1
enumerator :: Red = 2
enumerator :: Green = 3
end enum
enum, bind(c)
enumerator :: Size = 0
enumerator :: Small = 1
enumerator :: Medium = 2
enumerator :: Large = 3
end enum
integer(kind(Colors)) :: myColor
myColor = Green
if (myColor == Large) then
write(*,*) 'MyColor is Large'
end if
end program Main
I also tried to enclose this enumeration in a type and many others things but none works.
Here I can compare Colors with Size. In C, for example, when I declare color and a size typedef enum, I have no such problem, because the two types are different.
Does it exist a simple solution to have several enumerated type in Fortran?
Otherwise, I imagine to declare several types with one integer member that holds the value and, after, to create interface to overload the operators I need (comparison, affectation and so on). I am not sure that solution is possible and also, I can do it.
Fortran does not have enumerated types in the sense that you wish to use.1
An enumeration in Fortran is a set of enumerators. The program of the question has two of them.
Enumerators themselves are named (integer) constants of a kind interoperable with C's corresponding enumeration type. They exist for the purposes of C interoperability and not to provide a similar functionality within Fortran.
The enumerators Green and Large in the question are two named integer constants with value 3 (of some, possibly different kind). Green==Large is a true expression whatever the kind parameters of the constants.
There is no mechanism in Fortran to restrict a variable to values of an enumeration. The constants could equivalently be declared as
integer(kind=enum_kind1) :: Green = 3_enum_kind1
integer(kind=enum_kind2) :: Large = 3_enum_kind2
for the appropriate kind values (which are quite likely in this case to be the same: C_INT) and the Fortran program would know no difference.
If you wish to use enumerated types in the sense that they exist in C and similar languages, you will have to use a non-intrinsic approach (as intimated in the question).
1 This is the case for the current, 2018, revision of the language. At this time, there is a proposal for the next revision (provisionally 2023) to include enumerated types closer to what is desired here. This specification is given in 7.6.2 of one particular working draft.

Constraint with slice and map union: invalid operation: cannot index c (variable of type T constrained by sliceOrMap) [duplicate]

I decided to dive into Go since 1.18 introduced generics. I want to implement an algorithm that only accepts sequential types — arrays, slice, maps, strings, but I'm not able to crack how.
Is there a method that can be targeted involving indexability?
You can use a constraint with a union, however the only meaningful one you can have is:
type Indexable interface {
~[]byte | ~string
}
func GetAt[T Indexable](v T, i int) byte {
return v[i]
}
And that's all, for the time being. Why?
The operations allowed on types with union constraint are only those allowed for all types in the constraint type set.
To allow indexing, the types in the union must have equal key type and equal element type.
The type parameter proposal suggests that map[int]T could be used in a union with []T, however this has been disallowed. The specs now mention this in Index expressions: "If there is a map type in the type set of P, all types in that type set must be map types, and the respective key types must be all identical".
For arrays, the length is part of the type, so a union would have to specify all possible lengths you want to handle, e.g. [1]T | [2]T etc. Quite impractical, and prone to out-of-bounds issues (There's a proposal to improve this).
So the only union with diverse types that supports indexing appears to be []byte | string (possibly approximated ~). Since byte is an alias of uint8, you can also instantiate with []uint8.
Other than that, there's no other way to define a constraint that supports indexing on all possible indexable types.
NOTE that []byte | string supports indexing but not range, because this union doesn't have a core type.
Playground: https://gotipplay.golang.org/p/uatvtMo_mrZ

Why does unsafe.Sizeof return a uintptr?

As per the documentation (https://golang.org/pkg/unsafe/#Sizeof) unsafe.Sizeof returns the size of the given expression in bytes. A size of any given expression can ideally be denoted by a uint32 or uint64. Then why does Golang return a uintptr instead? Isn't that confusing? A uintptr is supposed to hold a pointer to some data value but in this case it is not actually a pointer it is just a number right?
There are a lot of good answers in the comments, which boil down to "because that's big enough, yet not too big". I think, though, it might be helpful to view this from a historical perspective, with particular attention to how this all came about in the C programming language.
In very old (pre-standard) C, if you go far back enough in time, there was not even an explicit unsigned integer type. The PDP-11 had:
char, which was 8 bits and signed;
int, which was 16 bits and signed; and
pointers, which were 16 bits and unsigned.
That is:
int i;
int *u;
was how you made two integers, i being signed, and u being unsigned. Setting i to 32767 (0x7fff) and then incrementing it gave you -32768 (0x8000), which gradually increased to -1 (0xffff) and then zero. Setting u to 32767 and then incrementing it gave you 32768, which gradually increased to 65535, and then rolled over to zero.
The lack of distinction between integers and pointers meant that device drivers could read:
struct {
int csr;
int blk;
int bar;
int bcr;
};
0177440->bcr = count;
0177440->blk = block;
0177440->bar = addr;
0177440->csr = READ | GO;
which might be how one told a device to read some bytes or blocks.
(This is also why struct member names, like st_ino in struct stat, were all prefixed like this: st_ino just meant "some integer offset" and you could use the st_ino member with any pointer, or even with an ordinary variable. The prefix meant you could #include multiple headers without having their struct member names collide.)
All of this turned untenable when C was made to work on 32-bit and other machines. C grew an unsigned integer type, rather than pressing pointers into service as unsigned integers, and Steve Johnson's PCC compiler turned unsigned into a modifier, that could be applied to char and short as well as int. A lot of experimentation occurred. Eventually, in 1989, C was first standardized with most of the syntax and semantics that we have now (though new standards have added new types, and many functions, and so on).
Some of the early C pioneers were involved with creating Go, with particular influence from Ken Thompson. There is a quote on the Wikipedia page that is appropriate here:
When the three of us [Thompson, Rob Pike, and Robert Griesemer] got started, it was pure research. The three of us got together and decided that we hated C++. [laughter] ... [Returning to Go,] we started off with the idea that all three of us had to be talked into every feature in the language, so there was no extraneous garbage put into the language for any reason.
As we see from the early days of C, a pointer-as-integer is a suitable unsigned type that can not only hold any pointer, but, if treated as unsigned, can also hold any object size. A pointer-as-integer is not directly usable as a pointer, of course, and with a GC system and concurrency, we need the language itself to have pointers. But we also need to be able to write the runtime support for the language,1 for which we need integer-ized pointers, which also covers all of our needs for object sizes. So one type, built in to the compiler, covers all the requirements. That is as simple as possible, but no simpler.
1I say "we" as if I had anything to do with it. It's just obvious, once you have implemented a few runtime systems.

How to check two structs for equality

I have two instances of this struct with references inside (as properties):
type ST struct {
some *float64
createdAt *time.Time
}
How can I preform a check for equality for two different instances of this struct? Is it only by using reflect?
While you could use reflection, as Corey Ogburn suggested, I would not do so for a simple struct like that. Per the official Go Blog, reflection is
a powerful tool that should be used with care and avoided unless strictly necessary
-- The Laws of Reflection
It should be a simple exercise for you to write a function that takes two pointers to values of your struct type and returns a boolean true/false as to whether they are equal, first by testing for nil pointers and then by testing for equality of each of the fields of the struct.
time.Time values already have an equality test method with signature
func (t Time) Equal(u Time) bool
Depending on your use cases, the bigger problem may be comparing two floating point values for equality. While == comparisons work on float64 values, for many applications you want two float values to be considered equal when they are close, as well as when they are exactly the same. If that is the case for your application, I recommend defining an equal function that accepts a precision and verifies that the difference between the two values is not greater than the precision. To learn more, research floating point representations of decimal values.
Note that time package documentation has this to say about using pointers:
Programs using times should typically store and pass them as values, not pointers. That is, time variables and struct fields should be of type time.Time, not *time.Time.
So you should probably change the type of createdAt in your struct.
You can use reflect.DeepEqual.
DeepEqual reports whether x and y are “deeply equal,” defined as follows. Two values of identical type are deeply equal if one of the following cases applies. Values of distinct types are never deeply equal.
The documentation then goes on to describe how arrays, structs, functions, pointers and other types are considered to be deeply equal.

Heterogeneous vs Homogeneous

I'm little bit confused about heterogeneous and homogeneous list and array. In OOP context, if I define base class and it's derived class, why is array of base class homogeneous, if I can store derived classes as well? It's same principle, as void pointer in C (i.e. https://gist.github.com/rawcoder/9720851 ). Every literature says, that homogeneous structure is of same type (semantically), so can you please explain this to me little bit more further?
The simplest way of putting it is that instances of a derived type are instances of a base type, so can be stored in a collection of the base type.
It has more to do with the type of the collection than what's stored in it. If a collection has type array[B], and D < B (type D derives from B), then storing a D in an array[B] doesn't violate the type of the array or operations on the array/elements of the array. However, if the array were defined to hold only descendants of B but not B itself (e.g. array[ D < B ]; array<D extends B> in Java-speak), then it would be heterogenous, because it's declared to hold multiple types, rather than a single type. Note that in some programming languages, "heterogeneous" means holds any type (e.g. array[Any?], array<*>), not just multiple different types.
Outside of such languages, union types muddy the waters, because they allow disparate types to be treated as one. array[A | B | C] could be viewed as heterogeneous, or as the homogeneous array[U], where U = A | B | C.
The point of a type system in programming is to ensure a certain level of correctness by restricting what is allowed, based on type. Homogeneity strengthens the typing system, while unions weaken typing, as they:
can allow an instance of one type to be treated as another, causing a run-time type violation, and
allow for some types to not be handled, which is potentially another error.
Some type systems avoid these by using the more restricted sum type (aka "variant type"), which includes type information along with instances, instead of using union types. This prevents both the above issues with union types, restoring homogeneity. However, type systems that have sum types are usually so strong that the concepts of homogeneity and heretogeneity aren't as useful (basically, every collection is homogeneous, even the heterogeneous ones, which are collections of the maximal type from which all other types descend; consider the array[Any?] example above).
Storing subtypes in a homogeneous collection is a consequence of the LSP, which states that a property of a type should be true of a subtype (i.e. subtypes should be substitutable in contexts in place of a supertype). If the method (e.g. operator[], add) that adds an element to an array takes type B, it should also take subtypes. If code operates on elements of a collection expecting they're of type B, it should operate just as correctly on subtypes.
Note this is distinct from the matter of subtype relationships between collection types (e.g. if D < B, is array[D] < array[B]?), which is a matter of (type) variance.

Resources