In Go is it possible to define a custom type with a number of bits other than those offered by byte uint uint16 or any of the other built-in types?
I'm planning on using "just enough bits" to represent variables and wanted a 6-bit and a 4-bit type. Perhaps a composite bool type?
type fourbit struct{
ones bool
twos bool
fours bool
eights bool
}
Though this sort of thing is quite messy and it would be nice to have a more general solution for n-bit types.
No. The minimum size of a Go type in current implementations, including type bool, is one byte, .
References:
The Go Programming Language Specification
Related
As per the documentation (https://golang.org/pkg/unsafe/#Sizeof) unsafe.Sizeof returns the size of the given expression in bytes. A size of any given expression can ideally be denoted by a uint32 or uint64. Then why does Golang return a uintptr instead? Isn't that confusing? A uintptr is supposed to hold a pointer to some data value but in this case it is not actually a pointer it is just a number right?
There are a lot of good answers in the comments, which boil down to "because that's big enough, yet not too big". I think, though, it might be helpful to view this from a historical perspective, with particular attention to how this all came about in the C programming language.
In very old (pre-standard) C, if you go far back enough in time, there was not even an explicit unsigned integer type. The PDP-11 had:
char, which was 8 bits and signed;
int, which was 16 bits and signed; and
pointers, which were 16 bits and unsigned.
That is:
int i;
int *u;
was how you made two integers, i being signed, and u being unsigned. Setting i to 32767 (0x7fff) and then incrementing it gave you -32768 (0x8000), which gradually increased to -1 (0xffff) and then zero. Setting u to 32767 and then incrementing it gave you 32768, which gradually increased to 65535, and then rolled over to zero.
The lack of distinction between integers and pointers meant that device drivers could read:
struct {
int csr;
int blk;
int bar;
int bcr;
};
0177440->bcr = count;
0177440->blk = block;
0177440->bar = addr;
0177440->csr = READ | GO;
which might be how one told a device to read some bytes or blocks.
(This is also why struct member names, like st_ino in struct stat, were all prefixed like this: st_ino just meant "some integer offset" and you could use the st_ino member with any pointer, or even with an ordinary variable. The prefix meant you could #include multiple headers without having their struct member names collide.)
All of this turned untenable when C was made to work on 32-bit and other machines. C grew an unsigned integer type, rather than pressing pointers into service as unsigned integers, and Steve Johnson's PCC compiler turned unsigned into a modifier, that could be applied to char and short as well as int. A lot of experimentation occurred. Eventually, in 1989, C was first standardized with most of the syntax and semantics that we have now (though new standards have added new types, and many functions, and so on).
Some of the early C pioneers were involved with creating Go, with particular influence from Ken Thompson. There is a quote on the Wikipedia page that is appropriate here:
When the three of us [Thompson, Rob Pike, and Robert Griesemer] got started, it was pure research. The three of us got together and decided that we hated C++. [laughter] ... [Returning to Go,] we started off with the idea that all three of us had to be talked into every feature in the language, so there was no extraneous garbage put into the language for any reason.
As we see from the early days of C, a pointer-as-integer is a suitable unsigned type that can not only hold any pointer, but, if treated as unsigned, can also hold any object size. A pointer-as-integer is not directly usable as a pointer, of course, and with a GC system and concurrency, we need the language itself to have pointers. But we also need to be able to write the runtime support for the language,1 for which we need integer-ized pointers, which also covers all of our needs for object sizes. So one type, built in to the compiler, covers all the requirements. That is as simple as possible, but no simpler.
1I say "we" as if I had anything to do with it. It's just obvious, once you have implemented a few runtime systems.
At first glance, it seems like one might opt for uint when you need an int that you don't want to be negative. However, in practice it seems that int is nearly always preferred.
I see general recommendations like:
"Generally if you are working with integers you should just use the int type."
"uint should generally only be used for doing binary operations"
"Don't use unsigned types to enforce or suggest that a number must be positive. That's not what they're for."
"this is what The Go Programming Language recommends, with the specific example of uints being useful when you want to do bitwise operations"
I also noticed that Go will let you convert a negative int to uint and give some odd results:
x := -5
y := uint(x)
fmt.Println(y)
>> 18446744073709551611
So, my understanding is that I should always use int when dealing with whole numbers, regardless of sign, unless I find myself needing uint, and I'll know it when that's the case (I think???).
My questions:
Is this the right takeaway?
If so, why is this the case?
What's an example of when one should use uint? -- maybe a specific example, as opposed to "when doing binary operations", as I'm not sure I know what that means :)
Also, I'm asking specific to Go's implementation.
This answer is for C but it's relevant here.
Generally if you are working with integers you should just use the int type.
This is recommended generallly because most of the code that we "generally" encounter deals with type int. And it's also not generally required for you to choose between the use of an int and a uint type.
Don't use unsigned types to enforce or suggest that a number must be positive. That's not what they're for.
This is quite subjective. You can very well use it to keep your program and data type-safe and needn't be bothered with dealing with the occasional errors that come due to the case of a negative integer.
"this is what The Go Programming Language recommends, with the specific example of uints being useful when you want to do bitwise operations"
This looks vague. Please add the source for this, I would like to read up on it.
x := -5
y := uint(x)
fmt.Println(y)
>> 18446744073709551611
This is typical of a number of languages. Logic behind this is that when you convert an int type to a uint, the binary representation used for the int is kind of shoved into the uint type. In the end, everything is just an abstraction over binary.
For example, take a look at this code and it's output:
a := int64(-123)
byteSliceRev := *(*[8]byte)(unsafe.Pointer(&a)) // The byte slice representation we get is LTR in increasing order of significance
u := uint(a)
byteSliceRevU := *(*[8]byte)(unsafe.Pointer(&u))
byteSlice, byteSliceU := make([]byte, 8), make([]byte, 8)
for i := 0; i < 8; i++ {
byteSlice[i], byteSliceU[i] = byteSliceRev[7-i], byteSliceRevU[7-i]
}
fmt.Println(u)
// 18446744073709551493
fmt.Printf("%b\n", byteSlice)
// [11111111 11111111 11111111 11111111 11111111 11111111 11111111 10000101]
fmt.Printf("%b\n", byteSliceU)
// [11111111 11111111 11111111 11111111 11111111 11111111 11111111 10000101]
The byte representation of both the int64 type of -5 is the same as for uint type of 18446744073709551493.
So, my understanding is that I should always use int when dealing with whole numbers, regardless of sign, unless I find myself needing uint, and I'll know it when that's the case (I think???).
But isn't this more or less true of every code that "we" write.?!
Is this the right takeaway?
If so, why is this the case?
I hope I have answered these two questions. Feel free to ask me if you still have any doubts.
What's an example of when one should use uint? -- maybe a specific example, as opposed to "when doing binary operations", as I'm not sure I know what that means :)
Imagine a scenario in which you have a table in your database with a lot of entries with an integer for an id, which is always positive. If you store this data as an int one bit of every entry is effectively useless and when you scale this, you are losing a lot of space when you could have just used a uint and saved it. Similar scenario can be thought of while transmitting data, transmitting tons of integers to be precise. Also, uint has double the range for positive integers compared to their counterpart signed integers due to the extra bit, so it will take you longer to run out of numbers. Storage is cheap now so people generally ignore this supposedly minor gain.
The other usecase is type-safety. A uint can never be negative so if a part of your code is delicate to negative numbers, it can prove to be pretty handy. It's better to get the error before wasting resource on the data just to find out it's impermissible because it's negative.
Package Image uses uint and so crypto/tls, so when you use these packages you must use uint.
I use it logically at first but I don't fight about it and if it became an issue I use a practical approach.
like why using int for len()
I have two instances of this struct with references inside (as properties):
type ST struct {
some *float64
createdAt *time.Time
}
How can I preform a check for equality for two different instances of this struct? Is it only by using reflect?
While you could use reflection, as Corey Ogburn suggested, I would not do so for a simple struct like that. Per the official Go Blog, reflection is
a powerful tool that should be used with care and avoided unless strictly necessary
-- The Laws of Reflection
It should be a simple exercise for you to write a function that takes two pointers to values of your struct type and returns a boolean true/false as to whether they are equal, first by testing for nil pointers and then by testing for equality of each of the fields of the struct.
time.Time values already have an equality test method with signature
func (t Time) Equal(u Time) bool
Depending on your use cases, the bigger problem may be comparing two floating point values for equality. While == comparisons work on float64 values, for many applications you want two float values to be considered equal when they are close, as well as when they are exactly the same. If that is the case for your application, I recommend defining an equal function that accepts a precision and verifies that the difference between the two values is not greater than the precision. To learn more, research floating point representations of decimal values.
Note that time package documentation has this to say about using pointers:
Programs using times should typically store and pass them as values, not pointers. That is, time variables and struct fields should be of type time.Time, not *time.Time.
So you should probably change the type of createdAt in your struct.
You can use reflect.DeepEqual.
DeepEqual reports whether x and y are “deeply equal,” defined as follows. Two values of identical type are deeply equal if one of the following cases applies. Values of distinct types are never deeply equal.
The documentation then goes on to describe how arrays, structs, functions, pointers and other types are considered to be deeply equal.
I am using a 64Bit server. My golang program needs integer type.
SO, If I use uint16 and uint32 type in source code, does it cost more than use most regular int type?
I am considering both computing cost and developing cost.
For the vast majority of cases using int makes more sense.
Here are some reasons:
Go doesn't implicitly convert between the numeric types, even when you think it should. If you start using some unsigned type instead of int, you should expect to pepper your code with multiple type conversions, because of other libraries or APIs preferring not to bother with unsigned types, because of untyped constant numerical expressions returning int values, etc.
Unsigned types are more prone to underflowing than signed types, because 0 (an unsigned type's boundary value) is much more of a naturally occurring value in computer programs than, for example, -9223372036854775808.
If you want to use an unsigned type because it restricts the values that you can put in it, keep in mind that when you combine silent underflow and compile time-only constant propagation, you probably aren't getting the bargain you were looking for. For example, while you cannot convert the constant math.MinInt64 to a uint, you can easily convert an int variable with value math.MinInt64 to a uint. And arguably it's not a bad Go style to have an if check whether the value you're trying to assign is valid for your program.
Unless you are experiencing significant memory pressure and your value space is somewhere slightly over what a smaller signed type would offer you, I'd think that using int will be much more efficient even if only because of development cost.
And even then, chances are that either there's a problem somewhere else in your program's memory footprint, or a managed language like Go is not the best fit for your needs.
I am some 'memory allocator' type code, by using an array and indexes rather than pointers. I'm hoping that the size of the index of the array is smaller than a pointer. I care because I am storing 'pointers' as integer indexes in an array rather than 64-bit pointers.
I can't see anything in the Go spec that says what an array is indexed by. Obviously it's some kind of integer. Passing very large values makes the runtime complain that I can't pass negative numbers, so I'm guessing that it's somehow cast to a signed integer. So is it an int32? I'm guessing it's not an int64 because I didn't touch the top bit (which would have been 2's compliment for a negative number).
Arrays may be indexed by any integer type.
The Array types section of the Go Programming Language Specification says that in an array type definition,
The length is part of the array's type and must be a constant
expression that evaluates to a non-negative integer value.
In an index expression such as a[x]:
x must be an integer value and 0 <= x < len(a)
But there is a limitation on the magnitude of an index; the description of Length and capacity says:
The built-in functions len and cap take arguments of various types and
return a result of type int. The implementation guarantees that the
result always fits into an int.
So the declared size of an array, or the index in an index expression, can be of any integer type (int, uint, uintptr, int8, int16, int32, int64, uint8, uint16, uint32, uint64), but it must be non-negative and within the range of type int (which is the same size as either int32 or int64 -- though it's a distinct type from either).
It's a very interesting question indeed. I have not found any direct rules in documentation too; instead I've found two great discussions in Groups.
In the first one, among many things, I've found an answer why indexes are implemented as int - but not uint:
Algorithms can benefit from the ability to express negative offsets
and such. If indexes were unsigned you'd always need a conversion in
these cases.
The second one specifically talks about possibility (but possibility only!) of using int64 for large arrays, mentioning limitations of len and cap functions (which limitations are actually mentioned in the doc):
The built-in functions len and cap take arguments of various types and
return a result of type int. The implementation guarantees that the
result always fits into an int.
I do agree, though, that more... official point of view wouldn't hurt. )
Arrays and slices are indexed by ints. An int is defined as being a 32 or 64 bit signed integer. The most common implementation (6g) uses 32 bit integers regardless of the architecture at this point in time. However, it is planed that eventually an int will be 64bit on 64bit machines and therefore the same length as a pointer.
The language spec defines 3 implementation dependent numeric types:
uint either 32 or 64 bits
int same size as uint
uintptr an unsigned integer large enough to store the uninterpreted bits of a pointer value