In Go, when should you use uint vs int? - go

At first glance, it seems like one might opt for uint when you need an int that you don't want to be negative. However, in practice it seems that int is nearly always preferred.
I see general recommendations like:
"Generally if you are working with integers you should just use the int type."
"uint should generally only be used for doing binary operations"
"Don't use unsigned types to enforce or suggest that a number must be positive. That's not what they're for."
"this is what The Go Programming Language recommends, with the specific example of uints being useful when you want to do bitwise operations"
I also noticed that Go will let you convert a negative int to uint and give some odd results:
x := -5
y := uint(x)
fmt.Println(y)
>> 18446744073709551611
So, my understanding is that I should always use int when dealing with whole numbers, regardless of sign, unless I find myself needing uint, and I'll know it when that's the case (I think???).
My questions:
Is this the right takeaway?
If so, why is this the case?
What's an example of when one should use uint? -- maybe a specific example, as opposed to "when doing binary operations", as I'm not sure I know what that means :)
Also, I'm asking specific to Go's implementation.

This answer is for C but it's relevant here.
Generally if you are working with integers you should just use the int type.
This is recommended generallly because most of the code that we "generally" encounter deals with type int. And it's also not generally required for you to choose between the use of an int and a uint type.
Don't use unsigned types to enforce or suggest that a number must be positive. That's not what they're for.
This is quite subjective. You can very well use it to keep your program and data type-safe and needn't be bothered with dealing with the occasional errors that come due to the case of a negative integer.
"this is what The Go Programming Language recommends, with the specific example of uints being useful when you want to do bitwise operations"
This looks vague. Please add the source for this, I would like to read up on it.
x := -5
y := uint(x)
fmt.Println(y)
>> 18446744073709551611
This is typical of a number of languages. Logic behind this is that when you convert an int type to a uint, the binary representation used for the int is kind of shoved into the uint type. In the end, everything is just an abstraction over binary.
For example, take a look at this code and it's output:
a := int64(-123)
byteSliceRev := *(*[8]byte)(unsafe.Pointer(&a)) // The byte slice representation we get is LTR in increasing order of significance
u := uint(a)
byteSliceRevU := *(*[8]byte)(unsafe.Pointer(&u))
byteSlice, byteSliceU := make([]byte, 8), make([]byte, 8)
for i := 0; i < 8; i++ {
byteSlice[i], byteSliceU[i] = byteSliceRev[7-i], byteSliceRevU[7-i]
}
fmt.Println(u)
// 18446744073709551493
fmt.Printf("%b\n", byteSlice)
// [11111111 11111111 11111111 11111111 11111111 11111111 11111111 10000101]
fmt.Printf("%b\n", byteSliceU)
// [11111111 11111111 11111111 11111111 11111111 11111111 11111111 10000101]
The byte representation of both the int64 type of -5 is the same as for uint type of 18446744073709551493.
So, my understanding is that I should always use int when dealing with whole numbers, regardless of sign, unless I find myself needing uint, and I'll know it when that's the case (I think???).
But isn't this more or less true of every code that "we" write.?!
Is this the right takeaway?
If so, why is this the case?
I hope I have answered these two questions. Feel free to ask me if you still have any doubts.
What's an example of when one should use uint? -- maybe a specific example, as opposed to "when doing binary operations", as I'm not sure I know what that means :)
Imagine a scenario in which you have a table in your database with a lot of entries with an integer for an id, which is always positive. If you store this data as an int one bit of every entry is effectively useless and when you scale this, you are losing a lot of space when you could have just used a uint and saved it. Similar scenario can be thought of while transmitting data, transmitting tons of integers to be precise. Also, uint has double the range for positive integers compared to their counterpart signed integers due to the extra bit, so it will take you longer to run out of numbers. Storage is cheap now so people generally ignore this supposedly minor gain.
The other usecase is type-safety. A uint can never be negative so if a part of your code is delicate to negative numbers, it can prove to be pretty handy. It's better to get the error before wasting resource on the data just to find out it's impermissible because it's negative.

Package Image uses uint and so crypto/tls, so when you use these packages you must use uint.
I use it logically at first but I don't fight about it and if it became an issue I use a practical approach.
like why using int for len()

Related

Why does unsafe.Sizeof return a uintptr?

As per the documentation (https://golang.org/pkg/unsafe/#Sizeof) unsafe.Sizeof returns the size of the given expression in bytes. A size of any given expression can ideally be denoted by a uint32 or uint64. Then why does Golang return a uintptr instead? Isn't that confusing? A uintptr is supposed to hold a pointer to some data value but in this case it is not actually a pointer it is just a number right?
There are a lot of good answers in the comments, which boil down to "because that's big enough, yet not too big". I think, though, it might be helpful to view this from a historical perspective, with particular attention to how this all came about in the C programming language.
In very old (pre-standard) C, if you go far back enough in time, there was not even an explicit unsigned integer type. The PDP-11 had:
char, which was 8 bits and signed;
int, which was 16 bits and signed; and
pointers, which were 16 bits and unsigned.
That is:
int i;
int *u;
was how you made two integers, i being signed, and u being unsigned. Setting i to 32767 (0x7fff) and then incrementing it gave you -32768 (0x8000), which gradually increased to -1 (0xffff) and then zero. Setting u to 32767 and then incrementing it gave you 32768, which gradually increased to 65535, and then rolled over to zero.
The lack of distinction between integers and pointers meant that device drivers could read:
struct {
int csr;
int blk;
int bar;
int bcr;
};
0177440->bcr = count;
0177440->blk = block;
0177440->bar = addr;
0177440->csr = READ | GO;
which might be how one told a device to read some bytes or blocks.
(This is also why struct member names, like st_ino in struct stat, were all prefixed like this: st_ino just meant "some integer offset" and you could use the st_ino member with any pointer, or even with an ordinary variable. The prefix meant you could #include multiple headers without having their struct member names collide.)
All of this turned untenable when C was made to work on 32-bit and other machines. C grew an unsigned integer type, rather than pressing pointers into service as unsigned integers, and Steve Johnson's PCC compiler turned unsigned into a modifier, that could be applied to char and short as well as int. A lot of experimentation occurred. Eventually, in 1989, C was first standardized with most of the syntax and semantics that we have now (though new standards have added new types, and many functions, and so on).
Some of the early C pioneers were involved with creating Go, with particular influence from Ken Thompson. There is a quote on the Wikipedia page that is appropriate here:
When the three of us [Thompson, Rob Pike, and Robert Griesemer] got started, it was pure research. The three of us got together and decided that we hated C++. [laughter] ... [Returning to Go,] we started off with the idea that all three of us had to be talked into every feature in the language, so there was no extraneous garbage put into the language for any reason.
As we see from the early days of C, a pointer-as-integer is a suitable unsigned type that can not only hold any pointer, but, if treated as unsigned, can also hold any object size. A pointer-as-integer is not directly usable as a pointer, of course, and with a GC system and concurrency, we need the language itself to have pointers. But we also need to be able to write the runtime support for the language,1 for which we need integer-ized pointers, which also covers all of our needs for object sizes. So one type, built in to the compiler, covers all the requirements. That is as simple as possible, but no simpler.
1I say "we" as if I had anything to do with it. It's just obvious, once you have implemented a few runtime systems.

Why can't Go floats overflow but integers can?

I've been testing a few things in Go, and noticed integers can overflow, but float64 and float32 apparently can't.
f64 := math.MaxFloat64
fmt.Printf("%f\n", f64)
fmt.Printf("%f\n", f64+1)
f32 := math.MaxFloat32
fmt.Printf("%f\n", f32)
fmt.Printf("%f\n", f32+1)
i := math.MaxInt64
fmt.Printf("%d\n", i)
fmt.Printf("%d\n", i+1)
Result:
179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000
179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000
340282346638528859811704183484516925440.000000
340282346638528859811704183484516925440.000000
9223372036854775807
-9223372036854775808
Integer overflows are apparently not checked for performance reasons, but why can't I make floats overflow? Are they checked?
Because the data structures are fundamentally different. The two's complement structure used by most programming languages (including Go) for (at least most of) their integral data types overflows as a by-product of how it works; the IEEE-754 floating point used by most programming languages (including Go) for (at least most of) their floating point data types doesn't overflow, the way it works the magnitude of the number just continues to increase and, once it's past a certain point, the number starts losing precision even at the integer level.
It's just that the two mechanisms for storing numeric data in a fixed-size set of bits work fundamentally differently.
There are other structures. For instance, some languages have "big integer" and/or "big decimal" types that aren't fixed size; instead, they take up however much room they need to hold the number. (Java's BigInteger and BigDecimal, JavaScript's BigInt, ...) Go has Int, Rat, and Float in the math/big package. (Thanks Adrian!) The fixed-size ones are very useful because they're very fast; but sometimes you want something other than speed (extended range, better precision in floating point, etc.), in which case you sacrifice some speed for the other thing you need.

In Golang, uint16 VS int, which is less cost?

I am using a 64Bit server. My golang program needs integer type.
SO, If I use uint16 and uint32 type in source code, does it cost more than use most regular int type?
I am considering both computing cost and developing cost.
For the vast majority of cases using int makes more sense.
Here are some reasons:
Go doesn't implicitly convert between the numeric types, even when you think it should. If you start using some unsigned type instead of int, you should expect to pepper your code with multiple type conversions, because of other libraries or APIs preferring not to bother with unsigned types, because of untyped constant numerical expressions returning int values, etc.
Unsigned types are more prone to underflowing than signed types, because 0 (an unsigned type's boundary value) is much more of a naturally occurring value in computer programs than, for example, -9223372036854775808.
If you want to use an unsigned type because it restricts the values that you can put in it, keep in mind that when you combine silent underflow and compile time-only constant propagation, you probably aren't getting the bargain you were looking for. For example, while you cannot convert the constant math.MinInt64 to a uint, you can easily convert an int variable with value math.MinInt64 to a uint. And arguably it's not a bad Go style to have an if check whether the value you're trying to assign is valid for your program.
Unless you are experiencing significant memory pressure and your value space is somewhere slightly over what a smaller signed type would offer you, I'd think that using int will be much more efficient even if only because of development cost.
And even then, chances are that either there's a problem somewhere else in your program's memory footprint, or a managed language like Go is not the best fit for your needs.

Avoid too much conversion

I have some parts in my current Go code that look like this:
i := int(math.Floor(float64(len(l)/4)))
The verbosity seems necessary because of some function type signatures like the one in math.Floor, but can it be simplified?
In general, the strict typing of Go leads to some verbose expressions. Verbose doesn't mean stuttering though. Type conversions do useful things and it's valuable to have those useful things explicitly stated.
The trick to simplification is to not write unneeded type conversions, and for that you need to refer to documentation such as the language definition.
In your specific case, you need to know that len() returns int, and further, a value >= 0. You need to know that 4 is a constant that will take on the type int in this expression, and you need to know that integer division will return the integer quotient, which in this case will be a non-negative int and in fact exactly the answer you want.
i := len(l)/4
This case is an easy one.
I'm not 100% sure how Go deals with integer division and integer conversion, but it's usually via truncation. Thus, assuming len(l) is an int
i:=len(l)/4
Otherwise i:= int(len(l))/4 or i:=int(len(l)/4) should work, with the first being theoretically slightly faster than the second.

What are Go arrays indexed by?

I am some 'memory allocator' type code, by using an array and indexes rather than pointers. I'm hoping that the size of the index of the array is smaller than a pointer. I care because I am storing 'pointers' as integer indexes in an array rather than 64-bit pointers.
I can't see anything in the Go spec that says what an array is indexed by. Obviously it's some kind of integer. Passing very large values makes the runtime complain that I can't pass negative numbers, so I'm guessing that it's somehow cast to a signed integer. So is it an int32? I'm guessing it's not an int64 because I didn't touch the top bit (which would have been 2's compliment for a negative number).
Arrays may be indexed by any integer type.
The Array types section of the Go Programming Language Specification says that in an array type definition,
The length is part of the array's type and must be a constant
expression that evaluates to a non-negative integer value.
In an index expression such as a[x]:
x must be an integer value and 0 <= x < len(a)
But there is a limitation on the magnitude of an index; the description of Length and capacity says:
The built-in functions len and cap take arguments of various types and
return a result of type int. The implementation guarantees that the
result always fits into an int.
So the declared size of an array, or the index in an index expression, can be of any integer type (int, uint, uintptr, int8, int16, int32, int64, uint8, uint16, uint32, uint64), but it must be non-negative and within the range of type int (which is the same size as either int32 or int64 -- though it's a distinct type from either).
It's a very interesting question indeed. I have not found any direct rules in documentation too; instead I've found two great discussions in Groups.
In the first one, among many things, I've found an answer why indexes are implemented as int - but not uint:
Algorithms can benefit from the ability to express negative offsets
and such. If indexes were unsigned you'd always need a conversion in
these cases.
The second one specifically talks about possibility (but possibility only!) of using int64 for large arrays, mentioning limitations of len and cap functions (which limitations are actually mentioned in the doc):
The built-in functions len and cap take arguments of various types and
return a result of type int. The implementation guarantees that the
result always fits into an int.
I do agree, though, that more... official point of view wouldn't hurt. )
Arrays and slices are indexed by ints. An int is defined as being a 32 or 64 bit signed integer. The most common implementation (6g) uses 32 bit integers regardless of the architecture at this point in time. However, it is planed that eventually an int will be 64bit on 64bit machines and therefore the same length as a pointer.
The language spec defines 3 implementation dependent numeric types:
uint either 32 or 64 bits
int same size as uint
uintptr an unsigned integer large enough to store the uninterpreted bits of a pointer value

Resources