cap vs len of slice in golang - go

What is the difference between cap and len of a slice in golang?
According to definition:
A slice has both a length and a capacity.
The length of a slice is the number of elements it contains.
The capacity of a slice is the number of elements in the underlying array, counting from the first element in the slice.
x := make([]int, 0, 5) // len(b)=0, cap(b)=5
Does the len mean non null values only?

A slice is an abstraction that uses an array under the covers.
cap tells you the capacity of the underlying array. len tells you how many items are in the array.
The slice abstraction in Go is very nice since it will resize the underlying array for you, plus in Go arrays cannot be resized so slices are almost always used instead.
Example:
s := make([]int, 0, 3)
for i := 0; i < 5; i++ {
s = append(s, i)
fmt.Printf("cap %v, len %v, %p\n", cap(s), len(s), s)
}
Will output something like this:
cap 3, len 1, 0x1040e130
cap 3, len 2, 0x1040e130
cap 3, len 3, 0x1040e130
cap 6, len 4, 0x10432220
cap 6, len 5, 0x10432220
As you can see once the capacity is met, append will return a new slice with a larger capacity. On the 4th iteration you will notice a larger capacity and a new pointer address.
Play example
I realize you did not ask about arrays and append but they are pretty foundational in understanding the slice and the reason for the builtins.

From the source code:
// The len built-in function returns the length of v, according to its type:
// Array: the number of elements in v.
// Pointer to array: the number of elements in *v (even if v is nil).
// Slice, or map: the number of elements in v; if v is nil, len(v) is zero.
// String: the number of bytes in v.
// Channel: the number of elements queued (unread) in the channel buffer;
// if v is nil, len(v) is zero.
func len(v Type) int
// The cap built-in function returns the capacity of v, according to its type:
// Array: the number of elements in v (same as len(v)).
// Pointer to array: the number of elements in *v (same as len(v)).
// Slice: the maximum length the slice can reach when resliced;
// if v is nil, cap(v) is zero.
// Channel: the channel buffer capacity, in units of elements;
// if v is nil, cap(v) is zero.
func cap(v Type) int

Simple explanation
Slice are self growing form of array so there are two main properties.
Length is total no of elements() the slice is having and can be used for looping through the elements we stored in slice. Also when we print the slice all elements till length gets printed.
Capacity is total no elements in underlying array, when you append more elements the length increases till capacity. After that any further append to slice causes the capacity to increase automatically(apprx double) and length by no of elements appended.
The real magic happens when you slice out sub slices from a slice where all the actual read/write happens on the underlaying array. So any change in sub slice will also change data both in original slice and underlying array. Where as any sub slices can have their own length and capacity.
Go through the below program carefully. Its modified version of golang tour example
package main
import "fmt"
func main() {
sorig := []int{2, 3, 5, 7, 11, 13}
printSlice(sorig)
// Slice the slice to give it zero length.
s := sorig[:0]
printSlice(s)
// Extend its length.
s = s[:4]
s[2] = 555
printSlice(s)
// Drop its first two values.
s = s[2:]
printSlice(s)
printSlice(sorig)
}
func printSlice(s []int) {
fmt.Printf("len=%d cap=%d %v\n", len(s), cap(s), s)
//Output
//len=6 cap=6 [2 3 5 7 11 13]
//len=0 cap=6 []
//len=4 cap=6 [2 3 555 7]
//len=2 cap=4 [555 7]
//len=6 cap=6 [2 3 555 7 11 13]

Related

cap returns different value for same underlying array

For
package main
import "fmt"
func main() {
a := make([]int, 5)
printSlice("a", a)
b := make([]int, 1, 5)
b[0]=1
printSlice("b", b)
c := b[:2]
printSlice("c", c)
d := b[2:5]
printSlice("d", d)
}
func printSlice(s string, x []int) {
fmt.Printf("%s len=%d cap=%d %v\n",
s, len(x), cap(x), x)
}
The output is
a len=5 cap=5 [0 0 0 0 0]
b len=0 cap=5 []
c len=2 cap=5 [0 0]
d len=3 cap=3 [0 0 0]
Why c has cap=5 while d has cap=3? As both of them has the same underlying b array (which should be cap=5)
Spec: Slice types:
The array underlying a slice may extend past the end of the slice. The capacity is a measure of that extent: it is the sum of the length of the slice and the length of the array beyond the slice; a slice of length up to that capacity can be created by slicing a new one from the original slice. The capacity of a slice a can be discovered using the built-in function cap(a).
Slices can be extended beyond the length (if the capacity allows), but not before the first element. The capacity therefore only includes elements that may be "claimed" after the last element with a slice expression.

Slice can access another slice out of range but indexing out of range causes panic

My code:
package main
import (
"fmt"
)
func main() {
a := [10]int{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
b := a[1:4]
fmt.Println("a:", a)
fmt.Println("b:", b)
// Works fine even though c is indexing past the end of b.
c := b[4:7]
fmt.Println("c:", c)
// This fails with panic: runtime error: index out of range [4] with length 3
// d := b[4]
}
Output:
a: [0 1 2 3 4 5 6 7 8 9]
b: [1 2 3]
c: [5 6 7]
If I uncomment the line that contains d := b[4], it leads to this this error:
panic: runtime error: index out of range [4] with length 3
My question:
Why is it okay to access b[4:7] even though the index 4 is out of range for b which has a length 3 but it is not okay to access b[4]? What Go language rules explain this behavior?
Relevant rules: Spec: Index expressions and Spec: Slice expressions.
In short: when indexing, index must be less than the length. When slicing, upper index must be less than or equal to the capacity.
When indexing: a[x]
the index x is in range if 0 <= x < len(a), otherwise it is out of range
When slicing: a[low: high]
For arrays or strings, the indices are in range if 0 <= low <= high <= len(a), otherwise they are out of range. For slices, the upper index bound is the slice capacity cap(a) rather than the length.
When you do this:
b := a[1:4]
b will be a slice sharing the backing array with a, b's length will be 3 and its capacity will be 9. So later it is perfectly valid to slice b even beyond its length, up to its capacity which is 9. But when indexing, you can always index only the part covered by the slice's length.
We use indexing to access current elements of a slice or array, and we use slicing if we want to create a fragment of an array or slice, or if we want to extend it. Extending it means we want a bigger portion (but what is still covered by the backing array).

Trying to initialize 2D slice, but the element reference changes all the rows in the slice [duplicate]

This question already has answers here:
What is a concise way to create a 2D slice in Go?
(4 answers)
Closed 2 years ago.
I created a 2D slice using the below code. Say I created [3]3] slice -- [[1 2 3],[4 5 6][7 8 9]]
But if I update the slice say s[1][1]=99 all changes --> [1 99 3], [4 99 6], [7 99 9]]
However, the second slice I have initialized below with variable cost does behave correctly. Not sure what is wrong:
func CreateSparseM() *SparseM{
var m,n,nz int
fmt.Println("Enter the row count of matrix ")
fmt.Scan(&m)
fmt.Println("Enter the column count of matrix ")
fmt.Scan(&n)
fmt.Println("Enter the count of Non Zero elements in the matrix ")
fmt.Scan(&nz)
r:=make([][]int,m)
c:=make([]int,n)
for i:=0;i<m;i++{
r[i] = c
}
fmt.Println(" r ", r)
r[1][1] = 99
fmt.Println(r[1][1])
fmt.Println(r[0][1])
//enter the non-zero elements
var row,col,elem int
for i:=0;i<nz;i++{
fmt.Println("Enter row ")
fmt.Scan(&row)
fmt.Println("Enter col ")
fmt.Scan(&col)
fmt.Println("Enter element ")
fmt.Scan(&elem)
r[row][col] = elem
}
fmt.Println(r)
cost:= [][]int{ {1,1,2,2,3,4,4,5,5},
{2,6,3,7,4,5,7,6,7},
{25,5,12,10,8,16,14,20,18}}
fmt.Println(cost)
cost[1][2]= 777
fmt.Println(cost)
sparseM := &SparseM{m,n,nz,r}
return sparseM
}
A slice contains a reference to an array, the capacity, and the length of the slice. So the following code:
r:=make([][]int,m)
c:=make([]int,n)
for i:=0;i<m;i++{
r[i] = c
}
sets all of r[i] to the same slice c. That is, all r[i] share the same backing array. So if you set r[i][j]=x, you set j'th element of all slices r[i] to x.
The slice you initialized using a literal has three distinct slices, so it does not behave like this.
If you do:
for i:=0;i<m;i++{
r[i] = make([]int,n)
}
then you'll have distinct slices for the first case as well.

golang combination generation made an error

I'm dealing with a programming problem
Given two integers n and k, return all possible combinations of k numbers out of 1 ... n.
and with input n = 5, k = 4, the output should be [[1,2,3,4],[1,2,3,5],[1,2,4,5],[1,3,4,5],[2,3,4,5]], the following is my golang solution
func combine(n int, k int) [][]int {
result := [][]int{}
comb := []int{}
subcom(0, k, n, &comb, &result)
return result
}
func subcom(s, k, n int, comb *[]int, result *[][]int) {
if k > 0 {
for i := s + 1; i <= n-k+1; i++ {
c := append(*comb, i)
subcom(i, k-1, n, &c, result)
}
} else {
*result = append(*result, *comb)
}
}
I think my solution is right, but it return [[1 2 3 5] [1 2 3 5] [1 2 4 5] [1 3 4 5] [2 3 4 5]].
After debugging, I found [1 2 3 4] was added to the result slice at the beginning, but later changed to [1 2 3 5], resulting in the repetition of two [1 2 3 5]s. But I can't figure out what's wrong here.
This is a common mistake when using append.
When your code runs c:=append(*comb,i), it tries to first use the allocated memory in the underlying array to add a new item and only create a new slice when it failed to do so. This is what changes the [1 2 3 4] to [1 2 3 5] - because they share the same underlying memory.
To fix this, copy when you want to append into result:
now := make([]int,len(*comb))
copy(now,*comb)
*result = append(*result,now)
Or use a shortcut of copying:
*result = append(*result, append([]int{},*comb...))
Update:
To understand what I mean by underlying memory, one should understandd the internal model of Go's slice.
In Go, a slice has a data structure called SliceHeader which is accessible through reflect package and is what being referred to when you use unsafe.Sizeof and taking address.
The SliceHeader taking cares of three elements: Len,Cap and a Ptr. The fisrt two is trivail: they are what len() and cap() is for. The last one is a uintptr that points to the memory of the data the slice is containing.
When you shallow-copy a slice, a new SliceHeader is created but with the same content, including Ptr. So the underlying memory is not copied, but shared.

`append` complexity

What is the computational complexity of this loop in the Go programming language?
var a []int
for i := 0 ; i < n ; i++ {
a = append(a, i)
}
Does append operate in linear time (reallocating memory and copying everything on each append), or in amortized constant time (like the way vector classes in many languages are implemnted)?
The Go Programming Language Specification says that the append built-in function reallocates if necessary.
Appending to and copying slices
If the capacity of s is not large enough to fit the additional values,
append allocates a new, sufficiently large slice that fits both the
existing slice elements and the additional values. Thus, the returned
slice may refer to a different underlying array.
The precise algorithm to grow the target slice, when necessary, for an append is implementation dependent. For the current gc compiler algorithm, see the growslice function in the Go runtime package slice.go source file. It's amortized constant time.
In part, the amount-to-grow slice computation reads:
newcap := old.cap
doublecap := newcap + newcap
if cap > doublecap {
newcap = cap
} else {
if old.len < 1024 {
newcap = doublecap
} else {
for newcap < cap {
newcap += newcap / 4
}
}
}
ADDENDUM
The Go Programming Language Specification allows implementors of the language to implement the append built-in function in a number of ways.
For example, new allocations only have to be "sufficiently large". The amount allocated may be parsimonius, allocating the minimum necessary amount, or generous, allocating more than the minimum necessary amount to minimize the cost of resizing many times. The Go gc compiler uses a generous dynamic array amortized constant time algorithm.
The following code illustrates two legal implementations of the append built-in function. The generous constant function implements the same amortized constant time algorithm as the Go gc compiler. The parsimonius variable function, once the initial allocation is filled, reallocates and copies everything every time. The Go append function and the Go gccgo compiler are used as controls.
package main
import "fmt"
// Generous reallocation
func constant(s []int, x ...int) []int {
if len(s)+len(x) > cap(s) {
newcap := len(s) + len(x)
m := cap(s)
if m+m < newcap {
m = newcap
} else {
for {
if len(s) < 1024 {
m += m
} else {
m += m / 4
}
if !(m < newcap) {
break
}
}
}
tmp := make([]int, len(s), m)
copy(tmp, s)
s = tmp
}
if len(s)+len(x) > cap(s) {
panic("unreachable")
}
return append(s, x...)
}
// Parsimonious reallocation
func variable(s []int, x ...int) []int {
if len(s)+len(x) > cap(s) {
tmp := make([]int, len(s), len(s)+len(x))
copy(tmp, s)
s = tmp
}
if len(s)+len(x) > cap(s) {
panic("unreachable")
}
return append(s, x...)
}
func main() {
s := []int{0, 1, 2}
x := []int{3, 4}
fmt.Println("data ", len(s), cap(s), s, len(x), cap(x), x)
a, c, v := s, s, s
for i := 0; i < 4096; i++ {
a = append(a, x...)
c = constant(c, x...)
v = variable(v, x...)
}
fmt.Println("append ", len(a), cap(a), len(x))
fmt.Println("constant", len(c), cap(c), len(x))
fmt.Println("variable", len(v), cap(v), len(x))
}
Output:
gc:
data 3 3 [0 1 2] 2 2 [3 4]
append 8195 9152 2
constant 8195 9152 2
variable 8195 8195 2
gccgo:
data 3 3 [0 1 2] 2 2 [3 4]
append 8195 9152 2
constant 8195 9152 2
variable 8195 8195 2
To summarize, depending on the implementation, once the initial capacity is filled, the append built-in function may or may not reallocate on every call.
References:
Dynamic array
Amortized analysis
Appending to and copying slices
If the capacity of s is not large enough to fit the additional values,
append allocates a new, sufficiently large slice that fits both the
existing slice elements and the additional values. Thus, the returned
slice may refer to a different underlying array.
Append to a slice specification discussion
The spec (at tip and 1.0.3) states:
"If the capacity of s is not large enough to fit the additional
values, append allocates a new, sufficiently large slice that fits
both the existing slice elements and the additional values. Thus, the
returned slice may refer to a different underlying array."
Should this be an "If and only if"? For example, if I know the
capacity of my slice is sufficiently long, am I assured that I will
not change the underlying array?
Rob Pike
Yes you are so assured.
runtime slice.go source file
Arrays, slices (and strings): The mechanics of 'append'
It doesn't reallocate on every append and it is quite explicitly stated in the docs:
If the capacity of s is not large enough to fit the additional values, append allocates a new, sufficiently large slice that fits both the existing slice elements and the additional values. Thus, the returned slice may refer to a different underlying array.
Amortized constant time is thus the complexity asked about.

Resources