golang: Insert to a sorted slice - go

What's the most efficient way of inserting an element to a sorted slice?
I tried a couple of things but all ended up using at least 2 appends which as I understand makes a new copy of the slice

Here is how to insert into a sorted slice of strings:
Go Playground Link to full example: https://play.golang.org/p/4RkVgEpKsWq
func Insert(ss []string, s string) []string {
i := sort.SearchStrings(ss, s)
ss = append(ss, "")
copy(ss[i+1:], ss[i:])
ss[i] = s
return ss
}

If the slice has enough capacity then there's no need for a new copy.
The elements after the insert position can be shifted to the right.
Only when the slice doesn't have enough capacity,
a new slice and copying all values will be necessary.
Keep in mind that slices are not designed for fast insertion.
So there won't be a miracle solution here using slices.
You could create a custom data structure to make this more efficient,
but obviously there will be other trade-offs.
One point that can be optimized in the process is finding the insertion point quickly. If the slice is sorted, then you can use binary search to perform this in O(log n) time.
However, this might not matter much,
considering the expensive operation of copying the end of the slice,
or reallocating when necessary.

I like #likebike's answer but it only works for strings. Here is the generic version that will work for a slice of any ordered type (requires Go 1.18):
func Insert[T constraints.Ordered](ts []T, t T) []T {
var dummy T
ts = append(ts, dummy) // extend the slice
i, _ := slices.BinarySearch(ts, t) // find slot
copy(ts[i+1:], ts[i:]) // make room
ts[i] = t
return ts
}
Note that this uses the package golang.org/x/exp/slices but this will almost certainly be included in the std Go library in Go 1.19.
Try it in the Go Playground

There are two parts to the problem: finding where to insert the value and inserting the value.
Use the sort package search functions to efficiently find the insertion index using binary search.
Use a single call to append to efficiently insert a value into a slice:
// insertAt inserts v into s at index i and returns the new slice.
func insertAt(data []int, i int, v int) []int {
if i == len(data) {
// Insert at end is the easy case.
return append(data, v)
}
// Make space for the inserted element by shifting
// values at the insertion index up one index. The call
// to append does not allocate memory when cap(data) is
// greater ​than len(data).
data = append(data[:i+1], data[i:]...)
// Insert the new element.
data[i] = v
// Return the updated slice.
return data
}
Here's the code for inserting a value a sorted slice:
func insertSorted(data []int, v int) []int {
i := sort.Search(len(data), func(i int) bool { return data[i] >= v })
return insertAt(data, i, v)
}
The code in this answer uses a slice of int. Adjust the type to match your actual data.
The call to sort.Search in this answer can be replaced with a call to the helper function sort.SearchInts. I show sort.Search in this answer because the function applies to a slice of any type.
If you do not want to add duplicate values, check the value at the search index before inserting:
func insertSortedNoDups(data []int, v int) []int {
i := sort.Search(len(data), func(i int) bool { return data[i] >= v })
if i < len(data) && data[i] == v {
return data
}
return insertAt(data, i, v)
}

You could use a heap:
package main
import (
"container/heap"
"sort"
)
type slice struct { sort.IntSlice }
func (s slice) Pop() interface{} { return 0 }
func (s *slice) Push(x interface{}) {
(*s).IntSlice = append((*s).IntSlice, x.(int))
}
func main() {
s := &slice{
sort.IntSlice{11, 10, 14, 13},
}
heap.Init(s)
heap.Push(s, 12)
println(s.IntSlice[0] == 10)
}
Note that a heap is not strictly sorted, but the "minimum element" is guaranteed
to be the first element. Also I did not implement the Pop function in my
example, you would want to do that.
https://golang.org/pkg/container/heap

There are two approaches mentioned here to insert into the slice when the position i is known:
data = append(data, "")
copy(data[i+1:], data[i:])
data[i] = s
and
data = append(data[:i+1], data[i:]...)
data[i] = s
I just benchmarked both with go1.18beta2, and the first solution is approximately 10% faster.

no dependency, generic data type with duplicated options. (go 1.18)
time complexity : Log2(n) + 1
import "golang.org/x/exp/constraints"
import "golang.org/x/exp/slices"
func InsertionSort[T constraints.Ordered](array []T, value T, canDupicate bool) []T {
pos, isFound := slices.BinarySearch(array, value)
if canDupicate || !isFound {
array = slices.Insert(array, pos, value)
}
return array
}
full version : https://go.dev/play/p/P2_ou2Fqs37

play : https://play.golang.org/p/dUGmPurouxA
array1 := []int{1, 3, 4, 5}
//want to insert at index 1
insertAtIndex := 1
temp := append([]int{}, array1[insertAtIndex:]...)
array1 = append(array1[0:insertAtIndex], 2)
array1 = append(array1, temp...)
fmt.Println(array1)

You can try the below code. It basically uses the golang sort package
package main
import "sort"
import "fmt"
func main() {
data := []int{20, 21, 22, 24, 25, 26, 28, 29, 30, 31, 32}
var items = []int{23, 27}
for _, x := range items {
i := sort.Search(len(data), func(i int) bool { return data[i] >= x })
if i < len(data) && data[i] == x {
fmt.Println(i)
} else {
data = append(data, 0)
copy(data[i+1:], data[i:])
data[i] = x
}
fmt.Println(data)
}
}

Related

Sort 2D array of structs Golang

I want to create a consistent ordering for a 2D slice of structs, I am creating the 2D slice from a map so the order is always different.
My structs look like
// Hit contains the data for a hit.
type Hit struct {
Key string `json:"key"`
Data []Field `json:"data"`
}
// Hits stores a list of hits.
type Hits [][]Hit
I want to provide a consistent order for the contents of my Hits type.
I have tried:
func (c Hits) Len() int { return len(c) }
func (c Hits) Swap(i, j int) { c[i], c[j] = c[j], c[i] }
func (c Hits) Less(i, j int) bool { return strings.Compare(c[i][0].Key, c[j][0].Key) == -1 }
But the results still seem to come back in random order.
I was thinking of possibly hashing each item in the slice but thought there might be an easier option
The order of iteration over a map, because it's a hash table is rather indeterminate (it's not, really — insert items with the same keys in the same exact sequence into 2 maps and the order of iteration for each will be identical).
Assuming that your map is a map[string]Hit, to iterate it over in a determinate order, I would enumerate the set of keys in the map, sort that, and use that sorted set to enumerate the map.
Something like this:
package main
import (
"fmt"
"sort"
)
type Hit struct {
Key string `json:"key"`
Data []Field `json:"data"`
}
type Field struct {
Value string `json:"value"`
}
func main() {
var mapOfHits = getSomeHits()
var sortedHits = sortHits(mapOfHits)
for _, h := range sortedHits {
fmt.Println(h.Key)
}
}
func getSomeHits() map[string]Hit {
return make(map[string]Hit, 0)
}
func sortHits(m map[string]Hit) []Hit {
keys := make([]string, 0, len(m))
sorted := make([]Hit, 0, len(m))
for k := range m {
keys = append(keys, k)
}
sort.Strings(keys)
for _, k := range keys {
sorted = append(sorted, m[k])
}
return sorted
}

How to pass an accumulator to a recursive func?

(I'm new to Go.)
I am working on this leetcode problem: https://leetcode.com/problems/pascals-triangle/
package main
import "fmt"
func main() {
arrRes := [][]int{}
gen(5, arrRes)
fmt.Println(arrRes)
}
func gen(numRows int, arrRes [][]int) {
build(numRows, 0, arrRes)
}
func build(n int, level int, arrRes [][]int) {
if(n == level) {
return
}
arr := []int{}
if level == 0 {
arr = append(arr, 1)
} else if level == 1 {
arr = append(arr, 1, 1)
} else {
// get it out
tmp := arrRes[level-1]
arr = comb(tmp)
}
arrRes = append(arrRes, arr)
build(n, level+1, arrRes)
}
func comb(arr []int) []int{
// arr type init
tmpArr := []int{1}
for i:=1; i<len(arr); i++ {
sum := arr[i-1] + arr[i]
tmpArr = append(tmpArr, sum)
}
// go use val, not ref
tmpArr = append(tmpArr, 1)
return tmpArr;
}
I want to define an accumulated variable arrRes := [][]int{} and keep passing into the recursive function. I think Go is pass-by-value instead of pass-by-reference. Is there a way to keep this pattern?
I've got two alternative methods:
passing a global var.
pass a 2D array into the func then return the new 2D array.
https://github.com/kenpeter/go_tri/blob/master/tri_global.go
https://github.com/kenpeter/go_tri/blob/master/tri.go
A slice is (basically) three things: a length, a capacity, and a pointer to an underlying array. Everything in Go is pass-by-value, so when you pass a slice to a function you are passing its current length, current capacity, and the memory address of the pointer. Changes made to length and capacity inside the function are made to a copy, and will not affect the length and capacity of the slice that was passed as an argument in the function call.
Printing a slice doesn't print its underlying array, it prints the part of the underlying array that is visible in the slice (which could be none of it if len = 0), based on (1) the pointer to the first element in the underlying array that's supposed to be visible to the slice; and (2) the length in the slice variable.
If you are modifying the length or capacity of a slice inside a function and you want those changes to be visible outside the function, you can either return the slice to update the context outside the function, like append does:
numbers := append(numbers, 27)
Or you can pass in a pointer to a slice:
func ChangeNumbersLenOrCap(numbers *[]int) {
// make your changes, no return value required
}
For your program, it looks like you could get away with a pointer to a slice of int slices:
var arrRes *[][]int
...because you're not modifying the int slice across another function boundary. Some programs would need a pointer to a slice of pointers to int slices:
var arrRes *[]*[]int
Here are some simple edits to get you started:
arrRes := [][]int{}
gen(5, &arrRes)
fmt.Println(arrRes)
}
func gen(numRows int, arrRes *[][]int) {
// ...
func build(n int, level int, arrRes *[][]int) {
// ...
tmp := *arrRes[level-1]
// ...
*arrRes = append(*arrRes, arr)
build(n, level+1, arrRes)

How to extract x top int values from a map in Golang?

I have a map[string]int
I want to get the x top values from it and store them in another data structure, another map or a slice.
From https://blog.golang.org/go-maps-in-action#TOC_7. I understood that:
When iterating over a map with a range loop, the iteration order is
not specified and is not guaranteed to be the same from one iteration
to the next.
so the result structure will be a slice then.
I had a look at several related topics but none fits my problem:
related topic 1
related topic 2
related topic 3
What would be the most efficient way to do this please?
Thanks,
Edit:
My solution would be to turn my map into a slice and sort it, then extract the first x values.
But is there a better way ?
package main
import (
"fmt"
"sort"
)
func main() {
// I want the x top values
x := 3
// Here is the map
m := make(map[string]int)
m["k1"] = 7
m["k2"] = 31
m["k3"] = 24
m["k4"] = 13
m["k5"] = 31
m["k6"] = 12
m["k7"] = 25
m["k8"] = -8
m["k9"] = -76
m["k10"] = 22
m["k11"] = 76
// Turning the map into this structure
type kv struct {
Key string
Value int
}
var ss []kv
for k, v := range m {
ss = append(ss, kv{k, v})
}
// Then sorting the slice by value, higher first.
sort.Slice(ss, func(i, j int) bool {
return ss[i].Value > ss[j].Value
})
// Print the x top values
for _, kv := range ss[:x] {
fmt.Printf("%s, %d\n", kv.Key, kv.Value)
}
}
Link to golang playground example
If I want to have a map at the end with the x top values, then with my solution I would have to turn the slice into a map again. Would this still be the most efficient way to do it?
Creating a slice and sorting is a fine solution; however, you could also use a heap. The Big O performance should be equal for both implementations (n log n) so this is a viable alternative with the advantage that if you want to add new entries you can still efficiently access the top N items without repeatedly sorting the entire set.
To use a heap, you would implement the heap.Interface for the kv type with a Less function that compares Values as greater than (h[i].Value > h[j].Value), add all of the entries from the map, and then pop the number of items you want to use.
For example (Go Playground):
func main() {
m := getMap()
// Create a heap from the map and print the top N values.
h := getHeap(m)
for i := 1; i <= 3; i++ {
fmt.Printf("%d) %#v\n", i, heap.Pop(h))
}
// 1) main.kv{Key:"k11", Value:76}
// 2) main.kv{Key:"k2", Value:31}
// 3) main.kv{Key:"k5", Value:31}
}
func getHeap(m map[string]int) *KVHeap {
h := &KVHeap{}
heap.Init(h)
for k, v := range m {
heap.Push(h, kv{k, v})
}
return h
}
// See https://golang.org/pkg/container/heap/
type KVHeap []kv
// Note that "Less" is greater-than here so we can pop *larger* items.
func (h KVHeap) Less(i, j int) bool { return h[i].Value > h[j].Value }
func (h KVHeap) Swap(i, j int) { h[i], h[j] = h[j], h[i] }
func (h KVHeap) Len() int { return len(h) }
func (h *KVHeap) Push(x interface{}) {
*h = append(*h, x.(kv))
}
func (h *KVHeap) Pop() interface{} {
old := *h
n := len(old)
x := old[n-1]
*h = old[0 : n-1]
return x
}

Sorting array of i by v[i]/w[i] in Go

I would like to sort an array of indices descendingly by v[i]/w[i] where v and w are two other arrays of integers. Here is what I have tried in Go:
package main
import "fmt"
import "sort"
func main() {
v := [3]int{5, 6, 3}
w := [3]int{4, 5, 2}
indices := make([]int, 3)
for i := range indices {
indices[i] = i
}
sort.Slice(indices, func(a, b int) bool {
return float32(v[a])/float32(w[a]) > float32(v[b])/float32(w[b])
})
fmt.Println(indices)
}
I expect the output to be [2,0,1] because 3/2 > 5/4 > 6/5 but the actual output is [0,2,1]. Could anyone help me find the where the problem is? Thank you.
To not mutate v and w arrays which can be expensive, we can just add another level of indirection into the Less function
sort.Slice(indices, func(a, b int) bool {
return float32(v[indices[a]])/float32(w[indices[a]]) > float32(v[indices[b]])/float32(w[indices[b]])
})
Playground
Sorting by definition moves items you're sorting around in the slice, therefore changing their respective indexes. However you are not moving the actual values you are sorting, which are in w and v, only the indices slice.
Since the indices slice contains the sorted "indices", you can use that to lookup the actual value for comparison.
sort.Slice(indices, func(i, j int) bool {
return float64(v[indices[i]])/float64(w[indices[i]]) > float64(v[indices[j]])/float64(w[indices[j]])
})
https://play.golang.org/p/6oFBM27bVR-
Or you could implement a type to sort all 3 values at once for example:
type indexSorter struct {
indices, w, v []int
}
func (a indexSorter) Len() int { return len(a.indices) }
func (a indexSorter) Swap(i, j int) {
a.indices[i], a.indices[j] = a.indices[j], a.indices[i]
a.w[i], a.w[j] = a.w[j], a.w[i]
a.v[i], a.v[j] = a.v[j], a.v[i]
}
func (a indexSorter) Less(i, j int) bool {
return float64(a.v[i])/float64(a.w[i]) > float64(a.v[i])/float64(a.w[j])
}
https://play.golang.org/p/EFUkHWgjo5U
This is happening because the sort function changes the indices while it is sorting but your accompanying arrays v and w remain constant.
The best way to do what you want is to create a single array with v and w both contained in a struct and then order that array.

How to check the uniqueness inside a for-loop?

Is there a way to check slices/maps for the presence of a value?
I would like to add a value to a slice only if it does not exist in the slice.
This works, but it seems verbose. Is there a better way to do this?
orgSlice := []int{1, 2, 3}
newSlice := []int{}
newInt := 2
newSlice = append(newSlice, newInt)
for _, v := range orgSlice {
if v != newInt {
newSlice = append(newSlice, v)
}
}
newSlice == [2 1 3]
Your approach would take linear time for each insertion. A better way would be to use a map[int]struct{}. Alternatively, you could also use a map[int]bool or something similar, but the empty struct{} has the advantage that it doesn't occupy any additional space. Therefore map[int]struct{} is a popular choice for a set of integers.
Example:
set := make(map[int]struct{})
set[1] = struct{}{}
set[2] = struct{}{}
set[1] = struct{}{}
// ...
for key := range(set) {
fmt.Println(key)
}
// each value will be printed only once, in no particular order
// you can use the ,ok idiom to check for existing keys
if _, ok := set[1]; ok {
fmt.Println("element found")
} else {
fmt.Println("element not found")
}
Most efficient is likely to be iterating over the slice and appending if you don't find it.
func AppendIfMissing(slice []int, i int) []int {
for _, ele := range slice {
if ele == i {
return slice
}
}
return append(slice, i)
}
It's simple and obvious and will be fast for small lists.
Further, it will always be faster than your current map-based solution. The map-based solution iterates over the whole slice no matter what; this solution returns immediately when it finds that the new value is already present. Both solutions compare elements as they iterate. (Each map assignment statement certainly does at least one map key comparison internally.) A map would only be useful if you could maintain it across many insertions. If you rebuild it on every insertion, then all advantage is lost.
If you truly need to efficiently handle large lists, consider maintaining the lists in sorted order. (I suspect the order doesn't matter to you because your first solution appended at the beginning of the list and your latest solution appends at the end.) If you always keep the lists sorted then you you can use the sort.Search function to do efficient binary insertions.
Another option:
package main
import "golang.org/x/tools/container/intsets"
func main() {
var (
a intsets.Sparse
b bool
)
b = a.Insert(9)
println(b) // true
b = a.Insert(9)
println(b) // false
}
https://pkg.go.dev/golang.org/x/tools/container/intsets
This option if the number of missing numbers is unknown
AppendIfMissing := func(sl []int, n ...int) []int {
cache := make(map[int]int)
for _, elem := range sl {
cache[elem] = elem
}
for _, elem := range n {
if _, ok := cache[elem]; !ok {
sl = append(sl, elem)
}
}
return sl
}
distincting a array of a struct :
func distinctObjects(objs []ObjectType) (distinctedObjs [] ObjectType){
var output []ObjectType
for i:= range objs{
if output==nil || len(output)==0{
output=append(output,objs[i])
} else {
founded:=false
for j:= range output{
if output[j].fieldname1==objs[i].fieldname1 && output[j].fieldname2==objs[i].fieldname2 &&......... {
founded=true
}
}
if !founded{
output=append(output,objs[i])
}
}
}
return output
}
where the struct here is something like :
type ObjectType struct {
fieldname1 string
fieldname2 string
.........
}
the object will distinct by checked fields here :
if output[j].fieldname1==objs[i].fieldname1 && output[j].fieldname2==objs[i].fieldname2 &&......... {

Resources