Problems understanding usage of `interface{}` in Go - go

I'm trying to port an algorithm from Python to Go. The central part of it is a tree built using dicts, which should stay this way since each node can have an arbitrary number of children. All leaves are at the same level, so up the the lowest level the dicts contain other dicts, while the lowest level ones contain floats. Like this:
tree = {}
insert(tree, ['a', 'b'], 1.0)
print tree['a']['b']
So while trying to port the code to Go while learning the language at the same time, this is what I started with to test the basic idea:
func main() {
tree := make(map[string]interface{})
tree["a"] = make(map[string]float32)
tree["a"].(map[string]float32)["b"] = 1.0
fmt.Println(tree["a"].(map[string]float32)["b"])
}
This works as expected, so the next step was to turn this into a routine that would take a "tree", a path, and a value. I chose the recursive approach and came up with this:
func insert(tree map[string]interface{}, path []string, value float32) {
node := path[0]
l := len(path)
switch {
case l > 1:
if _, ok := tree[node]; !ok {
if l > 2 {
tree[node] = make(map[string]interface{})
} else {
tree[node] = make(map[string]float32)
}
}
insert(tree[node], path[1:], value) //recursion
case l == 1:
leaf := tree
leaf[node] = value
}
}
This is how I imagine the routine should be structured, but I can't get the line marked with "recursion" to work. There is either a compiler error, or a runtime error if I try to perform a type assertion on tree[node]. What would be the correct way to do this?

Go is perhaps not the ideal solution for generic data structures like this. The type assertions make it possible, but manipulating data in it requires more work that you are used to from python and other scripting languages.
About your specific issue: You are missing a type assertion in the insert() call. The value of tree[node] is of type interface{} at that point. The function expects type map[string]interface{}. A type assertion will solve that.
Here's a working example:
package main
import "fmt"
type Tree map[string]interface{}
func main() {
t := make(Tree)
insert(t, []string{"a", "b"}, 1.0)
v := t["a"].(Tree)["b"]
fmt.Printf("%T %v\n", v, v)
// This prints: float32 1
}
func insert(tree Tree, path []string, value float32) {
node := path[0]
len := len(path)
switch {
case len == 1:
tree[node] = value
case len > 1:
if _, ok := tree[node]; !ok {
tree[node] = make(Tree)
}
insert(tree[node].(Tree), path[1:], value) //recursion
}
}
Note that I created a new type for the map. This makes the code a little easier to follow. I also use the same 'map[string]interface{}` for both tree nodes and leaves. If you want to get a float out of the resulting tree, another type assertion is needed:
leaf := t["a"].(Tree)["b"] // leaf is of type 'interface{}`.
val := leaf.(float32)

well... the problem is that you're trying to code Go using Python idioms, and you're making a tree with... hashtables? Huh? Then you have to maintain that the keys are unique and do a bunch of otherstuff, when if you just made the set of children a slice, you get that sort of thing for free.
I wouldn't make a Tree an explicit map[string]interface{}. A tree and a node on a tree are really the same thing, since it's a recursive datatype.
type Tree struct {
Children []*Tree
Value interface{}
}
func NewTree(v interface{}) *Tree {
return &Tree{
Children: []*Tree{},
Value: v,
}
}
so to add a child...
func (t *Tree) AddChild(child interface{}) {
switch c := child.(type) {
case *Tree:
t.Children = append(t.Children, c)
default:
t.Children = append(t.Children, NewTree(c))
}
}
and if you wanted to implement some recursive function...
func (t *Tree) String() string {
return fmt.Sprint(t.Value)
}
func (t *Tree) PrettyPrint(w io.Writer, prefix string) {
var inner func(int, *Tree)
inner = func(depth int, child *Tree) {
for i := 0; i < depth; i++ {
io.WriteString(w, prefix)
}
io.WriteString(w, child.String()+"\n") // you should really observe the return value here.
for _, grandchild := range child.Children {
inner(depth+1, grandchild)
}
}
inner(0, t)
}
something like that. Any node can be made the root of some tree, since a subtree is just a tree itself. See here for a working example: http://play.golang.org/p/rEx43vOnXN
There are some articles out there like "Python is not Java" (http://dirtsimple.org/2004/12/python-is-not-java.html), and to that effect, Go is not Python.

Related

Traversing tree and extracting information with reusable components

I have a tree of nested structs in a Go project. I would like to walk through the tree and perform different actions, such as picking out certain structs at different levels in the tree and appending them to a list, or modifying the structs in place.
I would like to do this using reusable components so that I can focus on implementing that perform the tasks, not having to reimplement the walker for every such function. So far the only thing I can think of is this API:
type applyFunc func(*Node)
func walker(node *Node, f applyFunc) {
....
for _, child := range node.children() {
walker(child, f)
}
}
The function walker can clearly be used to modify the tree because it is passed pointers to the tree nodes. I like it because I can write applyFunc functions separately without having to bother with the actual recursive walker code. However, extracting nodes or deleting them is more difficult.
For extracting information from nodes, perhaps I can use a closure:
values := &[]int{}
f := func(node *Node) {
values.append(node.val)
}
walker(root, f)
//values now hold the information I am interested in
Would this be a good solution? Are there better ones?
You could also add the walk function to your tree type, add a pointer to the parent in a node and add a deleteChild method to a node which takes the index of the child as argument which would allow you to manipulate easily.
Example (here i called walk apply):
type node struct {
children []*node
parent *node
value int
}
func (n *node) deleteChild(index int) {
n.children = append(n.children[:index], n.children[index+1:]...)
}
func (n *node) delete(index int) {
if n.parent != nil {
n.parent.deleteChild(index)
}
}
func (n *node) apply(index int, f func(int, *node)) {
f(index, n)
for childIndex, child := range n.children {
child.apply(childIndex, f)
}
}
func main() {
t := &node{}
t.children = []*node{
&node{
children: []*node{
&node{value: 2},
},
value: 1,
parent: t,
},
}
// extract all values in nodes
values := []int{}
t.apply(0, func(index int, n *node) {
values = append(values, n.value)
})
fmt.Println(values) // [0 1 2]
// delete a node
fmt.Println(t.children) // [0xc4.....]
t.apply(0, func(index int, n *node) {
n.delete(index)
})
fmt.Println(t.children) // []
}

golang: Insert to a sorted slice

What's the most efficient way of inserting an element to a sorted slice?
I tried a couple of things but all ended up using at least 2 appends which as I understand makes a new copy of the slice
Here is how to insert into a sorted slice of strings:
Go Playground Link to full example: https://play.golang.org/p/4RkVgEpKsWq
func Insert(ss []string, s string) []string {
i := sort.SearchStrings(ss, s)
ss = append(ss, "")
copy(ss[i+1:], ss[i:])
ss[i] = s
return ss
}
If the slice has enough capacity then there's no need for a new copy.
The elements after the insert position can be shifted to the right.
Only when the slice doesn't have enough capacity,
a new slice and copying all values will be necessary.
Keep in mind that slices are not designed for fast insertion.
So there won't be a miracle solution here using slices.
You could create a custom data structure to make this more efficient,
but obviously there will be other trade-offs.
One point that can be optimized in the process is finding the insertion point quickly. If the slice is sorted, then you can use binary search to perform this in O(log n) time.
However, this might not matter much,
considering the expensive operation of copying the end of the slice,
or reallocating when necessary.
I like #likebike's answer but it only works for strings. Here is the generic version that will work for a slice of any ordered type (requires Go 1.18):
func Insert[T constraints.Ordered](ts []T, t T) []T {
var dummy T
ts = append(ts, dummy) // extend the slice
i, _ := slices.BinarySearch(ts, t) // find slot
copy(ts[i+1:], ts[i:]) // make room
ts[i] = t
return ts
}
Note that this uses the package golang.org/x/exp/slices but this will almost certainly be included in the std Go library in Go 1.19.
Try it in the Go Playground
There are two parts to the problem: finding where to insert the value and inserting the value.
Use the sort package search functions to efficiently find the insertion index using binary search.
Use a single call to append to efficiently insert a value into a slice:
// insertAt inserts v into s at index i and returns the new slice.
func insertAt(data []int, i int, v int) []int {
if i == len(data) {
// Insert at end is the easy case.
return append(data, v)
}
// Make space for the inserted element by shifting
// values at the insertion index up one index. The call
// to append does not allocate memory when cap(data) is
// greater ​than len(data).
data = append(data[:i+1], data[i:]...)
// Insert the new element.
data[i] = v
// Return the updated slice.
return data
}
Here's the code for inserting a value a sorted slice:
func insertSorted(data []int, v int) []int {
i := sort.Search(len(data), func(i int) bool { return data[i] >= v })
return insertAt(data, i, v)
}
The code in this answer uses a slice of int. Adjust the type to match your actual data.
The call to sort.Search in this answer can be replaced with a call to the helper function sort.SearchInts. I show sort.Search in this answer because the function applies to a slice of any type.
If you do not want to add duplicate values, check the value at the search index before inserting:
func insertSortedNoDups(data []int, v int) []int {
i := sort.Search(len(data), func(i int) bool { return data[i] >= v })
if i < len(data) && data[i] == v {
return data
}
return insertAt(data, i, v)
}
You could use a heap:
package main
import (
"container/heap"
"sort"
)
type slice struct { sort.IntSlice }
func (s slice) Pop() interface{} { return 0 }
func (s *slice) Push(x interface{}) {
(*s).IntSlice = append((*s).IntSlice, x.(int))
}
func main() {
s := &slice{
sort.IntSlice{11, 10, 14, 13},
}
heap.Init(s)
heap.Push(s, 12)
println(s.IntSlice[0] == 10)
}
Note that a heap is not strictly sorted, but the "minimum element" is guaranteed
to be the first element. Also I did not implement the Pop function in my
example, you would want to do that.
https://golang.org/pkg/container/heap
There are two approaches mentioned here to insert into the slice when the position i is known:
data = append(data, "")
copy(data[i+1:], data[i:])
data[i] = s
and
data = append(data[:i+1], data[i:]...)
data[i] = s
I just benchmarked both with go1.18beta2, and the first solution is approximately 10% faster.
no dependency, generic data type with duplicated options. (go 1.18)
time complexity : Log2(n) + 1
import "golang.org/x/exp/constraints"
import "golang.org/x/exp/slices"
func InsertionSort[T constraints.Ordered](array []T, value T, canDupicate bool) []T {
pos, isFound := slices.BinarySearch(array, value)
if canDupicate || !isFound {
array = slices.Insert(array, pos, value)
}
return array
}
full version : https://go.dev/play/p/P2_ou2Fqs37
play : https://play.golang.org/p/dUGmPurouxA
array1 := []int{1, 3, 4, 5}
//want to insert at index 1
insertAtIndex := 1
temp := append([]int{}, array1[insertAtIndex:]...)
array1 = append(array1[0:insertAtIndex], 2)
array1 = append(array1, temp...)
fmt.Println(array1)
You can try the below code. It basically uses the golang sort package
package main
import "sort"
import "fmt"
func main() {
data := []int{20, 21, 22, 24, 25, 26, 28, 29, 30, 31, 32}
var items = []int{23, 27}
for _, x := range items {
i := sort.Search(len(data), func(i int) bool { return data[i] >= x })
if i < len(data) && data[i] == x {
fmt.Println(i)
} else {
data = append(data, 0)
copy(data[i+1:], data[i:])
data[i] = x
}
fmt.Println(data)
}
}

Idiomatic way of implementing nested matrices in golang

I am trying to represent a hypergraph in memory. Are there any better data structures for this task beside nested matrices? A nested matrix is a matrix which can have elements of both the "native" type (let's say int for the sake of simplicity) and matrices.
This is the beginning of such a matrix. Are there any rough edges in the code, to make it look more idiomatic? How to make it look more idiomatic?
The code:
package main
import "fmt"
type Matricial interface {
Put(interface{}, ...int)
Get(...int) interface{}
}
type Matrix struct {
Matricial
values map[int]interface{}
}
func NewMatrix() *Matrix {
m := &Matrix{}
m.values = make(map[int]interface{})
return m
}
func (m *Matrix) Set(atom interface{}, pos ...int) {
firstdim := pos[0]
if val, ok := m.values[firstdim]; ok {
fmt.Println("map key exists", val)
switch converted := val.(type) {
case int:
m.values[firstdim] = converted
default:
fmt.Println("ERR: unknown type: %T", val)
}
} else {
if len(pos[1:]) > 0 {
newm := NewMatrix()
m.values[firstdim] = newm
newm.Set(atom, pos[1:]...)
} else {
m.values[firstdim] = atom
}
}
}
func (m *Matrix) Get(pos ...int) interface{} {
if len(pos) == 1 {
return m.values[pos[0]]
} else {
switch accessor := m.values[pos[0]].(type) {
case Matricial:
return accessor.Get(pos[1:]...)
default:
return nil
}
}
return nil
}
func main() {
m := NewMatrix()
m.Set(42, 2, 3, 4)
m.Set(43, 0)
fmt.Println(m.Get(2, 3))
fmt.Println(m.Get(2, 3, 4))
fmt.Println(m.Get(0))
}
The data structure must allow connecting hyperedges with other hyperedges (i.e. handling hyperedges as though they were nodes).
A nested matrix (adopting your definition of the term) seems a reasonable representation for hypergraph, not knowing anything more about your application anyway. An example Go implementation is the power set example at Rosetta code.
It is not idiomatic to embed an interface. For example, if you rename the Put method of Matricial to be Set, which is what I think you meant, then you can just delete the Matricial field of Matrix and your program produces the same output.

Overlapping in treap package from stathat?

According to this link stathat uses overlapping with their treap:
GoLLRB is great and there's no reason you should switch. We thought
the idea behind treaps was an elegant solution to our problem, so we
implemented it. We liked the interface that GoLLRB provided, so we
mimicked it in our implementation.
One thing we added to the treap package is to allow you to iterate
using an overlap function, so you can get all the keys in [3,9), for
example. We use this a lot, often with a struct as the key.
Patrick
I am playing with the following code and have no idea how to continue:
package main
import(
"reflect"
"fmt"
"github.com/stathat/treap"
)
func IntLess(p, q interface{}) bool {
return p.(int) < q.(int)
}
func BucketOverlap(a, b interface{}) bool {
return false
}
func main() {
tree := treap.NewOverlapTree(IntLess, BucketOverlap)
tree.Insert(5, "a")
tree.Insert(7, "b")
tree.Insert(2, "c")
tree.Insert(1, "d")
for v := range tree.IterateOverlap([]int{2,5}) {
fmt.Printf("val: %v\n", v)
}
}
let's say I want to get keys in range [2,5] => [c,a]
The first place I'd start would be the tests for the stathat treap code:
https://github.com/stathat/treap/blob/master/treap_test.go#L164
It seems that what you're doing is trying to pass a slice of keys when it is expecting a single one. You are also trying to do vector operations (i.e. range overlap) on a value that is scalar (i.e. int).
Maybe I am misunderstanding the point of the overlap, but my understanding is that the use for it is as an interval tree:
key1 := []int{1, 3}
key2 := []int{2, 4}
key3 := []int{5, 6}
These are intervals (low and high). key1 overlaps key2, and vice-versa. Neither overlap key3. In this case, the overlap would be useful (i.e. IterateOverlap([]int{2,3}) would give me key1 and key2, whereas IterateOverlap([]int{3,5}) would return all).
I'm not sure how you'd iterate over these entries. Maybe this:
for i := 2; i <= 5; i++ {
fmt.Printf("val: %v\n", tree.Get(i))
}
Again, I've not used this implementation, so forgive me if I'm barking up the wrong tree.
I have found a solution using GoLLRB:
package main
import (
"fmt"
"github.com/petar/GoLLRB/llrb"
)
type Item struct {
key int
value string
}
func lessInt(a, b interface{}) bool {
aa := a.(*Item)
bb := b.(*Item)
return aa.key < bb.key
}
func main() {
tree := llrb.New(lessInt)
tree.ReplaceOrInsert(&Item{5, "a"})
tree.ReplaceOrInsert(&Item{7, "b"})
tree.ReplaceOrInsert(&Item{2, "c"})
tree.ReplaceOrInsert(&Item{1, "d"})
//tree.DeleteMin()
c := tree.IterRangeInclusive(&Item{key: 2}, &Item{key: 5})
for item := <-c; item != nil; item = <-c {
i := item.(*Item)
fmt.Printf("%s\n", i.value)
}
}
Still I am wondering if this is also possible using stathat's treap.

How to check the uniqueness inside a for-loop?

Is there a way to check slices/maps for the presence of a value?
I would like to add a value to a slice only if it does not exist in the slice.
This works, but it seems verbose. Is there a better way to do this?
orgSlice := []int{1, 2, 3}
newSlice := []int{}
newInt := 2
newSlice = append(newSlice, newInt)
for _, v := range orgSlice {
if v != newInt {
newSlice = append(newSlice, v)
}
}
newSlice == [2 1 3]
Your approach would take linear time for each insertion. A better way would be to use a map[int]struct{}. Alternatively, you could also use a map[int]bool or something similar, but the empty struct{} has the advantage that it doesn't occupy any additional space. Therefore map[int]struct{} is a popular choice for a set of integers.
Example:
set := make(map[int]struct{})
set[1] = struct{}{}
set[2] = struct{}{}
set[1] = struct{}{}
// ...
for key := range(set) {
fmt.Println(key)
}
// each value will be printed only once, in no particular order
// you can use the ,ok idiom to check for existing keys
if _, ok := set[1]; ok {
fmt.Println("element found")
} else {
fmt.Println("element not found")
}
Most efficient is likely to be iterating over the slice and appending if you don't find it.
func AppendIfMissing(slice []int, i int) []int {
for _, ele := range slice {
if ele == i {
return slice
}
}
return append(slice, i)
}
It's simple and obvious and will be fast for small lists.
Further, it will always be faster than your current map-based solution. The map-based solution iterates over the whole slice no matter what; this solution returns immediately when it finds that the new value is already present. Both solutions compare elements as they iterate. (Each map assignment statement certainly does at least one map key comparison internally.) A map would only be useful if you could maintain it across many insertions. If you rebuild it on every insertion, then all advantage is lost.
If you truly need to efficiently handle large lists, consider maintaining the lists in sorted order. (I suspect the order doesn't matter to you because your first solution appended at the beginning of the list and your latest solution appends at the end.) If you always keep the lists sorted then you you can use the sort.Search function to do efficient binary insertions.
Another option:
package main
import "golang.org/x/tools/container/intsets"
func main() {
var (
a intsets.Sparse
b bool
)
b = a.Insert(9)
println(b) // true
b = a.Insert(9)
println(b) // false
}
https://pkg.go.dev/golang.org/x/tools/container/intsets
This option if the number of missing numbers is unknown
AppendIfMissing := func(sl []int, n ...int) []int {
cache := make(map[int]int)
for _, elem := range sl {
cache[elem] = elem
}
for _, elem := range n {
if _, ok := cache[elem]; !ok {
sl = append(sl, elem)
}
}
return sl
}
distincting a array of a struct :
func distinctObjects(objs []ObjectType) (distinctedObjs [] ObjectType){
var output []ObjectType
for i:= range objs{
if output==nil || len(output)==0{
output=append(output,objs[i])
} else {
founded:=false
for j:= range output{
if output[j].fieldname1==objs[i].fieldname1 && output[j].fieldname2==objs[i].fieldname2 &&......... {
founded=true
}
}
if !founded{
output=append(output,objs[i])
}
}
}
return output
}
where the struct here is something like :
type ObjectType struct {
fieldname1 string
fieldname2 string
.........
}
the object will distinct by checked fields here :
if output[j].fieldname1==objs[i].fieldname1 && output[j].fieldname2==objs[i].fieldname2 &&......... {

Resources