Traversing tree and extracting information with reusable components - go

I have a tree of nested structs in a Go project. I would like to walk through the tree and perform different actions, such as picking out certain structs at different levels in the tree and appending them to a list, or modifying the structs in place.
I would like to do this using reusable components so that I can focus on implementing that perform the tasks, not having to reimplement the walker for every such function. So far the only thing I can think of is this API:
type applyFunc func(*Node)
func walker(node *Node, f applyFunc) {
....
for _, child := range node.children() {
walker(child, f)
}
}
The function walker can clearly be used to modify the tree because it is passed pointers to the tree nodes. I like it because I can write applyFunc functions separately without having to bother with the actual recursive walker code. However, extracting nodes or deleting them is more difficult.
For extracting information from nodes, perhaps I can use a closure:
values := &[]int{}
f := func(node *Node) {
values.append(node.val)
}
walker(root, f)
//values now hold the information I am interested in
Would this be a good solution? Are there better ones?

You could also add the walk function to your tree type, add a pointer to the parent in a node and add a deleteChild method to a node which takes the index of the child as argument which would allow you to manipulate easily.
Example (here i called walk apply):
type node struct {
children []*node
parent *node
value int
}
func (n *node) deleteChild(index int) {
n.children = append(n.children[:index], n.children[index+1:]...)
}
func (n *node) delete(index int) {
if n.parent != nil {
n.parent.deleteChild(index)
}
}
func (n *node) apply(index int, f func(int, *node)) {
f(index, n)
for childIndex, child := range n.children {
child.apply(childIndex, f)
}
}
func main() {
t := &node{}
t.children = []*node{
&node{
children: []*node{
&node{value: 2},
},
value: 1,
parent: t,
},
}
// extract all values in nodes
values := []int{}
t.apply(0, func(index int, n *node) {
values = append(values, n.value)
})
fmt.Println(values) // [0 1 2]
// delete a node
fmt.Println(t.children) // [0xc4.....]
t.apply(0, func(index int, n *node) {
n.delete(index)
})
fmt.Println(t.children) // []
}

Related

go method - pointer receiver and returning same pointer

A go method on struct receives pointer reference, made some modifications and returning same pointer. The struct has nested reference of same struct: when append method being called with values some reason it was loosing previous values.
package main
import (
"fmt"
)
type Node struct{
next *Node
val int
}
func newNode(val int) (*Node){
n := Node{
val: val,
}
return &n
}
func (n *Node) append(val int) (*Node){
for n.next != nil {
n = n.next
}
n.next = newNode(val)
return n
}
func (n *Node)printList(){
for n != nil {
fmt.Printf("%d,", n.val)
n = n.next
}
fmt.Println()
}
func main() {
n := newNode(3)
n.printList()
n = n.append(4)
n.printList()
n = n.append(5)
n.printList()
n = n.append(6)
n.printList()
}
output:
3,
3,4,
4,5,
5,6,
I was expecting 3,4,5,6, - Probably something I totally missing something fundamentals here. appreciate if you have some inputs.
https://play.golang.org/p/-zDH98UNFLa
I was getting expected results when I modify append method not return anything.
append() returns the pointer of the next node. Therefore printList() only print the nodes starting from the next node. If you'd like to print the all nodes in the list, you should add a variable to store the pointer referenced to the starting node of this list.
func main() {
n := newNode(3)
head := n
head.printList()
n = n.append(4)
head.printList()
n = n.append(5)
head.printList()
n = n.append(6)
head.printList() // 3,4,5,6
}
This function:
func (n *Node) append(val int) (*Node){
for n.next != nil {
n = n.next
}
n.next = newNode(val)
return n
}
does not return its original argument in general. It returns the node that is next-to-last in the (assumed to be non-empty) list. Hence:
n = n.append(4)
adds a node holding 4 to the original node holding 3, then returns the original node holding 3, but:
n = n.append(5)
adds a node holding 5 to the original list, but then returns a pointer to the node holding 4. That's why you see 4,5, at this point. Subsequent calls keep repeating the last two elements for the same reason.
You could modify your append function to save the original return value and return that:
func (n *Node) append(val int) *Node {
// find current tail
t := n
for t.next != nil {
t = t.next
}
t.next = newNode(val)
return n
}
but overall this is still not a great strategy: this append does not work when given a nil-valued n, for instance. Consider constructing a list type that does handle such cases. Alternatively, as in Hsaio's answer, you can have the caller hang on to the head node directly. If you do that you could have the append function return the tail pointer:
func (n *Node) append(val int) *Node {
n.next = newNode(val)
return n.next
}
and then use it like this:
head := newNode(3)
t := append(head, 4)
t = append(t, 5)
t = append(t, 6)
head.printList()
(There's already a List implementation in the standard Go packages, container/list, that does this stuff nicely for you. Instead of pointing directly to each element in your list, you create an overall list-container instance, which allows you to insert-at-front, insert-at-back, remove-from-anywhere, and so on. It's a little awkward in that it uses interface{} for the data, so it requires a type-assertion to get each node's value.)
The variable n in main function is overrided by n in append function.

golang: Insert to a sorted slice

What's the most efficient way of inserting an element to a sorted slice?
I tried a couple of things but all ended up using at least 2 appends which as I understand makes a new copy of the slice
Here is how to insert into a sorted slice of strings:
Go Playground Link to full example: https://play.golang.org/p/4RkVgEpKsWq
func Insert(ss []string, s string) []string {
i := sort.SearchStrings(ss, s)
ss = append(ss, "")
copy(ss[i+1:], ss[i:])
ss[i] = s
return ss
}
If the slice has enough capacity then there's no need for a new copy.
The elements after the insert position can be shifted to the right.
Only when the slice doesn't have enough capacity,
a new slice and copying all values will be necessary.
Keep in mind that slices are not designed for fast insertion.
So there won't be a miracle solution here using slices.
You could create a custom data structure to make this more efficient,
but obviously there will be other trade-offs.
One point that can be optimized in the process is finding the insertion point quickly. If the slice is sorted, then you can use binary search to perform this in O(log n) time.
However, this might not matter much,
considering the expensive operation of copying the end of the slice,
or reallocating when necessary.
I like #likebike's answer but it only works for strings. Here is the generic version that will work for a slice of any ordered type (requires Go 1.18):
func Insert[T constraints.Ordered](ts []T, t T) []T {
var dummy T
ts = append(ts, dummy) // extend the slice
i, _ := slices.BinarySearch(ts, t) // find slot
copy(ts[i+1:], ts[i:]) // make room
ts[i] = t
return ts
}
Note that this uses the package golang.org/x/exp/slices but this will almost certainly be included in the std Go library in Go 1.19.
Try it in the Go Playground
There are two parts to the problem: finding where to insert the value and inserting the value.
Use the sort package search functions to efficiently find the insertion index using binary search.
Use a single call to append to efficiently insert a value into a slice:
// insertAt inserts v into s at index i and returns the new slice.
func insertAt(data []int, i int, v int) []int {
if i == len(data) {
// Insert at end is the easy case.
return append(data, v)
}
// Make space for the inserted element by shifting
// values at the insertion index up one index. The call
// to append does not allocate memory when cap(data) is
// greater ​than len(data).
data = append(data[:i+1], data[i:]...)
// Insert the new element.
data[i] = v
// Return the updated slice.
return data
}
Here's the code for inserting a value a sorted slice:
func insertSorted(data []int, v int) []int {
i := sort.Search(len(data), func(i int) bool { return data[i] >= v })
return insertAt(data, i, v)
}
The code in this answer uses a slice of int. Adjust the type to match your actual data.
The call to sort.Search in this answer can be replaced with a call to the helper function sort.SearchInts. I show sort.Search in this answer because the function applies to a slice of any type.
If you do not want to add duplicate values, check the value at the search index before inserting:
func insertSortedNoDups(data []int, v int) []int {
i := sort.Search(len(data), func(i int) bool { return data[i] >= v })
if i < len(data) && data[i] == v {
return data
}
return insertAt(data, i, v)
}
You could use a heap:
package main
import (
"container/heap"
"sort"
)
type slice struct { sort.IntSlice }
func (s slice) Pop() interface{} { return 0 }
func (s *slice) Push(x interface{}) {
(*s).IntSlice = append((*s).IntSlice, x.(int))
}
func main() {
s := &slice{
sort.IntSlice{11, 10, 14, 13},
}
heap.Init(s)
heap.Push(s, 12)
println(s.IntSlice[0] == 10)
}
Note that a heap is not strictly sorted, but the "minimum element" is guaranteed
to be the first element. Also I did not implement the Pop function in my
example, you would want to do that.
https://golang.org/pkg/container/heap
There are two approaches mentioned here to insert into the slice when the position i is known:
data = append(data, "")
copy(data[i+1:], data[i:])
data[i] = s
and
data = append(data[:i+1], data[i:]...)
data[i] = s
I just benchmarked both with go1.18beta2, and the first solution is approximately 10% faster.
no dependency, generic data type with duplicated options. (go 1.18)
time complexity : Log2(n) + 1
import "golang.org/x/exp/constraints"
import "golang.org/x/exp/slices"
func InsertionSort[T constraints.Ordered](array []T, value T, canDupicate bool) []T {
pos, isFound := slices.BinarySearch(array, value)
if canDupicate || !isFound {
array = slices.Insert(array, pos, value)
}
return array
}
full version : https://go.dev/play/p/P2_ou2Fqs37
play : https://play.golang.org/p/dUGmPurouxA
array1 := []int{1, 3, 4, 5}
//want to insert at index 1
insertAtIndex := 1
temp := append([]int{}, array1[insertAtIndex:]...)
array1 = append(array1[0:insertAtIndex], 2)
array1 = append(array1, temp...)
fmt.Println(array1)
You can try the below code. It basically uses the golang sort package
package main
import "sort"
import "fmt"
func main() {
data := []int{20, 21, 22, 24, 25, 26, 28, 29, 30, 31, 32}
var items = []int{23, 27}
for _, x := range items {
i := sort.Search(len(data), func(i int) bool { return data[i] >= x })
if i < len(data) && data[i] == x {
fmt.Println(i)
} else {
data = append(data, 0)
copy(data[i+1:], data[i:])
data[i] = x
}
fmt.Println(data)
}
}

GO: Slice of unique structs effective reusable implementation

I often need to get rid of duplicates based on arbitrary equals function.
I need implementation that:
is fast and memory effective (does not create map)
is reusable and easy to use, think of slice.Sort() (github.com/bradfitz/slice)
it's not required to keep order of the original slice or preserve original slice
would be nice to minimize copying
Can this be implemented in go? Why this function is not part of some library I am aware of?
I was looking e.g. godash (github.com/zillow/godash) implementation uses map and does not allow arbitrary less and equal.
Here is how it should approximately look like.
Test:
import (
"reflect"
"testing"
)
type bla struct {
ID string
}
type blas []bla
func (slice blas) Less(i, j int) bool {
return slice[i].ID < slice[j].ID
}
func (slice blas) EqualID(i, j int) bool {
return slice[i].ID == slice[j].ID
}
func Test_Unique(t *testing.T) {
input := []bla{bla{ID: "d"}, bla{ID: "a"}, bla{ID: "b"}, bla{ID: "a"}, bla{ID: "c"}, bla{ID: "c"}}
expected := []bla{bla{ID: "a"}, bla{ID: "b"}, bla{ID: "c"}, bla{ID: "d"}}
Unique(input, blas(input).Less, blas(input).EqualID)
if !reflect.DeepEqual(expected, input) {
t.Errorf("2: Expected: %v but was %v \n", expected, input)
}
}
What I think will need to be used to implement this:
Only slices as data structure to keep it simple and for easy sorting.
Some reflection - the hard part for me! Since I am new to go.
Options
You can sort slice and check for adjacent nodes creation = O(n logn),lookup = O(log n) , insertion = O(n), deletion = O(n)
You can use a Tree and the original slice together creation = O(n logn),lookup = O(log n) , insertion = O(log n), deletion = O(log n)
In the tree implementation you may put only the index in tree nodes and evaluation of nodes will be done using the Equal/Less functions defined for the interface.
Here is an example with tree, here is the play link
You have to add more functions to make it usable ,and the code is not cache friendly so you may improve the code for make it cache friendly
How to use
Make the type representing slice implement Setter interface
set := NewSet(slice),creates a slice
now set.T has only unique values indexes
implement more functions to Set for other set operations
Code
type Set struct {
T Tree
Slice Setter
}
func NewSet(slice Setter) *Set {
set := new(Set)
set.T = Tree{nil, 0, nil}
set.Slice = slice
for i:=0;i < slice.Len();i++ {
insert(&set.T, slice, i)
}
return set
}
type Setter interface {
Len() int
At(int) (interface{},error)
Less(int, int) bool
Equal(int, int) bool
}
// A Tree is a binary tree with integer values.
type Tree struct {
Left *Tree
Value int
Right *Tree
}
func insert(t *Tree, Setter Setter, index int) *Tree {
if t == nil {
return &Tree{nil, index, nil}
}
if Setter.Equal(t.Value, index) {
return t
}
if Setter.Less(t.Value, index) {
t.Left = insert(t.Left, Setter, index)
return t
}
t.Right = insert(t.Right, Setter, index)
return t
}
Bloom filter is frequently used for equality test. There is https://github.com/willf/bloom for example, which awarded some stars on github. This particular implementation uses murmur3 for hashing and bitset for filter, so can be more efficient than map.

Idiomatic way of implementing nested matrices in golang

I am trying to represent a hypergraph in memory. Are there any better data structures for this task beside nested matrices? A nested matrix is a matrix which can have elements of both the "native" type (let's say int for the sake of simplicity) and matrices.
This is the beginning of such a matrix. Are there any rough edges in the code, to make it look more idiomatic? How to make it look more idiomatic?
The code:
package main
import "fmt"
type Matricial interface {
Put(interface{}, ...int)
Get(...int) interface{}
}
type Matrix struct {
Matricial
values map[int]interface{}
}
func NewMatrix() *Matrix {
m := &Matrix{}
m.values = make(map[int]interface{})
return m
}
func (m *Matrix) Set(atom interface{}, pos ...int) {
firstdim := pos[0]
if val, ok := m.values[firstdim]; ok {
fmt.Println("map key exists", val)
switch converted := val.(type) {
case int:
m.values[firstdim] = converted
default:
fmt.Println("ERR: unknown type: %T", val)
}
} else {
if len(pos[1:]) > 0 {
newm := NewMatrix()
m.values[firstdim] = newm
newm.Set(atom, pos[1:]...)
} else {
m.values[firstdim] = atom
}
}
}
func (m *Matrix) Get(pos ...int) interface{} {
if len(pos) == 1 {
return m.values[pos[0]]
} else {
switch accessor := m.values[pos[0]].(type) {
case Matricial:
return accessor.Get(pos[1:]...)
default:
return nil
}
}
return nil
}
func main() {
m := NewMatrix()
m.Set(42, 2, 3, 4)
m.Set(43, 0)
fmt.Println(m.Get(2, 3))
fmt.Println(m.Get(2, 3, 4))
fmt.Println(m.Get(0))
}
The data structure must allow connecting hyperedges with other hyperedges (i.e. handling hyperedges as though they were nodes).
A nested matrix (adopting your definition of the term) seems a reasonable representation for hypergraph, not knowing anything more about your application anyway. An example Go implementation is the power set example at Rosetta code.
It is not idiomatic to embed an interface. For example, if you rename the Put method of Matricial to be Set, which is what I think you meant, then you can just delete the Matricial field of Matrix and your program produces the same output.

Problems understanding usage of `interface{}` in Go

I'm trying to port an algorithm from Python to Go. The central part of it is a tree built using dicts, which should stay this way since each node can have an arbitrary number of children. All leaves are at the same level, so up the the lowest level the dicts contain other dicts, while the lowest level ones contain floats. Like this:
tree = {}
insert(tree, ['a', 'b'], 1.0)
print tree['a']['b']
So while trying to port the code to Go while learning the language at the same time, this is what I started with to test the basic idea:
func main() {
tree := make(map[string]interface{})
tree["a"] = make(map[string]float32)
tree["a"].(map[string]float32)["b"] = 1.0
fmt.Println(tree["a"].(map[string]float32)["b"])
}
This works as expected, so the next step was to turn this into a routine that would take a "tree", a path, and a value. I chose the recursive approach and came up with this:
func insert(tree map[string]interface{}, path []string, value float32) {
node := path[0]
l := len(path)
switch {
case l > 1:
if _, ok := tree[node]; !ok {
if l > 2 {
tree[node] = make(map[string]interface{})
} else {
tree[node] = make(map[string]float32)
}
}
insert(tree[node], path[1:], value) //recursion
case l == 1:
leaf := tree
leaf[node] = value
}
}
This is how I imagine the routine should be structured, but I can't get the line marked with "recursion" to work. There is either a compiler error, or a runtime error if I try to perform a type assertion on tree[node]. What would be the correct way to do this?
Go is perhaps not the ideal solution for generic data structures like this. The type assertions make it possible, but manipulating data in it requires more work that you are used to from python and other scripting languages.
About your specific issue: You are missing a type assertion in the insert() call. The value of tree[node] is of type interface{} at that point. The function expects type map[string]interface{}. A type assertion will solve that.
Here's a working example:
package main
import "fmt"
type Tree map[string]interface{}
func main() {
t := make(Tree)
insert(t, []string{"a", "b"}, 1.0)
v := t["a"].(Tree)["b"]
fmt.Printf("%T %v\n", v, v)
// This prints: float32 1
}
func insert(tree Tree, path []string, value float32) {
node := path[0]
len := len(path)
switch {
case len == 1:
tree[node] = value
case len > 1:
if _, ok := tree[node]; !ok {
tree[node] = make(Tree)
}
insert(tree[node].(Tree), path[1:], value) //recursion
}
}
Note that I created a new type for the map. This makes the code a little easier to follow. I also use the same 'map[string]interface{}` for both tree nodes and leaves. If you want to get a float out of the resulting tree, another type assertion is needed:
leaf := t["a"].(Tree)["b"] // leaf is of type 'interface{}`.
val := leaf.(float32)
well... the problem is that you're trying to code Go using Python idioms, and you're making a tree with... hashtables? Huh? Then you have to maintain that the keys are unique and do a bunch of otherstuff, when if you just made the set of children a slice, you get that sort of thing for free.
I wouldn't make a Tree an explicit map[string]interface{}. A tree and a node on a tree are really the same thing, since it's a recursive datatype.
type Tree struct {
Children []*Tree
Value interface{}
}
func NewTree(v interface{}) *Tree {
return &Tree{
Children: []*Tree{},
Value: v,
}
}
so to add a child...
func (t *Tree) AddChild(child interface{}) {
switch c := child.(type) {
case *Tree:
t.Children = append(t.Children, c)
default:
t.Children = append(t.Children, NewTree(c))
}
}
and if you wanted to implement some recursive function...
func (t *Tree) String() string {
return fmt.Sprint(t.Value)
}
func (t *Tree) PrettyPrint(w io.Writer, prefix string) {
var inner func(int, *Tree)
inner = func(depth int, child *Tree) {
for i := 0; i < depth; i++ {
io.WriteString(w, prefix)
}
io.WriteString(w, child.String()+"\n") // you should really observe the return value here.
for _, grandchild := range child.Children {
inner(depth+1, grandchild)
}
}
inner(0, t)
}
something like that. Any node can be made the root of some tree, since a subtree is just a tree itself. See here for a working example: http://play.golang.org/p/rEx43vOnXN
There are some articles out there like "Python is not Java" (http://dirtsimple.org/2004/12/python-is-not-java.html), and to that effect, Go is not Python.

Resources