How to write a vector - go

I am using the Go flatbuffers interface for the first time. I find the instructions sparse.
I would like to write a vector of uint64s into a table. Ideally, I would like to store numbers directly in a vector without knowing how many there are up front (I'm reading them from sql.Rows iterator). I see the generated code for the table has functions:
func DatasetGridAddDates(builder *flatbuffers.Builder, dates flatbuffers.UOffsetT) {
builder.PrependUOffsetTSlot(2, flatbuffers.UOffsetT(dates), 0)
}
func DatasetGridStartDatesVector(builder *flatbuffers.Builder, numElems int) flatbuffers.UOffsetT {
return builder.StartVector(8, numElems, 8)
}
Can I first write the vector using (??), then use DatasetGridAddDates to record the resulting vector in the containing "DatasetGrid" table?

(caveat: I have not heard of FlatBuffers prior to reading your question)
If you do know the length in advance, storing a vector is done as explained in the tutorial:
name := builder.CreateString("hello")
q55310927.DatasetGridStartDatesVector(builder, len(myDates))
for i := len(myDates) - 1; i >= 0; i-- {
builder.PrependUint64(myDates[i])
}
dates := builder.EndVector(len(myDates))
q55310927.DatasetGridStart(builder)
q55310927.DatasetGridAddName(builder, name)
q55310927.DatasetGridAddDates(builder, dates)
grid := q55310927.DatasetGridEnd(builder)
builder.Finish(grid)
Now what if you don’t have len(myDates)? On a toy example I get exactly the same output if I replace StartDatesVector(builder, len(myDates)) with StartDatesVector(builder, 0). Looking at the source code, it seems like the numElems may be necessary for alignment and for growing the buffer. I imagine alignment might be moot when you’re dealing with uint64, and growing seems to happen automatically on PrependUint64, too.
So, try doing it without numElems:
q55310927.DatasetGridStartDatesVector(builder, 0)
var n int
for rows.Next() { // use ORDER BY to make them go in reverse order
var date uint64
if err := rows.Scan(&date); err != nil {
// ...
}
builder.PrependUint64(date)
n++
}
dates := builder.EndVector(n)
and see if it works on your data.

Related

In sync.Map is it necessary to use Load followed by LoadOrStore for complex values

In code where a global map with an expensive to generate value structure may be modified by multiple concurrent threads, which pattern is correct?
// equivalent to map[string]*activity where activity is a
// fairly heavyweight structure
var ipActivity sync.Map
// version 1: not safe with multiple threads, I think
func incrementIP(ip string) {
val, ok := ipActivity.Load(ip)
if !ok {
val = buildComplexActivityObject()
ipActivity.Store(ip, val)
}
updateTheActivityObject(val.(*activity), ip)
}
// version 2: inefficient, I think, because a complex object is built
// every time even through it's only needed the first time
func incrementIP(ip string) {
tmp := buildComplexActivityObject()
val, _ := ipActivity.LoadOrStore(ip, tmp)
updateTheActivity(val.(*activity), ip)
}
// version 3: more complex but technically correct?
func incrementIP(ip string) {
val, found := ipActivity.Load(ip)
if !found {
tmp := buildComplexActivityObject()
// using load or store incase the mapping was already made in
// another store
val, _ = ipActivity.LoadOrStore(ip, tmp)
}
updateTheActivity(val.(*activity), ip)
}
Is version three the correct pattern given Go's concurrency model?
Option 1 obviously can be called by multiple goroutines with a new ip concurrently, and only the last one in the if block would get stored. This possibility is greatly increased the longer buildComplexActivityObject takes, as there is more time in the critical section.
Option 2 works, but calls buildComplexActivityObject every time, which you state is not what you want.
Given that you want to call buildComplexActivityObject as infrequently as possible, the third option is the only one that makes sense.
The sync.Map however cannot protect the actual activity values referenced by the stored pointers. You also need synchronization there when updating the activity value.

Iterate through a go struct to get a csv string [duplicate]

This question already has answers here:
Iterate through the fields of a struct in Go
(8 answers)
Closed 5 years ago.
I have a struct representing a dataset that I need to write to a CSV file as a time-series data. This is what I have so far.
type DataFields struct {
Field1 int,
Field2 string,
...
Fieldn int
}
func (d DataFields) String() string {
return fmt.Sprintf("%v,%v,...,%v", Field1, Field2,..., Fieldn)
}
Is there a way I can iterate through the members of the struct and construct a string object using it?
Performance is not really an issue here and I was wondering if there was a way I could generate the string without having to modify the String() function if the structure changed in the future.
EDITED to add my change below:
This is what I ended up with after looking at the answers below.
func (d DataFields) String() string {
v := reflect.ValueOf(d)
var csvString string
for i := 0; i < v.NumField(); i++ {
csvString = fmt.Sprintf("%v%v,", csvString, v.Field(i).Interface())
}
return csvString
}
What you are looking for is called reflection. This answer explains how to use it to loop though a struct and get the values.
This is the example the author uses on the other answer:
package main
import (
"fmt"
"reflect"
)
func main() {
x := struct{Foo string; Bar int }{"foo", 2}
v := reflect.ValueOf(x)
values := make([]interface{}, v.NumField())
for i := 0; i < v.NumField(); i++ {
values[i] = v.Field(i).Interface()
}
fmt.Println(values)
}
You can see it working on the go playground.
One way would be to use the reflect package. There is a Value.Field(int) Value method that might be usefull to you. You would essentially first call ValueOf(interface{}) Value with your DataFields, and then have a simple loop calling Field(int) Value on the Value.
Another approach is to use code generation which would generate a serializer code for you.
The trade-offs are:
Codegen is more compilcated in that it, most of the time, relies
on running an external program (though this could be made simpler by
employing go run as it's supposed to be always available).
Every time you make a change to your data type adding or removing
a field which has to be serialized, you need to run go generate
to regenerate the serializer code.
On the flip side the resulting code is fast and robust,
and the changes to the data type are usually seldom enough.
Reflection is simpler in that it does not require thinking about
regenerating the code.
On the flip side, the code which uses reflect is usually ugly and quite hard to understand. And of course it incurs runtime performance penalty.

Go polymorphism in function parameters

i found several questions with similar titles, but cannot find the answer to my question in them:
I have the following simple scenario:
types:
type intMappedSortable interface {
getIntMapping() int
}
type Rectangle struct {
length, width int
}
func (r Rectangle) getIntMapping() int {
return r.Area();
}
func (Rectangle r) getIntMapping() int {
return r.length * r.width;
}
main:
func main() {
r := rand.New(rand.NewSource(time.Now().UnixNano()))
var values []int
values = make([]int, 0)
for i := 0; i < 10; i++ {
values = append(values, r.Intn(20))
}
var rects []Rectangle;
rects = make([]intMappedSortable, len(values));
for i,v:= range values {
r := Rectangle{v,v};
rects[i] = r;
}
for i,v:= range rects {
fmt.Println(v.Area());
}
rectsRet := make(chan intMappedSortable, len(rects));
sort(rects, rectsRet);
}
doWork:
func sort(values []intMappedSortable, out chan intMappedSortable) {...}
How do i manage to pass the Rectangles to the sorting function and then work with the sorted rectangles in main after it?
I tried:
var rects []*Rectangle;
rects = make([]*Rectangle, len(values));
as a habit from my C days, i don't want to copy the rectangles, just the addresses, so i can sort directly in the original slice, preventing 2 copy procedures for the whole data.
After this failed i tried:
var rects []intMappedSortable;
rects = make([]*Rectangle, len(values));
i learned that Go handles "polymorphism" by holding a pointer to the original data which is not exposed, so i changed *Rectangle to Rectangle, both gave me the compilererror that Rectangle is not []intMappedSortable
What obviously works is:
var rects []intMappedSortable;
rects = make([]intMappedSortable, len(values));
for i,v:= range values {
r := Rectangle{v,v};
rects[i] = r;
}
But are are these rectangles now copied or is just the memoryrepresentation of the interface with their reference copied? Additionally there now is no way to access length and width of the rectangles as the slice is not explicitly of type rectangle anymore.
So, how would i implement this scenario?
I want to create a slice of ANY structure, that implements the mapToInt(), sort the slice and then keep working with the concrete type after it
EDIT/FOLLOWUP:
I know its not good style, but i'm, experimenting:
can i somehow use type assertion with a dynamic type like:
func maptest(values []intMappedSortable, out interface{}) {
oType := reflect.TypeOf(out);
fmt.Println(oType); // --> chan main.intMappedSortable
test := values[0].(oType) //i know this is not working AND wrong even in thought because oType holds "chan intMappedSortable", but just for theory's sake
}
how could i do this, or is this not possible. I do not mean wether it is "meant to be done", i know it is not. But is it possible?^^
But are are these rectangles now copied or is just the memory representation of the interface with their reference copied?
The latter, see "what is the meaning of interface{} in golang?"
An interface value is constructed of two words of data:
one word is used to point to a method table for the value’s underlying type,
and the other word is used to point to the actual data being held by that value.
I want to create a slice of ANY structure, that implements the mapToInt(), sort the slice and then keep working with the concrete type after it
That isn't possible, as there is no genericity in Go.
See "What would generics in Go be?"
That is why you have projects like "gen":
generates code for your types, at development time, using the command line.
gen is not an import; the generated source becomes part of your project and takes no external dependencies.

Can we write a generic array/slice deduplication in go?

Is there a way to write a generic array/slice deduplication in go, for []int we can have something like (from http://rosettacode.org/wiki/Remove_duplicate_elements#Go ):
func uniq(list []int) []int {
unique_set := make(map[int] bool, len(list))
for _, x := range list {
unique_set[x] = true
}
result := make([]int, len(unique_set))
i := 0
for x := range unique_set {
result[i] = x
i++
}
return result
}
But is there a way to extend it to support any array? with a signature like:
func deduplicate(a []interface{}) []interface{}
I know that you can write that function with that signature, but then you can't actually use it on []int, you need to create a []interface{} put everything from the []int into it, pass it to the function then get it back and put it into a []interface{} and go through this new array and put everything in a new []int.
My question is, is there a better way to do this?
While VonC's answer probably does the closest to what you really want, the only real way to do it in native Go without gen is to define an interface
type IDList interface {
// Returns the id of the element at i
ID(i int) int
// Returns the element
// with the given id
GetByID(id int) interface{}
Len() int
// Adds the element to the list
Insert(interface{})
}
// Puts the deduplicated list in dst
func Deduplicate(dst, list IDList) {
intList := make([]int, list.Len())
for i := range intList {
intList[i] = list.ID(i)
}
uniques := uniq(intList)
for _,el := range uniques {
dst.Insert(list.GetByID(el))
}
}
Where uniq is the function from your OP.
This is just one possible example, and there are probably much better ones, but in general mapping each element to a unique "==able" ID and either constructing a new list or culling based on the deduplication of the IDs is probably the most intuitive way.
An alternate solution is to take in an []IDer where the IDer interface is just ID() int. However, that means that user code has to create the []IDer list and copy all the elements into that list, which is a bit ugly. It's cleaner for the user to wrap the list as an ID list rather than copy, but it's a similar amount of work either way.
The only way I have seen that implemented in Go is with the clipperhouse/gen project,
gen is an attempt to bring some generics-like functionality to Go, with some inspiration from C#’s Linq and JavaScript’s underscore libraries
See this test:
// Distinct returns a new Thing1s slice whose elements are unique. See: http://clipperhouse.github.io/gen/#Distinct
func (rcv Thing1s) Distinct() (result Thing1s) {
appended := make(map[Thing1]bool)
for _, v := range rcv {
if !appended[v] {
result = append(result, v)
appended[v] = true
}
}
return result
}
But, as explained in clipperhouse.github.io/gen/:
gen generates code for your types, at development time, using the command line.
gen is not an import; the generated source becomes part of your project and takes no external dependencies.
You could do something close to this via an interface. Define an interface, say "DeDupable" requiring a func, say, UniqId() []byte, which you could then use to do the removing of dups. and your uniq func would take a []DeDupable and work on it

Is there an easy way to iterate over a map in order?

This is a variant of the venerable "why is my map printing out of order" question.
I have a (fairly large) number of maps of the form map[MyKey]MyValue, where MyKey and MyValue are (usually) structs. I've got "less" functions for all the key types.
I need to iterate over the maps in order. (Specifically, the order defined by the less function on that type.) Right now, my code looks like this:
type PairKeyValue struct {
MyKey
MyValue
}
type PairKeyValueSlice []Pair
func (ps PairKeyValueSlice) Len() int {
return len(ps)
}
func (ps PairKeyValueSlice) Swap(i,j int) {
ps[i], ps[j] = ps[j], ps[i]
}
func (ps PairKeyValueSlice) Less(i,j int) {
return LessKey(ps[i].MyKey, ps[j].MyKey)
}
func NewPairKeyValueSlice(m map[MyKey]MyValue) (ps PairKeyValueSlice) {
ps = make(PairKeyValueSlice, len(m))
i := 0
for k,v := range m {
ps[i] = PairKeyValue{k,v}
i++
}
sort.Sort(ps)
}
And then, any time I want an in-order iteration, it looks like:
var m map[MyKey]MyValue
m = GetMapFromSomewhereUseful()
for _, kv := range NewPairKeyValueSlice(m) {
key := kv.MyKey
value := kv.MyValue
DoUsefulWork(key, value)
}
And this appears to largely work. The problem is that it is terribly verbose. Particularly since the problem at hand really has very little to do with implmenting ordered maps and is really about the useful work in the loop.
Also, I have several different keys and value types. So, every time I want to iterate over a map in order, I copy/paste all that code and do find/replace MyKey with the new key and MyValue with the new value. Copy/paste on that magnitude is... "smelly". It has already become a hassle, since I've already made a few errors that I had to fix several times.
This technique also has the downside that it requires making a full copy of all the keys and values. That is undesirable, but I don't see a way around it. (I could reduce it to just the keys, but it doesn't change the primary nature of the problem.)
This question is attempting the same thing with strings. This question does it with strings and ints. This question implies that you need to use reflection and will have to have a switch statement that switches on every possible type, including all user-defined types.
But with the people who are puzzled that maps don't iterate deterministically, it seems that there has got to be a better solution to this problem. I'm from an OO background, so I'm probably missing something fundamental.
So, is there a reasonable way to iterate over a map in order?
Update: Editing the question to have more information about the source, in case there's a better solution than this.
I have a lot of things I need to group for output. Each grouping level is in a structure that looks like these:
type ObjTypeTree struct {
Children map[Type]*ObjKindTree
TotalCount uint
}
type ObjKindTree struct {
Children map[Kind]*ObjAreaTree
TotalCount uint
}
type ObjAreaTree struct {
Children map[Area]*ObjAreaTree
TotalCount uint
Objs []*Obj
}
Then, I'd iterate over the children in the ObjTypeTree to print the Type groupings. For each of those, I iterate over the ObjKindTree to print the Kind groupings. The iterations are done with methods on the types, and each kind of type needs a little different way of printing its grouping level. Groups need to be printed in order, which causes the problem.
Don't use a map if key collating is required. Use a B-tree or any other/similar ordered container.
I second jnml's answer. But if you want something shorter than you have and are willing to give up compile time type safety, then my library might work for you. (It's built on top of reflect.) Here's a full working example:
package main
import (
"fmt"
"github.com/BurntSushi/ty/fun"
)
type OrderedKey struct {
L1 rune
L2 rune
}
func (k1 OrderedKey) Less(k2 OrderedKey) bool {
return k1.L1 < k2.L1 || (k1.L1 == k2.L1 && k1.L2 < k2.L2)
}
func main() {
m := map[OrderedKey]string{
OrderedKey{'b', 'a'}: "second",
OrderedKey{'x', 'y'}: "fourth",
OrderedKey{'x', 'x'}: "third",
OrderedKey{'a', 'b'}: "first",
OrderedKey{'x', 'z'}: "fifth",
}
for k, v := range m {
fmt.Printf("(%c, %c): %s\n", k.L1, k.L2, v)
}
fmt.Println("-----------------------------")
keys := fun.QuickSort(OrderedKey.Less, fun.Keys(m)).([]OrderedKey)
for _, k := range keys {
v := m[k]
fmt.Printf("(%c, %c): %s\n", k.L1, k.L2, v)
}
}
Note that such a method will be slower, so if you need performance, this is not a good choice.

Resources