Best way to marshal map to struct fields in GO - go

I want to know which is the best way to create instances of a certain struct based on a map[string]string
My app should process huge files in CSV format and should create an instance of a struct for each row of the file.
I'm already using the encoding/csv/Reader from golang to read the CSV file and create an instance of map[string]string for each row in the file.
So given this file:
columnA, columnB, columnC
a, b, c
My own reader implementation will return this map (each row values with the header):
myMap := map[string]string{
"columnA": "a",
"columnB": "b",
"columnC": "c",
}
(this is just an example in real life the file contains a lot of columns and rows)
so.. at this point I need to create an instance of the struct that is related with the row contents, let say:
type MyStruct struct {
AColumn string
BColumn string
CColumn string
}
My question is what could be the best way to create the instance of the struct using the given map, I have already implemented a version that just copy each value from the map to the struct but the code ended up being very long and tedious:
s := &MyStruct{}
s.AColumn := m["columnA"]
s.AColumn := m["columnB"]
s.AColumn := m["columnC"]
...
I also consider using this library https://github.com/mitchellh/mapstructure but I don't know if using reflection could be the best approach considering that the file is huge and will be using reflection for each row.
Maybe there is no other option but I'm asking just in case someone knows a better approach.
Thanks in advance.

I would say that the idiomatic Go way would be just populating the struct's fields from your map. Go favors explicitness this approach is the more direct and the easiest to read. In other words, your approach is correct.
You could make it slightly nicer by initializing the struct directly:
s := &MyStruct{
AColumn: m["columnA"],
BColumn: m["columnB"],
CColumn: m["columnC"],
}
Now, if your structure has 100s of fields (which is an odd design choice), you may want to leverage some code generation. Otherwise, just go with the straightforward code - it's the best approach in the long term.

I already posted a library that I made for some stuff I have needed sometimes, I've made a MapToStruct fews months ago, I pushed that today to share with you the full library. The library is based in the usage of reflect, I still testing and implementing stuff, you will find some odd comments and these kind of things.
https://github.com/FedeMFernandez/goscript
I Hope it is useful

Related

Unexplained behavior with pointers and protobufs

I'm struggling to figure out a reason for this behavior, or maybe this is suppose to happen and I just wasn't aware.
For background, I'm using proto3, and am doing this in Go1.15, and I do know that packed is the default in proto3, and I'm relatively new to protobufs.
I defined the following message in a proto file:
message Response {
repeated uint32 points = 1 [packed=true];
}
Which will generate the following code using protoc-gen-go v1.25.0.
type Response struct {
state protoimpl.MessageState
sizeCache protoimpl.SizeCache
unknownFields protoimpl.UnknownFields
Points []uint32 `protobuf:"varint,3,rep,packed,name=points,json=points,proto3" json:"points,omitempty"`
}
I go to use the new struct, and it doesn't behave like I would normally expect a struct to behave. Here's some things I wrote, along with what was printed out.
newResponse := pb.Response{Points: []uint32{2,4,6,8}}
fmt.Println(newResponse)
//{{{} [] [] <nil>} 0 [] [2 4 6 8] --> I expect this
refToNewResponse := &newResponse
fmt.Println(refToNewResponse)
// points:2 points:4 points:6 points:8 --> not what I expected
Now you might be thinking, it's just formatting big deal.
But I expect a list... not numbers that each individually have a label. I've seen and used other protobufs... and when I see the response that they return, it doesn't look like this, it's one label to a list like:
points: [2 4 6 8]
I do need to use the reference version of this because I eventually want to expand and use a list of Responses which the generated code will spit out a slice of pointer Responses, but I can't understand why it's separating and labeling each element in the slice.
I'm hoping someone can point out something I'm doing or not doing that is causing this... thank you in advance.
This is indeed just formatting. Nothing has changed in the underlying data structure. You requested a repeated uint32 Points and it's literally printing them that way.
The marshaler in the protobuf implementation can really output whatever it likes, there is no reference version of the human-readable representation of a protobuf.
If you really must have a custom format for the .String() output, you can try a different proto library such as gogoprotobuf, or try various extensions. But ultimately, it's just human-readable output.
Note:
this has nothing to do with packed=true (which is indeed the default).
if you're confused about printing the pointer vs the basic type, it's because the String() method has a pointer receiver. See this question

protoc-gen-go struct xxx covert to map[string]interface{}

The struct in the .pb.go file generated by .proto file has three additional fields and some other things.like this:
When converting this struct to json, if one field is empty, the field will not appear in json. Now I know it can be done using jsonpb.Marshaler.
m := jsonpb.Marshaler{EmitDefaults: true}
Now, I coverting struct to map[string]interface{}, put it in
InfluxDB. I have to convert struct to map[string]interface{}.The function NewPoint needs. like this:
I use structs.Map(value) function in go ,The transformed map has three additional fields, and running the program causes errors,like this:
{"error":"unable to parse 'txt,severity=1 CurrentValue=\"1002\",MetricAlias=\"CPU\",XXX_sizecache=0i,XXX_unrecognized= 1552551101': missing field value"}
When I remove these three fields, the program runs OK.These three fields are automatically generated, and I have a lot of structs.
What should I do?Thank you!
Protobuf generator adds some additional fields with names starting from XXX that are intended for optimizations. You can't change this behavior of protoc-gen-go.
The problem is in the way you convert struct to map[sting]interface{}. It's hard to figure out from which package exactly structs.Map comes from. Seems like it goes from here: https://github.com/fatih/structs/blob/master/structs.go#L89 - this code uses reflect to iterate through all fields of the structure and push them to map[sting]interface{}. You just need to write your own slightly modified version of FillMap routine that will omit XXX fields.

Why are map values not addressable?

While playing with Go code, I found out that map values are not addressable. For example,
package main
import "fmt"
func main(){
var mymap map[int]string = make(map[int]string)
mymap[1] = "One"
var myptr *string = &mymap[1]
fmt.Println(*myptr)
}
Generates error
mapaddressable.go:7: cannot take the address of mymap[1]
Whereas, the code,
package main
import "fmt"
func main(){
var mymap map[int]string = make(map[int]string)
mymap[1] = "One"
mystring := mymap[1]
var myptr *string = &mystring
fmt.Println(*myptr)
}
works perfectly fine.
Why is this so? Why have the Go developers chosen to make certain values not addressable? Is this a drawback or a feature of the language?
Edit:
Being from a C++ background, I am not used to this not addressable trend that seems to be prevalent in Go. For example, the following code works just fine:
#include<iostream>
#include<map>
#include<string>
using namespace std;
int main(){
map<int,string> mymap;
mymap[1] = "one";
string *myptr = &mymap[1];
cout<<*myptr;
}
It would be nice if somebody could point out why the same addressability cannot be achieved (or intentionally wasn't achieved) in Go.
Well I do not know about the internal Go implementation of maps but most likely it is a kind of hash table. So if you take and save the address of one of its entries and afterwards put another bunch of entries into it, your saved address may be invalid. This is due to internal reorganizations of hash tables when the load factor exceeds a certain threshold and the hash table needs to grow.
Therefore I guess it is not allowed to take the address of one of its entries in order to avoid such errors.
Being from a C++ background.
Why are [Go] map values not addressable?
If all other languages were like C++ there would be no point in having other languages.
C++ is a complex, hard-to-read language.
Remember the Vasa! - Bjarne Stroustrup
Go, by design, is a simple, readable language.
dotGo 2015 - Rob Pike - Simplicity is Complicated
A Go map is a hash map. A deterministic hash function is applied to a map key. The hash value is used to determine the primary map bucket for the entry (key-value pair). A bucket stores one or more map entries. A primary bucket may overflow to secondary buckets. Buckets are implemented as an array. As the number of map entries increases by insertion, the hash function adapts to provide more buckets. The map entries are copied incrementally to a new, larger bucket array. If the number of map entries decreases by deletion, space may be reclaimed.
In summary, a Go map is a dynamic, self-organizing data structure. The memory address of an entry (key-value pair) is not fixed. Therefore, map values are not addressable.
GopherCon 2016 Keith Randall - Inside the Map Implementation
In Go, map value addressability is not necessary.

Golang assignment of []map[string]struct error

As you could probably tell from the below code I am working on a project which creates csv reports from data in mongoDB. After getting the data I need in, I need to structure the data into something more sensible then how it exists in the db, which is fairly horrendous (not my doing) and near impossible to print the way I need it. The structure that makes the most sense to me is a slice (for each document of data) of maps of the name of the data to a structure holding the data for that name. Then I would simply have to loop through the document and stuff values into the structs where they belong.
My implementation of this is
type mongo_essential_data_t struct {
caution string
citation string
caution_note string
}
mongo_rows_struct := make([]map[string]mongo_essential_data_t, len(mongodata_rows))
//setting the values goes like this
mongo_rows_struct[i][data_name].caution_note = fmt.Sprint(k)
//"i" being the document, "k" being the data I want to store
This doesn't work however. When doing "go run" it returns ./answerstest.go:140: cannot assign to mongo_rows_struct[i][data_name].caution_note. I am new to Go and not sure why I am not allowed to do this. I'm sure this is an invalid way to reference that particular data location, if it is even possible to reference it in Go. What is another way to accomplish this setting line? If it is too much work to accomplish this the way I want, I am willing to use a different type of data structure and am open to suggestions.
This is a known issue of Golang, known as issue 3117. You can use a temporary variable to get around it:
var tmp = mongo_rows_struct[i][data_name]
tmp.caution_note = fmt.Sprint(k)
mongo_rows_struct[i][data_name] = tmp
as per my understanding, when you write:
mongo_rows_struct[i][data_name]
compiler will generate code, which will return copy of mongo_essential_data_t struct(since struct in go is value type, not reference type), and
mongo_rows_struct[i][data_name].caution_note = fmt.Sprint(k)
will write new value to that copy. And after that copy will be discarded. Obviously, its not what you expect. So Go compiler generate error to prevent this misunderstanding.
In order to solve this problem you can:
1. Change definition of your data type to
[]map[string]*mongo_essential_data_t
2. Explicitly create copy of your struct, make changes in that copy and write it back to the map
data := mongo_rows_struct[i][data_name]
data.caution_note = fmt.Sprint(k)
mongo_rows_struct[i][data_name] = data
Of course, first solution is preferable because you will avoid unnecessary copying of data

Accessing struct fields inside a map value (without copying)

Assuming the following
type User struct {
name string
}
users := make(map[int]User)
users[5] = User{"Steve"}
Why isn't it possible to access the struct instance now stored in the map?
users[5].name = "Mark"
Can anyone shed some light into how to access the map-stored struct, or the logic behind why it's not possible?
Notes
I know that you can achieve this by making a copy of the struct, changing the copy, and copying back into the map -- but that's a costly copy operation.
I also know this can be done by storing struct pointers in my map, but I don't want to do that either.
The fundamental problem is that you can't take the address of an item within a map. You might think the compiler would re-arrange users[5].name = "Mark" into this
(&users[5]).name = "Mark"
But that doesn't compile, giving this error
cannot take the address of users[5]
This gives the maps the freedom to re-order things at will to use memory efficiently.
The only way to change something explicitly in a map is to assign value to it, i.e.
t := users[5]
t.name = "Mark"
users[5] = t
So I think you either have to live with the copy above or live with storing pointers in your map. Storing pointers have the disadvantage of using more memory and more memory allocations, which may outweigh the copying way above - only you and your application can tell that.
A third alternative is to use a slice - your original syntax works perfectly if you change users := make(map[int]User) to users := make([]User, 10)
Maps are typically sparsely filled hash tables which are reallocated when they exceed the threshold. Re-allocation would create issues when someone is holding the pointers to the values
If you are keen on not creating the copy of the object, you can store the pointer to the object itself as the value
When we are referring the map, the value returned is returned "returned by value", if i may borrow the terminology used in function parameters, editing the returned structure does not have any impact on the contents of the map

Resources