This question already has answers here:
Strings encode/decode in gob
(4 answers)
Closed 12 months ago.
I am trying to decode an Inv struct, but decoding the same encoded value returns a different value.
// inv struct
type Inv struct {
AddrFrom string
Type int
data [][]byte
}
inv := Inv{
AddrFrom: nodeAddress,
Type: kind,
data: inventories,
}
data := GobEncode(inv)
var payload Inv
gob.NewDecoder(bytes.NewBuffer(data)).Decode(&payload)
Here payload and inv have different values. When decoded data field of inv struct is of length zero.
https://pkg.go.dev/encoding/gob
A struct field of chan or func type is treated exactly like an unexported field and is ignored.
https://go.dev/ref/spec#Exported_identifiers
An identifier may be exported to permit access to it from another package. An identifier is exported if both:
the first character of the identifier's name is a Unicode upper case letter (Unicode class "Lu"); and
the identifier is declared in the package block or it is a field name or method name.
All other identifiers are not exported.
https://pkg.go.dev/encoding/gob#hdr-Types_and_Values
Gob can encode a value of any type implementing the GobEncoder or encoding.BinaryMarshaler interfaces by calling the corresponding method, in that order of preference.
Internally, the gob package relies on the reflect package, which is designed to respect the visibility principle. Thus the gob package does not handle those fields automatically, it requires you to write dedicated implementation.
https://pkg.go.dev/encoding/gob#GobEncoder
GobEncoder is the interface describing data that provides its own representation for encoding values for transmission to a GobDecoder. A type that implements GobEncoder and GobDecoder has complete control over the representation of its data and may therefore contain things such as private fields, channels, and functions, which are not usually transmissible in gob streams.
Example
package main
import (
"bytes"
"encoding/gob"
"fmt"
"log"
)
// The Vector type has unexported fields, which the package cannot access.
// We therefore write a BinaryMarshal/BinaryUnmarshal method pair to allow us
// to send and receive the type with the gob package. These interfaces are
// defined in the "encoding" package.
// We could equivalently use the locally defined GobEncode/GobDecoder
// interfaces.
type Vector struct {
x, y, z int
}
func (v Vector) MarshalBinary() ([]byte, error) {
// A simple encoding: plain text.
var b bytes.Buffer
fmt.Fprintln(&b, v.x, v.y, v.z)
return b.Bytes(), nil
}
// UnmarshalBinary modifies the receiver so it must take a pointer receiver.
func (v *Vector) UnmarshalBinary(data []byte) error {
// A simple encoding: plain text.
b := bytes.NewBuffer(data)
_, err := fmt.Fscanln(b, &v.x, &v.y, &v.z)
return err
}
// This example transmits a value that implements the custom encoding and decoding methods.
func main() {
var network bytes.Buffer // Stand-in for the network.
// Create an encoder and send a value.
enc := gob.NewEncoder(&network)
err := enc.Encode(Vector{3, 4, 5})
if err != nil {
log.Fatal("encode:", err)
}
// Create a decoder and receive a value.
dec := gob.NewDecoder(&network)
var v Vector
err = dec.Decode(&v)
if err != nil {
log.Fatal("decode:", err)
}
fmt.Println(v)
}
As the field type is already a byte slice, you really are just hitting a visibility access issue and the dedicated required marshalling implementation, while arguable because you could as well export that field, should be straightforward.
Related
I need some help with unmarshaling. I have this example code:
package main
import (
"encoding/json"
"fmt"
)
type Obj struct {
Id string `json:"id"`
Data []byte `json:"data"`
}
func main() {
byt := []byte(`{"id":"someID","data":["str1","str2"]}`)
var obj Obj
if err := json.Unmarshal(byt, &obj); err != nil {
panic(err)
}
fmt.Println(obj)
}
What I try to do here - convert bytes to the struct, where type of one field is []byte. The error I get:
panic: json: cannot unmarshal string into Go struct field Obj.data of
type uint8
That's probably because parser already sees that "data" field is already a slice and tries to represent "str1" as some char bytecode (type uint8?).
How do I store the whole data value as one bytes array? Because I want to unmarshal the value to the slice of strings later. I don't include a slice of strings into struct because this type can change (array of strings, int, string, etc), I wish this to be universal.
My first recommendation would be for you to just use []string instead of []byte if you know the input type is going to be an array of strings.
If data is going to be a JSON array with various types, then your best option is to use []interface{} instead - Go will happily unmarshal the JSON for you and you can perform checks at runtime to cast those into more specific typed variables on an as-needed basis.
If []byte really is what you want, use json.RawMessage, which is of type []byte, but also implements the methods for JSON parsing. I believe this may be what you want, as it will accept whatever ends up in data. Of course, you then have to manually parse Data to figure out just what actually IS in there.
One possible bonus is that this skips any heavy parsing because it just copies the bytes over. When you want to use this data for something, you use a []interface{}, then use a type switch to use individual values.
https://play.golang.org/p/og88qb_qtpSGJ
package main
import (
"encoding/json"
"fmt"
)
type Obj struct {
Id string `json:"id"`
Data json.RawMessage `json:"data"`
}
func main() {
byt := []byte(`{"id":"someID","data":["str1","str2", 1337, {"my": "obj", "id": 42}]}`)
var obj Obj
if err := json.Unmarshal(byt, &obj); err != nil {
panic(err)
}
fmt.Printf("%+v\n", obj)
fmt.Printf("Data: %s\n", obj.Data)
// use it
var d []interface{}
if err := json.Unmarshal(obj.Data, &d); err != nil {
panic(err)
}
fmt.Printf("%+v\n", d)
for _, v := range d {
// you need a type switch to deterine the type and be able to use most of these
switch real := v.(type) {
case string:
fmt.Println("I'm a string!", real)
case float64:
fmt.Println("I'm a number!", real)
default:
fmt.Printf("Unaccounted for: %+v\n", v)
}
}
}
Your question is:
convert bytes array to struct with a field of type []byte
But you do not have a bytearray but a string array. Your question is not the same as your example. So let answer your question, there are more solutions possible depending in how far you want to diverge from your original requirements.
One string can be converted to one byte-slice, two strings need first to be transformed to one string. So that is problem one. The second problem are the square-brackets in your json-string
This works fine, it implicitly converts the string in the json-string to a byte-slice:
byt := []byte(`{"id":"someID","data":"str1str2"}`)
var obj Obj
if err := json.Unmarshal(byt, &obj); err != nil {
panic(err)
}
fmt.Println(obj)
I'm using gob to serialize structs to disk. The struct in question contains an interface field, so the concrete type needs to be registered using gob.Register(...).
The wrinkle here is that the library doing the gob-ing should be ignorant of the concrete type in use. I wanted the serialization to be possible even when callers have defined their own implementations of the interface.
I can successfully encode the data by registering the type on the fly (see trivial example below), but upon trying to re-read that data, gob refuses to accept the un-registered type. Its frustrating, because it feels like all the data is there - why isn't gob just unpacking that as a main.UpperCaseTransformation struct if it's labelled as such?
package main
import (
"encoding/gob"
"fmt"
"os"
"strings"
)
type Transformation interface {
Transform(s string) string
}
type TextTransformation struct {
BaseString string
Transformation Transformation
}
type UpperCaseTransformation struct{}
func (UpperCaseTransformation) Transform(s string) string {
return strings.ToUpper(s)
}
func panicOnError(err error) {
if err != nil {
panic(err)
}
}
// Execute this twice to see the problem (it will tidy up files)
func main() {
file := os.TempDir() + "/so-example"
if _, err := os.Stat(file); os.IsNotExist(err) {
tt := TextTransformation{"Hello, World!", UpperCaseTransformation{}}
// Note: didn't need to refer to concrete type explicitly
gob.Register(tt.Transformation)
f, err := os.Create(file)
panicOnError(err)
defer f.Close()
enc := gob.NewEncoder(f)
err = enc.Encode(tt)
panicOnError(err)
fmt.Println("Run complete, run again for error.")
} else {
f, err := os.Open(file)
panicOnError(err)
defer os.Remove(f.Name())
defer f.Close()
var newTT TextTransformation
dec := gob.NewDecoder(f)
// Errors with: `gob: name not registered for interface: "main.UpperCaseTransformation"'
err = dec.Decode(&newTT)
panicOnError(err)
}
}
My work-around would be to require implementers of the interface to register their type with gob. But I don't like how that reveals my serialization choices to the callers.
Is there any route forward that avoids this?
Philosophical argumentation
The encoding/gob package cannot (or rather should not) make that decision on its own. Since the gob package creates a serialized form independent of / detached from the app, there is no guarantee that values of interface types will exist in the decoder; and even if they do (matched by the concrete type name), there is no guarantee that they represent the same type (or the same implementation of a given type).
By calling gob.Register() (or gob.RegisterName()) you make that intent clear, you give green light to the gob package to use that type. This also ensures that the type does exist, else you would not be able to pass a value of it when registering.
Technical requirement
There's also a technical point of view that dictates this requirement (that you must register prior): you cannot obtain the reflect.Type type descriptor of a type given by its string name. Not just you, the encoding/gob package can't do it either.
So by requiring you to call gob.Register() prior, the gob package will receive a value of the type in question, and therefore it can (and it will) access and store its reflect.Type descriptor internally, and so when a value of this type is detected, it is capable of creating a new value of this type (e.g. using reflect.New()) in order to store the value being decoded into it.
The reason why you can't "lookup" types by name is that they may not end up in your binary (they may get "optimized out") unless you explicitly refer to them. For details see Call all functions with special prefix or suffix in Golang; and Splitting client/server code. When registering your custom types (by passing values of them), you are making an explicit reference to them and thus ensuring that they won't get excluded from the binaries.
I have a listener which receives protobuf messages. However it doesn't know which type of message comes in when. So I tried to unmarshal into an interface{} so I can later type cast:
var data interface{}
err := proto.Unmarshal(message, data)
if err != nil {
log.Fatal("unmarshaling error: ", err)
}
log.Printf("%v\n", data)
However this code doesn't compile:
cannot use data (type interface {}) as type proto.Message in argument to proto.Unmarshal:
interface {} does not implement proto.Message (missing ProtoMessage method)
How can I unmarshal and later type cast an "unknown" protobuf message in go?
First, two words about the OP's question, as presented by them:
proto.Unmarshal can't unmarshal into an interface{}. The method signature is obvious, you must pass a proto.Message argument, which is an interface implemented by concrete protobuffer types.
When handling a raw protobuffer []byte payload that didn't come in an Any, ideally you have at least something (a string, a number, etc...) coming together with the byte slice, that you can use to map to the concrete protobuf message.
You can then switch on that and instantiate the appropriate protobuf concrete type, and only then pass that argument to Unmarshal:
var message proto.Message
switch atLeastSomething {
case "foo":
message = &mypb.Foo{}
case "bar":
message = &mypb.Bar{}
}
_ = proto.Unmarshal(data, message)
Now, what if the byte payload is truly unknown?
As a foreword, consider that this should seldom happen in practice. The schema used to generate the protobuffer types in your language of choice represents a contract, and by accepting protobuffer payloads you are, for some definitions of it, fulfilling that contract.
Anyway, if for some reason you must deal with a completely unknown, mysterious, protobuffer payload in wire format, you can extract some information from it with the protowire package.
Be aware that the wire representation of a protobuf message is ambiguous. A big source of uncertainty is the "length-delimited" type (2) being used for strings, bytes, repeated fields and... sub-messages (reference).
You can retrieve the payload content, but you are bound to have weak semantics.
The code
With that said, this is what a parser for unknown proto messages may look like. The idea is to leverage protowire.ConsumeField to read through the original byte slice.
The data model could be like this:
type Field struct {
Tag Tag
Val Val
}
type Tag struct {
Num int32
Type protowire.Type
}
type Val struct {
Payload interface{}
Length int
}
And the parser:
func parseUnknown(b []byte) []Field {
fields := make([]Field, 0)
for len(b) > 0 {
n, t, fieldlen := protowire.ConsumeField(b)
if fieldlen < 1 {
return nil
}
field := Field{
Tag: Tag{Num: int32(n), Type: t },
}
_, _, taglen := protowire.ConsumeTag(b[:fieldlen])
if taglen < 1 {
return nil
}
var (
v interface{}
vlen int
)
switch t {
case protowire.VarintType:
v, vlen = protowire.ConsumeVarint(b[taglen:fieldlen])
case protowire.Fixed64Type:
v, vlen = protowire.ConsumeFixed64(b[taglen:fieldlen])
case protowire.BytesType:
v, vlen = protowire.ConsumeBytes(b[taglen:fieldlen])
sub := parseUnknown(v.([]byte))
if sub != nil {
v = sub
}
case protowire.StartGroupType:
v, vlen = protowire.ConsumeGroup(n, b[taglen:fieldlen])
sub := parseUnknown(v.([]byte))
if sub != nil {
v = sub
}
case protowire.Fixed32Type:
v, vlen = protowire.ConsumeFixed32(b[taglen:fieldlen])
}
if vlen < 1 {
return nil
}
field.Val = Val{Payload: v, Length: vlen - taglen}
// fmt.Printf("%#v\n", field)
fields = append(fields, field)
b = b[fieldlen:]
}
return fields
}
Sample input and output
Given a proto schema like:
message Foo {
string a = 1;
string b = 2;
Bar bar = 3;
}
message Bar {
string c = 1;
}
initialized in Go as:
&test.Foo{A: "A", B: "B", Bar: &test.Bar{C: "C"}}
And by adding a fmt.Printf("%#v\n", field) statement at the end of the loop in the above code, it will output the following:
main.Field{Tag:main.Tag{Num:1, Type:2}, Val:main.Val{Payload:[]uint8{0x41}, Length:1}}
main.Field{Tag:main.Tag{Num:2, Type:2}, Val:main.Val{Payload:[]uint8{0x42}, Length:1}}
main.Field{Tag:main.Tag{Num:1, Type:2}, Val:main.Val{Payload:[]uint8{0x43}, Length:1}}
main.Field{Tag:main.Tag{Num:3, Type:2}, Val:main.Val{Payload:[]main.Field{main.Field{Tag:main.Tag{Num:1, Type:2}, Val:main.Val{Payload:[]uint8{0x43}, Length:1}}}, Length:3}}
About sub-messages
As you can see from the above the idea to deal with a protowire.BytesType that may or may not be a message field is to attempt to parse it, recursively. If it succeeds, we keep the resulting msg and store it in the field value, if it fails, we store the bytes as-is, which then may be a proto string or bytes. BTW, if I'm reading correctly, this seems what Marc Gravell does in the Protogen code.
About repeated fields
The code above doesn't deal with repeated fields explicitly, but after the parsing is done, repeated fields will have the same value for Field.Tag.Num. From that, packing the fields into a slice/array should be trivial.
About maps
The code above also doesn't deal with proto maps. I suspect that maps are semantically equivalent to a repeated k/v pair, e.g.:
message Pair {
string key = 1; // or whatever key type
string val = 2; // or whatever val type
}
If my assumption is correct, then maps can be parsed with the given code as sub-messages.
About oneofs
I haven't yet tested this, but I expect that information about the union type are completely lost. The byte payload will contain only the value that was actually set.
But what about Any?
The Any proto type doesn't fit in the picture. Contrary to what it may look like, Any is not analogous to, say, map[string]interface{} for JSON objects. And the reason is simple: Any is a proto message with a very well defined structure, namely (in Go):
type Any struct {
// unexported fields
TypeUrl string // struct tags omitted
Value []byte // struct tags omitted
}
So it is more similar to the implementation of a Go interface{} in that it holds some actual data and that data's type information.
It can hold itself arbitrary proto payloads (with their type information!) but it can not be used to decode unknown messages, because Any has exactly those two fields, type url and a byte payload.
To wrap up, this answer doesn't provide a full-blown production-grade solution, but it shows how to decode arbitrary payloads while preserving as much original semantics as possible. Hopefully it will point you in the right direction.
As you've seen, and the commenters have pointed out, you can't use proto.Unmarshal to interface{} because, the method expects a type Message which implements an interface MessageV1.
Protobuf messages are typed and correspond to method invocations ("comes in") and the implementation cannot take generic types of protobuf but specific protobufs:
func (s *server) M(ctx context.Context, _ *pb.Foo) (*pb.Bar, error)
The solution is to envelope your generic types as Any within a specific type perhaps Envelope:
message Envelope {
google.protobuf.Any content = 1;
...
}
The content is then transmitted as a []byte (see Golang anypb.Any) and the implementation (anypb) includes methods to pack|unpack these.
The 'trick' with Any is that messages include a [TypeURL] that uniquely identifies the message so that the receiver knows how to e.g. Unmarshal it.
I am trying to understand the code that is used at my company. I am new to go lang, and I have already gone through the tutorial on their official website. However, I am having a hard time wrapping my head around empty interfaces, i.e. interface{}. From various sources online, I figured out that the empty interface can hold any type. But, I am having a hard time figuring out the codebase, especially some of the functions. I will not be posting the entire thing here, but just the minimal functions in which it has been used. Please bear with me!
Function (I am trying to understand):
func (this *RequestHandler) CreateAppHandler(rw http.ResponseWriter, r *http.Request) *foo.ResponseError {
var data *views.Data = &views.Data{Attributes: &domain.Application{}}
var request *views.Request = &views.Request{Data: data}
if err := json.NewDecoder(r.Body).Decode(request); err != nil {
logrus.Error(err)
return foo.NewResponsePropogateError(foo.STATUS_400, err)
}
requestApp := request.Data.Attributes.(*domain.Application)
requestApp.CreatedBy = user
Setting some context, RequestHandler is a struct defined in the same package as this code. domain and views are seperate packages. Application is a struct in the package domain. The following two structs are part of the package views:
type Data struct {
Id string `json:"id"`
Type string `json:"type"`
Attributes interface{} `json:"attributes"`
}
type Request struct {
Data *Data `json:"data"`
}
The following are part of the package json:
func NewDecoder(r io.Reader) *Decoder {
return &Decoder{r: r}
}
func (dec *Decoder) Decode(v interface{}) error {
if dec.err != nil {
return dec.err
}
if err := dec.tokenPrepareForDecode(); err != nil {
return err
}
if !dec.tokenValueAllowed() {
return &SyntaxError{msg: "not at beginning of value"}
}
// Read whole value into buffer.
n, err := dec.readValue()
if err != nil {
return err
}
dec.d.init(dec.buf[dec.scanp : dec.scanp+n])
dec.scanp += n
// Don't save err from unmarshal into dec.err:
// the connection is still usable since we read a complete JSON
// object from it before the error happened.
err = dec.d.unmarshal(v)
// fixup token streaming state
dec.tokenValueEnd()
return err
}
type Decoder struct {
r io.Reader
buf []byte
d decodeState
scanp int // start of unread data in buf
scan scanner
err error
tokenState int
tokenStack []int
}
Now, I understood that, in the struct Data in package views, Application is being set as a type for the empty interface. After that, a pointer to Request in the same package is created which points to the variable data.
I have the following doubts:
What exactly does this keyword mean in Go? What is the purpose of writing this * RequestHandler?
Initialization of a structure in Go can be done while assigning it to a variable by specifying the values of all it's members. However, here, for the struct Data, only the empty interface value is assigned and the values for the other two fields are not assigned?
What is the advantage of assigning the Application struct to an empty interface? Does it mean I can use the struct members using the interface variable directly?
Can someone help me figure out the meaning of this statement? json.NewDecoder(r.Body).Decode(request)?
While I know this is too much, but I am having a hard time figuring out the meaning of interfaces in Go. Please help!
this is not a keyword in go; any variable name can be used there. That is called the receiver. A function declared in that way must be called like thing.func(params), where "thing" is an expression of the type of the receiver. Within the function, the receiver is set to the value of thing.
A struct literal does not have to contain values for all the fields (or any of them). Any fields not explicitly set will have the zero value for their types.
As you said, an empty interface can take on a value of any type. To use a value of type interface{}, you would use type assertion or a type switch to determine the type of the value, or you could use reflection to use the value without having to have code for the specific type.
What specifically about that statement do you not understand? json is the name of a package in which the function NewDecoder is declared. That function is called, and then the Decode function (which is implemented by the type of the return value of NewDecoder) is called on that return value.
You may want to take a look at Effective Go and/or The Go Programming Language Specification for more information.
package main
import (
"fmt"
"encoding/json"
"reflect"
)
type GeneralConfig map[string]interface{}
var data string = `
{
"key":"value",
"important_key":
{"foo":"bar"}
}`
func main() {
jsonData := &GeneralConfig{}
json.Unmarshal([]byte(data), jsonData)
fmt.Println(reflect.TypeOf(jsonData)) //main.GeneralConfig
jsonTemp := (*jsonData)["important_key"]
fmt.Println(reflect.TypeOf(jsonTemp)) //map[string]interface {}
//newGeneralConfig := GeneralConfig(jsonTemp)
//cannot convert jsonTemp (type interface {}) to type GeneralConfig:
//need type assertion
newGeneralConfig := jsonTemp.(GeneralConfig)
//fmt.Println(reflect.TypeOf(newGeneralConfig))
//panic: interface conversion: interface {} is map[string]interface {},
//not main.GeneralConfig
}
Available at the playground
I understand that I can use a nested struct in lieu of GeneralConfig, but that would require me knowing the exact structure of the payload, ie it wouldn't work for different keys (I would be locked into "important_key").
Is there a golang workaround for when I don't know what the value of "important_key" is? I say golang, because if possible, one could require all "important_keys" to have a constant parent key, which could resolve this issue.
To summarize, given an arbitrary json object, there must be a way that I can traverse its keys, and if a value is a custom type, convert the value to that type. Right now it seems that if I use type conversion, it tells me that the type is interface{} and I need to use type assertion; however, if I use type assertion, it tells me that interface{} is map[string]interface{} not main.GeneralConfig.
I agree the comments about trying to utilise the expected structure of the incoming JSON in order to write well-defined Structs, but I'll attempt to answer the question anyway.
The thing to take away from what you're seeing printed versus the error messages that you're seeing is that the compiler knows less about the type than the runtime because the runtime can look at the actual value. To bring the compiler up-to-speed we must (i) assert (*jsonData)["important_key"] is a map[string]interface{} -- the compiler only knows it to be an interface{} -- and then (ii) type-cast that to a GeneralConfig type. See:
package main
import (
"fmt"
"encoding/json"
)
type GeneralConfig map[string]interface{}
func main() {
jsonStruct := new(GeneralConfig)
json.Unmarshal([]byte(`{"parent_key": {"foo": "bar"}}`), jsonStruct)
fmt.Printf("%#v\n", jsonStruct)
// => &main.GeneralConfig{"parent_key":map[string]interface {}{"foo":"bar"}}
nestedStruct := (*jsonStruct)["parent_key"]
fmt.Printf("%#v\n", nestedStruct)
// => map[string]interface {}{"foo":"bar"}
// Whilst this shows the runtime knows its actual type is
// map[string]interface, the compiler only knows it to be an interface{}.
// First we assert for the compiler that it is indeed a
// map[string]interface{} we are working with. You can imagine the issues
// that might arrise if we has passed in `{"parent_key": 123}`.
mapConfig, ok := nestedStruct.(map[string]interface{})
if !ok {
// TODO: Error-handling.
}
// Now that the compiler can be sure mapConfig is a map[string]interface{}
// we can type-cast it to GeneralConfig:
config := GeneralConfig(mapConfig)
fmt.Printf("%#v\n", config)
// => main.GeneralConfig{"foo":"bar"}
}
You are looking for json.RawMessage.
You can delay unmarshalling based upon some other value and then force it to unmarshal to a specific type.
This is not a good idea, but might be closer to what you are looking for.
http://play.golang.org/p/PWwAUDySE0
This is a standard "workaround" if get what you're after. When handling unknown data you can implement this pattern (modified from your example) of switching on the type recursively to get to the concrete values in an unknown body of json data.
package main
import (
"encoding/json"
"fmt"
"reflect"
)
var data = `
{
"key":"value",
"important_key":
{"foo":"bar"}
}`
func main() {
var jsonData interface{}
json.Unmarshal([]byte(data), &jsonData)
fmt.Println(reflect.TypeOf(jsonData))
parseArbitraryJSON(jsonData.(map[string]interface{}))
}
func parseArbitraryJSON(data map[string]interface{}) {
for k, v := range data {
switch a := v.(type) {
case string:
fmt.Printf("%v:%v\n", k, a)
case map[string]interface{}:
fmt.Printf("%v:%v\n", k, a)
parseArbitraryJSON(a)
}
}
}
The resulting output is:
map[string]interface {}
key:value
important_key:map[foo:bar]
foo:bar
This example only accounts for the base data being a string type but you can switch on any type that you expect to receive, and like any switch you can group your cases, so you can treat all numbers similarly for example.