Correctly log protobuf messages as unescaped JSON with zap logger - go

I have a Go project where I'm using Zap structured logging to log the contents of structs. That's how I initialise the logger:
zapLog, err := zap.NewProductionConfig().Build()
if err != nil {
panic(err)
}
Initially I started with my own structs with json tags and it all worked perfectly:
zapLog.Info("Event persisted", zap.Any("event", &event))
Result:
{"level":"info","ts":1626448680.69099,"caller":"persisters/log.go:56",
"msg":"Event persisted","event":{"sourceType":4, "sourceId":"some-source-id",
"type":"updated", "value":"{...}", "context":{"foo":"bar"}}}
I now switched to protobuf and I'm struggling to achieve the same result. Initially I just got the "reflected map" version, when using zap.Any():
zapLog.Info("Event persisted", zap.Any("event", &event))
{"level":"info","ts":1626448680.69099,"caller":"persisters/log.go:56",
"msg":"Event persisted","event":"sourceType:TYPE_X sourceId:\"some-source-id\",
type:\"updated\" value:{...}, context:<key: foo, value:bar>}
I tried marshalling the object with the jsonpb marshaller, which generated the correct output on itself, however, when I use it in zap.String(), the string is escaped, so I get an extra set of '\' in front of each quotation mark. Since there's processing of the logs at a later point, this causes problems there and hence I want to avoid it:
m := jsonpb.Marshaler{}
var buf bytes.Buffer
if err := m.Marshal(&buf, msg); err != nil {
// handle error
}
zapLog.Info("Event persisted", zap.ByteString("event", buf.Bytes()))
Result:
{"level":"info","ts":1626448680.69099,"caller":"persisters/log.go:56",
"msg":"Event persisted","event":"{\"sourceType\":\"TYPE_X\", \"sourceId\":\"some-source-id\",
\"type\":\"updated\", \"value\":\"{...}\", \"context\":{\"foo\":"bar\"}}"}
I then tried using zap.Reflect() instead of zap.Any() which was the closest thing I could get to what I need, except that enums are rendered as their numerical values (the initial solution did not have enums, so that didn't work in the pre-protobuf solution either):
zapLog.Info("Event persisted", zap.Reflect("event", &event))
Result:
{"level":"info","ts":1626448680.69099,"caller":"persisters/log.go:56",
"msg":"Event persisted","event":{"sourceType":4, "sourceId":"some-source-id",
"type":"updated", "value":"{...}", "context":{"foo":"bar"}}}
The only option I see so far is to write my own MarshalLogObject() function:
type ZapEvent struct {
event *Event
}
func (z *ZapEvent) MarshalLogObject(encoder zapcore.ObjectEncoder) error {
encoder.AddString("sourceType", z.event.SourceType.String()
// implement encoder for each attribute
}
func processEvent(e Event) {
...
zapLog.Info("Event persisted", zap.Object("event", &ZapEvent{event: &e}))
}
But since it's a complex struct, I would rather use a less error prone and maintenance heavy solution. Ideally, I would tell zap to use the jsonpb marshaller somehow, but I don't know if that's possible.

Use zap.Any with a json.RawMessage. You can convert directly the byte output of jsonpb.Marshaler:
foo := &pb.FooMsg{
Foo: "blah",
Bar: 1,
}
m := jsonpb.Marshaler{}
var buf bytes.Buffer
if err := m.Marshal(&buf, foo); err != nil {
// handle error
}
logger, _ := zap.NewDevelopment()
logger.Info("Event persisted", zap.Any("event", json.RawMessage(buf.Bytes())))
The bytes will be printed as:
Event persisted {"event": {"foo":"blah","bar":"1"}}`
I believe that's the easiest way, however I'm also aware of a package kazegusuri/go-proto-zap-marshaler (I'm not affiliated to it) that generates MarshalLogObject() implementations as a protoc plugin. You may want to take a look at that too.

I used another way to jsonify protos.
Since protos can be naturally marshaled, I just wrapped them in the strict-to-json marshaler.
And you can modify the internals to use protojson (newer jsonpb).
Unlike the marshaler in the previous solution, this one doesn't require ahead-of-logging processing.
type jsonObjectMarshaler struct {
obj any
}
func (j *jsonObjectMarshaler) MarshalJSON() ([]byte, error) {
bytes, err := json.Marshal(j.obj)
// bytes, err := protojson.Marshal(j.obj)
if err != nil {
return nil, fmt.Errorf("json marshaling failed: %w", err)
}
return bytes, nil
}
func ZapJsonable(key string, obj any) zap.Field {
return zap.Reflect(key, &jsonObjectMarshaler{obj: obj})
}
Then to use it, just
logger, _ := zap.NewDevelopment()
logger.Info("Event persisted", ZapJsonable("event", buf))

Related

Generic Go code to retrieve multiple rows from BigQuery

I am writing some utils to retrieve multiple rows from BigQuery in a generic way using Go.
e.g.
type User struct {name string, surname string}
type Car struct {model string, platenumber string}
query1:="SELECT name, surname FROM UserTable"
query2:="SELECT model, platenumber FROM CarTable"
cars, _ := query2.GetResults()
users, _ := query1.GetResults()
OR
cars := []Car{}
query2.GetResults(cars) // and it would append to the slice
I am unsure about the signature of GetResults. I need somehow to pass the type to BigQuery library so it can retrieve the data and map it to the struct correctly. But at the same time I need to make it generic so it can be used for different types.
At the moment my GetResults looks like this: it doesn't work, the error is:
bigquery: cannot convert *interface {} to ValueLoader (need pointer to []Value, map[string]Value, or struct)[]
But I cannot pass directly the struct as I want to make it generic.
func (s *Query) GetResults() ([]interface{}, error) {
var result []interface{}
job, err := s.Run()
if err != nil {
s.log.Error(err, "error in running the query")
return nil, err
}
it, err := job.ReadData()
if err != nil {
s.log.Error(err, "error in reading the data")
return nil, err
}
var row interface{}
for {
err := it.Next(&row)
if err != nil {
fmt.Print(err)
break
}
result = append(result, row)
}
return result, nil
}
Is there another way to achieve that? Or is the good way not to create a method like that?
I've tried quite a lot of different things, with or without pointer, with or without array, by modifying the args, or returning a new list, nothing seem to work, and doing all of that feels a bit wrong regarding the nature "easy" of what I am trying to achieve.
I've also looked into doing the following
GetResults[T any]() ([]T, error)
But it's "excluded" as GetResults is part of an interface (and we can't define generic for a method of an interface). And I can't/don't want to define a type for all the interface, as it impacts other interfaces.

How can I compare read(1.proto) = read(2.proto) in Go(assuming there's just one message definition)?

Context: I'm trying to resolve this issue.
In other words, there's a NormalizeJsonString() for JSON strings (see this for more context:
// Takes a value containing JSON string and passes it through
// the JSON parser to normalize it, returns either a parsing
// error or normalized JSON string.
func NormalizeJsonString(jsonString interface{}) (string, error) {
that allows to have the following code:
return structure.NormalizeJsonString(old) == structure.NormalizeJsonString(new)
but it doesn't work for strings that are proto files (all proto files are guaranteed to have just one message definition). For example, I could see:
syntax = "proto3";
- package bar.proto;
+ package bar.proto;
option java_outer_classname = "FooProto";
message Foo {
...
- int64 xyz = 3;
+ int64 xyz = 3;
Is there NormalizeProtoString available in some Go SDKs? I found MessageDifferencer but it's in C++ only. Another option I considered was to replace all new lines / group of whitespaces with a single whitespace but it's a little bit hacky.
To do this in a semantic fashion, the proto definitions should really be parsed. Naively stripping and/or replacing whitespace may get you somewhere, but likely will have gotchas.
As far as I'm aware the latest official Go protobuf package don't have anything to handle parsing protobuf definitions - the protoc compiler handles that side of affairs, and this is written in C++
There would be options to execute the protoc compiler to get hold of the descriptor set output (e.g. protoc --descriptor_set_out=...), however I'm guessing this would also be slightly haphazard considering it requires one to have protoc available - and version differences could potentially cause problems too.
Assuming that is no go, one further option is to use a 3rd party parser written in Go - github.com/yoheimuta/go-protoparser seems to handle things quite well. One slight issue when making comparisons is that the parser records meta information about source line + column positions for each type; however it is relatively easy to make a comparison and ignore these, by using github.com/google/go-cmp
For example:
package main
import (
"fmt"
"log"
"os"
"github.com/google/go-cmp/cmp"
"github.com/google/go-cmp/cmp/cmpopts"
"github.com/yoheimuta/go-protoparser/v4"
"github.com/yoheimuta/go-protoparser/v4/parser"
"github.com/yoheimuta/go-protoparser/v4/parser/meta"
)
func main() {
if err := run(); err != nil {
log.Fatal(err)
}
}
func run() error {
proto1, err := parseFile("example1.proto")
if err != nil {
return err
}
proto2, err := parseFile("example2.proto")
if err != nil {
return err
}
equal := cmp.Equal(proto1, proto2, cmpopts.IgnoreTypes(meta.Meta{}))
fmt.Printf("equal: %t", equal)
return nil
}
func parseFile(path string) (*parser.Proto, error) {
f, err := os.Open(path)
if err != nil {
return nil, err
}
defer f.Close()
return protoparser.Parse(f)
}
outputs:
equal: true
for the example you provided.

Why a nil error returned from strings.Builder WriteString in golang, is it necessary?

When reviewed my colleague's code, I found that a returned err has been ignored, though we would not do that in general:
b := new(strings.Builder)
b.WriteString("Hello, World!") // ignore err
The source code for WriteString declares it may return an error, but in fact it never will (always returning nil for the error value):
// WriteString appends the contents of s to b's buffer.
// It returns the length of s and a nil error.
func (b *Builder) WriteString(s string) (int, error) {
b.copyCheck()
b.buf = append(b.buf, s...)
return len(s), nil
}
What would the issues be, if any, with removing the error return, as follows?
func (b *Builder) WriteString(s string) int {
b.copyCheck()
b.buf = append(b.buf, s...)
return len(s)
}
The changelist which introduces strings.Builder includes a lot of comments about trying to make this API similar to bytes.Buffer.
For instance,
That's how a bytes.Buffer behaves, after all, and we're supposed to be a subset of a bytes.Buffer.
Looking at the documentation for some bytes.Buffer functions, it mentions
WriteRune appends the UTF-8 encoding of Unicode code point r to the buffer, returning its length and an error, which is always nil but is included to match bufio.Writer's WriteRune.
It looks like they're basically trying to design an API that's similar to other interfaces in Golang's standard library. Even though the always-nil error is redundant, it allows the Builder to match existing interfaces that would accept bytes.Buffer or bufio.Writer. One such interface is io.StringWriter, which looks like
type StringWriter interface {
WriteString(s string) (n int, err error)
}
The err return value here is useful since other StringWriter implementations could possibly return errors.
Go, it's quite common to return a value and error. So you can check the error is not null, if no error then easily use the returned value.
In other words, if it receives an error from a function then it indicates there was a problem with the function called.

How to stop json.Marshal from escaping < and >?

package main
import "fmt"
import "encoding/json"
type Track struct {
XmlRequest string `json:"xmlRequest"`
}
func main() {
message := new(Track)
message.XmlRequest = "<car><mirror>XML</mirror></car>"
fmt.Println("Before Marshal", message)
messageJSON, _ := json.Marshal(message)
fmt.Println("After marshal", string(messageJSON))
}
Is it possible to make json.Marshal not escape < and >? I currently get:
{"xmlRequest":"\u003ccar\u003e\u003cmirror\u003eXML\u003c/mirror\u003e\u003c/car\u003e"}
but I am looking for something like this:
{"xmlRequest":"<car><mirror>XML</mirror></car>"}
As of Go 1.7, you still cannot do this with json.Marshal(). The source code for json.Marshal shows:
> err := e.marshal(v, encOpts{escapeHTML: true})
The reason json.Marshal always does this is:
String values encode as JSON strings coerced to valid UTF-8,
replacing invalid bytes with the Unicode replacement rune.
The angle brackets "<" and ">" are escaped to "\u003c" and "\u003e"
to keep some browsers from misinterpreting JSON output as HTML.
Ampersand "&" is also escaped to "\u0026" for the same reason.
This means you cannot even do it by writing a custom func (t *Track) MarshalJSON(), you have to use something that does not satisfy the json.Marshaler interface.
So, the workaround, is to write your own function:
func (t *Track) JSON() ([]byte, error) {
buffer := &bytes.Buffer{}
encoder := json.NewEncoder(buffer)
encoder.SetEscapeHTML(false)
err := encoder.Encode(t)
return buffer.Bytes(), err
}
https://play.golang.org/p/FAH-XS-QMC
If you want a generic solution for any struct, you could do:
func JSONMarshal(t interface{}) ([]byte, error) {
buffer := &bytes.Buffer{}
encoder := json.NewEncoder(buffer)
encoder.SetEscapeHTML(false)
err := encoder.Encode(t)
return buffer.Bytes(), err
}
https://play.golang.org/p/bdqv3TUGr3
In Go1.7 the have added a new option to fix this:
encoding/json:
add Encoder.DisableHTMLEscaping This provides a way to disable the escaping of <, >, and & in JSON strings.
The relevant function is
func (*Encoder) SetEscapeHTML
That should be applied to a Encoder.
enc := json.NewEncoder(os.Stdout)
enc.SetEscapeHTML(false)
Simple example: https://play.golang.org/p/SJM3KLkYW-
This doesn't answer the question directly but it could be an answer if you're looking for a way how to deal with json.Marshal escaping < and >...
Another way to solve the problem is to replace those escaped characters in json.RawMessage into just valid UTF-8 characters, after the json.Marshal() call.
It will work as well for any letters other than < and >. (I used to do this to make non-English letters to be human readable in JSON :D)
func _UnescapeUnicodeCharactersInJSON(_jsonRaw json.RawMessage) (json.RawMessage, error) {
str, err := strconv.Unquote(strings.Replace(strconv.Quote(string(_jsonRaw)), `\\u`, `\u`, -1))
if err != nil {
return nil, err
}
return []byte(str), nil
}
func main() {
// Both are valid JSON.
var jsonRawEscaped json.RawMessage // json raw with escaped unicode chars
var jsonRawUnescaped json.RawMessage // json raw with unescaped unicode chars
// '\u263a' == '☺'
jsonRawEscaped = []byte(`{"HelloWorld": "\uC548\uB155, \uC138\uC0C1(\u4E16\u4E0A). \u263a"}`) // "\\u263a"
jsonRawUnescaped, _ = _UnescapeUnicodeCharactersInJSON(jsonRawEscaped) // "☺"
fmt.Println(string(jsonRawEscaped)) // {"HelloWorld": "\uC548\uB155, \uC138\uC0C1(\u4E16\u4E0A). \u263a"}
fmt.Println(string(jsonRawUnescaped)) // {"HelloWorld": "안녕, 세상(世上). ☺"}
}
https://play.golang.org/p/pUsrzrrcDG-
I hope this helps someone.
Here's my workaround:
// Marshal is a UTF-8 friendly marshaler. Go's json.Marshal is not UTF-8
// friendly because it replaces the valid UTF-8 and JSON characters "&". "<",
// ">" with the "slash u" unicode escaped forms (e.g. \u0026). It preemptively
// escapes for HTML friendliness. Where text may include any of these
// characters, json.Marshal should not be used. Playground of Go breaking a
// title: https://play.golang.org/p/o2hiX0c62oN
func Marshal(i interface{}) ([]byte, error) {
buffer := &bytes.Buffer{}
encoder := json.NewEncoder(buffer)
encoder.SetEscapeHTML(false)
err := encoder.Encode(i)
return bytes.TrimRight(buffer.Bytes(), "\n"), err
}
No, you can't.
A third-party json package might be the choice rather than the std json lib.
More detail:https://github.com/golang/go/issues/8592
I had a requirement to store xml inside json :puke:
At first I was having significant difficulty unmarshalling that xml after passing it via json, but my issue was actually due to trying to unmarshall the xml string as a json.RawMessage. I actually needed to unmarshall it as a string and then coerce it into []byte for the xml.Unmarshal.
type xmlInJson struct {
Data string `json:"data"`
}
var response xmlInJson
err := json.Unmarshall(xmlJsonData, &response)
var xmlData someOtherStructThatMatchesTheXmlFormat
err = xml.Unmarshall([]byte(response.Data), &xmlData)
Custom function is not kind of the best solution.
How about another library to solve this.
I use gabs
import
go get "github.com/Jeffail/gabs"
use
message := new(Track)
resultJson,_:=gabs.Consume(message)
fmt.Println(string(resultJson.EncodeJSON()))
I solve that problem like this.

Arbitrary JSON data structure in go

I'm building an http api and every one of my handlers returns JSON data, so I built a wrapper function that handles the JSON marshalling and http response (I've included the relevant section from the wrapper as well as one of the sample handlers below).
What is the best way to pass arbitrarily nested structs (the structs also contain arbitrary types/number of fields). Right now I've settled on a map with string keys and interface{} values. This works, but is this the most idiomatic go way to do this?
result := make(map[string]interface{})
customerList(httpRequest, &result)
j, err := json.Marshal(result)
if err != nil {
log.Println(err)
errs := `{"error": "json.Marshal failed"}`
w.Write([]byte(errs))
return
}
w.Write(j)
func customerList(req *http.Request, result *map[string]interface{}) {
data, err := database.RecentFiftyCustomers()
if err != nil {
(*result)["error"] = stringifyErr(err, "customerList()")
return
}
(*result)["customers"] = data//data is a slice of arbitrarily nested structs
}
If you do not know in advance what types, what structure and which nesting you get, there is no option but to decode it into something generic like map[string]interface{}. So nothing "idiomatic" or "non-idiomatic" here.
(Personally I'd try to somehow fix the structs and not have "arbitrary" nestings, and combinations.)

Resources