BigQuery - Fetch table data from Go Server in Cloud Run - go

I am able to fetch the Big Query table output as json through Golang server. But is there a way to fetch the schema directly instead of defining it as ColStatsRow? Also, any way to make this better.
type ColStatsRow struct {
COLUMN_NAME string `bigquery:"column_name"`
COLUMN_VALUE string `bigquery:"column_value"`
FREQUENCY int64 `bigquery:"frequency"`
}
// getOutput prints results from a query
func getOutput(w http.ResponseWriter, iter *bigquery.RowIterator) error {
var rows []ColStatsRow
for {
var row ColStatsRow
err := iter.Next(&row)
if err == iterator.Done {
out, err := json.Marshal(rows)
if err != nil {
return fmt.Errorf("error marshalling results: %v", err)
}
w.Write([]byte(out))
return nil
}
if err != nil {
return fmt.Errorf("error iterating through results: %v", err)
}
rows = append(rows, row)
}
}
Thank you.

If you're after the schema for the result, it's available on the RowIterator.
If you mean you want to more dynamically process the rows without a specific struct, usually some combination of checking the schema and/or leveraging a type switch is the way to go about this.

According to the documentation you can specify a JSON schema file like this:
[
{
"description": "[DESCRIPTION]",
"name": "[NAME]",
"type": "[TYPE]",
"mode": "[MODE]"
},
{
"description": "[DESCRIPTION]",
"name": "[NAME]",
"type": "[TYPE]",
"mode": "[MODE]"
}
]
and then you can write this schema using the following command:
bq show --schema --format=prettyjson project_id:dataset.table > path_to_file

Related

BigQuery nullable types in golang when using BigQuery storage write API

I'm switching from the legacy streaming API to the storage write API following this example in golang:
https://github.com/alexflint/bigquery-storage-api-example
In the old code I used bigquery's null types to indicate a field can be null:
type Person struct {
Name bigquery.NullString `bigquery:"name"`
Age bigquery.NullInt64 `bigquery:"age"`
}
var persons = []Person{
{
Name: ToBigqueryNullableString(""), // this will be null in bigquery
Age: ToBigqueryNullableInt64("20"),
},
{
Name: ToBigqueryNullableString("David"),
Age: ToBigqueryNullableInt64("60"),
},
}
func main() {
ctx := context.Background()
bigqueryClient, _ := bigquery.NewClient(ctx, "project-id")
inserter := bigqueryClient.Dataset("dataset-id").Table("table-id").Inserter()
err := inserter.Put(ctx, persons)
if err != nil {
log.Fatal(err)
}
}
func ToBigqueryNullableString(x string) bigquery.NullString {
if x == "" {
return bigquery.NullString{Valid: false}
}
return bigquery.NullString{StringVal: x, Valid: true}
}
func ToBigqueryNullableInt64(x string) bigquery.NullInt64 {
if x == "" {
return bigquery.NullInt64{Valid: false}
}
if s, err := strconv.ParseInt(x, 10, 64); err == nil {
return bigquery.NullInt64{Int64: s, Valid: true}
}
return bigquery.NullInt64{Valid: false}
}
After switching to the new API:
var persons = []*personpb.Row{
{
Name: "",
Age: 20,
},
{
Name: "David",
Age: 60,
},
}
func main() {
ctx := context.Background()
client, _ := storage.NewBigQueryWriteClient(ctx)
defer client.Close()
stream, err := client.AppendRows(ctx)
if err != nil {
log.Fatal("AppendRows: ", err)
}
var row personpb.Row
descriptor, err := adapt.NormalizeDescriptor(row.ProtoReflect().Descriptor())
if err != nil {
log.Fatal("NormalizeDescriptor: ", err)
}
var opts proto.MarshalOptions
var data [][]byte
for _, row := range persons {
buf, err := opts.Marshal(row)
if err != nil {
log.Fatal("protobuf.Marshal: ", err)
}
data = append(data, buf)
}
err = stream.Send(&storagepb.AppendRowsRequest{
WriteStream: fmt.Sprintf("projects/%s/datasets/%s/tables/%s/streams/_default", "project-id", "dataset-id", "table-id"),
Rows: &storagepb.AppendRowsRequest_ProtoRows{
ProtoRows: &storagepb.AppendRowsRequest_ProtoData{
WriterSchema: &storagepb.ProtoSchema{
ProtoDescriptor: descriptor,
},
Rows: &storagepb.ProtoRows{
SerializedRows: data,
},
},
},
})
if err != nil {
log.Fatal("AppendRows.Send: ", err)
}
_, err = stream.Recv()
if err != nil {
log.Fatal("AppendRows.Recv: ", err)
}
}
With the new API I need to define the types in a .proto file, so I need to use something else to define nullable fields, I tried with optional fields:
syntax = "proto3";
package person;
option go_package = "/personpb";
message Row {
optional string name = 1;
int64 age = 2;
}
but it gives me error when trying to stream (not in the compile time):
BqMessage.proto: person_Row.Name: The [proto3_optional=true] option may only be set on proto3fields, not person_Row.Name
Another option I tried is to use oneof, and write the proto file like this
syntax = "proto3";
import "google/protobuf/struct.proto";
package person;
option go_package = "/personpb";
message Row {
NullableString name = 1;
int64 age = 2;
}
message NullableString {
oneof kind {
google.protobuf.NullValue null = 1;
string data = 2;
}
}
Then use it like this:
var persons = []*personpb.Row{
{
Name: &personpb.NullableString{Kind: &personpb.NullableString_Null{
Null: structpb.NullValue_NULL_VALUE,
}},
Age: 20,
},
{
Name: &personpb.NullableString{Kind: &personpb.NullableString_Data{
Data: "David",
}},
Age: 60,
},
}
...
But this gives me the following error:
Invalid proto schema: BqMessage.proto: person_Row.person_NullableString.null: FieldDescriptorProto.oneof_index 0 is out of range for type "person_NullableString".
I guess because the api doesn't know how to handle oneof type, I need to tell it somehow about this.
How can I use something like bigquery.Nullable types when using the new storage API? Any help will be appreciated
Take a look at this sample for an end to end example using a proto2 syntax file in go.
proto3 is still a bit of a special beast when working with the Storage API, for a couple reasons:
The current behavior of the Storage API is to operate using proto2 semantics.
Currently, the Storage API doesn't understand wrapper types, which was the original way in which proto3 was meant to communicate optional presence (e.g. NULL in BigQuery fields). Because of this, it tends to treat wrapper fields as a submessage with a value field (in BigQuery, a STRUCT with a single leaf field).
Later in its evolution, proto3 reintroduced the optional keyword as a way of marking presence, but in the internal representation this meant adding another presence marker (the source of the proto3_optional warning you were observing in the backend error).
It looks like you've using bits of the newer veneer, particularly adapt.NormalizeDescriptor(). I suspect if you're using this, you may be using an older version of the module, as the normalization code was updated in this PR and released as part of bigquery/v1.33.0.
There's work to improve the experiences with the storage API and make the overall experience smoother, but there's still work to be done.

Unmarshal nested GRPC structure in go

We want to unmarshal (in golang) a GRPC message and transform it into a map[string]interface{} to further process it. After using this code:
err := ptypes.UnmarshalAny(resource, config)
configMarshal, err := json.Marshal(config)
var configInterface map[string]interface{}
err = json.Unmarshal(configMarshal, &configInterface)
we get the following structure:
{
"name": "envoy.filters.network.tcp_proxy",
"ConfigType": {
"TypedConfig": {
"type_url": "type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy",
"value": "ChBCbGFja0hvbGVDbHVzdGVyEhBCbGFja0hvbGVDbHVzdGVy"
}
}
}
Where the TypedConfig field remains encoded. How can we decode the TypedConfig field? We know the type_url and we know the value, but to unmarshal the field, it needs to be of the pbany.Any type. But because the TypedConfig structure is a map[string] interface {}, our program either fails to compile, or it crashes, complaining that it is expecting a pbany.Any type, but instead it is getting a map[string] interface {}.
We have the following questions:
Is there a way to turn the structure under TypedConfig into a pbany.Any type that can be subsequently unmarshalled?
Is there a way to recursively unmarshal the entire GRPC message?
Edit (provide more information about the code, schemas/packages used)
We are looking at the code of xds_proxy.go here: https://github.com/istio/istio/blob/master/pkg/istio-agent/xds_proxy.go
This code uses a *discovery.DiscoveryResponse structure in this function:
func forwardToEnvoy(con *ProxyConnection, resp *discovery.DiscoveryResponse) {
The protobuf schema for discovery.DiscoveryResponse (and every other structure used in the code) is in the https://github.com/envoyproxy/go-control-plane/ repository in this file: https://github.com/envoyproxy/go-control-plane/blob/main/envoy/service/discovery/v3/discovery.pb.go
We added code to the forwardToEnvoy function to see the entire unmarshalled contents of the *discovery.DiscoveryResponse structure:
var config proto.Message
switch resp.TypeUrl {
case "type.googleapis.com/envoy.config.route.v3.RouteConfiguration":
config = &route.RouteConfiguration{}
case "type.googleapis.com/envoy.config.listener.v3.Listener":
config = &listener.Listener{}
// Six more cases here, truncated to save space
}
for _, resource := range resp.Resources {
err := ptypes.UnmarshalAny(resource, config)
if err != nil {
proxyLog.Infof("UnmarshalAny err %v", err)
return false
}
configMarshal, err := json.Marshal(config)
if err != nil {
proxyLog.Infof("Marshal err %v", err)
return false
}
var configInterface map[string]interface{}
err = json.Unmarshal(configMarshal, &configInterface)
if err != nil {
proxyLog.Infof("Unmarshal err %v", err)
return false
}
}
And this works well, except that now we have these TypedConfig fields that are still encoded:
{
"name": "virtualOutbound",
"address": {
"Address": {
"SocketAddress": {
"address": "0.0.0.0",
"PortSpecifier": {
"PortValue": 15001
}
}
}
},
"filter_chains": [
{
"filter_chain_match": {
"destination_port": {
"value": 15001
}
},
"filters": [
{
"name": "istio.stats",
"ConfigType": {
"TypedConfig": {
"type_url": "type.googleapis.com/udpa.type.v1.TypedStruct",
"value": "CkF0eXBlLmdvb2dsZWFwaXMuY29tL2Vudm95LmV4dGVuc2lvbnMuZmlsdG"
}
}
},
One way to visualize the contents of the TypedConfig fields is to use this code:
for index1, filterChain := range listenerConfig.FilterChains {
for index2, filter := range filterChain.Filters {
proxyLog.Infof("Listener %d: Handling filter chain %d, filter %d", i, index1, index2)
switch filter.ConfigType.(type) {
case *listener.Filter_TypedConfig:
proxyLog.Infof("Found TypedConfig")
typedConfig := filter.GetTypedConfig()
proxyLog.Infof("typedConfig.TypeUrl = %s", typedConfig.TypeUrl)
switch typedConfig.TypeUrl {
case "type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy":
tcpProxyConfig := &tcp_proxy.TcpProxy{}
err := typedConfig.UnmarshalTo(tcpProxyConfig)
if err != nil {
proxyLog.Errorf("Failed to unmarshal TCP proxy configuration")
} else {
proxyLog.Infof("TcpProxy Config for filter chain %d filter %d: %s", index1, index2, prettyPrint(tcpProxyConfig))
}
But then the code becomes very complex, as we have a large number of structures, and these structures can occur in different order in the messages.
So we wanted to get a generic way of unmarshalling these TypedConfig message by using pbAny, and hence our questions.

Scan a dynamodb table and using contains on a list with go sdk

I have a dynamodb field that looks like this:
[ { "S" : "test#gmail.com" }, { "S" : "test2#gmail.com" } ]
I am trying to run a scan to return any record that e.g. contain test#gmail.com . I am not sure I should use contains to do this, it's currently not returning any records, any pointers as to what I should use?
My go is setup like this:
type Site struct {
ID string `json:"id"`
Site string `json:"site"`
Emails []string `json:"emails,omitempty"`
}
func (ds *datastore) GetEmail(email string, out interface{}) error {
filt := expression.Name("emails").Contains(strings.ToLower(email))
fmt.Println("Get Email", filt)
//filt := expression.Contains(expression.Name("emails"), expression.Value(email))
proj := expression.NamesList(
expression.Name("emails"),
expression.Name("site"),
)
expr, err := expression.NewBuilder().
WithFilter(filt).
WithProjection(proj).
Build()
if err != nil {
fmt.Println(err)
}
scanInput := &dynamodb.ScanInput{
ExpressionAttributeNames: expr.Names(),
ExpressionAttributeValues: expr.Values(),
FilterExpression: expr.Filter(),
ProjectionExpression: expr.Projection(),
TableName: aws.String(ds.TableName),
}
result, err := ds.DDB.Scan(scanInput)
if err != nil {
fmt.Println("what is the err", err)
return err
}
if len(result.Items) == 0 {
fmt.Println("No Email found")
return errors.New(http.StatusText(http.StatusNotFound))
}
err = ds.Marshaler.UnmarshalMap(result.Items[0], out)
return err
}
If you're doing a contains on a partial email it won't match when the filter is applied to a set. It will have to be an exact email match.
{
"Email": "test#gmail.com"
}
// This will match a contains on "test#g"
{
"Emails": ["test#gmail.com", "another#gmail.com"]
}
// this will not match a contains on "test#g" but will match a contains of "test#gmail.com"
See contains: https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Condition.html
Also note you're doing a scan. Scans perform poorly in Dynamodb as soon as your data is of any significant size. Think of storing your data in a different format so you query it via partition keys or use an AWS RDS as an alternative.

How to read config.json in Go lang?

I have used the following code in filLib.go:
func LoadConfiguration(filename string) (Configuration, error) {
bytes, err := ioutil.ReadFile(filename)
if err != nil {
return Configuration{}, err
}
var c Configuration
err = json.Unmarshal(bytes, &c)
if err != nil {
return Configuration{}, err
}
return c, nil
}
But ioutil.ReadFile(filename) return *os.PathError.
Both the files config.json and filLib.go are in same folder.
The path of *.go file is not directly relevant to the working directory of the executing compiled code. Verify where your code thinks it actually is (compare to where you think it should be :).
import(
"os"
"fmt"
"log"
)
func main() {
dir, err := os.Getwd()
if err != nil {
log.Fatal(err)
}
fmt.Println(dir)
}
The issue might be with the filename you're providing. Below is the code sample that working fine for me.
func loadConfig() {
var AppConfig Conf
raw, err := ioutil.ReadFile("conf/conf.json")
if err != nil {
log.Println("Error occured while reading config")
return
}
json.Unmarshal(raw, &AppConfig)
}
I found this library enter link description here
It is a very simple and easy to use configuration library, allowing Json based config files for your Go application. Configuration provider reads configuration data from config.json file. You can get the string value of a configuration, or bind an interface to a valid JSON section by related section name convention parameter.
Consider the following config.json file:
{
"ConnectionStrings": {
"DbConnection": "Server=.;User Id=app;Password=123;Database=Db",
"LogDbConnection": "Server=.;User Id=app;Password=123;Database=Log"
},
"Caching": {
"ApplicationKey": "key",
"Host": "127.0.01"
},
"Website": {
"ActivityLogEnable": "true",
"ErrorMessages": {
"InvalidTelephoneNumber": "Invalid Telephone Number",
"RequestNotFound": "Request Not Found",
"InvalidConfirmationCode": "Invalid Confirmation Code"
}
},
"Services": {
"List": [
{
"Id": 1,
"Name": "Service1"
},
{
"Id": 2,
"Name": "Service2"
},
{
"Id": 3,
"Name": "Service3"
}
]
}
}
The following code displays how to access some of the preceding configuration settings. You can get config value via GetSection function with specifying Json sections as string parameter split by ":"
c, err := jsonconfig.GetSection("ConnectionStrings:DbConnection")
Any valid Json is a valid configuration type. You can also bind a struct via jsonconfig. For example, Caching configuration can be bind to valid struct:
type Caching struct {
ApplicationKey string
Host string
}
var c Caching
err = jsonconfig.Bind(&c, "Caching")

how I can mapping complex JSON to other JSON

I am trying to build aggregation services, to all third party APIs that's I used,
this aggregation services taking json values coming from my main system and it will put this value to key equivalent to third party api key then, aggregation services it will send request to third party api with new json format.
example-1:
package main
import (
"encoding/json"
"fmt"
"log"
"github.com/tidwall/gjson"
)
func main() {
// mapping JSON
mapB := []byte(`
{
"date": "createdAt",
"clientName": "data.user.name"
}
`)
// from my main system
dataB := []byte(`
{
"createdAt": "2017-05-17T08:52:36.024Z",
"data": {
"user": {
"name": "xxx"
}
}
}
`)
mapJSON := make(map[string]interface{})
dataJSON := make(map[string]interface{})
newJSON := make(map[string]interface{})
err := json.Unmarshal(mapB, &mapJSON)
if err != nil {
log.Panic(err)
}
err = json.Unmarshal(dataB, &dataJSON)
if err != nil {
log.Panic(err)
}
for i := range mapJSON {
r := gjson.GetBytes(dataB, mapJSON[i].(string))
newJSON[i] = r.Value()
}
newB, err := json.MarshalIndent(newJSON, "", " ")
if err != nil {
log.Println(err)
}
fmt.Println(string(newB))
}
output:
{
"clientName": "xxx",
"date": "2017-05-17T08:52:36.024Z"
}
I use gjson package to get values form my main system request in simple way from a json document.
example-2:
import (
"encoding/json"
"fmt"
"log"
"github.com/tidwall/gjson"
)
func main() {
// mapping JSON
mapB := []byte(`
{
"date": "createdAt",
"clientName": "data.user.name",
"server":{
"google":{
"date" :"createdAt"
}
}
}
`)
// from my main system
dataB := []byte(`
{
"createdAt": "2017-05-17T08:52:36.024Z",
"data": {
"user": {
"name": "xxx"
}
}
}
`)
mapJSON := make(map[string]interface{})
dataJSON := make(map[string]interface{})
newJSON := make(map[string]interface{})
err := json.Unmarshal(mapB, &mapJSON)
if err != nil {
log.Panic(err)
}
err = json.Unmarshal(dataB, &dataJSON)
if err != nil {
log.Panic(err)
}
for i := range mapJSON {
r := gjson.GetBytes(dataB, mapJSON[i].(string))
newJSON[i] = r.Value()
}
newB, err := json.MarshalIndent(newJSON, "", " ")
if err != nil {
log.Println(err)
}
fmt.Println(string(newB))
}
output:
panic: interface conversion: interface {} is map[string]interface {}, not string
I can handle this error by using https://golang.org/ref/spec#Type_assertions, but what if this json object have array and inside this array have json object ....
my problem is I have different apis, every api have own json schema, and my way for mapping json only work if
third party api have json key value only, without nested json or array inside this array json object.
is there a way to mapping complex json schema, or golang package to help me to do that?
EDIT:
After comment interaction and with updated question. Before we move forward, I would like to mention.
I just looked at your example-2 Remember one thing. Mapping is from one form to another form. Basically one known format to targeted format. Each data type have to handled. You cannot do generic to generic mapping logically (technically feasible though, would take more time & efforts, you can play around on this).
I have created sample working program of one approach; it does a mapping of source to targeted format. Refer this program as a start point and use your creativity to implement yours.
Playground link: https://play.golang.org/p/MEk_nGcPjZ
Explanation: Sample program achieves two different source format to one target format. The program consist of -
Targeted Mapping definition of Provider 1
Targeted Mapping definition of Provider 2
Provider 1 JSON
Provider 2 JSON
Mapping function
Targeted JSON marshal
Key elements from program: refer play link for complete program.
type MappingInfo struct {
TargetKey string
SourceKeyPath string
DataType string
}
Map function:
func mapIt(mapping []*MappingInfo, parsedResult gjson.Result) map[string]interface{} {
mappedData := make(map[string]interface{})
for _, m := range mapping {
switch m.DataType {
case "time":
mappedData[m.TargetKey] = parsedResult.Get(m.SourceKeyPath).Time()
case "string":
mappedData[m.TargetKey] = parsedResult.Get(m.SourceKeyPath).String()
}
}
return mappedData
}
Output:
Provider 1 Result: map[date:2017-05-17 08:52:36.024 +0000 UTC clientName:provider1 username]
Provider 1 JSON: {
"clientName": "provider1 username",
"date": "2017-05-17T08:52:36.024Z"
}
Provider 2 Result: map[date:2017-05-12 06:32:46.014 +0000 UTC clientName:provider2 username]
Provider 2 JSON: {
"clientName": "provider2 username",
"date": "2017-05-12T06:32:46.014Z"
}
Good luck, happy coding!
Typically Converting/Transforming one structure to another structure, you will have to handle this with application logic.
As you mentioned in the question:
my problem is I have different apis, every api have own json schema
This is true for every aggregation system.
One approach to handle this requirement effectively; is to keep mapping of keys for each provider JSON structure and targeted JSON structure.
For example: This is an approach, please go with your design as you see fit.
JSON structures from various provider:
// Provider 1 : JSON structrure
{
"createdAt": "2017-05-17T08:52:36.024Z",
"data": {
"user": {
"name": "xxx"
}
}
}
// Provider 2 : JSON structrure
{
"username": "yyy"
"since": "2017-05-17T08:52:36.024Z",
}
Mapping for target JSON structure:
jsonMappingByProvider := make(map[string]string)
// Targeted Mapping for Provider 1
jsonMappingByProvider["provider1"] = `
{
"date": "createdAt",
"clientName": "data.user.name"
}
`
// Targeted Mapping for Provider 2
jsonMappingByProvider["provider2"] = `
{
"date": "since",
"clientName": "username"
}
`
Now, based the on the provider you're handling, get the mapping and map the response JSON into targeted structure.
// get the mapping info by provider
mapping := jsonMappingByProvider["provider1"]
// Parse the response JSON
// Do the mapping
This way you can control each provider and it's mapping effectively.

Resources