How to parse xml in such silly format:
<key>KEY1</key><string>VALUE OF KEY1</string>
<key>KEY2</key><string>VALUE OF KEY2</string>
<key>KEY3</key><integer>42</integer>
<key>KEY3</key><array>
<integer>1</integer>
<integer>2</integer>
</array>
Parsing would be very simple if all values would have same type - for example strings. But in my case each value could be string, data, integer, boolean, array or dict.
This xml looks nearly like json, but unfortunately format is fixed, and I cannot change it. And I would prefer solution without any external packages.
Use a lower-level parsing interface provided by encoding/xml which allows you to iterate over individual tokens in the XML stream (such as "start element", "end element" etc).
See the Token() method of the encoding/xml's Decoder type.
Since the data is not well structured, and you can't modify the format, you can't use xml.Unmarshal, so you can process the XML elements by creating a new Decoder, then iterate over the tokens and use DecodeElement to process them one by one. In my sample code below, it puts everything in a map. The code is also on github here...
package main
import (
"encoding/xml"
"strings"
"fmt"
)
type PlistArray struct {
Integer []int `xml:"integer"`
}
const in = "<key>KEY1</key><string>VALUE OF KEY1</string><key>KEY2</key><string>VALUE OF KEY2</string><key>KEY3</key><integer>42</integer><key>KEY3</key><array><integer>1</integer><integer>2</integer></array>"
func main() {
result := map[string]interface{}{}
dec := xml.NewDecoder(strings.NewReader(in))
dec.Strict = false
var workingKey string
for {
token, _ := dec.Token()
if token == nil {
break
}
switch start := token.(type) {
case xml.StartElement:
fmt.Printf("startElement = %+v\n", start)
switch start.Name.Local {
case "key":
var k string
err := dec.DecodeElement(&k, &start)
if err != nil {
fmt.Println(err.Error())
}
workingKey = k
case "string":
var s string
err := dec.DecodeElement(&s, &start)
if err != nil {
fmt.Println(err.Error())
}
result[workingKey] = s
workingKey = ""
case "integer":
var i int
err := dec.DecodeElement(&i, &start)
if err != nil {
fmt.Println(err.Error())
}
result[workingKey] = i
workingKey = ""
case "array":
var ai PlistArray
err := dec.DecodeElement(&ai, &start)
if err != nil {
fmt.Println(err.Error())
}
result[workingKey] = ai
workingKey = ""
default:
fmt.Errorf("Unrecognized token")
}
}
}
fmt.Printf("%+v", result)
}
Related
I have this json that I convert to:
var leerCHAT []interface{}
but I am going through crazy hoops to get to any point on that map inside map and inside map crazyness, specially because some results are different content.
this is the Json
[
null,
null,
"hub:zWXroom",
"presence_diff",
{
"joins":{
"f718a187-6e96-4d62-9c2d-67aedea00000":{
"metas":[
{
"context":{},
"permissions":{},
"phx_ref":"zNDwmfsome=",
"phx_ref_prev":"zDMbRTmsome=",
"presence":"lobby",
"profile":{},
"roles":{}
}
]
}
},
"leaves":{}
}
]
I need to get to profile then inside there is a "DisplayName" field.
so I been doing crazy hacks.. and even like this I got stuck half way...
First is an array so I can just do something[elementnumber]
then is when the tricky mapping starts...
SORRY about all the prints etc is to debug and see the number of elements I am getting back.
if leerCHAT[3] == "presence_diff" {
var id string
presence := leerCHAT[4].(map[string]interface{})
log.Printf("algo: %v", len(presence))
log.Printf("algo: %s", presence["joins"])
vamos := presence["joins"].(map[string]interface{})
for i := range vamos {
log.Println(i)
id = i
}
log.Println(len(vamos))
vamonos := vamos[id].(map[string]interface{})
log.Println(vamonos)
log.Println(len(vamonos))
metas := vamonos["profile"].(map[string]interface{}) \\\ I get error here..
log.Println(len(metas))
}
so far I can see all the way to the meta:{...} but can't continue with my hacky code into what I need.
NOTICE: that since the id after Joins: and before metas: is dynamic I have to get it somehow since is always just one element I did the for range loop to grab it.
The array element at index 3 describes the type of the variant JSON at index 4.
Here's how to decode the JSON to Go values. First, declare Go types for each of the variant parts of the JSON:
type PrescenceDiff struct {
Joins map[string]*Presence // declaration of Presence type to be supplied
Leaves map[string]*Presence
}
type Message struct {
Body string
}
Declare a map associating the type string to the Go type:
var messageTypes = map[string]reflect.Type{
"presence_diff": reflect.TypeOf(&PresenceDiff{}),
"message": reflect.TypeOf(&Message{}),
// add more types here as needed
}
Decode the variant part to a raw message. Use use the name in the element at index 3 to create a value of the appropriate Go type and decode to that value:
func decode(data []byte) (interface{}, error) {
var messageType string
var raw json.RawMessage
v := []interface{}{nil, nil, nil, &messageType, &raw}
err := json.Unmarshal(data, &v)
if err != nil {
return nil, err
}
if len(raw) == 0 {
return nil, errors.New("no message")
}
t := messageTypes[messageType]
if t == nil {
return nil, fmt.Errorf("unknown message type: %q", messageType)
}
result := reflect.New(t.Elem()).Interface()
err = json.Unmarshal(raw, result)
return result, err
}
Use type switches to access the variant part of the message:
defer ws.Close()
for {
_, data, err := ws.ReadMessage()
if err != nil {
log.Printf("Read error: %v", err)
break
}
v, err := decode(data)
if err != nil {
log.Printf("Decode error: %v", err)
continue
}
switch v := v.(type) {
case *PresenceDiff:
fmt.Println(v.Joins, v.Leaves)
case *Message:
fmt.Println(v.Body)
default:
fmt.Printf("type %T not handled\n", v)
}
}
Run it on the playground.
This question already has answers here:
Unmarshal 2 different structs in a slice
(3 answers)
Closed 3 years ago.
i'm struggling to create a data structure for unmarshal the following json:
{
"asks": [
["2.049720", "183.556", 1576323009],
["2.049750", "555.125", 1576323009],
["2.049760", "393.580", 1576323008],
["2.049980", "206.514", 1576322995]
],
"bids": [
["2.043800", "20.691", 1576322350],
["2.039080", "755.396", 1576323007],
["2.036960", "214.621", 1576323006],
["2.036930", "700.792", 1576322987]
]
}
If I use the following struct with interfaces, there is no problem:
type OrderBook struct {
Asks [][]interface{} `json:"asks"`
Bids [][]interface{} `json:"bids"`
}
But i need a more strict typing, so i've tried with:
type BitfinexOrderBook struct {
Pair string `json:"pair"`
Asks [][]BitfinexOrder `json:"asks"`
Bids [][]BitfinexOrder `json:"bids"`
}
type BitfinexOrder struct {
Price string
Volume string
Timestamp time.Time
}
But unfortunately i had not success.
This is the code that I have used for parse the Kraken API for retrieve the order book:
// loadKrakenOrderBook is delegated to load the data related to pairs info
func loadKrakenOrderBook(data []byte) (datastructure.BitfinexOrderBook, error) {
var err error
// Creating the maps for the JSON data
m := map[string]interface{}{}
var orderbook datastructure.BitfinexOrderBook
// Parsing/Unmarshalling JSON
err = json.Unmarshal(data, &m)
if err != nil {
zap.S().Debugw("Error unmarshalling data: " + err.Error())
return orderbook, err
}
a := reflect.ValueOf(m["result"])
if a.Kind() == reflect.Map {
key := a.MapKeys()[0]
log.Println("KEY: ", key)
strct := a.MapIndex(key)
log.Println("MAP: ", strct)
m, _ := strct.Interface().(map[string]interface{})
log.Println("M: ", m)
data, err := json.Marshal(m)
if err != nil {
zap.S().Warnw("Panic on key: ", key.String(), " ERR: "+err.Error())
return orderbook, err
}
log.Println("DATA: ", string(data))
err = json.Unmarshal(data, &orderbook)
if err != nil {
zap.S().Warnw("Panic on key: ", key.String(), " during unmarshal. ERR: "+err.Error())
return orderbook, err
}
return orderbook, nil
}
return orderbook, errors.New("UNABLE_PARSE_VALUE")
}
The data that i use for test are the following:
{
"error": [],
"result": {
"LINKUSD": {
"asks": [
["2.049720", "183.556", 1576323009],
["2.049750", "555.125", 1576323009],
["2.049760", "393.580", 1576323008],
["2.049980", "206.514", 1576322995]
],
"bids": [
["2.043800", "20.691", 1576322350],
["2.039080", "755.396", 1576323007],
["2.036960", "214.621", 1576323006],
["2.036930", "700.792", 1576322987]
]
}
}
}
EDIT
NOTE: the data that i receive in input is the latest json that i've post, not the array of bids and asks.
I've tried to integrate the solution proposed by #chmike. Unfortunately there is a few preprocessing to be made, cause the data is the latest json that i've post.
So i've changed to code as following in order to extract the json data related to asks and bids.
func order(data []byte) (datastructure.BitfinexOrderBook, error) {
var err error
// Creating the maps for the JSON data
m := map[string]interface{}{}
var orderbook datastructure.BitfinexOrderBook
// var asks datastructure.BitfinexOrder
// var bids datastructure.BitfinexOrder
// Parsing/Unmarshalling JSON
err = json.Unmarshal(data, &m)
if err != nil {
zap.S().Warn("Error unmarshalling data: " + err.Error())
return orderbook, err
}
// Extract the "result" json
a := reflect.ValueOf(m["result"])
if a.Kind() == reflect.Map {
key := a.MapKeys()[0]
log.Println("KEY: ", key)
log.Println()
strct := a.MapIndex(key)
log.Println("MAP: ", strct)
m, _ := strct.Interface().(map[string]interface{})
log.Println("M: ", m)
log.Println("Asks: ", m["asks"])
log.Println("Bids: ", m["bids"])
// Here i retrieve the asks array
asks_data, err := json.Marshal(m["asks"])
log.Println("OK: ", err)
log.Println("ASKS: ", string(asks_data))
var asks datastructure.BitfinexOrder
// here i try to unmarshal the data into the struct
asks, err = UnmarshalJSON(asks_data)
log.Println(err)
log.Println(asks)
}
return orderbook, errors.New("UNABLE_PARSE_VALUE")
}
Unfortunately, i receive the following error:
json: cannot unmarshal array into Go value of type json.Number
As suggested by #Flimzy, you need a custom Unmarshaler. Here it is.
Note that the BitfinexOrderBook definition is slightly different from yours. There was an error in it.
// BitfinexOrderBook is a book of orders.
type BitfinexOrderBook struct {
Asks []BitfinexOrder `json:"asks"`
Bids []BitfinexOrder `json:"bids"`
}
// BitfinexOrder is a bitfinex order.
type BitfinexOrder struct {
Price string
Volume string
Timestamp time.Time
}
// UnmarshalJSON decode a BifinexOrder.
func (b *BitfinexOrder) UnmarshalJSON(data []byte) error {
var packedData []json.Number
err := json.Unmarshal(data, &packedData)
if err != nil {
return err
}
b.Price = packedData[0].String()
b.Volume = packedData[1].String()
t, err := packedData[2].Int64()
if err != nil {
return err
}
b.Timestamp = time.Unix(t, 0)
return nil
}
Note also that this custom unmarshaler function allows you to convert the price or volume to a float, which is probably what you want.
While you can hack your way by using reflex, or maybe even write your own parser, the most efficient way is to implement a json.Unmarshaler.
There are a few problem remaining, though.
You are transforming a json array to the struct, not just interface{} elements in it, so it should be: Asks []BitfinexOrder and Bids []BitfinexOrder.
You need to wrap the struct BitfinexOrderBook to get it work with its data. It is trivial and much simpler than using reflex.
By default, json.Unmarshal unmarshals a json number into a float64, which is not a good thing when parsing timestamp. You can use json.NewDecoder to get a decoder and then use Decoder.UseNumber to force use a string.
For example,
func (bo *BitfinexOrder) UnmarshalJSON(data []byte) error {
dec := json.NewDecoder(bytes.NewReader(data))
dec.UseNumber()
var x []interface{}
err := dec.Decode(&x)
if err != nil {
return errParse(err.Error())
}
if len(x) != 3 {
return errParse("length is not 3")
}
price, ok := x[0].(string)
if !ok {
return errParse("price is not string")
}
volume, ok := x[1].(string)
if !ok {
return errParse("volume is not string")
}
number, ok := x[2].(json.Number)
if !ok {
return errParse("timestamp is not number")
}
tint64, err := strconv.ParseInt(string(number), 10, 64)
if err != nil {
return errParse(fmt.Sprintf("parsing timestamp: %s", err))
}
*bo = BitfinexOrder{
Price: price,
Volume: volume,
Timestamp: time.Unix(tint64, 0),
}
return nil
}
and main func (wrapping the struct):
func main() {
x := struct {
Result struct{ LINKUSD BitfinexOrderBook }
}{}
err := json.Unmarshal(data, &x)
if err != nil {
log.Fatalln(err)
}
bob := x.Result.LINKUSD
fmt.Println(bob)
}
Playground link: https://play.golang.org/p/pC124F-3M_S .
Note: the playground link use a helper function to create errors. Some might argue it is best to name the helper function NewErrInvalidBitfinexOrder or rename the error. That is not the scope of this question and I think for the sake of typing, I will keep the short name for now.
I have a program that parses a log file and returns a slice of structs with populated data from the file.
Also I have written a function to add a struct item to the aforemetioned list.
But there is an error that says "Cannot use 'sf' (type *SegmentationFault) as type SegmentationFault" which stems from this function. How am I to solve this problem?
func (sfList *SegmentationFaultList) AddItem(item SegmentationFault) []SegmentationFault {
sfList.Items = append(sfList.Items, item)
return sfList.Items
}
func parseLogFile(logPath string) (s *SegmentationFaultList){
logFile, err := os.Open(logPath)
checkError(err, "Could not open your log file")
defer logFile.Close()
scanner := bufio.NewScanner(logFile)
parsing := false
sf := new(SegmentationFault)
sfs := []SegmentationFault{}
sfList := SegmentationFaultList{sfs}
var beginRegexp = regexp.MustCompile(`(?i).+\[err\]:F-(\d+): Dump: Segmentation fault at ([\da-z]+)$`)
var endRegexp = regexp.MustCompile(`(?i).+\[info\]:Engine child with pid \d+ terminated`)
var sfTextRegexp = regexp.MustCompile(`(?i).+\[err\]:F-\d+: Dump:(.+)`)
for scanner.Scan() {
beginMatch := beginRegexp.FindStringSubmatch(scanner.Text())
switch {
case beginMatch != nil:
sf.pid = beginMatch[1]
sf.sfAt = beginMatch[2]
parsing = true
case endRegexp.FindStringSubmatch(scanner.Text()) != nil:
parsing = false
sfList.AddItem(sf)
case parsing:
sf.sfText = append(sf.sfText, strings.TrimSpace(sfTextRegexp.FindStringSubmatch(scanner.Text())[1]))
}
}
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
return sfList
}
Your problem is that you are passing a pointer value (*SegmentationFault) where you just want a value SegmentationFault.
Instead of
sf := new(SegmentationFault)
You should do:
sf := SegmentationFault{}
I have a configuration file in YAML format. I am trying to read the configuration in some custom format. I couldn't guess any pattern that I can go for like tree, json etc.
Eg. application.yaml
organization:
products:
product1:
manager: "Rob"
engineer: "John"
product2:
manager: "Henry"
lead: "patrick"
The configuration file can have huge information and that can vary from file to file. I want to construct data in the following format,
organization/products/product1/manager = Rob
organization/products/product1/engineer = John
organization/products/product2/lead = patrick
OR
{"organization/products/product1/manager":"Rob","organization/products/product2/lead":"patrick"}
Any idea how I can achieve this pattern?
This is essentially an exercise in printing trees. The exact implementation will depend on the particular YAML parser you pick, but pretty much all of them will have some kind of "map of anything" type. In the very popular gopkg.in/yaml.v2 this type is named MapSlice (don't let the name confuse you; it leaks its implementation which has to deal with flexible key types).
Just throw it at your favorite tree traversal algorithm to render the text file. Here is a simple example that works with only string keys and only some scalar leaf nodes:
package main
import (
"bytes"
"fmt"
"io"
"log"
"path/filepath"
)
func main() {
var tree yaml.MapSlice
if err := yaml.Unmarshal(input, &tree); err != nil {
log.Fatal(err)
}
var buf bytes.Buffer
if err := render(&buf, tree, ""); err != nil {
log.Fatal(err)
}
}
func render(w io.Writer, tree yaml.MapSlice, prefix string) error {
for _, branch := range tree {
key, ok := branch.Key.(string)
if !ok {
return fmt.Errorf("unsupported key type: %T", branch.Key)
}
prefix := filepath.Join(prefix, key)
switch x := branch.Value.(type) {
default:
return fmt.Errorf("unsupported value type: %T", branch.Value)
case yaml.MapSlice:
// recurse
if err := render(w, x, prefix); err != nil {
return err
}
continue
// scalar values
case string:
case int:
case float64:
// ...
}
// print scalar
if _, err := fmt.Fprintf(w, "%s = %v\n", prefix, branch.Value); err != nil {
return err
}
}
return nil
}
I have created an object mapping in Go that is not relational, it is very simple.
I have several structs that looks like this:
type Message struct {
Id int64
Message string
ReplyTo sql.NullInt64 `db:"reply_to"`
FromId int64 `db:"from_id"`
ToId int64 `db:"to_id"`
IsActive bool `db:"is_active"`
SentTime int64 `db:"sent_time"`
IsViewed bool `db:"is_viewed"`
Method string `db:"-"`
AppendTo int64 `db:"-"`
}
To create a new message I just run this function:
func New() *Message {
return &Message{
IsActive: true,
SentTime: time.Now().Unix(),
Method: "new",
}
}
And then I have a message_crud.go file for this struct that looks like this:
To find a message by a unique column (for example by id) I run this function:
func ByUnique(column string, value interface{}) (*Message, error) {
query := fmt.Sprintf(`
SELECT *
FROM message
WHERE %s = ?
LIMIT 1;
`, column)
message := &Message{}
err := sql.DB.QueryRowx(query, value).StructScan(message)
if err != nil {
return nil, err
}
return message, nil
}
And to save a message (insert or update in the database) I run this method:
func (this *Message) save() error {
s := ""
if this.Id == 0 {
s = "INSERT INTO message SET %s;"
} else {
s = "UPDATE message SET %s WHERE id=:id;"
}
query := fmt.Sprintf(s, sql.PlaceholderPairs(this))
nstmt, err := sql.DB.PrepareNamed(query)
if err != nil {
return err
}
res, err := nstmt.Exec(*this)
if err != nil {
return err
}
if this.Id == 0 {
lastId, err := res.LastInsertId()
if err != nil {
return err
}
this.Id = lastId
}
return nil
}
The sql.PlaceholderPairs() function looks like this:
func PlaceholderPairs(i interface{}) string {
s := ""
val := reflect.ValueOf(i).Elem()
count := val.NumField()
for i := 0; i < count; i++ {
typeField := val.Type().Field(i)
tag := typeField.Tag
fname := strings.ToLower(typeField.Name)
if fname == "id" {
continue
}
if t := tag.Get("db"); t == "-" {
continue
} else if t != "" {
s += t + "=:" + t
} else {
s += fname + "=:" + fname
}
s += ", "
}
s = s[:len(s)-2]
return s
}
But every time I create a new struct, for example a User struct I have to copy paste the "crud section" above and create a user_crud.go file and replace the words "Message" with "User", and the words "message" with "user". I repeat alot of code and it is not very dry. Is there something I could do to not repeat this code for things I would reuse? I always have a save() method, and always have a function ByUnique() where I can return a struct and search by a unique column.
In PHP this was easy because PHP is not statically typed.
Is this possible to do in Go?
Your ByUnique is almost generic already. Just pull out the piece that varies (the table and destination):
func ByUnique(table string, column string, value interface{}, dest interface{}) error {
query := fmt.Sprintf(`
SELECT *
FROM %s
WHERE %s = ?
LIMIT 1;
`, table, column)
return sql.DB.QueryRowx(query, value).StructScan(dest)
}
func ByUniqueMessage(column string, value interface{}) (*Message, error) {
message := &Message{}
if err := ByUnique("message", column, value, &message); err != nil {
return nil, err
}
return message, error
}
Your save is very similar. You just need to make a generic save function along the lines of:
func Save(table string, identifier int64, source interface{}) { ... }
Then inside of (*Message)save, you'd just call the general Save() function. Looks pretty straightforward.
Side notes: do not use this as the name of the object inside a method. See the link from #OneOfOne for more on that. And do not get obsessed about DRY. It is not a goal in itself. Go focuses on code being simple, clear, and reliable. Do not create something complicated and fragile just to avoid typing a simple line of error handling. This doesn't mean that you shouldn't extract duplicated code. It just means that in Go it is usually better to repeat simple code a little bit rather than create complicated code to avoid it.
EDIT: If you want to implement Save using an interface, that's no problem. Just create an Identifier interface.
type Ider interface {
Id() int64
SetId(newId int64)
}
func (msg *Message) Id() int64 {
return msg.Id
}
func (msg *Message) SetId(newId int64) {
msg.Id = newId
}
func Save(table string, source Ider) error {
s := ""
if source.Id() == 0 {
s = fmt.Sprintf("INSERT INTO %s SET %%s;", table)
} else {
s = fmt.Sprintf("UPDATE %s SET %%s WHERE id=:id;", table)
}
query := fmt.Sprintf(s, sql.PlaceholderPairs(source))
nstmt, err := sql.DB.PrepareNamed(query)
if err != nil {
return err
}
res, err := nstmt.Exec(source)
if err != nil {
return err
}
if source.Id() == 0 {
lastId, err := res.LastInsertId()
if err != nil {
return err
}
source.SetId(lastId)
}
return nil
}
func (msg *Message) save() error {
return Save("message", msg)
}
The one piece that might blow up with this is the call to Exec. I don't know what package you're using, and it's possible that Exec won't work correctly if you pass it an interface rather than the actual struct, but it probably will work. That said, I'd probably just pass the identifier rather than adding this overhead.
You probably want to use an ORM.
They eliminate a lot of the boilerplate code you're describing.
See this question for "What is an ORM?"
Here is a list of ORMs for go: https://github.com/avelino/awesome-go#orm
I have never used one myself, so I can't recommend any. The main reason is that an ORM takes a lot of control from the developer and introduces a non-negligible performance overhead. You need to see for yourself if they fit your use-case and/or if you are comfortable with the "magic" that's going on in those libraries.
I don't recommend doing this, i personally would prefer being explicit about scanning into structs and creating queries.
But if you really want to stick to reflection you could do:
func ByUnique(obj interface{}, column string, value interface{}) error {
// ...
return sql.DB.QueryRowx(query, value).StructScan(obj)
}
// Call with
message := &Message{}
ByUnique(message, ...)
And for your save:
type Identifiable interface {
Id() int64
}
// Implement Identifiable for message, etc.
func Save(obj Identifiable) error {
// ...
}
// Call with
Save(message)
The approach i use and would recommend to you:
type Redirect struct {
ID string
URL string
CreatedAt time.Time
}
func FindByID(db *sql.DB, id string) (*Redirect, error) {
var redirect Redirect
err := db.QueryRow(
`SELECT "id", "url", "created_at" FROM "redirect" WHERE "id" = $1`, id).
Scan(&redirect.ID, &redirect.URL, &redirect.CreatedAt)
switch {
case err == sql.ErrNoRows:
return nil, nil
case err != nil:
return nil, err
}
return &redirect, nil
}
func Save(db *sql.DB, redirect *Redirect) error {
redirect.CreatedAt = time.Now()
_, err := db.Exec(
`INSERT INTO "redirect" ("id", "url", "created_at") VALUES ($1, $2, $3)`,
redirect.ID, redirect.URL, redirect.CreatedAt)
return err
}
This has the advantage of using the type system and mapping only things it should actually map.