How to parse timestamp with underscores in Golang

How to parse timestamp with underscores in Golang - go

I'm trying to parse access log timestamp like "2020/11/06_18:17:25_455" in Filebeat according to Golang spec.
Here is my test program to verify layout:
package main
import (
"fmt"
"log"
"time"
)
func main() {
eventDateLayout := "2006/01/02_15:04:05_000"
eventCheckDate, err := time.Parse(eventDateLayout, "2020/11/06_18:17:25_455")
if err != nil {
log.Fatal(err)
}
fmt.Println(eventCheckDate)
}
Result:
2009/11/10 23:00:00 parsing time "2020/11/06_18:17:25_455" as
"2006/01/02_15:04:05_000": cannot parse "455" as "_000"
As I understand underscore has a special meaning in Golang, but from documentation it's not clear how to escape it.
Any ideas, please?

It doesn't seem possible to use any escape characters for the time layout (e.g. "\\_" doesn't work), so one would have to do something different.
This issue describes the same problem, but it was solved in a very non-general way that doesn't seem to apply to your format.
So your best bet seems to be replacing _ with something else/stripping it from the string, then using a layout without it. To make sure that the millisecond part ist also parsed, it must be separated with a . instead of _, then it's recognized as part of the seconds (05) format.
eventDateLayout := "2006/01/02.15:04:05"
val := strings.Replace("2020/11/06_18:17:25_455", "_", ".", 2)
eventCheckDate, err := time.Parse(eventDateLayout, val)
if err != nil {
panic(err)
}
fmt.Println(eventCheckDate)
Playground link

From time.Format
A fractional second is represented by adding a period and zeros to the
end of the seconds section of layout string, as in "15:04:05.000" to
format a time stamp with millisecond precision.
You cannot specify millisecond precision with an underscore you need 05.000 instead:
// eventDateLayout := "2006/01/02_15:04:05_000" // invalid format
eventDateLayout := "2006/01/02_15:04:05.000"
eventCheckDate, err := time.Parse(eventDateLayout, "2020/11/06_18:17:25.455")
So basically use a simple translate function to convert the final _ to a . and use the above parser.
https://play.golang.org/p/POPgXC_qe81

Related

How can I compare read(1.proto) = read(2.proto) in Go(assuming there's just one message definition)?

Context: I'm trying to resolve this issue.
In other words, there's a NormalizeJsonString() for JSON strings (see this for more context:
// Takes a value containing JSON string and passes it through
// the JSON parser to normalize it, returns either a parsing
// error or normalized JSON string.
func NormalizeJsonString(jsonString interface{}) (string, error) {
that allows to have the following code:
return structure.NormalizeJsonString(old) == structure.NormalizeJsonString(new)
but it doesn't work for strings that are proto files (all proto files are guaranteed to have just one message definition). For example, I could see:
syntax = "proto3";
- package bar.proto;
+ package bar.proto;
option java_outer_classname = "FooProto";
message Foo {
...
- int64 xyz = 3;
+ int64 xyz = 3;
Is there NormalizeProtoString available in some Go SDKs? I found MessageDifferencer but it's in C++ only. Another option I considered was to replace all new lines / group of whitespaces with a single whitespace but it's a little bit hacky.

To do this in a semantic fashion, the proto definitions should really be parsed. Naively stripping and/or replacing whitespace may get you somewhere, but likely will have gotchas.
As far as I'm aware the latest official Go protobuf package don't have anything to handle parsing protobuf definitions - the protoc compiler handles that side of affairs, and this is written in C++
There would be options to execute the protoc compiler to get hold of the descriptor set output (e.g. protoc --descriptor_set_out=...), however I'm guessing this would also be slightly haphazard considering it requires one to have protoc available - and version differences could potentially cause problems too.
Assuming that is no go, one further option is to use a 3rd party parser written in Go - github.com/yoheimuta/go-protoparser seems to handle things quite well. One slight issue when making comparisons is that the parser records meta information about source line + column positions for each type; however it is relatively easy to make a comparison and ignore these, by using github.com/google/go-cmp
For example:
package main
import (
"fmt"
"log"
"os"
"github.com/google/go-cmp/cmp"
"github.com/google/go-cmp/cmp/cmpopts"
"github.com/yoheimuta/go-protoparser/v4"
"github.com/yoheimuta/go-protoparser/v4/parser"
"github.com/yoheimuta/go-protoparser/v4/parser/meta"
)
func main() {
if err := run(); err != nil {
log.Fatal(err)
}
}
func run() error {
proto1, err := parseFile("example1.proto")
if err != nil {
return err
}
proto2, err := parseFile("example2.proto")
if err != nil {
return err
}
equal := cmp.Equal(proto1, proto2, cmpopts.IgnoreTypes(meta.Meta{}))
fmt.Printf("equal: %t", equal)
return nil
}
func parseFile(path string) (*parser.Proto, error) {
f, err := os.Open(path)
if err != nil {
return nil, err
}
defer f.Close()
return protoparser.Parse(f)
}
outputs:
equal: true
for the example you provided.

Golang Equivalent of Python's `pd.to_datetime()`

I'm new to Go, and looking to create my own algo trading strategy backtesting library, an area I'm well experienced in with Python, to help learn the language.
I have a 5 minute OHLCV SPY5min.csv dataset, the head of which looks like this:
I use this code to read in the dataset from the file, converting everything to a list of lists of values:
package main
import (
"encoding/csv"
"log"
"os"
"fmt"
)
func ReadCsvFile(filePath string) [][]string {
f, err := os.Open(filePath)
if err != nil {
log.Fatal("Unable to read input file "+filePath, err)
}
defer f.Close()
csvReader := csv.NewReader(f)
records, err := csvReader.ReadAll()
if err != nil {
log.Fatal("Unable to parse file as CSV for "+filePath, err)
}
return records
}
func main() {
records := ReadCsvFile("./SPY5min.csv")
fmt.Println(records)
}
This returns a list of lists of string values. Cool. Now what I want to do is replicate a Pandas Dataframe like object, or perhaps separate each "column" into their own separate arrays/slices if that's easier, not sure yet.
Once that's done, I need a way to convert the strings of datetimes to actual datetime objects that I can run comparisons and loc's on. Can someone point me in the right direction?
My naive approach (pseudo) would be to:
Declare 6 array variables (datetime, open, high, low, close, volume) of len(records) in size
Iterate over the records list of lists
Insert each value into the current i of their respective arrays
Once iteration is done, mass convert the values in the datetime array to values of datetime objects?
Wondering if this is really the best way of doing this, or if there's a faster way than O(n) iteration?

You asked, "Once that's done, I need a way to convert the strings of datetimes to actual datetime objects ...". I recently answered a similar question here: https://stackoverflow.com/a/74491722/5739452
Your timestamp looks like this: "2022-11-08 4:00". The time package contains parsing and other manipulation functions. The key detail is knowing the conventions for the layout parser format. Each element of a time is recognized as a specific number. The year is 2006, the month is 01 etc.
So, for your purpose something like this should work:
package main
import (
"fmt"
"time"
)
func main() {
t := "2022-11-08 4:00"
const layout = "2006-01-02 15:04"
x, err := time.Parse(layout, t)
fmt.Println(x, err)
}

How to convert the string representation of a Terraform set of strings to a slice of strings

I've a terratest where I get an output from terraform like so s := "[a b]". The terraform output's value = toset([resource.name]), it's a set of strings.
Apparently fmt.Printf("%T", s) returns string. I need to iterate to perform further validation.
I tried the below approach but errors!
var v interface{}
if err := json.Unmarshal([]byte(s), &v); err != nil {
fmt.Println(err)
}
My current implementation to convert to a slice is:
s := "[a b]"
s1 := strings.Fields(strings.Trim(s, "[]"))
for _, v:= range s1 {
fmt.Println("v -> " + v)
}
Looking for suggestions to current approach or alternative ways to convert to arr/slice that I should be considering. Appreciate any inputs. Thanks.

Actually your current implementation seems just fine.
You can't use JSON unmarshaling because JSON strings must be enclosed in double quotes ".
Instead strings.Fields does just that, it splits a string on one or more characters that match unicode.IsSpace, which is \t, \n, \v. \f, \r and .
Moeover this works also if terraform sends an empty set as [], as stated in the documentation:
returning [...] an empty slice if s contains only white space.
...which includes the case of s being empty "" altogether.
In case you need additional control over this, you can use strings.FieldsFunc, which accepts a function of type func(rune) bool so you can determine yourself what constitutes a "space". But since your input string comes from terraform, I guess it's going to be well-behaved enough.
There may be third-party packages that already implement this functionality, but unless your program already imports them, I think the native solution based on the standard lib is always preferrable.
unicode.IsSpace actually includes also the higher runes 0x85 and 0xA0, in which case strings.Fields calls FieldsFunc(s, unicode.IsSpace)

package main
import (
"fmt"
"strings"
)
func main() {
src := "[a b]"
dst := strings.Split(src[1:len(src)-1], " ")
fmt.Println(dst)
}
https://play.golang.org/p/KVY4r_8RWv6

Why is time.Since returning negative durations on Windows?

I have been trying to work with some go, and have found some weird behavior on windows. If I construct a time object from parsing a time string in a particular format, and then use functions like time.Since(), I get negative durations.
Code sample:
package main
import (
"fmt"
"time"
"strconv"
)
func convertToTimeObject(dateStr string) time.Time {
layout := "2006-01-02T15:04:05.000Z"
t, _:= time.Parse(layout, dateStr)
return t
}
func main() {
timeOlder := convertToTimeObject(time.Now().Add(-30*time.Second).Format("2006-01-02T15:04:05.000Z"))
duration := time.Since(timeOlder)
fmt.Println("Duration in seconds: " + strconv.Itoa(int(duration.Seconds())))
}
If you run it on Linux or the Go Playground link, you get the result as Duration in seconds: 30 which is expected.
However, on Windows, running the same piece of code with Go 1.10.3 gives Duration in seconds: -19769.
I've banged my head on this for hours. Any help on what I might be missing?
The only leads I've had since now are that when go's time package goes to calculate the seconds for both time objects (time.Now() and my parsed time object), one of them has the property hasMonotonic and one doesn't, which results in go calculating vastly different seconds for both.
I'm not the expert in time, so would appreciate some help. I was going to file a bug for Go, but thought to ask here from the experts if there's something obvious I might be missing.

I think I figured out what the reason for the weird behavior of your code snippet is and can provide a solution. The relevant docs read as follows:
since returns the time elapsed since t. It is shorthand for time.Now().Sub(t).
But:
now returns the current local time.
That means you are formatting timeOlder and subtract it from an unformatted local time. That of course causes unexpected behavior. A simple solution is to parse the local time according to your format before subtracting timeOlder from it.
A solution that works on my machine (it probably does not make a lot of sense to give a playground example, though):
func convertToTimeObject(dateStr string) time.Time {
layout := "2006-01-02T15:04:05.000Z"
t, err := time.Parse(layout, dateStr)
// check the error!
if err != nil {
log.Fatalf("error while parsing time: %s\n", err)
}
return t
}
func main() {
timeOlder := convertToTimeObject(time.Now().Add(-30 * time.Second).Format("2006-01-02T15:04:05.000Z"))
duration := time.Since(timeOlder)
// replace time.Since() with a correctly parsed time.Now(), because
// time.Since() returns the time elapsed since the current LOCAL time.
t := time.Now().Format("2006-01-02T15:04:05.000Z")
timeNow := convertToTimeObject(t)
// print the different results
fmt.Println("duration in seconds:", strconv.Itoa(int(duration.Seconds())))
fmt.Printf("duration: %v\n", timeNow.Sub(timeOlder))
}
Outputs:
duration in seconds: 14430
duration: 30s

Is there a way of cleaning up this Go code?

I am just beginning to learn Go, and have made a function which parses markdown files with a header, containing some metadata (the files are blog posts).
here is an example:
---
Some title goes here
19 September 2012
---
This is some content, read it.
I've written this function, which works, but I feel it's quite verbose and messy, I've had a look at the various strings packages, but I don't know enough about Go and it's best practices to know what I should be doing differently, if I could get some tips to clean this up, I would appreciate it. (also, I know that i shouldn't be neglecting that error).
type Post struct {
Title string
Date string
Body string
}
func loadPost(title string) *Post {
filename := title + ".md"
file, _ := ioutil.ReadFile("posts/" + filename)
fileString := string(file)
str := strings.Split(fileString, "---")
meta := strings.Split(str[1], "\n")
title = meta[1]
date := meta[2]
body := str[2]
return &Post{Title: title, Date: date, Body: body}
}

I think it's not bad. A couple of suggestions:
The hard-coded slash in "posts/" is platform-dependent. You can use path/filepath.Join to avoid that.
There is bytes.Split, so you don't need the string(file).
You can create the Post without repeating the fields: &Post{title, date, body}
Alternatively, you could find out where the body starts with LastIndex(s, "--") and use that to index the file contents accordingly. This avoids the allocation of using Split.
const sep = "--"
func loadPost(content string) *Post {
sepLength := len(sep)
i := strings.LastIndex(content, sep)
headers := content[sepLength:i]
body := content[i+sepLength+1:]
meta := strings.Split(headers, "\n")
return &Post{meta[1], meta[2], body}
}

I agree that it's not bad. I'll add a couple of other ideas.
As Thomas showed, you don't need the intermediate variables title date and body. Try though,
return &Post{
Title: meta[1],
Date: meta[2],
Body: body,
}
It's true that you can leave the field names out, but I sometimes like them to keep the code self-documenting. (I think go vet likes them too.)
I fuss over strings versus byte slices, but probably more than I should. Since you're reading the file in one gulp, you probably don't need to worry about this. Converting everything to one big string and then slicing up the string is a handy way of doing things, just remember that you're pinning the entire string in memory if you keep any part of it. If your files are large or you have lots of them and you only end up keeping, say, the meta for most of them, this might not be the way to go.
There's just one blog entry per file? If so, I think I'll propose a variant of Thomas's suggestion. Verify the first bytes are --- (or your file is corrupt), then use strings.Index(fileString[3:], "---"). Split is more appropriate when you have an unknown number of segments. In your case you're just looking for that single separator after the meta. Index will find it after searching the meta and be done, without searching through the whole body. (And anyway, what if the body contained the string "---"?)
Finally, some people would use regular expressions for this. I still haven't warmed up to regular expressions, but anyway, it's another approach.

Sonia has some great suggestions. Below is my take which accounts for problems you might encounter when parsing the header.
http://play.golang.org/p/w-XYyhPj9n
package main
import (
"fmt"
"strings"
)
const sep = "---"
type parseError struct {
msg string
}
func (e *parseError) Error() string {
return e.msg
}
func parse(s string) (header []string, content string, err error) {
if !strings.HasPrefix(s, sep) {
return header, content, &parseError{"content does not start with `---`!"}
}
arr := strings.SplitN(s, sep, 3)
if len(arr) < 3 {
return header, content, &parseError{"header was not terminated with `---`!"}
}
header = strings.Split(strings.TrimSpace(arr[1]), "\n")
content = strings.TrimSpace(arr[2])
return header, content, nil
}
func main() {
//
f := `---
Some title goes here
19 September 2012
---
This is some content, read it. --Anonymous`
header, content, err := parse(f)
if err != nil {
panic(err)
}
for i, val := range header {
fmt.Println(i, val)
}
fmt.Println("---")
fmt.Println(content)
//
f = `---
Some title goes here
19 September 2012
This is some content, read it.`
_, _, err = parse(f)
fmt.Println("Error:", err)
//
f = `
Some title goes here
19 September 2012
---
This is some content, read it.`
_, _, err = parse(f)
fmt.Println("Error:", err)
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to parse timestamp with underscores in Golang - go

Related

How can I compare read(1.proto) = read(2.proto) in Go(assuming there's just one message definition)?

Golang Equivalent of Python's `pd.to_datetime()`

How to convert the string representation of a Terraform set of strings to a slice of strings

Why is time.Since returning negative durations on Windows?

Is there a way of cleaning up this Go code?

Categories

Resources