Time format weird behavior [duplicate] - go

This question already has answers here:
Parsing RFC-3339 / ISO-8601 date-time string in Go
(8 answers)
Closed 5 years ago.
This is a part of a bigger code. I am little confused that if I use any other digit in the below format it results in wrong values.
so instead of
fmt.Println(time.Now().Format("2006-01-02 15:04:05-07:00"))
if I try to use
fmt.Println(time.Now().Format("2006-01-02 12:04:05-04:00"))
Result is wrong. Even when it is same format, just digit change
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println(time.Now().Format("2006-01-02 15:04:05-07:00"))
}
So my question is why is it so. Digits inside format have no meaning. They are just for representation of the format.

From https://golang.org/pkg/time/:
func (Time) Format
func (t Time) Format(layout string) string
Format returns a textual representation of the time value formatted
according to layout, which defines the format by showing how the
reference time, defined to be
Mon Jan 2 15:04:05 -0700 MST 2006
would be displayed if it were the value;
So you must use the reference time. You should not change it to another time.

Related

Why "Split" is not a member function of type "string" in Go? [duplicate]

This question already has answers here:
Why are string functions not defined on the string type?
(2 answers)
Closed 4 years ago.
Here are some snippets when you want to split a string with some specific delimiter in different languages:
# python
s = 'a,b,c,d,e'
tokens = s.split(',')
// javascript
let s = 'a,b,c,d,e'
let tokens = s.split(',')
// go
s := "a,b,c,d,e"
tokens := strings.Split(s, ",")
As you can see, "split" is a member function of type string in Python and Javascript, but not in Go.
I am wondering why is that, it seems like STL in CPP, why the functions to manipulate an instance of a type are not member functions of that type, it seems easy to implement them in Go, like:
// go
func (s *string) Split(d string) []string {
// here goes the code to split s with d given
}
what is the reason it is designed this way?
As you can see, "split" is a member function of type string in python and javascript, but not in golang.
That seems to have been so from the very beginning: commit 729bc5c, Sept 2008, for Go1 is the first commit with any mention of a string Split() function.
rudimentary string utilities.
Those functions were considered "utilities", and not part of the predeclared string type 'string' itself.
It was documented soon after in commit 0f7306b, March 2009, still Go1
// Split returns the array representing the substrings of s separated by string sep. Adjacent
// occurrences of sep produce empty substrings. If sep is empty, it is the same as Explode.
func Split(s, sep string) []string {
You can see it used for the first time in commit 5eae3b2, April 2009 in func LookPath(file string) (string, *os.Error) {
The same approach was use for byte with bytes: commit 7893322, June 2009; Go1, with a similar Split() function.
add a bytes package analogous to the strings package.
The general idea is: you can change that utility function without changing the value type itself.
See commit 30533d6, June 2009:
Change strings.Split, bytes.Split to take a maximum substring count argument.
func Split(s, sep []byte, n int) [][]byte
An even more drastic evolution:commit ebb1566, June 2011
strings.Split: make the default to split all.
Change the signature of Split to have no count, assuming a full split, and rename the existing Split with a count to SplitN.
The other idea is to keep using string, while possibly removing dependencies to those utility functions when you don't need them (as in commit 35ace1d, Nov. 2009: "remove dependencies on strconv and strings")
It also allows to add more related function, without touching string itself.
See commit 5d436b9, Nov. 2009: lines := strings.SplitAfter(text, "\n", 0), which uses Split().
Another advantage: you can optimize those functions independently of string itself, allowing for duplicate 'Split' function to be replaced by strings.Split().
See commit f388119, March 2013, Go 1.1
go/printer: use strings.Split instead of specialized code
With the faster strings package, the difference between the specialized code and strings.Split is in the noise:
benchmark old ns/op new ns/op delta
BenchmarkPrint 16724291 16686729 -0.22%
The opposite case is also true: replacing strings.Split by a simpler code, as in commit d0c9b40, Sept. 2015, Go 1.6
mime: Remove an allocation in word decoding.
This fixes a TODO in (*WordDecoder).Decode by replacing a call to strings.Split with simple prefix/suffix checking and some custom slicing.
Benchmark results:
benchmark old ns/op new ns/op delta
BenchmarkQEncodeWord-8 740 693 -6.35%
BenchmarkQDecodeWord-8 1291 727 -43.69%
BenchmarkQDecodeHeader-8 1194 767 -35.76%
(same idea in commit ecff943, Sept. 2017, Go 1.11)

Understanding golang date formatting for time package

So I have the function performing well.
func Today()(result string){
current_time := time.Now().Local()
result = current_time.Format("01/02/2006")
return
}
Prints MM/DD/YYYY And I thought that it would be more readable if I had a value greater than 12 in the days position to make it clear that it was MM/DD/YYYY so I changed the to following
func Today()(result string){
current_time := time.Now().Local()
result = current_time.Format("01/23/2004")
return
}
Which to my chagrin caused bad results. Prints MM/DDHH/DD0MM
Realizing my mistake I see that the format is defined by the reference time...
Mon Jan 2 15:04:05 -0700 MST 2006
I'm wondering if there is any other instances this moment being used as a formatting reference for date times, and if this reference moment has a nickname (like null island)?
The values in a date string are not arbitrary. You can't just change 02 to 03 and expect it to work. The date formatter looks for those specific values, and knows that 1 means month, 2 means day of month, etc.
Changing 01/02/2006 to 01/23/2004 is like changing a human-readable form that says First Name: ______ Last Name: ______ to one that says First Name: ______ Ice Cream: ______. You can't expect anyone to know that Ice Cream should mean Last Name.
The name
The only name provided for this is "reference time", here:
Parse parses a formatted string and returns the time value it represents. The layout defines the format by showing how the reference time, defined to be
Mon Jan 2 15:04:05 -0700 MST 2006
and here:
These are predefined layouts for use in Time.Format and Time.Parse. The reference time used in the layouts is the specific time:
Mon Jan 2 15:04:05 MST 2006
which is Unix time 1136239445. Since MST is GMT-0700, the reference time can be thought of as
01/02 03:04:05PM '06 -0700
To define your own format, write down what the reference time would look like formatted your way; see the values of constants like ANSIC, StampMicro or Kitchen for examples. The model is to demonstrate what the reference time looks like so that the Format and Parse methods can apply the same transformation to a general time value.
To specify that you're talking about Go's reference time, I'd say "Go's reference time." Or to be blatantly obvious, "Go's time.Parse reference time."
As an aside, your function can be greatly shortened:
func Today() string {
return time.Now().Local().Format("01/02/2006")
}

is there a way to iterate over constant used as enum

I am trying to use enum in golang as below. I am struggling to find a easy way to iterate over the list of constant values. What are common practice in golang to iterate over constant values used as enum. Thanks!
type DayOfWeek int
const(
Monday DayOfWeek = iota
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday
)
In Java, we can iterate as below.
public enum DayOfWeek {
MONDAY,
TUESDAY,
WEDNESDAY,
THURSDAY,
FRIDAY,
SATURDAY,
SUNDAY
}
for (DayOfWeek day: DayOfWeek.values()) {
// code logic
}
There is no direct way to enumerate the values/instances of named type at runtime, whether variables or constants, unless you specifically define a slice that lists them. This is left up to the definer or the user of the enumeration type.
package main
import (
"fmt"
"time"
)
var Weekdays = []time.Weekday{
time.Sunday,
time.Monday,
time.Tuesday,
time.Wednesday,
time.Thursday,
time.Friday,
time.Saturday,
}
func main() {
for _, day := range Weekdays {
fmt.Println(day)
}
}
In order be able to generate this list dynamically at runtime, e.g. via reflection, the linker would have to retain all the symbols defined in all packages, like Java does. The golang-nuts group discussed this, regarding names and functions exported from a package, a superset of package constant definitions. https://groups.google.com/forum/#!topic/golang-nuts/M0ORoEU115o
It would be possible for the language to include syntactic sugar for generating this list at compile time if and only if it were referenced by the program. What should the iteration order be, though? If your week starts on Monday the list I defined is not very helpful; you will have to define your own slice to range through the days from Monday to Sunday.
You can do that without reflection.
First execute the Go tools Stringer at compile time using go generate. This creates a file [filename]_string.go which contains a map _[structname]_map of enum values referencing enum variable names as strings. This map is private, so simply assign it to a public map upon package initialization.
var EnumMap map[Enum]string
func init() {
EnumMap = _Enum_map
}
type Enum uint
//go:generate go run golang.org/x/tools/cmd/stringer -type=Enum
const (
One Enum = iota
Two
)
Then you can simply loop over the keys of the map.
The comment from #davec was great. This works perfect when you have a count that increments by one.
You could either do a simple loop such as for d := Monday; d <= Sunday; d++ {}
I had constant that jumped in bits (1,2,4,8,16 etc):
const (
Approved = 1 << iota
AlreadyApproved
NotApproved
OldTicket
Unknown
)
I avoided range and did a left shift one to move through my constant:
var score Bits
score = Set(score, AlreadyApproved)
for i := Approved; i < Unknown; i = i << 1 {
fmt.Println(i)
}
Output:
1
2
4
8
16
Using stringer is preferable it can help you keep your codebase up to date using code generators. Unfortunately stringer does not always generate the map.
For anyone interested in keep using go generators for this purpose I wrote a small code generator called enumall. It produces a file for each provided type with variable holding all values for given type.
Use it by adding code generator comment to your code like this:
//go:generate go run github.com/tomaspavlic/enumall#latest -type=Season
type Season uint8
const (
Spring Season = 1 << iota
Summer
Autumn
Winter
)
You can find more information here: https://github.com/tomaspavlic/enumall

Compact data structure for storing parsed log lines in Go (i.e. compact data structure for multiple enums in Go)

I'm working on a script that parses and graph information from a database logfile. Some examples loglines might be:
Tue Dec 2 03:21:09.543 [rsHealthPoll] DBClientCursor::init call() failed
Tue Dec 2 03:21:09.543 [rsHealthPoll] replset info example.com:27017 heartbeat failed, retrying
Thu Nov 20 00:05:13.189 [conn1264369] insert foobar.fs.chunks ninserted:1 keyUpdates:0 locks(micros) w:110298 110ms
Thu Nov 20 00:06:19.136 [conn1263135] update foobar.fs.chunks query: { files_id: ObjectId('54661657b23a225c1e4b00ac'), n: 0 } update: { $set: { data: BinData } } nscanned:1 nupdated:1 keyUpdates:0 locks(micros) w:675 137ms
Thu Nov 20 00:06:19.136 [conn1258266] update foobar.fs.chunks query: { files_id: ObjectId('54661657ae3a22741e0132df'), n: 0 } update: { $set: { data: BinData } } nscanned:1 nupdated:1 keyUpdates:0 locks(micros) w:687 186ms
Thu Nov 20 00:12:14.859 [conn1113639] getmore local.oplog.rs query: { ts: { $gte: Timestamp 1416453003000|74 } } cursorid:7965836327322142721 ntoreturn:0 keyUpdates:0 numYields: 15 locks(micros) r:351042 nreturned:3311 reslen:56307 188ms
Not every logline contains all fields, but some of the fields we parse out include:
Datetime
Query Duration
Name of Thread
Connection Number (e.g. 1234, 532434, 53433)
Logging Level (e.g. Warning, Error, Info, Debug etc.)
Logging Component (e.g. Storage, Journal, Commands, Indexin etc.)
Type of operation (e.g. Query, Insert, Delete etc.)
Namespace
The total logfile can often be fairly large (several hundred MBs up to a coupe of GBs). Currently the script is in Python, and as well as the fields, it's also storing the original raw logline as well as a tokenised version - the resulting memory consumption though is actually several multiples of the original logfile size. Hence, memory consumption is one of the main things I'd like to improve.
For fun/learning, I thought I might try re-doing this in Go, and looking at whether we could use a more compact data structure.
Many of the fields are enumerations (enums) - for some of them the set of values is known in advance (e.g. logging leve, logging component). For others (e.g. name of thread, connection number, namespace), we'll work out the set at runtime as we parse the logfile.
Planned Changes
Firstly, many of these enums are stored as strings. So I'm guessing one improvement will be move to using something like an uint8 to store it, and then either using consts (for the ones we know in advance), or having some kind of mapping table back to the original string (for the ones we work out.) Or are there any other reaosns I'd prefer consts versus some kind of mapping structure?
Secondly, rather than storing the original logline as a string, we can probably store an offset back to the original file on disk.
Questions
Do you see any issues with either of the two planned changes above? Are these a good starting point?
Do you have any other tips/suggestions for optimising the memory consumption of how we store the loglines?
I know for bitmaps, there's things like Roaring Bitmaps (http://roaringbitmap.org/), which are compressed bitmaps which you can still access/modify normally whilst compressed. Apparently the overall term for things like this is succinct data structures.
However, are there any equivalents to roaring bitmaps but for enumerations? Or any other clever way of storing this compactly?
I also thought of bloom filters, and maybe using those to store whether each logline was in a set (i.e. logging level warning, logging level error) - however, it can only be in one of those sets, so I don't know if that makes sense. Also, not sure how to handle the false positives.
Thoughts?
Do you see any issues with either of the two planned changes above? Are these a good starting point?
No problems with either. If the logs are definitely line-delimited you can just store the line number, but it may be more robust to store the byte-offset. The standard io.Reader interface returns the number of bytes read so you can use that to gain the offset.
Do you have any other tips/suggestions for optimising the memory consumption of how we store the loglines?
It depends on what you want to use them for, but once they've been tokenized (and you've got the data you want from the line), why hold onto the line in memory? It's already in the file, and you've now got an offset to look it up again quickly.
are there any equivalents to roaring bitmaps but for enumerations? Or any other clever way of storing this compactly?
I'd tend to just define each enum type as an int, and use iota. Something like:
package main
import (
"fmt"
"time"
)
type LogLevel int
type LogComponent int
type Operation int
const (
Info LogLevel = iota
Warning
Debug
Error
)
const (
Storage LogComponent = iota
Journal
Commands
Indexin
)
const (
Query Operation = iota
Insert
Delete
)
type LogLine struct {
DateTime time.Time
QueryDuration time.Duration
ThreadName string
ConNum uint
Level LogLevel
Comp LogComponent
Op Operation
Namespace string
}
func main() {
l := &LogLine{
time.Now(),
10 * time.Second,
"query1",
1000,
Info,
Journal,
Delete,
"ns1",
}
fmt.Printf("%v\n", l)
}
Produces &{2009-11-10 23:00:00 +0000 UTC 10s query1 1000 0 1 2 ns1}.
Playground
You could pack some of the struct fields, but then you need to define bit-ranges for each field and you lose some open-endedness. For example define LogLevel as the first 2 bits, Component as the next 2 bits etc.
I also thought of bloom filters, and maybe using those to store whether each logline was in a set (i.e. logging level warning, logging level error) - however, it can only be in one of those sets, so I don't know if that makes sense. Also, not sure how to handle the false positives.
For your current example, bloom filters may be overkill. It may be easier to have a []int for each enum, or some other master "index" that keeps track of line-number to (for example) log level relationships. As you said, each log line can only be in one set. In fact, depending on the number of enum fields, it may be easier to use the packed enums as an identifier for something like a map[int][]int.
Set := make(map[int][]int)
Set[int(Delete) << 4 + int(Journal) << 2 + int(Debug)] = []int{7, 45, 900} // Line numbers in this set.
See here for a complete, although hackish example.

How to format floating point numbers into a string using Go

Using Go I'm trying to find the "best" way to format a floating point number into a string. I've looked for examples however I cannot find anything that specifically answers the questions I have. All I want to do is use the "best" method to format a floating point number into a string. The number of decimal places may vary but will be known (eg. 2 or 4 or zero).
An example of what I want to achieve is below.
Based on the example below should I use fmt.Sprintf() or strconv.FormatFloat() or something else?
And, what is the normal usage of each and differences between each?
I also don't understand the significance of using either 32 or 64 in the following which currently has 32:
strconv.FormatFloat(float64(fResult), 'f', 2, 32)
Example:
package main
import (
"fmt"
"strconv"
)
func main() {
var (
fAmt1 float32 = 999.99
fAmt2 float32 = 222.22
)
var fResult float32 = float32(int32(fAmt1*100) + int32(fAmt2*100)) / 100
var sResult1 string = fmt.Sprintf("%.2f", fResult)
println("Sprintf value = " + sResult1)
var sResult2 string = strconv.FormatFloat(float64(fResult), 'f', 2, 32)
println("FormatFloat value = " + sResult2)
}
Both fmt.Sprintf and strconv.FormatFloat use the same string formatting routine under the covers, so should give the same results.
If the precision that the number should be formatted to is variable, then it is probably easier to use FormatFloat, since it avoids the need to construct a format string as you would with Sprintf. If it never changes, then you could use either.
The last argument to FormatFloat controls how values are rounded. From the documentation:
It rounds the
result assuming that the original was obtained from a floating-point
value of bitSize bits (32 for float32, 64 for float64)
So if you are working with float32 values as in your sample code, then passing 32 is correct.
You will have with Go 1.12 (February 2019) and the project cespare/ryu a faster alternative to strconv:
Ryu is a Go implementation of Ryu, a fast algorithm for converting floating-point numbers to strings.
It is a fairly direct Go translation of Ulf Adams's C library.
The strconv.FormatFloat latency is bimodal because of an infrequently-taken slow path that is orders of magnitude more expensive (issue 15672).
The Ryu algorithm requires several lookup tables.
Ulf Adams's C library implements a size optimization (RYU_OPTIMIZE_SIZE) which greatly reduces the size of the float64 tables in exchange for a little more CPU cost.
For a small fraction of inputs, Ryu gives a different value than strconv does for the last digit.
This is due to a bug in strconv: issue 29491.
Go 1.12 might or might not include that new implementation directly in strconv, but if it does not, you can use this project for faster conversion.

Resources