I am just beginning to learn Go, and have made a function which parses markdown files with a header, containing some metadata (the files are blog posts).
here is an example:
---
Some title goes here
19 September 2012
---
This is some content, read it.
I've written this function, which works, but I feel it's quite verbose and messy, I've had a look at the various strings packages, but I don't know enough about Go and it's best practices to know what I should be doing differently, if I could get some tips to clean this up, I would appreciate it. (also, I know that i shouldn't be neglecting that error).
type Post struct {
Title string
Date string
Body string
}
func loadPost(title string) *Post {
filename := title + ".md"
file, _ := ioutil.ReadFile("posts/" + filename)
fileString := string(file)
str := strings.Split(fileString, "---")
meta := strings.Split(str[1], "\n")
title = meta[1]
date := meta[2]
body := str[2]
return &Post{Title: title, Date: date, Body: body}
}
I think it's not bad. A couple of suggestions:
The hard-coded slash in "posts/" is platform-dependent. You can use path/filepath.Join to avoid that.
There is bytes.Split, so you don't need the string(file).
You can create the Post without repeating the fields: &Post{title, date, body}
Alternatively, you could find out where the body starts with LastIndex(s, "--") and use that to index the file contents accordingly. This avoids the allocation of using Split.
const sep = "--"
func loadPost(content string) *Post {
sepLength := len(sep)
i := strings.LastIndex(content, sep)
headers := content[sepLength:i]
body := content[i+sepLength+1:]
meta := strings.Split(headers, "\n")
return &Post{meta[1], meta[2], body}
}
I agree that it's not bad. I'll add a couple of other ideas.
As Thomas showed, you don't need the intermediate variables title date and body. Try though,
return &Post{
Title: meta[1],
Date: meta[2],
Body: body,
}
It's true that you can leave the field names out, but I sometimes like them to keep the code self-documenting. (I think go vet likes them too.)
I fuss over strings versus byte slices, but probably more than I should. Since you're reading the file in one gulp, you probably don't need to worry about this. Converting everything to one big string and then slicing up the string is a handy way of doing things, just remember that you're pinning the entire string in memory if you keep any part of it. If your files are large or you have lots of them and you only end up keeping, say, the meta for most of them, this might not be the way to go.
There's just one blog entry per file? If so, I think I'll propose a variant of Thomas's suggestion. Verify the first bytes are --- (or your file is corrupt), then use strings.Index(fileString[3:], "---"). Split is more appropriate when you have an unknown number of segments. In your case you're just looking for that single separator after the meta. Index will find it after searching the meta and be done, without searching through the whole body. (And anyway, what if the body contained the string "---"?)
Finally, some people would use regular expressions for this. I still haven't warmed up to regular expressions, but anyway, it's another approach.
Sonia has some great suggestions. Below is my take which accounts for problems you might encounter when parsing the header.
http://play.golang.org/p/w-XYyhPj9n
package main
import (
"fmt"
"strings"
)
const sep = "---"
type parseError struct {
msg string
}
func (e *parseError) Error() string {
return e.msg
}
func parse(s string) (header []string, content string, err error) {
if !strings.HasPrefix(s, sep) {
return header, content, &parseError{"content does not start with `---`!"}
}
arr := strings.SplitN(s, sep, 3)
if len(arr) < 3 {
return header, content, &parseError{"header was not terminated with `---`!"}
}
header = strings.Split(strings.TrimSpace(arr[1]), "\n")
content = strings.TrimSpace(arr[2])
return header, content, nil
}
func main() {
//
f := `---
Some title goes here
19 September 2012
---
This is some content, read it. --Anonymous`
header, content, err := parse(f)
if err != nil {
panic(err)
}
for i, val := range header {
fmt.Println(i, val)
}
fmt.Println("---")
fmt.Println(content)
//
f = `---
Some title goes here
19 September 2012
This is some content, read it.`
_, _, err = parse(f)
fmt.Println("Error:", err)
//
f = `
Some title goes here
19 September 2012
---
This is some content, read it.`
_, _, err = parse(f)
fmt.Println("Error:", err)
}
Related
I'm trying to parse access log timestamp like "2020/11/06_18:17:25_455" in Filebeat according to Golang spec.
Here is my test program to verify layout:
package main
import (
"fmt"
"log"
"time"
)
func main() {
eventDateLayout := "2006/01/02_15:04:05_000"
eventCheckDate, err := time.Parse(eventDateLayout, "2020/11/06_18:17:25_455")
if err != nil {
log.Fatal(err)
}
fmt.Println(eventCheckDate)
}
Result:
2009/11/10 23:00:00 parsing time "2020/11/06_18:17:25_455" as
"2006/01/02_15:04:05_000": cannot parse "455" as "_000"
As I understand underscore has a special meaning in Golang, but from documentation it's not clear how to escape it.
Any ideas, please?
It doesn't seem possible to use any escape characters for the time layout (e.g. "\\_" doesn't work), so one would have to do something different.
This issue describes the same problem, but it was solved in a very non-general way that doesn't seem to apply to your format.
So your best bet seems to be replacing _ with something else/stripping it from the string, then using a layout without it. To make sure that the millisecond part ist also parsed, it must be separated with a . instead of _, then it's recognized as part of the seconds (05) format.
eventDateLayout := "2006/01/02.15:04:05"
val := strings.Replace("2020/11/06_18:17:25_455", "_", ".", 2)
eventCheckDate, err := time.Parse(eventDateLayout, val)
if err != nil {
panic(err)
}
fmt.Println(eventCheckDate)
Playground link
From time.Format
A fractional second is represented by adding a period and zeros to the
end of the seconds section of layout string, as in "15:04:05.000" to
format a time stamp with millisecond precision.
You cannot specify millisecond precision with an underscore you need 05.000 instead:
// eventDateLayout := "2006/01/02_15:04:05_000" // invalid format
eventDateLayout := "2006/01/02_15:04:05.000"
eventCheckDate, err := time.Parse(eventDateLayout, "2020/11/06_18:17:25.455")
So basically use a simple translate function to convert the final _ to a . and use the above parser.
https://play.golang.org/p/POPgXC_qe81
I have a list downloaded from a website in XML. I am trying to filter the list by discarding lines that contain a certain string and building the same type of list without the lines containing the string.
I have a struct type that's having another struct.
I'm trying to use regexp and replaceall, and failing at replaceall.
func (*Regexp) ReplaceAll
func (re *Regexp) ReplaceAll(src, repl []byte) []byte
There may be an entirely simpler way to filter a list to a new list that I'm missing somewhere, but I've found this as the closest possible solution so far. Please share other ways to grep and delete lines to a new list. The list is a byte at body and downloaded as a xml.
type PeopleList struct {
Peoples []Person `xml:"peoples>person"`
}
type Person struct {
ADD string `xml:"add,attr"`
Loc string `xml:"loc,attr"`
Har string `xml:"har,attr"`
Name string `xml:"name,attr"`
Country string `xml:"country,attr"`
Num string `xml:"num,attr"`
ADD2 string `xml:"add2,attr"`
Distance float64
func fetchPeopleList(userinfo Userinfo) PeopleList {
var p byte
jam, err := http.Get(string(peoplelisturl))
iferror (err)
body, err := ioutil.ReadAll(jam.Body)
peeps := body
reg := regexp.MustCompile("(?m)[\r\n]+^.*BAD:.*$")
rep := reg.ReplaceAll(peeps, p) // Here fails probably because of my syntax. Error: cannot use p (variable of type byte) as []byte value in argument to re.ReplaceAll
fmt.Println(rep)
iferror (err)
defer jam.Body.Close()
Finally, I would like a new list in the same format as the first, only without the lines containing the string.
Your question says you want to "discard lines", but Replace/ReplaceAll, as their names suggest, are for replacing matched patterns. Your regex is also a simple substring match, so the obvious solution would seem to be reading the file line by line and - as your title says - discarding lines containing the substring.
func fetchPeopleList(userinfo Userinfo) PeopleList {
jam, err := http.Get(string(peoplelisturl))
iferror (err)
br := bufio.NewReader(jam.Body)
defer jam.Body.Close()
for {
line,err := br.ReadString('\n')
if !strings.Contains(line, "BAD:") {
fmt.Println(line) // or whatever you want to do with non-discarded lines
}
if err != nil {
break
}
}
I have a very simple markdown app in go which works great but I am really struggling to sort the order of the index posts on the page and would like a neat way in the file to do this. Any help appreciated.
html is
<section>
{{range .}}
<h2 class="h2_home">{{.Title}} ({{.Date}})</h2>
<p>{{.Summary}}</p>
{{end}}
</section>
and the go stuff for the index page is as follows
func getPosts() []Post {
a := []Post{}
files, _ := filepath.Glob("posts/*")
for _, f := range files {
file := strings.Replace(f, "posts/", "", -1)
file = strings.Replace(file, ".md", "", -1)
fileread, _ := ioutil.ReadFile(f)
lines := strings.Split(string(fileread), "\n")
title := string(lines[0])
date := string(lines[1])
summary := string(lines[2])
body := strings.Join(lines[3:len(lines)], "\n")
htmlBody := template.HTML(blackfriday.MarkdownCommon([]byte(body)))
a = append(a, Post{title, date, summary, htmlBody, file, nil})
}
return a
}
Ive not looked at it for a while as it just works but I really want to put something into the file to support ordering. The .md file is formatted
Hello Go lang markdown blog generator!
12th Jan 2015
This is a basic start to my own hosted go lang markdown static blog/ web generator.
### Here I am...
This entry is a no whistles Hello ... etc
See sort.Slice in the sort package. Here is an example of usage.
For your particular problem, you have a slice a []Post, so just call sort.Slice on it like this:
sort.Slice(a, func(i, j int) bool { return a[i].title < a[j].title })
This is just sorting all members of the slice a based on their title (assuming you can access this private field of a Post in this package, if not you'd need to add a function).
If you do that your slice a of posts will be sorted by title (or any other criteria you wish, perhaps you could give the user a choice of title or date?). You don't need to adjust your markdown files.
If you want to sort on dates obviously your Post should parse the dates with the time package first so that you have real dates, not just a string.
So something like this (assuming time.Time on the Post):
// parse dates
date,err := time.Parse("2nd Jan 2006",string(lines[1]))
if err != nil {
// deal with it
}
// later, sort the slice
sort.Slice(a, func(i, j int) bool { return a[i].date.Before(a[j].date)})
I'm writing a simple program that takes in input from a form, populates an instance of a struct with the received data and the writes this received data to a file.
I'm a bit stuck at the moment with figuring out the best way to iterate over the populated struct and write its contents to the file.
The struct in question contains 3 different types of fields (ints, strings, []strings).
I can iterate over them but I am unable to get their actual type.
Inspecting my posted code below with print statements reveals that each of their types is coming back as structs rather than the aforementioned string, int etc.
The desired output format is be plain text.
For example:
field_1="value_1"
field_2=10
field_3=["a", "b", "c"]
Anyone have any ideas? Perhaps I'm going about this the wrong way entirely?
func (c *Config) writeConfigToFile(file *os.File) {
listVal := reflect.ValueOf(c)
element := listVal.Elem()
for i := 0; i < element.NumField(); i++ {
field := element.Field(i)
myType := reflect.TypeOf(field)
if myType.Kind() == reflect.Int {
file.Write(field.Bytes())
} else {
file.WriteString(field.String())
}
}
}
Instead of using the Bytes method on reflect.Value which does not work as you initially intended, you can use either the strconv package or the fmt to format you fields.
Here's an example using fmt:
var s string
switch fi.Kind() {
case reflect.String:
s = fmt.Sprintf("%q", fi.String())
case reflect.Int:
s = fmt.Sprintf("%d", fi.Int())
case reflect.Slice:
if fi.Type().Elem().Kind() != reflect.String {
continue
}
s = "["
for j := 0; j < fi.Len(); j++ {
s = fmt.Sprintf("%s%q, ", s, fi.Index(i).String())
}
s = strings.TrimRight(s, ", ") + "]"
default:
continue
}
sf := rv.Type().Field(i)
if _, err := fmt.Fprintf(file, "%s=%s\n", sf.Name, s); err!= nil {
panic(err)
}
Playground: https://play.golang.org/p/KQF3CicVzA
Why not use the built-in gob package to store your struct values?
I use it to store different structures, one per line, in files. During decoding, you can test the type conversion or provide a hint in a wrapper - whichever is faster for your given use case.
You'd treat each line as a buffer when Encoding and Decoding when reading back the line. You can even gzip/zlib/compress, encrypt/decrypt, etc the stream in real-time.
No point in re-inventing the wheel when you have a polished and armorall'd wheel already at your disposal.
I am creating an IRC bot using Go as a first project to get to grips with the language. One of the bot functions is to grab data from the TVmaze API and display in the channel.
I have imported an env package which allows the bot admin to define how the output is displayed.
For example SHOWSTRING="#showname# - #status# – #network.name#"
I am trying to add functionality to it so that the admin can use IRC formatting functionality which is accessed with \u0002 this is bold \u0002 for example.
I have a function which generates the string that is being returned and displayed in the channel.
func generateString(show Show) string {
str := os.Getenv("SHOWSTRING")
r := strings.NewReplacer(
"#ID#", string(show.ID),
"#showname#", show.Name,
"#status#", show.Status,
"#network.name#", show.Network.Name,
)
result := r.Replace(str)
return result
}
From what i have read i think that i need to use the rune datatype instead of string and then converting the runes into a string before being output.
I am using the https://github.com/thoj/go-irceven package for interacting with IRC.
Although i think that using rune is the correct way to go, i have tried a few things that have confused me.
If i add \u0002 to the SHOWSTRING from the env, it returns \u0002House\u0002 - Ended - Fox. I am doing this by con.Privmsg(roomName, tvmaze.ShowLookup('house'))
However if i try con.Privmsg(roomName, "\u0002This should be bold\u0002") it outputs bold text.
What is the best option here? If it is converting the string into runes and then back to a string, how do i go about doing that?
I needed to use strconv.Unquote() on my return in the function.
The new generateString function now outputs the correct string and looks like this
func generateString(show Show) string {
str := os.Getenv("SHOWSTRING")
r := strings.NewReplacer(
"#ID#", string(show.ID),
"#showname#", show.Name,
"#status#", show.Status,
"#network.name#", show.Network.Name,
)
result := r.Replace(str)
ret, err := strconv.Unquote(`"` + result + `"`)
if err != nil {
fmt.Println("Error unquoting the string")
}
return ret
}