How to discard lines with certain strings in list and output to new list - go

I have a list downloaded from a website in XML. I am trying to filter the list by discarding lines that contain a certain string and building the same type of list without the lines containing the string.
I have a struct type that's having another struct.
I'm trying to use regexp and replaceall, and failing at replaceall.
func (*Regexp) ReplaceAll
func (re *Regexp) ReplaceAll(src, repl []byte) []byte
There may be an entirely simpler way to filter a list to a new list that I'm missing somewhere, but I've found this as the closest possible solution so far. Please share other ways to grep and delete lines to a new list. The list is a byte at body and downloaded as a xml.
type PeopleList struct {
Peoples []Person `xml:"peoples>person"`
}
type Person struct {
ADD string `xml:"add,attr"`
Loc string `xml:"loc,attr"`
Har string `xml:"har,attr"`
Name string `xml:"name,attr"`
Country string `xml:"country,attr"`
Num string `xml:"num,attr"`
ADD2 string `xml:"add2,attr"`
Distance float64
func fetchPeopleList(userinfo Userinfo) PeopleList {
var p byte
jam, err := http.Get(string(peoplelisturl))
iferror (err)
body, err := ioutil.ReadAll(jam.Body)
peeps := body
reg := regexp.MustCompile("(?m)[\r\n]+^.*BAD:.*$")
rep := reg.ReplaceAll(peeps, p) // Here fails probably because of my syntax. Error: cannot use p (variable of type byte) as []byte value in argument to re.ReplaceAll
fmt.Println(rep)
iferror (err)
defer jam.Body.Close()
Finally, I would like a new list in the same format as the first, only without the lines containing the string.

Your question says you want to "discard lines", but Replace/ReplaceAll, as their names suggest, are for replacing matched patterns. Your regex is also a simple substring match, so the obvious solution would seem to be reading the file line by line and - as your title says - discarding lines containing the substring.
func fetchPeopleList(userinfo Userinfo) PeopleList {
jam, err := http.Get(string(peoplelisturl))
iferror (err)
br := bufio.NewReader(jam.Body)
defer jam.Body.Close()
for {
line,err := br.ReadString('\n')
if !strings.Contains(line, "BAD:") {
fmt.Println(line) // or whatever you want to do with non-discarded lines
}
if err != nil {
break
}
}

Related

Alternative to using strings.Builder in conjunction with fmt.Sprintf

I am learning about the strings package in Go and I am trying to build up a simple error message.
I read that strings.Builder is a very eficient way to join strings, and that fmt.Sprintf lets me do some string interpolation.
With that said, I want to understand the best way to join a lot of strings together. For example here is what I create:
func generateValidationErrorMessage(err error) string {
errors := []string{}
for _, err := range err.(validator.ValidationErrors) {
var b strings.Builder
b.WriteString(fmt.Sprintf("[%s] failed validation [%s]", err.Field(), err.ActualTag()))
if err.Param() != "" {
b.WriteString(fmt.Sprintf("[%s]", err.Param()))
}
errors = append(errors, b.String())
}
return strings.Join(errors, "; ")
}
Is there another/better way to do this? Is using s1 + s2 considered worse?
You can use fmt to print directly to the strings.Builder. Use fmt.Fprintf(&builder, "format string", args).
The fmt functions beginning with Fprint..., meaning "file print", allow you to print to an io.Writer such as a os.File or strings.Builder.
Also, rather than using multiple builders and joining all their strings at the end, just use a single builder and keep writing to it. If you want to add a separator, you can do so easily within the loop:
var builder strings.Builder
for i, v := range values {
if i > 0 {
// unless this is the first item, add the separator before it.
fmt.Fprint(&builder, "; ")
}
fmt.Fprintf(&builder, "some format %v", v)
}
var output = builder.String()

Get data from Twitter Library search into a struct in Go

How do I append output from a twitter search to the field Data in the SearchTwitterOutput{} struct.
Thanks!
I am using a twitter library to search twitter base on a query input. The search returns an array of strings(I believe), I am able to fmt.println the data but I need the data as a struct.
type SearchTwitterOutput struct {
Data string
}
func (SearchTwitter) execute(input SearchTwitterInput) (*SearchTwitterOutput, error) {
credentials := Credentials{
AccessToken: input.AccessToken,
AccessTokenSecret: input.AccessTokenSecret,
ConsumerKey: input.ConsumerKey,
ConsumerSecret: input.ConsumerSecret,
}
client, err := GetUserClient(&credentials)
if err != nil {
return nil, err
}
// search through the tweet and returns a
search, _ , err := client.Search.Tweets(&twitter.SearchTweetParams{
Query: input.Text,
})
if err != nil {
println("PANIC")
panic(err.Error())
return &SearchTwitterOutput{}, err
}
for k, v := range search.Statuses {
fmt.Printf("Tweet %d - %s\n", k, v.Text)
}
return &SearchTwitterOutput{
Data: "test", //data is a string for now it can be anything
}, nil
}
//Data field is a string type for now it can be anything
//I use "test" as a placeholder, bc IDK...
Result from fmt.Printf("Tweet %d - %s\n", k, v.Text):
Tweet 0 - You know I had to do it to them! #JennaJulien #Jenna_Marbles #juliensolomita #notjulen Got my first hydroflask ever…
Tweet 1 - RT #brenna_hinshaw: I was in J2 today and watched someone fill their hydroflask with vanilla soft serve... what starts here changes the wor…
Tweet 2 - I miss my hydroflask :(
This is my second week working with go and new to development. Any help would be great.
It doesn't look like the client is just returning you a slice of strings. The range syntax you're using (for k, v := range search.Statuses) returns two values for each iteration, the index in the slice (in this case k), and the object from the slice (in this case v). I don't know the type of search.Statuses - but I know that strings don't have a .Text field or method, which is how you're printing v currently.
To your question:
Is there any particular reason to return just a single struct with a Data field rather than directly returning the output of the twitter client?
Your function signature could look like this instead:
func (SearchTwitter) execute(input SearchTwitterInput) ([]<client response struct>, error)
And then you could operate on the text in those objects in wherever this function was called.
If you're dead-set on placing the data in your own struct, you could return a slice of them ([]*SearchTwitterOutput), in which case you could build a single SearchTwitterOutput in the for loop you're currently printing the tweets in and append it to the output list. That might look like this:
var output []*SearchTwitterOutput
for k, v := range search.Statuses {
fmt.Printf("Tweet %d - %s\n", k, v.Text)
output = append(output, &SearchTwitterOutput{
Data: v.Text,
})
}
return output, nil
But if your goal really is to return all of the results concatenated together and placed inside a single struct, I would suggest building a slice of strings (containing the text you want), and then joining them with the delimiter of your choosing. Then you could place the single output string in your return object, which might look something like this:
var outputStrings []string
for k, v := range search.Statuses {
fmt.Printf("Tweet %d - %s\n", k, v.Text)
outputStrings = append(outputStrings, v.Text)
}
output = strings.Join(outputStrings, ",")
return &SearchTwitterOutput{
Data: output,
}, nil
Though I would caution, it might be tricky to find a delimiter that will never show up in a tweet..

How can I convert a JSON string to a byte array?

I need some help with unmarshaling. I have this example code:
package main
import (
"encoding/json"
"fmt"
)
type Obj struct {
Id string `json:"id"`
Data []byte `json:"data"`
}
func main() {
byt := []byte(`{"id":"someID","data":["str1","str2"]}`)
var obj Obj
if err := json.Unmarshal(byt, &obj); err != nil {
panic(err)
}
fmt.Println(obj)
}
What I try to do here - convert bytes to the struct, where type of one field is []byte. The error I get:
panic: json: cannot unmarshal string into Go struct field Obj.data of
type uint8
That's probably because parser already sees that "data" field is already a slice and tries to represent "str1" as some char bytecode (type uint8?).
How do I store the whole data value as one bytes array? Because I want to unmarshal the value to the slice of strings later. I don't include a slice of strings into struct because this type can change (array of strings, int, string, etc), I wish this to be universal.
My first recommendation would be for you to just use []string instead of []byte if you know the input type is going to be an array of strings.
If data is going to be a JSON array with various types, then your best option is to use []interface{} instead - Go will happily unmarshal the JSON for you and you can perform checks at runtime to cast those into more specific typed variables on an as-needed basis.
If []byte really is what you want, use json.RawMessage, which is of type []byte, but also implements the methods for JSON parsing. I believe this may be what you want, as it will accept whatever ends up in data. Of course, you then have to manually parse Data to figure out just what actually IS in there.
One possible bonus is that this skips any heavy parsing because it just copies the bytes over. When you want to use this data for something, you use a []interface{}, then use a type switch to use individual values.
https://play.golang.org/p/og88qb_qtpSGJ
package main
import (
"encoding/json"
"fmt"
)
type Obj struct {
Id string `json:"id"`
Data json.RawMessage `json:"data"`
}
func main() {
byt := []byte(`{"id":"someID","data":["str1","str2", 1337, {"my": "obj", "id": 42}]}`)
var obj Obj
if err := json.Unmarshal(byt, &obj); err != nil {
panic(err)
}
fmt.Printf("%+v\n", obj)
fmt.Printf("Data: %s\n", obj.Data)
// use it
var d []interface{}
if err := json.Unmarshal(obj.Data, &d); err != nil {
panic(err)
}
fmt.Printf("%+v\n", d)
for _, v := range d {
// you need a type switch to deterine the type and be able to use most of these
switch real := v.(type) {
case string:
fmt.Println("I'm a string!", real)
case float64:
fmt.Println("I'm a number!", real)
default:
fmt.Printf("Unaccounted for: %+v\n", v)
}
}
}
Your question is:
convert bytes array to struct with a field of type []byte
But you do not have a bytearray but a string array. Your question is not the same as your example. So let answer your question, there are more solutions possible depending in how far you want to diverge from your original requirements.
One string can be converted to one byte-slice, two strings need first to be transformed to one string. So that is problem one. The second problem are the square-brackets in your json-string
This works fine, it implicitly converts the string in the json-string to a byte-slice:
byt := []byte(`{"id":"someID","data":"str1str2"}`)
var obj Obj
if err := json.Unmarshal(byt, &obj); err != nil {
panic(err)
}
fmt.Println(obj)

how to insert an array in sqlite?

I have struct like:
type Foo struct {
bars []string
}
Since sqlite3 doesn't have array data type supported, can we store []string as string and while retrieving return as slice of string? Was trying to implement like below, but getting error because of type mismatch. What need to be done here?
Edit: I have changed the code and look like working
type strArray []string
func (strarr StrArray) Value() (driver.Value, error) {
if strarr != nil {
resarr := strings.Join(strarr, "")
return resarr, nil
}
return nil, nil
}
Complementary to database/sql/driver.Valuer you need also to implement database/sql.Scanner for reading your type from the database.
When you think of how to implement it, it's obvious that in Valuer you should Join your slice with some delimiter character/string (not occurring in the data of course) to be able to Split it back when retrieving.
Assuming that such delimiter would be ; (my wild guess), the code for reading would look like:
func (a *strArray) Scan(value interface{}) error {
if value == nil {
return nil // case when value from the db was NULL
}
s, ok := value.(string)
if !ok {
return fmt.Errorf("failed to cast value to string: %v", value)
}
*a = strings.Split(s, ";")
return nil
}
For writing, you'd need to use strings.Join(strarr, ";") in Valuer implementation.
Other less-trivial implementation would require marshaling your slice and encoding the resulting bytes as string somehow (base32/64? json?). In any case you need to not loose the information what are distinct slice elements when saving them to the database.

Is there a way of cleaning up this Go code?

I am just beginning to learn Go, and have made a function which parses markdown files with a header, containing some metadata (the files are blog posts).
here is an example:
---
Some title goes here
19 September 2012
---
This is some content, read it.
I've written this function, which works, but I feel it's quite verbose and messy, I've had a look at the various strings packages, but I don't know enough about Go and it's best practices to know what I should be doing differently, if I could get some tips to clean this up, I would appreciate it. (also, I know that i shouldn't be neglecting that error).
type Post struct {
Title string
Date string
Body string
}
func loadPost(title string) *Post {
filename := title + ".md"
file, _ := ioutil.ReadFile("posts/" + filename)
fileString := string(file)
str := strings.Split(fileString, "---")
meta := strings.Split(str[1], "\n")
title = meta[1]
date := meta[2]
body := str[2]
return &Post{Title: title, Date: date, Body: body}
}
I think it's not bad. A couple of suggestions:
The hard-coded slash in "posts/" is platform-dependent. You can use path/filepath.Join to avoid that.
There is bytes.Split, so you don't need the string(file).
You can create the Post without repeating the fields: &Post{title, date, body}
Alternatively, you could find out where the body starts with LastIndex(s, "--") and use that to index the file contents accordingly. This avoids the allocation of using Split.
const sep = "--"
func loadPost(content string) *Post {
sepLength := len(sep)
i := strings.LastIndex(content, sep)
headers := content[sepLength:i]
body := content[i+sepLength+1:]
meta := strings.Split(headers, "\n")
return &Post{meta[1], meta[2], body}
}
I agree that it's not bad. I'll add a couple of other ideas.
As Thomas showed, you don't need the intermediate variables title date and body. Try though,
return &Post{
Title: meta[1],
Date: meta[2],
Body: body,
}
It's true that you can leave the field names out, but I sometimes like them to keep the code self-documenting. (I think go vet likes them too.)
I fuss over strings versus byte slices, but probably more than I should. Since you're reading the file in one gulp, you probably don't need to worry about this. Converting everything to one big string and then slicing up the string is a handy way of doing things, just remember that you're pinning the entire string in memory if you keep any part of it. If your files are large or you have lots of them and you only end up keeping, say, the meta for most of them, this might not be the way to go.
There's just one blog entry per file? If so, I think I'll propose a variant of Thomas's suggestion. Verify the first bytes are --- (or your file is corrupt), then use strings.Index(fileString[3:], "---"). Split is more appropriate when you have an unknown number of segments. In your case you're just looking for that single separator after the meta. Index will find it after searching the meta and be done, without searching through the whole body. (And anyway, what if the body contained the string "---"?)
Finally, some people would use regular expressions for this. I still haven't warmed up to regular expressions, but anyway, it's another approach.
Sonia has some great suggestions. Below is my take which accounts for problems you might encounter when parsing the header.
http://play.golang.org/p/w-XYyhPj9n
package main
import (
"fmt"
"strings"
)
const sep = "---"
type parseError struct {
msg string
}
func (e *parseError) Error() string {
return e.msg
}
func parse(s string) (header []string, content string, err error) {
if !strings.HasPrefix(s, sep) {
return header, content, &parseError{"content does not start with `---`!"}
}
arr := strings.SplitN(s, sep, 3)
if len(arr) < 3 {
return header, content, &parseError{"header was not terminated with `---`!"}
}
header = strings.Split(strings.TrimSpace(arr[1]), "\n")
content = strings.TrimSpace(arr[2])
return header, content, nil
}
func main() {
//
f := `---
Some title goes here
19 September 2012
---
This is some content, read it. --Anonymous`
header, content, err := parse(f)
if err != nil {
panic(err)
}
for i, val := range header {
fmt.Println(i, val)
}
fmt.Println("---")
fmt.Println(content)
//
f = `---
Some title goes here
19 September 2012
This is some content, read it.`
_, _, err = parse(f)
fmt.Println("Error:", err)
//
f = `
Some title goes here
19 September 2012
---
This is some content, read it.`
_, _, err = parse(f)
fmt.Println("Error:", err)
}

Resources