Golang index an array of strings - go

Hi I have a string that says this.
"Style: Saison
ABV: 7.7
IBU: 20"
I try to split it into an array so that I can get Saison
Here is how I convert to array.
style :=strings.Split(style, "Style:")
When I do
style[0]
It doesn't index Saison. I also tried style[1] and style[2] and nothing happens. What am I doing wrong?
Style = []string so it is a list of strings right?

You could use strings.FieldsFunc:
FieldsFunc splits the string s at each run of Unicode code points c
satisfying f(c) and returns an array of slices of s. If all code
points in s satisfy f(c) or the string is empty, an empty slice is
returned.
FieldsFunc makes no guarantees about the order in which it calls f(c)
and assumes that f always returns the same value for a given c.
package main
import (
"fmt"
"strconv"
"strings"
)
func main() {
str := `Style: Saison Drink
ABV: 7.7
IBU: 20`
f := func(c rune) bool {
return c == ':' || c == '\n'
}
strFields := strings.FieldsFunc(str, f)
fmt.Printf("%q\n", strFields)
styleValue := strings.TrimSpace(strFields[1])
fmt.Println(styleValue)
abvValue, err := strconv.ParseFloat(strings.TrimSpace(strFields[3]), 32)
if err != nil {
fmt.Println("Error parsing float!")
}
fmt.Printf("%.2f\n", abvValue)
ibuValue, err := strconv.ParseInt(strings.TrimSpace(strFields[5]), 10, 32)
if err != nil {
fmt.Println("Error parsing int!")
}
fmt.Printf("%d\n", ibuValue)
}
Output:
["Style" " Saison Drink" "ABV" " 7.7" "IBU" " 20"]
Saison Drink
7.70
20

Related

How to convert strings to lower case in GO?

I am new to the language GO and working on an assignment where i should write a code that return the word frequencies of the text. However I know that the words 'Hello', 'HELLO' and 'hello' are all counted as 'hello', so I need to convert all strings to lower case.
I know that I should use strings.ToLower(), however I dont know where I should Included that in the class. Can someone please help me?
package main
import (
"fmt"
"io/ioutil"
"log"
"strings"
"time"
)
const DataFile = "loremipsum.txt"
// Return the word frequencies of the text argument.
func WordCount(text string) map[string]int {
fregs := make(map[string]int)
words := strings.Fields(text)
for _, word := range words {
fregs[word] += 1
}
return fregs
}
// Benchmark how long it takes to count word frequencies in text numRuns times.
//
// Return the total time elapsed.
func benchmark(text string, numRuns int) int64 {
start := time.Now()
for i := 0; i < numRuns; i++ {
WordCount(text)
}
runtimeMillis := time.Since(start).Nanoseconds() / 1e6
return runtimeMillis
}
// Print the results of a benchmark
func printResults(runtimeMillis int64, numRuns int) {
fmt.Printf("amount of runs: %d\n", numRuns)
fmt.Printf("total time: %d ms\n", runtimeMillis)
average := float64(runtimeMillis) / float64(numRuns)
fmt.Printf("average time/run: %.2f ms\n", average)
}
func main() {
// read in DataFile as a string called data
data, err:= ioutil.ReadFile("loremipsum.txt")
if err != nil {
log.Fatal(err)
}
// Convert []byte to string and print to screen
text := string(data)
fmt.Println(text)
fmt.Printf("%#v",WordCount(string(data)))
numRuns := 100
runtimeMillis := benchmark(string(data), numRuns)
printResults(runtimeMillis, numRuns)
}
You should convert words to lowercase when you are using them as map key
for _, word := range words {
fregs[strings.ToLower(word)] += 1
}
I get [a:822 a.:110 I want all a in the same. How do i a change the code so that a and a. is the same? – hello123
You need to carefully define a word. For example, a string of consecutive letters and numbers converted to lowercase.
func WordCount(s string) map[string]int {
wordFunc := func(r rune) bool {
return !unicode.IsLetter(r) && !unicode.IsNumber(r)
}
counts := make(map[string]int)
for _, word := range strings.FieldsFunc(s, wordFunc) {
counts[strings.ToLower(word)]++
}
return counts
}
to remove all non-word characters you could use a regular expression:
package main
import (
"bufio"
"fmt"
"log"
"regexp"
"strings"
)
func main() {
str1 := "This is some text! I want to count each word. Is it cool?"
re, err := regexp.Compile(`[^\w]`)
if err != nil {
log.Fatal(err)
}
str1 = re.ReplaceAllString(str1, " ")
scanner := bufio.NewScanner(strings.NewReader(str1))
scanner.Split(bufio.ScanWords)
for scanner.Scan() {
fmt.Println(strings.ToLower(scanner.Text()))
}
}
See strings.EqualFold.
Here is an example.

How to parse a JSON string returned from scanner.Text() [duplicate]

Objects like the below can be parsed quite easily using the encoding/json package.
[
{"something":"foo"},
{"something-else":"bar"}
]
The trouble I am facing is when there are multiple dicts returned from the server like this :
{"something":"foo"}
{"something-else":"bar"}
This can't be parsed using the code below.
correct_format := strings.Replace(string(resp_body), "}{", "},{", -1)
json_output := "[" + correct_format + "]"
I am trying to parse Common Crawl data (see example).
How can I do this?
Assuming your input is really a series of valid JSON documents, use a json.Decoder to decode them:
package main
import (
"encoding/json"
"fmt"
"io"
"log"
"strings"
)
var input = `
{"foo": "bar"}
{"foo": "baz"}
`
type Doc struct {
Foo string
}
func main() {
dec := json.NewDecoder(strings.NewReader(input))
for {
var doc Doc
err := dec.Decode(&doc)
if err == io.EOF {
// all done
break
}
if err != nil {
log.Fatal(err)
}
fmt.Printf("%+v\n", doc)
}
}
Playground: https://play.golang.org/p/ANx8MoMC0yq
If your input really is what you've shown in the question, that's not JSON and you have to write your own parser.
Seems like each line is its own json object.
You may get away with the following code which will structure this output into correct json:
package main
import (
"fmt"
"strings"
)
func main() {
base := `{"trolo":"lolo"}
{"trolo2":"lolo2"}`
delimited := strings.Replace(base, "\n", ",", -1)
final := "[" + delimited + "]"
fmt.Println(final)
}
You should be able to use encoding/json library on final now.
Another option would be to parse each incoming line, line by line, and then add each one to a collection in code (ie a slice) Go provides a line scanner for this.
yourCollection := []yourObject{}
scanner := bufio.NewScanner(YOUR_SOURCE)
for scanner.Scan() {
obj, err := PARSE_JSON_INTO_yourObject(scanner.Text())
if err != nil {
// something
}
yourCollection = append(yourCollection, obj)
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading standard input:", err)
}
You can read the ndjson from the file row by row and parse it then apply the logical operations on it. In the below sample instead of reading from the file, I have used an Array of JSON string.
import (
"encoding/json"
"fmt"
)
type NestedObject struct {
D string
E string
}
type OuterObject struct {
A string
B string
C []NestedObject
}
func main() {
myJsonString := []string{`{"A":"1","B":"2","C":[{"D":"100","E":"10"}]}`, `{"A":"11","B":"21","C":[{"D":"1001","E":"101"}]}`}
for index, each := range myJsonString {
fmt.Printf("Index value [%d] is [%v]\n", index, each)
var obj OuterObject
json.Unmarshal([]byte(each), &obj)
fmt.Printf("a: %v, b: %v, c: %v", obj.A, obj.B, obj.C)
fmt.Println()
}
}
Output:
Index value [0] is [{"A":"1","B":"2","C":[{"D":"100","E":"10"}]}]
a: 1, b: 2, c: [{100 10}]
Index value [1] is [{"A":"11","B":"21","C":[{"D":"1001","E":"101"}]}]
a: 11, b: 21, c: [{1001 101}]
Try it on golang play

Using regular expressions in Go to Identify a common pattern

I'm trying to parse this string goats=1\r\nalligators=false\r\ntext=works.
contents := "goats=1\r\nalligators=false\r\ntext=works"
compile, err := regexp.Compile("([^#\\s=]+)=([a-zA-Z0-9.]+)")
if err != nil {
return
}
matchString := compile.FindAllStringSubmatch(contents, -1)
my Output looks like [[goats=1 goats 1] [alligators=false alligators false] [text=works text works]]
What I'm I doing wrong in my expression to cause goats=1 to be valid too? I only want [[goats 1]...]
For another approach, you can use the strings package instead:
package main
import (
"fmt"
"strings"
)
func parse(s string) map[string]string {
m := make(map[string]string)
for _, kv := range strings.Split(s, "\r\n") {
a := strings.Split(kv, "=")
m[a[0]] = a[1]
}
return m
}
func main() {
m := parse("goats=1\r\nalligators=false\r\ntext=works")
fmt.Println(m) // map[alligators:false goats:1 text:works]
}
https://golang.org/pkg/strings

Handling Unicode in string search

Suppose I have a string containing Unicode characters. For example:
s := "foo 日本 foo!"
I'm trying to find the last occurrence foo in the string:
index := strings.LastIndex(s, "foo")
The expected result here would be 7 but this will return 11 as the index due to the Unicode in the string.
Is there a way to handle this using standard library functions?
You're encountering the difference between runes in go and bytes. Strings are composed of bytes, not runes. If you haven't learned about this, you should read https://blog.golang.org/strings.
Here's my version of a quick function to calculate the number of runes preceding the last match of a substring in a string. The basic approach is to find the byte index, then iterate/count through the strings runes until that number of bytes have been consumed.
I'm not aware of a standard library method that will do this directly.
package main
import (
"fmt"
"strings"
)
func LastRuneIndex(s, substr string) (int, error) {
byteIndex := strings.LastIndex(s, substr)
if byteIndex < 0 {
return byteIndex, nil
}
reader := strings.NewReader(s)
count := 0
for byteIndex > 0 {
_, bytes, err := reader.ReadRune()
if err != nil {
return 0, err
}
byteIndex = byteIndex - bytes
count += 1
}
return count, nil
}
func main() {
s := "foo 日本 foo!"
count, err := LastRuneIndex(s, "foo")
fmt.Println(count, err)
// outputs:
// 7 <nil>
}
This gets pretty close:
package main
import (
"golang.org/x/text/language"
"golang.org/x/text/search"
)
func main() {
m := search.New(language.English)
start, end := m.IndexString("foo 日本 foo!", "foo")
println(start == 0, end == 3)
}
buts it's searching forward. I tried this:
m.IndexString("foo 日本 foo!", "foo", search.Backwards)
but I get this result:
panic: TODO: implement
https://pkg.go.dev/golang.org/x/text/search
https://github.com/golang/text/blob/v0.3.6/search/search.go#L222-L223

Convert slice of string input from console to slice of numbers

I'm trying to write a Go script that takes in as many lines of comma-separated coordinates as the user wishes, split and convert the string of coordinates to float64, store each line as a slice, and then append each slice in a slice of slices for later usage.
Example inputs are:
1.1,2.2,3.3
3.14,0,5.16
Example outputs are:
[[1.1 2.2 3.3],[3.14 0 5.16]]
The equivalent in Python is
def get_input():
print("Please enter comma separated coordinates:")
lines = []
while True:
line = input()
if line:
line = [float(x) for x in line.replace(" ", "").split(",")]
lines.append(line)
else:
break
return lines
But what I wrote in Go seems way too long (pasted below), and I'm creating a lot of variables without the ability to change variable type as in Python. Since I literally just started writing Golang to learn it, I fear my script is long as I'm trying to convert Python thinking into Go. Therefore, I would like to ask for some advice as to how to write this script shorter and more concise in Go style? Thank you.
package main
import (
"fmt"
"os"
"bufio"
"strings"
"strconv"
)
func main() {
inputs := get_input()
fmt.Println(inputs)
}
func get_input() [][]float64 {
fmt.Println("Please enter comma separated coordinates: ")
var inputs [][]float64
scanner := bufio.NewScanner(os.Stdin)
for scanner.Scan() {
if len(scanner.Text()) > 0 {
raw_input := strings.Replace(scanner.Text(), " ", "", -1)
input := strings.Split(raw_input, ",")
converted_input := str2float(input)
inputs = append(inputs, converted_input)
} else {
break
}
}
return inputs
}
func str2float(records []string) []float64 {
var float_slice []float64
for _, v := range records {
if s, err := strconv.ParseFloat(v, 64); err == nil {
float_slice = append(float_slice, s)
}
}
return float_slice
}
Using only string functions:
package main
import (
"bufio"
"fmt"
"os"
"strconv"
"strings"
)
func main() {
scanner := bufio.NewScanner(os.Stdin)
var result [][]float64
var txt string
for scanner.Scan() {
txt = scanner.Text()
if len(txt) > 0 {
values := strings.Split(txt, ",")
var row []float64
for _, v := range values {
fl, err := strconv.ParseFloat(strings.Trim(v, " "), 64)
if err != nil {
panic(fmt.Sprintf("Incorrect value for float64 '%v'", v))
}
row = append(row, fl)
}
result = append(result, row)
}
}
fmt.Printf("Result: %v\n", result)
}
Run:
$ printf "1.1,2.2,3.3
3.14,0,5.16
2,45,76.0, 45 , 69" | go run experiment2.go
Result: [[1.1 2.2 3.3] [3.14 0 5.16] [2 45 76 45 69]]
With given input, you can concatenate them to make a JSON string and then unmarshal (deserialize) that:
func main() {
var lines []string
for {
var line string
fmt.Scanln(&line)
if line == "" {
break
}
lines = append(lines, "["+line+"]")
}
all := "[" + strings.Join(lines, ",") + "]"
inputs := [][]float64{}
if err := json.Unmarshal([]byte(all), &inputs); err != nil {
fmt.Println(err)
return
}
fmt.Println(inputs)
}

Resources