Golang Regular expression to separate a formula - go

Need to split something like 1-(a+b-b-d)*100 into 1, a+b-b-d, 100
I tried (\+|-|\*|\/) which will split the string into 1 (a b b d) 100

regular expression pattern:
^(\d+?)\-(\(.+?\))\*(\d+?)$
Go Playground
package main
import (
"log"
"regexp"
)
func main() {
reg := regexp.MustCompile(`^(\d+?)\-(\(.+?\))\*(\d+?)$`)
str := `1-(a+b-b-d)*100`
// see: https://pkg.go.dev/regexp#Regexp.FindAllSubmatch
ret := reg.FindAllSubmatch([]byte(str), -1)
log.Printf("%s %s %s", ret[0][1], ret[0][2], ret[0][3])
}

Related

How can I clean the text for search using RegEx

I can use the below code to search if the text str contains any or both of the keys, i.e.if it contains "MS" or "dynamics" or both of them
package main
import (
"fmt"
"regexp"
)
func main() {
keys := []string{"MS", "dynamics"}
keysReg := fmt.Sprintf("(%s %s)|%s|%s", keys[0], keys[1], keys[0], keys[1]) // => "(MS dynamics)|MS|dynamics"
fmt.Println(keysReg)
str := "What is MS dynamics, is it a product from MS?"
re := regexp.MustCompile(`(?i)` + keysReg)
matches := re.FindAllString(str, -1)
fmt.Println("We found", len(matches), "matches, that are:", matches)
}
I want the user to enter his phrase, so I trim unwanted words and characters, then doing the search as per above.
Let's say the user input was: This,is,a,delimited,string and I need to build the keys variable dynamically to be (delimited string)|delimited|string so that I can search for my variable str for all the matches, so I wrote the below:
s := "This,is,a,delimited,string"
t := regexp.MustCompile(`(?i),|\.|this|is|a`) // backticks are used here to contain the expression, (?i) for case insensetive
v := t.Split(s, -1)
fmt.Println(len(v))
fmt.Println(v)
But I got the output as:
8
[ delimited string]
What is the wrong part in my cleaning of the input text, I'm expecting the output to be:
2
[delimited string]
Here is my playground
To quote the famous quip from Jamie Zawinski,
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Two things:
Instead of trying to weed out garbage from the string ("cleaning" it), extract complete words from it instead.
Unicode is a compilcated matter; so even after you have succeeded with extracting words, you have to make sure your words are properly "escaped" to not contain any characters which might be interpreted as RE syntax before building a regexp of them.
package main
import (
"errors"
"fmt"
"regexp"
"strings"
)
func build(words ...string) (*regexp.Regexp, error) {
var sb strings.Builder
switch len(words) {
case 0:
return nil, errors.New("empty input")
case 1:
return regexp.Compile(regexp.QuoteMeta(words[0]))
}
quoted := make([]string, len(words))
for i, w := range words {
quoted[i] = regexp.QuoteMeta(w)
}
sb.WriteByte('(')
for i, w := range quoted {
if i > 0 {
sb.WriteByte('\x20')
}
sb.WriteString(w)
}
sb.WriteString(`)|`)
for i, w := range quoted {
if i > 0 {
sb.WriteByte('|')
}
sb.WriteString(w)
}
return regexp.Compile(sb.String())
}
var words = regexp.MustCompile(`\pL+`)
func main() {
allWords := words.FindAllString("\tThis\v\x20\x20,\t\tis\t\t,?a!,¿delimited?,string‽", -1)
re, err := build(allWords...)
if err != nil {
panic(err)
}
fmt.Println(re)
}
Further reading:
https://pkg.go.dev/regexp/syntax
https://pkg.go.dev/regexp#QuoteMeta
https://pkg.go.dev/unicode#pkg-variables and https://pkg.go.dev/unicode#Categories

how to realize mismatch of regexp in golang?

This is a multiple choice question example. I want to get the chinese text like "英国、法国", "加拿大、墨西哥", "葡萄牙、加拿大", "墨西哥、德国" in the content of following code in golang, but it does not work.
package main
import (
"fmt"
"regexp"
"testing"
)
func TestRegex(t *testing.T) {
text := `( B )38.目前,亚马逊美国站后台,除了有美国站点外,还有( )站点。
A.英国、法国B.加拿大、墨西哥
C.葡萄牙、加拿大D.墨西哥、德国
`
fmt.Printf("%q\n", regexp.MustCompile(`[A-E]\.(\S+)?`).FindAllStringSubmatch(text, -1))
fmt.Printf("%q\n", regexp.MustCompile(`[A-E]\.`).Split(text, -1))
}
text:
( B )38.目前,亚马逊美国站后台,除了有美国站点外,还有( )站点。
A.英国、法国B.加拿大、墨西哥
C.葡萄牙、加拿大D.墨西哥、德国
pattern: [A-E]\.(\S+)?
Actual result: [["A.英国、法国B.加拿大、墨西哥" "英国、法国B.加拿大、墨西哥"] ["C.葡萄牙、加拿大D.墨西哥、德国" "葡萄牙、加拿大D.墨西哥、德国"]].
Expect result: [["A.英国、法国" "英国、法国"] ["B.加拿大、墨西哥" "加拿大、墨西哥"] ["C.葡萄牙、加拿大" "葡萄牙、加拿大"] ["D.墨西哥、德国" "墨西哥、德国"]]
I think it might be a greedy mode problem. Because in my code, it reads option A and option B as one option directly.
Non-greedy matching won't solve this, you need positive lookahead, which re2 doesn't support.
As a workaround can just search on the labels and extract the text in between manually.
re := regexp.MustCompile(`[A-E]\.`)
res := re.FindAllStringIndex(text, -1)
results := make([][]string, len(res))
for i, m := range res {
if i < len(res)-1 {
results[i] = []string{text[m[0]:m[1]], text[m[1]:res[i+1][0]]}
} else {
results[i] = []string{text[m[0]:m[1]], text[m[1]:]}
}
}
fmt.Printf("%q\n", results)
Should print
[["A." "英国、法国"] ["B." "加拿大、墨西哥\n"] ["C." "葡萄牙、加拿大"] ["D." "墨西哥、德国\n"]]

How can I trim whitespaces in Go from a slice after Split

I have a string that is comma separated, so it could be
test1, test2, test3 or test1,test2,test3 or test1, test2, test3.
I split this in Go currently with strings.Split(s, ","), but now I have a []string that can contain elements with an arbitrary numbers of whitespaces.
How can I easily trim them off? What is best practice here?
This is my current code
var property= os.Getenv(env.templateDirectories)
if property != "" {
var dirs = strings.Split(property, ",")
for index,ele := range dirs {
dirs[index] = strings.TrimSpace(ele)
}
return dirs
}
I come from Java and assumed that there is a map/reduce etc functionality in Go also, therefore the question.
You can use strings.TrimSpace in a loop. If you want to preserve order too, the indexes can be used rather than values as the loop parameters:
Go Playground Example
EDIT: To see the code without the click:
package main
import (
"fmt"
"strings"
)
func main() {
input := "test1, test2, test3"
slc := strings.Split(input , ",")
for i := range slc {
slc[i] = strings.TrimSpace(slc[i])
}
fmt.Println(slc)
}
Easy way without looping
test := "2 , 123, 1"
result := strings.Split(strings.ReplaceAll(test," ","") , ",")
The encoding/csv package can handle this:
package main
import (
"encoding/csv"
"fmt"
"strings"
)
func main() {
for _, each := range []string{
"test1, test2, test3", "test1, test2, test3", "test1,test2,test3",
} {
r := csv.NewReader(strings.NewReader(each))
r.TrimLeadingSpace = true
s, e := r.Read()
if e != nil {
panic(e)
}
fmt.Printf("%q\n", s)
}
}
https://golang.org/pkg/encoding/csv#Reader.TrimLeadingSpace
If you already use regexp may be you can split using regular expressions:
regexp.MustCompile(`\s*,\s*`).Split(test, -1)
This solution is probably slower than the standard Split + TrimSpaces, but is more flexible. For example if you want to skip empty fields you can :
regexp.MustCompile(`(\s*,\s*)+`).Split(test, -1)
or to use multiple separators
regexp.MustCompile(`\s*[,;]\s*`).Split(test, -1)
You can test it in the go playground.

How do I parse URLs in the format of /id/123 not ?foo=bar

I'm trying to parse an URL like:
http://example.com/id/123
I've read through the net/url docs but it seems like it only parses strings like
http://example.com/blah?id=123
How can I parse the ID so I end up with the value of the id in the first example?
This is not one of my own routes but a http string returned from an openid request.
In your example /id/123 is a path and you can get the "123" part by using Base from the path module.
package main
import (
"fmt"
"path"
)
func main() {
fmt.Println(path.Base("/id/123"))
}
For easy reference, here's the docs on the path module. http://golang.org/pkg/path/#example_Base
You can try using regular expression as follow:
import "regexp"
re, _ := regexp.Compile("/id/(.*)")
values := re.FindStringSubmatch(path)
if len(values) > 0 {
fmt.Println("ID : ", values[1])
}
Here is a simple solution that works for URLs with the same structure as yours (you can improve to suit those with other structures)
package main
import (
"fmt"
"net/url"
)
var path = "http://localhost:8080/id/123"
func getFirstParam(path string) (ps string) {
// ignore first '/' and when it hits the second '/'
// get whatever is after it as a parameter
for i := 1; i < len(path); i++ {
if path[i] == '/' {
ps = path[i+1:]
}
}
return
}
func main() {
u, _ := url.Parse(path)
fmt.Println(u.Path) // -> "/id/123"
fmt.Println(getFirstParam(u.Path)) // -> "123"
}
Or, as #gollipher suggested, use the path package
import "path"
func main() {
u, _ := url.Parse(path)
ps := path.Base(u.Path)
}
With this method it's faster than regex, provided you know before hand the structure of the URL you are getting.

How to get a list of values into a flag in Golang?

What is Golang's equivalent of the below python commands ?
import argparse
parser = argparse.ArgumentParser(description="something")
parser.add_argument("-getList1",nargs='*',help="get 0 or more values")
parser.add_argument("-getList2",nargs='?',help="get 1 or more values")
I have seen that the flag package allows argument parsing in Golang.
But it seems to support only String, Int or Bool.
How to get a list of values into a flag in this format :
go run myCode.go -getList1 value1 value2
You can define your own flag.Value and use flag.Var() for binding it.
The example is here.
Then you can pass multiple flags like following:
go run your_file.go --list1 value1 --list1 value2
UPD: including code snippet right there just in case.
package main
import "flag"
type arrayFlags []string
func (i *arrayFlags) String() string {
return "my string representation"
}
func (i *arrayFlags) Set(value string) error {
*i = append(*i, value)
return nil
}
var myFlags arrayFlags
func main() {
flag.Var(&myFlags, "list1", "Some description for this param.")
flag.Parse()
}
You can at least have a list of arguments on the end of you command by using the flag.Args() function.
package main
import (
"flag"
"fmt"
)
var one string
func main() {
flag.StringVar(&one, "o", "default", "arg one")
flag.Parse()
tail := flag.Args()
fmt.Printf("Tail: %+q\n", tail)
}
my-go-app -o 1 this is the rest will print Tail: ["this" "is" "the" "rest"]
Use flag.String() to get the entire list of values for the argument you need and then split it up into individual items with strings.Split().
If you have a series of integer values at the end of the command line, this helper function will properly convert them and place them in a slice of ints:
package main
import (
"flag"
"fmt"
"strconv"
)
func GetIntSlice(i *[]string) []int {
var arr = *i
ret := []int{}
for _, str := range arr {
one_int, _ := strconv.Atoi(str)
ret = append(ret, one_int)
}
return ret
}
func main() {
flag.Parse()
tail := flag.Args()
fmt.Printf("Tail: %T, %+v\n", tail, tail)
intSlice := GetIntSlice(&tail)
fmt.Printf("intSlice: %T, %+v\n", intSlice, intSlice)
}
mac:demoProject sx$ go run demo2.go 1 2 3 4
Tail: []string, [1 2 3 4]
intSlice: []int, [1 2 3 4]

Resources