How to access a capturing group from regexp.ReplaceAllFunc? - go

How can I access a capture group from inside ReplaceAllFunc()?
package main
import (
"fmt"
"regexp"
)
func main() {
body := []byte("Visit this page: [PageName]")
search := regexp.MustCompile("\\[([a-zA-Z]+)\\]")
body = search.ReplaceAllFunc(body, func(s []byte) []byte {
// How can I access the capture group here?
})
fmt.Println(string(body))
}
The goal is to replace [PageName] with PageName.
This is the last task under the "Other tasks" section at the bottom of the Writing Web Applications Go tutorial.

I agree that having access to capture group while inside of your function would be ideal, I don't think it's possible with regexp.ReplaceAllFunc.
Only thing that comes to my mind right now regard how to do this with that function is this:
package main
import (
"fmt"
"regexp"
)
func main() {
body := []byte("Visit this page: [PageName] [OtherPageName]")
search := regexp.MustCompile("\\[[a-zA-Z]+\\]")
body = search.ReplaceAllFunc(body, func(s []byte) []byte {
m := string(s[1 : len(s)-1])
return []byte("" + m + "")
})
fmt.Println(string(body))
}
EDIT
There is one other way I know how to do what you want. First thing you need to know is that you can specify non capturing group using syntax (?:re) where re is your regular expression. This is not essential, but will reduce number of not interesting matches.
Next thing to know is regexp.FindAllSubmatcheIndex. It will return slice of slices, where each internal slice represents ranges of all submatches for given matching of regexp.
Having this two things, you can construct somewhat generic solution:
package main
import (
"fmt"
"regexp"
)
func ReplaceAllSubmatchFunc(re *regexp.Regexp, b []byte, f func(s []byte) []byte) []byte {
idxs := re.FindAllSubmatchIndex(b, -1)
if len(idxs) == 0 {
return b
}
l := len(idxs)
ret := append([]byte{}, b[:idxs[0][0]]...)
for i, pair := range idxs {
// replace internal submatch with result of user supplied function
ret = append(ret, f(b[pair[2]:pair[3]])...)
if i+1 < l {
ret = append(ret, b[pair[1]:idxs[i+1][0]]...)
}
}
ret = append(ret, b[idxs[len(idxs)-1][1]:]...)
return ret
}
func main() {
body := []byte("Visit this page: [PageName] [OtherPageName][XYZ] [XY]")
search := regexp.MustCompile("(?:\\[)([a-zA-Z]+)(?:\\])")
body = ReplaceAllSubmatchFunc(search, body, func(s []byte) []byte {
m := string(s)
return []byte("" + m + "")
})
fmt.Println(string(body))
}

If you want to get group in ReplaceAllFunc, you can use ReplaceAllString to get the subgroup.
package main
import (
"fmt"
"regexp"
)
func main() {
body := []byte("Visit this page: [PageName]")
search := regexp.MustCompile("\\[([a-zA-Z]+)\\]")
body = search.ReplaceAllFunc(body, func(s []byte) []byte {
// How can I access the capture group here?
group := search.ReplaceAllString(string(s), `$1`)
fmt.Println(group)
// handle group as you wish
newGroup := "<a href='/view/" + group + "'>" + group + "</a>"
return []byte(newGroup)
})
fmt.Println(string(body))
}
And when there are many groups, you are able to get each group by this way, then handle each group and return desirable value.

You have to call ReplaceAllFunc first and within the function call FindStringSubmatch on the same regex again. Like:
func (p parser) substituteEnvVars(data []byte) ([]byte, error) {
var err error
substituted := p.envVarPattern.ReplaceAllFunc(data, func(matched []byte) []byte {
varName := p.envVarPattern.FindStringSubmatch(string(matched))[1]
value := os.Getenv(varName)
if len(value) == 0 {
log.Printf("Fatal error substituting environment variable %s\n", varName)
}
return []byte(value)
});
return substituted, err
}

Related

How to convert strings to lower case in GO?

I am new to the language GO and working on an assignment where i should write a code that return the word frequencies of the text. However I know that the words 'Hello', 'HELLO' and 'hello' are all counted as 'hello', so I need to convert all strings to lower case.
I know that I should use strings.ToLower(), however I dont know where I should Included that in the class. Can someone please help me?
package main
import (
"fmt"
"io/ioutil"
"log"
"strings"
"time"
)
const DataFile = "loremipsum.txt"
// Return the word frequencies of the text argument.
func WordCount(text string) map[string]int {
fregs := make(map[string]int)
words := strings.Fields(text)
for _, word := range words {
fregs[word] += 1
}
return fregs
}
// Benchmark how long it takes to count word frequencies in text numRuns times.
//
// Return the total time elapsed.
func benchmark(text string, numRuns int) int64 {
start := time.Now()
for i := 0; i < numRuns; i++ {
WordCount(text)
}
runtimeMillis := time.Since(start).Nanoseconds() / 1e6
return runtimeMillis
}
// Print the results of a benchmark
func printResults(runtimeMillis int64, numRuns int) {
fmt.Printf("amount of runs: %d\n", numRuns)
fmt.Printf("total time: %d ms\n", runtimeMillis)
average := float64(runtimeMillis) / float64(numRuns)
fmt.Printf("average time/run: %.2f ms\n", average)
}
func main() {
// read in DataFile as a string called data
data, err:= ioutil.ReadFile("loremipsum.txt")
if err != nil {
log.Fatal(err)
}
// Convert []byte to string and print to screen
text := string(data)
fmt.Println(text)
fmt.Printf("%#v",WordCount(string(data)))
numRuns := 100
runtimeMillis := benchmark(string(data), numRuns)
printResults(runtimeMillis, numRuns)
}
You should convert words to lowercase when you are using them as map key
for _, word := range words {
fregs[strings.ToLower(word)] += 1
}
I get [a:822 a.:110 I want all a in the same. How do i a change the code so that a and a. is the same? – hello123
You need to carefully define a word. For example, a string of consecutive letters and numbers converted to lowercase.
func WordCount(s string) map[string]int {
wordFunc := func(r rune) bool {
return !unicode.IsLetter(r) && !unicode.IsNumber(r)
}
counts := make(map[string]int)
for _, word := range strings.FieldsFunc(s, wordFunc) {
counts[strings.ToLower(word)]++
}
return counts
}
to remove all non-word characters you could use a regular expression:
package main
import (
"bufio"
"fmt"
"log"
"regexp"
"strings"
)
func main() {
str1 := "This is some text! I want to count each word. Is it cool?"
re, err := regexp.Compile(`[^\w]`)
if err != nil {
log.Fatal(err)
}
str1 = re.ReplaceAllString(str1, " ")
scanner := bufio.NewScanner(strings.NewReader(str1))
scanner.Split(bufio.ScanWords)
for scanner.Scan() {
fmt.Println(strings.ToLower(scanner.Text()))
}
}
See strings.EqualFold.
Here is an example.

How to recursively capture user input

I'm trying to capture the input of a bunch of numbers in Go. I am not allowed to do for loops. User input is multi-lined. However the function below is not returning the expected results of an []int, it instead returns with an empty array. Why is this? Or is there another way to capture multi-lined user input without for loops?
func input_to_list() []int {
fmt.Print("continuously enter text: ")
reader := bufio.NewReader(os.Stdin)
user_input, _ := reader.ReadString('\n')
print(user_input)
var result []int
if user_input == "\n" {
return result
}
return append(result, input_to_list()...)
}
How to recursively capture user input?
I am not allowed to do for loops.
For example,
package main
import (
"bufio"
"fmt"
"io"
"os"
"strconv"
"strings"
)
func readInt(rdr *bufio.Reader, n []int) []int {
line, err := rdr.ReadString('\n')
line = strings.TrimSpace(line)
if i, err := strconv.Atoi(line); err == nil {
n = append(n, i)
}
if err == io.EOF || strings.ToLower(line) == "end" {
return n
}
return readInt(rdr, n)
}
func ReadInts() []int {
fmt.Print("enter integers:\n")
var n []int
rdr := bufio.NewReader(os.Stdin)
return readInt(rdr, n)
}
func main() {
n := ReadInts()
fmt.Println(n)
}
Output:
enter integers:
42
7
end
[42 7]
Your function never assigns any value to result.
func input_to_list() []int {
/* ... */
var result []int // Create empty `result` slice
if user_input == "\n" {
return result // Return empty result slice
}
return append(result, input_to_list()...) // Combine two empty slices, and return the (still) empty slice
}
Let's step through:
You create an empty slice called result
If user_input is empty, you return the result immediately.
If user_input is not empty, you call input_to_list() recursively, and add its (empty) result to your empty result, then return that (still) empty result.
To get your desired behavior, you should be doing something (other than just checking for empty) with user_input. Probably something related to strconv.Atoi or similar, then adding that to result.

Setting struct fields from function

I'm sure there is a better way to do this, and I understand it's simple but I am new to go so bear with me. I am trying to set the fields of a struct (playersObject) from two functions (setCalculations and Calculations), more specifically, I am passing in values of two arrays (playerData and playerData2 from main to those functions, performing calculations in those functions, and want to return the values so that they can be set within the struct.
package main
import (
"fmt"
"os"
"log"
"strings"
"bufio"
"strconv"
)
type playersObject struct {
firstname, lastname string
batting_average, slugging_percentage, OBP, teamaverage float64
}
func strToFloat(playerData []string, playerData2 []float64) []float64 {
for _, i := range playerData[2:] {
j, err := strconv.ParseFloat(i, 64)
if err != nil {
panic(err)
}
playerData2 = append(playerData2, j)
}
return playerData2
}
func (player *playersObject) setCalculations (playerData []string, playerData2 []float64) {
player.firstname = playerData[1]
player.lastname = playerData[0]
player.batting_average = (playerData2[2] + playerData2[3] + playerData2[4] + playerData2[5]) / (playerData2[1])
player.slugging_percentage = ((playerData2[2]) + (playerData2[3]*2) + (playerData2[4]*3) + (playerData2[5]*4) )/(playerData2[1])
player.OBP = (( playerData2[2] + playerData2[3] + playerData2[4] + playerData2[5] +playerData2[6] +playerData2[7])/ (playerData2[0]))
}
func (player *playersObject) Calculations () (string, string, float64, float64, float64, ) {
return player.firstname, player.lastname, player.batting_average, player.slugging_percentage, player.OBP
}
func main() {
reader := bufio.NewReader(os.Stdin)
fmt.Print("Enter file name: ")
fileName, err := reader.ReadString('\n')
if err != nil {
log.Fatalf("failed opening file: %s", err)
}
fileName = strings.TrimSuffix(fileName, "\n")
//fmt.Printf("%q\n", fileName)
file, err := os.Open(fileName)
scanner := bufio.NewScanner(file)
scanner.Split(bufio.ScanLines)
var fileOfPlayers []string
for scanner.Scan() {
fileOfPlayers = append(fileOfPlayers, scanner.Text())
}
file.Close()
// var total_Average_sum float64 = 0
var countofplayers float64 = 0
//var total_average float64 = 0
for _, player := range fileOfPlayers {
countofplayers ++
playerData := strings.Split(player, " ")
var playerData2 = []float64{}
playerData2 = strToFloat(playerData, playerData2)
player := playersObject{}
player.setCalculations(playerData, playerData2)
calcs := player.Calculations()
fmt.Println(firstname, lastname, batting_average, slugging_percentage, OBP)
}
}
I recieve the errors multiple-value player.Calculations() in single-value contextand undefined: firstname, lastname, batting_average, slugging_percentage, OBP
I know this is very incorrect but again I am new to go and OOP. If this can be done in any simpler way I am open to it and appreciate all help and tips. Thank you
Here, the error is thrown because Calculations() returns multiple values but you are trying to assign it to a single variable.
You need to change the player.Calculations() method invocation from
calcs := player.Calculations()
to
firstname, lastname, batting_average, slugging_percentage, OBP := player.Calculations()
Having said that I would recommend you to read more about golang may be here. You need to re-write the code in view of go best practises

Is there an equivalent of os.Args() for functions?

To help debug GO programs, I want to write two generic functions that will be called on entry and exit, which will print the values of input and output parameters respectively:
printInputParameters(input ...interface{})
printOutputParameters(output ...interface{})
Is there an equivalent of os.Args() for functions? I looked at runtime package and didn't find such functions.
For example lets say I have two functions with different input parameters and output parameters
func f1(int i, float f) (e error) {
... some code here
}
func f2(s string, b []byte) (u uint64, e error) {
.. some code here
}
I want to be able to do the following
func f1(int i, float f) (e error) {
printInputparameters( ? )
defer func() {
printOutputParameters( ? )
}()
... some code here
}
func f2(s string, b []byte) (u uint64, e error) {
printInputparameters( ? )
defer func() {
printOutputParameters( ? )
}()
... some code here
}
You cannot do this in Go since there is no way you can get the stack frame of the currently active function in the current goroutine. It is not impossible to do this as I'll show further below but the problem is that there is no public API to get this done reliably. That it can be done can be seen in the stack traces printed when a panic is raised: all values on the stack are dumped in that case.
Should you be interested in how the stack trace is actually generated then have a look at genstacktrace in the runtime package.
As for a solution to your problem, you can the source code parsing route as already suggested. If you feel adventurous, you can parse the stack trace provided by runtime.Stack. But beware, there are so many drawbacks that you will quickly realize that any solution is better than this one.
To parse the stack trace, just get the line of the previously called function (from the viewpoint of printInputParameters), get the name of that function and parse the parameter values according to the parameter types provided by reflection. Some examples of stack trace outputs of various function invocations:
main.Test1(0x2) // Test1(int64(2))
main.Test1(0xc820043ed5, 0x3, 0x3) // Test1([]byte{'A','B','C'})
main.Test1(0x513350, 0x4) // Test1("AAAA")
You can see that complex types (those which do not fit into a register) may use more than one 'parameter'. A string for example is a pointer to the data and the length. So you have to use the unsafe package to access these pointers and reflection to create values from this data.
If you want to try yourself, here's some example code:
import (
"fmt"
"math"
"reflect"
"runtime"
"strconv"
"strings"
"unsafe"
)
// Parses the second call's parameters in a stack trace of the form:
//
// goroutine 1 [running]:
// main.printInputs(0x4c4c60, 0x539038)
// /.../go/src/debug/main.go:16 +0xe0
// main.Test1(0x2)
// /.../go/src/debug/main.go:23
//
func parseParams(st string) (string, []uintptr) {
line := 1
start, stop := 0, 0
for i, c := range st {
if c == '\n' {
line++
}
if line == 4 && c == '\n' {
start = i + 1
}
if line == 5 && c == '\n' {
stop = i
}
}
call := st[start:stop]
fname := call[0:strings.IndexByte(call, '(')]
param := call[strings.IndexByte(call, '(')+1 : strings.IndexByte(call, ')')]
params := strings.Split(param, ", ")
parsedParams := make([]uintptr, len(params))
for i := range params {
iv, err := strconv.ParseInt(params[i], 0, 64)
if err != nil {
panic(err.Error())
}
parsedParams[i] = uintptr(iv)
}
return fname, parsedParams
}
func fromAddress(t reflect.Type, addr uintptr) reflect.Value {
return reflect.NewAt(t, unsafe.Pointer(&addr)).Elem()
}
func printInputs(fn interface{}) {
v := reflect.ValueOf(fn)
vt := v.Type()
b := make([]byte, 500)
if v.Kind() != reflect.Func {
return
}
runtime.Stack(b, false)
name, params := parseParams(string(b))
pidx := 0
fmt.Print(name + "(")
for i := 0; i < vt.NumIn(); i++ {
t := vt.In(i)
switch t.Kind() {
case reflect.Int64:
case reflect.Int:
// Just use the value from the stack
fmt.Print(params[pidx], ",")
pidx++
case reflect.Float64:
fmt.Print(math.Float64frombits(uint64(params[pidx])), ",")
pidx++
case reflect.Slice:
// create []T pointing to slice content
data := reflect.ArrayOf(int(params[pidx+2]), t.Elem())
svp := reflect.NewAt(data, unsafe.Pointer(params[pidx]))
fmt.Printf("%v,", svp.Elem())
pidx += 3
case reflect.String:
sv := fromAddress(t, params[pidx])
fmt.Printf("%v,", sv)
pidx += 2
case reflect.Map:
// points to hmap struct
mv := fromAddress(t,params[pidx])
fmt.Printf("%v,", mv)
pidx++
} /* switch */
}
fmt.Println(")")
}
Test:
func Test1(in int, b []byte, in2 int, m string) {
printInputs(Test1)
}
func main() {
b := []byte{'A', 'B', 'C'}
s := "AAAA"
Test1(2, b, 9, s)
}
Output:
main.Test1(2,[65 66 67],9,"AAAA",)
A slightly advanced version of this can be found on github:
go get github.com/githubnemo/pdump
To generically print your functions' arguments, you can do this:
func printInputParameters(input ...interface{}) {
fmt.Printf("Args: %v", input)
}
printInputParameters is a variadic function, and input is of type []interface{}.

Cleaner way to iterate through array + create a string from values

With this code, is there a better way to loop through all the users and create a new string containing all their Nick values?
package main
import "fmt"
type User struct {
Nick string
}
func main() {
var users [2]User
users[0] = User{ Nick: "Radar" }
users[1] = User{ Nick: "NotRadar" }
names := ":"
for _, u := range users {
names += u.Nick + " "
}
fmt.Println(names)
}
For example,
package main
import (
"bytes"
"fmt"
)
type User struct {
Nick string
}
func main() {
var users [2]User
users[0] = User{Nick: "Radar"}
users[1] = User{Nick: "NotRadar"}
var buf bytes.Buffer
buf.WriteByte(':')
for _, u := range users {
buf.WriteString(u.Nick)
buf.WriteByte(' ')
}
names := buf.String()
fmt.Println(names)
}
This avoids a lot of allocations due to the concatenation of strings.
You could also write:
package main
import (
"fmt"
)
type User struct {
Nick string
}
func main() {
var users [2]User
users[0] = User{Nick: "Radar"}
users[1] = User{Nick: "NotRadar"}
var buf []byte
buf = append(buf, ':')
for _, u := range users {
buf = append(buf, u.Nick...)
buf = append(buf, ' ')
}
names := string(buf)
fmt.Println(names)
}
It really looks like you want a strings.Join here. You probably want to avoid that tight loop of repeated string concatenations in the original code; I'm fairly certain that Go doesn't implement a rope-like data structure for its primitive strings.
package main
import (
"fmt"
"strings"
)
type User struct {
Nick string
}
func main() {
var users [2]User
users[0] = User{Nick: "Radar"}
users[1] = User{Nick: "NotRadar"}
userNames := []string{}
for _, u := range users {
userNames = append(userNames, u.Nick)
}
names := ":" + strings.Join(userNames, " ")
fmt.Println(names)
}
Unfortunately, I do not know of a more elegant way to write that code.
Go does have a String.Join method so if you made a helper that converted your array of users to a slice of strings ([]string) then you could pass that to String.Join.
I think that Go's static typing and lack of templates makes it hard to write a general purpose map function like Ruby has.
This is what I was talking about in the comments of dyoo's post. Effectively a rewrite of join to prevent having to iterate over the list an extra time and allocate an extra slice.
func Usernames(users []User) string {
if len(users) == 0 {
return ""
}
if len(users) == 1 {
return users[0].Name
}
sep := " "
n := len(users)-1 // From len(sep) * len(a)-1, sep is always len 1 unlike in Join
for i := 0; i < len(users); i++ {
n += len(users[i].Name)
}
names := make([]byte,n)
namesp := copy(names, users[0].Name)
for _,u := range users[1:] {
namesp += copy(names[namesp:], sep)
namesp += copy(names[namesp:], u.Name)
}
return string(names)
}
For reference, strings.go with the strings.Join source:
http://golang.org/src/pkg/strings/strings.go
See line 356

Resources