Go regexp: match three asterisks - go

So I did this:
r, _ := regexp.Compile("* * *")
r2 := r.ReplaceAll(b, []byte("<hr>"))
and got:
panic: runtime error: invalid memory address or nil pointer dereference
So I figured I had to escape them:
r, _ := regexp.Compile("\* \* \*")
But got unknown escape secuence
I'm a Go Beginner. What am I doing wrong?

You are not checking errors.
regexp.Compile gives you two results:
the compiled pattern (or nil)
the error while compiling the pattern (or nil)
You are ignoring the error and accessing the nil result. Observe (on play):
r, err := regexp.Compile("* * *")
fmt.Println("r:", r)
fmt.Println("err:", err)
Running this code will show you that, indeed there is an error. The error is:
error parsing regexp: missing argument to repetition operator: *
So yes, you are right, you have to escape the repetition operator *. You tried the following:
r, err := regexp.Compile("\* \* \*")
And consequently you got the following error from the compiler:
unknown escape sequence: *
Since there are a number of escape sequences like \n or \r for special characters that you do not have on your keyboard but want to have in strings, the compiler tries to insert these characters. \* is not a valid escape sequence and thus the compiler fails to do the replacement. What you want to do is to escape the escape sequence so that the regexp parser can do its thing.
So, the correct code is:
r, err := regexp.Compile("\\* \\* \\*")
The simplest way of dealing with these kind of quirks is using the raw string literals ("``") instead of normal quotes:
r, err := regexp.Compile(`\* \* \*`)
These raw strings ignore escape sequences altogether.

Adding to #VonC's answer, regexp aren't always the answer and are generally slower than using strings.*.
For a complex expression, sure regexp is awesome, however if you just want to match a string and replace it then, strings.Replacer is the way to go:
var asterisksReplacer = strings.NewReplacer(`* * *`, `<hr>`)
func main() {
fmt.Println(asterisksReplacer.Replace(`xxx * * * yyy *-*-* zzz* * *`))
}
playground

Try escaping your '*' (since '*' is a special character used for repetition in the re2 syntax)
r, err := regexp.Compile(`\* \* \*`)
// and yes, always check the error
// or at least use regexp.MustCompile() if you want to fail fast
Note the use of back quotes `` for the string literal.

Related

How to convert the string representation of a Terraform set of strings to a slice of strings

I've a terratest where I get an output from terraform like so s := "[a b]". The terraform output's value = toset([resource.name]), it's a set of strings.
Apparently fmt.Printf("%T", s) returns string. I need to iterate to perform further validation.
I tried the below approach but errors!
var v interface{}
if err := json.Unmarshal([]byte(s), &v); err != nil {
fmt.Println(err)
}
My current implementation to convert to a slice is:
s := "[a b]"
s1 := strings.Fields(strings.Trim(s, "[]"))
for _, v:= range s1 {
fmt.Println("v -> " + v)
}
Looking for suggestions to current approach or alternative ways to convert to arr/slice that I should be considering. Appreciate any inputs. Thanks.
Actually your current implementation seems just fine.
You can't use JSON unmarshaling because JSON strings must be enclosed in double quotes ".
Instead strings.Fields does just that, it splits a string on one or more characters that match unicode.IsSpace, which is \t, \n, \v. \f, \r and .
Moeover this works also if terraform sends an empty set as [], as stated in the documentation:
returning [...] an empty slice if s contains only white space.
...which includes the case of s being empty "" altogether.
In case you need additional control over this, you can use strings.FieldsFunc, which accepts a function of type func(rune) bool so you can determine yourself what constitutes a "space". But since your input string comes from terraform, I guess it's going to be well-behaved enough.
There may be third-party packages that already implement this functionality, but unless your program already imports them, I think the native solution based on the standard lib is always preferrable.
unicode.IsSpace actually includes also the higher runes 0x85 and 0xA0, in which case strings.Fields calls FieldsFunc(s, unicode.IsSpace)
package main
import (
"fmt"
"strings"
)
func main() {
src := "[a b]"
dst := strings.Split(src[1:len(src)-1], " ")
fmt.Println(dst)
}
https://play.golang.org/p/KVY4r_8RWv6

more than one character in rune literal

I have a string as just MyString and I want to append in this data something like this:
MYString ("1", "a"), ("1", "b") //END result
My code is something like this:
query := "MyString";
array := []string{"a", "b"}
for i , v := range array{
id := "1"
fmt.Println(v,i)
query += '("{}", "{}"), '.format(id, v)
}
but I am getting two errors:
./prog.go:15:23: more than one character in rune literal
./prog.go:15:39: '\u0000'.format undefined (type rune has no field or method format)
You can't use single quotes for Strings in Go. You can only use double-quotes or backticks.
Single quotes are used for single characters, called runes
Change your line to:
query += "(\"{}\", \"{}\"), ".format(id, v)
or
query += `("{}", "{}"), `.format(id, v)
However, Go is not python. Go doesn't have a format method like that. But it has fmt.Sprintf.
So to really fix it, use:
query = fmt.Sprintf(`%s("%s", "%s"), `, query, id, v)
Issue here is single quotes . Go Compiler expects a character only when encounters '' . Rather use double quotes with escape symbol as explained in above example.

How to escape a string with single quotes

I am trying to unquote a string that uses single quotes in Go (the syntax is same as Go string literal syntax but using single quotes not double quotes):
'\'"Hello,\nworld!\r\n\u1F60ANice to meet you!\nFirst Name\tJohn\nLast Name\tDoe\n'
should become
'"Hello,
world!
😊Nice to meet you!
First Name John
Last Name Doe
How do I accomplish this?
strconv.Unquote doesn't work on \n newlines (https://github.com/golang/go/issues/15893 and https://golang.org/pkg/strconv/#Unquote), and simply strings.ReplaceAll(ing would be a pain to support all Unicode code points and other backslash codes like \n & \r & \t.
I may be asking for too much, but it would be nice if it automatically validates the Unicode like how strconv.Unquote might be able to do/is doing (it knows that x Unicode code points may become one character), since I can do the same with unicode/utf8.ValidString.
#CeriseLimón came up with this answer, and I just put it into a function with more shenanigans to support \ns. First, this swaps ' and ", and changes \ns to actual newlines. Then it strconv.Unquotes each line, since strconv.Unquote cannot handle newlines, and then reswaps ' and " and pieces them together.
func unquote(s string) string {
replaced := strings.NewReplacer(
`'`,
`"`,
`"`,
`'`,
`\n`,
"\n",
).Replace(s[1:len(s)-1])
unquoted := ""
for _, line := range strings.Split(replaced, "\n") {
tmp, err := strconv.Unquote(`"` + line + `"`)
repr.Println(line, tmp, err)
if err != nil {
return nil, NewInvalidAST(obj.In.Text.LexerInfo, "*Obj.In.Text.Text")
}
unquoted += tmp + "\n"
}
return strings.NewReplacer(
`"`,
`'`,
`'`,
`"`,
).Replace(unquoted[:len(unquoted)-1])
}

String format with errors with %e

I've encountered some go code that appears to use %e for formatting an error for display to the screen. A simplified version would be code like this
err := errors.New("La de da")
fmt.Printf("%e\n", err)
outputs
&{%!e(string=La de da)}
However, if I look at the go manual, it says %e is for formatting floating point numbers in scientific notation. That output doesn't look like scientific notation, so I'm wondering
If this is a specific notation, what is it? (i.e. is there a %. formatting option I could use to get that format)
If it's not a specific notation, what weird thing is going on under the hood that leads to an error being rendered in this way?
What silly, obvious thing am I missing that renders most of what I've said in this post wrong?
Read the Go documentation.
Package fmt
Printing
Format errors:
If an invalid argument is given for a verb, such as providing a string
to %d, the generated string will contain a description of the problem,
as in these examples:
Wrong type or unknown verb: %!verb(type=value)
Printf("%d", hi): %!d(string=hi)
Too many arguments: %!(EXTRA type=value)
Printf("hi", "guys"): hi%!(EXTRA string=guys)
Too few arguments: %!verb(MISSING)
Printf("hi%d"): hi%!d(MISSING)
Non-int for width or precision: %!(BADWIDTH) or %!(BADPREC)
Printf("%*s", 4.5, "hi"): %!(BADWIDTH)hi
Printf("%.*s", 4.5, "hi"): %!(BADPREC)hi
Invalid or invalid use of argument index: %!(BADINDEX)
Printf("%*[2]d", 7): %!d(BADINDEX)
Printf("%.[2]d", 7): %!d(BADINDEX)
All errors begin with the string "%!" followed sometimes by a single
character (the verb) and end with a parenthesized description.
For your example,
package main
import (
"errors"
"fmt"
)
func main() {
err := errors.New("La de da")
fmt.Printf("%e\n", err)
}
Playground: https://play.golang.org/p/NKC6WWePyxM
Output:
&{%!e(string=La de da)}
Documentation:
All errors begin with the string "%!" followed sometimes by a single
character (the verb) and end with a parenthesized description.
Wrong type or unknown verb: %!verb(type=value)
Printf("%d", hi): %!d(string=hi)
When formatting errors into strings using fmt.Printf()/Println(), you can do the following:
err := fnThatReturnsErr()
if err != nil {
fmt.Println("The error is %v", err)
}
I believe %v is the formatting option you were looking for.

Golang Sprintf formatting a string and using it multiple times

I try to generate a sql query using Sprintf() where I have to use the same variable two times
myStr := "test"
str := Sprintf("SELECT ... WHERE a = '%#[1]s' or b = '%#[1]s'", myStr)
fmt.Println(str)
This snippets outputs the expected string
SELECT ... WHERE a = 'test' or b = 'test'
but go vet says:
unrecognized printf flag for verb 's': '#' (vet)
And I am puzzled why. Switching the printf verb to v satisfies go vet but adds " around my string. And I honestly doesn't see a mistake in using %#[1]s.
Any thoughts?
Using printf to construct queries is a bad idea, it opens you up to SQL injection.
See named parameters in the sql package.
There is no # Sprintf flag for a string verb (the flag # is e.g. adding 0x for hex values: %#x). So remove it to make your go vet troubles disappear:
myStr := "test"
str := Sprintf("SELECT ... WHERE a = '%[1]s' or b = '%[1]s'", myStr)
fmt.Println(str)
But: If any part of your constructed query (myStr) comes from external input (i.e. user input), you really should follow Hein's advise and use named parameters.

Resources