Differences between strings.Contains and strings.ContainsAny in Golang - go

In the source code:
// Contains returns true if substr is within s.
func Contains(s, substr string) bool {
return Index(s, substr) >= 0
}
// ContainsAny returns true if any Unicode code points in chars are within s.
func ContainsAny(s, chars string) bool {
return IndexAny(s, chars) >= 0
}
the only difference seems to be substr and the Unicode code points in chars. I wrote some test to test both of them. Their behaviors seem to be identical. I don't understand when to use which.

I think two functions are totally different. Contains are used to detect if a string contains a substring. ContainsAny are used to detect if a string contains any chars in the provided string.

Contains function reports whether a sub-string is within the string. Whereas ContainsAny function reports whether any Unicode code points in chars are within the string. Look at the documentation.
func main() {
fmt.Println(strings.Contains("seafood", "aes"))
fmt.Println(strings.ContainsAny("seafood", "aes"))
fmt.Println(strings.Contains("iiii", "ui"))
fmt.Println(strings.ContainsAny("iiii", "ui"))
}
The output is;
false
true
false
true

Related

when do we use rune function in golang work? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed last year.
Improve this question
I am a beginner in Golang...
I found that rune(char) == "-" has been used to check if a character in a word matches with hyphen instead of checking it as char == "-".
Here is the code:
package main
import (
"fmt"
"unicode"
)
func CodelandUsernameValidation(str string) bool {
// code goes here
if len(str) >= 4 && len(str) <= 25 {
if unicode.IsLetter(rune(str[0])) {
for _,char := range str {
if !unicode.IsLetter(rune(char)) && !unicode.IsDigit(rune(char)) && !(rune(char) == '_') {
return false
}
}
return true
}
}
return false;
}
func main() {
// do not modify below here, readline is our function
// that properly reads in the input for you
var user string
fmt.Println("Enter Username")
fmt.Scan(&user)
fmt.Println(CodelandUsernameValidation(user))
}
Could you please clarify why rune is required here?
The code in the question must convert the byte str[0] to a rune for the call to unicode.IsLetter. Otherwise, the rune conversions are not needed.
The required byte to rune conversion hints a problem: The application is treating a byte as a rune, but bytes are not runes.
Fix by using for range to iterate through the runes in the string. This eliminates conversions from the code:
func CodelandUsernameValidation(str string) bool {
if len(str) < 4 || len(str) > 25 {
return false
}
for i, r := range str {
if i == 0 && !unicode.IsLetter(r) {
// str must start with a letter
return false
} else if !unicode.IsLetter(r) && !unicode.IsDigit(r) && !(r == '_') {
// str is restricted to letters, digit and _.
return false
}
}
return true
}
The first thing we need to know is that rune is nothing but an alias of int32. Single quotes represent a rune and double quotes represent a string. so instead of this rune(char) == "-" it should be rune(char) == '-'.
comment from builtin package
// rune is an alias for int32 and is equivalent to int32 in all ways.
It is // used, by convention, to distinguish character values from
integer values.
Second, here we need to know that A loop over the string and accesses it by index returns individual bytes, not characters. like here unicode.IsLetter(rune(str[0])). str[0] returns a byte which is the alias of uint8 not characters. it will fail for some cases because some characters encoded have a length of more than 1 byte because UTF-8. for example take this character ⌘ is represented by the bytes [e2 8c 98] and that those bytes are the UTF-8 encoding, in your example code if you try to access str[0] it will return e2 which may an invalid UTF-8 codepoint or it will represent another character which is a single UTF-8 encoded byte. so here you do like this
strbytes := []byte(str)
firstChar, size := utf8.DecodeRune(strbytes )
A for range loop, by contrast, decodes one UTF-8-encoded rune on each iteration. Each time around the loop, the index of the loop is the starting position of the current rune, measured in bytes, and the code point is its value. so in the example code for _,char := range str { the type of char is rune and again you are trying to convert rune to rune which is duplicated the work.
if want to learn more about strings how they work in Golang here is a great post by Rob Pike
You need to translate from str to []rune
r := []rune(str)
This must be the first line in the function CodelandUsernameValidation.

Character index in line with UTF-8 files

I'm writing a lexical analyzer for UTF-8 text. When an error is detected, I'm supposed to give the line number and the index position in the line.
The user is expected to identify the location in the line by counting the characters he see on the screen (or on the paper) until he reaches the given index value. He could also use the index in the line of the cursor shown by some editors.
I suppose I can't simply use the rune count as index because some unicode characters have zero space width and are supposed to be hidden markers or combined with a non-zero space width unicode character.
How am I supposed to deal with this ?
Is there a function that is able to give the visual unicode index given a byte slice containing runes ?
Also, do the line index in a file start at 0 or at 1 ?
I couldnt find anything in the standard library, but this seems to do it:
package main
import "github.com/rivo/uniseg"
func index(s, substr string) int {
g := uniseg.NewGraphemes(s)
for n := 0; g.Next(); n++ {
if g.Str() == substr { return n }
}
return -1
}
func main() {
n := index("Z a̎ B", "B")
println(n == 4)
}
https://pkg.go.dev/github.com/rivo/uniseg

Is that possible to create a comparison operator from string?

I'm trying to create a function that will produce an if condition from a predefined array.
for example:
package errors
type errorCase struct {
// This is the field I need to get in another struct
Field string
// The comparison operator
TestOperator string
// The value that the expected one should not with equal...
WrongValue interface{}
}
var ErrorCases = []*errorCase{ {
"MinValue",
"<",
0,
}, {
"MaxValue",
"==",
0,
}}
Actually I made a new function with a for loop that iterate through all of these "error cases"
func isDirty(questionInterface models.QuestionInterface) bool {
for _, errorCase := range errors.ErrorCases {
s := reflect.ValueOf(&questionInterface).Elem()
value := s.Elem().FieldByName(errorCase.Field)
// At this point I need to create my if condition
// to compare the value of the value var and the wrong one
// With the given comparison operator
}
// Should return the comparison test value
return true
}
Is that possible to create an if condition like that?
With the reflect package?
I think this is possible but I don't find where I should start.
This is possible. I built a generic comparison library like this once before.
A comparison, in simple terms, contains 3 parts:
A value of some sort, on the left of the comparison.
An operator (=, <, >, ...).
A value of some sort, on the right of the comparison.
Those 3 parts, contain only two different types - value and operator. I attempted to abstract those two types into their base forms.
value could be anything, so we use the empty interface - interface{}.
operator is part of a finite set, each with their own rules.
type Operator int
const (
Equals Operator = 1
)
Evaluating a comparison with an = sign has only one rule to be valid - both values should be of the same type. You can't compare 1 and hello. After that, you just have to make sure the values are the same.
We can implement a new meta-type that wraps the requirement for evaluating an operator.
// Function signature for a "rule" of an operator.
type validFn func(left, right interface{}) bool
// Function signature for evaluating an operator comparison.
type evalFn func(left, right interface{}) bool
type operatorMeta struct {
valid []validFn
eval evalFn
}
Now that we've defined our types, we need to implement the rules and comparison functions for Equals.
func sameTypes(left, right interface{}) bool {
return reflect.TypeOf(left).Kind() == reflect.TypeOf(right).Kind()
}
func equals(left, right interface{}) bool {
return reflect.DeepEqual(left, right)
}
Awesome! So we can now validate that our two values are of the same type, and we can compare them against each other if they are. The last piece of the puzzle, is mapping the operator to its appropriate rules and evaluation and having a function to execute all of this logic.
var args = map[Operator]operatorMeta{
Equals: {
valid: []validFn{sameTypes},
eval: equals,
},
}
func compare(o Operator, left, right interface{}) (bool, error) {
opArgs, ok := args[o]
if !ok {
// You haven't implemented logic for this operator.
}
for _, validFn := range opArgs.valid {
if !validFn(left, right) {
// One of the rules were not satisfied.
}
}
return opArgs.eval(left, right), nil
}
Let's summarize what we have so far:
Abstracted a basic comparison into a value and operator.
Created a way to validate whether a pair of values are valid for an operator.
Created a way to evaluate an operator, given two values.
(Go Playground)
I hope that I gave some insight into how you can approach this. It's a simple idea, but can take some boilerplate to get working properly.
Good luck!

Is there a more efficient way to handle string escaping in this function?

I'm migrating some existing code from another language. In the following function it's more or less a 1-1 migration, but given the newness of the language to me I'd like to know if there's better / more efficient ways to handle how the escaped string gets built:
func influxEscape(str string) string {
var chars = map[string]bool{
"\\": true,
"\"": true,
",": true,
"=": true,
" ": true,
}
var escapeStr = ""
for i := 0; i < len(str); i++ {
var char = string(str[i])
if chars[char] == true {
escapeStr += "\\" + char
} else {
escapeStr += char
}
}
return escapeStr
}
This code performs escaping to make string values compatible with the InfluxDB line protocol.
This should be a comment, but it needs too much room for that.
One more thing to consider—which I mentioned in a comment on Burak Serdar's answer—is what happens when your input string is not valid UTF-8.
Remember that a Go string is a byte sequence. It need not be valid Unicode. It may be intended to represent valid Unicode, or it may not. For instance, it could be ISO-Latin-1 or something else that might not play well with UTF-8.
If it is non-UTF-8, using a range loop on it will translate each invalid sequence to the invalid rune. (See the linked Go blog post.) If it is intended to be valid UTF-8, this may be a plus, and of course, you can check for the resulting RuneError.
Your original loop leaves characters above ASCII DEL (127 or 0x7f) alone. If the bytes in the string are something like ISO-Latin-1, this may be the correct behavior. If not, you may be passing invalid, un-sanitized input to this other program. If you are deliberately sanitizing input, you must find out what kind of input it expects, and do a complete job of sanitizing input.
(I still have scars from being forced to cope with a really poor XML encoder coupled to an old database from some number of jobs ago, so I tend to be extra-cautious here.)
This should be somewhat equivalent to your code:
out := bytes.Buffer{}
for _, x := range str {
if strings.IndexRune(`\",= `, x)!=-1 {
out.WriteRune('\\')
}
out.WriteRune(x)
}
return out.String()

Golang - ToUpper() on a single byte?

I have a []byte, b, and I want to select a single byte, b[pos] and change it too upper case (and then lower case) The bytes type has a method called ToUpper(). How can I use this for a single byte?
Calling ToUpper on single Byte
OneOfOne gave the most efficient (when calling thousands of times), I use
val = byte(unicode.ToUpper(rune(b[pos])))
in order to find the byte and change the value
b[pos] = val
Checking if byte is Upper
Sometimes, instead of changing the case of a byte, I want to check if a byte is upper or lower case; All the upper case roman-alphabet bytes are lower than the value of the lower case bytes.
func (b Board) isUpper(x int) bool {
return b.board[x] < []byte{0x5a}[0]
}
For a single byte/rune, you can use unicode.ToUpper.
b[pos] = byte(unicode.ToUpper(rune(b[pos])))
I want to remind OP that bytes.ToUpper() operates on unicode code points encoded using UTF-8 in a byte slice while unicode.ToUpper() operates on a single unicode code point.
By asking to convert a single byte to upper case, OP is implying that the "b" byte slice contains something other than UTF-8, perhaps ASCII-7 or some 8-bit encoding such as ISO Latin-1 (e.g.). In that case OP needs to write an ISO Latin-1 (e.g.) ToUpper() function or OP must convert the ISO Latin-1 (e.g.) bytes to UTF-8 or unicode before using the bytes.ToUpper() or unicode.ToUpper() function.
Anything less creates a pending bug. Neither of the previously mentioned functions will properly convert all possible ISO Latin-1 (e.g.) encoded characters to upper case.
Use the following code to test if an element of the board is an ASCII uppercase letter:
func (b Board) isUpper(x int) bool {
v := b.board[x]
return 'A' <= v && v <= 'Z'
}
If the application only needs to distinguish between upper and lowercase letters, then there's no need for the lower bound test:
func (b Board) isUpper(x int) bool {
return b.board[x] <= 'Z'
}
The code in this answer improves on the code in the question in a few ways:
The code in the answer returns the correct value for a board element containing 'Z' (run playground example below for demonstration).
'Z' and 0x85 are the same value, but the code is easier to understand with 'Z'.
It's simpler to compare directly with the value 'Z'. No need to create a slice.
playground example
Edit: Revamped answer based on new information in the question since time of my original answer.
You can use bytes.ToUpper, you just need to deal with making the input a slice,
and making the output a byte:
package main
import "bytes"
func main() {
b, pos := []byte("north"), 1
b[pos] = bytes.ToUpper(b)[pos]
println(string(b) == "nOrth")
}

Resources