How to create a case insensitive map in Go? - go

I want to have a key insensitive string as key.
Is it supported by the language or do I have to create it myself?
thank you
Edit: What I am looking for is a way to make it by default instead of having to remember to convert the keys every time I use the map.

Edit: My initial code actually still allowed map syntax and thus allowed the methods to be bypassed. This version is safer.
You can "derive" a type. In Go we just say declare. Then you define methods on your type. It just takes a very thin wrapper to provide the functionality you want. Note though, that you must call get and set with ordinary method call syntax. There is no way to keep the index syntax or optional ok result that built in maps have.
package main
import (
"fmt"
"strings"
)
type ciMap struct {
m map[string]bool
}
func newCiMap() ciMap {
return ciMap{m: make(map[string]bool)}
}
func (m ciMap) set(s string, b bool) {
m.m[strings.ToLower(s)] = b
}
func (m ciMap) get(s string) (b, ok bool) {
b, ok = m.m[strings.ToLower(s)]
return
}
func main() {
m := newCiMap()
m.set("key1", true)
m.set("kEy1", false)
k := "keY1"
b, _ := m.get(k)
fmt.Println(k, "value is", b)
}

Two possiblities:
Convert to uppercase/lowercase if you're input set is guaranteed to be restricted to only characters for which a conversion to uppercase/lowercase will yield correct results (may not be true for some Unicode characters)
Convert to Unicode fold case otherwise:
Use unicode.SimpleFold(rune) to convert a unicode rune to fold case. Obviously this is dramatically more expensive an operation than simple ASCII-style case mapping, but it is also more portable to other languages. See the source code for EqualsFold to see how this is used, including how to extract Unicode runes from your source string.
Obviously you'd abstract this functionality into a separate package instead of re-implementing it everywhere you use the map. This should go without saying, but then you never know.

Here is something more robust than just strings.ToLower, you can use
the golang.org/x/text/cases package. Example:
package main
import "golang.org/x/text/cases"
func main() {
s := cases.Fold().String("March")
println(s == "march")
}
If you want to use something from the standard library, I ran this test:
package main
import (
"strings"
"unicode"
)
func main() {
var (
lower, upper int
m = make(map[string]bool)
)
for n := '\u0080'; n <= '\u07FF'; n++ {
q, r := n, n
for {
q = unicode.SimpleFold(q)
if q == n { break }
for {
r = unicode.SimpleFold(r)
if r == n { break }
s, t := string(q), string(r)
if m[t + s] { continue }
if strings.ToLower(s) == strings.ToLower(t) { lower++ }
if strings.ToUpper(s) == strings.ToUpper(t) { upper++ }
m[s + t] = true
}
}
}
println(lower == 951, upper == 989)
}
So as can be seen, ToUpper is the marginally better choice.

Related

Can 'map' and 'reduce' be implemented in Go with generics

I decided that now that generics have been introduced into Go that something like map/reduce should be possible. So, I took a naive stab at it and I get the error:
./prog.go:18:36: cannot use thing (variable of type int) as type I in argument to mapper
Which doesn't explain if the problem is fundamental or I am simply doing something wrong syntactically. Can generic map/reduce be implemented in Go?
package main
import "fmt"
func main() {
things := []int{1, 2, 3, 4}
results := Map(things, func(t int) int {
return t + 1
})
fmt.Printf("%v", results)
}
func Map[I interface{}, O interface{}](things []I, mapper func(thing I) O) []O {
results := make([]O, 0, len(things))
for thing := range things {
results = append(results, mapper(thing))
}
return results
}
You have incorrect use of range. A single variable extracted from range will be the index (type int), not the value (type I, which is only coincidentally int in this case).
Try
for _, thing := range things{...}
This can be done quite easily. You have an error in your code, though right here:
for thing := range things {
You are iterating over the index values (int), not the values of type I. You're also specifying 2 constraints (types I and O) both set to be interface{}. You can just use any instead (it's shorthand for interface{})
So simply write:
func Map[T any, O any](things []T, mapper func(thing T) O) []O {
result := make([]O, 0, len(things))
for _, thing := range things {
result = append(result, mapper(thing))
}
return result
}
Demo
This is quite closely related to some code I reviewed on codereview exchange here. After going through the code, and writing snippets with a ton of suggestions, I decided to just create a package and throw it up on github instead. You can find the repo here.
In it, there's some examples that may come in handy, or help you work through some other quirks WRT generics in golang. I wsa specifically thinking about this bit, where you can filter a generic map type using callbacks like so:
// given the sMap type
type sMap[K comparable, V any] struct {
mu *sync.RWMutex
m map[K]V
}
// Filter returns a map containing the elements that matched the filter callback argument
func (s *sMap[K, V]) Filter(cb func(K, V) bool) map[K]V {
s.mu.RLock()
defer s.mu.RUnlock()
ret := make(map[K]V, len(s.m))
for k, v := range s.m {
if cb(k, v) {
ret[k] = v
}
}
return ret
}

Golang Convert UTF-8 string to ASCII [duplicate]

How can I remove all diacritics from the given UTF8 encoded string using Go? e.g. transform the string "žůžo" => "zuzo". Is there a standard way?
You can use the libraries described in Text normalization in Go.
Here's an application of those libraries:
// Example derived from: http://blog.golang.org/normalization
package main
import (
"fmt"
"unicode"
"golang.org/x/text/transform"
"golang.org/x/text/unicode/norm"
)
func isMn(r rune) bool {
return unicode.Is(unicode.Mn, r) // Mn: nonspacing marks
}
func main() {
t := transform.Chain(norm.NFD, transform.RemoveFunc(isMn), norm.NFC)
result, _, _ := transform.String(t, "žůžo")
fmt.Println(result)
}
To expand a bit on the existing answer:
The internet standard for comparing strings of different character sets is called "PRECIS" (Preparation, Enforcement, and Comparison of Internationalized Strings in Application Protocols) and is documented in RFC7564. There is also a Go implementation at golang.org/x/text/secure/precis.
None of the standard profiles will do what you want, but it would be fairly straight forward to define a new profile that did. You would want to apply Unicode Normalization Form D ("D" for "Decomposition", which means the accents will be split off and be their own combining character), and then remove any combining character as part of the additional mapping rule, then recompose with the normalization rule. Something like this:
package main
import (
"fmt"
"unicode"
"golang.org/x/text/secure/precis"
"golang.org/x/text/transform"
"golang.org/x/text/unicode/norm"
)
func main() {
loosecompare := precis.NewIdentifier(
precis.AdditionalMapping(func() transform.Transformer {
return transform.Chain(norm.NFD, transform.RemoveFunc(func(r rune) bool {
return unicode.Is(unicode.Mn, r)
}))
}),
precis.Norm(norm.NFC), // This is the default; be explicit though.
)
p, _ := loosecompare.String("žůžo")
fmt.Println(p, loosecompare.Compare("žůžo", "zuzo"))
// Prints "zuzo true"
}
This lets you expand your comparison with more options later (eg. width mapping, case mapping, etc.)
It's also worth noting that removing accents is almost never what you actually want to do when comparing strings like this, however, without knowing your use case I can't actually make that assertion about your project. To prevent the proliferation of precis profiles it's good to use one of the existing profiles where possible. Also note that no effort was made to optimize the example profile.
transform.RemoveFunc is deprecated.
Instead you can use the Remove function from runes package:
t := transform.Chain(norm.NFD, runes.Remove(runes.In(unicode.Mn)), norm.NFC)
result, _, _ := transform.String(t, "žůžo")
fmt.Println(result)
For anyone looking how to remove (or replace / flatten) Polish diacritics in Go, you may define a mapping for runes:
package main
import (
"fmt"
"golang.org/x/text/runes"
"golang.org/x/text/secure/precis"
"golang.org/x/text/transform"
"golang.org/x/text/unicode/norm"
)
func main() {
trans := transform.Chain(
norm.NFD,
precis.UsernameCaseMapped.NewTransformer(),
runes.Map(func(r rune) rune {
switch r {
case 'ą':
return 'a'
case 'ć':
return 'c'
case 'ę':
return 'e'
case 'ł':
return 'l'
case 'ń':
return 'n'
case 'ó':
return 'o'
case 'ś':
return 's'
case 'ż':
return 'z'
case 'ź':
return 'z'
}
return r
}),
norm.NFC,
)
result, _, _ := transform.String(trans, "ŻóŁć")
fmt.Println(result)
}
On Go Playground: https://play.golang.org/p/3ulPnOd3L91

Function that takes map and only cares about key type

I have two maps, both of them are keyed by strings, but the values are of two different custom types.
map[string]type1
map[string]type2
Now I want to write a function which can take an argument of either of these two types, because that function only looks at the keys and doesn't care about the values at all. So I think it should look like this:
func takeTheMap(argument map[string]interface{}) {
...
But that doesn't work due to:
cannot use myVariable (type map[string]customType) as type map[string]interface {} in argument to takeTheMap
https://play.golang.org/p/4Xkhi4HekO5
Can I make that work somehow?
The only polymorphism in Go is interfaces. The only alternatives to that are reflection, duplication, or rethinking the broader design so that you don't need to do what you're trying to do here.
If the last option isn't a possibility, personally I would recommend duplication, since it's a whole four lines of code.
keys := make([]string, 0, len(myMap))
for key,_ := range myMap {
keys = append(keys,key)
}
A big complicated generic helper seems kind of unnecessary.
A solution using an interface. This example may seem a bit overkill and it may be better to in your case (I'm not sure, not enough details in your example) to just use a couple of for loops.
package main
import (
"fmt"
)
type foo bool
type bar string
type mapOne map[string]foo
type mapTwo map[string]bar
func (m mapOne) Keys() []string {
s := []string{}
for k := range m {
s = append(s, k)
}
return s
}
func (m mapTwo) Keys() []string {
s := []string{}
for k := range m {
s = append(s, k)
}
return s
}
type ToKeys interface {
Keys() []string
}
func main() {
m1 := mapOne{"one": true, "two": false}
m2 := mapTwo{"three": "foo", "four": "bar"}
doSomething(m1)
doSomething(m2)
}
func doSomething(m ToKeys) {
fmt.Println(m.Keys())
}
Playground example

Using default value in golang func

I'm trying to implement a default value according to the option 1 of the post Golang and default values. But when I try to do go install the following error pops up in the terminal:
not enough arguments in call to test.Concat1
have ()
want (string)
Code:
package test
func Concat1(a string) string {
if a == "" {
a = "default-a"
}
return fmt.Sprintf("%s", a)
}
// other package
package main
func main() {
test.Concat1()
}
Thanks in advance.
I don't think what you are trying to do will work that way. You may want to opt for option #4 from the page you cited, which uses variadic variables. In your case looks to me like you want just a string, so it'd be something like this:
func Concat1(a ...string) string {
if len(a) == 0 {
return "a-default"
}
return a[0]
}
Go does not have optional defaults for function arguments.
You may emulate them to some extent by having a special type
to contain the set of parameters for a function.
In your toy example that would be something like
type Concat1Args struct {
a string
}
func Concat1(args Concat1Args) string {
if args.a == "" {
args.a = "default-a"
}
return fmt.Sprintf("%s", args.a)
}
The "trick" here is that in Go each type has its respective
"zero value", and when producing a value of a composite type
using the so-called literal, it's possible to initialize only some of the type's fields, so in our example that would be
s := Concat1(Concat1Args{})
vs
s := Concat1(Concat1Args{"whatever"})
I know that looks clumsy, and I have showed this mostly for
demonstration purpose. In real production code, where a function
might have a dozen of parameters or more, having them packed
in a dedicate composite type is usually the only sensible way
to go but for a case like yours it's better to just explicitly
pass "" to the function.
Golang does not support default parameters. Accordingly, variadic arguments by themselves are not analogous. However, variadic functions with the use of error handling can 'resemble' the pattern. Try the following as a simple example:
package main
import (
"errors"
"log"
)
func createSeries(p ...int) ([]int, error) {
usage := "Usage: createSeries(<length>, <optional starting value>), length should be > 0"
if len(p) == 0 {
return nil, errors.New(usage)
}
n := p[0]
if n <= 0 {
return nil, errors.New(usage)
}
var base int
if len(p) == 2 {
base = p[1]
} else if len(p) > 2 {
return nil, errors.New(usage)
}
vals := make([]int, n)
for i := 0; i < n; i++ {
vals[i] = base + i
}
return vals, nil
}
func main() {
answer, err := createSeries(4, -9)
if err != nil {
log.Fatal(err)
}
log.Println(answer)
}
Default parameters work differently in Go than they do in other languages. In a function there can be one ellipsis, always at the end, which will keep a slice of values of the same type so in your case this would be:
func Concat1(a ...string) string {
but that means that the caller may pass in any number of arguments >= 0. Also you need to check that the arguments in the slice are not empty and then assign them yourself. This means they do not get assigned a default value through any kind of special syntax in Go. This is not possible but you can do
if a[0] == "" {
a[0] = "default value"
}
If you want to make sure that the user passes either zero or one strings, just create two functions in your API, e.g.
func Concat(a string) string { // ...
func ConcatDefault() string {
Concat("default value")
}

strings - get characters before a digit

I have some strings such E2 9NZ, N29DZ, EW29DZ . I need to extract the chars before the first digit, given the above example : E, N, EW.
Am I supposed to use regex ? The strings package looks really nice but just doesn't seem to handle this case (extract everything before a specific type).
Edit:
To clarify the "question" I'm wondering what method is more idiomatic to go and perhaps likely to provide better performance.
For example,
package main
import (
"fmt"
"unicode"
)
func DigitPrefix(s string) string {
for i, r := range s {
if unicode.IsDigit(r) {
return s[:i]
}
}
return s
}
func main() {
fmt.Println(DigitPrefix("E2 9NZ"))
fmt.Println(DigitPrefix("N29DZ"))
fmt.Println(DigitPrefix("EW29DZ"))
fmt.Println(DigitPrefix("WXYZ"))
}
Output:
E
N
EW
WXYZ
If there is no digit, example "WXYZ", and you don't want anything returned, change return s to return "".
Not sure why almost everyone provided answers in everything but Go. Here is regex-based Go version:
package main
import (
"fmt"
"regexp"
)
func main() {
pattern, err := regexp.Compile("^[^\\d]*")
if err != nil {
panic(err)
}
part := pattern.Find([]byte("EW29DZ"))
if part != nil {
fmt.Printf("Found: %s\n", string(part))
} else {
fmt.Println("Not found")
}
}
Running:
% go run main.go
Found: EW
Go playground
We don't need regex for this problem. You can easily walk through on a slice of rune and check the current character with unicode.IsDigit(), if it's a digit: return. If it isn't: continue the loop. If there are no numbers: return the argument
Code
package main
import (
"fmt"
"unicode"
)
func UntilDigit(r []rune) []rune {
var i int
for _, v := range r {
if unicode.IsDigit(v) {
return r[0:i]
}
i++
}
return r
}
func main() {
fmt.Println(string(UntilDigit([]rune("E2 9NZ"))))
fmt.Println(string(UntilDigit([]rune("N29DZ"))))
fmt.Println(string(UntilDigit([]rune("EW29DZ"))))
}
Playground link
I think the best option is to use the index returned from strings.IndexAny which will return the first index of any character in a string.
func BeforeNumbers(str string) string {
value := strings.IndexAny(str,"0123456789")
if value >= 0 && value <= len(str) {
return str[:value]
}
return str
}
Will slice the string and return the subslice up to (but not including) the first character that's in the string "0123456789" which is any number.
Way later edit:
It would probably be better to use IndexFunc rather than IndexAny:
func BeforeNumbers(str string) string {
indexFunc := func(r rune) bool {
return r >= '0' && r <= '9'
}
value := strings.IndexFunc(str,indexFunc)
if value >= 0 && value <= len(str) {
return str[:value]
}
return str
}
This is more or less equivalent to the loop version, and eliminates a search over a long string to check for a match every character from my previous answer. But I think it looks cleaner than the loop version, which is obviously a manner of taste.
The code below will continue grabbing characters until it reaches a digit.
int i = 0;
String string2test = "EW29DZ";
String stringOutput = "";
while (!Character.isDigit(string2test.charAt(i)))
{
stringOutput = stringOutput + string2test.charAt(i);
i++;
}

Resources