Validation versus disjunction - validation

Suppose I want to write a method with the following signature:
def parse(input: List[(String, String)]):
ValidationNel[Throwable, List[(Int, Int)]]
For each pair of strings in the input, it needs to verify that both members can be parsed as integers and that the first is smaller than the second. It then needs to return the integers, accumulating any errors that turn up.
First I'll define an error type:
import scalaz._, Scalaz._
case class InvalidSizes(x: Int, y: Int) extends Exception(
s"Error: $x is not smaller than $y!"
)
Now I can implement my method as follows:
def checkParses(p: (String, String)):
ValidationNel[NumberFormatException, (Int, Int)] =
p.bitraverse[
({ type L[x] = ValidationNel[NumberFormatException, x] })#L, Int, Int
](
_.parseInt.toValidationNel,
_.parseInt.toValidationNel
)
def checkValues(p: (Int, Int)): Validation[InvalidSizes, (Int, Int)] =
if (p._1 >= p._2) InvalidSizes(p._1, p._2).failure else p.success
def parse(input: List[(String, String)]):
ValidationNel[Throwable, List[(Int, Int)]] = input.traverseU(p =>
checkParses(p).fold(_.failure, checkValues _ andThen (_.toValidationNel))
)
Or, alternatively:
def checkParses(p: (String, String)):
NonEmptyList[NumberFormatException] \/ (Int, Int) =
p.bitraverse[
({ type L[x] = ValidationNel[NumberFormatException, x] })#L, Int, Int
](
_.parseInt.toValidationNel,
_.parseInt.toValidationNel
).disjunction
def checkValues(p: (Int, Int)): InvalidSizes \/ (Int, Int) =
(p._1 >= p._2) either InvalidSizes(p._1, p._2) or p
def parse(input: List[(String, String)]):
ValidationNel[Throwable, List[(Int, Int)]] = input.traverseU(p =>
checkParses(p).flatMap(s => checkValues(s).leftMap(_.wrapNel)).validation
)
Now for whatever reason the first operation (validating that the pairs parse as strings) feels to me like a validation problem, while the second (checking the values) feels like a disjunction problem, and it feels like I need to compose the two monadically (which suggests that I should be using \/, since ValidationNel[Throwable, _] doesn't have a monad instance).
In my first implementation, I use ValidationNel throughout and then fold at the end as a kind of fake flatMap. In the second, I bounce back and forth between ValidationNel and \/ as appropriate depending on whether I need error accumulation or monadic binding. They produce the same results.
I've used both approaches in real code, and haven't yet developed a preference for one over the other. Am I missing something? Should I prefer one over the other?

This is probably not the answer you're looking, but I just noticed Validation has the following methods
/** Run a disjunction function and back to validation again. Alias for `#\/` */
def disjunctioned[EE, AA](k: (E \/ A) => (EE \/ AA)): Validation[EE, AA] =
k(disjunction).validation
/** Run a disjunction function and back to validation again. Alias for `disjunctioned` */
def #\/[EE, AA](k: (E \/ A) => (EE \/ AA)): Validation[EE, AA] =
disjunctioned(k)
When I saw them, I couldn't really see their usefulness until I remembered this question. They allow you to do a proper bind by converting to disjunction.
def checkParses(p: (String, String)):
ValidationNel[NumberFormatException, (Int, Int)] =
p.bitraverse[
({ type L[x] = ValidationNel[NumberFormatException, x] })#L, Int, Int
](
_.parseInt.toValidationNel,
_.parseInt.toValidationNel
)
def checkValues(p: (Int, Int)): InvalidSizes \/ (Int, Int) =
(p._1 >= p._2) either InvalidSizes(p._1, p._2) or p
def parse(input: List[(String, String)]):
ValidationNel[Throwable, List[(Int, Int)]] = input.traverseU(p =>
checkParses(p).#\/(_.flatMap(checkValues(_).leftMap(_.wrapNel)))
)

The following is a pretty close translation of the second version of my code for Cats:
import scala.util.Try
case class InvalidSizes(x: Int, y: Int) extends Exception(
s"Error: $x is not smaller than $y!"
)
def parseInt(input: String): Either[Throwable, Int] = Try(input.toInt).toEither
def checkValues(p: (Int, Int)): Either[InvalidSizes, (Int, Int)] =
if (p._1 >= p._2) Left(InvalidSizes(p._1, p._2)) else Right(p)
import cats.data.{EitherNel, ValidatedNel}
import cats.instances.either._
import cats.instances.list._
import cats.syntax.apply._
import cats.syntax.either._
import cats.syntax.traverse._
def checkParses(p: (String, String)): EitherNel[Throwable, (Int, Int)] =
(parseInt(p._1).toValidatedNel, parseInt(p._2).toValidatedNel).tupled.toEither
def parse(input: List[(String, String)]): ValidatedNel[Throwable, List[(Int, Int)]] =
input.traverse(fields =>
checkParses(fields).flatMap(s => checkValues(s).toEitherNel).toValidated
)
To update the question, this code is "bouncing back and forth between ValidatedNel and Either as appropriate depending on whether I need error accumulation or monadic binding".
In the almost six years since I asked this question, Cats has introduced a Parallel type class (improved in Cats 2.0.0) that solves exactly the problem I was running into:
import cats.data.EitherNel
import cats.instances.either._
import cats.instances.list._
import cats.instances.parallel._
import cats.syntax.either._
import cats.syntax.parallel._
def checkParses(p: (String, String)): EitherNel[Throwable, (Int, Int)] =
(parseInt(p._1).toEitherNel, parseInt(p._2).toEitherNel).parTupled
def parse(input: List[(String, String)]): EitherNel[Throwable, List[(Int, Int)]] =
input.parTraverse(fields =>
checkParses(fields).flatMap(checkValues(_).toEitherNel)
)
We can switch the the par version of our applicative operators like traverse or tupled when we want to accumulate errors, but otherwise we're working in Either, which gives us monadic binding, and we no longer have to refer to Validated at all.

Related

Spark repartitionAndSortWithinPartitions with tuples

I'm trying to follow this example to partition hbase rows: https://www.opencore.com/blog/2016/10/efficient-bulk-load-of-hbase-using-spark/
However, I have data already stored in (String, String, String) where the first is the rowkey, second is column name, and third is column value.
I tried writing an implicit ordering to achieve the OrderedRDD implicit
implicit val caseInsensitiveOrdering: Ordering[(String, String, String)] = new Ordering[(String, String, String)] {
override def compare(x: (String, String, String), y: (String, String, String)): Int = ???
}
but repartitionAndSortWithinPartitions is still not available. Is there a way I can use this method with this tuple?
RDD must have key and value, not only values, for ex.:
val data = List((("5", "6", "1"), (1)))
val rdd : RDD[((String, String, String), Int)] = sparkContext.parallelize(data)
implicit val caseInsensitiveOrdering = new Ordering[(String, String, String)] {
override def compare(x: (String, String, String), y: (String, String, String)): Int = 1
}
rdd.repartitionAndSortWithinPartitions(..)

"Expected type parameter, found reference to type parameter" when implementing a generic cache struct

In the Closures chapter of the second edition of The Rust Programming Language, the writer implements a Cache struct and leaves it with a few problems for the reader to fix up, such as:
Accepting generic parameters and return values on the closure function
Allowing more than one value to be cached
I've attempted to fix those problems but I am quite stuck and can't make it work.
use std::collections::HashMap;
use std::hash::Hash;
struct Cacher<T, X, Y>
where
T: Fn(&X) -> &Y,
X: Eq + Hash,
{
calculation: T,
results: HashMap<X, Y>,
}
impl<T, X, Y> Cacher<T, X, Y>
where
T: Fn(&X) -> &Y,
X: Eq + Hash,
{
fn new(calculation: T) -> Cacher<T, X, Y> {
Cacher {
calculation,
results: HashMap::new(),
}
}
fn value<'a>(&'a mut self, arg: &'a X) -> &'a Y {
match self.results.get(arg) {
Some(v) => v,
None => {
let res = (self.calculation)(arg);
self.results.insert(*arg, res);
res
}
}
}
}
Where T is the closure function type, X is the argument type and Y is the return value type.
The error I get:
error[E0308]: mismatched types
--> src/main.rs:30:43
|
30 | self.results.insert(*arg, res);
| ^^^ expected type parameter, found &Y
|
= note: expected type `Y`
found type `&Y`
I understand this, but I can't think of an elegant solution for the whole ordeal.
You've stated that your closure returns a reference:
T: Fn(&X) -> &Y,
but then you are trying to store something that isn't a reference:
results: HashMap<X, Y>,
This is fundamentally incompatible; you need to unify the types.
In many cases, there's no reason to have a reference to a generic type because a generic type can already be a reference. Additionally, forcing the closure to return a reference means that a closure like |_| 42 would not be valid. Because of that, I'd say you should return and store the value type.
Next you need to apply similar logic to value, as it needs to take the argument by value in order to store it. Additionally, remove all the lifetimes from it as elision does the right thing: fn value(&mut self, arg: X) -> &Y.
Once you've straightened that out, apply the knowledge from How to lookup from and insert into a HashMap efficiently?:
fn value(&mut self, arg: X) -> &Y {
match self.results.entry(arg) {
Entry::Occupied(e) => e.into_mut(),
Entry::Vacant(e) => {
let res = (self.calculation)(e.key());
e.insert(res)
}
}
}
Round it off with some tests that assert it's only called once, and you are good to go. Note that we had to make decisions along the way, but they aren't the only ones we could have chosen. For example, we could have made it so that the cached value is cloned when returned.
use std::collections::HashMap;
use std::collections::hash_map::Entry;
use std::hash::Hash;
struct Cacher<F, I, O>
where
F: Fn(&I) -> O,
I: Eq + Hash,
{
calculation: F,
results: HashMap<I, O>,
}
impl<F, I, O> Cacher<F, I, O>
where
F: Fn(&I) -> O,
I: Eq + Hash,
{
fn new(calculation: F) -> Self {
Cacher {
calculation,
results: HashMap::new(),
}
}
fn value(&mut self, arg: I) -> &O {
match self.results.entry(arg) {
Entry::Occupied(e) => e.into_mut(),
Entry::Vacant(e) => {
let res = (self.calculation)(e.key());
e.insert(res)
}
}
}
}
#[test]
fn called_once() {
use std::sync::atomic::{AtomicUsize, Ordering};
let calls = AtomicUsize::new(0);
let mut c = Cacher::new(|&()| {
calls.fetch_add(1, Ordering::SeqCst);
()
});
c.value(());
c.value(());
c.value(());
assert_eq!(1, calls.load(Ordering::SeqCst));
}

Map of methods in Go

I have several methods that I'm calling for some cases (like Add, Delete, etc..). However over time the number of cases is increasing and my switch-case is getting longer. So I thought I'd create a map of methods, like Go map of functions; here the mapping of functions is trivial. However, is it possible to create a map of methods in Go?
When we have a method:
func (f *Foo) Add(a string, b int) { }
The syntax below create compile-time error:
actions := map[string]func(a, b){
"add": f.Add(a,b),
}
Is it possible to create a map of methods in Go?
Yes. Currently:
actions := map[string]func(a string, b int){
"add": func(a string, b int) { f.Add(a, b) },
}
Later: see the go11func document guelfi mentioned.
There is currently no way to store both receiver and method in a single value (unless you store it in a struct). This is currently worked on and it may change with Go 1.1 (see http://golang.org/s/go11func).
You may, however, assign a method to a function value (without a receiver) and pass the receiver to the value later:
package main
import "fmt"
type Foo struct {
n int
}
func (f *Foo) Bar(m int) int {
return f.n + m
}
func main() {
foo := &Foo{2}
f := (*Foo).Bar
fmt.Printf("%T\n", f)
fmt.Println(f(foo, 42))
}
This value can be stored in a map like anything else.
I met with a similar question.
How can this be done today, 9 years later:
the thing is that the receiver must be passed to the method map as the first argument. Which is pretty unusual.
package main
import (
"fmt"
"log"
)
type mType struct {
str string
}
func (m *mType) getStr(s string) {
fmt.Println(s)
fmt.Println(m.str)
}
var (
testmap = make(map[string]func(m *mType, s string))
)
func main() {
test := &mType{
str: "Internal string",
}
testmap["GetSTR"] = (*mType).getStr
method, ok := testmap["GetSTR"]
if !ok {
log.Fatal("something goes wrong")
}
method(test, "External string")
}
https://go.dev/play/p/yy3aR_kMzHP
You can do this using Method Expressions:
https://golang.org/ref/spec#Method_expressions
However, this makes the function take the receiver as a first argument:
actions := map[string]func(Foo, string, int){
"add": Foo.Add
}
Similarly, you can get a function with the signature func(*Foo, string, int) using (*Foo).Add
If you want to use pointer to type Foo as receiver, like in:
func (f *Foo) Add(a string, b int) { }
then you can map string to function of (*Foo, string, int), like this:
var operations = map[string]func(*Foo, string, int){
"add": (*Foo).Add,
"delete": (*Foo).Delete,
}
Then you would use it as:
var foo Foo = ...
var op string = GetOp() // "add", "delete", ...
operations[op](&foo, a, b)
where GetOp() returns an operation as string, for example from a user input.
a and b are your string and int arguments to methods.
This assumes that all methods have the same signatures. They can also have return value(s), again of the same type(s).
It is also possible to do this with Foo as receiver instead of *Foo. In that case we don't have to de-reference it in the map, and we pass foo instead of &foo.

What is an idiomatic way of representing enums in Go?

I'm trying to represent a simplified chromosome, which consists of N bases, each of which can only be one of {A, C, T, G}.
I'd like to formalize the constraints with an enum, but I'm wondering what the most idiomatic way of emulating an enum is in Go.
Quoting from the language specs:Iota
Within a constant declaration, the predeclared identifier iota represents successive untyped integer constants. It is reset to 0 whenever the reserved word const appears in the source and increments after each ConstSpec. It can be used to construct a set of related constants:
const ( // iota is reset to 0
c0 = iota // c0 == 0
c1 = iota // c1 == 1
c2 = iota // c2 == 2
)
const (
a = 1 << iota // a == 1 (iota has been reset)
b = 1 << iota // b == 2
c = 1 << iota // c == 4
)
const (
u = iota * 42 // u == 0 (untyped integer constant)
v float64 = iota * 42 // v == 42.0 (float64 constant)
w = iota * 42 // w == 84 (untyped integer constant)
)
const x = iota // x == 0 (iota has been reset)
const y = iota // y == 0 (iota has been reset)
Within an ExpressionList, the value of each iota is the same because it is only incremented after each ConstSpec:
const (
bit0, mask0 = 1 << iota, 1<<iota - 1 // bit0 == 1, mask0 == 0
bit1, mask1 // bit1 == 2, mask1 == 1
_, _ // skips iota == 2
bit3, mask3 // bit3 == 8, mask3 == 7
)
This last example exploits the implicit repetition of the last non-empty expression list.
So your code might be like
const (
A = iota
C
T
G
)
or
type Base int
const (
A Base = iota
C
T
G
)
if you want bases to be a separate type from int.
Referring to the answer of jnml, you could prevent new instances of Base type by not exporting the Base type at all (i.e. write it lowercase). If needed, you may make an exportable interface that has a method that returns a base type. This interface could be used in functions from the outside that deal with Bases, i.e.
package a
type base int
const (
A base = iota
C
T
G
)
type Baser interface {
Base() base
}
// every base must fulfill the Baser interface
func(b base) Base() base {
return b
}
func(b base) OtherMethod() {
}
package main
import "a"
// func from the outside that handles a.base via a.Baser
// since a.base is not exported, only exported bases that are created within package a may be used, like a.A, a.C, a.T. and a.G
func HandleBasers(b a.Baser) {
base := b.Base()
base.OtherMethod()
}
// func from the outside that returns a.A or a.C, depending of condition
func AorC(condition bool) a.Baser {
if condition {
return a.A
}
return a.C
}
Inside the main package a.Baser is effectively like an enum now.
Only inside the a package you may define new instances.
You can make it so:
type MessageType int32
const (
TEXT MessageType = 0
BINARY MessageType = 1
)
With this code compiler should check type of enum
It's true that the above examples of using const and iota are the most idiomatic ways of representing primitive enums in Go. But what if you're looking for a way to create a more fully-featured enum similar to the type you'd see in another language like Java or Python?
A very simple way to create an object that starts to look and feel like a string enum in Python would be:
package main
import (
"fmt"
)
var Colors = newColorRegistry()
func newColorRegistry() *colorRegistry {
return &colorRegistry{
Red: "red",
Green: "green",
Blue: "blue",
}
}
type colorRegistry struct {
Red string
Green string
Blue string
}
func main() {
fmt.Println(Colors.Red)
}
Suppose you also wanted some utility methods, like Colors.List(), and Colors.Parse("red"). And your colors were more complex and needed to be a struct. Then you might do something a bit like this:
package main
import (
"errors"
"fmt"
)
var Colors = newColorRegistry()
type Color struct {
StringRepresentation string
Hex string
}
func (c *Color) String() string {
return c.StringRepresentation
}
func newColorRegistry() *colorRegistry {
red := &Color{"red", "F00"}
green := &Color{"green", "0F0"}
blue := &Color{"blue", "00F"}
return &colorRegistry{
Red: red,
Green: green,
Blue: blue,
colors: []*Color{red, green, blue},
}
}
type colorRegistry struct {
Red *Color
Green *Color
Blue *Color
colors []*Color
}
func (c *colorRegistry) List() []*Color {
return c.colors
}
func (c *colorRegistry) Parse(s string) (*Color, error) {
for _, color := range c.List() {
if color.String() == s {
return color, nil
}
}
return nil, errors.New("couldn't find it")
}
func main() {
fmt.Printf("%s\n", Colors.List())
}
At that point, sure it works, but you might not like how you have to repetitively define colors. If at this point you'd like to eliminate that, you could use tags on your struct and do some fancy reflecting to set it up, but hopefully this is enough to cover most people.
There is a way with struct namespace.
The benefit is all enum variables are under a specific namespace to avoid pollution.
The issue is that we could only use var not const
type OrderStatusType string
var OrderStatus = struct {
APPROVED OrderStatusType
APPROVAL_PENDING OrderStatusType
REJECTED OrderStatusType
REVISION_PENDING OrderStatusType
}{
APPROVED: "approved",
APPROVAL_PENDING: "approval pending",
REJECTED: "rejected",
REVISION_PENDING: "revision pending",
}
As of Go 1.4, the go generate tool has been introduced together with the stringer command that makes your enum easily debuggable and printable.
I am sure we have a lot of good answers here. But, I just thought of adding the way I have used enumerated types
package main
import "fmt"
type Enum interface {
name() string
ordinal() int
values() *[]string
}
type GenderType uint
const (
MALE = iota
FEMALE
)
var genderTypeStrings = []string{
"MALE",
"FEMALE",
}
func (gt GenderType) name() string {
return genderTypeStrings[gt]
}
func (gt GenderType) ordinal() int {
return int(gt)
}
func (gt GenderType) values() *[]string {
return &genderTypeStrings
}
func main() {
var ds GenderType = MALE
fmt.Printf("The Gender is %s\n", ds.name())
}
This is by far one of the idiomatic ways we could create Enumerated types and use in Go.
Edit:
Adding another way of using constants to enumerate
package main
import (
"fmt"
)
const (
// UNSPECIFIED logs nothing
UNSPECIFIED Level = iota // 0 :
// TRACE logs everything
TRACE // 1
// INFO logs Info, Warnings and Errors
INFO // 2
// WARNING logs Warning and Errors
WARNING // 3
// ERROR just logs Errors
ERROR // 4
)
// Level holds the log level.
type Level int
func SetLogLevel(level Level) {
switch level {
case TRACE:
fmt.Println("trace")
return
case INFO:
fmt.Println("info")
return
case WARNING:
fmt.Println("warning")
return
case ERROR:
fmt.Println("error")
return
default:
fmt.Println("default")
return
}
}
func main() {
SetLogLevel(INFO)
}
For a use case like this, it may be useful to use a string constant so it can be marshaled into a JSON string. In the following example, []Base{A,C,G,T} would get marshaled to ["adenine","cytosine","guanine","thymine"].
type Base string
const (
A Base = "adenine"
C = "cytosine"
G = "guanine"
T = "thymine"
)
When using iota, the values get marshaled into integers. In the following example, []Base{A,C,G,T} would get marshaled to [0,1,2,3].
type Base int
const (
A Base = iota
C
G
T
)
Here's an example comparing both approaches:
https://play.golang.org/p/VvkcWvv-Tvj
Here is an example that will prove useful when there are many enumerations. It uses structures in Golang, and draws upon Object Oriented Principles to tie them all together in a neat little bundle. None of the underlying code will change when a new enumeration is added or deleted. The process is:
Define an enumeration structure for enumeration items: EnumItem. It has an integer and string type.
Define the enumeration as a list of enumeration items: Enum
Build methods for the enumeration. A few have been included:
enum.Name(index int): returns the name for the given index.
enum.Index(name string): returns the name for the given index.
enum.Last(): returns the index and name of the last enumeration
Add your enumeration definitions.
Here is some code:
type EnumItem struct {
index int
name string
}
type Enum struct {
items []EnumItem
}
func (enum Enum) Name(findIndex int) string {
for _, item := range enum.items {
if item.index == findIndex {
return item.name
}
}
return "ID not found"
}
func (enum Enum) Index(findName string) int {
for idx, item := range enum.items {
if findName == item.name {
return idx
}
}
return -1
}
func (enum Enum) Last() (int, string) {
n := len(enum.items)
return n - 1, enum.items[n-1].name
}
var AgentTypes = Enum{[]EnumItem{{0, "StaffMember"}, {1, "Organization"}, {1, "Automated"}}}
var AccountTypes = Enum{[]EnumItem{{0, "Basic"}, {1, "Advanced"}}}
var FlagTypes = Enum{[]EnumItem{{0, "Custom"}, {1, "System"}}}
Refactored https://stackoverflow.com/a/17989915/863651 to make it a bit more readable:
package SampleEnum
type EFoo int
const (
A EFoo = iota
C
T
G
)
type IEFoo interface {
Get() EFoo
}
func(e EFoo) Get() EFoo { // every EFoo must fulfill the IEFoo interface
return e
}
func(e EFoo) otherMethod() { // "private"
//some logic
}
This is a safe way to implement enum in golang:
package main
import (
"fmt"
)
const (
MALE = _gender(1)
FEMALE = _gender(2)
RED = _color("RED")
GREEN = _color("GREEN")
BLUE = _color("BLUE")
)
type Gender interface {
_isGender()
Value() int
}
type _gender int
func (_gender) _isGender() {}
func (_g _gender) Value() int {
return int(_g)
}
type Color interface {
_isColor()
Value() string
}
type _color string
func (_color) _isColor() {}
func (_c _color) Value() string {
return string(_c)
}
func main() {
genders := []Gender{MALE, FEMALE}
colors := []Color{RED, GREEN, BLUE}
fmt.Println("Colors =", colors)
fmt.Println("Genders =", genders)
}
The output:
Colors = [RED GREEN BLUE]
Genders = [1 2]
Also, this is a pretty effective way to store different roles in one location in a byte, where the first value is set to 1, bit shifted by an iota.
package main
import "fmt"
const (
isCaptain = 1 << iota
isTrooper
isMedic
canFlyMars
canFlyJupiter
canFlyMoon
)
func main() {
var roles byte = isCaptain | isMedic | canFlyJupiter
//Prints a binary representation.
fmt.Printf("%b\n", roles)
fmt.Printf("%b\n", isCaptain)
fmt.Printf("%b\n", isTrooper)
fmt.Printf("%b\n", isMedic)
fmt.Printf("Is Captain? %v\n", isCaptain&roles == isCaptain)
fmt.Printf("Is Trooper? %v", isTrooper&roles == isTrooper)
}
I created the enum this way. Suppose we need an enum representing gender. Possible values are Male, Female, Others
package gender
import (
"fmt"
"strings"
)
type Gender struct {
g string
}
var (
Unknown = Gender{}
Male = Gender{g: "male"}
Female = Gender{g: "female"}
Other = Gender{g: "other"}
)
var genders = []Gender{
Unknown,
Male,
Female,
Other,
}
func Parse(code string) (parsed Gender, err error) {
for _, g := range genders {
if g.g == strings.ToLower(code) {
if g == Unknown {
err = fmt.Errorf("unknown gender")
}
parsed = g
return
}
}
parsed = Unknown
err = fmt.Errorf("unknown gender", code)
return
}
func (g Gender) Gender() string {
return g.g
}
A simpler way I have found to work.
const (
Stake TX = iota
Withdraw)
type TX int
func (t TX) String() string {
return [...]string{"STAKE", "WITHDRAW"}[t]}
log.Println(Stake.String()) --> STAKE

How to create a case insensitive map in Go?

I want to have a key insensitive string as key.
Is it supported by the language or do I have to create it myself?
thank you
Edit: What I am looking for is a way to make it by default instead of having to remember to convert the keys every time I use the map.
Edit: My initial code actually still allowed map syntax and thus allowed the methods to be bypassed. This version is safer.
You can "derive" a type. In Go we just say declare. Then you define methods on your type. It just takes a very thin wrapper to provide the functionality you want. Note though, that you must call get and set with ordinary method call syntax. There is no way to keep the index syntax or optional ok result that built in maps have.
package main
import (
"fmt"
"strings"
)
type ciMap struct {
m map[string]bool
}
func newCiMap() ciMap {
return ciMap{m: make(map[string]bool)}
}
func (m ciMap) set(s string, b bool) {
m.m[strings.ToLower(s)] = b
}
func (m ciMap) get(s string) (b, ok bool) {
b, ok = m.m[strings.ToLower(s)]
return
}
func main() {
m := newCiMap()
m.set("key1", true)
m.set("kEy1", false)
k := "keY1"
b, _ := m.get(k)
fmt.Println(k, "value is", b)
}
Two possiblities:
Convert to uppercase/lowercase if you're input set is guaranteed to be restricted to only characters for which a conversion to uppercase/lowercase will yield correct results (may not be true for some Unicode characters)
Convert to Unicode fold case otherwise:
Use unicode.SimpleFold(rune) to convert a unicode rune to fold case. Obviously this is dramatically more expensive an operation than simple ASCII-style case mapping, but it is also more portable to other languages. See the source code for EqualsFold to see how this is used, including how to extract Unicode runes from your source string.
Obviously you'd abstract this functionality into a separate package instead of re-implementing it everywhere you use the map. This should go without saying, but then you never know.
Here is something more robust than just strings.ToLower, you can use
the golang.org/x/text/cases package. Example:
package main
import "golang.org/x/text/cases"
func main() {
s := cases.Fold().String("March")
println(s == "march")
}
If you want to use something from the standard library, I ran this test:
package main
import (
"strings"
"unicode"
)
func main() {
var (
lower, upper int
m = make(map[string]bool)
)
for n := '\u0080'; n <= '\u07FF'; n++ {
q, r := n, n
for {
q = unicode.SimpleFold(q)
if q == n { break }
for {
r = unicode.SimpleFold(r)
if r == n { break }
s, t := string(q), string(r)
if m[t + s] { continue }
if strings.ToLower(s) == strings.ToLower(t) { lower++ }
if strings.ToUpper(s) == strings.ToUpper(t) { upper++ }
m[s + t] = true
}
}
}
println(lower == 951, upper == 989)
}
So as can be seen, ToUpper is the marginally better choice.

Resources