Related
Solving wordle efficiently (for humans and for computers) is all the rage right now.
One particular way of solving a wordle made me curious. The idea is to select 5 words that have distinct letters so you'll end up with 25 characters. If you use these 5 words as your first 5 guesses in the game, you'll have a close to 100% chance of getting the correct word in your last guess (it's essentially an anagram of all the clues and you'll probably have a few green ones). There is a set of words that is suggested (all of the words are valid English words):
brick
glent
jumpy
vozhd
waqfs
But this made me wonder: How many of these 5 word combinations are out there and I started whipping up a recursive algorithm but I am close to giving up.
My initial thought was:
Start with the first word
reduce overlapping words from the word list
pick the next remaining word in the word list
Repeat with the next word
But this only really works if you have a set of five distinct words in order.
For this list:
brick
feast
glent
jumpy
vozhd
waqfs
I will end up with: [brick, feast, jumpy, vozhd] because feast comes before glent and will filter it out but in the end glent would have been the better pick.
I wasn't able to find any algorithms for this specific problem so I was wondering if there is any existing algorithm that can be applied to this?
It's possible to brute-force this. For efficiency, one can discard all words with duplicate letters, and pre-process the words to use a bitmask of which letters they have (there are 26 letters, so this fits in a 32-bit unsigned integer).
Then just do a depth-first search, maintaining a list of words (bitmasks) that don't intersect with the words found so far.
I've written some go code that does this. It uses a shortened list of words that just contains the solution words (the full wordlist is too long to include here), but the code runs in a few seconds even with the full list.
Because it uses bitmasks to represent words, it's possible that there's multiple words with the same letters in the solution. The program shows those with a | inbetween. There's just one pair: cylix|xylic in the solution:
bling treck waqfs jumpy vozhd
pling treck waqfs jumby vozhd
brick glent waqfs jumpy vozhd
kreng clipt waqfs jumby vozhd
fjord chunk vibex gymps waltz
fjord gucks vibex nymph waltz
prick glent waqfs jumby vozhd
kempt brung waqfs cylix|xylic vozhd
blunk waqfs cimex grypt vozhd
clunk waqfs bemix grypt vozhd
It can be run here: https://go.dev/play/p/wVEDjx3G1fE
package main
import (
"fmt"
"math/bits"
"sort"
"strings"
)
var allWords = []string{
"bemix", "bling", "blunk", "brick", "brung", "chunk", "cimex", "clipt", "clunk", "cylix", "fjord", "glent", "grypt", "gucks", "gymps", "jumby", "jumpy", "kempt", "kreng", "nymph", "pling", "prick", "treck", "vibex", "vozhd", "waltz", "waqfs", "xylic",
}
func printSol(res []uint32, masks map[uint32][]string) {
var b strings.Builder
for i, r := range res {
if i > 0 {
b.WriteString(" ")
}
b.WriteString(strings.Join(masks[r], "|"))
}
fmt.Println(b.String())
}
func find5(w []uint32, mask uint32, n int, res []uint32, masks map[uint32][]string) {
if n == 5 {
printSol(res, masks)
return
}
sub := []uint32{}
for _, x := range w {
if x&mask != 0 {
continue
}
sub = append(sub, x)
}
for i, x := range sub {
res[n] = x
find5(sub[i+1:], mask|x, n+1, res, masks)
}
}
func find5clique() {
masks := map[uint32][]string{}
for _, x := range allWords {
m := uint32(0)
for _, c := range x {
m |= 1 << (c - 'a')
}
if bits.OnesCount32(m) == 5 {
masks[m] = append(masks[m], x)
}
}
maskSlice := []uint32{}
for m := range masks {
maskSlice = append(maskSlice, m)
}
sort.Slice(maskSlice, func(i, j int) bool {
return maskSlice[i] < maskSlice[j]
})
find5(maskSlice, uint32(0), 0, make([]uint32, 5, 5), masks)
}
func main() {
find5clique()
}
My choice of 4 words: batch, field, wrong, musky
Works very well for all forms of *ordles
Can’t find a fifth word with the remaining letters, though.
I'm currently working on the same thing. (Implemented In Python)
Here's My Code (Explanation Below):
import requests,string
wordlist = str(requests.get('https://gist.githubusercontent.com/dracos/dd0668f281e685bad51479e5acaadb93/raw/ca9018b32e963292473841fb55fd5a62176769b5/valid-wordle-words.txt').content).split('\\n');wordlist[0] = 'aahed'
alphabet = [str(_) for _ in string.ascii_lowercase]
for word in wordlist:
if [char in word for char in alphabet].count(True) != 5:
wordlist.remove(word)
for i in range(len(wordlist) ** 2):
alphabet = [str(_) for _ in string.ascii_lowercase]
currentwords=[]
for word in wordlist:
for char in word:
alphabet.remove(char)
currentwords.append(word)
with open("out.txt", "a") as f:
f.write(";".join(currentwords))
f.write("\n")
wordlist.pop(wordlist.index(currentwords[0]))
Basically, we load the wordlist, remove the duplicates:
for word in wordlist:
if [char in word for char in alphabet].count(True) != 5:
wordlist.remove(word)
then loop over the entire wordlist len of wordlist amount of times.
we reset/initialize the variables. and loop over every word in the wordlist.
we remove each character in the current word from the alphabet (aka left over letters)
and add that word to the current words.
we then output that to the out.txt file.
after we finish the nested loop. we remove the word we just got (since we're done with it) and continue.
this method will output combinations of any length (ie. 1,2,3,4,5) and is incredibly inefficient and is still in the making.
Please comment if you have any ideas for optimizing this!
I have written a function and I can't seem to find where the bug is:
The function change works like this:
An input of 15 (target value) with possible values of [1, 5, 10, 25, 100] should return [5, 10]. That's because to reach a target value of 15, the least amount of numbers to make up that target number is to have a 10 and 5
I use a caching mechanism, as it is a recursive function and remembers the values that have already been calculated.
func Change(coins []int, target int, resultsCache map[int][]int) ([]int, error) {
if val, ok := resultsCache[target]; ok {
return val, nil
}
if target == 0 {
return make([]int, 0), nil
}
if target < 0 {
return nil, errors.New("Target can't be less than zero")
}
var leastNumOfCoinChangeCombinations []int
for _, coin := range coins {
remainder := target - coin
remainderCombination, _ := Change(coins, remainder, resultsCache)
if remainderCombination != nil {
combination := append(remainderCombination, coin)
if leastNumOfCoinChangeCombinations == nil || len(combination) < len(leastNumOfCoinChangeCombinations) {
leastNumOfCoinChangeCombinations = combination
}
}
}
if leastNumOfCoinChangeCombinations == nil {
return nil, errors.New("Can't find changes from coin combinations")
}
sort.Ints(leastNumOfCoinChangeCombinations)
resultsCache[target] = leastNumOfCoinChangeCombinations
return leastNumOfCoinChangeCombinations, nil
}
The cache however have some abnormal behaviour, for example if I want to use the value of 12 in the cache later, instead of getting [2,5,5], I get [1 2 5] instead. Not sure where I went wrong. (but initially it was calculated and stored correctly, not sure how it got changed).
Here is a playground I used for troubleshooting:
https://play.golang.org/p/Rt8Sh_Ul-ge
You are encountering a fairly common, but sometimes difficult to spot, issue caused by the way slices work. Before reading further it's probably worth scanning the blog post Go Slices: usage and internals. The issue stems from the way append can reuse the slices underlying array as per this quote from the spec:
If the capacity of s is not large enough to fit the additional values, append allocates a new, sufficiently large underlying array that fits both the existing slice elements and the additional values. Otherwise, append re-uses the underlying array.
The below code provides a simple demonstration of what is occurring:
package main
import (
"fmt"
"sort"
)
func main() {
x := []int{2, 3}
x2 := append(x, 4)
x3 := append(x2, 1)
fmt.Println("x2 before sort", x2)
sort.Ints(x3)
fmt.Println("x2 after sort", x2)
fmt.Println("x3", x3)
fmt.Println("x2 cap", cap(x2))
}
The results are (playground):
x2 before sort [2 3 4]
x2 after sort [1 2 3]
x3 [1 2 3 4]
x2 cap 4
The result is probably not what you expected - why did x2 change when we sorted x3? The reason this happens is that the backing array for x2 has a capacity of 4 (length is 3) and when we append 1 the new slice x3 uses the same backing array (capacity 4, length 4). This only becomes an issue when we make a change to the portion of the backing array used by x2 and this happens when we call sort on x3.
So in your code you are adding a slice to the map but it's backing array is then being altered after that instance of Change returns (the append/sort ends up happening pretty much as in the example above).
There are a few ways you can fix this; removing the sort will do the trick but is probably not what you want. A better alternative is to take a copy of the slice; you can do this by replacing combination := append(remainderCombination, coin) with:
combination := make([]int, len(remainderCombination)+1)
copy(combination , remainderCombination)
combination[len(remainderCombination)] = coin
or the simpler (but perhaps not as easy to grasp - playground):
combination := append([]int{coin}, remainderCombination...)
maybe it's not just a Go-Problem but i have this problem:
I want to multiply two (or more) arrays, so for example:
a := [3]int{2, 3, 5}
b := [2]bool{true, false}
// desired output of "c" =>
// [[2 true] [2 false] [3 true] [3 false] [5 true] [5 false]]
I already found this library here: https://godoc.org/github.com/gonum/matrix/mat64 but i'm not seeing how to use something else than float64
The fallback-solution would be to use multiple for-range-loops but it'd be amazing if there is a "smoother" way to do this
Short answer: go is not intended for this kind of problem. What you want is an equivalent of the zip function, which is present natively in some languages (e.g. Haskell, Python, ...)
However, in Golang you'll have one big problem: you can't have dynamic types. That is: an array can contain only one type (int OR bool), not several. The workaround is to make an array of interface, but that means you'd have to make ugly type assertions to get the proper type back.
Also, you do have a general way to do that, but the type you'll get at the end will be [][]interface{} and no way of knowing what's inside.
For your example: here is the simplest way to do what you want (not general):
func main() {
a := [3]int{2, 3, 5}
b := [2]bool{true, false}
var c [6][2]interface{}
i := 0
for _, val1 := range a {
for _, val2 := range b {
c[i] = [2]interface{}{val1, val2}
i += 1
}
}
var a1 int = c[0][0].(int)
var b1 bool = c[0][1].(bool)
fmt.Printf("c[0] is %v, a1 is %d and b1 is %v\n", c[0], a1, b1)
fmt.Println(c)
}
As you can see, that's ugly and useless in practice (and very error-prone)
So, if you want to make this kind of transformations, you should use another language, Go was not (and won't) designed for this type of purposes.
This isn't a matrix multiplication, as pointed out above. The two for loops work if there are only two things, but if there are multiple ones it can clearly get tedious.
The way I would do it is to think of a multidimensional array. The total "number" of elements is the product of the sizes, and then use a function like SubFor https://godoc.org/github.com/btracey/meshgrid#SubFor
dims := []int{3,2}
sz := 1
for _,v := range dims {
sz *= v
}
sub := make([]int, len(dims))
for i := 0: i < sz; i++{
meshgrid.SubFor(sub, i, dims)
fmt.Println(a[sub[0]], b[sub[1]])
}
There are some things with types to figure out (appending to a slice, etc.), but that should give you the general gist.
I would like to take random samples from very large lists while maintaining the order. I wrote the script below, but it requires .map(idx => ls(idx)) which is very wasteful. I can see a way of making this more efficient with a helper function and tail recursion, but I feel that there must be a simpler solution that I'm missing.
Is there a clean and more efficient way of doing this?
import scala.util.Random
def sampledList[T](ls: List[T], sampleSize: Int) = {
Random
.shuffle(ls.indices.toList)
.take(sampleSize)
.sorted
.map(idx => ls(idx))
}
val sampleList = List("t","h","e"," ","q","u","i","c","k"," ","b","r","o","w","n")
// imagine the list is much longer though
sampledList(sampleList, 5) // List(e, u, i, r, n)
EDIT:
It appears I was unclear: I am referring to maintaining the order of the values, not the original List collection.
If by
maintaining the order of the values
you understand to keeping the elements in the sample in the same order as in the ls list, then with a small modification to your original solution the performances can be greatly improved:
import scala.util.Random
def sampledList[T](ls: List[T], sampleSize: Int) = {
Random.shuffle(ls.zipWithIndex).take(sampleSize).sortBy(_._2).map(_._1)
}
This solution has a complexity of O(n + k*log(k)), where n is the list's size, and k is the sample size, while your solution is O(n + k * log(k) + n*k).
Here is an (more complex) alternative that has O(n) complexity. You can't get any better in terms of complexity (though you could get better performance by using another collection, in particular a collection that has a constant time size implementation). I did a quick benchmark which indicated that the speedup is very substantial.
import scala.util.Random
import scala.annotation.tailrec
def sampledList[T](ls: List[T], sampleSize: Int) = {
#tailrec
def rec(list: List[T], listSize: Int, sample: List[T], sampleSize: Int): List[T] = {
require(listSize >= sampleSize,
s"listSize must be >= sampleSize, but got listSize=$listSize and sampleSize=$sampleSize"
)
list match {
case hd :: tl =>
if (Random.nextInt(listSize) < sampleSize)
rec(tl, listSize-1, hd :: sample, sampleSize-1)
else rec(tl, listSize-1, sample, sampleSize)
case Nil =>
require(sampleSize == 0, // Should never happen
s"sampleSize must be zero at the end of processing, but got $sampleSize"
)
sample
}
}
rec(ls, ls.size, Nil, sampleSize).reverse
}
The above implementation simply iterates over the list and keeps (or not) the current element according to a probability which is designed to give the same chance to each element. My logic may have a flow, but at first blush it seems sound to me.
Here's another O(n) implementation that should have a uniform probability for each element:
implicit class SampleSeqOps[T](s: Seq[T]) {
def sample(n: Int, r: Random = Random): Seq[T] = {
assert(n >= 0 && n <= s.length)
val res = ListBuffer[T]()
val length = s.length
var samplesNeeded = n
for { (e, i) <- s.zipWithIndex } {
val p = samplesNeeded.toDouble / (length - i)
if (p >= r.nextDouble()) {
res += e
samplesNeeded -= 1
}
}
res.toSeq
}
}
I'm using it frequently with collections > 100'000 elements and the performance seems reasonable.
It's probably the same idea as in Régis Jean-Gilles's answer but I think the imperative solution is slightly more readable in this case.
Perhaps I don't quite understand, but since Lists are immutable you don't really need to worry about 'maintaining the order' since the original List is never touched. Wouldn't the following suffice?
def sampledList[T](ls: List[T], sampleSize: Int) =
Random.shuffle(ls).take(sampleSize)
While my previous answer has linear complexity, it does have the drawback of requiring two passes, the first one corresponding to the need to compute the length before doing anything else. Besides affecting the running time, we might want to sample a very large collection for which it is not practical nor efficient to load the whole collection in memory at once, in which case we'd like to be able to work with a simple iterator.
As it happens, we don't need to invent anything to fix this. There is simple and clever algorithm called reservoir sampling which does exactly this (building a sample as we iterate over a collection, all in one pass). With a minor modification we can also preserve the order, as required:
import scala.util.Random
def sampledList[T](ls: TraversableOnce[T], sampleSize: Int, preserveOrder: Boolean = false, rng: Random = new Random): Iterable[T] = {
val result = collection.mutable.Buffer.empty[(T, Int)]
for ((item, n) <- ls.toIterator.zipWithIndex) {
if (n < sampleSize) result += (item -> n)
else {
val s = rng.nextInt(n)
if (s < sampleSize) {
result(s) = (item -> n)
}
}
}
if (preserveOrder) {
result.sortBy(_._2).map(_._1)
}
else result.map(_._1)
}
How can I check if two slices are equal, given that the operators == and != are not an option?
package main
import "fmt"
func main() {
s1 := []int{1, 2}
s2 := []int{1, 2}
fmt.Println(s1 == s2)
}
This does not compile with:
invalid operation: s1 == s2 (slice can only be compared to nil)
You should use reflect.DeepEqual()
DeepEqual is a recursive relaxation of Go's == operator.
DeepEqual reports whether x and y are “deeply equal,” defined as
follows. Two values of identical type are deeply equal if one of the
following cases applies. Values of distinct types are never deeply
equal.
Array values are deeply equal when their corresponding elements are
deeply equal.
Struct values are deeply equal if their corresponding fields, both
exported and unexported, are deeply equal.
Func values are deeply equal if both are nil; otherwise they are not
deeply equal.
Interface values are deeply equal if they hold deeply equal concrete
values.
Map values are deeply equal if they are the same map object or if they
have the same length and their corresponding keys (matched using Go
equality) map to deeply equal values.
Pointer values are deeply equal if they are equal using Go's ==
operator or if they point to deeply equal values.
Slice values are deeply equal when all of the following are true: they
are both nil or both non-nil, they have the same length, and either
they point to the same initial entry of the same underlying array
(that is, &x[0] == &y[0]) or their corresponding elements (up to
length) are deeply equal. Note that a non-nil empty slice and a nil
slice (for example, []byte{} and []byte(nil)) are not deeply equal.
Other values - numbers, bools, strings, and channels - are deeply
equal if they are equal using Go's == operator.
You need to loop over each of the elements in the slice and test. Equality for slices is not defined. However, there is a bytes.Equal function if you are comparing values of type []byte.
func testEq(a, b []Type) bool {
if len(a) != len(b) {
return false
}
for i := range a {
if a[i] != b[i] {
return false
}
}
return true
}
This is just example using reflect.DeepEqual() that is given in #VictorDeryagin's answer.
package main
import (
"fmt"
"reflect"
)
func main() {
a := []int {4,5,6}
b := []int {4,5,6}
c := []int {4,5,6,7}
fmt.Println(reflect.DeepEqual(a, b))
fmt.Println(reflect.DeepEqual(a, c))
}
Result:
true
false
Try it in Go Playground
If you have two []byte, compare them using bytes.Equal. The Golang documentation says:
Equal returns a boolean reporting whether a and b are the same length and contain the same bytes. A nil argument is equivalent to an empty slice.
Usage:
package main
import (
"fmt"
"bytes"
)
func main() {
a := []byte {1,2,3}
b := []byte {1,2,3}
c := []byte {1,2,2}
fmt.Println(bytes.Equal(a, b))
fmt.Println(bytes.Equal(a, c))
}
This will print
true
false
And for now, here is https://github.com/google/go-cmp which
is intended to be a more powerful and safer alternative to reflect.DeepEqual for comparing whether two values are semantically equal.
package main
import (
"fmt"
"github.com/google/go-cmp/cmp"
)
func main() {
a := []byte{1, 2, 3}
b := []byte{1, 2, 3}
fmt.Println(cmp.Equal(a, b)) // true
}
You cannot use == or != with slices but if you can use them with the elements then Go 1.18 has a new function to easily compare two slices, slices.Equal:
Equal reports whether two slices are equal: the same length and all elements equal. If the lengths are different, Equal returns false. Otherwise, the elements are compared in increasing index order, and the comparison stops at the first unequal pair. Floating point NaNs are not considered equal.
The slices package import path is golang.org/x/exp/slices. Code inside exp package is experimental, not yet stable. It will be moved into the standard library in Go 1.19 eventually.
Nevertheless you can use it as soon as Go 1.18 (playground)
sliceA := []int{1, 2}
sliceB := []int{1, 2}
equal := slices.Equal(sliceA, sliceB)
fmt.Println(equal) // true
type data struct {
num float64
label string
}
sliceC := []data{{10.99, "toy"}, {500.49, "phone"}}
sliceD := []data{{10.99, "toy"}, {200.0, "phone"}}
equal = slices.Equal(sliceC, sliceD)
fmt.Println(equal) // true
If the elements of the slice don't allow == and !=, you can use slices.EqualFunc and define whatever comparator function makes sense for the element type.
In case that you are interested in writing a test, then github.com/stretchr/testify/assert is your friend.
Import the library at the very beginning of the file:
import (
"github.com/stretchr/testify/assert"
)
Then inside the test you do:
func TestEquality_SomeSlice (t * testing.T) {
a := []int{1, 2}
b := []int{2, 1}
assert.Equal(t, a, b)
}
The error prompted will be:
Diff:
--- Expected
+++ Actual
## -1,4 +1,4 ##
([]int) (len=2) {
+ (int) 1,
(int) 2,
- (int) 2,
(int) 1,
Test: TestEquality_SomeSlice
Thought of a neat trick and figured I'd share.
If what you are interested in knowing is whether two slices are identical (i.e. they alias the same region of data) instead of merely equal (the value at each index of one slice equals the value in the same index of the other) then you can efficiently compare them in the following way:
foo := []int{1,3,5,7,9,11,13,15,17,19}
// these two slices are exactly identical
subslice1 := foo[3:][:4]
subslice2 := foo[:7][3:]
slicesEqual := &subslice1[0] == &subslice2[0] &&
len(subslice1) == len(subslice2)
There are some caveats to this sort of comparison, in particular that you cannot compare empty slices in this way, and that the capacity of the slices isn't compared, so this "identicality" property is only really useful when reading from a slice or reslicing a strictly narrower subslice, as any attempt to grow the slice will be affected by the slices' capacity. Still, it's very useful to be able to efficiently declare, "these two huge blocks of memory are in fact the same block, yes or no."
To have a complete set of answers: here is a solution with generics.
func IsEqual[A comparable](a, b []A) bool {
// Can't be equal if length differs
if len(a) != len(b) {
return false
}
// Empty arrays trivially equal
if len(a) == 0 {
return true
}
// Two pointers going towards each other at every iteration
left := 0
right := len(a) - 1
for left < right {
if a[left] != b[left] || a[right] != b[right] {
return false
}
left++
right--
}
return true
}
Code uses strategy of "two pointers" which brings runtime complexity of n / 2, which is still O(n), however, twice as less steps than a linear check one-by-one.