How to group by then merge slices with duplicated values in Go - go

Excuse me, this is my first Stackoverflow question, so, any tips/advice on what I can do to improve it would be wonderful, in addition to some help.
The problem:
I have a slice that I am trying to group into smaller slices by certain criteria. I then need to merge the newly created slices with each other if they contain any of the same values in the slice. (Essentially, appending slices together that have "overlapping" values).
Some additional notes about the problem:
The number of items in the original slice will likely be between 1-50, in most cases, with outliers rarely exceeding 100.
Once gropued, the size of the 'inside' slices will be between 1-10 values.
Performance is a factor, as this operation will be run as part of a webservice where a single request will perform this operation 20+ times, and there can be many (hundreds - thousands) of requests per minute at peak times. However, clarity of code is also important.
My implementation is using ints, the final implementation would have more complex structs, though I was considering making a map and then use the implementation shown below based upon the keys. Is this a good idea?
I have broken the problem down into a few steps:
Create a 2D slice with groupings of values, based up criteria (the initial grouping phase)
Attempt to merge slices in place if they include a duplicated value.
I am running into two problems:
First, I think my implementation might not scale super well, as it tends to have some nested loops (however, these loops will be iterating on small slices, so that might be ok)
Second, my implementation is requiring an extra step at the end to remove duplicated values, ideally we should remove it.
Input: [ 100, 150, 300, 350, 600, 700 ]
Expected Output: [[100 150 300 350] [600 700]]
This is with the 'selection criteria' of grouping values that are within 150 units of at least one other value in the slice.
And the code (Go Playground link) :
package main
import (
"fmt"
"sort"
)
func filter(vs []int, f func(int) bool) []int {
vsf := make([]int, 0)
for _, v := range vs {
if f(v) {
vsf = append(vsf, v)
}
}
return vsf
}
func unique(intSlice []int) []int {
keys := make(map[int]bool)
list := []int{}
for _, entry := range intSlice {
if _, value := keys[entry]; !value {
keys[entry] = true
list = append(list, entry)
}
}
return list
}
func contains(intSlice []int, searchInt int) bool {
for _, value := range intSlice {
if value == searchInt {
return true
}
}
return false
}
func compare(a, b []int) bool {
if len(a) != len(b) {
return false
}
if (a == nil) != (b == nil) {
return false
}
b = b[:len(a)]
for i, v := range a {
if v != b[i] {
return false
}
}
return true
}
func main() {
fmt.Println("phase 1 - initial grouping")
s := []int{100, 150, 300, 350, 600, 700}
g := make([][]int, 0)
// phase 1
for _, v := range s {
t := filter(s, func(i int) bool { return i - v >= -150 && i - v <= 150 })
for _, v1 := range t {
t1 := filter(s, func(i int) bool { return i - v1 >= -150 && i - v1 <= 150})
t = unique(append(t, t1...))
sort.Ints(t)
}
g = append(g, t)
fmt.Println(g)
}
// phase 2
fmt.Println("phase 2 - merge in place")
for i, tf := range g {
for _, death := range tf {
if i < len(g) - 1 && contains(g[i+1], death) {
g[i+1] = unique(append(g[i], g[i+1]...))
g = g[i+1:]
} else if i == len(g) - 1 {
fmt.Println(g[i], g[i-1])
// do some cleanup to make sure the last two items of the array don't include duplicates
if compare(g[i-1], g[i]) {
g = g[:i]
}
}
}
fmt.Println(i, g)
}
}

Not sure what you are actually asking, and the problem isn't fully defined.
So here's a version that is more efficient
If input is not sorted and output order matters, then this is a bad solution.
Here it is (on Play)
package main
import (
"fmt"
)
// Input: [ 100, 150, 300, 350, 600, 700 ] Expected Output: [[100 150 300 350] [600 700]]
func main() {
input := []int{100, 150, 300, 350, 600, 700}
fmt.Println("Input:", input)
fmt.Println("Output:", groupWithin150(input))
}
func groupWithin150(ints []int) [][]int {
var ret [][]int
// Your example input was sorted, if the inputs aren't actually sorted, then uncomment this
// sort.Ints(ints)
var group []int
for idx, i := range ints {
if idx > 0 && i-150 > group[len(group)-1] {
ret = append(ret, group)
group = make([]int, 0)
}
group = append(group, i)
}
if len(group) > 0 {
ret = append(ret, group)
}
return ret
}

Related

Modifying receiver with a method on value?

package matrix
import (
"errors"
"strconv"
"strings"
)
// Matrix matrix inteface
type Matrix interface {
Rows() [][]int
Cols() [][]int
Set(r, c, val int) bool
}
// matrix implements the interface Matrix
type matrix struct {
data [][]int
rows int
cols int
}
// New returns a valid matrix created from the input
func New(input string) (Matrix, error) {
var m matrix
rows := strings.Split(input, "\n")
for r, row := range rows {
rowElements := strings.Fields(row)
switch {
case r == 0:
m.rows, m.cols = len(rows), len(rowElements)
matrix, err := allocateMemory(m.rows, m.cols)
if err != nil {
return invalidMatrix()
}
m.data = matrix
case len(rowElements) != m.cols:
return invalidMatrix()
}
for c, element := range rowElements {
element, err := strconv.Atoi(element)
if err != nil {
return invalidMatrix()
}
m.data[r][c] = element
}
}
return m, nil
}
// invalidMatrix returns the error indicating the
// provided matrix is invalid
func invalidMatrix() (Matrix, error) {
return nil, errors.New("invalid matrix")
}
// allocateMemory allocates a 2D slice of int having size RxC
func allocateMemory(R, C int) ([][]int, error) {
if R < 1 || C < 1 {
return nil, errors.New("invalid matrix")
}
matrix := make([][]int, R)
for r := range matrix {
matrix[r] = make([]int, C)
}
return matrix, nil
}
// Set sets the given value at (r,c) in the matrix,
// if (r,c) belongs to the matrix.
func (m matrix) Set(r, c, val int) bool {
switch {
case r < 0 || c < 0:
return false
case r >= m.rows || c >= m.cols:
return false
default:
m.data[r][c] = val
return true
}
}
// order defines the order the matrix to export
// two useful values are columnMajor and rowMajor
type order int
const (
columnMajor order = iota
rowMajor
)
// Cols returns columns of the matrix.
func (m matrix) Cols() [][]int {
return m.export(columnMajor)
}
// Rows returns rows of the matrix.
func (m matrix) Rows() [][]int {
return m.export(rowMajor)
}
// export return the matrix in the required order;
// either columnMajor or rowMajor.
func (m matrix) export(o order) [][]int {
var matrix [][]int
var err error
switch o {
case columnMajor:
matrix, err = allocateMemory(m.cols, m.rows)
if err != nil {
return nil
}
for r, row := range m.data {
for c, element := range row {
matrix[c][r] = element
}
}
case rowMajor:
matrix, err = allocateMemory(m.rows, m.cols)
if err != nil {
return nil
}
for r, row := range m.data {
copy(matrix[r], row)
}
}
return matrix
}
I am having a hard time understanding why the method Set() is able to modify the data of the struct. I had an understanding that methods defined on values cannot do that. I have tried to compare it with another problem where I cannot modify the content of receiver but in this case it just works. A test file for this code is available at test file. Any idea what I am missing?
The reason Set can modify the contents of the slice is that the slice is a reference value. Your other example (in the comment) attempts to assign the field holding the slice, and this won't work - because it's working on a copy. See this code sample:
package main
import (
"fmt"
)
type Holder struct {
s []int
v []int
}
func (h Holder) Set() {
// This will successfully modify the `s` slice's contents
h.s[0] = 99
// This will assign a new slice to a copy of the v field,
// so it won't affect the actual value on which this
// method is invoked.
h.v = []int{1, 2, 3}
}
func main() {
var h Holder
h.s = []int{10, 20, 30}
h.v = []int{40, 50, 60}
fmt.Println("before Set:", h)
h.Set()
fmt.Println("after Set:", h)
}
You can run it on the playground, and it prints:
before Set: {[10 20 30] [40 50 60]}
after Set: {[99 20 30] [40 50 60]}
What happens here is that even though Set gets a copy of h, and hence h.s is a copy too, but both copies point to the same underlying slice, so the contents can be modified. Read this post for all the details.
A slice value contains (ptr, len, cap) where ptr is a pointer to the slice's underlying array. The Set method modifies the slice's underlying array by dereferencing the pointer. The slice value, stored in the field, is not modified.
The Go Language blog post on slices describes the slice memory layout in more detail.

Find the minimum value in golang?

In the language there is a minimum function https://golang.org/pkg/math/#Min But what if I have more than 2 numbers? I must to write a manual comparison in a for loop, or is there another way? The numbers are in the slice.
No, there isn't any better way than looping. Not only is it cleaner than any other approach, it's also the fastest.
values := []int{4, 20, 0, -11, -10}
min := values[0]
for _, v := range values {
if (v < min) {
min = v
}
}
fmt.Println(min)
EDIT
Since there has been some discussion in the comments about error handling and how to handle empty slices, here is a basic function that determines the minimum value. Remember to import errors.
func Min(values []int) (min int, e error) {
if len(values) == 0 {
return 0, errors.New("Cannot detect a minimum value in an empty slice")
}
min = values[0]
for _, v := range values {
if (v < min) {
min = v
}
}
return min, nil
}
General answer is: "Yes, you must use a loop, if you do not know exact number of items to compare".
In this package Min functions are implemented like:
// For 2 values
func Min(value_0, value_1 int) int {
if value_0 < value_1 {
return value_0
}
return value_1
}
// For 1+ values
func Mins(value int, values ...int) int {
for _, v := range values {
if v < value {
value = v
}
}
return value
}
You should write a loop. It does not make sense to create dozens of function in standard library to find min/max/count/count_if/all_of/any_of/none_of etc. like in C++ (most of them in 4 flavours according arguments).

check for equality on slices without order

I am trying to find a solution to check for equality in 2 slices. Unfortanely, the answers I have found require values in the slice to be in the same order. For example, http://play.golang.org/p/yV0q1_u3xR evaluates equality to false.
I want a solution that lets []string{"a","b","c"} == []string{"b","a","c"} evaluate to true.
MORE EXAMPLES
[]string{"a","a","c"} == []string{"c","a","c"} >>> false
[]string{"z","z","x"} == []string{"x","z","z"} >>> true
Here is an alternate solution, though perhaps a bit verbose:
func sameStringSlice(x, y []string) bool {
if len(x) != len(y) {
return false
}
// create a map of string -> int
diff := make(map[string]int, len(x))
for _, _x := range x {
// 0 value for int is 0, so just increment a counter for the string
diff[_x]++
}
for _, _y := range y {
// If the string _y is not in diff bail out early
if _, ok := diff[_y]; !ok {
return false
}
diff[_y] -= 1
if diff[_y] == 0 {
delete(diff, _y)
}
}
return len(diff) == 0
}
Try it on the Go Playground
You can use cmp.Diff together with cmpopts.SortSlices:
less := func(a, b string) bool { return a < b }
equalIgnoreOrder := cmp.Diff(x, y, cmpopts.SortSlices(less)) == ""
Here is a full example that runs on the Go Playground:
package main
import (
"fmt"
"github.com/google/go-cmp/cmp"
"github.com/google/go-cmp/cmp/cmpopts"
)
func main() {
x := []string{"a", "b", "c"}
y := []string{"a", "c", "b"}
less := func(a, b string) bool { return a < b }
equalIgnoreOrder := cmp.Diff(x, y, cmpopts.SortSlices(less)) == ""
fmt.Println(equalIgnoreOrder) // prints "true"
}
The other answers have better time complexity O(N) vs (O(N log(N)), that are in my answer, also my solution will take up more memory if elements in the slices are repeated frequently, but I wanted to add it because I think this is the most straight forward way to do it:
package main
import (
"fmt"
"sort"
"reflect"
)
func array_sorted_equal(a, b []string) bool {
if len(a) != len(b) {return false }
a_copy := make([]string, len(a))
b_copy := make([]string, len(b))
copy(a_copy, a)
copy(b_copy, b)
sort.Strings(a_copy)
sort.Strings(b_copy)
return reflect.DeepEqual(a_copy, b_copy)
}
func main() {
a := []string {"a", "a", "c"}
b := []string {"c", "a", "c"}
c := []string {"z","z","x"}
d := []string {"x","z","z"}
fmt.Println( array_sorted_equal(a, b))
fmt.Println( array_sorted_equal(c, d))
}
Result:
false
true
I would think the easiest way would be to map the elements in each array/slice to their number of occurrences, then compare the maps:
func main() {
x := []string{"a","b","c"}
y := []string{"c","b","a"}
xMap := make(map[string]int)
yMap := make(map[string]int)
for _, xElem := range x {
xMap[xElem]++
}
for _, yElem := range y {
yMap[yElem]++
}
for xMapKey, xMapVal := range xMap {
if yMap[xMapKey] != xMapVal {
return false
}
}
return true
}
You'll need to add some additional due dilligence, like short circuiting if your arrays/slices contain elements of different types or are of different length.
Generalizing the code of testify ElementsMatch, solution to compare any kind of objects (in the example []map[string]string):
https://play.golang.org/p/xUS2ngrUWUl
Like adrianlzt wrote in his answer, an implementation of assert.ElementsMatch from testify can be used to achieve that. But how about reusing actual testify module instead of copying that code when all you need is a bool result of the comparison? The implementation in testify is intended for tests code and usually takes testing.T argument.
It turns out that ElementsMatch can be quite easily used outside of testing code. All it takes is a dummy implementation of an interface with ErrorF method:
type dummyt struct{}
func (t dummyt) Errorf(string, ...interface{}) {}
func elementsMatch(listA, listB interface{}) bool {
return assert.ElementsMatch(dummyt{}, listA, listB)
}
Or test it on The Go Playground, which I've adapted from the adrianlzt's example.
Since I haven't got enough reputation to comment, I have to post yet another answer with a bit better code readability:
func AssertSameStringSlice(x, y []string) bool {
if len(x) != len(y) {
return false
}
itemAppearsTimes := make(map[string]int, len(x))
for _, i := range x {
itemAppearsTimes[i]++
}
for _, i := range y {
if _, ok := itemAppearsTimes[i]; !ok {
return false
}
itemAppearsTimes[i]--
if itemAppearsTimes[i] == 0 {
delete(itemAppearsTimes, i)
}
}
if len(itemAppearsTimes) == 0 {
return true
}
return false
}
The logic is the same as in this answer
I know its been answered but still I would like to add my answer. By following code here stretchr/testify we can have something like
func Elementsmatch(listA, listB []string) (string, bool) {
aLen := len(listA)
bLen := len(listB)
if aLen != bLen {
return fmt.Sprintf("Len of the lists don't match , len listA %v, len listB %v", aLen, bLen), false
}
visited := make([]bool, bLen)
for i := 0; i < aLen; i++ {
found := false
element := listA[i]
for j := 0; j < bLen; j++ {
if visited[j] {
continue
}
if element == listB[j] {
visited[j] = true
found = true
break
}
}
if !found {
return fmt.Sprintf("element %s appears more times in %s than in %s", element, listA, listB), false
}
}
return "", true
}
Now lets talk about performance of this solution compared to map based ones. Well it really depends on the size of the lists which you are comparing, If size of list is large (I would say greater than 20) then map approach is better else this would be sufficent.
Well on Go PlayGround it shows 0s always, but run this on local system and you can see the difference in time taken as size of list increases
So the solution I propose is, adding map based comparision from above solution
func Elementsmatch(listA, listB []string) (string, bool) {
aLen := len(listA)
bLen := len(listB)
if aLen != bLen {
return fmt.Sprintf("Len of the lists don't match , len listA %v, len listB %v", aLen, bLen), false
}
if aLen > 20 {
return elementsMatchByMap(listA, listB)
}else{
return elementsMatchByLoop(listA, listB)
}
}
func elementsMatchByLoop(listA, listB []string) (string, bool) {
aLen := len(listA)
bLen := len(listB)
visited := make([]bool, bLen)
for i := 0; i < aLen; i++ {
found := false
element := listA[i]
for j := 0; j < bLen; j++ {
if visited[j] {
continue
}
if element == listB[j] {
visited[j] = true
found = true
break
}
}
if !found {
return fmt.Sprintf("element %s appears more times in %s than in %s", element, listA, listB), false
}
}
return "", true
}
func elementsMatchByMap(x, y []string) (string, bool) {
// create a map of string -> int
diff := make(map[string]int, len(x))
for _, _x := range x {
// 0 value for int is 0, so just increment a counter for the string
diff[_x]++
}
for _, _y := range y {
// If the string _y is not in diff bail out early
if _, ok := diff[_y]; !ok {
return fmt.Sprintf(" %v is not present in list b", _y), false
}
diff[_y] -= 1
if diff[_y] == 0 {
delete(diff, _y)
}
}
if len(diff) == 0 {
return "", true
}
return "", false
}

Golang: Find two number index where the sum of these two numbers equals to target number

The problem is: find the index of two numbers that nums[index1] + nums[index2] == target. Here is my attempt in golang (index starts from 1):
package main
import (
"fmt"
)
var nums = []int{0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 25182, 25184, 25186, 25188, 25190, 25192, 25194, 25196} // The number list is too long, I put the whole numbers in a gist: https://gist.github.com/nickleeh/8eedb39e008da8b47864
var target int = 16021
func twoSum(nums []int, target int) (int, int) {
if len(nums) <= 1 {
return 0, 0
}
hdict := make(map[int]int)
for i := 1; i < len(nums); i++ {
if val, ok := hdict[nums[i+1]]; ok {
return val, i + 1
} else {
hdict[target-nums[i+1]] = i + 1
}
}
return 0, 0
}
func main() {
fmt.Println(twoSum(nums, target))
}
The nums list is too long, I put it into a gist:
https://gist.github.com/nickleeh/8eedb39e008da8b47864
This code works fine, but I find the return 0,0 part is ugly, and it runs ten times slower than the Julia translation. I would like to know is there any part that is written terrible and affect the performance?
Edit:
Julia's translation:
function two_sum(nums, target)
if length(nums) <= 1
return false
end
hdict = Dict()
for i in 1:length(nums)
if haskey(hdict, nums[i])
return [hdict[nums[i]], i]
else
hdict[target - nums[i]] = i
end
end
end
In my opinion if no elements found adding up to target, best would be to return values which are invalid indices, e.g. -1. Although returning 0, 0 would be enough as a valid index pair can't be 2 equal indices, this is more convenient (because if you forget to check the return values and you attempt to use the invalid indices, you will immediately get a run-time panic, alerting you not to forget checking the validity of the return values). As so, in my solutions I will get rid of that i + 1 shifts as it makes no sense.
Benchmarking of different solutions can be found at the end of the answer.
If sorting allowed:
If the slice is big and not changing, and you have to call this twoSum() function many times, the most efficient solution would be to sort the numbers simply using sort.Ints() in advance:
sort.Ints(nums)
And then you don't have to build a map, you can use binary search implemented in sort.SearchInts():
func twoSumSorted(nums []int, target int) (int, int) {
for i, v := range nums {
v2 := target - v
if j := sort.SearchInts(nums, v2); v2 == nums[j] {
return i, j
}
}
return -1, -1
}
Note: Note that after sorting, the indices returned will be indices of values in the sorted slice. This may differ from indices in the original (unsorted) slice (which may or may not be a problem). If you do need indices from the original order (original, unsorted slice), you may store sorted and unsorted index mapping so you can get what the original index is. For details see this question:
Get the indices of the array after sorting in golang
If sorting is not allowed:
Here is your solution getting rid of that i + 1 shifts as it makes no sense. Slice and array indices are zero based in all languages. Also utilizing for ... range:
func twoSum(nums []int, target int) (int, int) {
if len(nums) <= 1 {
return -1, -1
}
m := make(map[int]int)
for i, v := range nums {
if j, ok := m[v]; ok {
return j, i
}
m[target-v] = i
}
return -1, -1
}
If the nums slice is big and the solution is not found fast (meaning the i index grows big) that means a lot of elements will be added to the map. Maps start with small capacity, and they are internally grown if additional space is required to host many elements (key-value pairs). An internal growing requires rehashing and rebuilding with the already added elements. This is "very" expensive.
It does not seem significant but it really is. Since you know the max elements that will end up in the map (worst case is len(nums)), you can create a map with a big-enough capacity to hold all elements for the worst case. The gain will be that no internal growing and rehashing will be required. You can provide the initial capacity as the second argument to make() when creating the map. This speeds up twoSum2() big time if nums is big:
func twoSum2(nums []int, target int) (int, int) {
if len(nums) <= 1 {
return -1, -1
}
m := make(map[int]int, len(nums))
for i, v := range nums {
if j, ok := m[v]; ok {
return j, i
}
m[target-v] = i
}
return -1, -1
}
Benchmarking
Here's a little benchmarking code to test execution speed of the 3 solutions with the input nums and target you provided. Note that in order to test twoSumSorted(), you first have to sort the nums slice.
Save this into a file named xx_test.go and run it with go test -bench .:
package main
import (
"sort"
"testing"
)
func BenchmarkTwoSum(b *testing.B) {
for i := 0; i < b.N; i++ {
twoSum(nums, target)
}
}
func BenchmarkTwoSum2(b *testing.B) {
for i := 0; i < b.N; i++ {
twoSum2(nums, target)
}
}
func BenchmarkTwoSumSorted(b *testing.B) {
sort.Ints(nums)
b.ResetTimer()
for i := 0; i < b.N; i++ {
twoSumSorted(nums, target)
}
}
Output:
BenchmarkTwoSum-4 1000 1405542 ns/op
BenchmarkTwoSum2-4 2000 722661 ns/op
BenchmarkTwoSumSorted-4 10000000 133 ns/op
As you can see, making a map with big enough capacity speeds up: it runs twice as fast.
And as mentioned, if nums can be sorted in advance, that is ~10,000 times faster!
If nums is always sorted, you can do a binary search to see if the complement to whichever number you're on is also in the slice.
func binary(haystack []int, needle, startsAt int) int {
pivot := len(haystack) / 2
switch {
case haystack[pivot] == needle:
return pivot + startsAt
case len(haystack) <= 1:
return -1
case needle > haystack[pivot]:
return binary(haystack[pivot+1:], needle, startsAt+pivot+1)
case needle < haystack[pivot]:
return binary(haystack[:pivot], needle, startsAt)
}
return -1 // code can never fall off here, but the compiler complains
// if you don't have any returns out of conditionals.
}
func twoSum(nums []int, target int) (int, int) {
for i, num := range nums {
adjusted := target - num
if j := binary(nums, adjusted, 0); j != -1 {
return i, j
}
}
return 0, 0
}
playground example
Or you can use sort.SearchInts which implements binary searching.
func twoSum(nums []int, target int) (int, int) {
for i, num := range nums {
adjusted := target - num
if j := sort.SearchInts(nums, adjusted); nums[j] == adjusted {
// sort.SearchInts returns the index where the searched number
// would be if it was there. If it's not, then nums[j] != adjusted.
return i, j
}
}
return 0, 0
}

Go: What is the fastest/cleanest way to remove multiple entries from a slice?

How would you implement the deleteRecords function in the code below:
Example:
type Record struct {
id int
name string
}
type RecordList []*Record
func deleteRecords( l *RecordList, ids []int ) {
// Assume the RecordList can contain several 100 entries.
// and the number of the of the records to be removed is about 10.
// What is the fastest and cleanest ways to remove the records that match
// the id specified in the records list.
}
I did some micro-benchmarking on my machine, trying out most of the approaches given in the replies here, and this code comes out fastest when you've got up to about 40 elements in the ids list:
func deleteRecords(data []*Record, ids []int) []*Record {
w := 0 // write index
loop:
for _, x := range data {
for _, id := range ids {
if id == x.id {
continue loop
}
}
data[w] = x
w++
}
return data[:w]
}
You didn't say whether it's important to preserve the order of records in the list. If you don't then this function is faster than the above and still fairly clean.
func reorder(data []*Record, ids []int) []*Record {
n := len(data)
i := 0
loop:
for i < n {
r := data[i]
for _, id := range ids {
if id == r.id {
data[i] = data[n-1]
n--
continue loop
}
}
i++
}
return data[0:n]
}
As the number of ids rises, so does the cost of the linear search. At around 50 elements, using a map or doing a binary search to look up the id becomes more efficient, as long as you can avoid rebuilding the map (or resorting the list) every time. At several hundred ids, it becomes more efficient to use a map or a binary search even if you have to rebuild it every time.
If you wish to preserve original contents of the slice, something like this is more appropriate:
func deletePreserve(data []*Record, ids []int) []*Record {
wdata := make([]*Record, len(data))
w := 0
loop:
for _, x := range data {
for _, id := range ids {
if id == x.id {
continue loop
}
}
wdata[w] = x
w++
}
return wdata[0:w]
}
For a personal project, I did something like this:
func filter(sl []int, fn func(int) bool) []int {
result := make([]int, 0, len(sl))
last := 0
for i, v := range sl {
if fn(v) {
result = append(result, sl[last:i]...)
last = i + 1
}
}
return append(result, sl[last:]...)
}
It doesn't mutate the original, but should be relatively efficient.
It's probably better to just do:
func filter(sl []int, fn func(int) bool) (result []int) {
for _, v := range sl {
if !fn(v) {
result = append(result, v)
}
}
return
}
Simpler and cleaner.
If you want to do it in-place, you probably want something like:
func filter(sl []int, fn func(int) bool) []int {
outi := 0
res := sl
for _, v := range sl {
if !fn(v) {
res[outi] = v
outi++
}
}
return res[0:outi]
}
You can optimize this to use copy to copy ranges of elements, but that's twice
the code and probably not worth it.
So, in this specific case, I'd probably do something like:
func deleteRecords(l []*Record, ids []int) []*Record {
outi := 0
L:
for _, v := range l {
for _, id := range ids {
if v.id == id {
continue L
}
}
l[outi] = v
outi++
}
return l[0:outi]
}
(Note: untested.)
No allocations, nothing fancy, and assuming the rough size of the list of Records and the list of ids you presented, a simple linear search is likely to do as well as fancier things but without any overhead. I realize that my version mutates the slice and returns a new slice, but that's not un-idiomatic in Go, and it avoids forcing the slice at the callsite to be heap allocated.
For the case you described, where len(ids) is approximately 10 and len(*l) is in the several hundreds, this should be relatively fast, since it minimizes memory allocations by updating in place.
package main
import (
"fmt"
"strconv"
)
type Record struct {
id int
name string
}
type RecordList []*Record
func deleteRecords(l *RecordList, ids []int) {
rl := *l
for i := 0; i < len(rl); i++ {
rid := rl[i].id
for j := 0; j < len(ids); j++ {
if rid == ids[j] {
copy(rl[i:len(*l)-1], rl[i+1:])
rl[len(rl)-1] = nil
rl = rl[:len(rl)-1]
break
}
}
}
*l = rl
}
func main() {
l := make(RecordList, 777)
for i := range l {
l[i] = &Record{int(i), "name #" + strconv.Itoa(i)}
}
ids := []int{0, 1, 2, 4, 8, len(l) - 1, len(l)}
fmt.Println(ids, len(l), cap(l), *l[0], *l[1], *l[len(l)-1])
deleteRecords(&l, ids)
fmt.Println(ids, len(l), cap(l), *l[0], *l[1], *l[len(l)-1])
}
Output:
[0 1 2 4 8 776 777] 777 777 {0 name #0} {1 name #1} {776 name #776}
[0 1 2 4 8 776 777] 772 777 {1 name #1} {3 name #3} {775 name #775}
Instead of repeatedly searching ids, you could use a map. This code preallocates the full size of the map, and then just moves array elements in place. There are no other allocations.
func deleteRecords(l *RecordList, ids []int) {
m := make(map[int]bool, len(ids))
for _, id := range ids {
m[id] = true
}
s, x := *l, 0
for _, r := range s {
if !m[r.id] {
s[x] = r
x++
}
}
*l = s[0:x]
}
Use the vector package's Delete method as a guide, or just use a Vector instead of a slice.
Here is one option but I would hope there are cleaner/faster more functional looking ones:
func deleteRecords( l *RecordList, ids []int ) *RecordList {
var newList RecordList
for _, rec := range l {
toRemove := false
for _, id := range ids {
if rec.id == id {
toRemove = true
}
if !toRemove {
newList = append(newList, rec)
}
}
return newList
}
With large enough l and ids it will be more effective to Sort() both lists first and then do a single loop over them instead of two nested loops

Resources