How do i remove duplicate valeus in push? - go

This might be noob question...
How to remove the duplicate values instead pushing values?
When the values was:("lorem", "ipsum", 1, 1, 1, "jack", "jill", "felix", "donking")
It should print:("lorem", "ipsum", 1, "jack", "jill", "felix", "donking")
How to remove this duplicated values in push function like above?
// Push values
func (q *Data) Push(n interface{}) *Data {
if q.Len() < q.size {
q.data = append(q.data, n)
if q.data[q.Len()] == q.data[q.Len()+1] {
q.Pop()
q.Push(n)
}
} else {
q.Pop()
q.Push(n)
}
return q
}

Every data structure uses an underlying primitive data structure for implementation, and it looks like you are using a slice. If you only want to save unique data, you should use a map. In order to be as efficient as possible when using a map only for finding duplicates, you can use a map[interface{}]struct{}.

One needs to check the data in the queue against the pushed value. If the data was already in the queue it should returned.
for i := range q.data {
if q.data[i] == n {
return q // return q when n value is found equal to one of q.data values.
}
}
q.data = append(q.data, n)

Related

Reading from a slice of unknown length in Golang

I'm trying to replicate this algorithm for finding duplicates in an array in Golang. Here's the javascript version:
function hasDuplicateValue(array) {
let existingNumbers = [];
for(let i = 0; i < array.length; i++) {
if(existingNumbers[array[i]] === 1) {
return true;
} else {
existingNumbers[array[i]] = 1;
}
}
return false;
}
On line 2, the algorithm creates an empty array of unknown length, and then adds 1 to an index in the array corresponding with each number that it finds (e.g. if it finds the number 3 in the array, it will add a 1 to index 3 in existing numbers.
I'm wondering — how do I replicate this in Golang (since we need to have slots allocated in the slice before reading it). Would I first need to find the max value in the array and then declare the existingNumbers slice to be of that same size?
Or is there a more efficient way of doing this (instead of searching through the array and finding the max value before constructing the slice).
Thanks!
Edit:
I realized that I can't do this with a slice because I can't read from an empty value. However, as #icza suggested, it will work with a map:
func findDuplicates(list []int)(bool) {
temp := make(map[int]int)
for _, elem := range list {
if temp[elem] == 1 {
return true
} else {
temp[elem] = 1
}
}
return false
}
As comments, I would also suggest using a map to keep the state of the duplications, but we can use map[int]struct{} because empty structs are not consumed any memory in Go.
And also I have simplified the code a bit and it is as follows.
func findDuplicates(list []int) bool {
temp := make(map[int]struct{})
for _, elem := range list {
if _, ok := temp[elem]; ok {
return true
}
temp[elem] = struct{}{}
}
return false
}
Full code can be executed here

Golang maps/hashmaps, optimizing for iteration speed

When migrating a production NodeJS application to Golang I've noticed that iteration of GO's native Map is actually slower than Node.
I've come up with an alternative solution that sacrifices removal/insertion speed with iteration speed instead, by exposing an array that can be iterated over and storing key=>index pairs inside a separate map.
While this solution works, and has a significant performance increase, I was wondering if there is a better solution to this that I could look into.
The setup I have is that its very rare something is removed from the hashmaps, only additions and replacements are common for which this implementation 'works', albeit feels like a workaround more than an actual solution.
The maps are always indexed by an integer, holding arbitrary data.
FastMap: 500000 Iterations - 0.153000ms
Native Map: 500000 Iterations - 4.988000ms
/*
Unordered hash map optimized for iteration speed.
Stores values in an array and holds key=>index mappings inside a separate hashmap
*/
type FastMapEntry[K comparable, T any] struct {
Key K
Value T
}
type FastMap[K comparable, T any] struct {
m map[K]int // Stores key => array index mappings
entries []FastMapEntry[K, T] // Array holding entries and their keys
len int // Total map size
}
func MakeFastMap[K comparable, T any]() *FastMap[K, T] {
return &FastMap[K, T]{
m: make(map[K]int),
entries: make([]FastMapEntry[K, T], 0),
}
}
func (m *FastMap[K, T]) Set(key K, value T) {
index, exists := m.m[key]
if exists {
// Replace if key already exists
m.entries[index] = FastMapEntry[K, T]{
Key: key,
Value: value,
}
} else {
// Store the key=>index pair in the map and add value to entries. Increase total len by one
m.m[key] = m.len
m.entries = append(m.entries, FastMapEntry[K, T]{
Key: key,
Value: value,
})
m.len++
}
}
func (m *FastMap[K, T]) Has(key K) bool {
_, exists := m.m[key]
return exists
}
func (m *FastMap[K, T]) Get(key K) (value T, found bool) {
index, exists := m.m[key]
if exists {
found = true
value = m.entries[index].Value
}
return
}
func (m *FastMap[K, T]) Remove(key K) bool {
index, exists := m.m[key]
if exists {
// Remove value from entries
m.entries = append(m.entries[:index], m.entries[index+1:]...)
// Remove key=>index mapping
delete(m.m, key)
m.len--
for i := index; i < m.len; i++ {
// Move all index mappings up, starting from current index
m.m[m.entries[i].Key] = i
}
}
return exists
}
func (m *FastMap[K, T]) Entries() []FastMapEntry[K, T] {
return m.entries
}
func (m *FastMap[K, T]) Len() int {
return m.len
}
The test code that was ran is:
// s.Variations is a native map holding ~500k records
start := time.Now()
iterations := 0
for _, variation := range s.Variations {
if variation.Id > 0 {
}
iterations++
}
log.Printf("Native Map: %d Iterations - %fms\n", iterations, float64(time.Since(start).Microseconds())/1000)
// Copy data into FastMap
fm := helpers.MakeFastMap[state.VariationId, models.ItemVariation]()
for key, variation := range s.Variations {
fm.Set(key, variation)
}
start = time.Now()
iterations = 0
for _, variation := range fm.Entries() {
if variation.Value.Id > 0 {
}
iterations++
}
log.Printf("FastMap: %d Iterations - %fms\n", iterations, float64(time.Since(start).Microseconds())/1000)
I think this kind of comparison and benchmarking is a little off-topic. Go implementation of map is quite different from your implementation, basically because it needs to cover a wider area of entries, the structs used in compile time are actually kind of heavy (not so much though, they basically store some information about the types you use in your map and so on), and the implementation approach is different! Go implementation of map is basically a hashmap (yours is not obviously, or it is, but the actual hashing implementation is delegated to the m map you hold internally).
One of the other factors makes you get this result is, if you take a look at this:
for _, variation := range fm.Entries() {
if variation.Value.Id > 0 {
}
iterations++
}
Basically, you're iterating over a slice, which is much easier and faster to iterate rather than a map, you have a view to an array, which holds elements of the same types next to each other, makes sense, right?
What you should do to make a better comparison would be something like this:
for _, y := range fastMap.m {
_ = fastMap.Entries()[y].Value + 1 // some simple calculation
}
If you're really looking for performance, a well written hash function and a fixed size array would be your best choice.

Generating combinatorial string from map

I have a map as such:
// map[int] position in string
// map[rune]bool characters possible at said position
func generateString(in map[int]map[rune]bool) []string {
// example: {0: {'A':true, 'C': true}, 1: {'E': true}, 2: {'I': true, 'X': true}}
result := []string{"AEI", "AEX", "CEI", "CEX"} // should generate these
return result
}
The difference with all possible permutations is that we are specifying which permutations are possible by index and I think that's the real head-breaker here.
First, we need to convert map[int]map[rune]bool to []map[rune]bool since map iteration isn't guaranteed to be sorted by key's
After that, this is a recursive approach
var res []string
func dfs(curString string, index int, in []map[rune]bool) {
if index == len(in) {
res = append(res, curString)
return
}
for ch, is := range in[index] {
if !is { // I assume booleans can be false
return
}
dfs(curString+string(ch), index+1, in)
}
}
and we can call it with dfs("", 0, arr) where arr is given map converted to slice and answer will be in res variable

GO - switch statement in a recursive function

I have an algorithm that I'm trying to implement but currently I have absolutely no clue how to do so, from a technical perspective.
We have a slice of 5 floats:
mySlice := [float1, float2, float3, float4, float5]
And a switch statement:
aFloat := mySlice[index]
switch aFloat {
case 1:
{
//do something
}
case 2:
{
//do something
}
case 3:
{
//do something
}
case 4:
{
//do something
}
case 5:
{
//do something
}
default:
{
//somehow go back to slice, take the next smallest and run
//through the switch statement again
}
}
What I want to do is as follows:
identify the smallest element of mySlice ex: smallestFloat
run smallestFloat through the switch statement
if smallestFloat gets to default case take the next smallest float from mySlice
do step 2 again.
I've managed to do the first step with a for loop and step 2, but I'm stuck on steps 3 and 4. I don't have an idea at the moment on how I might go about re-feeding the next smallest float from mySlice to the switch statement again...
I would appreciate any light shed on my problem.
EDIT: I figured that it would be good to put my solution to the algorithm presented above.
create another slice which will be a sorted version of mySlice
create a map[int]value where the index will correspond to the position of the value in the non-sorted slice, but the items of the map will be inserted in the same order as the sorted slice.
Result: a value sorted map with the respective indexes corresponding to the position of the original non-sorted slice
Here is an implementation using a Minimum Priority Queue. The original input slice of floats is not changed. It can be run on the Go playground
Note: When dealing with recursive functions, you need to be weary of stack overflows.
Go does tail recursion optimizations only in limited cases. For more information on that,
refer to this answer.
This particular example does even better than amortized O(log N) time, because it does not have to resize the priority queue halfway through. This makes it guaranteed O(log N).
package main
import (
"fmt"
)
func main() {
slice := []float64{2, 1, 13, 4, 22, 0, 5, 7, 3}
fmt.Printf("Order before: %v\n", slice)
queue := NewMinPQ(slice)
for !queue.Empty() {
doSmallest(queue)
}
fmt.Printf("Order after: %v\n", slice)
}
func doSmallest(queue *MinPQ) {
if queue.Empty() {
return
}
v := queue.Dequeue()
switch v {
case 1:
fmt.Println("Do", v)
case 2:
fmt.Println("Do", v)
case 3:
fmt.Println("Do", v)
case 4:
fmt.Println("Do", v)
case 5:
fmt.Println("Do", v)
default:
// No hit, do it all again with the next value.
doSmallest(queue)
}
}
// MinPQ represents a Minimum priority queue.
// It is implemented as a binary heap.
//
// Values which are enqueued can be dequeued, but will be done
// in the order where the smallest item is returned first.
type MinPQ struct {
values []float64 // Original input list -- Order is never changed.
indices []int // List of indices into values slice.
index int // Current size of indices list.
}
// NewMinPQ creates a new MinPQ heap for the given input set.
func NewMinPQ(set []float64) *MinPQ {
m := new(MinPQ)
m.values = set
m.indices = make([]int, 1, len(set))
// Initialize the priority queue.
// Use the set's indices as values, instead of the floats
// themselves. As these may not be re-ordered.
for i := range set {
m.indices = append(m.indices, i)
m.index++
m.swim(m.index)
}
return m
}
// Empty returns true if the heap is empty.
func (m *MinPQ) Empty() bool { return m.index == 0 }
// Dequeue removes the smallest item and returns it.
// Returns nil if the heap is empty.
func (m *MinPQ) Dequeue() float64 {
if m.Empty() {
return 0
}
min := m.indices[1]
m.indices[1], m.indices[m.index] = m.indices[m.index], m.indices[1]
m.index--
m.sink(1)
m.indices = m.indices[:m.index+1]
return m.values[min]
}
// greater returns true if element x is greater than element y.
func (m *MinPQ) greater(x, y int) bool {
return m.values[m.indices[x]] > m.values[m.indices[y]]
}
// sink reorders the tree downwards.
func (m *MinPQ) sink(k int) {
for 2*k <= m.index {
j := 2 * k
if j < m.index && m.greater(j, j+1) {
j++
}
if m.greater(j, k) {
break
}
m.indices[k], m.indices[j] = m.indices[j], m.indices[k]
k = j
}
}
// swim reorders the tree upwards.
func (m *MinPQ) swim(k int) {
for k > 1 && m.greater(k/2, k) {
m.indices[k], m.indices[k/2] = m.indices[k/2], m.indices[k]
k /= 2
}
}

How to check the uniqueness inside a for-loop?

Is there a way to check slices/maps for the presence of a value?
I would like to add a value to a slice only if it does not exist in the slice.
This works, but it seems verbose. Is there a better way to do this?
orgSlice := []int{1, 2, 3}
newSlice := []int{}
newInt := 2
newSlice = append(newSlice, newInt)
for _, v := range orgSlice {
if v != newInt {
newSlice = append(newSlice, v)
}
}
newSlice == [2 1 3]
Your approach would take linear time for each insertion. A better way would be to use a map[int]struct{}. Alternatively, you could also use a map[int]bool or something similar, but the empty struct{} has the advantage that it doesn't occupy any additional space. Therefore map[int]struct{} is a popular choice for a set of integers.
Example:
set := make(map[int]struct{})
set[1] = struct{}{}
set[2] = struct{}{}
set[1] = struct{}{}
// ...
for key := range(set) {
fmt.Println(key)
}
// each value will be printed only once, in no particular order
// you can use the ,ok idiom to check for existing keys
if _, ok := set[1]; ok {
fmt.Println("element found")
} else {
fmt.Println("element not found")
}
Most efficient is likely to be iterating over the slice and appending if you don't find it.
func AppendIfMissing(slice []int, i int) []int {
for _, ele := range slice {
if ele == i {
return slice
}
}
return append(slice, i)
}
It's simple and obvious and will be fast for small lists.
Further, it will always be faster than your current map-based solution. The map-based solution iterates over the whole slice no matter what; this solution returns immediately when it finds that the new value is already present. Both solutions compare elements as they iterate. (Each map assignment statement certainly does at least one map key comparison internally.) A map would only be useful if you could maintain it across many insertions. If you rebuild it on every insertion, then all advantage is lost.
If you truly need to efficiently handle large lists, consider maintaining the lists in sorted order. (I suspect the order doesn't matter to you because your first solution appended at the beginning of the list and your latest solution appends at the end.) If you always keep the lists sorted then you you can use the sort.Search function to do efficient binary insertions.
Another option:
package main
import "golang.org/x/tools/container/intsets"
func main() {
var (
a intsets.Sparse
b bool
)
b = a.Insert(9)
println(b) // true
b = a.Insert(9)
println(b) // false
}
https://pkg.go.dev/golang.org/x/tools/container/intsets
This option if the number of missing numbers is unknown
AppendIfMissing := func(sl []int, n ...int) []int {
cache := make(map[int]int)
for _, elem := range sl {
cache[elem] = elem
}
for _, elem := range n {
if _, ok := cache[elem]; !ok {
sl = append(sl, elem)
}
}
return sl
}
distincting a array of a struct :
func distinctObjects(objs []ObjectType) (distinctedObjs [] ObjectType){
var output []ObjectType
for i:= range objs{
if output==nil || len(output)==0{
output=append(output,objs[i])
} else {
founded:=false
for j:= range output{
if output[j].fieldname1==objs[i].fieldname1 && output[j].fieldname2==objs[i].fieldname2 &&......... {
founded=true
}
}
if !founded{
output=append(output,objs[i])
}
}
}
return output
}
where the struct here is something like :
type ObjectType struct {
fieldname1 string
fieldname2 string
.........
}
the object will distinct by checked fields here :
if output[j].fieldname1==objs[i].fieldname1 && output[j].fieldname2==objs[i].fieldname2 &&......... {

Resources