Is there a way to map an array of objects in golang? - go

Coming from Nodejs, I could do something like:
// given an array `list` of objects with a field `fruit`:
fruits = list.map(el => el.fruit) # which will return an array of fruit strings
Any way to do that in an elegant one liner in golang?
I know I can do it with a range loop, but I am looking for the possibility of a one liner solution

In Go, arrays are inflexible (because their length is encoded in their type) and costly to pass to functions (because a function operates on copies of its array arguments). I'm assuming you'd like to operate on slices rather than on arrays.
Because methods cannot take additional type arguments, you cannot simply declare a generic Map method in Go. However, you can define Map as a generic top-level function:
func Map[T, U any](ts []T, f func(T) U) []U {
us := make([]U, len(ts))
for i := range ts {
us[i] = f(ts[i])
}
return us
}
Then you can write the following code,
names := []string{"Alice", "Bob", "Carol"}
fmt.Println(Map(names, utf8.RuneCountInString))
which prints [5 3 5] to stdout (try it out in this Playground).
Go 1.18 saw the addition of a golang.org/x/exp/slices package, which provides many convenient operations on slices, but a Map function is noticeably absent from it. The omission of that function was the result of a long discussion in the GitHub issue dedicated to the golang.org/x/exp/slices proposal; concerns included the following:
hidden cost (O(n)) of operations behind a one-liner
uncertainty about error handling inside Map
risk of encouraging a style that strays too far from Go's traditional style
Russ Cox ultimately elected to drop Map from the proposal because it's
probably better as part of a more comprehensive streams API somewhere else.

Related

Why should I use the & sign on structs?

In the gotour, there is a section: struct literals.
package main
import "fmt"
type Vertex struct {
X, Y int
}
var (
v1 = Vertex{1, 2} // has type Vertex
v2 = Vertex{X: 1} // Y:0 is implicit
v3 = Vertex{} // X:0 and Y:0
p = &Vertex{1, 2} // has type *Vertex
)
func main() {
fmt.Println(v1, p, v2, v3)
}
What's the difference between a struct with an ampersand and the one without? I know that the ones with ampersands point to the same reference, but why should I use them over the regular ones?
var p = &Vertex{} // why should I use this
var c = Vertex{} // over this
It's true that p (&Vertex{}) has type *Vertex and that c (Vertex{}) has type Vertex.
However, I don't believe that statement really answers the question of why one would choose one over the other. It'd be kind of like answering the question "why use planes over cars" with something like "planes are winged, cars aren't." (Obviously we all know what wings are, but you get the point).
But it also wouldn't be very helpful to simply say "go learn about pointers" (though I think it is a really good idea to so nonetheless).
How you choose basically boils down to the following.
Realize that &Vertex{} and Vertex{} are fundamentally initialized in the same way.
There might be some low-level memory allocation differences (i.e. stack vs heap), but you really should just let the compiler worry about these details
What makes one more useful and performant than the other is determined by how they are used in the program.
"Do I want a pointer to the struct (p), or just the struct (c)?"
Note that you can get a pointer using the & operator (e.g. &c); you can dereference a pointer to get the value using * (e.g. *p)
So depending on what you choose, you may end up doing a lot of *p or &c
Bottom-line: create what you will use; if you don't need a pointer, don't make one (this will help more with "optimizations" in the long run).
Should I use Vertex{} or &Vertex{}?
For the Vertex given in your example, I'd choose Vertex{} to a get a simple value object.
Some reasons:
Vertex in the example is pretty small in terms of size. Copying is cheap.
Garbage collection is simplified, garbage creation may be mitigated by the compiler
Pointers can get tricky and add unnecessary cognitive load to the programmer (i.e. gets harder to maintain)
Pointers aren't something you want if you ever get into concurrency (i.e. it's unsafe)
Vertex doesn't really contain anything worth mutating (just return/create a new Vertex when needed).
Some reasons why you'd want &Struct{}
If Struct has a member caching some state that needs to be changed inside the original object itself.
Struct is huge, and you've done enough profiling to determine that the cost of copying is significant.
As an aside: you should try to keep your structs small, it's just good practice.
You find yourself making copies by accident (this is a bit of a stretch): v := Struct{}
v2 := v // Copy happens
v := &Struct{}
v2 := v // Only a pointer is copied
The comments pretty much spell it out:
v1 = Vertex{1, 2} // has type Vertex
p = &Vertex{1, 2} // has type *Vertex
As in many other languages, & takes the address of the identifier following it. This is useful when you want a pointer rather than a value.
If you need to understand more about pointers in programming, you could start with this for go, or even the wikipedia page.

What does ... mean when coming directly after a slice?

What does ... mean in this context in Go?
ids = append(ids[:index], ids[index+1:]...)
I have read this great Q&A:
Do three dots (which is called wildcard?) contain multiple meanings?
about what ... means in some situations, but I don't understand what it means in the situation above.
Some languages (C, Python, ...) accept variadic arguments. Basically, you allow the client of a function to pass a number of arguments, without specifying how many. Since the function will still need to process these arguments one way or another, they are usually converted to a collection of some sort. For instance, in Python:
def foo(x, *args): # * is used for variadic arguments
return len(args)
>>> foo(1) # Passed the required argument, but no varargs
0
>>> foo(1, 2, 3) # Passed the required, plus two variadic arguments
2
>>> foo(1, 2, 3, 4, 5, 6) # required + 5 args, etc...
5
Now one obvious problem of that approach is that a number of arguments is quite a fuzzy concept as far as types are concerned. C uses pointers, Python doesn't really care about types that much in the first place, and Go takes the decision of restricting it to a specified case: you pass a slice of a given type.
This is nice because it lets the type-system do its thing, while still being quite flexible (in particular, the type in question can be an interface, so you can pass different 'actual types' as long as the function knows how to process these.
The typical example would be the Command function, which executes a program, passing it some arguments:
func Command(name string, arg ...string) *Cmd
It makes a lot of sense here, but recall that variadic arguments are just a convenient way to pass slices. You could have the exact same API with:
func Command(name string, args []string) *Cmd
The only advantage of the first is that it lets you pass no argument, one argument, several... without having to build the slice yourself.
What about your question then ?
Sometimes, you do have a slice, but need to call a variadic function. But if you do it naively:
my_args_slice := []string{"foo", "bar"}
cmd := Command("myprogram", my_args_slice)
The compiler will complain that it expects string(s), but got a slice instead ! What you want to tell it is that it doesn't have to 'build the slice in the back', because you have a slice already. You do that by using this ellipsis:
my_args_slice := []string{"foo", "bar"}
cmd := Command("myprogram", my_args_slice...) // <- Interpret that as strings
The append function, despite being built-in and special, follows the same rules. You can append zero, one, or more elements to the slice. If you want to concatenate slices (i.e. you have a 'slice of arguments' already), you similarly use the ellipsis to make it use it directly.
It unpacks slice.
ids[:index] is a short form of ids[0:index]
ids[index+1:] is a short form of ids[index+1:len(ids)-1]
So, your example ids = append(ids[:index], ids[index+1:]...) translates to
//pseudocode
ids = append(ids[0:index], ids[index+1], ids[index+2], ids[index+3], ..., ids[len(ids)-2], ids[len(ids)-1])

Re-map a number from one range to another

Is there any equivalent in go for the Arduino map function?
map(value, fromLow, fromHigh, toLow, toHigh)
Description
Re-maps a number from one range to another. That is, a value of
fromLow would get mapped to toLow, a value of fromHigh to toHigh,
values in-between to values in-between, etc
If not, how would I implement this in go?
Is there any equivalent in go for the Arduino map function?
The standard library, or more specifically the math package, does not offer such a function, no.
If not, how would I implement this in go?
By taking the original code and translating it to Go. C and Go are very related syntactically and therefore this task is very, very easy. The manual page for map that you linked gives you the code. A translation to go is, as already mentioned, trivial.
Original from the page you linked:
For the mathematically inclined, here's the whole function
long map(long x, long in_min, long in_max, long out_min, long out_max)
{
return (x - in_min) * (out_max - out_min) / (in_max - in_min) + out_min;
}
You would translate that to something like
func Map(x, in_min, in_max, out_min, out_max int64) int64 {
return (x - in_min) * (out_max - out_min) / (in_max - in_min) + out_min
}
Here is an example on the go playground.
Note that map is not a valid function name in Go since there is already the map built-in type which makes map a reserved keyword. keyword for defining map types, similar to the []T syntax.

Implement range in Go

I have a type List []string and I am implementing some standard functions like Insert, DeleteAt etc. I would like to implement range so I can iterate the list easily.
I cannot seem to find the way to do that.
There is no reason to re-implement range since the range keyword will work on type List.
var l List
for i, v := range l {
/* whatever */
}
It is not possible in Go to implement range yourself for a given type. Range only works for Go's built-in data structures: slices, maps and channels (and arrays?).

Scala: Mutable vs. Immutable Object Performance - OutOfMemoryError

I wanted to compare the performance characteristics of immutable.Map and mutable.Map in Scala for a similar operation (namely, merging many maps into a single one. See this question). I have what appear to be similar implementations for both mutable and immutable maps (see below).
As a test, I generated a List containing 1,000,000 single-item Map[Int, Int] and passed this list into the functions I was testing. With sufficient memory, the results were unsurprising: ~1200ms for mutable.Map, ~1800ms for immutable.Map, and ~750ms for an imperative implementation using mutable.Map -- not sure what accounts for the huge difference there, but feel free to comment on that, too.
What did surprise me a bit, perhaps because I'm being a bit thick, is that with the default run configuration in IntelliJ 8.1, both mutable implementations hit an OutOfMemoryError, but the immutable collection did not. The immutable test did run to completion, but it did so very slowly -- it takes about 28 seconds. When I increased the max JVM memory (to about 200MB, not sure where the threshold is), I got the results above.
Anyway, here's what I really want to know:
Why do the mutable implementations run out of memory, but the immutable implementation does not? I suspect that the immutable version allows the garbage collector to run and free up memory before the mutable implementations do -- and all of those garbage collections explain the slowness of the immutable low-memory run -- but I'd like a more detailed explanation than that.
Implementations below. (Note: I don't claim that these are the best implementations possible. Feel free to suggest improvements.)
def mergeMaps[A,B](func: (B,B) => B)(listOfMaps: List[Map[A,B]]): Map[A,B] =
(Map[A,B]() /: (for (m <- listOfMaps; kv <-m) yield kv)) { (acc, kv) =>
acc + (if (acc.contains(kv._1)) kv._1 -> func(acc(kv._1), kv._2) else kv)
}
def mergeMutableMaps[A,B](func: (B,B) => B)(listOfMaps: List[mutable.Map[A,B]]): mutable.Map[A,B] =
(mutable.Map[A,B]() /: (for (m <- listOfMaps; kv <- m) yield kv)) { (acc, kv) =>
acc + (if (acc.contains(kv._1)) kv._1 -> func(acc(kv._1), kv._2) else kv)
}
def mergeMutableImperative[A,B](func: (B,B) => B)(listOfMaps: List[mutable.Map[A,B]]): mutable.Map[A,B] = {
val toReturn = mutable.Map[A,B]()
for (m <- listOfMaps; kv <- m) {
if (toReturn contains kv._1) {
toReturn(kv._1) = func(toReturn(kv._1), kv._2)
} else {
toReturn(kv._1) = kv._2
}
}
toReturn
}
Well, it really depends on what the actual type of Map you are using. Probably HashMap. Now, mutable structures like that gain performance by pre-allocating memory it expects to use. You are joining one million maps, so the final map is bound to be somewhat big. Let's see how these key/values get added:
protected def addEntry(e: Entry) {
val h = index(elemHashCode(e.key))
e.next = table(h).asInstanceOf[Entry]
table(h) = e
tableSize = tableSize + 1
if (tableSize > threshold)
resize(2 * table.length)
}
See the 2 * in the resize line? The mutable HashMap grows by doubling each time it runs out of space, while the immutable one is pretty conservative in memory usage (though existing keys will usually occupy twice the space when updated).
Now, as for other performance problems, you are creating a list of keys and values in the first two versions. That means that, before you join any maps, you already have each Tuple2 (the key/value pairs) in memory twice! Plus the overhead of List, which is small, but we are talking about more than one million elements times the overhead.
You may want to use a projection, which avoids that. Unfortunately, projection is based on Stream, which isn't very reliable for our purposes on Scala 2.7.x. Still, try this instead:
for (m <- listOfMaps.projection; kv <- m) yield kv
A Stream doesn't compute a value until it is needed. The garbage collector ought to collect the unused elements as well, as long as you don't keep a reference to the Stream's head, which seems to be the case in your algorithm.
EDIT
Complementing, a for/yield comprehension takes one or more collections and return a new collection. As often as it makes sense, the returning collection is of the same type as the original collection. So, for example, in the following code, the for-comprehension creates a new list, which is then stored inside l2. It is not val l2 = which creates the new list, but the for-comprehension.
val l = List(1,2,3)
val l2 = for (e <- l) yield e*2
Now, let's look at the code being used in the first two algorithms (minus the mutable keyword):
(Map[A,B]() /: (for (m <- listOfMaps; kv <-m) yield kv))
The foldLeft operator, here written with its /: synonymous, will be invoked on the object returned by the for-comprehension. Remember that a : at the end of an operator inverts the order of the object and the parameters.
Now, let's consider what object is this, on which foldLeft is being called. The first generator in this for-comprehension is m <- listOfMaps. We know that listOfMaps is a collection of type List[X], where X isn't really relevant here. The result of a for-comprehension on a List is always another List. The other generators aren't relevant.
So, you take this List, get all the key/values inside each Map which is a component of this List, and make a new List with all of that. That's why you are duplicating everything you have.
(in fact, it's even worse than that, because each generator creates a new collection; the collections created by the second generator are just the size of each element of listOfMaps though, and are immediately discarded after use)
The next question -- actually, the first one, but it was easier to invert the answer -- is how the use of projection helps.
When you call projection on a List, it returns new object, of type Stream (on Scala 2.7.x). At first you may think this will only make things worse, because you'll now have three copies of the List, instead of a single one. But a Stream is not pre-computed. It is lazily computed.
What that means is that the resulting object, the Stream, isn't a copy of the List, but, rather, a function that can be used to compute the Stream when required. Once computed, the result will be kept so that it doesn't need to be computed again.
Also, map, flatMap and filter of a Stream all return a new Stream, which means you can chain them all together without making a single copy of the List which created them. Since for-comprehensions with yield use these very functions, the use of Stream inside the prevent unnecessary copies of data.
Now, suppose you wrote something like this:
val kvs = for (m <- listOfMaps.projection; kv <-m) yield kv
(Map[A,B]() /: kvs) { ... }
In this case you aren't gaining anything. After assigning the Stream to kvs, the data hasn't been copied yet. Once the second line is executed, though, kvs will have computed each of its elements, and, therefore, will hold a complete copy of the data.
Now consider the original form::
(Map[A,B]() /: (for (m <- listOfMaps.projection; kv <-m) yield kv))
In this case, the Stream is used at the same time it is computed. Let's briefly look at how foldLeft for a Stream is defined:
override final def foldLeft[B](z: B)(f: (B, A) => B): B = {
if (isEmpty) z
else tail.foldLeft(f(z, head))(f)
}
If the Stream is empty, just return the accumulator. Otherwise, compute a new accumulator (f(z, head)) and then pass it and the function to the tail of the Stream.
Once f(z, head) has executed, though, there will be no remaining reference to the head. Or, in other words, nothing anywhere in the program will be pointing to the head of the Stream, and that means the garbage collector can collect it, thus freeing memory.
The end result is that each element produced by the for-comprehension will exist just briefly, while you use it to compute the accumulator. And this is how you save keeping a copy of your whole data.
Finally, there is the question of why the third algorithm does not benefit from it. Well, the third algorithm does not use yield, so no copy of any data whatsoever is being made. In this case, using projection only adds an indirection layer.

Resources