How to modify field of a struct in a slice? - for-loop

I have a JSON file named test.json which contains:
[
{
"name" : "john",
"interests" : ["hockey", "jockey"]
},
{
"name" : "lima",
"interests" : ["eating", "poker"]
}
]
Now I have written a golang script which reads the JSON file to an slice of structs, and then upon a condition check, modifies a struct fields by iterating over the slice.
Here is what I've tried so far:
package main
import (
"log"
"strings"
"io/ioutil"
"encoding/json"
)
type subDB struct {
Name string `json:"name"`
Interests []string `json:"interests"`
}
var dbUpdate []subDB
func getJSON() {
// open the file
filename := "test.json"
val, err := ioutil.ReadFile(filename)
if err != nil {
log.Fatal(err)
}
err = json.Unmarshal(val, &dbUpdate)
}
func (v *subDB) Change(newresponse []string) {
v.Interests = newresponse
}
func updater(name string, newinterest string) {
// iterating over the slice of structs
for _, item := range dbUpdate {
// checking if name supplied matches to the current struct
if strings.Contains(item.Name, name) {
flag := false // declare a flag variable
// item.Interests is a slice, so we iterate over it
for _, intr := range item.Interests {
// check if newinterest is within any one of slice value
if strings.Contains(intr, newinterest) {
flag = true
break // if we find one, we terminate the loop
}
}
// if flag is false, then we change the Interests field
// of the current struct
if !flag {
// Interests holds a slice of strings
item.Change([]string{newinterest}) // passing a slice of string
}
}
}
}
func main() {
getJSON()
updater("lima", "jogging")
log.Printf("%+v\n", dbUpdate)
}
The output I'm getting is:
[{Name:john Interests:[hockey jockey]} {Name:lima Interests:[eating poker]}]
However I should be getting an output like:
[{Name:john Interests:[hockey jockey]} {Name:lima Interests:[jogging]}]
My understanding was that since Change() has a pointer passed, it should directly modify the field. Can anyone point me out what I'm doing wrong?

The problem
Let's cite what the language specification says on the for ... range loops:
A "for" statement with a "range" clause iterates through all entries
of an array, slice, string or map, or values received on a channel.
For each entry it assigns iteration values to corresponding iteration
variables if present and then executes the block.
So, in
for _, item := range dbUpdate { ... }
the whole statement forms a scope in which a variable named item is declared and it gets assigned a value of each element of dbUpdate, in turn, form the first to the last — as the statement performs its iterations.
All assignments in Go, always and everywhere do copy the value of the expression being assigned, into a variable receiving that value.
So, when you have
type subDB struct {
Name string `json:"name"`
Interests []string `json:"interests"`
}
var dbUpdate []subDB
you have a slice whose backing array contains a set of elements, each of which has type subDB.
Consequently, when for ... range iterates over your slice, on each iteration a shallow copy of the fields of a subDB value contained in the current slice element is done: the values of those fields are copied into the variable item.
We could re-write what happes as this:
for i := 0; i < len(dbUpdate); i++ {
var item subDB
item = dbUpdate[i]
...
}
As you can see, if you mutate item in the loop's body, the changes you do to it do not in any way affect the collection's element currently being iterated over.
The solutions
Broadly speaking, the solution is to become fully acquainted with the fact that Go is very simple in most of the stuff it implements, and so range is no magic to: the iteration variable is just a variable, and assignment to it is just an assignment.
As to solving the particular problem, there are multiple ways.
Refer to a collection element by its index
Do
for i := range dbUpdate {
dbUpdate[i].FieldName = value
}
A corollary to this is that sometimes, when the element is complex or you'd like to delegate its mutation to some function, you may take a pointer to it:
for i := range dbUpdate {
p := &dbUpdate[i]
mutateSubDB(p)
}
...
func mutateSubDB(p *subDB) {
p.SomeField = someValue
}
Keep pointers in the slice
If your slice were declated like
var dbUpdates []*subDB
…and you'd keep pointers to (usually heap-allocated) SubDB values,
the
for _, ptr := range dbUpdate { ... }
statement would naturally copy a pointer to a SubDB (anonymous) variable into ptr as the slice contains pointers and so the assignment copies a pointer.
Since all pointers containing the same address are pointing to the same value, mutating the target variable through the pointer kept in the iteration variable would mutate the same thing which is pointed to by the slice's element.
Which approach to select should usually depend on considerations other than thinking about how one would iterate over the elements — simply because once you understand why your code did not work, you do not have this problem anymore.
As usually: if your values are really big, consider keeping pointers to them.
If you values need to be referenced from multiple places at the same time, keep pointers to them. In other cases keep the values directly — this greatly improves CPU data cache locality (simply put, by the time you're about to access the next element its contents will most likely have been already fetched from the memory, which does not occur when the CPU has to chase a pointer to access some arbitrary memory location through it).

Related

Append content to slice into a nested struct does not work

I have two nested structs like this:
type Block struct {
ID string
Contents []string
}
type Package struct {
Name string
Blocks []Block
}
Original package (p) does not change when I try to append a new Content in a specific block.
for _, b := range p.Blocks {
if b.ID == "B1" {
fmt.Println("Adding a new content")
b.Contents = append(b.Contents, "c3")
}
}
Example:
https://play.golang.org/p/5hm6RjPFk8o
This is happening because this line:
for _, b := range p.Blocks {
creates a copy of each element in the slice, and in this case this means creating a copy of each Block in the slice. So when you then make the changes in the loop body, you are making them to the copy of the Block, instead of to the Block in the slice.
If you instead use the index to get a pointer to each Block, e.g.
for i := range p.Blocks {
b := &p.Blocks[i]
// modify b ...
}
it works as expected:
https://play.golang.org/p/h_nXEX9oWRT
Alternatively, you can make the changes to the copy (as in your original code), and then copy the modified value back to the slice:
for i, b := range p.Blocks {
// modify b ...
p.Blocks[i] = b
}
https://go.dev/play/p/kVHTk-OTyC3
Even further, you could instead store pointers to Block in the slice (instead of the Block themselves), in which case your loop would be making a copy of the pointer, which is a valid way to access the Block the original pointers points to:
https://go.dev/play/p/I9-EyV_iCNS
When you are looping over a slice, each of the individual values retrieved from the slice is a copy of the corresponding element in the slice. So to modify the element in the slice, instead of the copy, you can access the element directly using the indexing expression. Or you can use pointers. Note that pointers are also copied but the copied pointer will point to the same address as the element in the slice and therefore can be used to directly modify the same data.
You can use indexing:
for i := range p.Blocks {
if p.Blocks[i].ID == "B1" {
fmt.Println("Adding a new content")
p.Blocks[i].Contents = append(p.Blocks[i].Contents, "c3")
}
}
https://play.golang.org/p/di175k18YQ9
Or you can use pointers:
type Block struct {
ID string
Contents []string
}
type Package struct {
Name string
Blocks []*Block
}
for _, b := range p.Blocks {
if b.ID == "B1" {
fmt.Println("Adding a new content")
b.Contents = append(b.Contents, "c3")
}
}
https://play.golang.org/p/1RjWlCZkhYv

How to slice a slice for eliminating matching values from identical slice

I'm looking for an easy way to iterate through a slice and on every value that's present in the current slice, remove the element from another slice.
I have a struct:
a := enter{
uid: 1234,
status: []StatusEntry{
{
rank: 1,
iterate: ierationState_Ongoing,
},
{
rank: 2,
iterate: ierationState_Completed,
},
},
}
In my .go file, I have a constant
Steps = [5]int64{0,1,2,3,4}
According to my requirement I want to copy the Steps in another variable and perform remove operation :
Steps2 := Steps // Make a copy of Steps
for _, element := enter.status {
// Remove that element from Steps
}
But I find it difficult to do so since Golang doesn't give me direct method to iterate and remove every element from enter.status from Steps.
I tried multiple things like creating a removeIndex function as posted on various stackoverflow answers like this:
for i, element := enter.status {
Steps2 = removeIndex(enter.status, i)
}
func removeIndex(s []int, index int) []int {
ret := make([]int, 0)
ret = append(ret, s[:index]...)
return append(ret, s[index+1:]...)
}
But it doesn't make sense to use this because I'm trying to remove a matching value (element) and not a specific index (for eg index 5) from Steps2.
Basically, for every element that's in slice enter.status, I want to remove that element/value from slice Steps2
Careful:
[5]int64{0,1,2,3,4}
This is an array (of 5 ints), not a slice. And:
Steps2 := Steps
If Steps were a slice, this would copy the slice header without copying the underlying array.
In any case, given some slice s of type T and length len(s), if you are allowed to modify s in place and order is relevant, you generally want to use this algorithm:
func trim(s []T) []T {
out := 0
for i := range s {
if keep(s[i]) {
s[out] = s[i]
out++
}
}
return s[:out]
}
where keep is your boolean function to decide whether to keep an element. To make this produce a new slice, allocate an output slice of the appropriate length (len(s)) at the start and optionally shrink it later, or, if you expect to throw out most elements, make it empty at the start and use append.
When the keep function is "the value of some field in the output slice does not match the value of any earlier kept field" and the type of that field is usable as a key type, you can use a simple map[T2]struct{} to determine whether the value has occurred yet:
seen := make(map[T2]struct{}, len(s))
and then the keep test and copy sequence becomes:
_, ok := seen[s[i].field]
if !ok {
seen[s[i].field] = struct{}{}
s[out] = s[i]
out++
}
The initial size of seen here is optimized on the theory that most values will be kept; if most values will be discarded, make the map initially empty, or small.

Why is undefined: error thrown while passing custom struct as pointers?

package main
import (
"encoding/json"
"fmt"
)
func main() {
type CustomInfo struct {
Name string
Size int
}
type Error struct {
ErrorCode int
ErrorMsg string
}
type Product struct {
Fruit string
CInfo CustomInfo
Err Error
}
var pr1 = Product{
Fruit: "Orange",
CInfo: CustomInfo{
Name: "orango botanica",
Size: 3,
},
Err: Error{
ErrorMsg: "",
},
}
var pr2 = Product{
Fruit: "Apple",
CInfo: CustomInfo{
Name: "appleo botanica",
Size: 4,
},
Err: Error{
ErrorMsg: "",
},
}
var products []Product
products = append(products, pr1, pr2)
mrshl, _ := json.Marshal(products)
var productsRes []Product
err := json.Unmarshal([]byte(mrshl), &productsRes)
if err != nil {
fmt.Println(err)
}
//fmt.Println(productsRes[0].Fruit)
//fmt.Println(productsRes[1])
//fmt.Println(unmrshl)
validate(&productsRes)
}
func validate(bRes *Product){
fmt.Println(bRes[0].Fruit)
fmt.Println(bRes[1])
}
Why do I get ./prog.go:61:22: undefined: Product ?
I modified your updated playground example a bit here.
You don't want a pointer to the slice, you just want to pass the slice itself. It's not inherently wrong to pass a pointer, it's just unnecessary here. A slice means: "I (main) give you (validate) access to an array I have made." The slice header provides the user-of-the-slice:
access to the array (via indexing: bRes[i] is the i-th element of the array);
the length of the array: len(bRes)—the for loops use this implicitly; and
the capacity of the array (not used in this example).
By writing to bRes[i] we can update any or all of the fields of one of the Products in the underlying array. This is what the second loop I added to validate does.
Note: lines 47-48, which read:
var products []Product
products = append(products, pr1, pr2)
uses append a little oddly: since we just have the two products, we could build the slice directly with:
products := []Product{pr1, pr2}
The value of products will be nil initially. The nil slice header says, in effect, that the length and capacity are both zero, and there is no underlying array after all. Appending to a nil slice always causes append to allocate a new underlying array. The append function returns the new slice, which uses the new array.1 So there's a tiny bit of wasted effort in setting up this nil slice, only to throw it out. Again, it's not wrong, it's just unnecessary.
(Meanwhile, you get +1 point for checking for an error from json.Unmarshal, but -1 point, or maybe minus half a point, for not checking for an error from json.Marshal. 😀)
1append always constructs a new slice header. The new header may re-use the old array, in some cases. or it may use a new array. The append operation will re-use the old, already-existing array if and only if the appended elements fit into the existing array based on the capacity indicated by the original slice header. Since a nil header has a capacity of zero, its existing array cannot be used here.
Your struct definition is in main and thus out of scope for validate, it can only be used inside of your main function. It should work when you move your struct definitions out of main
Also, your validate function should probably accept a []Product (slice of Product), not a *Product (pointer to single Product)

Using Pointers in a for loop

I'm struggling to understand why I have a bug in my code in one state but not the other. It's been a while since I've covered pointers, so I'm probably rusty!
Basically I have a repository structure I'm using to store an object in memory, that has a Store function.
type chartsRepository struct {
mtx sync.RWMutex
charts map[ChartName]*Chart
}
func (r *chartsRepository) Store(c *Chart) error {
r.mtx.Lock()
defer r.mtx.Unlock()
r.charts[c.Name] = c
return nil
}
So all it does is put a RW mutex lock on and adds the pointer to a map, referenced by an identifier.
Then I've got a function that will basically loop through a slice of these objects, storing them all in the repository.
type service struct {
charts Repository
}
func (svc *service) StoreCharts(arr []Chart) error {
hasError := false
for _, chart := range arr {
err := svc.repo.Store(&chart)
// ... error handling
}
if hasError {
// ... Deals with the error object
return me
}
return nil
}
The above doesn't work, it looks like everything works fine at first, but on trying to access the data later, the entries in the map all point to the same Chart object, despite having different keys.
If I do the following and move the pointer reference to another function, everything works as expected:
func (svc *service) StoreCharts(arr []Chart) error {
// ...
for _, chart := range arr {
err := svc.storeChart(chart)
}
// ...
}
func (svc *service) storeChart(c Chart) error {
return svc.charts.Store(&c)
}
I'm assuming the issue is that because the loop overwrites the reference to the chart in the for loop, the pointer reference also changes. When the pointer is generated in an independent function, that reference is never overwritten. Is that right?
I feel like I'm being stupid, but shouldn't the pointer be generated by &chart and that's independent of the chart reference? I also tried creating a new variable for the pointer p := &chart in the for loop and that didn't work either.
Should I just avoid generating pointers in loops?
This is because there is only a single loop variable chart, and in each iteration just a new value is assigned to it. So if you attempt to take the address of the loop variable, it will be the same in each iteration, so you will store the same pointer, and the pointed object (the loop variable) is overwritten in each iteration (and after the loop it will hold the value assigned in the last iteration).
This is mentioned in Spec: For statements: For statements with range clause:
The iteration variables may be declared by the "range" clause using a form of short variable declaration (:=). In this case their types are set to the types of the respective iteration values and their scope is the block of the "for" statement; they are re-used in each iteration. If the iteration variables are declared outside the "for" statement, after execution their values will be those of the last iteration.
Your second version works, because you pass the loop variable to a function, so a copy will be made of it, and then you store the address of the copy (which is detached from the loop variable).
You can achieve the same effect without a function though: just create a local copy and use the address of that:
for _, chart := range arr {
chart2 := chart
err := svc.repo.Store(&chart2) // Address of the local var
// ... error handling
}
Also note that you may also store the address of the slice elements:
for i := range arr {
err := svc.repo.Store(&arr[i]) // Address of the slice element
// ... error handling
}
The disadvantage of this is that since you store pointers to the slice elements, the whole backing array of the slice would have to be kept in memory for as long as you keep any of the pointers (the array cannot be garbage collected). Moreover, the pointers you store would share the same Chart values as the slice, so if someone would modify a chart value of the passed slice, that would effect the charts whose pointers you stored.
See related questions:
Golang: Register multiple routes using range for loop slices/map
Why do these two for loop variations give me different behavior?
I faced a similar issue today and creating this simple example helped me understand the problem.
// Input array of string values
inputList := []string {"1", "2", "3"}
// instantiate empty list
outputList := make([]*string, 0)
for _, value := range inputList {
// print memory address on each iteration
fmt.Printf("address of %v: %v\n", value, &value)
outputList = append(outputList, &value)
}
// show memory address of all variables
fmt.Printf("%v", outputList)
This printed out:
address of 1: 0xc00008e1e0
address of 2: 0xc00008e1e0
address of 3: 0xc00008e1e0
[0xc00008e1e0 0xc00008e1e0 0xc00008e1e0]
As you can see, the address of value in each iteration was always the same even though the actual value was different ("1", "2", and "3"). This is because value was getting reassigned.
In the end, every value in the outputList was pointing to the same address which is now storing the value "3".

Writing generic data access functions in Go

I'm writing code that allows data access from a database. However, I find myself repeating the same code for similar types and fields. How can I write generic functions for the same?
e.g. what I want to achieve ...
type Person{FirstName string}
type Company{Industry string}
getItems(typ string, field string, val string) ([]interface{}) {
...
}
var persons []Person
persons = getItems("Person", "FirstName", "John")
var companies []Company
cs = getItems("Company", "Industry", "Software")
So you're definitely on the right track with the idea of returning a slice of nil interface types. However, you're going to run into problems when you try accessing specific members or calling specific methods, because you're not going to know what type you're looking for. This is where type assertions are going to come in very handy. To extend your code a bit:
getPerson(typ string, field string, val string) []Person {
slice := getItems(typ, field, val)
output := make([]Person, 0)
i := 0
for _, item := range slice {
// Type assertion!
thing, ok := item.(Person)
if ok {
output = append(output, thing)
i++
}
}
return output
}
So what that does is it performs a generic search, and then weeds out only those items which are of the correct type. Specifically, the type assertion:
thing, ok := item.(Person)
checks to see if the variable item is of type Person, and if it is, it returns the value and true, otherwise it returns nil and false (thus checking ok tells us if the assertion succeeded).
You can actually, if you want, take this a step further, and define the getItems() function in terms of another boolean function. Basically the idea would be to have getItems() run the function pass it on each element in the database and only add that element to the results if running the function on the element returns true:
getItem(critera func(interface{})bool) []interface{} {
output := make([]interface{}, 0)
foreach _, item := range database {
if criteria(item) {
output = append(output, item)
}
}
}
(honestly, if it were me, I'd do a hybrid of the two which accepts a criteria function but also accepts the field and value strings)
joshlf13 has a great answer. I'd expand a little on it though to maintain some additional type safety. instead of a critera function I would use a collector function.
// typed output array no interfaces
output := []string{}
// collector that populates our output array as needed
func collect(i interface{}) {
// The only non typesafe part of the program is limited to this function
if val, ok := i.(string); ok {
output = append(output, val)
}
}
// getItem uses the collector
func getItem(collect func(interface{})) {
foreach _, item := range database {
collect(item)
}
}
getItem(collect) // perform our get and populate the output array from above.
This has the benefit of not requiring you to loop through your interface{} slice after a call to getItems and do yet another cast.

Resources