Does storing Vec as a value of StorageMap work fine? - substrate

We have a storage var declared like this:
#[pallet::storage]
pub(super) type SomeMap<T: Config> = StorageMap<_, Blake2_128Concat, DataId, Vec<MyData<T>>, ValueQuery>;
and we have concerns that adding/removing values into Vec will cause copying the whole vector and thus doubling space occupied by the storage item from block to block. Is this really true?
If so then (again by assumption) switching to StorageDoubleMap like this
#[pallet::storage]
pub(super) type SomeMap<T: Config> = StorageDoubleMap<_, Blake2_128Concat, DataId, Blake2_128Concat, AnotherId, MyData<T>, ValueQuery>;
should fix the problem?
In general what are pros and cons in the 1st and 2nd implementations in terms of efficiency?

Related

Why should I use the & sign on structs?

In the gotour, there is a section: struct literals.
package main
import "fmt"
type Vertex struct {
X, Y int
}
var (
v1 = Vertex{1, 2} // has type Vertex
v2 = Vertex{X: 1} // Y:0 is implicit
v3 = Vertex{} // X:0 and Y:0
p = &Vertex{1, 2} // has type *Vertex
)
func main() {
fmt.Println(v1, p, v2, v3)
}
What's the difference between a struct with an ampersand and the one without? I know that the ones with ampersands point to the same reference, but why should I use them over the regular ones?
var p = &Vertex{} // why should I use this
var c = Vertex{} // over this
It's true that p (&Vertex{}) has type *Vertex and that c (Vertex{}) has type Vertex.
However, I don't believe that statement really answers the question of why one would choose one over the other. It'd be kind of like answering the question "why use planes over cars" with something like "planes are winged, cars aren't." (Obviously we all know what wings are, but you get the point).
But it also wouldn't be very helpful to simply say "go learn about pointers" (though I think it is a really good idea to so nonetheless).
How you choose basically boils down to the following.
Realize that &Vertex{} and Vertex{} are fundamentally initialized in the same way.
There might be some low-level memory allocation differences (i.e. stack vs heap), but you really should just let the compiler worry about these details
What makes one more useful and performant than the other is determined by how they are used in the program.
"Do I want a pointer to the struct (p), or just the struct (c)?"
Note that you can get a pointer using the & operator (e.g. &c); you can dereference a pointer to get the value using * (e.g. *p)
So depending on what you choose, you may end up doing a lot of *p or &c
Bottom-line: create what you will use; if you don't need a pointer, don't make one (this will help more with "optimizations" in the long run).
Should I use Vertex{} or &Vertex{}?
For the Vertex given in your example, I'd choose Vertex{} to a get a simple value object.
Some reasons:
Vertex in the example is pretty small in terms of size. Copying is cheap.
Garbage collection is simplified, garbage creation may be mitigated by the compiler
Pointers can get tricky and add unnecessary cognitive load to the programmer (i.e. gets harder to maintain)
Pointers aren't something you want if you ever get into concurrency (i.e. it's unsafe)
Vertex doesn't really contain anything worth mutating (just return/create a new Vertex when needed).
Some reasons why you'd want &Struct{}
If Struct has a member caching some state that needs to be changed inside the original object itself.
Struct is huge, and you've done enough profiling to determine that the cost of copying is significant.
As an aside: you should try to keep your structs small, it's just good practice.
You find yourself making copies by accident (this is a bit of a stretch): v := Struct{}
v2 := v // Copy happens
v := &Struct{}
v2 := v // Only a pointer is copied
The comments pretty much spell it out:
v1 = Vertex{1, 2} // has type Vertex
p = &Vertex{1, 2} // has type *Vertex
As in many other languages, & takes the address of the identifier following it. This is useful when you want a pointer rather than a value.
If you need to understand more about pointers in programming, you could start with this for go, or even the wikipedia page.

What is the difference between storing a Vec vs a Slice?

Rust provides a few ways to store a collection of elements inside a user-defined struct. The struct can be given a custom lifetime specifier, and a reference to a slice:
struct Foo<'a> {
elements: &'a [i32]
}
impl<'a> Foo<'a> {
fn new(elements: &'a [i32]) -> Foo<'a> {
Foo { elements: elements }
}
}
Or it can be given a Vec object:
struct Bar {
elements: Vec<i32>
}
impl Bar {
fn new(elements: Vec<i32>) -> Bar {
Bar { elements: elements }
}
}
What are the major differences between these two approaches?
Will using a Vec force the language to copy memory whenever I call Bar::new(vec![1, 2, 3, 4, 5])?
Will the contents of Vec be implicitly destroyed when the owner Bar goes out of scope?
Are there any dangers associated with passing a slice in by reference if it's used outside of the struct that it's being passed to?
A Vec is composed of three parts:
A pointer to a chunk of memory
A count of how much memory is allocated (the capacity)
A count of how many items are stored (the size)
A slice is composed of two parts:
A pointer to a chunk of memory
A count of how many items are stored (the size)
Whenever you move either of these, those fields are all that will be copied. As you might guess, that's pretty lightweight. The actual chunk of memory on the heap will not be copied or moved.
A Vec indicates ownership of the memory, and a slice indicates a borrow of memory. A Vec needs to deallocate all the items and the chunk of memory when it is itself deallocated (dropped in Rust-speak). This happens when it goes out of scope. The slice does nothing when it is dropped.
There are no dangers of using slices, as that is what Rust lifetimes handle. These make sure that you never use a reference after it would be invalidated.
A Vec is a collection that can grow or shrink in size. It is stored on the heap, and it is allocated and deallocated dynamically at runtime. A Vec can be used to store any number of elements, and it is typically used when the number of elements is not known at compile time or when the number of elements may change during the execution of the program.
A slice is a reference to a contiguous sequence of elements in a Vec or other collection. It is represented using the [T] syntax, where T is the type of the elements in the slice. A slice does not store any elements itself, it only references elements stored in another collection. A slice is typically used when a reference to a subset of the elements in a collection is needed.
One of the main differences between a Vec and a slice is that a Vec can be used to add and remove elements, while a slice only provides read-only access to a subset of the elements in a collection. Another difference is that a Vec is allocated on the heap, while a slice is a reference and therefore has a fixed size. This means that a slice cannot be used to store new elements, but it can be used to reference a subset of the elements in a Vec or other collection.

OOP much slower than Structural programming. why and how can be fixed?

as i mentioned on subject of this post i found out OOP is slower than Structural Programming(spaghetti code) in the hard way.
i writed a simulated annealing program with OOP then remove one class and write it structural in main form. suddenly it got much faster . i was calling my removed class in every iteration in OOP program.
also checked it with Tabu Search. Same result .
can anyone tell me why this is happening and how can i fix it on other OOP programs?
are there any tricks ? for example cache my classes or something like that?
(Programs has been written in C#)
If you have a high-frequency loop, and inside that loop you create new objects and don't call other functions very much, then, yes, you will see that if you can avoid those news, say by re-using one copy of the object, you can save a large fraction of total time.
Between new, constructors, destructors, and garbage collection, a very little code can waste a whole lot of time.
Use them sparingly.
Memory access is often overlooked. The way o.o. tends to lay out data in memory is not conducive to efficient memory access in practice in loops. Consider the following pseudocode:
adult_clients = 0
for client in list_of_all_clients:
if client.age >= AGE_OF_MAJORITY:
adult_clients++
It so happens that the way this is accessed from memory is quite inefficient on modern architectures because they like accessing large contiguous rows of memory, but we only care for client.age, and of all clients we have; those will not be laid out in contiguous memory.
Focusing on objects that have fields results into data being laid out in memory in such a way that fields that hold the same type of information will not be laid out in consecutive memory. Performance-heavy code tends to involve loops that often look at data with the same conceptual meaning. It is conducive to performance that such data be laid out in contiguous memory.
Consider these two examples in Rust:
// struct that contains an id, and an optiona value of whether the id is divisible by three
struct Foo {
id : u32,
divbythree : Option<bool>,
}
fn main () {
// create a pretty big vector of these structs with increasing ids, and divbythree initialized as None
let mut vec_of_foos : Vec<Foo> = (0..100000000).map(|i| Foo{ id : i, divbythree : None }).collect();
// loop over all hese vectors, determine if the id is divisible by three
// and set divbythree accordingly
let mut divbythrees = 0;
for foo in vec_of_foos.iter_mut() {
if foo.id % 3 == 0 {
foo.divbythree = Some(true);
divbythrees += 1;
} else {
foo.divbythree = Some(false);
}
}
// print the number of times it was divisible by three
println!("{}", divbythrees);
}
On my system, the real time with rustc -O is 0m0.436s; now let us consider this example:
fn main () {
// this time we create two vectors rather than a vector of structs
let vec_of_ids : Vec<u32> = (0..100000000).collect();
let mut vec_of_divbythrees : Vec<Option<bool>> = vec![None; vec_of_ids.len()];
// but we basically do the same thing
let mut divbythrees = 0;
for i in 0..vec_of_ids.len(){
if vec_of_ids[i] % 3 == 0 {
vec_of_divbythrees[i] = Some(true);
divbythrees += 1;
} else {
vec_of_divbythrees[i] = Some(false);
}
}
println!("{}", divbythrees);
}
This runs in 0m0.254s on the same optimization level, — close to half the time needed.
Despite having to allocate two vectors instead of of one, storing similar values in contiguous memory has almost halved the execution time. Though obviously the o.o. approach provides for much nicer and more maintainable code.
P.s.: it occurs to me that I should probably explain why this matters so much given that the code itself in both cases still indexes memory one field at a time, rather than, say, putting a large swath on the stack. The reason is c.p.u. caches: when the program asks for the memory at a certain address, it actually obtains, and caches, a significant chunk of memory around that address, and if memory next to it be asked quickly again, then it can serve it from the cache, rather than from actual physical working memory. Of course, compilers will also vectorize the bottom code more efficiently as a consequence.

Does an equivalent function in OCaml exist that works the same way as "set!" in Scheme?

I'm trying to make a function that defines a vector that varies based on the function's input, and set! works great for this in Scheme. Is there a functional equivalent for this in OCaml?
I agree with sepp2k that you should expand your question, and give more detailed examples.
Maybe what you need are references.
As a rough approximation, you can see them as variables to which you can assign:
let a = ref 5;;
!a;; (* This evaluates to 5 *)
a := 42;;
!a;; (* This evaluates to 42 *)
Here is a more detailed explanation from http://caml.inria.fr/pub/docs/u3-ocaml/ocaml-core.html:
The language we have described so far is purely functional. That is, several evaluations of the same expression will always produce the same answer. This prevents, for instance, the implementation of a counter whose interface is a single function next : unit -> int that increments the counter and returns its new value. Repeated invocation of this function should return a sequence of consecutive integers — a different answer each time.
Indeed, the counter needs to memorize its state in some particular location, with read/write accesses, but before all, some information must be shared between two calls to next. The solution is to use mutable storage and interact with the store by so-called side effects.
In OCaml, the counter could be defined as follows:
let new_count =
let r = ref 0 in
let next () = r := !r+1; !r in
next;;
Another, maybe more concrete, example of mutable storage is a bank account. In OCaml, record fields can be declared mutable, so that new values can be assigned to them later. Hence, a bank account could be a two-field record, its number, and its balance, where the balance is mutable.
type account = { number : int; mutable balance : float }
let retrieve account requested =
let s = min account.balance requested in
account.balance <- account.balance -. s; s;;
In fact, in OCaml, references are not primitive: they are special cases of mutable records. For instance, one could define:
type 'a ref = { mutable content : 'a }
let ref x = { content = x }
let deref r = r.content
let assign r x = r.content <- x; x
set! in Scheme assigns to a variable. You cannot assign to a variable in OCaml, at all. (So "variables" are not really "variable".) So there is no equivalent.
But OCaml is not a pure functional language. It has mutable data structures. The following things can be assigned to:
Array elements
String elements
Mutable fields of records
Mutable fields of objects
In these situations, the <- syntax is used for assignment.
The ref type mentioned by #jrouquie is a simple, built-in mutable record type that acts as a mutable container of one thing. OCaml also provides ! and := operators for working with refs.

Scala: Mutable vs. Immutable Object Performance - OutOfMemoryError

I wanted to compare the performance characteristics of immutable.Map and mutable.Map in Scala for a similar operation (namely, merging many maps into a single one. See this question). I have what appear to be similar implementations for both mutable and immutable maps (see below).
As a test, I generated a List containing 1,000,000 single-item Map[Int, Int] and passed this list into the functions I was testing. With sufficient memory, the results were unsurprising: ~1200ms for mutable.Map, ~1800ms for immutable.Map, and ~750ms for an imperative implementation using mutable.Map -- not sure what accounts for the huge difference there, but feel free to comment on that, too.
What did surprise me a bit, perhaps because I'm being a bit thick, is that with the default run configuration in IntelliJ 8.1, both mutable implementations hit an OutOfMemoryError, but the immutable collection did not. The immutable test did run to completion, but it did so very slowly -- it takes about 28 seconds. When I increased the max JVM memory (to about 200MB, not sure where the threshold is), I got the results above.
Anyway, here's what I really want to know:
Why do the mutable implementations run out of memory, but the immutable implementation does not? I suspect that the immutable version allows the garbage collector to run and free up memory before the mutable implementations do -- and all of those garbage collections explain the slowness of the immutable low-memory run -- but I'd like a more detailed explanation than that.
Implementations below. (Note: I don't claim that these are the best implementations possible. Feel free to suggest improvements.)
def mergeMaps[A,B](func: (B,B) => B)(listOfMaps: List[Map[A,B]]): Map[A,B] =
(Map[A,B]() /: (for (m <- listOfMaps; kv <-m) yield kv)) { (acc, kv) =>
acc + (if (acc.contains(kv._1)) kv._1 -> func(acc(kv._1), kv._2) else kv)
}
def mergeMutableMaps[A,B](func: (B,B) => B)(listOfMaps: List[mutable.Map[A,B]]): mutable.Map[A,B] =
(mutable.Map[A,B]() /: (for (m <- listOfMaps; kv <- m) yield kv)) { (acc, kv) =>
acc + (if (acc.contains(kv._1)) kv._1 -> func(acc(kv._1), kv._2) else kv)
}
def mergeMutableImperative[A,B](func: (B,B) => B)(listOfMaps: List[mutable.Map[A,B]]): mutable.Map[A,B] = {
val toReturn = mutable.Map[A,B]()
for (m <- listOfMaps; kv <- m) {
if (toReturn contains kv._1) {
toReturn(kv._1) = func(toReturn(kv._1), kv._2)
} else {
toReturn(kv._1) = kv._2
}
}
toReturn
}
Well, it really depends on what the actual type of Map you are using. Probably HashMap. Now, mutable structures like that gain performance by pre-allocating memory it expects to use. You are joining one million maps, so the final map is bound to be somewhat big. Let's see how these key/values get added:
protected def addEntry(e: Entry) {
val h = index(elemHashCode(e.key))
e.next = table(h).asInstanceOf[Entry]
table(h) = e
tableSize = tableSize + 1
if (tableSize > threshold)
resize(2 * table.length)
}
See the 2 * in the resize line? The mutable HashMap grows by doubling each time it runs out of space, while the immutable one is pretty conservative in memory usage (though existing keys will usually occupy twice the space when updated).
Now, as for other performance problems, you are creating a list of keys and values in the first two versions. That means that, before you join any maps, you already have each Tuple2 (the key/value pairs) in memory twice! Plus the overhead of List, which is small, but we are talking about more than one million elements times the overhead.
You may want to use a projection, which avoids that. Unfortunately, projection is based on Stream, which isn't very reliable for our purposes on Scala 2.7.x. Still, try this instead:
for (m <- listOfMaps.projection; kv <- m) yield kv
A Stream doesn't compute a value until it is needed. The garbage collector ought to collect the unused elements as well, as long as you don't keep a reference to the Stream's head, which seems to be the case in your algorithm.
EDIT
Complementing, a for/yield comprehension takes one or more collections and return a new collection. As often as it makes sense, the returning collection is of the same type as the original collection. So, for example, in the following code, the for-comprehension creates a new list, which is then stored inside l2. It is not val l2 = which creates the new list, but the for-comprehension.
val l = List(1,2,3)
val l2 = for (e <- l) yield e*2
Now, let's look at the code being used in the first two algorithms (minus the mutable keyword):
(Map[A,B]() /: (for (m <- listOfMaps; kv <-m) yield kv))
The foldLeft operator, here written with its /: synonymous, will be invoked on the object returned by the for-comprehension. Remember that a : at the end of an operator inverts the order of the object and the parameters.
Now, let's consider what object is this, on which foldLeft is being called. The first generator in this for-comprehension is m <- listOfMaps. We know that listOfMaps is a collection of type List[X], where X isn't really relevant here. The result of a for-comprehension on a List is always another List. The other generators aren't relevant.
So, you take this List, get all the key/values inside each Map which is a component of this List, and make a new List with all of that. That's why you are duplicating everything you have.
(in fact, it's even worse than that, because each generator creates a new collection; the collections created by the second generator are just the size of each element of listOfMaps though, and are immediately discarded after use)
The next question -- actually, the first one, but it was easier to invert the answer -- is how the use of projection helps.
When you call projection on a List, it returns new object, of type Stream (on Scala 2.7.x). At first you may think this will only make things worse, because you'll now have three copies of the List, instead of a single one. But a Stream is not pre-computed. It is lazily computed.
What that means is that the resulting object, the Stream, isn't a copy of the List, but, rather, a function that can be used to compute the Stream when required. Once computed, the result will be kept so that it doesn't need to be computed again.
Also, map, flatMap and filter of a Stream all return a new Stream, which means you can chain them all together without making a single copy of the List which created them. Since for-comprehensions with yield use these very functions, the use of Stream inside the prevent unnecessary copies of data.
Now, suppose you wrote something like this:
val kvs = for (m <- listOfMaps.projection; kv <-m) yield kv
(Map[A,B]() /: kvs) { ... }
In this case you aren't gaining anything. After assigning the Stream to kvs, the data hasn't been copied yet. Once the second line is executed, though, kvs will have computed each of its elements, and, therefore, will hold a complete copy of the data.
Now consider the original form::
(Map[A,B]() /: (for (m <- listOfMaps.projection; kv <-m) yield kv))
In this case, the Stream is used at the same time it is computed. Let's briefly look at how foldLeft for a Stream is defined:
override final def foldLeft[B](z: B)(f: (B, A) => B): B = {
if (isEmpty) z
else tail.foldLeft(f(z, head))(f)
}
If the Stream is empty, just return the accumulator. Otherwise, compute a new accumulator (f(z, head)) and then pass it and the function to the tail of the Stream.
Once f(z, head) has executed, though, there will be no remaining reference to the head. Or, in other words, nothing anywhere in the program will be pointing to the head of the Stream, and that means the garbage collector can collect it, thus freeing memory.
The end result is that each element produced by the for-comprehension will exist just briefly, while you use it to compute the accumulator. And this is how you save keeping a copy of your whole data.
Finally, there is the question of why the third algorithm does not benefit from it. Well, the third algorithm does not use yield, so no copy of any data whatsoever is being made. In this case, using projection only adds an indirection layer.

Resources