Can someone explain how std::greater is used to implement priority_queue - c++11

std::priority_queue<int, vector<int>, std::greater<int> > pq;
I cannot understand the work of std::greater in the priority queue.
I am replacing minheap by the priority queue.
this code is taken from
geeksForGeeks implementation of Prims algorithm using STL

The std::priority_queue type is what’s called a container adapter. It works by starting with a type you can use to represent a sequence, then uses that type to build the priority queue (specifically, as a binary heap). By default, it uses a vector.
In order to do this, the priority queue type has to know how to compare elements against one another in a way that determines which elements are “smaller” than other elements. By default, it uses the less-than operator.
If you make a standard std::priority_queue<int>, you get back a priority queue that
uses a std::vector for storage, and
uses the less-than operator to compare elements.
In many cases, this is what you want. If you insert elements into a priority queue created this way, you’ll read them back out from greatest to least.
In some cases, though, this isn’t the behavior you want. In Prim’s algorithm and Dijkstra’s algorithm, for example, you want the values to come back in ascending order rather than descending order. To do this, you need to, in effect, reverse the order of comparisons by using the greater-than operator instead of the less-than operator.
To do this, you need to tell the priority queue to use a different comparison method. Unfortunately, the priority queue type is designed so that if you want to do that, you also need to specify which underlying container you want to use. I think this is a mistake in the design - it would be really nice to just be able to specify the comparator rather than the comparator and the container - but c’est la vie. The syntax for this is
std::priority_queue<int, // store integers...
std::vector<int>, // ... in a vector ...
std::greater<int>> // ... comparing using >

Related

Mutable data types that use stack allocation

Based on my earlier question, I understand the benefit of using stack allocation. Suppose I have an array of arrays. For example, A is a list of matrices and each element A[i] is a 1x3 matrix. The length of A and the dimension of A[i] are known at run time (given by the user). Each A[i] is a matrix of Float64 and this is also known at run time. However, through out the program, I will be modifying the values of A[i] element by element. What data structure can also allow me to use stack allocation? I tried StaticArrays but it doesn't allow me to modify a static array.
StaticArrays defines MArray (MVector, MMatrix) types that are fixed-size and mutable. If you use these there's a higher chance of the compiler determining that they can be stack-allocated, but it's not guaranteed. Moreover, since the pattern you're using is that you're passing the mutable state vector into a function which presumably modifies it, it's not going to be valid or helpful to stack allocate that anyway. If you're going to allocate state once and modify it throughout the program, it doesn't really matter if it is heap or stack allocated—stack allocation is only a big win for objects that are allocated, used locally and then don't escape the local scope, so they can be “freed” simply by popping the stack.
From the code snippet you showed in the linked question, the state vector is allocated in the outer function, test_for_loop, which shouldn't be a big deal since it's done once at the beginning of execution. Using a variably sized state vector to index into an array with a splat (...) might be an issue, however, and that's done in test_function. Using something with fixed size like MVector might be better for that. It might, however, be better still, to use a state tuple and return a new rather than mutated state tuple at the end. The compiler is very good at turning that kind of thing into very efficient code because of immutability.
Note that by convention test_function should be called test_function! since it modifies its M argument and even more so if it modifies the state vector.
I would also note that this isn't a great question/answer pair since it's not standalone at all and really just a continuation of your other question. StackOverflow isn't very good for this kind of iterative question/discussion interaction, I'm afraid.

What is the fastest way to Initialize a priority_queue from an unordered_set

It is said in construction of a priority queue , option (12):
template< class InputIt >
priority_queue( InputIt first, InputIt last,
const Compare& compare = Compare(),
Container&& cont = Container() );
But I don't know how ot use this.
I have a non-empty std::unordered_set<std::shared_ptr<MyStruct>> mySet, and I want to convert it to a priority queue. I also create a comparator struct MyComparator:
struct MyComparator {
bool operator()(const std::shared_ptr<myStruct>& a,
const std::shared_ptr<myStruct>& b){...}
};
Now how can I construct a new priority_queue myQueue in a better way? I used the following and it works:
std::priority_queue<std::shared_ptr<MyStruct>, std::deque<std::shared_ptr<MyStruct>, MyComparator>
myQueue(mySet.begin(), mySet.end());
I benchmarked both vector and deque, and I find deque will outperform vector when the size is relatively large (~30K).
Since we have already known the size of mySet, I should create the deque with that size. But how can I create this priority_queue with my own comparator and predefined deque, say myDeque?
Since you have already determined that std::deque gives you better performance than std::vector, I don't think there is much more you can do in terms of how you construct the priority_queue. As you have probably seen, there is no std::deque::reserve() method, so it's simply not possible to create a deque with memory allocated ahead of time. For most use cases this is not a problem, because the main feature of deque vs vector is that deque does not need to copy elements as new ones are inserted.
If you are still not achieving the performance you desire, you might consider either storing raw pointers (keeping your smart pointers alive outside), or simply changing your unordered_map to a regular map and relying on the ordering that container provides.

Constant-time list concatenation in OCaml

Is it possible to implement constant-time list concatenation in OCaml?
I imagine an approach where we deal directly with memory and concatenate lists by pointing the end of the first list to the beginning of the second list. Essentially, we're creating some type of linked-list like object.
With the normal list type, no, you can't. The algorithm you gave is exactly the one implemented ... but you still have to actually find the end of the first list...
There are various methods to implement constant time concatenation (see Okazaki for fancy details). I will just give you names of ocaml libraries that implement it: BatSeq, BatLazyList (both in batteries), sequence, gen, Core.Sequence.
Pretty sure there is a diff-list implementation somewhere too.
Lists are already (singly) linked lists. But list nodes are immutable. So you change any node's pointer to point to anything different. In order to concatenate two lists you must therefore copy all the nodes in the first list.

IndexedSeq.last complexity

When working with indexed collections (most often immutable Vectors) I am often using coll.last as what I supposed to be a convenient short-cut to coll(coll.size-1). When randomly inspecting my sources, I have clicked to see the last implementation, and the IntelliJ IDE took me to TraversableLike.last implementation, which traverses all elements to eventually reach the last one.
This was a surprise to me, and I am not sure now what is the reason for this. Is last really implemented this way? Is there some reason preventing last to be implemented for IndexedSeq (or perhaps for IndexedSeqLike) efficiently?
(Scala SDK used is 2.11.4)
IndexedSeq does not override last (it only inherits it from TraversableLike) - the fact that a particular sequence supports indexed access does not necessarily make indexed lookups faster than traversals. However, such optimized implementations are given in IndexedSeqOptimized, which I would expect many implementations to inherit from. In the specific case of Vector, last is overridden explicitly in the class itself.
IndexedSeq has constant access time for the arbitrary element. LinearSeq has linear time. TraversableLike is just common interface and you may find that it's overriden inside IndexedSeqOptimized trait:
A template trait for indexed sequences of type IndexedSeq[A] which
optimizes the implementation of several methods under the
assumption of fast random access.
def last: A = if (length > 0) this(length - 1) else super.last
You may also find the quick random access implementation inside Vector.getElem - it uses a tree of arrays with high branching factor, so usually it's O(1) for apply. It doesn't use IndexedSeqOptimized, but it has its own overriden last:
override /*TraversableLike*/ def last: A = {
if (isEmpty) throw new UnsupportedOperationException("empty.last")
apply(length-1)
}
So it's a little mess inside Scala collections, which is very common for Scala internals. Anyway last on IndexedSeqs is O(1) de facto, regardless such tricky collections architecture.
The Scala collections intricacy is actually an active topic. A talk (and slides) with Scala's collection framework criticism may be found at Paul Phillips: Scala Collections: Why Not?, and Paul Phillips is developing his alternate version of std.

Use cases of std::multimap

I don't quite get the purpose of this data structure. What's the difference between std::multimap<K, V> and std::map<K, std::vector<V>>. The same goes for std::multiset- it could just be std::map<K, int> where the int counts the number of occurrences of K. Am I missing something on the uses of these structures?
A counter-example seems to be in order.
Consider a PhoneEntry in an AdressList grouped by name.
int AdressListCompare(const PhoneEntry& p1, const PhoneEntry& p2){
return p1.name<p2.name;
}
multiset<PhoneEntry, AdressListCompare> adressList;
adressList.insert( PhoneEntry("Cpt.G", "123-456", "Cellular") );
adressList.insert( PhoneEntry("Cpt.G", "234-567", "Work") );
// Getting the entries
addressList.equal_range( PhoneENtry("Cpt.G") ); // All numbers
This would not be feasible with a set+count. Your Object+count approach seems to be faster if this behavior is not required. For instance the multiset::count() member states
"Complexity: logarithmic in size +
linear in count."
You could use make the substitutions that you suggest, and extract similar behavior. But the interfaces would be very different than when dealing with regular standard containers. A major design theme of these containers is that they share as much interface as possible, making them as interchangeable as possible so that the appropriate container can be chosen without having to change the code that uses it.
For instance, std::map<K, std::vector<V>> would have iterators that dereference to std::pair<K, std::vector<V>> instead of std::pair<K, V>. std::map<K, std::vector<V>>::Count() wouldn't return the correct result, failing to account for the duplicates in the vector. Of course you could change your code to do the extra steps needed to correct for this, but now you are interfacing with the container in a much different way. You can't later drop in unordered_map or some other map implementation to see it performs better.
In a broader sense, you are breaking the container abstraction by handling container implementation details in your code rather than having a container that handles it's own business.
It's entirely possible that your compiler's implementation of std::multimap is really just a wrapper around std::map<K, std::vector<V>>. Or it might not be. It could be more efficient and friendly to object pool allocation (which vectors are not).
Using std::map<K, int> instead of std::multiset is the same case. Count() would not return the expected value, iterators will not iterate over the duplicates, iterators will dereference to std::pair<k, int> instead of directly to `K.
A multimap or multiset allows you to have elements with duplicate keys.
ie a set is a non-ordered group of elements that are all unique in that {A,B,C} == {B,C,A}

Resources