What is the fastest way to Initialize a priority_queue from an unordered_set - c++11

It is said in construction of a priority queue , option (12):
template< class InputIt >
priority_queue( InputIt first, InputIt last,
const Compare& compare = Compare(),
Container&& cont = Container() );
But I don't know how ot use this.
I have a non-empty std::unordered_set<std::shared_ptr<MyStruct>> mySet, and I want to convert it to a priority queue. I also create a comparator struct MyComparator:
struct MyComparator {
bool operator()(const std::shared_ptr<myStruct>& a,
const std::shared_ptr<myStruct>& b){...}
};
Now how can I construct a new priority_queue myQueue in a better way? I used the following and it works:
std::priority_queue<std::shared_ptr<MyStruct>, std::deque<std::shared_ptr<MyStruct>, MyComparator>
myQueue(mySet.begin(), mySet.end());
I benchmarked both vector and deque, and I find deque will outperform vector when the size is relatively large (~30K).
Since we have already known the size of mySet, I should create the deque with that size. But how can I create this priority_queue with my own comparator and predefined deque, say myDeque?

Since you have already determined that std::deque gives you better performance than std::vector, I don't think there is much more you can do in terms of how you construct the priority_queue. As you have probably seen, there is no std::deque::reserve() method, so it's simply not possible to create a deque with memory allocated ahead of time. For most use cases this is not a problem, because the main feature of deque vs vector is that deque does not need to copy elements as new ones are inserted.
If you are still not achieving the performance you desire, you might consider either storing raw pointers (keeping your smart pointers alive outside), or simply changing your unordered_map to a regular map and relying on the ordering that container provides.

Related

Can someone explain how std::greater is used to implement priority_queue

std::priority_queue<int, vector<int>, std::greater<int> > pq;
I cannot understand the work of std::greater in the priority queue.
I am replacing minheap by the priority queue.
this code is taken from
geeksForGeeks implementation of Prims algorithm using STL
The std::priority_queue type is what’s called a container adapter. It works by starting with a type you can use to represent a sequence, then uses that type to build the priority queue (specifically, as a binary heap). By default, it uses a vector.
In order to do this, the priority queue type has to know how to compare elements against one another in a way that determines which elements are “smaller” than other elements. By default, it uses the less-than operator.
If you make a standard std::priority_queue<int>, you get back a priority queue that
uses a std::vector for storage, and
uses the less-than operator to compare elements.
In many cases, this is what you want. If you insert elements into a priority queue created this way, you’ll read them back out from greatest to least.
In some cases, though, this isn’t the behavior you want. In Prim’s algorithm and Dijkstra’s algorithm, for example, you want the values to come back in ascending order rather than descending order. To do this, you need to, in effect, reverse the order of comparisons by using the greater-than operator instead of the less-than operator.
To do this, you need to tell the priority queue to use a different comparison method. Unfortunately, the priority queue type is designed so that if you want to do that, you also need to specify which underlying container you want to use. I think this is a mistake in the design - it would be really nice to just be able to specify the comparator rather than the comparator and the container - but c’est la vie. The syntax for this is
std::priority_queue<int, // store integers...
std::vector<int>, // ... in a vector ...
std::greater<int>> // ... comparing using >

List storing pointers or "plain object"

I am designing a class which tracks the user manipulations in a software in order to restore previous application states (i.e. CTRL+Z/CTRL+Y). I symply wanted to clarify something about performances.
I am using the std::list container of the STL. This list is not meant to contain really huge objects, but a significant number. Should I use pointers or not?
For instance, here is the kinds of objects which will be stored:
struct ImagesState
{
cv::Mat first;
cv::Mat second;
};
struct StatusBarState
{
std::string notification;
std::string algorithm;
};
For now, I store the whole thing under the form of struct pointers, such as:
std::list<ImagesStatee*> stereoImages;
I know (I think) that new and delete operators are time consuming, but I don't want to encounter a stack overflow with "plain object". Is it a bad design?
If you are using a list, i would suggest not to use the pointer. The list items are on the heap anyway and the pointer just adds an unnecessary layer of indirection.
If you are after performance, using std::list is most likely not the best solution. Using std::vector might boost your performance significantly since the objects are better for your caches.
Even in an vector, the objects would lie on the heap and therefore the pointer are not needed (they would even harm you more than with a list). You only have to care about them if you make an array on your stack.
like so:
Type arrayName[REALLY_HUGE_NUMBER]

How to compress pointer ? eg. arbitrary bit pointer

I'm coding a complex tree data structure, which stores lots of pointers. The pointers themselves occupy lots of space, and this is what I'm expecting to save.
So I'm here to ask whether there are examples on this. E.g.: For 64-bit data type, can I use a 32-bit or less pointer if the data it's pointing to is sure to be continuous?
I found a paper called Transparent Pointer Compression for Linked Data Structures, but I thought there could be a much simpler solution.
Update:
It is an octree. A paper here about this on GPU is GigaVoxels: A Voxel-Based Rendering Pipeline For Efficient Exploration Of Large And Detailed Scenes, they use 15-bit pointers on GPU
Instead of using pointers, use an index into an array. The index can be a short if the array is less than 65536 in length, or int32_t if it's less than 2147483648.
An arbitrary pointer can really be anywhere in memory, so there's no way to shorten it by more than a couple of bits.
If the utilization of pointers takes a lot of space:
use an array of pointers, and replaces pointers with indexes in that array. That adds just another indirection. With less than 64k pointers, you need a [short] array (Linux)
Simple implementation
#define MAX_PTR 60000
void *aptr[MAX_PTR];
short nb = 0;
short ptr2index(void *ptr) {
aptr[nb] = ptr;
return (short)nb++;
}
void *index2ptr(short index) {
return aptr[index];
}
... utilization ...
... short next; // in Class
Class *c = new Class();
mystruct->next = ptr2index((void *)c);
...
Class *x = (Class *)index2ptr(otherstruct->next);
One option is to write a custom allocator to allocate big blocks of contiguous memory, and then store your nodes contiguously in there. Each of your nodes can then be referenced by a simple index that can be mapped back to memory using simple pointer arithmetic (e.g.: node_ptr = mem_block_ptr + node_index).
Soon you realise that having multiple of these memory blocks means that you no longer knows in which of them a specific node resides. This is where partitioning comes into scene. You can opt for horizontal and/or vertical partitioning. Both considerably increase the level of complexity, and both have pros and cons (see [1] and [2]).
The key thing here is to ensure that the data is split up in a predictable manner.
References:
Building Scalable Databases: Pros and Cons of Various Database Sharding Schemes
37signals - Mr. Moore gets to punt on sharding
In some cases, you could simply use an array to hold the nodes. A binary tree node at arr[i] would have children from arr[(i*2)+1] to arr[(i+1)*2]. Its parent would be at arr[(i-1)/2], if i != 0. And to figure the real pointer address, of course, you could say &arr[i]. It's actually a rather common implementation for trees that are full by specification, like the tree used for a heap.
In order for a node to know for itself how to find its children, though, you'd likely need either an index or a pointer to the container. (And even then, with only one of the two pieces, you're having to do a bit of hoop-jumping; you really need both pieces in order to do things easily. But having to calculate stuff rather than remembering it, is kinda the price you pay when you're trying not to remember much.) In order to keep stuff reasonably space-efficient, you'd have to dumb down the nodes; make them basically structs, or even just the values, and let the tree class do all the node-finding stuff. It'd just hand out pointers to nodes, and that pointer would be all the container needs to figure out the node's index (and, thus, where its children will be). You'd also have to pass both a tree pointer and a node pointer to any function that wants to traverse the tree.
Note, though, this won't save much space unless your trees are consistently near full (that is, unless most/all of your leaf nodes are at the end). For every leaf node that's not at the bottom of the tree (where the top is the root), you waste something like ((node size) * (tree size / i)) bytes.
If you can't count on the tree being full, or the nodes being in a certain restricted space, then there's not a whole lot to optimize here. The whole point of trees is nodes have pointers to their children; you can fake it with an array, but it must be easy to somehow find a node's children in order to make a tree worthwhile.
A very simple way of dealing with your issue is simply to use less pointers (seems silly right) ?
Compare the two following approaches:
template <typename T>
struct OctreeNaiveNode {
T value;
Point center;
OctreeNaiveNode* parent;
std::unique_ptr<OctreeNaiveNode> children[8];
}; // struct OctreeNaiveNode
// sizeof(OctreeNaiveNode) >= sizeof(T) + sizeof(Point) + 9 * sizeof(void*)
template <typename T>
struct OctreeNode {
T value;
Point center;
std::unique_ptr<OctreeNode[]> children; // allocate for 8 only when necessary
}; // struct OctreeNode
// sizeof(OctreeNode) >= sizeof(T) + sizeof(Point) + sizeof(void*)
How does it work:
A parent pointer is only necessary for simple iterators, if you have far more less iterators than nodes, then it is more economic to have deep iterators: ie iterators that conserve a stack of parents up to the root. Note that in RB-tree it does not work so well (balancing), but in octree it should be better because the partitioning is fixed.
A single children pointer: instead of having an array of pointers to children, build a pointer to an array of children. Not only it means 1 dynamic allocation instead of 8 (less heap fragmentation/overhead), but it also means 1 pointer instead of 8 within your node.
Overhead:
Point = std::tuple<float,float,float> => sizeof(T) + sizeof(Point) >= 64 => +100%
Point = std::tuple<double,double,double> => sizeof(T) + sizeof(Point) >= 256 => +25%
So, rather than delving into compression pointers strategies, I advise you just rework your datastructures to have less pointers/indirection in the first place.

Sorting a Vector Containing Pointer to Struct VS Struct

I am sorting a large vector containing structs using heapsort and the runtime of my code is quite slow. Instead of storing a struct in the vector, I want to store a pointer to the struct now.
My question is, under the hood, what is actually happening when I sort things and would it be faster if I store a pointer to a struct as opposed to storing the struct itself?
Certainly yes. Storing objects as values in stl containers will result in running copy constructor of the stored object.
In general, for performance, it is better to store pointers instead. However you will need to be more carefull about leaks and exception safety once you are using pointers instead.
Anyway, the simplest thing happening on sorting is the swap algorithm. Which involves copy constructing:
void swap(T & a, T & b)
{
T c = a; // copy constructing
a = b; // copy constructing
b = c; // copy constructing
}
It is definitelly much more faster to copy pointer instead of bigger objects.

Use cases of std::multimap

I don't quite get the purpose of this data structure. What's the difference between std::multimap<K, V> and std::map<K, std::vector<V>>. The same goes for std::multiset- it could just be std::map<K, int> where the int counts the number of occurrences of K. Am I missing something on the uses of these structures?
A counter-example seems to be in order.
Consider a PhoneEntry in an AdressList grouped by name.
int AdressListCompare(const PhoneEntry& p1, const PhoneEntry& p2){
return p1.name<p2.name;
}
multiset<PhoneEntry, AdressListCompare> adressList;
adressList.insert( PhoneEntry("Cpt.G", "123-456", "Cellular") );
adressList.insert( PhoneEntry("Cpt.G", "234-567", "Work") );
// Getting the entries
addressList.equal_range( PhoneENtry("Cpt.G") ); // All numbers
This would not be feasible with a set+count. Your Object+count approach seems to be faster if this behavior is not required. For instance the multiset::count() member states
"Complexity: logarithmic in size +
linear in count."
You could use make the substitutions that you suggest, and extract similar behavior. But the interfaces would be very different than when dealing with regular standard containers. A major design theme of these containers is that they share as much interface as possible, making them as interchangeable as possible so that the appropriate container can be chosen without having to change the code that uses it.
For instance, std::map<K, std::vector<V>> would have iterators that dereference to std::pair<K, std::vector<V>> instead of std::pair<K, V>. std::map<K, std::vector<V>>::Count() wouldn't return the correct result, failing to account for the duplicates in the vector. Of course you could change your code to do the extra steps needed to correct for this, but now you are interfacing with the container in a much different way. You can't later drop in unordered_map or some other map implementation to see it performs better.
In a broader sense, you are breaking the container abstraction by handling container implementation details in your code rather than having a container that handles it's own business.
It's entirely possible that your compiler's implementation of std::multimap is really just a wrapper around std::map<K, std::vector<V>>. Or it might not be. It could be more efficient and friendly to object pool allocation (which vectors are not).
Using std::map<K, int> instead of std::multiset is the same case. Count() would not return the expected value, iterators will not iterate over the duplicates, iterators will dereference to std::pair<k, int> instead of directly to `K.
A multimap or multiset allows you to have elements with duplicate keys.
ie a set is a non-ordered group of elements that are all unique in that {A,B,C} == {B,C,A}

Resources