How to implement Boost::Multi-index on a list of lists
I have a hierarchical tree as follows:
typedef std::list<struct obj> objList // the object list
typedef std::list<objList> topLevelList // the list of top-level object lists
struct obj
{
int Id; // globally unique Id
std::string objType;
std::string objAttributes;
....
topLevelList childObjectlist;
}
At the top-level, I have a std::list of struct obj
Then, each of these top-level obj can have any number of child objects,
which are contained in a topLevelList list for that object. This can continue, with a child in the nested list also having its own children.
Some objects can only be children, while others are containers and can have children of their own. Container objects have X number of sub-containers, each sub-container having its own list of child objects and that is why I have topLevelList in each obj struct, rather than simply objList.
I want to index this list of lists with boost::Multi-index to obtain random access to any of the objects in either the top-level list or the descendant list by its globally unique Id.
Can this be accomplished? I have searched for examples with no success.
I think the only way to have a flattened master search index by object Ids is to make the lists above to be lists of pointers to the objects, then traverse the completed hierarchical list, and log into the master search index the pointer where each object is physically allocated in memory. Then any object can be located via the master search index.
With Boost::Multi-index, I'd still have to traverse the hierarchy, though hopefully with the ability to use random instead of sequential access in each list encountered, in order to find a desired object.
Using nested vectors instead of lists is a problem - as additions and deletions occur in the vectors, there is a performance penalty as well as the prospect of pointers to objects becoming invalidated as the vectors are reallocated.
I'm almost talking myself into implementing the flattened master objId search index of pointers, unless someone has a better solution that can leverage Boost::Multi-index.
Edit on 1/31/2020:
I'm having trouble with the implementation of nested lists below. I have cases where the code does not properly place top-level parent objects into the top level, and thus in the "bracketed" printout, we don't see the hierarchy for that parent. However, in the "Children of xxx" printout, the children of that parent do display correctly. Here is a section of main.cpp which demonstrates the problem:
auto it=c.insert({170}).first;
it=c.insert({171}).first;
it=c.insert({172}).first;
it=c.insert({173}).first;
auto it141=c.insert({141}).first;
auto it137=insert_under(c,it141,{137}).first;
insert_under(c,it137,{8});
insert_under(c,it137,{138});
auto it9=insert_under(c,it137,{9}).first;
auto it5=insert_under(c,it9,{5}).first;
insert_under(c,it5,{6});
insert_under(c,it5,{7});
insert_under(c,it137,{142});
auto it143=insert_under(c,it137,{143}).first;
insert_under(c,it143,{144});
If you place this code in Main.cpp instead of the demo code and run it you will see the problem. Object 141 is a parent object and is placed at the top level. But it does not print in the "Bracketed" hierarchy printout. Why is this?
Edit on 2/2/2020:
Boost::Serialize often delivers an exception on oarchive, complaining that re-creating a particular object would result in duplicate objects. Some archives save and re-load successfully, but many result in the error above. I have not been able yet to determine the exact conditions under which the error occurs, but I have proven that none of the content used to populate the nested_container and the flat object list contains duplicate object IDs. I am using text archive, not binary. Here is how I have modified the code for nested_container and also for another, separate flat object list in order to do Boost::Serialize:
struct obj
{
int id;
const obj * parent = nullptr;
obj()
:id(-1)
{ }
obj(int object)
:id(object)
{ }
int getObjId() const
{
return id;
}
bool operator==(obj obj2)
{
if (this->getObjId() == obj2.getObjId())
return true;
else
return false;
}
#if 1
private:
friend class boost::serialization::access;
friend std::ostream & operator<<(std::ostream &os, const obj &obj);
template<class Archive>
void serialize(Archive &ar, const unsigned int file_version)
{
ar & id & parent;
}
#endif
};
struct subtree_obj
{
const obj & obj_;
subtree_obj(const obj & ob)
:obj_(ob)
{ }
#if 1
private:
friend class boost::serialization::access;
friend std::ostream & operator<<(std::ostream &os, const subtree_obj &obj);
template<class Archive>
void serialize(Archive &ar, const unsigned int file_version)
{
ar & obj_;
}
#endif
};
struct path
{
int id;
const path *next = nullptr;
path(int ID, const path *nex)
:id(ID), next(nex)
{ }
path(int ID)
:id(ID)
{ }
#if 1
private:
friend class boost::serialization::access;
friend std::ostream & operator<<(std::ostream &os, const path &pathe);
template<class Archive>
void serialize(Archive &ar, const unsigned int file_version)
{
ar & id & next;
}
#endif
};
struct subtree_path
{
const path & path_;
subtree_path(const path & path)
:path_(path)
{ }
#if 1
private:
friend class boost::serialization::access;
friend std::ostream & operator<<(std::ostream &os, const subtree_path &pathe);
template<class Archive>
void serialize(Archive &ar, const unsigned int file_version)
{
ar & path_;
}
#endif
};
//
// My flattened object list
//
struct HMIObj
{
int objId;
std::string objType;
HMIObj()
:objId(-1), objType("")
{ }
bool operator==(HMIObj obj2)
{
if (this->getObjId() == obj2.getObjId())
&& this->getObjType() == obj2.getObjType())
return true;
else
return false;
}
int getObjId() const
{
return objId;
}
std::string getObjType() const
{
return objType;
}
#if 1
private:
friend class boost::serialization::access;
friend std::ostream & operator<<(std::ostream &os, const HMIObj &obj);
template<class Archive>
void serialize(Archive &ar, const unsigned int file_version)
{
ar & objId & objType;
}
#endif
};
In case it helps, you can use Boost.MultiIndex to implement a sort of hierarchical container using the notion of path ordering.
Suppose we have the following hierarchy of objects, identified by their IDs:
|-------
| |
0 4
|---- |----
| | | | | |
1 2 3 5 8 9
|--
| |
6 7
We define the path of each object as the sequence of IDs from the root down to the object:
0 --> 0
1 --> 0, 1
2 --> 0, 2
3 --> 0, 3
4 --> 4
5 --> 4, 5
6 --> 4, 5, 6
7 --> 4, 5, 7
8 --> 4, 8
9 --> 4, 9
These paths can be ordered lexicographically so that a sequence of objects sorted by path is actually a representation of the underlying hierarchy. If we add a parent pointer to objects to model parent-child relationships:
struct obj
{
int id;
const obj* parent=nullptr;
};
then we can define a multi_index_container with both O(1) access by ID and hierarchy-based indexing:
using nested_container=multi_index_container<
obj,
indexed_by<
hashed_unique<member<obj,int,&obj::id>>,
ordered_unique<identity<obj>,obj_less>
>
>;
where obj_less compares objects according to path ordering. All types of tree manipulations and visitations are possible as exemplified below (code is not entirely trivial, feel free to ask).
Live On Coliru
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/hashed_index.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/identity.hpp>
#include <boost/multi_index/member.hpp>
#include <iterator>
struct obj
{
int id;
const obj* parent=nullptr;
};
struct subtree_obj
{
const obj& obj_;
};
struct path
{
int id;
const path* next=nullptr;
};
struct subtree_path
{
const path& path_;
};
inline bool operator<(const path& x,const path& y)
{
if(x.id<y.id)return true;
else if(y.id<x.id)return false;
else if(!x.next) return y.next;
else if(!y.next) return false;
else return *(x.next)<*(y.next);
}
inline bool operator<(const subtree_path& sx,const path& y)
{
const path& x=sx.path_;
if(x.id<y.id)return true;
else if(y.id<x.id)return false;
else if(!x.next) return false;
else if(!y.next) return false;
else return subtree_path{*(x.next)}<*(y.next);
}
inline bool operator<(const path& x,const subtree_path& sy)
{
return x<sy.path_;
}
struct obj_less
{
private:
template<typename F>
static auto apply_to_path(const obj& x,F f)
{
return apply_to_path(x.parent,path{x.id},f);
}
template<typename F>
static auto apply_to_path(const obj* px,const path& x,F f)
->decltype(f(x))
{
return !px?f(x):apply_to_path(px->parent,{px->id,&x},f);
}
public:
bool operator()(const obj& x,const obj& y)const
{
return apply_to_path(x,[&](const path& x){
return apply_to_path(y,[&](const path& y){
return x<y;
});
});
}
bool operator()(const subtree_obj& x,const obj& y)const
{
return apply_to_path(x.obj_,[&](const path& x){
return apply_to_path(y,[&](const path& y){
return subtree_path{x}<y;
});
});
}
bool operator()(const obj& x,const subtree_obj& y)const
{
return apply_to_path(x,[&](const path& x){
return apply_to_path(y.obj_,[&](const path& y){
return x<subtree_path{y};
});
});
}
};
using namespace boost::multi_index;
using nested_container=multi_index_container<
obj,
indexed_by<
hashed_unique<member<obj,int,&obj::id>>,
ordered_unique<identity<obj>,obj_less>
>
>;
template<typename Iterator>
inline auto insert_under(nested_container& c,Iterator it,obj x)
{
x.parent=&*it;
return c.insert(std::move(x));
}
template<typename Iterator,typename F>
void for_each_in_level(
nested_container& c,Iterator first,Iterator last, F f)
{
if(first==last)return;
const obj* parent=first->parent;
auto first_=c.project<1>(first),
last_=c.project<1>(last);
do{
f(*first_);
auto next=std::next(first_);
if(next->parent!=parent){
next=c.get<1>().upper_bound(subtree_obj{*first_});
}
first_=next;
}while(first_!=last_);
}
template<typename ObjPointer,typename F>
void for_each_child(nested_container& c,ObjPointer p,F f)
{
auto [first,last]=c.get<1>().equal_range(subtree_obj{*p});
for_each_in_level(c,std::next(first),last,f);
}
#include <iostream>
auto print=[](const obj& x){std::cout<<x.id<<" ";};
void print_subtree(nested_container& c,const obj& x)
{
std::cout<<x.id<<" ";
bool visited=false;
for_each_child(c,&x,[&](const obj& x){
if(!visited){
std::cout<<"[ ";
visited=true;
}
print_subtree(c,x);
});
if(visited)std::cout<<"] ";
}
int main()
{
nested_container c;
auto it=c.insert({0}).first;
insert_under(c,it,{1});
insert_under(c,it,{2});
insert_under(c,it,{3});
it=c.insert({4}).first;
auto it2=insert_under(c,it,{5}).first;
insert_under(c,it2,{6});
insert_under(c,it2,{7});
insert_under(c,it,{8});
insert_under(c,it,{9});
std::cout<<"preorder:\t";
std::for_each(c.get<1>().begin(),c.get<1>().end(),print);
std::cout<<"\n";
std::cout<<"top level:\t";
for_each_in_level(c,c.get<1>().begin(),c.get<1>().end(),print);
std::cout<<"\n";
std::cout<<"children of 0:\t";
for_each_child(c,c.find(0),print);
std::cout<<"\n";
std::cout<<"children of 4:\t";
for_each_child(c,c.find(4),print);
std::cout<<"\n";
std::cout<<"children of 5:\t";
for_each_child(c,c.find(5),print);
std::cout<<"\n";
std::cout<<"bracketed:\t";
for_each_in_level(c,c.get<1>().begin(),c.get<1>().end(),[&](const obj& x){
print_subtree(c,x);
});
std::cout<<"\n";
}
Output
preorder: 0 1 2 3 4 5 6 7 8 9
top level: 0 4
children of 0: 1 2 3
children of 4: 5 8 9
children of 5: 6 7
bracketed: 0 [ 1 2 3 ] 4 [ 5 [ 6 7 ] 8 9 ]
Update 2020/02/02:
When accessing top-level elements, I've changed the code from:
std::for_each(c.begin(),c.end(),...;
for_each_in_level(c,c.begin(),c.end(),...);
to
std::for_each(c.get<1>().begin(),c.get<1>().end(),...;
for_each_in_level(c,c.get<1>().begin(),c.get<1>().end(),...);
This is because index #0 is hashed and does not necessarily show elements sorted by ID.
For instance, if elements with IDs (170,171,173,173,141) are inserted in this order, index #0 lists them as
170,171,173,173,141 (coincidentally, same order as inserted),
while index #1 lists them as
141,170,171,173,173 (sorted by ID).
The way the code is implemented, for_each_in_level(c,c.begin(),c.end(),...); gets internally mapped to index #1 range [170,...,173], leaving out 141. The way to make sure all top elements are included is then to write for_each_in_level(c,c.get<1>().begin(),c.get<1>().end(),...);.
Let's say I am trying to implement some math vector class.
As vector interface will be used in multiple places: array based vector, matrices return columns and rows as vector interface objects and etc.
I would like to overload +,- operators for my vectors. Each operator should return new constructed object of some vector implementation class.
But as you know operator overloading should return a value or a reference. I can not return a value, as I need runtime polymorphism, so I am left with references. But to have a reference that does not die after the function call object should be created in the heap.
So how should I manage the situation?
P.S. I could create a shared_ptr and return a reference to containing value, but it does not look like a good practice.
typedef unsigned int vector_idx_t;
template <class T, vector_idx_t size>
class vector {
public:
virtual ~vector();
virtual T& operator[](const vector_idx_t idx) = 0;
virtual vector<T, size>& operator+ (const T& a) const = 0;
virtual vector<T, size>& operator- (const T& a) const = 0;
virtual vector<T, size>& operator* (const T& a) const = 0;
virtual vector<T, size>& operator/ (const T& a) const = 0;
virtual vector<T, size>& operator+ (const vector<T, size>& vec2) const = 0;
virtual vector<T, size>& operator- (const vector<T, size>& vec2) const = 0;
};
template <class T, vector_idx_t size>
class array_vector: public vector<T, size> {
private:
std::array<T, size> m_elements;
public:
array_vector();
array_vector(std::array<T, size> elements);
array_vector(const vector<T, size>& vec2);
array_vector(std::initializer_list<T> elems);
virtual ~array_vector();
virtual T& operator[](const vector_idx_t idx) {
return m_elements[idx];
}
virtual vector<T, size>& operator+ (const T& a) const {
std::array<T, size> e;
for (vector_idx_t i = 0; i < size; ++i) {
e[i] = m_elements[i] + a;
}
auto v = std::make_shared<array_vector<T, size>>(elems);
return *v;
}
};
I suggest a slight modification to your design for accommodating the polymorphic nature of the implementation.
Don't make vector polymorphic.
Use a Data class to contain the implementation specific details of vector.
Make Data polymorphic.
That will allow you to return vectors by value or by reference, as appropriate to an interface.
Polymorphism by subtype is not the answer to all problems. I understand what are you trying to do but I don't exactly understand why a polymorphic by template solution is not enough and you need to have virtual operators (which don't mix well at all with polymorphism by subtype).
You want to be able to define operations on mixed types of vectors so that you can compute results between real containers and proxy to containers.
This first of all should require that you have a basic final type that you need, a proxy to a matrix column is not a real container but rather a view of a container, so adding two of them should return a real container (eg. a container backed by an actual std::array?).
A similar design could be managed by something like
template<typename ContainerType, typename ElementType>
class vector_of : public ContainerType
{
public:
vector_of(const ContainerType& container) : ContainerType(container) { }
vector_of<ContainerType, ElementType> operator+(const ElementType& a) const
{
vector_of<ContainerType, ElementType> copy = vector_of<ContainerType,ElementType>(*this);
std::for_each(copy.begin(), copy.end(), [&a](ElementType& element) { element += a; });
}
template<typename T>
vector_of<ContainerType, ElementType> operator+(const vector_of<T, ElementType>& a) const
{
vector_of<ContainerType, ElementType> copy(*this);
auto it = copy.begin();
auto it2 = a.begin();
while (it != copy.end() && it2 != a.end())
{
*it += *it2;
++it;
++it2;
}
return copy;
}
};
The trick here is that operator+ is a template method which accepts a generic container of ElementType elements. The code assumes that these kind of containers provide a begin and end methods which return an iterator (which is a smart choice in any case because it works well with STL).
With you can do things like:
class MatrixRowProxy
{
private:
int* data;
size_t length;
public:
MatrixRowProxy(int* data, size_t length) : data(data), length(length) { }
int* begin() const { return data; }
int* end() const { return data + length; }
};
vector_of<std::array<int, 5>, int> base = vector_of<std::array<int, 5>, int>({ 1, 2, 3, 4, 5 });
vector_of<std::vector<int>, int> element = vector_of<std::vector<int>, int>({ 2, 3, 4, 5, 6 });
int* data = new int[5] { 10, 20, 30, 40, 50};
vector_of<MatrixRowProxy, int> proxy = vector_of<MatrixRowProxy, int>(MatrixRowProxy(data, 5));
auto result = base + element + proxy;
for (const auto& t : result)
std::cout << t << std::endl;
So you can add heterogeneous kinds of vectors without the need of any virtual method.
Of course these methods require to create a new resulting object in the methods. This is done by copying this into a new vector_of<ContainerType, ElementType>. Nothing prevents you from adding a third template argument like VectorFactory which takes care of this so that you could use vectors which are only wrappers also on LHS of such operators.
Iv implemented a red-black tree based on this example. But I don't understand the meaning of the header, is it the root of the tree? according to the descriptions:
the header node is maintained with links not only to the root but also to the leftmost node of the tree, to enable constant time begin(), and to the rightmost node of the tree, to enable linear time performance when used with the generic set algorithms (set_union, etc.);
How can I access the root of my tree using header node? and what is the complexity of that?
The header node in Boost Intrusive's RBTree implementations contains the link to the root, leftmost and rightmost nodes (see here).
So, parent_ is the pointer to the root node then.
You can use a container abstraction based on the "algorithm policy" shown in that example. You'd write custom value traits, like I linked in my previous answer: Accessing left child or right child of a node in avl_set
Here's a simple, self-contained example that shows how to use an actual rbtree container (not just the algorithms) built on your node type.
Note how you can still "drill through" and get at the nodes using the containers traits.
Live On Coliru
struct my_node
{
my_node(int i = 0) :
parent_(nullptr),
left_ (nullptr),
right_ (nullptr),
int_ (i)
{ }
my_node *parent_, *left_, *right_;
int color_;
//data members
int int_;
bool operator<(my_node const& other) const { return int_ < other.int_; }
};
//Define our own rbtree_node_traits
struct my_rbtree_node_traits
{
typedef my_node node;
typedef my_node * node_ptr;
typedef const my_node * const_node_ptr;
typedef int color;
static node_ptr get_parent(const_node_ptr n) { return n->parent_; }
static void set_parent(node_ptr n, node_ptr parent){ n->parent_ = parent; }
static node_ptr get_left(const_node_ptr n) { return n->left_; }
static void set_left(node_ptr n, node_ptr left) { n->left_ = left; }
static node_ptr get_right(const_node_ptr n) { return n->right_; }
static void set_right(node_ptr n, node_ptr right) { n->right_ = right; }
static color get_color(const_node_ptr n) { return n->color_; }
static void set_color(node_ptr n, color c) { n->color_ = c; }
static color black() { return color(0); }
static color red() { return color(1); }
};
#include <boost/intrusive/link_mode.hpp>
namespace bi = boost::intrusive;
struct my_value_traits
{
typedef my_rbtree_node_traits node_traits;
typedef node_traits::node value_type;
typedef node_traits::node_ptr node_ptr;
typedef node_traits::const_node_ptr const_node_ptr;
typedef value_type* pointer;
typedef value_type const* const_pointer;
static const bi::link_mode_type link_mode = bi::link_mode_type::normal_link;
static node_ptr to_node_ptr (value_type &value) { return &value; }
static const_node_ptr to_node_ptr (const value_type &value) { return &value; }
static pointer to_value_ptr (node_ptr n) { return n; }
static const_pointer to_value_ptr (const_node_ptr n) { return n; }
};
#include <boost/intrusive/rbtree.hpp>
using mytree = bi::rbtree<my_node, bi::value_traits<my_value_traits> >;
#include <iostream>
#include <vector>
int main() {
std::vector<my_node> storage { {1}, {3}, {4}, {2}, {3}, };
mytree container;
container.insert_equal(storage.begin(), storage.end());
// NOW for the "have your cake and eat it too" moment:
for (my_node& n : container) {
std::cout << n.int_
<< " (parent: " << n.parent_ << ")"
<< " (left: " << n.left_ << ")"
<< " (right: " << n.right_ << ")"
<< "\n";
}
}
Which prints (e.g.):
1 (parent: 0xb01c40) (left: 0) (right: 0xb01c80)
2 (parent: 0xb01c20) (left: 0) (right: 0)
3 (parent: 0x7fff6da3f058) (left: 0xb01c20) (right: 0xb01c60)
3 (parent: 0xb01c60) (left: 0) (right: 0)
4 (parent: 0xb01c40) (left: 0xb01ca0) (right: 0)
The structure of a Boost.Intrusive tree is explained in the documentation of bstree_algorithms (http://www.boost.org/boost/intrusive/bstree_algorithms.hpp). The "header" node is also explained:
"At the top of the tree a node is used specially. This node's parent pointer is pointing to the root of the tree. Its left pointer points to the leftmost node in the tree and the right pointer to the rightmost one. This node is used to represent the end-iterator."
So you can access the root node using:
root = rbtree_algorithms::get_parent(header);
If you are building your own container using value traits, as explained by sehe, since commit:
https://github.com/boostorg/intrusive/commit/bbb4f724d037a6ab5ee0d9bde292f0691564960c
tree-based containers have a root() function that returns an iterator to the root node (or end() if not present) with O(1) complexity, which might be easier to use:
#include <boost/intrusive/set.hpp>
#include <cassert>
using namespace boost::intrusive;
struct MyClass : public set_base_hook<>
{
friend bool operator<(const MyClass&, const MyClass&)
{ return true; }
};
int main()
{
set<MyClass> set;
//end() is returned when the tree is empty
assert(set.root() == set.end() );
//insert myobject, must be root
MyClass myobject;
set.insert(myobject);
assert(&*set.root() == &myobject);
//erase and check root is again end()
set.erase(set.root());
assert(set.croot() == set.cend());
return 0;
}