performance tuning on Eigen sparse matrix - performance

I've implemented something using Eigen's SparseMatrix, basically it's something like,
SparseMatrix W;
...
W.row(i) += X.row(j); // X is another SparseMatrix, both W and X are row major.
...
and I did some perf-profiling on the code via google-pprof, and I think the above code is problematic, see figure below,
fig 1
then fig 2
finally fig 3
looks like the operator+= brings in much memory-copy stuff.
I don't know much about the internals of SparseMatrix operations, but is there any recommended way to optimize the above code?

If the sparsity of X is a subset of the sparsity of W, then you can wrote your own function doing the addition in-place:
namespace Eigen {
template<typename Dst, typename Src>
void inplace_sparse_add(Dst &dst, const Src &src)
{
EIGEN_STATIC_ASSERT( ((internal::evaluator<Dst>::Flags&RowMajorBit) == (internal::evaluator<Src>::Flags&RowMajorBit)),
THE_STORAGE_ORDER_OF_BOTH_SIDES_MUST_MATCH);
using internal::evaluator;
evaluator<Dst> dst_eval(dst);
evaluator<Src> src_eval(src);
assert(dst.rows()==src.rows() && dst.cols()==src.cols());
for (Index j=0; j<src.outerSize(); ++j)
{
typename evaluator<Dst>::InnerIterator dst_it(dst_eval, j);
typename evaluator<Src>::InnerIterator src_it(src_eval, j);
while(src_it)
{
while(dst_it && dst_it.index()!=src_it.index())
++dst_it;
assert(dst_it);
dst_it.valueRef() += src_it.value();
++src_it;
}
}
}
}
Here is a usage example:
int main()
{
int n = 10;
MatrixXd R = MatrixXd::Random(n,n);
SparseMatrix<double, RowMajor> A = R.sparseView(0.25,1), B = 0.5*R.sparseView(0.65,1);
cout << A.toDense() << "\n\n" << B.toDense() << "\n\n";
inplace_sparse_add(A, B);
cout << A.toDense() << "\n\n";
auto Ai = A.row(2);
inplace_sparse_add(Ai, B.row(2));
cout << A.toDense() << "\n\n";
}

Related

Displaying garbage value

When the friendly function add is used to print the value of sum variable of two classes, the correct output is printing. But, when then display function of their respective class are used, garbage value is printing.
What's wrong with the code?
#include <iostream>
using namespace std;
class DB;
class DM {
int m, cm;
float sum;
public:
void read()
{
cout << "Enter meters and centimeters respectively\n";
cin >> m >> cm;
}
void display(void)
{
cout << sum << " meters\n";
}
friend void add(DM p1, DB p2);
};
class DB {
int feet, inch;
float sum;
public:
void read()
{
cout << "Enter feets and inches respectively\n";
cin >> feet >> inch;
}
void display(void)
{
cout << sum << " feets\n";
}
friend void add(DM p1, DB p2);
};
void add(DM p1, DB p2)
{
float a = p2.feet * 12;
float b = a + p2.inch;
float c = b * .3048 + p1.m + p1.cm * .01;
//cout << c << endl;
p1.sum = c;
//cout << p1.sum << endl;
//p2.sum = (c/12)/.3048;
cout << p2.sum << endl;
}
int main()
{
DM obj1;
DB obj2;
obj1.read();
obj2.read();
add(obj1, obj2);
obj1.display();
obj2.display();
}
The problem is that you print sum but you never initialize the sum of obj1 and obj2.
You initialize sum only inside add() but you pass values to add() by copy, so you initialize sum only in temporary objects.
When you print obj1.sum and obj2.sum (through display()), the sum member are still not-initialized.
So the garbage.
Try passing objects by reference
// ..........V........V
void add (DM & p1, DB & p2)
This way the initialization of sum inside add() has effect also for calling objects.

Reading in from file with modern c++ - data is not stored

maybe I get something wrong with shared_pointers or there is some basic shortcoming of mine but I couldn't get this right. So I want to read in some data from a file. There are position and momentum data on each line of the data file and the first line stores the number of data points.
I need to read this in to my data structure and for some reason my graph would not fill, although the data reads in correctly.
const int dim = 3; // dimension of problem
template <typename T, typename G>
// T is the type of the inputted locations and G is the type of the
// distance between them
// for example: int point with float/double distance
struct Node{
std::pair< std::array<T, dim>,std::pair< std::array<T, dim>, G > > pos; // position
std::pair< std::array<T, dim>,std::pair< std::array<T, dim>, G > > mom; // momentum
// a pair indexed by a position in space and has a pair of position
// and the distance between these points
};
template <typename T, typename G>
struct Graph{
int numOfNodes;
std::vector< Node<T,G> > nodes;
};
This is the data structure and here's my read function (std::cout-s are only for testing):
template <typename T, typename G>
std::istream& operator>>(std::istream& is, std::shared_ptr< Graph<T,G> >& graph){
is >> graph->numOfNodes; // there's the number of nodes on the first line of the data file
std::cout << graph->numOfNodes << "\n";
for(int k=0; k<graph->numOfNodes; k++){
Node<T,G> temp;
for(auto i : temp.pos.first){
is >> i;
std::cout << i << "\t";
}
std::cout << "\t";
for(auto i : temp.mom.first){
is >> i;
std::cout << i << "\t";
}
std::cout << "\n";
graph->nodes.push_back(temp);
}
return is;
}
I have an output function as well. So if I output the graph which I intended to fill during read-in is zeroed out. Number of nodes os correct however positions and momente are all zeroed out. What did I do wrong? Thanks in advance.
for(auto i : temp.pos.first){
is >> i;
std::cout << i << "\t";
}
Think of this as similar to a function. If you have something like:
void doX(int i) { i = 42; }
int main() {
int j=5;
doX(j);
return j;
}
Running this code, you'll see the program returns the value 5. This is because the function doX takes i by value; it basically takes a copy of the variable.
If you replace doX's signature with
void doX(int &i)
and run the code, you'll see it returns 42. This is because the function is now taking the argument by reference, and so can modify it.
Your loops will behave similarly. As you have it now, they take a copy of the values in the arrays in turn, but are not by reference.
As with the function, you can change your loops to look like
for(auto &i : temp.pos.first){
is >> i;
std::cout << i << "\t";
}
This should then let you change the values stored in the arrays.

Boost Mem_fn and accessing member function of derived class

I made a simple example to test boost bind's interaction with derived classes.
I created two subclasses with different getarea functions. I expected
g1 = boost::bind(boost::mem_fn(&Shape::getarea), Rec)
to print the area of Rectangle(10,20) but instead it printed '1'. I get the same when I instead write Rectangle::getarea. It prints the same even when I input other functions eg. member of Rectangle
double sum(double h,double w){return h+w; }
and use
g1 = boost::bind(boost::mem_fn(&Rectangle::sum), Rec,2,3)
Question 1: Why does it return '1'?Is that a default response for error?
My second problem is to do the same of printing g2 but now Rec is replaced by **iter, i.e. an object of some derived class type from a list of objects. Since getarea is a virtual fcn, once I get the above working it should be fine to just write:
g2= boost::bind(boost::mem_fn(& Shape::getarea , &(**iter));
Question 2: However, I was wondering if there is a way to return the classtype of **iter eg. classof(**iter) and then put it in g2 i.e.
g2= boost::bind(boost::mem_fn(& classof(**iter)::getarea , &(**iter));
When I ran g2 by writing Shape::getarea, I got '1' again for all iter.
#include <memory>
#include <vector>
#include <string>
#include <iostream>
#include <sstream>
#include <boost/bind.hpp>
using namespace std;
class Shape {
public:
Shape(double h, double w) :height(h), width(w) {};
virtual double getarea() = 0;
double height;
double width; };
class Rectangle: public Shape {
public:
Rectangle(double h, double w): Shape(h,w) {};
double getarea() override { return height*width; } };
class Triangle : public Shape {
public:
Triangle(double h, double w) :Shape(h,w) {};
double getarea() { return height*width*0.5; }};
int main() {
//create objects
Rectangle Rec(10, 20);
Triangle Tri(2, 3);
//create boost bind function
boost::function<double(double, double)> g1;
g1 = boost::bind(boost::mem_fn(&Shape::getarea), Rec);
//print area and g
cout << Rec.getarea()<<" should be equal to " << g1<< '\n';
//create list
vector<shared_ptr<Shape>> Plist;
Plist.push_back(make_shared<Rectangle>(Rec));
Plist.push_back(make_shared<Triangle>(Tri));
//print each element from the vector list
for (auto iter = Plist.begin(); iter != Plist.end(); iter ++ ) {
boost::function<double(double, double)> g2;
g2= boost::bind(boost::mem_fn(& .... , &(**iter));
//where in dots we need Classtype_of_**iter::getarea
cout << (**iter).getarea()<<"should be equal to " << g2<< '\n';
}
}
You... forget to invoke the functions...
for (auto iter = Plist.begin(); iter != Plist.end(); iter++) {
boost::function<double()> g2;
g2 = boost::bind(&Shape::getarea, iter->get());
cout << (*iter)->getarea() << " should be equal to " << g2() << '\n';
}
What you saw what the implicit conversion to bool (http://www.boost.org/doc/libs/1_60_0/doc/html/boost/function.html#idm45507164686720-bb)
Note also I fixed the signature of g1 and g2: Live On Coliru.
Some further improvements (remove the need for the g2 in the loop?):
auto getarea = boost::mem_fn(&Shape::getarea);
for (auto iter = Plist.begin(); iter != Plist.end(); iter++) {
cout << (*iter)->getarea() << " should be equal to " << getarea(**iter) << '\n';
}
Or, indeed in c++11:
for (auto& s : Plist)
cout << s->getarea() << " should be equal to " << getarea(*s) << '\n';
By this time, you'd wonder why you have this accessor when you can just use the member.

Is it possible in C++11 to combine functions into a new function?

This is more a kind of theoretical question. Is it possible in C++11 to combine functions into a new function? For example :
auto f = [](int i){return i * 2;};
auto g = [](int i){return i + 10;};
So this works:
auto c = f(g(20)); // = 60
But I want an object that stores the combination, like
auto c = f(g);
std::cout << c(20) << std::endl; //prints 60
Edit:
Additionally what i want to create is a function a, which you can give a function b and an int n, and which returns the n'th combination of the given function b. For example (not compilable)
template<typename T>
auto combine(T b, int i) -> decltype(T)
{
if (i == 0)
return b;
return combine(b, i - 1);
}
auto c = combine(f, 2); //c = f(f(f(int)))
A first attempt:
template<class First, class Second>
auto compose( Second&& second, First&& first ) }
return [second = std::forward<Second>(second), first=std::forward<First>(first)]
(auto&&...args)->decltype(auto) {
return second( first( decltype(args)(args)... ) );
};
}
template<class A, class B, class...Rest>
auto compose(A&& a, B&& b, Rest&&... rest) {
return compose( compose(std::forward<A>(a), std::forward<B>(b)), std::forward<Rest>(rest)... );
}
template<class A>
std::decay_t<A> compose(A&& a) {
return std::forward<A>(a);
}
in C++14. Now, this isn't perfect, as the pattern doesn't work all that well in C++.
To do this perfectly, we'd have to take a look at compositional programming. Here, functions interact with an abstract stack of arguments. Each function pops some number of arguments off the stack, then pops some number back on.
This would allow you do do this:
compose( print_coord, get_x, get_y )
where get_x and get_y consume nothing but return a coordinate, and print_coord takes two coordinates and prints them.
To emulate this in C++, we need some fancy machinery. Functions will return tuples (or tuple-likes?), and those values will be "pushed onto the argument stack" logically.
Functions will also consume things off this argument stack.
At each invocation, we unpack the current tuple of arguments, find the longest collection that the function can be called with, call it, get its return value, unpack it if it is a tuple, and then stick any such returned values back on the argument stack.
For this more advanced compose to compose with itself, it then needs SFINAE checks, and it needs to be able to take a invokable object and a tuple of arguments and find the right number of arguments to call the invokable object with, plus the left-over arguments.
This is a tricky bit of metaprogramming that I won't do here.
The second part, because I missed it the first time, looks like:
template<class F>
auto function_to_the_power( F&& f, unsigned count ) {
return [f=std::forward<F>(f),count](auto&& x)
-> std::decay_t< decltype( f(decltype(x)(x)) ) >
{
if (count == 0) return decltype(x)(x);
auto r = f(decltype(x)(x));
for (unsigned i = 1; i < count; ++i) {
r = f( std::move(r) );
}
return r;
};
}
This uses no type erasure.
Test code:
auto f = [](int x){ return x*3; };
auto fs = std::make_tuple(
function_to_the_power( f, 0 ),
function_to_the_power( f, 1 ),
function_to_the_power( f, 2 ),
function_to_the_power( f, 3 )
);
std::cout << std::get<0>(fs)(2) << "\n";
std::cout << std::get<1>(fs)(2) << "\n";
std::cout << std::get<2>(fs)(2) << "\n";
std::cout << std::get<3>(fs)(2) << "\n";
prints:
2
6
18
54
You can write something along the lines of:
#include <functional>
#include <iostream>
template<class F>
F compose(F f, F g)
{
return [=](int x) { return f(g(x)); };
}
int main()
{
std::function<int (int)> f = [](int i) { return i * 2; };
std::function<int (int)> g = [](int i) { return i + 10; };
auto c = compose(f, g);
std::cout << c(20) << '\n'; // prints 60
}
The code can be simply extended to cover the second half of the question:
template<class F>
F compose(F f, unsigned n)
{
auto g = f;
for (unsigned i = 0; i < n; ++i)
g = compose(g, f);
return g;
}
int main()
{
std::function<int (int)> h = [](int i) { return i * i; };
auto d = compose(h, 1);
auto e = compose(h, 2);
std::cout << d(3) << "\n" // prints 81
<< e(3) << "\n"; // prints 6561
}
NOTE. Here using std::function. It isn't a lambda but wraps a lambda with a performance cost.

OpenCV perspectiveTransform broken function

Im trying to use perspectiveTransform but I keep getting error. I tried to follow the solution from this thread http://answers.opencv.org/question/18252/opencv-assertion-failed-for-perspective-transform/
_players[i].getCoordinates() is of type Point
_homography_matrix is a 3 x 3 Mat
Mat temp_Mat = Mat::zeros(2, 1, CV_32FC2);
for (int i = 0; i < _players.size(); i++)
{
cout << Mat(_players[i].get_Coordinates()) << endl;
perspectiveTransform(Mat(_players[i].get_Coordinates()), temp_Mat, _homography_matrix);
}
Also, how do I convert temp_Mat into type Point ?
OpenCV Error: Assertion failed (scn + 1 == m.cols) in cv::perspectiveTransform
Basically you just need to correct from
Mat(_players[i].get_Coordinates()) ...
to
Mat2f(_players[i].get_Coordinates()) ...
In the first case you are creating a 2x1, 1 channel float matrix, in the second case (correct) you create a 1x1, 2 channel float matrix.
You also don't need to initialize temp_Mat.
You can also use template Mat_ to better control the types of your Mats. E.g. creating a Mat of type CV_32FC2 is equivalent to create a Mat2f.
This sample code will show you also how to convert back and forth between Mat and Point:
#include <opencv2\opencv.hpp>
#include <vector>
using namespace std;
using namespace cv;
int main()
{
// Some random points
vector<Point2f> pts = {Point2f(1,2), Point2f(5,10)};
// Some random transform matrix
Mat1f m(3,3, float(0.1));
for (int i = 0; i < pts.size(); ++i)
{
cout << "Point: " << pts[i] << endl;
Mat2f dst;
perspectiveTransform(Mat2f(pts[i]), dst, m);
cout << "Dst mat: " << dst << endl;
Point2f p(dst(0));
cout << "Dst point: " << p << endl;
}
return 0;
}

Resources