I think this is more of a philosophical question about readability and tupled types in C++11.
I am writing some code to produce Gaussian Mixture Models (the details are kind of irrelevant but it serves and a nice example.) My code is below:
GMM.hpp
#pragma once
#include <opencv2/opencv.hpp>
#include <vector>
#include <tuple>
#include "../Util/Types.hpp"
namespace LocalDescriptorAndBagOfFeature
{
// Weighted gaussian is defined as a (weight, mean vector, covariance matrix)
typedef std::tuple<double, cv::Mat, cv::Mat> WeightedGaussian;
class GMM
{
public:
GMM(int numGaussians);
void Train(const FeatureSet &featureSet);
std::vector<double> Supervector(const BagOfFeatures &bof);
int NumGaussians(void) const;
double operator ()(const cv::Mat &x) const;
private:
static double ComputeWeightedGaussian(const cv::Mat &x, WeightedGaussian wg);
std::vector<WeightedGaussian> _Gaussians;
int _NumGaussians;
};
}
GMM.cpp
using namespace LocalDescriptorAndBagOfFeature;
double GMM::ComputeWeightedGaussian(const cv::Mat &x, WeightedGaussian wg)
{
double weight;
cv::Mat mean, covariance;
std::tie(weight, mean, covariance) = wg;
cv::Mat precision;
cv::invert(covariance, precision);
double detp = cv::determinant(precision);
double outter = std::sqrt(detp / 2.0 * M_PI);
cv::Mat meanDist = x - mean;
cv::Mat meanDistTrans;
cv::transpose(meanDist, meanDistTrans);
cv::Mat symmetricProduct = meanDistTrans * precision * meanDist; // This is a "1x1" matrix e.g. a scalar value
double inner = symmetricProduct.at<double>(0,0) / -2.0;
return weight * outter * std::exp(inner);
}
double GMM::operator ()(const cv::Mat &x) const
{
return std::accumulate(_Gaussians.begin(), _Gaussians.end(), 0, [&x](double val, WeightedGaussian wg) { return val + ComputeWeightedGaussian(x, wg); });
}
In this case, am I gaining anything (clarity, readability, speed, ...) by using a tuple representation for the weighted Gaussian distribution over using a struct, or even a class with its own operator()?
You're reducing the size of your source code a little bit, but I'd argue that you're reducing its overall readability and type safety. Specifically, if you defined:
struct WeightedGaussian {
double weight;
cv::Mat mean, covariance;
};
then you wouldn't have a chance of writing the incorrect
std::tie(weight, covariance, mean) = wg;
and you'd guarantee that your users would use wg.mean instead of std::get<0>(wg). The biggest downside is that std::tuple comes with definitions of operator< and operator==, while you have to implement them yourself for a custom struct:
operator<(const WeightedGaussian& lhs, const WeightedGaussian& rhs) {
return std::tie(lhs.weight, lhs.mean, lhs.covariance) <
std::tie(rhs.weight, rhs.mean, rhs.covariance);
}
Related
I am writing simple classes that implement vectors and matrices as part of trying to learn OpenGL. I have matrix and vector classes that look like this :
// Forward declarations
template <typename T, size_t N/*len*/> struct vec;
template<typename T, size_t N /*rows*/, size_t M /*cols*/> struct mat;
// Forward declare *operator for matrix
// (NxM) matrix multiplied by (MxP) matrix yields (NxP) matrix
mat<T, N, P> operator* (const mat<T, N, M>& A, const mat<T, M, P>& B);
template <typename T, size_t N>
struct vec {
public:
vec() {}
virtual ~vec() {}
private:
T[N] m_data;
};
template <typename T, size_t N, size_t M>
struct mat {
public:
mat() {}
virtual ~mat() {}
// This is where it gets interesting. By my reading of the rules
// of C++11, this counts as a partial specialization of the
// operator template, and should not work.
// However, it compiles just fine!
template <size_t n, size_t m, size_t p>
friend mat<T, n, p> operator* (const mat<T, n, m>& A,
const mat<T, m, p> &B);
// Implementation appears later in the same header file.
private:
T[N*M] m_data;
};
I declare the * operator as a friend because I want it to have access to the internal m_data member, but I don't want the users of 'mat' and 'vec' to know the internals.
This compiles and runs just fine. I have a unit test for that matrix multiplication, and it works just fine. However, I don't know why it even compiles, let alone runs. By my reading of the rules of C++ templates, the declaration of the * operator counts as a partial specialization of function template, and is illegal.
What am I missing here?
Turns out this does *NOT* compile. I thought it was compiling because I wasn't invoking the template stream operator in my unit test when I thought I was.
Sorry for the stupid question!
I have a problem with a Template function inside a class. When I call "Set" in something() function, VS show me:
Error C2275 'T': illegal use of this type as an expression
The header is:
#include <vector>
#include <array>
#include <iostream>
using t_double = double;
template<typename T>
using t_vec = std::vector<T>;
class SuperPixel2
{
t_vec<double> distances;
template<typename T>
void Set(t_vec<T> &v,
size_t i,
size_t j,
const T &val);
void something();
}
And the cpp file:
#include "SuperPixel2.h"
template<typename T>
void SuperPixel2::Set(t_vec<T> &v,
size_t i,
size_t j,
const T &val)
{
v[i * cols + j] = T;
}
void SuperPixel2::something()
{
t_double d;
//..
Set(distances, k, l, (t_double)d);
//..
}
In addition to what sudo make install said, you generally cannot declare a template in a header file and write the implementation in a cpp file. See here for an answer to your problem.
Well this line looks pretty odd:
v[i * cols + j] = T;
I think it's meant to be:
v[i * cols + j] = val;
As a side note (and maybe this would make more sense looking at the whole class), the type of distances is known (a vector of doubles), so it's unclear to me why the Set method needs to be a templated member.
#include <iostream>
#include <Eigen/Core>
namespace Eigen {
// float op double -> double
template <typename BinaryOp>
struct ScalarBinaryOpTraits<float, double, BinaryOp> {
enum { Defined = 1 };
typedef double ReturnType;
};
// double op float -> double
template <typename BinaryOp>
struct ScalarBinaryOpTraits<double, float, BinaryOp> {
enum { Defined = 1 };
typedef double ReturnType;
};
}
int main() {
Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic> m1(2, 2);
m1 << 1, 2, 3, 4;
Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic> m2(2, 2);
m2 << 1, 2, 3, 4;
std::cerr << m1 * m2 <<std::endl; // <- boom!!
}
I'd like to know why the above code does not compile. Here is the full error messages. Please note that if I define m1 and m2 to have fixed sizes, it works fine.
I'm using Eigen3.3.1. It's tested on a Mac running OSX-10.12 with Apple's clang-800.0.42.1.
This is because the general matrix-matrix product is highly optimized with aggressive manual vectorization, pipelining, multi-level caching, etc. This part does not support mixing float and double. You can bypass this heavily optimized implementation with m1.lazyProduct(m2) that corresponds to the implementations used fro small fixed-size matrices, but there is only disadvantages of doing so: the ALUs does not support mixing float and double, so float values have to be promoted to double anyway and you will loose vectorization. Better cast the float to double explicitly:
m1.cast<double>() * m2
I need to sum about 100000 values stored in an array, but with conditions.
Is there a way to do that in CUDA to produce fast results?
Can anyone post a small code to do that?
I think that, to perform conditional reduction, you can directly introduce the condition as a multiplication by 0 (false) or 1 (true) to the addends. In other words, suppose that the condition you would like to meet is that the addends be smaller than 10.f. In this case, borrowing the first code at Optimizing Parallel Reduction in CUDA by M. Harris, then the above would mean
__global__ void reduce0(int *g_idata, int *g_odata) {
extern __shared__ int sdata[];
// each thread loads one element from global to shared mem
unsigned int tid = threadIdx.x;
unsigned int i = blockIdx.x*blockDim.x + threadIdx.x;
sdata[tid] = g_idata[i]*(g_data[i]<10.f);
__syncthreads();
// do reduction in shared mem
for(unsigned int s=1; s < blockDim.x; s *= 2) {
if (tid % (2*s) == 0) {
sdata[tid] += sdata[tid + s];
}
__syncthreads();
}
// write result for this block to global mem
if (tid == 0) g_odata[blockIdx.x] = sdata[0];
}
If you wish to use CUDA Thrust to perform conditional reduction, you can do the same by using thrust::transform_reduce. Alternatively, you can create a new vector d_b copying in that all the elements of d_a satisfying the predicate by thrust::copy_if and then applying thrust::reduce on d_b. I haven't checked which solution performs the best. Perhaps, the second solution will perform better on sparse arrays. Below is an example with an implementation of both the approaches.
#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
#include <thrust/reduce.h>
#include <thrust/count.h>
#include <thrust/copy.h>
// --- Operator for the first approach
struct conditional_operator {
__host__ __device__ float operator()(const float a) const {
return a*(a<10.f);
}
};
// --- Operator for the second approach
struct is_smaller_than_10 {
__host__ __device__ bool operator()(const float a) const {
return (a<10.f);
}
};
void main(void)
{
int N = 20;
// --- Host side allocation and vector initialization
thrust::host_vector<float> h_a(N,1.f);
h_a[0] = 20.f;
h_a[1] = 20.f;
// --- Device side allocation and vector initialization
thrust::device_vector<float> d_a(h_a);
// --- First approach
float sum = thrust::transform_reduce(d_a.begin(), d_a.end(), conditional_operator(), 0.f, thrust::plus<float>());
printf("Result = %f\n",sum);
// --- Second approach
int N_prime = thrust::count_if(d_a.begin(), d_a.end(), is_smaller_than_10());
thrust::device_vector<float> d_b(N_prime);
thrust::copy_if(d_a.begin(), d_a.begin() + N, d_b.begin(), is_smaller_than_10());
sum = thrust::reduce(d_b.begin(), d_b.begin() + N_prime, 0.f);
printf("Result = %f\n",sum);
getchar();
}
Hello. I write matrix class but i have some problem with matrices multiplication which have different dimensions.
template< typename T, size_t Row, size_t Col >
class Matrix
{
public:
.....................................
template< typename MulT >
auto operator * (const MulT& other) -> Matrix<T, Row, other.getColNum()>
{
if (other.getRowNum() != getColNum())
throw std::logic_error("Multiplication are not possible");
Matrix<T, Row, other.getColNum()> temp;
// Some operations.
return temp; // Must return matrix with this type Matrix<T, Row, other.getColNum()>
// but it dont work.
}
.....................................
}; // class Matrix
This code don't work. It is possible to resolve this problem?
other.getColNum() is probably not a constexpr function and can therefore not be used as a template non-type argument.
Read up on constexpr here: http://en.cppreference.com/w/cpp/language/constexpr
You don't want to check the if if (other.getRowNum() != getColNum()) at runtime, this should be done at compile time. One way to do this is to define the operator only for when the multiplication is valid. In this case:
template< typename T, size_t Row, size_t Col >
class Matrix
{
public:
.....................................
template<size_t _RHSWIDTH>
Matrix<_T, _RHSWIDTH, Row> operator * (const Matrix<T, _RHSWIDTH, Col> &rhs) const
{
Matrix<_T, _RHSWIDTH, Row> temp;
// Some operations.
return temp;
}
.....................................
}; // class Matrix
As a result, any attempt to multiply matrices which cannot be multiplied will fail at compile time. For a complete example, I wrote a complete matrix template a long time ago which uses very similar syntax: https://github.com/Enseed/GenericGeometry/blob/master/Matrix.h