I want to read a chunk of data which is just one frame of many frames stored in one dataset. The shape of the whole dataset is (10, 11214,3), 10 frames each frame has 11214 rows and 4 columns. Here is the file. The chunk I want to read would have the shape (11214,3). I can print the predefined array using, but I'm not sure how can I read data from a hdf5 file. Here is my code,
#include <h5xx/h5xx.hpp>
#include <boost/multi_array.hpp>
#include <iostream>
#include <vector>
#include <cstdio>
typedef boost::multi_array<int, 2> array_2d_t;
const int NI=10;
const int NJ=NI;
void print_array(array_2d_t const& array)
{
for (unsigned int j = 0; j < array.shape()[1]; j++)
{
for (unsigned int i = 0; i < array.shape()[0]; i++)
{
printf("%2d ", array[j][i]);
}
printf("\n");
}
}
void write_int_data(std::string const& filename, array_2d_t const& array)
{
h5xx::file file(filename, h5xx::file::trunc);
std::string name;
{
// --- create dataset and fill it with the default array data (positive values)
name = "integer array";
h5xx::create_dataset(file, name, array);
h5xx::write_dataset(file, name, array);
// --- create a slice object (aka hyperslab) to specify the location in the dataset to be overwritten
std::vector<int> offset; int offset_raw[2] = {4,4}; offset.assign(offset_raw, offset_raw + 2);
std::vector<int> count; int count_raw[2] = {2,2}; count.assign(count_raw, count_raw + 2);
h5xx::slice slice(offset, count);
}
}
void read_int_data(std::string const& filename)
{
h5xx::file file(filename, h5xx::file::in);
std::string name = "integer array";
// read and print the full dataset
{
array_2d_t array;
// --- read the complete dataset into array, the array is resized and overwritten internally
h5xx::read_dataset(file, name, array);
printf("original integer array read from file, negative number patch was written using a slice\n");
print_array(array);
printf("\n");
}
}
int main(int argc, char** argv)
{
std::string filename = argv[0];
filename.append(".h5");
// --- do a few demos/tests using integers
{
array_2d_t array(boost::extents[NJ][NI]);
{
const int nelem = NI*NJ;
int data[nelem];
for (int i = 0; i < nelem; i++)
data[i] = i;
array.assign(data, data + nelem);
}
write_int_data(filename, array);
read_int_data(filename);
}
return 0;
}
I'm using the h5xx — a template-based C++ wrapper for the HDF5 library link and boost library.
The datasets are stored in particles/lipids/box/positions path. The dataset name value holds the frames.
argv[0] is not what you want (arguments start at 1, 0 is the program name). Consider bounds checking as well:
std::vector<std::string> const args(argv, argv + argc);
std::string const filename = args.at(1) + ".h5";
the initialization can be done directly, without a temporary array (what is multi_array for, otherwise?)
for (size_t i = 0; i < array.num_elements(); i++)
array.data()[i] = i;
Or indeed, make it an algorithm:
std::iota(array.data(), array.data() + array.num_elements(), 0);
same with vectors:
std::vector<int> offset; int offset_raw[2] = {4,4}; offset.assign(offset_raw, offset_raw + 2);
std::vector<int> count; int count_raw[2] = {2,2}; count.assign(count_raw, count_raw + 2);
besides being a formatting mess can be simply
std::vector offset{4,4}, count{2,2};
h5xx::slice slice(offset, count);
On To The Real Question
The code has no relevance to the file. At all. I created some debug/tracing code to dump the file contents:
void dump(h5xx::group const& g, std::string indent = "") {
auto dd = g.datasets();
auto gg = g.groups();
for (auto it = dd.begin(); it != dd.end(); ++it) {
std::cout << indent << " ds:" << it.get_name() << "\n";
}
for (auto it = gg.begin(); it != gg.end(); ++it) {
dump(*it, indent + "/" + it.get_name());
}
}
int main()
{
h5xx::file xaa("xaa.h5", h5xx::file::mode::in);
dump(xaa);
}
Prints
/particles/lipids/box/edges ds:box_size
/particles/lipids/box/edges ds:step
/particles/lipids/box/edges ds:time
/particles/lipids/box/edges ds:value
/particles/lipids/box/positions ds:step
/particles/lipids/box/positions ds:time
/particles/lipids/box/positions ds:value
Now we can drill down to the dataset. Let's see whether we can figure out the correct type. It certainly is NOT array_2d_t:
h5xx::dataset ds(xaa, "particles/lipids/box/positions/value");
array_2d_t a;
h5xx::datatype detect(a);
std::cout << "type: " << std::hex << ds.get_type() << std::dec << "\n";
std::cout << "detect: " << std::hex << detect.get_type_id() << std::dec << "\n";
Prints
type: 30000000000013b
detect: 30000000000000c
That's a type mismatch. I guess I'll have to learn to read that gibberish as well...
Let's add some diagnostics:
void diag_type(hid_t type)
{
std::cout << " Class " << ::H5Tget_class(type) << std::endl;
std::cout << " Size " << ::H5Tget_size(type) << std::endl;
std::cout << " Sign " << ::H5Tget_sign(type) << std::endl;
std::cout << " Order " << ::H5Tget_order(type) << std::endl;
std::cout << " Precision " << ::H5Tget_precision(type) << std::endl;
std::cout << " NDims " << ::H5Tget_array_ndims(type) << std::endl;
std::cout << " NMembers " << ::H5Tget_nmembers(type) << std::endl;
}
int main()
{
h5xx::file xaa("xaa.h5", h5xx::file::mode::in);
// dump(xaa);
{
h5xx::group g(xaa, "particles/lipids/box/positions");
h5xx::dataset ds(g, "value");
std::cout << "dataset: " << std::hex << ds.get_type() << std::dec << std::endl;
diag_type(ds.get_type());
}
{
array_2d_t a(boost::extents[NJ][NI]);
h5xx::datatype detect(a);
std::cout << "detect: " << std::hex << detect.get_type_id() << std::dec << std::endl;
diag_type(detect.get_type_id());
}
}
Prints
dataset: 30000000000013b
Class 1
Size 4
Sign -1
Order 0
Precision 32
NDims -1
NMembers -1
detect: 30000000000000c
Class 0
Size 4
Sign 1
Order 0
Precision 32
NDims -1
NMembers -1
At least we know that HST_FLOAT (class 1) is required. Let's modify array_2d_t:
using array_2d_t = boost::multi_array<float, 2>;
array_2d_t a(boost::extents[11214][3]);
This at least makes the data appear similarly. Let's ... naively try to read:
h5xx::read_dataset(ds, a);
Oops, that predictably throws
terminate called after throwing an instance of 'h5xx::error'
what(): /home/sehe/Projects/stackoverflow/deps/h5xx/h5xx/dataset/boost_multi_array.hpp:176:read_dataset(): dataset "/particles/lipi
ds/box/positions/value" and target array have mismatching dimensions
No worries, we can guess:
using array_3d_t = boost::multi_array<float, 3>;
array_3d_t a(boost::extents[10][11214][3]);
h5xx::read_dataset(ds, a);
At least this does work. Adapting the print function:
template <typename T> void print_array(T const& array) {
for (auto const& row : array) {
for (auto v : row) printf("%5f ", v);
printf("\n");
}
}
Now we can print the first frame:
h5xx::read_dataset(ds, a);
print_array(*a.begin()); // print the first frame
This prints:
80.480003 35.360001 4.250000
37.450001 3.920000 3.960000
18.530001 -9.690000 4.680000
55.389999 74.339996 4.600000
22.110001 68.709999 3.850000
-4.130000 24.040001 3.730000
40.160000 6.390000 4.730000
-5.400000 35.730000 4.850000
36.669998 22.450001 4.080000
-3.680000 -10.660000 4.180000
(...)
That checks out with h5ls -r -d xaa.h5/particles/lipids/box/positions/value:
particles/lipids/box/positions/value Dataset {75/Inf, 11214, 3}
Data:
(0,0,0) 80.48, 35.36, 4.25, 37.45, 3.92, 3.96, 18.53, -9.69, 4.68,
(0,3,0) 55.39, 74.34, 4.6, 22.11, 68.71, 3.85, -4.13, 24.04, 3.73,
(0,6,0) 40.16, 6.39, 4.73, -5.4, 35.73, 4.85, 36.67, 22.45, 4.08, -3.68,
(0,9,1) -10.66, 4.18, 35.95, 36.43, 5.15, 57.17, 3.88, 5.08, -23.64,
(0,12,1) 50.44, 4.32, 6.78, 8.24, 4.36, 21.34, 50.63, 5.21, 16.29,
(0,15,1) -1.34, 5.28, 22.26, 71.25, 5.4, 19.76, 10.38, 5.34, 78.62,
(0,18,1) 11.13, 5.69, 22.14, 59.7, 4.92, 15.65, 47.28, 5.22, 82.41,
(0,21,1) 2.09, 5.24, 16.87, -11.68, 5.35, 15.54, -0.63, 5.2, 81.25,
(...)
The Home Stretch: Adding The Slice
array_2d_t read_frame(int frame_no) {
h5xx::file xaa("xaa.h5", h5xx::file::mode::in);
h5xx::group g(xaa, "particles/lipids/box/positions");
h5xx::dataset ds(g, "value");
array_2d_t a(boost::extents[11214][3]);
std::vector offsets{frame_no, 0, 0}, counts{1, 11214, 3};
h5xx::slice slice(offsets, counts);
h5xx::read_dataset(ds, a, slice);
return a;
}
There you have it. Now we can print any frame:
print_array(read_frame(0));
Printing the same as before. Let's try the last frame:
print_array(read_frame(9));
Prints
79.040001 36.349998 3.990000
37.250000 3.470000 4.140000
18.600000 -9.270000 4.900000
55.669998 75.070000 5.370000
21.920000 67.709999 3.790000
-4.670000 24.770000 3.690000
40.000000 6.060000 5.240000
-5.340000 36.320000 5.410000
36.369999 22.490000 4.130000
-3.520000 -10.430000 4.280000
(...)
Checking again with h5ls -r -d xaa.h5/particles/lipids/box/positions/value |& grep '(9' | head confirms:
(9,0,0) 79.04, 36.35, 3.99, 37.25, 3.47, 4.14, 18.6, -9.27, 4.9, 55.67,
(9,3,1) 75.07, 5.37, 21.92, 67.71, 3.79, -4.67, 24.77, 3.69, 40, 6.06,
(9,6,2) 5.24, -5.34, 36.32, 5.41, 36.37, 22.49, 4.13, -3.52, -10.43,
(9,9,2) 4.28, 35.8, 36.43, 4.99, 56.6, 4.09, 5.04, -23.37, 49.42, 3.81,
(9,13,0) 6.31, 8.83, 4.56, 22.01, 50.38, 5.43, 16.3, -2.92, 5.4, 22.02,
(9,16,1) 70.09, 5.36, 20.23, 11.12, 5.66, 78.48, 11.34, 6.09, 20.26,
(9,19,1) 61.45, 5.35, 14.25, 48.32, 5.35, 79.95, 1.71, 5.38, 17.56,
(9,22,1) -11.61, 5.39, 15.64, -0.19, 5.06, 80.43, 71.77, 5.29, 75.54,
(9,25,1) 35.14, 5.26, 22.45, 56.86, 5.56, 16.47, 52.97, 6.16, 20.62,
(9,28,1) 65.12, 5.26, 19.68, 71.2, 5.52, 23.39, 49.84, 5.28, 22.7,
Full Listing
#include <boost/multi_array.hpp>
#include <h5xx/h5xx.hpp>
#include <iostream>
using array_2d_t = boost::multi_array<float, 2>;
template <typename T> void print_array(T const& array)
{
for (auto const& row : array) { for (auto v : row)
printf("%5f ", v);
printf("\n");
}
}
void dump(h5xx::group const& g, std::string indent = "") {
auto dd = g.datasets();
auto gg = g.groups();
for (auto it = dd.begin(); it != dd.end(); ++it) {
std::cout << indent << " ds:" << it.get_name() << std::endl;
}
for (auto it = gg.begin(); it != gg.end(); ++it) {
dump(*it, indent + "/" + it.get_name());
}
}
array_2d_t read_frame(int frame_no) {
h5xx::file xaa("xaa.h5", h5xx::file::mode::in);
h5xx::group g(xaa, "particles/lipids/box/positions");
h5xx::dataset ds(g, "value");
array_2d_t arr(boost::extents[11214][3]);
std::vector offsets{frame_no, 0, 0}, counts{1, 11214, 3};
h5xx::slice slice(offsets, counts);
h5xx::read_dataset(ds, arr, slice);
return arr;
}
int main()
{
print_array(read_frame(9));
}
I'm playing with the constexpr keyword and coded the following simple program:
#include <iostream>
using namespace std;
template<typename T>
constexpr bool is_equal(T const* array1, T const* array2, size_t len)
{
return array1 == array2 || (len ? array1[len - 1] == array2[len - 1] && is_equal<T>(array1, array2, len - 1) : true);
}
template<typename T, size_t N1, size_t N2>
constexpr bool is_equal(T const (&array1)[N1], T const (&array2)[N2])
{
return N1 == N2 && is_equal<T>(array1, array2, N1);
}
template<typename T, size_t N>
constexpr size_t arraylength(T const (&array)[N])
{
return N;
}
constexpr size_t stringlength(char const* str, size_t len=0)
{
return str ? (*str ? stringlength(str + 1, len + 1) : len) : 0;
}
constexpr size_t min(size_t one, size_t another)
{
return one < another ? one : another;
}
constexpr size_t min_length(char const* str1, char const* str2)
{
return min(stringlength(str1), stringlength(str2));
}
template<typename T, size_t N1, size_t N2>
constexpr size_t min_length(T const (&array1)[N1], T const (&array2)[N2])
{
return min(N1, N2);
}
template<bool cond=false>
struct to_num
{
enum {value=0};
};
template<>
struct to_num<true>
{
enum {value=42};
};
template<size_t init>
struct two_times
{
enum {value=init*2};
};
static constexpr char x[]{"One string"};
static constexpr char y[]{"One string"};
static constexpr int a[]{1,2,3,4};
static constexpr int b[]{1,2,3,4};
static constexpr int const* const c = &a[0];
static constexpr int const* const d = &b[0];
int main()
{
cout << "The two literals are equal: " << to_num< is_equal("One string", "One string") >::value << endl; // COMPILES AND WORKS IN GCC BUT NOT IN CLANG
cout << "The two variables x & y are equal: " << to_num< is_equal(x, y) >::value << endl;
cout << "Pointers a & c are equal: " << to_num< a == c >::value << endl;
cout << "Pointers b & c are equal: " << to_num< b == c >::value << endl;
cout << "Pointers c & d are equal: " << to_num< c == d >::value << endl;
cout << "The contents of c & d is the same: " << to_num< is_equal(c, d, arraylength(a)) >::value << endl;
cout << "Pointers a & b are equal: " << to_num< a == b >::value << endl;
cout << "The contents of a & b is the same: " << to_num< is_equal(a, b) >::value << endl;
cout << "String x contains " << two_times< stringlength(x) >::value / 2 << " characters" << endl; // COMPILES AND WORKS IN CLANG BUT NOT IN GCC
cout << "String literal contains " << two_times< stringlength("literal") >::value / 2 << " characters" << endl; // COMPILES AND WORKS IN CLANG BUT NOT IN GCC
cout << "Array literal contains " << two_times< arraylength("literal") >::value / 2 << " values" << endl;
return 0;
}
As the comments remark, some code compiles and works ok with g++ but not with clang++ and other compiles and work ok with clang++ but not with g++
The gcc error is:
comaprison.cpp: In function ‘int main()’:
comaprison.cpp:97:62: in constexpr expansion of ‘stringlength(((const char*)(& x)), 0ul)’
comaprison.cpp:29:55: in constexpr expansion of ‘stringlength((str + 1u), (len + 1ul))’
comaprison.cpp:97:64: error: ‘((((const char*)(& x)) + 1u) != 0u)’ is not a constant expression
cout << "String x contains " << two_times< stringlength(x) >::value / 2 << " characters" << endl;
^
comaprison.cpp:97:64: note: in template argument for type ‘long unsigned int’
comaprison.cpp:98:76: in constexpr expansion of ‘stringlength(((const char*)"literal"), 0ul)’
comaprison.cpp:29:55: in constexpr expansion of ‘stringlength((str + 1u), (len + 1ul))’
comaprison.cpp:98:78: error: ‘((((const char*)"literal") + 1u) != 0u)’ is not a constant expression
cout << "String literal contains " << two_times< stringlength("literal") >::value / 2 << " characters" << endl; // COMPILES AND WORKS IN CLANG BUT NOT IN GCC
^
comaprison.cpp:98:78: note: in template argument for type ‘long unsigned int’
and the clang++ one is:
comaprison.cpp:89:55: error: non-type template argument is not a constant expression
cout << "The two literals are equal: " << to_num< is_equal("One string", "One string") >::value << ...
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
comaprison.cpp:8:19: note: subexpression not valid in a constant expression
return array1 == array2 || (len ? array1[len - 1] == array2[len - 1] && is_equal<T>(array1, array2...
^
comaprison.cpp:14:24: note: in call to 'is_equal(&"One string"[0], &"One string"[0], 11)'
return N1 == N2 && is_equal<T>(array1, array2, N1);
^
comaprison.cpp:89:55: note: in call to 'is_equal("One string", "One string")'
cout << "The two literals are equal: " << to_num< is_equal("One string", "One string") >::value << ...
^
1 error generated.
g++ is "gcc version 4.8.1 (Ubuntu/Linaro 4.8.1-10ubuntu9)",
clang++ is "Ubuntu clang version 3.4-1ubuntu1 (trunk) (based on LLVM 3.4)"
Both in a x86_64 linux.
The command lines are:
clang++ -std=c++11 -o comaprison comaprison.cpp
g++ -std=c++11 -o comaprison comaprison.cpp
So, am I doing anything outside the c++11 standard with this code or there is something wrong in both compilers?.
Note that If I remove the g++ problematic lines but leave the clang++ problematic one, the code compiles and works with g++ and if I remove the clang++ problematic line and leave the g++ problematic ones, the code compiles and works ok with clang++
Note also that I'm using the two templated structs to force the compiler to resolve the functions at compile time.
Thank you for your time and experience.
Update:
GCC chokes on your function stringlength. Try this:
constexpr size_t stringlength(char const* str, size_t i=0)
{
return str ? (str[i] ? 1 + stringlength(str, i+1) : 0) : 0;
}
To GCC 4.6.3 this seems dubious:
static constexpr int const* c = &a[0];
static constexpr int const* d = &b[0];
Try this:
static constexpr int* c = &a[0];
static constexpr int* d = &b[0];
Old answer:
You code seems to be correct and with GCC 4.6.3 it also compiles.
int main()
{
const char* a = "hallo";
const char* b = "qallo";
std::cout << is_equal(a,b,5) << std::endl;
constexpr const char* c = "hallo";
std::cout << is_equal(a,c,5) << std::endl;
}
Be careful, that the strings you are giving to your functions are constants.
See here what is allowed with constexpr.
Your arraylength example that you say doesn't work with GCC, works just fine with GCC 4.7 and 4.8, and except for an unrelated error about constexpr const, also with GCC 4.6.
As for your stringlength example: this is a limitation of GCC. I know I've seen a bug report about it already. You can work around it by rewriting your stringlength function:
constexpr size_t stringlength(char const* str, size_t len=0)
{
return str ? (str[len] ? stringlength(str, len + 1) : len) : 0;
}
but as far as I can tell, what you had already was perfectly valid.
I wrote a cuda function for Matlab to perform a LU factorization of a batch of matrices using cublasDgetrfBatched(). The toolkit documentation of this function is here.
It works fine for matrices up to size 32x32. But it fails with status code CUBLAS_STATUS_INVALID_VALUE for bigger matrices. Below is my source code (gpuBatchedLU.cu):
#include "mex.h"
#include "gpu/mxGPUArray.h"
/* Includes, cuda */
#include <cuda_runtime.h>
#include <cublas_v2.h>
#include <string>
#include <sstream>
static std::string cublasGetErrorString(cublasStatus_t error) {
switch (error) {
case CUBLAS_STATUS_SUCCESS:
return "CUBLAS_STATUS_SUCCESS";
case CUBLAS_STATUS_NOT_INITIALIZED:
return "CUBLAS_STATUS_NOT_INITIALIZED";
case CUBLAS_STATUS_ALLOC_FAILED:
return "CUBLAS_STATUS_ALLOC_FAILED";
case CUBLAS_STATUS_INVALID_VALUE:
return "CUBLAS_STATUS_INVALID_VALUE";
case CUBLAS_STATUS_ARCH_MISMATCH:
return "CUBLAS_STATUS_ARCH_MISMATCH";
case CUBLAS_STATUS_MAPPING_ERROR:
return "CUBLAS_STATUS_MAPPING_ERROR";
case CUBLAS_STATUS_EXECUTION_FAILED:
return "CUBLAS_STATUS_EXECUTION_FAILED";
case CUBLAS_STATUS_INTERNAL_ERROR:
return "CUBLAS_STATUS_INTERNAL_ERROR";
}
return "<unknown>";
}
inline bool cublasAssert(cublasStatus_t code, const char* file, int line) {
if (code != CUBLAS_STATUS_SUCCESS) {
std::stringstream ss;
ss << "cublasAssert: " << cublasGetErrorString(code) << " in "
<< std::string(file) << ", line " << line << ".";
mexErrMsgTxt(ss.str().c_str());
}
return code == CUBLAS_STATUS_SUCCESS;
}
inline bool cudaAssert(cudaError_t code, const char* file, int line) {
if (code != cudaSuccess) {
std::stringstream ss;
ss << "cudaAssert: " << cudaGetErrorString(code) << " in "
<< std::string(file) << ", line " << line << ".";
mexErrMsgTxt(ss.str().c_str());
}
return code == cudaSuccess;
}
inline bool mexGPUAssert(int code, const char* file, int line) {
if (code != MX_GPU_SUCCESS) {
std::stringstream ss;
ss << "mexGPUAssert: could not initialize the Mathworks GPU API in "
<< std::string(file) << ", line " << line << ".";
mexErrMsgTxt(ss.str().c_str());
}
return code == MX_GPU_SUCCESS;
}
#define cublasErrchk(ans) { cublasAssert((ans), __FILE__, __LINE__); }
#define cudaErrchk(ans) { cudaAssert((ans), __FILE__, __LINE__); }
#define mxGPUErrchk(ans) { mexGPUAssert((ans), __FILE__, __LINE__); }
void mexFunction(int nlhs, mxArray *plhs[], /* Output variables */int nrhs,
const mxArray *prhs[]) /* Input variables */{
if (nrhs != 1) { /* end if not one function arguments */
mexErrMsgTxt("This function requires one input argument.");
return;
}
if (nlhs > 3) { /* take three outputs */
mexErrMsgTxt("This function takes a maximum of three output variables.");
return;
}
mxGPUErrchk(mxInitGPU());
const mxGPUArray* in1_gpu = mxGPUCreateFromMxArray(prhs[0]);
size_t ndims = mxGPUGetNumberOfDimensions(in1_gpu);
const size_t* dim = (const size_t*) mxGPUGetDimensions(in1_gpu);
if (ndims != 3) { /* end if input arguments are of different dimensions */
mexErrMsgTxt("The input argument must be a 3-dimensional array.");
return;
}
cublasHandle_t handle;
cublasErrchk(cublasCreate(&handle));
int no_matrices = dim[2];
int nrow = dim[0];
int ncol = dim[1];
int matrix_size = nrow * ncol;
size_t i;
std::stringstream ss;
ss << "dim[2] = " << dim[2] << "\nno_matrices = " << no_matrices << "\nnrow = " << nrow << "\nmatrix_size = " << nrow << " x " << ncol << " = " << matrix_size << std::endl;
mexPrintf(ss.str().c_str());
mxGPUArray* gpu_array_inout = mxGPUCopyFromMxArray(prhs[0]);
double* inout_storage = (double*) mxGPUGetData(gpu_array_inout);
size_t info_dimensions[1] = { no_matrices };
mxGPUArray* gpu_array_info = mxGPUCreateGPUArray(1, (mwSize*) info_dimensions, mxINT32_CLASS, mxREAL,
MX_GPU_INITIALIZE_VALUES);
int* out_info = (int*) mxGPUGetData(gpu_array_info);
mexPrintf("after defining gpu_array_info\n");
size_t pivot_dimensions[2] = { nrow, no_matrices };
mxGPUArray* gpu_array_pivot = mxGPUCreateGPUArray(2, (mwSize*) pivot_dimensions, mxINT32_CLASS, mxREAL,
MX_GPU_DO_NOT_INITIALIZE);
int* out_pivot = (int*) mxGPUGetData(gpu_array_pivot);
mexPrintf("after defining gpu_array_pivot\n");
double** inout_pointers_CPU = (double**) malloc(no_matrices * sizeof(double*));
for (i = 0; i < no_matrices; i++) {
inout_pointers_CPU[i] = (double*) ((char*) inout_storage + i * ((size_t) matrix_size) * sizeof(double));
}
double** inout_pointers_GPU;
cudaErrchk(cudaMalloc((void** )&inout_pointers_GPU, no_matrices * sizeof(double*)));
cudaErrchk(
cudaMemcpy(inout_pointers_GPU, inout_pointers_CPU, no_matrices * sizeof(double*), cudaMemcpyHostToDevice));
free(inout_pointers_CPU);
ss.clear();
ss << "check again before calling cublasDgetrfBatched:\nnrow = " << nrow << "\nno_matrices = " << no_matrices << std::endl;
mexPrintf(ss.str().c_str());
cublasErrchk(cublasDgetrfBatched(handle, nrow, inout_pointers_GPU, nrow, out_pivot, out_info, no_matrices));
cublasErrchk(cublasDestroy(handle));
cudaErrchk(cudaFree(inout_pointers_GPU));
if (mxIsGPUArray(prhs[0])) {
plhs[0] = mxGPUCreateMxArrayOnGPU(gpu_array_inout);
if (nlhs > 1) {
plhs[1] = mxGPUCreateMxArrayOnGPU(gpu_array_pivot);
if (nlhs > 2) {
plhs[2] = mxGPUCreateMxArrayOnGPU(gpu_array_info);
}
}
} else {
plhs[0] = mxGPUCreateMxArrayOnCPU(gpu_array_inout);
if (nlhs > 1) {
plhs[1] = mxGPUCreateMxArrayOnCPU(gpu_array_pivot);
if (nlhs > 2) {
plhs[2] = mxGPUCreateMxArrayOnCPU(gpu_array_info);
}
}
}
mxGPUDestroyGPUArray(gpu_array_inout);
mxGPUDestroyGPUArray(gpu_array_pivot);
mxGPUDestroyGPUArray(gpu_array_info);
mxFree((void*) dim);
return;
}
I compile as follows:
mex -L/usr/local/cuda/lib64 -lcudart -lcublas gpuBatchedLU.cu
And I call from MATLAB:
[a1,b1,c1]=gpuBatchedLU(randn(32,32,5)); %no problem
[a2,b2,c2]=gpuBatchedLU(randn(33,33,5)); %produces CUBLAS_STATUS_INVALID_VALUE
I use Matlab R2013b with the parallel toolbox, Cuda 5.5, and a NVS 5200M graphics chip.
Can anyone replicate this problem? I would appreciate any suggestions on how to solve this problem.
The problem seems to be with Matlab R2013b using libcublas.so in version 5.0. The file link is in /MATLAB/R2013b/bin/glnxa64/. Once I changed the link to the libcublas.so of my Cuda 5.5 installation it worked fine.
Xcode gives me 3 "Apple Mach-O Linker (Id) Error" errors. But, when I click them it doesn't direct me to a line in my code, so I don't know what/where the problem is. I know others have asked this question, but all of the solutions that I could find were specific to each individual's code. I am learning C++, so these errors are coming as part of a beginner program I'm working on.
Apple Mach-O Linker (Id) Error
"SensorNode::SensorNode(char*, float, float, float, int, float)", referenced from:
Apple Mach-O Linker (Id) Error
"LOCATION::LOCATION()", referenced from:
Apple Mach-O Linker (Id) Error
Linker command failed with exit code 1 (use -v to see invocation)
If it helps, I'll put my code here:
here's my "sensor_node.h"
#ifndef SENSORNODE_H
#define SENSORNODE_H
#include <iostream>
class LOCATION {
float lat, longi, height;
public:
LOCATION ();
void setx(float xx);
void sety(float yy);
void setz(float zz);
void print();
};
class SensorNode {
char* NodeName;
int NodeID;
LOCATION Node1;
float batt;
int func;
public:
SensorNode(char *n, float x, float y, float z, int i, float ah);
void print();
void setOK(int o);
int getOK();
void setLOC(float longi, float lat, float h);
};
#endif /* defined(__Project_3__sensor_node__) */
here's my sensor_node.cpp:
#include "sensor_node.h"
//#include <iostream>
using namespace std;
void LOCATION::setx(float xx) {
lat = xx;
if (lat > 180.0 || lat < -180.0) {
cout << "Latitude is not in the -180 to 180 degree range";
lat = 0.0;
}
}
void LOCATION::sety(float yy) {
longi = yy;
if (longi > 180.0 || longi < -180.0) {
cout << "Latitude is not in the -180 to 180 degree range";
longi = 0.0;
}
}
void LOCATION::setz(float zz) {
height = zz;
}
void LOCATION::print() {
cout << "(LONGITUDE: " << longi << " ,LATITUDE: " << lat << " ,HEIGHT: " << height << " )";
}
and here's my main.cpp:
#include <iostream>
using namespace std;
#include "sensor_node.h"
int main() {
LOCATION a; SensorNode s1("Pulse",15.9,-30.1,0,157,2.0);
cout << "Beginning LOCATION tests.\n\n";
cout << " After initial construction: ";
a.print();
cout << "\n";
a.setx(-45.3);
a.sety(27.6);
a.setz(3.5);
cout << " After setting x/y/z to -45.3/27.6/3.5: ";
a.print();
cout << "\n";
cout << " After attempting to set longitude to 180.1: ";
a.setx(180.1);
a.print();
cout << "\n";
cout << " After attempting to set longitude to -180.1: ";
a.setx(-180.1);
a.print();
cout << "\n";
cout << " After attempting to set latitude to 180.1: ";
a.sety(180.1);
a.print();
cout << "\n";
cout << " After attempting to set latitude to -180.1: ";
a.sety(-180.1);
a.print();
cout << "\n";
/*
cout << "\n\n\n\nBeginning sensor node tests.\n\n";
cout << " After initial construction:";
s1.print();
cout << "\n Printing the value returned by getOK: " << s1.getOK();
cout << "\n After changing location to 20/30/40:";
s1.setLOC(20,30,40);
s1.print();
cout << "\n After trying to set location illegally:";
s1.setLOC(181, -181, 10);
s1.print();
cout << "\n Node fails, then try to change location:";
s1.setOK(0);
s1.setLOC(5,10,15);
s1.print();
cout << "\n Printing the value returned by getOK: " << s1.getOK();
cout << "\n\n\n End of tests.\n";
cout << "Enter an integer to quit: ";
cin >> hold;
*/
return 0;
}
You haven't written the constructor for the LOCATION class. You declare a LOCATION named a and a SensorNode which contains a LOCATION, but the linker can't figure out where the code for the LOCATION constructor is, so it can't link. Write a constructor for the LOCATION class and you should be good.
You seem to have forgotten to implement SensorNode. In SensorNode.h you declare a class, SensorNode which has data and public methods, but in SensorNode.cpp you are not providing an implementation of SensorNode (constructor), print etc. The linker is unable to find implementations of these since they haven't been implemented, hence the linker error.
Here's some boilerplate you can start out with:
SensorNode::SensorNode(char *n, float x, float y, float z, int i, float ah)
{
}
void SensorNode::print()
{
}
void SensorNode::setOK(int o)
{
}
int SensorNode::getOK()
{
}
void SensorNode::setLOC(float longi, float lat, float h)
{
}