I have encountered problem in copying the element of nested std::vector to another std::vector.
Example 1
std::vector<std::vector<int>> foo;
std::vector<int> temp;
std::vector<int> goo;
goo = foo[0]; //error
Example 2
for(int i = 0; i<foo[0].size(); i++) {temp.push_back(foo[0][i])};
goo = temp; //error
Thus, can i know where is the problem and what should i do to copy the element of a nested vector to another vector??
The actual scenario would be i have nested vector of cv::Point
std::vector<std::vector<cv::Point>> found_contour;
and would like to copy the element inside a std::vector<cv::Point> in a struct.
struct Contours
std::vector<cv::Point> contour;
cv::RotatedRect minRect;
cv::RotatedRect minEllipse;
Code Snippet:
std::vector<Contours> contour_struct;
for (size_t i = 0; i < found_contour.size(); i++)
contour_struct[i].contour = found_contour[i];
contour_struct[i].minRect = cv::minAreaRect(cv::Mat(found_contour[i]));
std::vector<Contours> contour_struct;
contour_struct.reserve(found_contour.size()); //<-----problem
for (size_t i = 0; i < found_contour.size(); i++)
contour_struct[i].contour = found_contour[i];
contour_struct[i].minRect = cv::minAreaRect(cv::Mat(found_contour[i]));
vector::reserve only aquires space internally so that push_back does not run out of space. It does not actually add more objects into the vector. You can use this line instead:
which will make sure that contour_struct is the right size.
For accessing single point, I am using this line of code and it works
int intensity = gray_image.at<uchar>(Point(100, 100));
However when I use this code to access all the pixels in image, it gives memory error,
for (int i = 0; i < gray_image.rows;i++)
for (int j = 0; j < gray_image.cols; j++) {
intensity += gray_image.at<uchar>(Point(i, j));
When I run above code, it does not give compile time error but gives memory exception. Where am I going wrong?
You can just skip the use of Point and do the following.
for (int i = 0; i < gray_image.rows;i++)
for (int j = 0; j < gray_image.cols; j++) {
intensity += gray_image.at<uchar>(i, j);
You're requesting a pixel (j,i) that doesn't exist. This wouldn't have been an error in a square image (where the number of rows = number of columns), but you're using a rectangular image.
The Mat::at function has multiple prototypes, the two that you're concerned with are:
C++: template<typename T> T& Mat::at(int i, int j)
C++: template<typename T> T& Mat::at(Point pt)
The documentation for Mat::at states that Point pt is defined as the Element position specified as Point(j,i), so you've effectively swapped your rows and columns.
The reason this happens is because the image is stored in a 1D array of pixels, and to get a pixel Point (r,c) is translated to p = r * image.cols + c;
In spite of reading many answers on the same kind of questions on SO I am not able to figure out solution in my case. I have written the following code to implement a thrust program. Program performs simple copy and display operation.
#include <stdio.h>
#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
int main(void)
// H has storage for 4 integers
thrust::host_vector<int> H(4);
H[0] = 14;
H[1] = 20;
H[2] = 38;
H[3] = 46;
// H.size() returns the size of vector H
printf("\nSize of vector : %d",H.size());
printf("\nVector Contents : ");
for (int i = 0; i < H.size(); ++i) {
thrust::device_vector<int> D = H;
printf("\nDevice Vector Contents : ");
for (int i = 0; i < D.size(); i++) {
printf("%d",D[i]); //This is where I get the warning.
return 0;
Thrust implements certain operations to facilitate using elements of a device_vector in host code, but this apparently isn't one of them.
There are many approaches to addressing this issue. The following code demonstrates 3 possible approaches:
explicitly copy D[i] to a host variable, and thrust has an appropriate method defined for that.
copy the thrust device_vector back to a host_vector before print-out.
use thrust::copy to directly copy the elements of the device_vector to a stream.
#include <stdio.h>
#include <iostream>
#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
#include <thrust/copy.h>
int main(void)
// H has storage for 4 integers
thrust::host_vector<int> H(4);
H[0] = 14;
H[1] = 20;
H[2] = 38;
H[3] = 46;
// H.size() returns the size of vector H
printf("\nSize of vector : %d",H.size());
printf("\nVector Contents : ");
for (int i = 0; i < H.size(); ++i) {
thrust::device_vector<int> D = H;
printf("\nDevice Vector Contents : ");
//method 1
for (int i = 0; i < D.size(); i++) {
int q = D[i];
//method 2
thrust::host_vector<int> Hnew = D;
for (int i = 0; i < Hnew.size(); i++) {
//method 3
thrust::copy(D.begin(), D.end(), std::ostream_iterator<int>(std::cout, ","));
std::cout << std::endl;
return 0;
Note that for methods like these, thrust is generating various kinds of device-> host copy operations to facilitate the use of device_vector in host code. This has performance implications, so you might want to use the defined copy operations for large vectors.
I am a lazy programmer. I want to use C++ vector to create a multidimensional array. For example, this code create a 3x2 2D array:
int nR = 3;
int nC = 2;
vector<vector<double> > array2D(nR);
for(int c = 0; c < nC; c++)
array2D.resize(nC, 0);
However, I am too lazy to
declare array2D's data type: vector<vector<double> >
C++ auto could solve this problem.
However, I am too lazy to
write loop(s) to allocate the space(s) for each object like array2D.
Writing a function could solve this problem.
However, I am too lazy to
write each function for each N-dimensional array.
write nested N-1 loops for allocating spaces.
wirte each function for each data type.
The C++11 variadic template with function recursion could solve this problem.
Is it possible ...?
This is what you want. (Tested on Microsoft Visual C++ 2013 Update 1)
#include <iostream>
#include <vector>
using namespace std;
template<class elemType> inline vector<elemType> getArrayND(int dim) {
// Allocate space and initialize all elements to 0s.
return vector<elemType>(dim, 0);
template<class elemType, class... Dims> inline auto getArrayND(
int dim, Dims... resDims
) -> vector<decltype(getArrayND<elemType>(resDims...))> {
// Allocate space for this dimension.
auto parent = vector<decltype(getArrayND<elemType>(resDims...))>(dim);
// Recursive to next dimension.
for (int i = 0; i < dim; i++) {
parent[i] = getArrayND<elemType>(resDims...);
return parent;
int main() {
auto test3D = getArrayND<double>(2, 3, 4);
auto test4D = getArrayND<double>(2, 3, 4, 2);
test3D[0][0][1] = 3;
test4D[1][2][3][1] = 5;
cout << test3D[0][0][1] << endl;
cout << test4D[1][2][3][1] << endl;
return 0;
Can anybody suggest a best way to parallelize this using openmp? The program gets aborted when I run this code.
void grayerode(int **img, int height, int width, int filterheight,
int filterwidth, int iterations, int pixrange)
int maxlabel=0;
int fh, fw, iters, pixval=0, i, j, s, k;
int fhlimit = filterheight/2;
int fwlimit = filterwidth/2;
int **smoothedlabels;
allocate_2D_int_matrix ( &smoothedlabels, height, width );
#pragma omp parallel for shared(smoothedlabels,height,width,k)
for (i=0; i<height; i++)
for (j=0; j<width; j++)
smoothedlabels[i][j] = img[i][j];
int *labeltemp = (int *)malloc(pixrange*sizeof(int));
for (s=0; s<pixrange; s++)
labeltemp[s] = 0;
for (iters=0; iters<iterations; iters++) {
#pragma omp parallel for private(i,j,labeltemp)
for (i=fhlimit; i<height-fhlimit; i++) {
for (j=fwlimit; j<width-fwlimit; j++) {
for (fh=-fhlimit; fh<=fhlimit; fh++)
for (fw=-fwlimit; fw<=fwlimit; fw++) {
for (s=0; s<pixrange; s++) {
if (labeltemp[s]>maxlabel) {
maxlabel = labeltemp[s];
pixval = s;
for (s=0; s<pixrange; s++)
labeltemp[s] = 0;
maxlabel = 0;
for (i=0; i<height; i++)
for (j=0; j<width; j++)
img[i][j] = smoothedlabels[i][j];
free_2D_int_matrix ( &smoothedlabels );
A few things:
You are not declaring private variables correctly. One example of doing it the correct way in your code:
#pragma omp parallel for private(i,j) shared(smoothedlabels, img, width, height)
for(i=0; i<height; i++)
for(j=0; j<width; j++)
smoothedlabels[i][j] = img[i][j]
It is important that j remains private or each thread will try change its value - giving you unexpected behaviour. (Note: i is actually implicitly declared private when you declare the pragma statement, but I always prefer to state it explicitly for better readability)
Try avoiding 2D arrays because they restrict your ability to parallelize. In the same example you could do the following:
#pragma omp parallel for private(i) shared(width, height, smoothedlabels, img)
for(i=0; i<width * height; i++)
smoothedlabels[i] = img[i]
This will parallelize the entire loop for you rather than just the outer loop. You can order your 1D array either column wise or row wise.
Same thing goes for the rest of the loops - just apply the same concept.
Later in your code for example, you have the following:
for (fh=-fhlimit; fh<=fhlimit; fh++)
for (fw=-fwlimit; fw<=fwlimit; fw++) {
If you do not declare fh and fw private, then you will get unexpected behaviour for the same reason not declaring j before would give you unexpected behaviour.
The environment: I am working in a proprietary scripting language where there is no such thing as a user-defined function. I have various loops and local variables of primitive types that I can create and use.
I have two related arrays, "times" and "values". They both contain floating point values. I want to numerically sort the "times" array but have to be sure that the same operations are applied on the "values" array. What's the most efficient way I can do this without the benefit of things like recursion?
You could maintain an index table and sort the index table instead.
This way you will not have to worry about times and values being consistent.
And whenever you need a sorted value, you can lookup on the sorted index.
And if in the future you decided there was going to be a third value, the sorting code will not need any changes.
Here's a sample in C#, but it shouldn't be hard to adapt to your scripting language:
static void Main() {
var r = new Random();
// initialize random data
var index = new int[10]; // the index table
var times = new double[10]; // times
var values = new double[10]; // values
for (int i = 0; i < 10; i++) {
index[i] = i;
times[i] = r.NextDouble();
values[i] = r.NextDouble();
// a naive bubble sort
for (int i = 0; i < 10; i++)
for (int j = 0; j < 10; j++)
// compare time value at current index
if (times[index[i]] < times[index[j]]) {
// swap index value (times and values remain unchanged)
var temp = index[i];
index[i] = index[j];
index[j] = temp;
// check if the result is correct
for (int i = 0; i < 10; i++)
Note: I used a naive bubble sort there, watchout. In your case, an insertion sort is probably a good candidate. Since you don't want complex recursions.
Just take your favourite sorting algorithm (e.g. Quicksort or Mergesort) and use it to sort the "values" array. Whenever two values are swapped in "values", also swap the values with the same indices in the "times" array.
So basically you can take any fast sorting algorithm and modify the swap() operation so that elements in both arrays are swapped.
Take a look at the Bottom-Up mergesort at Algorithmist. It's a non-recursive way of performing a mergesort. The version presented there uses function calls, but that can be inlined easily enough.
Like martinus said, every time you change a value in one array, do the exact same thing in the parallel array.
Here's a C-like version of a stable-non-recursive mergesort that makes no function calls, and uses no recursion.
const int arrayLength = 40;
float times_array[arrayLength];
float values_array[arrayLength];
// Fill the two arrays....
// Allocate two buffers
float times_buffer[arrayLength];
float values_buffer[arrayLength];
int blockSize = 1;
while (blockSize <= arrayLength)
int i = 0;
while (i < arrayLength-blockSize)
int begin1 = i;
int end1 = begin1 + blockSize;
int begin2 = end1;
int end2 = begin2 + blockSize;
int bufferIndex = begin1;
while (begin1 < end1 && begin2 < end2)
if ( values_array[begin1] > times_array[begin2] )
times_buffer[bufferIndex] = times_array[begin2];
values_buffer[bufferIndex++] = values_array[begin2++];
times_buffer[bufferIndex] = times_array[begin1];
values_buffer[bufferIndex++] = values_array[begin1++];
while ( begin1 < end1 )
times_buffer[bufferIndex] = times_array[begin1];
values_buffer[bufferIndex++] = values_array[begin1++];
while ( begin2 < end2 )
times_buffer[bufferIndex] = times_array[begin2];
values_buffer[bufferIndex++] = values_array[begin2++];
for (int k = i; k < i + 2 * blockSize; ++k)
times_array[k] = times_buffer[k];
values_array[k] = values_buffer[k];
i += 2 * blockSize;
blockSize *= 2;
I wouldn't suggest writing your own sorting routine, as the sorting routines provided as part of the Java language are well optimized.
The way I'd solve this is to copy the code in the java.util.Arrays class into your own class i.e. org.mydomain.util.Arrays. And add some comments telling yourself not to use the class except when you must have the additional functionality that you're going to add. The Arrays class is quite stable so this is less, less ideal than it would seem, but it's still less than ideal. However, the methods you need to change are private, so you've no real choice.
You then want to create an interface along the lines of:
public static interface SwapHook {
void swap(int a, int b);
You then need to add this to the sort method you're going to use, and to every subordinate method called in the sorting procedure, which swaps elements in your primary array. You arrange for the hook to get called by your modified sorting routine, and you can then implement the SortHook interface to achieve the behaviour you want in any secondary (e.g. parallel) arrays.