Efficient histogram computation from image ROIs - performance

I am looking for some already existing functions/tools to compute standard Bag of Visual Words histograms from multiple ROI (Regions of Interest) in an image. Let me explain:
(1) Suppose you have an image where each "pixel" carries an integer: 1 ... K
Each such "pixel" has following information
x,y coordinate
a value from 1 to K
(2) Suppose a LARGE amount of fixed size regions are sample from all the image in format:
(x1,y1) - top,left coordinate
(x2,y2) - bottom,right coordinate
(3) For every region: Compute a K bin histogram that counts number of occurrences of the "pixel" values that fall in that region
I have implemented a following function in MATLAB but due to multiple for loops in the code, it is very slow
function [H words] = sph_roi( wind, tree, desc, feat, bins )
% FUNCTION computes an SPH histogram for a collection of windows. Spatial
% information is captured by splitting the window in bins horizontally.
% [H words] = sph_roi( obj_wind, tree, desc, feat, [ bins ] );
% wind - sampled ROI windows
% [left_x, top_y, right_x, bottom_y] - see sample_roi()
% tree - vocabulary tree
% desc - descriptors matrix
% feat - features matrix
% bins - number of horizontal cells (1=BOVW, 2... SPH)
% by default set to the multiples of window height.
% H - SPH histograms
% words - word IDs found for every descriptor
verbose = 0;
% input argument number check
if nargin < 4
error( 'At least 4 input arguments required.' );
% default number of horizontal cells
if nargin < 5
bins = -1; % will be set in multiples of each window height corresp.
% number of windows
num_wind = size( wind, 1 );
% number of visual words
num_words = tree.K;
% pre-compute all visual words
words = vl_hikmeanspush( tree, desc );
% initialize SPH histograms matrix
H = zeros( num_words * bins, num_wind );
% compute BOVW for each ROI
for i = 1 : num_wind
if verbose == 1
fprintf( 'sph_roi(): processing %d / %d\n', i, num_wind );
% pick a window
wind_i = wind( i, : );
% get the dimensions of the window
[w h] = wind_size( wind_i );
% if was not set - the number of horizontal bins
if bins == -1
bins = round( w / h );
% return a list of subcell windows
scw = create_sph_wind( wind_i, bins );
for j = 1 : bins
% pick a cell
wind_tmp = scw( j, : );
% get the descriptor ids falling in that cell
ids = roi_feat_ids( wind_tmp, feat );
% compute the BOVW histogram for the current cell
h = vl_hikmeanshist( tree, words(ids) );
% assemble the SPH histogram in the output matrix directly
H( 1+(j-1)*num_words : j*num_words, i ) = h( 2:end );
function ids = roi_feat_ids( w, f )
% FUNCTION returns those feature ids that fall in the window.
% ids = roi_feat_ids( w, f );
% w - window
% f - all feature points
% ids - feature ids
% input argument number check
if nargin ~= 2
error( 'Two input arguments required.' );
left_x = 1;
top_y = 2;
right_x = 3;
bottom_y = 4;
% extract and round the interest point coordinates
x = round( f(1,:) );
y = round( f(2,:) );
% bound successively the interest points
s1 = ( x > w(left_x) ); % larger than left_x
s2 = ( x < w(right_x) ); % smaller than right_x
s3 = ( y > w(top_y) ); % larger than top_y
s4 = ( y < w(bottom_y) ); % smaller than bottom_y
% intersection of these 4 sets are the ROI enclosed interest points
ids = s1 & s2 & s3 & s4;
% convert ids to real
ids = find( ids );
I've looked at routines proposed by OpenCV and even in Intel's MKL but found nothing appropriate. Using the Matlab's profiler, I found that considerable time is spent in the roi_feat_ids() and the outer loop over each region in the function sph_roi() is slow too. Before trying to implement a MEX file, I would like to see if I could recycle some existing code.

There's a few things that I would do to speed this up.
The very last line should be removed (ids = find( ids );. Logical masks are much faster than using a find, and they work in almost every case that a find statement would work. I suspect this will speed up your function considerably, at no loss of functionality/ readability.
It might be quicker if you combined some of the s1, s2, s3, and s4 statements.
Try not to create large data sets in the for loop unless they are required. Specifically, I would remove two lines to do the following: ids = roi_feat_ids( scw( j, : ), feat );
The latter two might save you a bit of time, but the first should be a huge time saver. Good luck!


Octave: function doesn't return expected value?

This code is a programming assignment for Andrew Ng's machine learning course.
The function is expecting a row vector [J grad]. The code computes J (albeit wrongly, but that's not the issue here), and I put in a dummy value for grad (because I haven't written the code to compute it yet). When I run the code, it only outputs ans as a scalar with the value of J. Where did grad go?
function [J grad] = nnCostFunction(nn_params, ...
input_layer_size, ...
hidden_layer_size, ...
num_labels, ...
X, y, lambda)
%NNCOSTFUNCTION Implements the neural network cost function for a two layer
%neural network which performs classification
% [J grad] = NNCOSTFUNCTON(nn_params, hidden_layer_size, num_labels, ...
% X, y, lambda) computes the cost and gradient of the neural network. The
% parameters for the neural network are "unrolled" into the vector
% nn_params and need to be converted back into the weight matrices.
% The returned parameter grad should be a "unrolled" vector of the
% partial derivatives of the neural network.
% Reshape nn_params back into the parameters Theta1 and Theta2, the weight matrices
% for our 2 layer neural network
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
hidden_layer_size, (input_layer_size + 1));
Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
num_labels, (hidden_layer_size + 1));
% Setup some useful variables
m = size(X, 1);
% You need to return the following variables correctly
J = 0;
Theta1_grad = zeros(size(Theta1));
Theta2_grad = zeros(size(Theta2));
% ====================== YOUR CODE HERE ======================
% Instructions: You should complete the code by working through the
% following parts.
% Part 1: Feedforward the neural network and return the cost in the
% variable J. After implementing Part 1, you can verify that your
% cost function computation is correct by verifying the cost
% computed in ex4.m
% Part 2: Implement the backpropagation algorithm to compute the gradients
% Theta1_grad and Theta2_grad. You should return the partial derivatives of
% the cost function with respect to Theta1 and Theta2 in Theta1_grad and
% Theta2_grad, respectively. After implementing Part 2, you can check
% that your implementation is correct by running checkNNGradients
% Note: The vector y passed into the function is a vector of labels
% containing values from 1..K. You need to map this vector into a
% binary vector of 1's and 0's to be used with the neural network
% cost function.
% Hint: We recommend implementing backpropagation using a for-loop
% over the training examples if you are implementing it for the
% first time.
% Part 3: Implement regularization with the cost function and gradients.
% Hint: You can implement this around the code for
% backpropagation. That is, you can compute the gradients for
% the regularization separately and then add them to Theta1_grad
% and Theta2_grad from Part 2.
% PART 1
a1 = [ones(m,1) X]; % set a1 to equal X and add column of 1's
z2 = a1 * Theta1'; % matrix times matrix [5000*401 * 401*25 = 5000*25]
a2 = [ones(m,1),sigmoid(z2)]; % sigmoid function on matrix [5000*26]
z3 = a2 * Theta2'; % matrix times matrix [5000*26 * 26*10 = 5000 * 10]
hox = sigmoid(z3); % sigmoid function on matrix [5000*10]
for k = 1:num_labels
yk = y == k; % using the correct column vector y each loop
J = J + sum(-yk.*log(hox(:,k)) - (1-yk).*log(1-hox(:,k)));
J = 1/m * J;
% -------------------------------------------------------------
% =========================================================================
% Unroll gradients
% grad = [Theta1_grad(:) ; Theta2_grad(:)];
grad = 6.6735;
You have specified in your function declaration that the function can simultaneously return more than one output value:
function [J grad] = nnCostFunction(nn_params, ... % etc
You can capture both outputs if you 'request' them by assigning to a matrix of variables instead of a single variable:
[a, b] = nnCostFunction(input1, input2, etc)
If you don't do this, you're essentially 'requesting' only the first of the returned variables:
a = nnCostFunction(input1, input2, etc) % output 'b' is discarded.
If you don't specify a variable to assign to at all, octave by default assigns to the 'default' variable ans. So it's essentially equivalent to doing
ans = nnCostFunction(input1, input2, etc) % output 'b' is discarded.
See the documentation for the find function (i.e. type help find in your octave terminal) to see an example of such a function.
PS. If you only wanted the second output and did not want to 'waste' a variable name for the first one, you can do this by specifying ~ as the first output, e.g.:
[~, b] = nnCostFunction(input1, input2, etc) % output 'a' is discarded

How to generate random number that satisfying poisson distribution

I want to generate 500000 random numbers of Poisson distribution with lambda = 1, and T=6 by using the composition method which can be describes as follows:
Generate uniform r.v. z1, z2, …
Stop when z1.z2..zm<=exp(-lamda*T)
Assign k = m – 1
Then count how many number in each of 10 intervals ([0,1],[2,3],…, [16,17], [18,∞)].
I know that MATLAB has a built-in function poissrnd for above task. However, I want to use the above algorithm to do it by myself. I tried do it and compared it with the result of the poissrnd function, but my code gives a wrong result. Could you look at my code and give me some comments?
num_generated = 500000;
k_vec=[]; %% Store k
for i=1:number_generated
for j=1:number_generated
%% Step 1: Generate uniform in the interval [0,1]: z1,z2...
%% Step 2: Stop when z1z2...zm<=exp(-lambda*T)
k_vec=[k_vec k]; % Record k in vec
range_1 = sum( k_vec(:)==0 )+sum(k_vec(:)==1) % # number with in range [0,1]
range_2 = sum( k_vec(:)==2 )+sum( k_vec(:)==3) % # number with in range [2,3]
range_3 = sum( k_vec(:)==4 )+sum( k_vec(:)==5) % # number with in range [4,5]
range_4 = sum( k_vec(:)==6 )+sum( k_vec(:)==7) % # number with in range [6,7]
range_5 = sum( k_vec(:)==8 )+sum( k_vec(:)==9) % # number with in range [8,9]
range_6 = sum( k_vec(:)==10 )+sum( k_vec(:)==11) % # number with in range [10,11]
range_7 = sum( k_vec(:)==12 )+sum( k_vec(:)==13) % # number with in range [12,13]
range_8 = sum( k_vec(:)==14 )+sum( k_vec(:)==15) % # number with in range [14,15]
range_9 = sum( k_vec(:)==16 )+sum( k_vec(:)==17) % # number with in range [16,17]
range_10 = sum(k_vec(:)>=18) % # number with in range [18,+infty)
You don't know how many random values it will take for multiple to converge, so you need to change your for loop over j to a while loop that continues as long as multiple > exp(-lambda*T).
By changing this to a while loop, you now need k to be a counter and to increment it on each iteration of the loop:
(Warning: Untested Code)
for i = 1:number_generated
multiple = 1;
k = 0; %// Initialize counter for each number generated
while multiple > exp(-lambda*T) %// replace `for` loop
k = k + 1; %// Increment counter
%% Step 1: Generate uniform in the interval [0,1]: z1,z2...
z = rand();
%% Step 2: Stop when z1z2...zm<=exp(-lambda*T)
multiple = multiple*z;
%// If we exit the loop, we know multiple <= exp(-lambda*T)
k = k - 1;
k_vec = [k_vec k]; % Record k in vec
You should also avoid at all costs using sequential variable names like range_1, range_2, ... Matlab is designed to handle arrays and matrices, so you should used them. The simplest way to do this in your case, without even looping or vectorization, is:
range(1) = sum(...
range(2) = sum(...
range(10) = sum(...
Now you have one variable in your workspace rather than 10 and any operations you perform on this variable will be much easier.
I don't use Matlab so I can't give you the exact syntax for a fix. At a minimum, it looks like you're forgetting to reset multiple and k for each new Poisson. Also, you're only generating a single z.
A working implementation to get num_generated Poisson outcomes should look something like the following pseudocode:
threshold = Math.exp(-lambda * T)
loop num_generated times {
%% Each time through this loop produces a single Poisson outcome
count = 0
product = 1.0
while (product = product * rand()) >= threshold {
count += 1
%% count now has a valid Poisson value, do what you want with it

Halton sequence extension

I am trying to fill an area defined by 2 intervals [a,b] x [c,d] with points uniformly distributed and I am implementing the Halton sequence. I am using the following code (which generates subunitary numbers).
The number I is input.
The number H is output.
for i = 1:N
H = 0
half = 1 / 2
I = rand() % MATLAB rand()
do while ( I is not zero )
digit = mod ( I, 2 )
H = H + digit * half
I = ( I - digit ) / 2
half = half / 2
x(i) = H
For the x-axis I use base 2 and for the y-axis I use base 3.
Because I divide by 2, 3 I seem to be unable to fill the whole [0,1] x [0,1] space completely. I have to fill [0,1] x [0,1] and I actually fill [0,0.5] x [0,0.35]. And when I try to extend the algorithm for [a,b] x [c,d] I get points in [a,b-0.5] x [c,d-1].
What can I do to fill the correct full intervals?

Pattern matching – Normalized Correlation

as a part of my homework i need to implement this pattern matching.
the goal is to "Detect as many of the 0's (zeros) as you can in image coins4.tif."
i was given the NGC function. and i need to use it
this is my main.m file
Image = readImage('coins4.tif');
Pattern = readImage('zero.tif');
message = sprintf('Pattern matching Normalized Correlation');
PatternMatching(Image , Pattern);
uiwait(msgbox(message,'Done', 'help'));
close all
this is my PatternMatching function.
function [ output_args ] = PatternMatching( Image , Pattern )
% Pattern matching – Normalized Correlation
% Detect as many of the 0's (zeros) as you can in image coins4.tif.
% Use the 0 of the 10 coin as pattern.
% Use NGC_pm and find good threshold. Display original image with? detected regions marked using drawRect.
% NGCpm(im,pattern);
% drawRect(rectCoors,color);
% rectCoors = [r0,c0,rsize,csize] - r0,c0 = top-left corner of rect.
% rsize = number of rows, csize = number of cols
% color = an integer >=1 representing a color in the color wheel
% (curerntly cycles through 8 different colors
hold on
res = NGCpm(Image, Pattern);
for i = 1:size(res,1)
for j = 1:size(res,2)
if res(i,j) > 0.9999
drawRect([i j size(Pattern,1) size(Pattern,2)], 5)
this is the Given NGCpm.m file
function res=NGC_PM(im,pattern)
[n m]=size(pattern);
if ~(var(pattern(:))==0)
res = normxcorr2(pattern, im);
res = 1-abs(res); % res = abs(res);
this is the pattern i'm trying to find and the results, i'm getting
i'm trying to find as many "Zeros" as possiable using the zero pattern of the coin 10.
i'm tryingto understand if there is something wrong with my algorithm in the PatternMatching function. since the NGCpm function is already given to me, all i need to do is just loop of the best threshold ,correct?
or do i need to blur the image or the pattern?
this is the fixed version of this function.
function [ output_args ] = patternMatching( Image , Pattern )
% Pattern matching – Normalized Correlation
% Detect as many of the 0's (zeros) as you can in image coins4.tif.
% Use the 0 of the 10 coin as pattern.
% Use NGC_pm and find good threshold. Display original image with? detected regions marked using drawRect.
% NGCpm(im,pattern);
% drawRect(rectCoors,color);
% rectCoors = [r0,c0,rsize,csize] - r0,c0 = top-left corner of rect.
% rsize = number of rows, csize = number of cols
% color = an integer >=1 representing a color in the color wheel
% (curerntly cycles through 8 different colors
hold on
res = 1-NGCpm(Image, Pattern);
normalized_corellation = uint8(255*res/max(max(res)));
res_thresh = thresholdImage(normalized_corellation,100);
for i = 1:size(res_thresh,1)
for j = 1:size(res_thresh,2)
if res_thresh(i,j) > 0
drawRect([i j size(Pattern,1) size(Pattern,2)], 5)

Algorithm to express elements of a matrix as a vector

Statement of Problem:
I have an array M with m rows and n columns. The array M is filled with non-zero elements.
I also have a vector t with n elements, and a vector omega
with m elements.
The elements of t correspond to the columns of matrix M.
The elements of omega correspond to the rows of matrix M.
Goal of Algorithm:
Define chi as the multiplication of vector t and omega. I need to obtain a 1D vector a, where each element of a is a function of chi.
Each element of chi is unique (i.e. every element is different).
Using mathematics notation, this can be expressed as a(chi)
Each element of vector a corresponds to an element or elements of M.
Matlab code:
Here is a code snippet showing how the vectors t and omega are generated. The matrix M is pre-existing.
[m,n] = size(M);
t = linspace(0,5,n);
omega = linspace(0,628,m);
Conceptual Diagram:
This appears to be a type of integration (if this is the right word for it) along constant chi.
Link to reference
The algorithm is not explicitly stated in the reference. I only wish that this algorithm was described in a manner reminiscent of computer science textbooks!
Looking at Figure 11.5, the matrix M is Figure 11.5(a). The goal is to find an algorithm to convert Figure 11.5(a) into 11.5(b).
It appears that the algorithm is a type of integration (averaging, perhaps?) along constant chi.
It appears to me that reshape is the matlab function you need to use. As noted in the link:
B = reshape(A,siz) returns an n-dimensional array with the same elements as A, but reshaped to siz, a vector representing the dimensions of the reshaped array.
That is, create a vector siz with the number m*n in it, and say A = reshape(P,siz), where P is the product of vectors t and ω; or perhaps say something like A = reshape(t*ω,[m*n]). (I don't have matlab here, or would run a test to see if I have the product the right way around.) Note, the link does not show an example with one number (instead of several) after the matrix parameter to reshape, but I would expect from the description that A = reshape(t*ω,m*n) might also work.
You should add a pseudocode or a link to the algorithm you want to implement. From what I could understood I have developed the following code anyway:
M = [1 2 3 4; 5 6 7 8; 9 10 11 12]' % easy test M matrix
a = reshape(M, prod(size(M)), 1) % convert M to vector 'a' with reshape command
[m,n] = size(M); % Your sample code
t = linspace(0,5,n); % Your sample code
omega = linspace(0,628,m); % Your sample code
for i=1:length(t)
for j=1:length(omega) % Acces a(chi) in the desired order
chi = length(omega)*(i-1)+j;
t(i) % related t value
omega(j) % related omega value
a(chi) % related a(chi) value
As you can see, I also think that the reshape() function is the solution to your problems. I hope that this code helps,
The basic idea is to use two separate loops. The outer loop is over the chi variable values, whereas the inner loop is over the i variable values. Referring to the above diagram in the original question, the i variable corresponds to the x-axis (time), and the j variable corresponds to the y-axis (frequency). Assuming that the chi, i, and j variables can take on any real number, bilinear interpolation is then used to find an amplitude corresponding to an element in matrix M. The integration is just an averaging over elements of M.
The following code snippet provides an overview of the basic algorithm to express elements of a matrix as a vector using the spectral collapsing from 2D to 1D. I can't find any reference for this, but it is a solution that works for me.
% Amp = amplitude vector corresponding to Figure 11.5(b) in book reference
% M = matrix corresponding to the absolute value of the complex Gabor transform
% matrix in Figure 11.5(a) in book reference
% Nchi = number of chi in chi vector
% prod = product of timestep and frequency step
% dt = time step
% domega = frequency step
% omega_max = maximum angular frequency
% i = time array element along x-axis
% j = frequency array element along y-axis
% current_i = current time array element in loop
% current_j = current frequency array element in loop
% Nchi = number of chi
% Nivar = number of i variables
% ivar = i variable vector
% calculate for chi = 0, which only occurs when
% t = 0 and omega = 0, at i = 1
av0 = mean( M(1,:) );
av1 = mean( M(2:end,1) );
av2 = mean( [av0 av1] );
Amp(1) = av2;
% av_val holds the sum of all values that have been averaged
av_val_sum = 0;
% loop for rest of chi
for ccnt = 2:Nchi % 2:Nchi
av_val_sum = 0; % reset av_val_sum
current_chi = chi( ccnt ); % current value of chi
% loop over i vector
for icnt = 1:Nivar % 1:Nivar
current_i = ivar( icnt );
current_j = (current_chi / (prod * (current_i - 1))) + 1;
current_t = dt * (current_i - 1);
current_omega = domega * (current_j - 1);
% values out of range
if(current_omega > omega_max)
% use bilinear interpolation to find an amplitude
% at current_t and current_omega from matrix M
% f_x_y is the bilinear interpolated amplitude
% Insert bilinear interpolation code here
% add to running sum
av_val_sum = av_val_sum + f_x_y;
end % icnt loop
% compute the average over all i
av = av_val_sum / Nivar;
% assign the average to Amp
Amp(ccnt) = av;
end % ccnt loop
