octave matrix for loop performance - performance

I am new to Octave. I have two matrices. I have to compare a particular column of a one matrix with the other(my matrix A is containing more than 5 variables, similarly matrix B is containing the same.) and if elements in column one of matrix A is equal to elements in the second matrix B then I have to use the third column of second matrix B to compute certain values.I am doing this with octave by using for loop , but it consumes a lot of time to do the computation for single day , i have to do this for a year . Because size of matrices is very large.Please suggest some alternative way so that I can reduce my time and computation.
Thank you in advance.
Thanks for your quick response -hfs
continuation of the same problem,
Thank u, but this will work only if both elements in both the rows are equal.For example my matrices are like this,
A=[1 2 3;4 5 6;7 8 9;6 9 1]
B=[1 2 4; 4 2 6; 7 5 8;3 8 4]
here column 1 of first element of A is equal to column 1 of first element of B,even the second column hence I can take the third element of B, but for the second element of column 1 is equal in A and B ,but second element of column 2 is different ,here it should search for that element and print the element in the third column,and am doing this with for loop which is very slow because of larger dimension.In mine actual problem I have given for loop as written below:
for k=1:37651
for j=1:26018
if (s(k,1:2)==l(j,1:2))
z=sin((90-s(k,3))*pi/180) , break ,end
end
end
I want an alternative way to do this which should be faster than this.

You should work with complete matrices or vectors whenever possible. You should try commands and inspect intermediate results in the interactive shell to see how they fit together.
A(:,1)
selects the first column of a matrix. You can compare matrices/vectors and the result is a matrix/vector of 0/1 again:
> A(:,1) == B(:,1)
ans =
1
1
0
If you assign the result you can use it again to index into matrices:
I = A(:,1) == B(:,1)
B(I, 3)
This selects the third column of B of those rows where the first column of A and B is equal.
I hope this gets you started.

Related

A greedy solution for a matrix rearrangment

I am working on something which I feel an NP-hard problem. So, I am not looking for the optimal solution but I am looking for a better heuristics. An integer input matrix (matrix A in the following example) is given as input and I have to produce an integer output matrix (matrix B in the following example) whose number of rows are smaller than the input matrix and should obey the following two conditions:
1) Each column of the output matrix should contain integers in the same order as they appear in the input matrix. (In the example below, first column of the matrix A and matrix B have the same integers 1,3 in the same order.)
2) Same integers must not appear in the same row (In the example below, first row of the matrix B contains the integers 1,3 and 2 which are different from each other.)
Note that the input matrix always obey the 2nd condition.
What a greedy algorithm looks like to solve this problem?
Example:
In this example the output matrix 'Matrix B' contains all the integers as they appear in the input matrix 'Matrix A" but the output matrix has 5 rows and the input matrix has 6 rows. So, the output 'Matrix B' is a valid solution of the input 'Matrix A'.
I would produce the output one row at a time. When working out what to put in the row I would consider the next number from each input column, starting from the input column which has the most numbers yet to be placed, and considering the columns in decreasing order of numbers yet to be placed. Where a column can put a number in the current output row when its turn comes up it should do so.
You could extend this to a branch and bound solution to find the exact best answer. Recursively try all possible rows at each stage, except when you can see that the current row cannot possibly improve on the best answer so far. You know that if you have a column with k entries yet to be placed, in the best possible case you will need at least k more rows.
In practice I would expect that this will be too expensive to be practical, so you will need to ignore some possible rows which you cannot rule out, and so cannot guarantee to find the best answer. You could try using a heuristic search such as Limited Discrepancy search.
Another non-exact speedup is to multiply the estimate for the number of rows that the best possible answer derived from a partial solution will require by some factor F > 1. This will allow you to rule out some solutions earlier than branch and bound. The answer you find can be no more than F times more expensive than the best possible answer, because you only discard possibilities that cannot improve on the current answer by more than a factor of F.
A greedy solution to this problem would involve placing the numbers column by column, top down, as they appear.
Pseudocode:
For each column c in A:
r = 0 // row index of next element in A
nextRow = 0 // row index of next element to be placed in B
while r < A.NumRows()
while r < A.NumRows() && A[r, c] is null:
r++ // increment row to check in A
if r < A.NumRows() // we found a non-null entry in A
while nextRow < A.NumRows() && ~CheckConstraints(A[r,c], B[nextRow, c]):
nextRow++ // increment output row in B
if 'nextRow' >= A.NumRows()
return unsolvable // couldn't find valid position in B
B[nextRow, c] = v // successfully found position in B
++nextRow // increment output row in B
If there are no conflicts you end up "packing" B as tightly as possible. Otherwise you greedily search for the next non-conflicting row position in B. If none can be found, the problem is unsolvable.
The helper function CheckConstraints checks backwards in columns for the same row value in B to ensure the same value hasn't already been placed in a row.
If the problem statement is relaxed such that the output row count in B is <= the row count in A, then if we are unable to pack B any tighter, then we can return A as a solution.

What is the fast way to calculate this summation in MATLAB?

So I have the following constraints:
How to write this in MATLAB in an efficient way? The inputs are x_mn, M, and N. The set B={1,...,N} and the set U={1,...,M}
I did it like this (because I write x as the follwoing vector)
x=[x_11, x_12, ..., x_1N, X_21, x_22, ..., x_M1, X_M2, ..., x_MN]:
%# first constraint
function R1 = constraint_1(M, N)
ee = eye(N);
R1 = zeros(N, N*M);
for m = 1:M
R1(:, (m-1)*N+1:m*N) = ee;
end
end
%# second constraint
function R2 = constraint_2(M, N)
ee = ones(1, N);
R2 = zeros(M, N*M);
for m = 1:M
R2(m, (m-1)*N+1:m*N) = ee;
end
end
By the above code I will get a matrix A=[R1; R2] with 0-1 and I will have A*x<=1.
For example, M=N=2, I will have something like this:
And, I will create a function test(x) which returns true or false according to x.
I would like to get some help from you and optimize my code.
You should place your x_mn values in a matrix. After that, you can sum in each dimension to get what you want. Looking at your constraints, you will place these values in an M x N matrix, where M is the amount of rows and N is the amount of columns.
You can certainly place your values in a vector and construct your summations in the way you intended earlier, but you would have to write for loops to properly subset the proper elements in each iteration, which is very inefficient. Instead, use a matrix, and use sum to sum over the dimensions you want.
For example, let's say your values of x_mn ranged from 1 to 20. B is in the set from 1 to 5 and U is in the set from 1 to 4. As such:
X = vec2mat(1:20, 5)
X =
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
vec2mat takes a vector and reshapes it into a matrix. You specify the number of columns you want as the second element, and it will create the right amount of rows to ensure that a proper matrix is built. In this case, I want 5 columns, so this should create a 4 x 5 matrix.
The first constraint can be achieved by doing:
first = sum(X,1)
first =
34 38 42 46 50
sum works for vectors as well as matrices. If you have a matrix supplied to sum, you can specify a second parameter that tells you in what direction you wish to sum. In this case, specifying 1 will sum over all of the rows for each column. It works in the first dimension, which is the rows.
What this is doing is it is summing over all possible values in the set B over all values of U, which is what we are exactly doing here. You are simply summing every single column individually.
The second constraint can be achieved by doing:
second = sum(X,2)
second =
15
40
65
90
Here we specify 2 as the second parameter so that we can sum over all of the columns for each row. The second dimension goes over the columns. What this is doing is it is summing over all possible values in the set U over all values of B. Basically, you are simply summing every single row individually.
BTW, your code is not achieving what you think it's achieving. All you're doing is simply replicating the identity matrix a set number of times over groups of columns in your matrix. You are actually not performing any summations as per the constraint. What you are doing is you are simply ensuring that this matrix will have the conditions you specified at the beginning of your post to be enforced. These are the ideal matrices that are required to satisfy the constraints.
Now, if you want to check to see if the first condition or second condition is satisfied, you can do:
%// First condition satisfied?
firstSatisfied = all(first <= 1);
%// Second condition satisfied
secondSatisfied = all(second <= 1);
This will check every element of first or second and see if the resulting sums after you do the above code that I just showed are all <= 1. If they all satisfy this constraint, we will have true. Else, we have false.
Please let me know if you need anything further.

How to get histogram data object from matlab

Lets say I have a matrix x=[ 1 2 1 2 1 2 1 2 3 4 5 ]. To look at its histogram, I can do h=hist(x).
Now, h with retrieve a matrix consisting only the number of occurrences and does not store the original value to which it occurred.
What I want is something like a function which takes a value from x and returns number of occurrences of it. Having said that, what one thing histeq does should we admire is, it automatically scales nearest values according!
How should solve this issue? How exactly people do it?
My reason of interest is in images:
Lets say I have an image. I want to find all number of occurrences of a chrominance value of image.
I'm not really shure what you are looking for, but if you ant to use hist to count the number of occurences, use:
[h,c]=hist(x,sort(unique(x)))
Otherwise hist uses ranges defined by centers. The second output argument returns the corresponding number.
hist has a second return value that will be the bin centers xc corresponding to the counts n returned in form of the first return value: [n, xc] = hist(x). You should have a careful look at the reference which describes a large number of optional arguments that control the behavior of hist. However, hist is way too mighty for your specific problem.
To simply count the number of occurrences of a specific value, you could simply use something like sum(x(:) == 42). The colon operator will linearize your image matrix, the equals operator will yield a list of boolean values with 1 for each element of x that was 42, and thus sum will yield the total number of these occurrences.
An alternative to hist / histc is to use bsxfun:
n = unique(x(:)).'; %'// values contained in x. x can have any number of dims
y = sum(bsxfun(#eq, x(:), n)); %// count for each value

On the formulation of the linear assignment problem

I'm looking at the standard definition of the assignment problem as defined here
My question is to do with the two constraints (latex notation follows):
\sum_{j=1}^n(x_{ij}) = 1 for all i = 1, ... , n
\sum_{i=1}^n(x_{ij}) = 1 for all j = 1, ... , n
Specifically, why the second constraint required? Doesn't the first already cover all pairs of x_{ij}?
Consider the matrix x_ij with the i ranging over the rows, and j ranging over the columns.
The first equation says that for each i (that is, for each row!) the sum of values in that row equals 1.
The second equations says thta for each j (that is, for each column!) the sum of values in that column equals 1.
No. Given that all the entries in X are 0 or 1, one constraint says 'there is exactly one 1 in each column' - the other says 'there is exactly one 1 in each row' (I always forget which way round matrix subscripts conventionally go). These statements have independent truth values.
This is not even remotely a programming problem. But I'll answer it anyway.
The first is a sum over j, for EACH value of i. The second is a sum over i, for EACH value of j.
So essentially, one of these constraint sets requires that the sum across the rows of the matrix x_{i,j} matrix must be unity. The other constraint is a requirement that the sum down the columns of that matrix must be unity.
(edit) It seems that we are still not being clear here. Consider the matrix
[0 1]
[0 1]
One must agree that the sum across the rows of this matrix is 1 for each row. However, when you form the sum of the elements of the first column, it is zero, and the sum of the elements in the second column, we find 2.
Now, consider a different matrix.
[0 1]
[1 0]
See that here, the sum over the rows or down the columns is always 1.

Random number generator that fills an interval

How would you implement a random number generator that, given an interval, (randomly) generates all numbers in that interval, without any repetition?
It should consume as little time and memory as possible.
Example in a just-invented C#-ruby-ish pseudocode:
interval = new Interval(0,9)
rg = new RandomGenerator(interval);
count = interval.Count // equals 10
count.times.do{
print rg.GetNext() + " "
}
This should output something like :
1 4 3 2 7 5 0 9 8 6
Fill an array with the interval, and then shuffle it.
The standard way to shuffle an array of N elements is to pick a random number between 0 and N-1 (say R), and swap item[R] with item[N]. Then subtract one from N, and repeat until you reach N =1.
This has come up before. Try using a linear feedback shift register.
One suggestion, but it's memory intensive:
The generator builds a list of all numbers in the interval, then shuffles it.
A very efficient way to shuffle an array of numbers where each index is unique comes from image processing and is used when applying techniques like pixel-dissolve.
Basically you start with an ordered 2D array and then shift columns and rows. Those permutations are by the way easy to implement, you can even have one exact method that will yield the resulting value at x,y after n permutations.
The basic technique, described on a 3x3 grid:
1) Start with an ordered list, each number may exist only once
0 1 2
3 4 5
6 7 8
2) Pick a row/column you want to shuffle, advance it one step. In this case, i am shifting the second row one to the right.
0 1 2
5 3 4
6 7 8
3) Pick a row/column you want to shuffle... I suffle the second column one down.
0 7 2
5 1 4
6 3 8
4) Pick ... For instance, first row, one to the left.
2 0 7
5 1 4
6 3 8
You can repeat those steps as often as you want. You can always do this kind of transformation also on a 1D array. So your result would be now [2, 0, 7, 5, 1, 4, 6, 3, 8].
An occasionally useful alternative to the shuffle approach is to use a subscriptable set container. At each step, choose a random number 0 <= n < count. Extract the nth item from the set.
The main problem is that typical containers can't handle this efficiently. I have used it with bit-vectors, but it only works well if the largest possible member is reasonably small, due to the linear scanning of the bitvector needed to find the nth set bit.
99% of the time, the best approach is to shuffle as others have suggested.
EDIT
I missed the fact that a simple array is a good "set" data structure - don't ask me why, I've used it before. The "trick" is that you don't care whether the items in the array are sorted or not. At each step, you choose one randomly and extract it. To fill the empty slot (without having to shift an average half of your items one step down) you just move the current end item into the empty slot in constant time, then reduce the size of the array by one.
For example...
class remaining_items_queue
{
private:
std::vector<int> m_Items;
public:
...
bool Extract (int &p_Item); // return false if items already exhausted
};
bool remaining_items_queue::Extract (int &p_Item)
{
if (m_Items.size () == 0) return false;
int l_Random = Random_Num (m_Items.size ());
// Random_Num written to give 0 <= result < parameter
p_Item = m_Items [l_Random];
m_Items [l_Random] = m_Items.back ();
m_Items.pop_back ();
}
The trick is to get a random number generator that gives (with a reasonably even distribution) numbers in the range 0 to n-1 where n is potentially different each time. Most standard random generators give a fixed range. Although the following DOESN'T give an even distribution, it is often good enough...
int Random_Num (int p)
{
return (std::rand () % p);
}
std::rand returns random values in the range 0 <= x < RAND_MAX, where RAND_MAX is implementation defined.
Take all numbers in the interval, put them to list/array
Shuffle the list/array
Loop over the list/array
One way is to generate an ordered list (0-9) in your example.
Then use the random function to select an item from the list. Remove the item from the original list and add it to the tail of new one.
The process is finished when the original list is empty.
Output the new list.
You can use a linear congruential generator with parameters chosen randomly but so that it generates the full period. You need to be careful, because the quality of the random numbers may be bad, depending on the parameters.

Resources