Where can I find the implementation of barrier function in Matlab?
I am trying to see how the algorithm interior-point is implemented, and this is what I found in the end of fmincon.m
elseif strcmpi(OUTPUT.algorithm,interiorPoint)
defaultopt.MaxIter = 1000; defaultopt.MaxFunEvals = 3000; defaultopt.TolX = 1e-10;
defaultopt.Hessian = 'bfgs';
mEq = lin_eq + sizes.mNonlinEq + nnz(xIndices.fixed); % number of equalities
% Interior-point-specific options. Default values for lbfgs memory is 10, and
% ldl pivot threshold is 0.01
options = getIpOptions(options,sizes.nVar,mEq,flags.constr,defaultopt,10,0.01);
[X,FVAL,EXITFLAG,OUTPUT,LAMBDA,GRAD,HESSIAN] = barrier(funfcn,X,A,B,Aeq,Beq,l,u,confcn,options.HessFcn, ...
initVals.f,initVals.g,initVals.ncineq,initVals.nceq,initVals.gnc,initVals.gnceq,HESSIAN, ...
xIndices,options,optionFeedback,finDiffFlags,varargin{:});
So I want to see what's in barrier but failed.
edit barrier.m
I got:
The barrier function is defined in a p-file (precisely located in MATLABROOT/toolbox/optim/optim/barrier.p).
Unfortunately the point of p-files is exactly that they are obfuscated, i.e. you cannot read the source code. This is a recurent questio on SO, see this thread for instance.
I'm afraid you cannot read what's inside barrier. Maybe if you ask the Mathworks kindly they can give you some information on the content.
Best
Related
I have tried this code:
model var
Real x;
Real y;
Real z;
equation
x=6*time;
when time>=6 then
z=x;
end when;
y=3*z;
end var;
But it will give me y = 3*x(at time = 6) but from time = 6 while i need it from the beginning.
Any direct method for this problem?
Based on folks comments you now know that Modelica is quite strict in its treatment of time behavior. You could argue that it is a more physical representation of time (quantum and other crazy physics aside), as in you can not time-travel in your code.
Depending on your application there may be ways around your issue. One possibility is to move the time behavior to the initialization. That way you capture the behavior prior to time=0 and start at time=0 with the expected behavior.
For example:
model var
parameter Modelica.SIunits.Time t_zero = 6;
parameter Real x(fixed=false);
Real y;
Real z;
initial equation
x = 6*t_zero; // or some more complicated set of equations/functions
equation
z = x;
y=3*z;
end var;
Recognized this restricts things, potentially too much, but you can have many parameters and have a more complex representation in the initial equation block. You can also call a function x=func() where you have performed integrations, etc. to get the value of x at time=0.
Hope that helps, either now or in the future.
Currently I'm using this code for MATLAB R2015b support vector machine(SVM) 10-fold cross validation.
indices = crossvalind('Kfold',output,10);
cp = classperf(binary_output);
for i = 1:10
test = (indices == i); train = ~test;
SVMModel = fitcsvm(INPUT(train,:), output(train,:),'KernelFunction','RBF',...
'KernelScale','auto');
class = predict(SVMModel, INPUT(test,:));
classperf(cp,class,test);
end
z = cp.ErrorRate;
sensitivity = cp.Sensitivity;
specificity = cp.Specificity;
I need to extract sensitivity and specificity of this binary classification. Otherwise I'm running this code in a loop.
This structure of cross validation is so slow. Any other implementation for faster execution?
An easy way is to use the multi-threading from the crossval function.
tic
%data partition
order = unique(y); % Order of the group labels
cp = cvpartition(y,'k',10); %10-folds
%prediction function
f = #(xtr,ytr,xte,yte)confusionmat(yte,...
predict(fitcsvm(xtr, ytr,'KernelFunction','RBF',...
'KernelScale','auto'),xte),'order',order);
% missclassification error
cfMat = crossval(f,INPUT,output,'partition',cp);
cfMat = reshape(sum(cfMat),2,2)
toc
With the confusion matrix you can get the sensitivity and specificity easily. For the record there is no improvement of time between your script and the crossval function provided by Matlab.
An alternative to the crossval function is to use parfor
to split the computation of each iteration between different cores.
Ps: for any parallel computation, parpool has to be run before (or that might be run by some of the functions) and it takes few seconds to set it.
I'm trying to speed up my code using parfor. The purpose of the code is to slide a 3D square window on a 3D image and for each block of mxmxm apply a function.
I wrote this code:
function [ o_image ] = SlidingWindow( i_image, i_padSize, i_fun, i_options )
%SLIDINGWINDOW Summary of this function goes here
% Detailed explanation goes here
o_image = zeros(size(i_image,1),size(i_image,2),size(i_image,3));
i_image = padarray(i_image,i_padSize,'symmetric');
i_padSize = num2cell(i_padSize);
[m,n,p] = deal(i_padSize{:});
[row,col,depth] = size(i_image);
windowShape = i_options.windowShape;
mask = i_options.mask;
parfor (i = m+1:row-m,i_options.cores)
temp = i_image(i-m:i+m,:,:);
for j = n+1:col-n
for h = p+1:depth-p
ii = i-m;
jj = j-n;
hh = h-p;
temp = temp(:,j-n:j+n, h-p:h+p);
o_image(ii,jj,hh) = parfeval(i_fun, temp, windowShape, mask);
end
end
end
end
I get one warning and one error that I don't understand how to solve.
The warning says:
the entire array or structure 'i_image' is a broadcast variable.
The error says:
the PARFOR loop can not run due to the way variable 'o_image' is used.
I don't understand how to fix these two things. Any help is greatly appreciated!
As far as I understand, parfeval takes care of running your function on the available number of workers, which is why it doesn't need to be surrounded by parfor. Assuming you already have an active parpool, changing the external parfor into for eliminates both problems.
Unfortunately, I can't support my answer with a benchmark or suggest a more fitting solution because your inputs are unknown.
It seems to me that the code can be optimized in other ways, mainly by vectorization. I would suggest you looked into the following resources:
This question, for additional info on parfeval.
Examples on how to use bsxfun and permute and benchmarks thereof: ex1, ex2, ex3.
P.S.: The 2nd part of (i = m+1:row-m,i_options.cores) seems out of place...
There is a example for Employ early bail-out in this book (http://www.amazon.com/Accelerating-MATLAB-Performance-speed-programs/dp/1482211297) (#YairAltman). for speed improvement we can convert this code:
data = [];
newData = [];
outerIdx = 1;
while outerIdx <= 20
outerIdx = outerIdx + 1;
for innerIdx = -100 : 100
if innerIdx == 0
continue % skips to next innerIdx (=1)
elseif outerIdx > 15
break % skips to next outerIdx
else
data(end+1) = outerIdx/innerIdx;
newData(end+1) = process(data);
end
end % for innerIdx
end % while outerIdx
to this code:
function bailableProcessing()
for outerIdx = 1 : 5
middleIdx = 10
while middleIdx <= 20
middleIdx = middleIdx + 1;
for innerIdx = -100 : 100
data = outerIdx/innerIdx + middleIdx;
if data == SOME_VALUE
return
else
process(data);
end
end % for innerIdx
end % while middleIdx
end % for outerIdx
end % bailableProcessing()
How we did this conversion? Why we have different middleIdx range in new code? Where is checking for innerIdx and outerIdx in new code? what is this new data = outerIdx/innerIdx + middleIdx calculation?
we have only this information for second code :
We could place the code segment that should be bailed-out within a
dedicated function and return from the function when the bail-out
condition occurs.
I am sorry that I did not clarify within the text that the second code segment is not a direct replacement of the first. If you reread the early bail-out section (3.1.3) perhaps you can see that it has two main parts:
The first part of the section (which includes the top code segment) illustrates the basic mechanism of using break/continue in order to bail-out from a complex processing loop, in order to save processing time in computing values that are not needed.
In contrast, the second part of the section deals with cases when we wish to break out of an ancestor loop that is not the direct parent loop. I mention in the text that there are three alternatives that we can use in this case, and the second code segment that you mentioned is one of them (the other alternatives are to use dedicated flags with break/continue and to use try/catch blocks). The three code segments that I provided in this second part of the section should all be equivalent to each other, but they are NOT equivalent to the code-segment at the top of the section.
Perhaps I should have clarified this in the text, or maybe I should have used the same example throughout. I will think about this for the second edition of the book (if and when it ever appears).
I have used a variant of these code segments in other sections of the book to illustrate various other aspects of performance speedups (for example, 3.1.4 & 3.1.6) - in all these cases the code segments are NOT equivalent to each other. They are merely used to illustrate the corresponding text.
I hope you like my book in general and think that it is useful. I would be grateful if you would place a positive feedback about it on Amazon (direct link).
p.s. - #SamRoberts was correct to surmise that mention of my name would act as a "bat-signal", attracting my attention :-)
it's all far more simple than you think!
How we did this conversion?
Irrationally. Those two codes are completely different.
Why we have different middleIdx range in new code?
Randomness. The point of the author is something different.
Where is checking for innerIdx and outerIdx in new code?
dont need that, as it's not intended to be the same code.
what is this new data = outerIdx/innerIdx + middleIdx calculation?
a random calculation as well as data(end+1) = outerIdx/innerIdx; in the original code.
i suppose the author wants to illustrate something far more profoundly: that if you wrap your code that does (possibly many) loops (fors/whiles, doesnt matter) inside a function and you issue a return statement if you somehow detect that you're done, it will result in an effectively "bailable" computation, e.g. the method that does the work returns earlier than it would normally do. that is illustrated here by the condition that checks on data == SOME_VALUE; you can have your favourite bailout condition there instead :-)
moreover, the keywords [continue/break] inside the first example are meant to illustrate that you can [skip the rest of/leave] the inner-most loop from whereever you call them. in principal, you can implement a bailout using these by e.g.
bailing = false;
for outer = 1:1000
for inner = 1:1000
if <somebailingcondition>
bailing = true;
break;
else
<do stuff>
end
end
if bailing
break;
end
end
but that would be very clumsy as that "cascade" of breaks will need to be as long as you have nested loops and messes up the code.
i hope that could clarify your issues.
Update: I've provided a brief analysis of the three answers at the bottom of the question text and explained my choices.
My Question: What is the most efficient method of building a fixed interval dataset from a random interval dataset using stale data?
Some background: The above is a common problem in statistics. Frequently, one has a sequence of observations occurring at random times. Call it Input. But one wants a sequence of observations occurring say, every 5 minutes. Call it Output. One of the most common methods to build this dataset is using stale data, i.e. set each observation in Output equal to the most recently occurring observation in Input.
So, here is some code to build example datasets:
TInput = 100;
TOutput = 50;
InputTimeStamp = 730486 + cumsum(0.001 * rand(TInput, 1));
Input = [InputTimeStamp, randn(TInput, 1)];
OutputTimeStamp = 730486.002 + (0:0.001:TOutput * 0.001 - 0.001)';
Output = [OutputTimeStamp, NaN(TOutput, 1)];
Both datasets start at close to midnight at the turn of the millennium. However, the timestamps in Input occur at random intervals while the timestamps in Output occur at fixed intervals. For simplicity, I have ensured that the first observation in Input always occurs before the first observation in Output. Feel free to make this assumption in any answers.
Currently, I solve the problem like this:
sMax = size(Output, 1);
tMax = size(Input, 1);
s = 1;
t = 2;
%#Loop over input data
while t <= tMax
if Input(t, 1) > Output(s, 1)
%#If current obs in Input occurs after current obs in output then set current obs in output equal to previous obs in input
Output(s, 2:end) = Input(t-1, 2:end);
s = s + 1;
%#Check if we've filled out all observations in output
if s > sMax
break
end
%#This step is necessary in case we need to use the same input observation twice in a row
t = t - 1;
end
t = t + 1;
if t > tMax
%#If all remaining observations in output occur after last observation in input, then use last obs in input for all remaining obs in output
Output(s:end, 2:end) = Input(end, 2:end);
break
end
end
Surely there is a more efficient, or at least, more elegant way to solve this problem? As I mentioned, this is a common problem in statistics. Perhaps Matlab has some in-built function I'm not aware of? Any help would be much appreciated as I use this routine a LOT for some large datasets.
THE ANSWERS: Hi all, I've analyzed the three answers, and as they stand, Angainor's is the best.
ChthonicDaemon's answer, while clearly the easiest to implement, is really slow. This is true even when the conversion to a timeseries object is done outside of the speed test. I'm guessing the resample function has a lot of overhead at the moment. I am running 2011b, so it is possible Mathworks have improved it in the intervening time. Also, this method needs an additional line for the case where Output ends more than one observation after Input.
Rody's answer runs only slightly slower than Angainor's (unsurprising given they both employ the histc approach), however, it seems to have some problems. First, the method of assigning the last observation in Output is not robust to the last observation in Input occurring after the last observation in Output. This is an easy fix. But there is a second problem which I think stems from having InputTimeStamp as the first input to histc instead of the OutputTimeStamp adopted by Angainor. The problem emerges if you change OutputTimeStamp = 730486.002 + (0:0.001:TOutput * 0.001 - 0.001)'; to OutputTimeStamp = 730486.002 + (0:0.0001:TOutput * 0.0001 - 0.0001)'; when setting up the example inputs.
Angainor's appears robust to everything I threw at it, plus it was the fastest.
I did a lot of speed tests for different input specifications - the following numbers are fairly representative:
My naive loop: Elapsed time is 8.579535 seconds.
Angainor: Elapsed time is 0.661756 seconds.
Rody: Elapsed time is 0.913304 seconds.
ChthonicDaemon: Elapsed time is 22.916844 seconds.
I'm +1-ing Angainor's solution and marking the question solved.
This "stale data" approach is known as a zero order hold in signal and timeseries fields. Searching for this quickly brings up many solutions. If you have Matlab 2012b, this is all built in to the timeseries class by using the resample function, so you would simply do
TInput = 100;
TOutput = 50;
InputTimeStamp = 730486 + cumsum(0.001 * rand(TInput, 1));
InputData = randn(TInput, 1);
InputTimeSeries = timeseries(InputData, InputTimeStamp);
OutputTimeStamp = 730486.002 + (0:0.001:TOutput * 0.001 - 0.001);
OutputTimeSeries = resample(InputTimeSeries, OutputTimeStamp, 'zoh'); % zoh stands for zero order hold
Here is my take on the problem. histc is the way to go:
% find Output timestamps in Input bins
N = histc(Output(:,1), Input(:,1));
% find counts in the non-empty bins
counts = N(find(N));
% find Input signal value associated with every bin
val = Input(find(N),2);
% now, replicate every entry entry in val
% as many times as specified in counts
index = zeros(1,sum(counts));
index(cumsum([1 counts(1:end-1)'])) = 1;
index = cumsum(index);
val_rep = val(index)
% finish the signal with last entry from Input, as needed
val_rep(end+1:size(Output,1)) = Input(end,2);
% done
Output(:,2) = val_rep;
I checked against your procedure for a few different input models (I changed the number of Output timestamps) and the results are the same. However, I am still not sure I understood your problem, so if something is wrong here let me know.