Wikipedia says we can approximate Bark scale with the equation:
b(f) = 13*atan(0.00076*f)+3.5*atan(power(f/7500,2))
How can I divide frequency spectrum into n intervals of the same length on Bark scale (interval division points will be equidistant on Bark scale)?
The best way would be to analytically inverse function (express x by function of y). I was trying doing it on paper but failed. WolframAlpha search bar couldn't do it also. I tried Octave finverse function, but I got error.
Octave says (for simpler example):
octave:2> x = sym('x');
octave:3> finverse(2*x)
error: `finverse' undefined near line 3 column 1
This is finverse description from Matlab: http://www.mathworks.com/help/symbolic/finverse.html
There could be also numerical way to do it. I can imagine that you just start from dividing the y axis equally and search for ideal division by binary search. But maybe there are some existing tools that do it?
You need to numerically solve this equation (there is no analytical inverse function). Set values for b equally spaced and solve the equation to find the various f. Bissection is somewhat slow but a very good alternative is Brent's method. See http://en.wikipedia.org/wiki/Brent%27s_method
This function can't be inverted analytically. You'll have to use some numerical procedure. Binary search would be fine, but there are more efficient ways to do these sorts of things: look into root-finding algorithms. You can apply your algorithm of choice to the equation b(f) = f_n for each of the frequency interval endpoints f_n.
Just so you know, in (say) octave to implement rpsmi's or David Zaslavsky's answer, you'd do something like this:
global x0 = 0.
function res = b(f)
global x0
res = 13*atan(0.00076*f)+3.5*atan(power(f/7500,2)) - x0
end
function [intervals, barks] = barkintervals(left, right, n)
global x0
intervals = linspace(left, right, n);
barks = intervals;
for i = 1:n
x0 = intervals(i);
# 125*x0 is just a crude guess starting point given the values
[barks(i), fval, info] = fsolve('b', 125*x0);
endfor
end
and run it like so:
octave:1> barks
octave:2> [i,bx] = barkintervals(0, 10, 10)
[... lots of output from fsolve deleted...]
i =
Columns 1 through 8:
0.00000 1.11111 2.22222 3.33333 4.44444 5.55556 6.66667 7.77778
Columns 9 and 10:
8.88889 10.00000
bx =
Columns 1 through 6:
0.0000e+00 1.1266e+02 2.2681e+02 3.4418e+02 4.6668e+02 5.9653e+02
Columns 7 through 10:
7.3639e+02 8.8960e+02 1.0605e+03 1.2549e+03
I finally decided not to use the Bark values approximation but ideal values for critical bands centres (defined for n=1..24). I plotted them with gnuplot and on the same graph I plotted arbitrarily chosen values for points of greater density (for the required n>24). I adjusted the points values in Hz till the the both curves were approximately the same.
Of course rpsmi and David Zaslavsky answers are more general and scalable.
Related
I have a curvefit problem
I have two functions
y = ax+b
y = ax^2+bx-2.3
I have one set of data each for the above functions
I need to find a and b using least square method combining both the functions
I was using fminsearch function to minimize the sum of squares of errors of these two functions.
I am unable to use this method in lsqcurvefit
Kindly help me
Regards
Ram
I think you'll need to worry less about which library routine to use and more about the math. Assuming you mean vertical offset least squares, then you'll want
D = sum_{i=1..m}(y_Li - a x_Li + b)^2 + sum_{i=j..n}(y_Pj - a x_Pj^2 - b x_Pj + 2.3)^2
where there are m points (x_Li, y_Li) on the line and n points (x_Pj, y_Pj) on the parabola. Now find partial derivatives of D with respect to a and b. Setting them to zero provides two linear equations in 2 unknowns, a and b. Solve this linear system.
y = ax+b
y = ax^2+bx-2.3
In order to not confuse y of the first equation with y of the second equation we use distinct notations :
u = ax+b
v = ax^2+bx+c
The method of linear regression combined for the two functions is shown on the joint page :
HINT : If you want to find by yourself the matrixial equation appearing above, follow the Gene's answer.
I am going through my book , it states that " Write a sampling algorithm for this density function"
y=x^2+(2/3)*x+1/3; 0 < 𝑥 < 1
Or I can use Monte Carlo?
Any help would be appreciated!
I'm assuming you mean you want to generate random x values that have the distribution specified by density y(x).
It's often desirable to derive the cumulative distribution function by integrating the density, and use inverse transform sampling to generate x values. In your case the the CDF is a third order polynomial which doesn't factor to yield a simple cube-root solution, so you would have to use a numerical solver to find the inverse. Time to consider alternatives.
Another option is to use the acceptance/rejection method. After checking the derivative, it's clear that your density is convex, so it's easy to create a bounding function b(x) by drawing a straight line from f(0) to f(1). This yields b(x) = 1/3 + 5x/3. This bounding function has area 7/6, while your f(x) has an area of 1, since it is a valid density. Consequently, 6/7 of points generated uniformly under b(x) will also fall under f(x), and only 1 out of 7 attempts will fail in the rejection scheme. Here's a plot of f(x) and b(x):
Since b(x) is linear, it is easy to generate x values using it as a distribution after scaling by 6/7 to make it a valid distribution function. The algorithm, expressed in pseudocode, then becomes:
function generate():
while TRUE:
x <- (sqrt(1 + 35 * U(0,1)) - 1) / 5 # inverse CDF transform of b(x)
if U(0, b(x)) <= f(x):
return x
end while
end function
where U(a,b) means generate a value uniformly distributed between a and b, f(x) is your density, and b(x) is the bounding function described above.
I implemented the algorithm described above to generate 100,000 candidate values, of which 14,199 (~1/7) were rejected, as expected. The end results are presented in the following histogram, which you can compare to f(x) in the plot above.
I'm assuming that you have a function y(x), which takes a value between [0,1] and returns the value of y. You just need to provide a random value of x and return the corresponding value of y.
def getSample():
#get uniform random number
x = numpy.random.random()
#sample my custom function
return y(x)
I have a scalar function f([x,y],[i,j])= exp(-norm([x,y]-[i,j])^2/sigma^2) which receives two 2-dimensional vectors as input (norm here implements the Euclidean norm). The values of x,i range in 1:w and the values y,j range in 1:h. I want to create a cell array X such that X{x,y} will contain a w x h matrix such that X{x,y}(i,j) = f([x,y],[i,j]). This can obviously be done using 4 nested loops like so:
for x=1:w;
for y=1:h;
X{x,y}=zeros(w,h);
for i=1:w
for j=1:h
X{x,y}(i,j)=f([x,y],[i,j])
end
end
end
end
This is however extremely inefficient. I would very much appreciate an efficient way to create X.
The one way to do this is to remove the 2 innermost loops and replace then with a vectorised version. By the look of your f function this shouldn't be too bad
First we need to construct two matrices containing the 1 to w on every row and 1 to h on every column like so
wMat=repmat(1:w,h,1);
hMat=repmat(1:h,w,1)';
This is going to represent the inner two loops, and the transpose will allow us to get all combinations. Now we can vectorise the calculation (f([x,y],[i,j])= exp(-norm([x,y]-[i,j])^2/sigma^2)):
for x=1:w;
for y=1:h;
temp1=sqrt((x-wMat).^2+(y-hMat).^2);
X{x,y}=exp(temp1/(sigma^2));
end
end
Where we have computed the Euclidean norm for all pairs of nodes in the inner loops at once.
Some discussion and code
The trick here is to perform the norm-calculations with numeric arrays and save the results into a cell array version as late as possible. For performing the norm-calculations you can take help of ndgrid, bsxfun and some permute + reshape to give it the "shape" as needed for the final cell array version. So, here's the vectorized approach to perform these tasks -
%// Create x-y/i-j values to be used for calculation of function values
[xi,yi] = ndgrid(1:w,1:h);
%// Get the norm values
normvals = sqrt(bsxfun(#minus,xi(:),xi(:).').^2 + ...
bsxfun(#minus,yi(:),yi(:).').^2);
%// Get the actual function values
vals = exp(-normvals.^2/sigma^2);
%// Get the values into blocks of a 4D array and then re-arrange to match
%// with the shape of numeric array version of X
blks = reshape(permute(reshape(vals, w*h, h, []), [2 1 3]), h, w, h, w);
arranged_blks = reshape(permute(blks,[2 3 1 4]),w,h,w,h);
%// Finally get the cell array version
X = squeeze(mat2cell(arranged_blks,w,h,ones(1,w),ones(1,h)));
Benchmarking and runtimes
After improving the original loopy code with pre-allocation for X and function-inling f, runtime-benchmarks were performed with it against the proposed vectorized approach with datasizes as w, h = 60 and the runtime results thus obtained were -
----------- With Improved loopy code
Elapsed time is 41.227797 seconds.
----------- With Vectorized code
Elapsed time is 2.116782 seconds.
This suggested a whooping close to 20x speedup with the proposed solution!
For extremely huge datasizes
If you are dealing with huge datasizes, essentially you are not giving enough memory for bsxfun to work with, and bsxfun is known to use up a lot of memory for giving you a performance-efficient vectorized solution. So, for such huge-datasize cases, you can use the following loopy approach to replace normvals calculations that was listed in the earlier bsxfun based solution -
%// Get the norm values
nx = numel(xi);
normvals = zeros(nx,nx);
for ii = 1:nx
normvals(:,ii) = sqrt( (xi(:) - xi(ii)).^2 + (yi(:) - yi(ii)).^2 );
end
It seems to me that when you run through the cycle for x=w, y=h, you are calculating all the values you need at once. So you don't need recalculate them. Once you have this:
for i=1:w
for j=1:h
temp(i,j)=f([x,y],[i,j])
end
end
Then, e.g. X{1,1} is just temp(1,1), X{2,2} is just temp(1:2,1:2), and so on. If you can vectorise the calculation of f (norm here is just the Euclidean norm of that vector?) then it will get even simpler.
I want to make a linear fit to few data points, as shown on the image. Since I know the intercept (in this case say 0.05), I want to fit only points which are in the linear region with this particular intercept. In this case it will be lets say points 5:22 (but not 22:30).
I'm looking for the simple algorithm to determine this optimal amount of points, based on... hmm, that's the question... R^2? Any Ideas how to do it?
I was thinking about probing R^2 for fits using points 1 to 2:30, 2 to 3:30, and so on, but I don't really know how to enclose it into clear and simple function. For fits with fixed intercept I'm using polyfit0 (http://www.mathworks.com/matlabcentral/fileexchange/272-polyfit0-m) . Thanks for any suggestions!
EDIT:
sample data:
intercept = 0.043;
x = 0.01:0.01:0.3;
y = [0.0530642513911393,0.0600786706929529,0.0673485248329648,0.0794662409166333,0.0895915873196170,0.103837395346484,0.107224784565365,0.120300492775786,0.126318699218730,0.141508831492330,0.147135757370947,0.161734674733680,0.170982455701681,0.191799936622712,0.192312642057298,0.204771365716483,0.222689541632988,0.242582251060963,0.252582727297656,0.267390860166283,0.282890010610515,0.292381165948577,0.307990544720676,0.314264952297699,0.332344368808024,0.355781519885611,0.373277721489254,0.387722683944356,0.413648156978284,0.446500064130389;];
What you have here is a rather difficult problem to find a general solution of.
One approach would be to compute all the slopes/intersects between all consecutive pairs of points, and then do cluster analysis on the intersepts:
slopes = diff(y)./diff(x);
intersepts = y(1:end-1) - slopes.*x(1:end-1);
idx = kmeans(intersepts, 3);
x([idx; 3] == 2) % the points with the intersepts closest to the linear one.
This requires the statistics toolbox (for kmeans). This is the best of all methods I tried, although the range of points found this way might have a few small holes in it; e.g., when the slopes of two points in the start and end range lie close to the slope of the line, these points will be detected as belonging to the line. This (and other factors) will require a bit more post-processing of the solution found this way.
Another approach (which I failed to construct successfully) is to do a linear fit in a loop, each time increasing the range of points from some point in the middle towards both of the endpoints, and see if the sum of the squared error remains small. This I gave up very quickly, because defining what "small" is is very subjective and must be done in some heuristic way.
I tried a more systematic and robust approach of the above:
function test
%% example data
slope = 2;
intercept = 1.5;
x = linspace(0.1, 5, 100).';
y = slope*x + intercept;
y(1:12) = log(x(1:12)) + y(12)-log(x(12));
y(74:100) = y(74:100) + (x(74:100)-x(74)).^8;
y = y + 0.2*randn(size(y));
%% simple algorithm
[X,fn] = fminsearch(#(ii)P(ii, x,y,intercept), [0.5 0.5])
[~,inds] = P(X, y,x,intercept)
end
function [C, inds] = P(ii, x,y,intercept)
% ii represents fraction of range from center to end,
% So ii lies between 0 and 1.
N = numel(x);
n = round(N/2);
ii = round(ii*n);
inds = min(max(1, n+(-ii(1):ii(2))), N);
% Solve linear system with fixed intercept
A = x(inds);
b = y(inds) - intercept;
% and return the sum of squared errors, divided by
% the number of points included in the set. This
% last step is required to prevent fminsearch from
% reducing the set to 1 point (= minimum possible
% squared error).
C = sum(((A\b)*A - b).^2)/numel(inds);
end
which only finds a rough approximation to the desired indices (12 and 74 in this example).
When fminsearch is run a few dozen times with random starting values (really just rand(1,2)), it gets more reliable, but I still wouln't bet my life on it.
If you have the statistics toolbox, use the kmeans option.
Depending on the number of data values, I would split the data into a relative small number of overlapping segments, and for each segment calculate the linear fit, or rather the 1-st order coefficient, (remember you know the intercept, which will be same for all segments).
Then, for each coefficient calculate the MSE between this hypothetical line and entire dataset, choosing the coefficient which yields the smallest MSE.
I have 24 values for Y and corresponding 24 values for the Y values are measured experimentally,
while t has values : t=[1,2,3........24]
I want to find the relationship between Y and t as an equation using Fourier analysis,
what I have tried and done is:
I wrote the following MATLAB code:
Y=[10.6534
9.6646
8.7137
8.2863
8.2863
8.7137
9.0000
9.5726
11.0000
12.7137
13.4274
13.2863
13.0000
12.7137
12.5726
13.5726
15.7137
17.4274
18.0000
18.0000
17.4274
15.7137
14.0297
12.4345];
ts=1; % step
t=1:ts:24; % the period is 24
f=[-length(t)/2:length(t)/2-1]/(length(t)*ts); % computing frequency interval
M=abs(fftshift(fft(Y)));
figure;plot(f,M,'LineWidth',1.5);grid % plot of harmonic components
figure;
plot(t,Y,'LineWidth',1.5);grid % plot of original data Y
figure;bar(f,M);grid % plot of harmonic components as bar shape
the results of the bar figure is:
Now, I want to find the equation for these harmonic components which represent the data. After that I want to draw the original data Y with the data found from the fitting function and the two curves should be close to each other.
Should I use cos or sin or -sin or -cos?
In another way, what is the rule to represent these harmonics as a function: Y = f (t) ?
An example done with your data and Mathematica using Discrete sine transform. Hope you can extrapolate to Matlab:
n = 24;
xg = N[Range[n]]/n
fg = l (*your list *)
fp = ListPlot[Transpose[{xg, fg}], PlotRange -> All] (*points plot*)
coef = FourierDST[fg, 1]/Sqrt[n/2]; (*Fourier transform*)
Show[fp, Plot[Sum[coef[[r]]*Sin[Pi r x], {r, n - 1}], {x, -1, 1},
PlotRange -> All]]
The coefficients are:
{16.6411, -4.00062, 5.31557, -1.38863, 2.89762, 0.898562,
1.54402, -0.116046, 1.54847, 0.136079, 1.16729, 0.156489,
0.787476, -0.0879736, 0.747845, 0.00903859, 0.515012, 0.021791,
0.35001, 0.0159676, 0.215619, 0.0122281, 0.0943376, -0.00150218}
More detailed view:
Edit
However, as an even function seems to be better, I made also a discrete fourier cosine transform of type 3, which works much better:
In this case the coefficients are:
{14.7384, -8.93197, 4.56404, -2.85262, 2.42847, -0.249488,
0.565181,-0.848594, 0.958699, -0.468337, 0.660136, -0.317903,
0.390689,-0.457621, 0.427875, -0.260669, 0.278931, -0.166846,
0.18547, -0.102438, 0.111731, -0.0425396, 0.0484102, -0.00559378}
And the plotting of coeffs and function are obtained by:
coef = FourierDCT[fg, 3]/Sqrt[n];(*Fourier transform*)
f[x_]:= Sum[coef[[r]]*Cos[Pi (r - 1/2) x], {r, n - 1}]
You'll have to experiment a little ...
Depends on what MATLAB gave you back. It's either sine and cosine or a complex exponential.
Most FFT algorithms that I know of usually demand that the number of data points be an integer power of two. The closest one for your data set is 32, so you should pad it out with zeros.
Thanks for your help.
I found the solution I was aiming to get but for some reason everything is shifted by 1
Here is the code:
ts = 1; % time step
t = [1:ts:24];
fs = 1/ts; % frequency step
f=[-length(t)/2:length(t)/2-1]/(length(t)*ts); % frequency formula
%data
P=[10.7083
9.7003
8.9780
8.4531
8.1653
8.2633
8.8795
9.9850
11.3289
12.5172
13.2012
13.2720
12.9435
12.6647
12.8940
13.8516
15.3819
17.0033
18.1227
18.3039
17.4531
15.8322
13.9056
12.1154];
plot(t,P,'LineWidth',1.5);grid
xlabel('time (hours)');ylabel('Power (MW)')
title('Power Profile for 2nd Feb, 1998')
% fourier transform analysis
P1 = fft(P)/length(t);
P2=fftshift(P1);
amp=abs(P2); % amplitude
phi = angle(P2); % phase angle
figure
subplot(211),stem(f,amp,'LineWidth',1.5),grid
xlabel('frequency (Hz)');ylabel('amplitude (MW)')
subplot(212),stem(f,phi,'LineWidth',1.5),grid
xlabel('frequency (Hz)');ylabel('phase angle (rad)')
% NOW, I WILL CONSTRUCT THE MODEL FROM THE FIGURE
% THE STRUCTURE IS:
% Pmodel=Ai*COS(i*w*t+phii)
% where, w=2*pi/24 and i is the harmonic order
% Here, up to the third harmonic is enough
% and using Parseval's Theorem, the model is:
% PP=12.6635+2*(1.9806*cos(w*tt+1.807)+0.86388*cos(2*w*tt+2.0769)+0.39683*cos(3*w*tt- 1.8132));
w=2*pi/24;
Pmodel=12.6635+2*(1.9806*cos(w*t+1.807)+0.86388*cos(2*w*t+2.0769)+0.39686*cos(3*w*t-1.8132));
figure
plot(t,P,'LineWidth',1.5);grid on
hold on;
plot(t,Pmodel,'r','LineWidth',1.5)
legend('original','model');xlabel('time (hours )');ylabel('Power (MW)')
% But here is a problem, the modeled signal is shifted
% by 1 comparing to the original one
% I redraw the two figures together by plotting Pmodeled vs t+1
% Actually, I don't know why it is shifted, but they are
% exactly identical with shifting by 1
figure
plot(t,P,'LineWidth',1.5);grid on
hold on;
plot(t+1,Pmodel,'r','LineWidth',1.5)
legend('original','model');xlabel('time (hours )');ylabel('Power (MW)')
Why has this shifting problem happened, and how can I solve it?
The problem is with
line 2
"t = [1:ts:24];"
it should be "t= 0:ts:23;"