How to define matching axis notches from existing "step list" - algorithm

I need a way to align tick marks on two separate axis, while being able to control the "step" value (value between tick marks), where both axis start at mark 0 and end on a different maximum value.
Why this problem:
Flot, the JS charting package has an option to align tick marks, but when I do, I cannot control the step value. I can however control the step value directly, but then I lose the ability to align tick marks. I can however revert to defining my own max and step values, to get what I need (aligned tick marks while maintaining desired step value), but I need some help. yielding this question (read on for details).
Example
Let a be maximum value on axis A and b, be maximum value on axis B.
In this example, let a = 30, and b = 82.
Let's say I want 6 tick marks (not counting the extra tick mark at end of axis). In reality I guessed at 6 after trying out a few.
Once I have a desired number of tick marks, I can do something like this:
30 / 6 = 5 (I just go the needed step value for axis A)
Now need to figure out tick alignment for axis B
82 / 6 = 13.67 (not a good value, I prefer something more rounded)
move max value of B to 90 , where 90 / 6 = 15 (good - I just got the needed step value for axis B)
End Result
Input:
a_max = 30, b_max = 82
(in reality a_max could be 28.5, 29.42, b_max could be 84, 85.345, etc)
Output:
a_adjusted_max = 30, b_adjusted_max = 90,
a_step = 5, b_step = 15
number of ticks = 6 (+1 if count the end)
Visual:
|---------|---------|---------|---------|---------|---------> A
0 5 10 15 20 25 30
|---------|---------|---------|---------|---------|---------> B
0 15 30 45 60 75 90
Summary of "Demands"
Need step value for each axis to be one of 1, 2, 5, 10, 15, 20, 25, 50, 100 (in example was 5 for A, 15 for B)
Need adjusted max value for each axis (in example was 30 for A, 90 for B)
Need number of ticks to match for both axis
(optional) Number of ticks is flexible but should be anywhere between 4 and 12 as a sweet spot
adjusted max value is at or greater than original max value, and is located at a "rounded number" (i.e. 90 is prefered over 82 as in my above example)
Problems (Question)
I need to remove most of the guessing and automate tick mark generation.
i.e. at first, I Need better way to get number of tick marks because I guessed at number of tick marks I wanted above, because I wanted a good "step" value, which can be something like 1, 2, 5, 10, 15, 20, 25, 50, 100. Max values start at 4, and can go up to 100. In rarer cases go up to 500. In most cases the max values stay between 30-90.
How can I do so?

Here's a procedure I came up with. I'm assuming you only want integers.
choose a number of ticks from 4 to 12
calculate the number of steps needed for the A and B axes using this number of ticks
find how much we would have to extend axis A and axis B using these step values; add these numbers together and remember the result
repeat from the start for the next tick value
we choose the number of ticks that gives the minimal score; if there is a tie we choose the smaller number of ticks
Here are some example results:
a=30, b=82 gives 4 ticks
0 10 20 30
0 28 56 84
a=8, b=5 gives 6 ticks
0 2 4 6 8 10
0 1 2 3 4 5
Here's the pseudocode:
a = range of A axis
b = range of B axis
tickList[] = {4,5,6,7,8,9,10,11,12}
// calculate the scores for each number of ticks
for i from 0 to length(tickList)-1
ticks = tickList[i]
// find the number of steps we would use for this number of ticks
Astep = ceiling(a/(ticks-1))
Bstep = ceiling(b/(ticks-1))
// how much we would need to extend the A axis
if (a%Astep != 0)
Aextend[i] = Astep - a%Astep
else
Aextend[i] = 0
end
// how much we would need to extend the B axis
if (b%Bstep != 0)
Bextend[i] = Bstep - b%Bstep
else
Bextend[i] = 0
end
// the score is the total extending we would need to do
score[i] = Aextend[i] + Bextend[i]
end
// find the number of ticks that minimizes the score
bestIdx = 0
bestScore = 1000;
for i from 0 to length(tickList);
if (score[i] < bestScore)
bestIdx = i
bestScore = score[i]
end
end
bestTick = tickList[bestIdx]
bestAstep = ceiling(a/(bestTick-1))
bestBstep = ceiling(b/(bestTick-1))
A axis goes from 0 by bestAstep to bestAstep*bestTick
B axis goes from 0 by bestBstep to bestBstep*bestTick

Related

Matlab - Algorithm for calculating 1d consecutive line segment edges from midpoints?

So I have a rectilinear grid that can be described with 2 vectors. 1 for the x-coordinates of the cell centres and one for the y-coordinates. These are just points with spacing like x spacing is 50 scaled to 10 scaled to 20 (55..45..30..10,10,10..10,12..20,20,20) and y spacing is 60 scaled to 40 scaled to 60 (60,60,60,55..42,40,40,40..40,42..60,60) and the grid is made like this
e.g. x = 1 2 3, gridx = 1 2 3, y = 10 11 12, gridy = 10 10 10
1 2 3 11 11 11
1 2 3 12 12 12
so then cell centre 1 is 1,10 cc2 is 2,10 etc.
Now Im trying to formulate an algorithm to calculate the positions of the cell edges in the x and y direction. So like my first idea was to first get the first edge using x(1)-[x(2)-x(1)]/2, in the real case x(2)-x(1) is equal to 60 and x(1) = 16348.95 so celledge1 = x(1)-30 = 16318.95. Then after calculating the first one I go through a loop and calculate the rest like this:
for aa = 2:length(x)+1
celledge1(aa) = x(aa-1) + [x(aa-1)-celledge(aa-1)]
end
And I did the same for y. This however does not work and my y vector in the area where the edge spacing should be should be 40 is 35,45,35,45... approx.
Anyone have any idea why this doesnt work and can point me in the right direction. Cheers
Edit: Tried to find a solution using geometric alebra:
We are trying to find the points A,B,C,....H. From basic geometry we know:
c1 (centre 1) = [A+B]/2 and c2 = [B+C]/2 etc. etc.
So we have 7 equations and 8 variables. We also know the the first few distances between centres are equal (60,60,60,60) therefore the first segment is 60 too.
B - A = 60
So now we have 8 equations and 8 variables so I made this algorithm in Matlab:
edgex = zeros(length(DATA2.x)+1,1);
edgey = zeros(length(DATA2.y)+1,1);
edgex(1) = (DATA2.x(1)*2-diffx(1))/2;
edgey(1) = (DATA2.y(1)*2-diffy(1))/2;
for aa = 2:length(DATA2.x)+1
edgex(aa) = DATA2.x(aa-1)*2-edgex(aa-1);
end
for aa = 2:length(DATA2.y)+1
edgey(aa) = DATA2.y(aa-1)*2-edgey(aa-1);
end
And I still got the same answer as before with the y spacing going 35,45,35,45 where it should be 40,40,40... Could it be an accuracy error??
Edit: here are the numbers if ur interested and I did the same computation as above only in excel: http://www.filedropper.com/workoutedges
It seems you're just trying to interpolate your data. You can do this with the built-in interp1
x = [30 24 19 16 8 7 16 22 29 31];
xi = interp1(2:2:numel(x)*2, x, 1:(numel(x)*2+1), 'linear', 'extrap');
This just sets up the original data as the even-indexed elements and interpolates the odd indices, including extrapolation for the two end points.
Results:
xi =
Columns 1 through 11:
33.0000 30.0000 27.0000 24.0000 21.5000 19.0000 17.5000 16.0000 12.0000 8.0000 7.5000
Columns 12 through 21:
7.0000 11.5000 16.0000 19.0000 22.0000 25.5000 29.0000 30.0000 31.0000 32.0000

Make an n x n-1 matrix from 1 x n vector where the i-th row is the vector without the i-th element, without a for loop

I need this for Lagrange polynomials. I'm curious how one would do this without a for loop. The code currently looks like this:
tj = 1:n;
ti = zeros(n,n-1);
for i = 1:n
ti(i,:) = tj([1:i-1, i+1:end]);
end
My tj is not really just a 1:n vector but that's not important. While this for loop gets the job done, I'd rather use some matrix operation. I tried looking for some appropriate matrices to multiply it with, but no luck so far.
Here's a way:
v = [10 20 30 40]; %// example vector
n = numel(v);
M = repmat(v(:), 1, n);
M = M(~eye(n));
M = reshape(M,n-1,n).';
gives
M =
20 30 40
10 30 40
10 20 40
10 20 30
This should generalize to any n
ti = flipud(reshape(repmat(1:n, [n-1 1]), [n n-1]));
Taking a closer look at what's going on. If you look at the resulting matrix closely, you'll see that it's n-1 1's, n-1 2's, etc. from the bottom up.
For the case where n is 3.
ti =
2 3
1 3
1 2
So we can flip this vertically and get
f = flipud(ti);
1 2
1 3
2 3
Really this is [1, 2, 3; 1, 2, 3] reshaped to be 3 x 2 rather than 2 x 3.
In that line of thinking
a = repmat(1:3, [2 1])
1 2 3
1 2 3
b = reshape(a, [3 2]);
1 2
1 3
2 3
c = flipud(b);
2 3
1 3
1 2
We are now back to where you started when we bring it all together and replace 3's with n and 2's with n-1.
Here's another way. First create a matrix where each row is the vector tj but are stacked on top of each other. Next, extract the lower and upper triangular parts of the matrix without the diagonal, then add the results together ensuring that you remove the last column of the lower triangular matrix and the first column of the upper triangular matrix.
n = numel(tj);
V = repmat(tj, n, 1);
L = tril(V,-1);
U = triu(V,1);
ti = L(:,1:end-1) + U(:,2:end);
numel finds the total number of values in tj which we store in n. repmat facilitates the stacking of the vector tj to create a matrix that is n x n large. After, we use tril and triu so that we extract the lower and upper triangular parts of the matrices without the diagonal. In addition, the rest of the matrix is all zero except for the relevant triangular parts. The -1 and 1 flags for tril and triu respectively extract this out successfully while ensuring that the diagonal is all zero. This creates a column of extra zeroes appearing at the last column when calling tril and the first column when calling triu. The last part is to simply add these two matrices together ignoring the last column of the tril result and the first column of the triu result.
Given that tj = [10 20 30 40]; (borrowed from Luis Mendo's example), we get:
ti =
20 30 40
10 30 40
10 20 40
10 20 30

Neighboring gray-level dependence matrix (NGLDM) in MATLAB

I would like to calculate a couple of texture features (namely: small/ large number emphasis, number non-uniformity, second moment and entropy). Those can be computed from Neighboring gray-level dependence matrix. I'm struggling with understanding/implementation of this. There is very little info on this method (publicly available).
According to this paper:
This matrix takes the form of a two-dimensional array Q, where Q(i,j) can be considered as frequency counts of grayness variation of a processed image. It has a similar meaning as histogram of an image. This array is Ng×Nr where Ng is the number of possible gray levels and Nr is the number of possible neighbours to a pixel in an image.
If the image function f(i,j) is discrete, then it is easy to computer the Q matrix (for positive integer d, a) by counting the number of times the difference between each element in f(i,j) and its neighbours is equal or less than a at a certain distance d.
Here is the example from the same paper (d = 1, a = 0):
Input (image) matrix and output matrix Q:
I've been looking at this example for hours now and still can't figure out how they got that Q matrix. Anyone?
The method was originally created by C. Sun and W. Wee and was described in a paper called: "Neighboring gray level dependence matrix for texture classification" to which I got access, but can't download (after pressing download the page reloads and that's it).
In the example that you have provided, d=1 and a=0. When d=1, we consider pixels in an 8-pixel neighbourhood. When a=0, this means that we look for pixels that have the same value as the centre of the neighbourhood.
The basic algorithm is the following:
Initialize your NGLDM matrix to all zeroes. The total number of rows corresponds to the total number of possible intensities / values in your image. The total number of columns corresponds to how many pixels are in your neighbourhood plus 1. As such for d=1, we have an 8-pixel neighbourhood and so 8 + 1 = 9. Because there are 4 possible intensities (0,1,2,3), we thus have a 4 x 9 matrix. Let's call this matrix M.
For each pixel in your matrix, take note of this pixel. This goes in the Ng row.
Write out how many valid neighbours there are that surround this pixel.
Count how many times you see the neighbouring pixels matching that pixel in Step #1. This is your Nr column.
Once you figure out the numbers in Step #1 and Step #2, increment this location by 1.
Here's a slight gotcha: They ignore the border locations. As such, you don't do this procedure for the first row, last row, first column or last column. My guess is that they want to be sure that you have an 8-pixel neighbourhood all the time. This is also dictated by the distance d=1. You must be able to grab every valid pixel given a centre location at d=1. If d=2, then you would have to make sure that every pixel in the centre of the neighbourhood has a 25 pixel neighbourhood and so on.
Let's start from the second row, second column location of this matrix. Let's go through the steps:
Ng = 1 as the location is 1.
Valid neighbours - Starting from the top left pixel in this neighbourhood, and scanning left to right and omitting the centre, we have: 1, 1, 2, 0, 1, 0, 2, 2.
How many values are equal to 1? Three times. Therefore Nr = 3
M(Ng,Nr) += 1. Access row Ng = 1, and access row Nr = 3, and increment this spot by 1.
Want to know how I figured out they don't count the borders? Let's do the bottom left pixel. That location is 0, so Ng = 0. If you repeat the algorithm that I just said, you would expect Ng = 0, Nr = 1, and so you would expect at least one entry in that location in your matrix... but you don't! If you do similar checks around the border of the image, you'll see that entries that are supposed to be there... aren't. Take a look at the third row, fifth column. You would think that Ng = 1 and Nr = 1, but we don't see that in the matrix.
One more example. Why is M(Ng,Nr) = 4, Ng = 2, Nr = 4? Well, take a look at every pixel that has a 2 in it. The only valid locations where we can capture an 8 pixel neighbourhood successfully are the row=2, col=4, row=3, col=3, row=3, col=4, row=4, col=3, and row=4, col=4. By applying the same algorithm that we have seen, you'll see that for each of those locations, Nr = 4. As such, we see this combination of Ng = 2, Nr = 4 four times, and that's why the location is set to 4. However, in row=3, col=4, this actually is Nr = 5, as there are five 2s in that neighbourhood at that centre. That's why you see Ng = 2, Nr = 5, M(Ng,Nr) = 1.
As an example, let's do one of the locations. Let's do the 2 smack dab in the middle of the matrix (row=3, col=3):
Ng = 2
What are the valid neighbouring pixels? 1, 1, 2, 0, 2, 3, 2, 2 (omit the centre)
Count how many pixels equal to 2. There are four of them, so Nr = 4
M(Ng,Nr) += 1. Take Ng = 2, Nr = 4 and increment this spot by 1.
If you do this with the other valid locations that have 2, you'll see that Nr = 4 each time with the exception of the third row and fourth column, where Nr = 5.
So how would we implement this in MATLAB? What you can do is use im2col to transform each valid neighbourhood into columns. What I'm also going to do is extract the centre of each neighbourhood. This is actually the middle row of the matrix. We will then figure out how many pixels for each neighbourhood equal the centre, sum them up, and this will determine our Nr values. The Ng values will be the middle row values themselves. Once we do this, we can compute a histogram based on these values just like how the algorithm is doing to get our matrix. In other words, try doing this:
% // Your example
A = [1 1 2 3 1; 0 1 1 2 2; 0 0 2 2 1; 3 3 2 2 1; 0 0 2 0 1];
B = im2col(A, [3 3]); %//Convert neighbourhoods to columns - 3 x 3 means d = 1
C = bsxfun(#eq, B, B(5,:)); %//Figure out a logical matrix where each column tells
%//you how many elements equals the one in each centre
D = sum(C, 1) - 1; %// Must subtract by 1 to discount centre pixel
Ng = B(5,:).' + 1; % // We must make this into a column vector, and we also must
% // offset by 1 as MATLAB starts indexing by 1.
%// Column vector is for accumarray input
Nr = D.' + 1; %// Do the same for Nr. We could have simply left out the + 1 here and
%// took out the subtraction of -1 for D, but I want to explicitly show
%// the steps
Q = accumarray([Ng Nr], 1, [4 9]); %// 4 unique intensities, 9 possible locations (0-8)
... and here is our matrix:
Q =
0 0 1 0 0 0 0 0 0
0 0 1 1 0 0 0 0 0
0 0 0 0 4 1 0 0 0
0 1 0 0 0 0 0 0 0
If you check this, you'll see this matches with Q.
Bonus
If you want to be able to accommodate for the algorithm in general, where you specify d and a, we can simply follow the guidelines of your text. For each neighbourhood, you find the difference between the centre pixel and all of the other pixels. You count how many pixels are <= a for any positive integer d. Note that this will create a 2*d + 1 x 2*d + 1 neighbourhood we need to examine. We can also make this into a function. Without further ado:
%// Set A up yourself, then use a and d as inputs
%// Precondition - a and d are both integers. a can be 0 and d is positive!
function [Q] = calculateGrayDepMatrix(A, a, d)
neigh = 2*d + 1; % //Calculate rows/columns of neighbourhood
numTotalNeigh = neigh*neigh; % //Calculate total number of pixels in neighbourhood
middleRow = ceil(numTotalNeigh / 2); %// Figure out which index the middle row is
B = im2col(A, [neigh neigh]); %// Make into columns
Cdiff = abs(bsxfun(#minus, B, B(middleRow,:))); %// For each neighbourhood, subtract with its centre
C = Cdiff <= a; %// For each neighbourhood, figure out which differences are <= a
D = sum(C, 1) - 1; % //For each neighbourhood, add them up
Ng = B(middleRow,:).' + 1; % // Determine Ng and Nr, and find Q
Nr = D.' + 1;
Q = accumarray([Ng Nr], 1, [max(Ng) numTotalNeigh]);
end
We can recreate the scenario we showed above with the example matrix by:
A = [1 1 2 3 1; 0 1 1 2 2; 0 0 2 2 1; 3 3 2 2 1; 0 0 2 0 1];
Q = calculateGrayDepMatrix(A, 0, 1);
Q is thus:
Q =
0 0 1 0 0 0 0 0 0
0 0 1 1 0 0 0 0 0
0 0 0 0 4 1 0 0 0
0 1 0 0 0 0 0 0 0
Hope this helps!

How to calculate one certain value from a rolling-window estimation in Stata

I'm using Stata to estimate Value-at-risk (VaR) with the historical simulation method. Basically, I will create a rolling window with 100 observations, to estimate VaR for the next 250 days (repeat 250 times). Hence, as I've known, the rolling window with time series command in Stata would be useful in this case. Here is the process:
Input: 350 values
1. Ascending sort the very first 100 values (by magnitude).
2. Then I need to take the 5th smallest for each window.
3. Repeat 250 times.
Output: a list of the 5th values (250 in total).
Sound simple, but I cannot do it the right way. This was my attempt below:
program his,rclass
sort lnreturn
return scalar actual=lnreturn in 5
end
tsset stt
time variable: stt, 1 to 350
delta: 1 unit
rolling actual=r(actual), window(100) saving(C:\result100.dta, replace) : his
(running his on estimation sample)
And the result is:
Start end actual
1 100 -.047856
2 101 -.047856
3 102 -.047856
4 103 -.047856
.... ..... ......
251 350 -.047856
What I want is 250 different 5th values in panel "actual", not the same like that.
If I understand this correctly, you want the 5th percentile of values in a window of 100. That should yield to summarize, detail or centile. I see no need to write a program.
Your bug is that your program his calculates the same thing each time it is called. There is no communication about windows other than what is explicit in your code. It is like saying
move here: now add 2 + 2
move there: now add 2 + 2
move to New York: now add 2 + 2
The result is invariant to your supposed position.
Note that I doubt that
return scalar actual=lnreturn in 5
really is your code. lnreturn[5] should work.
UPDATE You don't even need rolling here. Looping over data is easy enough. The data in this example are clearly fake.
clear
* sandpit
set obs 500
set seed 2803
gen y = ceil(exp(rnormal(3,2)))
l y in 1/5
* initialise
gen p5 = .
* windows of length 100: 1..100, 101..200, ...
quietly forval j = 1/401 {
local J = `j' + 99
su y in `j'/`J', detail
replace p5 = r(p5) in `j'
}
* check first calculation
su y in 1/100, detail
l in 1/5

How to scale down the values so they could fit inside the min and max values

I have 6 graph bars with the prices.
Each price number will represent its graphbar's height by respecting min and max heights.
What i want is that graph bar's height wouldn't go below or above the min and the max value.
So i have values of min = 55 and max = 110.
And price numbers are:
49
212
717
1081
93
By which mathematical algorithm I could achieve expected results ?
It's some sort of dynamic scalable bar graphs.
Modified
So the min and max values from the price list will be: 49(min price) => 55(min) and 1081 (max price) => 110(max)
The solution is simple:
Pick the smallest, and largest item and find the difference.
(largest_item - smallest_item) maps to (max-min).
Compute ratio = (max-min)/(largest_item-smallest_item)
final_value = min_value + ratio*(value-smallest_item)
As a mathematical function:
f(x,max,min,largest,smallest) = min + (max-min)/(largest-smallest)*(x-smallest)
where:
x : Input item's price
max: Maximum value (here, 110)
min: Minimum value (here, 55)
largest: Largest item in input (Here, 1081)
smallest: Smallest item in input (Here, 49)
One check, as #amit correctly points out: Ensure largest and smallest item are distinct.
So let x = 93. We have other 4 values with us.
f(x,max,min,largest,smallest) = min + (max-min)/(largest-smallest)*(x-smallest)
value = 55 + ((110-55)/(1081-49)) * (93-49)
value = 57.344961
Further,
f(93,110,55,1081,49) = 57.344961
f(49,110,55,1081,49) = 55
f(1081,110,55,1081,49) = 110
The function:
[(x - min ) / (max-min)*55] + 55
ensures the boundaries you are after - but you should also consider - what should the graph show? What do you want the reader to understand from it?
Why?
(x-min) / (max-min) gives a number in range [0,1] - 0 for min,
1 for max.
Multiplying it with 55 ensures a number in range [0,55].
Adding 55 ensures a number in range [55,110] - as expected.
(*) Note: for max = min - the above fails because of division with 0, take care for these cases manually.

Resources