I'm trying to figure out the best way to define a von-Mises distribution wrapped on a half-circle (I'm using it to draw directionless lines at different concentrations). I'm currently using SciPy's vonmises.rvs(). Essentially, I want to be able to put in, say, a mean orientation of pi/2 and have the distribution truncated to no more than pi/2 either side.
I could use a truncated normal distribution, but I will lose the wrapping of the von-mises (say if I want a mean orientation of 0)
I've seen this done in research papers looking at mapping fibre orientations, but I can't figure out how to implement it (in python). I'm a bit stuck on where to start.
If my von Mesis is defined as (from numpy.vonmises):
np.exp(kappa*np.cos(x-mu))/(2*np.pi*i0(kappa))
with:
mu, kappa = 0, 4.0
x = np.linspace(-np.pi, np.pi, num=51)
How would I alter it to use a wrap around a half-circle instead?
Could anyone with some experience with this offer some guidance?
Is is useful to have direct numerical inverse CDF sampling, it should work great for distribution with bounded domain. Here is code sample, building PDF and CDF tables and sampling using inverse CDF method. Could be optimized and vectorized, of course
Code, Python 3.8, x64 Windows 10
import numpy as np
import matplotlib.pyplot as plt
import scipy.integrate as integrate
def PDF(x, μ, κ):
return np.exp(κ*np.cos(x - μ))
N = 201
μ = np.pi/2.0
κ = 4.0
xlo = μ - np.pi/2.0
xhi = μ + np.pi/2.0
# PDF normaliztion
I = integrate.quad(lambda x: PDF(x, μ, κ), xlo, xhi)
print(I)
I = I[0]
x = np.linspace(xlo, xhi, N, dtype=np.float64)
step = (xhi-xlo)/(N-1)
p = PDF(x, μ, κ)/I # PDF table
# making CDF table
c = np.zeros(N, dtype=np.float64)
for k in range(1, N):
c[k] = integrate.quad(lambda x: PDF(x, μ, κ), xlo, x[k])[0] / I
c[N-1] = 1.0 # so random() in [0...1) range would work right
#%%
# sampling from tabular CDF via insverse CDF method
def InvCDFsample(c, x, gen):
r = gen.random()
i = np.searchsorted(c, r, side='right')
q = (r - c[i-1]) / (c[i] - c[i-1])
return (1.0 - q) * x[i-1] + q * x[i]
# sampling test
RNG = np.random.default_rng()
s = np.empty(20000)
for k in range(0, len(s)):
s[k] = InvCDFsample(c, x, RNG)
# plotting PDF, CDF and sampling density
plt.plot(x, p, 'b^') # PDF
plt.plot(x, c, 'r.') # CDF
n, bins, patches = plt.hist(s, x, density = True, color ='green', alpha = 0.7)
plt.show()
and graph with PDF, CDF and sampling histogram
You could discard the values outside the desired range via numpy's filtering (theta=theta[(theta>=0)&(theta<=np.pi)], shortening the array of samples). So, you could first increment the number of generated samples, then filter and then take a subarray of the desired size.
Or you could add/subtract pi to put them all into that range (via theta = np.where(theta < 0, theta + np.pi, np.where(theta > np.pi, theta - np.pi, theta))). As noted by #SeverinPappadeux such changes the distribution and is probably not desired.
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
import numpy as np
from scipy.stats import vonmises
mu = np.pi / 2
kappa = 4
orig_theta = vonmises.rvs(kappa, loc=mu, size=(10000))
fig, axes = plt.subplots(ncols=2, sharex=True, sharey=True, figsize=(12, 4))
for ax in axes:
theta = orig_theta.copy()
if ax == axes[0]:
ax.set_title(f"$Von Mises, \\mu={mu:.2f}, \\kappa={kappa}$")
else:
theta = theta[(theta >= 0) & (theta <= np.pi)]
print(len(theta))
ax.set_title(f"$Von Mises, angles\\ filtered\\ ({100 * len(theta) / (len(orig_theta)):.2f}\\ \\%)$")
segs = np.zeros((len(theta), 2, 2))
segs[:, 1, 0] = np.cos(theta)
segs[:, 1, 1] = np.sin(theta)
line_segments = LineCollection(segs, linewidths=.1, colors='blue', alpha=0.5)
ax.add_collection(line_segments)
ax.autoscale()
ax.set_aspect('equal')
plt.show()
Related
I am trying to figure out how to sample for two random variables uniformly in the region where the sum of the two is greater than zero. I thought a solution might be to sample for X~U(-1,1) and then sample for Y~U(-x,1) where x would be the current sample for X.
But this resulted in a distribution that looks like this.
This doesn't look uniformly distributed as the density of points at the top left is higher and keeps reducing as we move to the right. Can someone point out where the flaw in my reasoning is and how to possibly fix this?
Thank you
You just need to make sure that adjust the density of x points away from the "top-left" corner appropriately. I'd also suggest generating in [0,1] and then transforming into [-1,1] afterwards.
For example:
import numpy as np
# generate points, sqrt takes care of moving points away from zero
n = 50000
x = np.sqrt(np.random.uniform(size=n))
y = np.random.uniform(1-x)
# transform to -1,1
x = x * 2 - 1
y = y * 2 - 1
plotting these gives:
which looks reasonable to me. Note I've colored the [-1,1] square to show where it should fit.
Could you please elaborate a bit on how you arrived at the answer?
Well, the main problem consists in getting a fair way to sample the non-uniform distribution of coordinate X.
From elementary geometry, the area of the part of the upper triangle with x < x0 is: (1/2) * (x0 + 1)2. As the total area of this upper triangle is equal to 2, it follows that the cumulative probability P of (X < x0) within the upper triangle is: P = (1/4) * (x0 + 1)2.
So, inverting the last formula, we have: x0 = 2*sqrt(P) - 1
Now, from the Inverse Transform Sampling theorem, we know that we can generate a fair sampling of X by reinterpreting P as a random variable U0 uniformly distributed between 0 and 1.
In Python, this gives us:
u0 = random.uniform(0.0, 1.0)
x = (2*math.sqrt(u0)) - 1.0
or equivalently:
u0 = random.random()
x = (2 * math.sqrt(u0)) - 1.0
Note that this is essentially the same maths as in the excellent answer by #SamMason. That thing comes from a general statistical principle. It can just as well be used to prove that a fair sampling of the latitude on a 3D sphere is given by arcsin(2*u - 1).
So now we have x, but we still need y. The underlying 2D density is an uniform one, so for a given x, all possible values of y are equidistributed.
The interval of possible values for y is [-x, 1]. So if U1 is yet another independent random variable uniformly distributed between 0 and 1, y can be drawn from the equation:
y = (1+x) * u1 - x
which in Python is rendered by:
u1 = random.random()
y = (1+x)*u1 - x
Overall, the Python code can be written like this:
import math
import random
import matplotlib.pyplot as plt
def mySampler():
u0 = random.random()
u1 = random.random()
x = 2*math.sqrt(u0) - 1.0
y = (1+x)*u1 - x
return (x,y)
#--- Main program:
points = (mySampler() for _ in range(10000)) # an iterator object
xx, yy = zip(*points)
plt.scatter(xx, yy, s=0.2)
plt.show()
Graphically, the result looks good enough:
Side note: a cheaper, ad hoc solution:
There is always the possibility of sampling uniformly in the whole square, and rejecting the points whose x+y sum happens to be negative. But this is a bit wasteful. We can have a more elegant solution by noting that the “bad” region has the same shape and area as the “good” region.
So if we get a “bad” point, instead of just rejecting it, we can replace it by its symmetic point with respect to the x+y=0 dividing line. This can be done using the following Python code:
def mySampler2():
x0 = random.uniform(-1.0, 1.0)
y0 = random.uniform(-1.0, 1.0)
s = x0+y0
if (s >= 0):
return (x0, y0) # good point
else:
return (x0-s, y0-s) # symmetric of bad point
This works fine too. And this is probably the cheapest possible solution regarding CPU time, as we reject nothing, and we don't need to compute a square root.
Following Generate random locations within a triangular domain
Code, to sample uniformly in any triangle, Python 3.9.4, Win 10 x64
import math
import random
import matplotlib.pyplot as plt
def trisample(A, B, C):
"""
Given three vertices A, B, C,
sample point uniformly in the triangle
"""
r1 = random.random()
r2 = random.random()
s1 = math.sqrt(r1)
x = A[0] * (1.0 - s1) + B[0] * (1.0 - r2) * s1 + C[0] * r2 * s1
y = A[1] * (1.0 - s1) + B[1] * (1.0 - r2) * s1 + C[1] * r2 * s1
return (x, y)
random.seed(312345)
A = (1, 0)
B = (1, 1)
C = (0, 1)
points = [trisample(A, B, C) for _ in range(10000)]
xx, yy = zip(*points)
plt.scatter(xx, yy, s=0.2)
plt.show()
Attached is a simple python Kalman filter example of a free-fall object (g=-9.8m/s^2)
Alas, I have a problem. The state vector x contains both the position and the velocity but the z vector (measurement) contains only the position.
If I set a wrong initial position value, the algorithm coverages to the true value even with noisy measurements (see picture below)
However, if I sent the wrong initial velocity value, the algorithm does not converge even though the motion model is defined correctly.
Attached is the python code:
kalman.py
In your code I see two problems.
You set the Q-Matrix to zero. It means you trust too much in your model and give the filter no chance to improve the estimation through the measurement. Your filter becomes to stiff. You can think of it like a low pass filter with a very big time constant.
In my code I set the Q-Matrix to
Q = np.array([[1,0],[0,0.1]])
The second issue is your measurement noise. You simulate the noisy measurements with R=100 but communicate to the filter R=4. The filter trusts the measurement more than it should be. This issue is not really relevant to your question but still it should be corrected.
Now even if I set the initial velocity to 20, the position estimation works fine.
Here is the estimation for R = 4:
And for R = 100:
UPDATE
The velocity estimation works wrong, because you have some mistakes in your matrix operations. Please note, the matrix multiplication goes through np.dot(), not through *.
Here is a correct result for v0 = 20:
Many thanks, Anton.
Attached below is the corrected code for your convenience:
Roi
import numpy as np
import matplotlib.pyplot as plt
%matplotlib notebook
from numpy.linalg import inv
N = 1000 # number of time steps
dt = 0.01 # Sampling time (s)
t = dt*np.arange(N)
F = np.array([[1, dt],[ 0, 1]])# system matrix - state
B = np.array([[-1/2*dt**2],[ -dt]])# system matrix - input
H = np.array([[1, 0]])#; % observation matrix
Q = np.array([[1,0],[0,1]])
u = 9.80665# % input = acceleration due to gravity (m/s^2)
I = np.array([[1,0],[0,1]]) #identity matrix
# Define the initial position and velocity
y0 = 100; # m
v0 = 0; # m/s
G2 = np.array([-1/2*dt**2, -dt])# system matrix - input
# Initialize the state vector (true state)
xt = np.zeros((2, N)) # True state vector
xt[:,0] = [y0,v0]
for k in range(1,N):
xt[:,k] = np.dot(F,xt[:,k-1]) +G2*u
#Generate the noisy measurement from the true state
R = 4 # % m^2/s^2
v = np.sqrt(R)*np.random.randn(N) #% measurement noise
z = np.dot(H,xt) + v; #% noisy measurement
R2=4
#% Initialize the covariance matrix
P = np.array([[10, 0], [0, 0.1]])# Covariance for initial state error
#% Loop through and perform the Kalman filter equations recursively
x_list =[]
x_kalman= np.array([[117],[290]])
x_list.append(x_kalman)
print(-B*u)
for k in range(1,N):
x_kalman=np.dot(F,x_kalman) +B*u
P = np.dot(np.dot(F,P),F.T) +Q
S=(np.dot(np.dot(H,P),H.T) + R2)
S2 = inv(S)
K = np.dot(P,H.T)*S2
x_kalman = x_kalman +K*((z[:,k]- np.dot(H,x_kalman)))
P = np.dot((I - K*H),P)
x_list.append(x_kalman)
x_array = np.array(x_list)
print(x_array.shape)
plt.figure()
plt.plot(t,z[0,:], label="measurment", color='LIME', linewidth=1)
plt.plot(t,x_array[:,0,:],label="kalman",linewidth=5)
plt.plot(t,xt[0,:],linestyle='--', label = "Truth",linewidth=6)
plt.legend(fontsize=30)
plt.grid(True)
plt.xlabel("t[s]")
plt.title("Position Estimation", fontsize=20)
plt.ylabel("$X_t$ = h[m]")
plt.gca().set( ylim=(0, 110))
plt.gca().set(xlim=(0,6))
plt.figure()
#plt.plot(t,z, label="measurment", color='LIME')
plt.plot(t,x_array[:,1,:],label="kalman",linewidth=4)
plt.plot(t,xt[1,:],linestyle='--', label = "Truth",linewidth=2)
plt.legend()
plt.grid(True)
plt.xlabel("t[s]")
plt.title("Velocity Estimation")
plt.ylabel("$X_t$ = h[m]")
I'm thinking if I can use GEKKO for the following problem. Please feel free to share your comments. Thank you in advance.
Given that I'd like to approximate some nonlinear functions by piece-wise linear(PWL) segments. For instance, I'd like to use N PWL segments to approximate the function of Gaussian. Is it possible to leverage GEKKO for the problem? What available examples do you suggest studying?
Thank you
The link that Junho sent is good if you have discontinuous functions that are linear or nonlinear with switching conditions. If you have data then there is a PWL function in Gekko that you can use without binary or MPCC switching conditions. Below is a simple PWL example in Python. Instead of the data points I included, you can use PWL segments to approximate the Gaussian function.
import matplotlib.pyplot as plt
from gekko import GEKKO
import numpy as np
m = GEKKO(remote=False)
m.options.SOLVER = 1
x = m.FV(value = 4.5)
y = m.Var()
xp = np.array([1, 2, 3, 3.5, 4, 5])
yp = np.array([1, 0, 2, 2.5, 2.8, 3])
m.pwl(x,y,xp,yp)
m.solve()
plt.plot(xp,yp,'rx-',label='PWL function')
plt.plot(x,y,'bo',label='Data')
plt.show()
If there is a data set with many points, sometimes it is desirable to fit just a few points with a PWL segments. This is another example that shows how to fit a PWL approximation. In this case you can't use the PWL object in Gekko.
from scipy import optimize
import matplotlib.pyplot as plt
from gekko import GEKKO
import numpy as np
m = GEKKO()
m.options.SOLVER = 3
m.options.IMODE = 2
xzd = np.linspace(1,5,100)
yzd = np.sin(xzd)
xz = m.Param(value=xzd)
yz = m.CV(value=yzd)
yz.FSTATUS = 1
xp_val = np.array([1, 2, 3, 3.5, 4, 5])
yp_val = np.array([1, 0, 2, 2.5, 2.8, 3])
xp = [m.FV(value=xp_val[i],lb=xp_val[0],ub=xp_val[-1]) for i in range(6)]
yp = [m.FV(value=yp_val[i]) for i in range(6)]
for i in range(6):
xp[i].STATUS = 0
yp[i].STATUS = 1
for i in range(5):
m.Equation(xp[i+1]>=xp[i]+0.05)
x = [m.Var(lb=xp[i],ub=xp[i+1]) for i in range(5)]
x[0].lower = -1e20
x[-1].upper = 1e20
# Variables
slk_u = [m.Var(value=1,lb=0) for i in range(4)]
slk_l = [m.Var(value=1,lb=0) for i in range(4)]
# Intermediates
slope = []
for i in range(5):
slope.append(m.Intermediate((yp[i+1]-yp[i]) / (xp[i+1]-xp[i])))
y = []
for i in range(5):
y.append(m.Intermediate((x[i]-xp[i])*slope[i]))
for i in range(4):
m.Obj(1000*(slk_u[i] + slk_l[i]))
m.Equation(xz == x[0] + slk_u[0])
for i in range(3):
m.Equation(xz == x[i+1] + slk_u[i+1] - slk_l[i])
m.Equation(xz == x[4] - slk_l[3])
m.Equation(yz == yp[0] + y[0] + y[1] + y[2] + y[3] + y[4])
m.solve()
#y_val = yz.value
#print(y_val)
import matplotlib.pyplot as plt
plt.plot(xp,yp,'rx-',label='PWL function')
plt.plot(xzd,yzd,'b.',label='Data')
plt.show()
Please check out the link below for examples of PWL using binary decision variables.
Logical conditions in Optimization
To sample a triangle ABC uniformly, I can use the following formula:
P = (1 - sqrt(r1)) * A + (sqrt(r1)*(1 - r2)) * B + (r2*sqrt(r1)) * C
where r1 and r2 are random numbers between 0 and 1. The more samples you take, the better. But what if I want to get a better distribution, while keeping then number of samples low?
For example if I had a square, I can implicitly divide it into an N x N grid and generate a random sample inside the smaller grid squares. Like this:
float u = (x + rnd(seed)) / width;
float v = (y + rnd(seed)) / height;
The point is I force the sampling to cover the entire grid at a lower sample resolution.
How can I achieve this with a triangle? The only way I can think of is to explicitly subdivide it into a number of triangles using a library like Triangle. But is there a way to do this implicitly like with a square, without having to actually divide the triangle?
OK, I had some thoughts and believe using quasirandom numbers could improve "uniformity" of the points-in-the-triangle coverage without doing subdivision into smaller triangles. Quasirandom sampling using Sobol sequences could provide a lot better coverage as seen in the Wiki article.
Here is 200 points in triangle using standard RNG (whatever it is in Python)
And here is picture with 200 points sampled from Sobol 2D sequence
Looks a lot better to me. Python code to play with
import os
import math
import random
import numpy as np
import matplotlib.pyplot as plt
import sobol_seq
def trisample(A, B, C, r1, r2):
s1 = math.sqrt(r1)
x = A[0] * (1.0 - s1) + B[0] * (1.0 - r2) * s1 + C[0] * r2 * s1
y = A[1] * (1.0 - s1) + B[1] * (1.0 - r2) * s1 + C[1] * r2 * s1
return (x, y)
if __name__ == "__main__":
N = 200
A = (0.0, 0.0)
B = (1.0, 0.0)
C = (0.5, 1.0)
seed = 1
xx = list()
yy = list()
random.seed(312345)
for k in range(0, N):
pts, seed = sobol_seq.i4_sobol(2, seed)
r1 = pts[0]
r2 = pts[1]
# uncomment if you want standard rng
#r1 = random.random()
#r2 = random.random()
pt = trisample(A, B, C, r1, r2)
xx.append(pt[0])
yy.append(pt[1])
plt.scatter(xx, yy)
plt.show()
sys.exit(0)
I'd suggest using Poisson disk sampling (short academic paper link,
pretty visualization link, wiki link, code link) to generate a configuration within the bounding box of your triangle and then cropping to the area bounded by the triangle.
I suggest starting with the short academic paper. The principle at work here is pretty easy to understand. There are many variations of this idea floating around out there, so get a handle on it and find the one that works for you.
I implemented the thin plate spline algorithm (see also this description) in order to interpolate scattered data using Python.
My algorithm seems to work correctly when the bounding box of the initial scattered data has an aspect ratio close to 1. However, scaling one of the data points coordinates changes the interpolation result. I created a minimal working example that is representative of what I am trying to accomplish. Below are two plots showing the results of the interpolation of 50 random points.
First, the interpolation of z = x^2 on the domain x = [0, 3], y = [0, 120]:
As you can see, the interpolation fails. Now, executing the same process but after scaling the x values by a factor of 40, I get:
This time, the result looks better. Choosing a slightly different scaling factor would have resulted in a slightly different interpolation. This shows that something is wrong in my algorithm but I can't find what exactly. Here is the algorithm:
import numpy as np
import numba as nb
# pts1 = Mx2 matrix (original coordinates)
# z1 = Mx1 column vector (original values)
# pts2 = Nx2 matrix (interpolation coordinates)
def gen_K(n, pts1):
K = np.zeros((n,n))
for i in range(0,n):
for j in range(0,n):
if i != j:
r = ( (pts1[i,0] - pts1[j,0])**2.0 + (pts1[i,1] - pts1[j,1])**2.0 )**0.5
K[i,j] = r**2.0*np.log(r)
return K
def compute_z2(m, n, pts1, pts2, coeffs):
z2 = np.zeros((m,1))
x_min = np.min(pts1[:,0])
x_max = np.max(pts1[:,0])
y_min = np.min(pts1[:,1])
y_max = np.max(pts1[:,1])
for k in range(0,m):
pt = pts2[k,:]
# If point is located inside bounding box of pts1
if (pt[0] >= x_min and pt[0] <= x_max and pt[1] >= y_min and pt[1] <= y_max):
z2[k,0] = coeffs[-3,0] + coeffs[-2,0]*pts2[k,0] + coeffs[-1,0]*pts2[k,1]
for i in range(0,n):
r2 = ( (pts1[i,0] - pts2[k,0])**2.0 + (pts1[i,1] - pts2[k,1])**2.0 )**0.5
if r2 != 0:
z2[k,0] += coeffs[i,0]*( r2**2.0*np.log(r2) )
else:
z2[k,0] = np.nan
return z2
gen_K_nb = nb.jit(nb.float64[:,:](nb.int64, nb.float64[:,:]), nopython = True)(gen_K)
compute_z2_nb = nb.jit(nb.float64[:,:](nb.int64, nb.int64, nb.float64[:,:], nb.float64[:,:], nb.float64[:,:]), nopython = True)(compute_z2)
def TPS(pts1, z1, pts2, factor):
n, m = pts1.shape[0], pts2.shape[0]
P = np.hstack((np.ones((n,1)),pts1))
Y = np.vstack((z1, np.zeros((3,1))))
K = gen_K_nb(n, pts1)
K += factor*np.identity(n)
L = np.zeros((n+3,n+3))
L[0:n, 0:n] = K
L[0:n, n:n+3] = P
L[n:n+3, 0:n] = P.T
L_inv = np.linalg.inv(L)
coeffs = L_inv.dot(Y)
return compute_z2_nb(m, n, pts1, pts2, coeffs)
Finally, here is the code snippet I used to create the two plots:
import matplotlib.pyplot as plt
import numpy as np
N = 50 # Number of random points
pts = np.random.rand(N,2)
pts[:,0] *= 3.0 # initial x values
pts[:,1] *= 120.0 # initial y values
z1 = (pts[:,0])**2.0
for scale in [1.0, 40.0]:
pts1 = pts.copy()
pts1[:,0] *= scale
x2 = np.linspace(np.min(pts1[:,0]), np.max(pts1[:,0]), 40)
y2 = np.linspace(np.min(pts1[:,1]), np.max(pts1[:,1]), 40)
x2, y2 = np.meshgrid(x2, y2)
pts2 = np.vstack((x2.flatten(), y2.flatten())).T
z2 = TPS(pts1, z1.reshape(z1.shape[0], 1), pts2, 0.0)
# Display
fig = plt.figure(figsize=(4,3))
ax = fig.add_subplot(111)
C = ax.contourf(x2, y2, z2.reshape(x2.shape), np.linspace(0,9,10), extend='both')
ax.plot(pts1[:,0], pts1[:,1], 'ok')
ax.set_xlabel('x')
ax.set_ylabel('y')
plt.colorbar(C, extendfrac=0)
plt.tight_layout()
plt.show()
Thin Plate Spline is scalar invariant, which means if you scale x and y by the same factor, the result should be the same. However, if you scale x and y differently, then the result will be different. This is common characteristics among radial basis functions. Some radial basis functions are not even scalar invariant.
When you say it "fails", what do you mean? The big question is, does it still exactly interpolate at the construction points? Assuming your code is correct and you do not have ill-conditioning, it should in which case it does not fail.
What I think is happening is that the addition of the scale is making the behavior in the x direction more dominant so you do not see the wiggles that come naturally from the interpolation.
As an aside, you can greatly speed up your code without using Numba by vectorizing.
import scipy.spatial.distance
import scipy.special
def gen_K(n,pts1):
# No need for n but kept to maintain compatability
pts1 = np.atleast_2d(pts1)
r = scipy.spatial.distance.cdist(pts1,pts1)
return scipy.special.xlogy(r**2,r)
It means you will get horrible ridges running through the surface. Resulting in a sub-optimal model fit. Read the caption below the images. Your model is experiencing the same effect, although plotted in 2D.