ROC curve interpretation - curve

I have a question regarding the ROC curve. The question is: Order the points A, B, C w.r.t the accuracy of the classifier (from good to weak).
I know the ROC curve shows the relation of true positive rate and false positive rate.
But I am not sure how to answer the question above and also how to justify it.

Related

Algorithm: How to smoothly interpolate/reconstruct sparse samples with noise?

This question is not directly related to a particular programming language but is an algorithmic question.
What I have is a lot of samples of a 2D function. The samples are at random locations, they are not uniformly distributed over the domain, the sample values contain noise and each sample has a confidence-weight assigned to it.
What I'm looking for is an algorithm to reconstruct the original 2D function based on the samples, so a function y' = G(x0, x1) that approximates the original well and interpolates areas where samples are sparse smoothly.
It goes into the direction of what scipy.interpolate.griddata is doing, but with the added difficulty that:
the sample values contain noise - meaning that samples should not just be interpolated, but nearby samples also averaged in some way to average out the sampling noise.
the samples are weighted, so, samples with higher weight should contrbute more strongly to the reconstruction that those with lower weight.
scipy.interpolate.griddata seems to do a Delaunay triangulation and then use the barycentric cordinates of the triangles to interpolate values. This doesn't seem to be compatible with my requirement of weighting samples and averaging noise though.
Can someone point me in the right direction on how to solve this?
Based on the comments, the function is defined on a sphere. That simplifies life because your region is both well-studied and nicely bounded!
First, decide how many Spherical Harmonic functions you will use in your approximation. The fewer you use, the more you smooth out noise. The more you use, the more accurate it will be. But if you use any of a particular degree, you should use all of them.
And now you just impose the condition that the sum of the squares of the weighted errors should be minimized. That will lead to a system of linear equations, which you then solve to get the coefficients of each harmonic function.

Sampling methods for plotting

Say we are making a program to render the plot of a function (black box) provided by the user as a sequence of line segments. We want to get the minimum number of samples of the function so the resulting image "looks" like the function (the exact meaning of "looks" here is part of the question). A naive approach might be to just sample at fixed intervals but we can probably do better than that eg by sampling the "curvy bits" more than the "linear bits". Are there systematic approaches/research on this problem?
This reference can be helpful which is using the combined sampling method. Before that its related works explain more about other methods of sampling:
There are several strategies for plotting the function y = f(x) on interval Ω = [a, b]. The
naive approach based on sampling of f in a fixed amount of the equally spaced points is
described in [20]. The simple functions suffer from oversampling, while the oscillating curves
are under-sampled; these issues are mentioned in [14]. Another approach based on the interval
constraint plot constructing a hull of the curve was described in [6], [13], [20]. The automated
detection of a useful domain and a range of the function is mentioned in [41]; the generalized
interval arithmetic approach is described in [40].
A significant refinement is represented by adaptive sampling providing a higher sampling
density in the higher-curvature regions. The are several algorithms for the curve interpolation preserving the speed, for example: [37], [42], [43]. The adaptive feed rate technique
is described in [44]. An early implementation in the Mathematica software is presented in
[39]. By reducing data, these methods are very efficient for the curve plotting. The polygonal approximation of the parametric curve based on adaptive sampling is mentioned in the
several papers. The refinement criteria, as well as the recursive approach, are discussed in
[15]. An approximation by the polygonal curves is described in [7], the robust method for
the geometric and spatial approximation of the implicit curves can be found in [27], [10], the
affine arithmetic working in the triangulated models in [32]. However, the map projections
are never defined by the implicit equations. Similar approaches can be used for graph drawing
[21].
Other techniques based on the approximation by the breakpoints can be found in many
papers: [33], [9], [3]; these approaches are used for the polygonal approximation of the closed
curves and applied in computer vision.
Hence, these are the reference methods that define some measures for a "good" plot and introduce an approach to optimize the plot base on the measure:
constructing a hull of the curve
automated detection of a useful domain and a range of the function
adaptive sampling: providing a higher sampling density in the higher-curvature regions
providing a higher sampling density in the higher-curvature regions
approximation by the polygonal curves
affine arithmetic working in the triangulated models
combined sampling: providing the polygonal approximation of the parametric curve involving the discontinuities will be presented. The modified method will be used for the function f(x) reconstruction and plot. Based on the ideas of splitting the domain into the subintervals without the discontinuities, it represents a typical problem solvable by the recursive approach.

Classification of K-nearest neighbours algorithm

The x and y axis have different scales in this scatter plot.
Assume the centre of each shape to be the datapoint.
Q: What will be the classification of a test point for 9-nearest- neighbour classifier using this training set, use both features?
Q: On the scatter plot at the top of the page, in any order, name the class of three nearest neighbours for the bottom left unknown point, using both features to compute distance.
Here's my attempt:
1: A higher K, 9 in this case, that more voters in each prediction and hence is more resilient to outliers. Larger values of K will have smoother decision boundaries to decide either Pet or Wild here, which means lower variance but increased bias.
2: By using the Pythagorean theorem, the distance of the three nearest classes to the bottom left unknown point are,
Pet, Distance = 0.02
Pet, Distance = 2.20
Wild, Distance = 2.60
Therefore, the class is Pet.
Question 1 asks for a specific answer (Pet or Wild), which you have not provided. The statements you've made are generally true, but they don't actually answer the question. Notice that there are only 4 Pet points, and the rest are Wild. So no matter which 9 points are the nearest neighbors, at least 5 (a majority) will be Wild. Hence, a KNN classifier with K = 9 will always predict Wild using this data.
Question 2 looks mostly right. I don't have the exact coordinates of the points, but your numbers seem to be in the right ballpark, except you probably have a typo in the first distance. The classes are right, and the resulting prediction (which the question didn't explicitly ask for) is also right (assuming K = 3).

Uncertainty on pose estimate when minimizing measurement errors

Let's say I want to estimate the camera pose for a given image I and I have a set of measurements (e.g. 2D points ui and their associated 3D coordinates Pi) for which I want to minimize the error (e.g. the sum of squared reprojection errors).
My question is: How do I compute the uncertainty on my final pose estimate ?
To make my question more concrete, consider an image I from which I extracted 2D points ui and matched them with 3D points Pi. Denoting Tw the camera pose for this image, which I will be estimating, and piT the transformation mapping the 3D points to their projected 2D points. Here is a little drawing to clarify things:
My objective statement is as follows:
There exist several techniques to solve the corresponding non-linear least squares problem, consider I use the following (approximate pseudo-code for the Gauss-Newton algorithm):
I read in several places that JrT.Jr could be considered an estimate of the covariance matrix for the pose estimate. Here is a list of more accurate questions:
Can anyone explain why this is the case and/or know of a scientific document explaining this in details ?
Should I be using the value of Jr on the last iteration or should the successive JrT.Jr be somehow combined ?
Some people say that this actually is an optimistic estimate of the uncertainty, so what would be a better way to estimate the uncertainty ?
Thanks a lot, any insight on this will be appreciated.
The full mathematical argument is rather involved, but in a nutshell it goes like this:
The outer product (Jt * J) of the Jacobian matrix of the reprojection error at the optimum times itself is an approximation of the Hessian matrix of least squares error. The approximation ignores terms of order three and higher in the Taylor expansion of the error function at the optimum. See here (pag 800-801) for proof.
The inverse of the Hessian matrix is an approximation of the covariance matrix of the reprojection errors in a neighborhood of the optimal values of the parameters, under a local linear approximation of parameters-to-errors transformation (pag 814 above ref).
I do not know where the "optimistic" comment comes from. The main assumption underlying the approximation is that the behavior of the cost function (the reproj. error) in a small neighborhood of the optimum is approximately quadratic.

Linear Least Squares Fit of Sphere to Points

I'm looking for an algorithm to find the best fit between a cloud of points and a sphere.
That is, I want to minimise
where C is the centre of the sphere, r its radius, and each P a point in my set of n points. The variables are obviously Cx, Cy, Cz, and r. In my case, I can obtain a known r beforehand, leaving only the components of C as variables.
I really don't want to have to use any kind of iterative minimisation (e.g. Newton's method, Levenberg-Marquardt, etc) - I'd prefer a set of linear equations or a solution explicitly using SVD.
There are no matrix equations forthcoming. Your choice of E is badly behaved; its partial derivatives are not even continuous, let alone linear. Even with a different objective, this optimization problem seems fundamentally non-convex; with one point P and a nonzero radius r, the set of optimal solutions is the sphere about P.
You should probably reask on an exchange with more optimization knowledge.
You might find the following reference interesting but I would warn you
that you will need to have some familiarity with geometric algebra -
particularly conformal geometric algebra to understand the
mathematics. However, the algorithm is straight forward to implement with
standard linear algebra techniques and is not iterative.
One caveat, the algorithm, at least as presented fits both center and
radius, you may be able to work out a way to constrain the fit so the radius is constrained.
Total Least Squares Fitting of k-Spheres in n-D Euclidean Space Using
an (n+ 2)-D Isometric Representation. L Dorst, Journal of Mathematical Imaging and Vision, 2014 p1-21
Your can pull in a copy from
Leo Dorst's researchgate page
One last thing, I have no connection to the author.
Short description of making matrix equation could be found here.
I've seen that WildMagic Library uses iterative method (at least in version 4)
You may be interested by the best fit d-dimensional sphere, i.e. minimizing the variance of the population of the squared distances to the center; it has a simple analytical solution (matrix calculus): see the appendix of the open access paper of Cerisier et al. in J. Comput. Biol. 24(11), 1134-1137 (2017), https://doi.org/10.1089/cmb.2017.0061
It works when the data points are weighted (it works even for continuous distributions; as a by-product, when d=1, a well-known inequality is retrieved: the kurtosis is always greater than the squared skewness plus 1).
Difficult to do this without iteration.
I would proceed as follows:
find the overall midpoint, by averaging (X,Y,Z) coords for all points
with that result, find the average distance Ravg to the midpoint, decide ok or proceed..
remove points from your set with a distance too far from Ravg found in step 2
go back to step 1 (average points again, yielding a better midpoint)
Of course, this will require some conditions for (2) and (4) that depends on the quality of your points cloud !
Ian Coope has an interesting algorithm in which he linearized the problem using a change of variable. The fit is quite robust, and although it slightly redefines the condition of optimality, I've found it to be generally visually better, especially for noisy data.
A preprint of Coope's paper is available here: https://ir.canterbury.ac.nz/bitstream/handle/10092/11104/coope_report_no69_1992.pdf.
I found the algorithm to be very useful, so I implemented it in scikit-guess as skg.nsphere_fit. Let's say you have an (m, n) array p, consisting of M points of dimension N (here N=3):
r, c = skg.nsphere_fit(p)
The radius, r, is a scalar and c is be an n-vector containing the center.

Resources