How come some of the lines get ignored with hough line function? - scikit-image

I'm struggling a bit to figure out
how to make sure all lines get recognized with Line Hough Transform taken from sckit-image library.
https://scikit-image.org/docs/dev/auto_examples/edges/plot_line_hough_transform.html#id3
Here below all lines got recognized:
But if I apply the same script on similar image,
one line will get ignored after applying the Hough transform,
I have read the documentation which says:
The Hough transform constructs a histogram array representing the parameter
space (i.e., an :math:`M \\times N` matrix, for :math:`M` different values of
the radius and :math:`N` different values of :math:`\\theta`). For each
parameter combination, :math:`r` and :math:`\\theta`, we then find the number
of non-zero pixels in the input image that would fall close to the
corresponding line, and increment the array at position :math:`(r, \\theta)`
appropriately.
We can think of each non-zero pixel "voting" for potential line candidates. The
local maxima in the resulting histogram indicates the parameters of the most
probably lines
So my conclusion is the line got removed since it hadn't got enough "votes",
(I have tested it with different precisions (0.05, 0.5, 0.1) degree, but still got the same issue).
Here is the code:
import numpy as np
from skimage.transform import hough_line, hough_line_peaks
from skimage.feature import canny
from skimage import data,io
import matplotlib.pyplot as plt
from matplotlib import cm
# Constructing test image
image = io.imread("my_image.png")
# Classic straight-line Hough transform
# Set a precision of 0.05 degree.
tested_angles = np.linspace(-np.pi / 2, np.pi / 2, 3600)
h, theta, d = hough_line(image, theta=tested_angles)
# Generating figure 1
fig, axes = plt.subplots(1, 3, figsize=(15, 6))
ax = axes.ravel()
ax[0].imshow(image, cmap=cm.gray)
ax[0].set_title('Input image')
ax[0].set_axis_off()
ax[1].imshow(np.log(1 + h),
extent=[np.rad2deg(theta[-1]), np.rad2deg(theta[0]), d[-1], d[0]],
cmap=cm.gray, aspect=1/1.5)
ax[1].set_title('Hough transform')
ax[1].set_xlabel('Angles (degrees)')
ax[1].set_ylabel('Distance (pixels)')
ax[1].axis('image')
ax[2].imshow(image, cmap=cm.gray)
origin = np.array((0, image.shape[1]))
for _, angle, dist in zip(*hough_line_peaks(h, theta, d)):
y0, y1 = (dist - origin * np.cos(angle)) / np.sin(angle)
ax[2].plot(origin, (y0, y1), '-r')
ax[2].set_xlim(origin)
ax[2].set_ylim((image.shape[0], 0))
ax[2].set_axis_off()
ax[2].set_title('Detected lines')
plt.tight_layout()
plt.show()
How should I "catch" this line too,
any suggestion?

Shorter lines have lower accumulator values in the Hough transform, so you have to adjust the threshold appropriately. If you know how many line segments you are looking for, you can set the threshold fairly low and then limit the number of peaks detected.
Here's a condensed version of the code above, with modified threshold, for reference:
import numpy as np
from skimage.transform import hough_line, hough_line_peaks
from skimage import io
import matplotlib.pyplot as plt
from matplotlib import cm
from skimage import color
# Constructing test image
image = color.rgb2gray(io.imread("my_image.png"))
# Classic straight-line Hough transform
# Set a precision of 0.05 degree.
tested_angles = np.linspace(-np.pi / 2, np.pi / 2, 3600)
h, theta, d = hough_line(image, theta=tested_angles)
hpeaks = hough_line_peaks(h, theta, d, threshold=0.2 * h.max())
fig, ax = plt.subplots()
ax.imshow(image, cmap=cm.gray)
for _, angle, dist in zip(*hpeaks):
(x0, y0) = dist * np.array([np.cos(angle), np.sin(angle)])
ax.axline((x0, y0), slope=np.tan(angle + np.pi/2))
plt.show()
(Note: axline requires matplotlib 3.3.)

Related

Optimal parameters not found: Number of calls to function has reached maxfev = 100

I'm new to python, I try to give some adjustment to the data, but when I get the graph, only the original data appears and with the message "Optimal parameters not found: Number of calls to function has reached maxfev = 1000." Could you help me find my mistake?
%matplotlib inline
import matplotlib.pylab as m
from scipy.optimize import curve_fit
import numpy as num
import scipy.optimize as optimize
xData=num.array([0,0,100,200,250,300,400], dtype="float")
yData=num.array([0,0,0,0,75,100,100], dtype="float")
m.plot(xData, yData, 'ro', label='Datos originales')
def fun(x, a, b):
return a + b * num.log(x)
popt,pcov=optimize.curve_fit(fun, xData, yData,p0=[1,1], maxfev=1000)
print=popt
x=num.linspace(1,400,7)
m.plot(x,fun(x, *popt), label='FunciĆ³n ajustada')
m.xlabel('concentraciĆ³n')
m.ylabel('% mortalidad')
m.legend()
m.grid()
The model in your code is "a + b * num.log(x)". Because your data contains an x value of 0.0, the evaluation of log(0.0) gives errors and will not allow the fitting software to function. Sometimes these x values of 0.0 can be replaced with very small numbers, as log(small number) will not fail - but in this case the equation and data do not appear to match and so using that technique alone would not be sufficient here.
My thought is that a different equation would be a better model for this data. I performed an equation search using your data, and found that several different sigmoidal type equations gave suspiciously good fits to this data set - which is not surprising because of the small number of data points.
The sigmoidal equations I tried were all extremely sensitive to the initial parameter estimates. Here is a graphical Python fitter using scipy's Differential Evolution genetic algorithm module to determine the initial parameter estimates for curve_fit's non-linear solver. That scipy module uses the Latin Hypercube algorithm to ensure a thorough search of parameter space, requiring bounds within which to search. Here those bounds are taken from the data maximum and minimun values.
I personally would not use this fit precisely because the small number of data points is giving such suspiciously good fits, and strongly recommend taking additional data points if at all possible. I could however not find any equations with less than three parameters that would fit the data.
import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings
xData=numpy.array([0,0,100,200,250,300,400], dtype="float")
yData=numpy.array([0,0,0,0,75,100,100], dtype="float")
def func(x, a, b, c): # Sigmoid B equation from zunzun.com
return a / (1.0 + numpy.exp(-1.0 * (x - b) / c))
# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)
def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
parameterBounds = []
parameterBounds.append([minX, maxX]) # search bounds for a
parameterBounds.append([minX, maxX]) # search bounds for b
parameterBounds.append([0.0, 2.0]) # search bounds for c
# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x
# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()
# now call curve_fit without passing bounds from the genetic algorithm,
# just in case the best fit parameters are aoutside those bounds
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Fitted parameters:', fittedParameters)
print()
modelPredictions = func(xData, *fittedParameters)
absError = modelPredictions - yData
SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)
print()
##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
# first the raw data as a scatter plot
axes.plot(xData, yData, 'D')
# create data for the fitted equation plot
xModel = numpy.linspace(min(xData), max(xData), 100)
yModel = func(xModel, *fittedParameters)
# now the model as a line plot
axes.plot(xModel, yModel)
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
plt.show()
plt.close('all') # clean up after using pyplot
graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)

How do I perform a curve fit with an array of points and touching a specific point in that array

I need help with curve fitting a given set of points. The points form a parabola and I ought to find the peak point of the result. Issue is when I do a curve fit, it sometimes doesn't touch the max y-coordinate even if the actual point is given in the input array.
Following is the code snippet. Here 1.88 is the actual peak y-coordinate (13.05,1.88). But the graph generated by the code does not touch the point due to curve fitting. So is there a way to fit the curve making sure that it touches the max point given in the input array?
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit, minimize_scalar
fig = plt.gcf()
#fig.set_size_inches(18.5, 10.5)
x = [4.59,9.02,13.05,18.47,20.3]
y = [1.7,1.84,1.88,1.7,1.64]
def f(x, p1, p2, p3):
return p3*(p1/((x-p2)**2 + (p1/2)**2))
plt.plot(x,y,"ro")
popt, pcov = curve_fit(f, x, y)
# find the peak
fm = lambda x: -f(x, *popt)
r = minimize_scalar(fm, bounds=(1, 5))
print( "maximum:", r["x"], f(r["x"], *popt) ) #maximum: 2.99846874275 18.3928199902
plt.text(1,1.9,'maximum '+str(round(r["x"],2))+'( #'+str(round(f(r["x"], *popt),2)) + ' )')
x_curve = np.linspace(min(x), max(x), 50)
plt.plot(x_curve, f(x_curve, *popt))
plt.plot(r['x'], f(r['x'], *popt), 'ko')
plt.show()
Here is a graphical code example using your equation with weighted fitting, where I have made the max point larger to more easily see the effect of the weighting. In non-weighted curve fitting, all weights are implicitly 1.0 as all data points have equal weight. Scipy's curve_fit routine uses weights in the form of uncertainties, so that giving a point a very small uncertainty (which I have done) is like giving the point a very large weight. This technique can be used to make a fit pass arbitrarily close to any single data point by any software that can perform weghted fitting.
import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
x = [4.59,9.02,13.05,18.47,20.3]
y = [1.7,1.84,2.0,1.7,1.64]
# note the single very small uncertainty - try making this value 1.0
uncertainties = numpy.array([1.0, 1.0, 1.0E-6, 1.0, 1.0])
# rename data to use previous example
xData = numpy.array(x)
yData = numpy.array(y)
def func(x, p1, p2, p3):
return p3*(p1/((x-p2)**2 + (p1/2)**2))
# these are the same as the scipy defaults
initialParameters = numpy.array([1.0, 1.0, 1.0])
# curve fit the test data, first without uncertainties to
# get us closer to initial starting parameters
ssqParameters, pcov = curve_fit(func, xData, yData, p0 = initialParameters)
# now that we have better starting parameters, use uncertainties
fittedParameters, pcov = curve_fit(func, xData, yData, p0 = ssqParameters, sigma=uncertainties, absolute_sigma=True)
modelPredictions = func(xData, *fittedParameters)
absError = modelPredictions - yData
SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print('Parameters:', fittedParameters)
print('RMSE:', RMSE)
print('R-squared:', Rsquared)
print()
##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
# first the raw data as a scatter plot
axes.plot(xData, yData, 'D')
# create data for the fitted equation plot
xModel = numpy.linspace(min(xData), max(xData))
yModel = func(xModel, *fittedParameters)
# now the model as a line plot
axes.plot(xModel, yModel)
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
plt.show()
plt.close('all') # clean up after using pyplot
graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)

Best learning algorithms concentric and not linearly separable data

Below are two scatter plots. The first one is for data points that have values of x and y, and I would like to know if there is a clustering algorithm that will automatically recognize that there are two clusters. They are concentric and not linearly separable. K-means is not right for several reasons. The other plot is similar but it has x, y and color values, and I would like to know what learning algorithm would be best at classifying or predicting the correct color from the values of x and y.
I got good classifier results for this problem using the sklearn MLPClassifier algorithm. Here is the scatter and contour plots:
Detailed code at: https://www.linkedin.com/pulse/couple-scikit-learn-classifiers-peter-thorsteinson. The simplified code below shows how it works:
import math
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix
# Generate the artificial data set and display the resulting scatter plot
x = []
y = []
z = []
for i in range(500):
rand = np.random.uniform(0.0, 2*math.pi)
randx = np.random.normal(0.0, 30.0)
randy = np.random.normal(0.0, 30.0)
if np.random.random() > 0.5:
z.append(0)
x.append(100*math.cos(rand) + randx)
y.append(100*math.sin(rand) + randy)
else:
z.append(1)
x.append(300*math.cos(rand) + randx)
y.append(300*math.sin(rand) + randy)
plt.axis('equal')
plt.axis([-500, 500, -500, 500])
plt.scatter(x, y, c=z)
plt.show()
# Run the MLPClassifier algorithm on the training data
XY = pd.DataFrame({'x': x, 'y': y})
print(XY.head())
Z = pd.DataFrame({'z': z})
print(Z.head())
XY_train, XY_test, Z_train, Z_test = train_test_split(XY, Z, test_size = 0.20)
mlp = MLPClassifier(hidden_layer_sizes=(10, 10, 10), max_iter=1000)
mlp.fit(XY_train, Z_train.values.ravel())
# Make predictions on the test data and display resulting scatter plot
predictions = mlp.predict(XY_test)
print(confusion_matrix(Z_test,predictions))
print(classification_report(Z_test,predictions))
plt.axis('equal')
plt.axis([-500, 500, -500, 500])
plt.scatter(XY_test.x, XY_test.y, c=predictions)
plt.show()

Can I do something like imsave() with text overlay?

I am using imsave() sequentially to make many PNGs that I will combine as an AVI and I would like to add moving text annotations. I use ImageJ to make AVIs or GIFs.
I don't want the axes, numbers, borders or anything, just the color image (as imsave() provides for example) with text (and maybe arrows) inside. These will change frame by frame. Pardon the use of jet.
I could use savefig() with ticks off and then do cropping as post processing, but is there a more convenient, direct, or "matplotlibithic" way to do this that wouldn't be so hard on my hard drive? (final thing will be pretty big).
A code snippet, added by request:
import numpy as np
import matplotlib.pyplot as plt
nx, ny = 101, 101
phi = np.zeros((ny, nx), dtype = 'float')
do_me = np.ones_like(phi, dtype='bool')
x0, y0, r0 = 40, 65, 12
x = np.arange(nx, dtype = 'float')[None,:]
y = np.arange(ny, dtype = 'float')[:,None]
rsq = (x-x0)**2 + (y-y0)**2
circle = rsq <= r0**2
phi[circle] = 1.0
do_me[circle] = False
do_me[0,:], do_me[-1,:], do_me[:,0], do_me[:,-1] = False, False, False, False
n, nper = 100, 100
phi_hold = np.zeros((n+1, ny, nx))
phi_hold[0] = phi
for i in range(n):
for j in range(nper):
phi2 = 0.25*(np.roll(phi, 1, axis=0) +
np.roll(phi, -1, axis=0) +
np.roll(phi, 1, axis=1) +
np.roll(phi, -1, axis=1) )
phi[do_me] = phi2[do_me]
phi_hold[i+1] = phi
change = phi_hold[1:] - phi_hold[:-1]
places = [(32, 20), (54,25), (11,32), (3, 12)]
plt.figure()
plt.imshow(change[50])
for (x, y) in places:
plt.text(x, y, "WOW", fontsize=16)
plt.text(5, 95, "Don't use Jet!", color="white", fontsize=20)
plt.show()
Method 1
Using an excellent answer to another question as a reference, I came up with the following simplified variant which seems to work nicely - just make sure the figsize (which is given in inches) aspect ratio matches the size ratio of the plot data:
import numpy as np
import matplotlib.pyplot as plt
test_image = np.eye(100)
fig = plt.figure(figsize=(4,4))
ax = plt.axes(frameon=False, xticks=[],yticks=[])
ax.imshow(test_image)
plt.savefig('test.png', bbox_inches='tight', pad_inches=0)
Note that I am using imshow with a test_image, which might behave differently from other plotting functions... please let me know in a comment in case you'd like to do something else.
Also note that the image will be (re-) sampled, so the figsize will influence the resolution of the written image.
As pointed out in the comments, the figsize setting doesn't match the size of the output image (or the size on screen, for that matter). To overcome this, use...
Method 2
Reading the FAQ entry Move the edge of an axes to make room for tick labels, I found a way to make the figsize parameter set the output image size directly, by moving the axes' ticks out of the visible area:
import numpy as np
import matplotlib.pyplot as plt
test_image = np.eye(100)
fig = plt.figure(figsize=(4,4))
ax = fig.add_axes([0,0,1,1])
ax.imshow(test_image)
plt.savefig('test.png')
Note that savefig has a default DPI setting (100 in my case) which - in combination with figsize - determines the number of pixels in x and y directions of the saved image. You can override this with the dpi keyword argument to savefig.
If you want to display the image on screen rather than saving it (by using plt.show() instead of the plt.savefig line in the code above), the size of the figure is dependent on (apart from the already familiar figsize parameter) the figure's DPI setting, which also has a default (80 on my system). This value can be overridden by passing the dpi keyword argument to the plt.figure() call.

Non-linear axes for imshow in matplotlib

I am generating 2D arrays on log-spaced axes (for instance, the x pixel coordinates are generated using logspace(log10(0.95), log10(2.08), n).
I want to display the image using a plain old imshow, in its native resolution and scaling (I don't need to stretch it; the data itself is already log scaled), but I want to add ticks, labels, lines that are in the correct place on the log axes. How do I do this?
Ideally I could just use commands line axvline(1.5) and the line would be in the correct place (58% from the left), but if the only way is to manually translate between logscale coordinates and image coordinates, that's ok, too.
For linear axes, using extents= in the call to imshow does what I want, but I don't see a way to do the same thing with a log axis.
Example:
from matplotlib.colors import LogNorm
x = logspace(log10(10), log10(1000), 5)
imshow(vstack((x,x)), extent=[10, 1000, 0, 100], cmap='gray', norm=LogNorm(), interpolation='nearest')
axvline(100, color='red')
This example does not work, because extent= only applies to linear scales, so when you do axvline at 100, it does not appear in the center. I'd like the x axis to show 10, 100, 1000, and axvline(100) to put a line in the center at the 100 point, while the pixels remain equally spaced.
In my view, it is better to use pcolor and regular (non-converted) x and y values. pcolor gives you more flexibility and regular x and y axis are less confusing.
import pylab as plt
import numpy as np
from matplotlib.colors import LogNorm
from matplotlib.ticker import LogFormatterMathtext
x=np.logspace(1, 3, 6)
y=np.logspace(0, 2,3)
X,Y=np.meshgrid(x,y)
z = np.logspace(np.log10(10), np.log10(1000), 5)
Z=np.vstack((z,z))
im = plt.pcolor(X,Y,Z, cmap='gray', norm=LogNorm())
plt.axvline(100, color='red')
plt.xscale('log')
plt.yscale('log')
plt.colorbar(im, orientation='horizontal',format=LogFormatterMathtext())
plt.show()
As pcolor is slow, a faster solution is to use pcolormesh instead.
im = plt.pcolormesh(X,Y,Z, cmap='gray', norm=LogNorm())
Actually, it works fine. I'm confused.
Previously I was getting errors about "Images are not supported on non-linear axes" which is why I asked this question. But now when I try it, it works:
import matplotlib.pyplot as plt
import numpy as np
x = np.logspace(1, 3, 5)
y = np.linspace(0, 2, 3)
z = np.linspace(0, 1, 4)
Z = np.vstack((z, z))
plt.imshow(Z, extent=[10, 1000, 0, 1], cmap='gray')
plt.xscale('log')
plt.axvline(100, color='red')
plt.show()
This is better than pcolor() and pcolormesh() because
it's not insanely slow and
is interpolated nicely without misleading artifacts when the image is not shown at native resolution.
To display imshow with abscisse log scale:
ax = fig.add_subplot(nrow, ncol, i+1)
ax.set_xscale('log')

Resources