Is that possible to clean a contour plot in skimage? - scikit-image

I have this thought this morning and I found scikit-image probably can solve the problem. Simply speaking I want to remove the level labels and other unnecessary features in a contour raster image and get the locations of the contour lines.
Like this image:
contour image with features to be removed.
But after reading a while I became unsure if it could be done. My consideration is the following:
1) find_contours may find the contour lines, but I think all other features (arrows and lables) will also included.
2) closing may remove the labels, but I found contour lines are also removed, I think it is because they are just thin lines and the bright regions will be connected if using closing function.
3) the arrows are mostly connected with the contour lines, so these features cannot be considered as individual objects. They may be the most difficult part to be removed.
What I want to get help at is to know if it is possible to remove the labels and arrows to get contour lines in skimage. If so, I will continue to learn the package.

Here is a partial solution to your problem. The basic idea is to consider the network of interconnected lines as a graph, and to remove the small edges of this graph because they correspond (mostly) to arrows and to labels. A first step is to compute the skeleton of the binarized phase.
Of course the result is not perfect, you can adapt the different parameters of the script below to find the solution that you consider best. If you need to repeat this segmentation on a large number of images, you can also try to select manually the edges of the graph that you want to keep, as a training set, and to use a machine learning algorithm to classify which edges to keep (using features of the edges such as length, number of other connected edges, direction, straightness, ...).
import numpy as np
import matplotlib.pyplot as plt
from skimage import io, morphology
from scipy import ndimage
def skel_to_graph(skel):
"""
Transform skeleton into its branches and nodes, by counting the number
of neighbors of each pixel in the skeleton
"""
convolve_skel = 3**skel.ndim * ndimage.uniform_filter(skel.astype(np.float))
neighbors = np.copy(skel.astype(np.uint8))
skel = skel.astype(np.bool)
neighbors[skel] = convolve_skel[skel] - 1
edges = morphology.label(np.logical_or(neighbors == 2, neighbors ==1),
background=0)
nodes = morphology.label(np.logical_and(np.not_equal(neighbors, 2),
neighbors > 0), background=0)
length_edges = np.bincount(edges.ravel())
return nodes, edges, length_edges
# Open image and threshold
image = io.imread('contours.png')
im = image < 200
im = morphology.binary_closing(im, np.ones((5, 5)))
# Remove small connected components (small arrows)
only_big = morphology.remove_small_objects(im, 400)
# Skeletonize to get 1-pixel-thick lines
skel = morphology.skeletonize(only_big)
skel_big = morphology.remove_small_objects(skel, 30, connectivity=2)
nodes, edges, length_edges = skel_to_graph(skel_big)
# Keep only long branches of the skeleton
# The threshold can be adjusted
threshold = 35
long_edges = (length_edges < threshold)[edges]
edges[long_edges] = 0
# Visualize results
plt.figure(figsize=(10, 10))
plt.imshow(image, cmap='gray', interpolation='nearest')
plt.contour(edges > 0, [0.5], colors='red', linewidths=[2])
plt.axis('off')
plt.tight_layout()
plt.show()

Related

Counting homogeneous objects along the contour

I'm trying to get amount of objects on the frame by finding their contours with OpenCV.
That's a frame after Canny filter applying:
Then I call findContours() method and leave ones which suitable in size.
When I overlay them on a frame I've got
the following picture.
It can be seen that we've got only full contour objects.
So the question is:
How can we artificially make the boundaries of objects holistic?
I tried to use dilate and erode (result of that) but after that borders of objects are glued together and we can't find their contours any more.
Since the contours are connected together, findContours will detect the connected contours as a single contour instead of individual separated circles. When you have connected contours, a potential approach is to use Watershed to label and detect each contour. Here's the results:
Input image
Output
Code
import cv2
import numpy as np
from skimage.feature import peak_local_max
from skimage.morphology import watershed
from scipy import ndimage
# Load in image, convert to gray scale, and Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Compute Euclidean distance from every binary pixel
# to the nearest zero pixel then find peaks
distance_map = ndimage.distance_transform_edt(thresh)
local_max = peak_local_max(distance_map, indices=False, min_distance=20, labels=thresh)
# Perform connected component analysis then apply Watershed
markers = ndimage.label(local_max, structure=np.ones((3, 3)))[0]
labels = watershed(-distance_map, markers, mask=thresh)
# Iterate through unique labels
for label in np.unique(labels):
if label == 0:
continue
# Create a mask
mask = np.zeros(gray.shape, dtype="uint8")
mask[labels == label] = 255
# Find contours and determine contour area
cnts = cv2.findContours(mask.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
c = max(cnts, key=cv2.contourArea)
color = list(np.random.random(size=3) * 256)
cv2.drawContours(image, [c], -1, color, 4)
cv2.imshow('image', image)
cv2.waitKey()
Here are some other references:
Image segmentation with Watershed Algorithm
Watershed Algorithm: Marker-based Segmentation
How to define the markers for Watershed
Find contours after watershed
It seems like you have a pattern for your objects, and those objects sometimes overlap.
I'd suggest you convolve your image with an object pattern and then process the outcoming scores image.
In more detail:
Suppose for simplicity that your initial image has only one channel. and the object you're looking for looks like this:. this is our pattern. say it's size is [W_p,H_p]
First step: construct new image - scores - where each pixel S in scores = probability that this pixel is the pattern center.
One way to do that is: for each pixel P in the original image, "cut" the [W_p,H_p] patch around P ( e.g. img(Rect(P-W_p/2,P-H_p/2,W_p,H_p))), and subtract patch from pattern to find the "distance" between them (e.g. cv::sum(cv::absdiff(patch, pattern)) function in opencv), and save this sum to S.
Another way to do this is: S = P.clone(); pattern = pattern / cv::sum(pattern);
and then use cv::filter2D for S with pattern...
Now that you have a scores image, you should filter false positives:
1. take the top 2% of the scores( one way is with cv::calcHist)
2. for each pixel that has a neighbor within [W_p,H_p] with a heigher score - turn this pixel to zero !
Now you should remain with image of zeroes where only pattern centers have some value. Hurray!
If you don't know in advance how an object will look like, you can find one object using contours, then 'cut it out' using convex hull of its contour (+ bounding box), and use it as the convolution kernel for finding the rest.

How to sort image blocks by to minimize the differences between them?

I have a list of small image blocks. All of them are the same size (eg : 16x16). They are black and white, with pixel values between 0-255. I would like to sort those images to minimize the the difference between them.
What I have in mind is to compute the mean absolute difference (MAD) of pixels between adjacent blocks (eg : block N vs block N+1, block N+1 vs block N+2, ...).
From this I can calculate the sum of those MAD values (eg : sum = mad1 + mad2 + ...). What I would like is to find the ordering that minimize that sum.
Before :
After : (this was done by hand only to provide an example, there is probably a better ordering for those blocks, especially the ones with vertical stripes)
Based on RaymoAisla comment which pointed out this is similar to Travelling salesman problem, I have decided to implement a solution using Nearest neighbour algorithm. While not perfect, it gives great results :
Here is code :
from PIL import Image
import numpy as np
def sortImages(images): #Nearest neighbour algorithm
result = []
tovisit = range(len(images))
best = max(tovisit, key=lambda x: images[x].sum()) #brightest block as first
tovisit.remove(best)
result.append(best)
while tovisit:
best = min(tovisit, key=lambda x: MAD(images[x], images[best])) #nearest neighbour
tovisit.remove(best)
result.append(best)
return result #indices of sorted image blocks
def MAD(a, b): #mean absolute distance
return np.absolute(a - b).sum()
images = ... #should contains b&w images only (eg : from PIL Image)
arrays = [np.array(x).astype(int) for x in images] #convert images to np arrays
result = sortImages(arrays)

Compute curvature of a bent pipe using image processing (Hough transform parabola detection)

I'm trying to design a way to detect this pipe's curvature. I tried applying hough transform and found detected line but they don't lie along the surface of pipe so smoothing it out to fit a beizer curve is not working .Please suggest some good way to start for the image like this.[
The image obtained by hough transform to detect lines is as follows
[
I'm using standard Matlab code for probabilistic hough transform line detection that generates line segment surrounding the structure. Essentially the shape of pipe resembles a parabola but for hough parabola detection I need to provide eccentricity of the point prior to the detection. Please suggest a good way for finding discrete points along the curvature that can be fitted to a parabola. I have given tag to opencv and ITK so if there is function that can be implemented on this particular picture please suggest the function I will try it out to see the results.
img = imread('test2.jpg');
rawimg = rgb2gray(img);
[accum, axis_rho, axis_theta, lineprm, lineseg] = Hough_Grd(bwtu, 8, 0.01);
figure(1); imagesc(axis_theta*(180/pi), axis_rho, accum); axis xy;
xlabel('Theta (degree)'); ylabel('Pho (pixels)');
title('Accumulation Array from Hough Transform');
figure(2); imagesc(bwtu); colormap('gray'); axis image;
DrawLines_2Ends(lineseg);
title('Raw Image with Line Segments Detected');
The edge map of the image is as follows and the result generated after applying Hough transform on edge map is also not good. I was thinking a solution that does general parametric shape detection like this curve can be expressed as a family of parabola and so we do a curve fitting to estimate the coefficients as it bends to analyze it's curvature. I need to design a real time procedure so please suggest anything in this direction.
I suggest the following approach:
First stage: generate a segmentation of the pipe.
perform thresholding on the image.
find connected components in the thresholded image.
search for a connected component which represents the pipe.
The connected component which represents the pipe should have an edge map which is divided into top and bottom edges (see attached image).
The top and bottom edges should have similar size, and they should have a relatively constant distance from one another. In other words, the variance of their per-pixel distances should be low.
Second stage - extract curve
At this stage, you should extract the points of the curve for performing Beizer fitting.
You can either perform this calculation on the top edge, or the bottom edge.
another option is to do it on the skeleton of the pipe segmentation.
Results
The pipe segmentation. Top and bottom edges are mark with blue and red correspondingly.
Code
I = mat2gray(imread('ILwH7.jpg'));
im = rgb2gray(I);
%constant values to be used later on
BW_THRESHOLD = 0.64;
MIN_CC_SIZE = 50;
VAR_THRESHOLD = 2;
SIMILAR_SIZE_THRESHOLD = 0.85;
%stage 1 - thresholding & noise cleaning
bwIm = im>BW_THRESHOLD;
bwIm = imfill(bwIm,'holes');
bwIm = imopen(bwIm,strel('disk',1));
CC = bwconncomp(bwIm);
%iterates over the CC list, and searches for the CC which represents the
%pipe
for ii=1:length(CC.PixelIdxList)
%ignore small CC
if(length(CC.PixelIdxList{ii})<50)
continue;
end
%extracts CC edges
ccMask = zeros(size(bwIm));
ccMask(CC.PixelIdxList{ii}) = 1;
ccMaskEdges = edge(ccMask);
%finds connected components in the edges mat(there should be two).
%these are the top and bottom parts of the pipe.
CC2 = bwconncomp(ccMaskEdges);
if length(CC2.PixelIdxList)~=2
continue;
end
%tests that the top and bottom edges has similar sizes
s1 = length(CC2.PixelIdxList{1});
s2 = length(CC2.PixelIdxList{2});
if(min(s1,s2)/max(s1,s2) < SIMILAR_SIZE_THRESHOLD)
continue;
end
%calculate the masks of these two connected compnents
topEdgeMask = false(size(ccMask));
topEdgeMask(CC2.PixelIdxList{1}) = true;
bottomEdgeMask = false(size(ccMask));
bottomEdgeMask(CC2.PixelIdxList{2}) = true;
%tests that the variance of the distances between the points is low
topEdgeDists = bwdist(topEdgeMask);
bottomEdgeDists = bwdist(bottomEdgeMask);
var1 = std(topEdgeDists(bottomEdgeMask));
var2 = std(bottomEdgeDists(topEdgeMask));
%if the variances are low - we have found the CC of the pipe. break!
if(var1<VAR_THRESHOLD && var2<VAR_THRESHOLD)
pipeMask = ccMask;
break;
end
end
%performs median filtering on the top and bottom boundaries.
MEDIAN_SIZE =5;
[topCorveY, topCurveX] = find(topEdgeMask);
topCurveX = medfilt1(topCurveX);
topCurveY = medfilt1(topCurveY);
[bottomCorveY, bottomCurveX] = find(bottomEdgeMask);
bottomCurveX = medfilt1(bottomCurveX);
bottomCorveY = medfilt1(bottomCorveY);
%display results
imshow(pipeMask); hold on;
plot(topCurveX,topCorveY,'.-');
plot(bottomCurveX,bottomCorveY,'.-');
Comments
In this specific example, acquiring the pipe segmentation by thresholding was relatively easy. In some scenes it may be more complex. in these cases, you may want to use region growing algorithm for generating the pipe segmentation.
Detecting the connected component which represents the pipe can be done by using some more hueristics. For example - the local curvature of it's boundaries should be low.
You can find the connected components (CCs) of your inverted edge-map image. Then you can somehow filter those components, say for example, based on their pixel count, using region-properties. Here are the connected components I obtained using the given Octave code.
Now you can fit a model to each of these CCs using something like nlinfit or any suitable method.
im = imread('uFBtU.png');
gr = rgb2gray(uint8(im));
er = imerode(gr, ones(3)) < .5;
[lbl, n] = bwlabel(er, 8);
imshow(label2rgb(lbl))

Visualize distance matrix as a graph

I am doing a clustering task and I have a distance matrix. I wish to visualize this distance matrix as a 2D graph. Please let me know if there is any way to do it online or in programming languages like R or python.
My distance matrix is as follows,
I used the classical Multidimensional scaling functionality (in R) and obtained a 2D plot that looks like:
But What I am looking for is a graph with nodes and weighted edges running between them.
Possibility 1
I assume, that you want a 2dimensional graph, where distances between nodes positions are the same as provided by your table.
In python, you can use networkx for such applications. In general there are manymethods of doing so, remember, that all of them are just approximations (as in general it is not possible to create a 2 dimensional representataion of points given their pairwise distances) They are some kind of stress-minimizatin (or energy-minimization) approximations, trying to find the "reasonable" representation with similar distances as those provided.
As an example you can consider a four point example (with correct, discrete metric applied):
p1 p2 p3 p4
---------------
p1 0 1 1 1
p2 1 0 1 1
p3 1 1 0 1
p4 1 1 1 0
In general, drawing actual "graph" is redundant, as you have fully connected one (each pair of nodes is connected) so it should be sufficient to draw just points.
Python example
import networkx as nx
import numpy as np
import string
dt = [('len', float)]
A = np.array([(0, 0.3, 0.4, 0.7),
(0.3, 0, 0.9, 0.2),
(0.4, 0.9, 0, 0.1),
(0.7, 0.2, 0.1, 0)
])*10
A = A.view(dt)
G = nx.from_numpy_matrix(A)
G = nx.relabel_nodes(G, dict(zip(range(len(G.nodes())),string.ascii_uppercase)))
G = nx.to_agraph(G)
G.node_attr.update(color="red", style="filled")
G.edge_attr.update(color="blue", width="2.0")
G.draw('distances.png', format='png', prog='neato')
In R you can try multidimensional scaling
# Classical MDS
# N rows (objects) x p columns (variables)
# each row identified by a unique row name
d <- dist(mydata) # euclidean distances between the rows
fit <- cmdscale(d,eig=TRUE, k=2) # k is the number of dim
fit # view results
# plot solution
x <- fit$points[,1]
y <- fit$points[,2]
plot(x, y, xlab="Coordinate 1", ylab="Coordinate 2",
main="Metric MDS", type="n")
text(x, y, labels = row.names(mydata), cex=.7)
Possibility 2
You just want to draw a graph with labeled edges
Again, networkx can help:
import networkx as nx
# Create a graph
G = nx.Graph()
# distances
D = [ [0, 1], [1, 0] ]
labels = {}
for n in range(len(D)):
for m in range(len(D)-(n+1)):
G.add_edge(n,n+m+1)
labels[ (n,n+m+1) ] = str(D[n][n+m+1])
pos=nx.spring_layout(G)
nx.draw(G, pos)
nx.draw_networkx_edge_labels(G,pos,edge_labels=labels,font_size=30)
import pylab as plt
plt.show()
Multidimensional scaling (MDS) is exactly what you want. See here and here for more.
You did not mentioned if you want a 2 dimensional graph or not. I suppose that you want to build a graph on 2 dimensions due to the fact that you need that for visualization. Considering that you have to be aware that for the most of the graphs this is simply not possible.
What can be probably done is to approximate somehow the values from distance matrix, something like small values to have relative small edges and big values to have a relative big length.
With all previous considerations one option would be graphviz. See neato function.
In general what you are interested in is force-directed drawing. See wikipedia for further reference.
You can use d3js Force Directed Graph and configure distance between nodes. d3js force layout has some clustering capability to separate nodes with similar distances. Here's an example with values as distance between nodes:
http://vida.io/documents/SyT7DREdQmGSpsBkK
Another way to visualize is to use same distance between nodes but different line thickness. In that case, you'd want to calculate stroke-width based on values:
.style("stroke-width", function(d) { return Math.sqrt(d.value / 50); });

Checking images for similarity with OpenCV

Does OpenCV support the comparison of two images, returning some value (maybe a percentage) that indicates how similar these images are? E.g. 100% would be returned if the same image was passed twice, 0% would be returned if the images were totally different.
I already read a lot of similar topics here on StackOverflow. I also did quite some Googling. Sadly I couldn't come up with a satisfying answer.
This is a huge topic, with answers from 3 lines of code to entire research magazines.
I will outline the most common such techniques and their results.
Comparing histograms
One of the simplest & fastest methods. Proposed decades ago as a means to find picture simmilarities. The idea is that a forest will have a lot of green, and a human face a lot of pink, or whatever. So, if you compare two pictures with forests, you'll get some simmilarity between histograms, because you have a lot of green in both.
Downside: it is too simplistic. A banana and a beach will look the same, as both are yellow.
OpenCV method: compareHist()
Template matching
A good example here matchTemplate finding good match. It convolves the search image with the one being search into. It is usually used to find smaller image parts in a bigger one.
Downsides: It only returns good results with identical images, same size & orientation.
OpenCV method: matchTemplate()
Feature matching
Considered one of the most efficient ways to do image search. A number of features are extracted from an image, in a way that guarantees the same features will be recognized again even when rotated, scaled or skewed. The features extracted this way can be matched against other image feature sets. Another image that has a high proportion of the features matching the first one is considered to be depicting the same scene.
Finding the homography between the two sets of points will allow you to also find the relative difference in shooting angle between the original pictures or the amount of overlapping.
There are a number of OpenCV tutorials/samples on this, and a nice video here. A whole OpenCV module (features2d) is dedicated to it.
Downsides: It may be slow. It is not perfect.
Over on the OpenCV Q&A site I am talking about the difference between feature descriptors, which are great when comparing whole images and texture descriptors, which are used to identify objects like human faces or cars in an image.
Since no one has posted a complete concrete example, here are two quantitative methods to determine the similarity between two images. One method for comparing images with the same dimensions; another for scale-invariant and transformation indifferent images. Both methods return a similarity score between 0 to 100, where 0 represents a completely different image and 100 represents an identical/duplicate image. For all other values in between: the lower the score, the less similar; the higher the score, the more similar.
Method #1: Structural Similarity Index (SSIM)
To compare differences and determine the exact discrepancies between two images, we can utilize Structural Similarity Index (SSIM) which was introduced in Image Quality Assessment: From Error Visibility to Structural Similarity. SSIM is an image quality assessment approach which estimates the degradation of structural similarity based on the statistical properties of local information between a reference and a distorted image. The range of SSIM values extends between [-1, 1] and it typically calculated using a sliding window in which the SSIM value for the whole image is computed as the average across all individual window results. This method is already implemented in the scikit-image library for image processing and can be installed with pip install scikit-image.
The skimage.metrics.structural_similarity() function returns a comparison score and a difference image, diff. The score represents the mean SSIM score between two images with higher values representing higher similarity. The diff image contains the actual image differences with darker regions having more disparity. Larger areas of disparity are highlighted in black while smaller differences are in gray. Here's an example:
Input images
Difference image -> highlighted mask differences
The SSIM score after comparing the two images show that they are very similar.
Similarity Score: 89.462%
To visualize the exact differences between the two images, we can iterate through each contour, filter using a minimum threshold area to remove tiny noise, and highlight discrepancies with a bounding box.
Limitations: Although this method works very well, there are some important limitations. The two input images must have the same size/dimensions and also suffers from a few problems including scaling, translations, rotations, and distortions. SSIM also does not perform very well on blurry or noisy images. These problems are addressed in Method #2.
Code:
from skimage.metrics import structural_similarity
import cv2
import numpy as np
first = cv2.imread('clownfish_1.jpeg')
second = cv2.imread('clownfish_2.jpeg')
# Convert images to grayscale
first_gray = cv2.cvtColor(first, cv2.COLOR_BGR2GRAY)
second_gray = cv2.cvtColor(second, cv2.COLOR_BGR2GRAY)
# Compute SSIM between two images
score, diff = structural_similarity(first_gray, second_gray, full=True)
print("Similarity Score: {:.3f}%".format(score * 100))
# The diff image contains the actual image differences between the two images
# and is represented as a floating point data type so we must convert the array
# to 8-bit unsigned integers in the range [0,255] before we can use it with OpenCV
diff = (diff * 255).astype("uint8")
# Threshold the difference image, followed by finding contours to
# obtain the regions that differ between the two images
thresh = cv2.threshold(diff, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
# Highlight differences
mask = np.zeros(first.shape, dtype='uint8')
filled = second.copy()
for c in contours:
area = cv2.contourArea(c)
if area > 100:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(first, (x, y), (x + w, y + h), (36,255,12), 2)
cv2.rectangle(second, (x, y), (x + w, y + h), (36,255,12), 2)
cv2.drawContours(mask, [c], 0, (0,255,0), -1)
cv2.drawContours(filled, [c], 0, (0,255,0), -1)
cv2.imshow('first', first)
cv2.imshow('second', second)
cv2.imshow('diff', diff)
cv2.imshow('mask', mask)
cv2.imshow('filled', filled)
cv2.waitKey()
Method #2: Dense Vector Representations
Typically, two images will not be exactly the same. They may have variations with slightly different backgrounds, dimensions, feature additions/subtractions, or transformations (scaled, rotated, skewed). In other words, we cannot use a direct pixel-to-pixel approach since with variations, the problem shifts from identifying pixel-similarity to object-similarity. We must switch to deep-learning feature models instead of comparing individual pixel values.
To determine identical and near-similar images, we can use the the sentence-transformers library which provides an easy way to compute dense vector representations for images and the OpenAI Contrastive Language-Image Pre-Training (CLIP) Model which is a neural network already trained on a variety of (image, text) pairs. The idea is to encode all images into vector space and then find high density regions which correspond to areas where the images are fairly similar.
When two images are compared, they are given a score between 0 to 1.00. We can use a threshold parameter to identify two images as similar or different. A lower threshold will result in clusters which have fewer similar images in it. Conversely, a higher threshold will result in clusters that have more similar images. A duplicate image will have a score of 1.00 meaning the two images are exactly the same. To find near-similar images, we can set the threshold to any arbitrary value, say 0.9. For instance, if the determined score between two images are greater than 0.9 then we can conclude they are near-similar images.
An example:
This dataset has five images, notice how there are duplicates of flower #1 while the others are different.
Identifying duplicate images
Score: 100.000%
.\flower_1 copy.jpg
.\flower_1.jpg
Both flower #1 and its copy are the same
Identifying near-similar images
Score: 97.141%
.\cat_1.jpg
.\cat_2.jpg
Score: 95.693%
.\flower_1.jpg
.\flower_2.jpg
Score: 57.658%
.\cat_1.jpg
.\flower_1 copy.jpg
Score: 57.658%
.\cat_1.jpg
.\flower_1.jpg
Score: 57.378%
.\cat_1.jpg
.\flower_2.jpg
Score: 56.768%
.\cat_2.jpg
.\flower_1 copy.jpg
Score: 56.768%
.\cat_2.jpg
.\flower_1.jpg
Score: 56.284%
.\cat_2.jpg
.\flower_2.jpg
We get more interesting results between different images. The higher the score, the more similar; the lower the score, the less similar. Using a threshold of 0.9 or 90%, we can filter out near-similar images.
Comparison between just two images
Score: 97.141%
.\cat_1.jpg
.\cat_2.jpg
Score: 95.693%
.\flower_1.jpg
.\flower_2.jpg
Score: 88.914%
.\ladybug_1.jpg
.\ladybug_2.jpg
Score: 94.503%
.\cherry_1.jpg
.\cherry_2.jpg
Code:
from sentence_transformers import SentenceTransformer, util
from PIL import Image
import glob
import os
# Load the OpenAI CLIP Model
print('Loading CLIP Model...')
model = SentenceTransformer('clip-ViT-B-32')
# Next we compute the embeddings
# To encode an image, you can use the following code:
# from PIL import Image
# encoded_image = model.encode(Image.open(filepath))
image_names = list(glob.glob('./*.jpg'))
print("Images:", len(image_names))
encoded_image = model.encode([Image.open(filepath) for filepath in image_names], batch_size=128, convert_to_tensor=True, show_progress_bar=True)
# Now we run the clustering algorithm. This function compares images aganist
# all other images and returns a list with the pairs that have the highest
# cosine similarity score
processed_images = util.paraphrase_mining_embeddings(encoded_image)
NUM_SIMILAR_IMAGES = 10
# =================
# DUPLICATES
# =================
print('Finding duplicate images...')
# Filter list for duplicates. Results are triplets (score, image_id1, image_id2) and is scorted in decreasing order
# A duplicate image will have a score of 1.00
# It may be 0.9999 due to lossy image compression (.jpg)
duplicates = [image for image in processed_images if image[0] >= 0.999]
# Output the top X duplicate images
for score, image_id1, image_id2 in duplicates[0:NUM_SIMILAR_IMAGES]:
print("\nScore: {:.3f}%".format(score * 100))
print(image_names[image_id1])
print(image_names[image_id2])
# =================
# NEAR DUPLICATES
# =================
print('Finding near duplicate images...')
# Use a threshold parameter to identify two images as similar. By setting the threshold lower,
# you will get larger clusters which have less similar images in it. Threshold 0 - 1.00
# A threshold of 1.00 means the two images are exactly the same. Since we are finding near
# duplicate images, we can set it at 0.99 or any number 0 < X < 1.00.
threshold = 0.99
near_duplicates = [image for image in processed_images if image[0] < threshold]
for score, image_id1, image_id2 in near_duplicates[0:NUM_SIMILAR_IMAGES]:
print("\nScore: {:.3f}%".format(score * 100))
print(image_names[image_id1])
print(image_names[image_id2])
If for matching identical images ( same size/orientation )
// Compare two images by getting the L2 error (square-root of sum of squared error).
double getSimilarity( const Mat A, const Mat B ) {
if ( A.rows > 0 && A.rows == B.rows && A.cols > 0 && A.cols == B.cols ) {
// Calculate the L2 relative error between images.
double errorL2 = norm( A, B, CV_L2 );
// Convert to a reasonable scale, since L2 error is summed across all pixels of the image.
double similarity = errorL2 / (double)( A.rows * A.cols );
return similarity;
}
else {
//Images have a different size
return 100000000.0; // Return a bad value
}
Source
Sam's solution should be sufficient. I've used combination of both histogram difference and template matching because not one method was working for me 100% of the times. I've given less importance to histogram method though. Here's how I've implemented in simple python script.
import cv2
class CompareImage(object):
def __init__(self, image_1_path, image_2_path):
self.minimum_commutative_image_diff = 1
self.image_1_path = image_1_path
self.image_2_path = image_2_path
def compare_image(self):
image_1 = cv2.imread(self.image_1_path, 0)
image_2 = cv2.imread(self.image_2_path, 0)
commutative_image_diff = self.get_image_difference(image_1, image_2)
if commutative_image_diff < self.minimum_commutative_image_diff:
print "Matched"
return commutative_image_diff
return 10000 //random failure value
#staticmethod
def get_image_difference(image_1, image_2):
first_image_hist = cv2.calcHist([image_1], [0], None, [256], [0, 256])
second_image_hist = cv2.calcHist([image_2], [0], None, [256], [0, 256])
img_hist_diff = cv2.compareHist(first_image_hist, second_image_hist, cv2.HISTCMP_BHATTACHARYYA)
img_template_probability_match = cv2.matchTemplate(first_image_hist, second_image_hist, cv2.TM_CCOEFF_NORMED)[0][0]
img_template_diff = 1 - img_template_probability_match
# taking only 10% of histogram diff, since it's less accurate than template method
commutative_image_diff = (img_hist_diff / 10) + img_template_diff
return commutative_image_diff
if __name__ == '__main__':
compare_image = CompareImage('image1/path', 'image2/path')
image_difference = compare_image.compare_image()
print image_difference
A little bit off topic but useful is the pythonic numpy approach. Its robust and fast but just does compare pixels and not the objects or data the picture contains (and it requires images of same size and shape):
A very simple and fast approach to do this without openCV and any library for computer vision is to norm the picture arrays by
import numpy as np
picture1 = np.random.rand(100,100)
picture2 = np.random.rand(100,100)
picture1_norm = picture1/np.sqrt(np.sum(picture1**2))
picture2_norm = picture2/np.sqrt(np.sum(picture2**2))
After defining both normed pictures (or matrices) you can just sum over the multiplication of the pictures you like to compare:
1) If you compare similar pictures the sum will return 1:
In[1]: np.sum(picture1_norm**2)
Out[1]: 1.0
2) If they aren't similar, you'll get a value between 0 and 1 (a percentage if you multiply by 100):
In[2]: np.sum(picture2_norm*picture1_norm)
Out[2]: 0.75389941124629822
Please notice that if you have colored pictures you have to do this in all 3 dimensions or just compare a greyscaled version. I often have to compare huge amounts of pictures with arbitrary content and that's a really fast way to do so.
one can use auto encoder for such task using architectures like VGG16 on pre-trained ImageRes data; Then calculate distance between query and other images in order to find the closest match.

Resources