I have 1000 of 2D gray-scale images and would like to cluster them in python in a way that images with more similarities stay in same group. The images represents simple geometrical shapes including circles, triangle etc.
If I wan to flatten each image to have a vector and then run the clustering algorithm, it would be very complicated. The images are 400*500, so my clustering training data would be 1000*200000 which means 200000 features!
Just wondering if anyone has come across this issue before?
This is a similar question to this one
Read my answer
Of course you don't use each picture as a feature.
In your case I would recommend features like:
Find corners and calculate their number
Assuming each edge is a straight line - do a histogram of orientations. In each pixel calculate the derivative angle atan(dy,dx), take the strongest 1% of derivative pixels and do a histogram. The amount of peaks in the histogram will correspond to amount of edges (will cluster triangles, squares, circles, etc)
Use connected components analysis to calculate how many shapes you have in the image. Calculate the amount of holes in each shape. Calculate the ratio between the circumference and the area o the shape. For geometrical shapes, geometrical features work extremely well
As you asked in the comment I am adding more info for issue 2.
Please read more about HOG feature here. I assume your are familiar with that is an edge in the image and what a gradient is. Imagine you have a triangle in the image. Only Pixels that lie on the edges of the shape will have a high gradient. Moreover you expect that all the gradients devide into 3 different directions, one for each edge. You don't know in which direction since you don't know the orientation of the triangle but you know that there should be 3 directions. With a square there would be 2 directions and with circle there will not be a clear direction. You want to count the amount of directions. Use the following steps. First find the pixels which have a high gradient value. Say from the entire image there is only 1000 such pixels (they lie on the edges of the shape). For each pixel calculate the angle of the gradient. So you have 1000 pixels, each may have an angle of [0..179] (Angle of 180 is equal to 0). There are 180 different angles. Lets assume that in order to reduce noise you don't need the exact angle but +- 1 degrees. So each angle is divided by 2 and rounded to the nearest integer. So totally you have 1000 pixels, each having only 90 options for different angle. Now make a histogram of angles. If the shape was a circle you expect that roughly ~11 (=1000/90) pixels will fall into each bin of the histogram. If it was a square you expect the histogram to be largely empty except for 2 bins with a very high amount of pixels in it and the bins being at distance of 45 from each other. Example: bin 13 has 400 pixels in it, bi 58 has
400 pixels in it and the rest 200 are noise split somehow in the other bins. Now you know that you are facing a square and you also know its rotation in the image.
If it was a triangle you expect 3 large bins in the histogram.
Related
I would like your help regarding rotations. Assuming we have a group of "n" planar surfaces within the following 3-dimensional system (image below), where the Red, Green and Blue planes are the xz, yz and xy Axes planes respectively. These planar surfaces are at random positions and have random orientations, sizes, and shapes.
Now imagine that we rotate all these planes as a group in all ways possible (for example 360{dimension1}*360{dimension2}*360{dimension3} = 46656000 times (assuming that our chosen rotational precision is the magnitude of 1 degree).
Among all these rotational combinations, each one of those "n" planes can be found several times to be almost (depending on our rotational precision) parallel to the xz, yz and xy planes.
My question: either using quaternions or Euler angles or other methodologies, how can I minimize the rotation checks, without lowering the rotational precision (criterion 1) and making sure at the same time that among the combinations, each plane surface within the group will be at least once, parallel to any of the 3 axes planes (e.g. if it is only parallel to yz Axes plane it is enough and we do not care about the other 2 axes) (criterion 2).
I thought of checking 90*90*90 degrees combinations but 1) I am not sure whether this will cover my criteria above and 2) maybe there is something more clever.
From what I understand, you want to do a number of rotations in sequence, by the the of which each plane will have been (almost) parallel to any of the (xy, xz, yz) planes?
Just rotate around the x-axis then rotate around the y-axis then the z-axis. so 360*3 rotations.
I'm just going to try and explain my problem with images:
The program receives an input (image):
There is a base polygon, but can be simplified into a circle in all situations:
Output should be something like:
There is no correct result, just good and bad ones.
To make things easier, an estimate how many circles there should be can be given based on the surface and extent of the polygon.
What I am searching is an algorithm that does something described above - cover as much as possible with the given shape, while minimizing the area of black pixels and overlapping areas.
I used k-means clustering to find circle centers. Number of clusters is calculated:
numberOfClusters = round(polygonArea / basePolygonArea).
Input data for k-means algorithm are points of white pixels.
I am trying to implement the method of Dalal and Triggs. I could implement the first stage compute gradients on an image, and I could create the code who walk across the image in cells, but I don't understand the logic behind this stage.
I know is necessary identify first between a signed (0-360 degrees) or unsigned (0-180 degrees) gradients.
I know I must create a data structure to store each cell histogram, whit n bins. I know what is a histogram, hence I understand I must visit each pixel, but I I don't fully understand about the method for classify each pixel, get the gradient orientation of this pixel and build the histogram with this data.
In short HOG is nothing but a dense representation of gradient orientations weighted by their strengths over a overlapped local neighbourhoods.
You asked what is the significance of finding each pixel gradient orientation. In an image the gradient orientation at each pixel indicates the direction of the boundary(edge between two textures) of the object at that location with respect to X and Y axis. So if you group the orientations of a patch or block or part of an object it represents the distribution of edge directions of object at that region in a very strong way or unique way... Now let us take a simple example, a circle if you plot the gradient orientations of a circle as a histogram you will get a straight line (Don't imagine HOG just a simple plot of gradient orientations) because the orientations of edges of circle ranges from 0 degrees to 360 degrees if u sampled at 360 consecutive locations, For a different object it is different, HOG also do the same thing but in a more sophisticated manner by dividing image into overlapping blocks and dividing each block into cells and making the histogram weighted by the strengths of the local gradients...
Hope it is useful ...
I have some GPS sample data taken from a device. What I need to do is to "move" the data to the "left" by, let's say, 1 to 5 meters. I know how to do the moving part, the only problem is that the moving is not as accurate as I want it to be.
What I currently do:
I take the GPS coordinates (latitude, longitude pairs)
I convert them using plate carrée transformation.
I scale the resulting coordinates to the longitudinal distance (distance on x) and the latitudinal distance (distance on y) - imagine the entire GPS sample data is inside a rectangle being bound by the maximum and minimum latitude/longitude. I compute these distances using the formula for the Great Circle Distance between the extreme values for longitude and latitude.
I move the points x meters in the wanted direction
I convert back to GPS coordinates
I don't really have the accuracy I want. For example moving to the left by 3 meters means less than 3 meters (around 1.8m - maybe 2).
What are the known solutions for doing such things? I need a solution that deviates at most by 0.2-0.5 meters from the real point (not 1.2 like in the current case).
LATER: Is this kind of approach good? By this kind I mean to transform the GPS coordinates into plane coordinates and back to GPS. Is there other way?
LATER2: The approach of converting to a conformal map is probably the one that will be used. In case of a small rectangle, and since there are not roads at the poles probably Mercator will be used. Opinions?
Thanks,
Iulian
PS: I'm working on small areas - so imagine the bounding rectangle I'm talking about to have the length of each side no more than 5 kilometers. (So a 5x5km rectangle is maximum).
There are two issues with your solution:
plate carrée transformation is not conformal (i.e. angles are not preserved)
you can not measure distances along lat or lon that way since that are not great circles (approximately you are off by a factor cos(lat) for your x).
Within small rectangles you may assume that lon/lat can be linearly mapped to x/y pairs but you have to keep in mind that a "square" in lon/lat maps to a rectangle with aspect ratio of approx cos(lat)/1.
I am interested in using shapes like these:
Usually a tangram is made of 7 shapes(5 triangles, 1 square and 1 parallelogram).
What I want to do is fill a shape only with tangram shapes, so at this point,
the size and repetition of shapes shouldn't matter.
Here's something I manually tried:
I am a bit lost on how to approach this.
Assuming I have a path (an ordered list/array of points of the outline),
I imagine I should try to do some sort of triangulation.
Is there such a thing as Deulanay triangulation with triangles constrained to 45 degrees
right angled triangles ?
A more 'brute' approach would be to add a bunch of triangles(45 degrees) and use SAT
for collision detection to 'fix' overlaps, and hopefully gaps will be avoided.
Since the square and parallelogram can be made of triangles(45 degrees) too, I imagine there
would be a nice clean geometric solution, right ?
How do I pack triangles(45 degrees) inside an arbitrary shape ?
Any ideas are welcome.
A few random thoughts (maybe they help you find a better solution) if you're using only the original sizes of the shapes:
as you point out, all shapes in the tangram can be made composed of e.g. the yellow or pink triangle (d-g-c), so try also thinking of a bottom-up approach such as first trying to place as many yellow triangles into your shape and then combine them into larger shapes if possible. In the worst case, you'll end up with a set of these smallest triangles.
any kind triangulation of non-polygons (such as the half-moon in your example) probably does not work very well...
It looks like you require that the shapes can only have a few discrete orientations. To find the best fit of these triangles into the given shape, I'd propose the following approximate solution: draw a grid of triangles (i.e. a square grid with diagonal lines) across the shape and take those triangles which are fully contained. This most likely will not give you the optimal coverage but then you could repeatedly shift the grid by a tenth of the grid size in horizontal and vertical direction and see whether you'll find something which covers a larger fraction of the original shape (or you could go in steps of 1/2 then 1/4 etc. of the original grid size in the spirit of a binary search).
If you allow any arbitrary scaling of the shapes you could approximate any (reasonably smooth ?) shape to arbitrary precision by adding smaller and smaller shapes. E.g. if you have a raster image, you can e.g. choose the size of the yellow triangle such that two of them make a pixel on the image and then you can represent any such raster image.