Say that you have a grid where users draw pictures/shapes by clicking and coloring the boxes. Can you suggest any algorithm to compare these drawings according to originality ? I was thinking about comparing them according to the boxes they occupy but I am not sure if that is the best way. I hope I was clear. Thanks.
IMHO, the best choice would be to use mutual information as a metric. Since this is still a very abstract problem I am not sure about details of calculating it.
Let me elaborate on why mutual information is a good measure. Let us assume a image is made up of colors a,b,c and 4 (exactly four colors). And another image is exactly same, except a is replaced with e, b->f, c->g and d->h. If you use any other metrics (correlation for example), these two images seem dissimilar, but mutual information would show that these two images share exact same information (only coded differently).
How to calculate mutual information: First, you need to align the images (which is a tough problem, you can get reasonable solution by transforming the image in offsets, scaling and rotation). Once images are aligned, you have pixel-to-pixel relation. You can assume each pixel is independent and calculate I(X;Y) where X is pixel from first image and Y from second. This is the simple-most solution, but you can assume more complicate relations Eg: I(X1,...,Xk;Y1,...,Yk) where X1,...,Xk are adjacent pixels and Yis correspond to their counterparts.
You can use a special curve in math. Such a curve fills the space and traverse each point exactly once. Thus you can reduce the 2d complexity you have a problem to a 1d complexity. When you sort the points you can see the image in 1 dimension this makes it easer to apply a statistical algorithm to look for similarities. You can apply this to each color of the image.
Related
Given a "density" scalar field in the plane, how can I divide the plane into nice (low moment of inertia) regions so that each region contains a similar amount of "mass"?
That's not the best description of what my actual problem is, but it's the most concise phrasing I could think of.
I have a large map of a fictional world for use in a game. I have a pretty good idea of approximately how far one could walk in a day from any given point on this map, and this varies greatly based on the terrain etc. I would like to represent this information by dividing the map into regions, so that one day of walking could take you from any region to any of its neighboring regions. It doesn't have to be perfect, but it should be significantly better than simply dividing the map into a hexagonal grid (which is what many games do).
I had the idea that I could create a gray-scale image with the same dimensions as the map, where each pixel's color value represents how quickly one can travel through the pixel in the same place on the map. Well-maintained roads would be encoded as white pixels, and insurmountable cliffs would be encoded as black, or something like that.
My question is this: does anyone have an idea of how to use such a gray-scale image (the "density" scalar field) to generate my "grid" from the previous paragraph (regions of similar "mass")?
I've thought about using the gray-scale image as a discrete probability distribution, from which I can generate a bunch of coordinates, and then use some sort of clustering algorithm to create the regions, but a) the clustering algorithms would have to create clusters of a similar size, I think, for that idea to work, which I don't think they usually do, and b) I barely have any idea if any of that even makes sense, as I'm way out of my comfort zone here.
Sorry if this doesn't belong here, my idea has always been to solve it programatically somehow, so this seemed the most sensible place to ask.
UPDATE: Just thought I'd share the results I've gotten so far, trying out the second approach suggested by #samgak - recursively subdividing regions into boxes of similar mass, finding the center of mass of each region, and creating a voronoi diagram from those.
I'll keep tweaking, and maybe try to find a way to make it less grid-like (like in the upper right corner), but this worked way better than I expected!
Building upon #samgak's solution, if you don't want the grid-like structure, you can just add a small random perturbation to your centers. You can see below for example the difference I obtain:
without perturbation
adding some random perturbation
A couple of rough ideas:
You might be able to repurpose a color-quantization algorithm, which partitions color-space into regions with roughly the same number of pixels in them. You would have to do some kind of funny mapping where the darker the pixel in your map, the greater the number of pixels of a color corresponding to that pixel's location you create in a temporary image. Then you quantize that image into x number of colors and use their color values as co-ordinates for the centers of the regions in your map, and you could then create a voronoi diagram from these points to define your region boundaries.
Another approach (which is similar to how some color quantization algorithms work under the hood anyway) could be to recursively subdivide regions of your map into axis-aligned boxes by taking each rectangular region and choosing the optimal splitting line (x or y) and position to create 2 smaller rectangles of similar "mass". You would end up with a power of 2 count of rectangular regions, and you could get rid of the blockiness by taking the centre of mass of each rectangle (not simply the center of the bounding box) and creating a voronoi diagram from all the centre-points. This isn't guaranteed to create regions of exactly equal mass, but they should be roughly equal. The algorithm could be improved by allowing recursive splitting along lines of arbitrary orientation (or maybe a finite number of 8, 16, 32 etc possible orientations) but of course that makes it more complicated.
what I want to achieve is a transition between two image files. The pixels from the image A move and rearrange themselves to form the image B. Imagine a cloud of particles (that is made from the A image's pixels) that forms into the picture B.
So far I have thought of going through all the pixels in image A and comparing them to pixels in image B; pixels that are the most similar are taken out of the arrays (with their x,y coordinates, too) and put into another array. So, in the end, I have pairs of pixels from both images that are similar. Then I only have to create the animation / possible color balancing (obviously all the pairs won't consist of identical pixels), which is fairly easy.
The problem is the algorithm that finds pixel pairs. For a small 100px x 100px image it would take 50 005 000 comparisons, for larger it would be impossible.
Dividing pictures in clusters? Any ideas will be appreciated.
I'd say that you're likely to achieve the best result matching up pixels by hue first, then saturation, finally luminance. If I'm right, then your best bet for optimization would be to convert to HSV first. Once there, you can just sort your pixels and binary search the results to find your pairs.
I'd say you'd may want to additionally search a fixed window around the result you find, to match up pixels that are least distance away from each other. That may make the resulting transition more coherent.
You may want to take a look at the Hungarian algorithm, which reduces the amount of actual comparisons for 100x100 pixels to 10000 - and after that you have O(n^3) time for finding the optimal matches. Basically, give each pixel combination a "cost" based on similarity and then send the (inverted) cost matrix through the algorithm to get the optimal assignment of pixels from A to pixels from B.
But it still might be too much computation for too little gain, depending on whether you need real time. I.e. this kind of work doesn't necessarily need an optimal match, just good enough - still, it may work as a point of origin in terms of finding less computationally intensive methods.
See bottom of the linked article for implementations in various languages - it's not entirely trival to implement.
Based on this original idea, that many of you have probably seen before:
http://rogeralsing.com/2008/12/07/genetic-programming-evolution-of-mona-lisa/
I wanted to try taking a different approach:
You have a target image. Let's say you can add one triangle at a time. There exists some triangle (or triangles in case of a tie) that maximizes the image similarity (fitness function). If you could brute force through all possible shapes and colors, you would find it. But that is prohibitively expensive. Searching all triangles is a 10-dimensional space: x1, y1, x2, y2, x3, y3, r, g, b, a.
I used simulated annealing with pretty good results. But I'm wondering if I can further improve on this. One thought was to actually analyze the image difference between the target image and current image and look for "hot spots" that might be good places to put a new triangle.
What algorithm would you use to find the optimal triangle (or other shape) that maximizes image similarity?
Should the algorithm vary to handle coarse details and fine details differently? I haven't let it run long enough to start refining the finer image details. It seems to get "shy" about adding new shapes the longer it runs... it uses low alpha values (very transparent shapes).
Target Image and Reproduced Image (28 Triangles):
Edit! I had a new idea. If shape coordinates and alpha value are given, the optimal RGB color for the shape can be computed by analyzing the pixels in the current image and the target image. So that eliminates 3 dimensions from the search space, and you know the color you're using is always optimal! I've implemented this, and tried another run using circles instead of triangles.
300 Circles and 300 Triangles:
I would start experimenting with vertex-colours (have a different RGBA value for each vertex), this will slightly increase the complexity but massively increase the ability to quickly match the target image (assuming photographic images which tend to have natural gradients in them).
Your question seems to suggest moving away from a genetic approach (i.e. trying to find a good triangle to fit rather than evolving it). However, it could be interpreted both ways, so I'll answer from a genetic approach.
A way to focus your mutations would be to apply a grid over the image, calculate which grid-square is the least-best match of the corresponding grid-square in the target image and determine which triangles intersect with that grid square, then flag them for a greater chance of mutation.
You could also (at the same time) improve fine-detail by doing a smaller grid-based check on the best matching grid-square.
For example if you're using an 8x8 grid over the image:
Determine which of the 64 grid squares is the worst match and flag intersecting (or nearby/surrounding) triangles for higher chance of mutation.
Determine which of the 64 grid-squares is the best match and repeat with another smaller 8x8 grid within that square only (i.e. 8x8 grid within that best grid-square). These can be flagged for likely spots for adding new triangles, or just to fine-tune the detail.
An idea using multiple runs:
Use your original algorithm as the first run, and stop it after a predetermined number of steps.
Analyze the first run's result. If the result is pretty good on most part of the image but was doing badly in a small part of the image, increase the emphasis of this part.
When running the second run, double the error contribution from the emphasized part (see note). This will cause the second run to do a better match in that area. On the other hand, it will do worse in the rest of the image, relative to the first run.
Repeatedly perform many runs.
Finally, use a genetic algorithm to merge the results - it is allowed to choose from triangles generated from all of the previous runs, but is not allowed to generate any new triangles.
Note: There was in fact some algorithms for calculating how much the error contribution should be increased. It's called http://en.wikipedia.org/wiki/Boosting. However, I think the idea will still work without using a mathematically precise method.
Very interesting problem indeed ! My way of analyzing such problem was usage of evolutionary strategy optimization algorithm. It's not fast and is suitable if number of triangles is small. I've not achieved good approximations of original image - but that is partly because my original image was too complex - so I didn't tried a lot of algorithm restarts to see what other sub-optimal results EVO could produce... In any case - this is not bad as abstract art generation method :-)
i think that algorithm is at real very simple.
P = 200 # size of population
max_steps = 100
def iteration
create P totally random triangles (random points and colors)
select one triangle that has best fittness
#fitness computing is described here: http://rogeralsing.com/2008/12/09/genetic-programming-mona-lisa-faq/
put selected triangle on the picture (or add it to array of triangles to manipulate them in future)
end
for i in 1..max_steps {iteration}
I know how to write a similarity function for data points in euclidean space (by taking the negative min sqaured error.) Now if I want to check my clustering algorithms on images how can I write a similarity function for data points in images? Do I base it on their RGB values or what? and how?
I think we need to clarify better some points:
Are you clustering only on color? So, take RGB values for pixels and apply your metric function (minimize sum of sq. error, or just calculate SAD - Sum of Absolute Differences).
Are you clustering on space basis (in an image)? In this case, you should take care of position, as you specified for euclidean space, just considering the image as your samples' domain. It's a 2D space anyway... 3D if you consider color information too (see next).
Are you looking for 3D information from image? (2D position + 1D color) It's the most probable case. Consider segmentation techniques if your image shows regular or well defined shapes, as first approach. If it fails, or you wanted a less hand tuned algorithm, consider reducing the 3D space of information to 2D or even 1D by doing PCA on data. By analyzing Principal Components you could drop off unuseful information from your collection and/or exploiting intrinsic data structure in some way.
The argument would need much more than a post to be solved, but I hope this could help a bit.
Looking for any information/algorithms relating to comparing vector graphics. E.g. say there two point collections or vector files with two almost identical figures. I want to determine that a first figure is about 90% similar to the second one.
A common way to test for similarity is with image moments. Moments are intrinsically translationally invariant, and if the objects you compare might be scaled or rotated you can use moments that are invariant to these transformations, such as Hu moments.
Most of the programs I know would require rasterized versions of the vector objects; but the moments could be calculated directly from the vector graphics using a Green's Theorem approach, or a more simplistic approach that just identifies unique (unordered) vertex configurations would be to convert the Hu moment integrals to sums over the vertices -- in a physics analogy replacing the continuous object with equal point masses at each vertex.
There is a paper on a tool called VISTO that sorts vector graphics images (using moments, I think), which should certainly be useful for more details.
You could search for fingerprint matching algorithms. Fingerprints are usually converted to a set of points with their relative location to each other, which makes it basically the same problem as yours.
You could transform it to a non-vector graphic and then apply standard image analysis techniques like SIFT points, etc.