Photo collage algorithm - binary-tree

I am trying to build a script that will dynamically arrange photos like a collage very similar to what is done on http://lightbox.com/explore#spotlight.
I can off course write code that would handle each case with different sets of photos but I would prefer to have an algorithm that would be able to handle any number of photos.
The algorithm explained here http://www.hpl.hp.com/techreports/2008/HPL-2008-199.pdf in chapter 4 seems very similar to what I need to do.
In my case the vertical and horizontal ratios would always be the same. I would have a defined a bounding box and how many levels each node could get split. The bounding box would have the same ratio of a horizontal photo. If the algorithm can't fit all images I would go back one level and leave it there or pick another photo from a pool of available photos.
My question is very similar to this one Algorithm Arrange images on screen but I am not sure how to move forward. Any further guidance or pseudo code would be very helpful.

Related

Detecting hexagonal shapes in greyscale or binary image

For my bachelor thesis I need to analyse images taken in the ocean to count and measure the size of water particles.
my problem:
besides the wanted water particles, the images show hexagonal patches all over the image in:
- different sizes
- not regular shape
- different greyscale values
(Example image below!)
It is clear that these patches will falsify my image analysis concerning the size and number of particles.
For this reason this patches need to be detected and deleted somehow.
Since it will be just a little part of the work in my thesis, I don't want to spend much time in it and already tried classic ways like: (imageJ)
playing with the threshold (resulting in also deleting wanted water particles)
analyse image including the hexagonal patches and later sort out the biggest areas (the hexagonal patches have quite the biggest areas, but you will still have a lot of haxagons)
playing with filters: using gaussian filter on a duplicated image and subtract the copy from the original deletes many patches (in reducing the greyscale value) but also deletes little wanted water particles and so again falsifies the result
a more complicated and time consuming solution would be to use a implemented library in for example matlab or opencv to detect points, that describe the shapes.
but so far I could not find any code that fits my task.
Does anyone of you have created such a code I could use for my task or any other idea?
You can see a lot of hexagonal patches in different depths also.
the little spots with an greater pixel value are the wanted particles!
Image processing is quite an involved area so there are no hard and fast rules.
But if it was me I would 'Mask' the image. This involves either defining what you want to keep or remove as a pixel 'Mask'. You then scan the mask over the image recursively and compare the mask to the image portion selected. You then select or remove the section (depending on your method) if it meets your criterion.
One such example of a criteria would be the spatial and grey-scale error weighted against a likelihood function (eg Chi-squared, square mean error etc.) or a Normal distribution that you define the uncertainty..
Some food for thought
Maybe you can try with the Hough transform:
https://en.wikipedia.org/wiki/Hough_transform
Matlab have an built-in function, hough, wich implements this, but only works for lines. Maybe you can start from that and change it to recognize hexagons.

Object Detection in an Image

I want to detect some elements in an Image.
For this goal, i get the image and the specified element (like a nose) and from Pixel(0,0) start to search for my element.
But the software performance is awful because i traverse the pixels one by one.
I think i need some smart algorithm for this problem.
And maybe the machine learning algorithm useful for this.
What's your idea?
I would start with viola jones object detection framework.
This is a supervised learning technique, that allows you to detect any kind of object with high provavility.
(even though the article mainly refers to faces, but it is designed for general objects..).
If you chose this approach - your main chore is going to be to obtain a classified training set. You can later evaluate how good your algorithm is using cross-validation.
AFAIK, it is implemented in OpenCV library (I am not familiar with the library to offer help)
You can do a very fast cross correlation using the Fourier transformation of your image and search pattern
A good implementation is for example OpenCV's matchTemplate function
This will work best if your pattern always has the same rotation and scale accross your image.
If it does not, you can repeat the search with several scaled/rotated versions of your pattern.
One advantage of this approach is that no training phase is required.
Another, simpler approach that would work in particular with your pattern is this:
Use connected component labeling to identify blobs with the right number of white pixels to be the center rectangle of your element. This will eliminate all but a few false positives. Concentrate your search on the remaining few spots.
Again OpenCV has a nice Blob library for that sort of stuff.
If you're looking for simple geometric shapes in computer-generated images like the example you provided, then you don't need to bother with machine learning.
For example, here's one of the components you're trying to find in the original image:
(Image removed by request)
Assuming this component is always drawn at the same dimensions, the top and bottom lines are always going to be 21 pixels apart. You can narrow down your search space considerably by combining this image with a copy of itself shifted vertically by 21 pixels, and taking the lighter of the two images as the pixel value at each position.
(Image removed by request)
Similarly, the vertical lines at the left and right of this component are 47 pixels apart, so we can repeat this process with a 47px horizontal shift. This results in a vertical bar about 24px tall at the position of the component.
(Image removed by request)
You can detect these bars quite easily by looking for runs of black pixels between 22 and 26 pixels long in the vertical columns of the processed image. This will provide you with a short list of candidate positions where you can check for the presence of this component more thoroughly, e.g. by calculating a local 2D cross correlation.
Here are the results after processing the whole image. Reaching this stage should only take a few milliseconds.
(Image removed by request)

How to determine which area is clicked in a complex image map?

We are given a rather complex image map, like the one linked below. Except that the layout, shapes of each booth are more irregular, and we have lots of image maps to process.
The requirement is that the software is able to detect which booth (the boxes) is being clicked on. Once having identified the booth, we have to fetch its ID and do some processing. So we need a way to map the physical data on the map to its logical counterpart.
Usually, there are two ways I would approach the problem.
Pragmatically determine where the hotspot are - however in this case, there is no consistency in the layout of booths - some are a small rectangle, some are a squares
Manually figure out the coordinates of each booth and program it into a giant lookup. This is very time consuming, considering the number of booths (the image below is not from the project - it's just a demo). There's an estimate of at least 5000 booths spread across different maps.
Besides the two usual methods of creating hotspots for an image map, what other ways could I use to determine which booth is being clicked?
Platform used is LimeJS, but this problem should be generic enough...
You could separate the map into booths using flood-fills, a new color for each region. You want to flood a known "corridor" spot first to eliminate that. 0,0 should work for that on most maps, I'd imagine.
This would create the hotspots you need. To cope with the print inside the boxes messing with the fill, you can just use the far corners of each region to create a rectangle. This assumes the booths are actually rectangular on the map, of course. For L-shaped booths, a little extra work might be necessary.
To get the ID from each booth, you can feed each region(from above) into an OCR, but you'll have to be able to distinguish between the ID numbers and the dimensions, etc.

Algorithm Arrange images on screen

I need to start building a image application and my customer wants to arrange the picutes in the screen like google tv does, as well as everpix. I have been looking for it for a while but I was unable to find it. The result of arranging the pictures this way is amazing and makes the best use of the screen space.
http://www.google.com//tv/static/images/photos_tv_straight.png
Is this a known algorithm? Does it have a name?
Many thanks
T
Like jwpat7 suggested look for "photo collage layout" algorithms. Particularly things like "treemap" and similar (squarified trieemap). I am working on similar algorithm and for some small number of images you just need to solve simple system of linear equations. There is another HP article that is probably more close to what are you looking for.
Mixed-Initiative Photo Collage Authoring - look at part 4.
Following image is done with some squarified treemap and ratio optimization.
Search for photo montage and photo collage algorithms, as well as photo tiling.
An HP article called "Structured Layout for Resizable Background Art" may be helpful.
Numerous collage programs are available for purchase and some software is available in source form; e.g. see hlrnet list, software.informer list, and perhaps this resizing blurb.
The algebra for scaling photos for a collage while maintaining aspect ratios is straightforward and easily described for specific cases, but not for too-general ones.
In css you can arrange images from horizontal to vertical. A good example is the Google image search. There is the Jquery Masonry plugin to arrange from vertical to horizontal and it has some nice animation. In your example you want to have rather a rectangle arrangement I suggest a treemap algorithm where you can also rotate the rectangle in 90°.

matching jigsaw puzzle pieces

I have nothing useful to do and was playing with jigsaw puzzle like this:
alt text http://manual.gimp.org/nl/images/filters/examples/render-taj-jigsaw.jpg
and I was wondering if it'd be possible to make a program that assists me in putting it together.
Imagine that I have a small puzzle, like 4x3 pieces, but the little tabs and blanks are non-uniform - different pieces have these tabs in different height, of different shape, of different size. What I'd do is to take pictures of all of these pieces, let a program analyze them and store their attributes somewhere. Then, when I pick up a piece, I could ask the program to tell me which pieces should be its 'neighbours' - or if I have to fill in a blank, it'd tell me how does the wanted puzzle piece(s) look.
Unfortunately I've never did anything with image processing and pattern recognition, so I'd like to ask you for some pointers - how do I recognize a jigsaw piece (basically a square with tabs and holes) in a picture?
Then I'd probably need to rotate it so it's in the right position, scale to some proportion and then measure tab/blank on each side, and also each side's slope, if present.
I know that it would be too time consuming to scan/photograph 1000 pieces of puzzle and use it, this would be just a pet project where I'd learn something new.
Data acquisition
(This is known as Chroma Key, Blue Screen or Background Color method)
Find a well-lit room, with the least lighting variation across the room.
Find a color (hue) that is rarely used in the entire puzzle / picture.
Get a color paper that has that exactly same color.
Place as many puzzle pieces on the color paper as it'll fit.
You can categorize the puzzles into batches and use it as a computer hint later on.
Make sure the pieces do not overlap or touch each other.
Do not worry about orientation yet.
Take picture and download to computer.
Color calibration may be needed because the Chroma Key background may have upset the built-in color balance of the digital camera.
Acquisition data processing
Get some computer vision software
OpenCV, MATLAB, C++, Java, Python Imaging Library, etc.
Perform connected-component on the chroma key color on the image.
Ask for the contours of the holes of the connected component, which are the puzzle pieces.
Fix errors in the detected list.
Choose the indexing vocabulary (cf. Ira Baxter's post) and measure the pieces.
If the pieces are rectangular, find the corners first.
If the pieces are silghtly-off quadrilateral, the side lengths (measured corner to corner) is also a valuable signature.
Search for "Shape Context" on SO or Google or here.
Finally, get the color histogram of the piece, so that you can query pieces by color later.
To make them searchable, put them in a database, so that you can query pieces with any combinations of indexing vocabulary.
A step back to the problem itself. The problem of building a puzzle can be easy (P) or hard (NP), depending of whether the pieces fit only one neighbour, or many. If there is only one fit for each edge, then you just find, for each piece/side its neighbour and you're done (O(#pieces*#sides)). If some pieces allow multiple fits into different neighbours, then, in order to complete the whole puzzle, you may need backtracking (because you made a wrong choice and you get stuck).
However, the first problem to solve is how to represent pieces. If you want to represent arbitrary shapes, then you can probably use transparency or masks to represent which areas of a tile are actually part of the piece. If you use square shapes then the problem may be easier. In the latter case, you can consider the last row of pixels on each side of the square and match it with the most similar row of pixels that you find across all other pieces.
You can use the second approach to actually help you solve a real puzzle, despite the fact that you use square tiles. Real puzzles are normally built upon a NxM grid of pieces. When scanning the image from the box, you split it into the same NxM grid of square tiles, and get the system to solve that. The problem is then to visually map the actual squiggly piece that you hold in your hand with a tile inside the system (when they are small and uniformly coloured). But you get the same problem if you represent arbitrary shapes internally.

Resources