recovering order from samples - algorithm

I have a pool of balls of 30 different color patterns(solid green, green and red striped, etc), and I also have 6 boxes ordered from 1 to 6. Now randomly I select 6 balls out of the pool and put each ball in one box so that each box contains exactly one ball. And among the 6 balls, the color pattern of each ball can be different from or the same as color pattern of other balls in other boxes. Now I want you to guess the color pattern of the ball in each box by doing the following:
Every time you make a request to me and I will randomly select 3 balls and display the balls in front of you in the same order of the box order. You can make unlimited requests.
The problem is how to tell the color pattern of the ball in each box by making least requests, I feel like there should be a well-known algorithm for this problem, but I can not find any. Has anybody seen this before?

I think there is a lot of statistics in this. First of all, I would make the simplifying assumption that (if you don't know which colour balls are present) the only colours and patterns available are those which you have seen.
Now write down, or work out how to calculate, a formula that gives you the probability of the observed data given the listing of which balls are present in which boxes.
Now all you have to do is find the combination of balls in boxes that gives the highest probability to the observed data, and hope that as you get more and more data the right answer wins out.
You could think of this as a generic optimisation problem, and try hill-climbing from multiple random starts, or genetic programming, or whatever your favourite heuristic is.
Or you could do a bit more web-searching about statistics and recognise that this is a missing data problem, where the hidden data is the knowledge of which box each sampled ball came from. Statisticians often solve hidden data problems with the EM algorithm. There is an introduction for mathematicians at http://www.inf.ed.ac.uk/teaching/courses/pmr/docs/EM.pdf. Your problem can be thought of as a simple case of a hidden Markov model, with the hidden state being the box which produced a particular ball.

Related

Finding an algorithm that can find the shortest way to solve a one dimensional variant of the "lights out" game

I'm solving a task from an old programming competition. The task is to make a program that can find if there exists a possible solution, and what the shortest solution is, for a version of the well known game "lights out". In short, we have several lights connected. By clicking on of the lights you change the status of it, and the two adjacent lights. The goal is to activate all the lights.
In the classic version of "lights out" we are working with two dimensions, but in this version the lights are connected in a "one dimensional" string, where the "edges" are connected. Basically a circle of lights.
The number of lights can go up to 10000, so the bruteforce method I tried was obviously not good enough. It only manages to solve the versions that have a solution, and where there are under ~10 lights. Here is an example of a solveable setup. The 1's mark lights that are activated, and the 0's mark lights that are deactivated. The first line includes the number of lights in the string. If a solution doesn't excist, the program will output that it isn't possible. Remember that the edges are connected.
5
10101
Click one of the "edges" (doesn't matter which one, I clicked the left one).
01100
Click the opposite edge
11111
If a solution doesn't excist the program outputs a message. If not, it outputs the shortest solution, in this case: 2.
Could anyone help me find an algorithm?
Thanks for the help.
Suppose you knew whether in the solution (if one exists) you need to click on the first and second light.
Once we have this information, we can immediately deduce whether we need to click on the third light as this is the last choice that can affect the second light (clicking on the first light changes the last/first/second, clicking on the second light changes the first/second/third, clicking on the third light changes the second/third/fourth - but no other clicks can change the second light).
Similarly, we can then immediately deduce whether to click the fourth light, as this is the last choice that can affect the third light.
You can then work all the way round to the end to find out whether you have a consistent solution (with all lights out).
Simply try all 4 options for the first 2 switches, and pick the best scoring one.
Complexity O(4n)

Alien tiles heuristic function

I am trying to find a good A* heuristic function for the problem "alien tiles", found at www.alientiles.com for a uni project.
In alien tiles you have a board with NxN tiles, all colored red. By clicking on a tile, all tiles in the same row and column advance by a color, the color order being red->green->blue->purple, resetting to red after purple. The goal is to change all tiles to the specified colors. The simplest goal state is all the tiles going from red to green, blue or purple. The board doesn't have to be 7x7 as the site suggests.
I've thought of summing the difference between each tile and the target tile and dividing by 2N-1 for an NxM board or or finding possible patterns of clicks as the minimum number of clicks, but neither has been working well. I can't think of a way to apply relaxation to the problem or divide it into sub-problems either, since a single click affects an entire row and column.
Of course I'm not asking for anyone to find a solution for me, but some tips or some relevant, simpler problems that I can look at (rubik's cube is such an example that I'm looking at).
Thanks in advance.
The problem you are trying to solve is similar to NIM FOCUS name. Please have a look at it. The solutions for it can be found in Stuart J. Russell book under heuristics section. Hope this helps
Although it is a relatively 'dumb' way of thinking around the problem, one heuristic mechanism i have found that drastically cuts down on the number of states that a star expands, tries to figure out a relationship between the cell that has been clicked most recently and the number of states that clicking on it again would expand. Its like telling a star: "If you have clicked on a cell in your last move, try clicking on another one this time." Obviously in special scenarios,
(e.g. having all the board on your target colour, say green, and only a purple cross where clicking on the center of the cross twice changes the cross colour to green and then you are done)
this way of thinking is actually detrimental. But, it is a place to start.
Please let me know if u figure anything out, as it is something i am working on as well.

distinguishing objects with opencv

I want to identify lego bricks for building a lego sorting machine (I use c++ with opencv).
That means I have to distinguish between objects which look very similar.
The bricks are coming to my camera individually on a flat conveyer. But they might lay in any possible way: upside down, on the side or "normal".
My approach is to teach the sorting machine the bricks by taping them with the camera in lots of different positions and rotations. Features of each and every view are calculated by surf-algorythm.
void calculateFeatures(const cv::Mat& image,
std::vector<cv::KeyPoint>& keypoints,
cv::Mat& descriptors)
{
// detector == cv::SurfFeatureDetector(10)
detector->detect(image,keypoints);
// extractor == cv::SurfDescriptorExtractor()
extractor->compute(image,keypoints,descriptors);
}
If there is an unknown brick (the brick that i want to sort) its features also get calculated and matched with known ones.
To find wrongly matched features I proceed as described in the book OpenCV 2 Cookbook:
with the matcher (=cv::BFMatcher(cv::NORM_L2)) the two nearest neighbours in both directions are searched
matcher.knnMatch(descriptorsImage1, descriptorsImage2,
matches1,
2);
matcher.knnMatch(descriptorsImage2, descriptorsImage1,
matches2,
2);
I check the ratio between the distances of the found nearest neighbours. If the two distances are very similar it's likely that a false value is used.
// loop for matches1 and matches2
for(iterator matchIterator over all matches)
if( ((*matchIterator)[0].distance / (*matchIterator)[1].distance) > 0.65 )
throw away
Finally only symmatrical match-pairs are accepted. These are matches in which not only n1 is the nearest neighbour to feature f1, but also f1 is the nearest neighbour to n1.
for(iterator matchIterator1 over all matches)
for(iterator matchIterator2 over all matches)
if ((*matchIterator1)[0].queryIdx == (*matchIterator2)[0].trainIdx &&
(*matchIterator2)[0].queryIdx == (*matchIterator1)[0].trainIdx)
// good Match
Now only pretty good matches remain. To filter out some more bad matches I check which matches fit the projection of img1 on img2 using the fundamental matrix.
std::vector<uchar> inliers(points1.size(),0);
cv::findFundamentalMat(
cv::Mat(points1),cv::Mat(points2), // matching points
inliers,
CV_FM_RANSAC,
3,
0.99);
std::vector<cv::DMatch> goodMatches
// extract the surviving (inliers) matches
std::vector<uchar>::const_iterator itIn= inliers.begin();
std::vector<cv::DMatch>::const_iterator itM= allMatches.begin();
// for all matches
for ( ;itIn!= inliers.end(); ++itIn, ++itM)
if (*itIn)
// it is a valid match
The result is pretty good. But in cases of extreme alikeness faults still occur.
In the picture above you can see that a similar brick is recognized well.
However in the second picture a wrong brick is recognized just as well.
Now the question is how I could improve the matching.
I had two different ideas:
The matches in the second picture trace back to the features really fitting, but only if the visual field is intensely changed. To recognize a brick I have to compare it in many different positions anyway (at least as shown in figure three). This means I know that I am only allowed to minimally change the visual field. The information how intensely the visual field is changed should be hidden in the fundamental matrix. How can I read out of this matrix how far the position in the room has changed? Especially the rotation and strong scaling should be of interest; if the brick once is taped farer on the left side this shouldn't matter.
Second idea:
I calculated the fundamental matrix out of 2 pictures and filtered out features that don't fit the projections - shouldn't there be a way to do the same using three or more pictures? (keyword Trifocal tensor). This way the matching should become more stable. But I neither know how to do this using OpenCV nor could I find any information on this on google.
I don't have a complete answer, but I have a few suggestions.
On the image analysis side:
It looks like your camera setup is pretty constant. Easy to just separate the brick from the background. I also see your system finding features in the background. This is unnecessary. Set all non-brick pixels to black to remove them from the analysis.
When you have located just the brick, your first step should be to just filter likely candidates based on the size (i.e. number of pixels) in the brick. That way the example faulty match you show is already less likely.
You can take other features into account such as the aspect ratio of the bounding box of the brick, the major and minor axes (eigevectors of the covariance matrix of the central moments) of the brick etc.
These simpler features will give you a reasonable first filter to limit your search space.
On the mechanical side:
If bricks are actually coming down a conveyor you should be able to "straighten" the bricks along a straight edge using something like a rod that lies at an angle to the direction of the conveyor across the belt so that the bricks arrive more uniformly at your camera like so.
Similar to the previous point, you could use something like a very loose brush suspended across the belt to topple bricks standing up as they pass.
Again both these points will limit your search space.

Mahout algorithm advice

What I need is actually just a hint where I can start.
I'm somewhat familiar to Mahout, at least theoretically. I know how it work, how to set it up, etc, and I could build a simple recommendation system based in collaborative filtering.
However, now I'm trying to do something more complex and even after reading quite some about different algorithms, I'm not sure which direction to go.
Quickly what I want to do is:
The final goal is to define one scalar (a "score") of each one of a set of entities based on some "known" entities. The entities interact with each other, known scores influence and define the unknown ones. You can imagine with the following example.
I have a lot if white clothes and a few pieces of colorful ones; red, blue, green... I put them into the washing machine. I want to know what colors the white ones will get after the wash.
Things to take into account:
we make a series of washing with different "actors"... some clothes are washed in the 1st and 3rd washing, some of them only in the 2nd, some of them are washed in all
in consecutive washes the clothes that were white before but now colored also influence the rest, but not as strong (as they are not as colored)
some colors don't "color" as much as others. for example red has a strong effect on most of the clothes, but green not so much
coloring effect also depends on how many clothes are in one washing. If you wash a red shirt with a white t-shirt, it gets much more colored, than if there were 100 other white t-shirt
clothes don't "lose" their color when influencing others
You can see that while calculating, entities actually have 2 assigned scalars:
the color hue (this also defines "coloring power" as mentioned above). The hue can be represented as a number, from 0 to 1, let's say. The coherence between the coloring power and the color number is not linear. It is more like the ends of the scale have more coloring power (0 and 1) while the middle (0.5) has less
the color "lightness" (how much an entity is colored, for originally colored clothes it's 1, for white ones it's 0), which in the same time also defines coloring power regardless of the hue
So, again, what I know:
which clothes where washed in which consecutive washing
I know the original color of some of them, the rest is white in the beginning
What I want to know:
- the hue of all clothes in the end of the washing
The problem is that I don't know what (type) of algorithm should I start with. If you were so kind to read so far, please suggest me something (or further reading).
Obviously I don't ask for any detailed thing, again, only hints.
Thank you!
The only thing I can think of that sounds like this problem is PageRank. It's computed by a sort of iterative simluation. Each page has some influence (color) which flows via its links (socks its washed with) and at some point the page influence reaches a steady state (final color). You can look up PageRank algorithms but it is essentially a matter of calculating eigenvectors of a big, erm, sock color matrix.

matching jigsaw puzzle pieces

I have nothing useful to do and was playing with jigsaw puzzle like this:
alt text http://manual.gimp.org/nl/images/filters/examples/render-taj-jigsaw.jpg
and I was wondering if it'd be possible to make a program that assists me in putting it together.
Imagine that I have a small puzzle, like 4x3 pieces, but the little tabs and blanks are non-uniform - different pieces have these tabs in different height, of different shape, of different size. What I'd do is to take pictures of all of these pieces, let a program analyze them and store their attributes somewhere. Then, when I pick up a piece, I could ask the program to tell me which pieces should be its 'neighbours' - or if I have to fill in a blank, it'd tell me how does the wanted puzzle piece(s) look.
Unfortunately I've never did anything with image processing and pattern recognition, so I'd like to ask you for some pointers - how do I recognize a jigsaw piece (basically a square with tabs and holes) in a picture?
Then I'd probably need to rotate it so it's in the right position, scale to some proportion and then measure tab/blank on each side, and also each side's slope, if present.
I know that it would be too time consuming to scan/photograph 1000 pieces of puzzle and use it, this would be just a pet project where I'd learn something new.
Data acquisition
(This is known as Chroma Key, Blue Screen or Background Color method)
Find a well-lit room, with the least lighting variation across the room.
Find a color (hue) that is rarely used in the entire puzzle / picture.
Get a color paper that has that exactly same color.
Place as many puzzle pieces on the color paper as it'll fit.
You can categorize the puzzles into batches and use it as a computer hint later on.
Make sure the pieces do not overlap or touch each other.
Do not worry about orientation yet.
Take picture and download to computer.
Color calibration may be needed because the Chroma Key background may have upset the built-in color balance of the digital camera.
Acquisition data processing
Get some computer vision software
OpenCV, MATLAB, C++, Java, Python Imaging Library, etc.
Perform connected-component on the chroma key color on the image.
Ask for the contours of the holes of the connected component, which are the puzzle pieces.
Fix errors in the detected list.
Choose the indexing vocabulary (cf. Ira Baxter's post) and measure the pieces.
If the pieces are rectangular, find the corners first.
If the pieces are silghtly-off quadrilateral, the side lengths (measured corner to corner) is also a valuable signature.
Search for "Shape Context" on SO or Google or here.
Finally, get the color histogram of the piece, so that you can query pieces by color later.
To make them searchable, put them in a database, so that you can query pieces with any combinations of indexing vocabulary.
A step back to the problem itself. The problem of building a puzzle can be easy (P) or hard (NP), depending of whether the pieces fit only one neighbour, or many. If there is only one fit for each edge, then you just find, for each piece/side its neighbour and you're done (O(#pieces*#sides)). If some pieces allow multiple fits into different neighbours, then, in order to complete the whole puzzle, you may need backtracking (because you made a wrong choice and you get stuck).
However, the first problem to solve is how to represent pieces. If you want to represent arbitrary shapes, then you can probably use transparency or masks to represent which areas of a tile are actually part of the piece. If you use square shapes then the problem may be easier. In the latter case, you can consider the last row of pixels on each side of the square and match it with the most similar row of pixels that you find across all other pieces.
You can use the second approach to actually help you solve a real puzzle, despite the fact that you use square tiles. Real puzzles are normally built upon a NxM grid of pieces. When scanning the image from the box, you split it into the same NxM grid of square tiles, and get the system to solve that. The problem is then to visually map the actual squiggly piece that you hold in your hand with a tile inside the system (when they are small and uniformly coloured). But you get the same problem if you represent arbitrary shapes internally.

Resources