Algorithm to compare images - image

I'm thinking of writing an app that can compare images: i.e. check if one image exists inside another.
Let's imagine a picture with people, trees and another stuff. Let's imagine another picture of a tree. Is it possible to check if that second picture has some similarity with the trees in the first picture?
I've tried to read bytes and compare them but it didn't work.
What's the best approach to do something like this? What is the algorithm that i should use? And what is the best ( faster ) language to do this?
Thanks in advance.

Look at openCV, follow the link :
http://docs.opencv.org/doc/tutorials/features2d/feature_homography/feature_homography.html#feature-homography

You can use a pHash algorithm from here:http://www.phash.org.

SURF and SIFT are two good algorithms, widely available in libraries for languages of your choice.
It is also instructive to imple.ent them yourself in your preferred language to familiarise yourself with this topic.

You need algorithm similar to fast algorithm of searching substring in a string, but developed to search in 2D space.
For example, here is good solution:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45.581&rep=rep1&type=pdf
Its complexity is O(n2) in worst case, where n is dimension of big matrix.

Related

How to simplify a spline?

I have an interesting algorithmic challenge in a project I am working on. I have a sorted list of coordinate points pointing at buildings on either side of a street that, sufficiently zoomed in, looks like this:
I would like to take this zigzag and smooth it out to linearize the underlying street.
I can think of a couple of solutions:
Calculate centroids using rolling averages of six or so points, and use those.
Spline regression.
Is there a better or best way to approach this problem? (I am using Python 3.5)
Based on your description and your comments, you are looking for a line simplification algorithms.
Ramer-Doublas algorithm (suggested in the comment) is most probably the most well-known algorithm in this family, but there are many more.
For example Visvalingam’s algorithm works by removing the point with the smallest change, which is calculated by the smallest square of the triangle. This makes it super easy to code and intuitively understandable. If it is hard to read research paper, you can read this easy article.
Other algorithms in this family are:
Opheim
Lang
Zhao
Read about them, understand what are they trying to minify and select the most suitable for you.
Dali's post correctly surmises that a line simplification algorithm is useful for this task. Before posting this question I actually examined a few such algorithms but wasn't quite comfortable with them because even though they resulted in the simplified geometry that I liked, they didn't directly address the issue I had of points being on either side of the feature and never in the middle.
Thus I used a two-step process:
I computed the centroids of the polyline by using a rolling average of the coordinates of the five surrounding points. This didn't help much with smoothing the function but it did mostly succeed in remapping them to the middle of the street.
I applied Visvalingam’s algorithm to the new polyline, with n=20 points specified (using this wonderful implementation).
The result wasn't quite perfect but it was good enough:
Thanks for the help everyone!

Solving puzzle with tree data structure

I am currently following an algorithms class and we have to solve a sudoku.
I have already a working solution with naive backtracking but I'm much more interest in solving this puzzle problem with a tree data structure.
My problem is that I don't quite understand how it works. Is anyone can explain to me the basic of puzzle solving with tree?
I don't seek optimization. I looking for explanation on algorithms like the Genetic algorithm or something similar. My purpose only to learn at this point. I have hard time to take what I read in scientific articles and translate it on real implementation.
I Hope, I've made my question more clear.
Thank you very much!
EDIT: I edit the post to be more precise.
I don't know how to solve Sudoku "with a tree", but you're trying to mark as many cells as possible before trying to guess and using backtrace. Therefore check out this question.
Like in most search problems, also in Sudoku you may associate a tree with the problem. In this case nodes are partial assignments to the variables and branches correspond to extensions of the assignment in the parent node. However it is not practical (or even feasible) to store these trees in memory, you can think of them as a conceptual tool. Backtracking actually navigates such a tree using variable ordering to avoid exploring hopeless branches. A better (faster) solution can be obtained by constraint propagation techniques since you know that all rows, columns, and 3x3 cells must contain different values. A simple algorithm for that is AC3 and it is covered in most introductory AI books including Russell & Norvig 2010.

How to find the closest 2 points in a 100 dimensional space with 500,000 points?

I have a database with 500,000 points in a 100 dimensional space, and I want to find the closest 2 points. How do I do it?
Update: Space is Euclidean, Sorry. And thanks for all the answers. BTW this is not homework.
There's a chapter in Introduction to Algorithms devoted to finding two closest points in two-dimensional space in O(n*logn) time. You can check it out on google books. In fact, I suggest it for everyone as the way they apply divide-and-conquer technique to this problem is very simple, elegant and impressive.
Although it can't be extended directly to your problem (as constant 7 would be replaced with 2^101 - 1), it should be just fine for most datasets. So, if you have reasonably random input, it will give you O(n*logn*m) complexity where n is the number of points and m is the number of dimensions.
edit
That's all assuming you have Euclidian space. I.e., length of vector v is sqrt(v0^2 + v1^2 + v2^2 + ...). If you can choose metric, however, there could be other options to optimize the algorithm.
Use a kd tree. You're looking at a nearest neighbor problem and there are highly optimized data structures for handling this exact class of problems.
http://en.wikipedia.org/wiki/Kd-tree
P.S. Fun problem!
You could try the ANN library, but that only gives reliable results up to 20 dimensions.
Run PCA on your data to convert vectors from 100 dimensions to say 20 dimensions. Then create a K-Nearest Neighbor tree (KD-Tree) and get the closest 2 neighbors based on euclidean distance.
Generally if no. of dimensions are very large then you have to either do a brute force approach (parallel + distributed/map reduce) or a clustering based approach.
Use the data structure known as a KD-TREE. You'll need to allocate a lot of memory, but you may discover an optimization or two along the way based on your data.
http://en.wikipedia.org/wiki/Kd-tree.
My friend was working on his Phd Thesis years ago when he encountered a similar problem. His work was on the order of 1M points across 10 dimensions. We built a kd-tree library to solve it. We may be able to dig-up the code if you want to contact us offline.
Here's his published paper:
http://www.elec.qmul.ac.uk/people/josh/documents/ReissSelbieSandler-WIAMIS2003.pdf

Match-three puzzle games algorithm

I am trying to write a match-three puzzle game like 'call of Atlantis' myself. The most important algorithm is to find out all possible match-three possibilities. Is there any open source projects that can be referenced? Or any keywords to the algorithm? I am trying to look for a faster algorithm to calculate all possibilities. Thanks.
To match 3 objects using one swap, you need already 2 objects lined up in the right way. Identify these pairs first. Then there are just a few possibilities from where a third object can be swapped in. Try to encode these patterns.
For smaller boards the easy brute force algorithm (try out all possible swaps and check if three objects line up in the neighborhood after a swap) may be sufficient.
Sorry, I can't say much more without a more precise description.

3 dimensional bin packing algorithms

I'm faced with a 3 dimensional bin packing problem and am currently conducting some preliminary research as to which algorithms/heuristics are currently yielding the best results. Since the problem is NP hard I do not expect to find the optimal solution in every case, but I was wondering:
1) what are the best exact solvers? Branch and Bound? What problem instance sizes can I expect to solve with reasonable computing resources?
2) what are the best heuristic solvers?
3) What off-the-shelf solutions exist to conduct some experiments with?
As far as off the shelf solutions, check out MAXLOADPRO for loading trucks. It may be able to be configured to load any rectangular volume, but I haven't tried that yet. In general 3d bin-packing problems have the added complication that the objects can be rotated into different positions so for any object with a given length, width and height, you effectively have to create three variables representing each position, but you only use one in the solution.
In general, stand-alone MIP formulations (or branch and bound) don't work well for the 2d or 3d problem but constraint programming has met with some success producing exact solutions for the 2d problem. Check out this abstract. Without looking at the paper, I like the decomposition approach for the problem where you're trying to minimize the number of same-sized bins. I haven't seen as many results for the 3d problem, but let us know if you find any that are implementable.
Good luck !
I've written a program which tests three various algorithms. Also this is a good source of information: A Thousand Ways to Pack the Bin - A Practical Approach to Two-Dimensional Rectangle Bin Packing. It is for two-dimensional rectangle bin, but you can always transform it to 3D.
From wikipedia:
Although these simple strategies are often good enough, efficient approximation algorithms have been demonstrated that can solve the bin packing problem within any fixed percentage of the optimal solution for sufficiently large inputs
Here are the two sources they give for this:
Approximation Algorithms
Bin packing can be solved within 1 + ε in linear time
Best exact solver: Use dynamic programming.
State variables:
Items you have packed and discarded.
Space filled in the container.
If the container is a parallelepiped grid, and the items "fit" in exact cells of the grid, you can use a 3-dimensional array to represent state variable 2. Otherwise, you will have to use more complex data structures.
Best heuristic solvers
I don't know. Perhaps Variable Neighborhood Search. There are some similarities between your problem and the timetable construction problem (which I'm working on), so the same heuristic might be good for both.
Off-the-shelf solutions to conduct experiments
I'm sorry, I don't even have a clue.
You question is similar to:
3d bin packing algorithm
Although, because you dis-allow rotation, you can get pretty good results. I suggest looking more towards a FIRST-FIT-DECREASING solution.
3dbinpacking is a commercial solution (not an algorithm) exposing an API to consume with nice visualization. It offers:
Single bin packing
Multi bin packing
Find third dimension
Find a bin dimensions

Resources