I am trying to make the minesweeper solver. As you know there are 2 ways to determine which fields in minefield are safe to open, or to determine which fields are mined and you need to flag it. First way to determine is trivial and we have something like this:
if (number of mines around X – current number of discovered mines around X) = number of unopened fields around X then
All unopened fields around X are mined
if (number of mines around X == current number of discovered mines around X) then
All unopened fields around X are NOT mined
But my question is: What about situation when we can't find any mined or safe field and we need to look at more than 1 field?
http://img541.imageshack.us/img541/4339/10299095.png
For example this situation. We can't determine anything using previous method. So i need a help with algorithm for these cases.
I have to use A* algorithm to make this. That is why i need all possible safe states for next step in algorithm. When i find all possible safe states i will add them to the current shortest path and depending on heuristic function i will sort list of paths and choose next field that needs to be opened.
Awesome problem, before you get too excited though, please read NP Completeness and Minesweeper, as well as the accompanying presentation which develops some good worst case examples and how a human might solve them. Nevertheless, in expectation we most likely won't hit a time barrier, if we use basic pruning and heuristics.
The question of generating the game is asked here: Minesweeper solving algorithm. There is a very cool post on algebraic methods. You can also give backtracking a try (i.e. take a guess and see if that invalidates things), similar to the case where local information is not enough for something like sudoku. See this great discussion about this technique.
As #tigger said this is not a problem that can be solved with a simple set of rules. Minesweeper is a good example where backtracking algorithms such as DPLL is useful. With something as simple as propositional logic, you can implement a very efficient solver for minesweeper. I am not sure if you are familiar with AI reasoning & logic inference - If not, you might want to have a look at the book "Artificial Intelligence - A Modern Approach" by Stuart Russel and Peter Norvig. For quick reference of DPLL and propositional logic, search "wumpus world propositional logic" on Google.
Related
I'm trying to solve the Sokoban puzzle in Prolog using a depth-first-search algorithm, but I cannot manage to search the solution tree in depth. I'm able to explore only the first level.
All the sources are at Github (links to revision when the question was asked) so feel free to explore and test them. I divided the rules into several files:
board.pl: contains rules related to the board: directions, neighbourhoods,...
game.pl: this file states the rules about movements, valid positions,...
level1.pl: defines the board, position of the boxes and solution squares for a sample game.
sokoban.pl: tries to implement dfs :(
I know I need to go deeper when a new state is created instead of checking if it is the final state and backtracking... I need to continue moving, it is impossible to reach the final state with only one movement.
Any help/advice will be highly appreciated, I've been playing around without improvements.
Thanks!
PS.- ¡Ah! I'm working with SWI-Prolog, just in case it makes some difference
PS.- I'm really newbie to Prolog, and maybe I'm facing an obvious mistake, but this is the reason I'm asking here.
This is easy to fix: In sokoban.pl, predicate solve_problem/2, you are limiting the solution to lists of a single element in the goal:
solve_dfs(Problem, Initial, [Initial], [Solution])
Instead, you probably mean:
solve_dfs(Problem, Initial, [Initial], Solution)
because a solution can consist of many moves.
In fact, an even better search strategy is often iterative deepening, which you get with:
length(Solution, _),
solve_dfs(Problem, Initial, [Initial], Solution)
Iterative deepening is a complete search strategy and an optimal strategy under quite general assumptions.
Other than that, I recommend you cut down the significant number of impure I/O calls in your program. There are just too many predicates where you write something on the screen.
Instead, focus on a clear declarative description, and cleanly separate the output from a description of what a solution looks like. In fact, let the toplevel do the printing for you: Describe what a solution looks like (you are already doing this), and let the toplevel display the solution as variable bindings. Also, think declaratively, and use better names like dfs_moves/4, problem_solution/2 instead of solve_dfs/4, solve_problem/2 etc.
DCGs may also help you in some places of your code to more conveniently describe lists.
+1 for tackling a nice and challenging search problem with Prolog!
I've implemented the algorithms marked as the correct answer in this question: What to use for flow free-like game random level creation?
However, using that method will create boards that may have multiple solutions. I was wondering if there is any simple restrictions or modification that can be made to the algorithm to make sure that there is only one possible solution?
Creating unique Numberlink/Flow Free is very difficult. If you look at my algorithm proposal in the mentioned thread, you'll find an algorithm that lets you create puzzles with the necessary condition that solutions must not have a 2x2 square of the same color. The discussion at http://forum.ukpuzzles.org/viewtopic.php?f=3&t=41, however, shows that this is insufficient, since there are also many non-trivial non-unique puzzles.
From my looking into this problem, it seems the only way to solve this problem is to have a separate algorithm for testing uniqueness, and discarding bad instances. One solver that's made precisely for uniqueness testing algorithm is Imo's solver.
Another option is to use multiple different solvers and check that they come up with the same solution.
I think you should implement the solver, which finds all the solutions for some level. The simplest way is backtracking.
When you have many levels, take one by one and look for solutions. As soon as you find the second solution for some level, throw that level away.
Me and some of my friends at college were assigned a practical task of developing a net application for optimization of cutting rectangular parts from some kind of material. Something like apps in this list, but more simplistic. Basically, I'm interested if there is any source code for this kind of optimization algorithms available on the internet. I'm planning to develop the app using Adobe Flex framework. The programming part will be done in Actionscript 3, ofc. However, I doubt that there are any optimization samples for this language. There may be some for Java, C++, C#, Ruby or Python and other more popular languages, though(then I'd just have to rewrite it in AS). So, if anyone knows any free libs or algorithm code samples that would suit me, I'd like to hear your suggestions. :)
This sounds just like the stock cutting problem which is extermely hard! The best solutions use linear programming (typically based on the simplex method) with column generation (which, even after years on a constraint solving research project I feel unequipped to give a half decent explanation). In short, you won't want to try this approach in Actionscript; consequently, with whatever you do implement, you shouldn't expect great results on anything other than small problems.
The best advice I can offer, then, is to see if you can cut the source rectangle into strips (each of the width of the largest rectangles you need), then subdivide the remainder of each strip after the "head" rectangle has been removed.
I'd recommend using branch-and-bound as your optimisation strategy. BnB works by doing an exhaustive tree search that keeps track of the best solution seen so far. When you find a solution, update the bound, and backtrack looking for the next solution. Whenever you know your search takes you to a branch that you know cannot lead to a better solution than the best you have found, you can backtrack early at that point.
Since these search trees will be very large, you will probably want to place a time limit on the search and just return your best effort.
Hope this helps.
I had trouble finding examples when I wanted to do the same for the woodwoorking company I work for. The problem itself is NP-hard so you need to use an approximation algorithm like a first fit or best fit algorithm.
Do a search for 2d bin-packing algorithms. The one I found, you sort the panels biggest to smallest, then add the to the sheets in in order, putting in the first bin it will fit. Sorry don't have the code with with me and its in vb.net anyway.
I'm writing a scheduling program with a difficult programming problem. There are several events, each with multiple meeting times. I need to find an arrangement of meeting times such that each schedule contains any given event exactly once, using one of each event's multiple meeting times.
Obviously I could use brute force, but that's rarely the best solution. I'm guessing this is a relatively basic computer science problem, which I'll learn about once I am able to start taking computer science classes. In the meantime, I'd prefer any links where I could read up on this, or even just a name I could Google.
I think you should use genetic algorithm because:
It is best suited for large problem instances.
It yields reduced time complexity on the price of inaccurate answer(Not the ultimate best)
You can specify constraints & preferences easily by adjusting fitness punishments for not met ones.
You can specify time limit for program execution.
The quality of solution depends on how much time you intend to spend solving the program..
Genetic Algorithms Definition
Genetic Algorithms Tutorial
Class scheduling project with GA
There are several ways to do this
One approach is to do constraint programming. It is a special case of the dynamic programming suggested by feanor. It is helful to use a specialized library that can do the bounding and branching for you. (Google for "gecode" or "comet-online" to find libraries)
If you are mathematically inclined then you can also use integer programming to solve the problem. The basic idea here is to translate your problem in to a set of linear inequalities. (Google for "integer programming scheduling" to find many real life examples and google for "Abacus COIN-OR" for a useful library)
My guess is that constraint programming is the easiest approach, but integer programming is useful if you want to include real variables in you problem at some point.
Your problem description isn't entirely clear, but if all you're trying to do is find a schedule which has no overlapping events, then this is a straightforward bipartite matching problem.
You have two sets of nodes: events and times. Draw an edge from each event to each possible meeting time. You can then efficiently construct the matching (the largest possible set of edges between the nodes) using augmented paths. This works because you can always convert a bipartite graph into an equivalent flow graph.
An example of code that does this is BIM. Standard graphing libraries such as GOBLIN and NetworkX also have bipartite matching implementations.
This sounds like this could be a good candidate for a dynamic programming solution, specifically something similar to the interval scheduling problem.
There are some visuals here for the interval scheduling problem specifically, which may make the concept clearer. Here is a good tutorial on dynamic programming overall.
I am a student interested in developing a search engine that indexes pages from my country. I have been researching algorithms to use for sometime now and I have identified HITS and PageRank as the best out there. I have decided to go with PageRank since it is more stable than the HITS algorithm (or so I have read).
I have found countless articles and academic papers related to PageRank, but my problem is that I don't understand most of the mathematical symbols that form the algorithm in these papers. Specifically, I don't understand how the Google Matrix (the irreducible,stochastic matrix) is calculated.
My understanding is based on these two articles:
http://online.redwoods.cc.ca.us/instruct/darnold/LAPROJ/fall2005/levicob/LinAlgPaperFinal2-Screen.pdf
http://ilpubs.stanford.edu:8090/386/1/1999-31.pdf
Could someone provide a basic explanation (examples would be nice) with less mathematical symbols?
Thanks in advance.
The formal defintion of PageRank, as defined at page 4 of the cited document, is expressed in the mathematical equation with the funny "E" symbol (it is in fact the capital Sigma Greek letter. Sigma is the letter "S" which here stands for Summation).
In a nutshell this formula says that to calculate the PageRank of page X...
For all the backlinks to this page (=all the pages that link to X)
you need to calculate a value that is
The PageRank of the page that links to X [R'(v)]
divided by
the number of links found on this page. [Nv]
to which you add
some "source of rank", [E(u)] normalized by c
(we'll get to the purpose of that later.)
And you need to make the sum of all these values [The Sigma thing]
and finally, multiply it by a constant [c]
(this constant is just to keep the range of PageRank manageable)
The key idea being this formula is that all web pages that link to a given page X are adding to value to its "worth". By linking to some page they are "voting" in favor of this page. However this "vote" has more or less weight, depending on two factors:
The popularity of the page that links to X [R'(v)]
The fact that the page that links to X also links to many other pages or not. [Nv]
These two factors reflect very intuitive ideas:
It's generally better to get a letter of recommendation from a recognized expert in the field than from a unknown person.
Regardless of who gives the recommendation, by also giving recommendation to other people, they are diminishing the value of their recommendation to you.
As you notice, this formula makes use of somewhat of a circular reference, because to know the page range of X, you need to know the PageRank of all pages linking to X. Then how do you figure these PageRank values?... That's where the next issue of convergence explained in the section of the document kick in.
Essentially, by starting with some "random" (or preferably "decent guess" values of PageRank, for all pages, and by calculating the PageRank with the formula above, the new calculated values get "better", as you iterate this process a few times. The values converge, i.e. they each get closer and closer to what is the actual/theorical value. Therefore by iterating a sufficient amount of times, we reach a moment when additional iterations would not add any practical precision to the values provided by the last iteration.
Now... That is nice and dandy, in theory. The trick is to convert this algorithm to something equivalent but which can be done more quickly. There are several papers that describe how this, and similar tasks, can be done. I don't have such references off-hand, but will add these later. Beware they do will involve a healthy dose of linear algebra.
EDIT: as promised, here are a few links regarding algorithms to calculate page rank.
Efficient Computation of PageRank Haveliwala 1999 ///
Exploiting the Block Structure of the Web for Computing PR Kamvar etal 2003 ///
A fast two-stage algorithm for computing PageRank Lee et al. 2002
Although many of the authors of the links provided above are from Stanford, it doesn't take long to realize that the quest for efficient PageRank-like calculation is a hot field of research. I realize this material goes beyond the scope of the OP, but it is important to hint at the fact that the basic algorithm isn't practical for big webs.
To finish with a very accessible text (yet with many links to in-depth info), I'd like to mention Wikipedia's excellent article
If you're serious about this kind of things, you may consider an introductory/refresher class in maths, particularly linear algebra, as well a computer science class that deal with graphs in general. BTW, great suggestion from Michael Dorfman, in this post, for OCW's video of 1806's lectures.
I hope this helps a bit...
If you are serious about developing an algorithm for a search engine, I'd seriously recommend you take a Linear Algebra course. In the absence of an in-person course, the MIT OCW course by Gilbert Strang is quite good (video lectures at http://ocw.mit.edu/OcwWeb/Mathematics/18-06Spring-2005/VideoLectures/).
A class like this would certainly allow you to understand the mathematical symbols in the document you provide-- there's nothing in that paper that wouldn't be covered in a first-year Linear Algebra course.
I know this isn't the answer you are looking for, but it's really the best option for you. Having someone try to explain the individual symbols or algorithms to you when you don't have a good grasp of the basic concepts isn't a very good use of anybody's time.
This is the paper that you need: http://infolab.stanford.edu/~backrub/google.html (If you do not recognise the names of the authors, you will find more information about them here: http://www.google.com/corporate/execs.html).
The symbols used in the document, are described in the document in lay English.
Thanks for making me google this.
You might also want to read the introductory tutorial on the mathematics behind the construction of the Pagerank matrix written by David Austin's entitled How Google Finds Your Needle in the Web's Haystack; it starts with a simple example and builds to the full definition.
"The $25,000,000,000 Eigenvector: The Linear Algebra Behind Google". from Rose-Hulman is a bit out of date, because now Page Rank is the $491B linear algebra problem. I think the paper is very well written.
"Programming Collective Intelligence" has a nice discussion of Page Rank as well.
Duffymo posted the best refernce in my opinion. I studied the page rank algorithm in my senior undergrad year. Page rank is doing the following:
Define the set of current webpages as the states of a finite markov chain.
Define the probability of transitioning from site u to v where the there is an outgoing link to v from u to be
1/u_{n} where u_{n} is the number of out going links from u.
Assume the markov chain defined above is irreducible (this can be enforced with only a slight degradation of the results)
It can be shown every finite irreducible markov chain has a stationary distribution. Define the page rank to be the stationary distribution, that is to say the vector that holds the probability of a random particle to end up at each given site as the number of state transitions goes to infinity.
Google uses a slight variation on the power method to find the stationary distribution (the power method finds dominant eigenvalues). Other than that there is nothing to it. Its rather simple and elegant and probably one of the simplest applications of markov chains I can think of, but it is wortha lot of money!
So all the pagerank algorithm does is take into account the topology of the web as an indication of whether a website should be important. The more incoming links a site has the greater the probability of a random particle spending its time at the site over an infinite amount of time.
If you want to learn more about page rank with less math, then this is very good tutorial on basic matrix operations. I recommend it for everyone who has little math background but wants to dive into ranking algorithms.