Depth First Search Basics - algorithm

I'm trying to improve my current algorithm for the 8 Queens problem, and this is the first time I'm really dealing with algorithm design/algorithms. I want to implement a depth-first search combined with a permutation of the different Y values described here:
http://en.wikipedia.org/wiki/Eight_queens_puzzle#The_eight_queens_puzzle_as_an_exercise_in_algorithm_design
I've implemented the permutation part to solve the problem, but I'm having a little trouble wrapping my mind around the depth-first search. It is described as a way of traversing a tree/graph, but does it generate the tree graph? It seems the only way that this method would be more efficient only if the depth-first search generates the tree structure to be traversed, by implementing some logic to only generate certain parts of the tree.
So essentially, I would have to create an algorithm that generated a pruned tree of lexigraphic permutations. I know how to implement the pruning logic, but I'm just not sure how to tie it in with the permutation generator since I've been using next_permutation.
Is there any resources that could help me with the basics of depth first searches or creating lexigraphic permutations in tree form?

In general, yes, the idea of the depth-first search is that you won't have to generate (or "visit" or "expand") every node.
In the case of the Eight Queens problem, if you place a queen such that it can attack another queen, you can abort that branch; it cannot lead to a solution.
If you were solving a variant of Eight Queens such that your goal was to find one solution, not all 92, then you could quit as soon as you found one.
More generally, if you were solving a less discrete problem, like finding the "best" arrangement of queens according to some measure, then you could abort a branch as soon as you knew it could not lead to a final state better than a final state you'd already found on another branch. This is related to the A* search algorithm.
Even more generally, if you are attacking a really big problem (like chess), you may be satisfied with a solution that is not exact, so you can abort a branch that probably can't lead to a solution you've already found.

The DFS algorithm itself does not generate the tree/graph. If you want to build the tree and graph, it's as simple building it as you perform the search. If you only want to find one soution, a flat LIFO data structure like a linked list will suffice for this: when you visit a new node, append it to the list. When you leave a node to backtrack in the search, pop the node off.

A book called "Introduction to algorithms" by anany levitan has a proper explanation for your understanding. He also provided the solution to 8 queens problem just the way you desctribed it. It will helpyou for sure.
As my understanding, for finding one solution you dont need any permutation all you need is dfs.That will lonely suffice in finding solution

Related

Algorithm Problem: Word transformation from a given word to another using only the words in a given dict

The detailed description of the problem is as follows:
Given two words (beginWord and endWord), and a dictionary's word list, find if there's a transformation sequence from beginWord to endWord, such that:
Only one letter can be changed at a time
Each transformed word must exist in the word list. Note that beginWord is not a transformed word.
I know this word can be solved using breadth-first-search. After I proposed the normal BFS solution, the interviewer asked me if I can make it faster. I didn't figure out a way to speed up. And the interviewer told me I should use a PriorityQueue instead to do a "Best-First-Search". And the priority is given by the hamming distance between the current word and target.
I don't quite understand why this can speed up the search. I feel by using priorityQueue we try to search the path that makes progress (i.e. reducing hamming distance).
This seems to be a greedy method. My questions is:
Why this solution is faster than the breadth-first-search solution? I feel the actual path can be like this: at first not making any progress, or even increasing the hamming distance, but after reaching a word the hamming distance goes down gradually. In this scenario, I think the priority queue solution will be slower.
Any suggestions will be appreciated! Thanks
First, I'd recommend to do some thorough reading on graph searching algorithms, that will explain the question to any detail you want (and far beyond).
TL;DR:
Your interviewer effectively recommended something close to the A* algorithm.
It differs from BFS in one aspect: which node to expand first. It uses a notion of distance score, composed from two elements:
At a node X, we already "traveled" a distance given by the number of transformations so far.
To reach the target from X, we still need to travel some more, at least N steps, where N is the number of characters different between node and target.
If we are to follow the path through X, the total number of steps from start to target can't be less than this score. It can be more if the real rest distance turns out to be longer (some words necessary for the direct path don't exist in the dictionary).
A* tells us: of all open (unexpanded) nodes, try the one first that potentially gives the shortest overall solution path, i.e. the one with the lowest score. And to implement that, a priority queue is a good fit.
In many cases, A* can dramatically reduce the search space (compared to BFS), and it still guarantees to find the best solution.
A* is NOT a greedy algorithm. It will eventually explore the whole search space, only in a much better ordering than a blind BFS.

Algorithm to generate TSP solutions randomly

The most common heuristics to solve the TSP problem (in particular the Kernighan–Lin heuristic) require to work on a randomly generated tour and to improve the solution starting from that. However, the only way I came up with to do that is to generate a random permutation of the vertices and to check if it is a solution or not.
For large instances of the problem (for example 1000 vertices) this process can take a while. Is there another smart way to generate a random tour for TSP problem faster?? Note that I'm looking for a tour, no matter the cost, and not an optimal solution.
Thanks in advance
If you're just looking for any tour, you can use Breadth or Depth First Search to generate a path while marking the nodes visited.
You could just create an array which contains ths problem's cities and then randomly shuffle that array (there are methods that could do this).
The resulting array is in fact a random permutation.
You want to use a space-filling-curve.

N-Puzzle with 5x5 grid, theory question

I'm writing a program which solves a 24-puzzle (5x5 grid) using two heuristic. The first uses how many blocks the incorrect place and the second uses the Manhattan distance between the blocks current place and desired place.
I have different functions in the program which use each heuristic with an A* and a greedy search and compares the results (so 4 different parts in total).
I'm curious whether my program is wrong or whether it's a limitation of the puzzle. The puzzle is generated randomly with pieces being moved around a few times and most of the time (~70%) a solution is found with most searches, but sometimes they fail.
I can understand why greedy would fail, as it's not complete, but seeing as A* is complete this leads me to believe that there's an error in my code.
So could someone please tell me whether this is an error in my thinking or a limitation of the puzzle? Sorry if this is badly worded, I'll rephrase if necessary.
Thanks
EDIT:
So I"m fairly sure it's something I'm doing wrong. Here's a step-by-step list of how I'm doing the searches, is anything wrong here?
Create a new list for the fringe, sorted by whichever heuristic is being used
Create a set to store visited nodes
Add the initial state of the puzzle to the fringe
while the fringe isn't empty..
pop the first element from the fringe
if the node has been visited before, skip it
if node is the goal, return it
add the node to our visited set
expand the node and add all descendants back to the fringe
If you mean that sliding puzzle: This is solvable if you exchange two pieces from a working solution - so if you don't find a solution this doesn't tell anything about the correctness of your algorithm.
It's just your seed is flawed.
Edit: If you start with the solution and make (random) legal moves, then a correct algorithm would find a solution (as reversing the order is a solution).
It is not completely clear who invented it, but Sam Loyd popularized the 14-15 puzzle, during the late 19th Century, which is the 4x4 version of your 5x5.
From the Wikipedia article, a parity argument proved that half of the possible configurations are unsolvable. You are probably running into something similar when your search fails.
I'm going to assume your code is correct, and you implemented all the algorithms and heuristics correctly.
This leaves us with the "generated randomly" part of your puzzle initialization. Are you sure you are generating correct states of the puzzle? If you generate an illegal state, obviously there will be no solution.
While the steps you have listed seem a little incomplete, you have listed enough to ensure that your A* will reach a solution if there is one (albeit not optimal as long as you are just simply skipping nodes).
It sounds like either your puzzle generation is flawed or your algorithm isn't implemented correctly. To easily verify your puzzle generation, store the steps used to generate the puzzle, and run it in reverse and check if the result is a solution state before allowing the puzzle to be sent to the search routines. If you ever generate an invalid puzzle, dump the puzzle, and expected steps and see where the problem is. If the puzzle passes and the algorithm fails, you have at least narrowed down where the problem is.
If it turns out to be your algorithm, post a more detailed explanation of the steps you have actually implemented (not just how A* works, we all know that), like for instance when you run the evaluation function, and where you resort the list that acts as your queue. That will make it easier to determine a problem within your implementation.

I need an algorithm to find the best path

I need an algorithm to find the best solution of a path finding problem. The problem can be stated as:
At the starting point I can proceed along multiple different paths.
At each step there are another multiple possible choices where to proceed.
There are two operations possible at each step:
A boundary condition that determine if a path is acceptable or not.
A condition that determine if the path has reached the final destination and can be selected as the best one.
At each step a number of paths can be eliminated, letting only the "good" paths to grow.
I hope this sufficiently describes my problem, and also a possible brute force solution.
My question is: is the brute force is the best/only solution to the problem, and I need some hint also about the best coding structure of the algorithm.
Take a look at A*, and use the length as boundary condition.
http://en.wikipedia.org/wiki/A%2a_search_algorithm
You are looking for some kind of state space search algorithm. Without knowing more about the particular problem, it is difficult to recommend one over another.
If your space is open-ended (infinite tree search), or nearly so (chess, for example), you want an algorithm that prunes unpromising paths, as well as selects promising ones. The alpha-beta algorithm (used by many OLD chess programs) comes immediately to mind.
The A* algorithm can give good results. The key to getting good results out of A* is choosing a good heuristic (weighting function) to evaluate the current node and the various successor nodes, to select the most promising path. Simple path length is probably not good enough.
Elaine Rich's AI textbook (oldie but goodie) spent a fair amount of time on various search algorithms. Full Disclosure: I was one of the guinea pigs for the text, during my undergraduate days at UT Austin.
did you try breadth-first search? (BFS) that is if length is a criteria for best path
you will also have to modify the algorithm to disregard "unacceptable paths"
If your problem is exactly as you describe it, you have two choices: depth-first search, and breadth first search.
Depth first search considers a possible path, pursues it all the way to the end (or as far as it is acceptable), and only then is it compared with other paths.
Breadth first search is probably more appropriate, at each junction you consider all possible next steps and use some score to rank the order in which each possible step is taken. This allows you to prioritise your search and find good solutions faster, (but to prove you have found the best solution it takes just as long as depth-first searching, and is less easy to parallelise).
However, your problem may also be suitable for Dijkstra's algorithm depending on the details of your problem. If it is, that is a much better approach!
This would also be a good starting point to develop your own algorithm that performs much better than iterative searching (if such an algorithm is actually possible, which it may not be!)
A* plus floodfill and dynamic programming. It is hard to implement, and too hard to describe in a simple post and too valuable to just give away so sorry I can't provide more but searching on flood fill and dynamic programming will put you on the path if you want to go that route.

branch and bound

Can someone explain the branch and bound search technique for me? I need to find a path with the smallest cost from any start node to an end node of any random graph using branch and bound search algorithm.
The basic idea of B & B is:
When solving an optimisation problem ("Find an X satisfying criteria Y so as to minimise the cost f(X)"), you build a solution piece by piece -- at any point in time, you have a partial solution, which has a cost.
If the nature of the problem is such that the cost of a partial solution can only stay the same or go up as you continue adding pieces to it, then you know that there's no point continuing to add pieces to a partial solution if there's already a full solution with lower cost. In this case, you can abandon (or "prune", or "fathom") further processing of this partial solution.
Many problems have the latter property, making B & B a widely applicable algorithm technique.
The process of searching for solutions can be represented by a search tree, where the root node represents the starting point where no decisions have been made, and each edge leading from a node represents a decision about something to be included in a partial solution. Each node is a partial solution comprising the decisions made (edges) from the root to that node.
Example: if we want to solve a Sudoku puzzle, the root node would represent the board with just the originally supplied numbers filled in; there might be 9 edges from this root, each representing the decision to assign a number 1-9 to the top-left cell. Each of those 9 partial solution nodes could have 8 branches, representing the valid assignments to the cell at position (1, 2), and so on. Usually, each edge represents a recursion step in a program.
With B & B, in the best case a good solution is found early, meaning that unpromising areas of the search tree can be pruned near the root; but in the worst case, the entire tree of valid solutions will be generated. For this reason B & B is usually only used to solve problems for which no faster algorithm is known (such as NP-hard problems).
This link provides a graphical representation of concepts related to B & B.
This link provides an explanation of the algorithm and sample C# code in a downloadable zip file.
Hope this helps.
There are a lot of references about branch and bound algorithms in the web.
here you can find some theoretical explanation.
whereas the code in C# is here
Fantastic answer #j_random_hacker !!!!
See pg 439 (example 18.2) in Papadimitriou and Steiglitz, Combinatorial Optimization.
This book is a classic, and it discusses your exact problem.

Resources