I was reserarching about Convex hull and Graham Scan to implement it and It drew my attention that everyone has used stacks. So I wanted to ask why are the stacks used in the algoithm precisely, what's the benefit of using stacks?
The stack here can be considered as an abstract data structure that supports push, pop and empty operations. If you read the description of the Graham Scan algorithm you'll see that these are exactly the operations which the algorithm uses, so I really can't see what could be an alternative to a stack - it would probably be a different algorithm then.
What data structure is used to back/implement these operations in the stack (i. e. the class that implements stack interface in OO terms) can be decided rather freely. Often an array is used, but for some applications also linked lists might make sense.
In graham scan while constructing the hull there is a need of backtracking if the points do not form a left turn with the next considered points so previous points need to be reconsidered as valid points of hull hence we use stack to get them in order of last visited first for re-validation. Though is its not mandatory to use stack you can use a simple array as well to do the same.
Related
I cannot seem to find any documentation on how to construct an R-Tree when I have all the known Minimum Bounding Rectangles (MBR) of the polygons in my set. The R-Tree would be ideal for storing these spatial references to eliminate my current brute force inspecting for polygon intersection:
for p1 in polygons: # O(n)
for p2 in polygons: # O(n)
if p2 is not p1: # O(1)
if p2.intersects(p1): # O(1); computed using DeMorgans law on vertices
# do stuff
Does anybody have a reference that denotes methods of how to determine the partitioning of the rectangles that encompass the MBRs of the polygons in the set?
There are many R-Tree partitioning algorithm, in my experience the best one is R*Tree (R-Star-Tree) by Beckmann et al. Just search for "The R*-tree: an efficient and robust access method for points and rectangles".
If you prefer reading code, there are also many open-source implementations, including my own one in Java. Be warned though, the R*Tree algorithm is not trivial.
If you are looking for something simpler, try quadtrees or octrees. If insertion and update speed is top priority, have a look at the PH-Tree (again my own implementation), but it also more complicated than quadtrees.
Another simpler solution is the AABB-Tree, it's like an R-Tree but with only two bounding boxes per node. It's used a lot in computer graphics I think, but I don't know much about it except that it is relatively simple for an R-Tree.
EDIT (Update to answer comment)
If you are looking for a bulk loading strategy such as STR, here is the original paper. You can have a look at my R-Tree implementation, as it also provides an implementation of an STR-Loader that can handle points and rectangles.
Searching stack overflow I also found this answer which apparently points to an alternative bulk loader specifically for storing rectangles.
I would like to point out that bulk loading (such as STR-Loading) is the fasted way to load an R-Tree. However, in my own experiments (see Figure 3 here), this is still 2-3 times slower than a good quadtree or a PH-Tree.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
Can anyone suggest real-world problems in this algorithms Insertion sort, Breath first search, Depth first search or Topological sorting? Thank you.
Real-world examples of recursion
I saw sample here But I need is specific problems for insertion sort, Breath first search, Depth first search or Topological sorting algorithms.
I wish you can Help me.
How more real can it get than our daily, humdrum lives?
Insertion sort is what we (or at least I) most commonly use when we need to sort things. Consider a deck of cards - one would go over them one by one, put the smallest ones at the front, smaller behind them, etc. Or a pile of paperwork which needs to be sorted by date, same algorithm.
In CS, insertion-sort is less commonly used because we have much better algorithms (qsort and merge-sort come to mind). A human can do them as well, but that'd be a much more tedious task indeed.
Breadth-first search's use is in the name: When we want to go over a tree horizontally, and not vertically. Let's say you heard your family is connected to Ilya Repin, a Russian artist. You go to the attic, crack open the giant wooden chest that's been gathering dust, and take out the old family tree which dates back to the 19th century (doesn't everyone have that?). You know he's closer to the top of the tree than he is to the bottom, so you go breadth-first: Take the first line, followed by the second, and so on...just a little more...Efim Repin...bingo!
If Ilya Repin happened to be in the leftmost branch of the tree, then depth-first would've made more sense. However, in the average case, you'd want to go breadth-first because we know our target is closer to the root than it is to the leafs. In CS there are a buttload of usage cases (Cheney's Algorithm, A*, etc. you can see several more on wikipedia).
Depth-first search is used when we... drumroll ...want to go to the depth of the tree first, travel vertically. There are so many uses I can't even begin, but the simplest and most common is solving a cereal-box maze. You go through one path until you reach a dead-end, and then you backtrack. We don't do it perfectly, since we sometimes skip over a path or forget which ones we took, but we still do it.
In CS, there're a whole lot of usage cases, so I'll redirect you to wikipedia again.
Topological sort some of us use in the back of our heads, but it's easily seen in chefs, cooks, programmers, anyone and everyone who has to do an ordered set of tasks. My grandmother made the best Canneloni I've eaten, and her very simple recipe was constructed of several simple steps (which I've managed to forget, but here are their very general outline): Making the "pancake" wrapper, making the sauce and wrapping them together. Now, we can't wrap these two up without making them, so naturally, we first have to make the wrapper and sauce, and only then wrap 'em.
In CS it's used for exactly the same thing: scheduling. I think Excel uses it to calculate dependant spreadsheet formulas (or maybe it just uses a simple recursive algorithm, which may be less efficient). For some more, you can see our good friend wikipedia and a random article I found.
I work with Hierarchical Datastructures, and always in need of BFS to find objects i need nested under a specific root...
e.g. Find(/Replace)
some-times (significally less) i use DFS in order to check some design constraints that cannot be evaluated without investigating the leafs.
though not used by me, and not exactly BFS,
but GPS navigation software use A* to search for a good path,
Which is kid of a "Weighted BFS"
Insertion sort - none, it's good for learning. Outside computers it is used often to sort, for example, cards. In real-world merge sort or quick sort are better.
BFS - finding connected nodes, finding shortest paths. Base for Dijkstra's algorithm and A* (faster version of Dijkstra's).
DFS - finding connected nodes, numbering nodes in tree.
Topological sorting - finding correct order of tasks.
I have a need to take a 2D graph of n points and reduce it the r points (where r is a specific number less than n). For example, I may have two datasets with slightly different number of total points, say 1021 and 1001 and I'd like to force both datasets to have 1000 points. I am aware of a couple of simplification algorithms: Lang Simplification and Douglas-Peucker. I have used Lang in a previous project with slightly different requirements.
The specific properties of the algorithm I am looking for is:
1) must preserve the shape of the line
2) must allow me reduce dataset to a specific number of points
3) is relatively fast
This post is a discussion of the merits of the different algorithms. I will post a second message for advice on implementations in Java or Groovy (why reinvent the wheel).
I am concerned about requirement 2 above. I am not an expert enough in these algorithms to know whether I can dictate the exact number of output points. The implementation of Lang that I've used took lookAhead, tolerance and the array of Points as input, so I don't see how to dictate the number of points in the output. This is a critical requirement of my current needs. Perhaps this is due to the specific implementation of Lang we had used, but I have not seen a lot of information on Lang on the web. Alternatively we could use Douglas-Peucker but again I am not sure if the number of points in the output can be specified.
I should add I am not an expert on these types of algorithms or any kind of math wiz, so I am looking for mere mortal type advice :) How do I satisfy requirements 1 and 2 above? I would sacrifice performance for the right solution.
I think you can adapt Douglas-PĆ¼cker quite straightforwardly. Adapt the recursive algorithm so that rather than producing a list it produces a tree mirroring the structure of the recursive calls. The root of the tree will be the single-line approximation P0-Pn; the next level will represent the two-line approximation P0-Pm-Pn where Pm is the point between P0 and Pn which is furthest from P0-Pn; the next level (if full) will represent a four-line approximation, etc. You can then trim the tree either on the basis of depth or on the basis of distance of the inserted point from the parent line.
Edit: in fact, if you take the latter approach you don't need to build a tree. Instead you populate a priority queue where the priority is given by the distance of the inserted point from the parent line. Then when you've finished the queue tells you which points to remove (or keep, according to the order of the priorities).
You can find my C++ implementation and article on Douglas-Peucker simplification here and here. I also provide a modified version of the Douglas-Peucker simplification that allows you to specify the number of points of the resulting simplified line. It uses a priority queue as mentioned by 'Peter Taylor'. Its a lot slower though, so I don't know if it would satisfy the 'is relatively fast' requirement.
I'm planning on providing an implementation for Lang simplification (and several others). Currently I don't see any easy way how to adjust Lang to reduce to a fixed point count. If you
could live with a less strict requirement: 'must allow me reduce dataset to an approximate number of points', then you could use an iterative approach. Guess an initial value for lookahead: point count / desired point count. Then slowly increase the lookahead until you approximately hit the desired point count.
I hope this helps.
p.s.: I just remembered something, you could also try the Visvalingam-Whyatt algorithm. In short:
-compute the triangle area for each point with its direct neighbors
-sort these areas
-remove the point with the smallest area
-update the area of its neighbors
-resort
-continue until n points remain
I have implemented an A* search algorithm for finding a shortest path between two states.
Algorithm uses a hash-map for storing best known distances for visited states. And one hash-map for storing child-parent relationships needed for reconstruction of the shortest path.
Here is the code. Implementation of the algorithm is generic (states only need to be "hashable" and "comparable") but in this particular case states are pairs (vectors) of ints [x y] and they represent one cell in a given heightmap (cost for jumping to neighboring cell depends on the difference in heights).
Question is whether it's possible to improve performance and how? Maybe by using some features from 1.2 or future versions, by changing logic of the algorithm implementation (e.g. using different way to store path) or changing state representation in this particular case?
Java implementation runs in an instant for this map and Clojure implementation takes about 40 seconds. Of course, there are some natural and obvious reasons for this: dynamic typing, persistent data structures, unnecessary (un)boxing of primitive types...
Using transients didn't make much difference.
Using priority-map instead of sorted-set
I first used sorted-set for storing open nodes (search frontier), switching to priority-map improved performance: now it takes 15-20 seconds for this map (before it took 40s).
This blog post was very helpful. And "my" new implementation is pretty much the same.
New a*-search can be found here.
I don't know Clojure, but I can give you some general advice about improving the performance of Vanilla A*.
Consider implementing IDA*, which is a variant of A* that uses less memory, if it's suitable for your domain.
Try a different heuristic. A good heuristic can have a significant impact on the number of node expansions required.
Use a cache, Often called a "transposition table" in search algorithms. Since search graphs are usually Directed Acyclic Graphs and not true trees, you can end up repeating the search of a state more than once; a cache to remember previously-searched nodes reduces node expansions.
Dr. Jonathan Schaeffer has some slides on this subject:
http://webdocs.cs.ualberta.ca/~jonathan/Courses/657/Notes/10.Single-agentSearch.pdf
http://webdocs.cs.ualberta.ca/~jonathan/Courses/657/Notes/11.Evaluations.pdf
Ok, so I would like to make a GLR parser generator. I know there exist such programs better than what I will probably make, but I am doing this for fun/learning so that's not important.
I have been reading about GLR parsing and I think I have a decent high level understanding of it now. But now it's time to get down to business.
The graph-structured stack (GSS) is the key data structure for use in GLR parsers. Conceptually I know how GSS works, but none of the sources I looked at so far explain how to implement GSS. I don't even have an authoritative list of operations to support. Can someone point me to some good sample code/tutorial for GSS? Google didn't help so far. I hope this question is not too vague.
Firstly, if you haven't already, you should read McPeak's paper on GLR http://www.cs.berkeley.edu/~smcpeak/papers/elkhound_cc04.ps. It is an academic paper, but it gives good details on GSS, GLR, and the techniques used to implement them. It also explains some of the hairy issues with implementing a GLR parser.
You have three parts to implementing a graph-structured stack.
I. The graph data structure itself
II. The stacks
III. GLR's use of a GSS
You are right, google isn't much help. And unless you like reading algorithms books, they won't be much help either.
I. The graph data structure
Rob's answer about "the direct representation" would be easiest to implement. It's a lot like a linked-list, except each node has a list of next nodes instead of just one.
This data structure is a directed graph, but as the McPeak states, the GSS may have cycles for epsilon-grammars.
II. The stacks
A graph-structured stack is conceptually just a list of regular stacks. For an unambiguous grammar, you only need one stack. You need more stacks when there is a parsing conflict so that you can take both parsing actions at the same time and maintain the different state both actions create. Using a graph allows you to take advantage of the fact that these stacks share elements.
It may help to understand how to implement a single stack with a linked-list first. The head of the linked list is the top of the stack. Pushing an element onto the stack is just creating a new head and pointing it to the old head. Popping an element off the stack is just moving the pointer to head->next.
In a GSS, the principle is the same. Pushing an element is just creating a new head node and pointing it to the old head. If you have two shift operations, you will push two elements onto the old head and then have two head nodes. Conceptually this is just two different stacks that happen share every element except the top ones. Popping an element is just moving the head pointer down the stack by following each of the next nodes.
III. GLR's use of the GSS
This is where McPeak's paper is a useful read.
The GLR algorithm takes advantage of the GSS by merging stack heads that have the same state element. This means that one state element may have more than one child. When reducing, the GLR algorithm will have to explore all possible paths from the stack head.
You can optimize GLR by maintaining the deterministic depth of each node. This is just the distance from a split in the stack. This way you don't always have to search for a stack split.
This is a tough task! So good luck!
The question that you're asking isn't trivial. I see two main ways of doing this:
The direct representation. Your data structure is represented in memory as node objects/structures, where each node has a reference/pointer to the structs below it on the stack (one could also make the references bi-directional, as an alternative). This is the way lists and trees are normally represented in memory. It is a bit more complicated in this case, because unlike a tree or a list, where one need only maintain a reference to root node or head node to keep track of the tree, here we would need to maintain a list of references to all the 'top level' nodes.
The adjacency list representation. This is similar to the way that mathematicians like to think about graphs: G = (V, E). You maintain a list of edges, indexed by the vertices which are the origin and termination points for each edge.
The first option has the advantage that traversal can be quicker, as long as the GSS isn't too flat. But the structure is slightly more difficult to work with. You'll have to roll a lot of your own algorithms.
The second option has the advantage of being more straightforward to work with. Most algorithms in textbooks seem to assume some kind of adjacency list representation, which makes is easier to apply the wealth of graph algorithms out there.
Some resources:
There are various types of adjacency list, e.g. hash table based, array based, etc. The wikipedia adjacency list page is a good place to start.
Here's a blog post from someone who has been grappling with the same issue. The code is clojure, which may or may not be familiar, but the discussion is worth a look, even if not.
I should mention that I think that I wish there were more information about representing Directed Acyclic Graphs (or Graph Structured Stacks, if you prefer), given the widespread application of this sort of model. I think there is room for better solutions to be found.