Related
I'm a freshman in cs, and I am having a little issue understanding the content of recursion in python.
I have 3 questions that I want to ask, or to confirm if I am understanding the content correctly.
1: I'm not sure the purpose of using base case in recursion. Is base case worked to terminate the recursion somewhere in my program?
2: Suppose I have my recursive call above any other code of my program. Is the program going to run the recursion first and then to run the content after recursion?
3: How to trace recursion properly with a reasonable correctness rate? I personally think it's really hard to trace recursion and I can barely find some instruction online.
Thanks for answering my questions.
Yes. The idea is that the recursive call will continue to call itself (with different inputs) until it calls the base case (or one of many base cases). The base case is usually a case of your problem where the solution is trivial and doesn't rely on any other parts of the problem. It then walks through the calls backwards, building on the answer it got for the simplest version of the question.
Yes. Interacting with recursive functions from the outside is exactly the same as interacting with any other function.
Depends on what you're writing, and how comfortable you are with debugging tools. If you're trying to track what values are getting passed back and forth between recursive calls, printing all the parameters at the start of the function and the return value at the end can take you pretty far.
The easiest way to wrap your head around this stuff is to write some of your own. Start with some basic examples (the classic is the factorial function) and ask for help if you get stuck.
Edit: If you're more math-oriented, you might look up mathematical induction (you will learn it anyway as part of cs education). It's the exact same concept, just taught a little differently.
I'm trying to implement a small problem to understand better in my compilers class. This is the problem as follows: Assume, I have a bunch of files to compile as follows:
a depends on nothing
b depends on c
c depends on f
d dependes on a
e depends on b
f depends on nothing
So in this case the compilation order for the files to be successfully compiled is a,f,c,b,d,e. I want to write my own algorithm to output the desired dependency just as an exercise. I know the linker does it automatically in C++ etc, but this is just a personal exercise. How can I go about solving this problem. Any references to algorithms/readings is much appreciated, since I'm fairly new.
Based on the comment of #ajb, a quick google search brings up the Wikipedia article for Topological Sorting. However, it seems to me that if you're going to go through the trouble of making a graph to represent the problem, there's a really easy way to do this.
First, for each file you're compiling, make a node. Then have an edge from each node to it's dependency, and have an edge to a special node for if the file requires no dependency. Once that is done, all you have to do is reverse the edges and compile in a breadth first search from that special node.
If you need to worry about circular dependencies or any of that jazz, then it gets a lot more complicated, but it's still doable.
Since you're asking for literature, there is a book called Data Structures and Algorithms in C++ that goes over all kinds of data structures and algorithms (what a surprise!) including graph algorithms in chapter 13.
I'm using the problem below as an example.
I'm trying to solve this problem http://www.geeksforgeeks.org/find-maximum-path-sum-in-a-binary-tree/ and I understand the solution provided which recursively traverses that tree. But I am wondering if it makes more sense to solve it as a client utilizing standard tree APIs which may include an iterable list of nodes in inorder/ pre/post order of the nodes.
I am not a professional software developer and don't use data structures at work. So my questions to you are
1) Does it make more sense to solve this type of problem as a client utilizing tree traversal methods of a DS in a library. (Assumption: such traversal methods exist).
2) In the context of an interview for software dev (sorry if this breaks the rules of this community), will the interviewers expect that I solve this as a method of a tree DS? i.e., I have access to the root and then I can traverse the tree like in the solution. or will they prefer that I solve it as a client.
3) What about 2) as a professional software dev in day to day work.
I apologize if the question in confusing or not well stated.
In interview questions they would want you to come up with the solution yourself, rather than just say "I would use this standard library", even if that's what you would do if you were actually solving the proglem.
I'm working on a board game algorithm where a large tree is traversed using recursion, however, it's not behaving as expected. How do I handle this and what are you experiences with these situations?
To make things worse, it's using alpha-beta pruning which means entire parts of the tree are never visited, as well that it simply stops recursion when certain conditions are met. I can't change the search-depth to a lower number either, because while it's deterministic, the outcome does vary by how deep is searched and it may behave as expected at a lower search-depth (and it does).
Now, I'm not gonna ask you "where is the problem in my code?" but I am looking for general tips, tools, visualizations, anything to debug code like this. Personally, I'm developing in C#, but any and all tools are welcome. Although I think that this may be most applicable to imperative languages.
Logging. Log in your code extensively. In my experience, logging is THE solution for these types of problems. when it's hard to figure out what your code is doing, logging it extensively is a very good solution, as it lets you output from within your code what the internal state is; it's really not a perfect solution, but as far as I've seen, it works better than using any other method.
One thing I have done in the past is to format your logs to reflect the recursion depth. So you may do a new indention for every recurse, or another of some other delimiter. Then make a debug dll that logs everything you need to know about a each iteration. Between the two, you should be able to read the execution path and hopefully tell whats wrong.
I would normally unit-test such algorithms with one or more predefined datasets that have well-defined outcomes. I would typically make several such tests in increasing order of complexity.
If you insist on debugging, it is sometimes useful to doctor the code with statements that check for a given value, so you can attach a breakpoint at that time and place in the code:
if ( depth = X && item.id = 32) {
// Breakpoint here
}
Maybe you could convert the recursion into an iteration with an explicit stack for the parameters. Testing is easier in this way because you can directly log values, access the stack and don't have to pass data/variables in each self-evaluation or prevent them from falling out of scope.
I once had a similar problem when I was developing an AI algorithm to play a Tetris game. After trying many things a loosing a LOT of hours in reading my own logs and debugging and stepping in and out of functions what worked out for me was to code a fast visualizer and test my code with FIXED input.
So, if time is not a problem and you really want to understand what is going on, get a fixed board state and SEE what your program is doing with the data using a mix of debug logs/output and some sort of your own tools that shows information on each step.
Once you find a board state that gives you this problem, try to pin-point the function(s) where it starts and then you will be in a position to fix it.
I know what a pain this can be. At my job, we are currently working with a 3rd party application that basically behaves as a black box, so we have to devise some interesting debugging techniques to help us work around issues.
When I was taking a compiler theory course in college, we used a software library to visualize our trees; this might help you as well, as it could help you see what the tree looks like. In fact, you could build yourself a WinForms/WPF application to dump the contents of your tree into a TreeView control--it's messy, but it'll get the job done.
You might want to consider some kind of debug output, too. I know you mentioned that your tree is large, but perhaps debug statements or breaks at key point during execution that you're having trouble visualizing would lend you a hand.
Bear in mind, too, that intelligent debugging using Visual Studio can work wonders. It's tough to see how state is changing across multiple breaks, but Visual Studio 2010 should actually help with this.
Unfortunately, it's not particularly easy to help you debug without further information. Have you identified the first depth at which it starts to break? Does it continue to break with higher search depths? You might want to evaluate your working cases and try to determine how it's different.
Since you say that the traversal is not working as expected, I assume you have some idea of where things may go wrong. Then inspect the code to verify that you have not overlooked something basic.
After that I suggest you set up some simple unit tests. If they pass, then keep adding tests until they fail. If they fail, then reduce the tests until they either pass or are as simple as they can be. That should help you pinpoint the problems.
If you want to debug as well, I suggest you employ conditional breakpoints. Visual Studio lets you modify breakpoints, so you can set conditions on when the breakpoint should be triggered. That can reduce the number of iterations you need to look at.
I would start by instrumenting the function(s). At each recursive call log the data structures and any other info that will be useful in helping you identify the problem.
Print out the dump along with the source code then get away from the computer and have a nice paper-based debugging session over a cup of coffee.
Start from the base case where you've mentioned if else statements and then try to channelize your thinking by writing it down on pen and paper + printing the values on console when the first few instances of recursive functions are generated with values.
The motto is to find the correct trend between the values you print and match them with those values you wrote on paper in the initial few steps of your recursive algorithm.
I wonder how many of you have implemented one of computer science's "classical algorithms" like Dijkstra's algorithm or data structures (e.g. binary search trees) in a real world, not academic project?
Is there a benefit to our dayjobs in knowing these algorithms and data structures when there are tons of libraries, frameworks and APIs which give you the same functionality?
Is there a benefit to our dayjobs in knowing these algorithms and data structures when there are tons of libraries, frameworks and APIs which give you the same functionality?
The library doesn't know what your problem domain is and won't be able to chose the correct algorithm to do the job. That is why I think it is important to know about them: then YOU can make the correct choice of algorithms to solve YOUR problem.
Knowing, or being able to understand these algorithms is important, these are the tools of your trade. It does not mean you have to be able to implement A* in an hour from memory. But you should be able to figure out what the advantages of using a red-black tree as opposed to a normal unbalanced tree are so you can decide if you need it or not. You need to be able to judge the fitness of an algorithm for solving your problem.
This might sound too school-masterish but these "classical algorithms" were not invented to give college students exam questions, they were invented to solve problems or improve on current solutions, just like the array, the linked list or the stack are building blocks to write a program so are some of these. Just like in math where you move from addition and subtraction to integration and differentiation, these are advanced techniques that will help you solve problems that are out there.
They might not be directly applicable to your problems or work situation but in the long run knowing of them will help you as a professional software engineer.
To answer your question, I did an implementation of A* recently for a game.
Is there a benefit to understanding your tools, rather than simply knowing that they exist?
Yes, of course there is. Taking a trivial example, don't you think there's a benefit to knowing what the difference is List (or your language's equivalent dynamic array implementation) and LinkedList (or your language's equivalent)? It's pretty important to know that one has constant random access time, while the other is linear. And one requires N copies if you insert a value in the middle of the sequence, while the other can do it in constant time.
Don't you think there's an advantage to understanding that the same sorting algorithm isn't always optimal? That for almost-sorted data, quicksort sucks, for example? Naively just calling Sort() and hoping for the best can become ridiculously expensive if you don't understand what's happening under the hood.
Of course there are a lot of algorithms you probably won't need, but even so, just understanding how they work may make it easier for yourself to come up with efficient algorithms to solve other, unrelated, problems.
Well, someone has to write the libraries. While working at a mapping software company, I implemented Dijkstra's, as well as binary search trees, b-trees, n-ary trees, bk-trees and hidden markov models.
Besides, if all you want is a single 'well known' algorithm, and you also want the freedom to specialise it and optimise it if it becomes critical to performance, including a whole library seems like a poor choice.
We use a home grown implementation of a p-random number generator from Knuth SemiNumeric as an aid in some statistical processing
In my previous workplace, which was an EDA company, we implemented versions of Prim and Dijsktra's algorithms, disjoint set data structures, A* search and more. All of these had real world significance. I believe this is dependent on problem domain - some domains are more algorithm-intensive and some less so.
Having said that, there is a fine line to walk - I see no business reason for re-implementing STL or Java Generics. In many cases, a standard library is better than "inventing a wheel". The more you are near your core application, the more it may be necessary to implement a textbook algorithm or data structure.
If you never work with performance-critical code, consider yourself lucky. However, I consider this scenario unrealistic. Performance problems could occur anywhere. And then it's necessary to know how to fix that problem. Obviously, merely knowing a few algorithm names isn't enough here – unless you want to implement them all and try them out one after the other.
No, knowing (at least some of) the inner workings of different algorithms is important for gauging their strengths and weaknesses and for analyzing how they would handle your situation.
Obviously, if there's a library already implementing exactly what you need, you're incredibly lucky. But let's face it, even if there is such a library, using it is often not completely straightforward (at the very least, interfaces and data representation often have to be adapted) so it's still good to know what to expect.
A* for a pac man clone. It took me weeks to really get but to this day I consider it a thing of beauty.
I've had to implement some of the classical algorithms from numerical analysis. It was easier to write my own than to connect to an existing library. Also, I've had to write variations on classical algorithms because the textbook case didn't fit my application.
For classical data structures, I nearly always use the standard libraries, such as STL for C++. The one time recently when I thought STL didn't have the structure I needed (a heap) I rolled my own, only to have someone point out almost immediately that I didn't need to do that.
Classical algorithms I have used in actual work:
A topological sort
A red-black tree (although I will
confess that I only had to implement
insertions for that application and
it only got used in a prototype).
This got used to implement an
'ordered dict' type structure in
Python.
A priority queue
State machines of various sorts
Probably one or two others I can't remember.
As to the second part of the question:
An understanding of how the algorithms work, their complexity and semantics gets used on a fairly regular basis. They also inform the design of systems. Occasionally one has to do things involving parsing or protocol handling, or some computation that's slightly clever. Having a working knowledge of what the algorithms do, how they work, how expensive they are and where one might find them lying around in library code goes a long way to knowing how to avoid reinventing the wheel poorly.
I use the Levenshtein distance algorithm to help implement a 'Did you mean [suggested word]?' feature in our website search.
Works quite well when combined with our 'tagging' system, which allows us to associate extra words (other than those in title/description/etc) with items in the database. \
It's not perfect by any means, but it's way better than most corporate site searches, if I don't say so myself ; )
Classical algorithms are usually associated with something glamorous, like games, or Web search, or scientific computation. However, I had to use some of the classical algorithms for a mere enterprise application.
I was building a metadata migration tool, and I had to use topological sort for dependency resolution, various forms of graph traversals for queries on metadata, and a modified variation of Tarjan's union-find datastructure to partition forest-like structured metadata to trees.
That was a really satisfying experience. Most of those algorithms were implemented before, but their implementations lacked something that I would need for my task. That's why It's important to understand their internals.