How to implement a DFS to the Reddit PRAW? - depth-first-search

I would like to implement a DFS to get a parent and child conversation in the Reddit API.
I'm totally lost on how to do this. Any pointers?

PRAW's list() method does a breadth-first traversal over the comments. https://github.com/praw-dev/praw/blob/5ee4b1820c2591117e32be45778372e7c03a5f56/praw/models/comment_forest.py#L83
If you want to make it depth-first you can do everything the same, but swap out:
queue.extend(comment.replies)
with:
queue[0:0] = comment.replies

Related

Why do we make clone for some nodes in Suffix Automata Algorithm?

I have been studying about Suffix Automata string matching algorithm for a few days. I watched these videos and reed documents but I really can't get why we need to make a new node (under special condition) and clone it. I know how it works now but I am eager to learn the reason behind it. What would be the problem if we keep previous nodes? for example in the picture below we have new node (red Circle) for 'b' character. Can some one explain it to me? Appreciate.
There's no difference for your test case.
Another test case abbcbb. Which node should string bb be belonging to?
So clone a node is necessary to guarantee that the node corresponding to each substring is unique.

Rope tree that returns node that can be used to find the position and who's parent is a token

I am trying to write a compiler/code editor.
To speed up the process I want a red and black tree that returns a node which I can then use to get the strings under it, and it's position value, and use it's parent node as a place to store a token (such as alphanumeric_word or left_parenthesis).
I am having trouble finding the best way to go about this.
I basically want something that can do the following:
tree.insert("01234567890123456789",0);
node = tree.at(10);
tree.insert("string",5);
node.index(); //should be 10+length("string")
node.value(); //should be '0'
node.tokenPtr.value; //should point to a token with the value of NUMBER
I am looking for the simplest implementation of such a tree that I could modify since these can be frustrating to build and debug from scratch.
The following code is sort of what I am looking for (it has parent nodes), but it lacks an indexing feature for index look up. This is needed because I want to create a map that uses the node as it's key and node.index() as it's sorting value so that I don't have to update the keys in that map.
[[archive.gamedev.net/archive/reference/programming/features/TStorage/page2.html]]
I have tried to look at sgi's rope implimentation, but the code is overwhelming and difficult to understand.
This tutorial seems to be helpfull, however it also doesn't provide a doubly linked tree which I think can be used to find the index of a node:
[[eternallyconfuzzled.com/tuts/datastructures/jsw_tut_rbtree.aspx]]
Update:
I have found an implementation that has a parent node, however it still lacks an index count property:
[[web.mit.edu/~emin/Desktop/ref_to_emin/www.old/source_code/red_black_tree/index.html]]
I have found one solution and another that might work.
You have to use the sgi stl rope mutable_begin()+(index) iterator.
There is also this function, however I am still having trouble analyzing the sgi rope code to see what it does:
mutable_reference_at(index)

Print a singly-linked list backwards, in constant space and linear time

I heard an interview question:
"Print a singly-linked list backwards,
in constant space and linear time."
My solution was to reverse the linkedlist in place and then print it like that. Is there another solution that is nondestructive?
You've already figured out most of the answer: reverse the linked list in place, and traverse the list back to the beginning to print it. To keep it from being (permanently) destructive, reverse the linked list in place again as you're traversing it back to the beginning and printing it.
Note, however, that this only works if you either only have a single thread of execution, or make the whole traversal a critical section so only one thread does it at a time (i.e., a second thread can never play with the list in the middle of the traversal).
If you reverse it again after printing it will no longer be destructive, since the original order is restored.
You could use a recursive call down the linked list chain with a reference to what you wish to write to. Each node would use the child node's print function while passing the reference before printing itself.
That way each node in the list would pass down, until the last one couldn't and would go straight to the write, then each one back up the chain would write after the last all the way back up to the front.
Edit
This actually doesn't fit the specs because of the linear space on stack. If you had something outside to walk the functions and a method of writing to the front of a string the base logic can still work though.
Okay , this could be an interview question , but it is actually a question behind weis algorithms book. The question clearly states that we cannot use recursion (something the interviewer will hide and reveal later on) as recursion will not use constant space, moslty recursion will become a major point of discusion going forward. Solution is reverse print and reverse back.
Here's an unconventional approach: Change your console to right-to-left reading order and then print the list in normal order. They will appear in backward order. Having to visit the actual data in reverse order doesn't sound like a constraint to the problem.

Algorithm for Tree Traversal

Update:
I found more of an example of what I'm trying to pull off: Managing Hierarchical Data in MySQL. I want to do that but in JavaScript because I am building an app that takes in comments that are in a hierarchical structure, to be more specific reddit.com. If you have the Pretty JSON extension on your chrome web browser go to reddit and click on a threads comments and then add .json to the url to see what I am parsing.
I get the JSON data just fine, its just parsing through the comments and adding the appropriate HTML to show that its nested.
Ideas for solutions?
OLD question:
I am working on a program and I have come to a part that I need to figure out the logic before I write the code.
I am taking in data that is in a tree format but with the possibility of several children for each parent node and the only tree's I can seem to find data on are tree's with weights or tree's where at most each node has two child nodes. So I'm trying to figure out the algorithm to evaluate each node of a tree like this:
startingParent[15] // [# of children]
child1[0]
child2[5]
child2ch1[4]
...
child2ch5[7]
child3[32]
...
child15[4]
Now when I try to write out how my algorithm would work I end up writing nested for/while loops but I end up writing a loop for each level of the height of the tree which for dynamic data and tree's of unknown height with unknown number of children per node this doesn't work. I know that at some point I learned how to traverse a tree like this but its completely escaping me right now. Anyone know how this is done in terms of loops?
If you're not going to use recursion, you need an auxiliary data structure. A queue will give you a breadth-first traversal, whereas a stack will give you a depth-first traversal. Either way it looks roughly like this:
structure <- new stack (or queue)
push root onto structure
while structure is not empty
node <- pop top off of structure
visit(node)
for each child of node
push child onto structure
loop
Wikipedia References
Queue
Stack
Use recursion, not loops.
Breadth first search
Depth first search
Those should help you get started with what you're trying to accomplish
Just use recursion like
def travel(node):
for child in node.childs:
# Do something
travel(child)
The simplest code for most tree traversal is usually recursive. For a multiway tree like yours, it's usually easiest to have a loop that looks at each pointer to a child, and calls itself with that node as the argument, for all the child nodes.

Need some help understanding this problem about maximizing graph connectivity

I was wondering if someone could help me understand this problem. I prepared a small diagram because it is much easier to explain it visually.
alt text http://img179.imageshack.us/img179/4315/pon.jpg
Problem I am trying to solve:
1. Constructing the dependency graph
Given the connectivity of the graph and a metric that determines how well a node depends on the other, order the dependencies. For instance, I could put in a few rules saying that
node 3 depends on node 4
node 2 depends on node 3
node 3 depends on node 5
But because the final rule is not "valuable" (again based on the same metric), I will not add the rule to my system.
2. Execute the request order
Once I built a dependency graph, execute the list in an order that maximizes the final connectivity. I am not sure if this is a really a problem but I somehow have a feeling that there might exist more than one order in which case, it is required to choose the best order.
First and foremost, I am wondering if I constructed the problem correctly and if I should be aware of any corner cases. Secondly, is there a closely related algorithm that I can look at? Currently, I am thinking of something like Feedback Arc Set or the Secretary Problem but I am a little confused at the moment. Any suggestions?
PS: I am a little confused about the problem myself so please don't flame on me for that. If any clarifications are needed, I will try to update the question.
It looks like you are trying to determine an ordering on requests you send to nodes with dependencies (or "partial ordering" for google) between nodes.
If you google "partial order dependency graph", you get a link to here, which should give you enough information to figure out a good solution.
In general, you want to sort the nodes in such a way that nodes come after their dependencies; AKA topological sort.
I'm a bit confused by your ordering constraints vs. the graphs that you picture: nothing matches up. That said, it sounds like you have soft ordering constraints (A should come before B, but doesn't have to) with costs for violating the constraint. An optimal algorithm for scheduling that is NP-hard, but I bet you could get a pretty good schedule using a DFS biased towards large-weight edges, then deleting all the back edges.
If you know in advance the dependencies of each node, you can easily build layers.
It's amusing, but I faced the very same problem when organizing... the compilation of the different modules of my application :)
The idea is simple:
def buildLayers(nodes):
layers = []
n = nodes[:] # copy the list
while not len(n) == 0:
layer = _buildRec(layers, n)
if len(layer) == 0: raise RuntimeError('Cyclic Dependency')
for l in layer: n.remove(l)
layers.append(layer)
return layers
def _buildRec(layers, nodes):
"""Build the next layer by selecting nodes whose dependencies
already appear in `layers`
"""
result = []
for n in nodes:
if n.dependencies in flatten(layers): result.append(n) # not truly python
return result
Then you can pop the layers one at a time, and each time you'll be able to send the request to each of the nodes of this layer in parallel.
If you keep a set of the already selected nodes and the dependencies are also represented as a set the check is more efficient. Other implementations would use event propagations to avoid all those nested loops...
Notice in the worst case you have O(n3), but I only had some thirty components and there are not THAT related :p

Resources