Do you have an idea how to apply a Random forest on the Hierarchical Ascending Classification (CAH) Dendogram nodes
Related
I'm working on a problem that requires removing transitive nodes in a graph. More specifically I need to reduce the number of edges by removing the nodes from the path between two sets of nodes.
Picture says a thousand words so here is what I'm trying to do
The graph contains 3 types of nodes (Ai, Bi, Ci). I'd like to reduce the graph by removing all the nodes Bi on a path between nodes Ai and Ci, whilst preserving reachability between the Ai,Ci nodes.
This is a tripartite graph, indeed, and I'm wondering if there is an efficient algorithm that can reduce it as per the description shown in the attached picture.
If we let A denote the adjacency matrix between the As and Bs and B denote the adjacency matrix between the Bs and Cs, then the adjacency matrix of the resulting graph is the Boolean matrix product AB. In theory the fast matrix multiplication algorithms apply (if you have dense matrices), but in practice I doubt that they'll help much.
I have an algorithm to classify some websites on a hierarchical tree. But I have trouble to test on the algorithm. Say the result suppose to be at one node, but the algorithm give me a different node, how do I measure the difference between the two node.
More specifically, I have an algorithm to categorize webpage to a hierarchical category. For example, https://stackoverflow.com/ is under computer>development>development Q&A. I am trying to test how the algorithm works. And trying to come up with a way to compare two nodes on a built tree.
I have some ideas now: 1. Calculate the distance of the two nodes. 2. Get the path of two node, as two lists and compare the lists.
This version of Kruskal's algorithm represents the edges with a adjacency list.
How would I modify the pseudo-code to instead use a adjacency matrix?
I was thinking you we would need to use the weight of edges for instance (i,j), as long as its not zero. Assigning the vertices to i,j. I may be a bit confused on this pseudo-code of Kruskals.
As pointed out by Henry the pseudocode did not specify what concrete data structures to be used. It just appears that the adjacency list representation of graph is more convenient than the adjacency matrix representation in this case.
For adjacency matrix, you simply have to scan every entries of your matrix to sort the edges of graph G on line 4. And you are doing exactly the same thing when using the adjacency list representation.
In your case you may, for example, use a PriorityQueue to sort the edges by weight in non-decreasing order and discard entries with disconnected vertices. You can then iterate this data structure in the for-loop on line 5.
I am trying to find a suitable algorithm to solve this: suppose I have some (oriented graph) nodes. Each node might have or not a parent (meaning at most one parent). Suppose this notation for a node: (id, id_parent). Some nodes will be (id_i, NULL) while there will be nodes (id_j, id_i) as "sons" of id_i . Having an array of these nodes in a particular order, I want to get them sorted in this order: parent-son-son of son-son-son of son, etc.
Example: nodes (1, NULL), (2,NULL), (3,1), (4,3), (5,2), (6,3)
The sorted array will be: (1,NULL), (3,1), (4,3), (6,3), (2, NULL), (5,2) . A kind of in-depth tree exploration.
Which algorithm would be suitable for achieving this? Thanks
If the graph has no cycles - it is a DAG, and you are looking for topoloical sort.
If it has cycles - there is no such ordering, since in the cycle, there will be a node, which its son is also its ancestor.
EDIT:
If the graph is a forest (disjoint union of trees) - then a simple DFS on it from sources will do. Just construct the graph (It is O(nlogn) to sort, if it is not already sorted, or O(n) using radix sort), find the list of sources, and do the DFS from each source, and each time you visit a node, store it in an output array. Iterate while there are undiscovered vertices.
I'm trying to implement the following graph reduction algorithm in
The graph is an undirected weighted graph
I want to strip away all nodes with only two neighbors
and update the weights
Have a look at the following illustration:
Algorithm reduce graph http://public.kungi.org/graph-reduction.png
The algorithm shall transform the upper graph into the lower one. Eliminate node 2 and update the weight of the edge to: w(1-3) = w(1-2)+w(2-3)
Since I have a very large graph I'm doing this with MapReduce.
My Question is how to represent the graph in HBase. I thought about building an adjacency list structure in HBase like this:
Column families: nodes, neighbors
1 -> 2, 6, 7
...
Is there a nicer way to do this?
Adjacency lists are the most frequently recommended structure.
You could use each node ID as the row ID and neighbor IDs as column qualifiers, with the weights as values.