Find the longest chaining list? - algorithm

I have a lot of objects. Some object can produce a chaining list, and all objects have a chaining continuation on condition. A exemple will explain more. Let's say I have this two objects :
[
{
"name": "ObjectA",
"produce": "ChainA",
"continuations": [
{ "ChainB": "ChainC" },
{ "ChainC": "ChainC" }
]
}
{
"name": "ObjectB",
"produce": null,
"continuations": [
{ "ChainA": "ChainB" }
]
}
]
I need to find the list :
ChainA (ObjectA) => ChainB (ObjectB) => ChainC (ObjectA) => ChainC (ObjectA)
I just can't find a better way that looping over and over again. Some of you have an idea ?
Thanks

This problem is NP-complete because if you had an efficient algorithm to solve it, then you could solve https://en.wikipedia.org/wiki/Longest_path_problem. So it is pretty hopeless.
Here is the reduction. Given a directed graph G, label the edges as objects and label the vertices as chains. Each "object" (edge) produces the "chain" that is the vertex it goes to. Each "object" (edge) has a single continuation, from the source "chain" (vertex) to the target "chain" (vertex).
Now apply the hypothetical algorithm. The longest chaining list will be the edges of the desired longest path for G.

Related

Implementation of A star Algorithm problem

I am implementing an indoor map, using A star algorithm as its pathfinding algorithm. I came across a library in github and modified it into my floorplan, it works but now I'm trying to study how the user implemented A star algorithm and compared it to the algorithm's pseudocode. I can't seem to understand a part of his code.
Here is a snippet of the A star algorithm code below:
'
while(!openList.isEmpty()){
Node current=openList.poll();
visitedSet.add(current);
List<Node> neighbors=getAdj(current);
for(Node neighbor : neighbors){
if(!visitedSet.contains(neighbor)){
int g=neighbor.calculateG(startNode);
int h=neighbor.calculateH(endNode);
int tempf=g+h;
if(openList.contains(neighbor)){
int f=neighbor.getF();
if(f>tempf){
openList.remove(neighbor);
neighbor.setF(tempf);
neighbor.setH(h);
neighbor.setG(g);
neighbor.setParent(current);
openList.add(neighbor);
}
}
else {
neighbor.setF(tempf);
neighbor.setH(h);
neighbor.setG(g);
neighbor.setParent(current);
openList.add(neighbor);
}
}
if(neighbor.equals(endNode))
return true;
}
}
return false;
}
'
I understand every part of it except for this part right here
if(openList.contains(neighbor)){
int f=neighbor.getF();
if(f>tempf){
openList.remove(neighbor);
neighbor.setF(tempf);
neighbor.setH(h);
neighbor.setG(g);
neighbor.setParent(current);
openList.add(neighbor);
}
}
else {
neighbor.setF(tempf);
neighbor.setH(h);
neighbor.setG(g);
neighbor.setParent(current);
openList.add(neighbor);
}
}
My own explanation is this. If openList contains neighbor, then get the neighbor's F value (I don't know if the F value is also tempf), after getting the F value of the neighbor, compare if the F value is greater than the tempf value, again, I don't know the difference between F value and tempf. If the F value is greater then remove the neighbor from the openlist since it's already examined, set its F to tempf, which I don't know why, set Parent to current, again, I don't know why I should set the current node to parent. I kind of confused here, will someone explain what i'm missing? P.S. I'm a newbie in programming and in algorithms but i'm really trying, please be kind. Thank you.

Algorithm to find which hashes in a list match another hash the fastest? (this is complicated to explain on title)

Explaining with words when 2 hashes would match is complicated, so, see the example:
Hash patterns are stored in a list like: (I'm using JavaScript for notation)
pattern:[
0:{type:'circle', radius:function(n){ return n>10; }},
1:{type:'circle', radius:function(n){ return n==2; }},
2:{type:'circle', color:'blue', radius:5},
... etc]
var test = {type:'circle', radius:12};
test should match with pattern 0 because pattern[0].type==test.type && pattern.radius(test.radius)==true.
So, trying with words, a hash matches a pattern if every of it's values is either equal of those of the pattern or returns true when applied as a function.
My question is: is there an algorithm to find all patterns that match certain hash without testing all of them?
Consider a dynamic, recursive, decision tree structure like the following.
decision:[
field:'type',
values:[
'circle': <another decision structure>,
'square': 0, // meaning it matched, return this value
'triangle': <another decision structure>
],
functions:[
function(n){ return n<12;}: <another decision structure>,
function(n){ return n>12;}: <another decision structure>
],
missing: <another decision structure>
]
Algorithm on d (a decision structure):
if test has field d.field
if test[d.field] in d.values
if d.values[test[d.field]] is a decision structure
recurse using the new decision structure
else
yield d.values[test[d.field]]
foreach f => v in d.functions
if f(test[d.field])
if v is a decision structure
recurse using the new decision structure
else
yield v
else if d.missing is present
if d.missing is a decision structure
recurse using the new decision structure
else
yield d.missing
else
No match

Basic prefix tree implementation question

I've implemented a basic prefix tree or "trie". The trie consists of nodes like this:
// pseudo-code
struct node {
char c;
collection<node> childnodes;
};
Say I add the following words to my trie: "Apple", "Ark" and "Cat". Now when I look-up prefixes like "Ap" and "Ca" my trie's "bool containsPrefix(string prefix)" method will correctly return true.
Now I'm implementing the method "bool containsWholeWord(string word)" that will return true for "Cat" and "Ark" but false for "App" (in the above example).
Is it common for nodes in a trie to have some sort of "endOfWord" flag? This would help determine if the string being looked-up was actually a whole word entered into the trie and not just a prefix.
Cheers!
The end of the key is usually indicated via a leaf node. Either:
the child nodes are empty; or
you have a branch, with one prefix of the key, and some children nodes.
Your design doesn't have a leaf/empty node. Try indicating it with e.g. a null.
If you need to store both "App" and "Apple", but not "Appl", then yes, you need something like an endOfWord flag.
Alternatively, you could fit it into your design by (sometimes) having two nodes with the same character. So "Ap" has to childnodes: The leaf node "p" and an internal node "p" with a child "l".

Calculating paths in a graph

I have to make a method for making a list with all the paths in a graph.My graph has only one start node and one finish node. Each node has a list whith its children and other list whith its parents. I have to make another list containing all the paths (each of them in another list)
Any suggestion??
It depends on whether it is acyclic or not. Clearly a cycle will result in infinity paths (once round the loop, twice round, 3 times round... etc etc). If the graph is acyclic then you should be able to do a depth-first seach (DFS) (http://en.wikipedia.org/wiki/Depth-first_search) and simply count the number of times you encounter the destination node.
First familiarize yourself with basic graph algorithms (try a textbook, or google). Figure out which one best suits the problem you are solving, and implement it. You may need to adapt the algorithm a little, but in general there are widely known algorithms for all basic graph problems.
If you have a GraphNode class that looks something like this:
public class GraphNode
{
public IEnumerable<GraphNode> Children { get; set; }
// ...
}
Then this sould do the work:
public static class GraphPathFinder
{
public static IEnumerable<IEnumerable<GraphNode>> FindAllPathsTo(this GraphNode startNode, GraphNode endNode)
{
List<IEnumerable<GraphNode>> results = new List<IEnumerable<GraphNode>>();
Stack<GraphNode> currentPath = new Stack<GraphNode>();
currentPath.Push(startNode);
FindAllPathsRecursive(endNode, currentPath, results);
return results;
}
private static void FindAllPathsRecursive(GraphNode endNode, Stack<GraphNode> currentPath, List<IEnumerable<GraphNode>> results)
{
if (currentPath.Peek() == endNode) results.Add(currentPath.ToList());
else
{
foreach (GraphNode node in currentPath.Peek().Children.Where(p => !currentPath.Contains(p)))
{
currentPath.Push(node);
FindAllPathsRecursive(endNode, currentPath, new List<IEnumerable<GraphNode>>());
currentPath.Pop();
}
}
}
}
It's a simple implementation of the DFS algorithm. No error checking, optimizations, thread-safety etc...
Also if you are sure that your graph does not cycles, you may remove the where clause in the foreach statement in the last method.
Hope this helped.
You could generate every possible combination of vertices (using combinatorics) and filter out the paths that don't exist (where the vertices aren't joined by an edge or the edge has the wrong direction on it).
You can improve on this basic idea by having the code that generates the combinations check what remaining vertices are available from the current vertex.
This is all assuming you have acyclic graphs and wish to visit each vertex exactly once.

How do I calculate tree edit distance? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I need to calculate the edit distance between trees. This paper describes an algorithm, but I can't make heads or tails out of it. Could you describe an applicable algorithm in a more approachable way? Pseudocode or code would both be helpful.
This Python library implements the Zhang-Shasha algorithm correctly: Zhang-Shasha: Tree edit distance in Python
It began as a direct port of the Java source listed in the currently accepted answer (the one with the tarball link), but that implementation is not correct and is nearly impossible to run at all.
I wrote an implementation (https://github.com/hoonto/jqgram.git) based on the existing PyGram Python code (https://github.com/Sycondaman/PyGram) for those of you who wish to use tree edit distance approximation using PQ-Gram algorithm in the browser and/or in Node.js.
The jqgram tree edit distance approximation module implements the PQ-Gram algorithm for both server-side and browser-side applications; O(n log n) time and O(n) space performant where n is the number of nodes. See the academic paper from which the algorithm comes: http://www.vldb2005.org/program/paper/wed/p301-augsten.pdf
The PQ-Gram approximation is much faster than obtaining the true edit distance via Zhang & Shasha, Klein, or Guha et. al, whom provide true edit distance algorithms that all perform minimum O(n^2) time and are therefore often unsuitable.
Often in real-world applications it is not necessary to know the true edit distance if a relative approximation of multiple trees to a known standard can be obtained. JavaScript, in the browser and now on the server with the advent of Node.js deal frequently with tree structures and end-user performance is usually critical in algorithm implementation and design; thus jqgram.
Example:
var jq = require("jqgram").jqgram;
var root1 = {
"thelabel": "a",
"thekids": [
{ "thelabel": "b",
"thekids": [
{ "thelabel": "c" },
{ "thelabel": "d" }
]},
{ "thelabel": "e" },
{ "thelabel": "f" }
]
}
var root2 = {
"name": "a",
"kiddos": [
{ "name": "b",
"kiddos": [
{ "name": "c" },
{ "name": "d" },
{ "name": "y" }
]},
{ "name": "e" },
{ "name": "x" }
]
}
jq.distance({
root: root1,
lfn: function(node){ return node.thelabel; },
cfn: function(node){ return node.thekids; }
},{
root: root2,
lfn: function(node){ return node.name; },
cfn: function(node){ return node.kiddos; }
},{ p:2, q:3, depth:10 },
function(result) {
console.log(result.distance);
});
Note that the lfn and cfn parameters specify how each tree should determine the node label names and the children array for each tree root independently so that you can do funky things like comparing an object to a browser DOM for example. All you need to do is provide those functions along with each root and jqgram will do the rest, calling your lfn and cfn provided functions to build out the trees. So in that sense it is (in my opinion anyway) much easier to use than PyGram. Plus, it’s JavaScript, so use it client or server-side!
Now one approach you can use is to use jqgram or PyGram to get a few trees that are close and then go on to use a true edit distance algorithm against a smaller set of trees. Why spend all the computation on trees you can already easily determine are very distant, or vice versa? So you can use jqgram to narrow down choices too.
Here's some Java source code (gzipped tarball at the bottom) for a tree edit distance algorithm that might be useful to you.
The page includes references and some slides that go through the "Zhang and Shasha" algorithm step-by-step and other useful links to get you up to speed.
The code in the link has bugs. Steve Johnson and tim.tadh have provided working Python code. See Steve Johnson's comment for more details.
Here you find Java implementations of tree edit distance algorithms:
Tree Edit Distance
In addition to Zhang&Shasha's algorithm of 1989, there are also tree edit distance implementations of more recent algorithms, including Klein 1998, Demaine et al. 2009, and the Robust Tree Edit Distance (RTED) algorithm by Pawlik&Augsten, 2011.
I made a simple Python wrapper (apted.py) for the APTED algorithm using jpype:
# To use, create a folder named lib next to apted.py, then put APTED.jar into it
import os, os.path, jpype
global distancePackage
distancePackage = None
global utilPackage
utilPackage = None
def StartJVM():
# from http://www.gossamer-threads.com/lists/python/python/379020
root = os.path.abspath(os.path.dirname(__file__))
jpype.startJVM(jpype.getDefaultJVMPath(),
"-Djava.ext.dirs=%s%slib" % (root, os.sep))
global distancePackage
distancePackage = jpype.JPackage("distance")
global utilPackage
utilPackage = jpype.JPackage("util")
def StopJVM():
jpype.shutdownJVM()
class APTED:
def __init__(self, delCost, insCost, matchCost):
global distancePackage
if distancePackage is None:
raise Exception("Need to call apted.StartJVM() first")
self.myApted = distancePackage.APTED(float(delCost), float(insCost), float(matchCost))
def nonNormalizedTreeDist(self, lblTreeA, lblTreeB):
return self.myApted.nonNormalizedTreeDist(lblTreeA.myLblTree, lblTreeB.myLblTree)
class LblTree:
def __init__(self, treeString):
global utilPackage
if utilPackage is None:
raise Exception("Need to call apted.StartJVM() first")
self.myLblTree = utilPackage.LblTree.fromString(treeString)
'''
# Example usage:
import apted
apted.StartJVM()
aptedDist = apted.APTED(delCost=1, insCost=1, matchCost=1)
treeA = apted.LblTree('{a}')
treeB = apted.LblTree('{b{c}}')
dist = aptedDist.nonNormalizedTreeDist(treeA, treeB)
print dist
# When you are done using apted
apted.StopJVM()
# For some reason it doesn't usually let me start it again
# and crashes the Python interpreter upon exit when I do
# this, so call only as needed.
'''
There are many variations of tree edit distance. If you can go with top-down tree edit distance, which limits insertions and deletes to the leaves, I suggest trying the following paper: Comparing Hierarchical Data in External Memory.
The implementation is a straightforward dynamic programming matrix with O(n2) cost.
There is a journal version of the ICALP2007 paper you refer to, An Optimal Decomposition Algorithm for Tree Edit Distance.
This version also has pseudocode.

Resources