creating .graphml tree diagram from nested list [closed] - binary-tree

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I am desperately looking for a solution to create a nice binary tree diagram. It is crucial that incomplete nodes have distinguishable edges (if any).
I failed to produce the desired result with .dot, because I know of no way to order nodes. I don't mind, importing a file to yEd or another editor. However, I want to be able to generate data very easily with little syntax.
What I am aiming at is a tool which generates e.g. a .graphml format from minimalistic data, such as (A (B1 C1 C2) B2), where A is the root label, B1 the root's left child with another two children. A similar complexity as .dot or .tgf would be of course tolerable, but I want to avoid writing a compiler myself for generating the .graphml.
Any ideas appreciated.
Markus R.

The data that you supplied is more-or-less an s-expression. Given that this is the format that you want to ingest, pyparsing (a Python module) has an s-expression parser.
You'll also need a graph library. I use networkx for most of my work. With the pyparsing s-expression parser and networkx, the following code ingests the data and creates a tree as a digraph:
import networkx as nx
def build(g, X):
if isinstance(X, list):
parent = X[0]
g.add_node(parent)
for branch in X[1:]:
child = build(g, branch)
g.add_edge(parent, child)
return parent
if isinstance(X, basestring):
g.add_node(X)
return X
#-- The sexp parser is constructed by the code example at...
#-- http://http://pyparsing.wikispaces.com/file/view/sexpParser.py
sexpr = sexp.parseString("(A (B1 C1 C2) B2)", parseAll = True)
#-- Get the parsing results as a list of component lists.
nested = sexpr.asList( )
#-- Construct an empty digraph.
dig = nx.DiGraph( )
#-- build the tree
for component in nested:
build(dig, component)
#-- Write out the tree as a graphml file.
nx.write_graphml(dig, 'tree.graphml', prettyprint = True)
To test this, I also wrote the tree as a .dot file and used graphviz to create the following image:
networkx is a good graph library and you can write additional code that walks over your tree to tag edges or nodes with additional metadata, if needed.

Related

How can I compare items in a Prolog database?

I have a Prolog database that is
dateopened(asda,date(1985,12,5)).
dateopened(tesco,date(1979,12,17)).
dateopened(morrisons,date(1999,12,25)).
dateopened(sainsburys,date(1979,12,17)).
dateopened(lidl,date(1987,8,27)).
I want to find out how to ask the following questions (Prolog queries) to answer the following:
Are there any two distinct supermarkets that opened on the same day and if there are, what are their names?
(I have no idea how to compare items in a database)
Give a year in the 1990s when no supermarkets were opened.
I have tried:
?- dateopened(Supermarket,date(Year,_,_)),Year>1989, Year<2000.
And the result I get is:
Supermarket = morrisons, Year = 1999.
Which sort-of answers the question because I can say that no supermarkets were opened in 1998 or 1997 etc but I don't think this is what is required.
There are a few clues, the questions can be answered using member/2, not/1 and \=.
It's a beginner querying exercise but I have no idea how to start, especially question 1.
As this is a learning exercise, my answer is a little more general, not with a concrete solution:
Question 1: The Prolog interpreter seeks to resolve a query by unifying it with the facts and rules in the fact basis, which form a closed world of known things. To check if something is true according to the facts you just state that in the query so that the solver can try to unify with the facts, using variables in places where you want to obtain a value to work with, and _ as a placeholder for variables whose actual values you don't need. The following query gives you any supermarket, with the full date in a variable.
?- dateopened(S1,Date1).
S1 = asda,
Date1 = date(1985, 12, 5) ;
...
If your query needs multiple conditions met you can combine them with ,, evaluated from left to right. To solve the initial question, you just pick any two supermarkets in the same way using different variables, and afterwards make sure their date is equal and their name is different \=, which reduces the number of possible solutions to what you need.
Question 2: I think your idea is almost of the opposite of what you want, but not really, as the opposite of 1999 would be all the other years. As a solution sketch:
Find a year in the nineties. The predicate between/3 will help, but if you are only allowed the above ones, a list L with the years and member(Year,L) will do.
Use the first part of your initial attempt to reduce the possible years to those when a super market was opened.
Invert that last part using \+ (not/1 is deprecated, assuming SWI-Prolog) so that you find the other years for which we previously made sure they were in range.

Recursion algorithm for multilevel report

I'm currently developing a multi-level SSRS report, and I'm struggling with the algorithm. I've developed a recursion class which looks like below, but the level numbers are incorrect. I want the parent record (represented by a, b, and c) to show the child records so that the child records' level = (parentRecLevel+1). Right now, the level values just increment by 1. Anyone have any advice?
protected BOMLevel getBomLevelItem(str itemId, int numLevel, boolean firstRec)
while select tmpBOM
{
bomLevel = this.getBomLevelItem(bomLevel.ItemId, bomLevel.Level, false);
}
Current Outcome (where b1, c1, and c2 are children of b and c respectively):
1 a
2 b
2 b1
3 c
4 c1
5 c2
Wanted Outcome:
1 a
2 b
3 b1
2 c
3 c1
3 c2
TLDR: Do not reinvent the wheel, use existing algorithms and frameworks.
I'm assuming your question is not for a training exercise, but a real world problem. If it is an exercise, try to get a good grasp of recursion in an easy to use language of your choice with a big community before coming back to x++.
Your recursion method looks incomplete, because in each recursion, you iterate through all records of tmpBom, which (unless you modify the records in that table somewhere else) does not make sense and will never terminate. I also don't see how this method could produce the outcome you describe. I suggest you take a look at some recursion algorithm training material to learn about the fundamental parts of a recursion.
You tagged the question x++ and the syntax also looks very much like that. Unfortunately you did not add the information which version of microsoft-dynamics you are using, but I will assume dynamics-ax-2012 as it is currently the most common version in use.
In this version, there is already an out-of-the-box SSRS report that will show you the structure of a bill of material. You can call the report at Inventory management > Reports > Bills of materials > Lines. It should be fairly easy to modify this report so that it also shows the level if the report does not already fulfill your requirements.
If you still need to implement your own solution, take a look at class BOMSearch and its children. It is used in several places (check the cross references) and can also used to expand/explode a bill of material.
Also note that there are a lot of articles out there that try to explain how to expand or explode a bill of material in x++ code, but as with all things on the internet, be careful: Most of them are incomplete or plain wrong.

Stack implementation the Trollface way [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
In my software engineering course, I encountered the following characteristic of a stack, condensed by me: What you push is what you pop. The fully axiomatic version I Uled here.
Being a natural-born troll, I immediately invented the Troll Stack. If it has more than 1 element already on it, pushing results in a random permutation of those elements. Promptly I got into an argument with the lecturers whether this nonsense implementation actually violates the axioms. I said no, the top element stays where it is. They said yes, somehow you can recursively apply the push-pop-axiom to get "deeper". Which I don't see. Who is right?
The violated axiom is pop(push(s,x)) = s. Take a stack s with n > 1 distinct entries. If you implement push such that push(s,x) is s'x with s' being a random permutation of s, then since pop is a function, you have a problem: how do you reverse random_permutation() such that pop(push(s,x)) = s? The preimage of s' might have been any of the n! > 1 permutations of s, and no matter which one you map to, there are n! - 1 > 0 other original permutations s'' for which pop(push(s'',x)) != s''.
In cases like this, which might be very easy to see for everybody but not for you (hence your usage of the "troll" word), it always helps to simply run the "program" on a piece of paper.
Write down what happens when you push and pop a few times, and you will see.
You should also be able to see how those axioms correspond very closely to the actual behaviour of your stack; they are not just there for fun, but they deeply (in multiple meanings of the word) specify the data structure with its methods. You could even view them as a "formal system" describing the ins and outs of stacks.
Note that it is still good for you to be sceptic; this leads to a) better insight and b) detection of errors your superiours make. In this case they are right, but there are cases where it can save you a lot of time (e.g. while searching the solution for the "MU" riddle in "Gödel, Escher, Bach", which would be an excellent read for you, I think).

implementing a basic search engine with prefix tree

The problem is the implementing a prefix tree (Trie) in functional language without using any storage and iterative method.
I am trying to solve this problem. How should I approach this problem ? Can you give me exact algorithm or link which shows already implemented one in any functional language?
Why I am trying to do => creating a simple search engine with an feature of
adding word to tree
searching a word in tree
deleting a word in tree
Why I want to use functional language => I want improve my problem-solving ability a bit further.
NOTE : Since it is my hobby project, I will first implement basic features.
EDIT:
i.) What I mean about "without using storage" => I don't want use variable storage ( ex int a ), reference to a variable, array . I want calculate the result by recursively then showing result to the screen.
ii.) I have wrote some line but then I have erased because what I wrote is made me angry. Sorry for not showing my effort.
Take a look at haskell's Data.IntMap. It is purely functional implementation of
Patricia trie and it's source is quite readable.
bytestring-trie package extends this approach to ByteStrings
There is accompanying paper Fast Mergeable Integer Maps which is also readable and through. It describes implementation step-by-step: from binary tries to big-endian patricia trees.
Here is little extract from the paper.
At its simplest, a binary trie is a complete binary tree of depth
equal to the number of bits in the keys, where each leaf is either
empty, indicating that the corresponding key is unbound, or full, in
which case it contains the data to which the corresponding key is
bound. This style of trie might be represented in Standard ML as
datatype 'a Dict =
Empty
| Lf of 'a
| Br of 'a Dict * 'a Dict
To lookup a value in a binary trie, we simply read the bits of the
key, going left or right as directed, until we reach a leaf.
fun lookup (k, Empty) = NONE
| lookup (k, Lf x) = SOME x
| lookup (k, Br (t0,t1)) =
if even k then lookup (k div 2, t0)
else lookup (k div 2, t1)
The key point in immutable data structure implementations is sharing of both data and structure. To update an object you should create new version of it with the most possible number of shared nodes. Concretely for tries following approach may be used.
Consider such a trie (from Wikipedia):
Imagine that you haven't added word "inn" yet, but you already have word "in". To add "inn" you have to create new instance of the whole trie with "inn" added. However, you are not forced to copy the whole thing - you can create only new instance of the root node (this without label) and the right banch. New root node will point to new right banch, but to old other branches, so with each update most of the structure is shared with the previous state.
However, your keys may be quite long, so recreating the whole branch each time is still both time and space consuming. To lessen this effect, you may share structure inside one node too. Normally each node is a vector or map of all possible outcomes (e.g. in a picture node with label "te" has 3 outcomes - "a", "d" and "n"). There are plenty of implementations for immutable maps (Scala, Clojure, see their repositories for more examples) and Clojure also has excellent implementation of an immutable vector (which is actually a tree).
All operations on creating, updating and searching resulting tries may be implemented recursively without any mutable state.

Diff Algorithm? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I've been looking like crazy for an explanation of a diff algorithm that works and is efficient.
The closest I got is this link to RFC 3284 (from several Eric Sink blog posts), which describes in perfectly understandable terms the data format in which the diff results are stored. However, it has no mention whatsoever as to how a program would reach these results while doing a diff.
I'm trying to research this out of personal curiosity, because I'm sure there must be tradeoffs when implementing a diff algorithm, which are pretty clear sometimes when you look at diffs and wonder "why did the diff program chose this as a change instead of that?"...
Where can I find a description of an efficient algorithm that'd end up outputting VCDIFF?
By the way, if you happen to find a description of the actual algorithm used by SourceGear's DiffMerge, that'd be even better.
NOTE: longest common subsequence doesn't seem to be the algorithm used by VCDIFF, it looks like they're doing something smarter, given the data format they use.
An O(ND) Difference Algorithm and its Variations (1986, Eugene W. Myers) is a fantastic paper and you may want to start there. It includes pseudo-code and a nice visualization of the graph traversals involved in doing the diff.
Section 4 of the paper introduces some refinements to the algorithm that make it very effective.
Successfully implementing this will leave you with a very useful tool in your toolbox (and probably some excellent experience as well).
Generating the output format you need can sometimes be tricky, but if you have understanding of the algorithm internals, then you should be able to output anything you need. You can also introduce heuristics to affect the output and make certain tradeoffs.
Here is a page that includes a bit of documentation, full source code, and examples of a diff algorithm using the techniques in the aforementioned algorithm.
The source code appears to follow the basic algorithm closely and is easy to read.
There's also a bit on preparing the input, which you may find useful. There's a huge difference in output when you are diffing by character or token (word).
I would begin by looking at the actual source code for diff, which GNU makes available.
For an understanding of how that source code actually works, the docs in that package reference the papers that inspired it:
The basic algorithm is described in "An O(ND) Difference Algorithm and its Variations", Eugene W. Myers, 'Algorithmica' Vol. 1 No. 2, 1986, pp. 251-266; and in "A File
Comparison Program", Webb Miller and Eugene W. Myers, 'Software--Practice and Experience' Vol. 15 No. 11, 1985, pp. 1025-1040. The algorithm was independently discovered as described in "Algorithms for Approximate String Matching", E. Ukkonen, `Information and Control' Vol. 64, 1985, pp. 100-118.
Reading the papers then looking at the source code for an implementation should be more than enough to understand how it works.
See https://github.com/google/diff-match-patch
"The Diff Match and Patch libraries
offer robust algorithms to perform the
operations required for synchronizing
plain text. ... Currently available
in Java, JavaScript, C++, C# and
Python"
Also see the wikipedia.org Diff page and - "Bram Cohen: The diff problem has been solved"
I came here looking for the diff algorithm and afterwards made my own implementation. Sorry I don't know about vcdiff.
Wikipedia: From a longest common subsequence it's only a small step to get diff-like output: if an item is absent in the subsequence but present in the original, it must have been deleted. (The '–' marks, below.) If it is absent in the subsequence but present in the second sequence, it must have been added in. (The '+' marks.)
Nice animation of the LCS algorithm here.
Link to a fast LCS ruby implementation here.
My slow and simple ruby adaptation is below.
def lcs(xs, ys)
if xs.count > 0 and ys.count > 0
xe, *xb = xs
ye, *yb = ys
if xe == ye
return [xe] + lcs(xb, yb)
end
a = lcs(xs, yb)
b = lcs(xb, ys)
return (a.length > b.length) ? a : b
end
return []
end
def find_diffs(original, modified, subsequence)
result = []
while subsequence.length > 0
sfirst, *subsequence = subsequence
while modified.length > 0
mfirst, *modified = modified
break if mfirst == sfirst
result << "+#{mfirst}"
end
while original.length > 0
ofirst, *original = original
break if ofirst == sfirst
result << "-#{ofirst}"
end
result << "#{sfirst}"
end
while modified.length > 0
mfirst, *modified = modified
result << "+#{mfirst}"
end
while original.length > 0
ofirst, *original = original
result << "-#{ofirst}"
end
return result
end
def pretty_diff(original, modified)
subsequence = lcs(modified, original)
diffs = find_diffs(original, modified, subsequence)
puts 'ORIG [' + original.join(', ') + ']'
puts 'MODIFIED [' + modified.join(', ') + ']'
puts 'LCS [' + subsequence.join(', ') + ']'
puts 'DIFFS [' + diffs.join(', ') + ']'
end
pretty_diff("human".scan(/./), "chimpanzee".scan(/./))
# ORIG [h, u, m, a, n]
# MODIFIED [c, h, i, m, p, a, n, z, e, e]
# LCS [h, m, a, n]
# DIFFS [+c, h, +i, -u, m, +p, a, n, +z, +e, +e]
Based on the link Emmelaich gave, there is also a great run down of Diff Strategies on Neil Fraser's website (one of the authors of the library).
He covers basic strategies and towards the end of the article progresses to Myer's algorithm and some graph theory.

Resources