So I'm trying to implement a project where I have graph data (few 100 nodes for now, but could become a million later) stored as a json file.
I'm using NetworkX libraries in python to generate all the data
from some information, and then sending over the json file to the
client side.
I'm using d3.js to represent the graph in javascript
on the client side.
Now, on the user's request, I need to delete the shortest path between two nodes of the user's choice in my graph and show them the resulting graph.
I know that this processing has to be done on the client-side to prevent excess server load, but this is what I'm unsure about:
An optimized graph library is what does this the fastest. In fact, NetworkX probably has a ready-made function for this. But it's in python. Is writing a shortest path deletion function in javascript the intelligent thing to do?
Does d3.js have these sort of functions? Or is it a library only for representing stuff graphically?
Thanks.
D3 doesn't have functions for this. It sounds like in your case the best way to go would be to implement this functionality in Javascript yourself (or find a library that does it). If you need more sophisticated functionality however, relying on something like NetworkX is almost certainly going to be easier and faster to implement though.
Related
I have a simple triple store.
It is structured as follows :
Entity, verb, object
e.g, it may be
John, Supports, Manchester United
Fred, plays golf, Mark
Mark, Supports, Manchester United
From this I'd like a graph which will display the following info (hopefully in a slightly nicer format though! :) ) :
What is the best API to do this with, and how can I best approach this problem?
Thanks
The best way you can approach is by using google organizational hierarchy it is really easy to use and program, for more information you can go across this Google Organizational Hierarchy. when it comes to the supporting frameworks i strongly suggest you to take a look at D3 JS, which is very useful for building visualizations using the data. Have a look over D3 Js Here. i hope these two references can help.
I've done a lot of graph theory work inside JavaScript. Instead of picking an API, I'd focus on the language you feel comfortable using, and then focus on the tools for that language.
The D3 graph libraries are a bit different than what you'd expect from something like an adjacency list (like I studied graphs in school.) I wrote this collection of JavaScript programs to help get things into a format D3 likes.
This also depends on the properties of your verbs. If a => b && b => a for all a and b (symmetric) then you don't probably want directed graphs. If Fred plays golf with Mark, does Mark play golf with Fred? For all verbs?
Are your relationships transitive? If NOT, then it might make sense to create a new node for every time it's in a relationship, since visualizing them in the way you did makes me think John supports Mark who plays golf with Fred.
There are a bunch more discrete math relationships you might want to consider.
I've also done a lot of graph theory inside Python. When I get a new graph theory problem, I like to use D3 for JavaScript, since I'm pretty good with JavaScript and already have some tools for it. If I need it to be super hardcore, I look into Python since it can run overnight on a server someplace.
I want to implement graph data structure and want to make its graphical view in any language like Windows form or java any one. If you know about it then please tell me.
Tall order.
When I was learning data-structures I always found this page to be helpful for understanding data structures.
https://www.cs.usfca.edu/~galles/visualization/Algorithms.html
This has a bunch of different types of graphs if you scroll down.
The javascript version of each visualization is still maintained. Maybe you can use this as a point of departure and try to reverse engineer whatever specific graph algorithm you are trying to construct.
I have my data as a RDF graph in DB and using SPARQL i am retriving the data. Now the nodes (objects) in the graphs gets huge and the traversal/search gets much slower now.
a. Can anyone suggest the efficient traversal/search algorithm to fetch the data?
As a next step, i have federated data i.e the data from external applications like SAP. In this case, the search becomes even much slower.
b. What efficient search algorithm do i use in this case?
This seems like a common issue in an large enterprise systems, and any inputs on how these problems have been solved in such systems will also be helpful.
I had a similiar problem. I was doing a lot of graph traversal using SPARQL property paths and it was too slow using an RDF based repository. I was using Jena TDB which is supposed to be fast but still it was too slow !
Like #Mikos suggested, I tried Neo4J. It then got much faster. Like Mark Watson says on this blog entry,
RDF data stores support SPARQL queries: good for matching patterns in data.
Neo4j supports arbitrary graph structures and seems best for exploring
a neighborhood of a graph: start at a node and explore the connected
nodes. (graph traversal)
I used Neo4j but you can try any tool that is built for graph traversal. I read that Allegrograph 4 is RDF based and has good graph traversal speed.
Now Im using Neo4j but I didnt give up on RDF. I still use URIs as identifiers and try to reuse the popular rdf vocabularies and relations. Later I'll add a feature to render my gaphs as RDF. I know that with Neo4j you can also use Tinkerpop to render RDF but I havent tried it myself.
Graph traversal and efficient querying is a wide-ranging problem and the approach to use is dependent on your situation. I would suggest looking at a data-store like Neo4j and complementing it with a tool like Lucene.
The graph is arguably the most versatile and valuable data structure of all. I can store single variables, lists, hashes etc., and of course graphs, with it.
Given this, are there any languages that offer inline / native graph support and syntax? I can create variables, arrays, lists and hashes inline in Ruby, Python and Javascript, but if I want a graph, I have to either manage the representation myself with a matrix / list, or select a library, and use the graph through method calls.
Why on earth is this still the case in 2010? And, practically, are there any languages out there which offer inline graph support and syntax?
The main problem of what you are asking is that a more general solution is not the best one for a specific problem. It's just average for all of them but not a best one.
Ok, you can store a list in a graph assuming its degeneracy but why should you do something like that? And how would you store an hashmap inside a graph? Why would you need such a structure?
And do not forgot that graph implementation must be chosen accordingly to which operations you are going to do on it, otherwise it would be like using a hashtable to store a list of values or a list to store an ordered collection instead that a tree. You know that you can use an adjacency matrix, an edge list or adjacency lists.. every different implementation with it's own strenghts and weaknesses.
Then graphs can have really many properties compared to other collections of data, cyclic, acyclic, directed, undirected, bipartite, and so on.. and for any specific case you can implement them in a different way (assuming some hypothesis on the graph you need) so having them in native syntax would be overkill since you would need to configure them anyway (and language should provide many implementations/optimizations).
If everything is already made you remove the fun of developing :)
By the way just look for a language that allows you to write your own graph DSL and live with it!
Gremlin, a graph-based programming language: https://github.com/tinkerpop/gremlin/wiki
GrGen.NET (www.grgen.net) is a programming language for graph transformation plus an environment including a graphical debugger. You can define your graph model, the rewrite rules, and rule control with some nice special purpose languages and use the generated assemblies/C# code from any .NET language you like or from the supplied shell.
To understand why normal languages don't offer such a convenient/built-in interface to graphs, just take a look at the amount of code written for that project: the compiler alone is several man-years of work. That's a price tag too hefty for a feature/data structure only a minority of programmers ever need - so it's not included in general purpose programming languages.
I am trying to store a large list of strings in a concise manner so that they can be very quickly analyzed/searched through.
A directed acyclic word graph (DAWG) suits this purpose wonderfully. However, I do not have a list of the strings to include in the first place, so it must be incrementally buildable. Additionally, when I search through it for a string, I need to bring back data associated with the result (not just a boolean saying if it was present).
I have found information on a modification of the DAWG for string data tracking here: http://www.pathcom.com/~vadco/adtdawg.html It looks extremely, extremely complex and I am not sure I am capable of writing it.
I have also found a few research papers describing incremental building algorithms, though I've found that research papers in general are not very helpful.
I don't think I am advanced enough to be able to combine both of these algorithms myself. Is there documentation of an algorithm already that features these, or an alternative algorithm with good memory use & speed?
I wrote the ADTDAWG web page. Adding words after construction is not an option. The structure is nothing more than 4 arrays of unsigned integer types. It was designed to be immutable for total CPU cache inclusion, and minimal multi-thread access complexity.
The structure is an automaton that forms a minimal and perfect hash function. It was built for speed while traversing recursively using an explicit stack.
As published, it supports up to 18 characters. Including all 26 English chars will require further augmentation.
My advice is to use a standard Trie, with an array index stored in each node. Ya, it is going to seem infantile, but each END_OF_WORD node represents only one word. The ADTDAWG is a solution to each END_OF_WORD node in a traditional DAWG representing many, many words.
Minimal and perfect hash tables are not the sort of thing that you can just put together on the fly.
I am looking for something else to work on, or a job, so contact me, and I'll do what I can. For now, all I can say is that it is unrealistic to use heavy optimization on a structure that is subject to being changed frequently.
Java
For graph problems which require persistence, I'd take a look at the Neo4j graph DB project. Neo4j is designed to store large graphs and allow incremental building and modification of the data, which seems to meet the criteria you describe.
They have some good examples to get you going quickly and there's usually example code to get you started with most problems.
They have a DAG example with a link at the bottom to the full source code.
C++
If you're using C++, a common solution to graph building/analysis is to use the Boost graph library. To persist your graph you could maintain a file based version of the graph in GraphML (for example) and read and write to that file as your graph changes.
You may also want to look at a trie structure for this (potentially building a radix-tree). It seems like a decent 'simple' alternative structure.
I'm suggesting this for a few reasons:
I really don't have a full understanding of your result.
Definitely incremental to build.
Leaf nodes can contain any data you wish.
Subjectively, a simple algorithm.