The real question is how to represent a graph data structure in ruby (some example code might help me understand).
I currently have an idea to represent a graph. that is every Node has an array of neighbourNodes which are object_id of node objects. Is there any better solution? Can i achieve this with some library easily. I have looked at GRATR and RGL. both are outdated (atleast i think so.) Anyway any working example on tuby 2.0.0 will help me a lot.
I have Busstops which compose Routes. Routesis a sequence of Busstops. How would I represent the graph for all the Routes. I want to use Dijkstra's Algorith to find a shortest path between two busstops (which may or may not lie on the same Route. Which means you have to change a bus on the way)).
This question is really vague, so you should expect to receive vague answers. Here's mine:
It's All Data
When you're looking to do something visually, it all starts with data
Your busstops have routes -- this means nothing to Rails or your graphing system. What will mean something is numbers & data; specifically geolocational data (for the bus stops & other geolocational data)
We've never done anything with maps or routes; so I don't know how you'd plot a route, and find the nearest bus stop. I do know, however, that in order to get that working, you'll definitely need to pull the correct data from your database
How I'd Approach It
I'd start by getting all the data you'll need stored in the database:
Each bus stop needs a location (long & lat value)
Each bus stop's route needs to be mapped out (perhaps with sequential waypoints of locational data)
You need a "reference" point (long & lat value to gauge against)
Once you have all these values in place, you'll then be able to get some sort of process sorted to show the data on the graph
As mentioned in the comments, since you do not want to use a new data store such as Neo4j, the best option is using PostGIS and pgRouting. They are relatively easy to implement as they use SQL and are extensions for PostgreSQL.
Since i asked a graph representation in ruby.i came up with this idea that every Node is connected to some other Nodes called Edges.
In ruby I create a Node Object and then push other Node'sobject_ids to thisNode.
newNode.neighbours << otherNode.object_id
There might me other ways. But this came to my mind. Please do tell me if there is a better way to do this.
For now this is what im using.
Related
I'm solving a complex pathfinding problem in my database that can't be expressed in Cypher, so I need to divide it up into multiple queries (and write a complex recursive set of functions).
My question is regarding performance of doing multiple queries on the same node. When query A returns a node X and node X is needed in the next query B, what is the best way of telling neo4j to look for node X in query B?
The most simple way would be to give every node a name, and then return X.name in query A and use WHERE X.name = ... in query B. I suppose this is really slow because neo4j would have to check every name of every node in the database. Is there a faster way or is this actually the best?
EDIT: because the question might not be completely clear, I'll give some more information on the problem I'm solving
I want to get the person that has the best knowledge of a given skill, for example physics. In the database there's a connection between physics and another skill, for example maths, that tells knowledge of maths is usefull for physics. But now I need to know how skilled every person is in maths which is the same process again. This would make sense to do recursively, but as far as I know there's no recursion in Cypher, so I'll have to split it up in multiple queries.
What I want to prevent is that when a link is found between physics and maths, the function that will calculate every person's knowledge in maths does not have to go through every node in the database to find the one where name = 'maths', which is very inefficiƫnt.
I don't know if I understood your question completely, but I think that a good starting point is to create an index in the name property of your nodes. From the docs:
A database index is a redundant copy of information in the database
for the purpose of making retrieving said data more efficient. This
comes at the cost of additional storage space and slower writes, so
deciding what to index and what not to index is an important and often
non-trivial task.
CREATE INDEX ON :NodeLabel(name)
I am currently trying to create a Barnes-Hut octree, however, I still not fully understand how to do this properly. I have read threads here, this article and some others. I believe I do understand how to make a tree if every node contains the information about the indices of particles inside, and if you keep storing the empty nodes. But if you do not want to? How to make a tree such that at the end you will only have necessary information: say, monopoles and quadrupoles for all non-empty nodes. I made so many different attempts that now I am completely confused, to be honest. What should I contain in each node? What would be the pseudocode for such thing?
P.S. By the way, is it different for monopoles and quadrupoles? I mean I can imagine that you do not need the exact information about the particles inside the node to calculate a monopole (it is just a full mass of node), but for quadruple?
Thank you in advance!
P.S. By the way, I use julia language if it is somehow relevant.
I want to implement graph data structure and want to make its graphical view in any language like Windows form or java any one. If you know about it then please tell me.
Tall order.
When I was learning data-structures I always found this page to be helpful for understanding data structures.
https://www.cs.usfca.edu/~galles/visualization/Algorithms.html
This has a bunch of different types of graphs if you scroll down.
The javascript version of each visualization is still maintained. Maybe you can use this as a point of departure and try to reverse engineer whatever specific graph algorithm you are trying to construct.
Sorry if this question seems a bit complex but I think its all related so I wanted try to get the answer in one shot. Basically I have a layered graph*, that has various sets of data that are connected to only the next set of data(so set1 has vertexes that have edges to set2, and so on but set1 has nothing connecting to set3 or anything other than set2. It might be relevant not sure). Generally, you can think of my data as one massive family tree(every set I add about a billion nodes) that I keep loading new generations with every new set(families create new families and no edges go backwards).
I have an Hbase/hadoop system running and I know how to use java to add columns and values, but what I don't know how to do is:
add data to hbase in a graph type format(since its hbase, I want to load it in a way that I can add a ton of data and it'll scale..unlike other databases that limit graphs to the size of the system). I know how to add data but don't understand how to do it in a scalable graph way.
Once the graph is loaded I want to know how to apply some kind of analytics to it. Pagerank is popular so I thought I would say it, but pretty much anything that is based on processing a graph.
I guess the simplified way of asking the question is how to do I specifically get a graph into hbase and once its there how do I analyze it? Is there a tutorial? There's a lot of hbase information on the internet(I read the hbase book) but I could not find anything specific to graphs. I found, giraph, but I don't think it can connect to hbase(yet). Seeing how hadoop/hbase are versions of mapreduce/bigtables I suspect there is a way to process graphs I'm just not having luck finding anything.
*A layered graph is a directed graph with a level for different set of vertex's, like so: http://en.wikipedia.org/wiki/Layered_graph_drawing
I think this question on SO could help:
https://stackoverflow.com/questions/9865738/is-it-possible-to-store-graphs-hbase-if-so-how-do-you-model-the-database-to-sup/9867563#9867563
This part of my answer to this question might be of use.
Using HBase/Accumulo as input to giraph has been submitted recently (7
Mar 2012) as a new feature request to Giraph: HBase/Accumulo Input
and Output formats (GIRAPH-153)
We use giraph in this way, it only store minimum data in each vertex, and then run the graph algorithm with giraph, then we assemble the result with rich data using pig, for page rank algo, each vertex only needs to store vertex id, rank, thus it could scale to almost billion level.
I am a graph/network enthusiast and this just for my curiosity :)
I am trying to model the StackOverflow community as a graph/network. Assume that the people in the SO community are nodes and that the answers given to any of the question establishes a relationship between these nodes. The relationship can be assumed to be directed(link from answer -> question) or undirected. The graph could be weighted and that the weights of the nodes could represented number of vote-ups/downs (normalized on the scale of 0 to 1).
What kind of graph/network does one end up with at any given snapshot of time? Is it scale-free? Is it a small-world? The graph is continuously evolving over a period of time and i would like to understand its structure and dynamics.
Is there a way where can i retrieve this relationship data from - may be SO APIs or some one from SO can help me out with (sample) data?
Clarification edit:
Scale-free network: A network whose degree distribution asymptotically follows a power law Small-world: A network that has sub-networks characterized by presence of connections between almost any two nodes within them and most pairs of nodes are connected by at least one short path.
To the second part of your question:
Is there a way where can i retrieve
this relationship data from - may be
SO APIs or some one from SO can help
me out with (sample) data?
Try these questions instead. There are a lot of plans to implement an API to access SO data. Some things are in change, but there are possibilities to screen-scrape the data or access them via JSON (afaik).
Is there a guide to accessing StackOverflow data programmatically?
What would you want to see in a StackOverflow API?
Are there plans for a StackOverflow API?
Try it out. Good luck!
What kind of graph/network does one end up with at any given snapshot of time? Is it scale-free? Is it a small-world? The graph is continuously evolving over a period of time and i would like to understand its structure and dynamics.
It takes only a few links between remote clusters to turn a random network into a small world one, so it's quite likely to be small world.
As to whether it's scale free, that would require there to be a few posters with lots of answers and many with only one or two. I seem to recall Jeff saying that there were lots with only one question in one of the pod-casts; you might be better off asking the question there rather than here, as he will have the data.