D3 Sankey Diagrams: How to handle dynamic data? I.E. Nodes/Links with no value (or 0 value) - d3.js

This is a question for those working with D3.js Sankey diagrams.
What is the recommended approach for organizing/creating the JSON data for these diagrams?
Every example I've come across involves the example creator having the JSON/CSV having at least one value for every node/link.
However, If you try to supply dynamic data where you don't know if everything will have a value, and if a node/link value is zero, the node floats to the top-left and it disconnected from everything.
Example:
The items in this picture I'm talking about are: "Fugitive Emissions", "Industry", and "electricity and heat"
Cleaning up these nodes seems non-trivial.
It would seem logical to somehow exclude them from the JSON data to begin with, but this feels like an overly complicated process.
I.E. If you remove a node in the JSON, you need to remove all links associated with that node, but this would seem to require a recursive check to ensure everything is kept clean.
I'm still searching for possible ways to have the D3 Sankey remove the nodes without values, but the crux of the problem seems to be that the SVG object is populated with data for nodes based on the JSON data supplied, not what is passed to the Sankey JavaScript code. I've tried filtering the nodes I don't want here, but it doesn't affect what has already been attached to the SVG (Which is ultimately displayed on the diagram).
Any suggestions are welcome regarding how to remove/filter nodes without values. Either an addition to the Sankey code, or a suggestion for organizing/filtering the JSON data to be supplied to the Sankey.

Related

How to do data modelling for a flowchart?

I've a requirement to persist a flowchart built by users into database. Until now, I've been using BLOBs to store the flowchart image. But I want to serialize the flow chart and persist them as a structured data.
For example, a flow chart will have start and end, process box, input/output box, decision flow etc. All these elements will be connected by arrows. I'm blanking out on how to model data structure that will store this information. Any tips would be appreciated?
Sorry, but this question is too vague.
Perhaps the best question to start off asking yourself:
Q: Do I really want to invent my own format, or find an existing "flowchart"
format that does what I need?
If you want to invent your own, I'd strongly suggest:
Using JSON or XML (instead of something completely ad-hoc), and
Finding JSON or XML examples for storing "graphs" (nodes, edges, etc).
Here is one such example: http://jsongraphformat.info/
But then your next challenge would be how to DISPLAY your flowchat, after you've read it from the database.
ANOTHER SUGGESTION:
Assume, for example, you're using Visio. I'd just store the .vsd in your database!

Arango single tree response using AQL only

I have found several questions that are similar but no solution worked as needed, or used internal functions. This is the most relevant one:
Getting data for d3 from ArangoDB using AQL (or arangojs)
I'm unable to understand how to return a single response with a tree structure of parent + children. Something that D3 can understand. Whatever I do, beyond the first iteration, everything is a mess. I have tried MERGE and MERGE_RECURSIVE but it just did not work as I thought of.
I'm clueless to how I can make it to work. I'm used to Neo4J and for some reason this one is just hard for me to understand.
Any help will do,
Thanks,
DD.
I found a simple solution. I'm just using AQL to get a flat list of results and their edges. After that, I just sort it as I need on my code

Extend data to extent of graph

I have time-sensitive data that ranges from now until 7 days from now with hourly resolution ~220 values. I was able to get them to plot following Mike Bostock's demo here: http://bost.ocks.org/mike/cubism/intro/demo-stocks.html
but I can't seem to find how to extend the timescales and data to span across the entire display. I was looking for an extent argument or xrange or width, but I haven't had any luck. I'm sure the answer is trivial, but I can't seem to find it.
var context = cubism.context()
.step(3600000) // <-- this changes the time resolution
.size(1280) // <-- this changes the width
.stop();
Also, d3.time.scale.domain seems to be undefined in d3.v3.min.js.
Here's the fiddle that shows you what the code looks like (because it's calling d3.csv I wasn't sure how to get it completely working in the fiddle...so I included the csv file below the javascript.) : http://jsfiddle.net/oay7tvq0/
I think that cubism uses a single pixel for each time slice, so if you've only got 220 timesteps only 220 pixels are needed. Check the documentation to be sure. If I'm correct then you'll either have to be creative with your timeseries and extend the timestep (by repeating adjacent records) or perhaps d3 (or one of the various libraries built on it) would better suit your purposes. – user1614080
This I'm sure is the answer...

How can I query kendo trees and MVVM models?

According to the docs I should be able to do this ...
$("#tree").data("kendoTreeView").expand(".k-item");
Great if i want to expand everything, but what if i only want to expand nodes where the property "expanded" in my model items is set to true?
Is there a way i can query the tree based on something in the model then perform an action on all results?
The real answer here is quite long, the short version being as with everything kendo, spend hours with support to be given half the solution and told to write the rest yourself.
I got round this problem by using another library (jslinq) to query the model data.
This is yet another frustrating issue with kendo that really should at the very least be offered as a core part of the heirarchy data source at some basic level (essentially an incomplete implementation).

Drawing a user-defined tree

I am making a pretty abstract tree drawing system, but I am having quite a lot of trouble formalizing all the drawing features it should have. I'd very much appreciate if someone could point me to things to read about this topic, because unfortunately my searches have been in vain.
I am looking for/trying to make a meta-language for displaying trees. In these trees each node is an instance of a user-defined Object which have a user-defined graphical representation.
Each Object is associated with a Name, a graphical representation and has a finite number of childs ( 0+ ), which are only known to be Objects themselves. Object recursion is not allowed.
Each Object may have user-defined Options that are used to trigger conditions which would change their graphical representation ( in user-defined ways ). Some Options are automatically applied, others may require user interaction ( "Would you like this Object to be A or B?" ), thus explaining why Object trees need to be instanced.
Object
Name // The Object Name
Childs // List of Object Childs
ContextName // The Name of the Child within this context
Types // List of Objects' names. This child may be only one of them. Decided by the user during instancing.
Options // List of Options assigned to this child. Some of them may require user interaction, and apply other Options to the Child's childs.
*Priority // This is an integer which is used to decide the order in which childs are drawn.
Symbol Name // The Graphical representation of the Object
Once an Object Tree has been instanced, it has to be drawn without any addictional user input, and this is where I am having some trouble. The instancing of an Object tree assigns to each Object a particular graphical representation ( let's call it Symbol ). The assignment is however not known before the instancing. Different Objects may also have the same Symbol, which may be drawn differently depending on the Object's Options.
Because of this, Symbols must be defined separately from Objects, and must have a series of abstract mechanisms to be able to draw themselves ( and thir assigned childs ) correctly, following the user-specified rules.
Each Symbol is represented by an image ( or no image ) plus a finite number of Attachments. Attachments are relative positions to the Symbol's coordinates which tell the drawing code where to draw the Symbols of the Object's childs. Each one of them may have particular conditions to be used ( e.g. this Attachment may only be used by a Symbol that has a particular Option, or if N Symbols have already been drawn, no collisions with already drawn Symbols etc etc ).
The algorithm has to find a free Attachment for each Object's child, following the order specified by their Priority. If it is not possible to find an Attachment for a Child the user may specify beforehand rules that allow some automatic retries, but if they also fail then the whole tree drawing fails. Some of these rules allow for adding addictional child Symbols and/or assigning child Symbols to other childs ( making them grandChildren ) etc.
Symbol
Name
Main Image // Image Path, Height, Width
Attachments // List of the attachments, their position, requirements and addictional infos
Fail Rules // List of actions to do if it is not possible to successfully assign each Child to an Attachment
My main problem is that the number of variables that a Symbol should be able to access is pretty high. Each Symbol, which I'll again remind should be defined using this meta-language, should be able to access its child Symbols' informations ( not others to avoid deadlocks and circular referencing ): for example the user may want the heigth and width of a Symbol to be equal to the sum of the heigth and width of all the Child's Symbols, or to use the same picture, and so on. This is also caused by the fact that the user writes Symbols' rules independently from the final structure.
At the same time, since the tree must be drawn from top to bottom, some of these informations may not be available from the start, and may require a great deal of backtracking.
Also, since all of this has to be defined within a meta-language which I have to be able formalize and parse, I have to define which are the functions that the meta-language requires to allow the maximum grade of freedom to the language-writing user without being overly complex ( this is a vague limit, but essentially I don't want to have Tikz as a subset of my meta-language ). I am having however quite a bit of trouble identifying them.
As I said before, I am looking for informations about this kind of topic and/or methods for completing a project like this. Once I'll be able to fully complete the meta-language I think I won't have too much trouble implementing the code to do all of this, my problems are for the most part theoretical.
I have done a few similar projects with hierarchical data. I point you to where I started:
Joe Celko is the king of tree data. I recommend you start with his book. It is a mix of logic and business case. Trees and Hierarchies in SQL for Smarties even has a new edition out. There is a language for describing the hierarchies there, too.
I have used Oracle for storing my hierarchies which has a very efficient system for pulling and storing tree data. Look up "connect by" either in documentation or in the book: Mastering Oracle SQL by Mishra and Beaulieu.
You can use pointers to pull the images from the server so you wouldn't store them in the database. I have built several systems that use hierarchical displays of data with graphical objects, it keeps the overhead down this way. DevExpress and Telerik both have great viewers for displaying the trees and I have mine build the next levels dynamically. It doesn't know how many or what the next level is going to be until it is drilled down on. Try these examples and read the docs and you will be able to put this together in not time.
For telerik this link will show you multiple load on demand views: http://demos.telerik.com/aspnet-ajax/treeview/examples/programming/loadondemandmodes/defaultcs.aspx
For Devexpress: http://demos.devexpress.com/ASPxTreeListDemos/Data/VirtualMode.aspx
Think in HTML/DOM.
I was surprised, when I found, that the file format of the outliner I am using, NoteCase, is plain HTML. NoteCase can be found here: http://notecase.sourceforge.net/index1.html
If you don't familiar with it, outliner is an application type, which you can organize mainly text nodes in hierarchical tree. There are task outliners, too. When an outliner has graphical representation, it's called mind mapping. Anyway, the directory structure of a filesystem is an outline, too. There are lot of outliners for various areas. See Wikipedia for more details.
Notecase uses DL/DT/DD: DL is the list, DT is the item, and DD is the description of an item. They can be nested, of course.
If the format is HTML, you need only a CSS to show it in a browser easy readable for human eyes.
If you have additional properties, you can define additional tags or attributes, which browser will not show, but your renderer can.
You should write a converter, which transforms your source HTML file format to a more detailed HTML format, which contains computed fields (e.g. which sums of values from the sub-nodes, or replaces "inherit" marks in a sub-node with the inherited value from parent node), some additional formatting, or you can transform attributes into HTML nodes:
<node type="x" size="y" />
to
<div class="node">
<div class="param"> type: x </div>
<div class="param"> size: y </div>
</div>
Your data representation is a kinda DOM, and you should process it similar way. First, parse and read values from the file. Then, you should run some additional rounds (walk the tree) to fill missing values with defaults, calculate inherited and summarized values etc.
I think, you can't use standard DOM parsers, because you mentioned custom sorting of stuff depending on parameters, which DOM modell doesn't really support.
Don't afraid to walk the object tree just as many passes as many operation you want to perform on it. You can play with changing the order of the passes, enabling and disabling passes... as your have more and more features, it will articulated as new processing passes.
You may have passes, which must be run several times, e.g. if one pass can't calculate a value (because it's source should be calculated first), it may return a flag that "I've not done yet", and it should run again on the tree, until it results "no change mades, I've done".
I hope I've push you a bit.

Resources