Why is D3 quadtree dropping nodes? - d3.js

I’m having issues with D3’s excellent quadtree appearing to drop nodes unpredictably. I can understand that it might not return all nodes if they are closely overlapping, but it would be very useful to understand more about when this might happen so I can work around it.
But that assumes that I’m not misusing it. If I run this with 10,000 points in data below, I get about a consistent ~29% drop in leaf nodes. With only 200 I can get one drop. This feels too high.
Am I doing something wrong with my quadtree implementation?
What could I do to work round this?
var quadtree = d3.geom.quadtree()
.x(function(d){return d[0];})
.y(function(d){return d[1];});
var data = d3.range(10000)
.map(function(d){
return [
Math.random(),
Math.random()
];
});
If I run this count of quadtree leaves, I get a number below data.length:
var qt = quadtree(data),
count = 0;
qt.visit(function(p,x1,y1,x2,y2){
if(p.leaf)count++;
});
But if I run this filter, it returns an empty array suggesting that they are all there:
data.filter(function(d){return qt.find([d.x,d.y]).id !== d.id;});
Where am I going wrong?!

Leaf and point are not interchangeable. Points can exist on internal nodes.
https://github.com/mbostock/d3/wiki/Quadtree-Geom

Related

dc.js - avoid data points animation when adding data to scatter plot

I'm trying to implement a live data visualization (i.e. with new data arriving periodically) using dc.js. The problem I'm having is the following - when new data is added to the plot, already existing points often start to "dance around", even though they were not changed. Can this be avoided?
The following fiddle illustrates this.
My guess is that crossfilter sorts data internally, which results in points moving on the chart for data items that changed their position (index) in the internal storage. Data is added in the following way:
var data = [];
var ndx = crossfilter(data)
setInterval(function() {
var value = ndx.size() + 1;
if (value > 50) {
return;
}
var newElement = {
x: myRandom(),
y: myRandom()
};
ndx.add([newElement]);
dc.redrawAll();
}, 1000);
Any ideas?
I stand by my comments above. dc.js should be fixed by binding the data using a key function, and probably the best way to deal with the problem is just to disable transitions on the scatterplot using .transitionDuration(0)
However, I was curious if it was possible to work around the current problems by keeping the group in a set order using a fake group. And it is indeed, at least for this example where there is no aggregation and we just want to display the original data points.
First, we add a third field, index, to the data. This has to order the data in the same order in which it comes in. As noted in the discussion above, the scatter plot is currently binding data by its index, so we need to keep the points in a set order; nothing should be inserted.
var newElement = {
index: value,
x: myRandom(),
y: myRandom()
};
Next, we have to preserve this index through the binning and aggregation. We could keep it either in the key or in the value, but keeping it in the key seems more fitting:
xyiDimension = ndx.dimension(function(d) {
return [+d.x, +d.y, d.index];
}),
xyiGroup = xyiDimension.group();
The original reduction didn't make sense to me, so I dropped it. We'll just use the default behavior, which counts the number of rows which fall into each bin. The counts should be 1 if included, or 0 if filtered out. Including the index in the key also ensures uniqueness, which the original keys were not guaranteed to have.
Now we can create a fake group that keeps everything sorted by index:
var xyiGroupSorted = {
all: function() {
var ret = xyiGroup.all().slice().sort((a,b) => a.key[2] - b.key[2]);
return ret;
}
}
This will fetch the original data whenever it's requested by the chart, create a copy of the array (because the original is owned by crossfilter), and sort it to return it to the correct order.
And voila, we have a scatter plot that behaves the way it should, even though the data has gone through crossfilter.
Fork of your fiddle: https://jsfiddle.net/gordonwoodhull/mj81m42v/13/
[After all this, maybe we shouldn't have given the data to crossfilter in the first place! We could have just created a fake group which exposes the original data. But maybe there's some use to this technique. At least it proves that there's almost always a way to work around any problems in dc.js & crossfilter.]

d3.js force graph : links spreading the graph too much

I want to display a set of nodes in a force-directed graph.Those nodes have almost exclusively a parent-child relationship (almost a tree).There are a few nodes which have many children, and most others are leaves.
This produces a structure of several interconnected node "islands".The problem is, those "islands" keep spreading apart, specially when nodes are dragged.
Fiddle here: https://jsfiddle.net/50oz3xtr/
I've been trying different settings in the following block:
var simulation = d3.forceSimulation()
.force("link", d3.forceLink().id(function(d) { return d.id; }))
.force("charge", d3.forceManyBody())
.force("center", d3.forceCenter(width / 2, height / 2));
But without luck (strength, distanceMin...)
My questions are:
- How can i set a "max length" on links, so the graph doesnt spread too much?
- Is there a way to set a "fixed" length to links?
What you can do is to set a "ideal" length on each link which the simulation will try to optimise to.
To do this, chain the distance method to the end of your "link" force - api reference
e.g.
.force("link", d3.forceLink().id(function(d) { return d.id; }).distance({x}));
You can also pass in a function to the distance method, which allows you to vary ideal distances on a link by link basis.
Also, bear in mind that you have more than one force acting here, so other forces may "interfere" with your link force. I would look at playing with the "strength" setting on each link, and consider using a "collide" force instead of many body.
Playing with the parameters to each force should get you the result you are after.

D3 circle packing diameter calculation

I am using the pack layout for packing different no of equal sized circles. I have a group of clusters to be visualized. So I am calling pack function for each cluster of circles. In all the d3 examples the diameter is either calculated with the given size or fixed diameter. I would like to calculate it according to the no of circles to be packed. So how do I calculate the packing circle diameter?
is there any formula so that I can pack the circles without wasting the space.
If you truly don't care about relative sizing of the circles, then you could make your JSON file represent only the data you care about(say, names) and feed your packing function a dummy value that the 'value' accessor function is expecting.
For instance:
var circleChildren = [{
"value": 1
}, {
"value": 1
}, {
"value": 1
}, {
"value": 1
}];
would give you a JSON object that you can use as children for your packing function:
var circleInput = Object();
circleInput.children = circleChildren;
You can verify that in your console by running:
bubble.nodes(circleInput)
.filter(function (d) {
return !d.children; //we're flattening the 'parent-child' node structure
})
where bubble is your D3 packing bubble variable.
Here's a fiddle that demonstrates that. It may have some extra things but it implements what you're looking for. In addition, you can play around with the number of circles by adding more dummies in the JSON file, as well as changing the SVG container size in the diameter variable. Hope that helps!
EDIT: The size of your layout(in this case, a misnomer of the 'diameter' variable) directly determines the size and diameter of your circles within. At some point you have to assign the pack.size() or pack.radius() value in order for your circles to display within a layout(documentation ):
If size is specified, sets the available layout size to the specified two-element array of numbers representing x and y. If size is not specified, returns the current size, which defaults to 1×1.
Here you have several options:
If you want your circles to be 'dynamically' sized to your available element's width (that is, if you want them to cover up all the element width available) then I'd recommend you get your element's width beforehand, and then apply in your pack() function. The problem is then you have to think about resizing, etc.
If you want to keep the maximum sizing available, then you have to make your viz responsive. There's a really good question already in SO that deals with that.
I know this isn't the full solution but hopefully that points you in the right direction for what you're trying to do.
FURTHER EDIT:
All of a sudden, another idea came to mind. Kind of an implementation of my previous suggestion, but this would ensure you're using the maximum space available at the time for your circle drawing:
zone = d3.select("#myDiv");
myWidth = zone.style("width").substring(0, zone.style("width").length - 2);

How do I control the bounce entry of a Force Directed Graph in D3?

I've been able to build a Force Directed Graph using a Force Layout. Most features work great but the one big issue I'm having is that, on starting the layout, it bounces all over the page (in and out of the canvas boundary) before settling to its location on the canvas.
I've tried using alpha to control it but it doesn't seem to work:
// Create a force layout and bind Nodes and Links
var force = d3.layout.force()
.charge(-1000)
.nodes(nodeSet)
.links(linkSet)
.size([width/8, height/10])
.linkDistance( function(d) { if (width < height) { return width*1/3; } else { return height*1/3 } } ) // Controls edge length
.on("tick", tick)
.alpha(-5) // <---------------- HERE
.start();
Does anyone know how to properly control the entry of a Force Layout into its SVG canvas?
I wouldn't mind the graph floating in and settling slowly but the insane bounce of the entire graph isn't appealing, at all.
BTW, the Force Directed Graph example can be found at: http://bl.ocks.org/Guerino1/2879486enter link description here
Thanks for any help you can offer!
The nodes are initialized with a random position. From the documentation: "If you do not initialize the positions manually, the force layout will initialize them randomly, resulting in somewhat unpredictable behavior." You can see it in the source code:
// initialize node position based on first neighbor
function position(dimension, size) {
...
return Math.random() * size;
They will be inside the canvas boundary, but they can be pushed outside by the force. You have many solutions:
The nodes can be constrained inside the canvas: http://bl.ocks.org/mbostock/1129492
Try more charge strength and shorter links, or more friction, so the nodes will tend to bounce less
You can run the simulation without animating the nodes, only showing the end result http://bl.ocks.org/mbostock/1667139
You can initialize the nodes position https://github.com/mbostock/d3/wiki/Force-Layout#wiki-nodes (but if you place them all on the center, the repulsion will be huge and the graph will explode still more):
.
var n = nodes.length; nodes.forEach(function(d, i) {
d.x = d.y = width / n * i; });
I have been thinking about this problem too and this is the solution I came up with. I used nodejs to run the force layout tick offline and save the resulting nodes data to a json file.
I used that as the new json file for the layout. I'm not really sure it works better to be honest. I would like hear about any solutions you find.

Constraining d3 force layout graphs based on node degree

I have a force layout with potentially a very large number of nodes, too large for the graph to render responsively. I was thinking that one way to improve the performance of the system was to prune the graph by eliminating nodes based on in- and out-degree when the number of nodes gets too large.
Recomputing the node and link lists is a bit of a nuisance because links are related to indexes in the node array, and so all the links would need to be re-built.
It seems more elegant to be able to mark individual nodes for exclusion (analogously to the way some nodes are fixed) and have the layout algorithm skip those nodes. This would allow me to dynamically select subsets of the graph to show, while preserving as much state for each node (e.g., position) as practical.
Has anyone implemented something like this?
UPDATE:
I tried to implement the filter suggestion, but ran into an interesting error. It appears that the filter method returns an object that does not implement enter:
qChart apply limit:2
NODES BEF: [Array[218], enter: function, exit: function, select: function, selectAll: function, attr: function…]
NODES AFT: [Array[210], select: function, selectAll: function, attr: function, classed: function, style: function…]
Uncaught TypeError: Object [object Array] has no method 'enter'
The following code is run to get from BEF to AFT:
nodeSubset = nodeSubset.filter(function(n) { return (n.sentCount() <= limit); });
UPDATE 2:
I created a jsfiddle to isolate the problem. This example implements my interpretation of ChrisJamesC's answer. When I tried to implement his suggestion directly (putting the filter after the data), the subsequent call to enter failed because the object returned by filter did not have enter defined.
The goal is to make the layout select only those nodes that have active == true, so in this example, this means that node b should be excluded.
You can use the selection.filter() option combined with the node.weight attribute.
What you would normally do is:
var node = svg.selectAll(".node")
.data(graph.nodes)
.enter().append("circle")
Here you can do:
var node = svg.selectAll(".node")
.data(graph.nodes)
.filter(function(d){return d.weight>3})
.enter();
You might also have to remove from drawing the links going to these nodes using the same method.
EDIT You should just filter the data you provide if you want to mark nodes as active directly in the data array (and do the same for links)
var node = svg.selectAll(".node")
.data(force.nodes().filter(function(d) { return d.active; }));
var link = svg.selectAll(".link")
.data(force.links().filter(function(d) {
var show = d.source.active && d.target.active;
if (show)
console.log("kept", d);
else
console.log("excluded", d);
return show;
}) );
Fiddle
If you want to do this by computing the weight of each node, I would still recommend you to do this before passing the nodes and links to the graph and mark nodes as active or not following a specific criteria, then filter the links according to active nodes. Otherwise you would have to load the whole force directed layout only to get the weight to then filter the data to re-load the force directed graph.

Resources