d3 function(d,i) - the meaning of i changed as data changed - d3.js

I have confusion about this tutorial of D3. On this page there is some example code:
var myData = [
[15, 20],
[40, 10],
[30, 17]
]
var svg = d3.select("div.output svg")
var selA = svg.selectAll("g").data(myData)
selA.enter().append("g")
selA.attr("transform", function(d,i) { // I'm confused!
return 'translate(70,' + (i*100+50) + ')'
})
selA.exit().remove()
var selB = selA.selectAll('circle')
.data(function(d) { return d })
selB.enter().append('circle')
selB
.attr("cx", function(d,i) { return i*80 }) // I'm confused!
.attr("r", function(d,i) { return d })
selB.exit().remove()
My confusion is about the two function(d,i) functions. Judging from the code output i means different things in the two functions. In the first function, i seems to be the index for the [15,20], [40,10], [30,17] entries. Therefore the indexes are 0, 1, 2. In the second function i seems to be the second dimension index. So the indexes are 0, 1, 0, 1, 0, 1.
I think this has something to do with
var selB = selA.selectAll('circle')
.data(function(d) { return d })
but I can't really think through. Could anyone explain to me why i meant different indexes in the two functions? Thanks!

In your first selection you are binding the data ([[],[],[]]) and creating a group for each element in the data, so the function in selA.attr(..., function(d, i) {}) gets called onces for each element in the outer array (indices 0,1,2).
For the second part, each group in selA got bounded to one of the inner arrays, so selB.enter gets called 3 times (once for each group), each time with the data that was bounded to the group (each of the inner arrays), so each function in selB.attr(...) gets passed each element in each of the inner arrays, hence indices 0,1 three times.
Hope this makes sense :)
Take a look at this example:
http://jsfiddle.net/jaimedp/heEyn/

Related

Difficulty understanding d3 version 4 selection life cycle with nested elements

In version d3 v3, a common workflow for me was to create several svg group elements and then append a number of other child elements to each g. as an example, below I've created 3 group elements and appended a circle to each group. I then use the selection.each method to update the radius of each circle:
var data = [2, 4, 8]
var g = d3.select('svg').selectAll('.g').data(data)
g.each(function(datum) {
var thisG = d3.select(this)
var circle = thisG.selectAll('.circle')
circle.transition().attr('r', datum * 2)
})
var enterG = g.enter().append('g').attr('class', 'g')
enterG.append('circle')
.attr('class', 'circle')
.attr('r', function(d) { return d })
g.exit().remove()
What is the proper way to do this in d3 v4? I am very confused on how best to do this. Here's an example of what i'm trying:
var data = [2, 4, 8]
var g = d3.select('svg').selectAll('.g').data(data)
g.enter()
// do stuff to the entering group
.append('g')
.attr('class', 'g')
// do stuff to the entering AND updating group
.merge(g)
// why do i need to reselect all groups here to append additional elements?
// is it because selections are now immutable?
var g = d3.select('svg').selectAll('g')
g.append('circle')
.attr('class', 'circle')
.attr('r', function(d) { return d })
// for each of the enter and updated groups, adjust the radius of the child circles
g.each(function(datum) {
var thisG = d3.select(this)
var circle = thisG.selectAll('.circle')
circle.transition().attr('r', datum * 2)
})
g.exit().remove()
Thanks in advance for any help you can provide. I've used d3 v3 for a long time and feel pretty comfortable with it. However, I am having a very hard time understanding some of the different behaviors in v4.
I think your code could be modified as follow (untested, so unsure):
var data = [2, 4, 8]
var g = d3.select('svg').selectAll('.g').data(data);
// do stuff to the entering group
var enterSelection = g.enter();
var enterG = enterSelection.append('g')
.attr('class', 'g');
//Append circles only to new elements
enterG.append('circle')
.attr('class', 'circle')
.attr('r', function(d) { return d })
// for each of the enter and updated groups, adjust the radius of the child circles
enterG.merge(g)
.select('.circle')
.transition()
.attr('r',function(d){return d*2});
g.exit().remove()
When using the first .selectAll, only existing elements are selected. Then, by entering, you are creating new elements, that generate a new selection. When you need to update all, you simply merge the new and existing elements in a single selection.
From that selection, I simply selected all .circle (single select - one element per g), and then update the radius thanks to the binding API that prevents me from making a .each call. I am unsure as how these two compares, I simply always did it this way.
Finally, here is a bl.ocks demonstrating the pattern.

d3.js v4: How to access parent group's datum index?

The description of the selection.data function includes an example with multiple groups (link) where a two-dimensional array is turned into an HTML table.
In d3.js v3, for lower dimensions, the accessor functions included a third argument which was the index of the parent group's datum:
td.text(function(d,i,j) {
return "Row: " + j;
});
In v4, this j argument has been replaced by the selection's NodeList. How do I access the parent group's datum index now?
Well, sometimes an answer doesn't provide a solution, because the solution may not exist. This seems to be the case.
According to Bostock:
I’ve merged the new bilevel selection implementation into master and also simplified how parents are tracked by using a parallel parents array.
A nice property of this new approach is that selection.data can
evaluate the values function in exactly the same manner as other
selection functions: the values function gets passed {d, i, nodes}
where this is the parent node, d is the parent datum, i is the parent
(group) index, and nodes is the array of parent nodes (one per group).
Also, the parents array can be reused by subselections that do not
regroup the selection, such as selection.select, since the parents
array is immutable.
This change restricts functionality—in the sense that you cannot
access the parent node from within a selection function, nor the
parent data, nor the group index — but I believe this is ultimately A
Good Thing because it encourages simpler code.
(emphasis mine)
Here's the link: https://github.com/d3/d3-selection/issues/47
So, it's not possible to get the index of the parent's group using selection (the parent's group index can be retrieved using selection.data, as this snippet bellow shows).
var testData = [
[
{x: 1, y: 40},
{x: 2, y: 43},
{x: 3, y: 12},
{x: 6, y: 23}
], [
{x: 1, y: 12},
{x: 4, y: 18},
{x: 5, y: 73},
{x: 6, y: 27}
], [
{x: 1, y: 60},
{x: 2, y: 49},
{x: 3, y: 16},
{x: 6, y: 20}
]
];
var svg = d3.select("body")
.append("svg")
.attr("width", 300)
.attr("height", 300);
var g = svg.selectAll(".groups")
.data(testData)
.enter()
.append("g");
var rects = g.selectAll("rect")
.data(function(d, i , j) { console.log("Data: " + JSON.stringify(d), "\nIndex: " + JSON.stringify(i), "\nNode: " + JSON.stringify(j)); return d})
.enter()
.append("rect");
<script src="https://d3js.org/d3.v4.min.js"></script>
My workaround is somewhat similar to Dinesh Rajan's, assuming the parent index is needed for attribute someAttr of g.nestedElt:
v3:
svg.selectAll(".someClass")
.data(nestedData)
.enter()
.append("g")
.attr("class", "someClass")
.selectAll(".nestedElt")
.data(Object)
.enter()
.append("g")
.attr("class", "nestedElt")
.attr("someAttr", function(d, i, j) {
});
v4:
svg.selectAll(".someClass")
.data(nestedData)
.enter()
.append("g")
.attr("class", "someClass")
.attr("data-index", function(d, i) { return i; }) // make parent index available from DOM
.selectAll(".nestedElt")
.data(Object)
.enter()
.append("g")
.attr("class", "nestedElt")
.attr("someAttr", function(d, i) {
var j = +this.parentNode.getAttribute("data-index");
});
I ended up defining an external variable "j" and then increment it whenever "i" is 0
example V3 snippet below.
rowcols.enter().append("rect")
.attr("x", function (d, i, j) { return CalcXPos(d, j); })
.attr("fill", function (d, i, j) { return GetColor(d, j); })
and in V4, code converted as below.
var j = -1;
rowcols.enter().append("rect")
.attr("x", function (d, i) { if (i == 0) { j++ }; return CalcXPos(d, j); })
.attr("fill", function (d, i) { return GetColor(d, j); })
If j is the nodeList...
j[i] is the current node (eg. the td element),
j[i].parentNode is the level-1 parent (eg. the row element),
j[i].parentNode.parentNode is the level-2 parent (eg. the table element),
j[i].parentNode.parentNode.childNodes is the array of level-1 parents (eg. array of row elements) including the original parent.
So the question is, what is the index of the parent (the row) with respect to it's parent (the table)?
We can find this using Array.prototype.indexOf like so...
k = Array.prototype.indexOf.call(j[i].parentNode.parentNode.childNodes,j[i].parentNode);
You can see in the snippet below that the row is printed in each td cell when k is returned.
var testData = [
[
{x: 1, y: 1},
{x: 1, y: 2},
{x: 1, y: 3},
{x: 1, y: 4}
], [
{x: 2, y: 1},
{x: 2, y: 2},
{x: 2, y: 3},
{x: 2, y: 4}
], [
{x: 3, y: 4},
{x: 3, y: 4},
{x: 3, y: 4},
{x: 3, y: 4}
]
];
var tableData =
d3.select('body').selectAll('table')
.data([testData]);
var tables =
tableData.enter()
.append('table');
var rowData =
tables.selectAll('table')
.data(function(d,i,j){
return d;
});
var rows =
rowData.enter()
.append('tr');
var eleData =
rows.selectAll('tr')
.data(function(d,i,j){
return d;
});
var ele =
eleData.enter()
.append('td')
.text(function(d,i,j){
var k = Array.prototype.indexOf.call(j[i].parentNode.parentNode.childNodes,j[i].parentNode);
return k;
});
<script src="https://d3js.org/d3.v4.min.js"></script>
Reservations
This approach is using DOM order as a proxy for data index. In many cases, I think this is a viable band-aid solution if this is no longer possible in D3 (as reported in this answer).
Some extra effort in manipulating the DOM selection to match data might be needed. As an example, filtering j[i].parentNode.parentNode.childNodes for <tr> elements only in order to determine the row -- generally speaking the childNodes array may not match the selection and could contain extra elements/junk.
While this is not a cure-all, I think it should work or could be made to work in most cases, presuming there is some logical connection between DOM and data that can be leveraged which allows you to use DOM child index as a proxy for data index.
Here's an example of how to use the selection.each() method. I don't think it's messy, but it did slow down the render on a large matrix. Note the following code assumes an existing table selection and a call to update().
update(matrix) {
var self = this;
var tr = table.selectAll("tr").data(matrix);
tr.exit().remove();
tr.enter().append("tr");
tr.each(addCells);
function addCells(data, rowIndex) {
var td = d3.select(this).selectAll("td")
.data(function (d) {
return d;
});
td.exit().remove();
td.enter().append("td");
td.attr("class", function (d) {
return d === 0 ? "dead" : "alive";
});
td.on("click", function(d,i){
matrix[rowIndex][i] = d === 1 ? 0 : 1; // rowIndex now available for use in callback.
});
}
setTimeout(function() {
update(getNewMatrix(matrix))
}, 1000);
},
Assume you want to do a nested selectiom, and your
data is some array where each element in turn
contains an array, let's say "values". Then you
have probably some code like this:
var aInnerSelection = oSelection.selectAll(".someClass") //
.data(d.values) //
...
You can replace the array with the values by a new array, where
you cache the indices within the group.
var aInnerSelection = oSelection.selectAll(".someClass") //
.data(function (d, i) {
var aData = d.values.map(function mapValuesToIndexedValues(elem, index) {
return {
outerIndex: i,
innerIndex: index,
datum: elem
};
})
return aData;
}, function (d, i) {
return d.innerIndex;
}) //
...
Assume your outer array looks like this:
[{name "X", values: ["A", "B"]}, {name "y", values: ["C", "D"]}
With the first approach, the nested selection brings you from here
d i
------------------------------------------------------------------
root dummy X {name "X", values: ["A", "B"]} 0
dummy Y {name "Y", values: ["C", "D"]} 1
to here.
d i
------------------------------------------------------------------
root X A "A" 0
B "B" 1
Y C "C" 2
D "D" 3
With the augmented array, you end up here instead:
d i
------------------------------------------------------------------
root X A {datum: "A", outerIndex: 0, innerIndex: 0} 0
B {datum: "B", outerIndex: 0, innerIndex: 1} 1
Y C {datum: "C", outerIndex: 1, innerIndex: 0} 2
D {datum: "D", outerIndex: 1, innerIndex: 1} 3
So you have within the nested selections, in any function(d,i), all
information you need.
Here's a snippet I crafter after re-remembering this usage of .each for nesting, I thought it may be useful to others who end up here. This examples creates two layers of circles, and the parent group index is used to determine the color of the circles - white for the circles in the first layer, and black for the circles in the top layer (only two layers in this case).
const nested = nest().key(layerValue).entries(data);
let layerGroups = g.selectAll('g.layer').data(nested);
layerGroups = layerGroups.enter().append('g').attr('class', 'layer')
.merge(layerGroups);
layerGroups.each(function(layerEntry, j) {
const circles = select(this)
.selectAll('circle').data(layerEntry.values);
circles.enter().append('circle')
.merge(circles)
.attr('cx', d => xScale(xValue(d)))
.attr('cy', d => yScale(yValue(d)))
.attr('r', d => radiusScale(radiusValue(d)))
.attr('fill', j === 0 ? 'white' : 'black'); // <---- Access parent index.
});
My solution was to embed this information in the data provided to d3js
data = [[1,2,3],[4,5,6],[7,8,9]]
flattened_data = data.reduce((acc, v, i) => {
v.forEach((d, j) => {
data_item = { i, j, d };
acc.push(data_item);
});
return acc;
}, []);
Then you can access i, j and d from the data arg of the function
td.text(function(d) {
// Can access i, j and original data here
return "Row: " + d.j;
});

Third variable in D3 anonymous function

Let's say you've got a selection with some data bound to it and you use the typical inline anonymous function to access that data:
d3.select("#whatever").each(function(d,i,q) {console.log(d,i,q)})
We all know the first variable is the data and the second is the array position. But what does the third variable (q in this case) represent? So far it's always come back zero in everything I've tested.
The secret third argument is only of use when you have nested selections. In these cases, it holds the index of the parent data element. Consider for example this code.
var sel = d3.selectAll("foo")
.data(data)
.enter()
.append("foo");
var subsel = sel.selectAll("bar")
.data(function(d) { return d; })
.enter()
.append("bar");
Assuming that data is a nested structure, you can now do this.
subsel.attr("foobar", function(d, i) { console.log(d, i); });
This, unsurprisingly, will log the data item inside the nesting and its index. But you can also do this.
subsel.attr("foobar", function(d, i, j) { console.log(d, i, j); });
Here d and i still refer to the same things, but j refers to the index of the parent data element, i.e. the index of the foo element.
A note on Lars's reply, which is correct but I found one more feature that is helpful.
The j element gives the index of the element without regard to the nesting of the parent elements. In other words, if you are appending and logging as follows, the final circles are treated as a flat array, not as a group of nested arrays. So your indexes will be scaled from 0 to the number of circle elements you have, without regard to the data structure of your nesting.
var categorygroups = chart.selectAll('g.categorygroups')
.data(data)
.enter()
.append('g').attr('class','categorygroups');
var valuesgroups = categorygroups.selectAll('g.valuesgroups')
.data(function(d) {return d.values; }).enter().append('g').attr('class','valuesgroups');
valuesgroups.append('text').text(function(d) {
return d.category
}).attr('y',function(d,i) { return (i + 1) * 100 }).attr('x',0);
var circlesgroups = valuesgroups.selectAll('g.circlesgroups')
.data(function(d) {return d.values; }).enter().append('g').attr('class','circlesgroups');
circlesgroups.append('circle').style('fill','#666')
.attr('cy',function(d,i,j) { console.log(j); return (j + 1) * 100 })
.attr('cx',function(d,i) { return (i + 1) * 40 });

Unique symbols for each data set in d3 Scatterplot

I am having trouble using d3's symbol mechanism to specify a unique symbol for each set of data. The data's like this:
[[{x: 1, y:1},{x: 2, y:2},{x: 3, y:3}], [{x: 1, y:1},{x: 2, y:4},{x: 3, y:9}], etc.]
The part of the code that writes out the symbols looks like this:
I create a series group for each vector of points. Then:
series.selectAll("g.points")
//this selects all <g> elements with class points (there aren't any yet)
.data(Object) //drill down into the nested Data
.enter()
.append("g") //create groups then move them to the data location
.attr("transform", function(d, i) {
return "translate(" + xScale(d.x) + "," + yScale(d.y) + ")";
})
.append("path")
.attr("d", function(d,i,j){
return (d3.svg.symbol().type(d3.svg.symbolTypes[j]));
}
);
Or at least that's how I'd like it to work. The trouble is that I can't return the function d3.svg.symbol() from the other function. If I try to just put the function in the "type" argument, then data is no longer scoped correctly to know what j is (the index of the series).
right, but I don't want a unique symbol for each datapoint, I want a unique symbol for each series. The data consists of multiple arrays (series), each of which can have an arbitrary number of points (x,y). I'd like a different symbol for each array, and that's what j should give me. I associate the data (in the example, two arrays shown, so i is 0 then 1 for that) with the series selection. Then I associate the data Object with the points selection, so i becomes the index for the points in each array, and j becomes the index of the original arrays/series of data. I actually copied this syntax from somewhere else, and it works ok for other instances (coloring series of bars in a grouped bar chart for example), but I couldn't tell you exactly why it works...
Any guidance would be appreciated.
Thanks!
What is the question exactly? The code that you give answers your question. My bad, j does return a reference to the series. Simpler example.
var data = [
{id: 1, pts: [{x:50, y:10},{x:50, y:30},{x:50, y:20},{x:50, y:30},{x:50, y:40}]},
{id: 2, pts: [{x:10, y:10},{x:10, y:30},{x:40, y:20},{x:30, y:30},{x:10, y:30}]}
];
var vis = d3.select("svg");
var series = vis.selectAll("g.series")
.data(data, function(d, i) { return d.id; })
.enter()
.append("svg:g")
.classed("series", true);
series.selectAll("g.point")
.data(function(d, i) { return d.pts })
.enter()
.append("svg:path")
.attr("transform", function(d, i) { return "translate(" + d.x + "," + d.y + ")"; })
.attr("d", function(d,i, j) { return d3.svg.symbol().type(d3.svg.symbolTypes[j])(); })
The only difference is that I added parenthesis after d3.svg.symbol().type(currentType)() to return the value rather than the function. D3js uses chaining, jquery style. This let you use symbol().type('circle') to set a value and symbol().type() to get it. Whenever accessors are used, what is returned is a reference to a function that has methods and attributes. Keep in mind that, in Javascript functions are first class objects - What is meant by 'first class object'?. In libraries that use that approach, often, there is an obvious getter for retrieving meaningful data. With symbol, you have to use symbol()().
The code beyond the symbol functionality can be seen at: https://github.com/mbostock/d3/blob/master/src/svg/symbol.js
d3.svg.symbol = function() {
var type = d3_svg_symbolType,
size = d3_svg_symbolSize;
function symbol(d, i) {
return (d3_svg_symbols.get(type.call(this, d, i))
|| d3_svg_symbolCircle)
(size.call(this, d, i));
}
...
symbol.type = function(x) {
if (!arguments.length) return type;
type = d3_functor(x);
return symbol;
};
return symbol;
};
Just in case you haven't. Have you tried?
.append("svg:path")
.attr("d", d3.svg.symbol())
as per https://github.com/mbostock/d3/wiki/SVG-Shapes.

Why is domain not using d3.max(data) in D3?

I'm new to D3 and playing around with a scatterplot. I cannot get d3.max(data) to work correctly in setting up domain!
I have the following setting up a random dataset:
var data = [];
for (i=0; i < 40; i++){
data.push({"x": i/40, "y": i/8, "a": Math.floor(Math.random() * 3), "x2": Math.random()});
}
And then the following to set my coordinates:
var x = d3.scale.linear().domain([0, 1]).range([0 + margin, w-margin]),
y = d3.scale.linear().domain([0, d3.max(data)]).range([0 + margin, h-margin]),
c = d3.scale.linear().domain([0, 3]).range(["hsl(100,50%,50%)", "rgb(350, 50%, 50%)"]).interpolate(d3.interpolateHsl);
This puts all 40 points in a single, horizontal line. If I replace d3.max(data) with '5' then it is a diagonal (albeit from the upper left to the bottom right, I'm still struggling to flip y-coordinates). Why isn't d3.max(data) working as expected?
d3.max() expects an array of numbers, not of objects. The elements of data have an internal key-value structure and there is no way for d3.max() to know what to take the maximum of. You can use something like jQuery's $.map to get the elements of the objects you want and then take the max, e.g.
var maxy = d3.max($.map(data, function(d) { return d.y; }));
Edit:
As pointed out in the comment below, you don't even need JQuery for this, as .map() is a native Array method. The code then becomes simply
var maxy = d3.max(data.map(function(d) { return d.y; }));
or even simpler (and for those browsers that don't implement Array.map()), using the optional second argument of d3.max that tells it how to access values within the array
var maxy = d3.max(data, function(d) { return d.y; });
d3.max API documentation can be found here.
# d3.max(array[, accessor])
Returns the maximum value in the given array using natural order. If
the array is empty, returns undefined. An optional accessor function
may be specified, which is equivalent to calling array.map(accessor)
before computing the maximum value. Unlike the built-in Math.max, this
method ignores undefined values; this is useful for computing the
domain of a scale while only considering the defined region of the
data. In addition, elements are compared using natural order rather
than numeric order. For example, the maximum of ["20", "3"] is "3",
while the maximum of [20, 3] is 20.
Applying this information to the original question we get:
function accessor(o){
return o.y;
}
var y = d3.scale.linear()
.domain([0, d3.max(data, accessor)])
.range([0 + margin, h-margin]);
If you end up using many accessor functions you can just make a factory.
function accessor(key) {
return function (o) {
return o[key];
};
}
var x = d3.scale.linear()
.domain([0, d3.max(data, accessor('x'))])
.range([...]),
y = d3.scale.linear()
.domain([0, d3.max(data, accessor('y'))])
.range([...]);
I was having a similar issue dealing with an associative array. My data looked like the following: [{"year_decided":1982,"total":0},{"year_decided":"1983","Total":"847"},...}]
Simply passing parseInt before returning the value worked.
var yScale = d3.scale.linear()
.domain([0, d3.max(query,function(d){ return parseInt(d["Total"]); }) ])
.range([0,h]);

Resources