How to choose number of bins in a dc.js barchart? - dc.js

I have created a bar chart in dc.js:
d3.csv("input.csv", function (error, data) {
data.forEach(function (x) {
x.x = +x.x;
});
var ndx = crossfilter(data),
dim1 = ndx.dimension(function (d) {
return Math.floor(parseFloat(d['x']) * 10) / 10;
});
var group1 = dim1.group(),
chart = dc.barChart("#barchart");
chart
.width(500)
.height(200)
.dimension(dim1)
.group(group1)
.x(d3.scale.linear().domain([0,3]))
.elasticY(true)
.barPadding([0.4])
chart.xAxis().tickFormat(function(d) {return d});
chart.yAxis().ticks(10);
});
But the number of bars is low. I want to increase the number of bars displayed to 12.
How can I choose the number of bars?

These lines determine the number of bins:
dim1 = ndx.dimension(function (d) {
return Math.floor(parseFloat(d['x']) * 10) / 10;
});
var group1 = dim1.group(),
What you are doing here is rounding down to the previous tenth (0.1). Thus any rows with x equal to 1.12, 1.16, 1.19 will be counted in the 1.1 bin, etc.
If you want an exact number of bins, you'll have to determine the range of x and divide that by the number of bins you want. If you don't need anything that exact, you could just fiddle with the number 10 there until you get what you want.
For example, changing it to
dim1 = ndx.dimension(function (d) {
return Math.floor(parseFloat(d['x']) * 20) / 20;
});
var group1 = dim1.group(),
will give you twice as many bins, because it will round down to the previous twentieth (0.05).
So 1.12 would round to 1.10; and 1.16, 1.19 would round to 1.15, etc.
BTW, parseFloat is unnecessary, because you have already converted x to number with x.x = +x.x

Related

nested donuts, partial sum of the values in the 2nd dimension array

Is this possible? The reason I am asking the question is first I did the concentric donuts with 2 datasets and the slices size did not match related data it was each proportionate but slightly smaller on the inner ring and I want the slices to match inner and outer. So I read you need nested dataset.
I need the pie slices of the first 2 values of apples to match the first 2 slices of the inner and outer donuts. Then I need the total value of the remaining apples to be one slice and it needs to match the same individual pie slices of the rest of the first array. So the client just wants to compare the summed values or see it as only 3 slices compared to the 5 slices.
I used the working apples and oranges JSfiddle to start with from the internet: https://jsfiddle.net/vgq0z5aL/
I modified it here to use the dataset that will work with my problem but couldn't get it to work. Something wrong with the dataset I think?
My Example: https://jsfiddle.net/aumnxjc8/
How can I fix the dataset so it works?
var dataset = {
apples: [13245, 28479, 1000, 1000, 3000],
apples2: [dataset[0][0], dataset[0][1], sumofapples],
};
var sumofapples = dataset[0][3]+ dataset[0][4]+dataset[0][5];
var width = d3.select('#duration').node().offsetWidth,
height = 300,
cwidth = 33;
var colorO = ['#1352A4', '#2478E5', '#5D9CEC', '#A4C7F4', '#DBE8FB'];
var colorA = ['#58A53B', '#83C969', '#A8D996'];
var pie = d3.layout.pie()
.sort(null);
var arc = d3.svg.arc();
var svg = d3.select("#duration svg")
.append("g")
.attr("transform", "translate(" + width / 2 + "," + height / 2 + ")");
console.log(dataset);
var gs = svg.selectAll("g").data(d3.values(dataset)).enter().append("g");
var path = gs.selectAll("path")
.data(function(d, i) { return pie(d); })
.enter().append("path")
.attr("fill", function(d, i, j) {
if (j == 0) {
return colorO[i];
} else {
return colorA[i];
}
})
.attr("d", function(d, i, j) {
if (j == 0) {
return arc.innerRadius(75 + cwidth * j - 17).outerRadius(cwidth * (j + 2.9))(d);
} else {
return arc.innerRadius(75 + cwidth * j - 5).outerRadius(cwidth * (j + 2.5))(d);
}
});
Try:
const apples = [13245, 28479, 11111, 11000, 3876];
const apples2 = [apples[0], apples[1],
apples.slice(2).reduce((sum,item) => sum + item, 0)];
const dataset = { apples, apples2 };
You can see the result in a fiddle

Violin plot in d3

I need to build a violin point with discrete data points in d3.
Example:
I am not sure how to align the center for each value on X axis. The default behavior will overlay all the points with same X and Y value, however I would like the points to be offset while being center aligned e.g. 5.1 has 3 values in control group and 4.5 has 2 values, all center aligned. It is easy to do so for either right or left aligned by doing a transformation of each point by a specified amount. However, the center alignment seems to be quite hacky.
A hacky way would be to manually transform the X value by maintaining a couple of arrays to see whether this is the first, even or odd number of element and place it according my specifying the value. Is there a proper way to handle this?
The only example of violin plot in d3 I found was here - which implements a probability distribution rather than the discrete values which I require.
"A hacky way would be to manually transform the X value by maintaining a couple of arrays" - that's pretty much the way most d3 layouts work :-) . Discretise your data set by the y value (weight), keeping a total of the data points in each discrete group and a group index for each datum. Then use those to calculate offsets x-ways and the rounded y-value.
See https://jsfiddle.net/n444k759/4/
// below code assumes a svg and g group element are present (they are in the jsfiddle)
var yscale = d3.scale.linear().domain([0,10]).range([0,390]);
var xscale = d3.scale.linear().domain([0,2]).range ([0,390])
var color = d3.scale.ordinal().domain([0,1]).range(["red", "blue"]);
var data = [];
for (var n = 0; n <100; n++) {
data.push({weight: Math.random() * 10.0, category: Math.floor (Math.random() * 2.0)});
}
var groups = {};
var circleR = 5;
var discreteTo = (circleR * 2) / (yscale.range()[1] / yscale.domain()[1]);
data.forEach (function(datum) {
var g = Math.floor (datum.weight / discreteTo);
var cat = datum.category;
var ref = cat+"-"+g;
if (!groups[ref]) { groups[ref] = 0; }
datum.groupIndex = groups[ref];
datum.discy = yscale (g * discreteTo); // discrete
groups[ref]++;
});
data.forEach (function(datum) {
var cat = datum.category;
var g = Math.floor (datum.weight / discreteTo);
var ref = cat+"-"+g;
datum.offset = datum.groupIndex - ((groups[ref] - 1) / 2);
});
d3.select("svg g").selectAll("circle").data(data)
.enter()
.append("circle")
.attr("cx", function(d) { return 50 + xscale(d.category) + (d.offset * (circleR * 2)); })
.attr("r", circleR)
.attr("cy", function(d) { return 10 + d.discy; })
.style ("fill", function(d) { return color(d.category); })
;
The above example discretes into groups according to the size of the display and the size of the circle to display. You might want to discrete by a given interval and then work out the size of circle from that.
Edit: Updated to show how to differentiate when category is different as in your screenshot above

How do I tweak binning for dc.js and crossfilter? Is that the performance bottleneck?

I'm trying to make a generic cross filter that can take in a csv and build a dashboard. Here are working examples:
https://ubershmekel.github.io/gfilter/?dl=https://ubershmekel.github.io/csvData/spent.csv
https://ubershmekel.github.io/gfilter/?dl=https://ubershmekel.github.io/csvData/Sacramentorealestatetransactions.csv
But for some reason the flight data is slow and unresponsive. Compare these 2 which analyze the same data:
https://ubershmekel.github.io/gfilter/?dl=https://ubershmekel.github.io/csvData/flights-3m.csv
https://github.com/square/crossfilter
I think it's because the histogram binning is too detailed but I can't find a good way to tweak that in the api reference. #gordonwoodhull mentioned:
If the binning is wrong you really want to look at the way you've set up crossfilter - dc.js just uses what it is given.
How do I tweak the binning of crossfilter? I've tried messing with the xUnits, dimension and group rounding to no avail.
This is the problem code I suspect is slow/wrong:
var dim = ndx.dimension(function (d) { return d[propName]; });
if (isNumeric(data[0][propName])) {
var theChart = dc.barChart("#" + chartId);
var countGroup = dim.group().reduceCount();
var minMax = d3.extent(data, function (d) { return +d[propName] });
var min = +minMax[0];
var max = +minMax[1];
theChart
.width(gfilter.width).height(gfilter.height)
.dimension(dim)
.group(countGroup)
.x(d3.scale.linear().domain([min, max]))
.elasticY(true);
theChart.yAxis().ticks(2);
You can adjust binning by passing a function that adjusts values to the group() method. For example, this group would create integer bins:
var countGroup = dim.group(function (v) { return Math.floor(v); });
And this one would create bins of 20 units a piece:
var countGroup = dim.group(function(d) { return Math.floor(d / 20) * 20 });
Factoring out a variable for bin size:
var bin = 20; // or any integer
var countGroup = dim.group(function(d) { return Math.floor(d / bin) * bin });
If you use binning, you'll also likely want your bars to be of a width matching your bin size. To do so, add a call to xUnits() on your bar chart. xUnits() sets the number of points on the axis:
.xUnits(function(start, end, xDomain) { return (end - start) / bin; })
See the documentation for crossfilter dimension group(), dc.js xUnits()
You can check out the results at:
https://ubershmekel.github.io/gfilter/?dl=testData/Sacramentorealestatetransactions.csv
This worked for me. I had to avoid 3 pitfalls: the group() function needed to round to the bar locations, xUnits needed the amount of bars, and making the domain (x axis) show the max value.
var numericValue = function (d) {
if (d[propName] === "")
return NaN;
else
return +d[propName];
};
var dimNumeric = ndx.dimension(numericValue);
var minMax = d3.extent(data, numericValue);
var min = minMax[0];
var max = minMax[1];
var barChart = dc.barChart("#" + chartId);
// avoid very thin lines and a barcode-like histogram
var barCount = 30;
var span = max - min;
lastBarSize = span / barCount;
var roundToHistogramBar = function (d) {
if (isNaN(d) || d === "")
d = NaN;
if (d == max)
// This fix avoids the max value always being in its own bin (max).
// I should figure out how to make the grouping equation better and avoid this hack.
d = max - lastBarSize;
var res = min + span * Math.floor(barCount * (d - min) / span) / barCount;
return res;
};
var countGroup = dimNumeric.group(roundToHistogramBar);
barChart.xUnits(function () { return barCount; });
barChart
.width(gfilter.width).height(gfilter.height)
.dimension(dimNumeric)
.group(countGroup)
.x(d3.scale.linear().domain([min - lastBarSize, max + lastBarSize]).rangeRound([0, 500]))
.elasticY(true);
barChart.yAxis().ticks(2);

d3js time scale same value but different results

I have two overlapping charts. One is a barchart and one is a linechart.
Both have in each data a date (the same date - same count of results) But if I calculate the attr x for each chart I get different results.
var x = d3.time.scale().range([0, width]);
width = 860.
So I debugged very deep in d3js-code and recognized the following:
If I call x (x(d.value) and step into it I came to the following code:
function d3_uninterpolateNumber(a, b) {
b = b - (a = +a) ? 1 / (b - a) : 0;
return function(x) {
return (x - a) * b;
};
}
I saw that x and a have the same value (x is a date (Tue Jul 2 00:00:00 UTC+0200 2013) and a are ticks (1372716000000 --> the ticks of the first date in my data) in case of 2nd of July.
But b has different values. So I have no idea what b is.
In case of barchart it is 3.7947783849423196e-10 and in case of LineChart it's 3.7947783849423196e-10
So the result of x(d.value) is different, but it shouldn't be.
Has anybody an idea?
I try to get a fiddle tomorrow, but maybe somebody has the answer without js-fiddle.
thx in advance
©a-x-i
problem solved.
To render the barchart and linechart I have two different functions. In each function I have called the x.domain.
In LineChart only:
x.domain([d3.min(data, function (d) { return d.xValue; }), d3.max(data, function (d) { return d.xValue; })]);
But in BarChart:
var xMax = d3.max(data, function (d) { return d.xValue; });
xMax = new Date(xMax.toString()); // otherwise data[<last>].xValue is changed as well!
if (ChartHandler.SelectedAggregationPeriod == ChartHandler.AggregationPeriod.month) {
xMax.setTime(xMax.getTime() + 12 * 60 * 60 * 1000); //+12h
}
var xMin = d3.min(data, function (d) { return d.xValue; });
x.domain([xMin, xMax]);
that's it....
©a-x-i

D3: Can't select a subset of my dataset

I would like to select a subset of data with .select() or .selectAll().
For example, I have a dataset:
var dataset = [4,5,6,7,9,56]
Each number of this dataset is bound to an SVG <rect>:
svg.selectAll("rect")
.data(dataset)
.enter()
.append("rect");
Now I would like to select only a subset of data for applying some stuff on it (colouring in yellow in my case).
This works for colouring every the <rect>:
var allRect = myselection.selectAll("rect")
.attr("fill","rgb(255, 255, 0)");
But I would like to select, for example, only the <rect>s corresponding to a number between 5 and 7. Or at least the <rect> corresponding to a specific number from my dataset.
I tried:
var specificRect = myselection.selectAll("rect")[5:9]
var specificRect = myselection.selectAll("rect")[5]
var specificRect = myselection.selectAll("rect")[2,3,4]
var specificRect = myselection.selectAll("rect").data(dataset)[1]
None of those are working. Thanks for your help.
The solution was the use of ".filter".
var specificRect = myselection.selectAll("rect").data(dataset)
.filter(function(d) { return (d >= 5 && d <= 9) })

Resources