dc.js stacked line chart with more than 1 dimension - dc.js

My dataset is an array of json of the like :
var data = [ { company: "A", date_round_1: "21/05/2002", round_1: 5, date_round_2: "21/05/2004", round_2: 20 },
...
{ company: "Z", date_round_1: "16/01/2004", round_1: 10, date_round_2: "20/12/2006", round_2: 45 }]
and I wish to display both 'round_1' and 'round_2' time series as stacked line charts.
The base line would look like this :
var fundsChart = dc.lineChart("#fundsChart");
var ndx = crossfilter(data);
var all = ndx.groupAll();
var date_1 = ndx.dimension(function(d){
return d3.time.year(d.date_round_1);
})
fundsChart
.renderArea(true)
.renderHorizontalGridLines(true)
.width(400)
.height(360)
.dimension(date_1)
.group(date_1.group().reduceSum(function(d) { return +d.round_1 }))
.x(d3.time.scale().domain([new Date(2000, 0, 1), new Date(2015, 0, 1)]))
I have tried using the stack method to add the series but the problem resides in the fact that only a single dimension can be passed as argument of the lineChart.
Can you think of a turnaround to display both series while still using a dc chart?

Are you going to be filtering on this chart? If not, just create a different group on a date_2 dimension and use that in the stack. Should work.
If you are going to be filtering, I think you'll have to change your data model a bit. You'll want to switch to have 1 record per round, so in this case you'll have 2 records for every 1 record you have now. There should be 1 date property (the date for that round), an amount property (the contents of round_x in the current structure), and a 'round' property (which would be '1', or '2', for example).
Then you need to create a date dimension and multiple groups on that dimension. The group will have a reduceSum function that looks something like:
var round1Group = dateDim.group().reduceSum(function(d) {
return d.round === '1' ? d.amount : 0;
});
So, what happens here is that we have a group that will only aggregate values from round 1. You'll create similar groups for round 2, etc. Then stack these groups in the dc.js chart.
Hopefully that helps!

Related

Histogram based on "reduceSummed" groups

I have CSV data with the following pattern:
Quarter,productCategory,unitsSold
2018-01-01,A,21766
2018-01-01,B,10076
2018-01-01,C,4060
2018-04-01,A,27014
2018-04-01,B,12219
2018-04-01,C,4740
2018-07-01,A,29503
2018-07-01,B,13020
2018-07-01,C,5549
2018-10-01,A,3796
2018-10-01,B,15110
2018-10-01,C,6137
2019-01-01,A,25008
2019-01-01,B,11655
2019-01-01,C,4630
2019-04-01,A,31633
2019-04-01,B,14837
2019-04-01,C,5863
2019-07-01,A,33813
2019-07-01,B,15442
2019-07-01,C,6293
2019-10-01,A,35732
2019-10-01,B,19482
2019-10-01,C,6841
As you can see, there are 3 product categories sold every day. I can make a histogram and count how many Quarters are involved per bin of unitsSold. The problem here is that every Quarter is counted separately. What I would like is a histogram where the bins of unitsSold are already grouped with a reduceSum on the Quarter.
This would result in something like this:
Quarter, unitsSold
2018-01-01,35902
2018-04-01,43973
2018-07-01,48072
2018-10-01,25043
2019-01-01,41293
2019-04-01,52333
2019-07-01,55548
2019-10-01,62055
Where, based on the bins of unitsSold, a number of Quarters would fall into. For example a bin of 50.000 - 70.000 would count 3 Quarters (2019-04-01, 2019-07-01 and 2019-10-01)
Normally I would do something like this:
const histogramChart = new dc.BarChart('#histogram');
const histogramDim = ndx.dimension(d => Math.round(d.unitsSold / binSize) * binSize);
const histogramGroup = histogramDim.group().reduceCount();
But in the desired situation the histogram is kind of created on something that has already been "reducedSummed". Ending up in a barchart histogram like this (data does not match with this example):
How can this be done with dc.js/crossfilter.js?
Regrouping the data by value
I think the major difference between your question and this previous question is that you want to bin the data when you "regroup" it. (Sometimes this is called a "double reduce"... no clear names for this stuff.)
Here's one way to do that, using an offset and width:
function regroup(group, width, offset = 0) {
return {
all: function() {
const bins = {};
group.all().forEach(({key, value}) => {
const bin = Math.floor((value - offset) / width);
bins[bin] = (bins[bin] || 0) + 1;
});
return Object.entries(bins).map(
([bin, count]) => ({key: bin*width + offset, value: count}));
}
}
}
What we do here is loop through the original group and
map each value to its bin number
increment the count for that bin number, or start at 1
map the bins back to original numbers, with counts
Testing it out
I displayed your original data with the following chart (too lazy to figure out quarters, although I think it's not hard with recent D3):
const quarterDim = cf.dimension(({Quarter}) => Quarter),
unitsGroup = quarterDim.group().reduceSum(({unitsSold}) => unitsSold);
quarterChart.width(300)
.height(200)
.margins({left: 50, top: 0, right: 0, bottom: 20})
.dimension(quarterDim)
.group(unitsGroup)
.x(d3.scaleTime().domain([d3.min(data, d => d.Quarter), d3.timeMonth.offset(d3.max(data, d => d.Quarter), 3)]))
.elasticY(true)
.xUnits(d3.timeMonths);
and the new chart with
const rg = regroup(unitsGroup, 10000);
countQuartersChart.width(500)
.height(200)
.dimension({})
.group(rg)
.x(d3.scaleLinear())
.xUnits(dc.units.fp.precision(10000))
.elasticX(true)
.elasticY(true);
(Note the empty dimension, which disables filtering. Filtering may be possible but you have to map back to the original dimension keys so I’m skipping that for now.)
Here are the charts I get, which look correct at a glance:
Demo fiddle.
Adding filtering to the chart
To implement filtering on this "number of quarters by values" histogram, first let's enable filtering between the by-values chart and the quarters chart by putting the by-values chart on its own dimension:
const quarterDim2 = cf.dimension(({Quarter}) => Quarter),
unitsGroup2 = quarterDim2.group().reduceSum(({unitsSold}) => unitsSold);
const byvaluesGroup = regroup(unitsGroup2, 10000);
countQuartersChart.width(500)
.height(200)
.dimension(quarterDim2)
.group(byvaluesGroup)
.x(d3.scaleLinear())
.xUnits(dc.units.fp.precision(10000))
.elasticX(true)
.elasticY(true);
Then, we implement filtering with
countQuartersChart.filterHandler((dimension, filters) => {
if(filters.length === 0)
dimension.filter(null);
else {
console.assert(filters.length === 1 && filters[0].filterType === 'RangedFilter');
const range = filters[0];
const included_quarters = unitsGroup2.all()
.filter(({value}) => range[0] <= value && value < range[1])
.map(({key}) => key.getTime());
dimension.filterFunction(k => included_quarters.includes(k.getTime()));
}
return filters;
});
This finds all quarters in unitsGroup2 that have a value which falls in the range. Then it sets the dimension's filter to accept only the dates of those quarters.
Odds and ends
Quarters
D3 supports quarters with interval.every:
const quarterInterval = d3.timeMonth.every(3);
chart.xUnits(quarterInterval.range);
Eliminating the zeroth bin
As discussed in the comments, when other charts have filters active, there may end up being many quarters with less than 10000 units sold, resulting in a very tall zero bar which distorts the chart.
The zeroth bin can be removed with
delete bins[0];
before the return in regroup()
Rounding the by-values brush
If snapping to the bars is desired, you can enable it with
.round(x => Math.round(x/10000)*10000)
Otherwise, the filtered range can start or end inside of a bar, and the way the bars are colored when brushed is somewhat inaccurate as seen below.
Here's the new fiddle.

Crossfilter stacked bar charts negate values

I am using a crossfilter2 with dcv3
My data is in a csv which i loaded into memory
Original Data
Day, ID
1, 2
1, 2
1, 2
2, 5
3, 6
4, 6
Processed data
Day, ID, target
1, 2, True
1, 2, True
1, 2, True
2, 5, False
3, 6, False
4, 6, False
Currently what i am trying to do is create a crossfilter stackedbar chart with 2 bars. If ID == 2, i consider it as one group, and ID !=2 as another group. However, i cannot do it dynamically it in DC/crossfilter which results me having to preprocess the data to add a new column and work off the column as shown by my solution below.
Is there a better way?
var dimID = ndx.dimension(function(d) { return d.day; });
var id_stacked = dimID.group().reduce(
function reduceAdd(p, v) {
p[v.target] = (p[v.target] || 0) + 1;
return p;
},
function reduceRemove(p, v) {
p[v.target] = (p[v.target] || 0) - 1;
return p;
},
function reduceInitial() {
return {};
});
//Doing the stacked bar chart here
stackedBarChart.width(1500)
.height(150)
.margins({top: 10, right: 10, bottom: 50, left: 40})
.dimension(dimID)
.group(id_stacked, 'Others', sel_stack("True"))
.stack(id_stacked, 'Eeid of interest', sel_stack("False"))
This is my sel_stack function
function sel_stack(i) {
return function(d) {
return d.value[i] ? d.value[i] : 0;
};
}
I am plotting a bar chart with x-axis being the day and the Y-axis being the frequency of ID == 2 or ID!=2 in a stacked bar chart
So you want to group by day and then stack by whether ID===2. Although dc.js will accept many different formats, often the trick is getting the data into the right shape.
You're on the right track, but you don't need the extra column in order to create stacks for "is 2" and "not 2". You can calculate it directly:
var dayDimension = ndx.dimension(function(d) { return d.Day; }),
idStackGroup = dayDimension.group().reduce(
function add(p, v) {
++p[v.ID===2 ? 'is2' : 'not2'];
return p;
},
function remove(p, v) {
--p[v.ID===2 ? 'is2' : 'not2'];
return p;
},
function init() {
return {is2: 0, not2: 0};
});
These are standard add/remove functions for reducing multiple values for each bin. You'll find other variations where the name of the field is driven by the data. But here we know what fields will exist, so we can initialize them to zero in init and not worry about encountering new fields.
The add function is called when a row is added to the crossfilter or a filter changes so that a row is included; the remove function is called whenever a row is filtered out or removed from crossfilter. Since we're not worried about undefined (1) we can simply increment (++) and decrement (--) the values.
Finally we need accessors to pull these values out of the object. I think it's simpler to put the stack accessors inline - sel_stack was written for adding a dynamic number of stacks. (YMMV)
.group(idStackGroup, 'Others', d => d.value.not2)
.stack(idStackGroup, 'Eeid of interest', d => d.value.is2);
https://jsfiddle.net/gordonwoodhull/fu4w96Lh/23/
(1) If you do any arithmetic on undefined it casts to NaN and NaN ruins all further calculations.

Setting bar chart bar widths to month intervals

I'm trying to create a histogram using dc.js to display post counts aggregated by month. I've set up the crossfilter dimension and group to aggregate the data correctly but I can't get the widths of the resulting chart to fill the correct widths on the x axis.
My (simplified) code looks like this:
var ndx = crossfilter(items)
var dateDimension = ndx.dimension(d => d.date)
// group by month
var overviewGroup = dateDimension.group(d => {
if (d) {
return new Date(d.getUTCFullYear(), d.getUTCMonth())
}
})
var minMonth = new Date(dateDimension.bottom(1)[0].date.getUTCFullYear(), dateDimension.bottom(1)[0].date.getUTCMonth())
var maxMonth = new Date(dateDimension.top(1)[0].date.getUTCFullYear(), dateDimension.top(1)[0].date.getUTCMonth() + 1)
this.overviewChart
.height(60)
.minWidth(600)
.width(null)
.margins({top: 0, right: 10, bottom: 30, left: 40})
.dimension(dateDimension)
.centerBar(false)
.x(scale.scaleTime().domain([minMonth, maxMonth]))
.round(time.timeMonths.round)
.xUnits(time.timeMonths)
.group(overviewGroup)
.on('filtered', () => { this.displayItems = ndx.allFiltered() })
This displays the correct data on the y axis but the bars are only 1px wide. The chart in question is the smaller, lower chart - it's supposed to be the range chart for the higher-resolution one above (which aggregates posts by day and is displaying correctly) but that's for another question!
I get better results with .xUnits(() => { return overviewGroup.all().length - 1 }) which produces a wider bar and is closer to my intended result but it's still not correct:
I've pulled my code into a fiddle however in the fiddle it works more or less as expected: https://jsfiddle.net/y1qby1xc/9/

dc.js Composite Graph - Plot New Line for Each Person

Good Evening Everyone,
I'm trying to take the data from a database full of hour reports (name, timestamp, hours worked, etc.) and create a plot using dc.js to visualize the data. I would like the timestamp to be on the x-axis, the sum of hours for the particular timestamp on the y-axis, and a new bar graph for each unique name all on the same chart.
It appears based on my objectives that using crossfilter.js the timestamp should be my 'dimension' and then the sum of hours should be my 'group'.
Question 1, how would I then use the dimension and group to further split the data based on the person's name and then create a bar graph to add to my composite graph? I would like for the crossfilter.js functionality to remain intact so that if I add a date range tool or some other user controllable filter, everything updates accordingly.
Question 2, my timestamps are in MySQL datetime format: YYYY-mm-dd HH:MM:SS so how would I go about dropping precision? For instance, if I want to combine all entries from the same day into one entry (day precision) or combine all entries in one month into a single entry (month precision).
Thanks in advance!
---- Added on 2017/01/28 16:06
To further clarify, I'm referencing the Crossfilter & DC APIs alongside the DC NASDAQ and Composite examples. The Composite example has shown me how to place multiple line/bar charts on a single graph. On the composite chart I've created, each of the bar charts I've added a dimension based off of the timestamps in the data-set. Now I'm trying to figure out how to define the groups for each. I want each bar chart to represent the total time worked per timestamp.
For example, I have five people in my database, so I want there to be five bar charts within the single composite chart. Today all five submitted reports saying they worked 8 hours, so now all five bar charts should show a mark at 01/28/2017 on the x-axis and 8 hours on the y-axis.
var parseDate = d3.time.format('%Y-%m-%d %H:%M:%S').parse;
data.forEach(function(d) {
d.timestamp = parseDate(d.timestamp);
});
var ndx = crossfilter(data);
var writtenDimension = ndx.dimension(function(d) {
return d.timestamp;
});
var hoursSumGroup = writtenDimension.group().reduceSum(function(d) {
return d.time_total;
});
var minDate = parseDate('2017-01-01 00:00:00');
var maxDate = parseDate('2017-01-31 23:59:59');
var mybarChart = dc.compositeChart("#my_chart");
mybarChart
.width(window.innerWidth)
.height(480)
.x(d3.time.scale().domain([minDate,maxDate]))
.brushOn(false)
.clipPadding(10)
.yAxisLabel("This is the Y Axis!")
.compose([
dc.barChart(mybarChart)
.dimension(writtenDimension)
.colors('red')
.group(hoursSumGroup, "Top Line")
]);
So based on what I have right now and the example I've provided, in the compose section I should have 5 charts because there are 5 people (obviously this needs to be dynamic in the end) and each of those charts should only show the timestamp: total_time data for that person.
At this point I don't know how to further breakup the group hoursSumGroup based on each person and this is where my Question #1 comes in and I need help figuring out.
Question #2 above is that I want to make sure that the code is both dynamic (more people can be handled without code change), when minDate and maxDate are later tied to user input fields, the charts update automatically (I assume through adjusting the dimension variable in some way), and if I add a names filter that if I unselect names that the chart will update by removing the data for that person.
A Question #3 that I'm now realizing I'll want to figure out is how to get the person's name to show up in the pointer tooltip (the title) along with timestamp and total_time values.
There are a number of ways to go about this, but I think the easiest thing to do is to create a custom reduction which reduces each person into a sub-bin.
First off, addressing question #2, you'll want to set up your dimension based on the time interval you're interested in. For instance, if you're looking at days:
var writtenDimension = ndx.dimension(function(d) {
return d3.time.hour(d.timestamp);
});
chart.xUnits(d3.time.hours);
This will cause each timestamp to be rounded down to the nearest hour, and tell the chart to calculate the bar width accordingly.
Next, here's a custom reduction (from the FAQ) which will create an object for each reduced value, with values for each person's name:
var hoursSumGroup = writtenDimension.group().reduce(
function(p, v) { // add
p[v.name] = (p[v.name] || 0) + d.time_total;
return p;
},
function(p, v) { // remove
p[v.name] -= d.time_total;
return p;
},
function() { // init
return {};
});
I did not go with the series example I mentioned in the comments, because I think composite keys can be difficult to deal with. That's another option, and I'll expand my answer if that's necessary.
Next, we can feed the composite line charts with value accessors that can fetch the value by name.
Assume we have an array names.
compositeChart.shareTitle(false);
compositeChart.compose(
names.map(function(name) {
return dc.lineChart(compositeChart)
.dimension(writtenDimension)
.colors('red')
.group(hoursSumGroup)
.valueAccessor(function(kv) {
return kv.value[name];
})
.title(function(kv) {
return name + ' ' + kv.key + ': ' + kv.value;
});
}));
Again, it wouldn't make sense to use bar charts here, because they would obscure each other.
If you filter a name elsewhere, it will cause the line for the name to drop to zero. Having the line disappear entirely would probably not be so simple.
The above shareTitle(false) ensures that the child charts will draw their own titles; the title functions just add the current name to those titles (which would usually just be key:value).

dc.js access data points in multiple charts when click datapoint in first chart

Using different dimensions of the same dataset, there are three dc.js Line Charts on screen.
When user clicks a datapoint on any lineChart, I wish to locate and return the data values for that corresponding point from all other charts, including the one clicked on.
I am also attempting (on mouseover) to change the circle fill color to red for the datapoint being hovered, as well as for the corresponding datapoint (same "x" value) for all other charts.
I am using the .filter() method but haven't been successful getting the desired data. The error message is: "Uncaught TypeError: myCSV[i].filter is not a function"
Full jsFiddle demo/example
lc1.on('renderlet', function(lc1) {
var allDots1 = lc1.selectAll('circle.dot');
var allDots2 = lc2.selectAll('circle.dot');
var allDots3 = lc3.selectAll('circle.dot');
allDots1.on('click', function(d) {
var d2find = d.x;
var d2find2 = d3.select(this).datum();
console.log(myCSV);
alert('Captured:'+"\nX-axis (Date): "+d2find2.x +"\nY-axis (Value): "+ d2find2.y +"\nDesired: display corresponding values from all three charts for this (date/time) datapoint");
allDots2.filter(d=>d.x == d2find2).attr('fill','red');
findAllPoints(d2find2);
});//END allDots1.on(click);
function findAllPoints(datum) {
var objOut = {};
var arrLines=['car','bike','moto'];
for (var i = 0; i < 3; i++) {
thisSrx = arrLines[i];
console.log('thisSrx: '+thisSrx);
console.log(myCSV[i].date)
console.log(datum.x);
//loop thru myCSV obj, compare myCSV[i].date to clicked "x" val
//build objOut with data from each graph at same "X" (d/t) as clicked
objOut[i] = myCSV[i].filter(e => e.date === datum.x)[0][thisSrx];
}
$('#msg').html( JSON.stringify(objOut) );
console.log( JSON.stringify(objOut) );
}//END fn findAllPoints()
});//END lc1.on(renderlet)
myCSV contains all three data points, so I don't see the need to loop through the three charts independently - findAllPoints is going to find the same array entry for all three data series anyway.
The main problem you have here is that date objects don't compare equal if they have the same value. This is because == (and ===) evaluate object identity if the operands are objects:
> var d1 = new Date(), d2 = new Date(d1)
undefined
> d1
Mon Feb 13 2017 09:03:53 GMT-0500 (EST)
> d2
Mon Feb 13 2017 09:03:53 GMT-0500 (EST)
> d1==d2
false
> d1.getTime()===d2.getTime()
true
There are two ways to deal with this.
Approach 1: use second event argument
If the items in all the charts match up item by item, you can just use the index.
All d3 callbacks pass both the datum and the index. So you can modify your callback like this:
allDots1.on('click', function(d,i) {
// ...
allDots2.filter((d,j)=> j===i).attr('fill','red').style('fill-opacity', 1);
alert(JSON.stringify(myCSV[i]));
});
http://jsfiddle.net/gordonwoodhull/drbtmL77/7/
Approach 2: compare by date
If the different charts might have different data indices, you probably want to compare by date, but use Date.getTime() to get an integer you can compare with ===:
allDots1.on('click', function(d) {
var d2find = d.x;
// ...
allDots2.filter(d=> d.x.getTime()===d2find.getTime()).attr('fill','red').style('fill-opacity', 1);
var row = myCSV.filter(r=>r.date.getTime()===d2find.getTime())
alert(JSON.stringify(row));
});
http://jsfiddle.net/gordonwoodhull/drbtmL77/10/
Note that in either case, you're going to need to also change the opacity of the dot in the other charts, because otherwise they don't show until they are hovered.
Not sure when you want to reset this - I guess it might make more sense to show the corresponding dots on mouseover and hide them on mouseout. Hopefully this is enough to get you started!

Resources