understanding dc interaction with crossfilter objects - dc.js

Though I can write dc.js applications, I still don't understand how dc uses crossfilter objects, ie the dimensions and groups in various charts. When we click on an graph element, for instance, a pie chart slice, I believe dc is applying filters on the dimension, but does it manipulate the crossfilter object as well? Anyone knows of any document/article explaining how dc interacts with crossfilter objects? I know of http://www.codeproject.com/Articles/693841/Making-Dashboards-with-Dc-js-Part-Using-Crossfil
which is really good for beginners, but it does not go deep dive on this specific subject.
For instance, I have this dc chart: http://bit.ly/1nStSh3
Basically the dataset has object names (4 of them, P, Q, S, T) and its size for various dates. The two piecharts show the size for dates and objects respectively. There is a line chart which shows the data growth over a period of time. Now, when I click on the second graph, ie object names, both line chart and the first pie chart auto adjusts, but when I click on the first pie chart, the line chart does not change.

Your particular question is covered by the crossfilter documentation and the dc.js FAQ: a dimension does not observe its own filters, but only the filters on other dimensions.
To get the charts to respond to each other, create a duplicate of the dimension (construct another one with the same arguments) and put the charts on separate dimensions. (There is also work underway to reflect the brushing/filter state between charts that share the same dimension.)
As to your larger question, no, there is no documentation on the interaction between dc.js and crossfilter that I know of. As the principle maintainer (but not the original author) of dc.js, I hope to write such documentation in the next year.
There actually isn't much magic to it: charts just update the dimension filters and then trigger redraws on the charts in their group. The d3 transitions within each chart are what make it look fancier than that.

Related

Who needs dimensions anyway?

This is half a question, as I have this 'sort of' thought through. Still, I'd like to have some confirmation. Here it goes:
From what I've seen so far, a group holds all the information necessary for a plot. Let's imagine a bar chart for the 'canonical' dc data array. We define a dimension on type and then a group. The group data will give us all the necessary coordinates for drawing the bars.
Why do we need dimensions, then? Is this for plotting, or just for keeping track of the filters and dynamically updating the chart?
This is only half an answer. :)
Yes, there is some redundancy between dimensions and groups for sure.
The group key function needs to be a refinement of (and must be consistent with) the dimension key function.
There's only one place where I've found it's helpful to refine the group key function more than the dimension key function: when the dimension is time and the group is some quantization of time like months or hours. Otherwise there is no need to specify the group key function at all. I haven't seen too many people create sets of time-series charts where different charts are quantized at different levels, so I am not sure if that's the motivation.
Filtering happens through the dimension, not the group, and a group does not observe its dimension's filters - so you might want them to be different objects so you that you can optionally have the chart respond to its own filters. That's pretty rare, though - usually a chart will read data from a group on the dimension it filters.
Any other reasons? Please add to this list!

D3.js set axis domain from bound data?

In learning d3.js, I've seen several examples of d3 plots where the axes update when the data is changed.
I would have expected the axes to depend on bound data, so that when you add new data with .data(newData), the domain of the scale used by the axis would change. Instead, all of these examples join the data to the plot selection and then manually redraw the axes based on a different variable (often the original, unbound data variable).
Why aren't scales defined as a function of bound data? Perhaps this is leads to a circular reference problem? Or does it go against d3 philosophy for some other reason?
I ended up tying the axes to the data myself, and it seems to work well. I have an updatePlot function that adds new data to the plot, a getDataExtents that returns the extents of the data, and a setAxesBounds function which sets the domain of the scales and then does .call to redraw the axes. Inside updatePlot, I just have setAxesBounds(getDataExtents(data)) so that whenever data is updated, the axes are redrawn with scales based on the data.
I'm used to plotting libraries where this sort of thing happens for you automatically, but I understand the d3 is not a plotting library and so it makes sense that it doesn't do this sort of thing for you, in the name of flexibility and to avoid unpredictable default behavior.

NVD3.js: Stacked and grouped bar chart with two y-axis

I am using NVD3.js and want to create following chart:
As you can see - bars are stacked, two axis and grouped by x-axis
Using multiChart I got :
It is stacked, two axis, but not grouped by x-axis.
Maybe I need to use different chart type - not multiChart, but I didn't find bar charts where are two y-axis.
1) How can I achieve this using NVD3.js?
2) If it can not be done in NVD3.js, then which solution I can properly integrate?
Thanks!
The NVD3 Javascript library is, to quote their website, "an attempt to build re-usable charts and chart components". It's creators have made a couple key decisions in order to emphasize the reusability of the charts:
They have focused on implementing standard chart designs (line graphs, bar graphs, scatterplots), but implemented in flexible, interactive ways.
They have used the same data structure requirement for all the graphs:
The main data array contains multiple data series, each of which represents a logical grouping of the data;
Each series is an array of individual data objects containing two or more variables.
All the graphs have a similar style and reuse important pieces of code.
The NVD3 library allows you to create a grouped bar chart or a stacked bar chart, and even a chart that interactively animates between the two.
Adapting that chart to create a stacked and grouped bar chart is not a simple task, in part because the data structure would be different. You would need a three-level data structure (series > sub-series > datapoints, representing groups > stacks > bars) instead of the two-level (series > datapoints) structure used by NVD3.
All is not lost, however. NVD3 is built on the d3 Javascript library. D3 is much more flexible and open-ended; it doesn't define specific chart types, it defines a way of manipulating a webpage to make it match your data. You can use it to create any type of chart that can be drawn with HTML or SVG. But of course, that means that it is much more work, since you have to explicitly create all the parts of the graph, and make all the design decisions yourself!
I strongly recommend, if you want to use d3, start with the basics in the tutorials list or one of the introductory books. However, you'll also want to check out the gallery of examples, and from there you'll find the following charts that will be of particular interest:
Mike Bostock's Stacked Bar Chart
Bostock's Grouped Bar Chart of the same data
Ali Gencay's adaptation of those examples to create a stacked, grouped bar chart
Once you have become familiar with building charts in d3, you may want to open up the NVD3 source code to see if you can borrow some of their reusable code components (being sure to respect their licence terms, of course). However, I would not recommend doing so as a beginner -- it is a lot of code, and uses a lot of complex techniques to put all the pieces together.

DC.js Crossfilter on "nested" dimensions

I'm quite confused and might need help just formulating the question, so please give good comments...
I'm trying to crossfilter some data where each data point has its own sub-dataset that I want to chart and filter on as well. Each point represents a geographic region, and associated with each point is a time series which measures a certain metric over time.
Here's what I've got so far: http://michaeldougherty.info/dcjs/
The top bar chart shows a particular value for 10 regions, and the choropleth is linked with the same data. Now, below that are two composite line charts. Each line corresponds to a region -- there are 10 lines in each graph, and each graph is measuring a different metric over time. I would like the lines to be filtered as well, so if one bar is selected, only one line will show on the line chart.
Moreover, I want to be able to filter by time on the line charts (through brushing) in addition to some other filter, so I can make queries like "filter out all regions whose line value between 9 AM and 5 PM is less than 20,000", which would also update the bar and choropleth charts.
This is where I'm lost. I'm considering scrapping DC.js for this and using crossfilter and d3.js directly because it seems so complicated, but I would love it if I'm missing something and DC.js can actually handle this. I'd also love some ideas on where to start implementing this in straight crossfilter, because I haven't fully wrapped my head around that yet either.
How does one deal with datasets within datasets?
Screenshot of the link above included for convenience:

Transition a chart dependent on another chart

I am new to d3.js but have managed to make two individual charts as in introduction.
I have a map chart, which has dots representing monitoring stations.
I also have a line chart which has multiple timeseries (data from json) from one monitoring station.
What I would like to do. Have the two charts on one page. When you mouseover or click on a station on the map the data is loaded and displayed on the line chart. When a new station is selected on the map, the data transitions on the line chart
The question I have is one of style. With the two separate charts what is the best way to combine them?
With the transition, I have searched but have not found any simple examples that has two charting elements where interacting with one effects the other. Should I combine all the timeseries data into one json file (say 4 timeseries times 50 stations) or have 50 json files?
Thanks
Unless your timeseries data is very large, I would just put everything in one JSON file to make things simpler and so that changing stations can take place entirely client side.

Resources