Who needs dimensions anyway? - dc.js

This is half a question, as I have this 'sort of' thought through. Still, I'd like to have some confirmation. Here it goes:
From what I've seen so far, a group holds all the information necessary for a plot. Let's imagine a bar chart for the 'canonical' dc data array. We define a dimension on type and then a group. The group data will give us all the necessary coordinates for drawing the bars.
Why do we need dimensions, then? Is this for plotting, or just for keeping track of the filters and dynamically updating the chart?

This is only half an answer. :)
Yes, there is some redundancy between dimensions and groups for sure.
The group key function needs to be a refinement of (and must be consistent with) the dimension key function.
There's only one place where I've found it's helpful to refine the group key function more than the dimension key function: when the dimension is time and the group is some quantization of time like months or hours. Otherwise there is no need to specify the group key function at all. I haven't seen too many people create sets of time-series charts where different charts are quantized at different levels, so I am not sure if that's the motivation.
Filtering happens through the dimension, not the group, and a group does not observe its dimension's filters - so you might want them to be different objects so you that you can optionally have the chart respond to its own filters. That's pretty rare, though - usually a chart will read data from a group on the dimension it filters.
Any other reasons? Please add to this list!

Related

understanding dc interaction with crossfilter objects

Though I can write dc.js applications, I still don't understand how dc uses crossfilter objects, ie the dimensions and groups in various charts. When we click on an graph element, for instance, a pie chart slice, I believe dc is applying filters on the dimension, but does it manipulate the crossfilter object as well? Anyone knows of any document/article explaining how dc interacts with crossfilter objects? I know of http://www.codeproject.com/Articles/693841/Making-Dashboards-with-Dc-js-Part-Using-Crossfil
which is really good for beginners, but it does not go deep dive on this specific subject.
For instance, I have this dc chart: http://bit.ly/1nStSh3
Basically the dataset has object names (4 of them, P, Q, S, T) and its size for various dates. The two piecharts show the size for dates and objects respectively. There is a line chart which shows the data growth over a period of time. Now, when I click on the second graph, ie object names, both line chart and the first pie chart auto adjusts, but when I click on the first pie chart, the line chart does not change.
Your particular question is covered by the crossfilter documentation and the dc.js FAQ: a dimension does not observe its own filters, but only the filters on other dimensions.
To get the charts to respond to each other, create a duplicate of the dimension (construct another one with the same arguments) and put the charts on separate dimensions. (There is also work underway to reflect the brushing/filter state between charts that share the same dimension.)
As to your larger question, no, there is no documentation on the interaction between dc.js and crossfilter that I know of. As the principle maintainer (but not the original author) of dc.js, I hope to write such documentation in the next year.
There actually isn't much magic to it: charts just update the dimension filters and then trigger redraws on the charts in their group. The d3 transitions within each chart are what make it look fancier than that.

DC.js Crossfilter on "nested" dimensions

I'm quite confused and might need help just formulating the question, so please give good comments...
I'm trying to crossfilter some data where each data point has its own sub-dataset that I want to chart and filter on as well. Each point represents a geographic region, and associated with each point is a time series which measures a certain metric over time.
Here's what I've got so far: http://michaeldougherty.info/dcjs/
The top bar chart shows a particular value for 10 regions, and the choropleth is linked with the same data. Now, below that are two composite line charts. Each line corresponds to a region -- there are 10 lines in each graph, and each graph is measuring a different metric over time. I would like the lines to be filtered as well, so if one bar is selected, only one line will show on the line chart.
Moreover, I want to be able to filter by time on the line charts (through brushing) in addition to some other filter, so I can make queries like "filter out all regions whose line value between 9 AM and 5 PM is less than 20,000", which would also update the bar and choropleth charts.
This is where I'm lost. I'm considering scrapping DC.js for this and using crossfilter and d3.js directly because it seems so complicated, but I would love it if I'm missing something and DC.js can actually handle this. I'd also love some ideas on where to start implementing this in straight crossfilter, because I haven't fully wrapped my head around that yet either.
How does one deal with datasets within datasets?
Screenshot of the link above included for convenience:

What is the main idea of creating click heatmap?

in one of my projects, I would like to create heatmap of user clicks. I was searching a while and found this library - http://www.patrick-wied.at/static/heatmapjs/examples.html . That is basically exactly what I would like to make. I would like to create heatmap in SVG, if possible, that is only difference.
I would like to create my own heatmap and I'm just wondering how to do that. I have XY clicks position. Each click has mostly different XY position, but there can be exceptions time to time, a few clicks can have the came XY position.
I found a few solutions based on grid on website, where you have to check which clicks belong into the same column in this grid and according to these informations you are able to fill the most clicked columns with red or orange and so on. But it seems a little bit complicated to me and maybe slower for bigger grids.
So I'm wondering if there is another solution how to "calculate" heatmap colors or I would like to know the main idea used in library above.
Many thanks
To make this kind of heat map, you need some kind of writable array (or, as you put it, a "grid"). User clicks are added onto this array in a cumulative fashion, by adding a small "filter" sub-array (aligned around each click) to the writable array.
Unfortunately, this "grid" method seems to be the easiest, simplest way to get that kind of smooth, blobby appearance. Fortunately, this kind of operation is well-supported by software and hardware, under the name "computer graphics".
When considered as a computer graphics operation, the writable array is called an "accumulation buffer". The filter is what gives you the nice blobby appearance, even with a relatively small number of clicks -- you can tweak the size of the filter according to the needs of your application.
After accumulating the user clicks, you will need to convert from the raw accumulated values to some kind of visible color scale. This may involve looking through the entire accumulation buffer to find the largest value, and mapping your chosen color scale accordingly. Alternately, you could adjust your scale according to the number of mouse clicks, or (as in the demo you linked to) just choose a fixed scale regardless of the content of the buffer.
Finally, I should mention that SVG is not well-adapted to representing this kind of graphic. It should probably be saved as some kind of image file (.jpg or .png) instead.

improve cartographic visualization

I need some advice about how to improve the visualization of cartographic information.
User can select different species and the webmapping app shows its geographical distribution (polygonal degree cells), each specie with a range of color (e.g darker orange color where we find more info, lighter orange where less info).
The problem is when more than one specie overlaps. What I am currently doing is just to calculate the additive color mix of two colors using http://www.xarg.org/project/jquery-color-plugin-xcolor/
As you can see in the image below, the resulting color where two species overlap (mixed blue and yellow) is not intuitive at all.
Someone has any idea or knows similar tools where to get inspiration? for creating the polygons I use d3.js, so if more complex SVG features have to be created I can give a try.
Some ideas I had are...
1) The more data on a polygon, the thicker the border (or each part of the border with its corresponding color)
2) add a label at the center of polygon saying how many species overlap.
3) Divide polygon in different parts, each one with corresponding species color.
thanks in advance,
Pere
My suggestion is something along the lines of option #3 that you listed, with a twist. Rather painting the entire cell with species colors, place a dot in each cell, one for each species. You can vary the color of each dot in the same way that you currently are: darker for more, ligher for less. This doesn't require you to blend colors, and it will expose more of your map to provide more context to the data. I'd try this approach with the border of the cell and without, and see which one works best.
Your visualization might also benefit from some interactivity. A tooltip providing more detailed information and perhaps a further breakdown of information could be displayed when the user hovers his mouse over each cell.
All of this is very subjective. However one thing's for sure: when you're dealing with multi-dimensional data as you are, the less you project dimensions down onto the same visual/perceptual axis, the better. I've seen some examples of "4-dimensional heatmaps" succeed in doing this (here's an example of visualizing latency on a heatmap, identifying different sources with different colors), but I don't think any attempt's made to combine colors.
My initial thoughts about what you are trying to create (a customized variant of a heat map for a slightly crowded data set, I believe:
One strategy is to employ a formula suggested for
n + 1
with regards to breaks in bin spacing. This causes me some concern regarding how many outliers your set has.
Equally-spaced breaks are ideal for compact data sets without
outliers. In many real data sets, especially proteomics data sets,
outliers can make this representation less effective.
One suggestion I have would be to consider the idea of adding some filters to your categories if you have not yet. This would allow slimming down the rendered data for faster reading by the user.
another solution would be to use something like (Comprehensive) R
or maybe even DanteR
Tutorial in displaying mass spectrometry-based proteomic data using heat maps
(Particularly worth noting I felt, was 'Color mapping'.)

Can I link actions between two d3 charts?

Very casual javscript user here. Hopefully this makes sense.
I am tracking 20 different series on a stacked area chart with nvd3.js. These series fluctuate between positive and negative values, some on a huge base and some on a small base. The result is that - when one of the really big series is below the x axis - it pushes everything else underneath too, and the positive series won't appear above the x axis until you filter out the bigger players using the key.
The technically inelegant but good looking solution I have come up with is to split all of my negative values into one array, and all of my positives into another. The top half of the page is a positive values graph, the bottom half is negative values and they line up pretty nicely.
The weakness with this approach is when you go to interact with it as an end user. If I filter out a series (by unchecking it in the key) or change the graph mode (with the type selector) or zoom in on a series (by clicking it so the graph refocuses to that series only) then it will only affect whichever graph you clicked on. I would like to adjust these three click events (and any others I've missed?) so that your action is synchronised across both graphs.
Is this achievable? Any reading material I can dig through where somebody has done something similar? I imagine linking two representations of one data set (e.g a pie and column graph) is vaguely analogous.

Resources