dc.js has been great, and now I'm trying to understand how to use it for data with multiple dimensions.
I have time series data (csv), which contains the number of people that fit a certain attribute on a given day - e.g. the number of brown-haired people age 65+. A simplified version of it looks like this (There are 5 options for hair color, 5 for age group, and about 200 dates):
Date, Hair Color, 0-18, 19-39, 40-64, 65+
1/1/21, Brown, 5, 3, 10, 2
1/1/21, Blonde, 15, 2, 4, 1
1/2/21, Brown, 2, 8, 0, 2
1/2/21, Blonde, 11, 6, 7, 4
...
I'd like to be able to plot the cumulative counts over time for each sub-population. The complication is that I'd like to show
A plot aggregated by hair color
(so summing over all age groups), which can then be toggled (ideally by clicking on one of the lines) to show:
A plot for a given hair color
disaggregated by age group.
(Note that in the mockups, I'm normalizing counts to show it as a cumulative percentage. I've been doing that calculation straightforwardly with valueAccessors.)
My question is: how do I create the dimensions and groups to create these plots?
I'd prefer not to create individual variables for each age group (I'd like it to be generic enough to expand to finer categories). But I'm having trouble understanding how to use reduce and filters to achieve my desired outcome.
Also, should I be doing it all as linecharts in a compositeChart, or in a series chart? There is the added wrinkle that I plan to then annotate the chart with extra trendlines added in from d3.
Thanks!
The series chart is a convenience class that generates a composite chart underneath.
It allows you to specify your data using a 2D key, where one component is the key to be used for the X values in the chart, and one component is another key to be used for splitting the data into multiple layers - lines, in your case. You also give it the "prototype" of the layer chart, in the form of a function that returns a partially-initialized chart.
It sounds like you are on the right track, so I won't attempt to give a complete answer, just a few hints. Please feel free to follow up in the comments, and I will edit this answer to fill in details.
Flattening the data
You will probably want to flatten your data so that there is only one value per row, i.e. structure it with an Age column and a Value column. This is a general best practice for working with crossfilter.
It's possible to work with the data as you have it, but
you won't be able to filter by age, since filtering in crossfilter is by row
aggregating across ages will be more complicated, requiring custom reductions
Using multikeys and series chart
Following the series chart example, you might define your dimension as
const colorDateDimension = cf.dimension(d => [d['Hair Color'], d.Date]);
Now any group on this dimension will aggregate by both hair color and date.
Now if you're using the series chart, you can extract the components with
chart
.seriesAccessor(({key}) => key[0])
.keyAccessor(({key}) => key[1])
You could use the third parameter of the series chart chart function to determine the color or dash style of the layer, e.g.:
const dashStyles = {
'0-18': [3,1],
'19-29': [4,1,1,1],
// ...
};
.chart(function(c, _, subkey) {
return new dc.LineChart(c).dashStyle(dashStyles[subkey]);
})
Interaction
dc.js does not natively support the kind of drill-down you are describing. It would be easier to have one chart which is by hair color and another chart which is by age. Then when no hair color is selected, the age chart will show all hair colors, and when no age is selected, the hair color chart will show all ages.
If you want drill-down as you describe, you will have to write custom code to apply the filter and swap the chart definition when a hair color is clicked. It's not terribly complicated but please ask a follow-up question if you can't figure it out - it's better to keep SO questions on a single topic.
Annotating with D3
This part is pretty simple no matter how you implement the charts.
You will implement a pretransition handler and use chart.selectAll to add the content you need. There are many examples here on SO, so I won't go into it here.
Conclusion
I hope this gets you started. I've answered your specific question and given some hints about other assumptions or implicit questions within your question. It will be some work to get the results you want, but it is definitely possible.
I am working on creating text based data feed files that have fixed column widths. Example: Position 1-5 is record layout ID, position 6-35 is part number, position 36-70 is description, etc.
I wish there were a tool I could provide these data input widths, then paste in the raw text to visually see where it lines up. Conceptually, this would seem to be a pretty simple tool.
Do you know of any solutions or creative ideas?
Thanks!
Use https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/substr
Layout Id would be str.substr(1, 5)
Part number would be str.substr(6, 35)
etc.
I am currently trying to adapt an excel dashboard I created a few years back into spotfire. The data previously fit onto one page, but this year the number of metrics has doubled and I thought it would be a good opportunity to try spotfire.
The current dashboard has a number of scatter plot charts showing the values from a range of cells (rank 1-5). Overlaid is another scatter plot that shows the groups results.
I have found it easy enough to run the scatter plot in Excel based on a range of data, but as of right now I am stumped on how to choose a range of values for my spotfire scatters.
Here is an example (in excel) of what I am trying to accomplish. The data example is how my data is currently setup.
Would you have any tips on how I may be able to produce a similar chart in Spotfire?
You can create a Document.Properties with all the values you want. Then create a slider in a text area with this document properties. You will be able to select a range of values with this slider.
To use your range in a scatter plot, on the axis you want, right click and custom expression then write something like $(my_docproperty).
Hope it will a solution to your problem !
I am trying to show machine states over time. Part of this is to reproduce/automate a report that used to be done by hand. It consists of coloring 2minute 'time slices' in Excel based on what the machine is doing.
(Sorry, not enough reputation to post a picture, but it is a classic heatmap where the state drives the color. Some non DC-JS fiddle: http://jsfiddle.net/ww6Lbnc5/4/)
I was able to generate most of what I want in the following jsfiddle:
http://jsfiddle.net/hwhfxz2t/14/
See fiddle for code.
The total state duration (for selected time frame) is shown in the pieChart, followed by the individual state lines and then the heatmap that people are used to. (the ZOOM and date selection buttons do not work in the fiddle but are there to select specific data ranges or zoom in if you like).
The line charts uses the original representation of the states, which consists of a time the state is entered and a duration.
In order to make the heat map work, I had to (I think) take the original data and convert it into individual minute chunks and mark them with a state. So for instance the original data specifying:
RUN state starting 14:30 for 300 seconds
becomes:
14:30=RUN, 14:31=RUN, 14:32=RUN, 14:33=RUN and 14:34=RUN
The code in lines 233-297 loops through the original data and generates a new one that does this. In cases where there is more than one state within a given minute, the last state survives.
This works okay but it seems that this code is exactly what is normally done in group().reduce(add,remove,init). But in this case I need to add multiple timeslots depending on the duration of a state.
Also, because it is now using a different crossfilter, maps do not update each other.
Here are my questions related to this:
Can I display a heatmap without supplying information for all individual
'cells'? (i.e. straddle cells based on a value, similar to rowspan in a table)
Can I add multiple values at once inside group().reduce()?
Is there an easy way to invert the yAxis so 0 is at the top?
When clicking a row in the heatmap, it selects a column and vice-versa?
I'm not sure if this should be in the crossfilter group. If so please ignore my rambling. If someone knows how to keep the charts linked by grouping better, please let me know.
--Nico
Concerning Question 3:
DC.js heatmaps currently do not support custom order functions on axis but there is a pull request that has been merged into the developing branch and should be accessible to the public soon.
You could manually edit the dc.js file to set the sorting in heatmaps to a custom function. In the latest (2.0.0-beta10) version it is the following line:
rowValues.sort(d3.ascending);
and accordingly
colValues.sort(d3.ascending);
I need some advice about the design on my app.
I have several screens where I need a date range to show content. One has rows of text data and others are charts. All but one need future dates (one needs dates in the past).
I have a screen where they can select the date range and a quick way to select 30 days etc.
I think I have two options. 1) allow one date range to be used though-out all screens (apart from that one chart). 2) allow a date range to be selected for each.
However, both of these have issues, I may have to correct old dates and theres that one chart. Although, making these corrections could cause confusion.
My main issue, is that I don't want to confuse my users.
Suggestions?
I recommend placing a date range selector on each screen. It makes the content super obvious and a little easier to use. For convenience, you could have previously selected date ranges persist between screens for users who want all the data for the same dates.
edit: What is the platform? If its a mobile phone I would recommend a separate screen for the date range and if it is a computer I would recommend the date range selector on each screen.