Ignore subset of data when filtering - dc.js

I use dc.js for analyzing results of a classification algorithm
and would like to filter on the confidence
(additional metrics like precision, recall and f-measure are calculated on the whole filtered dataset).
Example: https://jsfiddle.net/bse7rfdy/6/
var conf = dc.barChart('#conf');
conf
.dimension(ConfidenceDimension)
.group(ConfidenceGroup)
.x(d3.scaleLinear().domain([0.0,1.05]))
.xAxisLabel("confidence")
.xUnits(function(){return 20;})
.yAxisLabel("");
Since the false negatives always have a confidence of 0.0 they will be filtered when the confidence bar chart is used to select a confidence range greater than 0.0.
Thus I want to achieve a filtering on confidence only when the "EvaluationResult" is not "false negative". I also don't want to show the false negatives in the confidence bar chart but in the pie chart (thus they should remain the the crossfilter dataset).
I know that I can remove the 0.0 bar by using a fake group but when I filter on the confidence bar chart the fiter is applied and the "false negative" are removed (e.g. selecting a range from 0.5 to 0.6).
Actually I need to modify the filter in a way that the confidence range (selected by the user) is applied only if "EvaluationResult" !== "false negative".
Is that possible?

Thanks for the fiddle, that helps so much for answering questions.
You can do this by specifying a filterHandler for the chart.
The results are a little weird, because the pie chart will always show the same number of false negatives, while the other slices change size.
function filter_range_ignore_zeroes(dimension, filters) {
if(filters.length === 0) {
dimension.filter(null);
return
}
const filter = filters[0],
rf = dc.filters.RangedFilter(filter[0],filter[1]);
dimension.filterFunction(k => k===0 || rf.isFiltered(k));
return filters;
}
conf.filterHandler(filter_range_ignore_zeroes);
The handler has two cases: if there's no filter active, it resets the dimension filter.
Otherwise, it installs a filter function on the dimension which accepts zeroes but delegates to the default dc.filters.RangedFilter behavior otherwise.
Fork of your fiddle.
[You're not binning the confidence in this fiddle, so the bars overlap and go to 1.0, but I figure you have that working in your actual dashboard.]

Related

How to show only limited number of records in box plot dc.js

I want to show the most recent 10 bins for box plot.
If a filter is applied to the bar chart or line chart, the box plot should show the most recent 10 records according to those filters.
I made dimension by date(ordinal). But I am unable to get the result.
I didn’t get how to do it with a fake group. I am new to dc.js.
The pic of scenario is attached. Let me know if anyone need more detail to help me.
in image i tried some solution by time scale.
You can do this with two fake groups, one to remove the empty box plots, and one to take the last N elements of the resulting data.
Removing empty box plots:
function remove_empty_array_bins(group) {
return {
all: function() {
return group.all().filter(d => d.value.length);
}
};
}
This just filters the bins, removing any where the .value array is of length 0.
Taking the last N elements:
function cap_group(group, N) {
return {
all: function() {
var all = group.all();
return all.slice(all.length - N);
}
};
}
This is essentially what the cap mixin does, except without creating a bin for "others" (which is somewhat tricky).
We fetch the data from the original group, see how long it is, and then slice that array from all.length - N to the end.
Chain these fake together when passing them to the chart:
chart
.group(cap_group(remove_empty_array_bins(closeGroup), 5))
I'm using 5 instead of 10 because I have a smaller data set to work with.
Demo fiddle.
This example uses a "real" time scale rather than ordinal dates. There are a few ways to do ordinal dates, but if your group is still sorted from low to high dates, this should still work.
If not, you'll have to edit your question to include an example of the code you are using to generate the ordinal date group.

dc.js Weird mouse zooming for seriesChart

I want to use the mouse zoom functionality on seriesChart and have it filter for other charts of the same group.
When I enable the zoom with .mouseZoomable(true) on seriesChart, and zoom the chart, the other charts become empty.
This doesn't happen when I enable it on a LineChart.
Here is a simple example: https://codepen.io/udeste/pen/ZKeXmX
(Zoom the second chart with the mouse. All is working. But when you zoom the first chart the other charts go blank.)
What am I doing wrong? Is it a dc.seriesChart bug?
It's because dc.seriesChart required you to supply that strange dimension, but it didn't change the filter function accordingly.
You specified seriesDimension like so:
var seriesDimension = ndx.dimension(function(d) {
return [+d.Expt, +d.Hours];
});
But when you zoom, the dc.coordinateGridMixin just filters using a regular dc.filters.RangedFilter, which does not know about these kinds of two-dimensional "multikeys".
Probably since the series chart requires this kind of input, it should redefine the filter handler to also deal with multikeys. But until then, you can work around it by providing your own filterHandler:
seriesChart.filterHandler(function(dimension, filters) {
if(filters.length === 0) // 1
dimension.filter(null);
else {
console.assert(filters.length===1); // 2
console.assert(filters[0].filterType==='RangedFilter');
dimension.filter(function(d) { // 3
return filters[0][0] <= d[1] && d[1] < filters[0][1];
})
}
});
What this does:
Checks if this event is because the filters have been cleared, and clears the dimension's filter if so.
Asserts that the filter is what is expected. coordinateGridMixin will always supply a single dc.filters.RangedFilter but who knows what else could happen.
Supplies a substitute filter function that checks if the part of the key used by the keyAccessor falls within the range (instead of comparing the array with the range, which will always return false).
Here's a working fork of your codepen.
(Incidentally, it looks like this examples slams into a known issue where line segments off the edge of the chart are dropped instead of clipping the segments. It won't be quite as bad if there are more points. I don't think I've found a good workaround. Hopefully we'll fix this soon.)

Clicking on rowchart (dc.js) changes the percentage

I need to solve a problem with dc and crossfilter, I have two rowcharts in which I show the calculated percentage of each row as:
(d.value/ndx.groupAll().reduceCount().value()*100).toFixed(1)
When you click on a row in the first chart, the text changes to 100% and does not maintain the old percentage value, also the percentages of the rows of the same chart where the row was selected change.
Is it possible to keep the original percentage when I click ?, affecting the other graphics where it was not clicked.
regards
thank you very much
First off, you probably don't want to call ndx.groupAll() inside of the calculation for the percentages, since that will be called many times. This method creates a object which will get updated every time a filter changes.
Now, there are three ways to interpret your specific question. I think the first case is the most likely, but the other two are also legitimate, so I'll address all three.
Percentages affected by other charts
Clearly you don't want the percentage affected by filtering the current chart. You almost never want that. But it often makes sense to have the percentage label affected by filtering on other charts, so that all the bars in the row chart add up to 100%.
The subtle difference between dimension.groupAll() and crossfilter.groupAll() is that the former will not observe that dimensions filters, whereas the latter observes all filters. If we use the row chart dimension's groupAll it will observe the other filters but not filters on this chart:
var totalGroup = rowDim.groupAll().reduceCount();
rowChart.label(function(kv) {
return kv.key + ' (' + (kv.value/totalGroup.value()*100).toFixed(1) + '%)';
});
That's probably what you want, but reading your question literally suggests two other possible answers. So read on if that's not what you were looking for.
Percentages out of the constant total, but affected by other filters
Crossfilter doesn't have any particular way to calculate unfiltered totals, but if want to use the unfiltered total, we can capture the value before any filters are applied.
So:
var total = rowDim.groupAll().reduceCount().value;
rowChart.label(function(kv) {
return kv.key + ' (' + (kv.value/total*100).toFixed(1) + '%)';
});
In this case, the percentages will always show the portion out of the full, unfiltered, total denominator, but the numerators will reflect filters on other charts.
Percentages not affected by filtering at all
If you really want to just completely freeze the percentages and show unfiltered percentages, not affected by any filtering, we'll have to do a little extra work to capture those values.
(This is similar to what you need to do if you want to show a "shadow" of the unfiltered bars behind them.)
We'll copy all the group data into a map we can use to look up the values:
var rowUnfilteredAll = rowGroup.all().reduce(function(p, kv) {
p[kv.key] = kv.value;
return p;
}, {});
Now the label code is similar to before, but we lookup values instead of reading them from the bound data:
var total = rowDim.groupAll().reduceCount().value;
rowChart.label(function(kv) {
return kv.key + ' (' + (rowUnfilteredAll[kv.key]/total*100).toFixed(1) + '%)';
});
(There might be a simpler way to just freeze the labels, but this is what came to mind.)

Get only non-filtered data from dc.js chart (dimension / group)

So this is a question regarding a rather specific problem. As I know from Gordon, main contributor of dc.js, there is no support for elasticY(true) function for logarithmic scales.
So, after knowing this, I tried to implement my own solution, by building a workaround, inside dc.js's renderlet event. This event is always triggered by a click of the user onto the barchart. What I wanted to do is this:
let groupSize = this.getGroupSize(fakeGroup, this.yValue);
let maximum = group.top(1)[0].value;
let minimum = group.top(groupSize)[groupSize-1].value;
console.log(minimum, maximum);
chart.y(d3.scale.log().domain([minimum, maximum])
.range(this.height, 0)
.nice()
.clamp(true));
I thought, that at this point the "fakeGroup" (which is just group.top(50)) contains only the data points that are NOT filtered out after the user clicked somewhere. However, this group always contains all data points that are in the top 50 and doesn't change on filter events.
What I really wanted is get all data points that are NOT filtered out, to get a new maximum and minimum for the yScale and rescale the yAxis accordingly by calling chart.y(...) again.
Is there any way to get only data rows that are still in the chart and not filtered out. I also tried using remove_empty_bins(group) but didn't have any luck with that. Somewhere is always all() or top() missing, even after giving remove_empty_bins both functions.
This is how i solved it:
I made a function called rescale(), which looks like this:
rescale(chart, group, fakeGroup) {
let groupSize = this.getGroupSize(fakeGroup, this.yValue);
let minTop = group.top(groupSize)[groupSize-1].value;
let minimum = minTop > 0 ? minTop : 0.0001;
let maximum = group.top(1)[0].value;
chart.y(d3.scale.log().domain([minimum, maximum])
.range(this.height, 0)
.nice()
.clamp(true));}
I think the parameters are pretty self-explanatory, I just get my chart, the whole group as set by dimension.group.reduceSum and a fake group I created, which contains the top 50 elements, to reduce bar count of my chart.
The rescale() method is called in the event listener
chart.on('preRedraw', (chart) => {
this.rescale(chart, group, fakeGroup);
}
So what I do is re-defining (re-setting min and max values regarding filtered data) the charts yAxis everytime the chart gets redrawn, which happens to also be every time one of my charts is filtered. So now, the scale always fits the filtered data the chart contains after filtering another chart.

DC.js Choropleth filtering Issue

I am trying to filter data on my choropleth chart from a bargraph. Strange thing is that it is not showing correct value on selecting a bar from the accompanying bar chart.
Here is the jsfiddle: https://jsfiddle.net/anmolkoul/jk8LammL/
The script code begins from line 4794
If i select WIN004 from the bar chart, it should highlight only five states and the tooltip should reflect the values for the data. Some states are highlighted for whom WIN004 does not exist.
I changed the properties of the choropleth from
.colors(d3.scale.quantize().range(["#F90D00", "#F63F00", "#F36F01", "#F09E01", "#EDCB02", "#DDEA03", "#ADE703", "#7EE404", "#50E104", "#24DE05", "#05DB11"]))
.colorDomain([-1, 1])
To
.colors(d3.scale.linear().range(["green", "white", "red"]))
.colorDomain([-2, 0, 2])
But i get a lot of white states where its hard to discern what has been highlighted. The tool tip for some white-ed-out states show -0.00 :/
Here is the fiddle http://jsfiddle.net/anmolkoul/jk8LammL/1/
So i guess either its a problem with my color range or how my data is getting parsed.
I would ideally like to specify the data ranges in the .colorDomain based on the top and bottom values of the riskIndicator dimension. My functions are not working though. Should i use d3.max or riskIndicator.top here?
EDIT:
I got the color domain dynamic by using the min and max values but still the graph is not performing as expected? Could this be an issue with the geochoropleth chart? I further took a working geochoropleth example and ported my data to it and even that gave me the same issue of representing data incorrectly. I thoughit could be a data problem but i validated using a couple of good BI tools and their map charts displayed data correctly.
Could this be an issue with the dc choropleth?
Thank you.
Anmol
This has the same root cause as the issue in this question:
Crossfilter showing negative numbers on dc.js with no negative numbers in the dataset
In short, floating point numbers don't always cancel out to zero when added and subtracted. This "fake group" will ensure they snap to zero when they get close:
function snap_to_zero(source_group) {
return {
all:function () {
return source_group.all().map(function(d) {
return {key: d.key,
value: (Math.abs(d.value)<1e-6) ? 0 : d.value};
});
}
};
}
Added it to the FAQ!

Resources