How to handle duplicate values in d3.js - d3.js

First I'm a d3.js noob :)
How you can see from the title I've got a problem with duplicated data and aggregate the values is no option, because the name represent different bus stops. In this example maybe the stops are on the fron side and the back side of a building.
And of course I like to show the names on the x-axis.
If i created an example and the result is a bloody mess, see jsFiddel.
x = index
name = bus stop name
n = value
I've got a json e.g.:
[{
"x": 0,
"name": "Corniche St / Abu Dhabi Police GHQ",
"n": 113
},
{
"x": 1,
"name": "Corniche St / Nation Towers",
"n": 116
},
{
"x": 2,
"name": "Zayed 1st St / Al Khalidiya Public Garden",
"n": 146
},
...
{
"x": 49,
"name": "Hamdan St / Tariq Bin Zeyad Mosque",
"n": 55
}]
The problem: It is possible that the name could appear more then once e.g.
{
"x": 1,
"name": "Corniche St / Nation Towers",
"n": 116
}
and
{
"x": 4,
"name": "Corniche St / Nation Towers",
"n": 105
}
I like to know is there a way to tell d3.js not to "delete" duplicated names and instead just show all names in sequence with their values.
Any ideas or suggestions are very welcome :) If you need more information let me know.
Thanks in advanced
Mario

Lars is right: the d3.ordinal scale is doing exactly what it should: treating duplicate values as repeat instances. See here for more details: https://github.com/mbostock/d3/wiki/Ordinal-Scales
You can use a regular linear scale instead, like this: http://jsfiddle.net/vy8vjy4r/2/
The changes are to make the scale linear and set the domain to be the length of your dataset.
var x = d3.scale.linear().domain([0,j_data.length]).range([0, width]),
When you pass a value to the scale, you simply pass the position in the list. I'm using the index - the i in function(d,i) - but you could have used the x in your dataset. (I didn't use it as it looks like you don't need it.)
.x(function (d,i) { return x(i); })
Hopefully this works for you.
Additional information on axis
Strictly speaking, I guess this should have been an additional question, but to get the text on the axis, you can simply add these two lines of code in where you modify the text in xAxisGroup, after .selectAll("text"):
.data(j_data.filter(function(d,i) { return !(i%5); }))
.text(function(d){ return d.name; })
The axis is displaying numbers every fifth item, so we choose every fifth item from the dataset. This gives us data that matches the existing labels, and we change the text to the .name value, see http://jsfiddle.net/vy8vjy4r/4/
This approach isn't particularly strong: it depends on D3 displaying every fifth stop, and for short or very long routes (or whatever these are) it might display all stops, or every tenth, etc. I would rather not use the D3 axis and build your own. For something like this, it shouldn't be too hard, although fitting all the names in might be hard in this space.
Try this: http://jsfiddle.net/vy8vjy4r/5/

Try this filter,
var names = [];
var result = [];
var indx=-1;
for(var i=0; i< j_data.length; i++){
indx = names.indexOf(j_data[i].name);
if(indx==-1){
names.push(j_data[i].name);
result.push(j_data[i]);
}
}
j_data= result;
Do this after your j_data array, it'll remove the duplicated objects from your j_data array. And see this http://jsfiddle.net/vy8vjy4r/1/
If it is not, what you are looking for, ask what change you need.

Related

How can I color my scatter using a vector with values?

Well, I want to color my scatter using a vector with values. Actually, I want to use other dimension than the one used for creating the scatter.
Using these lines it gives a color to my scatter using the values given by the dimension that scatter is built on.
.colorAccessor(function(d) {return d.key[1]})
.colors(d3.scaleSequential(d3.interpolateOranges))
.colorDomain(y_range)
y_range = [y_min, y_max]
I tried to include the column for color in the dimension of the scatter, but it slows down the process of filtering. Something like this:
scatterDim = crossFilter.dimension(function(d) { return [d[it.variable[0]], d[it.variable[1]], d[it.color]]})
.colorAccessor(function(d) {return d.key[2]})
.colors(d3.scaleSequential(d3.interpolatePlasma))
.colorDomain([colorUnits[0], colorUnits[colorUnits.length - 1]]),
I want to have a different dimension for color:
colorDimension = crossFilter.dimension(function (d) { return d[it.color] }),
colorGroup = colorDimension.group().reduceCount(),
colorAll = colorGroup.all(),
colorUnits = [],
count = 0;
for(var color in colorAll)
{
colorUnits[count] = colorAll[color].key;
count++;
}
.colorAccessor(//some different code for my vector colorUnits or even for dimension?!//)
.colors(d3.scaleSequential(d3.interpolatePlasma))
.colorDomain([colorUnits[0], colorUnits[colorUnits.length - 1]]),
I would also like to know how to use scaleOrdinal for color. In case that the vector colorUnits contains strings.
The name "dimension" is a little confusing in crossfilter and dc.js. It isn't used to describe the "Y" (aggregated) values, or the color.
It really means, "I want to bin my data by this key, and filter on it."
The reason you will find color as a third element in dimension keys in many examples is that it's expedient. It's easier to change the keys than the aggregated values. But it doesn't really make sense.
The fact that your chart got slower when you added color to your dimension key tells me that you don't have a unique color for each X/Y pair. Instead of drawing a dot for each X/Y pair, you end up with a dot for each X/Y/color triplet.
You also don't need to create a separate color dimension unless you want to bin, aggregate, or filter on color.
Assuming you only want one dot per X/Y pair, you need to decide which color to use. Then you can change the reduction, instead of the key, to add this data:
scatterDim = crossFilter.dimension(function(d) {
return [d[it.variable[0]], d[it.variable[1]]];
}),
scatterGroup = scatterDim.group().reduce(
function(p, v) { // add
p.count++; // reduceCount equivalent
p.color = combine_colors(p.color, v[it.color]);
return p;
},
function(p, v) { // remove
p.count--;
// maybe adjust p.color
return p;
},
function() { // init
return {count: 0, color: null};
}
);
If you don't care which of the colors is used, you don't need combine_colors; just use v[it.color]. Otherwise, that's something you need to decide based on your application.
Now the scatter group has objects as its values, and you can change the scatter plot to take advantage of them:
scatterPlot
.existenceAccessor(d => d.value.count) // don't draw dot when it is zero
.colorAccessor(d => d.value.color)
If in fact you do want to draw all the dots with different colors, for example using opacity to allow overplotting, you probably need a canvas implementation of a scatter plot, because SVG is only good up to thousands of points. There is one in the works for dc.js but it needs to be ported to the latest APIs.
I would also like to know how to use scaleOrdinal for color. In case that the vector colorUnits contains strings.
Not sure what you mean here. scaleOrdinal takes strings as its domain, so
.colors(d3.scaleOrdinal(colorUnits, output_colors))
should work?
Example
Since I'm failing to communicate something or another, here is an example. The color strings come from an array since I don't have an example of your data or code:
const names = ["Zero", "One", "Two", "Three", "Four", "Five"];
speedSumGroup = runDimension.group()
.reduce(
function(p, v) { // add
p.count++; // reduceCount equivalent
p.color = names[+v.Expt];
return p;
},
// ... as before
);
chart
.colorAccessor(d => d.value.color)
.colors(d3.scaleOrdinal(names, d3.schemeCategory10))
Once again, if the method isn't working for you, the best way to figure it out is to log speedSumGroup.all(). I get:
[
{
"key": [
1,
850
],
"value": {
"count": 1,
"color": "One"
}
},
{
"key": [
1,
880
],
"value": {
"count": 1,
"color": "Three"
}
},
{
"key": [
1,
890
],
"value": {
"count": 2,
"color": "Five"
}
},
// ...
]
Example fiddle.

How to change D3.format to display real number suffix instead of Byte suffix

Newbie Alert! My experience with JavaScript and SVG is EXTREMELY dated and I am completely new to d3.
I've just learned that d3.format('s') will display a number like 160,000 to 160k and a number like 20,000,000,000 to 20G. AWESOME!! There is just one teeny-tiny problem. This is a format for Scientific Notation in Bytes. I assume (maybe incorrectly) there must be an additional parameter that d3 uses to display numbers in Thousands (16T) and Millions (160M) and Billions (20B) instead of Bytes. No?
I have found another question how to get localizable or customizable si codes with d3.format and have looked at other similar questions but have yet to find what I believe to be the real answer. If what I am trying to do is not a feature in d3 but there is a work-around, I might need a little help implementing it.
I have been stepping through the d3 code to try to understand how to use the the other input parameters, si codes, prefix, type, etc. and even tried to glean some information from a tutorial here: D3.format tutorial through examples
It's become pretty clear that one needs to be an expert in D3 alone. So, in the absence of time, I'm asking for a D3 guru to please help.
Here's a custom formatting function adapted from another question:
function getFormatter(digits) {
var notations = [
// { value: 1E12, suffix: "T" }, what you'll use for trillion?
{ value: 1E9, suffix: "B" },
{ value: 1E6, suffix: "M" },
{ value: 1E3, suffix: "T" }
]
rx = /\.0+$|(\.[0-9]*[1-9])0+$/
return function(num) {
var notation
for (var i = 0; i < notations.length; i++) {
notation = notations[i]
if (num >= notation.value) {
var value = num / notation.value
value = value.toFixed(digits)
value = value.replace(rx, "$1")
return value + notation.suffix
}
}
}
}
var numbers = [
20000000000,
160020,
18200000
]
d3.select('body')
.append('ul')
.selectAll('li')
.data(numbers)
.enter()
.append('li')
.text(getFormatter(2))
span {
display:
}
<script src="https://d3js.org/d3.v4.min.js"></script>
Hopefully only the notations variable is relevant to your problem. But let me know if you need any further clarification. Good luck!

dc.js stacked line chart with more than 1 dimension

My dataset is an array of json of the like :
var data = [ { company: "A", date_round_1: "21/05/2002", round_1: 5, date_round_2: "21/05/2004", round_2: 20 },
...
{ company: "Z", date_round_1: "16/01/2004", round_1: 10, date_round_2: "20/12/2006", round_2: 45 }]
and I wish to display both 'round_1' and 'round_2' time series as stacked line charts.
The base line would look like this :
var fundsChart = dc.lineChart("#fundsChart");
var ndx = crossfilter(data);
var all = ndx.groupAll();
var date_1 = ndx.dimension(function(d){
return d3.time.year(d.date_round_1);
})
fundsChart
.renderArea(true)
.renderHorizontalGridLines(true)
.width(400)
.height(360)
.dimension(date_1)
.group(date_1.group().reduceSum(function(d) { return +d.round_1 }))
.x(d3.time.scale().domain([new Date(2000, 0, 1), new Date(2015, 0, 1)]))
I have tried using the stack method to add the series but the problem resides in the fact that only a single dimension can be passed as argument of the lineChart.
Can you think of a turnaround to display both series while still using a dc chart?
Are you going to be filtering on this chart? If not, just create a different group on a date_2 dimension and use that in the stack. Should work.
If you are going to be filtering, I think you'll have to change your data model a bit. You'll want to switch to have 1 record per round, so in this case you'll have 2 records for every 1 record you have now. There should be 1 date property (the date for that round), an amount property (the contents of round_x in the current structure), and a 'round' property (which would be '1', or '2', for example).
Then you need to create a date dimension and multiple groups on that dimension. The group will have a reduceSum function that looks something like:
var round1Group = dateDim.group().reduceSum(function(d) {
return d.round === '1' ? d.amount : 0;
});
So, what happens here is that we have a group that will only aggregate values from round 1. You'll create similar groups for round 2, etc. Then stack these groups in the dc.js chart.
Hopefully that helps!

Visualize data count in d3

I want to visualize data count in d3. I have a dataset similar to this:
[
{
"name": "Team blue",
"color": "#0433ff"
"count": 9
},
{
"name": "Team red",
"color": "#ff2600"
"count": 12
}
]
and I want to visualize it like this: http://i.imgur.com/xjFeNYd.png
I understand the basics of data and enter() but I do not know which is the best way to create the red or blue boxes based on the count value.
Any help will be appreciated.
You can d3.range(number) to generate a range of numbers from 1 to number. You can then combine this with nested selections. The code looks like this:
block.selectAll("span")
.data(function(d) { return d3.range(d.count); })
.enter()
.append('span')
Complete demo (with fixed CSS) here. The way of getting the color for the span elements is a bit hacky at the moment as it indexes into the top-level data set. A cleaner way would be to make this data part of the elements generated with d3.range() and is left as an exercise for the reader.

Split stream (or path) into segments with D3JS

Please consider this data:
var data = [];
data.segments = [
{ "id": "A", "start": 0, "end": 4},
{ "id": "B", "start": 5, "end": 9},
{ "id": "C", "start": 10, "end": 14},
];
data.stream = [
[ 0, 0, 0, 0, 0,
65, 60, 75, 85, 60,
20, 30, 20, 25, 15,
],
];
I want to display it as three distinct parts, where the A segment (ie: the first 5 entries in the stream) would be red (or whatever the color), the B segment (the middle 5 entries) green and the C segment (the last 5 entries) blue.
Here's what it would look like with help from a photo-editing program:
So far, I'm able to display data.stream as a stream. However, I'm stuck at breaking it into segments.
If my data was structured differently (as in this question), things would be easier. However, the way the data is structured right now is sort of ideal at it lets me separate the segment definitions from the stream data. This is useful as I want to be able to use different segments down the line. (You can look at those segments as sounds or words inside of an audio. Sometimes I would focus on individual sounds, sometimes on individual words, but the stream would always be the same.)
I put a working demo on JSFiddle here: http://jsfiddle.net/vsFhf/
How can I color the different parts of the stream?
Let me know if you need more details.
Thank you for the help-
Fabien
No matter what, you still need individual <path> elements for each segment. You could construct a segmented data array as #ValarDohaeris suggests. But, you can also do it without transforming the data:
Instead of binding to data.stream, you need to bind to data.segments, which will enable you to create that one <path> per segment. Then you call pathGenerator for each of those <paths>, passing in a slice of the stream you're rendering data.streams[0]. You'll also need to X-translate each <path> to the appropriate position, using your x scale function.
Here's the modified fiddle.
Would it help to split the data according to your segment definitions?
var segmentdata = data.segments.map(function(segment, i) {
return data.stream[0].slice(segment.start, segment.end + 1);
});
This will create:
segmentdata = [[0,0,0,0,0], [65,60,75,85,60], [20,30,20,25,15]]

Resources