Extend data to extent of graph - d3.js

I have time-sensitive data that ranges from now until 7 days from now with hourly resolution ~220 values. I was able to get them to plot following Mike Bostock's demo here: http://bost.ocks.org/mike/cubism/intro/demo-stocks.html
but I can't seem to find how to extend the timescales and data to span across the entire display. I was looking for an extent argument or xrange or width, but I haven't had any luck. I'm sure the answer is trivial, but I can't seem to find it.
var context = cubism.context()
.step(3600000) // <-- this changes the time resolution
.size(1280) // <-- this changes the width
.stop();
Also, d3.time.scale.domain seems to be undefined in d3.v3.min.js.
Here's the fiddle that shows you what the code looks like (because it's calling d3.csv I wasn't sure how to get it completely working in the fiddle...so I included the csv file below the javascript.) : http://jsfiddle.net/oay7tvq0/

I think that cubism uses a single pixel for each time slice, so if you've only got 220 timesteps only 220 pixels are needed. Check the documentation to be sure. If I'm correct then you'll either have to be creative with your timeseries and extend the timestep (by repeating adjacent records) or perhaps d3 (or one of the various libraries built on it) would better suit your purposes. – user1614080
This I'm sure is the answer...

Related

Imagenet ILSVRC2014 validation ground truth to synset label translation not accurate

Im using a pre-trained image classifier to evaluate input data treatments. I downloaded the ImageNet ILSVRC2014 CLS-LOC validation dataset to use as base. I need to know the actual classes of the images to evaluate my treatments (need to detect correct classifications). In the 2014 toolkit there is ILSVRC2014_clsloc_validation_ground_truth.txt file that according to the readme is supposed to contain class labels (in form of ID:s) for the 50 000 images in the data set. There are 50 000 entries/lines in the file so this far all seems good but i also want the corresponding semantic class labels/names.
I found these in a couple of places online and they seem to be coherent (1000 classes). But then i looked at the first image which is a snake, the ground truth for the first pic is 490, the 490:th row in the semantic name list is "chain". That's weird but still kind of close. The second image is two people skiing, the derived class "polecat". I tried many more with similar results.
I must have misunderstood something. Isn't the ground truth supposed to be the "correct" answers for the validation set? Have i missed something in the translation between ID:s and semantic labels?
The readme in the 2014 imagenet dev-kit states:
" There are a total of 50,000 validation images. They are named as
ILSVRC2012_val_00000001.JPEG
ILSVRC2012_val_00000002.JPEG
...
ILSVRC2012_val_00049999.JPEG
ILSVRC2012_val_00050000.JPEG
There are 50 validation images for each synset.
The classification ground truth of the validation images is in
data/ILSVRC2014_clsloc_validation_ground_truth.txt,
where each line contains one ILSVRC2014_ID for one image, in the
ascending alphabetical order of the image file names.
The localization ground truth for the validation images can be downloaded
in xml format. "
Im doing this as part of my bachelor thesis and really want to get it right.
Thanks in advance
This problem is now solved. In the ILSVRC2017 development kit there is a map_clsloc.txt file with the correct mappings.

D3 Sankey Diagrams: How to handle dynamic data? I.E. Nodes/Links with no value (or 0 value)

This is a question for those working with D3.js Sankey diagrams.
What is the recommended approach for organizing/creating the JSON data for these diagrams?
Every example I've come across involves the example creator having the JSON/CSV having at least one value for every node/link.
However, If you try to supply dynamic data where you don't know if everything will have a value, and if a node/link value is zero, the node floats to the top-left and it disconnected from everything.
Example:
The items in this picture I'm talking about are: "Fugitive Emissions", "Industry", and "electricity and heat"
Cleaning up these nodes seems non-trivial.
It would seem logical to somehow exclude them from the JSON data to begin with, but this feels like an overly complicated process.
I.E. If you remove a node in the JSON, you need to remove all links associated with that node, but this would seem to require a recursive check to ensure everything is kept clean.
I'm still searching for possible ways to have the D3 Sankey remove the nodes without values, but the crux of the problem seems to be that the SVG object is populated with data for nodes based on the JSON data supplied, not what is passed to the Sankey JavaScript code. I've tried filtering the nodes I don't want here, but it doesn't affect what has already been attached to the SVG (Which is ultimately displayed on the diagram).
Any suggestions are welcome regarding how to remove/filter nodes without values. Either an addition to the Sankey code, or a suggestion for organizing/filtering the JSON data to be supplied to the Sankey.

How to simplify with topojson API?

So I have no problem simplifying using topojson from the command line using the -s flag, however, I can't figure out how to do it from the node module.
I see a topojson.simplify() method, but I can't figure out how it works as there is no documentation.
Does anyone have any insight?
By looking at the simplification tests for topojson, I was able to figure out how to use toposjson.simplify(), but I can't fully claim to know whats going on. You can see the tests on the topojson github.
Basically topojson.simplify takes a topology input and has 2 possible options for simplification, "retain-proportion" and "minimum-area", you can also pass the coordinate system, aka "cartesian" or "spherical", although it can be inferred under most circumstances.
examples:
output = topojson.simplify(topology,{"minimum-area": 2,"coordinate-system": "spherical"});
output =topojson.simplify(topology,{"retain-proportion: 2,"coordinate-system": "spherical"});
I am not really sure exactly what the values you pass into these options mean, however higher values tends to produce more simplification. As a note, retain proportion often returns invalid topologies when passed LineStrings, that may be as intended.
Additionally using the quantization option in topojson.topology can be used to create a smaller, simpler output and may be the best solution to some similar use cases and also doesn't have any clearly documented server API examples anywhere so:
//very simplified, small output
topojson.topology({routes: routesCollection},{"quantization":100});
//very unfiltered, large output
topojson.topology({routes: routesCollection},{"quantization":1e8});
note: the default quantization is 10000 (1e4), so anything less than 10000 will create a smaller output and vice versa.

Cubism with genomic data (or non-timeseries data)

I'd like to hear your thoughts on what would it take to make cubism work with non timeseries data, concretely, genomic data.
These type of data has a locus (a chromosome and coordinates within that chromosome) instead of a timestamp:
chrm1 145678123 value
chrm12 45345 value
chrmX 4535 value
....
What option do you think is best, hacking cubism's core to allow for these type of data (or any type of data for that matter) or spawning a new project all together?
UPDATE: I decided to implement a modified version of cubism for DNA. I call it DNAism and you can find it here. Take a look and let me know what you think.
-drd
Cubism is probably not the right kind of library for this task. You're going to have to modify the library in a pretty significant way. Instead of doing that I'd recommend you use the d3.horizon plugin so that you can gain a lot more control by creating custom scales.
Hope this answers your question.

Cursors for data selection in matplotlib

I am trying to get user input from matplotlib XY plot. The plot contains multiple datasets and I need get from user selection of which dataset to use and the range. I need this to fit model to right dataset and range.
Therefore I need two indicators, which would be "attached" to specific dataset, per user choosing. I need to get from them both the dataset info and the range info.
Somehow in line with what commercial plotting packages (Igor Pro, Kaleidagraph, Sigmaplot...) provide as "cursors" and similarly named widgets for control of their fitting interface, which is what I am trying to reproduce.
I have checked various examples with rangeselector and other methods I was able to Google on the web, but none I was able to find seems to be able to provide what I need.
Would anyone have any pointers to where to look or what to start with, please?
You might want to look at this example: http://matplotlib.sourceforge.net/examples/pylab_examples/ginput_manual_clabel.html
The interesting functions are ginput, waitforbuttonpress.

Resources