Custom dendrogram in D3 - d3.js

maybe this is not the place for this question, but maybe someone is an experienced user of D3.js.
I would like to create a dendrogram where I initially show nodes from different levels (precomputed) and nodes are colored differently. The nodes have different tooltips for colored part and for the grey part.
Also I would like to side that with a heatmap.
Do you think combining those thing is possible in D3?
Since the work to do that is quite big I would like to know if it is reasonable to even start.
Part of the result I'm aiming for is here:

The short answer to your question is yes.
I'm looking into the same sort of problem/challenge and found a very nice example that almost exactly does what you describe: https://github.com/MaayanLab/clustergrammer
Since the solution involves 10k+ lines of code and this case is not a simple 'use this to do this' answer I'm not providing code excerpts (for details see their github). In short; it uses D3 libraries + javascript code for dynamic plotting, zooming and sorting of the heatmap and a collapsed dendrogram. It loads (meta)-data from a pre-computed json file that contains the information on clusters and some meta data.
I understand your question you don't prefer a pre-computed input. This is also the case for the application that I am buidling. I'm looking into generalising the generation of the json file from an SQLquery which can then hook up to the clustergrammer.js code. I will update this thread if I find out more/have a different/working solution that does everything on the fly.

Related

Data Structure to Implement Text Editor?

Recently I was asked this question in an interview. The exact question was
What data structures will you use to implement a text editor. Size of editor can be changed and you also need to save the styling information for all the text like italic, bold etc ?
At that point of time, I tried to convince him using many different approaches like stack, Doubly Linked list and all.
From that point of time,This question is bugging me.
It looks like they'd like to know if you were aware of the flyweight pattern and how to use it correctly.
A text editor is a common example while describing that pattern.
Maybe your interviewer was a lover of the GOF book. :-)
In addition to the previous answers, I would like to add that in order to get to the data structures, you need first to know your design - otherwise the options will be too broad selected.
As example let's assume that you'll need an editing functionality. Here the State and Memento design patterns will be a good fit. Very suitable structure will be the Cord, since it's
composed of smaller strings that is used for efficiently storing and manipulating a very long string.
In our case the text editing program
may use a rope to represent the text being edited, so that operations such as insertion, deletion, and random access can be done efficiently.
An open-ended question like this is designed more to see if you can think cogently about making a design that hangs together well, rather than having one, specific answer.
One specialized answer to the question is to use DOM/XML ("Document Object Model"). Markup "languages" are intended to solve this exact problem. You could store the data for the editor in a DOM. One of the advantages of using a DOM is that there are libraries like Xerces that have extensive support for building and managing DOMs, so a lot of your work is done for you. It is possible the interviewer intended this to be the ideal answer.
A more general answer is that any nested sequence structure can be used. The text can be seen as a sequence of strings. Each elment of the sequence, like rows in a database, can have multiple attributes (font type, font size, italic, bold, strikethrough, etc). Nesting (hierarchy) is useful because the document might have structure such as chapters, sections, paragraphs. For example, if a paragraph has its own styling (indent), then it may need to have its own level. So you have something like this:
Document
Chapter
Paragraph
Text
To implement this, you would use a tree and each node of the tree would have multiple attributes. You would require different kinds of nodes (Chapter nodes, Paragraph nodes, etc). So, for example, a document like a paper would have multiple Section nodes and a Notes node inside a Document node, but a book-like document might have Chapter nodes inside a document node. The advantage of this approach is that it is more specific and hand-tailored to the problem than using a DOM, which is a more flexible approach.
You could also combine the two approaches, using a DOM as your base structure and the hierarchical structure described above as your DOM implementation.
(Note: in the future you should post questions like this to https://softwareengineering.stackexchange.com/)

Is there any wat to make GUI of graph data structure in any language?

I want to implement graph data structure and want to make its graphical view in any language like Windows form or java any one. If you know about it then please tell me.
Tall order.
When I was learning data-structures I always found this page to be helpful for understanding data structures.
https://www.cs.usfca.edu/~galles/visualization/Algorithms.html
This has a bunch of different types of graphs if you scroll down.
The javascript version of each visualization is still maintained. Maybe you can use this as a point of departure and try to reverse engineer whatever specific graph algorithm you are trying to construct.

d3.js treemap could paint internal nodes too

This question is in regards to Mike Bostock's very exciting d3.js library in general, and more specifically the treemap plot. Note: treemap seems to have two versions, the "talk version" and the "example version". My question relates to the "talk version," which has the zoom feature.
My question is more of a wish: How difficult would it be to extend treemap to accommodate and show multiple internal nodes, with multiple levels of zoom? For example, click to go down one level and option-click to go up one level. Perhaps to keep things tidy, only nodes one level deeper are painted -- as you zoom in, deeper levels are resolved.
This is my pie-in-the-sky wish -- I am not familiar with javascript and can't take this on right now -- but it seems do-able on a visual/UI level. I did notice that mbostock commented here that treemap only shows leaf nodes, but I don't know if this is a design constraint or just a SMOP.
Anyone with any interest in doing this? Possibly for a commission? Thanks.
It appears the author posted an nearly exact answer my question on his website the day after I posted this question. Whether or not this question prompted the adaptation, I am excited to try it out!
He is calling it "Zoomable Treemap". He also points out a couple other examples on the net.
Thanks, mbostock!

How to handle large numbers of pushpins in Bing Maps

I am using Bing Maps with Ajax and I have about 80,000 locations to drop pushpins into. The purpose of the feature is to allow a user to search for restaurants in Louisiana and click the pushpin to see the health inspection information.
Obviously it doesn't do much good to have 80,000 pins on the map at one time, but I am struggling to find the best solution to this problem. Another problem is that the distance between these locations is very small (All 80,000 are in Louisiana). I know I could use clustering to keep from cluttering the map, but it seems like that would still cause performance problems.
What I am currently trying to do is to simply not show any pins until a certain zoom level and then only show the pins within the current view. The way I am currently attempting to do that is by using the viewchangeend event to find the zoom level and the boundaries of the map and then querying the database (through a web service) for any points in that range.
It feels like I am going about this the wrong way. Is there a better way to manage this large amount of data? Would it be better to try to load all points initially and then have the data on hand without having to hit my web service every time the map moves. If so, how would I go about it?
I haven't been able to find answers to my questions, which usually means that I am asking the wrong questions. If anyone could help me figure out the right question it would be greatly appreciated.
Well, I've implemented a slightly different approach to this. It was just a fun exercise, but I'm displaying all my data (about 140.000 points) in Bing Maps using the HTML5 canvas.
I previously load all the data to the client. Then, I've optimized the drawing process so much that I've attached it to the "Viewchange" event (which fires all the time during the view change process).
I've blogged about this. You can check it here.
My example does not have interaction on it but could be easily done (should be a nice topic for a blog post). You would have thus to handle the events manually and search for the corresponding points yourself or, if the amount of points to draw and/or the zoom level was below some threshold, show regular pushpins.
Anyway, another option, if you're not restricted to Bing Maps, is to use the likes of Leaflet. It allows you to create a Canvas Layer which is a tile-based layer but rendered in client-side using HTML5 canvas. It opens a new range of possibilities. Check for example this map in GisCloud.
Yet another option, although more suitable to static data, is using a technique called UTFGrid. The lads that developed it can certainly explain it better than me, but it scales for as many points as you want with a fenomenal performance. It consists on having a tile layer with your info, and an accompanying json file with something like an "ascii-art" file describing the features on the tiles. Then, using a library called wax it provides complete mouse-over, mouse-click events on it, without any performance impact whatsoever.
I've also blogged about it.
I think clustering would be your best bet if you can get away with using it. You say that you tried using clustering but it still caused performance problems? I went to test it out with 80000 data points at the V7 Interactive SDK and it seems to perform fine. Test it out yourself by going to the link and change the line in the Load module - clustering tab:
TestDataGenerator.GenerateData(100,dataCallback);
to
TestDataGenerator.GenerateData(80000,dataCallback);
then hit the Run button. The performance seems acceptable to me with that many data points.

Is there an algorithm for positioning nodes on a link chart?

I'm a member of a small but fairly sociable online forum, and just for fun we've been plotting a chart of who's met who in real life. Here's what it looked like fairly recently.
(The colour is the "distance" from the currently-selected user, e.g., yellow is someone who's met someone who's met them. And no, I'm not Zak.) Apologies for the faded lines, they don't seem to have weathered the SO upload process very well.
It's generated as SVG, with a big block of JSON defining who's met who. The position (x,y) of each member on the chart is hard-coded into that JSON. Until now, it's been fairly easy to cope when someone meets someone else - at worst, maybe two or three people need to be shuffled around - but it does involve editing the co-ordinates manually. And now that the European and North American contingents are meeting up, and a few on the periphery are showing up at meets, all hell is breaking loose...
We can put some effort into making all the nodes draggable, which would make the job of re-arranging a bit less tiresome. But it seems more sensible to let the computer take care of positioning them, especially as the problem will only get harder with more members.
So, does anyone know of an algorithm for positioning these nodes on the chart, based on which other nodes they're linked with?
Ideally, it would
minimise or avoid long links
avoid having lines run underneath unrelated nodes
take account of the fact that well-connected nodes are bigger
do its best to show the wider "all these guys met each other" relationships (the big circle at the bottom is largely the result of one meet, for example, though the chart has no idea of when any two people met)
but if it gets us close enough to tweak it, that's progress.
And, what's the real name for these charts? I believe they're called "link charts", but I'm not getting good results from Google using that name or anything else I can think of.
We'll likely be implementing this in PHP or Javascript, but right now it's how to begin approaching the problem that's the bigger question.
Edit: Some great answers coming already. I would be very interested in the actual algorithm(s) used, though, as well as tools that do the job.
What you are looking for are f.e. force-based algorithms. There are quite a few libraries, and some have been named already, like prefuse, yWorks. Here a few more: jung, gvf, jGraph.
The real name for it is "graph". To generate graph, and have a good layout algorithm, the best is to use a software which will do the job.
I advise you to use Gephi.
This soft is able to do all the things you want to.
Have a look at the yWorks tools.
You can google for graph visualization. There are more libraries for this, including GraphViz, but probably not all your requirements will be met.
If you can deal w/ Java, take a look at prefuse.
Have a look at NodeXL
Also, this book may be relevant.

Resources