Techniques for visualising change over time in graphs

Techniques for visualising change over time in graphs - time

I'm looking to display a graph (network diagram, not a chart) and show its changes over time. Is there a standard or best way to do this, or any kind of 'network diff' tool?
I'm looking for an overview of the general layout decisions involved, i.e. a list of options and trade-offs to be made, and best-practice guidelines where these exist.

Wow. Not an easy question! I'm curious if anyone can come up with some authoritative resources for you.
I haven't found any standard or best practice documented anywhere from a design standpoint, nor do I know of any tool specifically designed for determining and displaying the changes, but I have some ideas.
First, a few technical notes. There's GraphML, which you can use (and extend) to represent your graph in a standard format, and there are some parsers available, and it works with Prefuse and probably other display libraries. It's just XML, though - nothing too special. Creating the "diff" by comparing two GraphML files should be pretty simple.
The really interesting part is how to communicate the differences to the user.
In all cases, you should have a visual indicator for nodes and edges that are added or removed. You may use color, showing existing nodes as something neutral, say gray, new nodes as green, and removed nodes as red. There are lots of options.
You might find this slideshow interesting.
It's probably obvious, but, over time, the nodes should not move more than necessary to adapt to the new state of the graph - the layout should evolve, not start from scratch for every state. This is crucial for comparing the states.
Side-by-side before/after comparison. Present before and after snapshots of the same graph side-by-side. If your graph is very large and complicated, a side-by-side layout may be impractical. You could try overlaying one graph over the other, though that is likely to be disorienting.
Side-by-side series comparison. AKA small multiples. Same as above but showing as many points in time as is useful. Even more restrictive than before-after in terms of how much space required, and difficult for.
Animate a single graph. I think the most intuitive method is to smoothly animate the graph changes, though a choppy slideshow could work if the changes between slides are not too drastic.
Showing details. If useful, you can spell out the change event details in a few different ways.
Show labels on the graph node (could be interactive if there are too many to show at once)
Show a list in a sidebar / legend. Nice if reading the progression of changes is useful, but harder to connect to the visual.
Show a timeline instead of a list. This shows the 'real' progression of events better than a simple list, which gives the impression that all the events are evenly spaced over time.
What you actually choose to do would depend largely on the nature of your dataset and your goals. A simple graph of a few dozen nodes and a few changes is a much different challenge than a huge network, like say every constellation in the night sky!

Here is an interesting study: http://publik.tuwien.ac.at/files/PubDat_198995.pdf
This paper presents a prototype, and user tests will be published soon in:
P. Federico, W. Aigner, S. Miksch, F. Windhager, M. Smuc:
"Vertigo zoom: combining relational and temporal perspectives on
dynamic networks";
accepted as talk for: 11th International Working Conference on
Advanced Visual Interfaces (AVI2012), Capri Island; 2012-05-21 -
2012-05-25; in: "Proceedings of the 11th International Working
Conference on Advanced Visual Interfaces (AVI2012)", ACM, (2012),
ISBN: 978-1-4503-1287-5.
http://ieg.ifs.tuwien.ac.at/~federico/pub.php

Your question is kind of general, I'm not clear exactly what kinds of analysis you are aiming for. The are several network analysis packages that have some dynamics capacity. Gephi is one. The networkDynamic and ndtv R packages provide tools for representing and visualizing dynamics as animations and static layouts (disclaimer: I'm a maintainer)

Related

How can I isolate and recolor specific color range?

Given an image of the region containing the lips and other "noise" (teeth, skin), how can we isolate and recolor only the lips (simulating a "lipstick" effect)?
Attached is a photo describing the lips/mouth states.
What we have tried so far is a three-part process:
Color matching the lips using a stable point on the lips (provided by internal API).
Use this color as the base color for the lips isolation.
Recolor the lips (lipstick behavior)
We tried a few algorithms like hue difference, HSV difference, ∆and E after converting them to CIE color space. Unfortunately, nothing has panned out or has produced artifacts due to the skin's relative similarity in color to the lips and the discoloration from shadows cast by the nose and mouth.
What are we missing? Is there a better way to approach it?
We are looking for a solution/direction from a classic Computer Vision color algorithm, not a solution from the Machine Learning/Depp Learning domain. Thanks!

You probably won't like this answer, but your question is ill-posed (there is no measurable solution that is better than others, there are only peoples' opinions.)
In this case, the best answer you can hope for then is usually:
Ask an expert for a large set of examples that would be acceptable in practice.
Your problem can easily be solved by an appropriate artist (who you trust will produce usable results) with access to the right tools (for example photoshop,) but a single artist (or even a group of them) can't possibly scale to millions (or whatever large number you care about) of examples.
To address the short-coming of the artist-based solution, you can use the following strategy:
Collect a sufficiently large set of before and after images created by artists, who you deem trustworthy.
Apply your favorite machine learning algorithm to learn a mapping from the before to the after images. There are many possible choices, and it almost really doesn't matter which you choose as long as you know how to use it well.
Note, the above two steps are usually not one-and-done, as most algorithms are. Usually, you will come across pathological or not-well behaved examples to your ML solution above in using the product. The key is to collect these examples, pass them through the artist and retrain or update your ML model. Repeat this enough times and you will produce a state-of-the-art solution to your problem.
Whether you have the funding, time, motivation and resources to accomplish this is another matter.

You should try semantic segmentation techniques that would definitely give you very good results and it would be a generalized concept.

How to visualize a long conversation thread?

Suppose I have a long conversation thread (A replied to B, which replied to C, which replied to D, etc.). I would like to display the entire conversation as a tree but what if the tree does not fit the window?
I can display the whole tree anyway but the user will have to scroll the window left/right and up/down. Are there any better solution?
Do you know examples of UI (web/desktop), which display large trees (not only conversations) properly?

There is the very cool Remail project, which has arc diagrams for email threads among other interesting views. The arc diagrams were incorporated into the Thunderbird plugin, ThreadVis.
SpaceTree is another interesting research project for showing large trees with limited space:
SpaceTree is a novel tree browser that builds on the conventional layout node link diagrams along a single preferred direction. It adds dynamic rescaling of branches of the tree to best fit the available screen space, optimized camera movement, and the use of preview icons summarizing the topology of the branches that cannot be expanded. In addition, it includes integrated search and filter functions. This paper reflects on the evolution of the design and highlights the principles that emerged from it. A controlled experiment showed benefits for navigation tasks to already previously visited nodes and estimation of overall tree topology.

How can I visualize changes in a large code base quality?

One of the things I’ve been thinking about a lot off and on is how we can use metrics of some kind to measure change, are we going backwards or not? This is in the context of a large, legacy code base which we are improving. Most of the code is C++ with a C heritage. Some new functions and the GUI are written in C#.
To start with, we could at least be checking if the simple complexity level was changing over time in the code. The difficulty is in having a representation – we can maybe do a 3D surface where a 2D map represents the code and we have a heat-map of color representing complexity with the 3D surface bulging in and out to show change.
Once you can generate some matrics of numbers there are a ton of math systems around to take care of stuff like this.
Over time, I'd like to have more sophisticated numbers in there but the same visualisation techniques used to represent change.
I like the idea in Crap4j of focusing on the ratio between complexity and number of unit tests covering that code.
I'd also like to include Uncle Bob's SOLID metrics and some of the Chidamber and Kemerer OO metrics. The hard part is finding tools to generate these for C++. The only option seems to be Krakatau Essential Metrics (I have no objection to paying for tools). My desire to use the CK metrics comes partly from the books Object-Oriented Metrics:Measures of Complexity by Henderson-Sellers and the earlier Object-Oriented Software Metrics.
If we start using a number of these metrics we could end up with ten or so numbers that are varying across time. I'm fairly ignorant of statistics but it seems it could be interesting to track a bunch of such metrics and then pay attention to which ones tend to vary.
Note that a related question is about measuring code quality across a large code base. I'm more interested in measuring the change.

I'd consider using a Kiviat Diagram to represent multiple software metrics dimensions evolving over time. These diagrams represent multiple data points in a concave hull around a centerpoint. Visual inspection will show where a particular metric is going up or down, and one ought to be able to compute an overall ratio of area biased by metric value using some hueristic area computation.

You can also have a glance at NDepend documentation about code metrics. Disclaimer: I am one of the developer of the tool NDepend.
With the Code Rule and Query over LINQ (CQLinq) facility, it is possible to ask for code metric evolution/trending across two different snapshots in time of the code base. For example there is a default rule proposed: Avoid making complex methods even more complex illustrated by the screenshot below:
Several metric trending rules are proposed like:
Avoid decreasing code coverage by[enter link description here]5 tests of types
Types that used to be 100% covered but not anymore
and also, since you mentioned Crap4J the metric C.R.A.P can be written with CQLinq, and the query could be easily tweaked to see the trending in C.R.A.P metric.
Concerning the visualization of code metric, NDepend lets visualize code metrics values through an interactive treemap:

There is a fresh approach for this topic.
E.g.
https://github.com/databricks/koalas/pull/840#issuecomment-536949320
See https://softagram.com/docs/visualizing-code-changes/ for more info or do an image search in search engine using the two keywords: softagram koalas
Disclaimer: I work for Softagram.

Is there an algorithm for positioning nodes on a link chart?

I'm a member of a small but fairly sociable online forum, and just for fun we've been plotting a chart of who's met who in real life. Here's what it looked like fairly recently.
(The colour is the "distance" from the currently-selected user, e.g., yellow is someone who's met someone who's met them. And no, I'm not Zak.) Apologies for the faded lines, they don't seem to have weathered the SO upload process very well.
It's generated as SVG, with a big block of JSON defining who's met who. The position (x,y) of each member on the chart is hard-coded into that JSON. Until now, it's been fairly easy to cope when someone meets someone else - at worst, maybe two or three people need to be shuffled around - but it does involve editing the co-ordinates manually. And now that the European and North American contingents are meeting up, and a few on the periphery are showing up at meets, all hell is breaking loose...
We can put some effort into making all the nodes draggable, which would make the job of re-arranging a bit less tiresome. But it seems more sensible to let the computer take care of positioning them, especially as the problem will only get harder with more members.
So, does anyone know of an algorithm for positioning these nodes on the chart, based on which other nodes they're linked with?
Ideally, it would
minimise or avoid long links
avoid having lines run underneath unrelated nodes
take account of the fact that well-connected nodes are bigger
do its best to show the wider "all these guys met each other" relationships (the big circle at the bottom is largely the result of one meet, for example, though the chart has no idea of when any two people met)
but if it gets us close enough to tweak it, that's progress.
And, what's the real name for these charts? I believe they're called "link charts", but I'm not getting good results from Google using that name or anything else I can think of.
We'll likely be implementing this in PHP or Javascript, but right now it's how to begin approaching the problem that's the bigger question.
Edit: Some great answers coming already. I would be very interested in the actual algorithm(s) used, though, as well as tools that do the job.

What you are looking for are f.e. force-based algorithms. There are quite a few libraries, and some have been named already, like prefuse, yWorks. Here a few more: jung, gvf, jGraph.

The real name for it is "graph". To generate graph, and have a good layout algorithm, the best is to use a software which will do the job.
I advise you to use Gephi.
This soft is able to do all the things you want to.

Have a look at the yWorks tools.

You can google for graph visualization. There are more libraries for this, including GraphViz, but probably not all your requirements will be met.

If you can deal w/ Java, take a look at prefuse.

Have a look at NodeXL
Also, this book may be relevant.

Looking for algorithms to generate realistic planets

I'd like to collect a list of algorithms and other resources to generate realistic and interesting visuals of planets. The visual should look like something which you'd expect to find on the NASA homepage. Key attributes would be:
a nice colorful atmosphere for gas giants
rings (optional)
impact craters for solid rocks without atmosphere
inhabitable planets could have features like oceans, mountains, rivers, forests
inhabitables could even have a realistic distribution for the civilization on the surface
The final goal should be to give Science Fiction(SciFi) writers a tool to generate a world which helps them to spark ideas, create locations for scenes, or as a basis to render nice images for their books.
Note: This is a wiki, so no single "correct" answer.

Fractal terrain generation works wonders for creating realistic landscapes. I imagine you could scale the processs up in order to generate landmasses on a plantary scale. This site has a detailed description of the process used for landscapes.

If you want high-level descriptions of a very mature procedural planet renderer, Infinity is perhaps the most venerable. The development blog covers many of the concepts used to create some very nice procedural planets and some other very nice space phenomena.

Check out conworlding links. There is actually commercial software out there (ProFantasy comes to mind) but if you wanted to do something from scratch, I have a link you may be interested in :
Magical World Builder
Finally, Guy Lecky-Thompson has written some interesting books on using procedural content in game design. I have both of his books and they are very inspiring. Many algorithms are listed, including a few RNG implementations, name generators (HINT: pick a list of name parts, then how many parts each name should have, then randomise), two whole chapters on terrain and landscape generation, a dungeon chapter...
Oooh ! Speaking of dungeons, dunno if you have heard of Roguelikes, but I have recently been looking into these. I imagine that many of the same general principles they use for dungeons can be applied - and there are wilderness algorithms they share, besides. Try:
Temple of The Roguelike - possibly the largest Roguelike dev forum
Wilderness Generation using Vornoi Diagrams - this blog is run by a developer of Unangband, a very popular Rogue variant. Many people in the Roguelike dev community share sources.
Markov Chain - this article is about how to put together randomised names using Markov Chains. The wiki where this is hosted has quite a few algorithms of interest to anyone generating procedural content of any sort.
Roguebasin - many useful aglorithms and code examples here.
Have fun !

I'm no astronomer, but you might consider some sort of decision tree for a preliminary classification of the planet:
Main Composition (methane/rock/etc.)
Mass
Additional atmosphere (how much, what of, etc.)
Temperature (Alternately, specify distance from star, model the star and write an algorithm based on the above)
Age
Asteroid/Meteor activity
Things like craters would be indirectly determined by 1, 3, and 6. Radius could be calculated from 1 and 2. And higher elements on the list might put boundaries on lower elements.
You still have many algorithms to research, but maybe having an order of information might structure your calculations or what variables you use.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio