Reasonable optimized chart scaling - algorithm

I need to make a chart with an optimized y axis maximum value.
The current method I have of making charts simply uses the maximum value of all the graphs, then divides it by ten, and uses that as grid lines. I didn't write it.
Update Note: These graphs have been changed. As soon as I fixed the code, my dynamic graphs started working, making this question nonsensical (because the examples no longer had any errors in them). I've updated these with static images, but some of the answers refrence different values. Keep that in mind.
There were between 12003 and 14003 inbound calls so far in February. Informative, but ugly.
I'd like to avoid charts that look like a monkey came up with the y-axis numbers.
Using the Google charts API helps a little bit, but it's still not quite what I want.
The numbers are clean, but the top of the y value is always the same as the maximum value on the chart. This chart scales from 0 to 1357. I need to have calculated the proper value of 1400, problematically.
I'm throwing in rbobby's defanition of a 'nice' number here because it explains it so well.
A "nice" number is one that has 3 or fewer non-zero digits (eg. 1230000)
A "nice" number has the same or few non-zero digits than zero digits (eg 1230 is not nice, 1200 is nice)
The nicest numbers are ones with multiples of 3 zeros (eg. "1,000", "1,000,000")
The second nicest numbers are onces with multples of 3 zeros plus 2 zeros (eg. "1,500,000", "1,200")
Solution
I found the way to get the results that I want using a modified version of Mark Ransom's idea.
Fist, Mark Ransom's code determines the optimum spacing between ticks, when given the number of ticks. Sometimes this number ends up being more than twice what the highest value on the chart is, depending on how many grid lines you want.
What I'm doing is I'm running Mark's code with 5, 6, 7, 8, 9, and 10 grid lines (ticks) to find which of those is the lowest. With a value of 23, the height of the chart goes to 25, with a grid line at 5, 10, 15, 20, and 25. With a value of 26, the chart's height is 30, with grid lines at 5, 10, 15, 20, 25, and 30. It has the same spacing between grid lines, but there are more of them.
So here's the steps to just-about copy what Excel does to make charts all fancy.
Temporarily bump up the chart's highest value by about 5% (so that there is always some space between the chart's highest point and the top of the chart area. We want 99.9 to round up to 120)
Find the optimum grid line placement
for 5, 6, 7, 8, 9, and 10 grid
lines.
Pick out the lowest of those numbers. Remember the number of grid lines it took to get that value.
Now you have the optimum chart height. The lines/bar will never butt up against the top of the chart and you have the optimum number of ticks.
PHP:
function roundUp($maxValue){
$optiMax = $maxValue * 2;
for ($i = 5; $i <= 10; $i++){
$tmpMaxValue = bestTick($maxValue,$i);
if (($optiMax > $tmpMaxValue) and ($tmpMaxValue > ($maxValue + $maxValue * 0.05))){
$optiMax = $tmpMaxValue;
$optiTicks = $i;
}
}
return $optiMax;
}
function bestTick($maxValue, $mostTicks){
$minimum = $maxValue / $mostTicks;
$magnitude = pow(10,floor(log($minimum) / log(10)));
$residual = $minimum / $magnitude;
if ($residual > 5){
$tick = 10 * $magnitude;
} elseif ($residual > 2) {
$tick = 5 * $magnitude;
} elseif ($residual > 1){
$tick = 2 * $magnitude;
} else {
$tick = $magnitude;
}
return ($tick * $mostTicks);
}
Python:
import math
def BestTick(largest, mostticks):
minimum = largest / mostticks
magnitude = 10 ** math.floor(math.log(minimum) / math.log(10))
residual = minimum / magnitude
if residual > 5:
tick = 10 * magnitude
elif residual > 2:
tick = 5 * magnitude
elif residual > 1:
tick = 2 * magnitude
else:
tick = magnitude
return tick
value = int(input(""))
optMax = value * 2
for i in range(5,11):
maxValue = BestTick(value,i) * i
print maxValue
if (optMax > maxValue) and (maxValue > value + (value*.05)):
optMax = maxValue
optTicks = i
print "\nTest Value: " + str(value + (value * .05)) + "\n\nChart Height: " + str(optMax) + " Ticks: " + str(optTicks)

This is from a previous similar question:
Algorithm for "nice" grid line intervals on a graph
I've done this with kind of a brute
force method. First, figure out the
maximum number of tick marks you can
fit into the space. Divide the total
range of values by the number of
ticks; this is the minimum
spacing of the tick. Now calculate
the floor of the logarithm base 10 to
get the magnitude of the tick, and
divide by this value. You should end
up with something in the range of 1 to
10. Simply choose the round number greater than or equal to the value and
multiply it by the logarithm
calculated earlier. This is your
final tick spacing.
Example in Python:
import math
def BestTick(largest, mostticks):
minimum = largest / mostticks
magnitude = 10 ** math.floor(math.log(minimum) / math.log(10))
residual = minimum / magnitude
if residual > 5:
tick = 10 * magnitude
elif residual > 2:
tick = 5 * magnitude
elif residual > 1:
tick = 2 * magnitude
else:
tick = magnitude
return tick

You could round up to two significant figures. The following pseudocode should work:
// maxValue is the largest value in your chart
magnitude = floor(log10(maxValue))
base = 10^(magnitude - 1)
chartHeight = ceiling(maxValue / base) * base
For example, if maxValue is 1357, then magnitude is 3 and base is 100. Dividing by 100, rounding up, and multiplying by 100 has the result of rounding up to the next multiple of 100, i.e. rounding up to two significant figures. In this case, the result if 1400 (1357 ⇒ 13.57 ⇒ 14 ⇒ 1400).

In the past I've done this in a brute force-ish sort of way. Here's a chunk of C++ code that works well... but for a hardcoded lower and upper limits (0 and 5000):
int PickYUnits()
{
int MinSize[8] = {20, 20, 20, 20, 20, 20, 20, 20};
int ItemsPerUnit[8] = {5, 10, 20, 25, 50, 100, 250, 500};
int ItemLimits[8] = {20, 50, 100, 250, 500, 1000, 2500, 5000};
int MaxNumUnits = 8;
double PixelsPerY;
int PixelsPerAxis;
int Units;
//
// Figure out the max from the dataset
// - Min is always 0 for a bar chart
//
m_MinY = 0;
m_MaxY = -9999999;
m_TotalY = 0;
for (int j = 0; j < m_DataPoints.GetSize(); j++) {
if (m_DataPoints[j].m_y > m_MaxY) {
m_MaxY = m_DataPoints[j].m_y;
}
m_TotalY += m_DataPoints[j].m_y;
}
//
// Give some space at the top
//
m_MaxY = m_MaxY + 1;
//
// Figure out the size of the range
//
double yRange = (m_MaxY - m_MinY);
//
// Pick the initial size
//
Units = MaxNumUnits;
for (int k = 0; k < MaxNumUnits; k++)
{
if (yRange < ItemLimits[k])
{
Units = k;
break;
}
}
//
// Adjust it upwards based on the space available
//
PixelsPerY = m_rcGraph.Height() / yRange;
PixelsPerAxis = (int)(PixelsPerY * ItemsPerUnit[Units]);
while (PixelsPerAxis < MinSize[Units]){
Units += 1;
PixelsPerAxis = (int)(PixelsPerY * ItemsPerUnit[Units]);
if (Units == 5)
break;
}
return ItemsPerUnit[Units];
}
However something in what you've said tweaked me. To pick nice axis numbers a definition of "nice number" would help:
A "nice" number is one that has 3 or fewer non-zero digits (eg. 1230000)
A "nice" number has the same or few non-zero digits than zero digits (eg 1230 is not nice, 1200 is nice)
The nicest numbers are ones with multiples of 3 zeros (eg. "1,000", "1,000,000")
The second nicest numbers are onces with multples of 3 zeros plus 2 zeros (eg. "1,500,000", "1,200")
Not sure if the above definition is "right" or actually helpful (but with the definition in hand it then becomes a simpler task to devise an algorithm).

A slight refinement and tested... (works for fractions of units and not just integers)
public void testNumbers() {
double test = 0.20000;
double multiple = 1;
int scale = 0;
String[] prefix = new String[]{"", "m", "u", "n"};
while (Math.log10(test) < 0) {
multiple = multiple * 1000;
test = test * 1000;
scale++;
}
double tick;
double minimum = test / 10;
double magnitude = 100000000;
while (minimum <= magnitude){
magnitude = magnitude / 10;
}
double residual = test / (magnitude * 10);
if (residual > 5) {
tick = 10 * magnitude;
} else if (residual > 2) {
tick = 5 * magnitude;
} else if (residual > 1) {
tick = 2 * magnitude;
} else {
tick = magnitude;
}
double curAmt = 0;
int ticks = (int) Math.ceil(test / tick);
for (int ix = 0; ix < ticks; ix++) {
curAmt += tick;
BigDecimal bigDecimal = new BigDecimal(curAmt);
bigDecimal.setScale(2, BigDecimal.ROUND_HALF_UP);
System.out.println(bigDecimal.stripTrailingZeros().toPlainString() + prefix[scale] + "s");
}
System.out.println("Value = " + test + prefix[scale] + "s");
System.out.println("Tick = " + tick + prefix[scale] + "s");
System.out.println("Ticks = " + ticks);
System.out.println("Scale = " + multiple + " : " + scale);
}

If you want 1400 at the top, how about adjusting the last two parameters to 1400 instead of 1357:

You could use div and mod. For example.
Let's say you want your chart to round up by increments of 20 (just to make it more a more arbitrary number than your typical "10" value).
So I would assume that 1, 11, 18 would all round up to 20. But 21, 33, 38 would round to 40.
To come up with the right value do the following:
Where divisor = your rounding increment.
divisor = 20
multiple = maxValue / divisor; // Do an integer divide here.
if (maxValue modulus divisor > 0)
multiple++;
graphMax = multiple * maxValue;
So now let's plugin real numbers:
divisor = 20;
multiple = 33 / 20; (integer divide)
so multiple = 1
if (33 modulus 20 > 0) (it is.. it equals 13)
multiple++;
so multiple = 2;
graphMax = multiple (2) * maxValue (20);
graphMax = 40;

Related

Algorithm to find optimal groups in 2D array

I have a deck of 24 cards - 8 red, 8 blue and 8 yellow cards.
red |1|2|3|4|5|6|7|8|
yellow |1|2|3|4|5|6|7|8|
blue |1|2|3|4|5|6|7|8|
I can take 3 of cards (same numbers, straight, straigh flush), whereas each of the type is scored differently.
My question is, how to calculate maximal possible score (find optimal groups) for a game in progress, where some cards are already missing.
for example:
red |1|2|3|4|5|6|7|8|
yellow |1|2|3| |5| |7|8|
blue |1|2| |4|5|6| |8|
The score for a three-of-a-kind is:
1-1-1 20
2-2-2 30
3-3-3 40
4-4-4 50
5-5-5 60
6-6-6 70
7-7-7 80
8-8-8 90
The score for a straight is:
1-2-3 10
2-3-4 20
3-4-5 30
4-5-6 40
5-6-7 50
6-7-8 60
The score for a straight flush is:
1-2-3 50
2-3-4 60
3-4-5 70
4-5-6 80
5-6-7 90
6-7-8 100
A solution which recursively tries every combination would go like this:
Start looking at combinations that have a red 8 as the highest card: three-of-a-kind r8-y8-b8, straight flush r6-r7-r8, and every possible straight *6-*7-r8. For each of these, remove the cards from the set, and recurse to check combinations with the yellow 8, then blue 8, then red 7, yellow 7, blue 7, red 6 ... until you've checked everything except the 2's and 1's; then add three-of-a-kind 2-2-2 and 1-1-1 if available. At each step, check which recursion returns the maximum score, and return this maximum.
Let's look at what happens in each of these steps. Say we're looking at combinations with red 8; we have available cards like:
red ...|6|7|8|
yellow ...|6| |8|
blue ...| |7|8|
First, use three-of-a-kind r8-y8-b8, if possible. Create a copy of the available cards, remove the 8's, and recurse straight to the 7's:
score = 90 + max_score(cards_copy, next = red 7)
(Trying the three-of-a-kind should only be done when the current card is red, to avoid duplicate solutions.)
Then, use straight flush r6-r7-r8, if possible. Create a copy of the available cards, remove r6, r7 and r8, and recurse to yellow 8:
score = 100 + max_score(cards_copy, next = yellow 8)
Then, use every possible non-flush straight containing red 8; in the example, those are r6-b7-r8, y6-r7-r8 and y6-b7-r8 (there could be up to nine). For each of these, create a copy of the available cards, remove the three cards and recurse to yellow 8:
score = 60 + max_score(cards_copy, next = yellow 8)
Then, finally, recurse without using red 8: create a copy of the available cards, remove red 8 and recurse to yellow 8:
score = max_score(cards_copy, next = yellow 8)
You then calculate which of these options has the greatest score (with the score returned by its recursion added), and return that maximum score.
A quick test in JavaScript shows that for a full set of 24 cards, the algorithm goes through 30 million recursions to find the maximum score 560, and becomes quite slow. However, as soon as 3 higher-value cards have been removed, the number of recursions falls below one million and it takes around 1 second, and with 6 higher-value cards removed, it falls below 20,000 and returns almost instantly.
For almost-complete sets, you could pre-compute the maximum scores, and only calculate the score once a certain number of cards have been removed. A lot of sets will be duplicates anyway; removing r6-r7-r8 will result in the same maximum score as removing y6-y7-y8; removing r6-y7-b8 is a duplicate of removing b6-y7-r8... So first you change the input to a canonical version, and then you look up the pre-computed score. E.g. using pre-computed scores for all sets with 3 or 6 cards removed would require storing 45,340 scores.
As a code example, here's the JavaScript code I tested the algorithm with:
function clone(array) { // copy 2-dimensional array
var copy = [];
array.forEach(function(item) {copy.push(item.slice())});
return copy;
}
function max_score(cards, suit, rank) {
suit = suit || 0; rank = rank || 7; // start at red 8
var max = 0;
if (rank < 2) { // try 3-of-a-kind for rank 1 and 2
if (cards[0][0] && cards[1][0] && cards[2][0]) max += 20;
if (cards[0][1] && cards[1][1] && cards[2][1]) max += 30;
return max;
}
var next_rank = suit == 2 ? rank - 1: rank;
var next_suit = (suit + 1) % 3;
max = max_score(clone(cards), next_suit, next_rank); // try skipping this card
if (! cards[suit][rank]) return max;
if (suit == 0 && cards[1][rank] && cards[2][rank]) { // try 3-of-a-kind
var score = rank * 10 + 20 + max_score(clone(cards), 0, rank - 1);
if (score > max) max = score;
}
for (var i = 0; i < 3; i++) { // try all possible straights
if (! cards[i][rank - 2]) continue;
for (var j = 0; j < 3; j++) {
if (! cards[j][rank - 1]) continue;
var copy = clone(cards);
copy[j][rank - 1] = 0; copy[i][rank - 2] = 0;
var score = rank * 10 - 10 + max_score(copy, next_suit, next_rank);
if (i == suit && j == suit) score += 40; // straight is straight flush
if (score > max) max = score;
}
}
return max;
}
document.write(max_score([[1,1,1,1,1,0,1,1], [1,1,1,1,1,1,1,0], [1,1,1,0,1,1,1,1]]));
An obvious way to speed up the algorithm is to use a 24-bit pattern instead of a 3x8 bit array to represent the cards; that way the array cloning is no longer necessary, and most of the code is turned into bit manipulation. In JavaScript, it's about 8 times faster:
function max_score(cards, suit, rank) {
suit = suit || 0; rank = rank || 7; // start at red 8
var max = 0;
if (rank < 2) { // try 3-of-a-kind for rank 1 and 2
if ((cards & 65793) == 65793) max += 20; // 65793 = rank 1 of all suits
if ((cards & 131586) == 131586) max += 30; // 131586 = rank 2 of all suits
return max;
}
var next_rank = suit == 2 ? rank - 1: rank;
var next_suit = (suit + 1) % 3;
var this_card = 1 << rank << suit * 8;
max = max_score(cards, next_suit, next_rank); // try skipping this card
if (! (cards & this_card)) return max;
if (suit == 0 && cards & this_card << 8 && cards & this_card << 16) { // try 3oaK
var score = rank * 10 + 20 + max_score(cards, 0, rank - 1);
if (score > max) max = score;
}
for (var i = 0; i < 3; i++) { // try all possible straights
var mid_card = 1 << rank - 1 << i * 8;
if (! (cards & mid_card)) continue;
for (var j = 0; j < 3; j++) {
var low_card = 1 << rank - 2 << j * 8;
if (! (cards & low_card)) continue;
var cards_copy = cards - mid_card - low_card;
var score = rank * 10 - 10 + max_score(cards_copy, next_suit, next_rank);
if (i == suit && j == suit) score += 40; // straight is straight flush
if (score > max) max = score;
}
}
return max;
}
document.write(max_score(parseInt("111101110111111111011111", 2)));
// B Y R
// 876543218765432187654321
The speed for almost-complete sets can be further improved by using the observation that if a straight flush for all three suits can be be made for the current rank, then this is always the best option. This reduces the number of recursions drastically, because nine cards can be skipped at once. This check should be added immediately after trying 3-of-a-kind for rank 1 and 2:
if (suit == 0) { // try straight flush for all suits
var flush3 = 460551 << rank - 2; // 460551 = rank 1, 2 and 3 of all suits
if ((cards & flush3) == flush3) {
max = rank * 30 + 90;
if (rank > 2) max += max_score(cards - flush3, 0, rank - 3);
return max;
}
}

Scoring two sequences of ordered numbers for their similarity to one-another

How would I go about scoring two sequences of numbers such that
5, 8, 28, 31 (differences of 3, 20 and 3)
6, 9, 26, 29 differences of 3, 17 and 3
are considered similar "enough" but a sequence of
8 11 31 34 (differences of 3, 20 and 3, errors of 3, 3, 3, 3)
Is too dissimilar to allow?
The second set of numbers has an absolute error of
1 1 2 2 and that is low "enough" to accept.
If that error was too high I'd like to be able to reject it.
To give a little background, these are indicators of time and when events arrived to a computer. The first sequence is the expected time of arrival and the second sequence is the actual times they arrived. Knowing that the sequence is at least in the correct order I need to be able to score the similarity to the expectation and accept or reject it by tweaking some sort of value.
If it were standard deviation for a set of numbers where order didn't matter I could just reject the second set based on its own standard deviation.
Since this is not the case I had the idea of measuring deviance and position error.
Position error shouldn't exceed 3, though this number should not be integer - it needs to be decimal as the numbers are more realistically floating point, or at least accurate to 6 decimal places.
It also needs to work equally well, or perhaps offer a variant in which a much longer series of numbers can be scored fairly.
In the longer series of numbers it it not likely the position error will exceed 3 so the position error would still be fairly low.
This is a partial solution I have found using a Person's correlation coefficient series for each time x fits into y. It uses the form of the equation that works off expected values. The comments describe it fairly well.
function getPearsonsCorrelation(x, y)
{
/**
* Pearsons can be calculated in an alternative fashion as
* p(x, y) = (E(xy) - E(x)*E(y))/sqrt[(E(x^2)-(E(x))^2)*(E(y^2)-(E(y))^2)]
* where p(x, y) is the Pearson's correlation result, E is a function referring to the expected value
* E(x) = var expectedValue = 0; for(var i = 0; i < x.length; i ++){ expectedValue += x[i]*p[i] }
* where p[i] is the probability of that variable occurring, here we substitute in 1 every time
* hence this simplifies to E(x) = sum of all x values
* sqrt is the square root of the result in square brackets
* ^2 means to the power of two, or rather just square that value
**/
var maxdelay = y.length - x.length; // we will calculate Pearson's correlation coefficient at every location x fits into y
var xl = x.length
var results = [];
for(var d = 0; d <= maxdelay; d++){
var xy = [];
var x2 = [];
var y2 = [];
var _y = y.slice(d, d + x.length); // take just the segment of y at delay
for(var i = 0; i < xl; i ++){
xy.push(x[i] * _y[i]); // x*y array
x2.push(x[i] * x[i]); // x squareds array
y2.push(_y[i] * _y[i]); // y squareds array
}
var sum_x = 0;
var sum_y = 0;
var sum_xy = 0;
var sum_x2 = 0;
var sum_y2 = 0;
for(var i = 0; i < xl; i ++){
sum_x += x[i]; // expected value of x
sum_y += _y[i]; // expected value of y
sum_xy += xy[i]; // expected value of xy/n
sum_x2 += x2[i]; // expected value of (x squared)/n
sum_y2 += y2[i]; // expected value of (y squared)/n
}
var numerator = xl * sum_xy - sum_x * sum_y; // expected value of xy - (expected value of x * expected value of y)
var denomLetSide = xl * sum_x2 - sum_x * sum_x; // expected value of (x squared) - (expected value of x) squared
var denomRightSide = xl * sum_y2 - sum_y * sum_y; // expected value of (y squared) - (expected value of y) squared
var denom = Math.sqrt(denomLetSide * denomRightSide);
var pearsonsCorrelation = numerator / denom;
results.push(pearsonsCorrelation);
}
return results;
}

Tickmark algorithm for a graph axis

I'm looking for an algorithm that places tick marks on an axis, given a range to display, a width to display it in, and a function to measure a string width for a tick mark.
For example, given that I need to display between 1e-6 and 5e-6 and a width to display in pixels, the algorithm would determine that I should put tickmarks (for example) at 1e-6, 2e-6, 3e-6, 4e-6, and 5e-6. Given a smaller width, it might decide that the optimal placement is only at the even positions, i.e. 2e-6 and 4e-6 (since putting more tickmarks would cause them to overlap).
A smart algorithm would give preference to tickmarks at multiples of 10, 5, and 2. Also, a smart algorithm would be symmetric around zero.
As I didn't like any of the solutions I've found so far, I implemented my own. It's in C# but it can be easily translated into any other language.
It basically chooses from a list of possible steps the smallest one that displays all values, without leaving any value exactly in the edge, lets you easily select which possible steps you want to use (without having to edit ugly if-else if blocks), and supports any range of values. I used a C# Tuple to return three values just for a quick and simple demonstration.
private static Tuple<decimal, decimal, decimal> GetScaleDetails(decimal min, decimal max)
{
// Minimal increment to avoid round extreme values to be on the edge of the chart
decimal epsilon = (max - min) / 1e6m;
max += epsilon;
min -= epsilon;
decimal range = max - min;
// Target number of values to be displayed on the Y axis (it may be less)
int stepCount = 20;
// First approximation
decimal roughStep = range / (stepCount - 1);
// Set best step for the range
decimal[] goodNormalizedSteps = { 1, 1.5m, 2, 2.5m, 5, 7.5m, 10 }; // keep the 10 at the end
// Or use these if you prefer: { 1, 2, 5, 10 };
// Normalize rough step to find the normalized one that fits best
decimal stepPower = (decimal)Math.Pow(10, -Math.Floor(Math.Log10((double)Math.Abs(roughStep))));
var normalizedStep = roughStep * stepPower;
var goodNormalizedStep = goodNormalizedSteps.First(n => n >= normalizedStep);
decimal step = goodNormalizedStep / stepPower;
// Determine the scale limits based on the chosen step.
decimal scaleMax = Math.Ceiling(max / step) * step;
decimal scaleMin = Math.Floor(min / step) * step;
return new Tuple<decimal, decimal, decimal>(scaleMin, scaleMax, step);
}
static void Main()
{
// Dummy code to show a usage example.
var minimumValue = data.Min();
var maximumValue = data.Max();
var results = GetScaleDetails(minimumValue, maximumValue);
chart.YAxis.MinValue = results.Item1;
chart.YAxis.MaxValue = results.Item2;
chart.YAxis.Step = results.Item3;
}
Take the longest of the segments about zero (or the whole graph, if zero is not in the range) - for example, if you have something on the range [-5, 1], take [-5,0].
Figure out approximately how long this segment will be, in ticks. This is just dividing the length by the width of a tick. So suppose the method says that we can put 11 ticks in from -5 to 0. This is our upper bound. For the shorter side, we'll just mirror the result on the longer side.
Now try to put in as many (up to 11) ticks in, such that the marker for each tick in the form i*10*10^n, i*5*10^n, i*2*10^n, where n is an integer, and i is the index of the tick. Now it's an optimization problem - we want to maximize the number of ticks we can put in, while at the same time minimizing the distance between the last tick and the end of the result. So assign a score for getting as many ticks as we can, less than our upper bound, and assign a score to getting the last tick close to n - you'll have to experiment here.
In the above example, try n = 1. We get 1 tick (at i=0). n = 2 gives us 1 tick, and we're further from the lower bound, so we know that we have to go the other way. n = 0 gives us 6 ticks, at each integer point point. n = -1 gives us 12 ticks (0, -0.5, ..., -5.0). n = -2 gives us 24 ticks, and so on. The scoring algorithm will give them each a score - higher means a better method.
Do this again for the i * 5 * 10^n, and i*2*10^n, and take the one with the best score.
(as an example scoring algorithm, say that the score is the distance to the last tick times the maximum number of ticks minus the number needed. This will likely be bad, but it'll serve as a decent starting point).
Funnily enough, just over a week ago I came here looking for an answer to the same question, but went away again and decided to come up with my own algorithm. I am here to share, in case it is of any use.
I wrote the code in Python to try and bust out a solution as quickly as possible, but it can easily be ported to any other language.
The function below calculates the appropriate interval (which I have allowed to be either 10**n, 2*10**n, 4*10**n or 5*10**n) for a given range of data, and then calculates the locations at which to place the ticks (based on which numbers within the range are divisble by the interval). I have not used the modulo % operator, since it does not work properly with floating-point numbers due to floating-point arithmetic rounding errors.
Code:
import math
def get_tick_positions(data: list):
if len(data) == 0:
return []
retpoints = []
data_range = max(data) - min(data)
lower_bound = min(data) - data_range/10
upper_bound = max(data) + data_range/10
view_range = upper_bound - lower_bound
num = lower_bound
n = math.floor(math.log10(view_range) - 1)
interval = 10**n
num_ticks = 1
while num <= upper_bound:
num += interval
num_ticks += 1
if num_ticks > 10:
if interval == 10 ** n:
interval = 2 * 10 ** n
elif interval == 2 * 10 ** n:
interval = 4 * 10 ** n
elif interval == 4 * 10 ** n:
interval = 5 * 10 ** n
else:
n += 1
interval = 10 ** n
num = lower_bound
num_ticks = 1
if view_range >= 10:
copy_interval = interval
else:
if interval == 10 ** n:
copy_interval = 1
elif interval == 2 * 10 ** n:
copy_interval = 2
elif interval == 4 * 10 ** n:
copy_interval = 4
else:
copy_interval = 5
first_val = 0
prev_val = 0
times = 0
temp_log = math.log10(interval)
if math.isclose(lower_bound, 0):
first_val = 0
elif lower_bound < 0:
if upper_bound < -2*interval:
if n < 0:
copy_ub = round(upper_bound*10**(abs(temp_log) + 1))
times = copy_ub // round(interval*10**(abs(temp_log) + 1)) + 2
else:
times = upper_bound // round(interval) + 2
while first_val >= lower_bound:
prev_val = first_val
first_val = times * copy_interval
if n < 0:
first_val *= (10**n)
times -= 1
first_val = prev_val
times += 3
else:
if lower_bound > 2*interval:
if n < 0:
copy_ub = round(lower_bound*10**(abs(temp_log) + 1))
times = copy_ub // round(interval*10**(abs(temp_log) + 1)) - 2
else:
times = lower_bound // round(interval) - 2
while first_val < lower_bound:
first_val = times*copy_interval
if n < 0:
first_val *= (10**n)
times += 1
if n < 0:
retpoints.append(first_val)
else:
retpoints.append(round(first_val))
val = first_val
times = 1
while val <= upper_bound:
val = first_val + times * interval
if n < 0:
retpoints.append(val)
else:
retpoints.append(round(val))
times += 1
retpoints.pop()
return retpoints
When passing in the following three data-points to the function
points = [-0.00493, -0.0003892, -0.00003292]
... the output I get (as a list) is as follows:
[-0.005, -0.004, -0.003, -0.002, -0.001, 0.0]
When passing this:
points = [1.399, 38.23823, 8309.33, 112990.12]
... I get:
[0, 20000, 40000, 60000, 80000, 100000, 120000]
When passing this:
points = [-54, -32, -19, -17, -13, -11, -8, -4, 12, 15, 68]
... I get:
[-60, -40, -20, 0, 20, 40, 60, 80]
... which all seem to be a decent choice of positions for placing ticks.
The function is written to allow 5-10 ticks, but that could easily be changed if you so please.
Whether the list of data supplied contains ordered or unordered data it does not matter, since it is only the minimum and maximum data points within the list that matter.
This simple algorithm yields an interval that is multiple of 1, 2, or 5 times a power of 10. And the axis range gets divided in at least 5 intervals. The code sample is in java language:
protected double calculateInterval(double range) {
double x = Math.pow(10.0, Math.floor(Math.log10(range)));
if (range / x >= 5)
return x;
else if (range / (x / 2.0) >= 5)
return x / 2.0;
else
return x / 5.0;
}
This is an alternative, for minimum 10 intervals:
protected double calculateInterval(double range) {
double x = Math.pow(10.0, Math.floor(Math.log10(range)));
if (range / (x / 2.0) >= 10)
return x / 2.0;
else if (range / (x / 5.0) >= 10)
return x / 5.0;
else
return x / 10.0;
}
I've been using the jQuery flot graph library. It's open source and does axis/tick generation quite well. I'd suggest looking at it's code and pinching some ideas from there.

Choosing an attractive linear scale for a graph's Y Axis

I'm writing a bit of code to display a bar (or line) graph in our software. Everything's going fine. The thing that's got me stumped is labeling the Y axis.
The caller can tell me how finely they want the Y scale labeled, but I seem to be stuck on exactly what to label them in an "attractive" kind of way. I can't describe "attractive", and probably neither can you, but we know it when we see it, right?
So if the data points are:
15, 234, 140, 65, 90
And the user asks for 10 labels on the Y axis, a little bit of finagling with paper and pencil comes up with:
0, 25, 50, 75, 100, 125, 150, 175, 200, 225, 250
So there's 10 there (not including 0), the last one extends just beyond the highest value (234 < 250), and it's a "nice" increment of 25 each. If they asked for 8 labels, an increment of 30 would have looked nice:
0, 30, 60, 90, 120, 150, 180, 210, 240
Nine would have been tricky. Maybe just have used either 8 or 10 and call it close enough would be okay. And what to do when some of the points are negative?
I can see Excel tackles this problem nicely.
Does anyone know a general-purpose algorithm (even some brute force is okay) for solving this? I don't have to do it quickly, but it should look nice.
A long time ago I have written a graph module that covered this nicely. Digging in the grey mass gets the following:
Determine lower and upper bound of the data. (Beware of the special case where lower bound = upper bound!
Divide range into the required amount of ticks.
Round the tick range up into nice amounts.
Adjust the lower and upper bound accordingly.
Lets take your example:
15, 234, 140, 65, 90 with 10 ticks
lower bound = 15
upper bound = 234
range = 234-15 = 219
tick range = 21.9. This should be 25.0
new lower bound = 25 * round(15/25) = 0
new upper bound = 25 * round(1+235/25) = 250
So the range = 0,25,50,...,225,250
You can get the nice tick range with the following steps:
divide by 10^x such that the result lies between 0.1 and 1.0 (including 0.1 excluding 1).
translate accordingly:
0.1 -> 0.1
<= 0.2 -> 0.2
<= 0.25 -> 0.25
<= 0.3 -> 0.3
<= 0.4 -> 0.4
<= 0.5 -> 0.5
<= 0.6 -> 0.6
<= 0.7 -> 0.7
<= 0.75 -> 0.75
<= 0.8 -> 0.8
<= 0.9 -> 0.9
<= 1.0 -> 1.0
multiply by 10^x.
In this case, 21.9 is divided by 10^2 to get 0.219. This is <= 0.25 so we now have 0.25. Multiplied by 10^2 this gives 25.
Lets take a look at the same example with 8 ticks:
15, 234, 140, 65, 90 with 8 ticks
lower bound = 15
upper bound = 234
range = 234-15 = 219
tick range = 27.375
Divide by 10^2 for 0.27375, translates to 0.3, which gives (multiplied by 10^2) 30.
new lower bound = 30 * round(15/30) = 0
new upper bound = 30 * round(1+235/30) = 240
Which give the result you requested ;-).
------ Added by KD ------
Here's code that achieves this algorithm without using lookup tables, etc...:
double range = ...;
int tickCount = ...;
double unroundedTickSize = range/(tickCount-1);
double x = Math.ceil(Math.log10(unroundedTickSize)-1);
double pow10x = Math.pow(10, x);
double roundedTickRange = Math.ceil(unroundedTickSize / pow10x) * pow10x;
return roundedTickRange;
Generally speaking, the number of ticks includes the bottom tick, so the actual y-axis segments are one less than the number of ticks.
Here is a PHP example I am using. This function returns an array of pretty Y axis values that encompass the min and max Y values passed in. Of course, this routine could also be used for X axis values.
It allows you to "suggest" how many ticks you might want, but the routine will return
what looks good. I have added some sample data and shown the results for these.
#!/usr/bin/php -q
<?php
function makeYaxis($yMin, $yMax, $ticks = 10)
{
// This routine creates the Y axis values for a graph.
//
// Calculate Min amd Max graphical labels and graph
// increments. The number of ticks defaults to
// 10 which is the SUGGESTED value. Any tick value
// entered is used as a suggested value which is
// adjusted to be a 'pretty' value.
//
// Output will be an array of the Y axis values that
// encompass the Y values.
$result = array();
// If yMin and yMax are identical, then
// adjust the yMin and yMax values to actually
// make a graph. Also avoids division by zero errors.
if($yMin == $yMax)
{
$yMin = $yMin - 10; // some small value
$yMax = $yMax + 10; // some small value
}
// Determine Range
$range = $yMax - $yMin;
// Adjust ticks if needed
if($ticks < 2)
$ticks = 2;
else if($ticks > 2)
$ticks -= 2;
// Get raw step value
$tempStep = $range/$ticks;
// Calculate pretty step value
$mag = floor(log10($tempStep));
$magPow = pow(10,$mag);
$magMsd = (int)($tempStep/$magPow + 0.5);
$stepSize = $magMsd*$magPow;
// build Y label array.
// Lower and upper bounds calculations
$lb = $stepSize * floor($yMin/$stepSize);
$ub = $stepSize * ceil(($yMax/$stepSize));
// Build array
$val = $lb;
while(1)
{
$result[] = $val;
$val += $stepSize;
if($val > $ub)
break;
}
return $result;
}
// Create some sample data for demonstration purposes
$yMin = 60;
$yMax = 330;
$scale = makeYaxis($yMin, $yMax);
print_r($scale);
$scale = makeYaxis($yMin, $yMax,5);
print_r($scale);
$yMin = 60847326;
$yMax = 73425330;
$scale = makeYaxis($yMin, $yMax);
print_r($scale);
?>
Result output from sample data
# ./test1.php
Array
(
[0] => 60
[1] => 90
[2] => 120
[3] => 150
[4] => 180
[5] => 210
[6] => 240
[7] => 270
[8] => 300
[9] => 330
)
Array
(
[0] => 0
[1] => 90
[2] => 180
[3] => 270
[4] => 360
)
Array
(
[0] => 60000000
[1] => 62000000
[2] => 64000000
[3] => 66000000
[4] => 68000000
[5] => 70000000
[6] => 72000000
[7] => 74000000
)
Try this code. I've used it in a few charting scenarios and it works well. It's pretty fast too.
public static class AxisUtil
{
public static float CalculateStepSize(float range, float targetSteps)
{
// calculate an initial guess at step size
float tempStep = range/targetSteps;
// get the magnitude of the step size
float mag = (float)Math.Floor(Math.Log10(tempStep));
float magPow = (float)Math.Pow(10, mag);
// calculate most significant digit of the new step size
float magMsd = (int)(tempStep/magPow + 0.5);
// promote the MSD to either 1, 2, or 5
if (magMsd > 5.0)
magMsd = 10.0f;
else if (magMsd > 2.0)
magMsd = 5.0f;
else if (magMsd > 1.0)
magMsd = 2.0f;
return magMsd*magPow;
}
}
Sounds like the caller doesn't tell you the ranges it wants.
So you are free to changed the end points until you get it nicely divisible by your label count.
Let's define "nice". I would call nice if the labels are off by:
1. 2^n, for some integer n. eg. ..., .25, .5, 1, 2, 4, 8, 16, ...
2. 10^n, for some integer n. eg. ..., .01, .1, 1, 10, 100
3. n/5 == 0, for some positive integer n, eg, 5, 10, 15, 20, 25, ...
4. n/2 == 0, for some positive integer n, eg, 2, 4, 6, 8, 10, 12, 14, ...
Find the max and min of your data series. Let's call these points:
min_point and max_point.
Now all you need to do is find is 3 values:
- start_label, where start_label < min_point and start_label is an integer
- end_label, where end_label > max_point and end_label is an integer
- label_offset, where label_offset is "nice"
that fit the equation:
(end_label - start_label)/label_offset == label_count
There are probably many solutions, so just pick one. Most of the time I bet you can set
start_label to 0
so just try different integer
end_label
until the offset is "nice"
I'm still battling with this :)
The original Gamecat answer does seem to work most of the time, but try plugging in say, "3 ticks" as the number of ticks required (for the same data values 15, 234, 140, 65, 90)....it seems to give a tick range of 73, which after dividing by 10^2 yields 0.73, which maps to 0.75, which gives a 'nice' tick range of 75.
Then calculating upper bound:
75*round(1+234/75) = 300
and the lower bound:
75 * round(15/75) = 0
But clearly if you start at 0, and proceed in steps of 75 up to the upper bound of 300, you end up with 0,75,150,225,300
....which is no doubt useful, but it's 4 ticks (not including 0) not the 3 ticks required.
Just frustrating that it doesn't work 100% of the time....which could well be down to my mistake somewhere of course!
The answer by Toon Krijthe does work most of the time. But sometimes it will produce excess number of ticks. It won't work with negative numbers as well. The overal approach to the problem is ok but there is a better way to handle this. The algorithm you want to use will depend on what you really want to get. Below I'm presenting you my code which I used in my JS Ploting library. I've tested it and it always works (hopefully ;) ). Here are the major steps:
get global extremas xMin and xMax (inlucde all the plots you want to print in the algorithm )
calculate range between xMin and xMax
calculate the order of magnitude of your range
calculate tick size by dividing range by number of ticks minus one
this one is optional. If you want to have zero tick allways printed you use tick size to calculate number of positive and negative ticks. Total number of ticks will be their sum + 1 (the zero tick)
this one is not needed if you have zero tick allways printed. Calculate lower and upper bound but remember to center the plot
Lets start. First the basic calculations
var range = Math.abs(xMax - xMin); //both can be negative
var rangeOrder = Math.floor(Math.log10(range)) - 1;
var power10 = Math.pow(10, rangeOrder);
var maxRound = (xMax > 0) ? Math.ceil(xMax / power10) : Math.floor(xMax / power10);
var minRound = (xMin < 0) ? Math.floor(xMin / power10) : Math.ceil(xMin / power10);
I round minimum and maximum values to be 100% sure that my plot will cover all the data. It is also very important to floor log10 of range wheter or not it is negative and substract 1 later. Otherwise your algorithm won't work for numbers that are lesser than one.
var fullRange = Math.abs(maxRound - minRound);
var tickSize = Math.ceil(fullRange / (this.XTickCount - 1));
//You can set nice looking ticks if you want
//You can find exemplary method below
tickSize = this.NiceLookingTick(tickSize);
//Here you can write a method to determine if you need zero tick
//You can find exemplary method below
var isZeroNeeded = this.HasZeroTick(maxRound, minRound, tickSize);
I use "nice looking ticks" to avoid ticks like 7, 13, 17 etc. Method I use here is pretty simple. It is also nice to have zeroTick when needed. Plot looks much more professional this way. You will find all the methods at the end of this answer.
Now you have to calculate upper and lower bounds. This is very easy with zero tick but requires a little bit more effort in other case. Why? Because we want to center the plot within upper and lower bound nicely. Have a look at my code. Some of the variables are defined outside of this scope and some of them are properties of an object in which whole presented code is kept.
if (isZeroNeeded) {
var positiveTicksCount = 0;
var negativeTickCount = 0;
if (maxRound != 0) {
positiveTicksCount = Math.ceil(maxRound / tickSize);
XUpperBound = tickSize * positiveTicksCount * power10;
}
if (minRound != 0) {
negativeTickCount = Math.floor(minRound / tickSize);
XLowerBound = tickSize * negativeTickCount * power10;
}
XTickRange = tickSize * power10;
this.XTickCount = positiveTicksCount - negativeTickCount + 1;
}
else {
var delta = (tickSize * (this.XTickCount - 1) - fullRange) / 2.0;
if (delta % 1 == 0) {
XUpperBound = maxRound + delta;
XLowerBound = minRound - delta;
}
else {
XUpperBound = maxRound + Math.ceil(delta);
XLowerBound = minRound - Math.floor(delta);
}
XTickRange = tickSize * power10;
XUpperBound = XUpperBound * power10;
XLowerBound = XLowerBound * power10;
}
And here are methods I mentioned before which you can write by yourself but you can also use mine
this.NiceLookingTick = function (tickSize) {
var NiceArray = [1, 2, 2.5, 3, 4, 5, 10];
var tickOrder = Math.floor(Math.log10(tickSize));
var power10 = Math.pow(10, tickOrder);
tickSize = tickSize / power10;
var niceTick;
var minDistance = 10;
var index = 0;
for (var i = 0; i < NiceArray.length; i++) {
var dist = Math.abs(NiceArray[i] - tickSize);
if (dist < minDistance) {
minDistance = dist;
index = i;
}
}
return NiceArray[index] * power10;
}
this.HasZeroTick = function (maxRound, minRound, tickSize) {
if (maxRound * minRound < 0)
{
return true;
}
else if (Math.abs(maxRound) < tickSize || Math.round(minRound) < tickSize) {
return true;
}
else {
return false;
}
}
There is only one more thing that is not included here. This is the "nice looking bounds". These are lower bounds that are numbers similar to the numbers in "nice looking ticks". For example it is better to have the lower bound starting at 5 with tick size 5 than having a plot that starts at 6 with the same tick size. But this my fired I leave it to you.
Hope it helps.
Cheers!
Converted this answer as Swift 4
extension Int {
static func makeYaxis(yMin: Int, yMax: Int, ticks: Int = 10) -> [Int] {
var yMin = yMin
var yMax = yMax
var ticks = ticks
// This routine creates the Y axis values for a graph.
//
// Calculate Min amd Max graphical labels and graph
// increments. The number of ticks defaults to
// 10 which is the SUGGESTED value. Any tick value
// entered is used as a suggested value which is
// adjusted to be a 'pretty' value.
//
// Output will be an array of the Y axis values that
// encompass the Y values.
var result = [Int]()
// If yMin and yMax are identical, then
// adjust the yMin and yMax values to actually
// make a graph. Also avoids division by zero errors.
if yMin == yMax {
yMin -= ticks // some small value
yMax += ticks // some small value
}
// Determine Range
let range = yMax - yMin
// Adjust ticks if needed
if ticks < 2 { ticks = 2 }
else if ticks > 2 { ticks -= 2 }
// Get raw step value
let tempStep: CGFloat = CGFloat(range) / CGFloat(ticks)
// Calculate pretty step value
let mag = floor(log10(tempStep))
let magPow = pow(10,mag)
let magMsd = Int(tempStep / magPow + 0.5)
let stepSize = magMsd * Int(magPow)
// build Y label array.
// Lower and upper bounds calculations
let lb = stepSize * Int(yMin/stepSize)
let ub = stepSize * Int(ceil(CGFloat(yMax)/CGFloat(stepSize)))
// Build array
var val = lb
while true {
result.append(val)
val += stepSize
if val > ub { break }
}
return result
}
}
this works like a charm, if you want 10 steps + zero
//get proper scale for y
$maximoyi_temp= max($institucion); //get max value from data array
for ($i=10; $i< $maximoyi_temp; $i=($i*10)) {
if (($divisor = ($maximoyi_temp / $i)) < 2) break; //get which divisor will give a number between 1-2
}
$factor_d = $maximoyi_temp / $i;
$factor_d = ceil($factor_d); //round up number to 2
$maximoyi = $factor_d * $i; //get new max value for y
if ( ($maximoyi/ $maximoyi_temp) > 2) $maximoyi = $maximoyi /2; //check if max value is too big, then split by 2
The above algorithms do not take into consideration the case when the range between min and max value is too small. And what if these values are a lot higher than zero? Then, we have the possibility to start the y-axis with a value higher than zero. Also, in order to avoid our line to be entirely on the upper or the down side of the graph, we have to give it some "air to breathe".
To cover those cases I wrote (on PHP) the above code:
function calculateStartingPoint($min, $ticks, $times, $scale) {
$starting_point = $min - floor((($ticks - $times) * $scale)/2);
if ($starting_point < 0) {
$starting_point = 0;
} else {
$starting_point = floor($starting_point / $scale) * $scale;
$starting_point = ceil($starting_point / $scale) * $scale;
$starting_point = round($starting_point / $scale) * $scale;
}
return $starting_point;
}
function calculateYaxis($min, $max, $ticks = 7)
{
print "Min = " . $min . "\n";
print "Max = " . $max . "\n";
$range = $max - $min;
$step = floor($range/$ticks);
print "First step is " . $step . "\n";
$available_steps = array(5, 10, 20, 25, 30, 40, 50, 100, 150, 200, 300, 400, 500);
$distance = 1000;
$scale = 0;
foreach ($available_steps as $i) {
if (($i - $step < $distance) && ($i - $step > 0)) {
$distance = $i - $step;
$scale = $i;
}
}
print "Final scale step is " . $scale . "\n";
$times = floor($range/$scale);
print "range/scale = " . $times . "\n";
print "floor(times/2) = " . floor($times/2) . "\n";
$starting_point = calculateStartingPoint($min, $ticks, $times, $scale);
if ($starting_point + ($ticks * $scale) < $max) {
$ticks += 1;
}
print "starting_point = " . $starting_point . "\n";
// result calculation
$result = [];
for ($x = 0; $x <= $ticks; $x++) {
$result[] = $starting_point + ($x * $scale);
}
return $result;
}
For anyone who need this in ES5 Javascript, been wrestling a bit, but here it is:
var min=52;
var max=173;
var actualHeight=500; // 500 pixels high graph
var tickCount =Math.round(actualHeight/100);
// we want lines about every 100 pixels.
if(tickCount <3) tickCount =3;
var range=Math.abs(max-min);
var unroundedTickSize = range/(tickCount-1);
var x = Math.ceil(Math.log10(unroundedTickSize)-1);
var pow10x = Math.pow(10, x);
var roundedTickRange = Math.ceil(unroundedTickSize / pow10x) * pow10x;
var min_rounded=roundedTickRange * Math.floor(min/roundedTickRange);
var max_rounded= roundedTickRange * Math.ceil(max/roundedTickRange);
var nr=tickCount;
var str="";
for(var x=min_rounded;x<=max_rounded;x+=roundedTickRange)
{
str+=x+", ";
}
console.log("nice Y axis "+str);
Based on the excellent answer by Toon Krijtje.
This solution is based on a Java example I found.
const niceScale = ( minPoint, maxPoint, maxTicks) => {
const niceNum = ( localRange, round) => {
var exponent,fraction,niceFraction;
exponent = Math.floor(Math.log10(localRange));
fraction = localRange / Math.pow(10, exponent);
if (round) {
if (fraction < 1.5) niceFraction = 1;
else if (fraction < 3) niceFraction = 2;
else if (fraction < 7) niceFraction = 5;
else niceFraction = 10;
} else {
if (fraction <= 1) niceFraction = 1;
else if (fraction <= 2) niceFraction = 2;
else if (fraction <= 5) niceFraction = 5;
else niceFraction = 10;
}
return niceFraction * Math.pow(10, exponent);
}
const result = [];
const range = niceNum(maxPoint - minPoint, false);
const stepSize = niceNum(range / (maxTicks - 1), true);
const lBound = Math.floor(minPoint / stepSize) * stepSize;
const uBound = Math.ceil(maxPoint / stepSize) * stepSize;
for(let i=lBound;i<=uBound;i+=stepSize) result.push(i);
return result;
};
console.log(niceScale(15,234,6));
// > [0, 100, 200, 300]
Based on #Gamecat's algorithm, I produced the following helper class
public struct Interval
{
public readonly double Min, Max, TickRange;
public static Interval Find(double min, double max, int tickCount, double padding = 0.05)
{
double range = max - min;
max += range*padding;
min -= range*padding;
var attempts = new List<Interval>();
for (int i = tickCount; i > tickCount / 2; --i)
attempts.Add(new Interval(min, max, i));
return attempts.MinBy(a => a.Max - a.Min);
}
private Interval(double min, double max, int tickCount)
{
var candidates = (min <= 0 && max >= 0 && tickCount <= 8) ? new[] {2, 2.5, 3, 4, 5, 7.5, 10} : new[] {2, 2.5, 5, 10};
double unroundedTickSize = (max - min) / (tickCount - 1);
double x = Math.Ceiling(Math.Log10(unroundedTickSize) - 1);
double pow10X = Math.Pow(10, x);
TickRange = RoundUp(unroundedTickSize/pow10X, candidates) * pow10X;
Min = TickRange * Math.Floor(min / TickRange);
Max = TickRange * Math.Ceiling(max / TickRange);
}
// 1 < scaled <= 10
private static double RoundUp(double scaled, IEnumerable<double> candidates)
{
return candidates.First(candidate => scaled <= candidate);
}
}
A demo of accepted answer
function tickEvery(range, ticks) {
return Math.ceil((range / ticks) / Math.pow(10, Math.ceil(Math.log10(range / ticks) - 1))) * Math.pow(10, Math.ceil(Math.log10(range / ticks) - 1));
}
function update() {
const range = document.querySelector("#range").value;
const ticks = document.querySelector("#ticks").value;
const result = tickEvery(range, ticks);
document.querySelector("#result").textContent = `With range ${range} and ${ticks} ticks, tick every ${result} for a total of ${Math.ceil(range / result)} ticks at ${new Array(Math.ceil(range / result)).fill(0).map((v, n) => Math.round(n * result)).join(", ")}`;
}
update();
<input id="range" min="1" max="10000" oninput="update()" style="width:100%" type="range" value="5000" width="40" />
<br/>
<input id="ticks" min="1" max="20" oninput="update()" type="range" style="width:100%" value="10" />
<p id="result" style="font-family:sans-serif"></p>

Algorithm for "nice" grid line intervals on a graph

I need a reasonably smart algorithm to come up with "nice" grid lines for a graph (chart).
For example, assume a bar chart with values of 10, 30, 72 and 60. You know:
Min value: 10
Max value: 72
Range: 62
The first question is: what do you start from? In this case, 0 would be the intuitive value but this won't hold up on other data sets so I'm guessing:
Grid min value should be either 0 or a "nice" value lower than the min value of the data in range. Alternatively, it can be specified.
Grid max value should be a "nice" value above the max value in the range. Alternatively, it can be specified (eg you might want 0 to 100 if you're showing percentages, irrespective of the actual values).
The number of grid lines (ticks) in the range should be either specified or a number within a given range (eg 3-8) such that the values are "nice" (ie round numbers) and you maximise use of the chart area. In our example, 80 would be a sensible max as that would use 90% of the chart height (72/80) whereas 100 would create more wasted space.
Anyone know of a good algorithm for this? Language is irrelevant as I'll implement it in what I need to.
I've done this with kind of a brute force method. First, figure out the maximum number of tick marks you can fit into the space. Divide the total range of values by the number of ticks; this is the minimum spacing of the tick. Now calculate the floor of the logarithm base 10 to get the magnitude of the tick, and divide by this value. You should end up with something in the range of 1 to 10. Simply choose the round number greater than or equal to the value and multiply it by the logarithm calculated earlier. This is your final tick spacing.
Example in Python:
import math
def BestTick(largest, mostticks):
minimum = largest / mostticks
magnitude = 10 ** math.floor(math.log(minimum, 10))
residual = minimum / magnitude
if residual > 5:
tick = 10 * magnitude
elif residual > 2:
tick = 5 * magnitude
elif residual > 1:
tick = 2 * magnitude
else:
tick = magnitude
return tick
Edit: you are free to alter the selection of "nice" intervals. One commenter appears to be dissatisfied with the selections provided, because the actual number of ticks can be up to 2.5 times less than the maximum. Here's a slight modification that defines a table for the nice intervals. In the example, I've expanded the selections so that the number of ticks won't be less than 3/5 of the maximum.
import bisect
def BestTick2(largest, mostticks):
minimum = largest / mostticks
magnitude = 10 ** math.floor(math.log(minimum, 10))
residual = minimum / magnitude
# this table must begin with 1 and end with 10
table = [1, 1.5, 2, 3, 5, 7, 10]
tick = table[bisect.bisect_right(table, residual)] if residual < 10 else 10
return tick * magnitude
There are 2 pieces to the problem:
Determine the order of magnitude involved, and
Round to something convenient.
You can handle the first part by using logarithms:
range = max - min;
exponent = int(log(range)); // See comment below.
magnitude = pow(10, exponent);
So, for example, if your range is from 50 - 1200, the exponent is 3 and the magnitude is 1000.
Then deal with the second part by deciding how many subdivisions you want in your grid:
value_per_division = magnitude / subdivisions;
This is a rough calculation because the exponent has been truncated to an integer. You may want to tweak the exponent calculation to handle boundary conditions better, e.g. by rounding instead of taking the int() if you end up with too many subdivisions.
I use the following algorithm. It's similar to others posted here but it's the first example in C#.
public static class AxisUtil
{
public static float CalcStepSize(float range, float targetSteps)
{
// calculate an initial guess at step size
var tempStep = range/targetSteps;
// get the magnitude of the step size
var mag = (float)Math.Floor(Math.Log10(tempStep));
var magPow = (float)Math.Pow(10, mag);
// calculate most significant digit of the new step size
var magMsd = (int)(tempStep/magPow + 0.5);
// promote the MSD to either 1, 2, or 5
if (magMsd > 5)
magMsd = 10;
else if (magMsd > 2)
magMsd = 5;
else if (magMsd > 1)
magMsd = 2;
return magMsd*magPow;
}
}
CPAN provides an implementation here (see source link)
See also Tickmark algorithm for a graph axis
FYI, with your sample data:
Maple: Min=8, Max=74, Labels=10,20,..,60,70, Ticks=10,12,14,..70,72
MATLAB: Min=10, Max=80, Labels=10,20,,..,60,80
Here's another implementation in JavaScript:
var calcStepSize = function(range, targetSteps)
{
// calculate an initial guess at step size
var tempStep = range / targetSteps;
// get the magnitude of the step size
var mag = Math.floor(Math.log(tempStep) / Math.LN10);
var magPow = Math.pow(10, mag);
// calculate most significant digit of the new step size
var magMsd = Math.round(tempStep / magPow + 0.5);
// promote the MSD to either 1, 2, or 5
if (magMsd > 5.0)
magMsd = 10.0;
else if (magMsd > 2.0)
magMsd = 5.0;
else if (magMsd > 1.0)
magMsd = 2.0;
return magMsd * magPow;
};
I am the author of "Algorithm for Optimal Scaling on a Chart Axis". It used to be hosted on trollop.org, but I have recently moved domains/blogging engines.
Please see my answer to a related question.
Taken from Mark above, a slightly more complete Util class in c#. That also calculates a suitable first and last tick.
public class AxisAssists
{
public double Tick { get; private set; }
public AxisAssists(double aTick)
{
Tick = aTick;
}
public AxisAssists(double range, int mostticks)
{
var minimum = range / mostticks;
var magnitude = Math.Pow(10.0, (Math.Floor(Math.Log(minimum) / Math.Log(10))));
var residual = minimum / magnitude;
if (residual > 5)
{
Tick = 10 * magnitude;
}
else if (residual > 2)
{
Tick = 5 * magnitude;
}
else if (residual > 1)
{
Tick = 2 * magnitude;
}
else
{
Tick = magnitude;
}
}
public double GetClosestTickBelow(double v)
{
return Tick* Math.Floor(v / Tick);
}
public double GetClosestTickAbove(double v)
{
return Tick * Math.Ceiling(v / Tick);
}
}
With ability to create an instance, but if you just want calculate and throw it away:
double tickX = new AxisAssists(aMaxX - aMinX, 8).Tick;
I wrote an objective-c method to return a nice axis scale and nice ticks for given min- and max values of your data set:
- (NSArray*)niceAxis:(double)minValue :(double)maxValue
{
double min_ = 0, max_ = 0, min = minValue, max = maxValue, power = 0, factor = 0, tickWidth, minAxisValue = 0, maxAxisValue = 0;
NSArray *factorArray = [NSArray arrayWithObjects:#"0.0f",#"1.2f",#"2.5f",#"5.0f",#"10.0f",nil];
NSArray *scalarArray = [NSArray arrayWithObjects:#"0.2f",#"0.2f",#"0.5f",#"1.0f",#"2.0f",nil];
// calculate x-axis nice scale and ticks
// 1. min_
if (min == 0) {
min_ = 0;
}
else if (min > 0) {
min_ = MAX(0, min-(max-min)/100);
}
else {
min_ = min-(max-min)/100;
}
// 2. max_
if (max == 0) {
if (min == 0) {
max_ = 1;
}
else {
max_ = 0;
}
}
else if (max < 0) {
max_ = MIN(0, max+(max-min)/100);
}
else {
max_ = max+(max-min)/100;
}
// 3. power
power = log(max_ - min_) / log(10);
// 4. factor
factor = pow(10, power - floor(power));
// 5. nice ticks
for (NSInteger i = 0; factor > [[factorArray objectAtIndex:i]doubleValue] ; i++) {
tickWidth = [[scalarArray objectAtIndex:i]doubleValue] * pow(10, floor(power));
}
// 6. min-axisValues
minAxisValue = tickWidth * floor(min_/tickWidth);
// 7. min-axisValues
maxAxisValue = tickWidth * floor((max_/tickWidth)+1);
// 8. create NSArray to return
NSArray *niceAxisValues = [NSArray arrayWithObjects:[NSNumber numberWithDouble:minAxisValue], [NSNumber numberWithDouble:maxAxisValue],[NSNumber numberWithDouble:tickWidth], nil];
return niceAxisValues;
}
You can call the method like this:
NSArray *niceYAxisValues = [self niceAxis:-maxy :maxy];
and get you axis setup:
double minYAxisValue = [[niceYAxisValues objectAtIndex:0]doubleValue];
double maxYAxisValue = [[niceYAxisValues objectAtIndex:1]doubleValue];
double ticksYAxis = [[niceYAxisValues objectAtIndex:2]doubleValue];
Just in case you want to limit the number of axis ticks do this:
NSInteger maxNumberOfTicks = 9;
NSInteger numberOfTicks = valueXRange / ticksXAxis;
NSInteger newNumberOfTicks = floor(numberOfTicks / (1 + floor(numberOfTicks/(maxNumberOfTicks+0.5))));
double newTicksXAxis = ticksXAxis * (1 + floor(numberOfTicks/(maxNumberOfTicks+0.5)));
The first part of the code is based on the calculation I found here to calculate nice graph axis scale and ticks similar to excel graphs. It works excellent for all kind of data sets. Here is an example of an iPhone implementation:
Another idea is to have the range of the axis be the range of the values, but put the tick marks at the appropriate position.. i.e. for 7 to 22 do:
[- - - | - - - - | - - - - | - - ]
10 15 20
As for selecting the tick spacing, I would suggest any number of the form 10^x * i / n, where i < n, and 0 < n < 10. Generate this list, and sort them, and you can find the largest number smaller than value_per_division (as in adam_liss) using a binary search.
Using a lot of inspiration from answers already availible here, here's my implementation in C. Note that there's some extendibility built into the ndex array.
float findNiceDelta(float maxvalue, int count)
{
float step = maxvalue/count,
order = powf(10, floorf(log10(step))),
delta = (int)(step/order + 0.5);
static float ndex[] = {1, 1.5, 2, 2.5, 5, 10};
static int ndexLenght = sizeof(ndex)/sizeof(float);
for(int i = ndexLenght - 2; i > 0; --i)
if(delta > ndex[i]) return ndex[i + 1] * order;
return delta*order;
}
In R, use
tickSize <- function(range,minCount){
logMaxTick <- log10(range/minCount)
exponent <- floor(logMaxTick)
mantissa <- 10^(logMaxTick-exponent)
af <- c(1,2,5) # allowed factors
mantissa <- af[findInterval(mantissa,af)]
return(mantissa*10^exponent)
}
where range argument is max-min of domain.
Here is a javascript function I wrote to round grid intervals (max-min)/gridLinesNumber to beautiful values. It works with any numbers, see the gist with detailed commets to find out how it works and how to call it.
var ceilAbs = function(num, to, bias) {
if (to == undefined) to = [-2, -5, -10]
if (bias == undefined) bias = 0
var numAbs = Math.abs(num) - bias
var exp = Math.floor( Math.log10(numAbs) )
if (typeof to == 'number') {
return Math.sign(num) * to * Math.ceil(numAbs/to) + bias
}
var mults = to.filter(function(value) {return value > 0})
to = to.filter(function(value) {return value < 0}).map(Math.abs)
var m = Math.abs(numAbs) * Math.pow(10, -exp)
var mRounded = Infinity
for (var i=0; i<mults.length; i++) {
var candidate = mults[i] * Math.ceil(m / mults[i])
if (candidate < mRounded)
mRounded = candidate
}
for (var i=0; i<to.length; i++) {
if (to[i] >= m && to[i] < mRounded)
mRounded = to[i]
}
return Math.sign(num) * mRounded * Math.pow(10, exp) + bias
}
Calling ceilAbs(number, [0.5]) for different numbers will round numbers like that:
301573431.1193228 -> 350000000
14127.786597236991 -> 15000
-63105746.17236853 -> -65000000
-718854.2201183736 -> -750000
-700660.340487957 -> -750000
0.055717507097870114 -> 0.06
0.0008068701205775142 -> 0.00085
-8.66660070605576 -> -9
-400.09256079792976 -> -450
0.0011740548815578223 -> 0.0015
-5.3003294346854085e-8 -> -6e-8
-0.00005815960629843176 -> -0.00006
-742465964.5184875 -> -750000000
-81289225.90985894 -> -85000000
0.000901771713513881 -> 0.00095
-652726598.5496342 -> -700000000
-0.6498901364393532 -> -0.65
0.9978325804695487 -> 1
5409.4078950583935 -> 5500
26906671.095639467 -> 30000000
Check out the fiddle to experiment with the code. Code in the answer, the gist and the fiddle is slightly different I'm using the one given in the answer.
If you are trying to get the scales looking right on VB.NET charts, then I've used the example from Adam Liss, but make sure when you set the min and max scale values that you pass them in from a variable of type decimal (not of type single or double) otherwise the tick mark values end up being set to like 8 decimal places.
So as an example, I had 1 chart where I set the min Y Axis value to 0.0001 and the max Y Axis value to 0.002.
If I pass these values to the chart object as singles I get tick mark values of 0.00048000001697801, 0.000860000036482233 ....
Whereas if I pass these values to the chart object as decimals I get nice tick mark values of 0.00048, 0.00086 ......
In python:
steps = [numpy.round(x) for x in np.linspace(min, max, num=num_of_steps)]
Answer that can dynamically always plot 0, handle positive and negatives, and small and large numbers, gives the tick interval size and how many to plot; written in Go
forcePlotZero changes how the max values are rounded so it'll always make a nice multiple to then get back to zero. Example:
if forcePlotZero == false then 237 --> 240
if forcePlotZero == true then 237 --> 300
Intervals are calculated by getting the multiple of 10/100/1000 etc for max and then subtracting till the cumulative total of these subtractions is < min
Here's the output from the function, along with showing forcePlotZero
Force to plot zero
max and min inputs
rounded max and min
intervals
forcePlotZero=false
min: -104 max: 240
minned: -160 maxed: 240
intervalCount: 5 intervalSize: 100
forcePlotZero=true
min: -104 max: 240
minned: -200 maxed: 300
intervalCount: 6 intervalSize: 100
forcePlotZero=false
min: 40 max: 1240
minned: 0 maxed: 1300
intervalCount: 14 intervalSize: 100
forcePlotZero=false
min: 200 max: 240
minned: 190 maxed: 240
intervalCount: 6 intervalSize: 10
forcePlotZero=false
min: 0.7 max: 1.12
minned: 0.6 maxed: 1.2
intervalCount: 7 intervalSize: 0.1
forcePlotZero=false
min: -70.5 max: -12.5
minned: -80 maxed: -10
intervalCount: 8 intervalSize: 10
Here's the playground link https://play.golang.org/p/1IhiX_hRQvo
func getMaxMinIntervals(max float64, min float64, forcePlotZero bool) (maxRounded float64, minRounded float64, intervalCount float64, intervalSize float64) {
//STEP 1: start off determining the maxRounded value for the axis
precision := 0.0
precisionDampener := 0.0 //adjusts to prevent 235 going to 300, instead dampens the scaling to get 240
epsilon := 0.0000001
if math.Abs(max) >= 0 && math.Abs(max) < 2 {
precision = math.Floor(-math.Log10(epsilon + math.Abs(max) - math.Floor(math.Abs(max)))) //counting number of zeros between decimal point and rightward digits
precisionDampener = 1
precision = precision + precisionDampener
} else if math.Abs(max) >= 2 && math.Abs(max) < 100 {
precision = math.Ceil(math.Log10(math.Abs(max)+1)) * -1 //else count number of digits before decimal point
precisionDampener = 1
precision = precision + precisionDampener
} else {
precision = math.Ceil(math.Log10(math.Abs(max)+1)) * -1 //else count number of digits before decimal point
precisionDampener = 2
if forcePlotZero == true {
precisionDampener = 1
}
precision = precision + precisionDampener
}
useThisFactorForIntervalCalculation := 0.0 // this is needed because intervals are calculated from the max value with a zero origin, this uses range for min - max
if max < 0 {
maxRounded = (math.Floor(math.Abs(max)*(math.Pow10(int(precision)))) / math.Pow10(int(precision)) * -1)
useThisFactorForIntervalCalculation = (math.Floor(math.Abs(max)*(math.Pow10(int(precision)))) / math.Pow10(int(precision))) + ((math.Ceil(math.Abs(min)*(math.Pow10(int(precision)))) / math.Pow10(int(precision))) * -1)
} else {
maxRounded = math.Ceil(max*(math.Pow10(int(precision)))) / math.Pow10(int(precision))
useThisFactorForIntervalCalculation = maxRounded
}
minNumberOfIntervals := 2.0
maxNumberOfIntervals := 19.0
intervalSize = 0.001
intervalCount = minNumberOfIntervals
//STEP 2: get interval size (the step size on the axis)
for {
if math.Abs(useThisFactorForIntervalCalculation)/intervalSize < minNumberOfIntervals || math.Abs(useThisFactorForIntervalCalculation)/intervalSize > maxNumberOfIntervals {
intervalSize = intervalSize * 10
} else {
break
}
}
//STEP 3: check that intervals are not too large, safety for max and min values that are close together (240, 220 etc)
for {
if max-min < intervalSize {
intervalSize = intervalSize / 10
} else {
break
}
}
//STEP 4: now we can get minRounded by adding the interval size to 0 till we get to the point where another increment would make cumulative increments > min, opposite for negative in
minRounded = 0.0
if min >= 0 {
for {
if minRounded < min {
minRounded = minRounded + intervalSize
} else {
minRounded = minRounded - intervalSize
break
}
}
} else {
minRounded = maxRounded //keep going down, decreasing by the interval size till minRounded < min
for {
if minRounded > min {
minRounded = minRounded - intervalSize
} else {
break
}
}
}
//STEP 5: get number of intervals to draw
intervalCount = (maxRounded - minRounded) / intervalSize
intervalCount = math.Ceil(intervalCount) + 1 // include the origin as an interval
//STEP 6: Check that the intervalCount isn't too high
if intervalCount-1 >= (intervalSize * 2) && intervalCount > maxNumberOfIntervals {
intervalCount = math.Ceil(intervalCount / 2)
intervalSize *= 2
}
return}
This is in python and for base 10.
Doesn't cover all your questions but I think you can build on it
import numpy as np
def create_ticks(lo,hi):
s = 10**(np.floor(np.log10(hi - lo)))
start = s * np.floor(lo / s)
end = s * np.ceil(hi / s)
ticks = [start]
t = start
while (t < end):
ticks += [t]
t = t + s
return ticks

Resources