Choosing an attractive linear scale for a graph's Y Axis

Choosing an attractive linear scale for a graph's Y Axis - algorithm

I'm writing a bit of code to display a bar (or line) graph in our software. Everything's going fine. The thing that's got me stumped is labeling the Y axis.
The caller can tell me how finely they want the Y scale labeled, but I seem to be stuck on exactly what to label them in an "attractive" kind of way. I can't describe "attractive", and probably neither can you, but we know it when we see it, right?
So if the data points are:
15, 234, 140, 65, 90
And the user asks for 10 labels on the Y axis, a little bit of finagling with paper and pencil comes up with:
0, 25, 50, 75, 100, 125, 150, 175, 200, 225, 250
So there's 10 there (not including 0), the last one extends just beyond the highest value (234 < 250), and it's a "nice" increment of 25 each. If they asked for 8 labels, an increment of 30 would have looked nice:
0, 30, 60, 90, 120, 150, 180, 210, 240
Nine would have been tricky. Maybe just have used either 8 or 10 and call it close enough would be okay. And what to do when some of the points are negative?
I can see Excel tackles this problem nicely.
Does anyone know a general-purpose algorithm (even some brute force is okay) for solving this? I don't have to do it quickly, but it should look nice.

A long time ago I have written a graph module that covered this nicely. Digging in the grey mass gets the following:
Determine lower and upper bound of the data. (Beware of the special case where lower bound = upper bound!
Divide range into the required amount of ticks.
Round the tick range up into nice amounts.
Adjust the lower and upper bound accordingly.
Lets take your example:
15, 234, 140, 65, 90 with 10 ticks
lower bound = 15
upper bound = 234
range = 234-15 = 219
tick range = 21.9. This should be 25.0
new lower bound = 25 * round(15/25) = 0
new upper bound = 25 * round(1+235/25) = 250
So the range = 0,25,50,...,225,250
You can get the nice tick range with the following steps:
divide by 10^x such that the result lies between 0.1 and 1.0 (including 0.1 excluding 1).
translate accordingly:
0.1 -> 0.1
<= 0.2 -> 0.2
<= 0.25 -> 0.25
<= 0.3 -> 0.3
<= 0.4 -> 0.4
<= 0.5 -> 0.5
<= 0.6 -> 0.6
<= 0.7 -> 0.7
<= 0.75 -> 0.75
<= 0.8 -> 0.8
<= 0.9 -> 0.9
<= 1.0 -> 1.0
multiply by 10^x.
In this case, 21.9 is divided by 10^2 to get 0.219. This is <= 0.25 so we now have 0.25. Multiplied by 10^2 this gives 25.
Lets take a look at the same example with 8 ticks:
15, 234, 140, 65, 90 with 8 ticks
lower bound = 15
upper bound = 234
range = 234-15 = 219
tick range = 27.375
Divide by 10^2 for 0.27375, translates to 0.3, which gives (multiplied by 10^2) 30.
new lower bound = 30 * round(15/30) = 0
new upper bound = 30 * round(1+235/30) = 240
Which give the result you requested ;-).
------ Added by KD ------
Here's code that achieves this algorithm without using lookup tables, etc...:
double range = ...;
int tickCount = ...;
double unroundedTickSize = range/(tickCount-1);
double x = Math.ceil(Math.log10(unroundedTickSize)-1);
double pow10x = Math.pow(10, x);
double roundedTickRange = Math.ceil(unroundedTickSize / pow10x) * pow10x;
return roundedTickRange;
Generally speaking, the number of ticks includes the bottom tick, so the actual y-axis segments are one less than the number of ticks.

Here is a PHP example I am using. This function returns an array of pretty Y axis values that encompass the min and max Y values passed in. Of course, this routine could also be used for X axis values.
It allows you to "suggest" how many ticks you might want, but the routine will return
what looks good. I have added some sample data and shown the results for these.
#!/usr/bin/php -q
<?php
function makeYaxis($yMin, $yMax, $ticks = 10)
{
// This routine creates the Y axis values for a graph.
//
// Calculate Min amd Max graphical labels and graph
// increments. The number of ticks defaults to
// 10 which is the SUGGESTED value. Any tick value
// entered is used as a suggested value which is
// adjusted to be a 'pretty' value.
//
// Output will be an array of the Y axis values that
// encompass the Y values.
$result = array();
// If yMin and yMax are identical, then
// adjust the yMin and yMax values to actually
// make a graph. Also avoids division by zero errors.
if($yMin == $yMax)
{
$yMin = $yMin - 10; // some small value
$yMax = $yMax + 10; // some small value
}
// Determine Range
$range = $yMax - $yMin;
// Adjust ticks if needed
if($ticks < 2)
$ticks = 2;
else if($ticks > 2)
$ticks -= 2;
// Get raw step value
$tempStep = $range/$ticks;
// Calculate pretty step value
$mag = floor(log10($tempStep));
$magPow = pow(10,$mag);
$magMsd = (int)($tempStep/$magPow + 0.5);
$stepSize = $magMsd*$magPow;
// build Y label array.
// Lower and upper bounds calculations
$lb = $stepSize * floor($yMin/$stepSize);
$ub = $stepSize * ceil(($yMax/$stepSize));
// Build array
$val = $lb;
while(1)
{
$result[] = $val;
$val += $stepSize;
if($val > $ub)
break;
}
return $result;
}
// Create some sample data for demonstration purposes
$yMin = 60;
$yMax = 330;
$scale = makeYaxis($yMin, $yMax);
print_r($scale);
$scale = makeYaxis($yMin, $yMax,5);
print_r($scale);
$yMin = 60847326;
$yMax = 73425330;
$scale = makeYaxis($yMin, $yMax);
print_r($scale);
?>
Result output from sample data
# ./test1.php
Array
(
[0] => 60
[1] => 90
[2] => 120
[3] => 150
[4] => 180
[5] => 210
[6] => 240
[7] => 270
[8] => 300
[9] => 330
)
Array
(
[0] => 0
[1] => 90
[2] => 180
[3] => 270
[4] => 360
)
Array
(
[0] => 60000000
[1] => 62000000
[2] => 64000000
[3] => 66000000
[4] => 68000000
[5] => 70000000
[6] => 72000000
[7] => 74000000
)

Try this code. I've used it in a few charting scenarios and it works well. It's pretty fast too.
public static class AxisUtil
{
public static float CalculateStepSize(float range, float targetSteps)
{
// calculate an initial guess at step size
float tempStep = range/targetSteps;
// get the magnitude of the step size
float mag = (float)Math.Floor(Math.Log10(tempStep));
float magPow = (float)Math.Pow(10, mag);
// calculate most significant digit of the new step size
float magMsd = (int)(tempStep/magPow + 0.5);
// promote the MSD to either 1, 2, or 5
if (magMsd > 5.0)
magMsd = 10.0f;
else if (magMsd > 2.0)
magMsd = 5.0f;
else if (magMsd > 1.0)
magMsd = 2.0f;
return magMsd*magPow;
}
}

Sounds like the caller doesn't tell you the ranges it wants.
So you are free to changed the end points until you get it nicely divisible by your label count.
Let's define "nice". I would call nice if the labels are off by:
1. 2^n, for some integer n. eg. ..., .25, .5, 1, 2, 4, 8, 16, ...
2. 10^n, for some integer n. eg. ..., .01, .1, 1, 10, 100
3. n/5 == 0, for some positive integer n, eg, 5, 10, 15, 20, 25, ...
4. n/2 == 0, for some positive integer n, eg, 2, 4, 6, 8, 10, 12, 14, ...
Find the max and min of your data series. Let's call these points:
min_point and max_point.
Now all you need to do is find is 3 values:
- start_label, where start_label < min_point and start_label is an integer
- end_label, where end_label > max_point and end_label is an integer
- label_offset, where label_offset is "nice"
that fit the equation:
(end_label - start_label)/label_offset == label_count
There are probably many solutions, so just pick one. Most of the time I bet you can set
start_label to 0
so just try different integer
end_label
until the offset is "nice"

I'm still battling with this :)
The original Gamecat answer does seem to work most of the time, but try plugging in say, "3 ticks" as the number of ticks required (for the same data values 15, 234, 140, 65, 90)....it seems to give a tick range of 73, which after dividing by 10^2 yields 0.73, which maps to 0.75, which gives a 'nice' tick range of 75.
Then calculating upper bound:
75*round(1+234/75) = 300
and the lower bound:
75 * round(15/75) = 0
But clearly if you start at 0, and proceed in steps of 75 up to the upper bound of 300, you end up with 0,75,150,225,300
....which is no doubt useful, but it's 4 ticks (not including 0) not the 3 ticks required.
Just frustrating that it doesn't work 100% of the time....which could well be down to my mistake somewhere of course!

The answer by Toon Krijthe does work most of the time. But sometimes it will produce excess number of ticks. It won't work with negative numbers as well. The overal approach to the problem is ok but there is a better way to handle this. The algorithm you want to use will depend on what you really want to get. Below I'm presenting you my code which I used in my JS Ploting library. I've tested it and it always works (hopefully ;) ). Here are the major steps:
get global extremas xMin and xMax (inlucde all the plots you want to print in the algorithm )
calculate range between xMin and xMax
calculate the order of magnitude of your range
calculate tick size by dividing range by number of ticks minus one
this one is optional. If you want to have zero tick allways printed you use tick size to calculate number of positive and negative ticks. Total number of ticks will be their sum + 1 (the zero tick)
this one is not needed if you have zero tick allways printed. Calculate lower and upper bound but remember to center the plot
Lets start. First the basic calculations
var range = Math.abs(xMax - xMin); //both can be negative
var rangeOrder = Math.floor(Math.log10(range)) - 1;
var power10 = Math.pow(10, rangeOrder);
var maxRound = (xMax > 0) ? Math.ceil(xMax / power10) : Math.floor(xMax / power10);
var minRound = (xMin < 0) ? Math.floor(xMin / power10) : Math.ceil(xMin / power10);
I round minimum and maximum values to be 100% sure that my plot will cover all the data. It is also very important to floor log10 of range wheter or not it is negative and substract 1 later. Otherwise your algorithm won't work for numbers that are lesser than one.
var fullRange = Math.abs(maxRound - minRound);
var tickSize = Math.ceil(fullRange / (this.XTickCount - 1));
//You can set nice looking ticks if you want
//You can find exemplary method below
tickSize = this.NiceLookingTick(tickSize);
//Here you can write a method to determine if you need zero tick
//You can find exemplary method below
var isZeroNeeded = this.HasZeroTick(maxRound, minRound, tickSize);
I use "nice looking ticks" to avoid ticks like 7, 13, 17 etc. Method I use here is pretty simple. It is also nice to have zeroTick when needed. Plot looks much more professional this way. You will find all the methods at the end of this answer.
Now you have to calculate upper and lower bounds. This is very easy with zero tick but requires a little bit more effort in other case. Why? Because we want to center the plot within upper and lower bound nicely. Have a look at my code. Some of the variables are defined outside of this scope and some of them are properties of an object in which whole presented code is kept.
if (isZeroNeeded) {
var positiveTicksCount = 0;
var negativeTickCount = 0;
if (maxRound != 0) {
positiveTicksCount = Math.ceil(maxRound / tickSize);
XUpperBound = tickSize * positiveTicksCount * power10;
}
if (minRound != 0) {
negativeTickCount = Math.floor(minRound / tickSize);
XLowerBound = tickSize * negativeTickCount * power10;
}
XTickRange = tickSize * power10;
this.XTickCount = positiveTicksCount - negativeTickCount + 1;
}
else {
var delta = (tickSize * (this.XTickCount - 1) - fullRange) / 2.0;
if (delta % 1 == 0) {
XUpperBound = maxRound + delta;
XLowerBound = minRound - delta;
}
else {
XUpperBound = maxRound + Math.ceil(delta);
XLowerBound = minRound - Math.floor(delta);
}
XTickRange = tickSize * power10;
XUpperBound = XUpperBound * power10;
XLowerBound = XLowerBound * power10;
}
And here are methods I mentioned before which you can write by yourself but you can also use mine
this.NiceLookingTick = function (tickSize) {
var NiceArray = [1, 2, 2.5, 3, 4, 5, 10];
var tickOrder = Math.floor(Math.log10(tickSize));
var power10 = Math.pow(10, tickOrder);
tickSize = tickSize / power10;
var niceTick;
var minDistance = 10;
var index = 0;
for (var i = 0; i < NiceArray.length; i++) {
var dist = Math.abs(NiceArray[i] - tickSize);
if (dist < minDistance) {
minDistance = dist;
index = i;
}
}
return NiceArray[index] * power10;
}
this.HasZeroTick = function (maxRound, minRound, tickSize) {
if (maxRound * minRound < 0)
{
return true;
}
else if (Math.abs(maxRound) < tickSize || Math.round(minRound) < tickSize) {
return true;
}
else {
return false;
}
}
There is only one more thing that is not included here. This is the "nice looking bounds". These are lower bounds that are numbers similar to the numbers in "nice looking ticks". For example it is better to have the lower bound starting at 5 with tick size 5 than having a plot that starts at 6 with the same tick size. But this my fired I leave it to you.
Hope it helps.
Cheers!

Converted this answer as Swift 4
extension Int {
static func makeYaxis(yMin: Int, yMax: Int, ticks: Int = 10) -> [Int] {
var yMin = yMin
var yMax = yMax
var ticks = ticks
// This routine creates the Y axis values for a graph.
//
// Calculate Min amd Max graphical labels and graph
// increments. The number of ticks defaults to
// 10 which is the SUGGESTED value. Any tick value
// entered is used as a suggested value which is
// adjusted to be a 'pretty' value.
//
// Output will be an array of the Y axis values that
// encompass the Y values.
var result = [Int]()
// If yMin and yMax are identical, then
// adjust the yMin and yMax values to actually
// make a graph. Also avoids division by zero errors.
if yMin == yMax {
yMin -= ticks // some small value
yMax += ticks // some small value
}
// Determine Range
let range = yMax - yMin
// Adjust ticks if needed
if ticks < 2 { ticks = 2 }
else if ticks > 2 { ticks -= 2 }
// Get raw step value
let tempStep: CGFloat = CGFloat(range) / CGFloat(ticks)
// Calculate pretty step value
let mag = floor(log10(tempStep))
let magPow = pow(10,mag)
let magMsd = Int(tempStep / magPow + 0.5)
let stepSize = magMsd * Int(magPow)
// build Y label array.
// Lower and upper bounds calculations
let lb = stepSize * Int(yMin/stepSize)
let ub = stepSize * Int(ceil(CGFloat(yMax)/CGFloat(stepSize)))
// Build array
var val = lb
while true {
result.append(val)
val += stepSize
if val > ub { break }
}
return result
}
}

this works like a charm, if you want 10 steps + zero
//get proper scale for y
$maximoyi_temp= max($institucion); //get max value from data array
for ($i=10; $i< $maximoyi_temp; $i=($i*10)) {
if (($divisor = ($maximoyi_temp / $i)) < 2) break; //get which divisor will give a number between 1-2
}
$factor_d = $maximoyi_temp / $i;
$factor_d = ceil($factor_d); //round up number to 2
$maximoyi = $factor_d * $i; //get new max value for y
if ( ($maximoyi/ $maximoyi_temp) > 2) $maximoyi = $maximoyi /2; //check if max value is too big, then split by 2

The above algorithms do not take into consideration the case when the range between min and max value is too small. And what if these values are a lot higher than zero? Then, we have the possibility to start the y-axis with a value higher than zero. Also, in order to avoid our line to be entirely on the upper or the down side of the graph, we have to give it some "air to breathe".
To cover those cases I wrote (on PHP) the above code:
function calculateStartingPoint($min, $ticks, $times, $scale) {
$starting_point = $min - floor((($ticks - $times) * $scale)/2);
if ($starting_point < 0) {
$starting_point = 0;
} else {
$starting_point = floor($starting_point / $scale) * $scale;
$starting_point = ceil($starting_point / $scale) * $scale;
$starting_point = round($starting_point / $scale) * $scale;
}
return $starting_point;
}
function calculateYaxis($min, $max, $ticks = 7)
{
print "Min = " . $min . "\n";
print "Max = " . $max . "\n";
$range = $max - $min;
$step = floor($range/$ticks);
print "First step is " . $step . "\n";
$available_steps = array(5, 10, 20, 25, 30, 40, 50, 100, 150, 200, 300, 400, 500);
$distance = 1000;
$scale = 0;
foreach ($available_steps as $i) {
if (($i - $step < $distance) && ($i - $step > 0)) {
$distance = $i - $step;
$scale = $i;
}
}
print "Final scale step is " . $scale . "\n";
$times = floor($range/$scale);
print "range/scale = " . $times . "\n";
print "floor(times/2) = " . floor($times/2) . "\n";
$starting_point = calculateStartingPoint($min, $ticks, $times, $scale);
if ($starting_point + ($ticks * $scale) < $max) {
$ticks += 1;
}
print "starting_point = " . $starting_point . "\n";
// result calculation
$result = [];
for ($x = 0; $x <= $ticks; $x++) {
$result[] = $starting_point + ($x * $scale);
}
return $result;
}

For anyone who need this in ES5 Javascript, been wrestling a bit, but here it is:
var min=52;
var max=173;
var actualHeight=500; // 500 pixels high graph
var tickCount =Math.round(actualHeight/100);
// we want lines about every 100 pixels.
if(tickCount <3) tickCount =3;
var range=Math.abs(max-min);
var unroundedTickSize = range/(tickCount-1);
var x = Math.ceil(Math.log10(unroundedTickSize)-1);
var pow10x = Math.pow(10, x);
var roundedTickRange = Math.ceil(unroundedTickSize / pow10x) * pow10x;
var min_rounded=roundedTickRange * Math.floor(min/roundedTickRange);
var max_rounded= roundedTickRange * Math.ceil(max/roundedTickRange);
var nr=tickCount;
var str="";
for(var x=min_rounded;x<=max_rounded;x+=roundedTickRange)
{
str+=x+", ";
}
console.log("nice Y axis "+str);
Based on the excellent answer by Toon Krijtje.

This solution is based on a Java example I found.
const niceScale = ( minPoint, maxPoint, maxTicks) => {
const niceNum = ( localRange, round) => {
var exponent,fraction,niceFraction;
exponent = Math.floor(Math.log10(localRange));
fraction = localRange / Math.pow(10, exponent);
if (round) {
if (fraction < 1.5) niceFraction = 1;
else if (fraction < 3) niceFraction = 2;
else if (fraction < 7) niceFraction = 5;
else niceFraction = 10;
} else {
if (fraction <= 1) niceFraction = 1;
else if (fraction <= 2) niceFraction = 2;
else if (fraction <= 5) niceFraction = 5;
else niceFraction = 10;
}
return niceFraction * Math.pow(10, exponent);
}
const result = [];
const range = niceNum(maxPoint - minPoint, false);
const stepSize = niceNum(range / (maxTicks - 1), true);
const lBound = Math.floor(minPoint / stepSize) * stepSize;
const uBound = Math.ceil(maxPoint / stepSize) * stepSize;
for(let i=lBound;i<=uBound;i+=stepSize) result.push(i);
return result;
};
console.log(niceScale(15,234,6));
// > [0, 100, 200, 300]

Based on #Gamecat's algorithm, I produced the following helper class
public struct Interval
{
public readonly double Min, Max, TickRange;
public static Interval Find(double min, double max, int tickCount, double padding = 0.05)
{
double range = max - min;
max += range*padding;
min -= range*padding;
var attempts = new List<Interval>();
for (int i = tickCount; i > tickCount / 2; --i)
attempts.Add(new Interval(min, max, i));
return attempts.MinBy(a => a.Max - a.Min);
}
private Interval(double min, double max, int tickCount)
{
var candidates = (min <= 0 && max >= 0 && tickCount <= 8) ? new[] {2, 2.5, 3, 4, 5, 7.5, 10} : new[] {2, 2.5, 5, 10};
double unroundedTickSize = (max - min) / (tickCount - 1);
double x = Math.Ceiling(Math.Log10(unroundedTickSize) - 1);
double pow10X = Math.Pow(10, x);
TickRange = RoundUp(unroundedTickSize/pow10X, candidates) * pow10X;
Min = TickRange * Math.Floor(min / TickRange);
Max = TickRange * Math.Ceiling(max / TickRange);
}
// 1 < scaled <= 10
private static double RoundUp(double scaled, IEnumerable<double> candidates)
{
return candidates.First(candidate => scaled <= candidate);
}
}

A demo of accepted answer
function tickEvery(range, ticks) {
return Math.ceil((range / ticks) / Math.pow(10, Math.ceil(Math.log10(range / ticks) - 1))) * Math.pow(10, Math.ceil(Math.log10(range / ticks) - 1));
}
function update() {
const range = document.querySelector("#range").value;
const ticks = document.querySelector("#ticks").value;
const result = tickEvery(range, ticks);
document.querySelector("#result").textContent = `With range ${range} and ${ticks} ticks, tick every ${result} for a total of ${Math.ceil(range / result)} ticks at ${new Array(Math.ceil(range / result)).fill(0).map((v, n) => Math.round(n * result)).join(", ")}`;
}
update();
<input id="range" min="1" max="10000" oninput="update()" style="width:100%" type="range" value="5000" width="40" />
<br/>
<input id="ticks" min="1" max="20" oninput="update()" type="range" style="width:100%" value="10" />
<p id="result" style="font-family:sans-serif"></p>

Related

Scoring two sequences of ordered numbers for their similarity to one-another

How would I go about scoring two sequences of numbers such that
5, 8, 28, 31 (differences of 3, 20 and 3)
6, 9, 26, 29 differences of 3, 17 and 3
are considered similar "enough" but a sequence of
8 11 31 34 (differences of 3, 20 and 3, errors of 3, 3, 3, 3)
Is too dissimilar to allow?
The second set of numbers has an absolute error of
1 1 2 2 and that is low "enough" to accept.
If that error was too high I'd like to be able to reject it.
To give a little background, these are indicators of time and when events arrived to a computer. The first sequence is the expected time of arrival and the second sequence is the actual times they arrived. Knowing that the sequence is at least in the correct order I need to be able to score the similarity to the expectation and accept or reject it by tweaking some sort of value.
If it were standard deviation for a set of numbers where order didn't matter I could just reject the second set based on its own standard deviation.
Since this is not the case I had the idea of measuring deviance and position error.
Position error shouldn't exceed 3, though this number should not be integer - it needs to be decimal as the numbers are more realistically floating point, or at least accurate to 6 decimal places.
It also needs to work equally well, or perhaps offer a variant in which a much longer series of numbers can be scored fairly.
In the longer series of numbers it it not likely the position error will exceed 3 so the position error would still be fairly low.
This is a partial solution I have found using a Person's correlation coefficient series for each time x fits into y. It uses the form of the equation that works off expected values. The comments describe it fairly well.
function getPearsonsCorrelation(x, y)
{
/**
* Pearsons can be calculated in an alternative fashion as
* p(x, y) = (E(xy) - E(x)*E(y))/sqrt[(E(x^2)-(E(x))^2)*(E(y^2)-(E(y))^2)]
* where p(x, y) is the Pearson's correlation result, E is a function referring to the expected value
* E(x) = var expectedValue = 0; for(var i = 0; i < x.length; i ++){ expectedValue += x[i]*p[i] }
* where p[i] is the probability of that variable occurring, here we substitute in 1 every time
* hence this simplifies to E(x) = sum of all x values
* sqrt is the square root of the result in square brackets
* ^2 means to the power of two, or rather just square that value
**/
var maxdelay = y.length - x.length; // we will calculate Pearson's correlation coefficient at every location x fits into y
var xl = x.length
var results = [];
for(var d = 0; d <= maxdelay; d++){
var xy = [];
var x2 = [];
var y2 = [];
var _y = y.slice(d, d + x.length); // take just the segment of y at delay
for(var i = 0; i < xl; i ++){
xy.push(x[i] * _y[i]); // x*y array
x2.push(x[i] * x[i]); // x squareds array
y2.push(_y[i] * _y[i]); // y squareds array
}
var sum_x = 0;
var sum_y = 0;
var sum_xy = 0;
var sum_x2 = 0;
var sum_y2 = 0;
for(var i = 0; i < xl; i ++){
sum_x += x[i]; // expected value of x
sum_y += _y[i]; // expected value of y
sum_xy += xy[i]; // expected value of xy/n
sum_x2 += x2[i]; // expected value of (x squared)/n
sum_y2 += y2[i]; // expected value of (y squared)/n
}
var numerator = xl * sum_xy - sum_x * sum_y; // expected value of xy - (expected value of x * expected value of y)
var denomLetSide = xl * sum_x2 - sum_x * sum_x; // expected value of (x squared) - (expected value of x) squared
var denomRightSide = xl * sum_y2 - sum_y * sum_y; // expected value of (y squared) - (expected value of y) squared
var denom = Math.sqrt(denomLetSide * denomRightSide);
var pearsonsCorrelation = numerator / denom;
results.push(pearsonsCorrelation);
}
return results;
}

Find 'average' with equal upper and lower distance to values of a given set

I recently encountered the following
problem:
Given a set of points with height yᵢ, find the height of the line for which the average distance to points above equals the average distance to points below the line:
More abstract definition: Given a set of real valued data points Y = {y1, ..., yn}, find ȳ which splits Y into two sets Y⁺ = {y ∊ Y : y > ȳ} and Y⁻ = {y ∊ Y : y < ȳ} so that the average distance between ȳ and elements of Y⁺ equals the average distance between ȳ and elements of Y⁻.
Naive solution: Initialize ȳ with the average of Y, compute average upper and lower distances and iteratively move up or down depending on whether the upper or lower average distance is greater.
Question: This problem is pretty basic, so there is probably a better solution (?) Even a non-iterative algebraic algorithm?

As mentioned in the comment, if you know which points are above and below the line, then you can solve it like this:
a = number of points above the line
b = number of points below the line
sa = sum of all y above the line
sb = sum of all y below the line
Now we can create the following equation:
(sa - a * y) / a = (b * y - sb) / b | * a * b
sa * b - a * b * y = a * b * y - a * sb | + a * b * y + a * sb
sa * b + a * sb = 2 * a * b * y | / (2 * a * b)
==> y = (a * sb + b * sa) / (2 * a * b)
= sa / (2 * a) + sb / (2 * b)
= (sa / a + sb / b) / 2
If we interprete the result then we could say it is the average between the averages of the points above and below the line.

An iterative solution based on maraca's answer:
Initialize ȳ with the mean of the given values.
Split the given values into those above and below ȳ.
Calculate the new optimal ȳ for this split.
Repeat until ȳ converges.
This is slightly faster than the algorithm outlined in the question.
// Find mean with equal average distance to upper and lower values:
function findEqualAverageDistanceMean(values) {
let mean = values.reduce((a, b) => a + b) / values.length,
last = NaN;
// Iteratively equalize average distances:
while (last != mean) {
let lower_total = 0,
lower_n = 0,
upper_total = 0,
upper_n = 0;
for (let value of values) {
if (value > mean) {
upper_total += value;
++upper_n;
} else if (value < mean) {
lower_total += value;
++lower_n;
}
}
last = mean;
mean = (upper_total / upper_n + lower_total / lower_n) / 2;
}
return mean;
}
// Example:
let canvas = document.getElementById("canvas"),
ctx = canvas.getContext("2d"),
points = Array.from({length: 100}, () => Math.random() ** 4),
mean = points.reduce((a, b) => a + b) / points.length,
equalAverageDistanceMean = findEqualAverageDistanceMean(points);
function draw(points, mean, equalAverageDistanceMean) {
for (let [i, point] of points.entries()) {
ctx.fillStyle = (point < equalAverageDistanceMean) ? 'red' : 'green';
ctx.fillRect(i * canvas.width / points.length, canvas.height * point, 3, 3);
}
ctx.fillStyle = 'black';
ctx.fillRect(0, canvas.height * mean, canvas.width, .5);
ctx.fillRect(0, canvas.height * equalAverageDistanceMean, canvas.width, 3);
}
draw(points, mean, equalAverageDistanceMean);
<canvas id="canvas" width="400" height="200">

How to calculate percentage between the range of two values a third value is

Example:
I'm trying to figure out the calculation for finding the percentage between two values that a third value is.
Example: The range is 46 to 195. The value 46 would 0%, and the value 195 would be 100% of the range. What percentage of this range is the value 65?
rangeMin=46
rangeMax=195
inputValue=65
inputPercentage = ?

Well, I would use the formula
((input - min) * 100) / (max - min)
For your example it would be
((65 - 46) * 100) / (195 - 46) = 12.75
Or a little bit longer
range = max - min
correctedStartValue = input - min
percentage = (correctedStartValue * 100) / range
If you already have the percentage and you're looking for the "input value" in a given range, then you can use the adjusted formula provided by Dustin in the comments:
value = (percentage * (max - min) / 100) + min

I put together this function to calculate it. It also gives the ability to set a mid way 100% point that then goes back down.
Usage
//[] = optional
rangePercentage(input, minimum_range, maximum_normal_range, [maximum_upper_range]);
rangePercentage(250, 0, 500); //returns 50 (as in 50%)
rangePercentage(100, 0, 200, 400); //returns 50
rangePercentage(200, 0, 200, 400); //returns 100
rangePercentage(300, 0, 200, 400); //returns 50
The function
function rangePercentage (input, range_min, range_max, range_2ndMax){
var percentage = ((input - range_min) * 100) / (range_max - range_min);
if (percentage > 100) {
if (typeof range_2ndMax !== 'undefined'){
percentage = ((range_2ndMax - input) * 100) / (range_2ndMax - range_max);
if (percentage < 0) {
percentage = 0;
}
} else {
percentage = 100;
}
} else if (percentage < 0){
percentage = 0;
}
return percentage;
}

If you want to calculate the percentages of a list of values and truncate the values between a max and min you can do something like this:
private getPercentages(arr:number[], min:number=0, max:number=100): number[] {
let maxValue = Math.max( ...arr );
return arr.map((el)=>{
let percent = el * 100 / maxValue;
return percent * ((max - min) / 100) + min;
});
};
Here the function call:
this.getPercentages([20,30,80,200],20,50);
would return
[23, 24.5, 32, 50]
where the percentages are relative and placed between the min and max value.

Can be used for scaling any number of variables
Python Implementation:
# List1 will contain all the variables
list1 = []
# append all the variables in list1
list1.append(var1)
list1.append(var2)
list1.append(var3)
list1.append(var4)
# Sorting the list in ascending order
list1.sort(key = None, reverse = False)
# Normalizing each variable using ( X_Normalized = (X - X_minimum) / (X_Maximum - X_minimum) )
normalized_var1 = (var1 - list1[0]) / (list1[-1] - list1[0])
normalized_var2 = (var2 - list1[0]) / (list1[-1] - list1[0])
normalized_var3 = (var3 - list1[0]) / (list1[-1] - list1[0])
normalized_var4 = (var4 - list1[0]) / (list1[-1] - list1[0])

Reasonable optimized chart scaling

I need to make a chart with an optimized y axis maximum value.
The current method I have of making charts simply uses the maximum value of all the graphs, then divides it by ten, and uses that as grid lines. I didn't write it.
Update Note: These graphs have been changed. As soon as I fixed the code, my dynamic graphs started working, making this question nonsensical (because the examples no longer had any errors in them). I've updated these with static images, but some of the answers refrence different values. Keep that in mind.
There were between 12003 and 14003 inbound calls so far in February. Informative, but ugly.
I'd like to avoid charts that look like a monkey came up with the y-axis numbers.
Using the Google charts API helps a little bit, but it's still not quite what I want.
The numbers are clean, but the top of the y value is always the same as the maximum value on the chart. This chart scales from 0 to 1357. I need to have calculated the proper value of 1400, problematically.
I'm throwing in rbobby's defanition of a 'nice' number here because it explains it so well.
A "nice" number is one that has 3 or fewer non-zero digits (eg. 1230000)
A "nice" number has the same or few non-zero digits than zero digits (eg 1230 is not nice, 1200 is nice)
The nicest numbers are ones with multiples of 3 zeros (eg. "1,000", "1,000,000")
The second nicest numbers are onces with multples of 3 zeros plus 2 zeros (eg. "1,500,000", "1,200")
Solution
I found the way to get the results that I want using a modified version of Mark Ransom's idea.
Fist, Mark Ransom's code determines the optimum spacing between ticks, when given the number of ticks. Sometimes this number ends up being more than twice what the highest value on the chart is, depending on how many grid lines you want.
What I'm doing is I'm running Mark's code with 5, 6, 7, 8, 9, and 10 grid lines (ticks) to find which of those is the lowest. With a value of 23, the height of the chart goes to 25, with a grid line at 5, 10, 15, 20, and 25. With a value of 26, the chart's height is 30, with grid lines at 5, 10, 15, 20, 25, and 30. It has the same spacing between grid lines, but there are more of them.
So here's the steps to just-about copy what Excel does to make charts all fancy.
Temporarily bump up the chart's highest value by about 5% (so that there is always some space between the chart's highest point and the top of the chart area. We want 99.9 to round up to 120)
Find the optimum grid line placement
for 5, 6, 7, 8, 9, and 10 grid
lines.
Pick out the lowest of those numbers. Remember the number of grid lines it took to get that value.
Now you have the optimum chart height. The lines/bar will never butt up against the top of the chart and you have the optimum number of ticks.
PHP:
function roundUp($maxValue){
$optiMax = $maxValue * 2;
for ($i = 5; $i <= 10; $i++){
$tmpMaxValue = bestTick($maxValue,$i);
if (($optiMax > $tmpMaxValue) and ($tmpMaxValue > ($maxValue + $maxValue * 0.05))){
$optiMax = $tmpMaxValue;
$optiTicks = $i;
}
}
return $optiMax;
}
function bestTick($maxValue, $mostTicks){
$minimum = $maxValue / $mostTicks;
$magnitude = pow(10,floor(log($minimum) / log(10)));
$residual = $minimum / $magnitude;
if ($residual > 5){
$tick = 10 * $magnitude;
} elseif ($residual > 2) {
$tick = 5 * $magnitude;
} elseif ($residual > 1){
$tick = 2 * $magnitude;
} else {
$tick = $magnitude;
}
return ($tick * $mostTicks);
}
Python:
import math
def BestTick(largest, mostticks):
minimum = largest / mostticks
magnitude = 10 ** math.floor(math.log(minimum) / math.log(10))
residual = minimum / magnitude
if residual > 5:
tick = 10 * magnitude
elif residual > 2:
tick = 5 * magnitude
elif residual > 1:
tick = 2 * magnitude
else:
tick = magnitude
return tick
value = int(input(""))
optMax = value * 2
for i in range(5,11):
maxValue = BestTick(value,i) * i
print maxValue
if (optMax > maxValue) and (maxValue > value + (value*.05)):
optMax = maxValue
optTicks = i
print "\nTest Value: " + str(value + (value * .05)) + "\n\nChart Height: " + str(optMax) + " Ticks: " + str(optTicks)

This is from a previous similar question:
Algorithm for "nice" grid line intervals on a graph
I've done this with kind of a brute
force method. First, figure out the
maximum number of tick marks you can
fit into the space. Divide the total
range of values by the number of
ticks; this is the minimum
spacing of the tick. Now calculate
the floor of the logarithm base 10 to
get the magnitude of the tick, and
divide by this value. You should end
up with something in the range of 1 to
10. Simply choose the round number greater than or equal to the value and
multiply it by the logarithm
calculated earlier. This is your
final tick spacing.
Example in Python:
import math
def BestTick(largest, mostticks):
minimum = largest / mostticks
magnitude = 10 ** math.floor(math.log(minimum) / math.log(10))
residual = minimum / magnitude
if residual > 5:
tick = 10 * magnitude
elif residual > 2:
tick = 5 * magnitude
elif residual > 1:
tick = 2 * magnitude
else:
tick = magnitude
return tick

You could round up to two significant figures. The following pseudocode should work:
// maxValue is the largest value in your chart
magnitude = floor(log10(maxValue))
base = 10^(magnitude - 1)
chartHeight = ceiling(maxValue / base) * base
For example, if maxValue is 1357, then magnitude is 3 and base is 100. Dividing by 100, rounding up, and multiplying by 100 has the result of rounding up to the next multiple of 100, i.e. rounding up to two significant figures. In this case, the result if 1400 (1357 ⇒ 13.57 ⇒ 14 ⇒ 1400).

In the past I've done this in a brute force-ish sort of way. Here's a chunk of C++ code that works well... but for a hardcoded lower and upper limits (0 and 5000):
int PickYUnits()
{
int MinSize[8] = {20, 20, 20, 20, 20, 20, 20, 20};
int ItemsPerUnit[8] = {5, 10, 20, 25, 50, 100, 250, 500};
int ItemLimits[8] = {20, 50, 100, 250, 500, 1000, 2500, 5000};
int MaxNumUnits = 8;
double PixelsPerY;
int PixelsPerAxis;
int Units;
//
// Figure out the max from the dataset
// - Min is always 0 for a bar chart
//
m_MinY = 0;
m_MaxY = -9999999;
m_TotalY = 0;
for (int j = 0; j < m_DataPoints.GetSize(); j++) {
if (m_DataPoints[j].m_y > m_MaxY) {
m_MaxY = m_DataPoints[j].m_y;
}
m_TotalY += m_DataPoints[j].m_y;
}
//
// Give some space at the top
//
m_MaxY = m_MaxY + 1;
//
// Figure out the size of the range
//
double yRange = (m_MaxY - m_MinY);
//
// Pick the initial size
//
Units = MaxNumUnits;
for (int k = 0; k < MaxNumUnits; k++)
{
if (yRange < ItemLimits[k])
{
Units = k;
break;
}
}
//
// Adjust it upwards based on the space available
//
PixelsPerY = m_rcGraph.Height() / yRange;
PixelsPerAxis = (int)(PixelsPerY * ItemsPerUnit[Units]);
while (PixelsPerAxis < MinSize[Units]){
Units += 1;
PixelsPerAxis = (int)(PixelsPerY * ItemsPerUnit[Units]);
if (Units == 5)
break;
}
return ItemsPerUnit[Units];
}
However something in what you've said tweaked me. To pick nice axis numbers a definition of "nice number" would help:
A "nice" number is one that has 3 or fewer non-zero digits (eg. 1230000)
A "nice" number has the same or few non-zero digits than zero digits (eg 1230 is not nice, 1200 is nice)
The nicest numbers are ones with multiples of 3 zeros (eg. "1,000", "1,000,000")
The second nicest numbers are onces with multples of 3 zeros plus 2 zeros (eg. "1,500,000", "1,200")
Not sure if the above definition is "right" or actually helpful (but with the definition in hand it then becomes a simpler task to devise an algorithm).

A slight refinement and tested... (works for fractions of units and not just integers)
public void testNumbers() {
double test = 0.20000;
double multiple = 1;
int scale = 0;
String[] prefix = new String[]{"", "m", "u", "n"};
while (Math.log10(test) < 0) {
multiple = multiple * 1000;
test = test * 1000;
scale++;
}
double tick;
double minimum = test / 10;
double magnitude = 100000000;
while (minimum <= magnitude){
magnitude = magnitude / 10;
}
double residual = test / (magnitude * 10);
if (residual > 5) {
tick = 10 * magnitude;
} else if (residual > 2) {
tick = 5 * magnitude;
} else if (residual > 1) {
tick = 2 * magnitude;
} else {
tick = magnitude;
}
double curAmt = 0;
int ticks = (int) Math.ceil(test / tick);
for (int ix = 0; ix < ticks; ix++) {
curAmt += tick;
BigDecimal bigDecimal = new BigDecimal(curAmt);
bigDecimal.setScale(2, BigDecimal.ROUND_HALF_UP);
System.out.println(bigDecimal.stripTrailingZeros().toPlainString() + prefix[scale] + "s");
}
System.out.println("Value = " + test + prefix[scale] + "s");
System.out.println("Tick = " + tick + prefix[scale] + "s");
System.out.println("Ticks = " + ticks);
System.out.println("Scale = " + multiple + " : " + scale);
}

If you want 1400 at the top, how about adjusting the last two parameters to 1400 instead of 1357:

You could use div and mod. For example.
Let's say you want your chart to round up by increments of 20 (just to make it more a more arbitrary number than your typical "10" value).
So I would assume that 1, 11, 18 would all round up to 20. But 21, 33, 38 would round to 40.
To come up with the right value do the following:
Where divisor = your rounding increment.
divisor = 20
multiple = maxValue / divisor; // Do an integer divide here.
if (maxValue modulus divisor > 0)
multiple++;
graphMax = multiple * maxValue;
So now let's plugin real numbers:
divisor = 20;
multiple = 33 / 20; (integer divide)
so multiple = 1
if (33 modulus 20 > 0) (it is.. it equals 13)
multiple++;
so multiple = 2;
graphMax = multiple (2) * maxValue (20);
graphMax = 40;

Algorithm for "nice" grid line intervals on a graph

I need a reasonably smart algorithm to come up with "nice" grid lines for a graph (chart).
For example, assume a bar chart with values of 10, 30, 72 and 60. You know:
Min value: 10
Max value: 72
Range: 62
The first question is: what do you start from? In this case, 0 would be the intuitive value but this won't hold up on other data sets so I'm guessing:
Grid min value should be either 0 or a "nice" value lower than the min value of the data in range. Alternatively, it can be specified.
Grid max value should be a "nice" value above the max value in the range. Alternatively, it can be specified (eg you might want 0 to 100 if you're showing percentages, irrespective of the actual values).
The number of grid lines (ticks) in the range should be either specified or a number within a given range (eg 3-8) such that the values are "nice" (ie round numbers) and you maximise use of the chart area. In our example, 80 would be a sensible max as that would use 90% of the chart height (72/80) whereas 100 would create more wasted space.
Anyone know of a good algorithm for this? Language is irrelevant as I'll implement it in what I need to.

I've done this with kind of a brute force method. First, figure out the maximum number of tick marks you can fit into the space. Divide the total range of values by the number of ticks; this is the minimum spacing of the tick. Now calculate the floor of the logarithm base 10 to get the magnitude of the tick, and divide by this value. You should end up with something in the range of 1 to 10. Simply choose the round number greater than or equal to the value and multiply it by the logarithm calculated earlier. This is your final tick spacing.
Example in Python:
import math
def BestTick(largest, mostticks):
minimum = largest / mostticks
magnitude = 10 ** math.floor(math.log(minimum, 10))
residual = minimum / magnitude
if residual > 5:
tick = 10 * magnitude
elif residual > 2:
tick = 5 * magnitude
elif residual > 1:
tick = 2 * magnitude
else:
tick = magnitude
return tick
Edit: you are free to alter the selection of "nice" intervals. One commenter appears to be dissatisfied with the selections provided, because the actual number of ticks can be up to 2.5 times less than the maximum. Here's a slight modification that defines a table for the nice intervals. In the example, I've expanded the selections so that the number of ticks won't be less than 3/5 of the maximum.
import bisect
def BestTick2(largest, mostticks):
minimum = largest / mostticks
magnitude = 10 ** math.floor(math.log(minimum, 10))
residual = minimum / magnitude
# this table must begin with 1 and end with 10
table = [1, 1.5, 2, 3, 5, 7, 10]
tick = table[bisect.bisect_right(table, residual)] if residual < 10 else 10
return tick * magnitude

There are 2 pieces to the problem:
Determine the order of magnitude involved, and
Round to something convenient.
You can handle the first part by using logarithms:
range = max - min;
exponent = int(log(range)); // See comment below.
magnitude = pow(10, exponent);
So, for example, if your range is from 50 - 1200, the exponent is 3 and the magnitude is 1000.
Then deal with the second part by deciding how many subdivisions you want in your grid:
value_per_division = magnitude / subdivisions;
This is a rough calculation because the exponent has been truncated to an integer. You may want to tweak the exponent calculation to handle boundary conditions better, e.g. by rounding instead of taking the int() if you end up with too many subdivisions.

I use the following algorithm. It's similar to others posted here but it's the first example in C#.
public static class AxisUtil
{
public static float CalcStepSize(float range, float targetSteps)
{
// calculate an initial guess at step size
var tempStep = range/targetSteps;
// get the magnitude of the step size
var mag = (float)Math.Floor(Math.Log10(tempStep));
var magPow = (float)Math.Pow(10, mag);
// calculate most significant digit of the new step size
var magMsd = (int)(tempStep/magPow + 0.5);
// promote the MSD to either 1, 2, or 5
if (magMsd > 5)
magMsd = 10;
else if (magMsd > 2)
magMsd = 5;
else if (magMsd > 1)
magMsd = 2;
return magMsd*magPow;
}
}

CPAN provides an implementation here (see source link)
See also Tickmark algorithm for a graph axis
FYI, with your sample data:
Maple: Min=8, Max=74, Labels=10,20,..,60,70, Ticks=10,12,14,..70,72
MATLAB: Min=10, Max=80, Labels=10,20,,..,60,80

Here's another implementation in JavaScript:
var calcStepSize = function(range, targetSteps)
{
// calculate an initial guess at step size
var tempStep = range / targetSteps;
// get the magnitude of the step size
var mag = Math.floor(Math.log(tempStep) / Math.LN10);
var magPow = Math.pow(10, mag);
// calculate most significant digit of the new step size
var magMsd = Math.round(tempStep / magPow + 0.5);
// promote the MSD to either 1, 2, or 5
if (magMsd > 5.0)
magMsd = 10.0;
else if (magMsd > 2.0)
magMsd = 5.0;
else if (magMsd > 1.0)
magMsd = 2.0;
return magMsd * magPow;
};

I am the author of "Algorithm for Optimal Scaling on a Chart Axis". It used to be hosted on trollop.org, but I have recently moved domains/blogging engines.
Please see my answer to a related question.

Taken from Mark above, a slightly more complete Util class in c#. That also calculates a suitable first and last tick.
public class AxisAssists
{
public double Tick { get; private set; }
public AxisAssists(double aTick)
{
Tick = aTick;
}
public AxisAssists(double range, int mostticks)
{
var minimum = range / mostticks;
var magnitude = Math.Pow(10.0, (Math.Floor(Math.Log(minimum) / Math.Log(10))));
var residual = minimum / magnitude;
if (residual > 5)
{
Tick = 10 * magnitude;
}
else if (residual > 2)
{
Tick = 5 * magnitude;
}
else if (residual > 1)
{
Tick = 2 * magnitude;
}
else
{
Tick = magnitude;
}
}
public double GetClosestTickBelow(double v)
{
return Tick* Math.Floor(v / Tick);
}
public double GetClosestTickAbove(double v)
{
return Tick * Math.Ceiling(v / Tick);
}
}
With ability to create an instance, but if you just want calculate and throw it away:
double tickX = new AxisAssists(aMaxX - aMinX, 8).Tick;

I wrote an objective-c method to return a nice axis scale and nice ticks for given min- and max values of your data set:
- (NSArray*)niceAxis:(double)minValue :(double)maxValue
{
double min_ = 0, max_ = 0, min = minValue, max = maxValue, power = 0, factor = 0, tickWidth, minAxisValue = 0, maxAxisValue = 0;
NSArray *factorArray = [NSArray arrayWithObjects:#"0.0f",#"1.2f",#"2.5f",#"5.0f",#"10.0f",nil];
NSArray *scalarArray = [NSArray arrayWithObjects:#"0.2f",#"0.2f",#"0.5f",#"1.0f",#"2.0f",nil];
// calculate x-axis nice scale and ticks
// 1. min_
if (min == 0) {
min_ = 0;
}
else if (min > 0) {
min_ = MAX(0, min-(max-min)/100);
}
else {
min_ = min-(max-min)/100;
}
// 2. max_
if (max == 0) {
if (min == 0) {
max_ = 1;
}
else {
max_ = 0;
}
}
else if (max < 0) {
max_ = MIN(0, max+(max-min)/100);
}
else {
max_ = max+(max-min)/100;
}
// 3. power
power = log(max_ - min_) / log(10);
// 4. factor
factor = pow(10, power - floor(power));
// 5. nice ticks
for (NSInteger i = 0; factor > [[factorArray objectAtIndex:i]doubleValue] ; i++) {
tickWidth = [[scalarArray objectAtIndex:i]doubleValue] * pow(10, floor(power));
}
// 6. min-axisValues
minAxisValue = tickWidth * floor(min_/tickWidth);
// 7. min-axisValues
maxAxisValue = tickWidth * floor((max_/tickWidth)+1);
// 8. create NSArray to return
NSArray *niceAxisValues = [NSArray arrayWithObjects:[NSNumber numberWithDouble:minAxisValue], [NSNumber numberWithDouble:maxAxisValue],[NSNumber numberWithDouble:tickWidth], nil];
return niceAxisValues;
}
You can call the method like this:
NSArray *niceYAxisValues = [self niceAxis:-maxy :maxy];
and get you axis setup:
double minYAxisValue = [[niceYAxisValues objectAtIndex:0]doubleValue];
double maxYAxisValue = [[niceYAxisValues objectAtIndex:1]doubleValue];
double ticksYAxis = [[niceYAxisValues objectAtIndex:2]doubleValue];
Just in case you want to limit the number of axis ticks do this:
NSInteger maxNumberOfTicks = 9;
NSInteger numberOfTicks = valueXRange / ticksXAxis;
NSInteger newNumberOfTicks = floor(numberOfTicks / (1 + floor(numberOfTicks/(maxNumberOfTicks+0.5))));
double newTicksXAxis = ticksXAxis * (1 + floor(numberOfTicks/(maxNumberOfTicks+0.5)));
The first part of the code is based on the calculation I found here to calculate nice graph axis scale and ticks similar to excel graphs. It works excellent for all kind of data sets. Here is an example of an iPhone implementation:

Another idea is to have the range of the axis be the range of the values, but put the tick marks at the appropriate position.. i.e. for 7 to 22 do:
[- - - | - - - - | - - - - | - - ]
10 15 20
As for selecting the tick spacing, I would suggest any number of the form 10^x * i / n, where i < n, and 0 < n < 10. Generate this list, and sort them, and you can find the largest number smaller than value_per_division (as in adam_liss) using a binary search.

Using a lot of inspiration from answers already availible here, here's my implementation in C. Note that there's some extendibility built into the ndex array.
float findNiceDelta(float maxvalue, int count)
{
float step = maxvalue/count,
order = powf(10, floorf(log10(step))),
delta = (int)(step/order + 0.5);
static float ndex[] = {1, 1.5, 2, 2.5, 5, 10};
static int ndexLenght = sizeof(ndex)/sizeof(float);
for(int i = ndexLenght - 2; i > 0; --i)
if(delta > ndex[i]) return ndex[i + 1] * order;
return delta*order;
}

In R, use
tickSize <- function(range,minCount){
logMaxTick <- log10(range/minCount)
exponent <- floor(logMaxTick)
mantissa <- 10^(logMaxTick-exponent)
af <- c(1,2,5) # allowed factors
mantissa <- af[findInterval(mantissa,af)]
return(mantissa*10^exponent)
}
where range argument is max-min of domain.

Here is a javascript function I wrote to round grid intervals (max-min)/gridLinesNumber to beautiful values. It works with any numbers, see the gist with detailed commets to find out how it works and how to call it.
var ceilAbs = function(num, to, bias) {
if (to == undefined) to = [-2, -5, -10]
if (bias == undefined) bias = 0
var numAbs = Math.abs(num) - bias
var exp = Math.floor( Math.log10(numAbs) )
if (typeof to == 'number') {
return Math.sign(num) * to * Math.ceil(numAbs/to) + bias
}
var mults = to.filter(function(value) {return value > 0})
to = to.filter(function(value) {return value < 0}).map(Math.abs)
var m = Math.abs(numAbs) * Math.pow(10, -exp)
var mRounded = Infinity
for (var i=0; i<mults.length; i++) {
var candidate = mults[i] * Math.ceil(m / mults[i])
if (candidate < mRounded)
mRounded = candidate
}
for (var i=0; i<to.length; i++) {
if (to[i] >= m && to[i] < mRounded)
mRounded = to[i]
}
return Math.sign(num) * mRounded * Math.pow(10, exp) + bias
}
Calling ceilAbs(number, [0.5]) for different numbers will round numbers like that:
301573431.1193228 -> 350000000
14127.786597236991 -> 15000
-63105746.17236853 -> -65000000
-718854.2201183736 -> -750000
-700660.340487957 -> -750000
0.055717507097870114 -> 0.06
0.0008068701205775142 -> 0.00085
-8.66660070605576 -> -9
-400.09256079792976 -> -450
0.0011740548815578223 -> 0.0015
-5.3003294346854085e-8 -> -6e-8
-0.00005815960629843176 -> -0.00006
-742465964.5184875 -> -750000000
-81289225.90985894 -> -85000000
0.000901771713513881 -> 0.00095
-652726598.5496342 -> -700000000
-0.6498901364393532 -> -0.65
0.9978325804695487 -> 1
5409.4078950583935 -> 5500
26906671.095639467 -> 30000000
Check out the fiddle to experiment with the code. Code in the answer, the gist and the fiddle is slightly different I'm using the one given in the answer.

If you are trying to get the scales looking right on VB.NET charts, then I've used the example from Adam Liss, but make sure when you set the min and max scale values that you pass them in from a variable of type decimal (not of type single or double) otherwise the tick mark values end up being set to like 8 decimal places.
So as an example, I had 1 chart where I set the min Y Axis value to 0.0001 and the max Y Axis value to 0.002.
If I pass these values to the chart object as singles I get tick mark values of 0.00048000001697801, 0.000860000036482233 ....
Whereas if I pass these values to the chart object as decimals I get nice tick mark values of 0.00048, 0.00086 ......

In python:
steps = [numpy.round(x) for x in np.linspace(min, max, num=num_of_steps)]

Answer that can dynamically always plot 0, handle positive and negatives, and small and large numbers, gives the tick interval size and how many to plot; written in Go
forcePlotZero changes how the max values are rounded so it'll always make a nice multiple to then get back to zero. Example:
if forcePlotZero == false then 237 --> 240
if forcePlotZero == true then 237 --> 300
Intervals are calculated by getting the multiple of 10/100/1000 etc for max and then subtracting till the cumulative total of these subtractions is < min
Here's the output from the function, along with showing forcePlotZero
Force to plot zero
max and min inputs
rounded max and min
intervals
forcePlotZero=false
min: -104 max: 240
minned: -160 maxed: 240
intervalCount: 5 intervalSize: 100
forcePlotZero=true
min: -104 max: 240
minned: -200 maxed: 300
intervalCount: 6 intervalSize: 100
forcePlotZero=false
min: 40 max: 1240
minned: 0 maxed: 1300
intervalCount: 14 intervalSize: 100
forcePlotZero=false
min: 200 max: 240
minned: 190 maxed: 240
intervalCount: 6 intervalSize: 10
forcePlotZero=false
min: 0.7 max: 1.12
minned: 0.6 maxed: 1.2
intervalCount: 7 intervalSize: 0.1
forcePlotZero=false
min: -70.5 max: -12.5
minned: -80 maxed: -10
intervalCount: 8 intervalSize: 10
Here's the playground link https://play.golang.org/p/1IhiX_hRQvo
func getMaxMinIntervals(max float64, min float64, forcePlotZero bool) (maxRounded float64, minRounded float64, intervalCount float64, intervalSize float64) {
//STEP 1: start off determining the maxRounded value for the axis
precision := 0.0
precisionDampener := 0.0 //adjusts to prevent 235 going to 300, instead dampens the scaling to get 240
epsilon := 0.0000001
if math.Abs(max) >= 0 && math.Abs(max) < 2 {
precision = math.Floor(-math.Log10(epsilon + math.Abs(max) - math.Floor(math.Abs(max)))) //counting number of zeros between decimal point and rightward digits
precisionDampener = 1
precision = precision + precisionDampener
} else if math.Abs(max) >= 2 && math.Abs(max) < 100 {
precision = math.Ceil(math.Log10(math.Abs(max)+1)) * -1 //else count number of digits before decimal point
precisionDampener = 1
precision = precision + precisionDampener
} else {
precision = math.Ceil(math.Log10(math.Abs(max)+1)) * -1 //else count number of digits before decimal point
precisionDampener = 2
if forcePlotZero == true {
precisionDampener = 1
}
precision = precision + precisionDampener
}
useThisFactorForIntervalCalculation := 0.0 // this is needed because intervals are calculated from the max value with a zero origin, this uses range for min - max
if max < 0 {
maxRounded = (math.Floor(math.Abs(max)*(math.Pow10(int(precision)))) / math.Pow10(int(precision)) * -1)
useThisFactorForIntervalCalculation = (math.Floor(math.Abs(max)*(math.Pow10(int(precision)))) / math.Pow10(int(precision))) + ((math.Ceil(math.Abs(min)*(math.Pow10(int(precision)))) / math.Pow10(int(precision))) * -1)
} else {
maxRounded = math.Ceil(max*(math.Pow10(int(precision)))) / math.Pow10(int(precision))
useThisFactorForIntervalCalculation = maxRounded
}
minNumberOfIntervals := 2.0
maxNumberOfIntervals := 19.0
intervalSize = 0.001
intervalCount = minNumberOfIntervals
//STEP 2: get interval size (the step size on the axis)
for {
if math.Abs(useThisFactorForIntervalCalculation)/intervalSize < minNumberOfIntervals || math.Abs(useThisFactorForIntervalCalculation)/intervalSize > maxNumberOfIntervals {
intervalSize = intervalSize * 10
} else {
break
}
}
//STEP 3: check that intervals are not too large, safety for max and min values that are close together (240, 220 etc)
for {
if max-min < intervalSize {
intervalSize = intervalSize / 10
} else {
break
}
}
//STEP 4: now we can get minRounded by adding the interval size to 0 till we get to the point where another increment would make cumulative increments > min, opposite for negative in
minRounded = 0.0
if min >= 0 {
for {
if minRounded < min {
minRounded = minRounded + intervalSize
} else {
minRounded = minRounded - intervalSize
break
}
}
} else {
minRounded = maxRounded //keep going down, decreasing by the interval size till minRounded < min
for {
if minRounded > min {
minRounded = minRounded - intervalSize
} else {
break
}
}
}
//STEP 5: get number of intervals to draw
intervalCount = (maxRounded - minRounded) / intervalSize
intervalCount = math.Ceil(intervalCount) + 1 // include the origin as an interval
//STEP 6: Check that the intervalCount isn't too high
if intervalCount-1 >= (intervalSize * 2) && intervalCount > maxNumberOfIntervals {
intervalCount = math.Ceil(intervalCount / 2)
intervalSize *= 2
}
return}

This is in python and for base 10.
Doesn't cover all your questions but I think you can build on it
import numpy as np
def create_ticks(lo,hi):
s = 10**(np.floor(np.log10(hi - lo)))
start = s * np.floor(lo / s)
end = s * np.ceil(hi / s)
ticks = [start]
t = start
while (t < end):
ticks += [t]
t = t + s
return ticks

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Choosing an attractive linear scale for a graph's Y Axis - algorithm

Related

Scoring two sequences of ordered numbers for their similarity to one-another

Find 'average' with equal upper and lower distance to values of a given set

How to calculate percentage between the range of two values a third value is

Reasonable optimized chart scaling

Algorithm for "nice" grid line intervals on a graph

Categories

Resources