Algorithm to determine if a given date/time is between two date/time pairs - algorithm

I have an array of dates in a one week range stored in an unusual way.
The Dates are stored in this numeric format: 12150
From left to right:
1st digit represents day: 1 = sunday, 2 = monday, 3 = tuesday, ...., 7 = saturday
next two digits represent hour in a 24 hour system: 00 = midnight, 23 = 11pm
next two digits represent minutes: 00-59
Given an input date and a start date and end date I need to know if the input date is between the start and end date.
I have an algorithm right now that I think works 100% of the time, but I am not sure.
In any case, I think there is probably a better and simpler way to do this and I was wondering if anybody knew what that algorithm was.
If not it would be cool if someone could double check my work and verify that it does actually work for 100% of valid cases.
What I have right now is:
if (startDate < inputDate &&
endDate > inputDate) {
inRange = yes;
}
else if (endDate < startDate) {
if((inputDate + 72359) > startDate &&
(inputDate + 72359) < endDate) {
inRange = yes;
}
else if((inputDate + 72359) > startDate &&
(inputDate + 72359) < (endDate + 72359)) {
inRange = yes;
}
}

How about
const int MAX = 72460; // Or anything more than the highest legal value
inRange = (MAX + inputDate - startDate) % MAX <
(MAX + endDate - startDate) % MAX;
This assumes of course that all the dates are well formed (according to your specs).
This addresses the case where the start is "after" the end. (e.g. Friday is in range if start is Wednesday and end is Monday)
It may take a second to see (which probably isn't good, because readability is usually the most important) but I think it does work.
Here's the basic trick:
Legend:
0: Minimum time
M: Maximum time
S: Start time
1,2,3: Input Time test points
E: End Time
The S E => Not in range
2 In range
3 > E => Not in range
The S > E case
0 M
Original -1--E----2---S--3--
Add Max -------------------1--E----2---S--3--
Subtract StartDate ------1--E----2---S--3--
% Max S--3--1--E----2----
1 In range
2 > E => Not in range
3 In range
If you really want to go nuts (and be even more difficult to decipher)
const int MAX = 0x20000;
const int MASK = 0x1FFFF;
int maxMinusStart = MAX - startDate;
inRange = (maxMinusStart + inputDate) & MASK <
(maxMinusStart + endDate) & MASK;
which ought to be slightly faster (trading modulus for a bitwise and) which we can do since the value of MAX doesn't really matter (as long as it exceeds the maximum well-formed value) and we're free to choose one that makes our computations easy.
(And of course you can replace the < with a <= if that's what you really need)

There is some logic error with dates in that format. Since the month and year information is missing, you cannot know what calendar day is missing. e.g. 50755 might be Thursday March 12 2009, but it might just as well be exactly a week ago, or 18 weeks ahead. That for you could never be 100% sure if any date in that format is between any other 2 dates.

Here the condition of the inner if can never be true, since endDate < startDate:
if (endDate < startDate) {
if((inputDate + 72359) > startDate &&
(inputDate + 72359) < endDate) {
// never reached
inRange = yes;
}
The following if also can't be optimal, since the first part is always true and the second part is just identical to inputDate < endDate:
if((inputDate + 72359) > startDate &&
(inputDate + 72359) < (endDate + 72359))
I think you want something like this:
if (startDate < endDate)
inRange = (startDate < inputDate) && (inputDate < endDate);
else
inRange = (startDate < inputDate) || (inputDate < endDate);

you should use >= and <= if you really want it in range
say i pick this date 10000 or 72359, how you would handle this? it is in range or not?
also i didn't know value for startDate and endDate since you didn't initialize it, correct me if i were wrong, variable that didn't initialized will start with 0 or null or ''
so i assume the startDate = 10000 and endDate 72359
btw why you pick this kind of array (as int or string value?) why first value was day? not date example:
010000 -> date 1st of the month 00:00
312359 -> date 31th of the month 23:59
but it's up to you :D
so sorry if i were wrong i took algorithm class only on university and it was 5 years ago :D

A better approach might be to normalize your data converting all the day of the week values to be relative to the start date. Something like this:
const int dayScale = 10000; // scale factor for the day of the week
int NormalizeDate(int date, int startDay)
{
int day = (date / dayScale) - 1; // this would be a lot easier if Sunday was 0
int sday = startDay - 1;
if (day < sday)
day = (day + 7 - sday) % 7;
return ((day+1) * dayScale) + (date % dayScale);
}
int startDay = startDate / dayScale; // isolate the day of the week
int normalizedStartDate = NormalizeDate(startDate, startDay);
int normalizedEndDate = NormalizeDate(endDate, startDay);
int normalizedInputDate = NormalizeDate(inputDate, startDay);
inRange = normalizedInputDate >= normalizedStartDate &&
normalizedInputDate <= normalizedEndDate;
I am pretty sure this will work as written. In any case, the concept is cleaner that multiple comparisons.

The simplest solution i found is this:
said x your generic time and S, E the start and end time respectively (with 0 < S,E < T):
f(x) = [(x-S) * (x-E) * (E-S) < 0]
This function returns TRUE if x is in between the start and end time, and FALSE otherwise.
It will also take care of start time bigger than end time (i.e. you start working at 20:00 and finish at 04:00, 23:13 will return TRUE)
i must say, considering the multiplications, it could not be the most efficient in terms of speed, but it is definitely the most compact (and pretty IMHO)
EDIT:
i found a much more elegant and efficient solution:
f(x) = (x<S) XOR (x<E) XOR (E<S)
you can substitute XOR with the "different" operator ( != )
I explain it:
The first formula comes from the considering the relation inequality study:
if S < E:
...............S.....E..........
(x-S)----------+++++++++++++++++
(x-E)----------------+++++++++++
(E-S)+++++++++++++++++++++++++++
total++++++++++------+++++++++++
so, the total is negative if x is in between S and E
if S > E:
...............E.....S..........
(x-S)----------------+++++++++++
(x-E)----------+++++++++++++++++
(E-S)---------------------------
total----------++++++-----------
so, the total is negative if x is bigger than S or smaller than E
To reach the final equation, you decompose the first formula in 3 terms:
(x-S)<0 => x<S
(x-E)<0 => x<E
(E-S)<0 => E<S
the product of these terms is negative only if they are all negative (true, true, true) or only one is negative and the other are positive (true, false, false, but the order does not matter)
Therefore the problem can be solved via
f(x) = (x<S) != (x<E) != (E<S)
These solution can be applied to any similar problem with periodic system, such as checking if the angle x is inside the arc formed by the two angles S and E.
Just make sure that all the variable are between 0 and the period of your system (2PI for arcs in a circle, 24h for hours, 24*60*60 for the seconds count of a day.....and so on)

Related

Algorithm to fit as many events into a schedule as possible

I'm trying to find an algorithm that can arrange as many of these non-overlapping events into a schedule as possible (where any of these events can be added or removed from the schedule as needed). None of these events can overlap, but I want to fit as many of them into a daily schedule as possible:
12:00 PM - 12:45 PM: Lunch
1:00 AM - 3:00 AM: Math class 1
3:30 PM - 5:00 PM: Math class 2
7:00 PM - 10:00 PM: History class 1
9:00 PM - 11:00 PM: History class 2
Any time of day: Grocery shopping, 40 minutes
Any time of day: Study math for 30 minutes
Any time of day between 11:00 AM and 4:00 PM: Basketball practice for 2 hours
I've been thinking about this problem for a while, and I still have no idea about how I should solve it. What type of calendar-scheduling algorithm would be most effective in this case?
You are bin packing periods into a single day length. You want to find the possible solutions for your problem and grade them according to the number of periods you manage to pack into it.
Split your day in 15 mins intervals, so that from 1 am to 10 pm you have 21 * 4 frames.
Generate every permutation possible with your constraints (no overlap of frames).
For each valid permutation, count the number of periods you managed to fit in.
Print the [x] permutations that scored the highest
I've written a function called generateCombination that takes an array of integer ranges as input, and generates all possible non-overlapping combinations of the events in the array. From this array, you can extract the largest arrays of ranges, which are the ranges that contain the greatest possible number of events.
http://jsfiddle.net/nvYZ8/1/
var theArray = generateCombination([[0, 2], [2, 3], [4, 5], [0, 9], [2, 50]]);
alert(JSON.stringify(theArray));
function generateCombination(theArray) {
var theString = "";
var tempArray = new Array();
for (var i = 0; i < theArray.length; i++) {
theString += "1";
}
var maximumNumber = convertFromBaseToBase(theString, 2, 10);
for (var k = 0; k <= maximumNumber; k++) {
theString = convertFromBaseToBase(k + "", 10, 2);
while(theString.length != theArray.length){
theString = "0" + theString;
}
var theResult = getArray(theArray, theString);
if(theResult != false){
tempArray[tempArray.length] = JSON.stringify(theResult);
}
}
return tempArray;
}
function getArray(theArray, theString){
var tempArray = new Array();
for(var i = 0; i < theArray.length; i++){
if(theString[i] == 1){
tempArray[tempArray.length] = theArray[i];
}
}
for (var i = 0; i < theArray.length; i++) {
for (var j = i; j < theArray.length; j++) {
if ((j != i) && (theString[i] == 1) && (theString[j] == 1)) {
//check whether theArray[i] overlaps with theArray[j]
var overlaps = rangesOverlap(theArray[i][0], theArray[i][1], theArray[j][0], theArray[j][1]);
//if overlaps is true, break out of the current loop
//otherwise, add theArray[j] to tempArray
if(overlaps == true){
return false;
}
}
}
}
return tempArray;
}
function convertFromBaseToBase(str, fromBase, toBase) {
var num = parseInt(str, fromBase);
return num.toString(toBase);
}
function rangesOverlap(x1, x2, y1, y2) {
if (x1 <= y2 && y1 <= x2) {
return true;
} else {
return false;
}
}
I think Dynamic Programming is the solution ..
For a, b as events: f(a) > f(b) ~ duration(a) < duration(b)
For x, y as schedules: g(x) > g(y) ~ Number-Of-Events(x) > Number-Of-Events(y)
Dynamic Programming with f(event) over g(schedule); to find the optimal schedule
OTOH I can think of two suitable solutions, one with planning algorithms, PopPlan or GraphPlan; the other, you could use simulated annealing.

Fair product distribution algorithm

Here is my problem:
There are n companies distributing
products.
All products should be distributed in k days
Distributing products of company Ci should be consecutive - it means that it can be distributed on days 2,3,4,5 but not 2,3,6,7
number of distributed products by company Ci on day j should be less than (or equal) on day j-1 (if there were any on day j-1)
difference between distributed products between days i and j should not be greater than 1
Example:
We have 3 days to distribute products. Products of company A: a,a,a,a,a. Products of company B: b,b,b. Products of company C: c,c
Fair distribution:
[aab,aabc,abc]
Invalid distribution:
[aabc,aabc,ab]
because on 1st day there are 4 products, on 3rd day 2 products (difference > 1)
Invalid distribution:
[abc,aabc,aab]
because on 1st day there is one product A, and on 2nd day there are 2 products A, so distribution of product A is not non-decreasing
EDIT
if there is a case that makes fair distribution impossible please provide it with short description, I'll accept the answer
Gareth Rees's comment on djna's answer is right -- the following counterexample is unsolvable:
3 days, 7 items from company A and 5 items from company B
I tested this with the following dumbest-possible brute-force Perl program (which takes well under a second, despite being very inefficient):
my ($na, $nb) = (7, 5);
for (my $a1 = 0; $a1 <= $na; ++$a1) {
for (my $a2 = 0; $a2 <= $na - $a1; ++$a2) {
my $a3 = $na - $a1 - $a2;
for (my $b1 = 0; $b1 <= $nb; ++$b1) {
for (my $b2 = 0; $b2 <= $nb - $b1; ++$b2) {
my $b3 = $nb - $b1 - $b2;
if ($a1 >= $a2 && $a2 >= $a3 || $a1 == 0 && $a2 >= $a3 || $a1 == 0 && $a2 == 0) {
if ($b1 >= $b2 && $b2 >= $b3 || $b1 == 0 && $b2 >= $b3 || $b1 == 0 && $b2 == 0) {
if (max($a1 + $b1, $a2 + $b2, $a3 + $b3) - min($a1 + $b1, $a2 + $b2, $a3 + $b3) <= 1) {
print "Success! ($a1,$a2,$a3), ($b1,$b2,$b3)\n";
}
}
}
}
}
}
}
Please have a look and verify that I haven't made any stupid mistakes. (I've omitted max() and min() for brevity -- they just do what you'd expect.)
Since I thought the problem was fun, I did a model for finding solutions using MiniZinc. With the Gecode backend, the initial example is shown to have 20 solutions in about 1.6 ms.
include "globals.mzn";
%%% Data
% Number of companies
int: n = 3;
% Number of products per company
array[1..n] of int: np = [5, 3, 2];
% Number of days
int: k = 3;
%%% Computed values
% Total number of products
int: totalnp = sum(np);
% Offsets into products array to get single companys products
% (shifted cumulative sum).
array[1..n] of int: offset = [sum([np[j] | j in 1..i-1])
| i in 1..n];
%%% Predicates
predicate fair(array[int] of var int: x) =
let { var int: low,
var int: high
} in
minimum(low, x) /\
maximum(high, x) /\
high-low <= 1;
predicate decreasing_except_0(array[int] of var int: x) =
forall(i in 1..length(x)-1) (
(x[i] == 0) \/
(x[i] >= x[i+1])
);
predicate consecutive(array[int] of var int: x) =
forall(i in 1..length(x)-1) (
(x[i] == x[i+1]) \/
(x[i] == x[i+1]-1)
);
%%% Variables
% Day of production for all products from all companies
array[1..totalnp] of var 1..k: products
:: is_output;
% total number of products per day
array[1..k] of var 1..totalnp: productsperday
:: is_output;
%%% Constraints
constraint global_cardinality(products, productsperday);
constraint fair(productsperday);
constraint
forall(i in 1..n) (
let {
% Products produced by company i
array[1..np[i]] of var int: pi
= [products[j] |
j in 1+offset[i]..1+offset[i]+np[i]-1],
% Products per day by company i
array[1..k] of var 0..np[i]: ppdi
} in
consecutive(pi) /\
global_cardinality(pi, ppdi) /\
decreasing_except_0(ppdi)
);
%%% Find a solution, default search strategy
solve satisfy;
The predicates decreasing_except_0 and consecutive are both very naive, and have large decompositions. To solve larger instances, one should probably replace them with smarter variants (for example by using the regular constraint).
It has been shown that the points 4 and 5 were incompatible:
4: For any day j, for any company A, C(j,A) == 0 or C(j,A) >= C(j+1,A)
5: For any days i and j, |C(i) - C(j)| <= 1
You thus need relaxing either constraint. Honestly, while I get a feeling of why 4 was put in place (to avoid delaying the distribution of one company indefinitely) I think it could be expressed otherwise to consider the first and last day of distribution as being special (since on the first day, you typically take what's left by the previous company and on last day you distribute what's left).
Point 3 does force the contiguity.
Mathematically:
For any company A, which has products, there exists two days i and j such that:
C(i,A) > 0 and C(j,A) > 0
for any day x such that x < i or x > j, C(x,A) = 0
for any day x such that i < x < j, C(x,A) = C(x)
Admittedly, the problem then becomes trivial to solve :)
I don't think that you can always fulfil your requirements.
Consider 4 days, and 6 items from supplier A and 6 items from supplier B.

Algorithm needed to calculate difference between two times

I have an hour selection drop down 0-23 and minutes selection drop down 0-59 for Start time and End time respectively (so four controls).
I'm looking for an algorithm to calculate time difference using these four values.
Since they're not stored in fancy date/time selection controls, I don't think I can use any standard date/time manipulation functions.
How do I calculate the difference between the two times?
This pseudo-code gives you the algorithm to work out the difference in minutes. It assumes that, if the start time is after the end time, the start time was actually on the previous day.
const MINS_PER_HR = 60, MINS_PER_DAY = 1440
startx = starthour * MINS_PER_HR + startminute
endx = endhour * MINS_PER_HR + endminute
duration = endx - startx
if duration < 0:
duration = duration + MINS_PER_DAY
The startx and endx values are the number of minutes since midnight.
This is basically doing:
Get number of minutes from start of day for start time.
Get number of minutes from start of day for end time.
Subtract the former from the latter.
If result is negative, add number of minutes in a day.
Don't be so sure though that you can't use date/time manipulation functions. You may find that you could easily construct a date/time and calculate differences with something like:
DateTime startx = new DateTime (1, 1, 2010, starthour, startminute, 0);
DateTime endx = new DateTime (1, 1, 2010, endhour , endminute , 0);
Integer duration = DateTime.DiffSecs(endx, startx) / 60;
if (duration < 0)
duration = duration + 1440;
although it's probably not needed for your simple scenario. I'd stick with the pseudo-code I gave above unless you find yourself doing some trickier date/time manipulation.
If you then want to turn the duration (in minutes) into hours and minutes:
durHours = int(duration / 60)
durMinutes = duration % 60 // could also use duration - (durHours * 60)
This will compute duration in minutes including the year as factor
//* Assumptions:
Date is in Julian Format
startx = starthour * 60 + startminute
endx = endhour * 60 + endminute
duration = endx - startx
if duration <= 0:
duration = duration + 1440
end-if
if currday > prevday
duration = duration + ((currday-preday) - 1 * 1440)
end-if
First you need to check to see if the end time is greater than or equal to the start time to prevent any problems. To do this you first check to see if the End_Time_Hour is greater than Start_Time_Hour. If they're equal you would instead check to see if End_Time_Min is greater than or equal to Start_Time_Min.
Next you would subtract Start_Time_Hour from End_Time_Hour. Then you would subtract Start_Time_Min from End_Time_Min. If the difference of the minutes is less than 0 you would decrement the hour difference by one and add the minute difference to 60 (or 59, test that). Concat these two together and you should be all set.
$start_time_hr = 5;
$start_time_mi = 50;
$end_time_hr = 8;
$end_time_mi = 30;
$diff = (($end_time_hr*60)+$end_time_mi) - (($start_time_hr*60)+$start_time_mi);
$diff_hr = (int)($diff / 60);
$diff_mi = (int)($diff) - ($diff_hr*60);
echo $diff_hr . ':' . $diff_mi;
simple equation should help:
mindiff = 60 + endtime.min - starttime.min
hrdiff = ((mindiff/60) - 1) + endtime.hr - starttime.hr
This gives you the duration in hours and minutes
h1 = "hora1"
m1 "min1"
h2 "hora2"
m2 = "min2"
if ( m1 > m2)
{
h3 = (h2 - h1) - 1;
}
else
{
h3 = h2 - h1;
}
m1 = 60 - m1;
if (m1 + m2 >= 60)
{
m3 = 60 - (m1 + m2);
} else if (m3 < 0)
{
m3 = m3 * -1;
}
else
{
m3 = m1 + m2;
}
System.out.println("duration:" + h3 + "h" + m3 + "min");
If you have a function that returns the number of days since some start date (e.g. dayssince1900) you can just convert both dates to seconds since that start date, do the ABS(d1-d2) then convert the seconds back to whatever format you want e.g. HHHH:MM:SS
Simple e.g.
SecondsSince1900(d)
{
return dayssince1900(d)*86400
+hours(d)*3600
+minutes(d)*60
+seconds(d);
}
diff = ABS(SecondsSince1900(d1)-SecondsSince1900(d2))
return format(diff DIV 3600)+':'+format((diff DIV 60) MOD 60)+':'+format(diff MOD 60);
Hum: Not that simple if you have to take into account the leap seconds astronomers are keen to put in from time to time.

Non-linear counter

So I have a counter. It is supposed to calculate the current amount of something. To calculate this, I know the start date, and start amount, and the amount to increment the counter by each second. Easy peasy. The tricky part is that the growth is not quite linear. Every day, the increment amount increases by a set amount. I need to recreate this algorithmically - basically figure out the exact value at the current date based on the starting value, the amount incremented over time, and the amount the increment has increased over time.
My target language is Javascript, but pseudocode is fine too.
Based on AB's solution:
var now = new Date();
var startDate1 = new Date("January 1 2010");
var days1 = (now - startDate1) / 1000 / 60 / 60 / 24;
var startNumber1 = 9344747520;
var startIncrement1 = 463;
var dailyIncrementAdjustment1 = .506;
var currentIncrement = startIncrement1 + (dailyIncrementAdjustment1 * days1);
startNumber1 = startNumber1 + (days1 / 2) * (2 * startIncrement1 + (days1 - 1) * dailyIncrementAdjustment1);
Does that look reasonable to you guys?
It's a quadratic function. If t is the time passed, then it's the usual at2+bt+c, and you can figure out a,b,c by substituting the results for the first 3 seconds.
Or: use the formula for the arithmetic progression sum, where a1 is the initial increment, and d is the "set amount" you refer to. Just don't forget to add your "start amount" to what the formula gives you.
If x0 is the initial amount, d is the initial increment, and e is the "set amount" to increase the incerement, it comes to
x0 + (t/2)*(2d + (t-1)*e)
If I understand your question correctly, you have an initial value x_0, an initial increment per second of d_0 and an increment adjustment of e per day. That is, on day one the increment per second is d_0, on day two the increment per second is d_0 + e, etc.
Then, we note that the increment per second at time t is
d(t) = d_0 + floor(t / S) * e
where S is the number of seconds per day and t is the number of seconds that have elapsed since t = t_0. Then
x = x_0 + sum_{k < floor(t / S)} S * d(k) + S * (t / S - floor(t / S)) * d(t)
is the formula that you are seeking. From here, you can simplify this to
x = x_0 + S * floor(t / S) d_0 + S * e * (floor(t / S) - 1) * floor(t / S) / 2.
use strict; use warnings;
my $start = 0;
my $stop = 100;
my $current = $start;
for my $day ( 1 .. 100 ) {
$current += ($day / 10);
last unless $current < $stop;
printf "Day: %d\tLeft %.2f\n", $day, (1 - $current/$stop);
}
Output:
Day: 1 Left 1.00
Day: 2 Left 1.00
Day: 3 Left 0.99
Day: 4 Left 0.99
Day: 5 Left 0.98
...
Day: 42 Left 0.10
Day: 43 Left 0.05
Day: 44 Left 0.01

Tickmark algorithm for a graph axis

I'm looking for an algorithm that places tick marks on an axis, given a range to display, a width to display it in, and a function to measure a string width for a tick mark.
For example, given that I need to display between 1e-6 and 5e-6 and a width to display in pixels, the algorithm would determine that I should put tickmarks (for example) at 1e-6, 2e-6, 3e-6, 4e-6, and 5e-6. Given a smaller width, it might decide that the optimal placement is only at the even positions, i.e. 2e-6 and 4e-6 (since putting more tickmarks would cause them to overlap).
A smart algorithm would give preference to tickmarks at multiples of 10, 5, and 2. Also, a smart algorithm would be symmetric around zero.
As I didn't like any of the solutions I've found so far, I implemented my own. It's in C# but it can be easily translated into any other language.
It basically chooses from a list of possible steps the smallest one that displays all values, without leaving any value exactly in the edge, lets you easily select which possible steps you want to use (without having to edit ugly if-else if blocks), and supports any range of values. I used a C# Tuple to return three values just for a quick and simple demonstration.
private static Tuple<decimal, decimal, decimal> GetScaleDetails(decimal min, decimal max)
{
// Minimal increment to avoid round extreme values to be on the edge of the chart
decimal epsilon = (max - min) / 1e6m;
max += epsilon;
min -= epsilon;
decimal range = max - min;
// Target number of values to be displayed on the Y axis (it may be less)
int stepCount = 20;
// First approximation
decimal roughStep = range / (stepCount - 1);
// Set best step for the range
decimal[] goodNormalizedSteps = { 1, 1.5m, 2, 2.5m, 5, 7.5m, 10 }; // keep the 10 at the end
// Or use these if you prefer: { 1, 2, 5, 10 };
// Normalize rough step to find the normalized one that fits best
decimal stepPower = (decimal)Math.Pow(10, -Math.Floor(Math.Log10((double)Math.Abs(roughStep))));
var normalizedStep = roughStep * stepPower;
var goodNormalizedStep = goodNormalizedSteps.First(n => n >= normalizedStep);
decimal step = goodNormalizedStep / stepPower;
// Determine the scale limits based on the chosen step.
decimal scaleMax = Math.Ceiling(max / step) * step;
decimal scaleMin = Math.Floor(min / step) * step;
return new Tuple<decimal, decimal, decimal>(scaleMin, scaleMax, step);
}
static void Main()
{
// Dummy code to show a usage example.
var minimumValue = data.Min();
var maximumValue = data.Max();
var results = GetScaleDetails(minimumValue, maximumValue);
chart.YAxis.MinValue = results.Item1;
chart.YAxis.MaxValue = results.Item2;
chart.YAxis.Step = results.Item3;
}
Take the longest of the segments about zero (or the whole graph, if zero is not in the range) - for example, if you have something on the range [-5, 1], take [-5,0].
Figure out approximately how long this segment will be, in ticks. This is just dividing the length by the width of a tick. So suppose the method says that we can put 11 ticks in from -5 to 0. This is our upper bound. For the shorter side, we'll just mirror the result on the longer side.
Now try to put in as many (up to 11) ticks in, such that the marker for each tick in the form i*10*10^n, i*5*10^n, i*2*10^n, where n is an integer, and i is the index of the tick. Now it's an optimization problem - we want to maximize the number of ticks we can put in, while at the same time minimizing the distance between the last tick and the end of the result. So assign a score for getting as many ticks as we can, less than our upper bound, and assign a score to getting the last tick close to n - you'll have to experiment here.
In the above example, try n = 1. We get 1 tick (at i=0). n = 2 gives us 1 tick, and we're further from the lower bound, so we know that we have to go the other way. n = 0 gives us 6 ticks, at each integer point point. n = -1 gives us 12 ticks (0, -0.5, ..., -5.0). n = -2 gives us 24 ticks, and so on. The scoring algorithm will give them each a score - higher means a better method.
Do this again for the i * 5 * 10^n, and i*2*10^n, and take the one with the best score.
(as an example scoring algorithm, say that the score is the distance to the last tick times the maximum number of ticks minus the number needed. This will likely be bad, but it'll serve as a decent starting point).
Funnily enough, just over a week ago I came here looking for an answer to the same question, but went away again and decided to come up with my own algorithm. I am here to share, in case it is of any use.
I wrote the code in Python to try and bust out a solution as quickly as possible, but it can easily be ported to any other language.
The function below calculates the appropriate interval (which I have allowed to be either 10**n, 2*10**n, 4*10**n or 5*10**n) for a given range of data, and then calculates the locations at which to place the ticks (based on which numbers within the range are divisble by the interval). I have not used the modulo % operator, since it does not work properly with floating-point numbers due to floating-point arithmetic rounding errors.
Code:
import math
def get_tick_positions(data: list):
if len(data) == 0:
return []
retpoints = []
data_range = max(data) - min(data)
lower_bound = min(data) - data_range/10
upper_bound = max(data) + data_range/10
view_range = upper_bound - lower_bound
num = lower_bound
n = math.floor(math.log10(view_range) - 1)
interval = 10**n
num_ticks = 1
while num <= upper_bound:
num += interval
num_ticks += 1
if num_ticks > 10:
if interval == 10 ** n:
interval = 2 * 10 ** n
elif interval == 2 * 10 ** n:
interval = 4 * 10 ** n
elif interval == 4 * 10 ** n:
interval = 5 * 10 ** n
else:
n += 1
interval = 10 ** n
num = lower_bound
num_ticks = 1
if view_range >= 10:
copy_interval = interval
else:
if interval == 10 ** n:
copy_interval = 1
elif interval == 2 * 10 ** n:
copy_interval = 2
elif interval == 4 * 10 ** n:
copy_interval = 4
else:
copy_interval = 5
first_val = 0
prev_val = 0
times = 0
temp_log = math.log10(interval)
if math.isclose(lower_bound, 0):
first_val = 0
elif lower_bound < 0:
if upper_bound < -2*interval:
if n < 0:
copy_ub = round(upper_bound*10**(abs(temp_log) + 1))
times = copy_ub // round(interval*10**(abs(temp_log) + 1)) + 2
else:
times = upper_bound // round(interval) + 2
while first_val >= lower_bound:
prev_val = first_val
first_val = times * copy_interval
if n < 0:
first_val *= (10**n)
times -= 1
first_val = prev_val
times += 3
else:
if lower_bound > 2*interval:
if n < 0:
copy_ub = round(lower_bound*10**(abs(temp_log) + 1))
times = copy_ub // round(interval*10**(abs(temp_log) + 1)) - 2
else:
times = lower_bound // round(interval) - 2
while first_val < lower_bound:
first_val = times*copy_interval
if n < 0:
first_val *= (10**n)
times += 1
if n < 0:
retpoints.append(first_val)
else:
retpoints.append(round(first_val))
val = first_val
times = 1
while val <= upper_bound:
val = first_val + times * interval
if n < 0:
retpoints.append(val)
else:
retpoints.append(round(val))
times += 1
retpoints.pop()
return retpoints
When passing in the following three data-points to the function
points = [-0.00493, -0.0003892, -0.00003292]
... the output I get (as a list) is as follows:
[-0.005, -0.004, -0.003, -0.002, -0.001, 0.0]
When passing this:
points = [1.399, 38.23823, 8309.33, 112990.12]
... I get:
[0, 20000, 40000, 60000, 80000, 100000, 120000]
When passing this:
points = [-54, -32, -19, -17, -13, -11, -8, -4, 12, 15, 68]
... I get:
[-60, -40, -20, 0, 20, 40, 60, 80]
... which all seem to be a decent choice of positions for placing ticks.
The function is written to allow 5-10 ticks, but that could easily be changed if you so please.
Whether the list of data supplied contains ordered or unordered data it does not matter, since it is only the minimum and maximum data points within the list that matter.
This simple algorithm yields an interval that is multiple of 1, 2, or 5 times a power of 10. And the axis range gets divided in at least 5 intervals. The code sample is in java language:
protected double calculateInterval(double range) {
double x = Math.pow(10.0, Math.floor(Math.log10(range)));
if (range / x >= 5)
return x;
else if (range / (x / 2.0) >= 5)
return x / 2.0;
else
return x / 5.0;
}
This is an alternative, for minimum 10 intervals:
protected double calculateInterval(double range) {
double x = Math.pow(10.0, Math.floor(Math.log10(range)));
if (range / (x / 2.0) >= 10)
return x / 2.0;
else if (range / (x / 5.0) >= 10)
return x / 5.0;
else
return x / 10.0;
}
I've been using the jQuery flot graph library. It's open source and does axis/tick generation quite well. I'd suggest looking at it's code and pinching some ideas from there.

Resources