slow int.big calculation and only on one thread - go

I use the following code in my test:
package main
import "fmt"
import "math/big"
func main() {
input := "3333333333333333333.......tested with 100'000x3 , tested with 1'000'0000x3, tested with 10'000'000x3"
bi := big.NewInt(0)
if _, ok := bi.SetString(input, 10); ok {
fmt.Printf("number = %v\n", bi)
testval := new(big.Int)
testval.SetString("3", 10)
resultat, isGanzzahl := myDiv(bi, testval)
fmt.Printf("isGanzzahl = %v, resultat = %v\n", isGanzzahl, resultat)
} else {
fmt.Printf("error parsing line %#v\n", input)
}
}
func myDiv(minuend *big.Int, subtrahend *big.Int) (*big.Int, bool) {
zerotest := big.NewInt(0)
modulus := new(big.Int)
modulus = modulus.Mod(minuend, subtrahend)
if zerotest.Cmp(modulus) == 0 {
res := big.NewInt(0)
res.Quo(minuend, subtrahend)
return res, true
} else {
return big.NewInt(0), false
}
}
100'000 x 3 / 3 == not even a quater second
1'000'000 x 3 / 3 == 9.45 seconds
10'000'000 x 3 / 3 == 16.1 minute
Im looking for a way to make this happens much much faster. If i would like to do this multithreaded in go how do i do this with go-routines? Is there a faster way to do a division with larger numbers?
As this is just for testing i planned to use Numbers in the range of 100'000'000 - 1'000'000'000 digits (which would then be 1GB of ram usage). But 1 billion digits would not work because it would take years to complete.
What would then happen if it is N / M ? Where N=1billion digit, M=10million digit. Is this even possible on a powerful home computer?
How would it look / or what do i have to change to being able to distribute this work to multiple small computer (for example AWS)?

If your number is more than 100000 digits long, you need to use Fast Fourier Transform for multiplication and division: https://en.wikipedia.org/wiki/Multiplication_algorithm#Fourier_transform_methods . The basic idea is to treat numbers as polynomials with x being power of 10 (or power of 2 if you want binary system). Multiply polynomials using Fast Fourier Transform and then propagate carry to get a number from a polynomial. I.e. if we need to multiply 19 by 19 and we use x = 101, we will have (1 * x + 9) * (1 * x + 9) = x2 + 18 * x + 81. Now we propagate carry to convert polynomial back to number: x2 + 18 * x + 81 = x2 + (18 + 8) * x + 1 = x2 + 26 * x + 1 = (1 + 2) * x2 + 6 * x + 1 = 3 * x2 + 6 * x + 1 = 361. The trick is that polynomials can be multiplied efficiently (O(N*log(N) time) using Fast Fourier Transform. The coefficients of the product polynomial are larger than digits, so you need to choose x carefully in order to avoid integer overflow or precision problems.
There unlikely to be a golang library for that so you will need to write it yourself. Here are a few short FFT implementations you can use as a starting point:
http://codeforces.com/contest/632/submission/16449753 http://codeforces.com/contest/632/submission/16445979 http://codeforces.com/contest/632/submission/16449040
http://codeforces.com/contest/632/submission/16448169
If you decide to use FFT modulo prime, see this post for a good choice of the modulo: http://petr-mitrichev.blogspot.com/2015/04/this-week-in-competitive-programming.html

Related

Is there an efficient way to approximate (a / b)^n where a, b, and n are unsigned integers?

Exponentiation by squaring is an algorithm that quickly computes an, where a and n are signed integers. (It does so in O(log n) multiplications).
Is there a similar algorithm, that instead computes (a / b)n, where a, b, and n are all unsigned integers? The problem with the obvious approach (i.e., computing an / bn) is that it will return wrong results due to integer overflow on the intermediate values.
I don't have floating points in the host language, only ints.
I'm okay with an approximate answer.
If you want excellent accuracy for the value of (a/b)^n, where a, b, and n are unsigned integers and you do not have floating point arithmetic available--use extended-precision integer calculations to find a^n and b^n, then divide the two.
Some languages, such as Python, have extended-precision integer arithmetic built in. If your language does not have it, look for a package that implements it. If you cannot do that, just make your own package. It is not that hard--such a package was an assignment in my second-semester computer science class back in the day. The multiplications and powers are fairly straightforward; the most difficult part is the division, even if you just want the quotient and remainder. But "most difficult" does not mean "very difficult" and you could probably do it. The second must difficult routine is printing the extended integer to decimal format.
The basic idea is to store each integer in an array or list of regular unsigned integers, where is integer is a "digit" in arithmetic with a large base. You want to be able to handle the product of any two digits, so if your machine has 32-bit integers and you have no way of handling 64-bit integers, store "digits" of 16 bits each. The larger the "digit" the faster the calculations. If your calculations are few and your printing to decimal is frequent, use a power of 10 such as 10000 for each "digit".
Ask if you need more detail.
Here's a pow implementation in fixed point based on Feynman's log algorithm. It's quick and somewhat dirty; C libraries tend to use a polynomial approximation, but that approach is more complicated, and I'm not sure how well it would translate to fixed point.
// powFraction approximates (a/b)**n.
func powFraction(a uint64, b uint64, n uint64) uint64 {
if a == 0 || b == 0 || a < b {
panic("powFraction")
}
return expFixed((logFixed(a) - logFixed(b)) * n)
}
// logFixed approximates 2**58 * log2(x). [Feynman]
func logFixed(x uint64) uint64 {
if x == 0 {
panic("logFixed")
}
// Normalize x into [2**63, 2**64).
n := numberOfLeadingZeros(x)
x <<= n
p := uint64(1 << 63)
y := uint64(0)
for k := uint(1); k <= 63; k++ {
// Warning: if q > x-p, then p + q may overflow.
if q := p >> k; q <= x-p {
p += q
y += table[k-1]
}
}
return uint64(63-n)<<58 + y>>6
}
// expFixed approximately inverts logFixed.
func expFixed(y uint64) uint64 {
n := 63 - uint(y>>58)
y <<= 6
p := uint64(1 << 63)
for k := uint(1); k <= 63; k++ {
if z := table[k-1]; z <= y {
p += p >> k
y -= z
}
}
return p >> n
}
// numberOfLeadingZeros returns the number of leading zeros in the word x.
// [Hacker's Delight]
func numberOfLeadingZeros(x uint64) uint {
n := uint(64)
if y := x >> 32; y != 0 {
x = y
n = 32
}
if y := x >> 16; y != 0 {
x = y
n -= 16
}
if y := x >> 8; y != 0 {
x = y
n -= 8
}
if y := x >> 4; y != 0 {
x = y
n -= 4
}
if y := x >> 2; y != 0 {
x = y
n -= 2
}
if x>>1 != 0 {
return n - 2
}
return n - uint(x)
}
// table[k-1] approximates 2**64 * log2(1 + 2**-k). [MPFR]
var table = [...]uint64{
10790653543520307104, // 1
5938525176524057593, // 2
3134563013331062591, // 3
1613404648504497789, // 4
818926958183105433, // 5
412613322424486499, // 6
207106307442936368, // 7
103754619509458805, // 8
51927872466823974, // 9
25976601570169168, // 10
12991470209511302, // 11
6496527847636937, // 12
3248462157916594, // 13
1624280643531991, // 14
812152713665686, // 15
406079454902306, // 16
203040501980337, // 17
101520444623942, // 18
50760270720599, // 19
25380147462480, // 20
12690076756788, // 21
6345039134781, // 22
3172519756487, // 23
1586259925518, // 24
793129974578, // 25
396564990243, // 26
198282495860, // 27
99141248115, // 28
49570624104, // 29
24785312063, // 30
12392656035, // 31
6196328018, // 32
3098164009, // 33
1549082005, // 34
774541002, // 35
387270501, // 36
193635251, // 37
96817625, // 38
48408813, // 39
24204406, // 40
12102203, // 41
6051102, // 42
3025551, // 43
1512775, // 44
756388, // 45
378194, // 46
189097, // 47
94548, // 48
47274, // 49
23637, // 50
11819, // 51
5909, // 52
2955, // 53
1477, // 54
739, // 55
369, // 56
185, // 57
92, // 58
46, // 59
23, // 60
12, // 61
6, // 62
3, // 63
}
Just in case someone is looking for a constant-space solution, I've kind of solved the issue with binomial expansions, which are a decent approximation. I'm using the following code:
// Computes `k * (1+1/q) ^ N`, with precision `p`. The higher
// the precision, the higher the gas cost. It should be
// something around the log of `n`. When `p == n`, the
// precision is absolute (sans possible integer overflows).
// Much smaller values are sufficient to get a great approximation.
function fracExp(uint k, uint q, uint n, uint p) returns (uint) {
uint s = 0;
uint N = 1;
uint B = 1;
for (uint i = 0; i < p; ++i){
s += k * N / B / (q**i);
N = N * (n-i);
B = B * (i+1);
}
return s;
}
Which simply computes the p first terms of the binomial expansion of (1 + r)^N, where r is a small positive real number. I posted a more thoughtful explanation at Ethereum Stack Exchange.

golang - ceil function like php?

I want to return the least integer value greater than or equal to integer division. So I used math.ceil, but can not get the value I want.
package main
import (
"fmt"
"math"
)
func main() {
var pagesize int = 10
var length int = 43
d := float64(length / pagesize)
page := int(math.Ceil(d))
fmt.Println(page)
// output 4 not 5
}
http://golang.org/pkg/math/#Ceil
http://play.golang.org/p/asHta1HkO_
What is wrong?
Thanks.
The line
d := float64(length / pagesize)
transforms to float the result of the division. Since the division itself is integer division, it results in 4, so d = 4.0 and math.Ceil(d) is 4.
Replace the line with
d := float64(length) / float64(pagesize)
and you'll have d=4.3 and int(math.Ceil(d))=5.
Avoiding floating point operations (for performance and clarity):
x, y := length, pagesize
q := (x + y - 1) / y;
for x >= 0 and y > 0.
Or to avoid overflow of x+y:
q := 1 + (x - 1) / y
It's the same as the C++ version: Fast ceiling of an integer division in C / C++
Convert length and pagesize to floats before the division:
d := float64(length) / float64(pagesize)
http://play.golang.org/p/FKWeIj7of5
You can check the remainder to see if it should be raised to the next integer.
page := length / pagesize
if length % pagesize > 0 {
page++
}

How to find the number of values in a given range divisible by a given value?

I have three numbers x, y , z.
For a range between numbers x and y.
How can i find the total numbers whose % with z is 0 i.e. how many numbers between x and y are divisible by z ?
It can be done in O(1): find the first one, find the last one, find the count of all other.
I'm assuming the range is inclusive. If your ranges are exclusive, adjust the bounds by one:
find the first value after x that is divisible by z. You can discard x:
x_mod = x % z;
if(x_mod != 0)
x += (z - x_mod);
find the last value before y that is divisible by y. You can discard y:
y -= y % z;
find the size of this range:
if(x > y)
return 0;
else
return (y - x) / z + 1;
If mathematical floor and ceil functions are available, the first two parts can be written more readably. Also the last part can be compressed using math functions:
x = ceil (x, z);
y = floor (y, z);
return max((y - x) / z + 1, 0);
if the input is guaranteed to be a valid range (x >= y), the last test or max is unneccessary:
x = ceil (x, z);
y = floor (y, z);
return (y - x) / z + 1;
(2017, answer rewritten thanks to comments)
The number of multiples of z in a number n is simply n / z
/ being the integer division, meaning decimals that could result from the division are simply ignored (for instance 17/5 => 3 and not 3.4).
Now, in a range from x to y, how many multiples of z are there?
Let see how many multiples m we have up to y
0----------------------------------x------------------------y
-m---m---m---m---m---m---m---m---m---m---m---m---m---m---m---
You see where I'm going... to get the number of multiples in the range [ x, y ], get the number of multiples of y then subtract the number of multiples before x, (x-1) / z
Solution: ( y / z ) - (( x - 1 ) / z )
Programmatically, you could make a function numberOfMultiples
function numberOfMultiples(n, z) {
return n / z;
}
to get the number of multiples in a range [x, y]
numberOfMultiples(y) - numberOfMultiples(x-1)
The function is O(1), there is no need of a loop to get the number of multiples.
Examples of results you should find
[30, 90] ÷ 13 => 4
[1, 1000] ÷ 6 => 166
[100, 1000000] ÷ 7 => 142843
[777, 777777777] ÷ 7 => 111111001
For the first example, 90 / 13 = 6, (30-1) / 13 = 2, and 6-2 = 4
---26---39---52---65---78---91--
^ ^
30<---(4 multiples)-->90
I also encountered this on Codility. It took me much longer than I'd like to admit to come up with a good solution, so I figured I would share what I think is an elegant solution!
Straightforward Approach 1/2:
O(N) time solution with a loop and counter, unrealistic when N = 2 billion.
Awesome Approach 3:
We want the number of digits in some range that are divisible by K.
Simple case: assume range [0 .. n*K], N = n*K
N/K represents the number of digits in [0,N) that are divisible by K, given N%K = 0 (aka. N is divisible by K)
ex. N = 9, K = 3, Num digits = |{0 3 6}| = 3 = 9/3
Similarly,
N/K + 1 represents the number of digits in [0,N] divisible by K
ex. N = 9, K = 3, Num digits = |{0 3 6 9}| = 4 = 9/3 + 1
I think really understanding the above fact is the trickiest part of this question, I cannot explain exactly why it works.
The rest boils down to prefix sums and handling special cases.
Now we don't always have a range that begins with 0, and we cannot assume the two bounds will be divisible by K.
But wait! We can fix this by calculating our own nice upper and lower bounds and using some subtraction magic :)
First find the closest upper and lower in the range [A,B] that are divisible by K.
Upper bound (easier): ex. B = 10, K = 3, new_B = 9... the pattern is B - B%K
Lower bound: ex. A = 10, K = 3, new_A = 12... try a few more and you will see the pattern is A - A%K + K
Then calculate the following using the above technique:
Determine the total number of digits X between [0,B] that are divisible by K
Determine the total number of digits Y between [0,A) that are divisible by K
Calculate the number of digits between [A,B] that are divisible by K in constant time by the expression X - Y
Website: https://codility.com/demo/take-sample-test/count_div/
class CountDiv {
public int solution(int A, int B, int K) {
int firstDivisible = A%K == 0 ? A : A + (K - A%K);
int lastDivisible = B%K == 0 ? B : B - B%K; //B/K behaves this way by default.
return (lastDivisible - firstDivisible)/K + 1;
}
}
This is my first time explaining an approach like this. Feedback is very much appreciated :)
This is one of the Codility Lesson 3 questions. For this question, the input is guaranteed to be in a valid range. I answered it using Javascript:
function solution(x, y, z) {
var totalDivisibles = Math.floor(y / z),
excludeDivisibles = Math.floor((x - 1) / z),
divisiblesInArray = totalDivisibles - excludeDivisibles;
return divisiblesInArray;
}
https://codility.com/demo/results/demoQX3MJC-8AP/
(I actually wanted to ask about some of the other comments on this page but I don't have enough rep points yet).
Divide y-x by z, rounding down. Add one if y%z < x%z or if x%z == 0.
No mathematical proof, unless someone cares to provide one, but test cases, in Perl:
#!perl
use strict;
use warnings;
use Test::More;
sub multiples_in_range {
my ($x, $y, $z) = #_;
return 0 if $x > $y;
my $ret = int( ($y - $x) / $z);
$ret++ if $y%$z < $x%$z or $x%$z == 0;
return $ret;
}
for my $z (2 .. 10) {
for my $x (0 .. 2*$z) {
for my $y (0 .. 4*$z) {
is multiples_in_range($x, $y, $z),
scalar(grep { $_ % $z == 0 } $x..$y),
"[$x..$y] mod $z";
}
}
}
done_testing;
Output:
$ prove divrange.pl
divrange.pl .. ok
All tests successful.
Files=1, Tests=3405, 0 wallclock secs ( 0.20 usr 0.02 sys + 0.26 cusr 0.01 csys = 0.49 CPU)
Result: PASS
Let [A;B] be an interval of positive integers including A and B such that 0 <= A <= B, K be the divisor.
It is easy to see that there are N(A) = ⌊A / K⌋ = floor(A / K) factors of K in interval [0;A]:
1K 2K 3K 4K 5K
●········x········x··●·····x········x········x···>
0 A
Similarly, there are N(B) = ⌊B / K⌋ = floor(B / K) factors of K in interval [0;B]:
1K 2K 3K 4K 5K
●········x········x········x········x···●····x···>
0 B
Then N = N(B) - N(A) equals to the number of K's (the number of integers divisible by K) in range (A;B]. The point A is not included, because the subtracted N(A) includes this point. Therefore, the result should be incremented by one, if A mod K is zero:
N := N(B) - N(A)
if (A mod K = 0)
N := N + 1
Implementation in PHP
function solution($A, $B, $K) {
if ($K < 1)
return 0;
$c = floor($B / $K) - floor($A / $K);
if ($A % $K == 0)
$c++;
return (int)$c;
}
In PHP, the effect of the floor function can be achieved by casting to the integer type:
$c = (int)($B / $K) - (int)($A / $K);
which, I think, is faster.
Here is my short and simple solution in C++ which got 100/100 on codility. :)
Runs in O(1) time. I hope its not difficult to understand.
int solution(int A, int B, int K) {
// write your code in C++11
int cnt=0;
if( A%K==0 or B%K==0)
cnt++;
if(A>=K)
cnt+= (B - A)/K;
else
cnt+=B/K;
return cnt;
}
(floor)(high/d) - (floor)(low/d) - (high%d==0)
Explanation:
There are a/d numbers divisible by d from 0.0 to a. (d!=0)
Therefore (floor)(high/d) - (floor)(low/d) will give numbers divisible in the range (low,high] (Note that low is excluded and high is included in this range)
Now to remove high from the range just subtract (high%d==0)
Works for integers, floats or whatever (Use fmodf function for floats)
Won't strive for an o(1) solution, this leave for more clever person:) Just feel this is a perfect usage scenario for function programming. Simple and straightforward.
> x,y,z=1,1000,6
=> [1, 1000, 6]
> (x..y).select {|n| n%z==0}.size
=> 166
EDIT: after reading other's O(1) solution. I feel shamed. Programming made people lazy to think...
Division (a/b=c) by definition - taking a set of size a and forming groups of size b. The number of groups of this size that can be formed, c, is the quotient of a and b. - is nothing more than the number of integers within range/interval ]0..a] (not including zero, but including a) that are divisible by b.
so by definition:
Y/Z - number of integers within ]0..Y] that are divisible by Z
and
X/Z - number of integers within ]0..X] that are divisible by Z
thus:
result = [Y/Z] - [X/Z] + x (where x = 1 if and only if X is divisible by Y otherwise 0 - assuming the given range [X..Y] includes X)
example :
for (6, 12, 2) we have 12/2 - 6/2 + 1 (as 6%2 == 0) = 6 - 3 + 1 = 4 // {6, 8, 10, 12}
for (5, 12, 2) we have 12/2 - 5/2 + 0 (as 5%2 != 0) = 6 - 2 + 0 = 4 // {6, 8, 10, 12}
The time complexity of the solution will be linear.
Code Snippet :
int countDiv(int a, int b, int m)
{
int mod = (min(a, b)%m==0);
int cnt = abs(floor(b/m) - floor(a/m)) + mod;
return cnt;
}
here n will give you count of number and will print sum of all numbers that are divisible by k
int a = sc.nextInt();
int b = sc.nextInt();
int k = sc.nextInt();
int first = 0;
if (a > k) {
first = a + a/k;
} else {
first = k;
}
int last = b - b%k;
if (first > last) {
System.out.println(0);
} else {
int n = (last - first)/k+1;
System.out.println(n * (first + last)/2);
}
Here is the solution to the problem written in Swift Programming Language.
Step 1: Find the first number in the range divisible by z.
Step 2: Find the last number in the range divisible by z.
Step 3: Use a mathematical formula to find the number of divisible numbers by z in the range.
func solution(_ x : Int, _ y : Int, _ z : Int) -> Int {
var numberOfDivisible = 0
var firstNumber: Int
var lastNumber: Int
if y == x {
return x % z == 0 ? 1 : 0
}
//Find first number divisible by z
let moduloX = x % z
if moduloX == 0 {
firstNumber = x
} else {
firstNumber = x + (z - moduloX)
}
//Fist last number divisible by z
let moduloY = y % z
if moduloY == 0 {
lastNumber = y
} else {
lastNumber = y - moduloY
}
//Math formula
numberOfDivisible = Int(floor(Double((lastNumber - firstNumber) / z))) + 1
return numberOfDivisible
}
public static int Solution(int A, int B, int K)
{
int count = 0;
//If A is divisible by K
if(A % K == 0)
{
count = (B / K) - (A / K) + 1;
}
//If A is not divisible by K
else if(A % K != 0)
{
count = (B / K) - (A / K);
}
return count;
}
This can be done in O(1).
Here you are a solution in C++.
auto first{ x % z == 0 ? x : x + z - x % z };
auto last{ y % z == 0 ? y : y - y % z };
auto ans{ (last - first) / z + 1 };
Where first is the first number that ∈ [x; y] and is divisible by z, last is the last number that ∈ [x; y] and is divisible by z and ans is the answer that you are looking for.

Project Euler 16 - Help in solving it

I'm solving Project Euler problem 16, I've ended up with a code that can logically solve it, but is unable to process as I believe its overflowing or something? I tried int64 in place of int but it just prints 0,0. If i change the power to anything below 30 it works, but above 30 it does not work, Can anyone point out my mistake? I believe its not able to calculate 2^1000.
// PE_16 project main.go
package main
import (
"fmt"
)
func power(x, y int) int {
var pow int
var final int
final = 1
for pow = 1; pow <= y; pow++ {
final = final * x
}
return final
}
func main() {
var stp int
var sumfdigits int
var u, t, h, th, tth, l int
stp = power(2,1000)
fmt.Println(stp)
u = stp / 1 % 10
t = stp / 10 % 10
h = stp / 100 % 10
th = stp / 1000 % 10
tth = stp / 10000 % 10
l = stp / 100000 % 10
sumfdigits = u + t + h + th + tth + l
fmt.Println(sumfdigits)
}
Your approach to this problem requires exact integer math up to 1000 bits in size. But you're using int which is 32 or 64 bits. math/big.Int can handle such task. I intentionally do not provide a ready made solution using big.Int as I assume your goal is to learn by doing it by yourself, which I believe is the intent of Project Euler.
As noted by #jnml, ints aren't large enough; if you wish to calculate 2^1000 in Go, big.Ints are a good choice here. Note that math/big provides the Exp() method which will be easier to use than converting your power function to big.Ints.
I worked through some Project Euler problems about a year ago, doing them in Go to get to know the language. I didn't like the ones that required big.Ints, which aren't so easy to work with in Go. For this one, I "cheated" and did it in one line of Ruby:
Removed because I remembered it was considered bad form to show a working solution, even in a different language.
Anyway, my Ruby example shows another thing I learned with Go's big.Ints: sometimes it's easier to convert them to a string and work with that string than to work with the big.Int itself. This problem strikes me as one of those cases.
Converting my Ruby algorithm to Go, I only work with big.Ints on one line, then it's easy to work with the string and get the answer in just a few lines of code.
You don't need to use math/big. Below is a school boy maths way of doubling a decimal number as a hint!
xs holds the decimal digits in least significant first order. Pass in a pointer to the digits (pxs) as the slice might need to get bigger.
func double(pxs *[]int) {
xs := *pxs
carry := 0
for i, x := range xs {
n := x*2 + carry
if n >= 10 {
carry = 1
n -= 10
} else {
carry = 0
}
xs[i] = n
}
if carry != 0 {
*pxs = append(xs, carry)
}
}

Tickmark algorithm for a graph axis

I'm looking for an algorithm that places tick marks on an axis, given a range to display, a width to display it in, and a function to measure a string width for a tick mark.
For example, given that I need to display between 1e-6 and 5e-6 and a width to display in pixels, the algorithm would determine that I should put tickmarks (for example) at 1e-6, 2e-6, 3e-6, 4e-6, and 5e-6. Given a smaller width, it might decide that the optimal placement is only at the even positions, i.e. 2e-6 and 4e-6 (since putting more tickmarks would cause them to overlap).
A smart algorithm would give preference to tickmarks at multiples of 10, 5, and 2. Also, a smart algorithm would be symmetric around zero.
As I didn't like any of the solutions I've found so far, I implemented my own. It's in C# but it can be easily translated into any other language.
It basically chooses from a list of possible steps the smallest one that displays all values, without leaving any value exactly in the edge, lets you easily select which possible steps you want to use (without having to edit ugly if-else if blocks), and supports any range of values. I used a C# Tuple to return three values just for a quick and simple demonstration.
private static Tuple<decimal, decimal, decimal> GetScaleDetails(decimal min, decimal max)
{
// Minimal increment to avoid round extreme values to be on the edge of the chart
decimal epsilon = (max - min) / 1e6m;
max += epsilon;
min -= epsilon;
decimal range = max - min;
// Target number of values to be displayed on the Y axis (it may be less)
int stepCount = 20;
// First approximation
decimal roughStep = range / (stepCount - 1);
// Set best step for the range
decimal[] goodNormalizedSteps = { 1, 1.5m, 2, 2.5m, 5, 7.5m, 10 }; // keep the 10 at the end
// Or use these if you prefer: { 1, 2, 5, 10 };
// Normalize rough step to find the normalized one that fits best
decimal stepPower = (decimal)Math.Pow(10, -Math.Floor(Math.Log10((double)Math.Abs(roughStep))));
var normalizedStep = roughStep * stepPower;
var goodNormalizedStep = goodNormalizedSteps.First(n => n >= normalizedStep);
decimal step = goodNormalizedStep / stepPower;
// Determine the scale limits based on the chosen step.
decimal scaleMax = Math.Ceiling(max / step) * step;
decimal scaleMin = Math.Floor(min / step) * step;
return new Tuple<decimal, decimal, decimal>(scaleMin, scaleMax, step);
}
static void Main()
{
// Dummy code to show a usage example.
var minimumValue = data.Min();
var maximumValue = data.Max();
var results = GetScaleDetails(minimumValue, maximumValue);
chart.YAxis.MinValue = results.Item1;
chart.YAxis.MaxValue = results.Item2;
chart.YAxis.Step = results.Item3;
}
Take the longest of the segments about zero (or the whole graph, if zero is not in the range) - for example, if you have something on the range [-5, 1], take [-5,0].
Figure out approximately how long this segment will be, in ticks. This is just dividing the length by the width of a tick. So suppose the method says that we can put 11 ticks in from -5 to 0. This is our upper bound. For the shorter side, we'll just mirror the result on the longer side.
Now try to put in as many (up to 11) ticks in, such that the marker for each tick in the form i*10*10^n, i*5*10^n, i*2*10^n, where n is an integer, and i is the index of the tick. Now it's an optimization problem - we want to maximize the number of ticks we can put in, while at the same time minimizing the distance between the last tick and the end of the result. So assign a score for getting as many ticks as we can, less than our upper bound, and assign a score to getting the last tick close to n - you'll have to experiment here.
In the above example, try n = 1. We get 1 tick (at i=0). n = 2 gives us 1 tick, and we're further from the lower bound, so we know that we have to go the other way. n = 0 gives us 6 ticks, at each integer point point. n = -1 gives us 12 ticks (0, -0.5, ..., -5.0). n = -2 gives us 24 ticks, and so on. The scoring algorithm will give them each a score - higher means a better method.
Do this again for the i * 5 * 10^n, and i*2*10^n, and take the one with the best score.
(as an example scoring algorithm, say that the score is the distance to the last tick times the maximum number of ticks minus the number needed. This will likely be bad, but it'll serve as a decent starting point).
Funnily enough, just over a week ago I came here looking for an answer to the same question, but went away again and decided to come up with my own algorithm. I am here to share, in case it is of any use.
I wrote the code in Python to try and bust out a solution as quickly as possible, but it can easily be ported to any other language.
The function below calculates the appropriate interval (which I have allowed to be either 10**n, 2*10**n, 4*10**n or 5*10**n) for a given range of data, and then calculates the locations at which to place the ticks (based on which numbers within the range are divisble by the interval). I have not used the modulo % operator, since it does not work properly with floating-point numbers due to floating-point arithmetic rounding errors.
Code:
import math
def get_tick_positions(data: list):
if len(data) == 0:
return []
retpoints = []
data_range = max(data) - min(data)
lower_bound = min(data) - data_range/10
upper_bound = max(data) + data_range/10
view_range = upper_bound - lower_bound
num = lower_bound
n = math.floor(math.log10(view_range) - 1)
interval = 10**n
num_ticks = 1
while num <= upper_bound:
num += interval
num_ticks += 1
if num_ticks > 10:
if interval == 10 ** n:
interval = 2 * 10 ** n
elif interval == 2 * 10 ** n:
interval = 4 * 10 ** n
elif interval == 4 * 10 ** n:
interval = 5 * 10 ** n
else:
n += 1
interval = 10 ** n
num = lower_bound
num_ticks = 1
if view_range >= 10:
copy_interval = interval
else:
if interval == 10 ** n:
copy_interval = 1
elif interval == 2 * 10 ** n:
copy_interval = 2
elif interval == 4 * 10 ** n:
copy_interval = 4
else:
copy_interval = 5
first_val = 0
prev_val = 0
times = 0
temp_log = math.log10(interval)
if math.isclose(lower_bound, 0):
first_val = 0
elif lower_bound < 0:
if upper_bound < -2*interval:
if n < 0:
copy_ub = round(upper_bound*10**(abs(temp_log) + 1))
times = copy_ub // round(interval*10**(abs(temp_log) + 1)) + 2
else:
times = upper_bound // round(interval) + 2
while first_val >= lower_bound:
prev_val = first_val
first_val = times * copy_interval
if n < 0:
first_val *= (10**n)
times -= 1
first_val = prev_val
times += 3
else:
if lower_bound > 2*interval:
if n < 0:
copy_ub = round(lower_bound*10**(abs(temp_log) + 1))
times = copy_ub // round(interval*10**(abs(temp_log) + 1)) - 2
else:
times = lower_bound // round(interval) - 2
while first_val < lower_bound:
first_val = times*copy_interval
if n < 0:
first_val *= (10**n)
times += 1
if n < 0:
retpoints.append(first_val)
else:
retpoints.append(round(first_val))
val = first_val
times = 1
while val <= upper_bound:
val = first_val + times * interval
if n < 0:
retpoints.append(val)
else:
retpoints.append(round(val))
times += 1
retpoints.pop()
return retpoints
When passing in the following three data-points to the function
points = [-0.00493, -0.0003892, -0.00003292]
... the output I get (as a list) is as follows:
[-0.005, -0.004, -0.003, -0.002, -0.001, 0.0]
When passing this:
points = [1.399, 38.23823, 8309.33, 112990.12]
... I get:
[0, 20000, 40000, 60000, 80000, 100000, 120000]
When passing this:
points = [-54, -32, -19, -17, -13, -11, -8, -4, 12, 15, 68]
... I get:
[-60, -40, -20, 0, 20, 40, 60, 80]
... which all seem to be a decent choice of positions for placing ticks.
The function is written to allow 5-10 ticks, but that could easily be changed if you so please.
Whether the list of data supplied contains ordered or unordered data it does not matter, since it is only the minimum and maximum data points within the list that matter.
This simple algorithm yields an interval that is multiple of 1, 2, or 5 times a power of 10. And the axis range gets divided in at least 5 intervals. The code sample is in java language:
protected double calculateInterval(double range) {
double x = Math.pow(10.0, Math.floor(Math.log10(range)));
if (range / x >= 5)
return x;
else if (range / (x / 2.0) >= 5)
return x / 2.0;
else
return x / 5.0;
}
This is an alternative, for minimum 10 intervals:
protected double calculateInterval(double range) {
double x = Math.pow(10.0, Math.floor(Math.log10(range)));
if (range / (x / 2.0) >= 10)
return x / 2.0;
else if (range / (x / 5.0) >= 10)
return x / 5.0;
else
return x / 10.0;
}
I've been using the jQuery flot graph library. It's open source and does axis/tick generation quite well. I'd suggest looking at it's code and pinching some ideas from there.

Resources