Min Abs Sum task from codility - algorithm

There is already a topic about this task, but I'd like to ask about my specific approach.
The task is:
Let A be a non-empty array consisting of N integers.
The abs sum of two for a pair of indices (P, Q) is the absolute value
|A[P] + A[Q]|, for 0 ≤ P ≤ Q < N.
For example, the following array A:
A[0] = 1 A1 = 4 A[2] = -3 has pairs of indices (0, 0), (0,
1), (0, 2), (1, 1), (1, 2), (2, 2). The abs sum of two for the pair
(0, 0) is A[0] + A[0] = |1 + 1| = 2. The abs sum of two for the pair
(0, 1) is A[0] + A1 = |1 + 4| = 5. The abs sum of two for the pair
(0, 2) is A[0] + A[2] = |1 + (−3)| = 2. The abs sum of two for the
pair (1, 1) is A1 + A1 = |4 + 4| = 8. The abs sum of two for the
pair (1, 2) is A1 + A[2] = |4 + (−3)| = 1. The abs sum of two for
the pair (2, 2) is A[2] + A[2] = |(−3) + (−3)| = 6. Write a function:
def solution(A)
that, given a non-empty array A consisting of N integers, returns the
minimal abs sum of two for any pair of indices in this array.
For example, given the following array A:
A[0] = 1 A1 = 4 A[2] = -3 the function should return 1, as
explained above.
Given array A:
A[0] = -8 A1 = 4 A[2] = 5 A[3] =-10 A[4] = 3 the
function should return |(−8) + 5| = 3.
Write an efficient algorithm for the following assumptions:
N is an integer within the range [1..100,000]; each element of array A
is an integer within the range [−1,000,000,000..1,000,000,000].
The official solution is O(N*M^2), but I think it could be solved in O(N).
My approach is to first get rid of duplicates and sort the array. Then we check both ends and sompare the abs sum moving the ends by one towards each other. We try to move the left end, the right one or both. If this doesn't improve the result, our sum is the lowest. My code is:
def solution(A):
A = list(set(A))
n = len(A)
beg = 0
end = n - 1
min_sum = abs(A[beg] + A[end])
while True:
min_left = abs(A[beg+1] + A[end]) if beg+1 < n else float('inf')
min_right = abs(A[beg] + A[end-1]) if end-1 >= 0 else float('inf')
min_both = abs(A[beg+1] + A[end-1]) if beg+1 < n and end-1 >= 0 else float('inf')
min_all = min([min_left, min_right, min_both])
if min_sum <= min_all:
return min_sum
if min_left == min_all:
beg += 1
min_sum = min_left
elif min_right == min_all:
end -= 1
min_sum = min_right
beg += 1
end -= 1
min_sum = min_both
It passes almost all of the tests, but not all. Is there some bug in my code or the approach is wrong?
After the aka.nice answer I was able to fix the code. It scores 100% now.
def solution(A):
A = list(set(A))
n = len(A)
beg = 0
end = n - 1
min_sum = abs(A[beg] + A[end])
while beg <= end:
min_left = abs(A[beg+1] + A[end]) if beg+1 < n else float('inf')
min_right = abs(A[beg] + A[end-1]) if end-1 >= 0 else float('inf')
min_all = min(min_left, min_right)
if min_all < min_sum:
min_sum = min_all
if min_left <= min_all:
beg += 1
end -= 1
return min_sum

Just take this example for array A
-11 -5 -2 5 6 8 12
and execute your algorithm step by step, you get a premature return:
return min_sum
though there is a better solution abs(5-5)=0.
Hint: you should check the sign of A[beg] and A[end] to decide whether to continue or exit the loop. What to do if both >= 0, if both <= 0, else ?
Note that A.sort() has a non neglectable cost, likely O(N*log(N)), it will dominate the cost of the solution you exhibit.
By the way, what is M in the official cost O(N*M^2)?
And the link you provide is another problem (sum all the elements of A or their opposite).


How can I solve this problem using dynamic programming?

Given a list of numbers, say [4 5 2 3], I need to maximize the sum obtained according to the following set of rules:
I need to select a number from the list and that number will be removed.
Eg. selecting 2 will have the list as [4 5 3].
If the number to be removed has two neighbours then I should get the result of this selection as the product of the currently selected number with one of its neighbours and this product summed up with the other neighbour. eg.: if I select 2 then I can have the result of this selction as 2 * 5 + 3.
If I select a number with only one neighbour then the result is the product of the selected number with its neighbour.
When their is only one number left then it is just added to the result till now.
Following these rules, I need to select the numbers in such an order that the result is maximized.
For the above list, if the order of selction is 4->2->3->5 then the sum obtained is 53 which is the maximum.
I am including a program which lets you pass as input the set of elements and gives all possible sums and also indicates the max sum.
Here's a link.
import itertools
l = [int(i) for i in input().split()]
p = itertools.permutations(l)
c, cs = 1, -1
mm = -1
for i in p:
var, s = l[:], 0
print(c, ':', i)
c += 1
for j in i:
print(' removing: ', j)
pos = var.index(j)
if pos == 0 or pos == len(var) - 1:
if pos == 0 and len(var) != 1:
s += var[pos] * var[pos + 1]
elif pos == 0 and len(var) == 1:
s += var[pos]
if pos == len(var) - 1 and pos != 0:
s += var[pos] * var[pos - 1]
mx = max(var[pos - 1], var[pos + 1])
mn = min(var[pos - 1], var[pos + 1])
s += var[pos] * mx + mn
if s > mm:
mm = s
cs = c - 1
print(' modified list: ', var, '\n sum:', s)
print('MAX SUM was', mm, ' at', cs)
Consider 4 variants of the problem: those where every element gets consumed, and those where either the left, the right, or both the right and left elements are not consumed.
In each case, you can consider the last element to be removed, and this breaks the problem down into 1 or 2 subproblems.
This solves the problem in O(n^3) time. Here's a python program that solves the problem. The 4 variants of solve_ correspond to none, one or the other, or both of the endpoints being fixed. No doubt this program can be reduced (there's a lot of duplication).
def solve_00(seq, n, m, cache):
key = ('00', n, m)
if key in cache:
return cache[key]
assert m >= n
if n == m:
return seq[n]
best = -1e9
for i in range(n, m+1):
left = solve_01(seq, n, i, cache) if i > n else 0
right = solve_10(seq, i, m, cache) if i < m else 0
best = max(best, left + right + seq[i])
cache[key] = best
return best
def solve_01(seq, n, m, cache):
key = ('01', n, m)
if key in cache:
return cache[key]
assert m >= n + 1
if m == n + 1:
return seq[n] * seq[m]
best = -1e9
for i in range(n, m):
left = solve_01(seq, n, i, cache) if i > n else 0
right = solve_11(seq, i, m, cache) if i < m - 1 else 0
best = max(best, left + right + seq[i] * seq[m])
cache[key] = best
return best
def solve_10(seq, n, m, cache):
key = ('10', n, m)
if key in cache:
return cache[key]
assert m >= n + 1
if m == n + 1:
return seq[n] * seq[m]
best = -1e9
for i in range(n+1, m+1):
left = solve_11(seq, n, i, cache) if i > n + 1 else 0
right = solve_10(seq, i, m, cache) if i < m else 0
best = max(best, left + right + seq[n] * seq[i])
cache[key] = best
return best
def solve_11(seq, n, m, cache):
key = ('11', n, m)
if key in cache:
return cache[key]
assert m >= n + 2
if m == n + 2:
return max(seq[n] * seq[n+1] + seq[n+2], seq[n] + seq[n+1] * seq[n+2])
best = -1e9
for i in range(n + 1, m):
left = solve_11(seq, n, i, cache) if i > n + 1 else 0
right = solve_11(seq, i, m, cache) if i < m - 1 else 0
best = max(best, left + right + seq[i] * seq[n] + seq[m], left + right + seq[i] * seq[m] + seq[n])
cache[key] = best
return best
for c in [[1, 1, 1], [4, 2, 3, 5], [1, 2], [1, 2, 3], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]]:
print(c, solve_00(c, 0, len(c)-1, dict()))

which solution is better in terms of space.time complexity?

i have 2 lists of integers. they are both sorted already. I want to find the elements (one from each list) that add up to a given number.
-first idea is to iterate over first list and use binary search to look for the number needed to sum to the given number. i know this will take nlogn time.
the other is to store one of the lists in a hashtable/map (i dont really know the difference) and iterate over other list and look up the needed value. does this take n time? and n memory?
overall which would be better?
You are comparing it right. But both has different aspects. Hashing is not a good choice if you have memory constraints. But if you have plenty of memory then yes, you can afford to do that.
Also you will see many times in Computer Science the notion of space-time tradeoff. It will always be some gain by losing some. Hashing runs in O(n) and space complexity is O(n). But in case of searching only O(nlogn) time complexity but space complexity is O(1)
Long story short, scenario lets you decide which one to select. I have shown just one aspect. There can be many. Know the constraints and tradeoffs of each and you will be able to decide it.
A better solution : (Time complexity: O(n) Space complexity: O(1))
Suppose there are 2 array a and b.
Now WLOG suppose a is sorted in ascending and another in descending (Even if it is not the case we can traverse it accordingly).
index1=0;index2=0; // considered 0 indexing
while(index1 <= N1-1 && index2 <= N2-1)
if ((a[index1] + b[index2]) == x)
// success
else if ((a[index1] + b[index2]) > x)
//failure no such element.
Sort list A in ascending order, and list B in descending order. Set a = 1 and b = 1.
If A[a] + B[b] = T, record the pair, increment a, and repeat.
Otherwise, A[a] + B[b] < T, increment a, and repeat from 1.
Otherwise, A[a] + B[b] > T, increment b, and repeat from 1.
Naturally, if a or b exceeds the size of A or B, respectively, terminate.
A = 1, 2, 2, 6, 8, 10, 11
B = 9, 8, 4, 3, 1, 1
T = 10
a = 1, b = 1
A[a] + B[b] = A[1] + B[1] = 10; record; a = a + 1 = 2; repeat.
A[a] + B[b] = A[2] + B[1] = 11; b = b + 1 = 2; repeat.
A[a] + B[b] = A[2] + B[2] = 10; record; a = a + 1 = 3; repeat.
A[a] + B[b] = A[3] + B[2] = 10; record; a = a + 1 = 4; repeat.
A[a] + B[b] = A[4] + B[2] = 14; b = b + 1 = 3; repeat.
A[a] + B[b] = A[4] + B[3] = 10; record; a = a + 1 = 5; repeat.
A[a] + B[b] = A[5] + B[3] = 12; b = b + 1 = 4; repeat.
A[a] + B[b] = A[5] + B[4] = 11; b = b + 1 = 5; repeat.
A[a] + B[b] = A[5] + B[5] = 9; a = a + 1 = 6; repeat.
A[a] + B[b] = A[6] + B[5] = 11; b = b + 1 = 6; repeat.
A[a] + B[b] = A[6] + B[6] = 11; b = b + 1 = 7; repeat.
You can do this without additional space if instead of having B sorted in descending order, you set b = |B| and decrement it instead of incrementing it, effectively reading it backwards.
The above procedure misses out on some duplicate answers where B has a string of duplicate values, for instance:
A = 2, 2, 2
B = 8, 8, 8
The algorithm as described above will yield three pairs, but you might want nine. This can be fixed by detecting this case, keeping separate counters ca and cb for the lengths of the runs of A[a] and B[b] you have seen, and adding ca * cb - ca copies of the last pair you added to the bag. In this example:
A = 2, 2, 2
B = 8, 8, 8
a = 1, b = 1
ca = 0, cb = 0
A[a] + B[b] = 10; record pair, a = a + 1 = 2, ca = ca + 1 = 2, repeat.
A[a] + B[b] = 10; record pair, a = a + 1 = 3, ca = ca + 1 = 2, repeat.
A[a] + B[b] = 10; record pair, a = a + 1 = 4;
a exceeds bounds, value of A[a] changed;
increment b to count run of B's;
b = b + 1 = 2, cb = cb + 1 = 2
b = b + 1 = 3, cb = cb + 1 = 3
b = b + 1 = 4;
b exceeds bounds, value of B[b] changed;
add ca * cb - ca = 3 * 3 - 3 = 6 copies of pair (2, 8).

Confusion Regarding deepest pit within an Array

I got this question as prerequisite for an interview,
A non-empty zero-indexed array A consisting of N integers is given. A
pit in this array is any triplet of integers (P, Q, R) such that: 0 ≤
P < Q < R < N;
sequence [A[P], A[P+1], ..., A[Q]] is strictly decreasing, i.e. A[P] >
A[P+1] > ... > A[Q];
sequence A[Q], A[Q+1], ..., A[R] is strictly increasing, i.e. A[Q] <
A[Q+1] < ... < A[R].
The depth of a pit (P, Q, R) is the number min{A[P] − A[Q], A[R] −
A[Q]}. For example, consider array A consisting of 10 elements such
A[0] = 0
A[1] = 1
A[2] = 3
A[3] = -2
A[4] = 0
A[5] = 1
A[6] = 0
A[7] = -3
A[8] = 2
A[9] = 3
Triplet (2, 3, 4) is one of pits in this array, because sequence
[A[2], A[3]] is strictly decreasing (3 > −2) and sequence [A[3], A[4]]
is strictly increasing (−2 < 0). Its depth is min{A[2] − A[3], A[4] −
A[3]} = 2.
Triplet (2, 3, 5) is another pit with depth 3.
Triplet (5, 7, 8) is yet another pit with depth 4. There is no pit in
this array deeper (i.e. having depth greater) than 4.
It says that Triplet (5, 7, 8) has the deepest pit depth of 4.
but isn't Triplet (2, 7, 9) has the deepest pit depth 6?
corresponding value of Triplet (2, 7, 9) is (3, -3, 3) and it also satisfies the conditions mentioned, i.e.
1) 0 ≤ P < Q < R < N
2) A[P] > A[P+1] > ... > A[Q] and A[Q] < A[Q+1] < ... < A[R]
so in this case min{A[P] − A[Q], A[R] − A[Q]} is 6.
What am i missing here?
P.S. if you think this post does not belong here in this forum then please point out where should i post it.
See the sequence from P to Q for 2 to 7.
It is 3 -2 0 1 0 -3.
sequence [A[P], A[P+1], ..., A[Q]] is strictly decreasing, i.e. A[P] > A[P+1] > ... > A[Q];
The rule says that this should be a decreasing sequence. But it isn't. 3>-2 but -2 is not greater than 0. Here the sequence breaks.
From 7 to 9. No problem as the sequence is increasing. -3<2<3.
answer of the deepest pit problem in swift :
func solution(_ array: [Int]) -> Int {
//guaranty we have at least three elements
if array.isEmpty {
return -1
if array.count < 3 {
print("is less than 3")
return -1
//extremum point; max or min points
var extremumPoints = [Int]()
//adding first element
//calculate extremum points for 1 to one before last element
for i in 1..<(array.count - 1) {
let isRelativeExtremum = ((array[i] - array[i - 1]) * (array[i] - array[i + 1])) > 0
//we call a point semi-extremum if a point is equal to previous element or next element and not equal to previous element or next element
let isSemiExtremum = ((array[i] != array[i - 1]) && (array[i] == array[i + 1])) || ((array[i] != array[i + 1]) && (array[i] == array[i - 1]))
if isRelativeExtremum || isSemiExtremum {
//adding last element
extremumPoints.append(array[array.count - 1])
//we will hold depthes in this array
var depthes = [Int]()
for i in 1..<(extremumPoints.count - 1) {
let isBottomOfaPit = extremumPoints[i] < extremumPoints[i - 1] && extremumPoints[i] < extremumPoints[i + 1]
if isBottomOfaPit {
let d1 = extremumPoints[i - 1] - extremumPoints[i]
let d2 = extremumPoints[i + 1] - extremumPoints[i]
let d = min(d1, d2)
//deepest pit
let deepestPit = depthes.max()
return deepestPit ?? -1
let A = [0,1,3,-2,0,1,0,-3,2,3]
let deepestPit = solution(A)
print(deepestPit) // 4
def deepest(A):
def check(p, q, r, A):
if A[p] > A[q] and A[q] < A[r]:
return min(A[p] - A[q], A[r] - A[q])
return -1
max_depth = 0
for i in range(1, len(A) - 2):
if A[i-1] > A[i] < A[i + 1]:
p = i
r = i
while 0 <= p and r <= len(A) - 1:
depth = check(p, i, r, A)
max_depth = max(max_depth, depth)
p -= 1
r += 1
return max_depth

Using matrices to find the number of different ways to write n as the sum of 1, 3, and 4?

This is a question given in this presentation. Dynamic Programming
now i have implemented the algorithm using recursion and it works fine for small values. But when n is greater than 30 it becomes really slow.The presentation mentions that for large values of n one should consider something similar to
the matrix form of Fibonacci numbers .I am having trouble undestanding how to use the matrix form of Fibonacci numbers to come up with a solution.Can some one give me some hints or pseudocode
Yes, you can use the technique from fast Fibonacci implementations to solve this problem in time O(log n)! Here's how to do it.
Let's go with your definition from the problem statement that 1 + 3 is counted the same as 3 + 1. Then you have the following recurrence relation:
A(0) = 1
A(1) = 1
A(2) = 1
A(3) = 2
A(k+4) = A(k) + A(k+1) + A(k+3)
The matrix trick here is to notice that
| 1 0 1 1 | |A( k )| |A(k) + A(k-2) + A(k-3)| |A(k+1)|
| 1 0 0 0 | |A(k-1)| | A( k ) | |A( k )|
| 0 1 0 0 | |A(k-2)| = | A(k-1) | = |A(k-1)|
| 0 0 1 0 | |A(k-3)| | A(k-2) | = |A(k-2)|
In other words, multiplying a vector of the last four values in the series produces a vector with those values shifted forward by one step.
Let's call that matrix there M. Then notice that
|A( k )| |A(k+2)|
|A(k-1)| |A(k+1)|
M^2 |A(k-2)| = |A( k )|
|A(k-3)| |A(k-1)|
In other words, multiplying by the square of this matrix shifts the series down two steps. More generally:
|A( k )| | A(k+n) |
|A(k-1)| |A(k-1 + n)|
M^n |A(k-2)| = |A(k-2 + n)|
|A(k-3)| |A(k-3 + n)|
So multiplying by Mn shifts the series down n steps. Now, if we want to know the value of A(n+3), we can just compute
|A(3)| |A(n+3)|
|A(2)| |A(n+2)|
M^n |A(1)| = |A(n+1)|
|A(0)| |A(n+2)|
and read off the top entry of the vector! This can be done in time O(log n) by using exponentiation by squaring. Here's some code that does just that. This uses a matrix library I cobbled together a while back:
#include "Matrix.hh"
#include <cstdint>
#include <iomanip>
#include <iostream>
#include <algorithm>
using namespace std;
/* Naive implementations of A. */
uint64_t naiveA(int n) {
if (n == 0) return 1;
if (n == 1) return 1;
if (n == 2) return 1;
if (n == 3) return 2;
return naiveA(n-1) + naiveA(n-3) + naiveA(n-4);
/* Constructs and returns the giant matrix. */
Matrix<4, 4, uint64_t> M() {
Matrix<4, 4, uint64_t> result;
fill(result.begin(), result.end(), uint64_t(0));
result[0][0] = 1;
result[0][2] = 1;
result[0][3] = 1;
result[1][0] = 1;
result[2][1] = 1;
result[3][2] = 1;
return result;
/* Constructs the initial vector that we multiply the matrix by. */
Vector<4, uint64_t> initVec() {
Vector<4, uint64_t> result;
result[0] = 2;
result[1] = 1;
result[2] = 1;
result[3] = 1;
return result;
/* O(log n) time for raising a matrix to a power. */
Matrix<4, 4, uint64_t> fastPower(const Matrix<4, 4, uint64_t>& m, int n) {
if (n == 0) return Identity<4, uint64_t>();
auto half = fastPower(m, n / 2);
if (n % 2 == 0) return half * half;
else return half * half * m;
/* Fast implementation of A(n) using matrix exponentiation. */
uint64_t fastA(int n) {
if (n == 0) return 1;
if (n == 1) return 1;
if (n == 2) return 1;
if (n == 3) return 2;
auto result = fastPower(M(), n - 3) * initVec();
return result[0];
/* Some simple test code showing this in action! */
int main() {
for (int i = 0; i < 25; i++) {
cout << setw(2) << i << ": " << naiveA(i) << ", " << fastA(i) << endl;
Now, how would this change if 3 + 1 and 1 + 3 were treated as equivalent? This means that we can think about solving this problem in the following way:
Let A(n) be the number of ways to write n as a sum of 1s, 3s, and 4s.
Let B(n) be the number of ways to write n as a sum of 1s and 3s.
Let C(n) be the number of ways to write n as a sum of 1s.
We then have the following:
A(n) = B(n) for all n ≤ 3, since for numbers in that range the only options are to use 1s and 3s.
A(n + 4) = A(n) + B(n + 4), since your options are either (1) use a 4 or (2) not use a 4, leaving the remaining sum to use 1s and 3s.
B(n) = C(n) for all n ≤ 2, since for numbers in that range the only options are to use 1s.
B(n + 3) = B(n) + C(n + 3), sine your options are either (1) use a 3 or (2) not use a 3, leaving the remaining sum to use only 1s.
C(0) = 1, since there's only one way to write 0 as a sum of no numbers.
C(n+1) = C(n), since the only way to write something with 1s is to pull out a 1 and write the remaining number as a sum of 1s.
That's a lot to take in, but do notice the following: we ultimately care about A(n), and to evaluate it, we only need to know the values of A(n), A(n-1), A(n-2), A(n-3), B(n), B(n-1), B(n-2), B(n-3), C(n), C(n-1), C(n-2), and C(n-3).
Let's imagine, for example, that we know these twelve values for some fixed value of n. We can learn those twelve values for the next value of n as follows:
C(n+1) = C(n)
B(n+1) = B(n-2) + C(n+1) = B(n-2) + C(n)
A(n+1) = A(n-3) + B(n+1) = A(n-3) + B(n-2) + C(n)
And the remaining values then shift down.
We can formulate this as a giant matrix equation:
A( n ) A(n-1) A(n-2) A(n-3) B( n ) B(n-1) B(n-2) C( n )
| 0 0 0 1 0 0 1 1 | |A( n )| = |A(n+1)|
| 1 0 0 0 0 0 0 0 | |A(n-1)| = |A( n )|
| 0 1 0 0 0 0 0 0 | |A(n-2)| = |A(n-1)|
| 0 0 1 0 0 0 0 0 | |A(n-3)| = |A(n-2)|
| 0 0 0 0 0 0 1 1 | |B( n )| = |B(n+1)|
| 0 0 0 0 1 0 0 0 | |B(n-1)| = |B( n )|
| 0 0 0 0 0 1 0 0 | |B(n-2)| = |B(n-1)|
| 0 0 0 0 0 0 0 1 | |C( n )| = |C(n+1)|
Let's call this gigantic matrix here M. Then if we compute
|2| // A(3) = 2, since 3 = 3 or 3 = 1 + 1 + 1
|1| // A(2) = 1, since 2 = 1 + 1
|1| // A(1) = 1, since 1 = 1
M^n |1| // A(0) = 1, since 0 = (empty sum)
|2| // B(3) = 2, since 3 = 3 or 3 = 1 + 1 + 1
|1| // B(2) = 1, since 2 = 1 + 1
|1| // B(1) = 1, since 1 = 1
|1| // C(3) = 1, since 3 = 1 + 1 + 1
We'll get back a vector whose first entry is A(n+3), the number of ways to write n+3 as a sum of 1's, 3's, and 4's. (I've actually coded this up to check it - it works!) You can then use the technique for computing Fibonacci numbers using a matrix to a power efficiently that you saw with Fibonacci numbers to solve this in time O(log n).
Here's some code doing that:
#include "Matrix.hh"
#include <cstdint>
#include <iomanip>
#include <iostream>
#include <algorithm>
using namespace std;
/* Naive implementations of A, B, and C. */
uint64_t naiveC(int n) {
return 1;
uint64_t naiveB(int n) {
return (n < 3? 0 : naiveB(n-3)) + naiveC(n);
uint64_t naiveA(int n) {
return (n < 4? 0 : naiveA(n-4)) + naiveB(n);
/* Constructs and returns the giant matrix. */
Matrix<8, 8, uint64_t> M() {
Matrix<8, 8, uint64_t> result;
fill(result.begin(), result.end(), uint64_t(0));
result[0][3] = 1;
result[0][6] = 1;
result[0][7] = 1;
result[1][0] = 1;
result[2][1] = 1;
result[3][2] = 1;
result[4][6] = 1;
result[4][7] = 1;
result[5][4] = 1;
result[6][5] = 1;
result[7][7] = 1;
return result;
/* Constructs the initial vector that we multiply the matrix by. */
Vector<8, uint64_t> initVec() {
Vector<8, uint64_t> result;
result[0] = 2;
result[1] = 1;
result[2] = 1;
result[3] = 1;
result[4] = 2;
result[5] = 1;
result[6] = 1;
result[7] = 1;
return result;
/* O(log n) time for raising a matrix to a power. */
Matrix<8, 8, uint64_t> fastPower(const Matrix<8, 8, uint64_t>& m, int n) {
if (n == 0) return Identity<8, uint64_t>();
auto half = fastPower(m, n / 2);
if (n % 2 == 0) return half * half;
else return half * half * m;
/* Fast implementation of A(n) using matrix exponentiation. */
uint64_t fastA(int n) {
if (n == 0) return 1;
if (n == 1) return 1;
if (n == 2) return 1;
if (n == 3) return 2;
auto result = fastPower(M(), n - 3) * initVec();
return result[0];
/* Some simple test code showing this in action! */
int main() {
for (int i = 0; i < 25; i++) {
cout << setw(2) << i << ": " << naiveA(i) << ", " << fastA(i) << endl;
This is a very interesting sequence. It is almost but not quite the order-4 Fibonacci (a.k.a. Tetranacci) numbers. Having extracted the doubling formulas for Tetranacci from its companion matrix, I could not resist doing it again for this very similar recurrence relation.
Before we get into the actual code, some definitions and a short derivation of the formulas used are in order. Define an integer sequence A such that:
A(n) := A(n-1) + A(n-3) + A(n-4)
with initial values A(0), A(1), A(2), A(3) := 1, 1, 1, 2.
For n >= 0, this is the number of integer compositions of n into parts from the set {1, 3, 4}. This is the sequence that we ultimately wish to compute.
For convenience, define a sequence T such that:
T(n) := T(n-1) + T(n-3) + T(n-4)
with initial values T(0), T(1), T(2), T(3) := 0, 0, 0, 1.
Note that A(n) and T(n) are simply shifts of each other. More precisely, A(n) = T(n+3) for all integers n. Accordingly, as elaborated by another answer, the companion matrix for both sequences is:
[0 1 0 0]
[0 0 1 0]
[0 0 0 1]
[1 1 0 1]
Call this matrix C, and let:
a, b, c, d := T(n), T(n+1), T(n+2), T(n+3)
a', b', c', d' := T(2n), T(2n+1), T(2n+2), T(2n+3)
By induction, it can easily be shown that:
[0 1 0 0]^n = [d-c-a c-b b-a a]
[0 0 1 0] [ a d-c c-b b]
[0 0 0 1] [ b b+a d-c c]
[1 1 0 1] [ c c+b b+a d]
As seen above, for any n, C^n can be fully determined from its rightmost column alone. Furthermore, multiplying C^n with its rightmost column produces the rightmost column of C^(2n):
[d-c-a c-b b-a a][a] = [a'] = [a(2d - 2c - a) + b(2c - b)]
[ a d-c c-b b][b] [b'] [ a^2 + c^2 + 2b(d - c)]
[ b b+a d-c c][c] [c'] [ b(2a + b) + c(2d - c)]
[ c c+b b+a d][d] [d'] [ b^2 + d^2 + 2c(a + b)]
Thus, if we wish to compute C^n for some n by repeated squaring, we need only perform matrix-vector multiplication per step instead of the full matrix-matrix multiplication.
Now, the implementation, in Python:
# O(n) integer additions or subtractions
def A_linearly(n):
a, b, c, d = 0, 0, 0, 1 # T(0), T(1), T(2), T(3)
if n >= 0:
for _ in range(+n):
a, b, c, d = b, c, d, a + b + d
else: # n < 0
for _ in range(-n):
a, b, c, d = d - c - a, a, b, c
return d # because A(n) = T(n+3)
# O(log n) integer multiplications, additions, subtractions.
def A_by_doubling(n):
n += 3 # because A(n) = T(n+3)
if n >= 0:
a, b, c, d = 0, 0, 0, 1 # T(0), T(1), T(2), T(3)
else: # n < 0
a, b, c, d = 1, 0, 0, 0 # T(-1), T(0), T(1), T(2)
# Unroll the final iteration to avoid computing extraneous values
for i in reversed(range(1, abs(n).bit_length())):
w = a*(2*(d - c) - a) + b*(2*c - b)
x = a*a + c*c + 2*b*(d - c)
y = b*(2*a + b) + c*(2*d - c)
z = b*b + d*d + 2*c*(a + b)
if (n >> i) & 1 == 0:
a, b, c, d = w, x, y, z
else: # (n >> i) & 1 == 1
a, b, c, d = x, y, z, w + x + z
if n & 1 == 0:
return a*(2*(d - c) - a) + b*(2*c - b) # w
else: # n & 1 == 1
return a*a + c*c + 2*b*(d - c) # x
print(all(A_linearly(n) == A_by_doubling(n) for n in range(-1000, 1001)))
Because it was rather trivial to code, the sequence is extended to negative n in the usual way. Also provided is a simple linear implementation to serve as a point of reference.
For n large enough, the logarithmic implementation above is 10-20x faster than directly exponentiating the companion matrix with numpy, by a simple (i.e. not rigorous, and likely flawed) timing comparison. And by my estimate, it would still take ~100 years to compute A(10**12)! Even though the algorithm above has room for improvement, that number is simply too large. On the other hand, computing A(10**12) mod M for some M is much more attainable.
A direct relation to Lucas and Fibonacci numbers
It turns out that T(n) is even closer to the Fibonacci and Lucas numbers than it is to Tetranacci. To see this, note that the characteristic polynomial for T(n) is x^4 - x^3 - x - 1 = 0 which factors into (x^2 - x - 1)(x^2 + 1) = 0. The first factor is the characteristic polynomial for Fibonacci & Lucas! The 4 roots of (x^2 - x - 1)(x^2 + 1) = 0 are the two Fibonacci roots, phi and psi = 1 - phi, and i and -i--the two square roots of -1.
The closed-form expression or "Binet" formula for T(n) will have the general form:
T(n) = U(n) + V(n)
U(n) = p*(phi^n) + q*(psi^n)
V(n) = r*(i^n) + s*(-i)^n
for some constant coefficients p, q, r, s.
Using the initial values for T(n), solving for the coefficients, applying some algebra, and noting that the Lucas numbers have the closed-form expression: L(n) = phi^n + psi^n, we can derive the following relations:
L(n+1) - L(n) L(n-1) F(n) + F(n-2)
U(n) = ------------- = -------- = ------------
5 5 5
where L(n) is the n'th Lucas number with L(0), L(1) := 2, 1 and F(n) is the n'th Fibonacci number with F(0), F(1) := 0, 1. And we also have:
V(n) = 1 / 5 if n = 0 (mod 4)
| -2 / 5 if n = 1 (mod 4)
| -1 / 5 if n = 2 (mod 4)
| 2 / 5 if n = 3 (mod 4)
Which is ugly, but trivial to code. Note that the numerator of V(n) can also be succinctly expressed as cos(n*pi/2) - 2sin(n*pi/2) or (3-(-1)^n) / 2 * (-1)^(n(n+1)/2), but we use the piece-wise definition for clarity.
Here's an even nicer, more direct identity:
T(n) + T(n+2) = F(n)
Essentially, we can compute T(n) (and therefore A(n)) by using Fibonacci & Lucas numbers. Theoretically, this should be much more efficient than the Tetranacci-like approach.
It is known that the Lucas numbers can computed more efficiently than Fibonacci, therefore we will compute A(n) from the Lucas numbers. The most efficient, simple Lucas number algorithm I know of is one by L.F. Johnson (see his 2010 paper: Middle and Ripple, fast simple O(lg n) algorithms for Lucas Numbers). Once we have a Lucas algorithm, we use the identity: T(n) = L(n - 1) / 5 + V(n) to compute A(n).
# O(log n) integer multiplications, additions, subtractions
def A_by_lucas(n):
n += 3 # because A(n) = T(n+3)
offset = (+1, -2, -1, +2)[n % 4]
L = lf_johnson_2010_middle(n - 1)
return (L + offset) // 5
def lf_johnson_2010_middle(n):
"-> n'th Lucas number. See [L.F. Johnson 2010a]."
#: The following Lucas identities are used:
#: L(2n) = L(n)^2 - 2*(-1)^n
#: L(2n+1) = L(2n+2) - L(2n)
#: L(2n+2) = L(n+1)^2 - 2*(-1)^(n+1)
#: The first and last identities are equivalent.
#: For the unrolled iteration, the following is also used:
#: L(2n+1) = L(n)*L(n+1) - (-1)^n
#: Since this approach uses only square multiplications per loop,
#: It turns out to be slightly faster than standard Lucas doubling,
#: which uses 1 square and 1 regular multiplication.
if n >= 0:
a, b, sign = 2, 1, +1 # L(0), L(1), (-1)^0
else: # n < 0
a, b, sign = -1, 2, -1 # L(-1), L(0), (-1)^(-1)
# unroll the last iteration to avoid computing unnecessary values
for i in reversed(range(1, abs(n).bit_length())):
a = a*a - 2*sign # L(2k)
c = b*b + 2*sign # L(2k+2)
b = c - a # L(2k+1)
sign = +1
if (n >> i) & 1:
a, b = b, c
sign = -1
if n & 1:
return a*b - sign
return a*a - 2*sign
You may verify that A_by_lucas produces the same results as the previous A_by_doubling function, but is roughly 5x faster. Still not fast enough to compute A(10**12) in any reasonable amount of time!
You can easily improve your current recursion implementation by adding memoization which makes the solution fast again. C# code:
// Dictionary to store computed values
private static Dictionary<int, long> s_Solutions = new Dictionary<int, long>();
private static long Count134(int value) {
if (value == 0)
return 1;
else if (value <= 0)
return 0;
long result;
// Improvement: Do we have the value computed?
if (s_Solutions.TryGetValue(value, out result))
return result;
result = Count134(value - 4) +
Count134(value - 3) +
Count134(value - 1);
// Improvement: Store the value computed for future use
s_Solutions.Add(value, result);
return result;
And so you can easily call
The outcome (which takes about 2 milliseconds) is

Caculating total combinations

I don't know how to go about this programming problem.
Given two integers n and m, how many numbers exist such that all numbers have all digits from 0 to n-1 and the difference between two adjacent digits is exactly 1 and the number of digits in the number is atmost 'm'.
What is the best way to solve this problem? Is there a direct mathematical formula?
Edit: The number cannot start with 0.
for n = 3 and m = 6 there are 18 such numbers (210, 2101, 21012, 210121 ... etc)
Update (some people have encountered an ambiguity):
All digits from 0 to n-1 must be present.
This Python code computes the answer in O(nm) by keeping track of the numbers ending with a particular digit.
Different arrays (A,B,C,D) are used to track numbers that have hit the maximum or minimum of the range.
A=[1]*n # Number of ways of being at digit i and never being to min or max
B=[0]*n # number of ways with minimum being observed
C=[0]*n # number of ways with maximum being observed
D=[0]*n # number of ways with both being observed
A[0]=0 # Cannot start with 0
A[n-1]=0 # Have seen max so this 1 moves from A to C
C[n-1]=1 # Have seen max if start with highest digit
for k in range(m-1):
for i in range(1,n-1):
x=sum(d for d in D2)
print t
After doing some more research, I think there may actually be a mathematical approach after all, although the math is advanced for me. Douglas S. Stones pointed me in the direction of Joseph Myers' (2008) article, BMO 2008–2009 Round 1 Problem 1—Generalisation, which derives formulas for calculating the number of zig-zag paths across a rectangular board.
As I understand it, in Anirudh's example, our board would have 6 rows of length 3 (I believe this would mean n=3 and r=6 in the article's terms). We can visualize our board so:
0 1 2 example zig-zag path: 0
0 1 2 1
0 1 2 0
0 1 2 1
0 1 2 2
0 1 2 1
Since Myers' formula m(n,r) would generate the number for all the zig-zag paths, that is, the number of all 6-digit numbers where all adjacent digits are consecutive and digits are chosen from (0,1,2), we would still need to determine and subtract those that begin with zero and those that do not include all digits.
If I understand correctly, we may do this in the following way for our example, although generalizing the concept to arbitrary m and n may prove more complicated:
Let m(3,6) equal the number of 6-digit numbers where all adjacent digits
are consecutive and digits are chosen from (0,1,2). According to Myers,
m(3,r) is given by formula and also equals OEIS sequence A029744 at
index r+2, so we have
m(3,6) = 16
How many of these numbers start with zero? Myers describes c(n,r) as the
number of zig-zag paths whose colour is that of the square in the top
right corner of the board. In our case, c(3,6) would include the total
for starting-digit 0 as well as starting-digit 2. He gives c(3,2r) as 2^r,
so we have
c(3,6) = 8. For starting-digit 0 only, we divide by two to get 4.
Now we need to obtain only those numbers that include all the digits in
the range, but how? We can do this be subtracting m(n-1,r) from m(n,r).
In our case, we have all the m(2,6) that would include only 0's and 1's,
and all the m(2,6) that would include 1's and 2's. Myers gives
m(2,anything) as 2, so we have
2*m(2,6) = 2*2 = 4
But we must remember that one of the zero-starting numbers is included
in our total for 2*m(2,6), namely 010101. So all together we have
m(3,6) - c(3,6)/2 - 4 + 1
= 16 - 4 - 4 + 1
= 9
To complete our example, we must follow a similar process for m(3,5),
m(3,4) and m(3,3). Since it's late here, I might follow up tomorrow...
One approach could be to program it recursively, calling the function to add as well as subtract from the last digit.
Haskell code:
import Data.List (sort,nub)
f n m = concatMap (combs n) [n..m]
combs n m = concatMap (\x -> combs' 1 [x]) [1..n - 1] where
combs' count result
| count == m = if test then [concatMap show result] else []
| otherwise = combs' (count + 1) (result ++ [r + 1])
++ combs' (count + 1) (result ++ [r - 1])
where r = last result
test = (nub . sort $ result) == [0..n - 1]
*Main> f 3 6
In response to Anirudh Rayabharam's comment, I hope the following code will be more 'pseudocode' like. When the total number of digits reaches m, the function g outputs 1 if the solution has hashed all [0..n-1], and 0 if not. The function f accumulates the results for g for starting digits [1..n-1] and total number of digits [n..m].
Haskell code:
import qualified Data.Set as S
g :: Int -> Int -> Int -> Int -> (S.Set Int, Int) -> Int
g n m digitCount lastDigit (hash,hashCount)
| digitCount == m = if test then 1 else 0
| otherwise =
if lastDigit == 0
then g n m d' (lastDigit + 1) (hash'',hashCount')
else if lastDigit == n - 1
then g n m d' (lastDigit - 1) (hash'',hashCount')
else g n m d' (lastDigit + 1) (hash'',hashCount')
+ g n m d' (lastDigit - 1) (hash'',hashCount')
where test = hashCount' == n
d' = digitCount + 1
hash'' = if test then S.empty else hash'
| hashCount == n = (S.empty,hashCount)
| S.member lastDigit hash = (hash,hashCount)
| otherwise = (S.insert lastDigit hash,hashCount + 1)
f n m = foldr forEachNumDigits 0 [n..m] where
forEachNumDigits numDigits accumulator =
accumulator + foldr forEachStartingDigit 0 [1..n - 1] where
forEachStartingDigit startingDigit accumulator' =
accumulator' + g n numDigits 1 startingDigit (S.empty,0)
*Main> f 3 6
(0.01 secs, 571980 bytes)
*Main> f 4 20
(1.23 secs, 97795656 bytes)
*Main> f 4 25
(11.73 secs, 1068373268 bytes)
model your problem as 2 superimposed lattices in 2 dimensions, specifically as pairs (i,j) interconnected with oriented edges ((i0,j0),(i1,j1)) where i1 = i0 + 1, |j1 - j0| = 1, modified as follows:
dropping all pairs (i,j) with j > 9 and its incident edges
dropping all pairs (i,j) with i > m-1 and its incident edges
dropping edge ((0,0), (1,1))
this construction results in a structure like in this diagram:
the requested numbers map to paths in the lattice starting at one of the green elements ((0,j), j=1..min(n-1,9)) that contain at least one pink and one red element ((i,0), i=1..m-1, (i,n-1), i=0..m-1 ). to see this, identify the i-th digit j of a given number with point (i,j). including pink and red elements ('extremal digits') guarantee that all available diguts are represented in the number.
for convenience, let q1, q2 denote the position-1.
let q1 be the position of a number's first digit being either 0 or min(n-1,9).
let q2 be the position of a number's first 0 if the digit at position q1 is min(n-1,9) and vv.
case 1: first extremal digit is 0
the number of valid prefixes containing no 0 can be expressed as sum_{k=1..min(n-1,9)} (paths_to_0(k,1,q1), the function paths_to_0 being recursively defined as
paths_to_0(0,q1-1,q1) = 0;
paths_to_0(1,q1-1,q1) = 1;
paths_to_0(digit,i,q1) = 0; if q1-i < digit;
paths_to_0(x,_,_) = 0; if x >= min(n-1,9)
// x=min(n-1,9) mustn't occur before position q2,
// x > min(n-1,9) not at all
paths_to_0(x,_,_) = 0; if x <= 0;
// x=0 mustn't occur before position q1,
// x < 0 not at all
and else paths_to_0(digit,i,q1) =
paths_to_0(digit+1,i+1,q1) + paths_to_0(digit-1,i+1,q1);
similarly we have
paths_to_max(min(n-1,9),q2-1,q2) = 0;
paths_to_max(min(n-2,8),q2-1,q2) = 1;
paths_to_max(digit,i,q2) = 0 if q2-i < n-1;
paths_to_max(x,_,_) = 0; if x >= min(n-1,9)
// x=min(n-1,9) mustn't occur before
// position q2,
// x > min(n-1,9) not at all
paths_to_max(x,_,_) = 0; if x < 0;
and else paths_to_max(digit,q1,q2) =
paths_max(digit+1,q1+1,q2) + paths_to_max(digit-1,q1+1,q2);
and finally
paths_suffix(digit,length-1,length) = 2; if digit > 0 and digit < min(n-1,9)
paths_suffix(digit,length-1,length) = 1; if digit = 0 or digit = min(n-1,9)
paths_suffix(digit,k,length) = 0; if length > m-1
or length < q2
or k > length
paths_suffix(digit,k,0) = 1; // the empty path
and else paths_suffix(digit,k,length) =
paths_suffix(digit+1,k+1,length) + paths_suffix(digit-1,k+1,length);
... for a grand total of
number_count_case_1(n, m) =
sum_{first=1..min(n-1,9), q1=1..m-1-(n-1), q2=q1..m-1, l_suffix=0..m-1-q2} (
+ paths_to_max(0,q1,q2)
+ paths_suffix(min(n-1,9),q2,l_suffix+q2)
case 2: first extremal digit is min(n-1,9)
case 2.1: initial digit is not min(n-1,9)
this is symmetrical to case 1 with all digits d replaced by min(n,10) - d. as the lattice structure is symmetrical, this means number_count_case_2_1 = number_count_case_1.
case 2.2: initial digit is min(n-1,9)
note that q1 is 1 and the second digit must be min(n-2,8).
number_count_case_2_2 (n, m) =
sum_{q2=1..m-2, l_suffix=0..m-2-q2} (
+ paths_suffix(min(n-1,9),q2,l_suffix+q2)
so the grand grand total will be
number_count ( n, m ) = 2 * number_count_case_1 (n, m) + number_count_case_2_2 (n, m);
i don't know whether a closed expression for number_count exists, but the following perl code will compute it (the code is but a proof of concept as it does not use memoization techniques to avoid recomputing results already obtained):
use strict;
use warnings;
my ($n, $m) = ( 5, 7 ); # for example
$n = ($n > 10) ? 10 : $n; # cutoff
sub min
sub paths_to_0 ($$$) {
my (
, $at
, $until
) = #_;
if (($d == 0) && ($at == $until - 1)) { return 0; }
if (($d == 1) && ($at == $until - 1)) { return 1; }
if ($until - $at < $d) { return 0; }
if (($d <= 0) || ($d >= $n))) { return 0; }
return paths_to_0($d+1, $at+1, $until) + paths_to_0($d-1, $at+1, $until);
} # paths_to_0
sub paths_to_max ($$$) {
my (
, $at
, $until
) = #_;
if (($d == $n-1) && ($at == $until - 1)) { return 0; }
if (($d == $n-2) && ($at == $until - 1)) { return 1; }
if ($until - $at < $n-1) { return 0; }
if (($d < 0) || ($d >= $n-1)) { return 0; }
return paths_to_max($d+1, $at+1, $until) + paths_to_max($d-1, $at+1, $until);
} # paths_to_max
sub paths_suffix ($$$) {
my (
, $at
, $until
) = #_;
if (($d < $n-1) && ($d > 0) && ($at == $until - 1)) { return 2; }
if ((($d == $n-1) && ($d == 0)) && ($at == $until - 1)) { return 1; }
if (($until > $m-1) || ($at > $until)) { return 0; }
if ($until == 0) { return 1; }
return paths_suffix($d+1, $at+1, $until) + paths_suffix($d-1, $at+1, $until);
} # paths_suffix
# main
number_count =
sum_{first=1..min(n-1,9), q1=1..m-1-(n-1), q2=q1..m-1, l_suffix=0..m-1-q2} (
+ paths_to_max(0,q1,q2)
+ paths_suffix(min(n-1,9),q2,l_suffix+q2)
my ($number_count, $number_count_2_2) = (0, 0);
my ($first, $q1, i, $l_suffix);
for ($first = 1; $first <= $n-1; $first++) {
for ($q1 = 1; $q1 <= $m-1 - ($n-1); $q1++) {
for ($q2 = $q1; $q2 <= $m-1; $q2++) {
for ($l_suffix = 0; $l_suffix <= $m-1 - $q2; $l_suffix++) {
$number_count =
+ paths_to_0($first,1,$q1)
+ paths_to_max(0,$q1,$q2)
+ paths_suffix($n-1,$q2,$l_suffix+$q2)
# case 2.2
for ($q2 = 1; $q2 <= $m-2; $q2++) {
for ($l_suffix = 0; $l_suffix <= $m-2 - $q2; $l_suffix++) {
$number_count_2_2 =
+ paths_to_max(1,1,$q2)
+ paths_suffix($n-1,$q2,$l_suffix+$q2)
$number_count = 2 * $number_count + number_count_2_2;
