Find the Maximum Element in any SubMatrix of Matrix - algorithm

I am giving a Matrix of N x M. For a Submatrix of Length X which starts at position (a, b) i have to find the largest element present in a Submatrix.
My Approach:
Do as the question says:
Simple 2 loops
for(i in range(a, a + x))
for(j in range(b, b + x)) max = max(max,A[i][j]) // N * M
A little Advance:
1. Make a segment tree for every i in range(0, N)
2. for i in range(a, a + x) query(b, b + x) // N * logM
Is there any better solution having O(log n) complexity only ?

A Sparse Table Algorithm Approach
:- <O( N x M x log(N) x log(M)) , O(1)>.
Precomputation Time - O( N x M x log(N) x log(M))
Query Time - O(1)
For understanding this method you should have knowledge of finding RMQ using sparse Table Algorithm for one dimension.
We can use 2D Sparse Table Algorithm for finding Range Minimum Query.
What we do in One Dimension:-
we preprocess RMQ for sub arrays of length 2^k using dynamic programming. We will keep an array M[0, N-1][0, logN] where M[i][j] is the index of the minimum value in the sub array starting at i.
For calculating M[i][j] we must search for the minimum value in the first and second half of the interval. It’s obvious that the small pieces have 2^(j – 1) length, so the pseudo code for calculation this is:-
if (A[M[i][j-1]] < A[M[i + 2^(j-1) -1][j-1]])
M[i][j] = M[i][j-1]
else
M[i][j] = M[i + 2^(j-1) -1][j-1]
Here A is actual array which stores values.Once we have these values preprocessed, let’s show how we can use them to calculate RMQ(i, j). The idea is to select two blocks that entirely cover the interval [i..j] and find the minimum between them. Let k = [log(j - i + 1)]. For computing RMQ(i, j) we can use the following formula:-
if (A[M[i][k]] <= A[M[j - 2^k + 1][k]])
RMQ(i, j) = A[M[i][k]]
else
RMQ(i , j) = A[M[j - 2^k + 1][k]]
For 2 Dimension :-
Similarly We can extend above rule for 2 Dimension also , here we preprocess RMQ for sub matrix of length 2^K, 2^L using dynamic programming & keep an array M[0,N-1][0, M-1][0, logN][0, logM]. Where M[x][y][k][l] is the index of the minimum value in the sub matrix starting at [x , y] and having length 2^K, 2^L respectively.
pseudo code for calculation M[x][y][k][l] is:-
M[x][y][i][j] = GetMinimum(M[x][y][i-1][j-1], M[x + (2^(i-1))][y][i-1][j-1], M[x][y+(2^(j-1))][i-1][j-1], M[x + (2^(i-1))][y+(2^(j-1))][i-1][j-1])
Here GetMinimum function will return the index of minimum element from provided elements. Now we have preprocessed, let's see how to calculate RMQ(x, y, x1, y1). Here [x, y] starting point of sub matrix and [x1, y1] represent end point of sub matrix means bottom right point of sub matrix. Here we have to select four sub matrices blocks that entirely cover [x, y, x1, y1] and find minimum of them. Let k = [log(x1 - x + 1)] & l = [log(y1 - y + 1)]. For computing RMQ(x, y, x1, y1) we can use following formula:-
RMQ(x, y, x1, y1) = GetMinimum(M[x][y][k][l], M[x1 - (2^k) + 1][y][k][l], M[x][y1 - (2^l) + 1][k][l], M[x1 - (2^k) + 1][y1 - (2^l) + 1][k][l]);
pseudo code for above logic:-
// remember Array 'M' store index of actual matrix 'P' so for comparing values in GetMinimum function compare the values of array 'P' not of array 'M'
SparseMatrix(n , m){ // n , m is dimension of matrix.
for i = 0 to 2^i <= n:
for j = 0 to 2^j <= m:
for x = 0 to x + 2^i -1 < n :
for y = 0 to y + (2^j) -1 < m:
if i == 0 and j == 0:
M[x][y][i][j] = Pair(x , y) // store x, y
else if i == 0:
M[x][y][i][j] = GetMinimum(M[x][y][i][j-1], M[x][y+(2^(j-1))][i][j-1])
else if j == 0:
M[x][y][i][j] = GetMinimum(M[x][y][i-1][j], M[x+ (2^(i-1))][y][i-1][j])
else
M[x][y][i][j] = GetMinimum(M[x][y][i-1][j-1], M[x + (2^(i-1))][y][i-1][j-1], M[x][y+(2^(j-1))][i-1][j-1], M[x + (2^(i-1))][y+(2^(j-1))][i-1][j-1]);
}
RMQ(x, y, x1, y1){
k = log(x1 - x + 1)
l = log(y1 - y + 1)
ans = GetMinimum(M[x][y][k][l], M[x1 - (2^k) + 1][y][k][l], M[x][y1 - (2^l) + 1][k][l], M[x1 - (2^k) + 1][y1 - (2^l) + 1][k][l]);
return P[ans->x][ans->y] // ans->x represent Row number stored in ans and similarly ans->y represent column stored in ans
}

Here is the sample code in c++, for the pseudo code given by #Chapta, as was requested by some user.
int M[1000][1000][10][10];
int **matrix;
void precompute_max(){
for (int i = 0 ; (1<<i) <= n; i += 1){
for(int j = 0 ; (1<<j) <= m ; j += 1){
for (int x = 0 ; x + (1<<i) -1 < n; x+= 1){
for (int y = 0 ; y + (1<<j) -1 < m; y+= 1){
if (i == 0 and j == 0)
M[x][y][i][j] = matrix[x][y]; // store x, y
else if (i == 0)
M[x][y][i][j] = max(M[x][y][i][j-1], M[x][y+(1<<(j-1))][i][j-1]);
else if (j == 0)
M[x][y][i][j] = max(M[x][y][i-1][j], M[x+ (1<<(i-1))][y][i-1][j]);
else
M[x][y][i][j] = max(M[x][y][i-1][j-1], M[x + (1<<(i-1))][y][i-1][j-1], M[x][y+(1<<(j-1))][i-1][j-1], M[x + (1<<(i-1))][y+(1<<(j-1))][i-1][j-1]);
// cout << "from i="<<x<<" j="<<y<<" of length="<<(1<<i)<<" and length="<<(1<<j) <<"max is: " << M[x][y][i][j] << endl;
}
}
}
}
}
int compute_max(int x, int y, int x1, int y1){
int k = log2(x1 - x + 1);
int l = log2(y1 - y + 1);
// cout << "Value of k="<<k<<" l="<<l<<endl;
int ans = max(M[x][y][k][l], M[x1 - (1<<k) + 1][y][k][l], M[x][y1 - (1<<l) + 1][k][l], M[x1 - (1<<k) + 1][y1 - (1<<l) + 1][k][l]);
return ans;
}
This code first precomputes, the 2 dimensional sparse table, and then queries it in constant time.
Additional info: the sparse table stores the maximum element and not the indices to the maximum element.

AFAIK, there can be no O(logn approach) as the matrix follows no order. However, if you have an order such that every row is sorted in ascending from left to right and every column is sorted ascending from up to down, then you know that A[a+x][b+x] (bottom-right cell of the submatrix) is the largest element in that submatrix. Thus, finding the maximum takes O(1) time once the matrix is sorted. However, sorting the matrix, if not already sorted, will cost O(NxM log{NxM})

Related

Using matrices to find the number of different ways to write n as the sum of 1, 3, and 4?

This is a question given in this presentation. Dynamic Programming
now i have implemented the algorithm using recursion and it works fine for small values. But when n is greater than 30 it becomes really slow.The presentation mentions that for large values of n one should consider something similar to
the matrix form of Fibonacci numbers .I am having trouble undestanding how to use the matrix form of Fibonacci numbers to come up with a solution.Can some one give me some hints or pseudocode
Thanks
Yes, you can use the technique from fast Fibonacci implementations to solve this problem in time O(log n)! Here's how to do it.
Let's go with your definition from the problem statement that 1 + 3 is counted the same as 3 + 1. Then you have the following recurrence relation:
A(0) = 1
A(1) = 1
A(2) = 1
A(3) = 2
A(k+4) = A(k) + A(k+1) + A(k+3)
The matrix trick here is to notice that
| 1 0 1 1 | |A( k )| |A(k) + A(k-2) + A(k-3)| |A(k+1)|
| 1 0 0 0 | |A(k-1)| | A( k ) | |A( k )|
| 0 1 0 0 | |A(k-2)| = | A(k-1) | = |A(k-1)|
| 0 0 1 0 | |A(k-3)| | A(k-2) | = |A(k-2)|
In other words, multiplying a vector of the last four values in the series produces a vector with those values shifted forward by one step.
Let's call that matrix there M. Then notice that
|A( k )| |A(k+2)|
|A(k-1)| |A(k+1)|
M^2 |A(k-2)| = |A( k )|
|A(k-3)| |A(k-1)|
In other words, multiplying by the square of this matrix shifts the series down two steps. More generally:
|A( k )| | A(k+n) |
|A(k-1)| |A(k-1 + n)|
M^n |A(k-2)| = |A(k-2 + n)|
|A(k-3)| |A(k-3 + n)|
So multiplying by Mn shifts the series down n steps. Now, if we want to know the value of A(n+3), we can just compute
|A(3)| |A(n+3)|
|A(2)| |A(n+2)|
M^n |A(1)| = |A(n+1)|
|A(0)| |A(n+2)|
and read off the top entry of the vector! This can be done in time O(log n) by using exponentiation by squaring. Here's some code that does just that. This uses a matrix library I cobbled together a while back:
#include "Matrix.hh"
#include <cstdint>
#include <iomanip>
#include <iostream>
#include <algorithm>
using namespace std;
/* Naive implementations of A. */
uint64_t naiveA(int n) {
if (n == 0) return 1;
if (n == 1) return 1;
if (n == 2) return 1;
if (n == 3) return 2;
return naiveA(n-1) + naiveA(n-3) + naiveA(n-4);
}
/* Constructs and returns the giant matrix. */
Matrix<4, 4, uint64_t> M() {
Matrix<4, 4, uint64_t> result;
fill(result.begin(), result.end(), uint64_t(0));
result[0][0] = 1;
result[0][2] = 1;
result[0][3] = 1;
result[1][0] = 1;
result[2][1] = 1;
result[3][2] = 1;
return result;
}
/* Constructs the initial vector that we multiply the matrix by. */
Vector<4, uint64_t> initVec() {
Vector<4, uint64_t> result;
result[0] = 2;
result[1] = 1;
result[2] = 1;
result[3] = 1;
return result;
}
/* O(log n) time for raising a matrix to a power. */
Matrix<4, 4, uint64_t> fastPower(const Matrix<4, 4, uint64_t>& m, int n) {
if (n == 0) return Identity<4, uint64_t>();
auto half = fastPower(m, n / 2);
if (n % 2 == 0) return half * half;
else return half * half * m;
}
/* Fast implementation of A(n) using matrix exponentiation. */
uint64_t fastA(int n) {
if (n == 0) return 1;
if (n == 1) return 1;
if (n == 2) return 1;
if (n == 3) return 2;
auto result = fastPower(M(), n - 3) * initVec();
return result[0];
}
/* Some simple test code showing this in action! */
int main() {
for (int i = 0; i < 25; i++) {
cout << setw(2) << i << ": " << naiveA(i) << ", " << fastA(i) << endl;
}
}
Now, how would this change if 3 + 1 and 1 + 3 were treated as equivalent? This means that we can think about solving this problem in the following way:
Let A(n) be the number of ways to write n as a sum of 1s, 3s, and 4s.
Let B(n) be the number of ways to write n as a sum of 1s and 3s.
Let C(n) be the number of ways to write n as a sum of 1s.
We then have the following:
A(n) = B(n) for all n ≤ 3, since for numbers in that range the only options are to use 1s and 3s.
A(n + 4) = A(n) + B(n + 4), since your options are either (1) use a 4 or (2) not use a 4, leaving the remaining sum to use 1s and 3s.
B(n) = C(n) for all n ≤ 2, since for numbers in that range the only options are to use 1s.
B(n + 3) = B(n) + C(n + 3), sine your options are either (1) use a 3 or (2) not use a 3, leaving the remaining sum to use only 1s.
C(0) = 1, since there's only one way to write 0 as a sum of no numbers.
C(n+1) = C(n), since the only way to write something with 1s is to pull out a 1 and write the remaining number as a sum of 1s.
That's a lot to take in, but do notice the following: we ultimately care about A(n), and to evaluate it, we only need to know the values of A(n), A(n-1), A(n-2), A(n-3), B(n), B(n-1), B(n-2), B(n-3), C(n), C(n-1), C(n-2), and C(n-3).
Let's imagine, for example, that we know these twelve values for some fixed value of n. We can learn those twelve values for the next value of n as follows:
C(n+1) = C(n)
B(n+1) = B(n-2) + C(n+1) = B(n-2) + C(n)
A(n+1) = A(n-3) + B(n+1) = A(n-3) + B(n-2) + C(n)
And the remaining values then shift down.
We can formulate this as a giant matrix equation:
A( n ) A(n-1) A(n-2) A(n-3) B( n ) B(n-1) B(n-2) C( n )
| 0 0 0 1 0 0 1 1 | |A( n )| = |A(n+1)|
| 1 0 0 0 0 0 0 0 | |A(n-1)| = |A( n )|
| 0 1 0 0 0 0 0 0 | |A(n-2)| = |A(n-1)|
| 0 0 1 0 0 0 0 0 | |A(n-3)| = |A(n-2)|
| 0 0 0 0 0 0 1 1 | |B( n )| = |B(n+1)|
| 0 0 0 0 1 0 0 0 | |B(n-1)| = |B( n )|
| 0 0 0 0 0 1 0 0 | |B(n-2)| = |B(n-1)|
| 0 0 0 0 0 0 0 1 | |C( n )| = |C(n+1)|
Let's call this gigantic matrix here M. Then if we compute
|2| // A(3) = 2, since 3 = 3 or 3 = 1 + 1 + 1
|1| // A(2) = 1, since 2 = 1 + 1
|1| // A(1) = 1, since 1 = 1
M^n |1| // A(0) = 1, since 0 = (empty sum)
|2| // B(3) = 2, since 3 = 3 or 3 = 1 + 1 + 1
|1| // B(2) = 1, since 2 = 1 + 1
|1| // B(1) = 1, since 1 = 1
|1| // C(3) = 1, since 3 = 1 + 1 + 1
We'll get back a vector whose first entry is A(n+3), the number of ways to write n+3 as a sum of 1's, 3's, and 4's. (I've actually coded this up to check it - it works!) You can then use the technique for computing Fibonacci numbers using a matrix to a power efficiently that you saw with Fibonacci numbers to solve this in time O(log n).
Here's some code doing that:
#include "Matrix.hh"
#include <cstdint>
#include <iomanip>
#include <iostream>
#include <algorithm>
using namespace std;
/* Naive implementations of A, B, and C. */
uint64_t naiveC(int n) {
return 1;
}
uint64_t naiveB(int n) {
return (n < 3? 0 : naiveB(n-3)) + naiveC(n);
}
uint64_t naiveA(int n) {
return (n < 4? 0 : naiveA(n-4)) + naiveB(n);
}
/* Constructs and returns the giant matrix. */
Matrix<8, 8, uint64_t> M() {
Matrix<8, 8, uint64_t> result;
fill(result.begin(), result.end(), uint64_t(0));
result[0][3] = 1;
result[0][6] = 1;
result[0][7] = 1;
result[1][0] = 1;
result[2][1] = 1;
result[3][2] = 1;
result[4][6] = 1;
result[4][7] = 1;
result[5][4] = 1;
result[6][5] = 1;
result[7][7] = 1;
return result;
}
/* Constructs the initial vector that we multiply the matrix by. */
Vector<8, uint64_t> initVec() {
Vector<8, uint64_t> result;
result[0] = 2;
result[1] = 1;
result[2] = 1;
result[3] = 1;
result[4] = 2;
result[5] = 1;
result[6] = 1;
result[7] = 1;
return result;
}
/* O(log n) time for raising a matrix to a power. */
Matrix<8, 8, uint64_t> fastPower(const Matrix<8, 8, uint64_t>& m, int n) {
if (n == 0) return Identity<8, uint64_t>();
auto half = fastPower(m, n / 2);
if (n % 2 == 0) return half * half;
else return half * half * m;
}
/* Fast implementation of A(n) using matrix exponentiation. */
uint64_t fastA(int n) {
if (n == 0) return 1;
if (n == 1) return 1;
if (n == 2) return 1;
if (n == 3) return 2;
auto result = fastPower(M(), n - 3) * initVec();
return result[0];
}
/* Some simple test code showing this in action! */
int main() {
for (int i = 0; i < 25; i++) {
cout << setw(2) << i << ": " << naiveA(i) << ", " << fastA(i) << endl;
}
}
This is a very interesting sequence. It is almost but not quite the order-4 Fibonacci (a.k.a. Tetranacci) numbers. Having extracted the doubling formulas for Tetranacci from its companion matrix, I could not resist doing it again for this very similar recurrence relation.
Before we get into the actual code, some definitions and a short derivation of the formulas used are in order. Define an integer sequence A such that:
A(n) := A(n-1) + A(n-3) + A(n-4)
with initial values A(0), A(1), A(2), A(3) := 1, 1, 1, 2.
For n >= 0, this is the number of integer compositions of n into parts from the set {1, 3, 4}. This is the sequence that we ultimately wish to compute.
For convenience, define a sequence T such that:
T(n) := T(n-1) + T(n-3) + T(n-4)
with initial values T(0), T(1), T(2), T(3) := 0, 0, 0, 1.
Note that A(n) and T(n) are simply shifts of each other. More precisely, A(n) = T(n+3) for all integers n. Accordingly, as elaborated by another answer, the companion matrix for both sequences is:
[0 1 0 0]
[0 0 1 0]
[0 0 0 1]
[1 1 0 1]
Call this matrix C, and let:
a, b, c, d := T(n), T(n+1), T(n+2), T(n+3)
a', b', c', d' := T(2n), T(2n+1), T(2n+2), T(2n+3)
By induction, it can easily be shown that:
[0 1 0 0]^n = [d-c-a c-b b-a a]
[0 0 1 0] [ a d-c c-b b]
[0 0 0 1] [ b b+a d-c c]
[1 1 0 1] [ c c+b b+a d]
As seen above, for any n, C^n can be fully determined from its rightmost column alone. Furthermore, multiplying C^n with its rightmost column produces the rightmost column of C^(2n):
[d-c-a c-b b-a a][a] = [a'] = [a(2d - 2c - a) + b(2c - b)]
[ a d-c c-b b][b] [b'] [ a^2 + c^2 + 2b(d - c)]
[ b b+a d-c c][c] [c'] [ b(2a + b) + c(2d - c)]
[ c c+b b+a d][d] [d'] [ b^2 + d^2 + 2c(a + b)]
Thus, if we wish to compute C^n for some n by repeated squaring, we need only perform matrix-vector multiplication per step instead of the full matrix-matrix multiplication.
Now, the implementation, in Python:
# O(n) integer additions or subtractions
def A_linearly(n):
a, b, c, d = 0, 0, 0, 1 # T(0), T(1), T(2), T(3)
if n >= 0:
for _ in range(+n):
a, b, c, d = b, c, d, a + b + d
else: # n < 0
for _ in range(-n):
a, b, c, d = d - c - a, a, b, c
return d # because A(n) = T(n+3)
# O(log n) integer multiplications, additions, subtractions.
def A_by_doubling(n):
n += 3 # because A(n) = T(n+3)
if n >= 0:
a, b, c, d = 0, 0, 0, 1 # T(0), T(1), T(2), T(3)
else: # n < 0
a, b, c, d = 1, 0, 0, 0 # T(-1), T(0), T(1), T(2)
# Unroll the final iteration to avoid computing extraneous values
for i in reversed(range(1, abs(n).bit_length())):
w = a*(2*(d - c) - a) + b*(2*c - b)
x = a*a + c*c + 2*b*(d - c)
y = b*(2*a + b) + c*(2*d - c)
z = b*b + d*d + 2*c*(a + b)
if (n >> i) & 1 == 0:
a, b, c, d = w, x, y, z
else: # (n >> i) & 1 == 1
a, b, c, d = x, y, z, w + x + z
if n & 1 == 0:
return a*(2*(d - c) - a) + b*(2*c - b) # w
else: # n & 1 == 1
return a*a + c*c + 2*b*(d - c) # x
print(all(A_linearly(n) == A_by_doubling(n) for n in range(-1000, 1001)))
Because it was rather trivial to code, the sequence is extended to negative n in the usual way. Also provided is a simple linear implementation to serve as a point of reference.
For n large enough, the logarithmic implementation above is 10-20x faster than directly exponentiating the companion matrix with numpy, by a simple (i.e. not rigorous, and likely flawed) timing comparison. And by my estimate, it would still take ~100 years to compute A(10**12)! Even though the algorithm above has room for improvement, that number is simply too large. On the other hand, computing A(10**12) mod M for some M is much more attainable.
A direct relation to Lucas and Fibonacci numbers
It turns out that T(n) is even closer to the Fibonacci and Lucas numbers than it is to Tetranacci. To see this, note that the characteristic polynomial for T(n) is x^4 - x^3 - x - 1 = 0 which factors into (x^2 - x - 1)(x^2 + 1) = 0. The first factor is the characteristic polynomial for Fibonacci & Lucas! The 4 roots of (x^2 - x - 1)(x^2 + 1) = 0 are the two Fibonacci roots, phi and psi = 1 - phi, and i and -i--the two square roots of -1.
The closed-form expression or "Binet" formula for T(n) will have the general form:
T(n) = U(n) + V(n)
U(n) = p*(phi^n) + q*(psi^n)
V(n) = r*(i^n) + s*(-i)^n
for some constant coefficients p, q, r, s.
Using the initial values for T(n), solving for the coefficients, applying some algebra, and noting that the Lucas numbers have the closed-form expression: L(n) = phi^n + psi^n, we can derive the following relations:
L(n+1) - L(n) L(n-1) F(n) + F(n-2)
U(n) = ------------- = -------- = ------------
5 5 5
where L(n) is the n'th Lucas number with L(0), L(1) := 2, 1 and F(n) is the n'th Fibonacci number with F(0), F(1) := 0, 1. And we also have:
V(n) = 1 / 5 if n = 0 (mod 4)
| -2 / 5 if n = 1 (mod 4)
| -1 / 5 if n = 2 (mod 4)
| 2 / 5 if n = 3 (mod 4)
Which is ugly, but trivial to code. Note that the numerator of V(n) can also be succinctly expressed as cos(n*pi/2) - 2sin(n*pi/2) or (3-(-1)^n) / 2 * (-1)^(n(n+1)/2), but we use the piece-wise definition for clarity.
Here's an even nicer, more direct identity:
T(n) + T(n+2) = F(n)
Essentially, we can compute T(n) (and therefore A(n)) by using Fibonacci & Lucas numbers. Theoretically, this should be much more efficient than the Tetranacci-like approach.
It is known that the Lucas numbers can computed more efficiently than Fibonacci, therefore we will compute A(n) from the Lucas numbers. The most efficient, simple Lucas number algorithm I know of is one by L.F. Johnson (see his 2010 paper: Middle and Ripple, fast simple O(lg n) algorithms for Lucas Numbers). Once we have a Lucas algorithm, we use the identity: T(n) = L(n - 1) / 5 + V(n) to compute A(n).
# O(log n) integer multiplications, additions, subtractions
def A_by_lucas(n):
n += 3 # because A(n) = T(n+3)
offset = (+1, -2, -1, +2)[n % 4]
L = lf_johnson_2010_middle(n - 1)
return (L + offset) // 5
def lf_johnson_2010_middle(n):
"-> n'th Lucas number. See [L.F. Johnson 2010a]."
#: The following Lucas identities are used:
#:
#: L(2n) = L(n)^2 - 2*(-1)^n
#: L(2n+1) = L(2n+2) - L(2n)
#: L(2n+2) = L(n+1)^2 - 2*(-1)^(n+1)
#:
#: The first and last identities are equivalent.
#: For the unrolled iteration, the following is also used:
#:
#: L(2n+1) = L(n)*L(n+1) - (-1)^n
#:
#: Since this approach uses only square multiplications per loop,
#: It turns out to be slightly faster than standard Lucas doubling,
#: which uses 1 square and 1 regular multiplication.
if n >= 0:
a, b, sign = 2, 1, +1 # L(0), L(1), (-1)^0
else: # n < 0
a, b, sign = -1, 2, -1 # L(-1), L(0), (-1)^(-1)
# unroll the last iteration to avoid computing unnecessary values
for i in reversed(range(1, abs(n).bit_length())):
a = a*a - 2*sign # L(2k)
c = b*b + 2*sign # L(2k+2)
b = c - a # L(2k+1)
sign = +1
if (n >> i) & 1:
a, b = b, c
sign = -1
if n & 1:
return a*b - sign
else:
return a*a - 2*sign
You may verify that A_by_lucas produces the same results as the previous A_by_doubling function, but is roughly 5x faster. Still not fast enough to compute A(10**12) in any reasonable amount of time!
You can easily improve your current recursion implementation by adding memoization which makes the solution fast again. C# code:
// Dictionary to store computed values
private static Dictionary<int, long> s_Solutions = new Dictionary<int, long>();
private static long Count134(int value) {
if (value == 0)
return 1;
else if (value <= 0)
return 0;
long result;
// Improvement: Do we have the value computed?
if (s_Solutions.TryGetValue(value, out result))
return result;
result = Count134(value - 4) +
Count134(value - 3) +
Count134(value - 1);
// Improvement: Store the value computed for future use
s_Solutions.Add(value, result);
return result;
}
And so you can easily call
Console.Write(Count134(500));
The outcome (which takes about 2 milliseconds) is
3350159379832610737

How to find ith item in zigzag ordering?

A question last week defined the zig zag ordering on an n by m matrix and asked how to list the elements in that order.
My question is how to quickly find the ith item in the zigzag ordering? That is, without traversing the matrix (for large n and m that's much too slow).
For example with n=m=8 as in the picture and (x, y) describing (row, column)
f(0) = (0, 0)
f(1) = (0, 1)
f(2) = (1, 0)
f(3) = (2, 0)
f(4) = (1, 1)
...
f(63) = (7, 7)
Specific question: what is the ten billionth (1e10) item in the zigzag ordering of a million by million matrix?
Let's assume that the desired element is located in the upper half of the matrix. The length of the diagonals are 1, 2, 3 ..., n.
Let's find the desired diagonal. It satisfies the following property:
sum(1, 2 ..., k) >= pos but sum(1, 2, ..., k - 1) < pos. The sum of 1, 2, ..., k is k * (k + 1) / 2. So we just need to find the smallest integer k such that k * (k + 1) / 2 >= pos. We can either use a binary search or solve this quadratic inequality explicitly.
When we know the k, we just need to find the pos - (k - 1) * k / 2 element of this diagonal. We know where it starts and where we should move(up or down, depending on the parity of k), so we can find the desired cell using a simple formula.
This solution has an O(1) or an O(log n) time complexity(it depends on whether we use a binary search or solve the inequation explicitly in step 2).
If the desired element is located in the lower half of the matrix, we can solve this problem for a pos' = n * n - pos + 1 and then use symmetry to get the solution to the original problem.
I used 1-based indexing in this solution, using 0-based indexing might require adding +1 or -1 somewhere, but the idea of the solution is the same.
If the matrix is rectangular, not square, we need to consider the fact the length of diagonals look this way: 1, 2, 3, ..., m, m, m, .., m, m - 1, ..., 1(if m <= n) when we search for the k, so the sum becomes something like k * (k + 1) / 2 if k <= m and k * (k + 1) / 2 + m * (k - m) otherwise.
import math, random
def naive(n, m, ord, swap = False):
dx = 1
dy = -1
if swap:
dx, dy = dy, dx
cur = [0, 0]
for i in range(ord):
cur[0] += dy
cur[1] += dx
if cur[0] < 0 or cur[1] < 0 or cur[0] >= n or cur[1] >= m:
dx, dy = dy, dx
if cur[0] >= n:
cur[0] = n - 1
cur[1] += 2
if cur[1] >= m:
cur[1] = m - 1
cur[0] += 2
if cur[0] < 0: cur[0] = 0
if cur[1] < 0: cur[1] = 0
return cur
def fast(n, m, ord, swap = False):
if n < m:
x, y = fast(m, n, ord, not swap)
return [y, x]
alt = n * m - ord - 1
if alt < ord:
x, y = fast(n, m, alt, swap if (n + m) % 2 == 0 else not swap)
return [n - x - 1, m - y - 1]
if ord < (m * (m + 1) / 2):
diag = int((-1 + math.sqrt(1 + 8 * ord)) / 2)
parity = (diag + (0 if swap else 1)) % 2
within = ord - (diag * (diag + 1) / 2)
if parity: return [diag - within, within]
else: return [within, diag - within]
else:
ord -= (m * (m + 1) / 2)
diag = int(ord / m)
within = ord - diag * m
diag += m
parity = (diag + (0 if swap else 1)) % 2
if not parity:
within = m - within - 1
return [diag - within, within]
if __name__ == "__main__":
for i in range(1000):
n = random.randint(3, 100)
m = random.randint(3, 100)
ord = random.randint(0, n * m - 1)
swap = random.randint(0, 99) < 50
na = naive(n, m, ord, swap)
fa = fast(n, m, ord, swap)
assert na == fa, "(%d, %d, %d, %s) ==> (%s), (%s)" % (n, m, ord, swap, na, fa)
print fast(1000000, 1000000, 9999999999, False)
print fast(1000000, 1000000, 10000000000, False)
So the 10-billionth element (the one with ordinal 9999999999), and the 10-billion-first element (the one with ordinal 10^10) are:
[20331, 121089]
[20330, 121090]
An analytical solution
In the general case, your matrix will be divided in 3 areas:
an initial triangle t1
a skewed part mid where diagonals have a constant length
a final triangle t2
Let's call p the index of your diagonal run.
We want to define two functions x(p) and y(p) that give you the column and row of the pth cell.
Initial triangle
Let's look at the initial triangular part t1, where each new diagonal is one unit longer than the preceding.
Now let's call d the index of the diagonal that holds the cell, and
Sp = sum(di) for i in [0..p-1]
We have p = Sp + k, with 0 <=k <= d and
Sp = d(d+1)/2
if we solve for d, it brings
d²+d-2p = 0, a quadratic equation where we retain only the positive root:
d = (-1+sqrt(1+8*p))/2
Now we want the highest integer value closest to d, which is floor(d).
In the end, we have
p = d + k with d = floor((-1+sqrt(1+8*p))/2) and k = p - d(d+1)/2
Let's call
o(d) the function that equals 1 if d is odd and 0 otherwise, and
e(d) the function that equals 1 if d is even and 0 otherwise.
We can compute x(p) and y(p) like so:
d = floor((-1+sqrt(1+8*p))/2)
k = p - d(d+1)/2
o = d % 2
e = 1 - o
x = e*d + (o-e)*k
y = o*d + (e-o)*k
even and odd functions are used to try to salvage some clarity, but you can replace
e(p) with 1 - o(p) and have slightly more efficient but less symetric formulaes for x and y.
Middle part
let's consider the smallest matrix dimension s, i.e. s = min (m,n).
The previous formulaes hold until x or y (whichever comes first) reaches the value s.
The upper bound of p such as x(i) <= s and y(i) <= s for all i in [0..p]
(i.e. the cell indexed by p is inside the initial triangle t1) is given by
pt1 = s(s+1)/2.
For p >= pt1, diagonal length remains equal to s until we reach the second triangle t2.
when inside mid, we have:
p = s(s+1)/2 + ds + k with k in [0..s[.
which yields:
d = floor ((p - s(s+1)/2)/s)
k = p - ds
We can then use the same even/odd trick to compute x(p) and y(p):
p -= s(s+1)/2
d = floor (p / s)
k = p - d*s
o = (d+s) % 2
e = 1 - o
x = o*s + (e-o)*k
y = e*s + (o-e)*k
if (n > m)
x += d+e
y -= e
else
y += d+o
x -= o
Final triangle
Using symetry, we can calculate pt2 = m*n - s(s+1)/2
We now face nearly the same problem as for t1, except that the diagonal may run in the same direction as for t1 or in the reverse direction (if n+m is odd).
Using symetry tricks, we can compute x(p) and y(p) like so:
p = n*m -1 - p
d = floor((-1+sqrt(1+8*p))/2)
k = p - d*(d+1)/2
o = (d+m+n) % 2
e = 1 - $o;
x = n-1 - (o*d + (e-o)*k)
y = m-1 - (e*d + (o-e)*k)
Putting all together
Here is a sample c++ implementation.
I used 64 bits integers out of sheer lazyness. Most could be replaced by 32 bits values.
The computations could be made more effective by precomputing a few more coefficients.
A good part of the code could be factorized, but I doubt it is worth the effort.
Since this is just a quick and dirty proof of concept, I did not optimize it.
#include <cstdio> // printf
#include <algorithm> // min
using namespace std;
typedef long long tCoord;
void panic(const char * msg)
{
printf("PANIC: %s\n", msg);
exit(-1);
}
struct tPoint {
tCoord x, y;
tPoint(tCoord x = 0, tCoord y = 0) : x(x), y(y) {}
tPoint operator+(const tPoint & p) const { return{ x + p.x, y + p.y }; }
bool operator!=(const tPoint & p) const { return x != p.x || y != p.y; }
};
class tMatrix {
tCoord n, m; // dimensions
tCoord s; // smallest dimension
tCoord pt1, pt2; // t1 / mid / t2 limits for p
public:
tMatrix(tCoord n, tCoord m) : n(n), m(m)
{
s = min(n, m);
pt1 = (s*(s + 1)) / 2;
pt2 = n*m - pt1;
}
tPoint diagonal_cell(tCoord p)
{
tCoord x, y;
if (p < pt1) // inside t1
{
tCoord d = (tCoord)floor((-1 + sqrt(1 + 8 * p)) / 2);
tCoord k = p - (d*(d + 1)) / 2;
tCoord o = d % 2;
tCoord e = 1 - o;
x = o*d + (e - o)*k;
y = e*d + (o - e)*k;
}
else if (p < pt2) // inside mid
{
p -= pt1;
tCoord d = (tCoord)floor(p / s);
tCoord k = p - d*s;
tCoord o = (d + s) % 2;
tCoord e = 1 - o;
x = o*s + (e - o)*k;
y = e*s + (o - e)*k;
if (m > n) // vertical matrix
{
x -= o;
y += d + o;
}
else // horizontal matrix
{
x += d + e;
y -= e;
}
}
else // inside t2
{
p = n * m - 1 - p;
tCoord d = (tCoord)floor((-1 + sqrt(1 + 8 * p)) / 2);
tCoord k = p - (d*(d + 1)) / 2;
tCoord o = (d + m + n) % 2;
tCoord e = 1 - o;
x = n - 1 - (o*d + (e - o)*k);
y = m - 1 - (e*d + (o - e)*k);
}
return{ x, y };
}
void check(void)
{
tPoint move[4] = { { 1, 0 }, { -1, 1 }, { 1, -1 }, { 0, 1 } };
tPoint pos;
tCoord dir = 0;
for (tCoord p = 0; p != n * m ; p++)
{
tPoint dc = diagonal_cell(p);
if (pos != dc) panic("zot!");
pos = pos + move[dir];
if (dir == 0)
{
if (pos.y == m - 1) dir = 2;
else dir = 1;
}
else if (dir == 3)
{
if (pos.x == n - 1) dir = 1;
else dir = 2;
}
else if (dir == 1)
{
if (pos.y == m - 1) dir = 0;
else if (pos.x == 0) dir = 3;
}
else
{
if (pos.x == n - 1) dir = 3;
else if (pos.y == 0) dir = 0;
}
}
}
};
void main(void)
{
const tPoint dim[] = { { 10, 10 }, { 11, 11 }, { 10, 30 }, { 30, 10 }, { 10, 31 }, { 31, 10 }, { 11, 31 }, { 31, 11 } };
for (tPoint d : dim)
{
printf("Checking a %lldx%lld matrix...", d.x, d.y);
tMatrix(d.x, d.y).check();
printf("done\n");
}
tCoord p = 10000000000;
tMatrix matrix(1000000, 1000000);
tPoint cell = matrix.diagonal_cell(p);
printf("Coordinates of %lldth cell: (%lld,%lld)\n", p, cell.x, cell.y);
}
Results are checked against "manual" sweep of the matrix.
This "manual" sweep is a ugly hack that won't work for a one-row or one-column matrix, though diagonal_cell() does work on any matrix (the "diagonal" sweep becomes linear in that case).
The coordinates found for the 10.000.000.000th cell of a 1.000.000x1.000.000 matrix seem consistent, since the diagonal d on which the cell stands is about sqrt(2*1e10), approx. 141421, and the sum of cell coordinates is about equal to d (121090+20330 = 141420). Besides, it is also what the two other posters report.
I would say there is a good chance this lump of obfuscated code actually produces an O(1) solution to your problem.

Improving the Efficiency Of This Code With Tracking Variable?

I have written the below code outline, basically to sum an array (a) where each element is multiplied by a value x^i:
y = a(0)
i = 0
{y = sum from i=0 to (n-1) a(i) * x^i AND 0 <= n <= a.length} //Invariant
while (i < (n-1))
{y = sum from i=0 to (n-1) a(i) * x^i AND 0 <= n <= a.length AND i < (n-1)}
y = y + a(i)*x^i
i = i + 1
end while
{y = sum from i=0 to (n-1) a(i) * x^i} //Postcondition
Note that I do not expect the code to compile - it's just a sensible outline of how the code should work. I need to improve the efficiency of the code by using a tracking variable, and thus, a linking invariant to bridge said variable with the rest of the code. This is where I am stuck. What would be useful to track in this case? I have thought about retaining sum values at each iteration, but I'm not sure if that does the trick. If I could figure out what to track, I'm pretty sure it would be trivial to link it to the space. Can anyone see how my algorithm might be improved via a tracking variable?
Your invariant logic has off-by-1 problems. Here is a corrected version that tracks partial power operations.
// Precondition: 1 <= n <= a.length
// Invariant:
{ 0 <= i < n AND xi = x^i AND y = sum(j = 0..i) . a(j) * x^j }
// Establish invariant at i = 0:
// xi = x^0 = 1 AND y = sum(j=0..0) . a(j) * x^j = a(0) * x^0 = a(0)
i = 0;
xi = 1;
y = a(0);
while (i < n - 1) {
i = i + 1; // Break the invariant
xi = xi * x; // Re-establish it
y = y + a(i) * xi
}
// Invariant was last established at i = n-1, so we have post condition:
{ y = sum(j = 0..n-1) . a(j) * x^j }
The more common and numerically stable way to calculate polynomials is with Horner's Rule
y = 0
for i = n-1 downto 0 do y = y * x + a(i)
So it seems like you're trying to end up with this:
(a(0)*x^0) + (a(1)*x^1) + ... + (a(n-1)*x^(n-1))
Is that right?
The only way I can see to improve performance would be if the ^ operation is more costly than the * operation. In that case, you could keep track of the x^n variable as you go, multiplying x by the value through each iteration.
In fact, in that case you could probably start at the end of the array and work your way backwards, multiplying by x each time, to produce:
(((...((a(n-1)*x+a(n-2))*x+...)+a(2))*x+a(1))*x)+a(0)
That would theoretically be slightly faster than recalculating x^i each time, but it's not going to be algorithmically faster. It probably wouldn't be an order of magnitude faster.

How to find the number of values in a given range divisible by a given value?

I have three numbers x, y , z.
For a range between numbers x and y.
How can i find the total numbers whose % with z is 0 i.e. how many numbers between x and y are divisible by z ?
It can be done in O(1): find the first one, find the last one, find the count of all other.
I'm assuming the range is inclusive. If your ranges are exclusive, adjust the bounds by one:
find the first value after x that is divisible by z. You can discard x:
x_mod = x % z;
if(x_mod != 0)
x += (z - x_mod);
find the last value before y that is divisible by y. You can discard y:
y -= y % z;
find the size of this range:
if(x > y)
return 0;
else
return (y - x) / z + 1;
If mathematical floor and ceil functions are available, the first two parts can be written more readably. Also the last part can be compressed using math functions:
x = ceil (x, z);
y = floor (y, z);
return max((y - x) / z + 1, 0);
if the input is guaranteed to be a valid range (x >= y), the last test or max is unneccessary:
x = ceil (x, z);
y = floor (y, z);
return (y - x) / z + 1;
(2017, answer rewritten thanks to comments)
The number of multiples of z in a number n is simply n / z
/ being the integer division, meaning decimals that could result from the division are simply ignored (for instance 17/5 => 3 and not 3.4).
Now, in a range from x to y, how many multiples of z are there?
Let see how many multiples m we have up to y
0----------------------------------x------------------------y
-m---m---m---m---m---m---m---m---m---m---m---m---m---m---m---
You see where I'm going... to get the number of multiples in the range [ x, y ], get the number of multiples of y then subtract the number of multiples before x, (x-1) / z
Solution: ( y / z ) - (( x - 1 ) / z )
Programmatically, you could make a function numberOfMultiples
function numberOfMultiples(n, z) {
return n / z;
}
to get the number of multiples in a range [x, y]
numberOfMultiples(y) - numberOfMultiples(x-1)
The function is O(1), there is no need of a loop to get the number of multiples.
Examples of results you should find
[30, 90] ÷ 13 => 4
[1, 1000] ÷ 6 => 166
[100, 1000000] ÷ 7 => 142843
[777, 777777777] ÷ 7 => 111111001
For the first example, 90 / 13 = 6, (30-1) / 13 = 2, and 6-2 = 4
---26---39---52---65---78---91--
^ ^
30<---(4 multiples)-->90
I also encountered this on Codility. It took me much longer than I'd like to admit to come up with a good solution, so I figured I would share what I think is an elegant solution!
Straightforward Approach 1/2:
O(N) time solution with a loop and counter, unrealistic when N = 2 billion.
Awesome Approach 3:
We want the number of digits in some range that are divisible by K.
Simple case: assume range [0 .. n*K], N = n*K
N/K represents the number of digits in [0,N) that are divisible by K, given N%K = 0 (aka. N is divisible by K)
ex. N = 9, K = 3, Num digits = |{0 3 6}| = 3 = 9/3
Similarly,
N/K + 1 represents the number of digits in [0,N] divisible by K
ex. N = 9, K = 3, Num digits = |{0 3 6 9}| = 4 = 9/3 + 1
I think really understanding the above fact is the trickiest part of this question, I cannot explain exactly why it works.
The rest boils down to prefix sums and handling special cases.
Now we don't always have a range that begins with 0, and we cannot assume the two bounds will be divisible by K.
But wait! We can fix this by calculating our own nice upper and lower bounds and using some subtraction magic :)
First find the closest upper and lower in the range [A,B] that are divisible by K.
Upper bound (easier): ex. B = 10, K = 3, new_B = 9... the pattern is B - B%K
Lower bound: ex. A = 10, K = 3, new_A = 12... try a few more and you will see the pattern is A - A%K + K
Then calculate the following using the above technique:
Determine the total number of digits X between [0,B] that are divisible by K
Determine the total number of digits Y between [0,A) that are divisible by K
Calculate the number of digits between [A,B] that are divisible by K in constant time by the expression X - Y
Website: https://codility.com/demo/take-sample-test/count_div/
class CountDiv {
public int solution(int A, int B, int K) {
int firstDivisible = A%K == 0 ? A : A + (K - A%K);
int lastDivisible = B%K == 0 ? B : B - B%K; //B/K behaves this way by default.
return (lastDivisible - firstDivisible)/K + 1;
}
}
This is my first time explaining an approach like this. Feedback is very much appreciated :)
This is one of the Codility Lesson 3 questions. For this question, the input is guaranteed to be in a valid range. I answered it using Javascript:
function solution(x, y, z) {
var totalDivisibles = Math.floor(y / z),
excludeDivisibles = Math.floor((x - 1) / z),
divisiblesInArray = totalDivisibles - excludeDivisibles;
return divisiblesInArray;
}
https://codility.com/demo/results/demoQX3MJC-8AP/
(I actually wanted to ask about some of the other comments on this page but I don't have enough rep points yet).
Divide y-x by z, rounding down. Add one if y%z < x%z or if x%z == 0.
No mathematical proof, unless someone cares to provide one, but test cases, in Perl:
#!perl
use strict;
use warnings;
use Test::More;
sub multiples_in_range {
my ($x, $y, $z) = #_;
return 0 if $x > $y;
my $ret = int( ($y - $x) / $z);
$ret++ if $y%$z < $x%$z or $x%$z == 0;
return $ret;
}
for my $z (2 .. 10) {
for my $x (0 .. 2*$z) {
for my $y (0 .. 4*$z) {
is multiples_in_range($x, $y, $z),
scalar(grep { $_ % $z == 0 } $x..$y),
"[$x..$y] mod $z";
}
}
}
done_testing;
Output:
$ prove divrange.pl
divrange.pl .. ok
All tests successful.
Files=1, Tests=3405, 0 wallclock secs ( 0.20 usr 0.02 sys + 0.26 cusr 0.01 csys = 0.49 CPU)
Result: PASS
Let [A;B] be an interval of positive integers including A and B such that 0 <= A <= B, K be the divisor.
It is easy to see that there are N(A) = ⌊A / K⌋ = floor(A / K) factors of K in interval [0;A]:
1K 2K 3K 4K 5K
●········x········x··●·····x········x········x···>
0 A
Similarly, there are N(B) = ⌊B / K⌋ = floor(B / K) factors of K in interval [0;B]:
1K 2K 3K 4K 5K
●········x········x········x········x···●····x···>
0 B
Then N = N(B) - N(A) equals to the number of K's (the number of integers divisible by K) in range (A;B]. The point A is not included, because the subtracted N(A) includes this point. Therefore, the result should be incremented by one, if A mod K is zero:
N := N(B) - N(A)
if (A mod K = 0)
N := N + 1
Implementation in PHP
function solution($A, $B, $K) {
if ($K < 1)
return 0;
$c = floor($B / $K) - floor($A / $K);
if ($A % $K == 0)
$c++;
return (int)$c;
}
In PHP, the effect of the floor function can be achieved by casting to the integer type:
$c = (int)($B / $K) - (int)($A / $K);
which, I think, is faster.
Here is my short and simple solution in C++ which got 100/100 on codility. :)
Runs in O(1) time. I hope its not difficult to understand.
int solution(int A, int B, int K) {
// write your code in C++11
int cnt=0;
if( A%K==0 or B%K==0)
cnt++;
if(A>=K)
cnt+= (B - A)/K;
else
cnt+=B/K;
return cnt;
}
(floor)(high/d) - (floor)(low/d) - (high%d==0)
Explanation:
There are a/d numbers divisible by d from 0.0 to a. (d!=0)
Therefore (floor)(high/d) - (floor)(low/d) will give numbers divisible in the range (low,high] (Note that low is excluded and high is included in this range)
Now to remove high from the range just subtract (high%d==0)
Works for integers, floats or whatever (Use fmodf function for floats)
Won't strive for an o(1) solution, this leave for more clever person:) Just feel this is a perfect usage scenario for function programming. Simple and straightforward.
> x,y,z=1,1000,6
=> [1, 1000, 6]
> (x..y).select {|n| n%z==0}.size
=> 166
EDIT: after reading other's O(1) solution. I feel shamed. Programming made people lazy to think...
Division (a/b=c) by definition - taking a set of size a and forming groups of size b. The number of groups of this size that can be formed, c, is the quotient of a and b. - is nothing more than the number of integers within range/interval ]0..a] (not including zero, but including a) that are divisible by b.
so by definition:
Y/Z - number of integers within ]0..Y] that are divisible by Z
and
X/Z - number of integers within ]0..X] that are divisible by Z
thus:
result = [Y/Z] - [X/Z] + x (where x = 1 if and only if X is divisible by Y otherwise 0 - assuming the given range [X..Y] includes X)
example :
for (6, 12, 2) we have 12/2 - 6/2 + 1 (as 6%2 == 0) = 6 - 3 + 1 = 4 // {6, 8, 10, 12}
for (5, 12, 2) we have 12/2 - 5/2 + 0 (as 5%2 != 0) = 6 - 2 + 0 = 4 // {6, 8, 10, 12}
The time complexity of the solution will be linear.
Code Snippet :
int countDiv(int a, int b, int m)
{
int mod = (min(a, b)%m==0);
int cnt = abs(floor(b/m) - floor(a/m)) + mod;
return cnt;
}
here n will give you count of number and will print sum of all numbers that are divisible by k
int a = sc.nextInt();
int b = sc.nextInt();
int k = sc.nextInt();
int first = 0;
if (a > k) {
first = a + a/k;
} else {
first = k;
}
int last = b - b%k;
if (first > last) {
System.out.println(0);
} else {
int n = (last - first)/k+1;
System.out.println(n * (first + last)/2);
}
Here is the solution to the problem written in Swift Programming Language.
Step 1: Find the first number in the range divisible by z.
Step 2: Find the last number in the range divisible by z.
Step 3: Use a mathematical formula to find the number of divisible numbers by z in the range.
func solution(_ x : Int, _ y : Int, _ z : Int) -> Int {
var numberOfDivisible = 0
var firstNumber: Int
var lastNumber: Int
if y == x {
return x % z == 0 ? 1 : 0
}
//Find first number divisible by z
let moduloX = x % z
if moduloX == 0 {
firstNumber = x
} else {
firstNumber = x + (z - moduloX)
}
//Fist last number divisible by z
let moduloY = y % z
if moduloY == 0 {
lastNumber = y
} else {
lastNumber = y - moduloY
}
//Math formula
numberOfDivisible = Int(floor(Double((lastNumber - firstNumber) / z))) + 1
return numberOfDivisible
}
public static int Solution(int A, int B, int K)
{
int count = 0;
//If A is divisible by K
if(A % K == 0)
{
count = (B / K) - (A / K) + 1;
}
//If A is not divisible by K
else if(A % K != 0)
{
count = (B / K) - (A / K);
}
return count;
}
This can be done in O(1).
Here you are a solution in C++.
auto first{ x % z == 0 ? x : x + z - x % z };
auto last{ y % z == 0 ? y : y - y % z };
auto ans{ (last - first) / z + 1 };
Where first is the first number that ∈ [x; y] and is divisible by z, last is the last number that ∈ [x; y] and is divisible by z and ans is the answer that you are looking for.

Avoiding Brute Force: Counting Solutions

In a programming contest, a problem was:
Count all solutions to the equation: x + 4y + 4z = n. You will be
given n and you will determine the count of solutions. Assume x, y and z are positive integers.
I have considered using triple for loops (brute force), but it was unefficient, causing TIME LIMIT EXCEED. (since the n may be = 1000,000):
int sol = 0;
for (int i = 1; i <= n; i++)
{
for (int j = 1; j <= n / 4; j++)
{
for (int k = 1; k <= n / 4; k++)
{
if (i + 4 * j + 4 * k == n)
sol++;
}
}
}
My friend could solve the problem. When I asked him, he said that he didn't use brute force at all. Instead, he converted the equation to a 'series' (i.e. summition). I asked him to tell how me but he refused :)
Can I know how?
This is particular case of coin change problem, which is solved in general by dynamic programming.
But here we can elaborate simple solution. I consider x,y,z > 0
x + 4*(y+z)=n
Let y + z = q = p + 1 (q > 1, p > 0)
x+4*q=n
x+4*p=n-4
There are M = Floor((n-5)/4) variants for x and p, hence there are M possible values of
q = 2..M+1
For every q>1 there are (q-1) variants of y and z: q = 1 + (q-1) = 2 + (q-2) +..+(q-1)+1
So we have N=1 + 2 + 3 + ... + M = M * (M + 1)/2 solutions
Example:
n = 15;
M = (15 - 5) div 4 = 2
N = 3
(3,1,2),(3,2,1),(7,1,1)
First note that n-x must be divisible by 4. Start by finding the smallest value that x can take:
start = 4
while ((n - start) % 4 != 0)
{
start = start + 1
}
From now on, you know that x will take values from [start, start+4, start+8 ...]. Now you can count the number of solutions by a simple counting loop:
count = 0
for (x = start; x < n - 4; x = x + 4)
{
y_z_sum = (n - x) / 4
count = count + y_z_sum - 1
}
For each choice of x, we can compute the value of y+z. For each value for y+z, there are y+z-1 possible choices (since y ranges from 1 to y+z-1, assuming that y and z are both positive integers).
Instead of a brute force solution with O(n3) running time, you can achieve O(n) this way.
This is a classic linear algebra problem. Please refer to any linear algebra textbook on how to solve a system of linear equations. One such method is called Gaussian Elimination.

Resources