Reconstructing input to encoder from output - algorithm

I would like to understand how to solve the Codility ArrayRecovery challenge, but I don't even know what branch of knowledge to consult. Is it combinatorics, optimization, computer science, set theory, or something else?
Edit:
The branch of knowledge to consult is constraint programming, particularly constraint propagation. You also need some combinatorics to know that if you take k numbers at a time from the range [1..n], with the restriction that no number can be bigger than the one before it, that works out to be
(n+k-1)!/k!(n-1)! possible combinations
which is the same as the number of combinations with replacements of n things taken k at a time, which has the mathematical notation . You can read about why it works out like that here.
Peter Norvig provides an excellent example of how to solve this kind of problem with his Sudoku solver.
You can read the full description of the ArrayRecovery problem via the link above. The short story is that there is an encoder that takes a sequence of integers in the range 1 up to some given limit (say 100 for our purposes) and for each element of the input sequence outputs the most recently seen integer that is smaller than the current input, or 0 if none exists.
input 1, 2, 3, 4 => output 0, 1, 2, 3
input 2, 4, 3 => output 0, 2, 2
The full task is, given the output (and the range of allowable input), figure out how many possible inputs could have generated it. But before I even get to that calculation, I'm not confident about how to even approach formulating the equation. That is what I am asking for help with. (Of course a full solution would be welcome, too, if it is explained.)
I just look at some possible outputs and wonder. Here are some sample encoder outputs and the inputs I can come up with, with * meaning any valid input and something like > 4 meaning any valid input greater than 4. If needed, inputs are referred to as A1, A2, A3, ... (1-based indexing)
Edit #2
Part of the problem I was having with this challenge is that I did not manually generate the exactly correct sets of possible inputs for an output. I believe the set below is correct now. Look at this answer's edit history if you want to see my earlier mistakes.
output #1: 0, 0, 0, 4
possible inputs: [>= 4, A1 >= * >= 4, 4, > 4]
output #2: 0, 0, 0, 2, 3, 4 # A5 ↴ See more in discussion below
possible inputs: [>= 2, A1 >= * >=2, 2, 3, 4, > 4]
output #3: 0, 0, 0, 4, 3, 1
possible inputs: none # [4, 3, 1, 1 >= * > 4, 4, > 1] but there is no number 1 >= * > 4
The second input sequence is very tightly constrained compared to the first just by adding 2 more outputs. The third sequence is so constrained as to be impossible.
But the set of constraints on A5 in example #2 is a bit harder to articulate. Of course A5 > O5, that is the basic constraint on all the inputs. But any output > A4 and after O5 has to appear in the input after A4, so A5 has to be an element of the set of numbers that comes after A5 that is also > A4. Since there is only 1 such number (A6 == 4), A5 has to be it, but it gets more complicated if there is a longer string of numbers that follow. (Editor's note: actually it doesn't.)
As the output set gets longer, I worry these constraints just get more complicated and harder to get right. I cannot think of any data structures for efficiently representing these in a way that leads to efficiently calculating the number of possible combinations. I also don't quite see how to algorithmically add constraint sets together.
Here are the constraints I see so far for any given An
An > On
An <= min(Set of other possible numbers from O1 to n-1 > On). How to define the set of possible numbers greater than On?
Numbers greater than On that came after the most recent occurrence of On in the input
An >= max(Set of other possible numbers from O1 to n-1 < On). How to define the set of possible numbers less than On?
Actually this set is empty because On is, by definition, the largest possible number from the previous input sequence. (Which it not to say it is strictly the largest number from the previous input sequence.)
Any number smaller than On that came before the last occurrence of it in the input would be ineligible because of the "nearest" rule. No numbers smaller that On could have occurred after the most recent occurrence because of the "nearest" rule and because of the transitive property: if Ai < On and Aj < Ai then Aj < On
Then there is the set theory:
An must be an element of the set of unaccounted-for elements of the set of On+1 to Om, where m is the smallest m > n such that Om < On. Any output after such Om and larger than Om (which An is) would have to appear as or after Am.
An element is unaccounted-for if it is seen in the output but does not appear in the input in a position that is consistent with the rest of the output. Obviously I need a better definition than this in order to code and algorithm to calculate it.
It seems like perhaps some kind of set theory and/or combinatorics or maybe linear algebra would help with figuring out the number of possible sequences that would account for all of the unaccounted-for outputs and fit the other constraints. (Editor's note: actually, things never get that complicated.)

The code below passes all of Codility's tests. The OP added a main function to use it on the command line.
The constraints are not as complex as the OP thinks. In particular, there is never a situation where you need to add a restriction that an input be an element of some set of specific integers seen elsewhere in the output. Every input position has a well-defined minimum and maximum.
The only complication to that rule is that sometimes the maximum is "the value of the previous input" and that input itself has a range. But even then, all the values like that are consecutive and have the same range, so the number of possibilities can be calculated with basic combinatorics, and those inputs as a group are independent of the other inputs (which only serve to set the range), so the possibilities of that group can be combined with the possibilities of other input positions by simple multiplication.
Algorithm overview
The algorithm makes a single pass through the output array updating the possible numbers of input arrays after every span, which is what I am calling repetitions of numbers in the output. (You might say maximal subsequences of the output where every element is identical.) For example, for output 0,1,1,2 we have three spans: 0, 1,1 and 2. When a new span begins, the number of possibilities for the previous span is calculated.
This decision was based on a few observations:
For spans longer than 1 in length, the minimum value of the input
allowed in the first position is whatever the value is of the input
in the second position. Calculating the number of possibilities of a
span is straightforward combinatorics, but the standard formula
requires knowing the range of the numbers and the length of the span.
Every time the value of the
output changes (and a new span beings), that strongly constrains the value of the previous span:
When the output goes up, the only possible reason is that the previous input was the value of the new, higher output and the input corresponding to the position of the new, higher output, was even higher.
When an output goes down, new constraints are established, but those are a bit harder to articulate. The algorithm stores stairs (see below) in order to quantify the constraints imposed when the output goes down
The aim here was to confine the range of possible values for every span. Once we do that accurately, calculating the number of combinations is straightforward.
Because the encoder backtracks looking to output a number that relates to the input in 2 ways, both smaller and closer, we know we can throw out numbers that are larger and farther away. After a small number appears in the output, no larger number from before that position can have any influence on what follows.
So to confine these ranges of input when the output sequence decreased, we need to store stairs - a list of increasingly larger possible values for the position in the original array. E.g for 0,2,5,7,2,4 stairs build up like this: 0, 0,2, 0,2,5, 0,2,5,7, 0,2, 0,2,4.
Using these bounds we can tell for sure that the number in the position of the second 2 (next to last position in the example) must be in (2,5], because 5 is the next stair. If the input were greater than 5, a 5 would have been output in that space instead of a 2. Observe, that if the last number in the encoded array was not 4, but 6, we would exit early returning 0, because we know that the previous number couldn't be bigger than 5.
The complexity is O(n*lg(min(n,m))).
Functions
CombinationsWithReplacement - counts number of combinations with replacements of size k from n numbers. E.g. for (3, 2) it counts 3,3, 3,2, 3,1, 2,2, 2,1, 1,1, so returns 6 It is the same as choose(n - 1 + k, n - 1).
nextBigger - finds next bigger element in a range. E.g. for 4 in sub-array 1,2,3,4,5 it returns 5, and in sub-array 1,3 it returns its parameter Max.
countSpan (lambda) - counts how many different combinations a span we have just passed can have. Consider span 2,2 for 0,2,5,7,2,2,7.
When curr gets to the final position, curr is 7 and prev is the final 2 of the 2,2 span.
It computes maximum and minimum possible values of the prev span. At this point stairs consist of 2,5,7 then maximum possible value is 5 (nextBigger after 2 in the stair 2,5,7). A value of greater than 5 in this span would have output a 5, not a 2.
It computes a minimum value for the span (which is the minimum value for every element in the span), which is prev at this point, (remember curr at this moment equals to 7 and prev to 2). We know for sure that in place of the final 2 output, the original input has to have 7, so the minimum is 7. (This is a consequence of the "output goes up" rule. If we had 7,7,2 and curr would be 2 then the minimum for the previous span (the 7,7) would be 8 which is prev + 1.
It adjusts the number of combinations. For a span of length L with a range of n possibilities (1+max-min), there are possibilities, with k being either L or L-1 depending on what follows the span.
For a span followed by a larger number, like 2,2,7, k = L - 1 because the last position of the 2,2 span has to be 7 (the value of the first number after the span).
For a span followed by a smaller number, like 7,7,2, k = L because
the last element of 7,7 has no special constraints.
Finally, it calls CombinationsWithReplacement to find out the number of branches (or possibilities), computes new res partial results value (remainder values in the modulo arithmetic we are doing), and returns new res value and max for further handling.
solution - iterates over the given Encoder Output array. In the main loop, while in a span it counts the span length, and at span boundaries it updates res by calling countSpan and possibly updates the stairs.
If the current span consists of a bigger number than the previous one, then:
Check validity of the next number. E.g 0,2,5,2,7 is invalid input, becuase there is can't be 7 in the next-to-last position, only 3, or 4, or 5.
It updates the stairs. When we have seen only 0,2, the stairs are 0,2, but after the next 5, the stairs become 0,2,5.
If the current span consists of a smaller number then the previous one, then:
It updates stairs. When we have seen only 0,2,5, our stairs are 0,2,5, but after we have seen 0,2,5,2 the stairs become 0,2.
After the main loop it accounts for the last span by calling countSpan with -1 which triggers the "output goes down" branch of calculations.
normalizeMod, extendedEuclidInternal, extendedEuclid, invMod - these auxiliary functions help to deal with modulo arithmetic.
For stairs I use storage for the encoded array, as the number of stairs never exceeds current position.
#include <algorithm>
#include <cassert>
#include <vector>
#include <tuple>
const int Modulus = 1'000'000'007;
int CombinationsWithReplacement(int n, int k);
template <class It>
auto nextBigger(It begin, It end, int value, int Max) {
auto maxIt = std::upper_bound(begin, end, value);
auto max = Max;
if (maxIt != end) {
max = *maxIt;
}
return max;
}
auto solution(std::vector<int> &B, const int Max) {
auto res = 1;
const auto size = (int)B.size();
auto spanLength = 1;
auto prev = 0;
// Stairs is the list of numbers which could be smaller than number in the next position
const auto stairsBegin = B.begin();
// This includes first entry (zero) into stairs
// We need to include 0 because we can meet another zero later in encoded array
// and we need to be able to find in stairs
auto stairsEnd = stairsBegin + 1;
auto countSpan = [&](int curr) {
const auto max = nextBigger(stairsBegin, stairsEnd, prev, Max);
// At the moment when we switch from the current span to the next span
// prev is the number from previous span and curr from current.
// E.g. 1,1,7, when we move to the third position cur = 7 and prev = 1.
// Observe that, in this case minimum value possible in place of any of 1's can be at least 2=1+1=prev+1.
// But if we consider 7, then we have even more stringent condition for numbers in place of 1, it is 7
const auto min = std::max(prev + 1, curr);
const bool countLast = prev > curr;
const auto branchesCount = CombinationsWithReplacement(max - min + 1, spanLength - (countLast ? 0 : 1));
return std::make_pair(res * (long long)branchesCount % Modulus, max);
};
for (int i = 1; i < size; ++i) {
const auto curr = B[i];
if (curr == prev) {
++spanLength;
}
else {
int max;
std::tie(res, max) = countSpan(curr);
if (prev < curr) {
if (curr > max) {
// 0,1,5,1,7 - invalid because number in the fourth position lies in [2,5]
// and so in the fifth encoded position we can't something bigger than 5
return 0;
}
// It is time to possibly shrink stairs.
// E.g if we had stairs 0,2,4,9,17 and current value is 5,
// then we no more interested in 9 and 17, and we change stairs to 0,2,4,5.
// That's because any number bigger than 9 or 17 also bigger than 5.
const auto s = std::lower_bound(stairsBegin, stairsEnd, curr);
stairsEnd = s;
*stairsEnd++ = curr;
}
else {
assert(curr < prev);
auto it = std::lower_bound(stairsBegin, stairsEnd, curr);
if (it == stairsEnd || *it != curr) {
// 0,5,1 is invalid sequence because original sequence lloks like this 5,>5,>1
// and there is no 1 in any of the two first positions, so
// it can't appear in the third position of the encoded array
return 0;
}
}
spanLength = 1;
}
prev = curr;
}
res = countSpan(-1).first;
return res;
}
template <class T> T normalizeMod(T a, T m) {
if (a < 0) return a + m;
return a;
}
template <class T> std::pair<T, std::pair<T, T>> extendedEuclidInternal(T a, T b) {
T old_x = 1;
T old_y = 0;
T x = 0;
T y = 1;
while (true) {
T q = a / b;
T t = a - b * q;
if (t == 0) {
break;
}
a = b;
b = t;
t = x; x = old_x - x * q; old_x = t;
t = y; y = old_y - y * q; old_y = t;
}
return std::make_pair(b, std::make_pair(x, y));
}
// Returns gcd and Bezout's coefficients
template <class T> std::pair<T, std::pair<T, T>> extendedEuclid(T a, T b) {
if (a > b) {
if (b == 0) return std::make_pair(a, std::make_pair(1, 0));
return extendedEuclidInternal(a, b);
}
else {
if (a == 0) return std::make_pair(b, std::make_pair(0, 1));
auto p = extendedEuclidInternal(b, a);
std::swap(p.second.first, p.second.second);
return p;
}
}
template <class T> T invMod(T a, T m) {
auto p = extendedEuclid(a, m);
assert(p.first == 1);
return normalizeMod(p.second.first, m);
}
int CombinationsWithReplacement(int n, int k) {
int res = 1;
for (long long i = n; i < n + k; ++i) {
res = res * i % Modulus;
}
int denom = 1;
for (long long i = k; i > 0; --i) {
denom = denom * i % Modulus;
}
res = res * (long long)invMod(denom, Modulus) % Modulus;
return res;
}
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
//
// Only the above is needed for the Codility challenge. Below is to run on the command line.
//
// Compile with: gcc -std=gnu++14 -lc++ -lstdc++ array_recovery.cpp
//
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
#include <string.h>
// Usage: 0 1 2,3, 4 M
// Last arg is M, the max value for an input.
// Remaining args are B (the output of the encoder) separated by commas and/or spaces
// Parentheses and brackets are ignored, so you can use the same input form as Codility's tests: ([1,2,3], M)
int main(int argc, char* argv[]) {
int Max;
std::vector<int> B;
const char* delim = " ,[]()";
if (argc < 2 ) {
printf("Usage: %s M 0 1 2,3, 4... \n", argv[0]);
return 1;
}
for (int i = 1; i < argc; i++) {
char* parse;
parse = strtok(argv[i], delim);
while (parse != NULL)
{
B.push_back(atoi(parse));
parse = strtok (NULL, delim);
}
}
Max = B.back();
B.pop_back();
printf("%d\n", solution(B, Max));
return 0;
}
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
//
// Only the above is needed for the Codility challenge. Below is to run on the command line.
//
// Compile with: gcc -std=gnu++14 -lc++ -lstdc++ array_recovery.cpp
//
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
#include <string.h>
// Usage: M 0 1 2,3, 4
// first arg is M, the max value for an input.
// remaining args are B (the output of the encoder) separated by commas and/or spaces
int main(int argc, char* argv[]) {
int Max;
std::vector<int> B;
const char* delim = " ,";
if (argc < 3 ) {
printf("Usage: %s M 0 1 2,3, 4... \n", argv[0]);
return 1;
}
Max = atoi(argv[1]);
for (int i = 2; i < argc; i++) {
char* parse;
parse = strtok(argv[i], delim);
while (parse != NULL)
{
B.push_back(atoi(parse));
parse = strtok (NULL, delim);
}
}
printf("%d\n", solution(B, Max));
return 0;
}
Let's see an example:
Max = 5
Array is
0 1 3 0 1 1 3
1
1 2..5
1 3 4..5
1 3 4..5 1
1 3 4..5 1 2..5
1 3 4..5 1 2..5 >=..2 (sorry, for a cumbersome way of writing)
1 3 4..5 1 3..5 >=..3 4..5
Now count:
1 1 2 1 3 2 which amounts to 12 total.

Here's an idea. One known method to construct the output is to use a stack. We pop it while the element is greater or equal, then output the smaller element if it exists, then push the greater element onto the stack. Now what if we attempted to do this backwards from the output?
First we'll demonstrate the stack method using the c∅dility example.
[2, 5, 3, 7, 9, 6]
2: output 0, stack [2]
5: output 2, stack [2,5]
3: pop 5, output, 2, stack [2,3]
7: output 3, stack [2,3,7]
... etc.
Final output: [0, 2, 2, 3, 7, 3]
Now let's try reconstruction! We'll use stack both as the imaginary stack and as the reconstituted input:
(Input: [2, 5, 3, 7, 9, 6])
Output: [0, 2, 2, 3, 7, 3]
* Something >3 that reached 3 in the stack
stack = [3, 3 < *]
* Something >7 that reached 7 in the stack
but both of those would've popped before 3
stack = [3, 7, 7 < x, 3 < * <= x]
* Something >3, 7 qualifies
stack = [3, 7, 7 < x, 3 < * <= x]
* Something >2, 3 qualifies
stack = [2, 3, 7, 7 < x, 3 < * <= x]
* Something >2 and >=3 since 3 reached 2
stack = [2, 2 < *, 3, 7, 7 < x, 3 < * <= x]
Let's attempt your examples:
Example 1:
[0, 0, 0, 2, 3, 4]
* Something >4
stack = [4, 4 < *]
* Something >3, 4 qualifies
stack = [3, 4, 4 < *]
* Something >2, 3 qualifies
stack = [2, 3, 4, 4 < *]
* The rest is non-increasing with lowerbound 2
stack = [y >= x, x >= 2, 2, 3, 4, >4]
Example 2:
[0, 0, 0, 4]
* Something >4
stack [4, 4 < *]
* Non-increasing
stack = [z >= y, y >= 4, 4, 4 < *]
Calculating the number of combinations is achieved by multiplying together the possibilities for all the sections. A section is either a bounded single cell; or a bound, non-increasing subarray of one or more cells. To calculate the latter we use the multi-choose binomial, (n + k - 1) choose (k - 1). Consider that we can express the differences between the cells of a bound, non-increasing sequence of 3 cells as:
(ub - cell_3) + (cell_3 - cell_2) + (cell_2 - cell_1) + (cell_1 - lb) = ub - lb
Then the number of ways to distribute ub - lb into (x + 1) cells is
(n + k - 1) choose (k - 1)
or
(ub - lb + x) choose x
For example, the number of non-increasing sequences between
(3,4) in two cells is (4 - 3 + 2) choose 2 = 3: [3,3] [4,3] [4,4]
And the number of non-increasing sequences between
(3,4) in three cells is (4 - 3 + 3) choose 3 = 4: [3,3,3] [4,3,3] [4,4,3] [4,4,4]
(Explanation attributed to Brian M. Scott.)
Rough JavaScript sketch (the code is unreliable; it's only meant to illustrate the encoding. The encoder lists [lower_bound, upper_bound], or a non-increasing sequence as [non_inc, length, lower_bound, upper_bound]):
function f(A, M){
console.log(JSON.stringify(A), M);
let i = A.length - 1;
let last = A[i];
let s = [[last,last]];
if (A[i-1] == last){
let d = 1;
s.splice(1,0,['non_inc',d++,last,M]);
while (i > 0 && A[i-1] == last){
s.splice(1,0,['non_inc',d++,last,M]);
i--
}
} else {
s.push([last+1,M]);
i--;
}
if (i == 0)
s.splice(0,1);
for (; i>0; i--){
let x = A[i];
if (x < s[0][0])
s = [[x,x]].concat(s);
if (x > s[0][0]){
let [l, _l] = s[0];
let [lb, ub] = s[1];
s[0] = [x+1, M];
s[1] = [lb, x];
s = [[l,_l], [x,x]].concat(s);
}
if (x == s[0][0]){
let [l,_l] = s[0];
let [lb, ub] = s[1];
let d = 1;
s.splice(0,1);
while (i > 0 && A[i-1] == x){
s =
[['non_inc', d++, lb, M]].concat(s);
i--;
}
if (i > 0)
s = [[l,_l]].concat(s);
}
}
// dirty fix
if (s[0][0] == 0)
s.splice(0,1);
return s;
}
var a = [2, 5, 3, 7, 9, 6]
var b = [0, 2, 2, 3, 7, 3]
console.log(JSON.stringify(a));
console.log(JSON.stringify(f(b,10)));
b = [0,0,0,4]
console.log(JSON.stringify(f(b,10)));
b = [0,2,0,0,0,4]
console.log(JSON.stringify(f(b,10)));
b = [0,0,0,2,3,4]
console.log(JSON.stringify(f(b,10)));
b = [0,2,2]
console.log(JSON.stringify(f(b,4)));
b = [0,3,5,6]
console.log(JSON.stringify(f(b,10)));
b = [0,0,3,0]
console.log(JSON.stringify(f(b,10)));

Related

Dynamic Programming: max sum through a a 2 column list, left column skips a row

To preface, my English is not quite perfect so apologies for any mistakes.
The problem goes as follows:
Given a list of two choices in each column, determine the maximum sum possible from choosing only one of the two options in the column. The twist is: if the bottom of the two values is chosen, the next column is entirely skipped.
Example:
5
9 3 5 7 3
5 8 1 4 5
If you were to choose 5 initially from [9, 5], the column [3, 8] would be skipped. Whereas, if 9 was chosen, the next column would NOT be skipped and you could choose from [3, 8] (if 8 was chosen, the next column would be skipped and if 3 was chosen, it would not be, etc).
When you attempt to solve this using DP, the most important aspect of the problem is to define the right states of the DP.
Define D[i][j] = maximum sum until index i, if u choose element j || j in [0,1]
L = [
[9, 3, 5, 7, 3],
[5, 8, 1, 4, 5]
]
def find_max_sum(L):
D = [[0, 0] for _ in range(len(L[0]))]
D[0][0] = max(L[0][0], 0)
D[0][1] = max(L[1][0], 0)
for i in range(1, len(D)):
D[i][0] = L[0][i] + max(D[i-1])
if i > 1:
D[i][1] = max(L[1][i] + max(D[i-2]), max(D[i-1]))
else:
D[i][1] = max(L[1][i], max(D[i-1]))
return max(D[-1])
print(find_max_sum(L))
Let N be the number of columns, then we can store our choices in an array a(2 X N).
And lets define a function f(i,j) which gives the maximum possible sum from the first j+1 columns(0,1...j) where in j-th column we either pick the first option(i=0) or we pick the second option(i=1).
So the answer to this problem will be max(f(i,j)) for each i=[0,1] and j=[0,1,...N-1].
For each column j we need to handle 2 cases:
1- choosing the first option ---> f(0,j)
2- choosing the second option ---> f(1,j)
To calculate the values of column j ( f(0,j) , f(1,j) ) we need to consider these cases for column j-1:
1- we picked the first option from column j-1
2- we did not pick any option from column j-1(means we picked the second option from column j-2)
note: we can not pick the second option from column j-1 because we are picking an option from column j according to our function definition.
Here is the code written in C++
int OO = 1e6;// a big value
//reading the input
int a[2][100];
int N; cin >> N;
for (int i = 0; i < N; i++) cin >> a[0][i];
for (int i = 0; i < N; i++) cin >> a[1][i];
//initialization
int f[2][100];
f[0][0] = a[0][0];
f[1][0] = a[1][0];
int result = max(f[0][0],f[1][0]);
for (int i = 1; i < N; i++) {
int val1 = f[0][i - 1];
int val2 = -OO;
if (i - 2 >= 0)
val2 = f[1][i - 2];
f[0][i] = max(val1, val2) + a[0][i];
f[1][i] = max(val1,val2) + a[1][i];
result = max(result, max(f[0][i], f[1][i]));
}
cout << result << endl;

How many PR numbers exist in a given range?

It is not a homework problem. I am just curious about this problem. And my approach is simple brute-force :-)
My brute-force C++ code:
int main()
{
ll l,r;
cin>>l>>r;
ll f=0;
ll i=l;
while(i<=r)
{
ll j=0;
string s;
ll c=0;
s=to_string(i);
// cout<<s<<" ";
ll x=s.length();
if(x==1)
{
c=0;
}
else
{
j=0;
//whil
while(j<=x-2)
{
string b,g;
b="1";
g="1";
b=s[j];
g=s[j+1];
ll k1,k2;
k1=stoi(b);
k2=stoi(g);
if(__gcd(k1,k2)==1)
{
c=1;
break;
}
j++;
}
}
ll d=0;
j=0;
while(j<=x-1)
{
if( s[j]=='2' || s[j]=='3' || s[j]=='5' || s[j]=='7')
{
string b;
b="1";
b=s[j];
ll k1=stoi(b);
if(i%k1==0)
{
//d=0;
}
else
{
d=1;
break;
}
}
j++;
}
if(c==1 || d==1)
{
// cout<<"NO";
}
else
{
f++;
// cout<<"PR";
}
// cout<<"\n";
i++;
}
cout<<f;
return 0;
}
You are given 2 integers 'L' and 'R' . You are required to find the count of all the PR numbers in the range 'L' to 'R' inclusively. PR number are the numbers which satisfy following properties:
No pair of adjacent digits are co-prime i.e. adjacent digits in a PR number will not be co-prime to each other.
PR number is divisible by all the single digit prime numbers which occur as a digit in the PR number.
Note: Two numbers 'a' and 'b' are co-prime, if gcd(a,b)=1.
Also, gcd(0,a)=a;
Example:
Input: [2,5].
Output: '4'.
(Note: '1' is not a prime-number, though its very common)
(All the integers: '2','3','4','5') satisfy the condition of PR numbers :-)
Constraints on 'L','R': 1 <= L, R <= 10^18
What can be the the most efficient algorithm to solve this ?
Note: This will solve only part 1 which is No pair of adjacent digits are co-prime i.e. adjacent digits in a PR number will not be co-prime to each other.
Here is a constructive approach in python: instead of going throught all numbers in range and filtering by conditions, we will just construct all numbers that satisfy the condition. Note that if we have a valid sequence of digits, for it to continue being valid only the rightmost digit matters in order to decide what the next digit will be.
def ways(max_number, prev_digit, current_number):
if current_number > max_number:
return 0
count = 1
if prev_digit == 0:
if current_number != 0:
count += ways(max_number, 0, current_number * 10)
for i in range(2, 10):
count += ways(max_number, i, current_number * 10 + i)
if prev_digit == 2 or prev_digit == 4 or prev_digit == 8:
for i in [0, 2, 4, 6, 8]:
count += ways(max_number, i, current_number * 10 + i)
if prev_digit == 3 or prev_digit == 9:
for i in [0, 3, 6, 9]:
count += ways(max_number, i, current_number * 10 + i)
if prev_digit == 5 or prev_digit == 7:
count += ways(max_number, 0, current_number * 10)
count += ways(max_number, prev_digit, current_number * 10 + prev_digit)
if prev_digit == 6:
for i in [0, 2, 3, 4, 6, 8, 9]:
count += ways(max_number, i, current_number * 10 + i)
return count
As we are generating all valid numbers up to max_number without any repeats, the complexity of this function is O(amount of numbers between 0 and max_number that satisfy condition 1). To calculate the range a to b, we just need to do ways(b) - ways(a - 1).
Takes less than 1 second to caculate these numbers from 0 to 1 million, as there are only 42935 numbers that satisfy the result. As there are few numbers that satisfy the condition, we can then check if they are multiple of its prime digits to satisfy also condition 2. I leave this part up to the reader as there are multiple ways to do it.
TL;DR: This is more commonly called "digit dynamic programming with bitmask"
In more competitive-programming-familiar terms, you'd compute dp[n_digit][mod_2357][is_less_than_r][digit_appeared][last_digit] = number of numbers with n_digit digits (including leading zeroes), less than the number formed by first n_digit digits of R and with the other properties match. Do it twice with R and L-1 then take the difference. The number of operations required would be about 19 (number of digits) * 210 (mod) * 2 * 24 (it's only necessary to check for appearance of single-digit primes) * 10 * 10, which is obviously manageable by today computers.
Think about how you'd check whether a number is valid.
Not the normal way. Using a finite state automaton that take the input from left to right, digit by digit.
For simplicity, assume the input has a fixed number of digits (so that comparison with L/R is easier. This is possible because the number has at most as many digits as R).
It's necessary for each state to keep track of:
which digit appeared in the number (use a bit mask, there are 4 1-digit primes)
is the number in range [L..R] (either this is guaranteed to be true/false by the prefix, otherwise the prefix matches with that of L/R)
what is the value of the prefix mod each single digit prime
the most recent digit (to check whether all pairs of consecutive digits are coprime)
After the finite state automaton is constructed, the rest is simple. Just use dynamic programming to count the number of path to any accepted state from the starting state.
Remark: This method can be used to count the number of any type of object that can be verified using a finite state automaton (roughly speaking, you can check whether the property is satisfied using a program with constant memory usage, and takes the object piece-by-piece in some order)
We need a table where we can look up the count of suffixes that would match a prefix to construct valid numbers. Given a prefix's
right digit
prime combination
mod combination
and a suffix length, we'd like the count of suffixes that have searchable:
left digit
length
prime combination
mod combination
I started coding in Python, then switched to JavaScript to be able to offer a snippet. Comments in the code describe each lookup table. There are a few of them to allow for faster enumeration. There are samples of prefix-suffix calculations to illustrate how one can build an arbitrary upper-bound using the table, although at least some, maybe all of the prefix construction and aggregation could be made during the tabulation.
function gcd(a,b){
if (!b)
return a
else
return gcd(b, a % b)
}
// (Started writing in Python,
// then switched to JavaScript...
// 'xrange(4)' -> [0, 1, 2, 3]
// 'xrange(2, 4)' -> [2, 3]
function xrange(){
let l = 0
let r = arguments[1] || arguments[0]
if (arguments.length > 1)
l = arguments[0]
return new Array(r - l).fill(0).map((_, i) => i + l)
}
// A lookup table and its reverse,
// mapping each of the 210 mod combinations,
// [n % 2, n % 3, n % 5, n % 7], to a key
// from 0 to 209.
// 'mod_combs[0]' -> [0, 0, 0, 0]
// 'mod_combs[209]' -> [1, 2, 4, 6]
// 'mod_keys[[0,0,0,0]]' -> 0
// 'mod_keys[[1,2,4,6]]' -> 209
let mod_combs = {}
let mod_keys = {}
let mod_key_count = 0
for (let m2 of xrange(2)){
for (let m3 of xrange(3)){
for (let m5 of xrange(5)){
for (let m7 of xrange(7)){
mod_keys[[m2, m3, m5, m7]] = mod_key_count
mod_combs[mod_key_count] = [m2, m3, m5, m7]
mod_key_count += 1
}
}
}
}
// The main lookup table built using the
// dynamic program
// [mod_key 210][l_digit 10][suffix length 20][prime_comb 16]
let table = new Array(210)
for (let mk of xrange(210)){
table[mk] = new Array(10)
for (let l_digit of xrange(10)){
table[mk][l_digit] = new Array(20)
for (let sl of xrange(20)){
table[mk][l_digit][sl] = new Array(16).fill(0)
}
}
}
// We build prime combinations from 0 (no primes) to
// 15 (all four primes), using a bitmask of up to four bits.
let prime_set = [0, 0, 1<<0, 1<<1, 0, 1<<2, 0, 1<<3, 0, 0]
// The possible digits that could
// follow a digit
function get_valid_digits(digit){
if (digit == 0)
return [0, 2, 3, 4, 5, 6, 7, 8, 9]
else if ([2, 4, 8].includes(digit))
return [0, 2, 4, 6, 8]
else if ([3, 9].includes(digit))
return [0, 3, 6, 9]
else if (digit == 6)
return [0, 2, 3, 4, 6, 8, 9]
else if (digit == 5)
return [0, 5]
else if (digit == 7)
return [0, 7]
}
// Build the table bottom-up
// Single digits
for (let i of xrange(10)){
let mod_key = mod_keys[[i % 2, i % 3, i % 5, i % 7]]
let length = 1
let l_digit = i
let prime_comb = prime_set[i]
table[mod_key][l_digit][length][prime_comb] = 1
}
// Everything else
// For demonstration, we just table up to 6 digits
// since either JavaScript, this program, or both seem
// to be too slow for a full demo.
for (let length of xrange(2, 6)){
// We're appending a new left digit
for (let new_l_digit of xrange(0, 10)){
// The digit 1 is never valid
if (new_l_digit == 1)
continue
// The possible digits that could
// be to the right of our new left digit
let ds = get_valid_digits(new_l_digit)
// For each possible digit to the right
// of our new left digit, iterate over all
// the combinations of primes and remainder combinations.
// The ones that are populated are valid paths, the
// sum of which can be aggregated for each resulting
// new combination of primes and remainders.
for (let l_digit of ds){
for (let p_comb of xrange(16)){
for (let m_key of xrange(210)){
new_prime_comb = prime_set[new_l_digit] | p_comb
// suffix's remainder combination
let [m2, m3, m5, m7] = mod_combs[m_key]
// new remainder combination
let m = Math.pow(10, length - 1) * new_l_digit
let new_mod_key = mod_keys[[(m + m2) % 2, (m + m3) % 3, (m + m5) % 5, (m + m7) % 7]]
// Aggregate any populated entries into the new
// table entry
table[new_mod_key][new_l_digit][length][new_prime_comb] += table[m_key][l_digit][length - 1][p_comb]
}
}
}
}
}
// If we need only a subset of the mods set to
// zero, we need to check all instances where
// this subset is zero. For example,
// for the prime combination, [2, 3], we need to
// check all mod combinations where the first two
// are zero since we don't care about the remainders
// for 5 and 7: [0,0,0,0], [0,0,0,1],... [0,0,4,6]
// Return all needed combinations given some
// predetermined, indexed remainders.
function prime_comb_to_mod_keys(remainders){
let mod_map = [2, 3, 5, 7]
let mods = []
for (let i of xrange(4))
mods.push(!remainders.hasOwnProperty(i) ? mod_map[i] - 1 : 0)
function f(ms, i){
if (i == ms.length){
for (let idx in remainders)
ms[idx] = remainders[idx]
return [mod_keys[ms]]
}
let result = []
for (let m=ms[i] - 1; m>=0; m--){
let _ms = ms.slice()
_ms[i] = m
result = result.concat(f(_ms, i + 1))
}
return result.concat(f(ms, i + 1))
}
return f(mods, 0)
}
function get_matching_mods(prefix, len_suffix, prime_comb){
let ps = [2, 3, 5, 7]
let actual_prefix = Math.pow(10, len_suffix) * prefix
let remainders = {}
for (let i in xrange(4)){
if (prime_comb & (1 << i))
remainders[i] = (ps[i] - (actual_prefix % ps[i])) % ps[i]
}
return prime_comb_to_mod_keys(remainders)
}
// A brute-force function to check the
// table is working. Returns a list of
// valid numbers of 'length' digits
// given a prefix.
function confirm(prefix, length){
let result = [0, []]
let ps = [0, 0, 2, 3, 0, 5, 0, 7, 0, 0]
let p_len = String(prefix).length
function check(suffix){
let num = Math.pow(10, length - p_len) * prefix + suffix
let temp = num
prev = 0
while (temp){
let d = temp % 10
if (d == 1 || gcd(prev, d) == 1 || (ps[d] && num % d))
return [0, []]
prev = d
temp = ~~(temp / 10)
}
return [1, [num]]
}
for (suffix of xrange(Math.pow(10, length - p_len))){
let [a, b] = check(suffix)
result[0] += a
result[1] = result[1].concat(b)
}
return result
}
function get_prime_comb(prefix){
let prime_comb = 0
while (prefix){
let d = prefix % 10
prime_comb |= prime_set[d]
prefix = ~~(prefix / 10)
}
return prime_comb
}
// A function to test the table
// against the brute-force method.
// To match a prefix with the number
// of valid suffixes of a chosen length
// in the table, we want to aggregate all
// prime combinations for all valid digits,
// where the remainders for each combined
// prime combination (prefix with suffix)
// sum to zero (with the appropriate mod).
function test(prefix, length, show=false){
let r_digit = prefix % 10
let len_suffix = length - String(prefix).length
let prefix_prime_comb = get_prime_comb(prefix)
let ds = get_valid_digits(r_digit)
let count = 0
for (let l_digit of ds){
for (let prime_comb of xrange(16)){
for (let i of get_matching_mods(prefix, len_suffix, prefix_prime_comb | prime_comb)){
let v = table[i][l_digit][len_suffix][prime_comb]
count += v
}
}
}
let c = confirm(prefix, length)
return `${ count }, ${ c[0] }${ show ? ': ' + c[1] : '' }`
}
// Arbitrary prefixes
for (let length of [3, 4]){
for (let prefix of [2, 30]){
console.log(`prefix, length: ${ prefix }, ${ length }`)
console.log(`tabled, brute-force: ${ test(prefix, length, true) }\n\n`)
}
}
let length = 6
for (let l_digit=2; l_digit<10; l_digit++){
console.log(`prefix, length: ${ l_digit }, ${ length }`)
console.log(`tabled, brute-force: ${ test(l_digit, length) }\n\n`)
}

Maximum number achievable by converting two adjacent x to one (x+1)

Given a sequence of N integers where 1 <= N <= 500 and the numbers are between 1 and 50. In a step any two adjacent equal numbers x x can be replaced with a single x + 1. What is the maximum number achievable by such steps.
For example if given 2 3 1 1 2 2 then the maximum possible is 4:
2 3 1 1 2 2 ---> 2 3 2 2 2 ---> 2 3 3 2 ---> 2 4 2.
It is evident that I should try to do better than the maximum number available in the sequence. But I can't figure out a good algorithm.
Each substring of the input can make at most one single number (invariant: the log base two of the sum of two to the power of each entry). For every x, we can find the set of substrings that can make x. For each x, this is (1) every occurrence of x (2) the union of two contiguous substrings that can make x - 1. The resulting algorithm is O(N^2)-time.
An algorithm could work like this:
Convert the input to an array where every element has a frequency attribute, collapsing repeated consecutive values in the input into one single node. For example, this input:
1 2 2 4 3 3 3 3
Would be represented like this:
{val: 1, freq: 1} {val: 2, freq: 2} {val: 4, freq: 1} {val: 3, freq: 4}
Then find local minima nodes, like the node (3 3 3 3) in 1 (2 2) 4 (3 3 3 3) 4, i.e. nodes whose neighbours both have higher values. For those local minima that have an even frequency, "lift" those by applying the step. Repeat this until no such local minima (with even frequency) exist any more.
Start of the recursive part of the algorithm:
At both ends of the array, work inwards to "lift" values as long as the more inner neighbour has a higher value. With this rule, the following:
1 2 2 3 5 4 3 3 3 1 1
will completely resolve. First from the left side inward:
1 4 5 4 3 3 3 1 1
Then from the right side:
1 4 6 3 2
Note that when there is an odd frequency (like for the 3s above), there will be a "remainder" that cannot be incremented. The remainder should in this rule always be left on the outward side, so to maximise the potential towards the inner part of the array.
At this point the remaining local minima have odd frequencies. Applying the step to such a node will always leave a "remainder" (like above) with the original value. This remaining node can appear anywhere, but it only makes sense to look at solutions where this remainder is on the left side or the right side of the lift (not in the middle). So for example:
4 1 1 1 1 1 2 3 4
Can resolve to one of these:
4 2 2 1 2 3 4
Or:
4 1 2 2 2 3 4
The 1 in either second or fourth position, is the above mentioned "remainder". Obviously, the second way of resolving is more promising in this example. In general, the choice is obvious when on one side there is a value that is too high to merge with, like the left-most 4 is too high for five 1 values to get to. The 4 is like a wall.
When the frequency of the local minimum is one, there is nothing we can do with it. It actually separates the array in a left and right side that do not influence each other. The same is true for the remainder element discussed above: it separates the array into two parts that do not influence each other.
So the next step in the algorithm is to find such minima (where the choice is obvious), apply that kind of step and separate the problem into two distinct problems which should be solved recursively (from the top). So in the last example, the following two problems would be solved separately:
4
2 2 3 4
Then the best of both solutions will count as the overall solution. In this case that is 5.
The most challenging part of the algorithm is to deal with those local minima for which the choice of where to put the remainder is not obvious. For instance;
3 3 1 1 1 1 1 2 3
This can go to either:
3 3 2 2 1 2 3
3 3 1 2 2 2 3
In this example the end result is the same for both options, but in bigger arrays it would be less and less obvious. So here both options have to be investigated. In general you can have many of them, like 2 in this example:
3 1 1 1 2 3 1 1 1 1 1 3
Each of these two minima has two options. This seems like to explode into too many possibilities for larger arrays. But it is not that bad. The algorithm can take opposite choices in neighbouring minima, and go alternating like this through the whole array. This way alternating sections are favoured, and get the most possible value drawn into them, while the other sections are deprived of value. Now the algorithm turns the tables, and toggles all choices so that the sections that were previously favoured are now deprived, and vice versa. The solution of both these alternatives is derived by resolving each section recursively, and then comparing the two "grand" solutions to pick the best one.
Snippet
Here is a live JavaScript implementation of the above algorithm.
Comments are provided which hopefully should make it readable.
"use strict";
function Node(val, freq) {
// Immutable plain object
return Object.freeze({
val: val,
freq: freq || 1, // Default frequency is 1.
// Max attainable value when merged:
reduced: val + (freq || 1).toString(2).length - 1
});
}
function compress(a) {
// Put repeated elements in a single node
var result = [], i, j;
for (i = 0; i < a.length; i = j) {
for (j = i + 1; j < a.length && a[j] == a[i]; j++);
result.push(Node(a[i], j - i));
}
return result;
}
function decompress(a) {
// Expand nodes into separate, repeated elements
var result = [], i, j;
for (i = 0; i < a.length; i++) {
for (j = 0; j < a[i].freq; j++) {
result.push(a[i].val);
}
}
return result;
}
function str(a) {
return decompress(a).join(' ');
}
function unstr(s) {
s = s.replace(/\D+/g, ' ').trim();
return s.length ? compress(s.split(/\s+/).map(Number)) : [];
}
/*
The function merge modifies an array in-place, performing a "step" on
the indicated element.
The array will get an element with an incremented value
and decreased frequency, unless a join occurs with neighboring
elements with the same value: then the frequencies are accumulated
into one element. When the original frequency was odd there will
be a "remainder" element in the modified array as well.
*/
function merge(a, i, leftWards, stats) {
var val = a[i].val+1,
odd = a[i].freq % 2,
newFreq = a[i].freq >> 1,
last = i;
// Merge with neighbouring nodes of same value:
if ((!odd || !leftWards) && a[i+1] && a[i+1].val === val) {
newFreq += a[++last].freq;
}
if ((!odd || leftWards) && i && a[i-1].val === val) {
newFreq += a[--i].freq;
}
// Replace nodes
a.splice(i, last-i+1, Node(val, newFreq));
if (odd) a.splice(i+leftWards, 0, Node(val-1));
// Update statistics and trace: this is not essential to the algorithm
if (stats) {
stats.total_applied_merges++;
if (stats.trace) stats.trace.push(str(a));
}
return i;
}
/* Function Solve
Parameters:
a: The compressed array to be reduced via merges. It is changed in-place
and should not be relied on after the call.
stats: Optional plain object that will be populated with execution statistics.
Return value:
The array after the best merges were applied to achieve the highest
value, which is stored in the maxValue custom property of the array.
*/
function solve(a, stats) {
var maxValue, i, j, traceOrig, skipLeft, skipRight, sections, goLeft,
b, choice, alternate;
if (!a.length) return a;
if (stats && stats.trace) {
traceOrig = stats.trace;
traceOrig.push(stats.trace = [str(a)]);
}
// Look for valleys of even size, and "lift" them
for (i = 1; i < a.length - 1; i++) {
if (a[i-1].val > a[i].val && a[i].val < a[i+1].val && (a[i].freq % 2) < 1) {
// Found an even valley
i = merge(a, i, false, stats);
if (i) i--;
}
}
// Check left-side elements with always increasing values
for (i = 0; i < a.length-1 && a[i].val < a[i+1].val; i++) {
if (a[i].freq > 1) i = merge(a, i, false, stats) - 1;
};
// Check right-side elements with always increasing values, right-to-left
for (j = a.length-1; j > 0 && a[j-1].val > a[j].val; j--) {
if (a[j].freq > 1) j = merge(a, j, true, stats) + 1;
};
// All resolved?
if (i == j) {
while (a[i].freq > 1) merge(a, i, true, stats);
a.maxValue = a[i].val;
} else {
skipLeft = i;
skipRight = a.length - 1 - j;
// Look for other valleys (odd sized): they will lead to a split into sections
sections = [];
for (i = a.length - 2 - skipRight; i > skipLeft; i--) {
if (a[i-1].val > a[i].val && a[i].val < a[i+1].val) {
// Odd number of elements: if more than one, there
// are two ways to merge them, but maybe
// one of both possibilities can be excluded.
goLeft = a[i+1].val > a[i].reduced;
if (a[i-1].val > a[i].reduced || goLeft) {
if (a[i].freq > 1) i = merge(a, i, goLeft, stats) + goLeft;
// i is the index of the element which has become a 1-sized valley
// Split off the right part of the array, and store the solution
sections.push(solve(a.splice(i--), stats));
}
}
}
if (sections.length) {
// Solve last remaining section
sections.push(solve(a, stats));
sections.reverse();
// Combine the solutions of all sections into one
maxValue = sections[0].maxValue;
for (i = sections.length - 1; i >= 0; i--) {
maxValue = Math.max(sections[i].maxValue, maxValue);
}
} else {
// There is no more valley that can be resolved without branching into two
// directions. Look for the remaining valleys.
sections = [];
b = a.slice(0); // take copy
for (choice = 0; choice < 2; choice++) {
if (choice) a = b; // restore from copy on second iteration
alternate = choice;
for (i = a.length - 2 - skipRight; i > skipLeft; i--) {
if (a[i-1].val > a[i].val && a[i].val < a[i+1].val) {
// Odd number of elements
alternate = !alternate
i = merge(a, i, alternate, stats) + alternate;
sections.push(solve(a.splice(i--), stats));
}
}
// Solve last remaining section
sections.push(solve(a, stats));
}
sections.reverse(); // put in logical order
// Find best section:
maxValue = sections[0].maxValue;
for (i = sections.length - 1; i >= 0; i--) {
maxValue = Math.max(sections[i].maxValue, maxValue);
}
for (i = sections.length - 1; i >= 0 && sections[i].maxValue < maxValue; i--);
// Which choice led to the highest value (choice = 0 or 1)?
choice = (i >= sections.length / 2)
// Discard the not-chosen version
sections = sections.slice(choice * sections.length/2);
}
// Reconstruct the solution from the sections.
a = [].concat.apply([], sections);
a.maxValue = maxValue;
}
if (traceOrig) stats.trace = traceOrig;
return a;
}
function randomValues(len) {
var a = [];
for (var i = 0; i < len; i++) {
// 50% chance for a 1, 25% for a 2, ... etc.
a.push(Math.min(/\.1*/.exec(Math.random().toString(2))[0].length,5));
}
return a;
}
// I/O
var inputEl = document.querySelector('#inp');
var randEl = document.querySelector('#rand');
var lenEl = document.querySelector('#len');
var goEl = document.querySelector('#go');
var outEl = document.querySelector('#out');
goEl.onclick = function() {
// Get the input and structure it
var a = unstr(inputEl.value),
stats = {
total_applied_merges: 0,
trace: a.length < 100 ? [] : undefined
};
// Apply algorithm
a = solve(a, stats);
// Output results
var output = {
value: a.maxValue,
compact: str(a),
total_applied_merges: stats.total_applied_merges,
trace: stats.trace || 'no trace produced (input too large)'
};
outEl.textContent = JSON.stringify(output, null, 4);
}
randEl.onclick = function() {
// Get input (count of numbers to generate):
len = lenEl.value;
// Generate
var a = randomValues(len);
// Output
inputEl.value = a.join(' ');
// Simulate click to find the solution immediately.
goEl.click();
}
// Tests
var tests = [
' ', '',
'1', '1',
'1 1', '2',
'2 2 1 2 2', '3 1 3',
'3 2 1 1 2 2 3', '5',
'3 2 1 1 2 2 3 1 1 1 1 3 2 2 1 1 2', '6',
'3 1 1 1 3', '3 2 1 3',
'2 1 1 1 2 1 1 1 2 1 1 1 1 1 2', '3 1 2 1 4 1 2',
'3 1 1 2 1 1 1 2 3', '4 2 1 2 3',
'1 4 2 1 1 1 1 1 1 1', '1 5 1',
];
var res;
for (var i = 0; i < tests.length; i+=2) {
var res = str(solve(unstr(tests[i])));
if (res !== tests[i+1]) throw 'Test failed: ' + tests[i] + ' returned ' + res + ' instead of ' + tests[i+1];
}
Enter series (space separated):<br>
<input id="inp" size="60" value="2 3 1 1 2 2"><button id="go">Solve</button>
<br>
<input id="len" size="4" value="30"><button id="rand">Produce random series of this size and solve</button>
<pre id="out"></pre>
As you can see the program produces a reduced array with the maximum value included. In general there can be many derived arrays that have this maximum; only one is given.
An O(n*m) time and space algorithm is possible, where, according to your stated limits, n <= 500 and m <= 58 (consider that even for a billion elements, m need only be about 60, representing the largest element ± log2(n)). m is representing the possible numbers 50 + floor(log2(500)):
Consider the condensed sequence, s = {[x, number of x's]}.
If M[i][j] = [num_j,start_idx] where num_j represents the maximum number of contiguous js ending at index i of the condensed sequence; start_idx, the index where the sequence starts or -1 if it cannot join earlier sequences; then we have the following relationship:
M[i][j] = [s[i][1] + M[i-1][j][0], M[i-1][j][1]]
when j equals s[i][0]
j's greater than s[i][0] but smaller than or equal to s[i][0] + floor(log2(s[i][1])), represent converting pairs and merging with an earlier sequence if applicable, with a special case after the new count is odd:
When M[i][j][0] is odd, we do two things: first calculate the best so far by looking back in the matrix to a sequence that could merge with M[i][j] or its paired descendants, and then set a lower bound in the next applicable cells in the row (meaning a merge with an earlier sequence cannot happen via this cell). The reason this works is that:
if s[i + 1][0] > s[i][0], then s[i + 1] could only possibly pair with the new split section of s[i]; and
if s[i + 1][0] < s[i][0], then s[i + 1] might generate a lower j that would combine with the odd j from M[i], potentially making a longer sequence.
At the end, return the largest entry in the matrix, max(j + floor(log2(num_j))), for all j.
JavaScript code (counterexamples would be welcome; the limit on the answer is set at 7 for convenient visualization of the matrix):
function f(str){
var arr = str.split(/\s+/).map(Number);
var s = [,[arr[0],0]];
for (var i=0; i<arr.length; i++){
if (s[s.length - 1][0] == arr[i]){
s[s.length - 1][1]++;
} else {
s.push([arr[i],1]);
}
}
var M = [new Array(8).fill([0,0])],
best = 0;
for (var i=1; i<s.length; i++){
M[i] = new Array(8).fill([0,i]);
var temp = s[i][1],
temp_odd,
temp_start,
odd = false;
for (var j=s[i][0]; temp>0; j++){
var start_idx = odd ? temp_start : M[i][j-1][1];
if (start_idx != -1 && M[start_idx - 1][j][0]){
temp += M[start_idx - 1][j][0];
start_idx = M[start_idx - 1][j][1];
}
if (!odd){
M[i][j] = [temp,start_idx];
temp_odd = temp;
} else {
M[i][j] = [temp_odd,-1];
temp_start = start_idx;
}
if (!odd && temp & 1 && temp > 1){
odd = true;
temp_start = start_idx;
}
best = Math.max(best,j + Math.floor(Math.log2(temp)));
temp >>= 1;
temp_odd >>= 1;
}
}
return [arr, s, best, M];
}
// I/O
var button = document.querySelector('button');
var input = document.querySelector('input');
var pre = document.querySelector('pre');
button.onclick = function() {
var val = input.value;
var result = f(val);
var text = '';
for (var i=0; i<3; i++){
text += JSON.stringify(result[i]) + '\n\n';
}
for (var i in result[3]){
text += JSON.stringify(result[3][i]) + '\n';
}
pre.textContent = text;
}
<input value ="2 2 3 3 2 2 3 3 5">
<button>Solve</button>
<pre></pre>
Here's a brute force solution:
function findMax(array A, int currentMax)
for each pair (i, i+1) of indices for which A[i]==A[i+1] do
currentMax = max(A[i]+1, currentMax)
replace A[i],A[i+1] by a single number A[i]+1
currentMax = max(currentMax, findMax(A, currentMax))
end for
return currentMax
Given the array A, let currentMax=max(A[0], ..., A[n])
print findMax(A, currentMax)
The algorithm terminates because in each recursive call the array shrinks by 1.
It's also clear that it is correct: we try out all possible replacement sequences.
The code is extremely slow when the array is large and there's lots of options regarding replacements, but actually works reasonbly fast on arrays with small number of replaceable pairs. (I'll try to quantify the running time in terms of the number of replaceable pairs.)
A naive working code in Python:
def findMax(L, currMax):
for i in range(len(L)-1):
if L[i] == L[i+1]:
L[i] += 1
del L[i+1]
currMax = max(currMax, L[i])
currMax = max(currMax, findMax(L, currMax))
L[i] -= 1
L.insert(i+1, L[i])
return currMax
# entry point
if __name__ == '__main__':
L1 = [2, 3, 1, 1, 2, 2]
L2 = [2, 3, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2]
print findMax(L1, max(L1))
print findMax(L2, max(L2))
The result of the first call is 4, as expected.
The result of the second call is 5 as expected; the sequence that gives the result: 2,3,1,1,2,2,2,2,2,2,2,2, -> 2,3,1,1,3,2,2,2,2,2,2 -> 2,3,1,1,3,3,2,2,2,2, -> 2,3,1,1,3,3,3,2,2 -> 2,3,1,1,3,3,3,3 -> 2,3,1,1,4,3, -> 2,3,1,1,4,4 -> 2,3,1,1,5

find all subsets that sum to x - using an initial code

I am trying to build upon a problem, to solve another similar problem... given below is a code for finding the total number of subsets that sum to a particular value, and I am trying to modify the code so that I can return all subsets that sum to that value (instead of finding the count).
Code for finding the total number of suibsets that sum to 'sum':
/**
* method to return number of sets with a given sum.
**/
public static int count = 0;
public static void countSubsetSum2(int arr[], int k, int sum) {
if(sum == 0) {
count++;
return;
}
if(sum != 0 && k == 0) {
return;
}
if(sum < arr[k - 1]) {
countSubsetSum2(arr, k-1, sum);
}
countSubsetSum2(arr, k-1, sum - arr[k-1]);
countSubsetSum2(arr, k-1, sum);
}
Can someone propose some changes to this code, to make it return the subsets rather than the subset count?
Firstly, your code isn't correct.
The function, at every step, recurses with the sum excluding and including the current element 1, moving on to the next element, thanks to these lines:
countSubsetSum2(arr, k-1, sum - arr[k-1]);
countSubsetSum2(arr, k-1, sum);
But then there's also this:
if(sum < arr[k - 1]) {
countSubsetSum2(arr, k-1, sum);
}
which causes it to recurse twice with the sum excluding the current element under some circumstances (which it should never do).
Essentially you just need to remove that if-statement.
If all the elements are positive and sum - arr[k-1] < 0, we'd keep going, but we can never get a sum of 0 since the sum can't increase, thus we'd be doing a lot of unnecessary work. So, if the elements are all positive, we can add a check for if(arr[k - 1] <= sum) to the first call to improve the running time. If the elements aren't all positive, the code won't find all sums.
Now on to printing the sums
If you understand the code well, changing it to print the sums instead should be pretty easy. I suggest you work on understanding it a bit more - trace what the program will do by hand, then trace what you want the program to do.
And a hint for solving the actual problem: On noting that countSubsetSum2(arr, k-1, sum - arr[k-1]); recurses with the sum including the current element (and the other recursive call recurses with the sum excluding the current element), what you should do should become clear.
1: Well, technically it's reversed (we start with the target sum and decrease to 0 instead of starting at 0 and increasing to sum), but the same idea is there.
This is the code that works:
import java.util.LinkedList;
import java.util.Iterator;
import java.util.List;
public class subset{
public static int count = 0;
public static List list = new LinkedList();
public static void countSubsetSum2(int arr[], int k, int sum) {
if(sum <= 0 || k < 0) {
count++;
return;
}
if(sum == arr[k]) {
System.out.print(arr[k]);
for(Iterator i = list.iterator(); i.hasNext();)
System.out.print("\t" + i.next());
System.out.println();
}
list.add(arr[k]);
countSubsetSum2(arr, k-1, sum - arr[k]);
list.remove(list.size() - 1);
countSubsetSum2(arr, k-1, sum);
}
public static void main(String[] args)
{
int [] array = {1, 4, 5, 6};
countSubsetSum2(array, 3, 10);
}
}
First off, the code you have there doesn't seem to actually work (I tested it on input [1,2,3, ..., 10] with a sum of 3 and it output 128).
To get it working, first note that you implemented the algorithm in a pretty unorthodox way. Mathematical functions take input and produce output. (Arguably) the most elegant programming functions should also take input and produce output because then we can reason about them as we reason about math.
In your case you don't produce any output (the return type is void) and instead store the result in a static variable. This means it's hard to tell exactly what it means to call countSubsetSum2. In particular, what happens if you call it multiple times? It does something different each time (because the count variable will have a different starting value!) Instead, if you write countSubsetSum2 so that it returns a value then you can define its behavior to be: countSubsetSum2 returns the number of subsets of the input arr[0...k] that sum to sum. And then you can try proving why your implementation meets that specification.
I'm not doing the greatest job of explaining, but I think a more natural way to write it would be:
// Algorithm stops once k is the least element in the array
if (k == 0) {
if (sum == 0 || sum == arr[k]) {
// Either we can sum to "sum"
return 1;
}
else {
// Or we can't sum to "sum"
return 0;
}
}
// Otherwise, let's recursively see if we can sum to "sum"
// Any valid subset either includes arr[k]
return countSubsetSum2(arr, k-1, sum - arr[k]) +
// Or it doesn't
countSubsetSum2(arr, k-1, sum);
As described above, this function takes an input and outputs a value that we can define and prove to be true mathematically (caveat: it's usually not quite a proof because there are crazy edge cases in most programming languages unfortunately).
Anyways, to get back to your question. The issue with the above code is that it doesn't store any data... it just returns the count. Instead, let's generate the actual subsets while we're generating them. In particular, when I say Any valid subset either includes arr[k] I mean... the subset we're generating includes arr[k]; so add it. Below I assumed that the code you wrote above is java-ish. Hopefully it makes sense:
// Algorithm stops once k is the least element in the array
if (k == 0) {
if (sum == 0 || sum == arr[k]) {
// Either we can sum to "sum" using just arr[0]
// So return a list of all of the subsets that sum to "sum"
// There are actually a few edge cases here, so we need to be careful
List<Set<int>> ret = new List<Set<int>>();
// First consider if the singleton containing arr[k] could equal sum
if (sum == arr[k])
{
Set<int> subSet = new Subset<int>();
subSet.Add(arr[k]);
ret.Add(subSet);
}
// Now consider the empty set
if (sum == 0)
{
Set<int> subSet = new Subset<int>();
ret.Add(subSet);
}
return ret;
}
else {
// Or we can't sum to "sum" using just arr[0]
// So return a list of all of the subsets that sum to "sum". None
// (given our inputs!)
List<Set<int>> ret = new List<Set<int>>();
return ret;
}
}
// Otherwise, let's recursively generate subsets summing to "sum"
// Any valid subset either includes arr[k]
List<Set<int>> subsetsThatNeedKthElement = genSubsetSum(arr, k-1, sum - arr[k]);
// Or it doesn't
List<Set<int>> completeSubsets = genSubsetSum(arr, k-1, sum);
// Note that subsetsThatNeedKthElement only sum to "sum" - arr[k]... so we need to add
// arr[k] to each of those subsets to create subsets which sum to "sum"
// On the other hand, completeSubsets contains subsets which already sum to "sum"
// so they're "complete"
// Initialize it with the completed subsets
List<Set<int>> ret = new List<Set<int>>(completeSubsets);
// Now augment the incomplete subsets and add them to the final list
foreach (Set<int> subset in subsetsThatNeedKthElement)
{
subset.Add(arr[k]);
ret.Add(subset);
}
return ret;
The code is pretty cluttered with all the comments; but the key point is that this implementation always returns what it's specified to return (a list of sets of ints from arr[0] to arr[k] which sum to whatever sum was passed in).
FYI, there is another approach which is "bottom-up" (i.e. doesn't use recursion) which should be more performant. If you implement it that way, then you need to store extra data in static state (a "memoized table")... which is a bit ugly but practical. However, when you implement it this way you need to have a more clever way of generating the subsets. Feel free to ask that question in a separate post after giving it a try.
Based, on the comments/suggestions here, I have been able to get the solution for this problem in this way:
public static int counter = 0;
public static List<List<Integer>> lists = new ArrayList<>();
public static void getSubsetCountThatSumToTargetValue(int[] arr, int k, int targetSum, List<Integer> list) {
if(targetSum == 0) {
counter++;
lists.add(list);
return;
}
if(k <= 0) {
return;
}
getSubsetCountThatSumToTargetValue(arr, k - 1, targetSum, list);
List<Integer> appendedlist = new ArrayList<>();
appendedlist.addAll(list);
appendedlist.add(arr[k - 1]);
getSubsetCountThatSumToTargetValue(arr, k - 1, targetSum - arr[k - 1], appendedlist);
}
The main method looks like this:
public static void main(String[] args) {
int[] arr = {1, 2, 3, 4, 5};
SubSetSum.getSubsetCountThatSumToTargetValue(arr, 5, 9, new ArrayList<Integer>());
System.out.println("Result count: " + counter);
System.out.println("lists: " + lists);
}
Output:
Result: 3
lists: [[4, 3, 2], [5, 3, 1], [5, 4]]
A Python implementation with k moving from 0 to len() - 1:
import functools
def sum_of_subsets( numbers, sum_original ):
def _sum_of_subsets( list, k, sum ):
if sum < 0 or k == len( numbers ):
return
if ( sum == numbers[ k ] ):
expression = functools.reduce( lambda result, num: str( num ) if len( result ) == 0 else \
"%s + %d" % ( result, num ),
sorted( list + [ numbers[ k ]] ),
'' )
print "%d = %s" % ( sum_original, expression )
return
list.append( numbers[ k ] )
_sum_of_subsets( list, k + 1, sum - numbers[ k ])
list.pop( -1 )
_sum_of_subsets( list, k + 1, sum )
_sum_of_subsets( [], 0, sum_original )
...
sum_of_subsets( [ 8, 6, 3, 4, 2, 5, 7, 1, 9, 11, 10, 13, 12, 14, 15 ], 15 )
...
15 = 1 + 6 + 8
15 = 3 + 4 + 8
15 = 1 + 2 + 4 + 8
15 = 2 + 5 + 8
15 = 7 + 8
15 = 2 + 3 + 4 + 6
15 = 1 + 3 + 5 + 6
15 = 4 + 5 + 6
15 = 2 + 6 + 7
15 = 6 + 9
15 = 1 + 2 + 3 + 4 + 5
15 = 1 + 3 + 4 + 7
15 = 1 + 2 + 3 + 9
15 = 2 + 3 + 10
15 = 3 + 5 + 7
15 = 1 + 3 + 11
15 = 3 + 12
15 = 2 + 4 + 9
15 = 1 + 4 + 10
15 = 4 + 11
15 = 1 + 2 + 5 + 7
15 = 1 + 2 + 12
15 = 2 + 13
15 = 1 + 5 + 9
15 = 5 + 10
15 = 1 + 14
15 = 15

A cool algorithm to check a Sudoku field?

Does anyone know a simple algorithm to check if a Sudoku-Configuration is valid? The simplest algorithm I came up with is (for a board of size n) in Pseudocode
for each row
for each number k in 1..n
if k is not in the row (using another for-loop)
return not-a-solution
..do the same for each column
But I'm quite sure there must be a better (in the sense of more elegant) solution. Efficiency is quite unimportant.
You need to check for all the constraints of Sudoku :
check the sum on each row
check the sum on each column
check for sum on each box
check for duplicate numbers on each row
check for duplicate numbers on each column
check for duplicate numbers on each box
that's 6 checks altogether.. using a brute force approach.
Some sort of mathematical optimization can be used if you know the size of the board (ie 3x3 or 9x9)
Edit: explanation for the sum constraint: Checking for the sum first (and stoping if the sum is not 45) is much faster (and simpler) than checking for duplicates. It provides an easy way of discarding a wrong solution.
Peter Norvig has a great article on solving sudoku puzzles (with python),
https://norvig.com/sudoku.html
Maybe it's too much for what you want to do, but it's a great read anyway
Check each row, column and box such that it contains the numbers 1-9 each, with no duplicates. Most answers here already discuss this.
But how to do that efficiently? Answer: Use a loop like
result=0;
for each entry:
result |= 1<<(value-1)
return (result==511);
Each number will set one bit of the result. If all 9 numbers are unique, the lowest 9
bits will be set.
So the "check for duplicates" test is just a check that all 9 bits are set, which is the same as testing result==511.
You need to do 27 of these checks.. one for each row, column, and box.
Just a thought: don't you need to also check the numbers in each 3x3 square?
I'm trying to figure out if it is possible to have the rows and columns conditions satisfied without having a correct sudoku
This is my solution in Python, I'm glad to see it's the shortest one yet :)
The code:
def check(sud):
zippedsud = zip(*sud)
boxedsud=[]
for li,line in enumerate(sud):
for box in range(3):
if not li % 3: boxedsud.append([]) # build a new box every 3 lines
boxedsud[box + li/3*3].extend(line[box*3:box*3+3])
for li in range(9):
if [x for x in [set(sud[li]), set(zippedsud[li]), set(boxedsud[li])] if x != set(range(1,10))]:
return False
return True
And the execution:
sudoku=[
[7, 5, 1, 8, 4, 3, 9, 2, 6],
[8, 9, 3, 6, 2, 5, 1, 7, 4],
[6, 4, 2, 1, 7, 9, 5, 8, 3],
[4, 2, 5, 3, 1, 6, 7, 9, 8],
[1, 7, 6, 9, 8, 2, 3, 4, 5],
[9, 3, 8, 7, 5, 4, 6, 1, 2],
[3, 6, 4, 2, 9, 7, 8, 5, 1],
[2, 8, 9, 5, 3, 1, 4, 6, 7],
[5, 1, 7, 4, 6, 8, 2, 3, 9]]
print check(sudoku)
Create an array of booleans for every row, column, and square. The array's index represents the value that got placed into that row, column, or square. In other words, if you add a 5 to the second row, first column, you would set rows[2][5] to true, along with columns[1][5] and squares[4][5], to indicate that the row, column, and square now have a 5 value.
Regardless of how your original board is being represented, this can be a simple and very fast way to check it for completeness and correctness. Simply take the numbers in the order that they appear on the board, and begin building this data structure. As you place numbers in the board, it becomes a O(1) operation to determine whether any values are being duplicated in a given row, column, or square. (You'll also want to check that each value is a legitimate number: if they give you a blank or a too-high number, you know that the board is not complete.) When you get to the end of the board, you'll know that all the values are correct, and there is no more checking required.
Someone also pointed out that you can use any form of Set to do this. Arrays arranged in this manner are just a particularly lightweight and performant form of a Set that works well for a small, consecutive, fixed set of numbers. If you know the size of your board, you could also choose to do bit-masking, but that's probably a little overly tedious considering that efficiency isn't that big a deal to you.
Create cell sets, where each set contains 9 cells, and create sets for vertical columns, horizontal rows, and 3x3 squares.
Then for each cell, simply identify the sets it's part of and analyze those.
You could extract all values in a set (row, column, box) into a list, sort it, then compare to '(1, 2, 3, 4, 5, 6, 7, 8, 9)
I did this once for a class project. I used a total of 27 sets to represent each row, column and box. I'd check the numbers as I added them to each set (each placement of a number causes the number to be added to 3 sets, a row, a column, and a box) to make sure the user only entered the digits 1-9. The only way a set could get filled is if it was properly filled with unique digits. If all 27 sets got filled, the puzzle was solved. Setting up the mappings from the user interface to the 27 sets was a bit tedious, but made the rest of the logic a breeze to implement.
It would be very interesting to check if:
when the sum of each row/column/box equals n*(n+1)/2
and the product equals n!
with n = number of rows or columns
this suffices the rules of a sudoku. Because that would allow for an algorithm of O(n^2), summing and multiplying the correct cells.
Looking at n = 9, the sums should be 45, the products 362880.
You would do something like:
for i = 0 to n-1 do
boxsum[i] := 0;
colsum[i] := 0;
rowsum[i] := 0;
boxprod[i] := 1;
colprod[i] := 1;
rowprod[i] := 1;
end;
for i = 0 to n-1 do
for j = 0 to n-1 do
box := (i div n^1/2) + (j div n^1/2)*n^1/2;
boxsum[box] := boxsum[box] + cell[i,j];
boxprod[box] := boxprod[box] * cell[i,j];
colsum[i] := colsum[i] + cell[i,j];
colprod[i] := colprod[i] * cell[i,j];
rowsum[j] := colsum[j] + cell[i,j];
rowprod[j] := colprod[j] * cell[i,j];
end;
end;
for i = 0 to n-1 do
if boxsum[i] <> 45
or colsum[i] <> 45
or rowsum[i] <> 45
or boxprod[i] <> 362880
or colprod[i] <> 362880
or rowprod[i] <> 362880
return false;
Some time ago, I wrote a sudoku checker that checks for duplicate number on each row, duplicate number on each column & duplicate number on each box. I would love it if someone could come up one with like a few lines of Linq code though.
char VerifySudoku(char grid[81])
{
for (char r = 0; r < 9; ++r)
{
unsigned int bigFlags = 0;
for (char c = 0; c < 9; ++c)
{
unsigned short buffer = r/3*3+c/3;
// check horizontally
bitFlags |= 1 << (27-grid[(r<<3)+r+c])
// check vertically
| 1 << (18-grid[(c<<3)+c+r])
// check subgrids
| 1 << (9-grid[(buffer<<3)+buffer+r%3*3+c%3]);
}
if (bitFlags != 0x7ffffff)
return 0; // invalid
}
return 1; // valid
}
if the sum and the multiplication of a row/col equals to the right number 45/362880
First, you would need to make a boolean, "correct". Then, make a for loop, as previously stated. The code for the loop and everything afterwards (in java) is as stated, where field is a 2D array with equal sides, col is another one with the same dimensions, and l is a 1D one:
for(int i=0; i<field.length(); i++){
for(int j=0; j<field[i].length; j++){
if(field[i][j]>9||field[i][j]<1){
checking=false;
break;
}
else{
col[field[i].length()-j][i]=field[i][j];
}
}
}
I don't know the exact algorithim to check the 3x3 boxes, but you should check all the rows in field and col with "/*array name goes here*/[i].contains(1)&&/*array name goes here*/[i].contains(2)" (continues until you reach the length of a row) inside another for loop.
def solution(board):
for i in board:
if sum(i) != 45:
return "Incorrect"
for i in range(9):
temp2 = []
for x in range(9):
temp2.append(board[i][x])
if sum(temp2) != 45:
return "Incorrect"
return "Correct"
board = []
for i in range(9):
inp = raw_input()
temp = [int(i) for i in inp]
board.append(temp)
print solution(board)
Here's a nice readable approach in Python:
from itertools import chain
def valid(puzzle):
def get_block(x,y):
return chain(*[puzzle[i][3*x:3*x+3] for i in range(3*y, 3*y+3)])
rows = [set(row) for row in puzzle]
columns = [set(column) for column in zip(*puzzle)]
blocks = [set(get_block(x,y)) for x in range(0,3) for y in range(0,3)]
return all(map(lambda s: s == set([1,2,3,4,5,6,7,8,9]), rows + columns + blocks))
Each 3x3 square is referred to as a block, and there are 9 of them in a 3x3 grid. It is assumed as the puzzle is input as a list of list, with each inner list being a row.
Let's say int sudoku[0..8,0..8] is the sudoku field.
bool CheckSudoku(int[,] sudoku)
{
int flag = 0;
// Check rows
for(int row = 0; row < 9; row++)
{
flag = 0;
for (int col = 0; col < 9; col++)
{
// edited : check range step (see comments)
if ((sudoku[row, col] < 1)||(sudoku[row, col] > 9))
{
return false;
}
// if n-th bit is set.. but you can use a bool array for readability
if ((flag & (1 << sudoku[row, col])) != 0)
{
return false;
}
// set the n-th bit
flag |= (1 << sudoku[row, col]);
}
}
// Check columns
for(int col= 0; col < 9; col++)
{
flag = 0;
for (int row = 0; row < 9; row++)
{
if ((flag & (1 << sudoku[row, col])) != 0)
{
return false;
}
flag |= (1 << sudoku[row, col]);
}
}
// Check 3x3 boxes
for(int box= 0; box < 9; box++)
{
flag = 0;
for (int ofs = 0; ofs < 9; ofs++)
{
int col = (box % 3) * 3;
int row = ((int)(box / 3)) * 3;
if ((flag & (1 << sudoku[row, col])) != 0)
{
return false;
}
flag |= (1 << sudoku[row, col]);
}
}
return true;
}
Let's assume that your board goes from 1 - n.
We'll create a verification array, fill it and then verify it.
grid [0-(n-1)][0-(n-1)]; //this is the input grid
//each verification takes n^2 bits, so three verifications gives us 3n^2
boolean VArray (3*n*n) //make sure this is initialized to false
for i = 0 to n
for j = 0 to n
/*
each coordinate consists of three parts
row/col/box start pos, index offset, val offset
*/
//to validate rows
VArray( (0) + (j*n) + (grid[i][j]-1) ) = 1
//to validate cols
VArray( (n*n) + (i*n) + (grid[i][j]-1) ) = 1
//to validate boxes
VArray( (2*n*n) + (3*(floor (i/3)*n)+ floor(j/3)*n) + (grid[i][j]-1) ) = 1
next
next
if every array value is true then the solution is correct.
I think that will do the trick, although i'm sure i made a couple of stupid mistakes in there. I might even have missed the boat entirely.
array = [1,2,3,4,5,6,7,8,9]
sudoku = int [][]
puzzle = 9 #9x9
columns = map []
units = map [] # box
unit_l = 3 # box width/height
check_puzzle()
def strike_numbers(line, line_num, columns, units, unit_l):
count = 0
for n in line:
# check which unit we're in
unit = ceil(n / unit_l) + ceil(line_num / unit_l) # this line is wrong - rushed
if units[unit].contains(n): #is n in unit already?
return columns, units, 1
units[unit].add(n)
if columns[count].contains(n): #is n in column already?
return columns, units, 1
columns[count].add(n)
line.remove(n) #remove num from temp row
return columns, units, line.length # was a number not eliminated?
def check_puzzle(columns, sudoku, puzzle, array, units):
for (i=0;i< puzzle;i++):
columns, units, left_over = strike_numbers(sudoku[i], i, columns, units) # iterate through rows
if (left_over > 0): return false
Without thoroughly checking, off the top of my head, this should work (with a bit of debugging) while only looping twice. O(n^2) instead of O(3(n^2))
Here is paper by math professor J.F. Crook: A Pencil-and-Paper Algorithm for Solving Sudoku Puzzles
This paper was published in April 2009 and it got lots of publicity as definite Sudoku solution (check google for "J.F.Crook Sudoku" ).
Besides algorithm, there is also a mathematical proof that algorithm works (professor admitted that he does not find Sudoku very interesting, so he threw some math in paper to make it more fun).
I'd write an interface that has functions that receive the sudoku field and returns true/false if it's a solution.
Then implement the constraints as single validation classes per constraint.
To verify just iterate through all constraint classes and when all pass the sudoku is correct. To speedup put the ones that most likely fail to the front and stop in the first result that points to invalid field.
Pretty generic pattern. ;-)
You can of course enhance this to provide hints which field is presumably wrong and so on.
First constraint, just check if all fields are filled out. (Simple loop)
Second check if all numbers are in each block (nested loops)
Third check for complete rows and columns (almost same procedure as above but different access scheme)
Here is mine in C. Only pass each square once.
int checkSudoku(int board[]) {
int i;
int check[13] = { 0 };
for (i = 0; i < 81; i++) {
if (i % 9 == 0) {
check[9] = 0;
if (i % 27 == 0) {
check[10] = 0;
check[11] = 0;
check[12] = 0;
}
}
if (check[i % 9] & (1 << board[i])) {
return 0;
}
check[i % 9] |= (1 << board[i]);
if (check[9] & (1 << board[i])) {
return 0;
}
check[9] |= (1 << board[i]);
if (i % 9 < 3) {
if (check[10] & (1 << board[i])) {
return 0;
}
check[10] |= (1 << board[i]);
} else if (i % 9 < 6) {
if (check[11] & (1 << board[i])) {
return 0;
}
check[11] |= (1 << board[i]);
} else {
if (check[12] & (1 << board[i])) {
return 0;
}
check[12] |= (1 << board[i]);
}
}
}
Here is what I just did for this:
boolean checkers=true;
String checking="";
if(a.length/3==1){}
else{
for(int l=1; l<a.length/3; l++){
for(int n=0;n<3*l;n++){
for(int lm=1; lm<a[n].length/3; lm++){
for(int m=0;m<3*l;m++){
System.out.print(" "+a[n][m]);
if(a[n][m]<=0){
System.out.print(" (Values must be positive!) ");
}
if(n==0){
if(m!=0){
checking+=", "+a[n][m];
}
else{
checking+=a[n][m];
}
}
else{
checking+=", "+a[n][m];
}
}
}
System.out.print(" "+checking);
System.out.println();
}
}
for (int i=1;i<=a.length*a[1].length;i++){
if(checking.contains(Integer.toString(i))){
}
else{
checkers=false;
}
}
}
checkers=checkCol(a);
if(checking.contains("-")&&!checking.contains("--")){
checkers=false;
}
System.out.println();
if(checkers==true){
System.out.println("This is correct! YAY!");
}
else{
System.out.println("Sorry, it's not right. :-(");
}
}
private static boolean checkCol(int[][]a){
boolean checkers=true;
int[][]col=new int[][]{{0,0,0},{0,0,0},{0,0,0}};
for(int i=0; i<a.length; i++){
for(int j=0; j<a[i].length; j++){
if(a[i][j]>9||a[i][j]<1){
checkers=false;
break;
}
else{
col[a[i].length-j][i]=a[i][j];
}
}
}
String alia="";
for(int i=0; i<col.length; i++){
for(int j=1; j<=col[i].length; j++){
alia=a[i].toString();
if(alia.contains(""+j)){
alia=col[i].toString();
if(alia.contains(""+j)){}
else{
checkers=false;
}
}
else{
checkers=false;
}
}
}
return checkers;
}
You can check if sudoku is solved, in these two similar ways:
Check if the number is unique in each row, column and block.
A naive solution would be to iterate trough every square and check if the number is unique in the row, column block that number occupies.
But there is a better way.
Sudoku is solved if every row, column and block contains a permutation of the numbers (1 trough 9)
This only requires to check every row, column and block, instead of doing that for every number. A simple implementation would be to have a bitfield of numbers 1 trough 9 and remove them when you iterate the columns, rows and blocks. If you try to remove a missing number or if the field isn't empty when you finish then sudoku isn't correctly solved.
Here's a very concise version in Swift, that only uses an array of Ints to track the groups of 9 numbers, and only iterates over the sudoku once.
import UIKit
func check(_ sudoku:[[Int]]) -> Bool {
var groups = Array(repeating: 0, count: 27)
for x in 0...8 {
for y in 0...8 {
groups[x] += 1 << sudoku[x][y] // Column (group 0 - 8)
groups[y + 9] += 1 << sudoku[x][y] // Row (group 9 - 17)
groups[(x + y * 9) / 9 + 18] += 1 << sudoku[x][y] // Box (group 18 - 27)
}
}
return groups.filter{ $0 != 1022 }.count == 0
}
let sudoku = [
[7, 5, 1, 8, 4, 3, 9, 2, 6],
[8, 9, 3, 6, 2, 5, 1, 7, 4],
[6, 4, 2, 1, 7, 9, 5, 8, 3],
[4, 2, 5, 3, 1, 6, 7, 9, 8],
[1, 7, 6, 9, 8, 2, 3, 4, 5],
[9, 3, 8, 7, 5, 4, 6, 1, 2],
[3, 6, 4, 2, 9, 7, 8, 5, 1],
[2, 8, 9, 5, 3, 1, 4, 6, 7],
[5, 1, 7, 4, 6, 8, 2, 3, 9]
]
if check(sudoku) {
print("Pass")
} else {
print("Fail")
}
One minor optimization you can make is that you can check for duplicates in a row, column, or box in O(n) time rather than O(n^2): as you iterate through the set of numbers, you add each one to a hashset. Depending on the language, you may actually be able to use a true hashset, which is constant time lookup and insertion; then checking for duplicates can be done in the same step by seeing if the insertion was successful or not. It's a minor improvement in the code, but going from O(n^2) to O(n) is a significant optimization.

Resources