Finding all possible unique combinations of numbers to reach a given sum - algorithm

We have a list of numbers, let's say: [ 2, 3, 5 ]
and we have a targetSum, let's say: 8
Our goal, then, is to pick numbers from the list in such a way that the sum of the numbers would lead to targetSum
I'll explain my code first, I wrote a simple c++ code for the same, it uses recursion and backtracking ( without memoization ). It does the following:
We subdivide our original problem by reducing the targetSum by each number at each recursion
Visualizing this in the form of a tree is useful, we also keep track of what number's we have substracted so far, and we keep pushing and popping accordingly
Once we hit the base case of 0, meaning it's possible to create the sum, we make a note of the current numbers we have recursed
This process goes on until we have gone through all of the possibilities
using namespace std;
bool bestSum( int targetSum, vector<int> &holder, vector<vector<int>> &combinations,
vector<int> &path )
if( targetSum == 0 )
combinations.push_back( path );
return true;
if( targetSum < 0 )
return false;
bool possible = false;
for( int i = 0; i < holder.size(); i++ )
int remainder = targetSum - holder[i];
cout << "After pushing:";
for( int j = 0; j < path.size(); j++ )
cout << path[j] << " ";
cout << endl;
bool verdict = bestSum( remainder, holder, combinations, path );
if( verdict == true )
possible = true;
cout << "After popping:";
for( int j = 0; j < path.size(); j++ )
cout << path[j] << " ";
cout << endl;
return possible;
int main()
vector<int> holder = { 2, 3, 5 };
int targetSum = 8;
vector<vector<int>> combinations;
vector<int> path;
bool verdict = bestSum( targetSum, holder, combinations, path );
for( int i = 0; i < combinations.size(); i++ )
for( int j = 0; j < combinations[i].size();j++)
cout << combinations[i][j] << " ";
cout << endl;
return 0;
(ignoring the printing statements) Talking about time complexity, it should be: exponential, without memoization
And at most small degree polynomial, with memoization
Combing back to the original problem, currently my code produces all of the possible combinations, for example, with the numbers list and targetSum presented at the start of this article, we would get: 2,3,3 and 3,3,2 as two different combinations. But we know that they aren't unique
My question is, is it possible to find all unique combination of numbers whilst keeping the logic of my code consistent?


Will this Selection Sort Code work in O(n) for best case?

I search everywhere on the internet for the best case time complexity of selection sort that is o(n^2). But i write and tested this below code of selection sort that can work in O(n) for best case (that is array is already sorted). Please find the mistake in this program
This is my code:
#include <bits/stdc++.h>
using namespace std;
/* Function to print an array */
void printArray(int arr[], int size)
int i;
for (i = 0; i < size; i++)
cout << arr[i] << " ";
cout << endl;
void swap(int *xp, int *yp)
int temp = *xp;
*xp = *yp;
*yp = temp;
void selectionSort(int arr[], int n)
int i, j, max_idx;
// One by one move boundary of unsorted subarray
for (i = 0; i < n - 1; i++)
cout << endl;
printArray(arr, n);
// Find the minimum element in unsorted array
max_idx = 0;
int count = 0;
for (j = 1; j < n - i; j++)
if (arr[j] >= arr[max_idx])
max_idx = j;
if (count != n - i - 1)
{ //swap only if not already sorted
// Swap the found minimum element with the first element
swap(&arr[max_idx], &arr[n - i - 1]);
else //already Sorted so returning
//cout << "Sorted array: \n";
printArray(arr, n);
// Driver program to test above functions
int main()
int arr[] = {2, 1, 4, 3, 6, 5, 8, 7};
int n = sizeof(arr) / sizeof(arr[0]);
selectionSort(arr, n);
cout << "Sorted array: \n";
printArray(arr, n);
return 0;
// This is code is contributed by
Yes, your algorithm has a best case running time of Θ(n), because if the array is already in ascending order then count will equal n - 1 on the first iteration of the outer loop, so the algorithm will terminate early.
Your algorithm is different to the standard selection sort algorithm, which looks like this:
for(int i = 0; i < n - 1; i++) {
int min_idx = i;
for(int j = i + 1; j < n; j++) {
if(arr[j] < arr[min_idx]) {
min_idx = j;
swap(&arr[i], &arr[min_idx]);
The selection sort algorithm iteratively searches for the minimum remaining element and swaps it into place. This doesn't create an opportunity to detect that the array is already in increasing order, so there's no opportunity to terminate early, and selection sort's best case running time is therefore Θ(n2).
Selection Sort: Idea Given an array of n items
1.Find the largest item x, in the range of [0…n−1]
2.Swap x with the (n−1)th item
3.Reduce n by 1 and go to Step 1
Selection sort function you can use following algorithm has hint to write the code:

Given a set of positive integers and value X, find a subset S whose sum is >= X, such that sum(S) is the lowest of all sums of such existing subsets

Given a set of positive integers and value X, find a subset S whose sum is >= X, such that sum(S) is the lowest of all sums of such existing subsets.
Can it be done in polynomial time? What would be the solution?
Checking all subsets is 2^n.
Backtracking is a possibility for this problem.
It allows examining all the possibilities recursively, without the need of a large amount of memory.
It stops as soon as an optimal solution is found: sum = X, up to a given tolerance (for example 10^-10 in the programme below)
It allows to implement a simple procedure of premature abandon:
at a given time, if sum + the sum of all remaining elements is higher than X, then we can give up examining the current path, without examining the remaining elements. This procedure is optimized by sorting the input data in decreasing order
Here is a code, in C++. The code being quite basic, it should be easy to migrate it to another language.
This programme tests the algorithm with random (uniform) elements, and display the number of iterations.
The complexity (i.e. the number of iterations) is really varying with the random elements (of course), but also greatly depends of the tolerance that we accept. With a tolerance of 10^-10 and a size of n=100, the complexity generally stays quite acceptable. It is no longer the case with a smaller tolerance.
With n = 100 and five runs, I obtained for the number of iterations: 6102, 3672, 8479, 2235, 12926. However, it is clear that there is no warranty to have good performances in all cases. For n = 100, the number of candidates (subsets) is huge.
// Find min sum greater than a given number X
#include <iostream>
#include <iomanip>
#include <vector>
#include <algorithm>
#include <tuple>
#include <cstdlib>
#include <cmath>
#include <ctime>
std::tuple<double, std::vector<double>> min_sum_greater(std::vector<double> &a, double X) {
int n = a.size();
std::vector<bool> parti (n, false); // current partition studies
std::vector<bool> parti_opt (n, false); // optimal partition
std::vector<double> sum_back (n, 0); // sum of remaining elements
//std::cout << "n = " << n << " \tX = " << X << "\n";
std::sort(a.begin(), a.end(), std::greater<double>());
sum_back[n-1] = a[n-1];
for (int i = n-2; i >= 0; --i) {
sum_back[i] = sum_back[i+1] + a[i];
double sum = 0.0; // current sum
int i = 0; // index of the element being examined
double best_sum = sum_back[0] + 1.0;
bool up_down = true;
double eps = 1.0e-10; // error tolerance
long long cpt = 0; // to check the number of iterations
while (true) { // UP
//std::cout << "Start of while loop: i = " << i << "\n";
if (up_down) {
bool abandon = (sum + sum_back[i] < X - eps) || (sum > best_sum);
if (abandon) { //premature abandon
parti[i] = false;
up_down = false;
parti[i] = true;
sum += a[i];
//std::cout << "UP, i = " << i << " \tsum = " << sum << "\n";
if (fabs(sum - X) < eps) {
best_sum = sum;
parti_opt = parti;
if (sum >= X) {
if (sum < best_sum) {
best_sum = sum;
parti_opt = parti;
//std::cout << "i = " << i << " \tbest sum = " << best_sum << "\n";
parti[i] = false;
sum -= a[i];
if (i == (n-1)) { // leaf
up_down = false;
} else { // DOWN
if (i < 0) break;
if (parti[i]) {
sum -= a[i];
parti[i] = false;
up_down = true;
} else {
up_down = false;
std::vector<double> answer;
for (int i = 0; i < n; ++i) {
if (parti_opt[i]) answer.push_back (a[i]);
std::cout << "number of iterations = " << cpt << " for n = " << n << "\n";
return std::make_tuple (best_sum, answer);
int main () {
//std::vector<double> a = {5, 6, 2, 10, 2, 3, 4, 13, 17, 38, 42};
double X = 33.5;
srand (time(NULL));
int n = 100;
double vmax = 100;
X = vmax * n / 4;
std::vector<double> a (n);
for (int i = 0; i < n; ++i) {
a[i] = vmax * double(rand())/RAND_MAX;
double sum;
std::vector<double> y;
std::tie (sum, y) = min_sum_greater (a, X);
std::cout << std::setprecision(15) << "sum = " << sum << "\n";
if (n < 20) {
std::cout << "set: ";
for (auto val: y) {
std::cout << val << " ";
std::cout << "\n";

Smallest Multiple of given number With digits only 0 and 1

You are given an integer N. You have to find smallest multiple of N which consists of digits 0 and 1 only. Since this multiple could be large, return it in form of a string.
Returned string should not contain leading zeroes.
For example,
For N = 55, 110 is smallest multiple consisting of digits 0 and 1.
For N = 2, 10 is the answer.
I saw several related problems, but I could not find the problem with my code.
Here is my code giving TLE on some cases even after using map instead of set.
#define ll long long
int getMod(string s, int A)
int res=0;
for(int i=0;i<s.length();i++)
return res;
string Solution::multiple(int A) {
return to_string(A);
string s="1";
int mod=getMod(s,A);
return s;
else if(st.find(mod)==st.end())
Here is an implementation in Raku.
my $n = 55;
(1 .. Inf).map( *.base(2) ).first( * %% $n );
(1 .. Inf) is a lazy list from one to infinity. The "whatever star" * establishes a closure and stands for the current element in the map.
base is a method of Rakus Num type which returns a string representation of a given number in the wanted base, here a binary string.
first returns the current element when the "whatever star" closure holds true for it.
The %% is the divisible by operator, it implicitly casts its left side to Int.
Oh, and to top it off. It's easy to parallelize this, so your code can use multiple cpu cores:
(1 .. Inf).race( :batch(1000), :degree(4) ).map( *.base(2) ).first( * %% $n );
As mentioned in the "math" reference, the result is related to the congruence of the power of 10 modulo A.
n = sum_i a[i] 10^i
n modulo A = sum_i a[i] b[i]
Where the a[i] are equal to 0 or 1, and the b[i] = (10^i) modulo A
Then the problem is to find the minimum a[i] sequence, such that the sum is equal to 0 modulo A.
From a graph a point of view, we have to find the shortest path to zero modulo A.
A BFS is generally well adapted to find such a path. The issue is the possible exponential increase of the number of nodes to visit. Here, were are sure to get a number of nodes less than A, by rejecting the nodes, the sum of which (modulo A) has already been obtained (see vector used in the program). Note that this rejection is needed in order to get the minimum number at the end.
Here is a program in C++. The solution being quite simple, it should be easy to understand even by those no familiar with C++.
#include <iostream>
#include <string>
#include <vector>
struct node {
int sum = 0;
std::string s;
std::string multiple (int A) {
std::vector<std::vector<node>> nodes (2);
std::vector<bool> used (A, false);
int range = 0;
int ten = 10 % A;
int pow_ten = 1;
if (A == 0) return "0";
if (A == 1) return "1";
nodes[range].push_back (node{0, "0"});
nodes[range].push_back (node{1, "1"});
used[1] = true;
while (1) {
int range_new = (range + 1) % 2;
pow_ten = (pow_ten * ten) % A;
for (node &x: nodes[range]) {
node y = x;
y.s = "0" + y.s;
y = x;
y.sum = (y.sum + pow_ten) % A;
if (used[y.sum]) continue;
used[y.sum] = true;
y.s = "1" + y.s;
if (y.sum == 0) return y.s;
range = range_new;
int main() {
std::cout << "input number: ";
int n;
std::cin >> n;
std::cout << "Result = " << multiple(n) << "\n";
return 0;
The above program is using a kind of memoization in order to speed up the process but for large inputs memory becomes too large.
As indicated in a comment for example, it cannot handle the case N = 60000007.
I improved the speed and the range a little bit with the following modifications:
A function (reduction) was created to simplify the search when the input number is divisible by 2 or 5
For the memorization of the nodes (nodes array), only one array is used now instead of two
A kind of meet-in-the middle procedure is used: in a first step, a function mem_gen memorizes all relevant 01 sequences up to N_DIGIT_MEM (=20) digits. Then the main procedure multiple2 generates valid 01 sequences "after the 20 first digits" and then in the memory looks for a "complementary sequence" such that the concatenation of both is a valid sequence
With this new program the case N = 60000007 provides the good result (100101000001001010011110111, 27 digits) in about 600ms on my PC.
Instead of limiting the number of digits for the memorization in the first step, I now use a threshold on the size of the memory, as this size does not depent only on the number of digits but also of the input number. Note that the optimal value of this threshold would depend of the input number. Here, I selected a thresholf of 50k as a compromise. With a threshold of 20k, for 60000007, I obtain the good result in 36 ms. Besides, with a threshold of 100k, the worst case 99999999 is solved in 5s.
I made different tests with values less than 10^9. In about all tested cases, the result is provided in less that 1s. However, I met a corner case N=99999999, for which the result consists in 72 consecutive "1". In this particular case, the program takes about 6.7s. For 60000007, the good result is obtained in 69ms.
Here is the new program:
#include <iostream>
#include <string>
#include <vector>
#include <map>
#include <unordered_map>
#include <chrono>
#include <cmath>
#include <algorithm>
std::string reverse (std::string s) {
std::string res {s.rbegin(), s.rend()};
return res;
struct node {
int sum = 0;
std::string s;
node (int sum_ = 0, std::string s_ = ""): sum(sum_), s(s_) {};
// This function simplifies the search when the input number is divisible by 2 or 5
node reduction (int &X, long long &pow_ten) {
node init {0, ""};
while (1) {
int digit = X % 10;
if (digit == 1 || digit == 3 || digit == 7 || digit == 9) break;
switch (digit) {
X /= 10;
X = (5*X)/10;
X = (2*X)/10;
pow_ten = (pow_ten * 10) % X;
return init;
const int N_DIGIT_MEM = 30; // 20
const int threshold_size_mem = 50000;
// This function memorizes all relevant 01 sequences up to N_DIGIT_MEM digits
bool gene_mem (int X, long long &pow_ten, int index_max, std::map<int, std::string> &mem, node &result) {
std::vector<node> nodes;
std::vector<bool> used (X, false);
bool start = true;
for (int index = 0; index < index_max; ++index){
if (start) {
node x = {int(pow_ten), "1"};
nodes.push_back (x);
} else {
for (node &x: nodes) {
int n = nodes.size();
for (int i = 0; i < n; ++i) {
node y = nodes[i];
y.sum = (y.sum + pow_ten) % X;
y.s.back() = '1';
if (used[y.sum]) continue;
used[y.sum] = true;
if (y.sum == 0) {
result = y;
return true;
pow_ten = (10 * pow_ten) % X;
start = false;
int n_mem = nodes.size();
if (n_mem > threshold_size_mem) {
for (auto &x: nodes) {
mem[x.sum] = x.s;
//std::cout << "size mem = " << mem.size() << "\n";
return false;
// This function generates valid 01 sequences "after the 20 first digits" and then in the memory
// looks for a "complementary sequence" such that the concatenation of both is a valid sequence
std::string multiple2 (int A) {
std::vector<node> nodes;
std::map<int, std::string> mem;
int ten = 10 % A;
long long pow_ten = 1;
int digit;
if (A == 0) return "0";
int X = A;
node init = reduction (X, pow_ten);
if (X != A) ten = ten % X;
if (X == 1) {
return reverse(init.s);
std::vector<bool> used (X, false);
node result;
int index_max = N_DIGIT_MEM;
if (gene_mem (X, pow_ten, index_max, mem, result)) {
return reverse(init.s + result.s);
node init2 {0, ""};
while (1) {
for (node &x: nodes) {
int n = nodes.size();
for (int i = 0; i < n; ++i) {
node y = nodes[i];
y.sum = (y.sum + pow_ten) % X;
if (used[y.sum]) continue;
used[y.sum] = true;
y.s.back() = '1';
if (y.sum != 0) {
int target = X - y.sum;
auto search = mem.find(target);
if (search != mem.end()) {
//std::cout << "mem size 2nd step = " << nodes.size() << "\n";
return reverse(init.s + search->second + y.s);
pow_ten = (pow_ten * ten) % X;
int main() {
std::cout << "input number: ";
int n;
std::cin >> n;
std::string res;
auto t1 = std::chrono::high_resolution_clock::now();
res = multiple2(n),
std::cout << "Result = " << res << " ndigit = " << res.size() << std::endl;
auto t2 = std::chrono::high_resolution_clock::now();
auto duration2 = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count();
std::cout << "time = " << duration2/1000 << " ms" << std::endl;
return 0;
For people more familiar with Python, here is a converted version of #Damien's code. Damien's important insight is to strongly reduce the search tree, taking advantage of the fact that each partial sum only needs to be investigated once, namely the first time it is encountered.
The problem is also described at Mathpuzzle, but there they mostly fix on the necessary existence of a solution. There's also code mentioned at the online encyclopedia of integer sequences. The sage version seems to be somewhat similar.
I made a few changes:
Starting with an empty list helps to correctly solve A=1 while simplifying the code. The multiplication by 10 is moved to the end of the loop. Doing the same for 0 seems to be hard, as log10(0) is minus infinity.
Instead of alternating between nodes[range] and nodes[new_range], two different lists are used.
As Python supports integers of arbitrary precision, the partial results could be stored as decimal or binary numbers instead of as strings. This is not yet done in the code below.
from collections import namedtuple
node = namedtuple('node', 'sum str')
def find_multiple_ones_zeros(A):
nodes = [node(0, "")]
used = set()
pow_ten = 1
while True:
new_nodes = []
for x in nodes:
y = node(x.sum, "0" + x.str)
next_sum = (x.sum + pow_ten) % A
y = node((x.sum + pow_ten) % A, x.str)
if next_sum in used:
y = node(next_sum, "1" + x.str)
if next_sum == 0:
return y.str
pow_ten = (pow_ten * 10) % A
nodes = new_nodes

Stuck with DFS/BFS task (USACO silver)

competitive programming noob here. I've been trying to solve this question:
The code I wrote only works with the first test case, and gives a Memory Limit Exceed error -- or ('!') for the rest of the test cases.
This is my code (accidently mixed up M and N):
#include <vector>
#include <algorithm>
#include <iostream>
using namespace std;
using std::vector;
vector<int> check;
vector< vector<int> > A;
void dfs(int node)
check[node] = 1;
int siz = A[node].size();
for (int i = 0; i < siz; i++)
int y = A[node][i];
if (check[y] == 0)
bool connected(vector<int> C)
for (int i = 1; i <= C.size() - 1; i++)
if (C[i] == 0)
return false;
return true;
int main()
freopen("", "r", stdin);
freopen("closing.out", "w", stdout);
int M, N;
cin >> M >> N;
check.resize(M + 1);
A.resize(M + 1);
for (int i = 0; i < N; i++)
int u, v;
cin >> u >> v;
A[u].push_back(v); A[v].push_back(u);
if (!connected(check)) {
cout << "NO" << "\n";
else {
cout << "YES" << "\n";
fill(check.begin(), check.end(), 0);
for (int j = 1; j < M; j++)
int node;
bool con = true;
cin >> node;
check[node] = -1;
for (int x = 1; x <= N; x++)
if (check[x] == 0)
if (!connected(check)) {
cout << "NO" << "\n";
else {
cout << "YES" << "\n";
for (int g = 1; g <= M; g++)
if (check[g] == 1)
check[g] = 0;
return 0;
void dfs(int node) searches through the bidirectional graph starting from node until it reaches a dead end, and for each node that is visited, check[node] will become 1.
(if visited -> 1, not visited -> 0, turned off -> -1).
bool connected(vector C) will take the check vector and see if there are any nodes that weren't visited. if this function returns true, it means that the graph is connected, and false if otherwise.
In the main function,
1) I save the bidirectional graph given in the task as an Adjacency list.
2) dfs through it first to see if the graph is initially connected (then print "Yes" or "NO") then reset check
3) from 1 to M, I take the input value of which barn would be closed, check[the input value] = -1, and dfs through it. After that, I reset the check vector, but keeping the -1 values so that those barns would be unavailable for the next loops of dfs.
I guess my algorithm makes sense, but why would this give an MLE, and how could I improve my solution? I really can't figure out why my code is giving MLEs.
Thanks so much!
Your DFS is taking huge load of stacks and thus causing MLE
Try to implement it with BFS which uses queue. Try to keep the queue as global rather than local.
Your approach will give you Time Limit Exceeded verdict. Try to solve it more efficiently. Say O(n).

How do we solve the given scenario efficiently?

We are given a maze in which we need to visit as many rooms as possible. The specialty of the maze is that once you enter any room it will only lead you to rooms with a higher tag in the direction you move . B and C decide to move in opposite directions trying their luck to maximize the number of rooms they search .(They can start with any room , need not be the same)
We need to find out the maximum number of rooms that can be searched.
1. Access to any room with a higher tag is allowed, not just adjacent rooms or the next room with a higher tag.
2. Tags are unique.
So given the input:
12 11 10 1 2 3 4 13 6 7 8 5 9
the answer is 12: (1,2,3,4,6,7,8,9) for B and (5,10,11,12) for C.
I thought of solving this using longest increasing sub sequence first from right and then from left.And the count of unique elements in above two sub sequence would be the answer.
But my logic seems to fail,how can this be done?
My program below computes the maximum number of rooms searched. This has time complexity of O(n^3). I modified the DP algorithm for computing the longest increasing sequence available online to solve OP's problem. This also addresses OP's concerns on arrays like {1,4,6,2,5}. I rightly get the max value as 5 for the previous example. So, I used the idea from #BeyelerStudios that we need to compute the longest increasing subsequence from both left to right and from right to left. But, there is a caveat. If we compute the Left to right max sequence, the sequence from right to left should be on the remaining elements. Example:
For the array {1, 4, 6, 2, 5}. If the forward rooms selected are {1, 4, 5 }, then the reverse longest increasing sequence should be computed on the left out elements {6, 2}.
Below is my program:
#include <iostream>
using namespace std;
// compute the max increasing sequence from right to left.
int r2lRooms (int arr[], int n)
int dp[n];
int i =0, j = 0;
int max = 0;
for ( i = 0; i < n; i++ ) {
dp[i] = 1;
for (i = n-2; i >= 0; i--) {
for ( j = n-1; j > i; j-- ) {
if ( arr[i] > arr[j] && dp[i] < dp[j] + 1) {
dp[i] = dp[j] + 1;
for ( i = 0; i < n; i++ ) {
if ( max < dp[i] ) {
max = dp[i];
return max;
// compute max rooms.
int maxRooms( int arr[], int n )
int dp[n], revArray[n];
int i =0, j = 0, k = 0;
int currentMax = 0;
int forwardMax = 0, reverseMax = 0;
for ( i = 0; i < n; i++ ) {
dp[i] = 1;
// First case is that except for first elem, all others are in revArray
for (i=1; i < n; i++, k++) {
revArray[k] = arr[i];
reverseMax = r2lRooms (revArray, k);
forwardMax = 1;
if (currentMax < (forwardMax + reverseMax)) {
currentMax = forwardMax + reverseMax;
cout << "forwardmax revmax and currentmax are: " << forwardMax << " " << reverseMax << " " << currentMax << endl;
cout << endl;
for ( i = 1; i < n; i++ ) {
k = 0;
forwardMax = 1;
reverseMax = 0;
cout << "Forward elems for arr[" << i << "]=" << arr[i] << endl;
for ( j = 0; j < i; j++ ) {
if ( arr[i] > arr[j] && dp[i] < dp[j] + 1) {
dp[i] = dp[j] + 1;
forwardMax = dp[i];
cout << arr[j] << " ";
else {
// element was not in DP calculation, so put in revArray.
revArray[k] = arr[j];
// copy the remaining elements in revArray.
for ( j = i+1; j < n; j++ ) {
revArray[k] = arr[j];
cout << endl;
reverseMax = r2lRooms (revArray, k);
if (currentMax < (forwardMax + reverseMax)) {
currentMax = forwardMax + reverseMax;
cout << "forwardmax revmax and currentmax are: " << forwardMax << " " << reverseMax << " " << currentMax << endl;
cout << endl;
cout << " Max rooms searched " << currentMax << endl;
return currentMax;
int main (void) {
int arr[] = {12, 11, 10, 1, 2, 3, 4, 13, 6, 7, 8, 5, 9 };
int size = sizeof(arr) / sizeof(int);
cout << maxRooms (arr, size);
I think the trick is at the intersection, where B and C might share a value or there's options to go around that (say the sequence is 12 11 10 1 2 3 4 <3 5> 13 6 7 8 9 The extra numbers here adds 1 to the solution, but doesn't change the result for either longest increasing sub-sequences.
So the only problem is the one room in the middle, since on both side the values chosen diverge.
What I would do is this: do the longest subsequence in one direction, figure out a solution (any solution), take out the numbers in the solution and do the longest subsequence in the other direction. This way if there's a way around the crossing room in the middle the second pass will prefer it, unless that's the chosen number is really needed. To check for that do the same thing, but build the first subsequence in the opposite direction and the second one (after removing the solution) in the direction chosen initially.
Complexity remains O(N) but with a slightly higher constant factor.
