Discrete optimization for a function on a matrix

Discrete optimization for a function on a matrix - algorithm

This is an optimization question I've simplified from a more specific problem i'm having, but I'm not sure specifically where this problem is classified under, or the method to obtain a solution (brute force, simulated annealing, linear programming?). Any help or references are appreciated!
We have two MxN matrices M1 and M2, where each entry is either 1 or 0.
I'm trying to get from matrix M1 to matrix M2 in the least amount of time possible.
The goal is to minimize the total time, where time is defined by the following:
0 -> 1 transition = 1s
1 -> 0 transition = 0.1s
The only way the matrix can be changed is by selecting a set of rows and columns, and all the elements at the intersections of the picked rows and columns are switched to 0 / 1, with the entire transition taking the time specified above.
Example:
M1
1 1 1
1 1 0
1 0 0
M2
0 0 1
0 1 1
1 1 1
First iteration:
Select rows 2 and 3, and columns 2 and 3 of M1.
Convert all intersecting elements to 1
takes 1s
M1
1 1 1
1 1 1
1 1 1
Second iteration:
Select rows 1, and columns 1 and 2 of M1.
Convert all intersecting elements to 0
takes 0.1s
M1
0 0 1
1 1 1
1 1 1
Third iteration:
Select row 2 and column 1 of M1.
Convert the selected element to 0
takes 0.1s
M1
0 0 1
0 1 1
1 1 1
Here, the total time is 1.2s.

For the sizes given, this looks like it will be very hard even to approximate. Anyway, here are a couple of ideas.
When a cell needs to change from 0 to 1, I'll write +, when it needs to change in the other direction I'll write -, and when it needs to stay as-is, I'll write either 0 or 1 (i.e. whatever it currently is). So e.g. the problem instance in the OP's question looks like
- - - 1
- - 1 +
- 1 + +
1 + + +
Let's consider a slightly easier monotone version of the problem, in which we never change a cell twice.
Generally requires many more moves, but gives a useful starting point and an upper bound.
In this version of the problem, it doesn't matter in which order we perform the moves.
Simple variations might be more effective as heuristics, e.g. performing a small number of initial 0->1 moves in which every + cell is changed to 1 and other cells are possibly changed too, followed by a series of 1->0 moves to change/fix all other cells.
Shrinking the problem safely
[EDIT 11/12/2014: Fixed the 3rd rule below. Unfortunately it's likely to apply much less often.]
The following tricks never cause a solution to become suboptimal, and may simplify the problem:
Delete any rows or columns that contain no +-cell or --cell: no move will ever use them.
If there are any identical rows or columns, collapse them: whatever you do to this single collapsed row or column, you can do to all rows or columns separately.
If there is any row with just a single +-cell and no 1-cells, you can immediately fix all +-cells in the entire column containing it with a single 0->1 move, since in the monotone problem it's not possible to fix this cell in the same 0->1 move as any +-cell in a different column. Likewise with rows and columns swapped, and with a single --cell and no 0-cells.
Applying these rules multiple times may yield further simplification.
A very simple heuristic
You can correct an entire row or column of + or --cells in a single move. Therefore it is always possible to solve the problem with 2*min(width, height) moves (the 2 is there because we may need both 0->1 and 1->0 moves). A slightly better approach would be to greedily find the row or column with the most cells needing correction, and correct it in a single move, switching between rows and columns freely.
The best possible move
Suppose we have two +-cells (i, j) and (k, l), with i <= k and j <= l. They can be changed in the same 0->1 move exactly when both of their "opposite corners" (i, l) and (k, j) are either + or 1. Also notice that if either or both of (i, j) and (k, l) are 1 (instead of +), then they could still be included in the same move, even though that move would have no effect for one or both of them. So if we build a graph G in which we have a vertex for every cell and an edge between two vertices (i, j) and (k, l) whenever (i, j), (k, l), (i, l) and (k, j) are all either + or 1, a clique in this graph corresponds to a set of cells that can all be changed to (or left at) 1 in a single 0->1 move. To find the best possible move -- that is, the move that changes the most possible 0s to 1s -- we don't quite want the maximum-sized clique in the graph; what we actually want is the clique that contains the largest number of +-cell vertices. We can formulate an ILP that will find this, using a 0/1 variable x_i_j to represent whether vertex (i, j) is in the clique:
Maximise the sum over all variables x_i_j such that (i, j) is a `+`-cell
Subject to
x_i_j + x_k_l <= 1 for all i, j, k, l s.t. there is no edge (i, j)-(k, l)
x_i_j in {0, 1} for all i, j
The constraints prevent any pair of vertices from both being included if there is no edge between them, and the objective function tries to find as large a subset of +-cell vertices as possible that satisfies them.
Of course, the same procedure works for finding 1->0 moves.
(You will already run into problems simply constructing a graph this size: with N and M around 1000, there are around a million vertices, and up to a million million edges. And finding a maximum clique is an NP-hard problem, so it's slow even for graphs with hundreds of edges...)
The fewest possible moves
A similar approach can tell us the smallest number of 0->1 (or 1->0) moves required, and at the same time give us a representative cell from each move. This time we look for the largest independent set in the same graph G:
Maximise the sum over all variables x_i_j such that (i, j) is a `+`-cell
Subject to
x_i_j + x_k_l <= 1 for all i, j, k, l s.t. there is an edge (i, j)-(k, l)
x_i_j in {0, 1} for all i, j
All that changed in the problem was that "no edge" changed to "an edge". This now finds a (there may be more than one) maximum-sized set of +-cell vertices that share no edge between them. No pair of such cells can be changed by the same 0->1 move (without also changing a 0-cell or --cell to a 1, which we forbid in the monotone version of the problem, because it would then need to be changed a second time), so however many vertices are returned, at least that many separate 0->1 moves are required. And because we have asked for the maximum independent set, no more moves are needed (if more moves were needed, there would be a larger independent set having that many vertices in it).

Related

Algorithm for minimum number of characters that need to be changed in String S

I came across a programming challenge few days back which is over now. The question said given a string S of lowercase English alphabets, find minimum count of characters that need to be changed in String S, so that it contains the given word W as a substring in S.
Also in next line , print the position of characters you need to change in ascending order. Since there can be multiple output, find the position in which first character to change is minimal.
I tried using LCS but could only get count of characters that need to be changed. How to find the position of the character?
I might be missing something, please help. Might be some other algorithm to solve it.

The obvious solution is to shift the reference word W over the input string S and count the differences. However, this will become inefficient for very long strings. So, how can we improve this?
The idea is to target the search at places in S where it is very likely that we have a good match with W. Finding these spots is the critical part. We cannot find them both efficiently and accurately without performing the naive algorithm. So, we use a heuristic H that gives us a lower bound on the number of changes that we have to perform. We calculate this lower bound for every position of S. Then, we start at the position of lowest H and check the actual difference in S and W at that position. If the next-higher H is higher than the current difference, we are already done. If it is not, we check the next position. The outline of the algorithm looks as follows:
input:
W of length LW
S of length LS
H := list of length LS - LW + 1 with tuples [index, mincost]
for i from 0 to LS - LW
H(i) = [i, calculate Heuristic for S[i .. i + LW]]
order H by mincost
actualcost = infinity
nextEntryInH = 0
while actualcost >= H[nextEntryInH].minCost && nextEntryInH < length(H)
calculate actual cost for S[H[nextEntryInH].index .. + LW]
update actualcost if we found a lesser cost or equal cost with an earlier difference
nextEntryInH++
Now, back to the heuristic. We need to find something that allows us to approximate the difference for a given position (and we need to guarantee that it is a lower bound), while at the same time being easy to calculate. Since our alphabet is limited, we can use a histogram of the letters to do this. So, let's assume the example from the comments: W = worldcup and the part of S that we are interested in is worstcap. The histograms for these two parts are (omitting letters that do not occur):
a c d l o p r s t u w
worldcup 0 1 1 1 1 1 1 0 0 1 1
worstcap 1 1 0 0 1 1 1 1 1 0 1
------------------------------
abs diff 1 0 1 1 0 0 0 1 1 1 0 (sum = 6)
We can see that half of the sum of absolute differences is a proper lower bound for the number of letters that we need to change (because every letter change decreases the sum by 2). In this case, the bound is even tight because the sum is equal to the actual cost. However, our heuristic does not consider the order of letters. But in the end, this is what makes it efficiently calculatable.
Ok, our heuristic is the sum of absolute differences for the histograms. Now, how can we calculate this efficiently? Luckily, we can calculate both the histograms and the sum incrementally. We start at position 0 and calculate the full histograms and the sum of absolute differences (note that the histogram of W will never change throughout the rest of the runtime). With this information, we can already set H(0).
To calculate the rest of H, we slide our window across S. When we slide our window by one letter to the right, we only need to update our histogram and sum slightly: There is exactly one new letter in our window (add to the histogram) and one letter leaves the window (remove from the histogram). For the two (or one) corresponding letters, calculate the resulting change for the sum of absolute differences and update it. Then, set H accordingly.
With this approach, we can calculate our heuristic in linear time for the entire string S. The heuristic gives us an indication where we should look for matches. Once we have it, we proceed with the remaining algorithm as outlined at the beginning of this answer (start the accurate cost calculation at places with low heuristic and continue until the actual cost exceeds the next-higher heuristic value).

LCS (= longest common subsequence) will not work because the common letters in W and S need to have matching positions.
Since you are only allowed to update and not remove/insert.
If you were allowed to remove/insert, the Levenshtein distance could be used:
https://en.wikipedia.org/wiki/Levenshtein_distance
In your case an obvious bruteforce solution is to match W with S at every position, with complexity O(N*M) (N size of S, M size of W)

Fitting a segment in a two-dimensional plane

I'm having troubles with the following problem
Given N x S grid and m segments parallel to the horizontal axis (all of them are tuples (x', x'', y)), answer Q online queries of form (x', x''). The answer to such a query is the smallest y (if there is one) such that we can place a segment (x', x'', y). All segments are non-overlapping yet beginning of one segment can be the ending of another i.e. segments (x', x'', y) and (x'', x''', y) are allowed. Being able to place a segment means there could exist a segment (x', x'', y) that wouldn't violate stated rules, segment isn't actually placed(board with initial segments isn't modified) but we only state there could be one.
Constraints
1 ≤ Q, S, m ≤ 10^5
1 ≤ N ≤ 10^9
Time: 1s
Memory: 256 Mb
Here is an example from the link below. Input segments are (2, 5, 1), (1, 2, 2), (4, 5, 2), (2, 3, 3), (3, 4, 3).
Answer for queries
1) (1, 2) → 1
2) (2, 3) → 2
3) (3, 4) → 2
4) (4, 5) → 3
5) (1, 3) → can't place a segment
A visualized answer for the third query (blue segment):
I don't quite understand how to approach the problem. It is supposed to be solved with persistent segment tree, but I am still unable to come up with something.
Could you help me please.
This is not my homework. The source problem can be found here http://informatics.mccme.ru/mod/statements/view3.php?chapterid=111614 . There's no English version of the statement avaliable and the test case presents input data in a different way, so don't mind the souce.

Here is an O(N log N) time solution.
Preliminaries (a good tutorial available here): segment tree, persistent segment tree.
Part 0. Original problem statement
I briefly describe the original problem statement as later I'm going to speak in its terms rather than in terms of abstract segments.
There is a train with S seats (S <= 10^5). It is known that seat s_i is occupied from time l_i to time r_i (no more than 10^5 such constraints, or passengers). Then we have to answer 10^5 queries of kind "find the lowest number of a seat with is free from time l_i to time r_i or say if there is none". All queries must be answered online, that is, you have to answer the previous query before seeing the next.
Throughout the text I denote with N both the number of seats, the number of passengers, and the number of queries, assuming they are the same order of magnitude. You can do more accurate analysis if needed.
Part 1. Answering a single query
Let's answer a query [L, R] assuming that there are no occupied places after time R. For each seat we maintain the last time when it is occupied. Call it last(S). Now the answer for the query is minimum S such that last(S) <= L. If we build a segment tree on seats then we'll be able to answer this query in O(log^2 N) time: binary search the value of S, check if range minimum on segment [0, S] is at most L.
However, it might be not enough to get Accepted. We need O(log N). Recall that each node of a segment tree stores minimum in corresponding range. We start at the root. If the minimum there is >= L then there is no available seat for such query. Otherwise either minimum in the left child or in the right child is <= L (or in both). In the first case we descend to the left child, in the second – to the right, and repeat until we reach a leaf. This leaf will correspond to the minimum seat with last(S) <= L.
Part 2. Solving the problem
We maintain a persistent tree on seats, storing last(S) for each seat (same as in the previous part). Let's process initial passengers one by one sorted by their left endpoint in increasing order. For a passenger (s_i, l_i, r_i) we update the segment tree at position s_i with value r_i. The tree is persistent, so we store the new copy somewhere.
To answer a query [L, R], find a latest version of the segment tree such that the update happened before R. If you do a binary search on versions, it takes O(log N) time.
In the version of the segment tree only passengers with their left endpoint < R are considered (even more, exactly such passengers are). So we can use the algorithm from the Part 1 to answer the query using this segment tree.

Statement :
Input : list<x',x'',Y>
Query Input : (X',X'')
Output : Ymin
Constraints :
1 ≤ Q, S, m ≤ 10^5
1 ≤ N ≤ 10^9
Time: 1s
Memory: 256 Mb
Answer:
Data structure method you can use :
1. Brute force : Directly iterate through the list and perform the check.
2. Sort : sort the list on Y [lowest to highest] and then iterate through it.
Note : Sorting large list will be time consuming.
Sort on Y
Ymin = -1 //Maintain Ymin
for i[0 : len(input)] : //Iterate through tuples
if Ymin != -1 && Y(i-1) != Yi : return Ymin // end case
else if x' > X'' : Ymin = Yi //its on right of query tuple
else if x'<X' && (x' + length <= X') : Ymin = Yi //on left of query tuple
else next
3. Hashmap : Map<Y, list< tuple<x',length> > > to store list of lines for each Y and iterate through them to get least Y.
Note : will take additional time for Map build.
Iterate through list and build a Map
Iterate through Map keys :
Iterate through list of tuples, for each tuple :
if x' > X'': Continue //its on right of query tuple
else if x'<X' && (x' + length <= X') : return Y //on left of query tuple
else next Y
4. Matrix : you can build matrix with 1 for occupied point and 0 for empty.
Note : will take additional time for Matrix build and iteration through matrix is time consuming so not useful.
Example :
0 1 1 1 0 0
1 1 0 1 0 0
0 1 1 1 1 0

Number of different marks

I came across an interesting problem and I can't solve it in a good complexity (better than O(qn)):
There are n persons in a row. Initially every person in this row has some value - lets say that i-th person has value a_i. These values are pairwise distinct.
Every person gets a mark. There are two conditions:
If a_i < a_j then j-th person cant get worse mark than i-th person.
If i < j then j-th person can't get worse mark than i-th person (this condition tells us that sequence of marks is non-decreasing sequence).
There are q operations. In every operation two person are swapped (they swap their values).
After each operation you have tell what is maximal number of diffrent marks that these n persons can get.
Do you have any idea?

Consider any two groups, J and I (j < i and a_j < a_i for all j and i). In any swap scenario, a_i is the new max for J and a_j is the new min for I, and J gets extended to the right at least up to and including i.
Now if there was any group of is to the right of i whos values were all greater than the values in the left segment of I up to i, this group would not have been part of I, but rather its own group or part of another group denoting a higher mark.
So this kind of swap would reduce the mark count by the count of groups between J and I and merge groups J up to I.
Now consider an in-group swap. The only time a mark would be added is if a_i and a_j (j < i), are the minimum and maximum respectively of two adjacent segments, leading to the group splitting into those two segments. Banana123 showed in a comment below that this condition is not sufficient (e.g., 3,6,4,5,1,2 => 3,1,4,5,6,2). We can address this by also checking before the switch that the second smallest i is greater than the second largest j.
Banana123 also showed in a comment below that more than one mark could be added in this instance, for example 6,2,3,4,5,1. We can handle this by keeping in a segment tree a record of min,max and number of groups, which correspond with a count of sequential maxes.
Example 1:
(1,6,1) // (min, max, group_count)
(3,6,1) (1,4,1)
(6,6,1) (3,5,1) (4,4,1) (1,2,1)
6 5 3 4 2 1
Swap 2 and 5. Updates happen in log(n) along the intervals containing 2 and 5.
To add group counts in a larger interval the left group's max must be lower than the right group's min. But if it's not, as in the second example, we must check one level down in the tree.
(1,6,1)
(2,6,1) (1,5,1)
(6,6,1) (2,3,2) (4,4,1) (1,5,1)
6 2 3 4 5 1
Swap 1 and 6:
(1,6,6)
(1,3,3) (4,6,3)
(1,1,1) (2,3,2) (4,4,1) (5,6,2)
1 2 3 4 5 6
Example 2:
(1,6,1)
(3,6,1) (1,4,1)
(6,6,1) (3,5,1) (4,4,1) (1,2,1)
6 5 3 4 2 1
Swap 1 and 6. On the right side, we have two groups where the left group's max is greater than the right group's min, (4,4,1) (2,6,2). To get an accurate mark count, we go down a level and move 2 into 4's group to arrive at a count of two marks. A similar examination is then done in the level before the top.
(1,6,3)
(1,5,2) (2,6,2)
(1,1,1) (3,5,1) (4,4,1) (2,6,2)
1 5 3 4 2 6

Here's an O(n log n) solution:
If n = 0 or n = 1, then there are n distinct marks.
Otherwise, consider the two "halves" of the list, LEFT = [1, n/2] and RIGHT = [n/2 + 1, n]. (If the list has an odd number of elements, the middle element can go in either half, it doesn't matter.)
Find the greatest value in LEFT — call it aLEFT_MAX — and the least value in the second half — call it aRIGHT_MIN.
If aLEFT_MAX < aRIGHT_MIN, then there's no need for any marks to overlap between the two, so you can just recurse into each half and return the sum of the two results.
Otherwise, we know that there's some segment, extending at least from LEFT_MAX to RIGHT_MIN, where all elements have to have the same mark.
To find the leftmost extent of this segment, we can scan leftward from RIGHT_MIN down to 1, keeping track of the minimum value we've seen so far and the position of the leftmost element we've found to be greater than some further-rightward value. (This can actually be optimized a bit more, but I don't think we can improve the algorithmic complexity by doing so, so I won't worry about that.) And, conversely to find the rightmost extent of the segment.
Suppose the segment in question extends from LEFTMOST to RIGHTMOST. Then we just need to recursively compute the number of distinct marks in [1, LEFTMOST) and in (RIGHTMOST, n], and return the sum of the two results plus 1.

I wasn't able to get a complete solution, but here are a few ideas about what can and can't be done.
First: it's impossible to find the number of marks in O(log n) from the array alone - otherwise you could use your algorithm to check if the array is sorted faster than O(n), and that's clearly impossible.
General idea: spend O(n log n) to create any additional data which would let you to compute number of marks in O(log n) time and said data can be updated after a swap in O(log n) time. One possibly useful piece to include is the current number of marks (i.e. finding how number of marks changed may be easier than to compute what it is).
Since update time is O(log n), you can't afford to store anything mark-related (such as "the last person with the same mark") for each person - otherwise taking an array 1 2 3 ... n and repeatedly swapping first and last element would require you to update this additional data for every element in the array.
Geometric interpretation: taking your sequence 4 1 3 2 5 7 6 8 as an example, we can draw points (i, a_i):
|8
+---+-
|7 |
| 6|
+-+---+
|5|
-------+-+
4 |
3 |
2|
1 |
In other words, you need to cover all points by a maximal number of squares. Corollary: exchanging points from different squares a and b reduces total number of squares by |a-b|.
Index squares approach: let n = 2^k (otherwise you can add less than n fictional persons who will never participate in exchanges), let 0 <= a_i < n. We can create O(n log n) objects - "index squares" - which are "responsible" for points (i, a_i) : a*2^b <= i < (a+1)*2^b or a*2^b <= a_i < (a+1)*2^b (on our plane, this would look like a cross with center on the diagonal line a_i=i). Every swap affects only O(log n) index squares.
The problem is, I can't find what information to store for each index square so that it would allow to find number of marks fast enough? all I have is a feeling that such approach may be effective.
Hope this helps.

Let's normalize the problem first, so that a_i is in the range of 0 to n-1 (can be achieved in O(n*logn) by sorting a, but just hast to be done once so we are fine).
function normalize(a) {
let b = [];
for (let i = 0; i < a.length; i++)
b[i] = [i, a[i]];
b.sort(function(x, y) {
return x[1] < y[1] ? -1 : 1;
});
for (let i = 0; i < a.length; i++)
a[b[i][0]] = i;
return a;
}
To get the maximal number of marks we can count how many times
i + 1 == mex(a[0..i]) , i integer element [0, n-1]
a[0..1] denotes the sub-array of all the values from index 0 to i.
mex() is the minimal exclusive, which is the smallest value missing in the sequence 0, 1, 2, 3, ...
This allows us to solve a single instance of the problem (ignoring the swaps for the moment) in O(n), e.g. by using the following algorithm:
// assuming values are normalized to be element [0,n-1]
function maxMarks(a) {
let visited = new Array(a.length + 1);
let smallestMissing = 0, marks = 0;
for (let i = 0; i < a.length; i++) {
visited[a[i]] = true;
if (a[i] == smallestMissing) {
smallestMissing++;
while (visited[smallestMissing])
smallestMissing++;
if (i + 1 == smallestMissing)
marks++;
}
}
return marks;
}
If we swap the values at indices x and y (x < y) then the mex for all values i < x and i > y doesn't change, although it is an optimization, unfortunately that doesn't improve complexity and it is still O(qn).
We can observe that the hits (where mark is increased) are always at the beginning of an increasing sequence and all matches within the same sequence have to be a[i] == i, except for the first one, but couldn't derive an algorithm from it yet:
0 6 2 3 4 5 1 7
*--|-------|*-*
3 0 2 1 4 6 5 7
-|---|*-*--|*-*

Algorithm to find best combination or path through nodes

As I am not very proficient in various optimization/tree algorithms, I am seeking help.
Problem Description:
Assume, a large sequence of sorted nodes is given with each node representing an integer value L. L is always getting bigger with each node and no nodes have the same L.
The goal now is to find the best combination of nodes, where the difference between the L-values of subsequent nodes is closest to a given integer value M(L) that changes over L.
Example:
So, in the beginning I would have L = 50 and M = 100. The next nodes have L = 70,140,159,240,310.
First, the value of 159 seems to be closest to L+M = 150, so it is chosen as the right value.
However, in the next step, M=100 is still given and we notice that L+M = 259, which is far away from 240.
If we now go back and choose the node with L=140 instead, which then is followed by 240, the overall match between the M values and the L-differences is stronger. The algorithm should be able to find back to the optimal path, even if a mistake was made along the way.
Some additional information:
1) the start node is not necessarily part of the best combination/path, but if required, one could first develop an algorithm, which chooses the best starter candidate.
2) the optimal combination of nodes is following the sorted sequence and not "jumping back" -> so 1,3,5,7 is possible but not 1,3,5,2,7.
3) in the end, the differences between the L values of chosen nodes should in the mean squared sense be closest to the M values
Every help is much appreciated!

If I understand your question correctly, you could use Dijktras algorithm:
https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
http://www.mathworks.com/matlabcentral/fileexchange/20025-dijkstra-s-minimum-cost-path-algorithm
For that you have to know your neighbours of every node and create an Adjacency Matrix. With the implementation of Dijktras algorithm which I posted above you can specify edge weights. You could specify your edge weight in a manner that it is L of the node accessed + M. So for every node combination you have your L of new node + M. In that way the algorithm should find the optimum path between your nodes.
To get all edge combinations you can use Matlabs graph functions:
http://se.mathworks.com/help/matlab/ref/graph.html
If I understand your problem correctly you need an undirected graph.
You can access all edges with the command
G.Edges after you have created the graph.
I know its not the perfect answer but I hope it helps!
P.S. Just watch out, Djikstras algorithm can only handle positive edge weights.

Suppose we are given a number M and a list of n numbers, L[1], ..., L[n], and we want to find a subsequence of at least q of the latter numbers that minimises the sum of squared errors (SSE) with respect to M, where the SSE of a list of k positions x[1], ..., x[k] with respect to M is given by
SSE(M, x[1], ..., x[k]) = sum((L[x[i]]-L[x[i-1]]-M)^2) over all 2 <= i <= k,
with the SSE of a list of 0 or 1 positions defined to be 0.
(I'm introducing the parameter q and associated constraint on the subsequence length here because without it, there always exists a subsequence of length exactly 2 that achieves the minimum possible SSE -- and I'm guessing that such a short sequence isn't helpful to you.)
This problem can be solved in O(qn^2) time and O(qn) space using dynamic programming.
Define f(i, j) to be the minimum sum of squared errors achievable under the following constraints:
The number at position i is selected, and is the rightmost selected position. (Here, i = 0 implies that no positions are selected.)
We require that at least j (instead of q) of these first i numbers are selected.
Also define g(i, j) to be the minimum of f(k, j) over all 0 <= k <= i. Thus g(n, q) will be the minimum sum of squared errors achievable on the entire original problem. For efficient (O(1)) calculation of g(i, j), note that
g(i>0, j>0) = min(g(i-1, j), f(i, j))
g(0, 0) = 0
g(0, j>0) = infinity
To calculate f(i, j), note that if i > 0 then any solution must be formed by appending the ith position to some solution Y that selects at least j-1 positions and whose rightmost selected position is to the left of i -- i.e. whose rightmost selected position is k, for some k < i. The total SSE of this solution to the (i, j) subproblem will be whatever the SSE of Y was, plus a fixed term of (L[x[i]]-L[x[k]]-M)^2 -- so to minimise this total SSE, it suffices to minimise the SSE of Y. But we can compute that minimum: it is g(k, j-1).
Since this holds for any 0 <= k < i, it suffices to try all such values of k, and take the one that gives the lowest total SSE:
f(i>=j, j>=2) = min of (g(k, j-1) + (L[x[i]]-L[x[k]]-M)^2) over all 0 <= k < i
f(i>=j, j<2) = 0 # If we only need 0 or 1 position, SSE is 0
f(i, j>i) = infinity # Can't choose > i positions if the rightmost chosen position is i
With the above recurrences and base cases, we can compute g(n, q), the minimum possible sum of squared errors for the entire problem. By memoising values of f(i, j) and g(i, j), the time to compute all needed values of f(i, j) is O(qn^2), since there are at most (n+1)*(q+1) possible distinct combinations of input parameters (i, j), and computing a particular value of f(i, j) requires at most (n+1) iterations of the loop that chooses values of k, each iteration of which takes O(1) time outside of recursive subcalls. Storing solution values of f(i, j) requires at most (n+1)*(q+1), or O(qn), space, and likewise for g(i, j). As established above, g(i, j) can be computed in O(1) time when all needed values of f(x, y) have been computed, so g(n, q) can be computed in the same time complexity.
To actually reconstruct a solution corresponding to this minimum SSE, you can trace back through the computed values of f(i, j) in reverse order, each time looking for a value of k that achieves a minimum value in the recurrence (there may in general be many such values of k), setting i to this value of k, and continuing on until i=0. This is a standard dynamic programming technique.

I now answer my own post with my current implementation, in order to structure my post and load images. Unfortunately, the code does not do what it should do. Imagine L,M and q given like in the images below. With the calcf and calcg functions I calculated the F and G matrices where F(i+1,j+1) is the calculated and stored f(i,j) and G(i+1,j+1) from g(i,j). The SSE of the optimal combination should be G(N+1,q+1), but the result is wrong. If anyone found the mistake, that would be much appreciated.
G and F Matrix of given problem in the workspace. G and F are created by calculating g(N,q) via calcg(L,N,q,M).
calcf and calcg functions

Closest equal numbers

Suppose you have a1..an numbers and some queries [l, k] (1 < l, k < n). The problem is to find in [l, k] interval minimum distance between two equal numbers.
Examples: (interval l,k shown as |...|)
1 2 2 |1 0 1| 2 3 0 1 2 3
Answer 2 (101)
1 |2 2| 1 0 1 2 3 0 1 2 3
Answer 1 (22)
1 2 2 1 0 |1 2 3 0 3 2 3|
Answer 2 (303) or (323)
I have thought about segment tree, but it is hard to join results from each tree node, when query is shared between several nodes. I have tried some ways to join them, but it looks ugly. Can somebody give me a hint?
Clarification
Thanks for your answers.
The problem is that there are a lot of queries, so o(n) is not good. I do not accidentally mentioned a segment tree. It performs [l, r] query for finding [l, r]SUM or [l, r]MIN in array with log(n) complexity. Can we do some preprocessing to fit in o(logn) here?

Call an interval minimal if its first number equals its last but each of the numbers in between appears exactly once in the interval. 11 and 101 are minimal, but 12021 and 10101 are not.
In linear time (assuming constant-time hashing), enumerate all of the minimal intervals. This can be done by keeping two indices, l and k, and a hash map that maps each symbol in between l and k to its index. Initially, l = 1 and k = 0. Repeatedly do the following. Increment k (if it's too large, we stop). If the symbol at the new value of k is in the map, then advance l to the map value, deleting stuff from the map as we go. Yield the interval [l, k] and increment l once more. In all cases, write k as the map value of the symbol.
Because of minimality, the minimal intervals are ordered the same way by their left and right endpoints. To answer a query, we look up the first interval that it could contain and the last and then issue a range-minimum query of the lengths of the range of intervals. The result is, in theory, an online algorithm that does linear-time preprocessing and answers queries in constant time, though for convenience you may not implement it that way.

We can do it in O(nlog(n)) with a sort. First, mark all the elements in [l,k] with their original indices. Then, sort the elements in [l,k], first based on value, and second based on original index, both ascending.
Then you can loop over the sorted list, keeping a currentValue variable, and checking adjacent values that are the same for distance and setting minDistance if necessary. currentValue is updated when you reach a new value in the sorted list.
Suppose we have this [l,k] range from your second example:
1 2 3 0 3 2 3
We can mark them as
1(1) 2(2) 3(3) 0(4) 3(5) 2(6) 3(7)
and sort them as
0(4) 1(1) 2(2) 2(6) 3(3) 3(5) 3(7)
Looping over this, there are no ranges for 0 and 1. The minimum distance for 2s is 4, and the minimum distance for 3s is 2 ([3,5] or [3,7], depending on if you reset minDistance when the new minimum distance is equal to the current minimum distance).
Thus we get
[3,5] in [l,k] or [5,7] in [l,k]
EDIT
Since you mention some queries, you can preprocess the list in O(nlog(n)) time, and then only use O(n) time for each individual query. You would just ignore indices that are not in [l,k] while looping over the sorted list.
EDIT 2
This is addressing the clarification in the question, which now states that there will always be lots of queries to run. We can preprocess in O(n^2) time using dynamic programming and then run each query in O(1) time.
First, perform the preprocessing on the entire list that I described above. Then form links in O(n) time from the original list into the sorted list.
We can imagine that:
[l,k] = min([l+1,k], [l,k-1], /*some other sequence starting at l or ending at k*/)
We have one base case
[l,k] = infinity where l = k
If [l,k] is not min([l+1,k], [l,k-1]), then it either starts at l or ends at k. We can take each of these, look into the sorted list and look at the adjacent element in the correct direction and check the distances (making sure we're in bounds). We only have to check 2 elements, so it is a constant factor.
Using this algorithm, we can run the following
for l = n downto 1
for k = l to n
M[l,k] = min(M[l+1,k], M[l,k-1], sequence starting at l, sequence ending at k)
You can also store the solutions in the matrix (which is actually a pyramid). Then, when you are given a query [l,k], you just look it up in the matrix.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio