Select one element from each set but the selected element should not be repeated - algorithm

I have a few sets, say 5 of them
{1,2,3}
{2,4}
{1,2}
{4,5}
{2,3,5}
Here, I need to choose at least 3 elements from any three sets(One element per set). Given that if an element is already selected, then it cannot be selected again.
Also check if any solution exists or not.
Eg
set {1,2,3} -> choose 1
set {2,4} -> choose 2
set {1,2} -> cannot choose since both 1 and 2 are chosen.
set {2,5} -> can only choose 5
Is there a way to achieve this? Simple explanation would be appreciated.

If you only need 3 elements, then the algorithm is quite simple. Just repeat the following procedure:
Select the set with the lowest heuristic. The heuristic is the length of the set, divided by the total occurrences of that set. If the set has zero elements, remove the set, and go to step 4. If there are two or more, you can choose any one of them.
Pick an element from that set. This is the element you'll choose.
Remove this element from every set.
If you have picked 3 elements or there are no more sets remaining, exit. Otherwise go to step 1.
This algorithm gives at least 3 elements whenever it's possible, even in the presence of duplicates. Here's the proof.
If the heuristic for a set is <= 1, picking an element from that set is basically free. It doesn't hurt the ability to use other sets at all.
If we are in a situation with 2 or more sets with heuristic >1, and we have to pick at least two elements, this is easy. Just pick one from the first, and the second one will have an element left, because it's length is >1 because it's heuristic is >1.
If we are in a situation with 3 or more sets with heuristic >1, we can pick from the first set. After this we are left with at least two sets, where at least one of them has more than one element. We can't be left with two size one sets, because that would imply that the 3 sets we started with contain a duplicate length 2 set, which has heuristic 1. Thus we can pick all 3 elements.
Here is python code for this algorithm. The generator returns as many elements as it can manage. If it's possible to return at least 3 elements, it will. However after that, it doesn't always return the optimal solution.
def choose(sets):
# Copy the input, to avoid modification of the input
s = [{*e} for e in sets]
while True:
# If there are no more sets remaining
if not s:return
# Remove based on length and number of duplicates
m = min(s,key=lambda x:(len(x)/s.count(x)))
s.remove(m)
# Ignore empty sets
if m:
# Remove a random element
e = m.pop()
# Yield it
yield e
# Remove the chosen element e from other sets
for i in range(len(s)):s[i].discard(e)
print([*choose([{1,2,3}, {2,4}, {1,2}, {4,5}, {2,3,5}])])
print([*choose([{1}, {2,3}, {2,4}, {1,2,4}])])
print([*choose([{1,2}, {2}, {2,3,4}])])
print([*choose([{1,2}, {2}, {2,1}])])
print([*choose([{1,2}, {1,3}, {1,3}])])
print([*choose([{1}, {1,2,3}, {1,2,3}])])
print([*choose([{1,3}, {2,3,4}, {2,3,4}, {2,3,4}, {2,3,4}])])
print([*choose([{1,5}, {2,3}, {1,3}, {1,2,3}])])
Try it online!

Something like this
given your sets
0: {1,2,3}
1: {2,4}
2: {1,2}
3: {4,5}
4: {2,3,5}
A array of sets
set A[1] = { 0, 2} // all sets containing 1
A[2] = { 0, 1, 2, 4} // all sets containing 2
A[3] = { 0, 4 } // all sets containing 3
A[4] = { 1, 3 } // all sets containing 4
A[5] = { 3, 4 } // all sets containing 5
set<int> result;
for(i = 0; i < 3; i++) {
find k such that A[k] not empty
if no k exist then "no solution"
result.add(k)
A[k] = empty
}
return result

I think my idea is a bit overkill but it would work on any kind of sets with any number of sets in any size.
the idea is to transform the sets to bipartite graph. on one side you have each set, and on the other side you have the number which they contains.
and if a set contains a number you have a edge between those vertices.
eventually you're trying to find the maximum matching in the graph (maximum cardinality matching).
gladly it can be done with Hopcroft-Karp algorithm in O(√VE) time or even less with Ford–Fulkerson algorithm.
here some links for more source on maximum matching and the algorithms->
https://en.wikipedia.org/wiki/Matching_(graph_theory)
https://en.wikipedia.org/wiki/Maximum_cardinality_matching
https://en.wikipedia.org/wiki/Ford%E2%80%93Fulkerson_algorithm

Related

Minimum number of additional weights required in order to weight items in range from 1 to 100

Given a set of weights S={w1,w2,w3} and a range of weights, we need to determine whether the weights in S can be used to balance every weight in the range. If not, we need to add the minimum additional weights to S so that all of the weights in the range can be balanced.
For example:
Range is 1 to 5
S = {4,8,9}
The item with weight 1 can be balanced by putting the item on the left pan along with the 8, and put the 9 on the right pan.
1 + 8 = 9
3 + 9 = 8 + 4
4 = 4
5 + 8 = 9 + 4
But 2 can't be balanced using the weights {4,8,9} so we need to add another weight. Adding a weight of 1 allows 2 to balanced with
2 + 8 = 1 + 9
My question is there a mathematical algorithm that can be used to solve this problem?
There certainly are algorithms that would solve this. For clarity's sake, I'm assuming your use of the term "set" is the mathematical set, where all set elements are distinct, though this should not affect the below code all that much.
Breaking down the problem into 2 parts:
(1) Determine if the provided set of weights can be arranged on the scale such that the required range of integer values are covered
A solution to part (1), in python:
(to run, call check_range(int, int, []), where the first two args are the integer bounds of the range, low/high respectively, and the 3rd arg is a list of the weights in set s)
def get_possible_offsets(s=[]):
#the variable set "temp" will hold the possible offsets that we can create by arranging the weights on the scale
temp=set()
#optionally, we don't need to add any of the weights, ergo add value 0 by default
temp.add(0)
#per every weight in the given set of weights
for weight in s:
#take an iterable snapshot of our set of possible offsets
l = list(temp)
#for each value in that list, |i+/-w| the weight value
for i in l:
temp.add(i + weight)
temp.add(abs(i - weight))
#and also add the weight by itself
temp.add(weight)
return(temp)
def check_range(r_low=0, r_high=1, s=[]):
#get the set of weight values available using the provided set of weights
possible_offsets = get_possible_offsets(s)
#list to store the required weight values not available using the provided set of weights
missing_offsets = []
#for each required weight in the range, check if that weight exists in our current possible offsets
for i in range(r_low, r_high+1):
if i not in possible_offsets:
missing_offsets.append(i)
#if we're not missing any values from the required range, then we are done!
if len(missing_offsets) == 0:
print ("Yes! The required range is covered by the provided weights.")
else:
print ("Tragically, the following weight offsets are not covered:",missing_offsets)
(2) If 1. is false, then determine the minimum required additional weights to complete the required range
Part (2) of the problem, I have not added a complete solution yet, however, we just need to take the missing_offsets list in the above code, and boil it down to the additional weight values that could be included in the set of possible_offsets, as performed in the lines of code:
for i in l:
temp.add(i + weight)
temp.add(abs(i - weight))
This problem also sounds a lot like search-tree algos (though not binary), as well as combinatorics, so there are likely several efficient ways of calculating the desired output.
The set of absolute differences between each side of the scale is our range. Let's enumerate them for S, aggregating each element in turn (add and subtract each element to each previously seen absolute difference, then add the element itself as a difference):
S: {4, 8, 9}
up to element S_0:
4
up to element S_1:
4, 12, 8
up to element S_2:
4, 12, 8, 13, 5, 21, 3, 1, 17, 9
Now let's order them:
1, 3, 4, 5, 8, 9, 12, 13, 17, 21
To cover our range, 1 to 5, we need to fill the gap between 1 and 3. Adding a 1 will add ±1 to every difference we can create.
Would it not be the case that to cover any range, we would need to add ceil(k / 2) 1's, where k is the maximum gap in our range, when considering our enumerated differences? In this case, ceil(1 / 2) = one 1?
As ruakh commented below, this is not the case. Any lower range we can build, in fact can be used to fill-in gaps anywhere, and the coverage of the filled-in range can be applied again to growing ranges. For example:
{1, 2} covers 1 to 3
Now add 7 and we've increased our
range to 1 - 10 by virtue of applying ±3 to 7
Now we can add 21 and achieve the
range 21 ± 10!
This points to the possibility of overlapping subproblems.

Way to Calculate Distinct Partitions using Subsets of a Set containing only one Kind of element

We know 3 things
n(number of elements in the set)
k(no. of parts)
set s= {x,x,x,x,...,x(n times)} (here X can have any possible integral value)
we have to find the result as a number which will holds the value of number of distinct partitions possible of the set S.
Is there any kind way(formula / procedure) to find the result using given values?
EXAMPLES:
Input: n = 3, k = 2
Output: 4
Explanation: Let the set be {0,0,0} (assuming x=0), we can partition
it into 2 subsets in following ways
{{0,0}, {0}}, {{0}, {0,0}}, {{0,0,0},{}}
{{},{0,0,0}}.
further, see {{0,0}, {0}} is made up of 2 subsets namely {0,0}
and {0} And it has x(=0) used exactly n(=3) times
Input: n = 3, k = 1
Output: 1
Explanation: There is only one way {{1, 1, 1}} (assuming x=1)
Note:
I know I used word Set in the problem. but a set is defined as collection of distinct elements. So you can either consider it a Multiset, an array or You can assume a set can hold same elements for this particular problem.
I am just trying to use Same terminology as that in the problem.

Replace two elements with their absolute difference and generate the minimum possible element in array

I have an array of size n and I can apply any number of operations(zero included) on it. In an operation, I can take any two elements and replace them with the absolute difference of the two elements. We have to find the minimum possible element that can be generated using the operation. (n<1000)
Here's an example of how operation works. Let the array be [1,3,4]. Applying operation on 1,3 gives [2,4] as the new array.
Ex: 2 6 11 3 => ans = 0
This is because 11-6 = 5 and 5-3 = 2 and 2-2 = 0
Ex: 20 6 4 => ans = 2
Ex: 2 6 10 14 => ans = 0
Ex: 2 6 10 => ans = 2
Can anyone tell me how can I approach this problem?
Edit:
We can use recursion to generate all possible cases and pick the minimum element from them. This would have complexity of O(n^2 !).
Another approach I tried is Sorting the array and then making a recursion call where the either starting from 0 or 1, I apply the operations on all consecutive elements. This will continue till their is only one element left in the array and we can return the minimum at any point in the recursion. This will have a complexity of O(n^2) but doesn't necessarily give the right answer.
Ex: 2 6 10 15 => (4 5) & (2 4 15) => (1) & (2 15) & (2 11) => (13) & (9). The minimum of this will be 1 which is the answer.
When you choose two elements for the operation, you subtract the smaller one from the bigger one. So if you choose 1 and 7, the result is 7 - 1 = 6.
Now having 2 6 and 8 you can do:
8 - 2 -> 6 and then 6 - 6 = 0
You may also write it like this: 8 - 2 - 6 = 0
Let"s consider different operation: you can take two elements and replace them by their sum or their difference.
Even though you can obtain completely different values using the new operation, the absolute value of the element closest to 0 will be exactly the same as using the old one.
First, let's try to solve this problem using the new operations, then we'll make sure that the answer is indeed the same as using the old ones.
What you are trying to do is to choose two nonintersecting subsets of initial array, then from sum of all the elements from the first set subtract sum of all the elements from the second one. You want to find two such subsets that the result is closest possible to 0. That is an NP problem and one can efficiently solve it using pseudopolynomial algorithm similar to the knapsack problem in O(n * sum of all elements)
Each element of initial array can either belong to the positive set (set which sum you subtract from), negative set (set which sum you subtract) or none of them. In different words: each element you can either add to the result, subtract from the result or leave untouched. Let's say we already calculated all obtainable values using elements from the first one to the i-th one. Now we consider i+1-th element. We can take any of the obtainable values and increase it or decrease it by the value of i+1-th element. After doing that with all the elements we get all possible values obtainable from that array. Then we choose one which is closest to 0.
Now the harder part, why is it always a correct answer?
Let's consider positive and negative sets from which we obtain minimal result. We want to achieve it using initial operations. Let's say that there are more elements in the negative set than in the positive set (otherwise swap them).
What if we have only one element in the positive set and only one element in the negative set? Then absolute value of their difference is equal to the value obtained by using our operation on it.
What if we have one element in the positive set and two in the negative one?
1) One of the negative elements is smaller than the positive element - then we just take them and use the operation on them. The result of it is a new element in the positive set. Then we have the previous case.
2) Both negative elements are smaller than the positive one. Then if we remove bigger element from the negative set we get the result closer to 0, so this case is impossible to happen.
Let's say we have n elements in the positive set and m elements in the negative set (n <= m) and we are able to obtain the absolute value of difference of their sums (let's call it x) by using some operations. Now let's add an element to the negative set. If the difference before adding new element was negative, decreasing it by any other number makes it smaller, that is farther from 0, so it is impossible. So the difference must have been positive. Then we can use our operation on x and the new element to get the result.
Now second case: let's say we have n elements in the positive set and m elements in the negative set (n < m) and we are able to obtain the absolute value of difference of their sums (again let's call it x) by using some operations. Now we add new element to the positive set. Similarly, the difference must have been negative, so x is in the negative set. Then we obtain the result by doing the operation on x and the new element.
Using induction we can prove that the answer is always correct.

Disperse Duplicates in an Array

Source : Google Interview Question
Write a routine to ensure that identical elements in the input are maximally spread in the output?
Basically, we need to place the same elements,in such a way , that the TOTAL spreading is as maximal as possible.
Example:
Input: {1,1,2,3,2,3}
Possible Output: {1,2,3,1,2,3}
Total dispersion = Difference between position of 1's + 2's + 3's = 4-1 + 5-2 + 6-3 = 9 .
I am NOT AT ALL sure, if there's an optimal polynomial time algorithm available for this.Also,no other detail is provided for the question other than this .
What i thought is,calculate the frequency of each element in the input,then arrange them in the output,each distinct element at a time,until all the frequencies are exhausted.
I am not sure of my approach .
Any approaches/ideas people .
I believe this simple algorithm would work:
count the number of occurrences of each distinct element.
make a new list
add one instance of all elements that occur more than once to the list (order within each group does not matter)
add one instance of all unique elements to the list
add one instance of all elements that occur more than once to the list
add one instance of all elements that occur more than twice to the list
add one instance of all elements that occur more than trice to the list
...
Now, this will intuitively not give a good spread:
for {1, 1, 1, 1, 2, 3, 4} ==> {1, 2, 3, 4, 1, 1, 1}
for {1, 1, 1, 2, 2, 2, 3, 4} ==> {1, 2, 3, 4, 1, 2, 1, 2}
However, i think this is the best spread you can get given the scoring function provided.
Since the dispersion score counts the sum of the distances instead of the squared sum of the distances, you can have several duplicates close together, as long as you have a large gap somewhere else to compensate.
for a sum-of-squared-distances score, the problem becomes harder.
Perhaps the interview question hinged on the candidate recognizing this weakness in the scoring function?
In perl
#a=(9,9,9,2,2,2,1,1,1);
then make a hash table of the counts of different numbers in the list, like a frequency table
map { $x{$_}++ } #a;
then repeatedly walk through all the keys found, with the keys in a known order and add the appropriate number of individual numbers to an output list until all the keys are exhausted
#r=();
$g=1;
while( $g == 1 ) {
$g=0;
for my $n (sort keys %x)
{
if ($x{$n}>1) {
push #r, $n;
$x{$n}--;
$g=1
}
}
}
I'm sure that this could be adapted to any programming language that supports hash tables
python code for algorithm suggested by Vorsprung and HugoRune:
from collections import Counter, defaultdict
def max_spread(data):
cnt = Counter()
for i in data: cnt[i] += 1
res, num = [], list(cnt)
while len(cnt) > 0:
for i in num:
if num[i] > 0:
res.append(i)
cnt[i] -= 1
if cnt[i] == 0: del cnt[i]
return res
def calc_spread(data):
d = defaultdict()
for i, v in enumerate(data):
d.setdefault(v, []).append(i)
return sum([max(x) - min(x) for _, x in d.items()])
HugoRune's answer takes some advantage of the unusual scoring function but we can actually do even better: suppose there are d distinct non-unique values, then the only thing that is required for a solution to be optimal is that the first d values in the output must consist of these in any order, and likewise the last d values in the output must consist of these values in any (i.e. possibly a different) order. (This implies that all unique numbers appear between the first and last instance of every non-unique number.)
The relative order of the first copies of non-unique numbers doesn't matter, and likewise nor does the relative order of their last copies. Suppose the values 1 and 2 both appear multiple times in the input, and that we have built a candidate solution obeying the condition I gave in the first paragraph that has the first copy of 1 at position i and the first copy of 2 at position j > i. Now suppose we swap these two elements. Element 1 has been pushed j - i positions to the right, so its score contribution will drop by j - i. But element 2 has been pushed j - i positions to the left, so its score contribution will increase by j - i. These cancel out, leaving the total score unchanged.
Now, any permutation of elements can be achieved by swapping elements in the following way: swap the element in position 1 with the element that should be at position 1, then do the same for position 2, and so on. After the ith step, the first i elements of the permutation are correct. We know that every swap leaves the scoring function unchanged, and a permutation is just a sequence of swaps, so every permutation also leaves the scoring function unchanged! This is true at for the d elements at both ends of the output array.
When 3 or more copies of a number exist, only the position of the first and last copy contribute to the distance for that number. It doesn't matter where the middle ones go. I'll call the elements between the 2 blocks of d elements at either end the "central" elements. They consist of the unique elements, as well as some number of copies of all those non-unique elements that appear at least 3 times. As before, it's easy to see that any permutation of these "central" elements corresponds to a sequence of swaps, and that any such swap will leave the overall score unchanged (in fact it's even simpler than before, since swapping two central elements does not even change the score contribution of either of these elements).
This leads to a simple O(nlog n) algorithm (or O(n) if you use bucket sort for the first step) to generate a solution array Y from a length-n input array X:
Sort the input array X.
Use a single pass through X to count the number of distinct non-unique elements. Call this d.
Set i, j and k to 0.
While i < n:
If X[i+1] == X[i], we have a non-unique element:
Set Y[j] = Y[n-j-1] = X[i].
Increment i twice, and increment j once.
While X[i] == X[i-1]:
Set Y[d+k] = X[i].
Increment i and k.
Otherwise we have a unique element:
Set Y[d+k] = X[i].
Increment i and k.

Algorithm to count the number of valid blocks in a permutation [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Finding sorted sub-sequences in a permutation
Given an array A which holds a permutation of 1,2,...,n. A sub-block A[i..j]
of an array A is called a valid block if all the numbers appearing in A[i..j]
are consecutive numbers (may not be in order).
Given an array A= [ 7 3 4 1 2 6 5 8] the valid blocks are [3 4], [1,2], [6,5],
[3 4 1 2], [3 4 1 2 6 5], [7 3 4 1 2 6 5], [7 3 4 1 2 6 5 8]
So the count for above permutation is 7.
Give an O( n log n) algorithm to count the number of valid blocks.
Ok, I am down to 1 rep because I put 200 bounty on a related question: Finding sorted sub-sequences in a permutation
so I cannot leave comments for a while.
I have an idea:
1) Locate all permutation groups. They are: (78), (34), (12), (65). Unlike in group theory, their order and position, and whether they are adjacent matters. So, a group (78) can be represented as a structure (7, 8, false), while (34) would be (3,4,true). I am using Python's notation for tuples, but it is actually might be better to use a whole class for the group. Here true or false means contiguous or not. Two groups are "adjacent" if (max(gp1) == min(gp2) + 1 or max(gp2) == min(gp1) + 1) and contigous(gp1) and contiguos(gp2). This is not the only condition, for union(gp1, gp2) to be contiguous, because (14) and (23) combine into (14) nicely. This is a great question for algo class homework, but a terrible one for interview. I suspect this is homework.
Just some thoughts:
At first sight, this sounds impossible: a fully sorted array would have O(n2) valid sub-blocks.
So, you would need to count more than one valid sub-block at a time. Checking the validity of a sub-block is O(n). Checking whether a sub-block is fully sorted is O(n) as well. A fully sorted sub-block contains n·(n - 1)/2 valid sub-blocks, which you can count without further breaking this sub-block up.
Now, the entire array is obviously always valid. For a divide-and-conquer approach, you would need to break this up. There are two conceivable breaking points: the location of the highest element, and that of the lowest element. If you break the array into two at one of these points, including the extremum in the part that contains the second-to-extreme element, there cannot be a valid sub-block crossing this break-point.
By always choosing the extremum that produces a more even split, this should work quite well (average O(n log n)) for "random" arrays. However, I can see problems when your input is something like (1 5 2 6 3 7 4 8), which seems to produce O(n2) behaviour. (1 4 7 2 5 8 3 6 9) would be similar (I hope you see the pattern). I currently see no trick to catch this kind of worse case, but it seems that it requires other splitting techniques.
This question does involve a bit of a "math trick" but it's fairly straight forward once you get it. However, the rest of my solution won't fit the O(n log n) criteria.
The math portion:
For any two consecutive numbers their sum is 2k+1 where k is the smallest element. For three it is 3k+3, 4 : 4k+6 and for N such numbers it is Nk + sum(1,N-1). Hence, you need two steps which can be done simultaneously:
Create the sum of all the sub-arrays.
Determine the smallest element of a sub-array.
The dynamic programming portion
Build two tables using the results of the previous row's entries to build each successive row's entries. Unfortunately, I'm totally wrong as this would still necessitate n^2 sub-array checks. Ugh!
My proposition
STEP = 2 // amount of examed number
B [0,0,0,0,0,0,0,0]
B [1,1,0,0,0,0,0,0]
VALID(A,B) - if not valid move one
B [0,1,1,0,0,0,0,0]
VALID(A,B) - if valid move one and step
B [0,0,0,1,1,0,0,0]
VALID (A,B)
B [0,0,0,0,0,1,1,0]
STEP = 3
B [1,1,1,0,0,0,0,0] not ok
B [0,1,1,1,0,0,0,0] ok
B [0,0,0,0,1,1,1,0] not ok
STEP = 4
B [1,1,1,1,0,0,0,0] not ok
B [0,1,1,1,1,0,0,0] ok
.....
CON <- 0
STEP <- 2
i <- 0
j <- 0
WHILE(STEP <= LEN(A)) DO
j <- STEP
WHILE(STEP <= LEN(A) - j) DO
IF(VALID(A,i,j)) DO
CON <- CON + 1
i <- j + 1
j <- j + STEP
ELSE
i <- i + 1
j <- j + 1
END
END
STEP <- STEP + 1
END
The valid method check that all elements are consecutive
Never tested but, might be ok
The original array doesn't contain duplicates so must itself be a consecutive block. Lets call this block (1 ~ n). We can test to see whether block (2 ~ n) is consecutive by checking if the first element is 1 or n which is O(1). Likewise we can test block (1 ~ n-1) by checking whether the last element is 1 or n.
I can't quite mould this into a solution that works but maybe it will help someone along...
Like everybody else, I'm just throwing this out ... it works for the single example below, but YMMV!
The idea is to count the number of illegal sub-blocks, and subtract this from the total possible number. We count the illegal ones by examining each array element in turn and ruling out sub-blocks that include the element but not its predecessor or successor.
Foreach i in [1,N], compute B[A[i]] = i.
Let Count = the total number of sub-blocks with length>1, which is N-choose-2 (one for each possible combination of starting and ending index).
Foreach i, consider A[i]. Ignoring edge cases, let x=A[i]-1, and let y=A[i]+1. A[i] cannot participate in any sub-block that does not include x or y. Let iX=B[x] and iY=B[y]. There are several cases to be treated independently here. The general case is that iX<i<iY<i. In this case, we can eliminate the sub-block A[iX+1 .. iY-1] and all intervening blocks containing i. There are (i - iX + 1) * (iY - i + 1) such sub-blocks, so call this number Eliminated. (Other cases left as an exercise for the reader, as are those edge cases.) Set Count = Count - Eliminated.
Return Count.
The total cost appears to be N * (cost of step 2) = O(N).
WRINKLE: In step 2, we must be careful not to eliminate each sub-interval more than once. We can accomplish this by only eliminating sub-intervals that lie fully or partly to the right of position i.
Example:
A = [1, 3, 2, 4]
B = [1, 3, 2, 4]
Initial count = (4*3)/2 = 6
i=1: A[i]=1, so need sub-blocks with 2 in them. We can eliminate [1,3] from consideration. Eliminated = 1, Count -> 5.
i=2: A[i]=3, so need sub-blocks with 2 or 4 in them. This rules out [1,3] but we already accounted for it when looking right from i=1. Eliminated = 0.
i=3: A[i] = 2, so need sub-blocks with [1] or [3] in them. We can eliminate [2,4] from consideration. Eliminated = 1, Count -> 4.
i=4: A[i] = 4, so we need sub-blocks with [3] in them. This rules out [2,4] but we already accounted for it when looking right from i=3. Eliminated = 0.
Final Count = 4, corresponding to the sub-blocks [1,3,2,4], [1,3,2], [3,2,4] and [3,2].
(This is an attempt to do this N.log(N) worst case. Unfortunately it's wrong -- it sometimes undercounts. It incorrectly assumes you can find all the blocks by looking at only adjacent pairs of smaller valid blocks. In fact you have to look at triplets, quadruples, etc, to get all the larger blocks.)
You do it with a struct that represents a subblock and a queue for subblocks.
struct
c_subblock
{
int index ; /* index into original array, head of subblock */
int width ; /* width of subblock > 0 */
int lo_value;
c_subblock * p_above ; /* null or subblock above with same index */
};
Alloc an array of subblocks the same size as the original array, and init each subblock to have exactly one item in it. Add them to the queue as you go. If you start with array [ 7 3 4 1 2 6 5 8 ] you will end up with a queue like this:
queue: ( [7,7] [3,3] [4,4] [1,1] [2,2] [6,6] [5,5] [8,8] )
The { index, width, lo_value, p_above } values for subbblock [7,7] will be { 0, 1, 7, null }.
Now it's easy. Forgive the c-ish pseudo-code.
loop {
c_subblock * const p_left = Pop subblock from queue.
int const right_index = p_left.index + p_left.width;
if ( right_index < length original array ) {
// Find adjacent subblock on the right.
// To do this you'll need the original array of length-1 subblocks.
c_subblock const * p_right = array_basic_subblocks[ right_index ];
do {
Check the left/right subblocks to see if the two merged are also a subblock.
If they are add a new merged subblock to the end of the queue.
p_right = p_right.p_above;
}
while ( p_right );
}
}
This will find them all I think. It's usually O(N log(N)), but it'll be O(N^2) for a fully sorted or anti-sorted list. I think there's an answer to this though -- when you build the original array of subblocks you look for sorted and anti-sorted sequences and add them as the base-level subblocks. If you are keeping a count increment it by (width * (width + 1))/2 for the base-level. That'll give you the count INCLUDING all the 1-length subblocks.
After that just use the loop above, popping and pushing the queue. If you're counting you'll have to have a multiplier on both the left and right subblocks and multiply these together to calculate the increment. The multiplier is the width of the leftmost (for p_left) or rightmost (for p_right) base-level subblock.
Hope this is clear and not too buggy. I'm just banging it out, so it may even be wrong.
[Later note. This doesn't work after all. See note below.]

Resources