finding the sum of smaller elements on left - algorithm

i came across a problem of finding the number of smaller elements on left of each element in an array of integers, which can be solved in O(nlgn) by using Binary Indexed trees(like AVL, etc) or Merge Sort. Using an AVL tree one can calculate the size of left sub-tree for each element and this would be the required answer. However I can't come up how to calculate the sum of the smaller elements left to each element efficiently. For each element , do i have to traverse the left sub-tree and sum the values at nodes or is there any better way(using Merge Sort etc)?
E.g for the array: 4,7,1,3,2 the required ans would be: 0,4,0,1,1
Thanks.

In Binary Indexed trees you store the number of child nodes for every node of the binary search tree. This allows you to find number of nodes, preceding each node (number of smaller elements).
For this task, you can store the sum of child node values for every node of the binary search tree. This allows you to find the sum of values for preceding nodes (sum of smaller elements). Also in O(n*log(n)).

Check this tutorial on Binary Indexed Tree. This is a structure, that uses O(n) memory and can proceed such tasks:
1. Change value of a[i] by(to) x, call this add(i,x);
2. Return sum all of a[i], i<=m, call this get(x).
in O(log n).
Now, how to use this to your task. You can do this in 2 steps.
Step one. Copy, sort and remove duplicates from original array. Now you can remap numbers, so they are in range [1...n].
Step 2. Now walk through the array from left to right. Let A[i] - be the value in original array, new[i] - mapped value. (if A = [2, 7, 11, -3, 7] then new = [2, 3, 4, 1, 2]).
The answer is get(new[i]-1).
Update the values: add(new[i], 1) for counting, add(new[i], A[i]) for sum.
All in all. Sorting and remapping is O(n logn). Working on array is n * O(log n) = O(n log n). So total complexity is O(n logn)
Alternatively, use treap (in russian).
EDIT: Building new array.
Suppose the original array A = [2, 7, 11,-3, 7]
Copy it to B and sort, B = [-3, 2, 7, 7, 11]
Do a unique B = [-3, 2, 7, 11].
Now to get new, you can
add all of elements to map in increasing order, e.g. (-3 -> 1, 2->2, 7->3, 11->4)
for each element in A, do a binary search over B

The following code has a complexity of O(nlogn).
It uses a binary indexed tree to solve the problem.
#include <cstdio>
using namespace std;
const int MX_RANGE = 100000, MX_SIZE = 100000;
int tree[MX_RANGE] = {0}, a[MX_SIZE];
int main() {
int n, mn = MX_RANGE, shift = 0;
scanf("%d", &n);
for(int i = 0; i < n; i++) {
scanf("%d", &a[i]);
if(a[i] < mn) mn = a[i];
}
shift = 1-mn; // we need to remap all values to start from 1
for(int i = 0; i < n; i++) {
// Read answer
int sum = 0, idx = a[i]+shift-1;
while(idx>0) {
sum += tree[idx];
idx -= (idx&-idx);
}
printf("%d ", sum);
// Update tree
idx = a[i]+shift;
while(idx<=MX_RANGE) {
tree[idx] += a[i];
idx += (idx&-idx);
}
}
printf("\n");
}

Related

Find the 1 non-repeating element in a given array without XOR or Map in O(n)

How can I find the one non repeating element in an array that all other elements appear exactly twice, when I'm not allowed to use a hash map or the operator XOR?
In O(n) time complexity
Examples:
Input
arr[] = {14, 1, 14, 4, 12, 2, 1, 2, 3, 3}
Output
4
If you want to do it in java Script then inside a for loop you can check if the first index and the last index of that item are same then return that item else return -1.
function getFirstDistinctNumber() {
arr = [14, 1, 14, 4, 12, 2, 1, 2, 3, 3];
for (let i=0; i<arr.length; i++) {
if(arr.indexOf(arr[i]) == arr.lastIndexOf(arr[i])) {
return arr[i];
}
}
return -1;
}
console.log(getFirstDistinctNumber());
And also in java, you can do the same but lastIndexOf() is not present for array. so you can do it by creating an array list
import java.util.*;
class FindDistinct {
public static void main(String[] args) {
// create an empty array list with an initial capacity
ArrayList<Integer> inputList = new ArrayList<Integer>();
// use add() method to add values in the list
inputList.add(14);
inputList.add(1);
inputList.add(14);
inputList.add(4);
inputList.add(12);
inputList.add(2);
inputList.add(1);
inputList.add(2);
inputList.add(3);
inputList.add(3);
for(int i=0; i< inputList.size(); i++) {
if(inputList.indexOf(inputList.get(i)) == inputList.lastIndexOf(inputList.get(i))) {
System.out.println(inputList.get(i));
break;
}
}
}
}
sorting the array and then using stack, you can find the required element
# code is in python
arr = [14, 1, 14, 4, 12, 2, 1, 2, 3, 3,12]
# sort the array
arr = sorted(arr)
# use a stack to find out the required element
stack = []
for ele in arr:
if len(stack) == 0:
stack.append(ele)
elif stack[-1]==ele:
stack.pop()
else:
stack.append(ele)
print(stack[-1]) # item with one occuerence
# output : 4
There's a way compute that in O(1) space and O(n log n) time. Simply binary search the value. For a given number x count the number of elements that less or equal to x - if this value is odd then the number you're looking for is less or equal to x, otherwise if it's even, then it's greater.
(technically the running time is O(n log k) where k is max_value - min_value from the elements in the array, but there's a way to modify it to work in O(n log n) if needed.)
I found the answer to the problem, using this video: https://www.youtube.com/watch?v=aZneq1PWFkg
You can get the median in O(n) time, then you can sort all of the numbers which are greater or equal to the median to the right of it and the ones that are less than it, to be on the left in O(n) time. now everything is sorted in a way that everything that's bigger than the median is on the left and bigger is on the right.
Then You search the array once more to see if the median has a twin, if it doesn’t, you’re done that’s your lonely number, if you’ve found the twin on the left side and the index of the median is odd then you take everything that’s on the left side of the median including it (and throw the others away), same for if you’ve found it in the right side. If the medians index is even you take the opposite side that you find the twin on without including the median. You keep doing the same algo from beginning until you’ve found it.
Then you can get T(n) = T(n/2) + Θ(n)
And with masters theorem you get Θ(n).
In c++ using O(n) compexity.
int arrayUnique(int *arr, int size)
{
int count;
for(int i=0;i<size;i++)
{
count=0;
for(int j=0;j<size;j++)
{
if(i==j){
continue;
}
if(arr[i] == arr[j]){
count=1;
}
}
if(count==0){
return arr[i];
}
}
}

Sorted squares of numbers in a list in O(n)?

Given a list of integers in sorted order, say, [-9, -2, 0, 2, 3], we have to square each element and return the result in a sorted order. So, the output would be: [0, 4, 4, 9, 81].
I could figure out two approaches:
O(NlogN) approach - We insert the square of each element in a hashset. Then copy the elements into a list, sort it and then return it.
O(n) approach - If there is a bound for the input elements (say -100 to -100), then we create a boolean list of size 20000 (to store -10000 to 10000). For each of the input elements, we mark the corresponding square number as true. For e.g., for 9 in the input, I will mark 81 in the boolean array as true. Then traverse this boolean list and insert all the true elements into a return list. Note that in this we make an assumption - that there is a bound for the input elements.
Is there some way in which we could do it in O(n) time even without assuming any bounds for the input?
Well I can think of an O(n) approach
Split the input into 2 lists. One with negative numbers, let's call this list A. And one with positive numbers and 0, list B. This is done while preserving the input order, which is trivial : O(n)
Reverse list A. We do this because once squared, the greater than relation between the elements if flipped
Square every item of both list in place : O(n)
Run a merge operation not unlike that of a merge sort. : O(n)
Total: O(n)
Done :)
Is there some way in which we could do it in O(n) time even without assuming any bounds for the input?
Absolutely.
Since the original list is already sorted you are in luck!
given two numbers x and y
if |x| > |y| then x^2 > y^2
So all you have to do is to split the list into two parts, one for all the negative numbers and the other one for all the positive ones
Reverse the negative one and make them positive
Then you merge those two lists into one using insertion. This runs in O(n) since both lists are sorted.
From there you can just calculate the square and put them into the new list.
We can achieve it by 2 pointer technique. 1 pointer at the start and other at the end. Compare the squares and move the pointers accordingly and start allocating the max element at the end of the new list.
Time = O(n)
Space = O(n)
Can you do it inplace ? To reduce space complexity.
This can be done with O(n) time and space. We need two pointers. The following is the Java code:
public int[] sortedSquares(int[] A) {
int i = 0;
int j = A.length - 1;
int[] result = new int[A.length];
int count = A.length - 1;
while(count >= 0) {
if(Math.abs(A[i]) > Math.abs(A[j])) {
result[count] = A[i]*A[i];
i++;
}
else {
result[count] = A[j]*A[j];
j--;
}
count--;
}
return result;
}
Start from the end ad compare the absolute values. And then create the answer.
class Solution {
public int[] sortedSquares(int[] nums) {
int left = 0;
int right = nums.length -1;
int index = nums.length- 1;
int result[] = new int [nums.length];
while(left<=right)
{
if(Math.abs(nums[left])>Math.abs(nums[right]))
{
result[index] = nums[left] * nums[left];
left++;
}
else
{
result[index] = nums[right] * nums[right];
right--;
}
index--;
}
return result;
}
}
By using the naive approach this question will be very easy but it will require O(nlogn) complexity
To solve this question in O(n), two pointer method is the best approach.
Create a new result array with the same length as the given array, and store it pointer as array length
Assign a pointer at the start of the array and then assign another pointer at the last of the array, as we know the last element from either side will be highest
[-9, -2, 0, 2, 3]
compare -9 and 3 absolute value
if the left value then store the value to the resultant array and decrease its index value and increase the left, otherwise decrease the right.
Python3 solution. time complexity - O(N) and space complexity O(1).
def sorted_squArrres(Arr:list) ->list:
i = 0
j = len(Arr)-1
while i<len(Arr):
if Arr[i]*Arr[i]<Arr[j]*Arr[j]:
Arr.insert(0,Arr[j]*Arr[j])
Arr.pop(j+1)
i+=1
continue
if Arr[i]*Arr[i]>Arr[j]*Arr[j]:
Arr.insert(0,Arr[i]*Arr[i])
Arr.pop(i+1)
i+=1
continue
else:
if i!=j:
Arr.insert(0,Arr[j]*Arr[j])
Arr.insert(0,Arr[j+1]*Arr[j+1])
Arr.pop(j+2)
Arr.pop(i+2)
i+=2
else:
Arr.insert(0,Arr[j]*Arr[j])
Arr.pop(j+1)
i+=1
return Arr
X = [[-4,-3,-2,0,3,5,6],[1,2,3,4,5],[-5,-4,-3,-2,-1],[-9,-2,0,2,3]]
for i in X:
# looping differnt kinds of inputs
print(sorted_squArrres(i))
# outputs
'''
[0, 4, 9, 9, 16, 25, 36]
[1, 4, 9, 16, 25]
[1, 4, 9, 16, 25]
[0, 4, 4, 9, 81]
'''

Maximize sum of list with no more than k consecutive elements from input

I have an array of N numbers and I want remove only those elements from the list which when removed will create a new list where there are no more K numbers adjacent to each other. There can be multiple lists that can be created with this restriction. So I just want that list in which the sum of the remaining numbers is maximum and as an output print that sum only.
The algorithm that I have come up with so far has a time complexity of O(n^2). Is it possible to get better algorithm for this problem?
Link to the question.
Here's my attempt:
int main()
{
//Total Number of elements in the list
int count = 6;
//Maximum number of elements that can be together
int maxTogether = 1;
//The list of numbers
int billboards[] = {4, 7, 2, 0, 8, 9};
int maxSum = 0;
for(int k = 0; k<=maxTogether ; k++){
int sum=0;
int size= k;
for (int i = 0; i< count; i++) {
if(size != maxTogether){
sum += billboards[i];
size++;
}else{
size = 0;
}
}
printf("%i\n", sum);
if(sum > maxSum)
{
maxSum = sum;
}
}
return 0;
}
The O(NK) dynamic programming solution is fairly easy:
Let A[i] be the best sum of the elements to the left subject to the not-k-consecutive constraint (assuming we're removing the i-th element as well).
Then we can calculate A[i] by looking back K elements:
A[i] = 0;
for j = 1 to k
A[i] = max(A[i], A[i-j])
A[i] += input[i]
And, at the end, just look through the last k elements from A, adding the elements to the right to each and picking the best one.
But this is too slow.
Let's do better.
So A[i] finds the best from A[i-1], A[i-2], ..., A[i-K+1], A[i-K].
So A[i+1] finds the best from A[i], A[i-1], A[i-2], ..., A[i-K+1].
There's a lot of redundancy there - we already know the best from indices i-1 through i-K because of A[i]'s calculation, but then we find the best of all of those except i-K (with i) again in A[i+1].
So we can just store all of them in an ordered data structure and then remove A[i-K] and insert A[i]. My choice - A binary search tree to find the minimum, along with a circular array of size K+1 of tree nodes, so we can easily find the one we need to remove.
I swapped the problem around to make it slightly simpler - instead of finding the maximum of remaining elements, I find the minimum of removed elements and then return total sum - removed sum.
High-level pseudo-code:
for each i in input
add (i + the smallest value in the BST) to the BST
add the above node to the circular array
if it wrapper around, remove the overridden element from the BST
// now the remaining nodes in the BST are the last k elements
return (the total sum - the smallest value in the BST)
Running time:
O(n log k)
Java code:
int getBestSum(int[] input, int K)
{
Node[] array = new Node[K+1];
TreeSet<Node> nodes = new TreeSet<Node>();
Node n = new Node(0);
nodes.add(n);
array[0] = n;
int arrPos = 0;
int sum = 0;
for (int i: input)
{
sum += i;
Node oldNode = nodes.first();
Node newNode = new Node(oldNode.value + i);
arrPos = (arrPos + 1) % array.length;
if (array[arrPos] != null)
nodes.remove(array[arrPos]);
array[arrPos] = newNode;
nodes.add(newNode);
}
return sum - nodes.first().value;
}
getBestSum(new int[]{1,2,3,1,6,10}, 2) prints 21, as required.
Let f[i] be the maximum total value you can get with the first i numbers, while you don't choose the last(i.e. the i-th) one. Then we have
f[i] = max{
f[i-1],
max{f[j] + sum(j + 1, i - 1) | (i - j) <= k}
}
you can use a heap-like data structure to maintain the options and get the maximum one in log(n) time, keep a global delta or whatever, and pay attention to the range i - j <= k.
The following algorithm is of O(N*K) complexity.
Examine the 1st K elements (0 to K-1) of the array. There can be at most 1 gap in this region.
Reason: If there were two gaps, then there would not be any reason to have the lower (earlier gap).
For each index i of these K gap options, following holds true:
1. Sum upto i-1 is the present score of each option.
2. If the next gap is after a distance of d, then the options for d are (K - i) to K
For every possible position of gap, calculate the best sum upto that position among the options.
The latter part of the array can be traversed similarly independently from the past gap history.
Traverse the array further till the end.

Longest subarray whose elements form a continuous sequence

Given an unsorted array of positive integers, find the length of the longest subarray whose elements when sorted are continuous. Can you think of an O(n) solution?
Example:
{10, 5, 3, 1, 4, 2, 8, 7}, answer is 5.
{4, 5, 1, 5, 7, 6, 8, 4, 1}, answer is 5.
For the first example, the subarray {5, 3, 1, 4, 2} when sorted can form a continuous sequence 1,2,3,4,5, which are the longest.
For the second example, the subarray {5, 7, 6, 8, 4} is the result subarray.
I can think of a method which for each subarray, check if (maximum - minimum + 1) equals the length of that subarray, if true, then it is a continuous subarray. Take the longest of all. But it is O(n^2) and can not deal with duplicates.
Can someone gives a better method?
Algorithm to solve original problem in O(n) without duplicates. Maybe, it helps someone to develop O(n) solution that deals with duplicates.
Input: [a1, a2, a3, ...]
Map original array as pair where 1st element is a value, and 2nd is index of array.
Array: [[a1, i1], [a2, i2], [a3, i3], ...]
Sort this array of pairs with some O(n) algorithm (e.g Counting Sort) for integer sorting by value.
We get some another array:
Array: [[a3, i3], [a2, i2], [a1, i1], ...]
where a3, a2, a1, ... are in sorted order.
Run loop through sorted array of pairs
In linear time we can detect consecutive groups of numbers a3, a2, a1. Consecutive group definition is next value = prev value + 1.
During that scan keep current group size (n), minimum value of index (min), and current sum of indices (actualSum).
On each step inside consecutive group we can estimate sum of indices, because they create arithmetic progression with first element min, step 1, and size of group seen so far n.
This sum estimate can be done in O(1) time using formula for arithmetic progression:
estimate sum = (a1 + an) * n / 2;
estimate sum = (min + min + (n - 1)) * n / 2;
estimate sum = min * n + n * (n - 1) / 2;
If on some loop step inside consecutive group estimate sum equals to actual sum, then seen so far consecutive group satisfy the conditions. Save n as current maximum result, or choose maximum between current maximum and n.
If on value elements we stop seeing consecutive group, then reset all values and do the same.
Code example: https://gist.github.com/mishadoff/5371821
See the array S in it's mathematical set definition :
S = Uj=0k (Ij)
Where the Ij are disjoint integer segments. You can design a specific interval tree (based on a Red-Black tree or a self-balancing tree that you like :) ) to store the array in this mathematical definitions. The node and tree structures should look like these :
struct node {
int d, u;
int count;
struct node *n_left, *n_right;
}
Here, d is the lesser bound of the integer segment and u, the upper bound. count is added to take care of possible duplicates in the array : when trying to insert an already existing element in the tree, instead of doing nothing, we will increment the count value of the node in which it is found.
struct root {
struct node *root;
}
The tree will only store disjoint nodes, thus, the insertion is a bit more complex than a classical Red-Black tree insertion. When inserting intervals, you must scans for potential overflows with already existing intervals. In your case, since you will only insert singletons this should not add too much overhead.
Given three nodes P, L and R, L being the left child of P and R the right child of P. Then, you must enforce L.u < P.d and P.u < R.d (and for each node, d <= u, of course).
When inserting an integer segment [x,y], you must find "overlapping" segments, that is to say, intervals [u,d] that satisfies one of the following inequalities :
y >= d - 1
OR
x <= u + 1
If the inserted interval is a singleton x, then you can only find up to 2 overlapping interval nodes N1 and N2 such that N1.d == x + 1 and N2.u == x - 1. Then you have to merge the two intervals and update count, which leaves you with N3 such that N3.d = N2.d, N3.u = N1.u and N3.count = N1.count + N2.count + 1. Since the delta between N1.d and N2.u is the minimal delta for two segments to be disjoint, then you must have one of the following :
N1 is the right child of N2
N2 is the left child of N1
So the insertion will still be in O(log(n)) in the worst case.
From here, I can't figure out how to handle the order in the initial sequence but here is a result that might be interesting : if the input array defines a perfect integer segment, then the tree only has one node.
UPD2: The following solution is for a problem when it is not required that subarray is contiguous. I misunderstood the problem statement. Not deleting this, as somebody may have an idea based on mine that will work for the actual problem.
Here's what I've come up with:
Create an instance of a dictionary (which is implemented as hash table, giving O(1) in normal situations). Keys are integers, values are hash sets of integers (also O(1)) – var D = new Dictionary<int, HashSet<int>>.
Iterate through the array A and for each integer n with index i do:
Check whether keys n-1 and n+1 are contained in D.
if neither key exists, do D.Add(n, new HashSet<int>)
if only one of the keys exists, e.g. n-1, do D.Add(n, D[n-1])
if both keys exist, do D[n-1].UnionWith(D[n+1]); D[n+1] = D[n] = D[n-1];
D[n].Add(n)
Now go through each key in D and find the hash set with the greatest length (finding length is O(1)). The greatest length will be the answer.
To my understanding, the worst case complexity will be O(n*log(n)), only because of the UnionWith operation. I don't know how to calculate the average complexity, but it should be close to O(n). Please correct me if I am wrong.
UPD: To speak code, here's a test implementation in C# that gives the correct result in both of the OP's examples:
var A = new int[] {4, 5, 1, 5, 7, 6, 8, 4, 1};
var D = new Dictionary<int, HashSet<int>>();
foreach(int n in A)
{
if(D.ContainsKey(n-1) && D.ContainsKey(n+1))
{
D[n-1].UnionWith(D[n+1]);
D[n+1] = D[n] = D[n-1];
}
else if(D.ContainsKey(n-1))
{
D[n] = D[n-1];
}
else if(D.ContainsKey(n+1))
{
D[n] = D[n+1];
}
else if(!D.ContainsKey(n))
{
D.Add(n, new HashSet<int>());
}
D[n].Add(n);
}
int result = int.MinValue;
foreach(HashSet<int> H in D.Values)
{
if(H.Count > result)
{
result = H.Count;
}
}
Console.WriteLine(result);
This will require two passes over the data. First create a hash map, mapping ints to bools. I updated my algorithm to not use map, from the STL, which I'm positive uses sorting internally. This algorithm uses hashing, and can be easily updated for any maximum or minimum combination, even potentially all possible values an integer can obtain.
#include <iostream>
using namespace std;
const int MINIMUM = 0;
const int MAXIMUM = 100;
const unsigned int ARRAY_SIZE = MAXIMUM - MINIMUM;
int main() {
bool* hashOfIntegers = new bool[ARRAY_SIZE];
//const int someArrayOfIntegers[] = {10, 9, 8, 6, 5, 3, 1, 4, 2, 8, 7};
//const int someArrayOfIntegers[] = {10, 6, 5, 3, 1, 4, 2, 8, 7};
const int someArrayOfIntegers[] = {-2, -3, 8, 6, 12, 14, 4, 0, 16, 18, 20};
const int SIZE_OF_ARRAY = 11;
//Initialize hashOfIntegers values to false, probably unnecessary but good practice.
for(unsigned int i = 0; i < ARRAY_SIZE; i++) {
hashOfIntegers[i] = false;
}
//Chage appropriate values to true.
for(int i = 0; i < SIZE_OF_ARRAY; i++) {
//We subtract the MINIMUM value to normalize the MINIMUM value to a zero index for negative numbers.
hashOfIntegers[someArrayOfIntegers[i] - MINIMUM] = true;
}
int sequence = 0;
int maxSequence = 0;
//Find the maximum sequence in the values
for(unsigned int i = 0; i < ARRAY_SIZE; i++) {
if(hashOfIntegers[i]) sequence++;
else sequence = 0;
if(sequence > maxSequence) maxSequence = sequence;
}
cout << "MAX SEQUENCE: " << maxSequence << endl;
return 0;
}
The basic idea is to use the hash map as a bucket sort, so that you only have to do two passes over the data. This algorithm is O(2n), which in turn is O(n)
Don't get your hopes up, this is only a partial answer.
I'm quite confident that the problem is not solvable in O(n). Unfortunately, I can't prove it.
If there is a way to solve it in less than O(n^2), I'd suspect that the solution is based on the following strategy:
Decide in O(n) (or maybe O(n log n)) whether there exists a continuous subarray as you describe it with at least i elements. Lets call this predicate E(i).
Use bisection to find the maximum i for which E(i) holds.
The total running time of this algorithm would then be O(n log n) (or O(n log^2 n)).
This is the only way I could come up with to reduce the problem to another problem that at least has the potential of being simpler than the original formulation. However, I couldn't find a way to compute E(i) in less than O(n^2), so I may be completely off...
here's another way to think of your problem: suppose you have an array composed only of 1s and 0s, you want to find the longest consecutive run of 1s. this can be done in linear time by run-length encoding the 1s (ignore the 0's). in order to transform your original problem into this new run length encoding problem, you compute a new array b[i] = (a[i] < a[i+1]). this doesn't have to be done explicitly, you can just do it implicitly to achieve an algorithm with constant memory requirement and linear complexity.
Here are 3 acceptable solutions:
The first is O(nlog(n)) in time and O(n) space, the second is O(n) in time and O(n) in space, and the third is O(n) in time and O(1) in space.
build a binary search tree then traverse it in order.
keep 2 pointers one for the start of max subset and one for the end.
keep the max_size value while iterating the tree.
it is a O(n*log(n)) time and space complexity.
you can always sort numbers set using counting sort in a linear time
and run through the array, which means O(n) time and space
complexity.
Assuming there isn't overflow or a big integer data type. Assuming the array is a mathematical set (no duplicate values). You can do it in O(1) of memory:
calculate the sum of the array and the product of the array
figure out what numbers you have in it assuming you have the min and max of the original set. Totally it is O(n) time complexity.

Number of all increasing subsequences in given sequence?

You may have heard about the well-known problem of finding the longest increasing subsequence. The optimal algorithm has O(n*log(n))complexity.
I was thinking about problem of finding all increasing subsequences in given sequence. I have found solution for a problem where we need to find a number of increasing subsequences of length k, which has O(n*k*log(n)) complexity (where n is a length of a sequence).
Of course, this algorithm can be used for my problem, but then solution has O(n*k*log(n)*n) = O(n^2*k*log(n)) complexity, I suppose. I think, that there must be a better (I mean - faster) solution, but I don't know such yet.
If you know how to solve the problem of finding all increasing subsequences in given sequence in optimal time/complexity (in this case, optimal = better than O(n^2*k*log(n))), please let me know about that.
In the end: this problem is not a homework. There was mentioned on my lecture a problem of the longest increasing subsequence and I have started thinking about general idea of all increasing subsequences in given sequence.
I don't know if this is optimal - probably not, but here's a DP solution in O(n^2).
Let dp[i] = number of increasing subsequences with i as the last element
for i = 1 to n do
dp[i] = 1
for j = 1 to i - 1 do
if input[j] < input[i] then
dp[i] = dp[i] + dp[j] // we can just append input[i] to every subsequence ending with j
Then it's just a matter of summing all the entries in dp
You can compute the number of increasing subsequences in O(n log n) time as follows.
Recall the algorithm for the length of the longest increasing subsequence:
For each element, compute the predecessor element among previous elements, and add one to that length.
This algorithm runs naively in O(n^2) time, and runs in O(n log n) (or even better, in the case of integers), if you compute the predecessor using a data structure like a balanced binary search tree (BST) (or something more advanced like a van Emde Boas tree for integers).
To amend this algorithm for computing the number of sequences, store in the BST in each node the number of sequences ending at that element. When processing the next element in the list, you simply search for the predecessor, count the number of sequences ending at an element that is less than the element currently being processed (in O(log n) time), and store the result in the BST along with the current element. Finally, you sum the results for every element in the tree to get the result.
As a caveat, note that the number of increasing sequences could be very large, so that the arithmetic no longer takes O(1) time per operation. This needs to be taken into consideration.
Psuedocode:
ret = 0
T = empty_augmented_bst() // with an integer field in addition to the key
for x int X:
// sum of auxiliary fields of keys less than x
// computed in O(log n) time using augmented BSTs
count = 1 + T.sum_less(x)
T.insert(x, 1 + count) // sets x's auxiliary field to 1 + count
ret += count // keep track of return value
return ret
I'm assuming without loss of generalization the input A[0..(n-1)] consists of all integers in {0, 1, ..., n-1}.
Let DP[i] = number of increasing subsequences ending in A[i].
We have the recurrence:
To compute DP[i], we only need to compute DP[j] for all j where A[j] < A[i]. Therefore, we can compute the DP array in the ascending order of values of A. This leaves DP[k] = 0 for all k where A[k] > A[i].
The problem boils down to computing the sum DP[0] to DP[i-1]. Supposing we have already calculated DP[0] to DP[i-1], we can calculate DP[i] in O(log n) using a Fenwick tree.
The final answer is then DP[0] + DP[1] + ... DP[n-1]. The algorithm runs in O(n log n).
This is an O(nklogn) solution where n is the length of the input array and k is the size of the increasing sub-sequences. It is based on the solution mentioned in the question.
vector<int> values, an n length array, is the array to be searched for increasing sub-sequences.
vector<int> temp(n); // Array for sorting
map<int, int> mapIndex; // This will translate from the value in index to the 1-based count of values less than it
partial_sort_copy(values.cbegin(), values.cend(), temp.begin(), temp.end());
for(auto i = 0; i < n; ++i){
mapIndex.insert(make_pair(temp[i], i + 1)); // insert will only allow each number to be added to the map the first time
}
mapIndex now contains a ranking of all numbers in values.
vector<vector<int>> binaryIndexTree(k, vector<int>(n)); // A 2D binary index tree with depth k
auto result = 0;
for(auto it = values.cbegin(); it != values.cend(); ++it){
auto rank = mapIndex[*it];
auto value = 1; // Number of sequences to be added to this rank and all subsequent ranks
update(rank, value, binaryIndexTree[0]); // Populate the binary index tree for sub-sequences of length 1
for(auto i = 1; i < k; ++i){ // Itterate over all sub-sequence lengths 2 - k
value = getValue(rank - 1, binaryIndexTree[i - 1]); // Retrieve all possible shorter sub-sequences of lesser or equal rank
update(rank, value, binaryIndexTree[i]); // Update the binary index tree for sub sequences of this length
}
result += value; // Add the possible sub-sequences of length k for this rank
}
After placing all n elements of values into all k dimensions of binaryIndexTree. The values collected into result represent the total number of increasing sub-sequences of length k.
The binary index tree functions used to obtain this result are:
void update(int rank, int increment, vector<int>& binaryIndexTree)
{
while (rank < binaryIndexTree.size()) { // Increment the current rank and all higher ranks
binaryIndexTree[rank - 1] += increment;
rank += (rank & -rank);
}
}
int getValue(int rank, const vector<int>& binaryIndexTree)
{
auto result = 0;
while (rank > 0) { // Search the current rank and all lower ranks
result += binaryIndexTree[rank - 1]; // Sum any value found into result
rank -= (rank & -rank);
}
return result;
}
The binary index tree is obviously O(nklogn), but it is the ability to sequentially fill it out that creates the possibility of using it for a solution.
mapIndex creates a rank for each number in values, such that the smallest number in values has a rank of 1. (For example if values is "2, 3, 4, 3, 4, 1" then mapIndex will contain: "{1, 1}, {2, 2}, {3, 3}, {4, 5}". Note that "4" has a rank of "5" because there are 2 "3"s in values
binaryIndexTree has k different trees, level x would represent the total number of increasing sub-strings that can be formed of length x. Any number in values can create a sub-string of length 1, so each element will increment it's rank and all ranks above it by 1.
At higher levels an increasing sub-string depends on there already being a sub-string available of a shorter length and lower rank.
Because elements are inserted into binary index tree according to their order in values, the order of occurrence in values is preserved, so if an element has been inserted in binaryIndexTree that is because it preceded the current element in values.
An excellent description of how binary index tree is available here: http://www.geeksforgeeks.org/binary-indexed-tree-or-fenwick-tree-2/
You can find an executable version of the code here: http://ideone.com/GdF0me
Let us take an example -
Take an array {7, 4, 6, 8}
Now if you consider each individual element also as a subsequence then the number of increasing subsequence that can be formed are -
{7} {4} {6} {4,6} {8} {7,8} {4,8} {6,8} {4,6,8}
A total of 9 increasing subsequence can be formed for this array.
So the answer is 9.
The code is as follows -
int arr[] = {7, 4, 6, 8};
int T[] = new int[arr.length];
for(int i=0; i<arr.length; i++)
T[i] = 1;
int sum = 1;
for(int i=1; i<arr.length; i++){
for(int j=0; j<i; j++){
if(arr[i] > arr[j]){
T[i] = T[i] + T[j];
}
}
sum += T[i];
}
System.out.println(sum);
The complexity of the code is O(N log N).
You can use sparse segment tree to get optimal solution with O(nlog(n)).
The solution running as follow :
for(int i=0;i<n;i++)
{
dp[i]=1+query(0,a[i]);
update(a[i],dp[i]);
}
The query parameters are : query(first position, last position)
The update parameters are : update(position,value)
And the final answer is the sum of all values of dp array.
Java version as an example:
int[] A = {1, 2, 0, 0, 0, 4};
int[] dp = new int[A.length];
for (int i = 0; i < A.length; i++) {
dp[i] = 1;
for (int j = 0; j <= i - 1; j++) {
if (A[j] < A[i]) {
dp[i] = dp[i] + dp[j];
}
}
}

Resources