Related
So I'm learning how to do a binary search and I understand the concept but what I don't understand is why we need
(right - left)
to get the highest index. I can't visualize this in my head :/
class Solution {
public int search(int[] nums, int target) {
int pivot;
int left = 0;
int right = nums.length - 1;
while (left <= right) {
pivot = left + (right - left) / 2;
if (nums[pivot] == target) return pivot;
if (target < nums[pivot]) right = pivot - 1;
else left = pivot + 1;
}
return -1;
}
}
left + (right - left) / 2
is the same as
(right + left) / 2
They both take the average of right and left. The purpose of using the former one is to prevent integer overflows. If right + left is bigger than the variable can carry, even though this variable can carry their average, the result will be wrong, because right + left will cause overflow.
On the other hand, first calculating (right - left) / 2 and adding left prevents overflow.
I'm trying to solve a problem which goes like this:
Problem
Given an array of integers "arr" of size "n", process two types of queries. There are "q" queries you need to answer.
Query type 1
input: l r
result: output number of inversions in [l, r]
Query type 2
input: x y
result: update the value at arr [x] to y
Inversion
For every index j < i, if arr [j] > arr [i], the pair (j, i) is one inversion.
Input
n = 5
q = 3
arr = {1, 4, 3, 5, 2}
queries:
type = 1, l = 1, r = 5
type = 2, x = 1, y = 4
type = 1, l = 1, r = 5
Output
4
6
Constraints
Time: 4 secs
1 <= n, q <= 100000
1 <= arr [i] <= 40
1 <= l, r, x <= n
1 <= y <= 40
I know how to solve a simpler version of this problem without updates, i.e. to simply count the number of inversions for each position using a segment tree or fenwick tree in O(N*log(N)). The only solution I have to this problem is O(q*N*log(N)) (I think) with segment tree other than the O(q*N2) trivial algorithm. This however does not fit within the time constraints of the problem. I would like to have hints towards a better algorithm to solve the problem in O(N*log(N)) (if it's possible) or maybe O(N*log2(N)).
I first came across this problem two days ago and have been spending a few hours here and there to try and solve it. However, I'm finding it non-trivial to do so and would like to have some help/hints regarding the same. Thanks for your time and patience.
Updates
Solution
With the suggestion, answer and help by Tanvir Wahid, I've implemented the source code for the problem in C++ and would like to share it here for anyone who might stumble across this problem and not have an intuitive idea on how to solve it. Thank you!
Let's build a segment tree with each node containing information about how many inversions exist and the frequency count of elements present in its segment of authority.
node {
integer inversion_count : 0
array [40] frequency : {0...0}
}
Building the segment tree and handling updates
For each leaf node, initialise inversion count to 0 and increase frequency of the represented element from the input array to 1. The frequency of the parent nodes can be calculated by summing up frequencies of the left and right childrens. The inversion count of parent nodes can be calculated by summing up the inversion counts of left and right children nodes added with the new inversions created upon merging the two segments of their authority which can be calculated using the frequencies of elements in each child. This calculation basically finds out the product of frequencies of bigger elements in the left child and frequencies of smaller elements in the right child.
parent.inversion_count = left.inversion_count + right.inversion_count
for i in [39, 0]
for j in [0, i)
parent.inversion_count += left.frequency [i] * right.frequency [j]
Updates are handled similarly.
Answering range queries on inversion counts
To answer the query for the number of inversions in the range [l, r], we calculate the inversions using the source code attached below.
Time Complexity: O(q*log(n))
Note
The source code attached does break some good programming habits. The sole purpose of the code is to "solve" the given problem and not to accomplish anything else.
Source Code
/**
Lost Arrow (Aryan V S)
Saturday 2020-10-10
**/
#include "bits/stdc++.h"
using namespace std;
struct node {
int64_t inv = 0;
vector <int> freq = vector <int> (40, 0);
void combine (const node& l, const node& r) {
inv = l.inv + r.inv;
for (int i = 39; i >= 0; --i) {
for (int j = 0; j < i; ++j) {
// frequency of bigger numbers in the left * frequency of smaller numbers on the right
inv += 1LL * l.freq [i] * r.freq [j];
}
freq [i] = l.freq [i] + r.freq [i];
}
}
};
void build (vector <node>& tree, vector <int>& a, int v, int tl, int tr) {
if (tl == tr) {
tree [v].inv = 0;
tree [v].freq [a [tl]] = 1;
}
else {
int tm = (tl + tr) / 2;
build(tree, a, 2 * v + 1, tl, tm);
build(tree, a, 2 * v + 2, tm + 1, tr);
tree [v].combine(tree [2 * v + 1], tree [2 * v + 2]);
}
}
void update (vector <node>& tree, int v, int tl, int tr, int pos, int val) {
if (tl == tr) {
tree [v].inv = 0;
tree [v].freq = vector <int> (40, 0);
tree [v].freq [val] = 1;
}
else {
int tm = (tl + tr) / 2;
if (pos <= tm)
update(tree, 2 * v + 1, tl, tm, pos, val);
else
update(tree, 2 * v + 2, tm + 1, tr, pos, val);
tree [v].combine(tree [2 * v + 1], tree [2 * v + 2]);
}
}
node inv_cnt (vector <node>& tree, int v, int tl, int tr, int l, int r) {
if (l > r)
return node();
if (tl == l && tr == r)
return tree [v];
int tm = (tl + tr) / 2;
node result;
result.combine(inv_cnt(tree, 2 * v + 1, tl, tm, l, min(r, tm)), inv_cnt(tree, 2 * v + 2, tm + 1, tr, max(l, tm + 1), r));
return result;
}
void solve () {
int n, q;
cin >> n >> q;
vector <int> a (n);
for (int i = 0; i < n; ++i) {
cin >> a [i];
--a [i];
}
vector <node> tree (4 * n);
build(tree, a, 0, 0, n - 1);
while (q--) {
int type, x, y;
cin >> type >> x >> y;
--x; --y;
if (type == 1) {
node result = inv_cnt(tree, 0, 0, n - 1, x, y);
cout << result.inv << '\n';
}
else if (type == 2) {
update(tree, 0, 0, n - 1, x, y);
}
else
assert(false);
}
}
int main () {
std::ios::sync_with_stdio(false);
std::cin.tie(nullptr);
std::cout.precision(10);
std::cout << std::fixed << std::boolalpha;
int t = 1;
// std::cin >> t;
while (t--)
solve();
return 0;
}
arr[i] can be at most 40. We can use this to our advantage. What we need is a segment tree. Each node will hold 41 values (A long long int which represents inversions for this range and a array of size 40 for count of each numbers. A struct will do). How do we merge two children of a node. We know inversions for left child and right child. Also know frequency of each numbers in both of them. Inversion of parent node will be summation of inversions of both children plus number of inversions between left and right child. We can easily find inversions between two children from frequency of numbers. Query can be done in similar way. Complexity O(40*qlog(n))
I am learning segment tree , i came across this question.
There are Array A and 2 type of operation
1. Find the Sum in Range L to R
2. Update the Element in Range L to R by Value X.
Update should be like this
A[L] = 1*X;
A[L+1] = 2*X;
A[L+2] = 3*X;
A[R] = (R-L+1)*X;
How should i handle the second type of query can anyone please give some algorithm to modify by segment tree , or there is a better solution
So, it is needed to update efficiently the interval [L,R] with to the corresponding values of the arithmetic progression with the step X, and to be able to find efficiently the sums over the different intervals.
In order to solve this problem efficiently - let's make use of the Segment Tree with Lazy Propagation.
The basic ideas are following:
The arithmetic progression can be defined by the first and last items and the amount of items
It is possible to obtain a new arithmetic progression by combination of the first and last items of two different arithmetic progressions (which have the same amount of items). The first and last items of the new arithmetic progression will be just a combination of the corresponding items of combined arithmetic progressions
Hence, we can associate with each node of the Segment Tree - the first and last values of the arithmetic progression, which spans over the given interval
During update, for all affected intervals, we can lazily propagate through the Segment Tree - the values of the first and last items, and update the aggregated sums on these intervals.
So, the node of the Segment Tree for given problem will have structure:
class Node {
int left; // Left boundary of the current SegmentTree node
int right; // Right boundary of the current SegmentTree node
int sum; // Sum on the interval [left,right]
int first; // First item of arithmetic progression inside given node
int last; // Last item of arithmetic progression
Node left_child;
Node right_child;
// Constructor
Node(int[] arr, int l, int r) { ... }
// Add arithmetic progression with step X on the interval [l,r]
// O(log(N))
void add(int l, int r, int X) { ... }
// Request the sum on the interval [l,r]
// O(log(N))
int query(int l, int r) { ... }
// Lazy Propagation
// O(1)
void propagate() { ... }
}
The specificity of the Segment Tree with Lazy Propagation is such, that every time, when the node of the tree is traversed - the Lazy Propagation routine (which has complexity O(1)) is executed for the given node. So, below is provided the illustration of the Lazy Propagation logic for some arbitrary node, which has children:
As you can see, during the Lazy Propagation the first and the last items of the arithmetic progressions of the child nodes are updated, also the sum inside the parent node is updated as well.
Implementation
Below provided the Java implementation of the described approach (with additional comments):
class Node {
int left; // Left boundary of the current SegmentTree node
int right; // Right boundary of the current SegmentTree node
int sum; // Sum on the interval
int first; // First item of arithmetic progression
int last; // Last item of arithmetic progression
Node left_child;
Node right_child;
/**
* Construction of a Segment Tree
* which spans over the interval [l,r]
*/
Node(int[] arr, int l, int r) {
left = l;
right = r;
if (l == r) { // Leaf
sum = arr[l];
} else { // Construct children
int m = (l + r) / 2;
left_child = new Node(arr, l, m);
right_child = new Node(arr, m + 1, r);
// Update accumulated sum
sum = left_child.sum + right_child.sum;
}
}
/**
* Lazily adds the values of the arithmetic progression
* with step X on the interval [l, r]
* O(log(N))
*/
void add(int l, int r, int X) {
// Lazy propagation
propagate();
if ((r < left) || (right < l)) {
// If updated interval doesn't overlap with current subtree
return;
} else if ((l <= left) && (right <= r)) {
// If updated interval fully covers the current subtree
// Update the first and last items of the arithmetic progression
int first_item_offset = (left - l) + 1;
int last_item_offset = (right - l) + 1;
first = X * first_item_offset;
last = X * last_item_offset;
// Lazy propagation
propagate();
} else {
// If updated interval partially overlaps with current subtree
left_child.add(l, r, X);
right_child.add(l, r, X);
// Update accumulated sum
sum = left_child.sum + right_child.sum;
}
}
/**
* Returns the sum on the interval [l, r]
* O(log(N))
*/
int query(int l, int r) {
// Lazy propagation
propagate();
if ((r < left) || (right < l)) {
// If requested interval doesn't overlap with current subtree
return 0;
} else if ((l <= left) && (right <= r)) {
// If requested interval fully covers the current subtree
return sum;
} else {
// If requested interval partially overlaps with current subtree
return left_child.query(l, r) + right_child.query(l, r);
}
}
/**
* Lazy propagation
* O(1)
*/
void propagate() {
// Update the accumulated value
// with the sum of Arithmetic Progression
int items_count = (right - left) + 1;
sum += ((first + last) * items_count) / 2;
if (right != left) { // Current node is not a leaf
// Calculate the step of the Arithmetic Progression of the current node
int step = (last - first) / (items_count - 1);
// Update the first and last items of the arithmetic progression
// inside the left and right subtrees
// Distribute the arithmetic progression between child nodes
// [a(1) to a(N)] -> [a(1) to a(N/2)] and [a(N/2+1) to a(N)]
int mid = (items_count - 1) / 2;
left_child.first += first;
left_child.last += first + (step * mid);
right_child.first += first + (step * (mid + 1));
right_child.last += last;
}
// Reset the arithmetic progression of the current node
first = 0;
last = 0;
}
}
The Segment Tree in provided solution is implemented explicitly - using objects and references, however it can be easily modified in order to make use of the arrays instead.
Testing
Below provided the randomized tests, which compare two implementations:
Processing queries by sequential increase of each item of the array with O(N) and calculating the sums on intervals with O(N)
Processing the same queries using Segment Tree with O(log(N)) complexity:
The Java implementation of the randomized tests:
public static void main(String[] args) {
// Initialize the random generator with predefined seed,
// in order to make the test reproducible
Random rnd = new Random(1);
int test_cases_num = 20;
int max_arr_size = 100;
int num_queries = 50;
int max_progression_step = 20;
for (int test = 0; test < test_cases_num; test++) {
// Create array of the random length
int[] arr = new int[rnd.nextInt(max_arr_size) + 1];
Node segmentTree = new Node(arr, 0, arr.length - 1);
for (int query = 0; query < num_queries; query++) {
if (rnd.nextDouble() < 0.5) {
// Update on interval [l,r]
int l = rnd.nextInt(arr.length);
int r = rnd.nextInt(arr.length - l) + l;
int X = rnd.nextInt(max_progression_step);
update_sequential(arr, l, r, X); // O(N)
segmentTree.add(l, r, X); // O(log(N))
}
else {
// Request sum on interval [l,r]
int l = rnd.nextInt(arr.length);
int r = rnd.nextInt(arr.length - l) + l;
int expected = query_sequential(arr, l, r); // O(N)
int actual = segmentTree.query(l, r); // O(log(N))
if (expected != actual) {
throw new RuntimeException("Results are different!");
}
}
}
}
System.out.println("All results are equal!");
}
static void update_sequential(int[] arr, int left, int right, int X) {
for (int i = left; i <= right; i++) {
arr[i] += X * ((i - left) + 1);
}
}
static int query_sequential(int[] arr, int left, int right) {
int sum = 0;
for (int i = left; i <= right; i++) {
sum += arr[i];
}
return sum;
}
Basically you need to make a tree and then make updates using lazy propagation, here is the implementation.
int tree[1 << 20], Base = 1 << 19;
int lazy[1 << 20];
void propagation(int v){ //standard propagation
tree[v * 2] += lazy[v];
tree[v * 2 + 1] += lazy[v];
lazy[v * 2] += lazy[v];
lazy[v * 2 + 1] += lazy[v];
lazy[v] == 0;
}
void update(int a, int b, int c, int v = 1, int p = 1, int k = Base){
if(p > b || k < a) return; //if outside range [a, b]
propagation(v);
if(p >= a && k <= b){ // if fully inside range [a, b]
tree[v] += c;
lazy[v] += c;
return;
}
update(a, b, c, v * 2, p, (p + k) / 2); //left child
update(a, b, c, v * 2 + 1, (p + k) / 2 + 1, k); //right child
tree[v] = tree[v * 2] + tree[v * 2 + 1]; //update current node
}
int query(int a, int b, int v = 1, int p = 1, int k = Base){
if(p > b || k < a) //if outside range [a, b]
return 0;
propagation(v);
if(p >= a && k <= b) // if fully inside range [a, b]
return tree[v];
int res = 0;
res += query(a, b, c, v * 2, p, (p + k) / 2); //left child
res += query(a, b, c, v * 2 + 1, (p + k) / 2 + 1, k); //right child
tree[v] = tree[v * 2] + tree[v * 2 + 1]; //update current node
return res;
}
update function oviously updates the tree so it adds to nodes on interval [a, b] (or [L, R])
update(L, R, value);
query function just gives you sum of elements in range
query(L, R);
The second operation can be regarded as adding a segment to the interval [L,R] with two endpoints (L,x),(R,(R-L+1)*x) and slope 1.
The most important thing to consider about segment tree with interval modifications is whether the lazy tags can be merged. If we regard the modification as adding segments, we can find that two segments can be easily merged - we only need to update the slope and the endpoints. For each interval, we only need to maintain the slope and the starting point of the segment for this interval. By using lazy tag technique, we can easily implement querying interval sums and doing interval modifications in O(nlogn) time complexity.
First, a bitonic array for this question is defined as one such that for some index K in an array of length N where 0 < K < N - 1 and 0 to K is a monotonically increasing sequence of integers, and K to N - 1 is a monotonically decreasing sequence of integers.
Example: [1, 3, 4, 6, 9, 14, 11, 7, 2, -4, -9]. It monotonically increases from 1 to 14, then decreases from 14 to -9.
The precursor to this question is to solve it in 3log(n), which is much easier. One altered binary search to find the index of the max, then two binary searchs for 0 to K and K + 1 to N - 1 respectively.
I presume the solution in 2log(n) requires you solve the problem without finding the index of the max. I've thought about overlapping the binary searches, but beyond that, I'm not sure how to move forward.
The algorithms presented in other answers (this and this) are unfortunately incorrect, they are not O(logN) !
The recursive formula f(L) = f(L/2) + log(L/2) + c doesn't lead to f(L) = O(log(N)) but leads to f(L) = O((log(N))^2) !
Indeed, assume k = log(L), then log(2^(k-1)) + log(2^(k-2)) + ... + log(2^1) = log(2)*(k-1 + k-2 + ... + 1) = O(k^2). Hence, log(L/2) + log(L/4) + ... + log(2) = O((log(L)^2)).
The right way to solve the problem in time ~ 2log(N) is to proceed as follows (assuming the array is first in ascending order and then in descending order):
Take the middle of the array
Compare the middle element with one of its neighbor to see if the max is on the right or on the left
Compare the middle element with the desired value
If the middle element is smaller than the desired value AND the max is on the left side, then do bitonic search on the left subarray (we are sure that the value is not in the right subarray)
If the middle element is smaller than the desired value AND the max is on the right side, then do bitonic search on the right subarray
If the middle element is bigger than the desired value, then do descending binary search on the right subarray and ascending binary search on the left subarray.
In the last case, it might be surprising to do a binary search on a subarray that may be bitonic but it actually works because we know that the elements that are not in the good order are all bigger than the desired value. For instance, doing an ascending binary search for the value 5 in the array [2, 4, 5, 6, 9, 8, 7] will work because 7 and 8 are bigger than the desired value 5.
Here is a fully working implementation (in C++) of the bitonic search in time ~2logN:
#include <iostream>
using namespace std;
const int N = 10;
void descending_binary_search(int (&array) [N], int left, int right, int value)
{
// cout << "descending_binary_search: " << left << " " << right << endl;
// empty interval
if (left == right) {
return;
}
// look at the middle of the interval
int mid = (right+left)/2;
if (array[mid] == value) {
cout << "value found" << endl;
return;
}
// interval is not splittable
if (left+1 == right) {
return;
}
if (value < array[mid]) {
descending_binary_search(array, mid+1, right, value);
}
else {
descending_binary_search(array, left, mid, value);
}
}
void ascending_binary_search(int (&array) [N], int left, int right, int value)
{
// cout << "ascending_binary_search: " << left << " " << right << endl;
// empty interval
if (left == right) {
return;
}
// look at the middle of the interval
int mid = (right+left)/2;
if (array[mid] == value) {
cout << "value found" << endl;
return;
}
// interval is not splittable
if (left+1 == right) {
return;
}
if (value > array[mid]) {
ascending_binary_search(array, mid+1, right, value);
}
else {
ascending_binary_search(array, left, mid, value);
}
}
void bitonic_search(int (&array) [N], int left, int right, int value)
{
// cout << "bitonic_search: " << left << " " << right << endl;
// empty interval
if (left == right) {
return;
}
int mid = (right+left)/2;
if (array[mid] == value) {
cout << "value found" << endl;
return;
}
// not splittable interval
if (left+1 == right) {
return;
}
if(array[mid] > array[mid-1]) {
if (value > array[mid]) {
return bitonic_search(array, mid+1, right, value);
}
else {
ascending_binary_search(array, left, mid, value);
descending_binary_search(array, mid+1, right, value);
}
}
else {
if (value > array[mid]) {
bitonic_search(array, left, mid, value);
}
else {
ascending_binary_search(array, left, mid, value);
descending_binary_search(array, mid+1, right, value);
}
}
}
int main()
{
int array[N] = {2, 3, 5, 7, 9, 11, 13, 4, 1, 0};
int value = 4;
int left = 0;
int right = N;
// print "value found" is the desired value is in the bitonic array
bitonic_search(array, left, right, value);
return 0;
}
The algorithm works recursively by combining bitonic and binary searches:
def bitonic_search (array, value, lo = 0, hi = array.length - 1)
if array[lo] == value then return lo
if array[hi] == value then return hi
mid = (hi + lo) / 2
if array[mid] == value then return mid
if (mid > 0 & array[mid-1] < array[mid])
| (mid < array.length-1 & array[mid+1] > array[mid]) then
# max is to the right of mid
bin = binary_search(array, value, low, mid-1)
if bin != -1 then return bin
return bitonic_search(array, value, mid+1, hi)
else # max is to the left of mid
bin = binary_search(array, value, mid+1, hi)
if bin != -1 then return bin
return bitonic_search(array, value, lo, mid-1)
So the recursive formula for the time is f(l) = f(l/2) + log(l/2) + c where log(l/2) comes from the binary search and c is the cost of the comparisons done in the function body.
Answers those provided have time complexity of (N/2)*logN. Because the worst case may include too many sub-searches which are unnecessary. A modification is to compare the target value with the left and right element of sub series before searching. If target value is not between two ends of the monotonic series or less than both ends of the bitonic series, subsequent search is redundant. This modification leads to 2lgN complexity.
There are 5 main cases depending on where the max element of array is, and whether middle element is greater than desired value
Calculate middle element.
Compare middle element desired value, if it matches search ends. Otherwise proceed to next step.
Compare middle element with neighbors to see if max element is on left or right. If both of the neighbors are less than middle element, then element is not present in the array, hence exit.(Array mentioned in the question will hit this case first as 14, the max element, is in middle)
If middle element is less than desired value and max element is on right, do bitonic search in right subarray
If middle element is less than desired value and max element is on left, do bitonic search in left subarray
If middle element is greater than desired value and max element is on left, do descending binary search in right subarray
If middle element is greater than desired value and max element is on right, do ascending binary search in left subarray
In the worst case we will be doing two comparisons each time array is divided in half, hence complexity will be 2*logN
public int FindLogarithmicGood(int value)
{
int lo = 0;
int hi = _bitonic.Length - 1;
int mid;
while (hi - lo > 1)
{
mid = lo + ((hi - lo) / 2);
if (value < _bitonic[mid])
{
return DownSearch(lo, hi - lo + 1, mid, value);
}
else
{
if (_bitonic[mid] < _bitonic[mid + 1])
lo = mid;
else
hi = mid;
}
}
return _bitonic[hi] == value
? hi
: _bitonic[lo] == value
? lo
: -1;
}
where DownSearch is
public int DownSearch(int index, int count, int mid, int value)
{
int result = BinarySearch(index, mid - index, value);
if (result < 0)
result = BinarySearch(mid, index + count - mid, value, false);
return result;
}
and BinarySearch is
/// <summary>
/// Exactly log(n) on average and worst cases.
/// Note: System.Array.BinarySerch uses 2*log(n) in the worst case.
/// </summary>
/// <returns>array index</returns>
public int BinarySearch(int index, int count, int value, bool asc = true)
{
if (index < 0 || count < 0)
throw new ArgumentOutOfRangeException();
if (_bitonic.Length < index + count)
throw new ArgumentException();
if (count == 0)
return -1;
// "lo minus one" trick
int lo = index - 1;
int hi = index + count - 1;
int mid;
while (hi - lo > 1)
{
mid = lo + ((hi - lo) / 2);
if ((asc && _bitonic[mid] < value) || (!asc && _bitonic[mid] > value))
lo = mid;
else
hi = mid;
}
return _bitonic[hi] == value ? hi : -1;
}
github
Finding the change of sign among the first order differences, by standard dichotomic search, will take 2Lg(n) array accesses.
You can do slightly better by using the search strategy for the maximum of a unimodal function known as Fibonacci search. After n steps each involving a single lookup, you reduce the interval size by a factor Fn, corresponding to about Log n/Log φ ~ 1.44Lg(n) accesses to find the maximum.
This marginal gain makes a little more sense when array accesses are instead costly funciton evaluations.
When it comes to searching Algorithms in O(log N) time, You gotta think of binary search only.
The concept here is to first find the peak point,
for ex: Array = [1 3 5 6 7 12 6 4 2 ] -> Here, 12 is the peak. Once detected and gotta mark as mid, Now simply do a binary search in Array[0:mid] and Array[mid:len(Array)].
Note: The second array from mid -> len is a descending array and need to make a small variation in binary search.
For finding the Bitonic Point :-) [ Written in Python ]
start, end = 0, n-1
while start <= end:
mid = start + end-start//2
if (mid == 0 or arr[mid-1] < arr[mid]) and (mid==n-1 or arr[mid+1] < arr[mid]):
return mid
if mid > 0 and arr[mid-1] > arr[mid]:
end = mid-1
else:
start = mid+1
Once found the index, Do the respective Binary Search. Woola...All done :-)
For a binary split, there are three cases:
max item is at right, then binary search left, and bitoinc search right.
max item is at left, then binary search right, and bitoinc search left.
max item is at the split point exactly, then binary both left and right.
caution: the binary search used in left and right are different because of increasing/decreasing order.
public static int bitonicSearch(int[] a, int lo, int hi, int key) {
int mid = (lo + hi) / 2;
int now = a[mid];
if (now == key)
return mid;
// deal with edge cases
int left = (mid == 0)? a[mid] : a[mid - 1];
int right = (mid == a.length-1)? a[mid] : a[mid + 1];
int leftResult, rightResult;
if (left < now && now < right) { // max item is at right
leftResult = binarySearchIncreasing(a, lo, mid - 1, key);
if (leftResult != -1)
return leftResult;
return bitonicSearch(a, mid + 1, hi, key);
}
else if (left > now && now > right) { // max item is at left
rightResult = binarySearchDecreasing(a, mid + 1, hi, key);
if (rightResult != -1)
return rightResult;
return bitonicSearch(a, lo, mid - 1, key);
}
else { // max item stands at the split point exactly
leftResult = binarySearchIncreasing(a, lo, mid - 1, key);
if (leftResult != -1)
return leftResult;
return binarySearchDecreasing(a, mid + 1, hi, key);
}
}
I have a given tree with n nodes. The task is to find the number of subtrees of the given tree with outgoing edges to its complement less than or equal to a given number K.
for example: If n=3 and k=1
and the given tree is 1---2---3
Then the total valid subtrees would be 6
{}, {1}, {3}, {1,2}, {2,3}, {1,2,3}
I know I can enumerate all 2^n trees and chack the valid ones, but is there some approach that is faster? Can I achieve polynomial time in n? Something close to O(n^3) or even O(n^4) would be nice.
EDIT: for k=1 this value turns out to be 2*n
This is a fairly typical instance of the DP-on-a-tree paradigm. Let's generalize the problem slightly by allowing the specification of a root vertex v and stratifying the counts of the small-boundary trees in two ways: whether v is included, and how many edges comprise the boundary.
The base case is easy. There are no edges and thus two subtrees: one includes v, the other excludes v, and both have no boundary edges. Otherwise, let e = {v, w} be an edge incident to v. The instance looks like this.
|\ /|
| \ e / |
|L v-----w R|
| / \ |
|/ \|
Compute recursively the stratified counts for L rooted at v and R rooted at w.
Subtrees that include v consist of a subtree in L that includes v, plus optionally e and a subtree in R that includes w. Subtrees that don't include v consist of either a subtree in L that doesn't include v, or a subtree in R (double counting the empty tree). This means we can obtain the stratified counts by convolving the stratified counts for L with the stratified counts for R.
Here's how this works on your example. Let's choose root 1.
e
1---2---3
We choose e as shown and recurse.
1
The vector for includes-1 is [1], since the one subtree is {1}, with no boundary. The vector for excludes-1 is [1], since the one subtree is {}, also with no boundary.
2---3
We compute 2 and 3 as we did for 1. The vector for includes-2 is [1, 1], since {2, 3} has no boundary edges, and {2} has one. We obtained this vector by adding the includes-2 vector for 2, shifted by one because of the new boundary edge to make [0, 1], to the convolution of the includes-2 vector for 2 with the includes-3 vector for 3, which is [1, 0]. The vector for excludes-2 is [1] + [1, 1] - [1] = [1, 1], where [1, 1] is the sum of the shifted includes-3 vector and the excludes-3 vector, and the subtraction is to compensate for double-counting {}.
Now, for the original invocation, to get the includes-1 vector, we add [0, 1], the includes-1 vector for 1 shifted by one, to the convolution of [1] with [1, 1], obtaining [1, 2]. To check: {1, 2, 3} has no boundary, and {1} and {1, 2} have one boundary edge. The excludes-1 vector is [1] + [1, 2, 1] - [1] = [1, 2, 1]. To check: {} has no boundary, {2, 3} and {3} have one boundary edge, and {2} has two boundary edges.
Here is my python implementation of David Eisenstat's solution:
from sys import stdin
from numpy import *
from scipy import *
def roundup_pow2(x):
"""
Round up to power of 2 (obfuscated and unintentionally faster :).
"""
while x&(x-1):
x = (x|(x>>1))+1
return max(x,1)
def to_long(x):
return long(rint(x))
def poly_mul(a,b):
n = len(a) + len(b) - 1
nr = roundup_pow2(n)
a += [0L]*(nr-len(a))
b += [0L]*(nr-len(b)) # pad with zeros to length n
u = fft(a)
v = fft(b)
w = ifft(u*v)[:n].real # ifft == inverse fft
return map(to_long,w)
def pad(l,s) :
return l+[0L]*(s-len(l))
def make_tree(l,x,y):
l[x][y]=y
l[x].pop(y)
for child in l[x]:
make_tree(l,child,x)
def cut_tree(l,x) :
if len(l[x])==0:
return [1L],[1L]
y,_ = l[x].popitem()
ai,ax=cut_tree(l,x)
bi,bx=cut_tree(l,y)
ci=[0L]+ai
tmp=poly_mul(ai,bi)
padlen=max(len(ci),len(tmp))
ci=pad(ci,padlen)
tmp=pad(tmp,padlen)
ci=map(add,ci,tmp)
cx=[0L]+bi
padlen=max(len(cx),len(bx),len(ax))
cx=pad(cx,padlen)
bx=pad(bx,padlen)
ax=pad(ax,padlen)
tmp=pad([-1],padlen)
cx=map(add,cx,bx)
cx=map(add,cx,ax)
cx=map(add,cx,tmp)
return ci,cx
n,k = map(int,raw_input().split())
l=[{}]
for i in range(1,n+1):
d={}
l.append(d)
for i in range(1,n):
x,y = map(int,raw_input().split())
l[x][y]=y
l[y][x]=x
make_tree(l,1,0)
i,x = cut_tree(l,1)
padlen=max(len(i),len(x))
i=pad(i,padlen)
x=pad(x,padlen)
combined=map(add,i,x)
sum=0L
for i in range(0,k+1) :
sum+=combined[i]
print sum
Let us create a slightly bigger tree like below.
1
/ | \
2 3 \
/ 4
7 / \
5 6
Let us define a function F(a, k) for each node 'a' with 'k' edges removed from node 'a' and below.
i.e. if 'k' edges are removed from node 'a' then we create F(a, k) number of subtrees.
(If 'a' is not root, it is assumed to be connected to it's parent).
e.g. in above tree ( F(4, 1) = 2 ), as we create 2 trees by removing 2 edges below '4'
(we assume that 4 is connected to parent and subtrees (5) and (6) are not counted in F(4,1))
We traverse and calculate 'F' of each child first. Then using child's F we calculate
parents F.
F(a, k) of a leaf node is '0' for all k
For non-leaf nodes.
F(a, k) = SUM (F(child, k)) + Z
While F(child, k) can be calculated recursively.
Z on the other hand is calculated by finding all combinations where some child take
ri edges out of k such that SUM(ri) = k
Programmatically this can be done by fixing 'j' edge for a given child and then
calculating the number of trees created by distributing 'k-j' edges to other children.
e.g. in above tree
F(1, 3) = F(2, 3) + F(3, 3) + F(4, 3) + // we pass k as-is to child
F(2,1)*F(3,1)*F(4,1) + F(2,1)*F(3,2) + F(2,1)*F(4,2) + //consume 1 edge by 2 and distribute 2 to other children
F(2, 2)*F(3,1) + F(2,2)*F(4,1) + // consume 2 edges from node '2' and 1 for other children
F(3,1)*F(4,2)
As we see above, we fix 'r' edge for node 2 and then distribute '3-r' edges to other children.
We keep doing this for all children of '1'.
Additionally, we create sub-trees when we detach a node from parent.
e.g. in above case when we calculate F(1, 3) we create the following
detached trees.
detached_tree += F(2, 2) + F(3, 2) + F(4, 2)
Here we assume that one edge is consumed by detaching child node from parent,
and in child node if we consume 'k-1' edges we will create F(child, k-1) subtrees.
These trees are counted and stored seperately in detached_trees.
Once we have calculated the F(a,k) of all nodes.
The total subtrees are 'SUM(F(root, k)) for all k' + 'total nodes - 1' + detached_trees.
We add 'total nodes - 1' to our total. This is because when a node (except root) is detached
from a tree, it creates two trees with 1 edge missing. While one of the tree is counted
in F(parent, 1), the other is not counted anywhere, hence needs to be counted in total.
Here is C code of above algorithm. The recursion can be further optimized.
#define MAX 51
/* We use the last entry of alist to store number of children of a given node */
#define NUM_CHILD(alist, node) (alist[node][MAX])
int alist[MAX][MAX+1] = {0};
long F[MAX][MAX]={0};
long detached_subtrees = 0;
/*
* We fix one of the child node for 'i' edges out of 'n', then we traverse
* over the rest of the children to get 'n-i' edges, we do so recursivly.
* Note that if 'n' is 1, we can always build a subtree by detaching.
*/
long REST_OF_NODES_SUM(int node, int q, int n)
{
long sum = 0, i, node2, ret = 0, nd;
/* fix node2 and calcualte the subtree for rest of the children */
for(nd = q; nd < NUM_CHILD(alist, node); nd++) {
node2 = alist[node][nd];
/* Consume 'i' edges and send 'n-i' for other children of node */
for (i = 1; i < n ; i++) {
sum = REST_OF_NODES_SUM(node, nd + 1, n - i);
ret += (F[node2][i] * sum);
/* Add one for 'node2' getting detached from tree */
if (i == 1) { ret += sum; }
}
ret += F[node2][n];
/* If only one edge is to be consumed, we detach 'node2' from the tree */
if (n == 1) { ret++; }
}
return ret;
}
void get_counts(int N, int K, int node, int root)
{
int child_node;
int i, j, p, k;
if (NUM_CHILD(alist, node) == 0) { return; }
for(i = 0 ; i < NUM_CHILD(alist, node); i++) {
child_node = alist[node][i];
/* Do a recursive traversal of all children */
get_counts(N, K, child_node, node);
F[node][1] += (F[child_node][1]);
}
F[node][1] += NUM_CHILD(alist, node);
for (k = 2; k <= K; k++) {
for(p = 0; p < NUM_CHILD(alist, node); p++) {
child_node = alist[node][p];
F[node][k] += F[child_node][k];
/* If we remove this child, then we create subtrees in the child */
detached_subtrees += F[child_node][k-1];
/* Assume that 'child_node' is detached, find tree created by rest
* of children for 'k-j' edges */
F[node][k] += REST_OF_NODES_SUM(node, p + 1, k - 1);
/* Fix one child node for 'j' edges out of 'k' and traverse over the rest of
* children for 'k - j' edges */
for (j = 1; j < k ; j++) {
if (F[child_node][j]) F[node][k] += (F[child_node][j] * REST_OF_NODES_SUM(node, p + 1, k - j));
}
}
}
}
void remove_back_ref(int parent, int node)
{
int c;
for (c = 0; c < NUM_CHILD(alist, node); c++) {
if (alist[node][c] == parent) {
if ((c + 1) == NUM_CHILD(alist, node)) {
NUM_CHILD(alist, node)--;
alist[node][c] = 0;
} else {
/* move last entry here */
alist[node][c] = alist[node][NUM_CHILD(alist, node)-1];
alist[node][NUM_CHILD(alist, node)-1] = 0;
NUM_CHILD(alist, node)--;
}
}
}
}
/* go to each child and remove back links */
void normalize(int node)
{
int j, child;
for (j = 0; j < NUM_CHILD(alist, node); j++) {
child = alist[node][j];
remove_back_ref(node, child);
normalize(child);
}
}
long cutTree(int N, int K, int edges_rows, int edges_columns, int** edges)
{
int i, j;
int node, index;
long ret = 0;
/* build an adjacency list from the above edges */
for (i = 0; i < edges_rows; i++) {
alist[edges[i][0]][NUM_CHILD(alist, edges[i][0])] = edges[i][1];
alist[edges[i][1]][NUM_CHILD(alist, edges[i][1])] = edges[i][0];
NUM_CHILD(alist, edges[i][0])++;
NUM_CHILD(alist, edges[i][1])++;
}
/* get rid of the back links in children */
normalize(1);
get_counts(N, K, 1, 1);
for (i = 1; i <= K; i++) { ret += F[1][i]; }
/* Every node (except root) when detached from tree, will create one extra subtree. */
ret += (N - 1);
/* The subtrees created by detaching from parent */
ret += detached_subtrees;
/* Add two for empty and full tree */
ret += 2;
return ret;
}
main(int argc, char *argv[])
{
int **arr;
int ret, i, N, K, x, y;
scanf("%d%d", &N, &K);
arr = malloc((N - 1) * sizeof(int*));
for (i = 0; i < (N - 1); i++) { arr[i] = malloc(2*sizeof(int)); }
for (i = 0; i < N-1; i++) { scanf("%d%d", &x, &y); arr[i][0] = x; arr[i][1] = y; }
printf("MAX %d ret %ld\n", MAX, cutTree(N, K, N-1, 2, arr));
}