How can I solve the problem on the segment tree - algorithm

I have an array a[0..n - 1]. I have 2 types operations:
sum l r:
sum = a[l] * 1 + a[l+1] * 2 + ... + a[r] * (r - l)
update index value: a[index] = value
How can I modify standart sum segment tree for this task?

Store a second segment tree for sums of the array b, where b[i] = (i+1) * a[i]. Then, setting a[index] = value should also update the second segment tree with b[index] = value * (index + 1).
To calculate the sum you want, given the ability to query the normal subarray sums of a and b:
special_sum(l, r) = a[l] * 1 + a[l+1] * 2 + ... + a[r] * (r - l)
can be rewritten as:
(l+1) * a[l] + (l+2) * a[l+1] + ... + (r) * a[r]
- l * (a[l] + a[l+1] + ... + a[r])
which becomes
b[l] + b[l+1] + ... + b[r]
- l * (a[l] + a[l+1] + ... + a[r])

Related

The number of compares for an optimized merge-sort on an already sorted array

I would like to know how I can calculate the number of compares for an optimized merge sort described below, on an array already sorted.
It is like an ordinary merge-sort but does not 'merge' if two subarrays are already sorted.
mergesort(arr, auxArr, lo, hi)
if (hi <= lo) return;
mid = lo + (hi + lo) / 2
sort(arr, auxArr, lo, mid)
sort(arr, auxArr, mid+1, hi)
if (less(arr[mid], arr[mid+1]) return; // optimization
merge(arr, auxArr, lo, mid ,hi)
Here is my attempt to find out the number of compares
Let C(N) be the number of compares
C(N) = 2 * C(N/2) + 1 (number of compares to sort two subarrays + 1 compare for our optimization)
Using recurrence relation,
C(N) = = 2 * C(N/2) + 1
= 2 ( 2C(N/4) + 1 ) + 1
= 4 C(N/4) + 1 + y
...
= N C(N/N) + 1 + 2 + 4 + ... + 2^(log2 (N/2))
= 1 + 2 + 4 + ... + 2^(log2 (N/2))
But the answer says it is ~ log2(N), and I am stuck. Any ideas?

The sum from 1 to n in theta(log n)

Is there anyway to calculate the sum of 1 to n in Theta(log n)?
Of course, the obvious way to do it is sum = n*(n+1)/2.
However, for practicing, I want to calculate in Theta(log n).
For example,
sum=0; for(int i=1; i<=n; i++) { sum += i}
this code will calculate in Theta(n).
Fair way (without using math formulas) assumes direct summing all n values, so there is no way to avoid O(n) behavior.
If you want to make some artificial approach to provide exactly O(log(N)) time, consider, for example, using powers of two (knowing that Sum(1..2^k = 2^(k-1) + 2^(2*k-1) - for example, Sum(8) = 4 + 32). Pseudocode:
function Sum(n)
if n < 2
return n
p = 1 //2^(k-1)
p2 = 2 //2^(2*k-1)
while p * 4 < n:
p = p * 2;
p2 = p2 * 4;
return p + p2 + ///sum of 1..2^k
2 * p * (n - 2 * p) + ///(n - 2 * p) summands over 2^k include 2^k
Sum(n - 2 * p) ///sum of the rest over 2^k
Here 2*p = 2^k is the largest power of two not exceeding N. Example:
Sum(7) = Sum(4) + 5 + 6 + 7 =
Sum(4) + (4 + 1) + (4 + 2) + (4 + 3) =
Sum(4) + 3 * 4 + Sum(3) =
Sum(4) + 3 * 4 + Sum(2) + 1 * 2 + Sum(1) =
Sum(4) + 3 * 4 + Sum(2) + 1 * 2 + Sum(1) =
2 + 8 + 12 + 1 + 2 + 2 + 1 = 28

How to pick a number based on probability?

I want to select a random number from 0,1,2,3...n, however I want to make it that the chance of selecting k|0<k<n will be lower by multiplication of x from selecting k - 1 so x = (k - 1) / k. As bigger the number as smaller the chances to pick it up.
As an answer I want to see the implementation of the next method:
int pickANumber(n,x)
This is for a game that I am developing, I saw those questions as related but not exactly that same:
How to pick an item by its probability
C Function for picking from a list where each element has a distinct probabili
p1 + p2 + ... + pn = 1
p1 = p2 * x
p2 = p3 * x
...
p_n-1 = pn * x
Solving this gives you:
p1 + p2 + ... + pn = 1
(p2 * x) + (p3 * x) + ... + (pn * x) + pn = 1
((p3*x) * x) + ((p4*x) * x) + ... + ((p_n-1*x) * x) + pn = 1
....
pn* (x^(n-1) + x^(n-2) + ... +x^1 + x^0) = 1
pn*(1-x^n)/(1-x) = 1
pn = (1-x)/(1-x^n)
This gives you the probability you need to set to pn, and from it you can calculate the probabilities for all other p1,p2,...p_n-1
Now, you can use a "black box" RNG that chooses a number with a distribution, like those in the threads you mentioned.
A simple approach to do it is to set an auxillary array:
aux[i] = p1 + p2 + ... + pi
Now, draw a random number with uniform distribution between 0 to aux[n], and using binary search (aux array is sorted), get the first value, which matching value in aux is greater than the random uniform number you got
Original answer, for substraction (before question was editted):
For n items, you need to solve the equation:
p1 + p2 + ... + pn = 1
p1 = p2 + x
p2 = p3 + x
...
p_n-1 = pn + x
Solving this gives you:
p1 + p2 + ... + pn = 1
(p2 + x) + (p3 + x) + ... + (pn + x) + pn = 1
((p3+x) + x) + ((p4+x) + x) + ... + ((p_n-1+x) + x) + pn = 1
....
pn* ((n-1)x + (n-2)x + ... +x + 0) = 1
pn* x = n(n-1)/2
pn = n(n-1)/(2x)
This gives you the probability you need to set to pn, and from it you can calculate the probabilities for all other p1,p2,...p_n-1
Now, you can use a "black box" RNG that chooses a number with a distribution, like those in the threads you mentioned.
Be advised, this is not guaranteed you will have a solution such that 0<p_i<1 for all i, but you cannot guarantee one given from your requirements, and it is going to depend on values of n and x to fit.
Edit This answer was for the OPs original question, which was different in that each probability was supposed to be lower by a fixed amount than the previous one.
Well, let's see what the constraints say. You want to have P(k) = P(k - 1) - x. So we have:
P(0)
P(1) = P(0) - x
P(2) = P(0) - 2x
...
In addition, Sumk P(k) = 1. Summing, we get:
1 = (n + 1)P(0) -x * n / 2 (n + 1),
This gives you an easy constraint between x and P(0). Solve for one in terms of the other.
For this I would use the Mersenne Twister algorithm for a uniform distribution which Boost provides, then have a mapping function to map the results of that random distribution to the actual number select.
Here's a quick example of a potential implementation, although I left out the quadtratic equation implementation since it is well known:
int f_of_xib(int x, int i, int b)
{
return x * i * i / 2 + b * i;
}
int b_of_x(int i, int x)
{
return (r - ( r ) / 2 );
}
int pickANumber(mt19937 gen, int n, int x)
{
// First, determine the range r required where the probability equals i * x
// since probability of each increasing integer is x higher of occuring.
// Let f(i) = r and given f'(i) = x * i then r = ( x * i ^2 ) / 2 + b * i
// where b = ( r - ( x * i ^ 2 ) / 2 ) / i . Since r = x when i = 1 from problem
// definition, this reduces down to b = r - r / 2. therefore to find r_max simply
// plugin x to find b, then plugin n for i, x, and b to get r_max since r_max occurs
// when n == i.
// Find b when
int b = b_of_x(x);
int r_max = f_of_xib(x, n, b);
boost::uniform_int<> range(0, r_max);
boost::variate_generator<boost::mt19937&, boost::uniform_int<> > next(gen, range);
// Now to map random number to desired number, just find the positive value for i
// when r is the return random number which boils down to finding the non-zero root
// when 0 = ( x * i ^ 2 ) / 2 + b * i - r
int random_number = next();
return quadtratic_equation_for_positive_value(1, b, r);
}
int main(int argc, char** argv)
{
mt19937 gen;
gen.seed(time(0));
pickANumber(gen, 10, 1);
system("pause");
}

How to find ith item in zigzag ordering?

A question last week defined the zig zag ordering on an n by m matrix and asked how to list the elements in that order.
My question is how to quickly find the ith item in the zigzag ordering? That is, without traversing the matrix (for large n and m that's much too slow).
For example with n=m=8 as in the picture and (x, y) describing (row, column)
f(0) = (0, 0)
f(1) = (0, 1)
f(2) = (1, 0)
f(3) = (2, 0)
f(4) = (1, 1)
...
f(63) = (7, 7)
Specific question: what is the ten billionth (1e10) item in the zigzag ordering of a million by million matrix?
Let's assume that the desired element is located in the upper half of the matrix. The length of the diagonals are 1, 2, 3 ..., n.
Let's find the desired diagonal. It satisfies the following property:
sum(1, 2 ..., k) >= pos but sum(1, 2, ..., k - 1) < pos. The sum of 1, 2, ..., k is k * (k + 1) / 2. So we just need to find the smallest integer k such that k * (k + 1) / 2 >= pos. We can either use a binary search or solve this quadratic inequality explicitly.
When we know the k, we just need to find the pos - (k - 1) * k / 2 element of this diagonal. We know where it starts and where we should move(up or down, depending on the parity of k), so we can find the desired cell using a simple formula.
This solution has an O(1) or an O(log n) time complexity(it depends on whether we use a binary search or solve the inequation explicitly in step 2).
If the desired element is located in the lower half of the matrix, we can solve this problem for a pos' = n * n - pos + 1 and then use symmetry to get the solution to the original problem.
I used 1-based indexing in this solution, using 0-based indexing might require adding +1 or -1 somewhere, but the idea of the solution is the same.
If the matrix is rectangular, not square, we need to consider the fact the length of diagonals look this way: 1, 2, 3, ..., m, m, m, .., m, m - 1, ..., 1(if m <= n) when we search for the k, so the sum becomes something like k * (k + 1) / 2 if k <= m and k * (k + 1) / 2 + m * (k - m) otherwise.
import math, random
def naive(n, m, ord, swap = False):
dx = 1
dy = -1
if swap:
dx, dy = dy, dx
cur = [0, 0]
for i in range(ord):
cur[0] += dy
cur[1] += dx
if cur[0] < 0 or cur[1] < 0 or cur[0] >= n or cur[1] >= m:
dx, dy = dy, dx
if cur[0] >= n:
cur[0] = n - 1
cur[1] += 2
if cur[1] >= m:
cur[1] = m - 1
cur[0] += 2
if cur[0] < 0: cur[0] = 0
if cur[1] < 0: cur[1] = 0
return cur
def fast(n, m, ord, swap = False):
if n < m:
x, y = fast(m, n, ord, not swap)
return [y, x]
alt = n * m - ord - 1
if alt < ord:
x, y = fast(n, m, alt, swap if (n + m) % 2 == 0 else not swap)
return [n - x - 1, m - y - 1]
if ord < (m * (m + 1) / 2):
diag = int((-1 + math.sqrt(1 + 8 * ord)) / 2)
parity = (diag + (0 if swap else 1)) % 2
within = ord - (diag * (diag + 1) / 2)
if parity: return [diag - within, within]
else: return [within, diag - within]
else:
ord -= (m * (m + 1) / 2)
diag = int(ord / m)
within = ord - diag * m
diag += m
parity = (diag + (0 if swap else 1)) % 2
if not parity:
within = m - within - 1
return [diag - within, within]
if __name__ == "__main__":
for i in range(1000):
n = random.randint(3, 100)
m = random.randint(3, 100)
ord = random.randint(0, n * m - 1)
swap = random.randint(0, 99) < 50
na = naive(n, m, ord, swap)
fa = fast(n, m, ord, swap)
assert na == fa, "(%d, %d, %d, %s) ==> (%s), (%s)" % (n, m, ord, swap, na, fa)
print fast(1000000, 1000000, 9999999999, False)
print fast(1000000, 1000000, 10000000000, False)
So the 10-billionth element (the one with ordinal 9999999999), and the 10-billion-first element (the one with ordinal 10^10) are:
[20331, 121089]
[20330, 121090]
An analytical solution
In the general case, your matrix will be divided in 3 areas:
an initial triangle t1
a skewed part mid where diagonals have a constant length
a final triangle t2
Let's call p the index of your diagonal run.
We want to define two functions x(p) and y(p) that give you the column and row of the pth cell.
Initial triangle
Let's look at the initial triangular part t1, where each new diagonal is one unit longer than the preceding.
Now let's call d the index of the diagonal that holds the cell, and
Sp = sum(di) for i in [0..p-1]
We have p = Sp + k, with 0 <=k <= d and
Sp = d(d+1)/2
if we solve for d, it brings
d²+d-2p = 0, a quadratic equation where we retain only the positive root:
d = (-1+sqrt(1+8*p))/2
Now we want the highest integer value closest to d, which is floor(d).
In the end, we have
p = d + k with d = floor((-1+sqrt(1+8*p))/2) and k = p - d(d+1)/2
Let's call
o(d) the function that equals 1 if d is odd and 0 otherwise, and
e(d) the function that equals 1 if d is even and 0 otherwise.
We can compute x(p) and y(p) like so:
d = floor((-1+sqrt(1+8*p))/2)
k = p - d(d+1)/2
o = d % 2
e = 1 - o
x = e*d + (o-e)*k
y = o*d + (e-o)*k
even and odd functions are used to try to salvage some clarity, but you can replace
e(p) with 1 - o(p) and have slightly more efficient but less symetric formulaes for x and y.
Middle part
let's consider the smallest matrix dimension s, i.e. s = min (m,n).
The previous formulaes hold until x or y (whichever comes first) reaches the value s.
The upper bound of p such as x(i) <= s and y(i) <= s for all i in [0..p]
(i.e. the cell indexed by p is inside the initial triangle t1) is given by
pt1 = s(s+1)/2.
For p >= pt1, diagonal length remains equal to s until we reach the second triangle t2.
when inside mid, we have:
p = s(s+1)/2 + ds + k with k in [0..s[.
which yields:
d = floor ((p - s(s+1)/2)/s)
k = p - ds
We can then use the same even/odd trick to compute x(p) and y(p):
p -= s(s+1)/2
d = floor (p / s)
k = p - d*s
o = (d+s) % 2
e = 1 - o
x = o*s + (e-o)*k
y = e*s + (o-e)*k
if (n > m)
x += d+e
y -= e
else
y += d+o
x -= o
Final triangle
Using symetry, we can calculate pt2 = m*n - s(s+1)/2
We now face nearly the same problem as for t1, except that the diagonal may run in the same direction as for t1 or in the reverse direction (if n+m is odd).
Using symetry tricks, we can compute x(p) and y(p) like so:
p = n*m -1 - p
d = floor((-1+sqrt(1+8*p))/2)
k = p - d*(d+1)/2
o = (d+m+n) % 2
e = 1 - $o;
x = n-1 - (o*d + (e-o)*k)
y = m-1 - (e*d + (o-e)*k)
Putting all together
Here is a sample c++ implementation.
I used 64 bits integers out of sheer lazyness. Most could be replaced by 32 bits values.
The computations could be made more effective by precomputing a few more coefficients.
A good part of the code could be factorized, but I doubt it is worth the effort.
Since this is just a quick and dirty proof of concept, I did not optimize it.
#include <cstdio> // printf
#include <algorithm> // min
using namespace std;
typedef long long tCoord;
void panic(const char * msg)
{
printf("PANIC: %s\n", msg);
exit(-1);
}
struct tPoint {
tCoord x, y;
tPoint(tCoord x = 0, tCoord y = 0) : x(x), y(y) {}
tPoint operator+(const tPoint & p) const { return{ x + p.x, y + p.y }; }
bool operator!=(const tPoint & p) const { return x != p.x || y != p.y; }
};
class tMatrix {
tCoord n, m; // dimensions
tCoord s; // smallest dimension
tCoord pt1, pt2; // t1 / mid / t2 limits for p
public:
tMatrix(tCoord n, tCoord m) : n(n), m(m)
{
s = min(n, m);
pt1 = (s*(s + 1)) / 2;
pt2 = n*m - pt1;
}
tPoint diagonal_cell(tCoord p)
{
tCoord x, y;
if (p < pt1) // inside t1
{
tCoord d = (tCoord)floor((-1 + sqrt(1 + 8 * p)) / 2);
tCoord k = p - (d*(d + 1)) / 2;
tCoord o = d % 2;
tCoord e = 1 - o;
x = o*d + (e - o)*k;
y = e*d + (o - e)*k;
}
else if (p < pt2) // inside mid
{
p -= pt1;
tCoord d = (tCoord)floor(p / s);
tCoord k = p - d*s;
tCoord o = (d + s) % 2;
tCoord e = 1 - o;
x = o*s + (e - o)*k;
y = e*s + (o - e)*k;
if (m > n) // vertical matrix
{
x -= o;
y += d + o;
}
else // horizontal matrix
{
x += d + e;
y -= e;
}
}
else // inside t2
{
p = n * m - 1 - p;
tCoord d = (tCoord)floor((-1 + sqrt(1 + 8 * p)) / 2);
tCoord k = p - (d*(d + 1)) / 2;
tCoord o = (d + m + n) % 2;
tCoord e = 1 - o;
x = n - 1 - (o*d + (e - o)*k);
y = m - 1 - (e*d + (o - e)*k);
}
return{ x, y };
}
void check(void)
{
tPoint move[4] = { { 1, 0 }, { -1, 1 }, { 1, -1 }, { 0, 1 } };
tPoint pos;
tCoord dir = 0;
for (tCoord p = 0; p != n * m ; p++)
{
tPoint dc = diagonal_cell(p);
if (pos != dc) panic("zot!");
pos = pos + move[dir];
if (dir == 0)
{
if (pos.y == m - 1) dir = 2;
else dir = 1;
}
else if (dir == 3)
{
if (pos.x == n - 1) dir = 1;
else dir = 2;
}
else if (dir == 1)
{
if (pos.y == m - 1) dir = 0;
else if (pos.x == 0) dir = 3;
}
else
{
if (pos.x == n - 1) dir = 3;
else if (pos.y == 0) dir = 0;
}
}
}
};
void main(void)
{
const tPoint dim[] = { { 10, 10 }, { 11, 11 }, { 10, 30 }, { 30, 10 }, { 10, 31 }, { 31, 10 }, { 11, 31 }, { 31, 11 } };
for (tPoint d : dim)
{
printf("Checking a %lldx%lld matrix...", d.x, d.y);
tMatrix(d.x, d.y).check();
printf("done\n");
}
tCoord p = 10000000000;
tMatrix matrix(1000000, 1000000);
tPoint cell = matrix.diagonal_cell(p);
printf("Coordinates of %lldth cell: (%lld,%lld)\n", p, cell.x, cell.y);
}
Results are checked against "manual" sweep of the matrix.
This "manual" sweep is a ugly hack that won't work for a one-row or one-column matrix, though diagonal_cell() does work on any matrix (the "diagonal" sweep becomes linear in that case).
The coordinates found for the 10.000.000.000th cell of a 1.000.000x1.000.000 matrix seem consistent, since the diagonal d on which the cell stands is about sqrt(2*1e10), approx. 141421, and the sum of cell coordinates is about equal to d (121090+20330 = 141420). Besides, it is also what the two other posters report.
I would say there is a good chance this lump of obfuscated code actually produces an O(1) solution to your problem.

Recursive realtion solution in terms of variable(p)

u[0, 0] =1
u[m, n] =(1/(p + m + m^2)) (m*u[(m - 1), n] + n*u[(m + 1), (n - 2)] + m*(m - 1)*u[(m - 2), (n + 2)])
I want u[m,n] in terms of p; don't want answer like u[0,2]=2u[1,0]/p only in terms of p; like I want value of u[m,n] for different value of m and n.
As stated by Jonie Shih your recursion doesn't terminate. Perhaps you want to terminate for either argument less than one:
u[x_, y_] /; x < 1 || y < 1 = 1;
u[m_, n_] :=
(1/(p + m + m^2)) (m*u[(m - 1), n] + n*u[(m + 1), (n - 2)] +
m*(m - 1)*u[(m - 2), (n + 2)])
p = 4;
u[3, 5]
593037/1392640

Resources