Is there anyway to calculate the sum of 1 to n in Theta(log n)?
Of course, the obvious way to do it is sum = n*(n+1)/2.
However, for practicing, I want to calculate in Theta(log n).
For example,
sum=0; for(int i=1; i<=n; i++) { sum += i}
this code will calculate in Theta(n).
Fair way (without using math formulas) assumes direct summing all n values, so there is no way to avoid O(n) behavior.
If you want to make some artificial approach to provide exactly O(log(N)) time, consider, for example, using powers of two (knowing that Sum(1..2^k = 2^(k-1) + 2^(2*k-1) - for example, Sum(8) = 4 + 32). Pseudocode:
function Sum(n)
if n < 2
return n
p = 1 //2^(k-1)
p2 = 2 //2^(2*k-1)
while p * 4 < n:
p = p * 2;
p2 = p2 * 4;
return p + p2 + ///sum of 1..2^k
2 * p * (n - 2 * p) + ///(n - 2 * p) summands over 2^k include 2^k
Sum(n - 2 * p) ///sum of the rest over 2^k
Here 2*p = 2^k is the largest power of two not exceeding N. Example:
Sum(7) = Sum(4) + 5 + 6 + 7 =
Sum(4) + (4 + 1) + (4 + 2) + (4 + 3) =
Sum(4) + 3 * 4 + Sum(3) =
Sum(4) + 3 * 4 + Sum(2) + 1 * 2 + Sum(1) =
Sum(4) + 3 * 4 + Sum(2) + 1 * 2 + Sum(1) =
2 + 8 + 12 + 1 + 2 + 2 + 1 = 28
I am trying to prove an equation given in the CLRS exercise book. The equation is:
Sigma k=0 to k=infinity (k-1)/2^k = 0
I solved the LHS but my answer is 1 whereas the RHS should be 0
Following is my solution:
Let's say S = k/2^k = 1/2 + 2/2^2 + 3/2^3 + 4/2^4 ....
2S = 1 + 2/2 + 3/2^2 + 4/2^3 ...
2S - S = 1 + ( 2/2 - 1/2) + (3/2^2 - 2/2^2) + (4/2^3 - 3/2^3)..
S = 1+ 1/2 + 1/2^2 + 1/2^3 + 1/2^4..
S = 2 -- eq 1
Now let's say S1 = (k-1)/2^k = 0/2 + 1/2^2 + 2/2^3 + 3/2^4...
S - S1 = 1/2 + (2/2^2 - 1/2^2) + (3/2^3 - 2/2^3) + (4/2^4 - 3/2^4)....
S - S1 = 1/2 + 1/2^2 + 1/2^3 + 1/2^4...
= 1
From eq 1
2 - S1 = 1
S1 = 1
Whereas the required RHS is 0. Is there anything wrong with my solution? Thanks..
Yes, you have issues in your solution to the problem.
While everything is correct in formulating the value of S, you have calculated the value of S1 incorrectly. You missed substituting the value for k=0 in S1. Whereas, for S, even after putting the value of k, the first term will be 0, so no effect.
Therefore,
S1 = (k-1)/2^k = -1 + 0/2 + 1/2^2 + 2/2^3 + 3/2^4...
// you missed -1 here because you started substituting values from k=1
S - S1 = -(-1) + 1/2 + (2/2^2 - 1/2^2) + (3/2^3 - 2/2^3) + (4/2^4 - 3/2^4)....
S - S1 = 1 + (1/2 + 1/2^2 + 1/2^3 + 1/2^4...)
= 1 + 1
= 2.
From eq 1
2 - S1 = 2
S1 = 0.
A^2+B^2+C^2+D^2 = N Given an integer N, print out all possible combinations of integer values of ABCD which solve the equation.
I am guessing we can do better than brute force.
Naive brute force would be something like:
n = 3200724;
lim = sqrt (n) + 1;
for (a = 0; a <= lim; a++)
for (b = 0; b <= lim; b++)
for (c = 0; c <= lim; c++)
for (d = 0; d <= lim; d++)
if (a * a + b * b + c * c + d * d == n)
printf ("%d %d %d %d\n", a, b, c, d);
Unfortunately, this will result in over a trillion loops, not overly efficient.
You can actually do substantially better than that by discounting huge numbers of impossibilities at each level, with something like:
#include <stdio.h>
int main(int argc, char *argv[]) {
int n = atoi (argv[1]);
int a, b, c, d, na, nb, nc, nd;
int count = 0;
for (a = 0, na = n; a * a <= na; a++) {
for (b = 0, nb = na - a * a; b * b <= nb; b++) {
for (c = 0, nc = nb - b * b; c * c <= nc; c++) {
for (d = 0, nd = nc - c * c; d * d <= nd; d++) {
if (d * d == nd) {
printf ("%d %d %d %d\n", a, b, c, d);
count++;
}
tot++;
}
}
}
}
printf ("Found %d solutions\n", count);
return 0;
}
It's still brute force, but not quite as brutish inasmuch as it understands when to stop each level of looping as early as possible.
On my (relatively) modest box, that takes under a second (a) to get all solutions for numbers up to 50,000. Beyond that, it starts taking more time:
n time taken
---------- ----------
100,000 3.7s
1,000,000 6m, 18.7s
For n = ten million, it had been going about an hour and a half before I killed it.
So, I would say brute force is perfectly acceptable up to a point. Beyond that, more mathematical solutions would be needed.
For even more efficiency, you could only check those solutions where d >= c >= b >= a. That's because you could then build up all the solutions from those combinations into permutations (with potential duplicate removal where the values of two or more of a, b, c, or d are identical).
In addition, the body of the d loop doesn't need to check every value of d, just the last possible one.
Getting the results for 1,000,000 in that case takes under ten seconds rather than over six minutes:
0 0 0 1000
0 0 280 960
0 0 352 936
0 0 600 800
0 24 640 768
: : : :
424 512 512 544
428 460 500 596
432 440 480 624
436 476 532 548
444 468 468 604
448 464 520 560
452 452 476 604
452 484 484 572
500 500 500 500
Found 1302 solutions
real 0m9.517s
user 0m9.505s
sys 0m0.012s
That code follows:
#include <stdio.h>
int main(int argc, char *argv[]) {
int n = atoi (argv[1]);
int a, b, c, d, na, nb, nc, nd;
int count = 0;
for (a = 0, na = n; a * a <= na; a++) {
for (b = a, nb = na - a * a; b * b <= nb; b++) {
for (c = b, nc = nb - b * b; c * c <= nc; c++) {
for (d = c, nd = nc - c * c; d * d < nd; d++);
if (d * d == nd) {
printf ("%4d %4d %4d %4d\n", a, b, c, d);
count++;
}
}
}
}
printf ("Found %d solutions\n", count);
return 0;
}
And, as per a suggestion by DSM, the d loop can disappear altogether (since there's only one possible value of d (discounting negative numbers) and it can be calculated), which brings the one million case down to two seconds for me, and the ten million case to a far more manageable 68 seconds.
That version is as follows:
#include <stdio.h>
#include <math.h>
int main(int argc, char *argv[]) {
int n = atoi (argv[1]);
int a, b, c, d, na, nb, nc, nd;
int count = 0;
for (a = 0, na = n; a * a <= na; a++) {
for (b = a, nb = na - a * a; b * b <= nb; b++) {
for (c = b, nc = nb - b * b; c * c <= nc; c++) {
nd = nc - c * c;
d = sqrt (nd);
if (d * d == nd) {
printf ("%d %d %d %d\n", a, b, c, d);
count++;
}
}
}
}
printf ("Found %d solutions\n", count);
return 0;
}
(a): All timings are done with the inner printf commented out so that I/O doesn't skew the figures.
The Wikipedia page has some interesting background information, but Lagrange's four-square theorem (or, more correctly, Bachet's Theorem - Lagrange only proved it) doesn't really go into detail on how to find said squares.
As I said in my comment, the solution is going to be nontrivial. This paper discusses the solvability of four-square sums. The paper alleges that:
There is no convenient algorithm (beyond the simple one mentioned in
the second paragraph of this paper) for finding additional solutions
that are indicated by the calculation of representations, but perhaps
this will streamline the search by giving an idea of what kinds of
solutions do and do not exist.
There are a few other interesting facts related to this topic. There
exist other theorems that state that every integer can be written as a
sum of four particular multiples of squares. For example, every
integer can be written as N = a^2 + 2b^2 + 4c^2 + 14d^2. There are 54
cases like this that are true for all integers, and Ramanujan provided
the complete list in the year 1917.
For more information, see Modular Forms. This is not easy to understand unless you have some background in number theory. If you could generalize Ramanujan's 54 forms, you may have an easier time with this. With that said, in the first paper I cite, there is a small snippet which discusses an algorithm that may find every solution (even though I find it a bit hard to follow):
For example, it was reported in 1911 that the calculator Gottfried
Ruckle was asked to reduce N = 15663 as a sum of four squares. He
produced a solution of 125^2 + 6^2 + 1^2 + 1^2 in 8 seconds, followed
immediately by 125^2 + 5^2 + 3^2 + 2^2. A more difficult problem
(reflected by a first term that is farther from the original number,
with correspondingly larger later terms) took 56 seconds: 11399 = 105^2
+ 15^2 + 8^2 + 5^2. In general, the strategy is to begin by setting the first term to be the largest square below N and try to represent the
smaller remainder as a sum of three squares. Then the first term is
set to the next largest square below N, and so forth. Over time a
lightning calculator would become familiar with expressing small
numbers as sums of squares, which would speed up the process.
(Emphasis mine.)
The algorithm is described as being recursive, but it could easily be implemented iteratively.
It seems as though all integers can be made by such a combination:
0 = 0^2 + 0^2 + 0^2 + 0^2
1 = 1^2 + 0^2 + 0^2 + 0^2
2 = 1^2 + 1^2 + 0^2 + 0^2
3 = 1^2 + 1^2 + 1^2 + 0^2
4 = 2^2 + 0^2 + 0^2 + 0^2, 1^2 + 1^2 + 1^2 + 1^2 + 1^2
5 = 2^2 + 1^2 + 0^2 + 0^2
6 = 2^2 + 1^2 + 1^2 + 0^2
7 = 2^2 + 1^2 + 1^2 + 1^2
8 = 2^2 + 2^2 + 0^2 + 0^2
9 = 3^2 + 0^2 + 0^2 + 0^2, 2^2 + 2^2 + 1^2 + 0^2
10 = 3^2 + 1^2 + 0^2 + 0^2, 2^2 + 2^2 + 1^2 + 1^2
11 = 3^2 + 1^2 + 1^2 + 0^2
12 = 3^2 + 1^2 + 1^2 + 1^2, 2^2 + 2^2 + 2^2 + 0^2
.
.
.
and so forth
As I did some initial working in my head, I thought that it would be only the perfect squares that had more than 1 possible solution. However after listing them out it seems to me there is no obvious order to them. However, I thought of an algorithm I think is most appropriate for this situation:
The important thing is to use a 4-tuple (a, b, c, d). In any given 4-tuple which is a solution to a^2 + b^2 + c^2 + d^2 = n, we will set ourselves a constraint that a is always the largest of the 4, b is next, and so on and so forth like:
a >= b >= c >= d
Also note that a^2 cannot be less than n/4, otherwise the sum of the squares will have to be less than n.
Then the algorithm is:
1a. Obtain floor(square_root(n)) # this is the maximum value of a - call it max_a
1b. Obtain the first value of a such that a^2 >= n/4 - call it min_a
2. For a in a range (min_a, max_a)
At this point we have selected a particular a, and are now looking at bridging the gap from a^2 to n - i.e. (n - a^2)
3. Repeat steps 1a through 2 to select a value of b. This time instead of finding
floor(square_root(n)) we find floor(square_root(n - a^2))
and so on and so forth. So the entire algorithm would look something like:
1a. Obtain floor(square_root(n)) # this is the maximum value of a - call it max_a
1b. Obtain the first value of a such that a^2 >= n/4 - call it min_a
2. For a in a range (min_a, max_a)
3a. Obtain floor(square_root(n - a^2))
3b. Obtain the first value of b such that b^2 >= (n - a^2)/3
4. For b in a range (min_b, max_b)
5a. Obtain floor(square_root(n - a^2 - b^2))
5b. Obtain the first value of b such that b^2 >= (n - a^2 - b^2)/2
6. For c in a range (min_c, max_c)
7. We now look at (n - a^2 - b^2 - c^2). If its square root is an integer, this is d.
Otherwise, this tuple will not form a solution
At steps 3b and 5b I use (n - a^2)/3, (n - a^2 - b^2)/2. We divide by 3 or 2, respectively, because of the number of values in the tuple not yet 'fixed'.
An example:
doing this on n = 12:
1a. max_a = 3
1b. min_a = 2
2. for a in range(2, 3):
use a = 2
3a. we now look at (12 - 2^2) = 8
max_b = 2
3b. min_b = 2
4. b must be 2
5a. we now look at (12 - 2^2 - 2^2) = 4
max_c = 2
5b. min_c = 2
6. c must be 2
7. (n - a^2 - b^2 - c^2) = 0, hence d = 0
so a possible tuple is (2, 2, 2, 0)
2. use a = 3
3a. we now look at (12 - 3^2) = 3
max_b = 1
3b. min_b = 1
4. b must be 1
5a. we now look at (12 - 3^2 - 1^2) = 2
max_c = 1
5b. min_c = 1
6. c must be 1
7. (n - a^2 - b^2 - c^2) = 1, hence d = 1
so a possible tuple is (3, 1, 1, 1)
These are the only two possible tuples - hey presto!
nebffa has a great answer. one suggestion:
step 3a: max_b = min(a, floor(square_root(n - a^2))) // since b <= a
max_c and max_d can be improved in the same way too.
Here is another try:
1. generate array S: {0, 1, 2^2, 3^2,.... nr^2} where nr = floor(square_root(N)).
now the problem is to find 4 numbers from the array that sum(a, b,c,d) = N;
2. according to neffa's post (step 1a & 1b), a (which is the largest among all 4 numbers) is between [nr/2 .. nr].
We can loop a from nr down to nr/2 and calculate r = N - S[a];
now the question is to find 3 numbers from S the sum(b,c,d) = r = N -S[a];
here is code:
nr = square_root(N);
S = {0, 1, 2^2, 3^2, 4^2,.... nr^2};
for (a = nr; a >= nr/2; a--)
{
r = N - S[a];
// it is now a 3SUM problem
for(b = a; b >= 0; b--)
{
r1 = r - S[b];
if (r1 < 0)
continue;
if (r1 > N/2) // because (a^2 + b^2) >= (c^2 + d^2)
break;
for (c = 0, d = b; c <= d;)
{
sum = S[c] + S[d];
if (sum == r1)
{
print a, b, c, d;
c++; d--;
}
else if (sum < r1)
c++;
else
d--;
}
}
}
runtime is O(sqare_root(N)^3).
Here is the test result running java on my VM (time in milliseconds, result# is total num of valid combination, time 1 with printout, time2 without printout):
N result# time1 time2
----------- -------- -------- -----------
1,000,000 1302 859 281
10,000,000 6262 16109 7938
100,000,000 30912 442469 344359
Determine the positive number c & n0 for the following recurrences (Using Substitution Method):
T(n) = T(ceiling(n/2)) + 1 ... Guess is Big-Oh(log base 2 of n)
T(n) = 3T(floor(n/3)) + n ... Guess is Big-Omega (n * log base 3 of n)
T(n) = 2T(floor(n/2) + 17) + n ... Guess is Big-Oh(n * log base 2 of n).
I am giving my Solution for Problem 1:
Our Guess is: T(n) = O (log_2(n)).
By Induction Hypothesis assume T(k) <= c * log_2(k) for all k < n,here c is a const & c > 0
T(n) = T(ceiling(n/2)) + 1
<=> T(n) <= c*log_2(ceiling(n/2)) + 1
<=> " <= c*{log_2(n/2) + 1} + 1
<=> " = c*log_2(n/2) + c + 1
<=> " = c*{log_2(n) - log_2(2)} + c + 1
<=> " = c*log_2(n) - c + c + 1
<=> " = c*log_2(n) + 1
<=> T(n) not_<= c*log_2(n) because c*log_2(n) + 1 not_<= c*log_2(n).
To solve this remedy used a trick a follows:
T(n) = T(ceiling(n/2)) + 1
<=> " <= c*log(ceiling(n/2)) + 1
<=> " <= c*{log_2 (n/2) + b} + 1 where 0 <= b < 1
<=> " <= c*{log_2 (n) - log_2(2) + b) + 1
<=> " = c*{log_2(n) - 1 + b} + 1
<=> " = c*log_2(n) - c + bc + 1
<=> " = c*log_2(n) - (c - bc - 1) if c - bc -1 >= 0
c >= 1 / (1 - b)
<=> T(n) <= c*log_2(n) for c >= {1 / (1 - b)}
so T(n) = O(log_2(n)).
This solution is seems to be correct to me ... My Ques is: Is it the proper approach to do?
Thanks to all of U.
For the first exercise:
We want to show by induction that T(n) <= ceiling(log(n)) + 1.
Let's assume that T(1) = 1, than T(1) = 1 <= ceiling(log(1)) + 1 = 1 and the base of the induction is proved.
Now, we assume that for every 1 <= i < nhold that T(i) <= ceiling(log(i)) + 1.
For the inductive step we have to distinguish the cases when n is even and when is odd.
If n is even: T(n) = T(ceiling(n/2)) + 1 = T(n/2) + 1 <= ceiling(log(n/2)) + 1 + 1 = ceiling(log(n) - 1) + 1 + 1 = ceiling(log(n)) + 1.
If n is odd: T(n) = T(ceiling(n/2)) + 1 = T((n+1)/2) + 1 <= ceiling(log((n+1)/2)) + 1 + 1 = ceiling(log(n+1) - 1) + 1 + 1 = ceiling(log(n+1)) + 1 = ceiling(log(n)) + 1
The last passage is tricky, but is possibile because n is odd and then it cannot be a power of 2.
Problem #1:
T(1) = t0
T(2) = T(1) + 1 = t0 + 1
T(4) = T(2) + 1 = t0 + 2
T(8) = T(4) + 1 = t0 + 3
...
T(2^(m+1)) = T(2^m) + 1 = t0 + (m + 1)
Letting n = 2^(m+1), we get that T(n) = t0 + log_2(n) = O(log_2(n))
Problem #2:
T(1) = t0
T(3) = 3T(1) + 3 = 3t0 + 3
T(9) = 3T(3) + 9 = 3(3t0 + 3) + 9 = 9t0 + 18
T(27) = 3T(9) + 27 = 3(9t0 + 18) + 27 = 27t0 + 81
...
T(3^(m+1)) = 3T(3^m) + 3^(m+1) = ((3^(m+1))t0 + (3^(m+1))(m+1)
Letting n = 3^(m+1), we get that T(n) = nt0 + nlog_3(n) = O(nlog_3(n)).
Problem #3:
Consider n = 34. T(34) = 2T(17+17) + 34 = 2T(34) + 34. We can solve this to find that T(34) = -34. We can also see that for odd n, T(n) = 1 + T(n - 1). We continue to find what values are fixed:
T(0) = 2T(17) + 0 = 2T(17)
T(17) = 1 + T(16)
T(16) = 2T(25) + 16
T(25) = T(24) + 1
T(24) = 2T(29) + 24
T(29) = T(28) + 1
T(28) = 2T(31) + 28
T(31) = T(30) + 1
T(30) = 2T(32) + 30
T(32) = 2T(33) + 32
T(33) = T(32) + 1
We get T(32) = 2T(33) + 32 = 2T(32) + 34, meaning that T(32) = -34. Working backword, we get
T(32) = -34
T(33) = -33
T(30) = -38
T(31) = -37
T(28) = -46
T(29) = -45
T(24) = -96
T(25) = -95
T(16) = -174
T(17) = -173
T(0) = -346
As you can see, this recurrence is a little more complicated than the others, and as such, you should probably take a hard look at this one. If I get any other ideas, I'll come back; otherwise, you're on your own.
EDIT:
After looking at #3 some more, it looks like you're right in your assessment that it's O(nlog_2(n)). So you can try listing a bunch of numbers - I did it from n=0 to n=45. You notice a pattern: it goes from negative numbers to positive numbers around n=43,44. To get the next even-index element of the sequence, you add powers of two, in the following order: 4, 8, 4, 16, 4, 8, 4, 32, 4, 8, 4, 16, 4, 8, 4, 64, 4, 8, 4, 16, 4, 8, 4, 32, ...
These numbers are essentially where you'd mark an arbitary-length ruler... quarters, halves, eights, sixteenths, etc. As such, we can solve the equivalent problem of finding the order of the sum 1 + 2 + 1 + 4 + 1 + 2 + 1 + 8 + ... (same as ours, divided by 4, and ours is shifted, but the order will still work). By observing that the sum of the first k numbers (where k is a power of 2) is equal to sum((n/(2^(k+1))2^k) = (1/2)sum(n) for k = 0 to log_2(n), we get that the simple recurrence is given by (n/2)log_2(n). Multiply by 4 to get ours, and shift x to the right by 34 and perhaps add a constant value to the result. So we're playing around with y = 2nlog_n(x) + k' for some constant k'.
Phew. That was a tricky one. Note that this recurrence does not admit of any arbitary "initial condiditons"; in other words, the recurrence does not describe a family of sequences, but one specific one, with no parameterization.