Related
I have an array of sorted integers. Given an integer N i need to place N largest elements further away from each other so that they have maximum space between each other. The remaining elements should be placed between these big items. For example, array of 10 with N=3 would result in [0, 5, 8, 2, 6, 9, 3, 7, 10, 4].
public static void main(String[] args) {
int[] start = {10, 9, 8, 7, 6, 5, 4, 3, 2, 1};
int[] end = new int[10];
int N = 4;
int step = Math.round(start.length / N );
int y = 0;
int count = 0;
for (int i = 0; i < step; i++) {
for (int j = i; j<start.length; j = j + step) {
//System.out.println(j + " " + i);
if (count < start.length && start[count] != 0) {
end[j] = start[count];
count++;
}
}
}
System.out.println(end.toString());
}
You have an array of K elements. You have N max numbers you need to distribute. Then:
Step := K/N (removing the remainder)
Take any number from N maximum and insert it at Step/2 position.
Take other maximum numbers and insert it after the previous inserted maximum number at Step distance.
Giving [1,2,3,4,5,6,7,8,9,10]. So K = 10, N = 3. Then Step = 3. So the first maximum is placed at 3/2 position
[1,10,2,3,4,5,6,7,8,9]
Then other 2 are put at 3 distance from each other:
[1,10,2,3,9,4,5,8,6,7]
The code:
std::vector<int> Distribute(std::vector<int> aSource, int aNumber)
{
auto step = aSource.size() / aNumber; // Note integer dividing.
for (int i = 0; i < aNumber; ++i)
{
auto place = aSource.end() - i * step - step / 2;
aSource.insert(place, aSource.front());
aSource.erase(aSource.begin());
}
return aSource;
}
int main()
{
std::vector<int> vec{10,9,8,7,6,5,4,3,2,1,0,-1,-2,-3,-4,-5,-6,-7,-8,-9,-10};
auto res = Distribute(vec, 4);
for (auto e : res)
{
std::cout << e << ", ";
}
std::cout << std::endl;
}
Output:
6, 5, 4, 7, 3, 2, 1, 0, 8, -1, -2, -3, -4, 9, -5, -6, -7, -8, 10, -9, -10,
I am just learning about time complexity, here is piece of code I've written
for (int i = 1; i <= N; i++)
for (int j = 1; j <= N; j++)
{
// Spread Peace
}
Clearly the above one is of O(N^2) complexity and its seems (for N == 1e6) to run forever.
Here is second piece of code
for (int i = 1; i <= N; i++)
for (int j = i; j <= N; j++)
{
// Money is Everything
}
the above one is also O(N^2) - N*(N+1)/2 complexity is also running forever, but this code:
for (int i = 1; i <= N; i++)
for (int j = i; j <= N; j += i)
{
// Be my GirlFriend
}
just executes within a sec., I am not able to derive its time complexity why this so fast? What's is the estimation for N == 1e6?
Let's carry out an experiment first, let's try unrolling the loop (C# implementation) and have a look what's going on:
private static IEnumerable<String> Unroll(int N) {
for (int i = 1; i <= N; i++) {
StringBuilder sb = new StringBuilder();
for (int j = i; j <= N; j += i) {
if (sb.Length > 0)
sb.Append(", ");
sb.Append(j);
}
yield return sb.ToString();
}
A test run with a small number (e.g. 16) reveals the picture
Console.Write(string.Join(Environment.NewLine, Unroll(16)));
Can you see the pattern, an exponential drop? It looks like N * log(N), right?
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
2, 4, 6, 8, 10, 12, 14, 16
3, 6, 9, 12, 15
4, 8, 12, 16
5, 10, 15
6, 12
7, 14
8, 16
9
10
11
12
13
14
15
16
Now it's time for the paper and pencil: we have (for large N)
N / 1 items (step == 1) +
N / 2 items (step == 2) +
N / 3 items (step == 3) +
...
N / N items (step == N)
------------------------------
N * (1 + 1/2 + ... + 1/N) =
N * H(N) =
O(N * log(N)) // Harmonic sum H(N) gives log(N)
More accurate estimation
H(N) = ln(N) + gamma + 1/(2*N) + ...
where
ln() - natural logarithm
gamma - Euler–Mascheroni constant (0.5772156649...)
gives you for N == 1e6 about 14.4e6 loops which is, in fact, a bit overestimated; the actual count is 13970034 (14.0e6) since when aproximating with Harmonic series we did't take integer division (each k/N should be integer, i.e. not k/N, but floor(k/N)) into account.
You may proceed using Sigma notation:
More on Harmonic functions here: Asymptotic analysis
Imagine there exits a huge real data set: A1,A2,A3,...,Ai,...An (where n is a very large number.). I want to find these subdata sets, in which the difference between each number within the these subset is smaller than a fixed value B. And it has to cost as little time and ram as possible. Any idea?
As mentioned in a comment, the set is already sorted. Lets call the i-th element a[i]. A simple linear pass finds all subsets (pseudocode, without check for the end of the data - this is easy to add but would obscure the idea of the algorithm):
low = 0;
high = 0;
repeat {
while (a[high] - a[low] <= B) {
high = high + 1;
}
output set a[low .. high-1];
while (a[high] - a[low] > B) {
low = low + 1;
}
}
Note that only the section between low and high needs to be in memory at a time. So it is possible to stream through the data without storing it all in memory.
This algorithm will also output one element subsets. If this is not wanted it can be easily suppressed.
It's not clear how much data did you mean - Is it small enough to load all data into RAM, whether it's 32 bit integer or not, how likely there are duplicates in your data, whether you'are using multiple machines or not and/or using map-reduce jobs etc. Despite lack of information, I can blindly suggest you to go with Radix sort. Its linear time sorting algorithm.
Edit 1
As you mentioned the data is already sorted in ascending order so we can find all subsets using binary search(upper bound) for each element.
Assuming the data container is A[i] and size n, here is the rough pseudo-code:
upper_bound(start, end, key):
indx := end + 1
while start <= end do
mid := start + (end - start) / 2
if A[mid] >= key:
indx := mid
end := mid - 1
else
start := mid + 1
return indx
end
subsets := [] // list of subsets
for i = n - 1 to i = 0 do
indx := upper_bound(0, i - 1, A[i] - B)
set := [ elements from A[indx] to A[i] ]
subsets.push(set)
end
print subsets
For each element arr[i], you have to find upper bound; the overall time complexity is O(n logn).
If you want, I can provide the C++ or Java working snippet.
Edit 2
Here is the Java code
import java.util.ArrayList;
import java.util.Arrays;
/**
*
* #author kaidul
*/
public class Test {
private static int upperBound(int left, int right, int key, Integer[] A) {
int indx = right + 1;
while(left <= right) {
int mid = left + (right - left) / 2;
if(A[mid] > key) {
indx = mid;
right = mid - 1;
} else {
left = mid + 1;
}
}
return indx;
}
public static void main(String[] args) {
Integer[] A = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
int B = 4;
ArrayList<ArrayList<Integer>> result = new ArrayList<>();
for(int i = A.length - 1; i > 0; --i) {
int startIndx = upperBound(0, i - 1, Math.min(A[i] - B, A[i] - A[0]), A);
if(startIndx < i) {
ArrayList<Integer> solutionSet = new ArrayList<>( Arrays.asList( Arrays.copyOfRange(A, startIndx, i + 1) ) );
result.add(solutionSet);
}
if(startIndx == 0) {
break;
}
}
result.stream().forEach((subset) -> {
System.out.println(subset);
});
}
}
Output:
[7, 8, 9, 10]
[6, 7, 8, 9]
[5, 6, 7, 8]
[4, 5, 6, 7]
[3, 4, 5, 6]
[2, 3, 4, 5]
[1, 2, 3, 4]
I have this code to calculate the number of recursive calls and show that it's approximately equal to O(2^n) as is well known for fibonacci;
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int c=0;
int fib(int n) {
c++;
if (0 == n) { return 0;}
if (2 >= n) {
return 1;
}
return fib(n-1) + fib(n-2);
}
int main(int argc, char * argv[]) {
int i;
for(i = 1 ; i < 10 ; i++) {
c=0;
int f=fib(i);
printf("fib(%d) = %d,\tc = %d,\t2^%d = %f\n", i, f, c, i, pow(2,i));
}
return 0;
}
and here's the output;
fib(1) = 1, c = 1, 2^1 = 2.000000
fib(2) = 1, c = 1, 2^2 = 4.000000
fib(3) = 2, c = 3, 2^3 = 8.000000
fib(4) = 3, c = 5, 2^4 = 16.000000
fib(5) = 5, c = 9, 2^5 = 32.000000
fib(6) = 8, c = 15, 2^6 = 64.000000
fib(7) = 13, c = 25, 2^7 = 128.000000
fib(8) = 21, c = 41, 2^8 = 256.000000
fib(9) = 34, c = 67, 2^9 = 512.000000
What's wrong here?
If I understand you right, you are wondering why your values for c and the corresponding value 2ⁿ are not as close as you expected.
The reason for this is, that the call tree is not balanced and so O(2ⁿ) is just a loose upper bound. The number of calls has a lower bound of Ω(2n/2).
As you can see in the picture: The number you calculated for n=6 is correct. You can also see that the call tree is not balanced.
find ten integers>0 that sum to 2011 but their reciprocals sum to 1
e.g.
x1+x2+..+x10 = 2011
1/x1+1/x2+..+1/x10 = 1
I found this problem here http://blog.computationalcomplexity.org/2011/12/is-this-problem-too-hard-for-hs-math.html
I was wondering what the computation complexity was, and what types of algorithms can solve it.
EDIT2:
I wrote the following brute force code which is fast enough. I didn't find any solutions though so I need to tweak my assumptions slightly. I'm now confident I will find the solution.
from fractions import Fraction
pairs = [(i,j) for i in range(2,30) for j in range(2,30)]
x1x2 = set((i+j, Fraction(1,i)+Fraction(1,j)) for i,j in pairs)
print('x1x2',len(x1x2))
x1x2x3x4 = set((s1+s2,f1+f2) for s1,f1 in x1x2 for s2,f2 in x1x2 if f1+f2<1)
print('x1x2x3x4',len(x1x2x3x4))
count = 0
for s,f in x1x2x3x4:
count+=1
if count%1000==0:
print('count',count)
s2 = 2011 - s
f2 = 1 - f
for s3,f3 in x1x2:
s4 = s2-s3
if s4>0:
f4 = f2-f3
if f4>0:
if (s4,f4) in x1x2x3x4:
print('s3f3',s3,f3)
print('sf',s,f)
Note that you cannot define computational complexity for a single problem instance, as once you know the answer the computational complexity is O(1), i.e. constant-time. Computational complexity can be only defined for an infinite family of problems.
One approach for solving this type of a problem would be to use backtracking search. Your algorithm spends too much time in searching parts of the 10-dimensional space that can't contain solutions. An efficient backtracking algorithm would
assign the variables in the order x1, x2, ..., x10
maintain the constraint x1 <= x2 <= ... <= x10
during search, always when number xi has been assigned
let S = x1 + ... + xi
let R = 1/x1 + ... + 1/xi
always check that S <= 2011 - (10 - i) * xi
always check that R <= 1 - (1 / [(2011 - S) / (10 - i)])
if these two constraints are not fulfilled during search there can't be a solution any more and the algorithm should backtrack immediately. Note that the constraints are based on the fact that the numbers are assigned in increasing order, i.e. xi <= xi+1 in all cases
Note: you can speed up search, limiting the search space and making calculations faster, by assuming that all x1, ..., x10 divide a given number evenly, e.g. 960. That is, you only consider such xi that 960 divided by xi is an integer. This makes calculating the fractional part much easier, as instead of checking that 1/x1 + ... equals 1 you can check that 960/x1 + ... equals 960. Because all the divisions are even and return integers, you don't need to use floating or rational arithmetics at all but everything works with integers only. Of course, the smaller the fixed modulus is the less solutions you can find, but this also makes the search faster.
I note that one of the things on the next blog in the series, http://blog.computationalcomplexity.org/2011/12/solution-to-reciprocals-problem.html, is a paper on the problem, and a suggested dynamic programming approach to counting the number of answers. Since it is a dynamic programming approach, you should be able to turn that into a dynamic program to find those answers.
Dynamic programming solution (C#) based on the Bill Gasarch paper someone posted. But this does not necessarily find the optimal (minimum number of numbers used) solution. It is only guaranteed to find a solution if allowed to go high enough, but it doesn't have to be with the desired N. Basically, I feel like it "accidentally" works for (10, 2011).
Some example solutions for 2011:
10 numbers: 2, 4, 5, 80, 80, 80, 160, 320, 640, 640
11 numbers: 3, 6, 4, 12, 12, 24, 30, 480, 480, 480, 480
13 numbers: 2, 4, 5, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200
15 numbers: 3, 6, 6, 12, 16, 16, 32, 32, 32, 64, 256, 256, 256, 512, 512
Anyone have an idea how to fix it to work in general?
using System;
using System.Collections.Generic;
namespace Recip
{
class Program
{
static void Main(string[] args)
{
int year = 2011;
int numbers = 20;
int[,,] c = new int[year+1, numbers+1, numbers];
List<int> queue = new List<int>();
// need some initial guesses to expand on - use squares because 1/y * y = 1
int num = 1;
do
{
for (int i = 0; i < num; i++)
c[num * num, num, i] = num;
queue.Add(num * num);
num++;
} while (num <= numbers && num * num <= year);
// expand
while (queue.Count > 0)
{
int x0 = queue[0];
queue.RemoveAt(0);
for (int i = 0; i <= numbers; i++)
{
if (c[x0, i, 0] > 0)
{
int[] coefs ={ 20, 4, 2, 2, 3, 3};
int[] cons = { 11, 6, 8, 9, 6, 8};
int[] cool = { 3, 2, 2, 2, 2, 2};
int[] k1 = { 2, 2, 4, 3, 3, 2};
int[] k2 = { 4, 4, 4, 6, 3, 6};
int[] k3 = { 5, 0, 0, 0, 0, 0};
int[] mul = { 20, 4, 2, 2, 3, 3};
for (int k = 0; k < 6; k++)
{
int x1 = x0 * coefs[k] + cons[k];
int c1 = i + cool[k];
if (x1 <= year && c1 <= numbers && c[x1, c1, 0] == 0)
{
queue.Add(x1);
c[x1, c1, 0] = k1[k];
c[x1, c1, 1] = k2[k];
int index = 2;
if (k == 0)
{
c[x1, c1, index] = k3[k];
index++;
}
int diff = index;
while (c[x0, i, index - diff] > 0)
{
c[x1, c1, index] = c[x0, i, index - diff] * mul[k];
index++;
}
}
}
}
}
}
for (int n = 1; n < numbers; n++)
{
if (c[year, n, 0] == 0) continue;
int ind = 0;
while (ind < n && c[year, n, ind] > 0)
{
Console.Write(c[year, n, ind] + ", ");
ind++;
}
Console.WriteLine();
}
Console.ReadLine();
}
}
}
There are Choose(2011,10) or about 10^26 sets of 10 numbers that add up to 2011. So, in order for a brute force approach to work, the search tree would have to be trimmed significantly.
Fortunately, there are a few ways to do that.
The first obvious way is to require that the numbers are ordered. This reduces the number of options by a factor of around 10^7.
The second is that we can detect early if our current partial solution can never lead to a complete solution. Since our values are ordered, the remaining numbers in the set are at least as large as the current number. Note that the sum of the numbers increases as the numbers get larger, while the sum of the reciprocals decreases.
There are two sure ways we can tell we're at a dead end:
We get the smallest possible total from where we are when we take all remaining numbers to be the same as the current number. If this smallest sum is too big, we'll never get less.
We get the largest possible sum of reciprocals when we take all remaining numbers to be the same as the current number. If this largest sum is less than 1, we'll never get to 1.
These two conditions set an upper bound on the next xi.
Thirdly, we can stop looking if our partial sum of reciprocals is greater than or equal to 1.
Putting all this together, here is a solution in C#:
static int[] x = new int[10];
static void Search(int depth, int xi, int sum, double rsum) {
if (depth == 9) {
// We know exactly what the last number should be
// to make the sum 2011:
xi = 2011 - sum;
// Now check if the sum of reciprocals adds up as well
if (Math.Abs(rsum + 1.0 / xi - 1.0) < 1e-12) {
// We have a winner!
x[depth] = xi;
var s = string.Join(" ", Array.ConvertAll(x, n => n.ToString()));
Console.WriteLine(s);
}
} else {
int lastxi = xi;
// There are 2 ways xi can be too large:
xi = Math.Min(
// 1. If adding it (10 - depth) times to the sum
// is greater than our total:
(2011 - sum) / (10 - depth),
// 2. If adding (10 - depth) times its reciprocal
// is less than 1.
(int)((10 - depth) * remainder));
// We iterate towards smaller xi so we can stop
// when the reciprocal sum is too large:
while (xi >= lastxi) {
double newRSum = rsum + 1.0 / xi;
if (newRSum >= 1.0)
break;
x[depth] = xi;
Search(depth + 1, xi, sum + xi, newRSum);
xi--;
}
}
}
Search(0, 1, 0, 0)
If you used a brute force algorithm to iterate through all the combinations, you'd end up with the answers. But I don't think it's quite as big as 10*2011*2011. Since you can easily arbitrarily postulate that x1
I think a brute force approach would easily get the answer. However I would imagine that the instructor is looking for a mathematical approach. I'm thinking the '1' must have some significance with regards to finding how to manipulate the equations to the answer. The '2011' seems arbitrary.