I'm trying to implement the task. We have 2*n points on circle. So we can create n chords between them. Print all ways to draw n not intersecting chords.
For example: if n = 6. We can draw (1->2 3->4 5->6), (1->4, 2->3, 5->6), (1->6, 2->3, 4->5), (1->6, 2->5, 3->4)
I've developed a recursive algorithms by creating a chord from 1-> 2, 4, 6 and generating answers for 2 remaining intervals. But I know there is more efficient non-recursive way. May be by implementing NextSeq function.
Does anyone have any ideas?
UPD: I do cache intermediate results, but what I really want is to find GenerateNextSeq() function, which can generate next sequence by previous and so generate all such combinations
This is my code by the way
struct SimpleHash {
size_t operator()(const std::pair<int, int>& p) const {
return p.first ^ p.second;
}
};
struct Chord {
int p1, p2;
Chord(int x, int y) : p1(x), p2(y) {};
};
void MergeResults(const vector<vector<Chord>>& res1, const vector<vector<Chord>>& res2, vector<vector<Chord>>& res) {
res.clear();
if (res2.empty()) {
res = res1;
return;
}
for (int i = 0; i < res1.size(); i++) {
for (int k = 0; k < res2.size(); k++) {
vector<Chord> cur;
for (int j = 0; j < res1[i].size(); j++) {
cur.push_back(res1[i][j]);
}
for (int j = 0; j < res2[k].size(); j++) {
cur.push_back(res2[k][j]);
}
res.emplace_back(cur);
}
}
}
int rec = 0;
int cached = 0;
void allChordsH(vector<vector<Chord>>& res, int st, int end, unordered_map<pair<int, int>, vector<vector<Chord>>, SimpleHash>& cach) {
if (st >= end)
return;
rec++;
if (cach.count( {st, end} )) {
cached++;
res = cach[{st, end}];
return;
}
vector<vector<Chord>> res1, res2, res3, curRes;
for (int i = st+1; i <=end; i += 2) {
res1 = {{Chord(st, i)}};
allChordsH(res2, st+1, i-1, cach);
allChordsH(res3, i+1, end, cach);
MergeResults(res1, res2, curRes);
MergeResults(curRes, res3, res1);
for (auto i = 0; i < res1.size(); i++) {
res.push_back(res1[i]);
}
cach[{st, end}] = res1;
res1.clear(); res2.clear(); res3.clear(); curRes.clear();
}
}
void allChords(vector<vector<Chord>>& res, int n) {
res.clear();
unordered_map<pair<int, int>, vector<vector<Chord>>, SimpleHash> cach; // intrval => result
allChordsH(res, 1, n, cach);
return;
}
Use dynamic programming. That is, cache partial results.
Basically, start from 1 chord, compute all answers and add them to cache.
Then take 2 chords, compute all answers using the cache whenever you can.
Etc.
Recursive way is O(n!) (at least n!, I'm bad with complexity calculation).
This way is n/2-1 operations for each step and n steps, therefore O(n^2), which is much better. However, this solution depends on memory, as it has to hold all the combinations in the cache. 15 chords easily uses 1GB of memory (Java solution).
Example solution:
https://ideone.com/g81zP9
Completes 12 chord computation in ~306ms.
Given 1GB of RAM it computes 15 chords in ~8sec.
Cache is saved in specific format to optimize performance: number saved in array means how much further is the link. For example [1,0,3,1,0,0] means:
1 0 3 1 0 0
|--| | |--| |
|--------|
You can transform it in a separate step to whatever format you want.
Related
It is a interview question. Given an array, e.g., [3,2,1,2,7], we want to make all elements in this array unique by incrementing duplicate elements and we require the sum of the refined array is minimal. For example the answer for [3,2,1,2,7] is [3,2,1,4,7] and its sum is 17. Any ideas?
It's not quite as simple as my earlier comment suggested, but it's not terrifically complicated.
First, sort the input array. If it matters to be able to recover the original order of the elements then record the permutation used for the sort.
Second, scan the sorted array from left to right (ie from low to high). If an element is less than or equal to the element to its left, set it to be one greater than that element.
Pseudocode
sar = sort(input_array)
for index = 2:size(sar) ! I count from 1
if sar(index)<=sar(index-1) sar(index) = sar(index-1)+1
forend
Is the sum of the result minimal ? I've convinced myself that it is through some head-scratching and trials but I haven't got a formal proof.
If you only need to find ONE of the best solution, here's the algorythm with some explainations.
The idea of this problem is to find an optimal solution, which can be found only by testing all existing solutions (well, they're infinite, let's stick with the reasonable ones).
I wrote a program in C, because I'm familiar with it, but you can port it to any language you want.
The program does this: it tries to increment one value to the max possible (I'll explain how to find it in the comments under the code sections), than if the solution is not found, decreases this value and goes on with the next one and so on.
It's an exponential algorythm, so it will be very slow on large values of duplicated data (yet, it assures you the best solution is found).
I tested this code with your example, and it worked; not sure if there's any bug left, but the code (in C) is this.
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
typedef int BOOL; //just to ease meanings of values
#define TRUE 1
#define FALSE 0
Just to ease comprehension, I did some typedefs. Don't worry.
typedef struct duplicate { //used to fasten the algorythm; it uses some more memory just to assure it's ok
int value;
BOOL duplicate;
} duplicate_t;
int maxInArrayExcept(int *array, int arraySize, int index); //find the max value in array except the value at the index given
//the result is the max value in the array, not counting th index
int *findDuplicateSum(int *array, int arraySize);
BOOL findDuplicateSum_R(duplicate_t *array, int arraySize, int *tempSolution, int *solution, int *totalSum, int currentSum); //resursive function used to find solution
BOOL check(int *array, int arraySize); //checks if there's any repeated value in the solution
These are all the functions we'll need. All split up for comprehension purpose.
First, we have a struct. This struct is used to avoid checking, for every iteration, if the value on a given index was originally duplicated. We don't want to modify any value not duplicated originally.
Then, we have a couple functions: first, we need to see the worst case scenario: every value after the duplicated ones is already occupied: then we need to increment the duplicated value up to the maximum value reached + 1.
Then, there are the main Function we'll discute later about.
The check Function only checks if there's any duplicated value in a temporary solution.
int main() { //testing purpose
int i;
int testArray[] = { 3,2,1,2,7 }; //test array
int nTestArraySize = 5; //test array size
int *solutionArray; //needed if you want to use the solution later
solutionArray = findDuplicateSum(testArray, nTestArraySize);
for (i = 0; i < nTestArraySize; ++i) {
printf("%d ", solutionArray[i]);
}
return 0;
}
This is the main Function: I used it to test everything.
int * findDuplicateSum(int * array, int arraySize)
{
int *solution = malloc(sizeof(int) * arraySize);
int *tempSolution = malloc(sizeof(int) * arraySize);
duplicate_t *duplicate = calloc(arraySize, sizeof(duplicate_t));
int i, j, currentSum = 0, totalSum = INT_MAX;
for (i = 0; i < arraySize; ++i) {
tempSolution[i] = solution[i] = duplicate[i].value = array[i];
currentSum += array[i];
for (j = 0; j < i; ++j) { //to find ALL the best solutions, we should also put the first found value as true; it's just a line more
//yet, it saves the algorythm half of the duplicated numbers (best/this case scenario)
if (array[j] == duplicate[i].value) {
duplicate[i].duplicate = TRUE;
}
}
}
if (findDuplicateSum_R(duplicate, arraySize, tempSolution, solution, &totalSum, currentSum));
else {
printf("No solution found\n");
}
free(tempSolution);
free(duplicate);
return solution;
}
This Function does a lot of things: first, it sets up the solution array, then it initializes both the solution values and the duplicate array, that is the one used to check for duplicated values at startup. Then, we find the current sum and we set the maximum available sum to the maximum integer possible.
Then, the recursive Function is called; this one gives us the info about having found the solution (that should be Always), then we return the solution as an array.
int findDuplicateSum_R(duplicate_t * array, int arraySize, int * tempSolution, int * solution, int * totalSum, int currentSum)
{
int i;
if (check(tempSolution, arraySize)) {
if (currentSum < *totalSum) { //optimal solution checking
for (i = 0; i < arraySize; ++i) {
solution[i] = tempSolution[i];
}
*totalSum = currentSum;
}
return TRUE; //just to ensure a solution is found
}
for (i = 0; i < arraySize; ++i) {
if (array[i].duplicate == TRUE) {
if (array[i].duplicate <= maxInArrayExcept(solution, arraySize, i)) { //worst case scenario, you need it to stop the recursion on that value
tempSolution[i]++;
return findDuplicateSum_R(array, arraySize, tempSolution, solution, totalSum, currentSum + 1);
tempSolution[i]--; //backtracking
}
}
}
return FALSE; //just in case the solution is not found, but we won't need it
}
This is the recursive Function. It first checks if the solution is ok and if it is the best one found until now. Then, if everything is correct, it updates the actual solution with the temporary values, and updates the optimal condition.
Then, we iterate on every repeated value (the if excludes other indexes) and we progress in the recursion until (if unlucky) we reach the worst case scenario: the check condition not satisfied above the maximum value.
Then we have to backtrack and continue with the iteration, that will go on with other values.
PS: an optimization is possible here, if we move the optimal condition from the check into the for: if the solution is already not optimal, we can't expect to find a better one just adding things.
The hard code has ended, and there are the supporting functions:
int maxInArrayExcept(int *array, int arraySize, int index) {
int i, max = 0;
for (i = 0; i < arraySize; ++i) {
if (i != index) {
if (array[i] > max) {
max = array[i];
}
}
}
return max;
}
BOOL check(int *array, int arraySize) {
int i, j;
for (i = 0; i < arraySize; ++i) {
for (j = 0; j < i; ++j) {
if (array[i] == array[j]) return FALSE;
}
}
return TRUE;
}
I hope this was useful.
Write if anything is unclear.
Well, I got the same question in one of my interviews.
Not sure if you still need it. But here's how I did it. And it worked well.
num_list1 = [2,8,3,6,3,5,3,5,9,4]
def UniqueMinSumArray(num_list):
max=min(num_list)
for i,V in enumerate(num_list):
while (num_list.count(num_list[i])>1):
if (max > num_list[i]+1) :
num_list[i] = max + 1
else:
num_list[i]+=1
max = num_list[i]
i+=1
return num_list
print (sum(UniqueMinSumArray(num_list1)))
You can try with your list of numbers and I am sure it will give you the correct unique minimum sum.
I got the same interview question too. But my answer is in JS in case anyone is interested.
For sure it can be improved to get rid of for loop.
function getMinimumUniqueSum(arr) {
// [1,1,2] => [1,2,3] = 6
// [1,2,2,3,3] = [1,2,3,4,5] = 15
if (arr.length > 1) {
var sortedArr = [...arr].sort((a, b) => a - b);
var current = sortedArr[0];
var res = [current];
for (var i = 1; i + 1 <= arr.length; i++) {
// check current equals to the rest array starting from index 1.
if (sortedArr[i] > current) {
res.push(sortedArr[i]);
current = sortedArr[i];
} else if (sortedArr[i] == current) {
current = sortedArr[i] + 1;
// sortedArr[i]++;
res.push(current);
} else {
current++;
res.push(current);
}
}
return res.reduce((a,b) => a + b, 0);
} else {
return 0;
}
}
I'm solving CS problem and I need little help. I have number N, and I need to count the number of distinct rectangles in which diagonal is passing in N squares if the rectangle is splited on rectangles with size 1x1. This picture will help you understand.
This picture is showing all 4 combinations if N = 4, actually the rectangles in which the diagonal is passing in 4 squares are with sizes 1x4, 2x3, 4x2 and 4x4.
I found the formula if we have given the two sizes of the rectangles it is:
A + B - gcd(A,B)
since N<=10^6, i go up to 10^6 and check for each N the divisors of N, complexity of that is O(Nsqrt(N)), since the divisors of A is gcd(A,B)i solve the system of equations
q is divisor of A and q is gcd(A,B)
A+B-q=N and gcd(A,B)=q
I solved this in O(Nsqrt(N)*log(N))
where i assume that log(N) is the time to find gcd of two numbers.
Because the time limit is 3 seconds it fails on time. I need help on optimizing the solution.
Update: Here is my code:
#include <bits/stdc++.h>
#define ll long long
using namespace std;
int a;
int gcd(int a, int b) {
if(b>a) swap(a,b);
if(b==0) return a;
return gcd(b, a%b);
}
bool valid(int n, int m, int gc, int a) {
if(n+m-gc==a) return true;
return false;
}
int main() {
cin>>a;
int counter=0;
for(int i=1;i<=a/2;i++) {
for(ll j=1;j<=sqrt(i);j++) {
if(i%j==0) {
if(j!=i/j) {
int m1 = a+j-i;
int div=i/j;
int m2 = a+div-i;
if(valid(i, m1, j, a)) {
if(gcd(i, m1)==j)
counter++;
}
if(valid(i, m2, i/j, a)) {
if(gcd(i,m2)==i/j)
counter++;
}
}
else {
int m1 = a+j-i;
if(valid(i, m1, j, a)) {
if(gcd(i, m1)==j)
counter++;
}
}
}
}
}
cout<<counter+1;
return 0;
}
Thanks in advance.
Although O(n*sqrt(n)*log(n)) sounds a bit much for n <= 10^6, and you likely need a slightly better algorithm, your code supports some optimizations:
int gcd(int a, int b) {
if(b>a) swap(a,b);
if(b==0) return a;
return gcd(b, a%b);
}
Get rid of the swap, it will work just fine without it.
While you're at it, get rid of the recursion too:
int gcd(int a, int b) {
while (b) {
int r = a % b;
a = b;
b = r;
}
return a;
}
Next:
for(int i=1;i<=a/2;i++) {
for(ll j=1;j<=sqrt(i);j++) {
Compute a/2 and sqrt(i) outside of their respective loops. There is no need to compute it at each iteration. The compiler may or may not be smart enough (or set up) to do this itself, but you shouldn't rely on it, especially in an online judge setting.
You can also precompute i / j further down so as to not recompute it every time. A lot of divisions can be slow.
Next, do you really need long long for j? i is an int, and j goes up to its square root. So you don't need long long for j, use int.
I'm interested in two solutions using priority_queue specifically. Although they both use priority_queue, I think they have different time complexity.
Solution 1:
int findKthLargest(vector<int>& nums, int k) {
priority_queue<int> pq(nums.begin(), nums.end()); //O(N)
for (int i = 0; i < k - 1; i++) //O(k*log(k))
pq.pop();
return pq.top();
}
Time Complexity: O(N) + O(k*log(k))
EDIT: sorry, it should be O(N) + O(k*log(N)) thanks for pointing out!
Solution 2:
int findKthLargest(vector<int>& nums, int k) {
priority_queue<int, vector<int>, greater<int>> p;
int i = 0;
while(p.size()<k) {
p.push(nums[i++]);
}
for(; i<nums.size(); i++) {
if(p.top()<nums[i]){
p.pop();
p.push(nums[i]);
}
}
return p.top();
}
Time Complexity: O(N*log(k))
So in most cases the 1st solution is much better than the 2nd?
In the first case the complexity is O(n)+klog(n) not O(n)+klog(k) as there are n elements in the heap. In the worst case, k can be as large as n, so for unbounded data O(nlog(n)) is the correct worst case complexity.
In the second case, there is never more than k items in the priority queue, so the complexity is O(nlog(k)) and again for unbounded data k can be as large as n, so it is O(nlog(n)).
For smaller k, the second code will run faster, but as k becomes larger the first code becomes faster. I made some experiments, here are the results:
k=1000
Code 1 time:0.123662
998906057
Code 2 time:0.03287
998906057
========
k=11000
Code 1 time:0.137448
988159929
Code 2 time:0.0872
988159929
========
k=21000
Code 1 time:0.152471
977547704
Code 2 time:0.131074
977547704
========
k=31000
Code 1 time:0.168929
966815132
Code 2 time:0.168899
966815132
========
k=41000
Code 1 time:0.185737
956136410
Code 2 time:0.205008
956136410
========
k=51000
Code 1 time:0.202973
945313516
Code 2 time:0.236578
945313516
========
k=61000
Code 1 time:0.216686
934315450
Code 2 time:0.27039
934315450
========
k=71000
Code 1 time:0.231253
923596252
Code 2 time:0.293189
923596252
========
k=81000
Code 1 time:0.246896
912964978
Code 2 time:0.321346
912964978
========
k=91000
Code 1 time:0.263312
902191629
Code 2 time:0.343613
902191629
========
I modified the second code a little bit to make to similar to code1:
int findKthLargest2(vector<int>& nums, int k) {
double st=clock();
priority_queue<int, vector<int>, greater<int>> p(nums.begin(), nums.begin()+k);
int i=k;
for(; i<nums.size(); i++) {
if(p.top()<nums[i]){
p.pop();
p.push(nums[i]);
}
}
cerr<<"Code 2 time:"<<(clock()-st)/CLOCKS_PER_SEC<<endl;
return p.top();
}
int findKthLargest1(vector<int>& nums, int k) {
double st=clock();
priority_queue<int> pq(nums.begin(), nums.end()); //O(N)
for (int i = 0; i < k - 1; i++) //O(k*log(k))
pq.pop();
cerr<<"Code 1 time:"<<(clock()-st)/CLOCKS_PER_SEC<<endl;
return pq.top();
}
int main() {
READ("in");
vector<int>v;
int n;
cin>>n;
repl(i,n)
{
int x;
scanf("%d",&x);
v.pb(x);
}
for(int k=1000;k<=100000;k+=10000)
{
cout<<"k="<<k<<endl;
cout<<findKthLargest1(v,k)<<endl;
cout<<findKthLargest2(v,k)<<endl;
puts("========");
}
}
I used 1000000 random integers between 0 to 10^9 as dataset, generated by C++ rand() function.
well, no the first is O(N)+O(k*log(N)) because the pop is O(log(N))
int findKthLargest(vector<int>& nums, int k) {
priority_queue<int> pq(nums.begin(), nums.end()); //O(N)
for (int i = 0; i < k - 1; i++) //O(k*log(N))
pq.pop(); // this line is O(log(N))
return pq.top();
}
it's still better than the second in most cases.
While practising problems from hackerearth I came across following problem( not from active contest ) and have been unsuccessful in solving it after many attempts.
Chandler is participating in a race competition involving N track
races. He wants to run his old car on these tracks having F amount of
initial fuel. At the end of each race, Chandler spends si fuel and
gains some money using which he adds ei amount of fuel to his car.
Also for participating in race i at any stage, Chandler should have
more than si amount of fuel. Also he can participate in race i once.
Help Chandler in maximizing the number of races he can take part in if
he has a choice to participate in the given races in any order.
How can I approach the problem. My approach was to sort by (ei-si) but than I couldn't incorporate condition that fuel present is greater than required for race.
EDIT I tried to solve using following algorithm but it fails,I also can't think of any inputs which fail the algorithm. Please help me out figuring whats wrong or give some input where my algorithm fails.
Sort (ei-si) in non-increasing order;
start iterating through sorted (ei-si) and find first element such that fuel>=si
update fuel=fuel+(ei-si);
update count;
erase that element from list, and start searching again;
if fuel was not updated than we can't take part in any races so stop searching
and output count.
EDIT And here is my code as requested.
#include<iostream>
#include<vector>
#include<algorithm>
#include<list>
using namespace std;
struct race{
int ei;
int si;
int earn;
};
bool compareByEarn(const race &a, const race &b)
{
return a.earn <= b.earn;
}
int main(){
int t;
cin>>t;
while(t--){
vector<struct race> fuel;
int f,n;
cin>>f>>n;
int si,ei;
while(n--){
cin>>si>>ei;
fuel.push_back({ei,si,ei-si});
}
sort(fuel.begin(),fuel.end(),compareByEarn);
list<struct race> temp;
std::copy( fuel.rbegin(), fuel.rend(), std::back_inserter(temp ) );
int count=0;
while(1){
int flag=0;
for (list<struct race>::iterator ci = temp.begin(); ci != temp.end(); ++ci){
if(ci->si<=f){
f+=ci->earn;
ci=temp.erase(ci);
++count;
flag=1;
break;
}
}
if(!flag){
break;
}
}
cout<<count<<endl;
}
}
EDIT As noted in answer below, the above greedy approach dosen't always work. So now any alternative method would be useful
Here is my solution, which gets accepted by the judge:
Eliminate those races which have a profit (ei>si)
Sort by ei (in decreasing order)
Solve the problem using a dynamic programming algorithm. (It is similar to a pseudo-polynomial solution for the 0-1 knapsack.)
It is clear that the order in which you eliminate profitable races does not matter. (As long as you process them until no more profitable races can be entered.)
For the rest, I will first prove that if a solution exists, you can perform the same set of races in decreasing order of ei, and the solution will still be feasible. Imagine we have a solution in which k races were chosen and let's say these k races have starting and ending fuel values of s1,...,sk and e1,...,ek. Let i be the first index where ei < ej (where j=i+1). We will show that we can swap i and i+1 without violating any constraints.
It is clear that swapping i and i+1 will not disrupt any constraints before i or after i+1, so we only need to prove that we can still perform race i if we swap its order with race i+1 (j). In the normal order, if the fuel level before we start on race i was f, after race i it will be f-si+ei, and this is at least sj. In other words, we have: f-si+ei>=sj, which means f-sj+ei>=si. However, we know that ei < ej so f-sj+ej >= f-sj+ei >= si, and therefore racing on the jth race before the ith race will still leave at least si fuel for race i.
From there, we implement a dynamic programming algorithm in which d[i][j] is the maximum number of races we can participate in if we can only use races i..n and we start with j units of fuel.
Here is my code:
#include <iostream>
#include <algorithm>
#include <cstring>
using namespace std;
const int maxn = 110;
const int maxf = 110*1000;
int d[maxn][maxf];
struct Race {
int s, e;
bool used;
inline bool operator < (const Race &o) const {
return e > o.e;
}
} race[maxn];
int main() {
int t;
for (cin >> t; t--;) {
memset(d, 0, sizeof d);
int f, n;
cin >> f >> n;
for (int i = 0; i < n; i++) {
cin >> race[i].s >> race[i].e;
race[i].used = false;
}
sort(race, race + n);
int count = 0;
bool found;
do {
found = 0;
for (int i = 0; i < n; i++)
if (!race[i].used && race[i].e >= race[i].s && race[i].s >= f) {
race[i].used = true;
count++;
f += race[i].s - race[i].e;
found = true;
}
} while (found);
for (int i = n - 1; i >= 0; i--) {
for (int j = 0; j < maxf; j++) {
d[i][j] = d[i + 1][j];
if (!race[i].used && j >= race[i].s) {
int f2 = j - race[i].s + race[i].e;
if (f2 < maxf)
d[i][j] = max(d[i][j], 1 + d[i + 1][f2]);
}
}
}
cout << d[0][f] + count << endl;
}
return 0;
}
You need to change your compareByEarn function
bool compareByEarn(const race &a, const race &b)
{
if(a.earn == b.earn) return a.si < b.si;
return a.earn < b.earn;
}
Above comparison means, choose the track with more earning (or lesser loss). But if there are 2 tracks with same earning, prefer the track which requires more fuel.
Consider the example
Initially fuel in the car = 4
track 1 : s = 2, e = 1
track 2 : s = 3, e = 2
track 3 : s = 4, e = 3
Expected answer = 3
Received answer = 2 or 3 depending on whether sorting algorithm is stable or unstable and the order of input\.
As a side note:
Also for participating in race i at any stage, Chandler should have
more than si amount of fuel
Should translate to
if(ci->si < f){ // and not if(ci->si<=f){
You can check if my observation is right or problem author chose incorrect sentence to describe the constraint.
EDIT With more reasoning I realized you can not do it with only greedy approach.
Consider the following input.
Initially fuel in the car = 9
track 1 : s = 9, e = 6
track 2 : s = 2, e = 0
track 3 : s = 2, e = 0
track 4 : s = 2, e = 0
Expected answer = 4
Received answer = 3
I have few bolded line segments on x-axis in form of their beginning and ending x-coordinates. Some line segments may be overlapping. How to find the union length of all the line segments.
Example, a line segment is 5,0 to 8,0 and other is 9,0 to 12,0. Both are non overlapping, so sum of length is 3 + 3 = 6.
a line segment is 5,0 to 8,0 and other is 7,0 to 12,0. But they are overlapping for range, 7,0 to 8,0. So union of length is 7.
But the x- coordinates may be floating points.
Represent a line segment as 2 EndPoint object. Each EndPoint object has the form <coordinate, isStartEndPoint>. Put all EndPoint objects of all the line segments together in a list endPointList.
The algorithm:
Sort endPointList, first by coordinate in ascending order, then place the start end points in front of the tail end points (regardless of which segment, since it doesn't matter - all at the same coordinate).
Loop through the sorted list according to this pseudocode:
prevCoordinate = -Inf
numSegment = 0
unionLength = 0
for (endPoint in endPointList):
if (numSegment > 0):
unionLength += endPoint.coordinate - prevCoordinate
prevCoordinate = endPoint.coordinate
if (endPoint.isStartCoordinate):
numSegment = numSegment + 1
else:
numSegment = numSegment - 1
The numSegment variable will tell whether we are in a segment or not. When it is larger than 0, we are inside some segment, so we can include the distance to the previous end point. If it is 0, it means that the part before the current end point doesn't contain any segment.
The complexity is dominated by the sorting part, since comparison-based sorting algorithm has lower bound of Omega(n log n), while the loop is clearly O(n) at best. So the complexity of the algorithm can be said to be O(n log n) if you choose an O(n log n) comparison-based sorting algorithm.
Use a range tree. A range tree is n log(n), just like the sorted begin/end points, but it has the additional advantage that overlapping ranges will reduce the number of elements (but maybe increase the cost of insertion) Snippet (untested)
struct segment {
struct segment *ll, *rr;
float lo, hi;
};
struct segment * newsegment(float lo, float hi) {
struct segment * ret;
ret = malloc (sizeof *ret);
ret->lo = lo; ret->hi = hi;
ret->ll= ret->rr = NULL;
return ret;
}
struct segment * insert_range(struct segment *root, float lo, float hi)
{
if (!root) return newsegment(lo, hi);
/* non-overlapping(or touching) ranges can be put into the {l,r} subtrees} */
if (hi < root->lo) {
root->ll = insert_range(root->ll, lo, hi);
return root;
}
if (lo > root->hi) {
root->rr = insert_range(root->rr, lo, hi);
return root;
}
/* when we get here, we must have overlap; we can extend the current node
** we also need to check if the broader range overlaps the child nodes
*/
if (lo < root->lo ) {
root->lo = lo;
while (root->ll && root->ll->hi >= root->lo) {
struct segment *tmp;
tmp = root->ll;
root->lo = tmp->lo;
root->ll = tmp->ll;
tmp->ll = NULL;
// freetree(tmp);
}
}
if (hi > root->hi ) {
root->hi = hi;
while (root->rr && root->rr->lo <= root->hi) {
struct segment *tmp;
tmp = root->rr;
root->hi = tmp->hi;
root->rr = tmp->rr;
tmp->rr = NULL;
// freetree(tmp);
}
}
return root;
}
float total_width(struct segment *ptr)
{
float ret;
if (!ptr) return 0.0;
ret = ptr->hi - ptr->lo;
ret += total_width(ptr->ll);
ret += total_width(ptr->rr);
return ret;
}
Here is a solution I just wrote in Haskell and below it is an example of how it can be implemented in the interpreter command prompt. The segments must be presented in the form of a list of tuples [(a,a)]. I hope you can get a sense of the algorithm from the code.
import Data.List
unionSegments segments =
let (x:xs) = sort segments
one_segment = snd x - fst x
in if xs /= []
then if snd x > fst (head xs)
then one_segment - (snd x - fst (head xs)) + unionSegments xs
else one_segment + unionSegments xs
else one_segment
*Main> :load "unionSegments.hs"
[1 of 1] Compiling Main ( unionSegments.hs, interpreted )
Ok, modules loaded: Main.
*Main> unionSegments [(5,8), (7,12)]
7
Java implementation
import java.util.*;
public class HelloWorld{
static void unionLength(int a[][],int sets)
{
TreeMap<Integer,Boolean> t=new TreeMap<>();
for(int i=0;i<sets;i++)
{
t.put(a[i][0],false);
t.put(a[i][1],true);
}
int count=0;
int res=0;
int one=1;
Set set = t.entrySet();
Iterator it = set.iterator();
int prev=0;
while(it.hasNext()) {
if(one==1){
Map.Entry me = (Map.Entry)it.next();
one=0;
prev=(int)me.getKey();
if((boolean)me.getValue()==false)
count++;
else
count--;
}
Map.Entry me = (Map.Entry)it.next();
if(count>0)
res=res+((int)me.getKey()-prev);
if((boolean)me.getValue()==false)
count++;
else
count--;
prev=(int)me.getKey();
}
System.out.println(res);
}
public static void main(String []args){
int a[][]={{0, 4}, {3, 6},{8,10}};
int b[][]={{5, 10}, {8, 12}};
unionLength(a,3);
unionLength(b,2);
}
}