I have a problem with finding the time-complexity.
Firtly, speaking about the outer FOR in MergeSort, i think that the repetitions are (1+ Sumation(from i=1, to sizeOfArray)(2*i) = 1+(2+4+8+16+32+...+size) but i also think that i am very wrong.
I also have a problem measuring the inside FOR-loop repetitions.
MergeSort(){ //Iterative Version (Bottom-Up)
for(int currentSize = 1; currentSize < length; currentSize *= 2) {
for(int low = 0; low < length - currentSize; low += 2*currentSize){
int mid = low + currentSize - 1;
//min() is used here so if low is very close to the end of the array, high doesn't take outOfBoundries Value.
int high = Math.min(low + currentSize*2 -1, length - 1);
merge(int low, int middle, int high) {
// Copy both parts into the helper array
for (int i = low; i <= high; i++) {
helper[i] = arrayForMergeSort[i];
int i = low;
int j = middle + 1;
int k = low;
// Copy the smallest values from either the left or the right side back
// to the original array
while (i <= middle && j <= high) {
if (helper[i] <= helper[j]) {
arrayForMergeSort[k] = helper[i];
} else {
arrayForMergeSort[k] = helper[j];
// Copy the rest of the left side of the array into the target array
while (i <= middle) {
arrayForMergeSort[k] = helper[i];

For the outer loop, the number of iterations is ceil(log2(length)).
For the inner loop, the number of runs to be merged on each iteration is ceil(length / currentSize) or floor((length + currentSize - 1) / currentSize). If this is an even number, then the last run's size may be less than currentSize. If this is an odd number, then the last run has no run to merge with and also may be less than currrentSize. I'm not sure there's a way to calculate the total number of merge operations without using iteration to sum the merge operations per iteration.
In a "production" version of merge sort, a one time allocation of a working array the same (or 1/2) the size of the original array is done, and then the direction of merge (original to working or working to original is changed with each outer loop. If the program pre-caculates the number of outer iterations, and it is an odd number, then a pre-pass can be done to swap elements in place on the initial pass, so that an even number of merge passes are done, with the sorted data ending up in the original array.


why this while loop performs worse than the other very similar while loop?

I am trying to write a variation of insertion sort. In my algorithm, the swapping of values doesn't happen when finding the correct place for item in hand. Instead, it uses a lookup table (an array containing "links" to smaller values in the main array at corresponding positions) to find the correct position of the item. When we are done with all n elements in the main array, we haven't actually changed any of the elements in the main array itself, but an array named smaller will contain the links to immediate smaller values at positions i, i+1, ... n in correspondence to every element i, i+1, ... n in the main array. Finally, we iterate through the array smaller, starting from the index where the largest value in the main array existed, and populate another empty array in backward direction to finally get the sorted sequence.
Somewhat hacky/verbose implementation of the algorithm just described:
public static int [] sort (int[] a) {
int length = a.length;
int sorted [] = new int [length];
int smaller [] = new int [length];
//debug helpers
long e = 0, t = 0;
int large = 0;
smaller[large] = -1;
for (int i = 1; i < length; i++) {
if (a[i] > a[large]) {
smaller[i] = large;
large = i;
int prevLarge = large;
int temp = prevLarge;
long st = System.currentTimeMillis();
while (prevLarge > -1 && a[prevLarge] >= a[i]) {
if (smaller[prevLarge] == -1) {
smaller[i] = -1;
smaller[prevLarge] = i;
continue here;
temp = prevLarge;
prevLarge = smaller[prevLarge];
long et = System.currentTimeMillis();
t += (et - st);
smaller[i] = prevLarge;
smaller[temp] = i;
for (int i = length - 1; i >= 0; i--) {
sorted[i] = a[large];
large = smaller[large];
App.print("DevSort while loop execution: " + (e));
App.print("DevSort while loop time: " + (t));
return sorted;
The variables e and t contain the number of times the inner while loop is executed and total time taken to execute the while loop e times, respectively.
Here is a modified version of insertion sort:
public static int [] sort (int a[]) {
int n = a.length;
//debug helpers
long e = 0, t = 0;
for (int j = 1; j < n; j++) {
int key = a[j];
int i = j - 1;
long st = System.currentTimeMillis();
while ( (i > -1) && (a[i] >= key)) {
// simply crap
if (1 == 1) {
int x = 0;
int y = 1;
int z = 2;
a[i + 1] = a[i];
long et = System.currentTimeMillis();
t += (et - st);
a[i+1] = key;
App.print("InsertSort while loop execution: " + (e));
App.print("InsertSort while loop time: " + (t));
return a;
if block inside the while loop is introduced just to match the number of statements inside the while loop of my "hacky" algorithm. Note that two variables e and t are introduced also in the modified insertion sort.
The thing that's confusing is that even though the while loop of insertion sort runs exactly equal number of times the while loop inside my "hacky" algorithm, t for insertion sort is significantly smaller than t for my algorithm.
For a particular run, if n = 10,000:
Total time taken by insertion sort's while loop: 20ms
Total time taken by my algorithm's while loop: 98ms
if n = 100,000;
Total time taken by insertion sort's while loop: 1100ms
Total time taken by my algorithm's while loop: 25251ms
In fact, because the condition 1 == 1 is always true, insertion sort's if block inside the while loop must execute more often than the one inside while loop of my algorithm. Can someone explain what's going on?
Two arrays containing same elements in the same order are being sorted using each algorithm.

Sort a given array whose elements range from 1 to n , in which one element is missing and one is repeated

I have to sort this array in O(n) time and O(1) space.
I know how to sort an array in O(n) but that doesn't work with missing and repeated numbers. If I find the repeated and missing numbers first (It can be done in O(n)) and then sort , that seems costly.
static void sort(int[] arr)
for(int i=0;i<arr.length;i++)
if(arr[i]-1 == i)
while(arr[i]-1 != i)
int temp = arr[arr[i]-1];
arr[arr[i]-1] = arr[i];
arr[i] = temp;
First, you need to find missing and repeated numbers. You do this by solving following system of equations:
Left sums are computed simultaneously by making one pass over array. Right sums are even simpler -- you may use formulas for arithmetic progression to avoid looping. So, now you have system of two equations with two unknowns: missing number m and repeated number r. Solve it.
Next, you "sort" array by filling it with numbers 1 to n left to right, omitting m and duplicating r. Thus, overall algorithm requires only two passes over array.
void sort() {
for (int i = 1; i <= N; ++i) {
while (a[i] != a[a[i]]) {
std::swap(a[i], a[a[i]]);
for (int i = 1; i <= N; ++i) {
if (a[i] == i) continue;
for (int j = a[i] - 1; j >= i; --j) a[j] = j + 1;
for (int j = a[i] + 1; j <= i; ++j) a[j] = j - 1;
Let's denote m the missing number and d the duplicated number
Please note in the while loop, the break condition is a[i] != a[a[i]] which covers both a[i] == i and a[i] is a duplicate.
After the first for, every non-duplicate number i is encountered 1-2 time and moved into the i-th position of the array at most 1 time.
The first-found number d is moved to d-th position, at most 1 time
The second d is moved around at most N-1 times and ends up in m-th position because every other i-th slot is occupied by number i
The second outer for locate the first i where a[i] != i. The only i satisfies that is i = m
The 2 inner fors handle 2 cases where m < d and m > d respectively
Full implementation at
int temp = arr[arr[i]-1];
add a check for duplicate in the loop:
if((temp-1) == i){ // found duplicate
} else {
arr[arr[i]-1] = arr[i];
arr[i] = temp;
See if you can figure out the rest of the code.

How can I develop the exact recurrence for this?

N buildings are built in a row, numbered 1 to N from left to right.
Spiderman is on buildings number 1, and want to reach building number N.
He can jump from building number i to building number j iff i < j and j-i is a power of 2 (1,2,4, so on).
Such a move costs him energy |Height[j]-Height[i]|, where Height[i] is the height of the ith building.
Find the minimum energy using which he can reach building N?
First line contains N, number of buildings.
Next line contains N space-separated integers, denoting the array Height.
Print a single integer, the answer to the above problem.
So, I thought of something like this:
int calc(int arr[], int beg, int end, )
//int ans = INT_MIN;
if (beg == end)
return 0;
else if (beg > end)
return 0;
for (int i = beg+1; i <= end; i++ ) // Iterate over all possible combinations
int foo = arr[i] - arr[beg]; // Check if power of two or not
int k = log2(foo);
int z = pow(2,k);
if (z == foo) // Calculate the minimum value over multiple values
int temp = calc(arr,i,end);
if (temp < ans)
temp = ans;
The above is a question that I am trying to solve and here is the link:
However, the above recurrence is not exactly correct. Do I have to pass in the value of answer too in this?
We can reach nth building from any of (n-2^0),(n-2^1),(n-2^2)... buildings. So we need to process the buildings starting from 1. For each building i we calculate cost for getting there from any of earlier building j where i-j is power of 2 and take the minimum cost.
int calc(int arr[],int dp[],int n) {
// n is the target building
for(int i=1; i<=n; i++) dp[i]=LLONG_MAX; //initialize to infinity
dp[1]=0; // no cost for starting building
for(int i=2; i<=n; i++) {
for(int j=1; i-j>=1; j*=2) {
dp[i]=min(dp[i], dp[i-j]+abs(arr[i]-arr[i-j]));
return dp[n];
Time complexity is O(n*log(n)).
First, you are doing the check for a power of 2 on the wrong quantity. The jumps have to be between buildings that are separated in index by a power of 2, not that differ in height (which is what you are checking).
Second, the recursion should be formulated in terms of the cost of the first jump and the cost of the remaining jumps (obtained by a recursive call). You are looking for the minimum cost over all legal first jumps. A first jump is legal if it is to a building that is at an index less than N and also a power of 2 in index away from the current start.
Something like this should work:
int calc(int arr[], int beg, int end)
if (beg == end)
return 0;
else if (beg > end)
throw an exception
int minEnergy = INFINITY;
for (int i = 1; // start with a step of 1
beg + i <= end; // test if we'd go too far
i <<= 1) // increase step to next power of 2
int energy = abs(arr[beg + i] - arr[beg]) // energy of first jump
+ calc(arr, beg + i, end); // remaining jumps
if (energy < minEnergy) {
minEnergy = energy;
return minEnergy;
The efficiency of this search can be greatly improved by passing the minimum energy obtained so far. Then if abs(arr[beg + i] - arr[beg]) is not less than that quantity, there's no need to do the recursive call, because whatever is found will never be smaller. (In fact, you can cut off the recursion if abs(arr[beg + i] - arr[beg]) + abs(arr[end] - arr[beg + i]) is not smaller than the best solution so far, because Spiderman will have to at least spend abs(arr[end] - arr[beg + i]) after getting to building beg + i.) Adding this improvement is left as an exercise. :)

Non-Recursive Merge Sort

Can someone explain in English how does Non-Recursive merge sort works ?
Non-recursive merge sort works by considering window sizes of 1,2,4,8,16..2^n over the input array. For each window ('k' in code below), all adjacent pairs of windows are merged into a temporary space, then put back into the array.
Here is my single function, C-based, non-recursive merge sort.
Input and output are in 'a'. Temporary storage in 'b'.
One day, I'd like to have a version that was in-place:
float a[50000000],b[50000000];
void mergesort (long num)
int rght, wid, rend;
int i,j,m,t;
for (int k=1; k < num; k *= 2 ) {
for (int left=0; left+k < num; left += k*2 ) {
rght = left + k;
rend = rght + k;
if (rend > num) rend = num;
m = left; i = left; j = rght;
while (i < rght && j < rend) {
if (a[i] <= a[j]) {
b[m] = a[i]; i++;
} else {
b[m] = a[j]; j++;
while (i < rght) {
i++; m++;
while (j < rend) {
j++; m++;
for (m=left; m < rend; m++) {
a[m] = b[m];
By the way, it is also very easy to prove this is O(n log n). The outer loop over window size grows as power of two, so k has log n iterations. While there are many windows covered by inner loop, together, all windows for a given k exactly cover the input array, so inner loop is O(n). Combining inner and outer loops: O(n)*O(log n) = O(n log n).
Loop through the elements and make every adjacent group of two sorted by swapping the two when necessary.
Now, dealing with groups of two groups (any two, most likely adjacent groups, but you could use the first and last groups) merge them into one group be selecting the lowest valued element from each group repeatedly until all 4 elements are merged into a group of 4. Now, you have nothing but groups of 4 plus a possible remainder. Using a loop around the previous logic, do it all again except this time work in groups of 4. This loop runs until there is only one group.
Quoting from Algorithmist:
Bottom-up merge sort is a
non-recursive variant of the merge
sort, in which the array is sorted by
a sequence of passes. During each
pass, the array is divided into blocks
of size m. (Initially, m = 1).
Every two adjacent blocks are merged
(as in normal merge sort), and the
next pass is made with a twice larger
value of m.
Both recursive and non-recursive merge sort have same time complexity of O(nlog(n)). This is because both the approaches use stack in one or the other manner.
In non-recursive approach
the user/programmer defines and uses stack
In Recursive approach stack is used internally by the system to store return address of the function which is called recursively
The main reason you would want to use a non-recursive MergeSort is to avoid recursion stack overflow. I for example am trying to sort 100 million records, each record about 1 kByte in length (= 100 gigabytes), in alphanumeric order. An order(N^2) sort would take 10^16 operations, ie it would take decades to run even at 0.1 microsecond per compare operation. An order (N log(N)) Merge Sort will take less than 10^10 operations or less than an hour to run at the same operational speed. However, in the recursive version of MergeSort, the 100 million element sort results in 50-million recursive calls to the MergeSort( ). At a few hundred bytes per stack recursion, this overflows the recursion stack even though the process easily fits within heap memory. Doing the Merge sort using dynamically allocated memory on the heap-- I am using the code provided by Rama Hoetzlein above, but I am using dynamically allocated memory on the heap instead of using the stack-- I can sort my 100 million records with the non-recursive merge sort and I don't overflow the stack. An appropriate conversation for website "Stack Overflow"!
PS: Thanks for the code, Rama Hoetzlein.
PPS: 100 gigabytes on the heap?!! Well, it's a virtual heap on a Hadoop cluster, and the MergeSort will be implemented in parallel on several machines sharing the load...
I am new here.
I have modified Rama Hoetzlein solution( thanks for the ideas ). My merge sort does not use the last copy back loop. Plus it falls back on insertion sort. I have benchmarked it on my laptop and it is the fastest. Even better than the recursive version. By the way it is in java and sorts from descending order to ascending order. And of course it is iterative. It can be made multithreaded. The code has become complex. So if anyone interested, please have a look.
Code :
int num = input_array.length;
int left = 0;
int right;
int temp;
int LIMIT = 16;
if (num <= LIMIT)
// Single Insertion Sort
right = 1;
while(right < num)
temp = input_array[right];
while(( left > (-1) ) && ( input_array[left] > temp ))
input_array[left+1] = input_array[left--];
input_array[left+1] = temp;
left = right;
int i;
int j;
//Fragmented Insertion Sort
right = LIMIT;
while (right <= num)
i = left + 1;
j = left;
while (i < right)
temp = input_array[i];
while(( j >= left ) && ( input_array[j] > temp ))
input_array[j+1] = input_array[j--];
input_array[j+1] = temp;
j = i;
left = right;
right = right + LIMIT;
// Remainder Insertion Sort
i = left + 1;
j = left;
while(i < num)
temp = input_array[i];
while(( j >= left ) && ( input_array[j] > temp ))
input_array[j+1] = input_array[j--];
input_array[j+1] = temp;
j = i;
// Rama Hoetzlein method
int[] temp_array = new int[num];
int[] swap;
int k = LIMIT;
while (k < num)
left = 0;
i = k;// The mid point
right = k << 1;
while (i < num)
if (right > num)
right = num;
temp = left;
j = i;
while ((left < i) && (j < right))
if (input_array[left] <= input_array[j])
temp_array[temp++] = input_array[left++];
temp_array[temp++] = input_array[j++];
while (left < i)
temp_array[temp++] = input_array[left++];
while (j < right)
temp_array[temp++] = input_array[j++];
// Do not copy back the elements to input_array
left = right;
i = left + k;
right = i + k;
// Instead of copying back in previous loop, copy remaining elements to temp_array, then swap the array pointers
while (left < num)
temp_array[left] = input_array[left++];
swap = input_array;
input_array = temp_array;
temp_array = swap;
k <<= 1;
return input_array;
Just in case anyone's still lurking in this thread ... I've adapted Rama Hoetzlein's non-recursive merge sort algorithm above to sort double linked lists. This new sort is in-place, stable and avoids time costly list dividing code that's in other linked list merge sorting implementations.
// MergeSort.cpp
// Angus Johnson 2017
// License: Public Domain
#include "io.h"
#include "time.h"
#include "stdlib.h"
struct Node {
int data;
Node *next;
Node *prev;
Node *jump;
inline void Move2Before1(Node *n1, Node *n2)
Node *prev, *next;
//extricate n2 from linked-list ...
prev = n2->prev;
next = n2->next;
prev->next = next; //nb: prev is always assigned
if (next) next->prev = prev;
//insert n2 back into list ...
prev = n1->prev;
if (prev) prev->next = n2;
n1->prev = n2;
n2->prev = prev;
n2->next = n1;
void MergeSort(Node *&nodes)
Node *first, *second, *base, *tmp, *prev_base;
if (!nodes || !nodes->next) return;
int mul = 1;
for (;;) {
first = nodes;
prev_base = NULL;
//sort each successive mul group of nodes ...
while (first) {
if (mul == 1) {
second = first->next;
if (!second) {
first->jump = NULL;
first->jump = second->next;
second = first->jump;
if (!second) break;
first->jump = second->jump;
base = first;
int cnt1 = mul, cnt2 = mul;
//the following 'if' condition marginally improves performance
//in an unsorted list but very significantly improves
//performance when the list is mostly sorted ...
if (second->data < second->prev->data)
while (cnt1 && cnt2) {
if (second->data < first->data) {
if (first == base) {
if (prev_base) prev_base->jump = second;
base = second;
base->jump = first->jump;
if (first == nodes) nodes = second;
tmp = second->next;
Move2Before1(first, second);
second = tmp;
if (!second) { first = NULL; break; }
first = first->next;
} //while (cnt1 && cnt2)
first = base->jump;
prev_base = base;
} //while (first)
if (!nodes->jump) break;
else mul <<= 1;
} //for (;;)
void InsertNewNode(Node *&head, int data)
Node *tmp = new Node;
tmp->data = data;
tmp->next = NULL;
tmp->prev = NULL;
tmp->jump = NULL;
if (head) {
tmp->next = head;
head->prev = tmp;
head = tmp;
else head = tmp;
void ClearNodes(Node *head)
if (!head) return;
while (head) {
Node *tmp = head;
head = head->next;
delete tmp;
int main()
Node *nodes = NULL, *n;
const int len = 1000000; //1 million nodes
for (int i = 0; i < len; i++)
InsertNewNode(nodes, rand() >> 4);
clock_t t = clock();
MergeSort(nodes); //~1/2 sec for 1 mill. nodes on Pentium i7.
t = clock() - t;
printf("Sort time: %d msec\n\n", t * 1000 / CLOCKS_PER_SEC);
n = nodes;
while (n)
if (n->prev && n->data < n->prev->data) {
printf("oops! sorting's broken\n");
n = n->next;
printf("All done!\n\n");
return 0;
Edited 2017-10-27: Fixed a bug affecting odd numbered lists
Any interest in this anymore? Probably not. Oh well. Here goes nothing.
The insight of merge-sort is that you can merge two (or several) small sorted runs of records into one larger sorted run, and you can do so with simple stream-like operations "read first/next record" and "append record" -- which means you don't need a big data set in RAM at once: you can get by with just two records, each taken from a distinct run. If you can just keep track of where in your file the sorted runs start and end, you can simply merge pairs of adjacent runs (into a temp file) repeatedly until the file is sorted: this takes a logarithmic number of passes over the file.
A single record is trivially sorted: each time you merge two adjacent runs, the size of each run doubles. So that's one way to keep track. The other is to work on a priority queue of runs. Take the two smallest runs from the queue, merge them, and enqueue the result -- until there is only one remaining run. This is appropriate if you expect your data to naturally start with sorted runs.
In practice with enormous data sets you'll want to exploit the memory hierarchy. Suppose you have gigabytes of RAM and terabytes of data. Why not merge a thousand runs at once? Indeed you can do this, and a priority-queue of runs can help. That will significantly decrease the number of passes you have to make over a file to get it sorted. Some details are left as an exercise for the reader.

Array of size n, with one element n/2 times

Given an array of n integers, where one element appears more than n/2 times. We need to find that element in linear time and constant extra space.
YAAQ: Yet another arrays question.
I have a sneaking suspicion it's something along the lines of (in C#)
// We don't need an array
public int FindMostFrequentElement(IEnumerable<int> sequence)
// Initial value is irrelevant if sequence is non-empty,
// but keeps compiler happy.
int best = 0;
int count = 0;
foreach (int element in sequence)
if (count == 0)
best = element;
count = 1;
// Vote current choice up or down
count += (best == element) ? 1 : -1;
return best;
It sounds unlikely to work, but it does. (Proof as a postscript file, courtesy of Boyer/Moore.)
Find the median, it takes O(n) on an unsorted array. Since more than n/2 elements are equal to the same value, the median is equal to that value as well.
int findLeader(int n, int* x){
int leader = x[0], c = 1, i;
for(i=1; i<n; i++){
if(c == 0){
leader = x[i];
c = 1;
} else {
if(x[i] == leader) c++;
else c--;
if(c == 0) return NULL;
else {
c = 0;
for(i=0; i<n; i++){
if(x[i] == leader) c++;
if(c > n/2) return leader;
else return NULL;
I'm not the author of this code, but this will work for your problem. The first part looks for a potential leader, the second checks if it appears more than n/2 times in the array.
This is what I thought initially.
I made an attempt to keep the invariant "one element appears more than n/2 times", while reducing the problem set.
Lets start comparing a[i], a[i+1]. If they're equal we compare a[i+i], a[i+2]. If not, we remove both a[i], a[i+1] from the array. We repeat this until i>=(current size)/2. At this point we'll have 'THE' element occupying the first (current size)/2 positions.
This would maintain the invariant.
The only caveat is that we assume that the array is in a linked list [for it to give a O(n) complexity.]
What say folks?
Well you can do an inplace radix sort as described here[pdf] this takes no extra space and linear time. then you can make a single pass counting consecutive elements and terminating at count > n/2.
How about:
randomly select a small subset of K elements and look for duplicates (e.g. first 4, first 8, etc). If K == 4 then the probability of not getting at least 2 of the duplicates is 1/8. if K==8 then it goes to under 1%. If you find no duplicates repeat the process until you do. (assuming that the other elements are more randomly distributed, this would perform very poorly with, say, 49% of the array = "A", 51% of the array ="B").
select a fixed size subset.
return the most common element in that subset
if there is no element with more than 1 occurrence repeat.
if there is more than 1 element with more than 1 occurrence call findDuplicate and choose the element the 2 calls have in common
This is a constant order operation (if the data set isn't bad) so then do a linear scan of the array in order(N) to verify.
My first thought (not sufficient) would be to:
Sort the array in place
Return the middle element
But that would be O(n log n), as would any recursive solution.
If you can destructively modify the array (and various other conditions apply) you could do a pass replacing elements with their counts or something. Do you know anything else about the array, and are you allowed to modify it?
Edit Leaving my answer here for posterity, but I think Skeet's got it.
in php---pls check if it's correct
function arrLeader( $A ){
$len = count($A);
$B = array();
$counts = array_count_values(array); //return array with elements as keys and occurrences of each element as values
$val = $A[$i];
if(in_array($val,$B,true)){//to avoid looping again and again
return $val;
array_push($B, $val);//to avoid looping again and again
return -1;
int n = A.Length;
int[] L = new int[n + 1];
L[0] = -1;
for (int i = 0; i < n; i++)
L[i + 1] = A[i];
int count = 0;
int pos = (n + 1) / 2;
int candidate = L[pos];
for (int i = 1; i <= n; i++)
if (L[i] == candidate && L[pos++] == candidate)
return candidate;
if (count > pos)
return candidate;
return (-1);
