Big O of BubbleSort on a simple list of 5 values - algorithm

I believe that a BubbleSort is of the order O(n^2). As I read previous postings, this has to do with nested iteration. But when I dry run a simple unsorted list, (see below), I have the list sorted in 10 comparisons.
In my example, here is my list of integer values:
5 4 3 2 1
To get 5 into position, I did n-1 swap operations. (4)
To get 4 into position, I did n-2 swap operations. (3)
To get 3 into position, I did n-3 swap operations. (2)
To get 2 into position, I did n-4 swap operations. (1)
I can't see where (n^2) comes from, as when I have a list of n=5 items, I only need 10 swap operations.
BTW, I've seen (n-1).(n-1) which doesn't make sense to me, as this would give 16 swap operations.
I'm only concerned with basic BubbleSort...a simple nested FOR loop, in the interest of simplicity and clarity.

You don't seem to understand the concept of big O notation very
well. It refers to how the number of operations or the time grows in
relation to the size of the input, asymptotically, considering only the
fastest-growing term, and without considering the constant of
proportionality.
A single measurement like your 5:10 result is completely meaningless.
Imagine looking for a function that maps 5 to 10. Is it 2N? N + 5? 4N –
10? 0.4N2? N2 – 15? 4 log5N + 6? The
possibilities are limitless.
Instead, you have to analyze the algorithm to see how the number of
operations grows as N does, or measure the operations or time over many
runs, using various values of N and the most general datasets you can
devise. Note that your test case is not general at all: when checking
the average performance of a sorting algorithm, you want the input to be
in random order (the most likely case), not sorted or reverse-sorted.

If you wan to precise there are (n)*(n-1)/2 operations because you are actually computing n+(n-1)+(n-2)+...+1 as the first element needs n swaps, second element need n-1 swaps and so on. So the algorithm is of O(1/2 * (n^2) - n) which in asymptotic notations is equal to O(n^2). But what actually is happening in bubble sort is different. In bubble sort you perform a pass on array and swap the misplaced neighbors place, until there is no misplacement which means the array has become sorted. As each pass on array takes O(n) time and in the worst case you have to perform n passes so the algorithm is of O(n^2). Note that we are counting the number of comparisons not the number of swaps.
There are two version of bubble sort mentioned in wikipedia:
procedure bubbleSort( A : list of sortable items )
n = length(A)
repeat
swapped = false
for i = 1 to n-1 inclusive do
/* if this pair is out of order */
if A[i-1] > A[i] then
/* swap them and remember something changed */
swap( A[i-1], A[i] )
swapped = true
end if
end for
until not swapped
end procedure
This version perform (n-1)*(n-1) comparison -> O(n^2)
Optimizing bubble sort
The bubble sort algorithm can be easily
optimized by observing that the n-th pass finds the n-th largest
element and puts it into its final place. So, the inner loop can avoid
looking at the last n-1 items when running for the n-th time:
procedure bubbleSort( A : list of sortable items )
n = length(A)
repeat
swapped = false
for i = 1 to n-1 inclusive do
if A[i-1] > A[i] then
swap(A[i-1], A[i])
swapped = true
end if
end for
n = n - 1
until not swapped
end procedure
This version performs (n-1)+(n-2)+(n-3)+...+1 operations which is (n-1)(n-2)/2 comparisons -> O(n^2)

Related

find 4th smallest element in linear time

So i had an exercise given to me about 2 months ago, that says the following:
Given n (n>=4) distinct elements, design a divide & conquer algorithm to compute the 4th smallest element. Your algorithm should run in linear time in the worst case.
I had an extremely hard time with this problem, and could only find relevant algorithms that runs in the worst case O(n*k). After several weeks of trying, we managed, with the help of our teacher, "solve" this problem. The final algorithm is as follows:
Rules: The input size can only be of size 2^k
(1): Divide input into n/2. One left array, one right array.
(2): If input size == 4, sort the arrays using merge sort.
(2.1) Merge left array with right array into a new result array with length 4.
(2.2) Return element at index [4-1]
(3): Repeat step 1
This is solved recursively and our base case is at step 2. Step 2.2 means that for all
of our recursive calls that we did, we will get a final result array of length 4, and at that
point, we can justr return the element at index [4-1].
With this algorithm, my teacher claims that this runs in linear time. My problem with that statement is that we are diving the input until we reach sub-arrays with an input size of 4, and then that is sorted. So for an input size of 8, we would sort 2 sub-arrays with length 4, since 8/4 = 2. How is this in any case linear time? We are still sorting the whole input size but in blocks aren't we? This really does not make sense to me. It doesn't matter if we sort the whole input size at it is, or divide it into sub-arrays with size of 4,and sort them like that? It will still be a worst time of O(n*log(n))?
Would appreciate some explanations on this !
To make proving that algorithm runs in linear time, let's modify it a bit (we will only change an order of dividing and merging blocks, nothing more):
(1): Divide input into n/4 blocks, each has size 4.
(2): Until there is more than one block, repeat:
Merge each pair of adjacent blocks into one block of size 4.
(For example, if we have 4 blocks, we will split them in 2 pairs -
first pair contains first and second blocks,
second pair contains third and fourth blocks.
After merging we will have 2 blocks -
the first one contains 4 least elements from blocks 1 and 2,
the second one contains 4 least elements from blocks 3 and 4).
(3): The answer is the last element of that one block left.
Proof: It's a fact that array of constant length (in your case, 4) can be sorted in constant time. Let k = log(n). Loop (2) runs k-2 iterations (on each iteration the count of elements left is divided by 2, until 4 elements are left).
Before i-th iteration (0 <= i < k-2) there are (2^(k-i)) elements left, so there are 2^(k-i-2) blocks and we will merge 2^(k-i-3) pairs of blocks. Let's find how many pairs we will merge in all iterations. Count of merges equals
mergeOperationsCount = 2^(k-3) + 2^(k-4) + .... + 2^(k-(k-3)) =
= 2^(k-3) * (1 + 1/2 + 1/4 + 1/8 + .....) < 2^(k-2) = O(2^k) = O(n)
Since we can merge each pair in constant time (because thay have constant size), and the only operation we make is merging pairs, the algorithm runs in O(n).
And after this proof, I want to notice that there is another linear algorithm which is trivial, but it is not divide-and-conquer.

Merge Sort Complexity Confusion

Can someone explain to me in plain english how Merge Sort is O(n*logn). I know that the 'n' comes from the fact that it takes n appends to merge two sorted lists of size n/2. What confuses me is the log. If we were to draw a tree of the function calls of running Merge Sort on a 32 element list, then it would have 5 levels. Log2(32)= 5. That makes sense, however, why do we use the levels of the tree, rather than the actual function calls and merges in the Big O definition ?
In this diagram we can see that for an 8 element list, there are 3 levels. In this context, Big O is trying to find how the number of operations behaves as the input increases, my question is how are the levels (of function calls) considered operations?
The levels of function calls are considered like this(in the book [introduction to algorithms](https://mitpress.mit.edu/books/introduction-algorithms Chapter 2.3.2):
We reason as follows to set up the recurrence for T(n), the worst-case running time of merge sort on n numbers. Merge sort on just one element takes constant time. When we have n > 1 elements, we break down the running time as follows.
Divide: The divide step just computes the middle of the subarray, which takes constant time. Thus, D(n) = Θ(1).
Conquer: We recursively solve two subproblems, each of size n/2, which contributes 2T(n/2) to the running time.
Combine: We have already noted that the MERGE procedure on an n-element subarray takes time Θ(n), and so C(n) = Θ(n).
When we add the functions D(n) and C(n) for the merge sort analysis, we are adding a function that is Θ(n) and a function that is Θ(1). This sum is a linear function of n, that is, Θ(n). Adding it to the 2T(n/2) term from the “conquer” step gives the recurrence for the worst-case running time T(n) of merge sort:
T(n) = Θ(1), if n = 1; T(n) = 2T(n/2) + Θ(n), if n > 1.
Then using the recursion tree or the master theorem, we can calculate:
T(n) = Θ(nlgn).
Simple analysis:-
Say length of array is n to be sorted.
Now every time it will be divided into half.
So, see as under:-
n
n/2 n/2
n/4 n/4 n/4 n/4
............................
1 1 1 ......................
As you can see height of tree will be logn( 2^k = n; k = logn)
At every level sum will be n. (n/2 +n/2 = n, n/4+n/4+n/4+n/4 = n).
So finally levels = logn and every level takes n
combining we get nlogn
Now regarding your question, how levels are considered operations, consider as under:-
array 9, 5, 7
suppose its split into 9,5 and 7
for 9,5 it will get converted to 5,9 (at this level one swap required)
then in upper level 5,9 and 7 while merging gets converted to 5,7,9
(again at this level one swap required).
In worst case on any level number operations can be O(N) and number of levels logn. Hence nlogn.
For more clarity try to code merge sort, you will be able to visualise it.
Let's take your 8-item array as an example. We start with [5,3,7,8,6,2,1,4].
As you noted, there are three passes. In the first pass, we merge 1-element subarrays. In this case, we'd compare 5 with 3, 7 with 8, 2 with 6, and 1 with 4. Typical merge sort behavior is to copy items to a secondary array. So every item is copied; we just change the order of adjacent items when necessary. After the first pass, the array is [3,5,7,8,2,6,1,4].
On the next pass, we merge two-element sequences. So [3,5] is merged with [7,8], and [2,6] is merged with [1,4]. The result is [3,5,7,8,1,2,4,6]. Again, every element was copied.
In the final pass the algorithm again copies every item.
There are log(n) passes, and at every pass all n items are copied. (There are also comparisons, of course, but the number is linear and no more than the number of items.) Anyway, if you're doing n operations log(n) times, then the algorithm is O(n log n).

Why is merge sort worst case run time O (n log n)?

Can someone explain to me in simple English or an easy way to explain it?
The Merge Sort use the Divide-and-Conquer approach to solve the sorting problem. First, it divides the input in half using recursion. After dividing, it sort the halfs and merge them into one sorted output. See the figure
It means that is better to sort half of your problem first and do a simple merge subroutine. So it is important to know the complexity of the merge subroutine and how many times it will be called in the recursion.
The pseudo-code for the merge sort is really simple.
# C = output [length = N]
# A 1st sorted half [N/2]
# B 2nd sorted half [N/2]
i = j = 1
for k = 1 to n
if A[i] < B[j]
C[k] = A[i]
i++
else
C[k] = B[j]
j++
It is easy to see that in every loop you will have 4 operations: k++, i++ or j++, the if statement and the attribution C = A|B. So you will have less or equal to 4N + 2 operations giving a O(N) complexity. For the sake of the proof 4N + 2 will be treated as 6N, since is true for N = 1 (4N +2 <= 6N).
So assume you have an input with N elements and assume N is a power of 2. At every level you have two times more subproblems with an input with half elements from the previous input. This means that at the the level j = 0, 1, 2, ..., lgN there will be 2^j subproblems with an input of length N / 2^j. The number of operations at each level j will be less or equal to
2^j * 6(N / 2^j) = 6N
Observe that it doens't matter the level you will always have less or equal 6N operations.
Since there are lgN + 1 levels, the complexity will be
O(6N * (lgN + 1)) = O(6N*lgN + 6N) = O(n lgN)
References:
Coursera course Algorithms: Design and Analysis, Part 1
On a "traditional" merge sort, each pass through the data doubles the size of the sorted subsections. After the first pass, the file will be sorted into sections of length two. After the second pass, length four. Then eight, sixteen, etc. up to the size of the file.
It's necessary to keep doubling the size of the sorted sections until there's one section comprising the whole file. It will take lg(N) doublings of the section size to reach the file size, and each pass of the data will take time proportional to the number of records.
After splitting the array to the stage where you have single elements i.e. call them sublists,
at each stage we compare elements of each sublist with its adjacent sublist. For example, [Reusing #Davi's image
]
At Stage-1 each element is compared with its adjacent one, so n/2 comparisons.
At Stage-2, each element of sublist is compared with its adjacent sublist, since each sublist is sorted, this means that the max number of comparisons made between two sublists is <= length of the sublist i.e. 2 (at Stage-2) and 4 comparisons at Stage-3 and 8 at Stage-4 since the sublists keep doubling in length. Which means the max number of comparisons at each stage = (length of sublist * (number of sublists/2)) ==> n/2
As you've observed the total number of stages would be log(n) base 2
So the total complexity would be == (max number of comparisons at each stage * number of stages) == O((n/2)*log(n)) ==> O(nlog(n))
Algorithm merge-sort sorts a sequence S of size n in O(n log n)
time, assuming two elements of S can be compared in O(1) time.
This is because whether it be worst case or average case the merge sort just divide the array in two halves at each stage which gives it lg(n) component and the other N component comes from its comparisons that are made at each stage. So combining it becomes nearly O(nlg n). No matter if is average case or the worst case, lg(n) factor is always present. Rest N factor depends on comparisons made which comes from the comparisons done in both cases. Now the worst case is one in which N comparisons happens for an N input at each stage. So it becomes an O(nlg n).
Many of the other answers are great, but I didn't see any mention of height and depth related to the "merge-sort tree" examples. Here is another way of approaching the question with a lot of focus on the tree. Here's another image to help explain:
Just a recap: as other answers have pointed out we know that the work of merging two sorted slices of the sequence runs in linear time (the merge helper function that we call from the main sorting function).
Now looking at this tree, where we can think of each descendant of the root (other than the root) as a recursive call to the sorting function, let's try to assess how much time we spend on each node... Since the slicing of the sequence and merging (both together) take linear time, the running time of any node is linear with respect to the length of the sequence at that node.
Here's where tree depth comes in. If n is the total size of the original sequence, the size of the sequence at any node is n/2i, where i is the depth. This is shown in the image above. Putting this together with the linear amount of work for each slice, we have a running time of O(n/2i) for every node in the tree. Now we just have to sum that up for the n nodes. One way to do this is to recognize that there are 2i nodes at each level of depth in the tree. So for any level, we have O(2i * n/2i), which is O(n) because we can cancel out the 2is! If each depth is O(n), we just have to multiply that by the height of this binary tree, which is logn. Answer: O(nlogn)
reference: Data Structures and Algorithms in Python
The recursive tree will have depth log(N), and at each level in that tree you will do a combined N work to merge two sorted arrays.
Merging sorted arrays
To merge two sorted arrays A[1,5] and B[3,4] you simply iterate both starting at the beginning, picking the lowest element between the two arrays and incrementing the pointer for that array. You're done when both pointers reach the end of their respective arrays.
[1,5] [3,4] --> []
^ ^
[1,5] [3,4] --> [1]
^ ^
[1,5] [3,4] --> [1,3]
^ ^
[1,5] [3,4] --> [1,3,4]
^ x
[1,5] [3,4] --> [1,3,4,5]
x x
Runtime = O(A + B)
Merge sort illustration
Your recursive call stack will look like this. The work starts at the bottom leaf nodes and bubbles up.
beginning with [1,5,3,4], N = 4, depth k = log(4) = 2
[1,5] [3,4] depth = k-1 (2^1 nodes) * (N/2^1 values to merge per node) == N
[1] [5] [3] [4] depth = k (2^2 nodes) * (N/2^2 values to merge per node) == N
Thus you do N work at each of k levels in the tree, where k = log(N)
N * k = N * log(N)
MergeSort algorithm takes three steps:
Divide step computes mid position of sub-array and it takes constant time O(1).
Conquer step recursively sort two sub arrays of approx n/2 elements each.
Combine step merges a total of n elements at each pass requiring at most n comparisons so it take O(n).
The algorithm requires approx logn passes to sort an array of n elements and so total time complexity is nlogn.
lets take an example of 8 element{1,2,3,4,5,6,7,8} you have to first divide it in half means n/2=4({1,2,3,4} {5,6,7,8}) this two divides section take 0(n/2) and 0(n/2) times so in first step it take 0(n/2+n/2)=0(n)time.
2. Next step is divide n/22 which means (({1,2} {3,4} )({5,6}{7,8})) which would take
(0(n/4),0(n/4),0(n/4),0(n/4)) respectively which means this step take total 0(n/4+n/4+n/4+n/4)=0(n) time.
3. next similar as previous step we have to divide further second step by 2 means n/222 ((({1},{2},{3},{4})({5},{6},{7},{8})) whose time is 0(n/8+n/8+n/8+n/8+n/8+n/8+n/8+n/8)=0(n)
which means every step takes 0(n) times .lets steps would be a so time taken by merge sort is 0(an) which mean a must be log (n) because step will always divide by 2 .so finally TC of merge sort is 0(nlog(n))

random merge sort

I was given the following question in an algorithms book:
Suppose a merge sort is implemented to split a file at a random position, rather then exactly in the middle. How many comparisons would be used by such method to sort n elements on average?
Thanks.
To guide you to the answer, consider these more specific questions:
Assume the split is always at 10%, or 25%, or 75%, or 90%. In each case: what's the impact on recursion depths? How many comparisons need to be per recursion level?
I'm partially agree with #Armen, they should be comparable.
But: consider the case when they are split in the middle. To merge two lists of lengths n we would need 2*n - 1 comparations (sometimes less, but we'll consider it fixed for simplicity), each of them producing the next value. There would be log2(n) levels of merges, that gives us approximately n*log2(n) comparations.
Now considering the random-split case: The maximum number of comparations needed to merge a list of length n1 with one of length n2 will be n1 + n2 - 1. Howerer, the average number will be close to it, because even for the most unhappy split 1 and n-1 we'll need an average of n/2 comparations. So we can consider that the cost of merging per level will be the same as in even case.
The difference is that in random case the number of levels will be larger, and we can consider that n for next level would be max(n1, n2) instead of n/2. This max(n1, n2) will tend to be 3*n/4, that gives us the approximate formula
n*log43(n) // where log43 is log in base 4/3
that gives us
n * log2(n) / log2(4/3) ~= 2.4 * n * log2(n)
This result is still larger than the correct one because we ignored that the small list will have fewer levels, but it should be close enough. I suppose that the correct answer will be the number of comparations on average will double
You can get an upper bound of 2n * H_{n - 1} <= 2n ln n using the fact that merging two lists of total length n costs at most n comparisons. The analysis is similar to that of randomized quicksort (see http://www.cs.cmu.edu/afs/cs/academic/class/15451-s07/www/lecture_notes/lect0123.pdf).
First, suppose we split a list of length n into 2 lists L and R. We will charge the first element of R for a comparison against all of the elements of L, and the last element of L for a comparison against all elements of R. Even though these may not be the exact comparisons that are executed, the total number of comparisons we are charging for is n as required.
This handles one level of recursion, but what about the rest? We proceed by concentrating only on the "right-to-left" comparisons that occur between the first element of R and every element of L at all levels of recursion (by symmetry, this will be half the actual expected total). The probability that the jth element is compared to the ith element is 1/(j - i) where j > i. To see this, note that element j is compared with element i exactly when it is the first element chosen as a "splitting element" from among the set {i + 1,..., j}. That is, elements i and j are split into two lists as soon as the list they are in is split at some element from {i + 1,..., j}, and element j is charged for a comparison with i exactly when element j is the element that is chosen from this set.
Thus, the expected total number of comparisons involving j is at most H_n (i.e., 1 + 1/2 + 1/3..., where the number of terms is at most n - 1). Summing across all possible j gives n * H_{n - 1}. This only counted "right-to-left" comparisons, so the final upper bound is 2n * H_{n - 1}.

What is the complexity of this specialized sort

I would like to know the complexity (as in O(...) ) of the following sorting algorithm:
There are B barrels
that contain a total of N elements, spread unevenly across the barrels.
The elements in each barrel are already sorted.
The sort combines all the elements from each barrel in a single sorted list:
using an array of size B to store the last sorted element of each barrel (starting at 0)
check each barrel (at the last stored index) and find the smallest element
copy the element in the final sorted array, increment the array counter
increment the last sorted element for the barrel we picked from
perform those steps N times
or in pseudo code:
for i from 0 to N
smallest = MAX_ELEMENT
foreach b in B
if bIndex[b] < b.length && b[bIndex[b]] < smallest
smallest_barrel = b
smallest = b[bIndex[b]]
result[i] = smallest
bIndex[smallest_barrel] += 1
I thought that the complexity would be O(n), but the problem I have with finding the complexity is that if B grows, it has a larger impact than if N grows because it adds another round in the B loop. But maybe that has no effect on the complexity?
Since you already have pseudo-code, finding the complexity is easy.
You have 2 nested loops. The outer one always runs N-1 times, the inner always |B|, where |B| is the size of the set B (number of barrels). Therefore the complexity is (N-1)*|B| which is O(NB).
You are correct that the number of barrels changes the complexity. Just look at Your pseudocode. For every element You are about to pick, You have to search the array of candidates, the length of which is B. So You are performing an operation of complexity O(B) N times

Resources