For Doubly linked list Q that has number elements, and we have a pointer to first and last elements we have define two operations.
Delete (k): delete k first elements from Q.
Append (c), check the last element from `Q`, if this value bigger than c, delete this elements and repeat it again until the last element is lower or equal to `c` (or empty `Q`), then insert c as last elements of Q`.`
if we repeat sequences of these two operations in arbitrary order for n times on empty list Q, sum of all cost of theses operation is close to 2n. why my instructor reach to 2n? any hint or idea is appreciated.
When we "repeat Delete and Append in arbitrary order for n times on empty list Q", Append is called n times; hence exactly n list element insertions are performed.
Since the list is initially empty, it never contains more than n elements; hence at most n list element deletions are performed in the combination of Delete and Append.
Hence the total number of loops in each of Delete and Append (including reads in Append) is no more than 2n.
So all in all, no section of the program is executed more than 2n times (counting separately code that may be common to list element insertion, list element deletion, and list element access).
The cost is minimal when k is always 0, and c non-decreasing (including always 0): we have n list element insertions, n list element read (one returning empty), n emptyness tests, n-1 element comparisons, and no deletion. The cost thus varies significantly with parameters.
Note: "Sum of all cost of theses operation is close to 2n" is ill-defined, thus not even wrong. Worse, if list element deletion, by some bad luck (e.g. code cache miss, debug code..) was much slower than the rest, it could be that the code duration vary by a large factor (much higher than 2) depending on parameters. Hence execution time is NOT ALWAYS "about 2n" for any lousing meaning of that.
Update: In comment, we are told that list element insertion and deletion has the same cost 1. There are n list element insertions, and between 0 to n list element deletions. Hence if we neglect the other costs (which is reasonable is memory allocation cost dominates), the total cost is about n to about 2n depending on parameters. Further, for many parameters (including k>=1 most of the time), there are nearly n list element deletions, hence cost is about 2n if one insists on a best guess, such as in a multiple choice question with (a) n+k (b) n (c) 2n (d) 3n as the only options.
if we repeat these operations in arbitrary order for n times on empty
list Q sum of all cost of theses operation is close to 2n
It will actually be O(n) since we know the list Q is empty.
DoStuff(list Q, int n):
for(int i = 0; i < n; i++)
Q.Delete(k) //O(k)
Q.Append(c) //O(sizeof(Q))
//or
//Q.Append(c) //O(sizeof(Q))
//Q.Delete(k) //O(k)
Where n is the number of iterations.
Now say the lists aren't empty, then we would have O(n*(sizeof(Q)+k)). Explanation for that is below:
Say worse case scenario for Delete (k) would be to delete k first elements from Q where k is the size of Q, then we would delete n elements. However, it is more accurate to say O(k) because you will always delete the first k elements only.
Say worse case scenario for Append (c) would be all elements inside of Q is greater than the value c. This would start from the tail node and delete all nodes from Q.
In either order
Delete(k) //O(k)
Append(c) //O(sizeof(Q))
Or
Append(c) //O(sizeof(Q))
Delete(k) //O(k)
It would be at worse case for just these two commands is O(sizeof(Q)+k). Now we know we have to do this n iterations, so we finally get O(n*(sizeof(Q)+k))
As for what your professor said, the only reason why I could imagine your professor said 2n is because there are 2 functions being called n times. Therefore 2n.
Related
(I got this as an interview question and would love some help with it.)
You have k sorted lists containing n different numbers in total.
Show how to create a single sorted list containing all the element from the k lists in O(n * log(k))
The idea is to use a min heap of size k.
Push all the k lists on the heap (one heap-entry per list), keyed by their minimum (i.e. first) value
Then repeatedly do this:
Extract the top list (having the minimal key) from the heap
Extract the minimum value from that list and push it on the result list
Push the shortened list back (if it is not empty) on the heap, now keyed by its new minimum value
Repeat until all values have been pushed on the result list.
The initial step will have a time complexity of O(klogk).
The 3 steps above will be repeated n times. At each iteration the cost of each is:
O(1)
O(1) if the extraction is implemented using a pointer/index (not shifting all values in the list)
O(log k) as the heap size is never greater than k
So the resulting complexity is O(nlogk) (as k < n, the initial step is not significant).
As the question is stated, there's no need for a k-way merge (or a heap). A standard 2 way merge used repeatedly to merge pairs of lists, in any order, until a single sorted list is produced will also have time complexity O(n log(k)). If the question had instead asked how to merge k lists in a single pass, then a k-way merge would be needed.
Consider the case for k == 32, and to simplify the math, assume all lists are merged in order so that each merge pass merges all n elements. After the first pass, there are k/2 lists, after the 2nd pass, k/4 lists, after log2(k) = 5 passes, all k (32) lists are merged into a single sorted list. Other than simplifying the math, the order in which lists are merged doesn't matter, the time complexity remains the same at O(n log2(k)).
Using a k-way merge is normally only advantageous when merging data using an external device, such as one or more disk drives (or classic usage tape drives), where the I/O time is great enough that heap overhead can be ignored. For a ram based merge / merge sort, the total number of operations is about the same for a 2-way merge / merge sort or a k-way merge / merge sort. On a processor with 16 registers, most of them used as indexes or pointers, an optimized (no heap) 4-way merge (using 8 of the registers as indexes or pointers to current and ending location of each run) can be a bit faster than a 2-way merge due to being more cache friendly.
When N=2, you merge the two lists by iteratively popping the front of the list which is the smallest. In a way, you create a virtual list that supports a pop_front operation implemented as:
pop_front(a, b): return if front(a) <= front(b) then pop_front(a) else pop_front(b)
You can very well arrange a tree-like merging scheme where such virtual lists are merged in pairs:
pop_front(a, b, c, d): return if front(a, b) <= front(c, d) then pop_front(a, b) else pop_front(c, d)
Every pop will involve every level in the tree once, leading to a cost O(Log k) per pop.
The above reasoning is wrong because it doesn't account for the front operations, that involves the comparison between two elements, which will cascade and finally require a total of k-1 comparisons per output element.
This can be circumvented by "memoizing" the front element, i.e. keeping it next to the two lists after a comparison has been made. Then, when an element is popped, this front element is updated.
This directly leads to the binary min-heap device, as suggested by #trincot.
5 7 32 21
5
6 4 8 23 40
2
7 7 20 53
2
2 4 6 8 10
This problem appeared in code jam 2018 qualification round which has ended.
https://codejam.withgoogle.com/2018/challenges/ (Problem 2)
Problem description:
The basic operation of the standard bubble sort algorithm is to examine a pair of adjacent numbers and reverse that pair if the left number is larger than the right number. But our algorithm examines a group of three adjacent numbers, and if the leftmost number is larger than the rightmost number, it reverses that entire group. Because our algorithm is a "triplet bubble sort", we have named it Trouble Sort for short.
We were looking forward to presenting Trouble Sort at the Special
Interest Group in Sorting conference in Hawaii, but one of our interns
has just pointed out a problem: it is possible that Trouble Sort does
not correctly sort the list! Consider the list 8 9 7, for example.
We need your help with some further research. Given a list of N
integers, determine whether Trouble Sort will successfully sort the
list into non-decreasing order. If it will not, find the index
(counting starting from 0) of the first sorting error after the
algorithm has finished: that is, the first value that is larger than
the value that comes directly after it when the algorithm is done.
So a naive approach will be to apply trouble sort on the given list, apply normal sort on the list, and find the index of the first non-matching element. However, this would time out for very large N.
Here is what I figured:
The algorithm will compare 0th index with 2nd, 2nd with 4th and so on.
Similarly 1st with 3rd, 3rd with 5th and so on.
All the elements at odd index will be sorted with respect to odd index. Same for even indexed element.
So the issue would lie between two consecutive odd/even indexed element.
I can't think of a way to figure it out without doing an O(n^2) approach.
Is my approach any viable, or there is something easier?
Your observation is spot on. The algorithm presented in the problem statement will only compare( and swap ) the consecutive odd and even elements among themselves.
If you take that observation one step further, you can state that Trouble Sort is an algorithm that correctly sorts odd- and even-indexed elements of an array within themselves. (i.e. as if odd-indexed elements and even-indexed elements of an array A are two separate arrays B and C)
In other words, Trouble Sort does sort B and C correctly. The issue here is whether those arrays B and C of odd and even-indexed elements can be merged properly. You should check if sorting odd- and even-indexed elements among themselves is enough to make the entire array sorted.
This step is really similar to the merging step of MergeSort. The only difference is that, due to the indexing being a limiting factor on your operation, you know at all times from which array you will pick the top element. For a 1-indexed array A, during the merging step of B and C, at each step, you should pick the smallest previously unpicked element from B, and then C.
So, basically, if you sort B and C, which takes, O(NlogN) using an algorithm such as mergesort or heapsort, and then merge them in the manner described in the previous paragraph, which takes O(N), you end up with the same version of the array A after it has been processed by the Trouble Sort algorithm.
The difference is the time complexity. While Trouble Sort takes O(N^2) time, the operations described above takes O(NlogN) time. Once you end up with this array, then you can check in O(N) time if, for each consecutive indices i, j, A[i] < A[j] holds. The overall complexity of the algorithm would still be O(NlogN).
Below is a code sample in Python to demonstrate sort of a pseudocode of the algorithm I described above. There are a couple of minor differences in implementation due to Python arrays being 0-indexed. You may observe the execution of this code here.
def does_trouble_sort_work(A):
B, C = A[0::2], A[1::2]
B_sorted = sorted(B)
C_sorted = sorted(C)
j = k = 0
for i in xrange(len(A)):
if i % 2 == 0:
A[i] = B_sorted[j]
j += 1
else:
A[i] = C_sorted[k]
k += 1
trouble_sort_works = True
for i in xrange(1, len(A)):
if A[i-1] > A[i]:
trouble_sort_works = False
break
return trouble_sort_works
Let L = list of S's where S = list of lengths of sides of a triangle.
Then, I need to find the minimum number of swaps required to sort the list L.
The list L is said to be sorted:
if the elements within each S list are sorted in non-decreasing order and
if the elements at ith index of each S list are sorted in non-decreasing order.
where 0 <= i <= 2
Note: Two types of swap operations can be done :
Either elements withing a S list can be swapped (requires 1 swap)
Two complete S lists can be swapped without changing the order of elements.
(requires 3 swaps)
Any efficient algorithm in terms of Time Complexity to find the minimum number of swaps required to sort the list L whenever possible?
EDIT:
As pointed out correctly by #Mbo, it is not always possible to sort such a list L. So, it would be great if someone provides an algorithm to check if the list L can be sorted followed by sorting if possible.
Maybe I misunderstood, but it seems to me that you have to sort the 3 elements in each S. Once this is done, it is simply a matter of sorting the triplets in order of the list {S0[0],S1[0]...} and then checking whether the sequences {S0[1], S1[1].....}, {S0[2], S1[2]..} are in increasing order. If you have n triangles the asymptotic worst case complexity would be O(n)
I want to generate some test data to test a function that merges 'k sorted' lists (lists where each element is at most k positions away from it's correct sorted position) into a single fully sorted list. I have an approach that works but I'm not sure how well randomized it is and I feel there should be a simpler / more elegant way to do this. My current approach:
Generate n random elements paired with an integer index.
Sort random elements.
Set paired index for each element to its sorted position.
Work backwards through the elements, swapping each element with an element a random distance between 1 and k positions behind it in the list. Only swap with the target element if its paired index is its current index (this avoids swapping an element that is already out of place and moving it further than k positions away from where it should be).
Copy the perturbed elements out into another list.
Like I say, this works but I'm interested in alternative / better approaches.
I think you could just fill an array with random integers and then run quicksort on it with a custom stopping condition.
If in a particular quicksort recursion your start and end indexes are less than k apart, then just return instead of continuing to recur.
Because of how quicksort works, every number in the start..end interval belongs somewhere in that region; worst case is that array[start] might really belong at array[end] (or vice versa) in truly sorted order. So, assuring that start and end are no more than k apart should be sufficient.
You can generate array of random numbers and then h-sort it like in shellsort, but without fiew last sorting steps when h is less then k.
Step 1: Randomly permute disjoint segments of length k. (Eg. 1 to K, k+1 to 2k ...)
Step 2: Permute conditionally again by swapping (that they don't break k-sorted assumption (1+t yo k+t, k+1+t to 1+2k+t ...) where t is a number between 1 and k (most preferably k/2)
Probably repeat step 2 multiple times with different t.
If I understand the problem, you want an algorithm to randomly pick a single k-sorted list of length n, uniformly selected from the universe U of all k-sorted lists of length n. (You will then run this algorithm m times to produce m lists as input test data.)
The first step is to count them. What is the size of U? |U|
The next step is to enumerate them. Create any one-to-one mapping F between the integers (1,2,...,|U|) and k-sorted lists of length n.
Then randomly select an integer x between 1 and |U| inclusive, and then apply F(x) to get the list.
Problem consists of two sorted lists with no duplicates of sizes n and m. First list contains strings that should be deleted from second list.
Simplest algorithm would have to do nxm operations (I believe that terminology for this is "quadratic time"?).
Improved solution would be to take advantage of the fact that both list are sorted and skip strings with index that is lower than last deleted index in future comparisons.
I wonder what time complexity would that be?
Are there any solutions for this problem with better time complexity?
You should look into Merge sort. This is the basic idea behind why it works efficiently.
The idea is to scan the two lists together, which takes O(n+m) time:
Make a pointer x for first list, say A and another pointer y for the second list, say B. Set x=0 and y=0. While x < n and y < m, if A[x] < B[y], then add A[x] to the new merged list and increment x. Otherwise add B[y] to the new list and increment y. Once you hit x=n or y=m, take on the remaining elements from B or A, respectively.
I believe the complexity would be O(n+m), because every item in each of the lists would be visited exactly once.
A counting/bucket sort algorithm would work where each string in the second list is a bucket.
You go through the second list (takes m time) and create your buckets. You then go through your first list (takes n time) and increment the number of occurances. You then would have to go through each bucket (takes m time) again and only return strings that occur once. A Trie or a HashMap would work well for storing a buckets. Should be O(n+m+m). If you use a HashSet, in the second pass instead of incrementing a counter, you remove from the Set. It should be O(n+m+(m-n)).
Might it be O(m + log(n)) if binary search is used?