Algorithm for finding mutual name in lists - algorithm

I've been reading up on Algorithms from the book Algorithms by Robert Sedgewick and I've been stuck on an exercise problem for a while. Here is the question :
Given 3 lists of N names each, find an algorithm to determine if there is any name common to all three lists. The algorithm must have O(NlogN) complexity. You're only allowed to use sorting algorithms and the only data structures you can use are stacks and queues.
I figured I could solve this problem using a HashMap, but the questions restricts us from doing so. Even then that still wouldn't have a complexity of NlogN.

If you sort each of the lists, then you could trivially check if all three lists have any 1 name in O(n) time by picking the first name of list A compare it to the first name in list B, if that element is < that of list A, pop the list b element and repeat until list B >= list A. If you find a match repeat the process on C. If you find a match in C also return true, otherwise return to the next element in a.
Now you have to sort all of the lists in n log n time. which you could do with your favorite sorting algorithm though you would have to be a little creative using just stacks and queues. I would probably recommend merge sort
The below psuedo code is a little messed up because I am changing lists that I am iterating over
pseudo code:
assume listA, b and c are sorted Queues where the smallest name is at the top of the queue.
eltB = listB.pop()
eltC = listC.pop()
for eltA in listA:
while(eltB<=eltA):
if eltB==eltA:
while(eltC<=eltB):
if eltB==eltC:
return true
if eltC<eltB:
eltC=listC.pop();
eltB=listB.pop()

Steps:
Sort the three lists using an O(N lgN) sorting algorithm.
Pop the one item from each list.
If any of the lists from which you tried to pop is empty, then you are done i.e. no common element exists.
Else, compare the three elements.
If the elements are equal, you are done - you have found the common element.
Else, keep the maximum of the three elements (constant time) and replenish from the same lists from which the two elements were discarded.
Go to step 3.
Step 1 takes O(N lgN) and the rest of the steps take O(N), so the overall complexity is O(N lgN).

Related

Average case nlogn Nuts and Bolts matching

I have to make an algorithm that matches items from two arrays, we are not allowed to sort either array first, we can only match by comparison with an item from array 1 to and an item to array 2 (comparisons being <,=,>). The output is two lists and they have the same order. I can think of ways to solve it using n(n+1)/2 time. The goal is nlog(n). I have been banging my head against a wall trying to think of a way but I can't. Can anyone give me a hint?
So to explain the input is two arrays ex. A = [1,3,6,2,5,4] B =[4,2,3,5,1,6] and the output is the two arrays with the same order. You can not sort the arrays individually first or compare items within the same array. You can only compare items across lists like so A_1<B_1, A_2=B_3, A_4<B_3.
Similar to quicksort:
Use a random A-element to partition B into smaller-B, equal-B and larger-B. Use its equal B-element to partition A. Recursively match smaller-A with smaller-B as well as larger-A with larger-B.
Just like quicksort, expected time is O(n log n) and worst case is O(n2).

Bubble sort variant - three adjacent number swapping

This problem appeared in code jam 2018 qualification round which has ended.
https://codejam.withgoogle.com/2018/challenges/ (Problem 2)
Problem description:
The basic operation of the standard bubble sort algorithm is to examine a pair of adjacent numbers and reverse that pair if the left number is larger than the right number. But our algorithm examines a group of three adjacent numbers, and if the leftmost number is larger than the rightmost number, it reverses that entire group. Because our algorithm is a "triplet bubble sort", we have named it Trouble Sort for short.
We were looking forward to presenting Trouble Sort at the Special
Interest Group in Sorting conference in Hawaii, but one of our interns
has just pointed out a problem: it is possible that Trouble Sort does
not correctly sort the list! Consider the list 8 9 7, for example.
We need your help with some further research. Given a list of N
integers, determine whether Trouble Sort will successfully sort the
list into non-decreasing order. If it will not, find the index
(counting starting from 0) of the first sorting error after the
algorithm has finished: that is, the first value that is larger than
the value that comes directly after it when the algorithm is done.
So a naive approach will be to apply trouble sort on the given list, apply normal sort on the list, and find the index of the first non-matching element. However, this would time out for very large N.
Here is what I figured:
The algorithm will compare 0th index with 2nd, 2nd with 4th and so on.
Similarly 1st with 3rd, 3rd with 5th and so on.
All the elements at odd index will be sorted with respect to odd index. Same for even indexed element.
So the issue would lie between two consecutive odd/even indexed element.
I can't think of a way to figure it out without doing an O(n^2) approach.
Is my approach any viable, or there is something easier?
Your observation is spot on. The algorithm presented in the problem statement will only compare( and swap ) the consecutive odd and even elements among themselves.
If you take that observation one step further, you can state that Trouble Sort is an algorithm that correctly sorts odd- and even-indexed elements of an array within themselves. (i.e. as if odd-indexed elements and even-indexed elements of an array A are two separate arrays B and C)
In other words, Trouble Sort does sort B and C correctly. The issue here is whether those arrays B and C of odd and even-indexed elements can be merged properly. You should check if sorting odd- and even-indexed elements among themselves is enough to make the entire array sorted.
This step is really similar to the merging step of MergeSort. The only difference is that, due to the indexing being a limiting factor on your operation, you know at all times from which array you will pick the top element. For a 1-indexed array A, during the merging step of B and C, at each step, you should pick the smallest previously unpicked element from B, and then C.
So, basically, if you sort B and C, which takes, O(NlogN) using an algorithm such as mergesort or heapsort, and then merge them in the manner described in the previous paragraph, which takes O(N), you end up with the same version of the array A after it has been processed by the Trouble Sort algorithm.
The difference is the time complexity. While Trouble Sort takes O(N^2) time, the operations described above takes O(NlogN) time. Once you end up with this array, then you can check in O(N) time if, for each consecutive indices i, j, A[i] < A[j] holds. The overall complexity of the algorithm would still be O(NlogN).
Below is a code sample in Python to demonstrate sort of a pseudocode of the algorithm I described above. There are a couple of minor differences in implementation due to Python arrays being 0-indexed. You may observe the execution of this code here.
def does_trouble_sort_work(A):
B, C = A[0::2], A[1::2]
B_sorted = sorted(B)
C_sorted = sorted(C)
j = k = 0
for i in xrange(len(A)):
if i % 2 == 0:
A[i] = B_sorted[j]
j += 1
else:
A[i] = C_sorted[k]
k += 1
trouble_sort_works = True
for i in xrange(1, len(A)):
if A[i-1] > A[i]:
trouble_sort_works = False
break
return trouble_sort_works

Best sorting algorithm - Partially sorted linked list

Problem- Given a sorted doubly link list and two numbers C and K. You need to decrease the info of node with data K by C and insert the new node formed at its correct position such that the list remains sorted.
I would think of insertion sort for such problem, because, insertion sort at any instance looks like, shown bunch of cards,
that are partially sorted. For insertion sort, number of swaps is equivalent to number of inversions. Number of compares is equivalent to number of exchanges + (N-1).
So, in the given problem(above), if node with data K is decreased by C, then the sorted linked list became partially sorted. Insertion sort is the best fit.
Another point is, amidst selection of sorting algorithm, if sorting logic applied for array representation of data holds best fit, then same sorting logic should holds best fit for linked list representation of same data.
For this problem, Is my thought process correct in choosing insertion sort?
Maybe you mean something else, but insertion sort is not the best algorithm, because you actually don't need to sort anything. If there is only one element with value K then it doesn't make a big difference, but otherwise it does.
So I would suggest the following algorithm O(n), ignoring edge cases for simplicity:
Go forward in the list until the value of the current node is > K - C.
Save this node, all the reduced nodes will be inserted before this one.
Continue to go forward while the value of the current node is < K
While the value of the current node is K, remove node, set value to K - C and insert it before the saved node. This could be optimized further, so that you only do one remove and insert operation of the whole sublist of nodes which had value K.
If these decrease operations can be batched up before the sorted list must be available, then you can simply remove all the decremented nodes from the list. Then, sort them, and perform a two-way merge into the list.
If the list must be maintained in order after each node decrement, then there is little choice but to remove the decremented node and re-insert in order.
Doing this with a linear search for a deck of cards is probably acceptable, unless you're running some monstrous Monte Carlo simulation involving cards, that runs for hours or day, so that optimization counts.
Otherwise the way we would deal with the need to maintain order would be to use an ordered sequence data structure: balanced binary tree (red-black, splay) or a skip list. Take the node out of the structure, adjust value, re-insert: O(log N).

Does the non-parallel sample sort have the same complexity as quick sort?

According to wikipedia and other resources, quick sort happens to be a special case of sample sort, because we always choose 1 partitioning item, put it in it's place and continue the sort, so quick sort is sample sort, where m (the number of partitioning items at each step) is 1. So, my question is, for 1 < m < n does it have the same complexity as quick sort when it's not parallel?
The following is the algorithm for sample sort described on wikipedia.
1) Find splitters, values that break up the data into buckets, by sampling the data.
2) Use the sorted splitters to define buckets and place data in appropriate buckets.
3) Sort each of the buckets.
I am not exactly sure I understand this algorithm correctly, but I think we first find the partitioning item, put it in it's place and then look to the left and to the right to find more partitioning items there, and then recursively call the same function to partition each one of those m samples into m samples again, am I right? Because if so, it seems that sample sort performs the same as quick sort because it simply does the same thing, except half of it iteratively (when looking for splitters) and half of it recursively.
They will have different complexity. When m > 1, their running would be approximate to CNlogm+1N. The constant C will be large enough to make it slower than ordinary QuickSort because there is no known algorithm to partition list into m + 1 buckets as efficiency as partition list into two buckets.
For example, normal QuickSort would takes O(N) to partition the list into two sub array. Assuming in best case, QuickSort perfectly choose value that split list into two buckets of the same size.
Cn = 2Cn/2 + n = nlog2n
Let assume that m = 2 that's mean we need to partition the list into three sub array. Let assuming that in best case, we can perfectly choose values that split the list into three buckets of the same size. However, let's say the cost of partition is O(3N).
Cn = 3Cn/3 + 3n = 3nlog3n
As you can see
3nlog3n > nlog2n.

Removing items from a list - algorithm time complexity

Problem consists of two sorted lists with no duplicates of sizes n and m. First list contains strings that should be deleted from second list.
Simplest algorithm would have to do nxm operations (I believe that terminology for this is "quadratic time"?).
Improved solution would be to take advantage of the fact that both list are sorted and skip strings with index that is lower than last deleted index in future comparisons.
I wonder what time complexity would that be?
Are there any solutions for this problem with better time complexity?
You should look into Merge sort. This is the basic idea behind why it works efficiently.
The idea is to scan the two lists together, which takes O(n+m) time:
Make a pointer x for first list, say A and another pointer y for the second list, say B. Set x=0 and y=0. While x < n and y < m, if A[x] < B[y], then add A[x] to the new merged list and increment x. Otherwise add B[y] to the new list and increment y. Once you hit x=n or y=m, take on the remaining elements from B or A, respectively.
I believe the complexity would be O(n+m), because every item in each of the lists would be visited exactly once.
A counting/bucket sort algorithm would work where each string in the second list is a bucket.
You go through the second list (takes m time) and create your buckets. You then go through your first list (takes n time) and increment the number of occurances. You then would have to go through each bucket (takes m time) again and only return strings that occur once. A Trie or a HashMap would work well for storing a buckets. Should be O(n+m+m). If you use a HashSet, in the second pass instead of incrementing a counter, you remove from the Set. It should be O(n+m+(m-n)).
Might it be O(m + log(n)) if binary search is used?

Resources