Divide-and-conquer algorithm for class - big-o

We are starting to work with Divide and conquer algorithms in my Data structures class and I am having a lot of trouble completely understanding what I am supposed to do. Below is the which is basically asking for me to write a program that given k sorted arrays of size n, combine them using divide and conquer in a single array of size kn
The Problem: Supose that you have k sorted arrays of size n and wants to combine them in a single sorted array of size kn, write a pseudo-code with a efficient solution to this.
Is there any algorithm better then O(kn)??

Since all the arrays are sorted, you just need to compare and copy them into a single array, which can be done in O(kn).

Related

Hybrid merge + insertion sorting algorithm

In the classical merge sort algorithm, one typically divides the input array until they have several single-element subarrays prior to merging the elements back together. But, it's well-known that you can modify this mergesort algorithm by splitting the arrays until you have, say, k subarrays, each of size n/k (n is the original length of the array). You can then use insertion sort to sort each one of those k subarrays and merge them using the merge subroutine.
Intuitively, I think that this should be better than just merge sort in some cases because insertion sort is fast on small arrays. But I want to figure out precisely when this hybrid algorithm is better than the regular merge sort algorithm. I don't think it would be better for small k because as k approaches 1, we'd just be using the insertion sort algorithm. I think there is some optimal ratio n/k, but I am not so sure how to find it.
Any help is appreciated.

Sorting an array that in the sorted array there are 3 differences between every adjacent elements

Im having a trouble with this sorting question.
Describe an algorithm that sort an array with the conditions:
1.In the sorted array there are 3 possible differences (k1,k2,k3 all natural numbers) between every adjacent elements.
2.In the sorted array there are 3 possible differences (k1,k2=2k1,k3=3k1 all rational numbers) between every adjacent elements.
I was able to find the differences in both questions in a linear time O(n) but im stuck on O(nlogn) in the sotring part.
Trying maybe to an O(n) time or maybe O(nloglogn) by maybe refer to k1,k2,k3 as a really small numbers and use counting sort.
Thanks.

Complexity - input length

I'm currently learning complexity (or efficiency however you call it), and I read about it in a book I got.
There is written something which I find pretty senseless and I need an explanation. I've tried looking online but I didn't find an answer for this certain example that they're giving.
For an algorithm that gets the max number in a single-dimensional array the size of n the input length would be n.
"For an algorithm that gets the max number in a two-dimensional array the size of n*n the input length would still be n."
I don't understand why the input length would be 'n' in both cases even though for the two-dimensional you have to go through n*n numbers...
It says
input length = the amount of work done ...
doesn't make any sense to me.
Would anyone care to explain? They certainly don't explain this there.
It's a common misconception (much seen here on SO) that the complexity of a scan across a 2D array with n*n elements is O(n^2). It's not, it's O(n). A scan is a linear operation, one element after another.
The 2D array is a polite fiction, it is really just a convenience for accessing a 1D array. After all, in languages which implement arrays properly (i.e. none of this array of pointers to blocks of memory) a 2D array is just a set of adjacent memory locations. And even in languages which do implement 2D arrays as arrays of pointers they're just linear segments of memory with interruptions
If a scan across a 2D array were O(n^2) then you could magically transform it to O(n) by ignoring the 2d-ness and just scanning the underlying 1d block of memory.
O(n^2) describes a different complexity class of operations such as those in which each pair of elements in the input is operated upon.
Reading in the comments that this book is written in Hebrew I would assume that the issue is a translation error or some other error in proofreading. The definition given in the comments of input length "input length is the measurement that indicates the work load of an algorithm" doesn't match what you would assume the term means at all in English.
To answer the question about complexity, they are reusing the variable 'n' in multiple places which makes it slightly confusing. They use 'n' to describe the dimension of the array and to describe the complexity. O(n) simply means the complexity is linear to the input. O(n^2) would be an exponential complexity. In this case with an array of n*n elements the input is n*n or n^2, but the complexity of the algorithm is still O(n) (or linear). This is because the algorithm still only operates on each input element once, whether the input is n or n*n. It would still be linear if it operated one each element 2 or three times as 3n and n are both linear functions (any x*n would be linear).
I hope this helps.
Big-O notation is used to classify TYPES of algorithms (complexity classes), not necessarily how much time it will ACTUALLY take to run. For instance O(cn) is just O(n) where c is a constant.
n is the size of the input whether that input is an nxn matrix or just an 'n' length array. The big-O 'n' and the program variable name are not referring to the same thing.

What is the best data structure to sort an array with duplicated elements?

If an array contains duplicated elements, what data structure is better for sorting?
Could B tree work?
For a fixed and small range of element values you can use counting sort algorithm, as described here. Its complexity is O(n + k), where n is the size of your array, and k is, basically, the amount of different possible elements.
The point is to calculate the number of same elements, and then insert them in the right order.

Is it possible to find two numbers whose difference is minimum in O(n) time

Given an unsorted integer array, and without making any assumptions on
the numbers in the array:
Is it possible to find two numbers whose
difference is minimum in O(n) time?
Edit: Difference between two numbers a, b is defined as abs(a-b)
Find smallest and largest element in the list. The difference smallest-largest will be minimum.
If you're looking for nonnegative difference, then this is of course at least as hard as checking if the array has two same elements. This is called element uniqueness problem and without any additional assumptions (like limiting size of integers, allowing other operations than comparison) requires >= n log n time. It is the 1-dimensional case of finding the closest pair of points.
I don't think you can to it in O(n). The best I can come up with off the top of my head is to sort them (which is O(n * log n)) and find the minimum difference of adjacent pairs in the sorted list (which adds another O(n)).
I think it is possible. The secret is that you don't actually have to sort the list, you just need to create a tally of which numbers exist. This may count as "making an assumption" from an algorithmic perspective, but not from a practical perspective. We know the ints are bounded by a min and a max.
So, create an array of 2 bit elements, 1 pair for each int from INT_MIN to INT_MAX inclusive, set all of them to 00.
Iterate through the entire list of numbers. For each number in the list, if the corresponding 2 bits are 00 set them to 01. If they're 01 set them to 10. Otherwise ignore. This is obviously O(n).
Next, if any of the 2 bits is set to 10, that is your answer. The minimum distance is 0 because the list contains a repeated number. If not, scan through the list and find the minimum distance. Many people have already pointed out there are simple O(n) algorithms for this.
So O(n) + O(n) = O(n).
Edit: responding to comments.
Interesting points. I think you could achieve the same results without making any assumptions by finding the min/max of the list first and using a sparse array ranging from min to max to hold the data. Takes care of the INT_MIN/MAX assumption, the space complexity and the O(m) time complexity of scanning the array.
The best I can think of is to counting sort the array (possibly combining equal values) and then do the sorted comparisons -- bin sort is O(n + M) (M being the number of distinct values). This has a heavy memory requirement, however. Some form of bucket or radix sort would be intermediate in time and more efficient in space.
Sort the list with radixsort (which is O(n) for integers), then iterate and keep track of the smallest distance so far.
(I assume your integer is a fixed-bit type. If they can hold arbitrarily large mathematical integers, radixsort will be O(n log n) as well.)
It seems to be possible to sort unbounded set of integers in O(n*sqrt(log(log(n))) time. After sorting it is of course trivial to find the minimal difference in linear time.
But I can't think of any algorithm to make it faster than this.
No, not without making assumptions about the numbers/ordering.
It would be possible given a sorted list though.
I think the answer is no and the proof is similar to the proof that you can not sort faster than n lg n: you have to compare all of the elements, i.e create a comparison tree, which implies omega(n lg n) algorithm.
EDIT. OK, if you really want to argue, then the question does not say whether it should be a Turing machine or not. With quantum computers, you can do it in linear time :)

Resources