I have a range of numbers from 1-10 and I want to pick out 3 randomly but never the same twice. In lua I used a Fisher-Yates shuffle which is O(n), I know python has a built-in random.sample() also O(n). Can it be done faster with an arbitrary range and number of picks?
It is impossible to operate on any sequence of numbers where you read each one under complexity of O(n) because having n read operations alone puts it into linear complexity.
P.S.: This assumes that n is the number of picks. If you have an array and n is the size of that array and the number of picks m is constant, then you can generate a random index number with any method m times and achieve O(1), assuming array index takes constant time. I hope that answers your question, please clarify, if that didn't solve it.
Related
For finding the sum of n numbers, which is the better option for less time complexity?
using a for loop and iterate throughout the n numbers of an array
using the real equation (n*(n-1))/2
I think that the second one is correct, but it will also depends on the length of the array? Which one is correct here ?
The problem is to sort a list containing n distinct integers that range in value from 1 to kn inclusive where k is a fixed positive integer. Design an algorithm to solve the problem in Θ(n) time.
I don't just want an answer. An explanation would help, or if someone could get me pointed in the right direction.
I know that Θ(n) time means the algorithm time is directly proportional to the number of elements. Not sure where to go from there.
Easy for fixed k: Create an array of kn counters. Set them all to zero. Iterate through the array, increasing the counter i by one if an array element equals i. Use the array of counters to re-create the sorted array.
Obviously this is inefficient if k > log n.
The key is that the integers only range from 1 to kn, so their length is limited. This is a little tricky:
The common assumption when we say that a sorting algorithm is O(N) is that the number N fits into a constant number of machine words so that we can do math on numbers of that size in constant time. Following this assumption, kN also fits into a constant number of machine words, since k is a fixed positive integer. Your input is therefore O(N) words long, and each word is fixed number of bits, so your input is O(N) bits long.
Therefore, any algorithm that takes time proportional to the number of bits in the input is considered O(N).
There are actually lots of choices, but when this particular question is asked in this particular way, the person asking usually wants you to come up with a radix sort:
https://en.wikipedia.org/wiki/Radix_sort
The MSB-first radix sort just partitions the integers into 2^W buckets according to the values of their top W bits, and then partitions each bucket according to the next W bits, etc., until all the bits are processed.
The time taken for this is O(N*(word_size/W)), but as we said the word size is constant, and W is constant, so this is O(N).
This question already has answers here:
Limit input data to achieve a better Big O complexity
(3 answers)
Closed 8 years ago.
You are given an unsorted array of n integers, and you would like to find if there are any duplicates in the array (i.e. any integer appearing more than once).
The Algorithm is based on unsorted array of size n integers. Use of nested loop was implemented to find duplicates and the complexity is; O (N^2)
If we limit the input data in order to achieve some best case scenario, how can you limit the input data to achieve a better Big O complexity? Describe an algorithm for handling this limited data to find if there are any duplicates. What is the Big O complexity?
The questions asks for the following:
one way of how the data can be limited.
How this changes your algorithm for finding duplicates, and what is the better Big O complexity.
The answer I have come up with:
If we limit the data to, let’s say, array size of 5 (n = 5), we could reduce the complexity to O(N).
If the array is sorted, than all we need is a single loop to compare each element to the next element in the array and this will find if duplicates exist.
Which simply means that if an array given to us is by default (or luckily) already sorted (from lowest to highest value) in this case the reduction will be from O(N^2) to O(N) as we wouldn’t need the inner loop for comparing the integers for sorting since it is already sorted therefore we could implement a single loop to compare the integers to its successor and if a duplicate is encountered, then we could, for instance, use a printf statement to print the duplicates and proceed to iterate the loop n-1 times (which would be 4)- ending the program once that has been done.
The best case in this algorithm would be O(N) simply because the performance grows linearly and in direct proportion to the size of the input/ data so if we have a sorted array of size 50 (50 integers in the array) then the iteration would be n-1 (the loop will iterate 50 – 1 times) where n is the length of the array which is 50.
The running time in this algorithm increases in direct proportion to the input size. This simply means that in a sorted array, the amount of time the operations take to perform is completely dependent on the input size of the array.
Your confirmation (on whether this is correct or not) would be grateful. I know that there are other algorithms with better complexity class but since this is more efficient than O(N^2), it would be a possible answer since it's what the question asks for.
If you limit the size of the array to 5 (or 1000, or any other constant for that matter), then the complexity of your algorithm becomes O(1), so limiting the size of the array is a non-starter.
What you can do, however, is limit the values that go into the array. If you limit them to, say, 10000, or some other small number like that, you could make an O(N) algorithm like this:
Make an array of booleans called seen. The array needs to have the size of the max value that goes into your data array. Set all elements of the seen array to false. Now go through your array data, check if the boolean for the corresponding value is set, and if it is, declare a duplicate. Otherwise, set the seen flag to true. This algorithm has the complexity of O(N) in the worst case.
You could expand this algorithm to allow any range of values, as long as the value has a good hash function. Replace the array seen with a hash set, and use the same algorithm. Since the time complexity of adding and retrieving data in a hash set is constant, the asymptotic complexity of the algorithm would not change.
Finally, you can sort the array, and look for duplicates in O(N*logN). This algorithm has a slightly worse time complexity, but its space complexity is O(1) (the algorithms using hash set has space complexity of O(N), which may be significant).
This was inspired by a question at a job interview: how do you efficiently generate N unique random numbers? Their security and distribution/bias don't matter.
I proposed a naive way of calling rand() N times and eliminating dupes by trial and error, thus getting inefficient and flawed solution. Then I've read this SO question, these algorithms are great for getting quality unique numbers and they are O(N).
But I suspect there are ways to get low-quality unique random numbers for dummy tasks in less than O(N) time complexity. I got some possible ideas:
Store many precomputed lists each containing N numbers and retrieve one list randomly. Complexity is O(1) for fixed N. Storage space used is O(NR) where R is number of lists.
Generate N/2 unique random numbers and then divide them by 2 inequal parts (floor/ceil for odd numbers, n+1/n-1 for even). I know this is flawed (duplicates can pop up) and O(N/2) is still O(N). This is more of a food for thought.
Generate one big random number and then squeeze more variants from it by some fixed manipulations like bitwise operations, factorization, recursion, MapReduce or something else.
Use a quasi-random sequence somehow (not a math guy, just googled this term).
Your ideas?
Presumably this routine has some kind of output (i.e. the results are written to an array of some kind). Populating an array (or some other data-structure) of size N is at least an O(N) operation, so you can't do better than O(N).
You can consequently generate a random number, and if the result array contains it, just add to it the maximum number of already generated numbers.
Detecting if a number already generated is O(1) (using a hash set). So it's O(n) and with only N random() calls.
Of course, this is an assumption that we do not overflow the upper limit (i.e. BigInteger).
Given an unsorted integer array, and without making any assumptions on
the numbers in the array:
Is it possible to find two numbers whose
difference is minimum in O(n) time?
Edit: Difference between two numbers a, b is defined as abs(a-b)
Find smallest and largest element in the list. The difference smallest-largest will be minimum.
If you're looking for nonnegative difference, then this is of course at least as hard as checking if the array has two same elements. This is called element uniqueness problem and without any additional assumptions (like limiting size of integers, allowing other operations than comparison) requires >= n log n time. It is the 1-dimensional case of finding the closest pair of points.
I don't think you can to it in O(n). The best I can come up with off the top of my head is to sort them (which is O(n * log n)) and find the minimum difference of adjacent pairs in the sorted list (which adds another O(n)).
I think it is possible. The secret is that you don't actually have to sort the list, you just need to create a tally of which numbers exist. This may count as "making an assumption" from an algorithmic perspective, but not from a practical perspective. We know the ints are bounded by a min and a max.
So, create an array of 2 bit elements, 1 pair for each int from INT_MIN to INT_MAX inclusive, set all of them to 00.
Iterate through the entire list of numbers. For each number in the list, if the corresponding 2 bits are 00 set them to 01. If they're 01 set them to 10. Otherwise ignore. This is obviously O(n).
Next, if any of the 2 bits is set to 10, that is your answer. The minimum distance is 0 because the list contains a repeated number. If not, scan through the list and find the minimum distance. Many people have already pointed out there are simple O(n) algorithms for this.
So O(n) + O(n) = O(n).
Edit: responding to comments.
Interesting points. I think you could achieve the same results without making any assumptions by finding the min/max of the list first and using a sparse array ranging from min to max to hold the data. Takes care of the INT_MIN/MAX assumption, the space complexity and the O(m) time complexity of scanning the array.
The best I can think of is to counting sort the array (possibly combining equal values) and then do the sorted comparisons -- bin sort is O(n + M) (M being the number of distinct values). This has a heavy memory requirement, however. Some form of bucket or radix sort would be intermediate in time and more efficient in space.
Sort the list with radixsort (which is O(n) for integers), then iterate and keep track of the smallest distance so far.
(I assume your integer is a fixed-bit type. If they can hold arbitrarily large mathematical integers, radixsort will be O(n log n) as well.)
It seems to be possible to sort unbounded set of integers in O(n*sqrt(log(log(n))) time. After sorting it is of course trivial to find the minimal difference in linear time.
But I can't think of any algorithm to make it faster than this.
No, not without making assumptions about the numbers/ordering.
It would be possible given a sorted list though.
I think the answer is no and the proof is similar to the proof that you can not sort faster than n lg n: you have to compare all of the elements, i.e create a comparison tree, which implies omega(n lg n) algorithm.
EDIT. OK, if you really want to argue, then the question does not say whether it should be a Turing machine or not. With quantum computers, you can do it in linear time :)