Time Complexity of searching - algorithm

there is a sorted array which is of very large size. every element is repeated more than once except one element. how much time will it take to find that element?
Options are:
1.O(1)
2.O(n)
3.O(logn)
4.O(nlogn)

The answer to the question is O(n) and here's why.
Let's first summarize the knowledge we're given:
A large array containing elements
The array is sorted
Every item except for one occurs more than once
Question is what is the time growth of searching for that one item that only occurs once?
The sorted property of the array, can we use this to speed up the search for the item? Yes, and no.
First of all, since the array isn't sorted by the property we must use to look for the item (only one occurrence) then we cannot use the sorted property in this regard. This means that optimized search algorithms, such as binary search, is out.
However, we know that if the array is sorted, then all items that have the same value will be grouped together. This means that when we look at an item we see for the first time we only have to compare it to the following item. If it's different, we've found the item we're looking for.
"see for the first time" is important, otherwise we would pick the first value since there will be a boundary between two groups of items where the two items are different.
So we have to move from one end of the array to the other, and compare each item to the following item, and this is an O(n) operation.
Basically, since the array isn't sorted by the property we're looking at, we're back to a linear search.

Must be O(n).
The fact that it's sorted doesn't help. Suppose you tried a binary method, jumping into the middle somewhere. You see that the value there has a neighbour that is the same. Now which half do you go to?
How would you write a program to find the value? You'd start at one end an check for an element whose neighbour is not the same. You'd have to walk the whole array until you found the value. So O(n)

Related

Understanding these questions about binary search on linear data structures?

The answers are (1) and (5) but I am not sure why. Could someone please explain this to me and why the other answers are incorrect. How can I understand how things like binary/linear search will behavior on different data structures?
Thank you
I am hoping you already know about binary search.
(1) True-
Explanation
For performing binary search, we have to get to middle of the sorted list. In linked list to get to the middle of the list we have to traverse half of the list starting from the head, while in array we can directly get to middle index if we know the length of the list. So linked lists takes O(n/2) time which can be done in O(1) by using array. Therefore linked list is not the efficient way to implement binary search.
(2)False
Same explanation as above
(3)False
Explanation
As explained in point 1 linked list cannot be used efficiently to perform binary search but array can be used.
(4) False
Explanation
Binary search worst case time is O(logn). As in binary search we don't need to traverse the whole list. In first loop if key is lesser then middle value we will discard the second half of the list. Similarly now we will operate with the remaining list. As we can see with every loop we are discarding the part of the list that we don't have to traverse, so clearly it will take less then O(n).
(5)True
Explanation
If element is found in O(1) time, that means only one loop was run by the code. And in the first loop we always compare to the middle element of the list that means the search will take O(1) time only if the middle element is the key value.
In short, binary search is an elimination based searching technique that can be applied when the elements are sorted. The idea is to eliminate half the keys from consideration by keeping the keys in sorted order. If the search key is not equal to the middle element, one of the two sets of keys to the left and to the right of the middle element can be eliminated from further consideration.
Now coming to your specific question,
True
The basic binary search requires that mid-point can be found in O(1) time which can't be possible in linked list and can be way more expensive if the the size of the list is unknown.
True.
False
False
Binary search, mid-point calculation should be done in O(1) time which can only be possible in arrays , as the indices defined in arrays are known. Secondly binary search can only be applied to the arrays which are in sorted order.
False
The answer by Vaibhav Khandelwal, explained it nicely. But I wanted to add some variations of the array on to which binary search can be still applied. If the given array is sorted but rotated by X degree and contains duplicates, for example,
3 5 6 7 1 2 3 3 3
Then binary search still applies on it, but for the worst case, we needed we go linearly through this list to find the required element, which is O(n).
True
If the element found in the first attempt i.e situated at the mid-point then it would be processed in O(1) time.
MidPointOfArray = (LeftSideOfArray + RightSideOfArray)/ 2
The best way to understand binary search is to think of exam papers which are sorted according to last names. In order to find a particular student paper, the teacher has to search in that student name's category and rule-out the ones that are not alphabetically closer to the name of the student.
For example, if the name is Alex Bob, then teacher directly starts her search from "B", then take out all the copies that have surname "B", then again repeat the process, and skips the copies till letter "o" and so on till find it or not.

Is there such data structure - "linked list with samples"

Is there such data structure:
There is slow list data structure such linked list or data saved on disk.
There is relatively small array of pointers to some of the elements in the "slow list", hopefully evenly distributed.
Then when you do search, you first check the array and then perform the normal search (linked list search or binary search in case of disk data).
This looks very similar to jump search, sample search and to skip lists, but I think is different algorithm.
Please note I am giving example with link list or file on disk, because they are slow structures.
I don't know if there's a name for this algorithm (I don't think it deserves one, though if there isn't, it could bear mine:), but I did implement something like that 10 years ago for an interview.
You can have an array of pointers to the elements of a list. An array of fixed size, say, of 256 pointers. When you construct the list or traverse it for the first time, you store pointers to its elements in the array. So, for a list of 256 or fewer elements you'd have a pointer to each element.
As the list grows beyond 256 elements, you drop every odd-numbered pointer by moving the 128 even-numbered pointers to the beginning of the array. When the array of pointers fills up again, you repeat the procedure. At every such point you double the step between the list elements whose addresses end up in the array of pointers. Initially you'd place every element's address there, then every other's, then of one out of four and so on.
You end up with an array of pointers to the list elements spaced apart by the list length / 256.
If the list is singly-linked, locating i-th element from the beginning or the end of it is reduced to searching in 1/256th of the list.
If the list is sorted, you can perform binary search on the array to locate the bin (the 1/256th portion of the list) where to look further.

Distribute list elements between two lists equitatively

I have a list with 'n' elements (lets say 10 ie), I want to distribute this elements into two lists, each one balanced with the other by a criteria, evaluating the valour of each element. ie The output should be two lists with 5 elements that are aproximately balanced with each other.
Thanks for your time.
You could employ a greedy strategy (I'm not certain this will give you "optimal" results, but it should give you relatively good results at least).
Start by finding the total value of all the elements in your list, V. The goal is to create two lists each with value about half this amount (ideally as close to 1/2*V as possible). Start with 3 lists, List_original, List_1, List_2.
Pull off items from List_original (starting with the largest, working your way down to the smallest) and put them into List_1 if and only if adding them to List_1 doesn't cause the total value of List_1 to exceed 1/2*V. Everything else goes into List_2.
The result will be that List_1 will be at most 1/2*V and List_2 will be at least 1/2*V. In the event that some subset of your items sums up to exactly 1/2*V then you might get equality. I haven't tried to prove/disprove this yet. Depending on how close to balanced your result has to be, this could be good enough (it should at least be very fast).
I came up with a quick "solution" by taking the full averagevalue of the list, then ordering it asc, taking the two highest values for each list and then iterate with the rest. With each iteration I compared the average of the full list with the average of each of the two sublists with the added iteration, each time I put the iteration element in the list wich average was closer to the full average of the list. Keep doing it until the list were full.
I know it is not the best choice but it was good enough for now.
Hope my explanation was clear enough.
Thanks to all.

It's About data Structure

There are two unsorted lists containing Integers, need to find the common largest integer in the list ?
I have an idea about this question, i taught first we need to find the largest element in the first list and then we need to apply linear search method for the second list using largest element of the first list. Is this logic correct....? If it's not can one help me with the logic for this question.
can any one help me out for this question please...
The problem with you first thought is that if the largest element in the first item does not occur in the second, you would never try another.
The most efficient way I can think of in a short time is this:
Order both arrays in descending order
grab the first element in the first array
compare to the first element in the second array
if it is the same, you are done
If the first item is larger than the second, pop it off of array 1 and repeat from step 2
if the first item is smaller than the second, pop the second off of array 2 and repeat from step 2

Find a common element within N arrays

If I have N arrays, what is the best(Time complexity. Space is not important) way to find the common elements. You could just find 1 element and stop.
Edit: The elements are all Numbers.
Edit: These are unsorted. Please do not sort and scan.
This is not a homework problem. Somebody asked me this question a long time ago. He was using a hash to solve the problem and asked me if I had a better way.
Create a hash index, with elements as keys, counts as values. Loop through all values and update the count in the index. Afterwards, run through the index and check which elements have count = N. Looking up an element in the index should be O(1), combined with looping through all M elements should be O(M).
If you want to keep order specific to a certain input array, loop over that array and test the element counts in the index in that order.
Some special cases:
if you know that the elements are (positive) integers with a maximum number that is not too high, you could just use a normal array as "hash" index to keep counts, where the number are just the array index.
I've assumed that in each array each number occurs only once. Adapting it for more occurrences should be easy (set the i-th bit in the count for the i-th array, or only update if the current element count == i-1).
EDIT when I answered the question, the question did not have the part of "a better way" than hashing in it.
The most direct method is to intersect the first 2 arrays and then intersecting this intersection with the remaining N-2 arrays.
If 'intersection' is not defined in the language in which you're working or you require a more specific answer (ie you need the answer to 'how do you do the intersection') then modify your question as such.
Without sorting there isn't an optimized way to do this based on the information given. (ie sorting and positioning all elements relatively to each other then iterating over the length of the arrays checking for defined elements in all the arrays at once)
The question asks is there a better way than hashing. There is no better way (i.e. better time complexity) than doing a hash as time to hash each element is typically constant. Empirical performance is also favorable particularly if the range of values is can be mapped one to one to an array maintaining counts. The time is then proportional to the number of elements across all the arrays. Sorting will not give better complexity, since this will still need to visit each element at least once, and then there is the log N for sorting each array.
Back to hashing, from a performance standpoint, you will get the best empirical performance by not processing each array fully, but processing only a block of elements from each array before proceeding onto the next array. This will take advantage of the CPU cache. It also results in fewer elements being hashed in favorable cases when common elements appear in the same regions of the array (e.g. common elements at the start of all arrays.) Worst case behaviour is no worse than hashing each array in full - merely that all elements are hashed.
I dont think approach suggested by catchmeifyoutry will work.
Let us say you have two arrays
1: {1,1,2,3,4,5}
2: {1,3,6,7}
then answer should be 1 and 3. But if we use hashtable approach, 1 will have count 3 and we will never find 1, int his situation.
Also problems becomes more complex if we have input something like this:
1: {1,1,1,2,3,4}
2: {1,1,5,6}
Here i think we should give output as 1,1. Suggested approach fails in both cases.
Solution :
read first array and put into hashtable. If we find same key again, dont increment counter. Read second array in same manner. Now in the hashtable we have common elelements which has count as 2.
But again this approach will fail in second input set which i gave earlier.
I'd first start with the degenerate case, finding common elements between 2 arrays (more on this later). From there I'll have a collection of common values which I will use as an array itself and compare it against the next array. This check would be performed N-1 times or until the "carry" array of common elements drops to size 0.
One could speed this up, I'd imagine, by divide-and-conquer, splitting the N arrays into the end nodes of a tree. The next level up the tree is N/2 common element arrays, and so forth and so on until you have an array at the top that is either filled or not. In either case, you'd have your answer.
Without sorting and scanning the best operational speed you'll get for comparing 2 arrays for common elements is O(N2).

Resources