Sorting and searching relation [closed] - algorithm

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Lets say we want to find some known key at array and extract the value. There are 2 possible approaches(maybe more?) to do it. Linear approach, during which we will compare each array key with needle O(N). Or we can sort this array O(N*log(N)) and apply binary search O(log(N)). And I have several questions about it.
So, as I can see sort is closely related to search but stand alone sort is useless. Sorting is an instrument to simplify search. Am I correct? Or there any other implementations of sorting?
If we will talk about search, than we can do search on unsorted data O(N) and sorted O(N*log(N)) + O(log(N)). Searching can exist separately from sorting. In case when we need to find something at array only once we should use linear search, if the search is repeated we should sort the data and after it perform searching?

Don't think before every search a O(n * lg(n)) sort is needed. That would be ridiculous because O(n * lg(n)) + O(log(n)) > O(n) that is it would be quicker to do a linear search on random order data which on average would be O(n/2).
The idea is to initially sort your random data only once using a O(n * lg(n)) algorithm then any data added prior to sorting should be added in order so every search there after can be done in O(lg(n)) time.
You might be interesting in looking at hash tables which are a kind of array that are unsorted but have O(1) constant access time.

It is extremely rare that you would create an array of N items then search it only once. Therefore it is usually profitable to improve the data structure holding the items to improve sort time (amortize the set up time over all the searches and see if you save over-all time)
However there are many other considerations: Do you need to add new items to the collection? Do you need to remove items from the collection? Are you willing to spend extra memory in order to improve sort time? Do you care about the original order in which the items were added to the collection? All of these factors, and more, influence your choice of container and searching technique.

Related

Why this property of quick sort? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
Why does quick sort sort a given set of already sorted items and reverse sorted items with equal speed.
Why not others like heap sort, insertion sort, or selection sort?
Standard selection sort, heap sort and quick sort are not adaptive as insertion sort is.
Look at the table of sorting algorithms comparison.
For example, both the best and the worst case for selection sort has complexity O(n^2) - every cycle run always walks through predetermined piece of array.
Speed of quick sort depends on proper choice of partition elements for given data set. Sorted order may influence, if partition is silly (for example, one always gets the first element of piece), otherwise probability of the worst (quadratic) case is rather small.
If you need to sort almost sorted datasets, choose some adaptive sorting, for example, natural merge sort (or insertion sort for small datasets).

How to insert the element in a direct sorting order in O(1) time? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
What is considered an optimal data structure for pushing elements in sorted order. Here i am looking for some idea or our own customize data structure using that i can achieve insertion of each element in O(1) time and it should be sorted. I do not want to use binary search or tree or linkedlist to make it done.
Values range would be till 50,000 and it can be insert in any random order. After each insert my test case will check data structure is sorted or not. So i have to sort after each insert.
Please share your suggestions and views on this. How can i achieve this inserting sorting order with O(1).
Thanks
If you could do insertion in O(1) time, then you could solve for sorting a list of n elements in O(n) time. But that problem has been proven to be O(n log n), so the original assumption, that insertion can be done in O(1), is wrong.
If you are dealing with integers, the closest you can get to your requirements is by using a Van Emde Boas tree.
You can't get pure O(1). Either you have to do a binary search, or move elements around, or find the right place in a tree.
Hash tables will not keep your elements sorted in any way, at least with VEB trees you have the FindNext methods.
The only "sorting" you can do in O(1) is to use your sort keys as direct indexes into an array, which becomes impractical or plain impossible as soon as your keys can vary in too broad a range.
Maybe "Bucket sort" will fulfill your requirement of O(1) insertion in sorted list, limited value range & insert with random order.
For example, you can split 1~50,000 number to 10,000 buckets, then when you get a number N, you can push it in bucket n/5. after that, you just need to rerank the number in bucket n/5.
this is "nearly" O(1).

Some questions about data structures and sort algorithm [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm learning about data structures and sort algorithm and I have some questions that I want to ask:
When we choose array and when we choose linked-list for sort algorithm
What sort algorithm we should use for small data and what for big data? I know that depends on the situation, we should choose available algorithm, but I can't understand the specifics.
Linked-list or array
Array is the more common choice.
Linked-list is mostly used when your data is already in a linked-list, or you need it in a linked-list for your application.
Not that I've really seen justifiable cause to use one over the other (except that most sorting algorithms are focussed around arrays). Both can be sorted in O(n log n), at least with comparison-based sorting algorithms.
When to use what
With comparison-based sorting, insertion sort is typically used for < ~10-20 elements, as it has low constant factors, even though it has O(n²) running time. For more elements, quick-sort or merge-sort (both running in O(n log n)) or some derivation of either is typically faster (although there are other O(n log n) sorting algorithms).
Insertion sort also performs well (O(n)) on nearly sorted data.
For non-comparison-based sorting, it really depends on your data. Radix sort, bucket sort and counting sort are all well-known examples, and each have their respective uses. A brief look at their running time should give you a good idea of when they should be used. Counting sort, for example, is good if the range of values to be sorted is really small.
You can see Wikipedia for a list of sorting algorithms.
Keep in mind that sorting less than like 10000 elements would be blazingly fast with any of these sorting algorithms - unless you need the absolute best performance, you could really pick whichever one you want.
To my understanding, for both questions there is no definitive answer as both depend on the context of usage. However the following points might of importance:
If the records to be sorted are large and implemented as a value type, an array might be infavourable since exchange of records involves copying of data, which might be slower than redirecting references.
Some instance size for switching sort algorithms is usually found by experimentation in a specific context; perhaps Quicksort is used for the 'large' instances, whereas Merge Sort is used for 'small' instances, where the actual best separation between 'large' and 'small' is found by trying out in the specific context.

Sorting algorithm that depends on initial organization of data [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am presently studying sorting algorithms. I have studied that quick sort algorithm depends on the initial organization of data. If the array is sorted, quick sort becomes slower. Is there any other sort which depends on the initial organization of data?
Of course. Insertion sort will be O(n) with the descending sorted input:
define selection_sort (arr):
out = []
while not (arr.is_empty()):
x = arr.pop()
out = insert x out
return out
because each insert call will be O(1). If pop_last() is used instead of pop() then it will be fastest on the sorted ascending input (this assumes pop() and/or pop_last() are O(1) themselves).
All fast sort algorithms minimize comparison and move operations. Minimizing move operations is dependent on the initial element ordering. I'm assuming you mean initial element ordering by initial organization.
Additionally, the fastest real world algorithms exploit locality of reference which which also shows dependence on the initial ordering.
If you are only interestend in a dependency that slows or speeds up the sorting dramatically, for example bubble sort will complete in one pass on sorted data.
Finally, many sort algorithms have average time complexity O(N log N) but worst case complexity O(N^2). What this means is that there exist specific inputs (e.g. sorted or reverse sorted) for these O(N^2) algorithms that provoke the bad run time behaviour. Some quicksort versions are example of these algorithms.
If what you're asking is "should I be worried about which sorting algorithm should I pick on a case basis?", unless you're processing thousands of millions of operations, the short answer is "no". Most of the times quicksort will be just fine (quicksort with a calculated pivot, like Java's).
In general cases, quicksort is good enough.
On the other hand, If your system is always expecting the source data in a consistent initial sorted way, and you need long CPU time and power each time, then you should definitely find the right algorithm for that corner case.

Different Dictionary Implementations [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am preparing for an exam in algorithm analysis and after I have learned C# and implemented Dictionary in different ways I am confused about the advantages/disadvantages..
So here are my questions:
What is a reason to implement a Dictionary using unordered array instead of always sorted array?
Reaons to implement Dicitonary using always sorted array instead of unordered array?
Reasons to implement Dictionary using a Binary Search Tree instead of an always sorted array?
If you use an unordered array you can just tack items onto the end or copy the array into a new array and tack items on the end if the original fills up. O(1) or O(n) depending on these 2 cases to insert, but O(n) to do any lookups.
With an ordered array you COULD gain the ability to more quickly search it depending on its contents, either through a binary search or other cool searches, but it has increased insertion cost because you must move things around in the array every time you insert into it, unless you're only adding things already in sorted order, which case it might even be the worst case which would be O(n) for each element.
With a binary search tree you can easily find whatever node you're looking for based on whatever you're forking the tree on in O(log n) time, though this is only possible on a balanced binary tree. With an unbalanced binary search tree (basically a linked list) you could get worst case performance of O(n). With the balanced tree though, an insertion can be more expensive because it requires reorganizing the tree which is usually O(n log n), but again O is the worst case scenario, most balanced binary search trees can do most insertions more quickly.

Resources