If values equal to the pivot are also considered during quicksort can it be called a stable sort? Here is my implementation of quicksort:
def isOrdered(aList):
ordered = True
for i in range(len(aList)-1):
if aList[i] > aList[i+1]:
ordered = False
break
return ordered
def partition(List,pivotIndex):
pivot=List[pivotIndex]
less=[]
equal=[]
greater=[]
if pivotIndex<len(List):
for i in List:
if i<pivot:
less.append(i)
elif i==pivot:
equal.append(i)
else:
greater.append(i)
returnlist= [less,equal,greater]
return returnlist
def quicksort(List,pivotIndex=0):
sortedList = []
if pivotIndex>=len(List):
pivotIndex%=len(List)
for i in partition(List,pivotIndex):
for j in i:
sortedList.append(j)
pivotIndex+=1
if isOrdered(sortedList):
return sortedList
else:
return quicksort(sortedList,pivotIndex)
Is it possible to improve stability and maintain computational speed for quicksort at the same time?
Related
I was struggling to figure out what changes need to be made to the recursive version of my code below so that the asymptotic run time just about matches the iterative version. Of course, the recursive version of the code still needs to be recursive, but I was stuck on how I should approach cutting down the run time on the recursion.
def binarySearch(alist, item): //ITERATIVE VERSION
first = 0
last = len(alist)-1
found = False
while first<=last and not found:
midpoint = (first + last)/2
if alist[midpoint] == item:
found = True
else:
if item < alist[midpoint]:
last = midpoint-1
else:
first = midpoint+1
return found
def binarySearch(alist, item): //RECURSIVE VERSION
if len(alist) == 0:
return False
else:
midpoint = len(alist)/2
if alist[midpoint]==item:
return True
else:
if item<alist[midpoint]:
return binarySearch(alist[:midpoint],item)
else:
return binarySearch(alist[midpoint+1:],item)
Tried to replicate my function recursively, but the asymptotic running time was much slower than the iterative version.
Your recursive version creates new lists -- which is detrimental to both time and space complexity.
So instead of slicing alist, use first and last like you did in the iterative version:
def binarySearch(alist, item, first=0, last=None):
if last is None:
last = len(alist) - 1
if last < first:
return False
else:
midpoint = (start + end) // 2
if alist[midpoint] == item:
return True
elif item < alist[midpoint]:
return binarySearch(alist, item, first, midpoint - 1)
else:
return binarySearch(alist, item, midpoint + 1, last)
I was presented with a problem where I had to create an algorithm that takes any given input integer of even length and, based on the input, determines whether the sum of the first n digits is equal to the sum of the last n digits, where each n is the equal to the length of the number divided by two (e.g. 2130 returns True, 3304 returns False).
My solution, which works but is rather unwieldy, was as follows:
def ticket(num):
list_num = [int(x) for x in str(num)]
half_length = int(len(list_num)/2)
for i in range(half_length*2):
first_half = list_num[0:half_length]
second_half = list_num[half_length::]
if sum(first_half) == sum(second_half):
return True
else:
return False
In an effort to improve my understanding of list comprehensions, I've tried to look at ways that I can make this more efficient but am struggling to do so. Any guidance would be much appreciated.
EDIT:
Thank you to jarmod:
Reorganised as so:
def ticket(num):
list_num = [int(x) for x in str(num)]
half_length = int(len(list_num)/2)
return sum(list_num[0:half_length]) == sum(list_num[half_length::])
ticket(1230)
You can remove the unnecessary assignment and loops to create a shorter, more pythonic solution:
def ticket(num):
half_length = int(len(list_num)/2)
first_half = sum(list(num[:half_length]))
second_half = sum(list(num[half_length:]))
return first_half == second_half
It can be further shortened though, which isn't as readable:
def ticket(num):
return sum(list(num[:len(num)//2])) == sum(list(num[len(num)//2:]))
I am trying to find the minimum in a rotated sorted array with duplicates:
I tried this:
def find_pivot(arr):
lo = 0
hi = len(arr) -1
while lo<=hi:
mid = (hi+lo)//2
if arr[mid]<arr[0]:
hi = mid-1
else:
lo = mid+1
return lo
class Solution:
"""
#param nums: a rotated sorted array
#return: the minimum number in the array
"""
def findMin(self, nums):
pivot_index = find_pivot(nums)
left = nums[:pivot_index]
right = nums[pivot_index:]
if right:
return right[0]
return left[0]
This fails when the array is [999,999,1000,1000,10000,0,999,999,999]. My algorithm returns 999, when it should return 0
How do I fix this?
Binary search will not work here because of duplicates.
Suppose your array is [1,1,1,1,1,0,1,1]. Then arr[lo] = arr[hi] = arr[mid] = 1: which half you dive in? You may say "right one!" of course, but why? All informations you have with those three items only is not enough to have the certainty of you choice. In fact the array could be [1,0,1,1,1,1,1,1], and yet arr[lo] = arr[hi] = arr[mid] = 1, but the pivot is not in right half now.
In the worst case, with this type of instances you stil need to scan the whole array.
If you could have a more tighten order in the array, which means no duplicates (or, no more than k duplicates) then binary search would be useful.
I would like to take random samples from very large lists while maintaining the order. I wrote the script below, but it requires .map(idx => ls(idx)) which is very wasteful. I can see a way of making this more efficient with a helper function and tail recursion, but I feel that there must be a simpler solution that I'm missing.
Is there a clean and more efficient way of doing this?
import scala.util.Random
def sampledList[T](ls: List[T], sampleSize: Int) = {
Random
.shuffle(ls.indices.toList)
.take(sampleSize)
.sorted
.map(idx => ls(idx))
}
val sampleList = List("t","h","e"," ","q","u","i","c","k"," ","b","r","o","w","n")
// imagine the list is much longer though
sampledList(sampleList, 5) // List(e, u, i, r, n)
EDIT:
It appears I was unclear: I am referring to maintaining the order of the values, not the original List collection.
If by
maintaining the order of the values
you understand to keeping the elements in the sample in the same order as in the ls list, then with a small modification to your original solution the performances can be greatly improved:
import scala.util.Random
def sampledList[T](ls: List[T], sampleSize: Int) = {
Random.shuffle(ls.zipWithIndex).take(sampleSize).sortBy(_._2).map(_._1)
}
This solution has a complexity of O(n + k*log(k)), where n is the list's size, and k is the sample size, while your solution is O(n + k * log(k) + n*k).
Here is an (more complex) alternative that has O(n) complexity. You can't get any better in terms of complexity (though you could get better performance by using another collection, in particular a collection that has a constant time size implementation). I did a quick benchmark which indicated that the speedup is very substantial.
import scala.util.Random
import scala.annotation.tailrec
def sampledList[T](ls: List[T], sampleSize: Int) = {
#tailrec
def rec(list: List[T], listSize: Int, sample: List[T], sampleSize: Int): List[T] = {
require(listSize >= sampleSize,
s"listSize must be >= sampleSize, but got listSize=$listSize and sampleSize=$sampleSize"
)
list match {
case hd :: tl =>
if (Random.nextInt(listSize) < sampleSize)
rec(tl, listSize-1, hd :: sample, sampleSize-1)
else rec(tl, listSize-1, sample, sampleSize)
case Nil =>
require(sampleSize == 0, // Should never happen
s"sampleSize must be zero at the end of processing, but got $sampleSize"
)
sample
}
}
rec(ls, ls.size, Nil, sampleSize).reverse
}
The above implementation simply iterates over the list and keeps (or not) the current element according to a probability which is designed to give the same chance to each element. My logic may have a flow, but at first blush it seems sound to me.
Here's another O(n) implementation that should have a uniform probability for each element:
implicit class SampleSeqOps[T](s: Seq[T]) {
def sample(n: Int, r: Random = Random): Seq[T] = {
assert(n >= 0 && n <= s.length)
val res = ListBuffer[T]()
val length = s.length
var samplesNeeded = n
for { (e, i) <- s.zipWithIndex } {
val p = samplesNeeded.toDouble / (length - i)
if (p >= r.nextDouble()) {
res += e
samplesNeeded -= 1
}
}
res.toSeq
}
}
I'm using it frequently with collections > 100'000 elements and the performance seems reasonable.
It's probably the same idea as in RĂ©gis Jean-Gilles's answer but I think the imperative solution is slightly more readable in this case.
Perhaps I don't quite understand, but since Lists are immutable you don't really need to worry about 'maintaining the order' since the original List is never touched. Wouldn't the following suffice?
def sampledList[T](ls: List[T], sampleSize: Int) =
Random.shuffle(ls).take(sampleSize)
While my previous answer has linear complexity, it does have the drawback of requiring two passes, the first one corresponding to the need to compute the length before doing anything else. Besides affecting the running time, we might want to sample a very large collection for which it is not practical nor efficient to load the whole collection in memory at once, in which case we'd like to be able to work with a simple iterator.
As it happens, we don't need to invent anything to fix this. There is simple and clever algorithm called reservoir sampling which does exactly this (building a sample as we iterate over a collection, all in one pass). With a minor modification we can also preserve the order, as required:
import scala.util.Random
def sampledList[T](ls: TraversableOnce[T], sampleSize: Int, preserveOrder: Boolean = false, rng: Random = new Random): Iterable[T] = {
val result = collection.mutable.Buffer.empty[(T, Int)]
for ((item, n) <- ls.toIterator.zipWithIndex) {
if (n < sampleSize) result += (item -> n)
else {
val s = rng.nextInt(n)
if (s < sampleSize) {
result(s) = (item -> n)
}
}
}
if (preserveOrder) {
result.sortBy(_._2).map(_._1)
}
else result.map(_._1)
}
I am trying to create an insertion sort with linked lists. Here is what I have:
def insertion_sort(a):
"""
-------------------------------------------------------
Sorts a list using the Insertion Sort algorithm.
Use: insertion_sort( a )
-------------------------------------------------------
Preconditions:
a - linked list of comparable elements (?)
Postconditions:
Contents of a are sorted.
-------------------------------------------------------
"""
unsorted = a._front
a._front = None
while unsorted is not None and unsorted._next is not None:
current = unsorted
unsorted = unsorted._next
if current._value < unsorted._value:
current._next = unsorted._next
unsorted._next = current
unsorted = unsorted._next
else:
find = unsorted
while find._next is not None and current._value > find._next._value:
find = find._next
current._next = find._next
current = find._next
a._front = unsorted
return a
I believe what I have is correct in terms of sorting. However when I try to read the list in the main module I get a bunch of None values.
In this case, the insertion sort is not creating a new list when sorting. Rather, it is moving all sorted elements to the 'front'.
To summarize, I have two problems: I am not sure if the insertion sort is correct, and there are problems with the returned list a as it contains None values. Thanks in advance
Not exactly sure about the type of a is, but if you assume a simple:
class Node:
def __init__(self, value, node=None):
self._value = value
self._next = node
def __str__(self):
return "Node({}, {})".format(self._value, self._next)
Then your insertion sort isn't far off, it needs to handle the head case properly:
def insertion_sort(unsorted):
head = None
while unsorted:
current = unsorted
unsorted = unsorted._next
if not head or current._value < head._value:
current._next = head;
head = current;
else:
find = head;
while find and current._value > find._next._value:
find = find._next
current._next = find._next
find._next = current
return head
>>> print(insertion_sort(Node(4, Node(1, Node(3, Node(2))))))
Node(1, Node(2, Node(3, Node(4, None))))