Finding out the middle point in mergesort in Python - algorithm

I have written two different implementations of mergesort algo with only one difference, that of formula used in finding the middle point of the array to divide it.
First implementation : (Runs correctly)
def mergesort(arr):
start = 0
end = len(arr) - 1
if len(arr) > 1:
mid = int(len(arr)/2)
left = mergesort(arr[:mid])
right = mergesort(arr[mid:])
return merge(left,right)
else:
return arr
def merge(left,right):
final = []
while len(left) > 0 or len(right) > 0:
if len(left) > 0 and len(right) > 0:
if left[0] < right[0]:
final.append(left[0])
del left[0]
elif right[0] < left[0]:
final.append(right[0])
del right[0]
elif len(right) > 0:
final.extend(right)
right = []
elif len(left) > 0:
final.extend(left)
left = []
return final
arr = list(map(int,input().split(' ')))
print ("List before sorting:",arr)
final = mergesort(arr)
print ("After sorting:",final)
Second implementation (Gets into an infinite loop):
def mergesort(arr):
start = 0
end = len(arr) - 1
if len(arr) > 1:
mid = int(start + (end - start)/2)
left = mergesort(arr[:mid])
right = mergesort(arr[mid:])
return merge(left,right)
else:
return arr
def merge(left,right):
final = []
while len(left) > 0 or len(right) > 0:
if len(left) > 0 and len(right) > 0:
if left[0] < right[0]:
final.append(left[0])
del left[0]
elif right[0] < left[0]:
final.append(right[0])
del right[0]
elif len(right) > 0:
final.extend(right)
right = []
elif len(left) > 0:
final.extend(left)
left = []
return final
arr = list(map(int,input().split(' ')))
print ("List before sorting:",arr)
final = mergesort(arr)
print ("After sorting:",final)
I have seen the second formula used in case of quicksort algo. The question is if my objective is to divide the array (as in the case of quicksort) why does it goes into an infinite loop.
I am very puzzled and can not come to any logical conclusion.
Can someone please throw some light into the matter? Thanks a lot in advance.

The second method should be used when working with a single array rather than multiple instances of a sub-array. Instead of using actual separate sub-arrays, the original array is split into logical sub-arrays via an index range. The mergesort function would take 3 parameters, mergesort(arr, start, end), and the caller would call mergesort(arr, 0, len(arr)). The merge function would take 4 parameters, merge(arr, start, mid, end).
Efficiency could be improved using an entry function that takes one parameter, mergesort(arr). It would allocate a single working array tmp and pass that to the internal functions, the call from mergesort(arr) would be mergesort(arr, tmp, 0, len(arr)). The mergesort function would be mergesort(arr, tmp, start, end). The merge function would be merge(arr, tmp, start, mid, end).

Related

LeetCode 1707. Maximum XOR With an Element From Array

You are given an array nums consisting of non-negative integers. You are also given a queries array, where queries[i] = [xi, mi].
The answer to the ith query is the maximum bitwise XOR value of xi and any element of nums that does not exceed mi. In other words, the answer is max(nums[j] XOR xi) for all j such that nums[j] <= mi. If all elements in nums are larger than mi, then the answer is -1.
Return an integer array answer where answer.length == queries.length and answer[i] is the answer to the ith query.
This python solution uses Trie, but still LeetCode shows TLE?
import operator
class TrieNode:
def __init__(self):
self.left=None
self.right=None
class Solution:
def insert(self,head,x):
curr=head
for i in range(31,-1,-1):
val = (x>>i) & 1
if val==0:
if not curr.left:
curr.left=TrieNode()
curr=curr.left
else:
curr=curr.left
else:
if not curr.right:
curr.right=TrieNode()
curr=curr.right
else:
curr=curr.right
def maximizeXor(self, nums: List[int], queries: List[List[int]]) -> List[int]:
res=[-10]*len(queries)
nums.sort()
for i in range(len(queries)):
queries[i].append(i)
queries.sort(key=operator.itemgetter(1))
head=TrieNode()
for li in queries:
max=0
xi,mi,index=li[0],li[1],li[2]
m=2**31
node = head
pos=0
if mi<nums[0]:
res[index]=-1
continue
for i in range(pos,len(nums)):
if mi<nums[i]:
pos=i
break
self.insert(node,nums[i])
node=head
for i in range(31,-1,-1):
val=(xi>>i)&1
if val==0:
if node.right:
max+=m
node=node.right
else:
node=node.left
else:
if node.left:
max+=m
node=node.left
else:
node=node.right
m>>=1
res[index]=max
return -1
here is alternative Trie implement to solve this problem:
[Notes: 1) max(x XOR y for y in A); 2) do the greedy on MSB bit; 3) sort the queries]
class Trie:
def __init__(self):
self.root = {}
def add(self, n):
p = self.root
for bitpos in range(31, -1, -1):
bit = (n >> bitpos) & 1
if bit not in p:
p[bit] = {}
p = p[bit]
def query(self, n):
p = self.root
ret = 0
if not p:
return -1
for bitpos in range(31, -1, -1):
bit = (n >> bitpos) & 1
inverse = 1 - bit
if inverse in p:
p = p[inverse]
ret |= (1 << bitpos)
else:
p = p[bit]
return ret
class Solution:
def maximizeXor(self, nums: List[int], queries: List[List[int]]) -> List[int]:
n = len(nums)
trie = Trie()
q = sorted(enumerate(queries), key = lambda x: x[1][1])
nums.sort()
res = [-1] * len(queries)
i = 0
for index, (x, m) in q:
while i < n and nums[i] <= m:
trie.add(nums[i])
i += 1
res[index] = trie.query(x)
return res
The problem is that you're building a fresh Trie for each query. And to make matters worse, use linear search to find the maximum value <= mi in nums. You'd be better off by simply using
max((n for n in nums if n <= mi), key=lambda n: n^xi, default=-1)
The solution here would be to build the trie right at the start and simply filter for values smaller than mi using that trie:
import math
import bisect
def dump(t, indent=''):
if t is not None:
print(indent, "bit=", t.bit, "val=", t.val, "lower=", t.lower)
dump(t.left, indent + '\tl')
dump(t.right, indent + '\tr')
class Trie:
def __init__(self, bit, val, lower):
self.bit = bit
self.val = val
self.lower = lower
self.left = None
self.right = None
def solve(self, mi, xi):
print('-------------------------------------------')
print(self.bit, "mi(b)=", (mi >> self.bit) & 1, "xi(b)=", (xi >> self.bit) & 1, "mi=", mi, "xi=", xi)
dump(self)
if self.val is not None:
# reached a leave of the trie => found matching value
print("Leaf")
return self.val
if mi & (1 << self.bit) == 0:
# the maximum has a zero-bit at this position => all values in the right subtree are > mi
print("Left forced by max")
return -1 if self.left is None else self.left.solve(mi, xi)
# pick based on xor-value if possible
if (xi >> self.bit) & 1 == 0 and self.right is not None and (mi > self.right.lower or mi == ~0):
print("Right preferred by xi")
return self.right.solve(mi, xi)
elif (xi >> self.bit) & 1 == 1 and self.left is not None:
print("Left preferred by xi")
return self.left.solve(~0, xi)
# pick whichever is available
if self.right is not None and (mi > self.right.lower or mi == ~0):
print("Only right available")
return self.right.solve(mi, xi)
elif self.left is not None:
print("Only left available")
return self.left.solve(~0, xi)
else:
print("None available")
return -1
def build_trie(nums):
nums.sort()
# msb of max(nums)
max_bit = int(math.log(nums[-1], 2)) # I'll just assume that nums is never empty
print(max_bit)
def node(start, end, bit, template):
print(start, end, bit, template, nums[start:end])
if end - start == 1:
# reached a leaf
return Trie(0, nums[start], nums[start])
elif start == end:
# a partition without values => no Trie-node
return None
# find pivot for partitioning based on bit-value of specified position (bit)
part = bisect.bisect_left(nums, template | (1 << bit), start, end)
print(part)
# build nodes for paritioning
res = Trie(bit, None, nums[start])
res.left = node(start, part, bit - 1, template)
res.right = node(part, end, bit - 1, template | (1 << bit))
return res
return node(0, len(nums), max_bit, 0)
class Solution:
def maximizeXor(self, nums: List[int], queries: List[List[int]]) -> List[int]:
trie = build_trie(nums)
return [trie.solve(mi if mi <= nums[-1] else ~0, xi) for xi, mi in queries]
I've been a bit lazy and simply used ~0 to signify that the maximum can be ignored since all values in the subtree are smaller than mi. The basic idea is that ~0 & x == x is true for any integer x. Not quite as simple as #DanielHao's answer, but capable of handling streams of queries.

Is there any way to use the return value as an argument to the same function during the previous recursion in merge sort

I am coding the Merge sort algorithm but somehow got stuck with a problem. The problem is that I need to use the return value of the merge function as an argument as an previous recursive call of the same merge function. Sorry for not being clear.
Here is my code:
a = [10,5,2,20,-50,30]
def mergeSort(arr):
l = 0
h = len(arr)-1
if h > l:
mid = (l+h) // 2
left = arr[l:mid+1]
right = arr[mid+1:]
mergeSort(left)
mergeSort(right)
merge(left, right)
def merge(l, r):
subarr = []
lc = 0
rc = 0
loop = True
while loop:
if lc > len(l)-1 and rc <= len(r)-1:
for i in range(rc, len(r)):
subarr.append(r[i])
loop = False
elif lc <= len(l)-1 and rc > len(r)-1:
for i in range(lc, len(l)):
subarr.append(l[i])
loop = False
elif l[lc] < r[rc]:
subarr.append(l[lc])
lc += 1
loop = True
elif r[rc] < l[lc]:
subarr.append(r[rc])
rc += 1
loop = True
elif l[lc] == r[rc]:
subarr.append(l[lc])
subarr.append(r[rc])
lc += 1
rc += 1
loop = True
mergeSort(a)
Any help will be appreciated thank you :)
First you need to actually return the result. Right now you return nothing so get None back.
Secondly, just assign to the same variable. left = mergeSort(left) and so on.
UPDATE:
Here is a debugged version.
a=[10,5,2,20,-50,30]
def mergeSort(arr):
l=0
h=len(arr)-1
if h>l:
mid=(l+h)//2
left=arr[l:mid+1]
right=arr[mid+1:]
# Capture the merge into variables here.
left=mergeSort(left)
right=mergeSort(right)
# Need a return of the merge.
return merge(left,right)
# Need to return arr if arr has 0 or 1 elements.
else:
return arr
def merge(l,r):
subarr=[]
lc=0
rc=0
loop=True
while loop:
if lc>len(l)-1 and rc<=len(r)-1:
for i in range(rc,len(r)):
subarr.append(r[i])
loop=False
elif lc<=len(l)-1 and rc>len(r)-1:
for i in range(lc,len(l)):
subarr.append(l[i])
loop=False
elif l[lc]<r[rc]:
subarr.append(l[lc])
lc+=1
loop=True
elif r[rc]<l[lc]:
subarr.append(r[rc])
rc+=1
loop=True
elif l[lc]==r[rc]:
subarr.append(l[lc])
subarr.append(r[rc])
lc+=1
rc+=1
loop=True
# Need to return the results of merge.
return subarr
# Need to actually try calling the function to see the result.
print(mergeSort(a))
I also indented more sanely. Trust me, it matters.
There are multiple problems in our code:
you do not return the sorted slice from mergeSort nor merge. Your implementation does not sort the array in place, so you must return subarr in merge and the return value of merge in mergeSort or arr if the length is less than 2.
your code is too complicated: there are many adjustments such as mid+1, len(l)-1, etc. It is highly recommended to use index values running from 0 to len(arr) excluded. This way you do not have to add error prone +1/-1 adjustments.
the merge function should proceed in 3 phases: merge the left and right arrays as long as both index values are less than the array lengths, then append remaining elements from the left array, finally append remaining elements from the right array.
there is no need to make 3 different tests to determine from which of the left and right array to take the next element, a single test is sufficient.
also use a consistent amount of white space to indent the blocks, 3 or 4 spaces are preferable, tabs are error prone as they expand to different amount of white space on different devices, mixing tabs and spaces, as you did is definitely a problem.
Here is a modified version:
def mergeSort(arr):
# no need to use l and h, use len(arr) directly
if len(arr) > 1:
# locate the middle point
mid = len(arr) // 2
# left has the elements before mid
left = arr[:mid]
# right has the elements from mid to the end
right = arr[mid:]
# sort the slices
left = mergeSort(left)
right = mergeSort(right)
# merge the slices into a new array and return it
return merge(left, right)
else:
# return the original array (should actually return a copy)
return arr
def merge(l, r):
subarr = []
lc = 0
rc = 0
# phase1: merge the arrays
while lc < len(l) and rc < len(r):
if l[lc] <= r[rc]:
subarr.append(l[lc])
lc += 1
else:
subarr.append(r[rc])
rc += 1
# phase2: copy remaining elements from l
while lc < len(l):
subarr.append(l[lc])
lc += 1
# phase3: copy remaining elements from r
while rc < len(r):
subarr.append(r[rc])
rc += 1
# return the merged array
return subarr
a = [10, 5, 2, 20, -50, 30]
print(mergeSort(a))

QuickSort - Median Three

I am working on the QuickSort - Median Three Algorithm.
I have no problem with the first and last element sorting. But, when comes to the Median-three, I am slightly confused. I hope someone could help me on this.
Would be appreciate if someone could provide me some pseudocode?
My understanding is to get the middle index by doing this. (start + end) / 2 , then swap the middle pivot value to the first value, after all these done it should goes well with the normal quick sort ( partitioning and sorting).
Somehow, I couldn't get it works. Please help!
#Array Swap function
def swap(A,i,k):
temp=A[i]
A[i]=A[k]
A[k]=temp
# Get Middle pivot function
def middle(lista):
if len(lista) % 2 == 0:
result= len(lista) // 2 - 1
else:
result = len(lista) // 2
return result
def median(lista):
if len(lista) % 2 == 0:
return sorted(lista)[len(lista) // 2 - 1]
else:
return sorted(lista)[len(lista) // 2]
# Create partition function
def partition(A,start,end):
m = middle(A[start:end+1])
medianThree = [ A[start], A[m], A[end] ]
if A[start] == median(medianThree):
pivot_pos = start
elif A[m] == median(medianThree):
tempList = A[start:end+1]
pivot_pos = middle(A[start:end+1])
swap(A,start,pivot_pos+start)
elif A[end] == median(medianThree):
pivot_pos = end
#pivot = A[pivot_pos]
pivot = pivot_pos
# swap(A,start,end) // This line of code is to switch the first and last element pivot
swap(A,pivot,end)
p = A[pivot]
i = pivot + 1
for j in range(pivot+1,end+1):
if A[j] < p:
swap(A,i,j)
i+=1
swap(A,start,i-1)
return i-1
count = 0
#Quick sort algorithm
def quickSort(A,start,end):
global tot_comparisons
if start < end:
# This to create the partition based on the
pivot_pos = partition(A,start,end)
tot_comparisons += len(A[start:pivot_pos-1]) + len(A[pivot_pos+1:end])
# This to sort the the left partition
quickSort(A,start,pivot_pos -1)
#This to sort the right partition
quickSort(A,pivot_pos+1,end)

Recursive solution to common longest substring between two strings

I am trying to return the length of a common substring between two strings. I'm very well aware of the DP solution, however I want to be able to solve this recursively just for practice.
I have the solution to find the longest common subsequence...
def get_substring(str1, str2, i, j):
if i == 0 or j == 0:
return
elif str1[i-1] == str2[j-1]:
return 1 + get_substring(str1, str2, i-1, j-1)
else:
return max(get_substring(str1, str2, i, j-1), get_substring(str1, str2, j-1, i))
However, I need the longest common substring, not the longest common sequence of letters. I tried altering my code in a couple of ways, one being changing the base case to...
if i == 0 or j == 0 or str1[i-1] != str2[j-1]:
return 0
But that did not work, and neither did any of my other attempts.
For example, for the following strings...
X = "AGGTAB"
Y = "BAGGTXAYB"
print(get_substring(X, Y, len(X), len(Y)))
The longest substring is AGGT.
My recursive skills are not the greatest, so if anybody can help me out that would be very helpful.
package algo.dynamic;
public class LongestCommonSubstring {
public static void main(String[] args) {
String a = "AGGTAB";
String b = "BAGGTXAYB";
int maxLcs = lcs(a.toCharArray(), b.toCharArray(), a.length(), b.length(), 0);
System.out.println(maxLcs);
}
private static int lcs(char[] a, char[] b, int i, int j, int count) {
if (i == 0 || j == 0)
return count;
if (a[i - 1] == b[j - 1]) {
count = lcs(a, b, i - 1, j - 1, count + 1);
}
count = Math.max(count, Math.max(lcs(a, b, i, j - 1, 0), lcs(a, b, i - 1, j, 0)));
return count;
}
}
You need to recurse on each separately. Which is easier to do if you have multiple recursive functions.
def longest_common_substr_at_both_start (str1, str2):
if 0 == len(str1) or 0 == len(str2) or str1[0] != str2[0]:
return ''
else:
return str1[0] + longest_common_substr_at_both_start(str1[1:], str2[1:])
def longest_common_substr_at_first_start (str1, str2):
if 0 == len(str2):
return ''
else:
answer1 = longest_common_substr_at_both_start (str1, str2)
answer2 = longest_common_substr_at_first_start (str1, str2[1:])
return answer2 if len(answer1) < len(answer2) else answer1
def longest_common_substr (str1, str2):
if 0 == len(str1):
return ''
else:
answer1 = longest_common_substr_at_first_start (str1, str2)
answer2 = longest_common_substr(str1[1:], str2)
return answer2 if len(answer1) < len(answer2) else answer1
print(longest_common_substr("BAGGTXAYB","AGGTAB") )
I am so sorry. I didn't have time to convert this into a recursive function. This was relatively straight forward to compose. If Python had a fold function a recursive function would be greatly eased. 90% of recursive functions are primitive. That's why fold is so valuable.
I hope the logic in this can help with a recursive version.
(x,y)= "AGGTAB","BAGGTXAYB"
xrng= range(len(x)) # it is used twice
np=[(a+1,a+2) for a in xrng] # make pairs of list index values to use
allx = [ x[i:i+b] for (a,b) in np for i in xrng[:-a]] # make list of len>1 combinations
[ c for i in range(len(y)) for c in allx if c == y[i:i+len(c)]] # run, matching x & y
...producing this list from which to take the longest of the matches
['AG', 'AGG', 'AGGT', 'GG', 'GGT', 'GT']
I didn't realize getting the longest match from the list would be a little involved.
ls= ['AG', 'AGG', 'AGGT', 'GG', 'GGT', 'GT']
ml= max([len(x) for x in ls])
ls[[a for (a,b) in zip(range(len(ls)),[len(x) for x in ls]) if b == ml][0]]
"AGGT"

Modification to Selection Sort. Theoretically seems correct but doesn't give the results

I am learning ruby and the way I am going about this is by learning and implementing sort algorithms. While working on selection sort, I tried to modify it as follows:
In every pass, instead of finding the smallest and moving it to the top or beginning of the array, find the smallest and the largest and move them to both ends
For every pass, increment the beginning and decrease the ending positions of the array that has to be looped through
While swapping, if the identified min and max are in positions that get swapped with each other, do the swap once (otherwise, two swaps will be done, 1 for the min and 1 for the max)
This doesn't seem to work in all cases. Am I missing something in the logic? If the logic is correct, I will revisit my implementation but for now I haven't been able to figure out what is wrong.
Please help.
Update: This is my code for the method doing this sort:
def mss(array)
start = 0;
stop = array.length - 1;
num_of_pass = 0
num_of_swap = 0
while (start <= stop) do
num_of_pass += 1
min_val = array[start]
max_val = array[stop]
min_pos = start
max_pos = stop
(start..stop).each do
|i|
if (min_val > array[i])
min_pos = i
min_val = array[i]
end
if (max_val < array[i])
max_pos = i
max_val = array[i]
end
end
if (min_pos > start)
array[start], array[min_pos] = array[min_pos], array[start]
num_of_swap += 1
end
if ((max_pos < stop) && (max_pos != start))
array[stop], array[max_pos] = array[max_pos], array[stop]
num_of_swap += 1
end
start += 1
stop -= 1
end
puts "length of array = #{array.length}"
puts "Number of passes = #{num_of_pass}"
puts "Number of swaps = #{num_of_swap}"
return array
end
The problem can be demonstrated with this input array
7 5 4 2 6
After searching the array the first time, we have
start = 0
stop = 4
min_pos = 3
min_val = 2
max_pos = 0 note: max_pos == start
max_val = 7
The first if statement will swap the 2 and 7, changing the array to
2 5 4 7 6
The second if statement does not move the 7 because max_pos == start. As a result, the 6 stays at the end of the array, which is not what you want.

Resources