Recursive Merge Sort Algorithm - algorithm

So I'm working on an algorithms problem and am really confused on what the correct answer should look like. I have an answer, but If anyone could give me feedback/guidance I would greatly appreciate it.
Problem :
Casc Merge is a recursive algorithm: Assume there
are n sorted lists, each of size m. Recursively
Casc Merge the first n − 1 lists and then
Merge the last list (of size m) into the sorted list A (which is of size (n-1)m).
1)Write down code for it
Here's what I've thought of so far. It seems like I'm on the right track hopefully, but Like I said I have no clue of knowing. I tried googling and didn't get too much help on it
proc Cas(A, L, c)
if c == n then
Merge(A, L[c-1], L[c])
end if
else
Merge(A, Casc(A, L, c), Casc=(A, L, c+1))
end else
end proc
Again, thank you ahead of time for any advice/feedback on the psuedocode.
Assuming merge does m + n - 1 comparisons
S(n) = { 1 if c = 1
S(n-1) + m - 1 otherwise
}

I think you're pretty close. Though, it looks like you might have a "double booking" of merges near your base case, where c==n, as you're merging L[c-1] and L[c] into A, but in the call previous to that, you're already performing a merge of L[c-1] with L[c].
If you were to think about the recursive algorithm in terms of the definition, you could write it out using slightly more simplified logic:
procedure Cascade(List A, List[] L, int c)
// base case (1-based indexing)
if (c == 1) then
Merge(A, L[c], {}) // merge the empty list with L[c] into A
else
// use the recursive definition
Merge(A, L[c], Cascade(A, L, c-1)) // merge L[c] with L[c-1]
end if
end procedure
You'd call the procedure such as: Cascade({}, L, n)
You can work through it like so:
n = 3
L = {{1 2 3} {3 2 1} {4 5 6}}
A = {}
First call to Merge yields:
Merge(A, {4 5 6}, Cascade({}, {{1 2 3} {3 2 1} {4 5 6}}, 2))
Then:
n = 2
Merge(A, {3 2 1}, Cascade({}, {{1 2 3} {3 2 1} {4 5 6}}, 1))
Then:
n = 1 (base case)
Merge(A, {1 2 3}, {})
Trickling back up the chain (actual merge results not shown for clarity):
A = {1 2 3}
A = {1 2 3 3 2 1}
A = {1 2 3 3 2 1 4 5 6} // merged list (the actual implementation would have sorted these...)
and you're done! Hope this helps you out...
EDIT: Based on discussions fleshed out in the comments, the following is an example that uses returns to pass data, rather than in-place modification.
procedure Cascade(List[] L, int c)
// base case (1-based indexing)
if (c == 1) then
A = Merge(L[c], {}) // merge the empty list with L[c] into A
else
// use the recursive definition
A = Merge(L[c], Cascade(L, c-1)) // merge L[c] with L[c-1]
end if
Return A // return the newly merged list
end procedure

Related

Lua - Choose a random value from a range (or table) excluding the values of a (or another) table

A range, 1, 2, 3, 4, 5, 6, 7, 8 (it can populate a Lua table if it makes it easier)
table = {1, 4, 3}
The possible random choice should be among 2, 5, 6, 7, 8.
In Python I have used this to get it:
possibleChoices = random.choice([i for i in range(9) if i not in table])
Any ideas how to achieve the same in Lua?
Lua has a very minimal library, so you will have to write your own functions to do some tasks that are automatically provided in many other languages.
A good way to go about this is to write small functions that solve part of your problem, and to incorporate those into a final solution. Here it would be nice to have a range of numbers, with certain of those numbers excluded, from which to randomly draw a number. A range can be obtained by using a range function:
-- Returns a sequence containing the range [a, b].
function range (a, b)
local r = {}
for i = a, b do
r[#r + 1] = i
end
return r
end
To get a sequence with some numbers excluded, a seq_diff function can be written; this version makes use of a member function:
-- Returns true if x is a value in the table t.
function member (x, t)
for k, v in pairs(t) do
if v == x then
return true
end
end
return false
end
-- Returns the sequence u - v.
function seq_diff (u, v)
local result = {}
for _, x in ipairs(u) do
if not member(x, v) then
result[#result + 1] = x
end
end
return result
end
Finally these smaller functions can be combined into a solution:
-- Returns a random number from the range [a, b],
-- excluding numbers in the sequence seq.
function random_from_diff_range (a, b, seq)
local selections = seq_diff(range(a, b), seq)
return selections[math.random(#selections)]
end
Sample interaction:
> for i = 1, 20 do
>> print(random_from_diff_range(1, 8, {1, 4, 3}))
>> end
8
6
8
5
5
8
6
7
8
5
2
5
5
7
2
8
7
2
6
5

no. of permutation of number from 1 to n in which i >i+1 and i-1

for a given N how many permutations of [1, 2, 3, ..., N] satisfy the following property.
Let P1, P2, ..., PN denote the permutation. The property we want to satisfy is that there exists an i between 2 and n-1 (inclusive) such that
Pj > Pj + 1 ∀ i ≤ j ≤ N - 1.
Pj > Pj - 1 ∀ 2 ≤ j ≤ i.
like for N=3
Permutations [1, 3, 2] and [2, 3, 1] satisfy the property.
Is there any direct formula or algorithm to find these set in programming.
There are 2^(n-1) - 2 such permutations. If n is the largest element, then the permutation is uniquely determined by the nonempty, proper subset of {1, 2, ..., n-1} which lies to the left of n in the permutation. This answer is consistent with the excellent answer of #גלעדברקן in view of the well-known fact that the elements in each row of Pascal's triangle sum to a power of two (hence the part of the row between the two ones is two less than a power of two).
Here is a Python enumeration which generates all n! permutations and checks them for validity:
import itertools
def validPerm(p):
n = max(p)
i = p.index(n)
if i == 0 or i == n-1:
return False
else:
before = p[:i]
after = p[i+1:]
return before == sorted(before) and after == sorted(after, reverse = True)
def validPerms(n):
nums = list(range(1,n+1))
valids = []
for p in itertools.permutations(nums):
lp = list(p)
if validPerm(lp): valids.append(lp)
return valids
For example,
>>> validPerms(4)
[[1, 2, 4, 3], [1, 3, 4, 2], [1, 4, 3, 2], [2, 3, 4, 1], [2, 4, 3, 1], [3, 4, 2, 1]]
which gives the expected number of 6.
On further edit: The above code was to verify the formula for nondegenerate unimodal permutations (to coin a phrase since "unimodal permutations" is used in the literature for the 2^(n-1) permutations with exactly one peak, but the 2 which either begin or end with n are arguably in some sense degenerate). From an enumeration point of view you would want to do something more efficient. The following is a Python implementation of the idea behind the answer of #גלעדברקן :
def validPerms(n):
valids = []
nums = list(range(1,n)) #1,2,...,n-1
snums = set(nums)
for i in range(1,n-1):
for first in itertools.combinations(nums,i):
#first will be already sorted
rest = sorted(snums - set(first),reverse = True)
valids.append(list(first) + [n] + rest)
return valids
It is functionally equivalent to the above code, but substantially more efficient.
Let's look at an example:
{1,2,3,4,5,6}
Clearly, any positioning of 6 at i will mean the right side of it will be sorted descending and the left side of it ascending. For example, i = 3
{1,2,6,5,4,3}
{1,3,6,5,4,2}
{1,4,6,5,3,2}
...
So for each positioning of N between 2 and n-1, we have (n - 1) choose (position - 1) arrangements. This leads to the answer:
sum [(n - 1) choose (i - 1)], for i = 2...(n - 1)
there are ans perm. and ans is as follows
ans equal to 2^(n-1) and
ans -= 2
as it need to be in between 2 <=i <= n-1 && we know that nC1 ans nCn = 1

Find objects with the most correspondences to a reference object

Reference object: { 1, 5, 6, 9, 10, 11 }
Other objects:
A { 2, 4, 5, 6, 8, 10, 11 }
B { 5, 7, 9, 10 }
C { 2, 5, 6, 7, 9, 12 }
D { 1, 3, 4, 5, 6, 8, 9, 10 }
E { 6, 8 }
F { 1, 2, 3, 4, 7, 8, 9, 13, 15 }
... { ... }
Difficulty: It should be faster than O(n*m)
Result should be:
Array
(
[D] => 5
[A] => 4
[C] => 3
[B] => 3
[F] => 2
[E] => 1
)
Slow solution:
ref = array(1, 5, 6, 9, 10, 11);
foreach (A, B, C, D,.. AS row)
{
foreach (row AS col)
{
if ( exist(col, ref) )
{
result[row] += 1;
}
}
}
sort (result)
.. this is a solution, but its far to slow.
Is there another way like patter recognition, hopefully in O(log n)?
It is possible to save each object in an other notation, like for example:
ref = "15691011"
A = "2456811"
But I don't know if this helps.
If you have all data in your objects sorted, you can do this routine faster, by comparing not single values in the row, but whole row step by step.
foreach (A, B, C, D,.. AS row)
{
for (i = 0, j = 0; i < row.length && j < ref.length)
{
if (row[i] < ref[j]) i++;
elseif (row[i] > ref[j]) j++;
else {
result[row] += 1;
i++; j++;
}
}
}
In this case you pass you reference only once for each row, but this algorithm need all your data to be already sorted.
You could start with the largest sequence (it has the largest change to have many references).
When you find - for example - 4 refs, you can safely skip all sequences with less then 4 elements.
Another early exit is to abort checking a sequence, when the current sequence cannot surpass the current max. for example: Your current max is 6 elements. You are processing a list of size 7, but the first two elements are no reference. The highest reachable for this list is 5, which is lower than 6, abort the sequence.
Problem in both cases is that you can not construct a complete array of results.
Assumptions:
There are m lists apart from the reference object.
The lists are sorted initially.
There are no repetition of elements in any array.
Scan all the arrays and find out the maximum element in all the lists. You only need to check the last element in each list. Call it MAX.
For each of the m + 1 lists, make a corresponding Boolean array with MAX elements and initialize their values to zero.
Scan all the arrays and make the corresponding indices of arrays 1.
For example, the corresponding array for the example reference object { 1, 5, 6, 9, 10, 11 } shall look like:
{1,0,0,0,1,1,0,0,1,1,1,0,0,...}
Now for every pair-wise combination, you can just check the corresponding indices and increment the count if both are 1.
The above algorithm can be done in linear time complexity with regards to the total number of elements in the data.
You should use other techniques used in search engines. For each number, you have a list of object contained this number in sorted order. In your case
1 -> {D, F}
5 -> {A, B, C, D}
6 -> {A, C, D, E}
9 -> {B, C, D, F}
10 -> {A, B, D}
11 -> {A}
Merging this list you can count how your object is similar to objects in list
A -> 4
B -> 3
C -> 2
D -> 5
E -> 1
F -> 2
After sorting, you get needed result. If you need only top k elements, you should use a priority queue.

N-fold partition of an array with equal sum in each partition

Given an array of integers a, two numbers N and M, return N group of integers from a such that each group sums to M.
For example, say:
a = [1,2,3,4,5]
N = 2
M = 5
Then the algorithm could return [2, 3], [1, 4] or [5], [2, 3] or possibly others.
What algorithms could I use here?
Edit:
I wasn't aware that this problem is NP complete. So maybe it would help if I provided more details on my specific scenario:
So I'm trying to create a "match-up" application. Given the number of teams N and the number of players per team M, the application listens for client requests. Each client request will give a number of players that the client represents. So if I need 2 teams of 5 players, then if 5 clients send requests, each representing 1, 2, 3, 4, 5 players respectively, then my application should generate a match-up between clients [1, 4] and clients [2, 3]. It could also generate a match-up between [1, 4] and [5]; I don't really care.
One implication is that any client representing more than M or less than 0 players is invalid. Hope this could simplify the problem.
this appears to be a variation of the subset sum problem. as this problem is np-complete, there will be no efficient algorithm without further constraints.
note that it is already hard to find a single subset of the original set whose elements would sum up to M.
People give up too easily on NP-complete problems. Just because a problem is NP complete doesn't mean that there aren't more and less efficient algorithms in the general case. That is you can't guarantee that for all inputs there is an answer that can be computed faster than a brute force search, but for many problems you can certainly have methods that are faster than the full search for most inputs.
For this problem there are certainly 'perverse' sets of numbers that will result in worst case search times, because there may be say a large vector of integers, but only one solution and you have to end up trying a very large number of combinations.
But for non-perverse sets, there are probably many solutions, and an efficient way of 'tripping over' a good partitioning will run much faster than NP time.
How you solve this will depend a lot on what you expect to be the more common parameters. It also makes a difference if the integers are all positive, or if negatives are allowed.
In this case I'll assume that:
N is small relative to the length of the vector
All integers are positive.
Integers cannot be re-used.
Algorithm:
Sort the vector, v.
Eliminate elements bigger than M. They can't be part of any solution.
Add up all remaining numbers in v, divide by N. If the result is smaller than M, there is no solution.
Create a new array w, same size as v. For each w[i], sum all the numbers in v[i+1 - end]
So if v was 5 4 3 2 1, w would be 10, 6, 3, 1, 0.
While you have not found enough sets:
Chose the largest number, x, if it is equal to M, emit a solution set with just x, and remove it from the vector, remove the first element from w.
Still not enough sets? (likely), then again while you have not found enough sets:
A solution theory is ([a,b,c], R ) where [a,b,c] is a partial set of elements of v and a remainder R. R = M-sum[a,b,c]. Extending a theory is adding a number to the partial set, and subtracting that number from R. As you extend the theories, if R == 0, that is a possible solution.
Recursively create theories like so: loop over the elements v, as v[i] creating theories, ( [v[i]], R ), And now recursively extend extend each theory from just part of v. Binary search into v to find the first element equal to or smaller than R, v[j]. Start with v[j] and extend each theory with the elements of v from j until R > w[k].
The numbers from v[j] to v[k] are the only numbers that be used to extend a theory and still get R to 0. Numbers larger than v[j] will make R negative. Smaller larger than v[k], and there aren't any more numbers left in the array, even if you used them all to get R to 0
Here is my own Python solution that uses dynamic programming. The algorithm is given here.
def get_subset(lst, s):
'''Given a list of integer `lst` and an integer s, returns
a subset of lst that sums to s, as well as lst minus that subset
'''
q = {}
for i in range(len(lst)):
for j in range(1, s+1):
if lst[i] == j:
q[(i, j)] = (True, [j])
elif i >= 1 and q[(i-1, j)][0]:
q[(i, j)] = (True, q[(i-1, j)][1])
elif i >= 1 and j >= lst[i] and q[(i-1, j-lst[i])][0]:
q[(i, j)] = (True, q[(i-1, j-lst[i])][1] + [lst[i]])
else:
q[(i, j)] = (False, [])
if q[(i, s)][0]:
for k in q[(i, s)][1]:
lst.remove(k)
return q[(i, s)][1], lst
return None, lst
def get_n_subset(n, lst, s):
''' Returns n subsets of lst, each of which sums to s'''
solutions = []
for i in range(n):
sol, lst = get_subset(lst, s)
solutions.append(sol)
return solutions, lst
# print(get_n_subset(7, [1, 2, 3, 4, 5, 7, 8, 4, 1, 2, 3, 1, 1, 1, 2], 5))
# [stdout]: ([[2, 3], [1, 4], [5], [4, 1], [2, 3], [1, 1, 1, 2], None], [7, 8])

Summation up to a variable integer: How to get the coefficients?

This is an example. I want to know if there is a general way to deal with this kind of problems.
Suppose I have a function (a ε ℜ) :
f[a_, n_Integer, m_Integer] := Sum[a^i k[i],{i,0,n}]^m
And I need a closed form for the coefficient a^p. What is the better way to proceed?
Note 1:In this particular case, one could go manually trying to represent the sum through Multinomial[ ], but it seems difficult to write down the Multinomial terms for a variable number of arguments, and besides, I want Mma to do it.
Note 2: Of course
Collect[f[a, 3, 4], a]
Will do, but only for a given m and n.
Note 3: This question is related to this other one. My application is different, but probably the same methods apply. So, feel free to answer both with a single shot.
Note 4:
You can model the multinomial theorem with a function like:
f[n_, m_] :=
Sum[KroneckerDelta[m - Sum[r[i], {i, n}]]
(Multinomial ## Sequence#Array[r, n])
Product[x[i]^r[i], {i, n}],
Evaluate#(Sequence ## Table[{r[i], 0, m}, {i, 1, n}])];
So, for example
f[2,3]
is the cube of a binomial
x[1]^3+ 3 x[1]^2 x[2]+ 3 x[1] x[2]^2+ x[2]^3
The coefficient by a^k can be viewed as derivative of order k at zero divided by k!. In version 8, there is a function BellY, which allows to construct a derivative at a point for composition of functions, out of derivatives of individual components. Basically, for f[g[x]] and expanding around x==0 we find Derivative[p][Function[x,f[g[x]]][0] as
BellY[ Table[ { Derivative[k][f][g[0]], Derivative[k][g][0]}, {k, 1, p} ] ]/p!
This is also known as generalized Bell polynomial, see wiki.
In the case at hand:
f[a_, n_Integer, m_Integer] := Sum[a^i k[i], {i, 0, n}]^m
With[{n = 3, m = 4, p = 7},
BellY[ Table[{FactorialPower[m, s] k[0]^(m - s),
If[s <= n, s! k[s], 0]}, {s, 1, p}]]/p!] // Distribute
(*
Out[80]= 4 k[1] k[2]^3 + 12 k[1]^2 k[2] k[3] + 12 k[0] k[2]^2 k[3] +
12 k[0] k[1] k[3]^2
*)
With[{n = 3, m = 4, p = 7}, Coefficient[f[a, n, m], a, p]]
(*
Out[81]= 4 k[1] k[2]^3 + 12 k[1]^2 k[2] k[3] + 12 k[0] k[2]^2 k[3] +
12 k[0] k[1] k[3]^2
*)
Doing it this way is more computationally efficient than building the entire expression and extracting coefficients.
EDIT The approach here outlined will work for symbolic orders n and m, but requires explicit value for p. When using it is this circumstances, it is better to replace If with its Piecewise analog, e.g. Boole:
With[{p = 2},
BellY[Table[{FactorialPower[m, s] k[0]^(m - s),
Boole[s <= n] s! k[s]}, {s, 1, p}]]/p!]
(* 1/2 (Boole[1 <= n]^2 FactorialPower[m, 2] k[0]^(-2 + m)
k[1]^2 + 2 m Boole[2 <= n] k[0]^(-1 + m) k[2]) *)

Resources