Question regarding mergesort's merge algorithm - algorithm

Let's suppose we have two sorted arrays, A and B, consisting of n elements. I dont understand why the time needed to merge these 2 is "n+n". In order to merge them we need to compare 2n-1 elements. For example, in the two following arrays
A = [3, 5, 7, 9] and B = [2, 4, 6, 8]
We will start merging them into a single one, by comparing the elements in the known way. However when we finally compare 8 with 9. Now, this will be our 2n-1=8-1=7th comparison and 8 will be inserted into the new array.
After this the 9 will be inserted without another comparison. So I guess my question is, since there are 2n-1 comparisons, why do we say that this merging takes 2n time? Im not saying O(n), im saying T(n)=2n, an exact time function.
Its probably a detail that im missing here so I would be very grateful if someone could provide some insight. Thanks in advance.

Related

canceling arrays by number of items that I am ready to lose

We are writing c# program that will help us to remove some of unnecessary data repeaters and already found some repeaters to remove with help of this Finding overlapping data in arrays. Now we are going to check maybe we can to cancel some repeaters by other term. The question is:
We have arrays of numbers
{1, 2, 3, 4, 5, 6, 7, ...}, {4, 5, 10, 100}, {100, 1, 20, 50}
some numbers can be repeated in other arrays, some numbers can be unique and to belong only to specific array. We want to remove some arrays when we are ready to lose up to N numbers from the arrays.
Explanation:
{1, 2}
{2, 3, 4, 5}
{2, 7}
We are ready to lose up to 3 numbers from these arrays it means that we can remove array 1 cause we will lose only number "1" it's only unique number. Also we can remove array 1 and 3 cause we will lose numbers "1", "7" or array 3 cause we will lose number "7" only and it less than 3 numbers.
In our output we want to give maximum amount of arrays that can be removed with condition that we going to lose less then N where N is number of items we are ready to lose.
This problem is equivalent to the Set Cover problem (e.g.: take N=0) and thus efficient, exact solutions that work in general are unlikely. However, in practice, heuristics and approximations are often good enough. Given the similarity of your problem with Set Cover, the greedy heuristic is a natural starting point. Instead of stopping when you've covered all elements, stop when you've covered all but N elements.
You need to first get a number for each array which tells you hwo many numbers are unique to that particular array.
An easy way to do this is O(n²) since for each element, you need to check through all arrays if it's unique.
You can do this much more efficiently by having sorted arrays, sorting first or using a heap-like data structure.
After that, you only have to find a sum so that the numbers for a certain amount of arrays sum up to N.That's similar to the subset sum problem, but much less complex because N > 0 and all your numbers are > 0.
So you simply have to sort these numbers from smallest to greatest and then iterate over the sorted array and take the numbers as long as the sum < N.
Finally, you can remove every array that corresponds to a number which you were able to fit into N.

decoding via number combinations algorithm in python 3

ok so here is the problem.
let's say:
1 means Bob
2 means Jerry
3 means Tom
4 means Henry
any summation combination of two of aforementioned numbers is a status/ mood type which is how the program will be encoded:
7 (4+3) means Angry
5 (3+2) menas Sad
3 (2+1) means Mad
4 (3+1) means Happy
and so on...
how may i create a decode function such that it accepts one of the added (encoded) values, such as 7, 5, 3, 4, etc and figures out the combination and return the names of the people representing the two numbers that constitue the combination. take note that one number cannot be repeated to get mood result, meaning 4 has to be 3+1 and may not be 2+2. so we can assume for this example, that there is only one possible combination for each status/ mood code. now the problem is, how do you implement such code in python 3? what would be the algorithm or logic for such a problem. how do you seek or check for combination of two numbers? i'm thinking i should just run a loop that keeps on adding two numbers at a time until the result matches with the status/ mood code. will that work? BUT THIS METHOD WILL SOON BECOME OBSOLETE IF THE NUMBER OF COMBINATIONS IS INCREASED (as in adding 4 numbers together instead of 2). doing it this way will take up a lot of time and will possibly be inefficient.
i apologize, i know this questions is extremely confusing but please bear with me.
let's try and work something out.
Use Binary
If you want to have sums that are unique, then assign each possible "Person" a number that's a power of 2. The sum of any combination of these numbers will uniquely identify which numbers were used in the sum.
1, 2, 4, 8, 16, ...
Rather than offer a detailed proof of correctness, I offer an intuitive argument about this: any number can be represented in base 2, and it is always a sum of exactly one combination of powers of 2.
This solution may not be optimal. It has realistic limitations (32 or 64 different "person" identifiers, unless you use some sort of BigInt), but depending on your needs, it might work. Having the smallest possible values, binary is better than any other radix though.
Example
(Edited)
Here's a quick snippet that demonstrates how you could decode the sum. The returned values are the exponents of the powers of 2. count_persons could be arbitrarily large, as could the range of n iterated over (just as a quick example).
#!/usr/bin/python3
count_persons = 64
for n in range(20,30):
matches = list(filter(lambda i: (n>>i) & 0x1, range(1,count_persons)))
print('{0}: {1}'.format(n,matches))
Output:
20: [2, 4]
21: [2, 4]
22: [1, 2, 4]
23: [1, 2, 4]
24: [3, 4]
25: [3, 4]
26: [1, 3, 4]
27: [1, 3, 4]
28: [2, 3, 4]
29: [2, 3, 4]
See a more appropriate answer here
In my opinion, the selected answer is so suboptimal that it can be considered plain wrong.
The table you are building can be indexed with N(N-1)/2 values, while the binary approach uses 2N.
With a 64 bits unsigned integer, you could encode about sqrt(265) values, that is 6 billion names, compared with the 64 names the binary approach will allow.
Using a big number library could push the limit somewhat, but the computations involved would be hugely more costly than the simple o(N) reverse indexing algorithm needed by the alternative approach.
My conclusion is: the binary approach is grossly inefficient, unless you want to play with a handful of values, in which case hard-coding or precomputing the indexes would be just as good a solution.
Since the question is very unlikely to match a search on the subject, it is not that important anyway.

Sorting Algorithm : output

I faced this problem on a website and I quite can't understand the output, please help me understand it :-
Bogosort, is a dumb algorithm which shuffles the sequence randomly until it is sorted. But here we have tweaked it a little, so that if after the last shuffle several first elements end up in the right places we will fix them and don't shuffle those elements furthermore. We will do the same for the last elements if they are in the right places. For example, if the initial sequence is (3, 5, 1, 6, 4, 2) and after one shuffle we get (1, 2, 5, 4, 3, 6) we will keep 1, 2 and 6 and proceed with sorting (5, 4, 3) using the same algorithm. Calculate the expected amount of shuffles for the improved algorithm to sort the sequence of the first n natural numbers given that no elements are in the right places initially.
Input:
2
6
10
Output:
2
1826/189
877318/35343
For each test case output the expected amount of shuffles needed for the improved algorithm to sort the sequence of first n natural numbers in the form of irreducible fractions. I just can't understand the output.
I assume you found the problem on CodeChef. There is an explanation of the answer to the Bogosort problem here.
Ok I think I found the answer, there is a similar problem here https://math.stackexchange.com/questions/20658/expected-number-of-shuffles-to-sort-the-cards/21273 , and this problem can be thought of as its extension

Sorting an array in minimum cost

I have an array A[] with 4 element A={
8 1 2 4 }. How to sort it with minimized cost. Criteria is defined as follows-
a. It is possible to swap any 2 element.
b. The cost of any swap is sum of the element value , Like if i swap 8 and 4 the cost is 12 an resultant array is look like A={4 1 2 8}, which is still unsorted so more swap needed.
c. Need to find a way to sort the array with minimum cost.
From my observation greedy will not work, like in each step place any element to its sorted position in array with minimum cost. So a DP solution needed.
Can any one help??
Swap 2 and 1, and then 1 and 4, and then 1 and 8? Or is it a general question?
For a more general approach you could try:
Swapping every pair of 2 elements (with the highest sum) if they are perfect swaps (i.e. swapping them will put them both at their right spot). Th
Use the lowest element as a pivot for swaps (by swapping the element whose spot it occupies), until it reaches its final spot
Then, you have two possibilities:
Repeat step 2: use the lowest element not in its final spot as a pivot until it reaches its final spot, then go back to step 3
Or swap the lowest element not in its final spot (l2) with the lowest element (l1), repeat step 2 until l1 reaches the final spot of l2. Then:
Either swap l1 and l2 again, go to step 3.1
Or go to step 3.2 again, with the next lowest element not in its final spot being used.
When all this is done, if some opposite swaps are performed one next to another (for example it could happen from going to step 2. to step 3.2.), remove them.
There are still some things to watch out for, but this is already a pretty good approximation. Step one and two should always work though, step three would be the one to improve in some borderline cases.
Example of the algorithm being used:
With {8 4 5 3 2 7}: (target array {2 3 4 5 7 8})
Step 2: 2 <> 7, 2 <> 8
Array is now {2, 4, 5, 3, 7, 8}
Choice between 3.1 and 3.2:
3.1 gives 3 <> 5, 3 <> 4
3.2 gives 2 <> 3, 2 <> 5, 2 <> 4, 2 <> 3
3 <> 5, 3 <> 4 is the better result
Conclusion: 2 <> 7, 2 <> 8, 3 <> 5, 3 <> 4 is the best answer.
With {1 8 9 7 6} (resulting array {1 6 7 8 9})
You're beginning at step three already
Choice between 3.1 and 3.2:
3.1 gives 6 <> 9, 6 <> 7, 6 <> 8 (total: 42)
3.2 gives 1 <> 6, 1 <> 9, 1 <> 7, 1 <> 8, 1 <> 6 (total: 41)
So 1 <> 6, 1 <> 9, 1 <> 7, 1 <> 8, 1 <> 6 is the best result
This smells like homework. What you need to do is sort the array but doing so while minimizing cost of swaps. So, it's a optimization problem rather than a sorting problem.
A greedy algorithm would despite this work, all you do is that you fix the solution by swapping the cheapest first (figuring out where in the list it belongs). This is however, not necessarily optimal.
As long as you never swap the same element twice a greedy algorithm should be optimal though.
Anyway, back to the dynamic programming stuff, just build your solution tree using recursion and then prune the tree as you find a more optimal solutions. This is pretty basic recursion.
If you a more complicated sorting algorithm you'll have a lot more difficulty puzzling that together with the dynamic programming so I suggest you start out with a simple, slow O(n^2) sort. And build on top of this.
Rather than to provide you with a solution, I'd like to explain how dynamic programming works in my own words.
The first thing you need to do, is to figure out an algorithm that will explore all possible solutions (this can be a really stupid brute force algorithm).
You then implement this using recursion because dynamic programming is based around being able to figure out overlapping sub problems quickly, ergo recursion.
At each recursive call you look up where you are in your solution and check where you've computed this part of the solution tree before, if you have done this, you can test whether the current solution is more optimal, if it is then you continue, otherwise you're done with this branch of the problem.
When you arrive at the final solution you will have solved the problem.
Think of each recursive call as a snapshot of a partial solution. It's your job to figure how each recursive call fits together in the final optimal solution.
This what I recommend you do:
Write a recursive sort algorithm
Add a parameter to your recursive function that maintains the cost of this execution path, as you sort the array, add to this cost. For every possible swap at any given point do another recursive call (this will branch your solution tree)
Whenever you realize that the cost of the solution you are currently exploring exceeds what you already have somewhere else, abort (just return).
To be able to answer the last question you need to maintain shared memory area in which you can index depending on where you are in you're recursive algorithm. If there's a precomputed cost there you just return that value and don't continue processing (this is the pruning, which makes it fast).
Using this method you can even base your solution on a permutation brute force algorithm, it will probably be very slow or memory intensive because it is stupid when it comes to when you branch or prune but you don't really need a specific sort algorithm to make this work, it will just be more efficient to go about it that way.
Good luck!
If you do a high-low selection sort, you can guarantee that the Nth greatest element isn't swapped more than N times. This a simple algorithm with a pretty easy and enticing guarantee... Maybe check this on a few examples and see how it could be tweaked. Note: this may not lead to an optimal answer...
To find the absolute minimal cost you'll have to try all ways to swap and then find the fastest one.
def recsort(l, sort):
if sorted(l2):
if min>cost:
cost=min
bestsort=sort
if(len(sort) > len(l)*len(l)): //or some other criteria
return
for p1 in (0,len(l)):
for p2 in (0,len(l)):
cost += l[p1] + l[p2]
l2 = swap(l, p1,p2)
if cost<min:
recsort(l2, append sort (p1,p2))
An approach that will be pretty good is to recursively place the biggest value at the top.

Bogosort optimization, probability related

I'm coding a question on an online judge for practice . The question is regarding optimizing Bogosort and involves not shuffling the entire number range every time. If after the last shuffle several first elements end up in the right places we will fix them and don't shuffle those elements furthermore. We will do the same for the last elements if they are in the right places. For example, if the initial sequence is (3, 5, 1, 6, 4, 2) and after one shuffle Johnny gets (1, 2, 5, 4, 3, 6) he will fix 1, 2 and 6 and proceed with sorting (5, 4, 3) using the same algorithm.
For each test case output the expected amount of shuffles needed for the improved algorithm to sort the sequence of first n natural numbers in the form of irreducible fractions.
A sample input/output says that for n=6, the answer is 1826/189.
I don't quite understand how the answer was arrived at.
This looks similar to 2011 Google Code Jam, Preliminary Round, Problem 4, however the answer is n, I don't know how you get 1826/189.

Resources