Related
This was a multiple-choice question in an exam today, and (at least) one of the answers should be true, but to me they all look wrong.
The sorting steps are:
5 2 6 1 3 4
4 2 6 1 3 5
4 2 5 1 3 6
4 2 3 1 5 6
1 2 3 4 5 6
The available answers were: Bubble Sort, Insertion Sort, Selection Sort, Merge Sort and Quick Sort.
I think that is a Quick sort. Here we can see the following steps:
A random selection of the reference element in the array (pivotValue), with respect to which reorders the elements of the array.
Move all of the values that are larger than the reference to the right, and all the values that the lower support left
Repeat algorithm for unsorted the left and right side of the array, while each element will not appear on its position
Why I think so:
It definitely isn't a Bubble Sort because it compares the first two elements of the array beginning so, the first step should be 2 5 6 1 3 4
It isn't a Insertion Sort because it's a sequential algorithm. In the first step we see that compared the first and the last element
It isn't a Selection Sort because it find the lowest value and move it to the top so, the first step should be 1 5 2 6 3 4
It isn't a Merge Sort because the array is divided into two subarrays. In this case we see interaction "first" and "second" parts
None of them.
bubble sort: no. After k steps, the last k elements should be the k largest, sorted.
insertion sort: no. After k steps, the k first elements should be sorted.
selection sort: no. After k steps, the k first elements should be the s smallest, sorted.
merge sort: no. After k steps, a value can only have moved 2^k - 1 places. (5 moves 5 places at k=1)
quick sort: no. Whatever the pivot is, 1 and 6 being the extreme values, they can stay in this initial position.
On the quick sort: To make it clear that it is not possible, lets enumerate the results of each pivot for the first step:
5 : [2134] - 5 - [6]. (2134 may be in any order)
2 : [1] - 2 - [5634]
6 : [52134] - 6
1 : 1 - [52634]
3 : [21] - 3 - [564]
4 : [213] - 4 - [56]
One obvious way of seeing that all those are incompatible with the OP's output is that in each case, the 1 is before the 6, no matter how you implement the pivot or the partition.
To solve this all you have to do is make a function for each sort algorithm but include a statement to print the array out after each swap. Then apply your print friendly sort algorithms to the initial array [5 2 6 1 3 4] and see which sort method produces the same output. Additionally, this will help you compare all the different methods.
So here is an interesting problem in C#. I'm looking for a better way of solving it:
Given a matrix M (not necesarily square) of matches, find the best matching elements. Element i matches elem j by value M(i,j). M(i,j) != M(j,i).
Since #rows != #columns, find the best min(#rows,#columns) matching pairs (i,j).
Basically the problem is to pick the maximum from each row/column such that no row/column is picked twice.
Example:
1 2 3
+---------
a | 10 3 1
b | 12 99 2
c | 20 5 3
d | 5 7 4
The maximum value in this matrix is 99 so the best match is (b,2). For the next selection we cannot use anymore row b and column 2. Is like cutting them
1 2 3 or, if you prefer, 1 3
+--------- a smaller matrix: +------
a | 10 || 1 a | 10 1
b | ===++=== c | 20 3
c | 20 || 3 d | 5 4
d | 5 || 4
The max is now 20 and the match is (c, 1). The remaining matrix has only one column.
After another pick we'll get the match (d, 3) with match = 4
In the end "a" has no match.
My current implementation uses 2 array to store the already matched rows/columns and for each match goes through the entire matrix, picking the first maximum that belongs to row/col not match.
PS: in case of value multiple matches having the same value, just pick one of them
PS2: The array is stored as int [,]
How would you approach this problem in a more optimal/beautiful way?
If you are trying to maximise the sum of the cells chosen, such that exactly one cell is picked from each row and from each column, then this is http://en.wikipedia.org/wiki/Assignment_problem. If your matrix is not square, you can make it square by adding rows or columns to them, with values in the new cells which mean that they won't be picked unless there is no other way to fill out the solution.
(If you are not maximising the sum, you need to say what function of the values chosen you are maximising - is (1,3) better than (2,2)?. Otherwise you are into http://en.wikipedia.org/wiki/Multi-objective_optimization, which is possible, but more complicated).
You could first sort all of the entries of the matrix in descending order, and then process the sorted list. Whenever you see an entry that isn't in an already-picked row/col, it means that entry should be picked, so you mark the corresponding row/column and continue further down the list until either all rows or all columns have been picked.
given a sorted array of distinct integers, what is the minimum number of steps required to make the integers contiguous? Here the condition is that: in a step , only one element can be changed and can be either increased or decreased by 1 . For example, if we have 2,4,5,6 then '2' can be made '3' thus making the elements contiguous(3,4,5,6) .Hence the minimum steps here is 1 . Similarly for the array: 2,4,5,8:
Step 1: '2' can be made '3'
Step 2: '8' can be made '7'
Step 3: '7' can be made '6'
Thus the sequence now is 3,4,5,6 and the number of steps is 3.
I tried as follows but am not sure if its correct?
//n is the number of elements in array a
int count=a[n-1]-a[0]-1;
for(i=1;i<=n-2;i++)
{
count--;
}
printf("%d\n",count);
Thanks.
The intuitive guess is that the "center" of the optimal sequence will be the arithmetic average, but this is not the case. Let's find the correct solution with some vector math:
Part 1: Assuming the first number is to be left alone (we'll deal with this assumption later), calculate the differences, so 1 12 3 14 5 16-1 2 3 4 5 6 would yield 0 -10 0 -10 0 -10.
sidenote: Notice that a "contiguous" array by your implied definition would be an increasing arithmetic sequence with difference 1. (Note that there are other reasonable interpretations of your question: some people may consider 5 4 3 2 1 to be contiguous, or 5 3 1 to be contiguous, or 1 2 3 2 3 to be contiguous. You also did not specify if negative numbers should be treated any differently.)
theorem: The contiguous numbers must lie between the minimum and maximum number. [proof left to reader]
Part 2: Now returning to our example, assuming we took the 30 steps (sum(abs(0 -10 0 -10 0 -10))=30) required to turn 1 12 3 14 5 16 into 1 2 3 4 5 6. This is one correct answer. But 0 -10 0 -10 0 -10+c is also an answer which yields an arithmetic sequence of difference 1, for any constant c. In order to minimize the number of "steps", we must pick an appropriate c. In this case, each time we increase or decrease c, we increase the number of steps by N=6 (the length of the vector). So for example if we wanted to turn our original sequence 1 12 3 14 5 16 into 3 4 5 6 7 8 (c=2), then the differences would have been 2 -8 2 -8 2 -8, and sum(abs(2 -8 2 -8 2 -8))=30.
Now this is very clear if you could picture it visually, but it's sort of hard to type out in text. First we took our difference vector. Imagine you drew it like so:
4|
3| *
2| * |
1| | | *
0+--+--+--+--+--*
-1| |
-2| *
We are free to "shift" this vector up and down by adding or subtracting 1 from everything. (This is equivalent to finding c.) We wish to find the shift which minimizes the number of | you see (the area between the curve and the x-axis). This is NOT the average (that would be minimizing the standard deviation or RMS error, not the absolute error). To find the minimizing c, let's think of this as a function and consider its derivative. If the differences are all far away from the x-axis (we're trying to make 101 112 103 114 105 116), it makes sense to just not add this extra stuff, so we shift the function down towards the x-axis. Each time we decrease c, we improve the solution by 6. Now suppose that one of the *s passes the x axis. Each time we decrease c, we improve the solution by 5-1=4 (we save 5 steps of work, but have to do 1 extra step of work for the * below the x-axis). Eventually when HALF the *s are past the x-axis, we can NO LONGER IMPROVE THE SOLUTION (derivative: 3-3=0). (In fact soon we begin to make the solution worse, and can never make it better again. Not only have we found the minimum of this function, but we can see it is a global minimum.)
Thus the solution is as follows: Pretend the first number is in place. Calculate the vector of differences. Minimize the sum of the absolute value of this vector; do this by finding the median OF THE DIFFERENCES and subtracting that off from the differences to obtain an improved differences-vector. The sum of the absolute value of the "improved" vector is your answer. This is O(N) The solutions of equal optimality will (as per the above) always be "adjacent". A unique solution exists only if there are an odd number of numbers; otherwise if there are an even number of numbers, AND the median-of-differences is not an integer, the equally-optimal solutions will have difference-vectors with corrective factors of any number between the two medians.
So I guess this wouldn't be complete without a final example.
input: 2 3 4 10 14 14 15 100
difference vector: 2 3 4 5 6 7 8 9-2 3 4 10 14 14 15 100 = 0 0 0 -5 -8 -7 -7 -91
note that the medians of the difference-vector are not in the middle anymore, we need to perform an O(N) median-finding algorithm to extract them...
medians of difference-vector are -5 and -7
let us take -5 to be our correction factor (any number between the medians, such as -6 or -7, would also be a valid choice)
thus our new goal is 2 3 4 5 6 7 8 9+5=7 8 9 10 11 12 13 14, and the new differences are 5 5 5 0 -3 -2 -2 -86*
this means we will need to do 5+5+5+0+3+2+2+86=108 steps
*(we obtain this by repeating step 2 with our new target, or by adding 5 to each number of the previous difference... but since you only care about the sum, we'd just add 8*5 (vector length times correct factor) to the previously calculated sum)
Alternatively, we could have also taken -6 or -7 to be our correction factor. Let's say we took -7...
then the new goal would have been 2 3 4 5 6 7 8 9+7=9 10 11 12 13 14 15 16, and the new differences would have been 7 7 7 2 1 0 0 -84
this would have meant we'd need to do 7+7+7+2+1+0+0+84=108 steps, the same as above
If you simulate this yourself, can see the number of steps becomes >108 as we take offsets further away from the range [-5,-7].
Pseudocode:
def minSteps(array A of size N):
A' = [0,1,...,N-1]
diffs = A'-A
medianOfDiffs = leftMedian(diffs)
return sum(abs(diffs-medianOfDiffs))
Python:
leftMedian = lambda x:sorted(x)[len(x)//2]
def minSteps(array):
target = range(len(array))
diffs = [t-a for t,a in zip(target,array)]
medianOfDiffs = leftMedian(diffs)
return sum(abs(d-medianOfDiffs) for d in diffs)
edit:
It turns out that for arrays of distinct integers, this is equivalent to a simpler solution: picking one of the (up to 2) medians, assuming it doesn't move, and moving other numbers accordingly. This simpler method often gives incorrect answers if you have any duplicates, but the OP didn't ask that, so that would be a simpler and more elegant solution. Additionally we can use the proof I've given in this solution to justify the "assume the median doesn't move" solution as follows: the corrective factor will always be in the center of the array (i.e. the median of the differences will be from the median of the numbers). Thus any restriction which also guarantees this can be used to create variations of this brainteaser.
Get one of the medians of all the numbers. As the numbers are already sorted, this shouldn't be a big deal. Assume that median does not move. Then compute the total cost of moving all the numbers accordingly. This should give the answer.
community edit:
def minSteps(a):
"""INPUT: list of sorted unique integers"""
oneMedian = a[floor(n/2)]
aTarget = [oneMedian + (i-floor(n/2)) for i in range(len(a))]
# aTargets looks roughly like [m-n/2?, ..., m-1, m, m+1, ..., m+n/2]
return sum(abs(aTarget[i]-a[i]) for i in range(len(a)))
This is probably not an ideal solution, but a first idea.
Given a sorted sequence [x1, x2, …, xn]:
Write a function that returns the differences of an element to the previous and to the next element, i.e. (xn – xn–1, xn+1 – xn).
If the difference to the previous element is > 1, you would have to increase all previous elements by xn – xn–1 – 1. That is, the number of necessary steps would increase by the number of previous elements × (xn – xn–1 – 1). Let's call this number a.
If the difference to the next element is >1, you would have to decrease all subsequent elements by xn+1 – xn – 1. That is, the number of necessary steps would increase by the number of subsequent elements × (xn+1 – xn – 1). Let's call this number b.
If a < b, then increase all previous elements until they are contiguous to the current element. If a > b, then decrease all subsequent elements until they are contiguous to the current element. If a = b, it doesn't matter which of these two actions is chosen.
Add up the number of steps taken in the previous step (by increasing the total number of necessary steps by either a or b), and repeat until all elements are contiguous.
First of all, imagine that we pick an arbitrary target of contiguous increasing values and then calculate the cost (number of steps required) for modifying the array the array to match.
Original: 3 5 7 8 10 16
Target: 4 5 6 7 8 9
Difference: +1 0 -1 -1 -2 -7 -> Cost = 12
Sign: + 0 - - - -
Because the input array is already ordered and distinct, it is strictly increasing. Because of this, it can be shown that the differences will always be non-increasing.
If we change the target by increasing it by 1, the cost will change. Each position in which the difference is currently positive or zero will incur an increase in cost by 1. Each position in which the difference is currently negative will yield a decrease in cost by 1:
Original: 3 5 7 8 10 16
New target: 5 6 7 8 9 10
New Difference: +2 +1 0 0 -1 -6 -> Cost = 10 (decrease by 2)
Conversely, if we decrease the target by 1, each position in which the difference is currently positive will yield a decrease in cost by 1, while each position in which the difference is zero or negative will incur an increase in cost by 1:
Original: 3 5 7 8 10 16
New target: 3 4 5 6 7 8
New Difference: 0 -1 -2 -2 -3 -8 -> Cost = 16 (increase by 4)
In order to find the optimal values for the target array, we must find a target such that any change (increment or decrement) will not decrease the cost. Note that an increment of the target can only decrease the cost when there are more positions with negative difference than there are with zero or positive difference. A decrement can only decrease the cost when there are more positions with a positive difference than with a zero or negative difference.
Here are some example distributions of difference signs. Remember that the differences array is non-increasing, so positives always have to be first and negatives last:
C C
+ + + - - - optimal
+ + 0 - - - optimal
0 0 0 - - - optimal
+ 0 - - - - can increment (negatives exceed positives & zeroes)
+ + + 0 0 0 optimal
+ + + + - - can decrement (positives exceed negatives & zeroes)
+ + 0 0 - - optimal
+ 0 0 0 0 0 optimal
C C
Observe that if one of the central elements (marked C) is zero, the target must be optimal. In such a circumstance, at best any increment or decrement will not change the cost, but it may increase it. This result is important, because it gives us a trivial solution. We pick a target such that a[n/2] remains unchanged. There may be other possible targets that yield the same cost, but there are definitely none that are better. Here's the original code modified to calculate this cost:
//n is the number of elements in array a
int targetValue;
int cost = 0;
int middle = n / 2;
int startValue = a[middle] - middle;
for (i = 0; i < n; i++)
{
targetValue = startValue + i;
cost += abs(targetValue - a[i]);
}
printf("%d\n",cost);
You can not do it by iterating once on the array, that's for sure.
You need first to check the difference between each two numbers, for example:
2,7,8,9 can be 2,3,4,5 with 18 steps or 6,7,8,9 with 4 steps.
Create a new array with the difference like so: for 2,7,8,9 it wiil be 4,1,1. Now you can decide whether to increase or decrease the first number.
Lets assume that the contiguous array looks something like this -
c c+1 c+2 c+3 .. and so on
Now lets take an example -
5 7 8 10
The contiguous array in this case will be -
c c+1 c+2 c+3
In order to get the minimum steps, the sum of the modulus of the difference of the integers(before and after) w.r.t the ith index should be the minimum. In which case,
(c-5)^2 + (c-6)^2 + (c-6)^2 + (c-7)^2 should be minimum
Let f(c) = (c-5)^2 + (c-6)^2 + (c-6)^2 + (c-7)^2
= 4c^2 - 48c + 146
Applying differential calculus to get the minima,
f'(c) = 8c - 48 = 0
=> c = 6
So our contiguous array is 6 7 8 9 and the minimum cost here is 2.
To sum it up, just generate f(c), get the first differential and find out c.
This should take O(n).
Brute force approach O(N*M)
If one draws a line through each point in the array a then y0 is a value where each line starts at index 0. Then the answer is the minimum among number of steps reqired to get from a to every line that starts at y0, in Python:
y0s = set((y - i) for i, y in enumerate(a))
nsteps = min(sum(abs(y-(y0+i)) for i, y in enumerate(a))
for y0 in xrange(min(y0s), max(y0s)+1)))
Input
2,4,5,6
2,4,5,8
Output
1
3
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Given a 2d array sorted in increasing order from left to right and top to bottom, what is the best way to search for a target number?
The following was asked in a Google interview:
You are given a 2D array storing integers, sorted vertically and horizontally.
Write a method that takes as input an integer and outputs a bool saying whether or not the integer is in the array.
What is the best way to do this? And what is its time complexity?
Start at the Bottom-Left corner of the Matrix and follow the rules stated below to traverse the matrix:
The matrix traversal is based on these conditions:
If the input number is greater than current number: Move Right
If the input number is less than current number: Move Up.
If the input number is equal to current number: Return Success
If the input number is not equal to current number and no transition is possible: Return Fail
Time Complexity: (Thanks to Martinho Fernandes)
The time complexity is O(N+M). In the worst case, the element searched for is in the upper-left corner, meaning you'll go up N times, and left M times.
Example
Input matrix:
--------------
| 1 | 4 | 6 |
--------------
| 2 | 5 | 9 |
--------------
| *3* | 8 | 10 |
--------------
Number to search: 4
Step 1:
Start at the cell where you have 3 (Bottom-Left).
3 < 4: Move Right
| 1 | 4 | 6 |
--------------
| 2 | 5 | 9 |
--------------
| 3 | *8* | 10 |
--------------
Step 2:
8 > 4: Move Up
| 1 | 4 | 6 |
--------------
| 2 | *5* | 9 |
--------------
| 3 | 8 | 10 |
--------------
Step 3:
5 > 4: Move Up
| 1 | *4* | 6 |
--------------
| 2 | 5 | 9 |
--------------
| 3 | 8 | 10 |
--------------
Step 4:
4=4: Return the index of the number
I would start by asking details about what it means to be "sorted vertically and horizontally"
If the matrix is sorted in a way that the last element of each row is less than the first element of the next row, you can run a binary search on the first column to find out in what row that number is, and then run another binary search on the row. This algorithm will take O(log C + log R) time, where C and R are, respectively the number of rows and columns. Using a property of the logarithm, one can write that as O(log(C*R)), which is the same as O(log N), if N is the number of elements in the array. This is almost the same as treating the array as 1D and running a binary search on it.
But the matrix could be sorted in a way that the last element of each row is not less than the first element of the next row:
1 2 3 4 5 6 7 8 9
2 3 4 5 6 7 8 9 10
3 4 5 6 7 8 9 10 11
In this case, you could run some sort of horizontal an vertical binary search simultaneously:
Test the middle number of the first column. If it's less than the target, consider the lines above it. If it's greater, consider those below;
Test the middle number of the first considered line. If it's less, consider the columns left of it. If it's greater, consider those to the right;
Lathe, rinse, repeat until you find one, or you're left with no more elements to consider;
This method is also logarithmic on the number of elements.
The first method that comes to mind is a vertical binary search, followed by a horizontal one when you find the row it should be in. Complexity will be O(log NM) where N and M are the dimensions of the array.
Further explanation:
Consider just the first number of every row. When you perform a binary search of these first numbers for the specified number, the result will be either the specified number if you're lucky, otherwise it will be the position before or after where the specified number would go depending on the binary search implementation. Once you find the two of the first numbers that the specified number should go between, you know that the number is in that row, and a second binary search will find the number if it is in the row.
I'm trying to solve one of the Project Euler problems. As a consequence, I need an algorithm that will help me find all possible partitions of a set, in any order.
For instance, given the set 2 3 3 5:
2 | 3 3 5
2 | 3 | 3 5
2 | 3 3 | 5
2 | 3 | 3 | 5
2 5 | 3 3
and so on. Pretty much every possible combination of the members of the set. I've searched the net of course, but haven't found much that's directly useful to me, since I speak programmer-ese not advanced-math-ese.
Can anyone help me out with this? I can read pretty much any programming language, from BASIC to Haskell, so post in whatever language you wish.
Have you considered a search tree? Each node would represent a choice of where to put an element and the leaf nodes are answers. I won't give you code because that's part of the fun of Project Euler ;)
Take a look at:
The Art of Computer Programming, Volume 4, Fascicle 3: Generating All Combinations and Partitions
7.2.1.5. Generating all set partitions
In general I would look at the structure of the recursion used to compute the number of configurations, and build a similar recursion for enumerating them. Best is to compute a one-to-one mapping between integers and configurations. This works well for permutations, combinations, etc. and ensures that each configuration is enumerated only once.
Now even the recursion for the number of partitions of some identical items is rather complicated.
For partitions of multisets the counting amounts to solving the generalization of Project Euler problem 181 to arbitrary multisets.
Well, the problem has two aspects.
Firsty, the items can be arranged in any order. So for N items, there are N! permutations (assuming the items are treated as unique).
Secondly, you can envision the grouping as a bit flag between each item indicating a divide. There would be N-1 of these flags, so for a given permutation there would be 2^(N-1) possible groupings.
This means that for N items, there would be a total of N!*(2^(N-1)) groupings/permutations, which gets big very very fast.
In your example, the top four items are groupings of one permutation. The last item is a grouping of another permutation. Your items can be viewed as :
2 on 3 off 3 off 5
2 on 3 on 3 off 5
2 on 3 off 3 on 5
2 on 3 on 3 on 5
2 off 5 on 3 off 3
The permutations (the order of display) can be derived by looking at them like a tree, as mentioned by the other two. This would almost certainly involve recursion, such as here.
The grouping is independent of them in many ways. Once you have all the permutations, you can link them with the groupings if needed.
Here is the code you need for this part of your problem:
def memoize(f):
memo={}
def helper(x):
if x not in memo:
memo[x]=f(x)
return memo[x]
return helper
#memoize
def A000041(n):
if n == 0: return 1
S = 0
J = n-1
k = 2
while 0 <= J:
T = A000041(J)
S = S+T if k//2%2!=0 else S-T
J -= k if k%2!=0 else k//2
k += 1
return S
print A000041(100) #the 100's number in this series, as an example
I quickly whipped up some code to do this. However, I left out separating every possible combination of the given list, because I wasn't sure it was actually needed, but it should be easy to add, if necessary.
Anyway, the code runs quite well for small amounts, but, as CodeByMoonlight already mentioned, the amount of possibilities gets really high really fast, so the runtime increases accordingly.
Anyway, here's the python code:
import time
def separate(toseparate):
"Find every possible way to separate a given list."
#The list of every possibility
possibilities = []
n = len(toseparate)
#We can distribute n-1 separations in the given list, so iterate from 0 to n
for i in xrange(n):
#Create a copy of the list to avoid modifying the already existing list
copy = list(toseparate)
#A boolean list indicating where a separator is put. 'True' indicates a separator
#and 'False', of course, no separator.
#The list will contain i separators, the rest is filled with 'False'
separators = [True]*i + [False]*(n-i-1)
for j in xrange(len(separators)):
#We insert the separators into our given list. The separators have to
#be between two elements. The index between two elements is always
#2*[index of the left element]+1.
copy.insert(2*j+1, separators[j])
#The first possibility is, of course, the one we just created
possibilities.append(list(copy))
#The following is a modification of the QuickPerm algorithm, which finds
#all possible permutations of a given list. It was modified to only permutate
#the spaces between two elements, so it finds every possibility to insert n
#separators in the given list.
m = len(separators)
hi, lo = 1, 0
p = [0]*m
while hi < m:
if p[hi] < hi:
lo = (hi%2)*p[hi]
copy[2*lo+1], copy[2*hi+1] = copy[2*hi+1], copy[2*lo+1]
#Since the items are non-unique, some possibilities will show up more than once, so we
#avoid this by checking first.
if not copy in possibilities:
possibilities.append(list(copy))
p[hi] += 1
hi = 1
else:
p[hi] = 0
hi += 1
return possibilities
t1 = time.time()
separations = separate([2, 3, 3, 5])
print time.time()-t1
sepmap = {True:"|", False:""}
for a in separations:
for b in a:
if sepmap.has_key(b):
print sepmap[b],
else:
print b,
print "\n",
It's based on the QuickPerm algorithm, which you can read more about here: QuickPerm
Basically, my code generates a list containing n separations, inserts them into the given list and then finds all possible permutations of the separations in the list.
So, if we use your example we would get:
2 3 3 5
2 | 3 3 5
2 3 | 3 5
2 3 3 | 5
2 | 3 | 3 5
2 3 | 3 | 5
2 | 3 3 | 5
2 | 3 | 3 | 5
In 0.000154972076416 seconds.
However, I read through the problem description of the problem you are doing and I see how you are trying to solve this, but seeing how quickly the runtime increases I don't think that it would work as fast you would expect. Remember that Project Euler's problems should solve in around a minute.