Codility, lesson 14, task TieRopes (https://codility.com/demo/take-sample-test/tie_ropes). Stated briefly, the problem is to partition a list A of positive integers into the maximum number of (contiguous) sublists having sum at least K.
I've only come up with a greedy solution because that's the name of the lesson. It passes all the tests but I don't know why it is an optimal solution (if it is optimal at all).
int solution(int K, vector<int> &A) {
int sum = 0, count = 0;
for (int a : A)
{
sum += a;
if (sum >= K)
{
++count;
sum = 0;
}
}
return count;
}
Can somebody tell me if and why this solution is optimal?
Maybe I'm being naive or making some mistake here, but I think that is not too hard (although not obvious) to see that the algorithm is indeed optimal.
Suppose that you have an optimal partition of the list that with the maximum number of sublists. You may or may not have all of the elements of the list, but since adding an element to a valid list produces an also valid lists, lets suppose that any possible "remaining" element that was initially not assigned to any sublist was assigned arbitrarily to one of its adjacent sublists; so we have a proper optimal partition of the list, which we will call P1.
Now lets think about the partition that the greedy algorithm would produce, say P2. There are two things that can happen for the first sublist in P2:
It can be the same as the first sublist in P1.
It can be shorter than the first sublist in P1.
In 1. you would repeat the reasoning starting in the next element after the first sublist. If every subsequent sublist produced by the algorithm is equal to that in P1, then P1 and P2 will be equal.
In 2. you would also repeat the reasoning, but now you have at least one "extra" item available. So, again, the next sublist may:
2.1. Get as far as the next sublist in P1.
2.2. End before the next sublist in P1.
And repeat. So, in every case, you will have at least as many sublists as P1. Which means, that P2 is at least as good as any possible partition of the list, and, in particular, any optimal partition.
It's not a very formal demonstration, but I think it's valid. Please point out anything you think may be wrong.
Here are the ideas that lead to a formal proof.
If A is a suffix of B, then the maximum partition size for A is less than or equal to the maximum partition size for B, because we can extend the first sublist of a partition of A to include the new elements without decreasing its sum.
Every proper prefix of every sublist in the greedy solution sums to less than K.
There is no point in having gaps, because we can add the missing elements to an adjacent list (I thought that my wording of the question had ruled out this possibility by definition, but I'll say it anyway).
The formal proof can be carried out by induction to show that, for every nonnegative integer i, there exists an optimal solution that agrees with the greedy solution on the first i sublists of each. It follows that, when i is sufficiently large, the only solution that agrees with greedy is greedy, so the greedy solution is optimal.
The basis i = 0 is trivial, since an arbitrary optimal solution will do. The inductive step consists of finding an optimal solution that agrees with greedy on the first i sublists and then shrinking the i+1th sublist to match the greedy solution (by observation 2, we really are shrinking that sublist, since it starts at the same position as greedy's; by observation 1, we can extend the i+2th sublist of the optimal solution correspondingly).
Related
I want to sort a list of n items with a comparison sort. However, one of the comparisons made by the algorithm will be flipped from what it's supposed to be. Specifically, there is one pair of items for which the comparator function consistently gives the wrong result.
What is a efficient n*log(n) sorting algorithm that will be robust to this faulty comparison? By robust, I mean that every item is off by at most k spots from its true position, for some reasonably small k.
If possible, I'd like it to be robust in the worst case (faulty comparison chosen adversarially), but I'll settle for robust in the average case.
An example robust algorithm (that's not efficient), would be to make all n*(n-1)/2 pairwise comparisons, and place each item by how many of the comparisons they won. Then, no matter what comparison the adversary makes, each items index will be off by no more than k=1.
An example of a NON-robust algorithm is quicksort, because the adversary could just choose the largest item to be on the wrong side of the first pivot, making it on average n/2 spots off from its correct index.
TL;DR: It's possible to modify quicksort to get the following guarantee: in (expected) time O(n log n), we can do one of the following, depending on which comparison is flipped.
Perfectly sort the array.
Perfectly sort the array, except that an adjacent pair of items somewhere in the array is swapped.
Perfectly sort the array, except that three consecutive items in the array, which can be identified, are permuted.
This guarantees a maximum displacement of 2, which is as good as is theoretically possible.
I mulled over this problem for a couple of hours and everything I'm doing connects back to tournaments.
I'd like to begin by trying to reframe the question as follows. If you have a set of n items and you know the "true" results of the comparisons between them, you can represent that result as a directed graph with one node per item and edges indicating when one item compares less than another. This type of digraph is called a "tournament," since you can think of it as encoding the result of a round-robin tournament where each player plays each other player.
In the case of an honest comparator, our tournament will be acyclic, and in particular it will have the following key property: there's exactly one node of each outdegree 0, 1, 2, ..., n - 1. The idea here is that the smallest element will have outdegree n - 1 (it's smaller than everything else), while the largest element will have outdegree 0 (it's bigger than everything else). And in fact, there's a theorem that a tournament is acyclic if and only if each node in the tournament has a different outdegree. Another useful fact: in an acyclic tournament, there's an edge from U to V if and only if outdeg(U) > outdeg(V).
In the case of a "dishonest comparator," we essentially start with an acyclic tournament, then flip a single edge. Your question asked about doing approximate sorting based on this comparator, but I'd like to step back and ask a different question, which I think can then be used to answer yours more precisely. In what cases can you figure out which edge was flipped? If we can do that, then we can do even better than approximate sorting - we can "unflip" the edge and sort perfectly. On the other hand, in which cases can you not figure out which edge was flipped, and when that happens, how far from sorted will we end up? That corresponds to having to do an approximate sort because we can't recover the original ordering.
Here's a useful fact:
Theorem: Begin with an acyclic tournament and flip a single edge. Then it's possible to determine which edge was flipped if and only if the outdegrees of the two endpoints of the flipped edge originally differ by at least three.
To prove this, we'll show both directions of implication.
First, suppose that we flip an edge between two nodes X and Y whose outdegrees differ by one. When we're done, we're left with a tournament where all nodes have different outdegrees (all other nodes have their outdegrees unchanged, and if we flipped the edge (X, Y), then X and Y swap outdegrees because one goes up by one and one goes down by one). We're now left with another acyclic tournament. And in particular, we can't tell which edge we flipped, because we could have just as well flipped any edge between any pair of nodes whose outdegrees differ by one.
Next, suppose we flip an edge between nodes X and Y where the outdeg(X) = k+1 and outdeg(Y) = k-1. We now have outdeg(X) = k = outdeg(Y), and somewhere else to begin with there must have been some node Z with outdegree k as well. So at this point, we have three nodes of outdegree k (namely, X, Y, and Z), and we know that we must have flipped one of the three edges between them. But we can't tell which one it was. Specifically, flipping the XY edge, or the XZ edge, or the YZ edge would all give back acyclic tournaments. So in that case, there's no way to undo the transform. That means that any sorted ordering we get from this comparator will have those two items out of place, so we'd have a maximum distance of at least 1.
An important note for this particular case: this corresponds to the comparator creating a tournament with exactly one cycle containing the nodes X, Y, and Z. Specifically, it'll take on the form X, Z, Y, X. The problem is we can't tell whether the original ordering was (X, Z, Y), or (Z, Y, X), or (Y, X, Z), and so we'd have a maximum distance of at least 2.
And finally, suppose that we have two nodes X and Y and flip the edge XY in the case where outdeg(X) = k, outdeg(Y) = m, and k ≥ m + 3. We're now left with a tournament in which two nodes have outdegree k - 1 and two nodes have outdegree m + 1. But of those four nodes, it's guaranteed that there's exactly one pair of them that can be flipped back to produce an acyclic tournament. One way to see this: take the four nodes that now have repeated outdegrees; call them X and Y (as above) and also W and Z, and suppose we have the cycle X, W, Z, Y, X, where the only flipped edge from the original is (Y, X). What will this cycle look like? Well, since (X, W), (W, Z), and (Z, Y) are edges in the tournament that weren't flipped, back in the original tournament we have outdeg(X) > outdeg(W) > outdeg(Z) > outdeg(Y). That means that we have to have X and W having outdegree k - 1 in the new graph and Z and Y having outdegree m + 1 in the new graph. Therefore, only flipping the edge from Y to X will increase the degree of one of the degree-(k-1) nodes back up to k while also decreasing the degree of one of the degree-(m+1) nodes down to m.
Summarizing:
Theorem: The faulty comparator will either
Behave as a real comparator, in which case we swapped two adjacent elements in the original sequence and we will never know which.
Have exactly one cycle of length three of elements whose original ordering can never be known, or
Have a cycle of length four, in which case we can identify which comparison is reversed.
With this in mind, it seems reasonable to reframe your problem in the following way:
Goal: Design an algorithm that, in time O(n log n), does one of the following things to a list of n elements given a faulty comparator that returns the wrong result when comparing two fixed elements X and Y against one another:
Perfectly sort the list.
Perfectly sort the list, except with two adjacent items swapped.
Perfectly sort the list, except with three adjacent items permuted.
Here's one possible algorithm that does this in expected O(n log n) time that's based on quicksort. The basic idea is the following: we run more or less a regular quicksort, at each point in time checking to see whether we found a triangle. If not, then either we're in case (1) or case (2). If we do find a triangle, we see whether we can identify which comparison got reversed. If we can, then we rerun quicksort, except that we "fix" the comparator in this broken case. If we can't, then we're in case (3) and just finish quicksort as usual.
The specific technique we'll use to detect a triangle works like this. Begin with a regular, vanilla quicksort: pick a pivot, partition the array into things less than the pivot and things bigger than the pivot, then recursively sort the two smaller subarrays. However, after doing so, we do one additional step: assuming the subarray we're sorting has three or more elements in it, look at the pivot p and the element just before and just after it (call those s, p, g for "smaller," "pivot," and "greater"). Then if the comparator says s < p < g < s, we've found a triangle. And in fact, we have something stronger.
Suppose that at some point in quicksort comparator does indeed compare X and Y, the mismatched items. We're assuming X < Y, but that the comparator incorrectly reports that Y < X. The only way that two items can be compared in quicksort is if one of them is a pivot element at a time when the other is in the current subarray. Without loss of generality, let's assume that X was the pivot, and that Y was compared against it.
What should happen here, assuming the comparator was honest, is that Y would be found to be larger than X, and therefore would be placed into the "bigger" subarray. But because the comparator is a lying liar who lies, instead Y gets placed into the "smaller" subarray. If we then recursively sort the "smaller" subarray and the "bigger" subarray, think about where Y will end up. It's in the "smaller" subarray but is actually bigger than X, which means it'll compare larger than everything in that "smaller" subarray. Consequently, Y will appear just before X. Now, look at the items in the "bigger" subarray. There are two options. The first is that in the "real" ordering, there's at least one value between X and Y. That value would then appear in the "bigger" subarray because it's larger than X, and in particular the first element of the "bigger" subarray would compare smaller than Y. That would mean that Y, then X, then the item immediately after X after sorting would form a triangle. The other option is that X and Y are adjacent in the true sorted ordering, which case we'd never find out (as mentioned above). This, combined with the above insight, means that
Theorem: Suppose we run quicksort, and after recursively sorting the left and right subarrays we look at the three items consisting of the pivot, the item just before it, and the item just after it to see if they form a triangle. Then if this algorithm detects a triangle, a triangle exists. Moreover, if this algorithm does not detect a triangle, then either (1) no triangle exists or (2) a triangle does exist, but the comparator was never applied to the bad pair (X, Y) and so the sorted order is correct.
With all this said and done, we can state the full algorithm that, in expected O(n log n) time, sorts the array as best as is possible.
function modifiedQuicksort(array, comparator):
if array has length 0 or 1, return.
pick a random pivot element from the array.
use the comparator to form subarrays smaller and greater based on
how elements compare against the pivot.
recursively apply modifiedQuicksort to those two arrays.
if the comparator finds a triangle formed from the last element of
smaller, the pivot, and the first element of greater, report those
three items as a triangle.
return smaller, pivot, greater.
function sortAsBestWeCan(array, comparator):
run modifiedQuicksort(array, comparator)
if it didn't report a triangle, return the result of the call.
otherwise, it reported a triangle A, B, C.
for each other item D:
if comparator(A, D) and comparator(D, B) or
comparator(B, D) and comparator(D, C) or
comparator(C, D) and comparator(D, A):
you have found a 4-cycle from A, B, C, and D.
detect which comparison is reversed.
use that knowledge plus the comparator and your favorite
O(n log n)-time sorting algorithm to perfectly sort
the input array.
otherwise, those three items are the only triangle, and the
array is sorted as well as it can be. return it.
I think I've thought up a solution.
First, do a first pass with any decent sorting algorithm you want (like quicksort), which should, at worst, result in only one item that's significantly far from where it should be.
Then, choose a width h that's at least 5.
for i from 0 to n-h, we look at the group of h items at i, i+1, ..., i+h-1. We make all h*(h-1)/2 pairwise comparisons in that group, and rearrange them by who won the most comparisons. We then increment i and move onto the next group.
Afterwards, we do the same thing, but going backwards from i=n-h to i=0.
These two extra passes will bubble up/bubble down the displaced item to be in the correct area, and uses the extra comparisons in a group of h to override the faulty single comparison.
The final number of comparisons will be O(n*log(n)) + n*h*(h-1)/2. Not sure how much better you can do.
This method also works (I think) for more than one faulty comparison. All you need to do is make sure that h is large enough to override those faulty comparisons.
I'm studying for my exam coming up and I am practicing a problem that wants me to implement a greedy algorithm.
I am given an unsorted array of different weights where 0 < weight_i for all i. I have to place all of them such that I use the least number of piles. I can not place two weights in a pile where the one on top is greater than the one below. I also have to respect the ordering of the weights, so they must be placed in order. There is no height limit for the pile.
An example: If I have the weights {53, 21, 40, 10, 18} I cannot place 40 above 21 because the pile must be in descending order, and I cannot place 21 above 40 because that does not respect the order. An optimal solution would have pile 1: 53, 21, 10 and pile 2: 40 18
My general solution is iterate through the array and always pick the first pile the weight is allowed to go. I believe this would give me an optimal solution (although I haven't proved it yet). I could not find a counter example to this. But this would be O(n^2) because worst case I have to iterate through every element and every pile (I think)
My question is, is there a way to get this down to O(n) or O(nlogn)? If there is I'm just not seeing it and need some help.
Your algorithm will give a correct result.
Now note the following: when visiting the piles in order and stopping at the first one where the next value can be stacked, you will always have a situation where the stacks are ordered by their current top (last) value, in ascending order.
You can use this property to avoid an iteration of the piles from "left to right". Instead use a binary search, among the piles, to find that first pile that can take the next value.
This will give you a O(nlogn) time complexity.
Believe it or not, the problem you describe is equivalent to computing the length of the longest increasing subsequence. There's a neat little greedy idea as to why.
Consider the longest increasing subsequence (LIS) of the array. Because the elements are ascending in index and also ascending in value, they must all be in different piles. As a result the minimum number of piles needed is equal to the number of elements in the LIS.
LIS is easily solvable in O(NlogN) using dynamic programming and a binary search.
Note that the algorithm you describe does the same thing as the algorithm below - it finds the first pile you can put the item on (with binary search), or it creates a new pile, so this serves as a "proof" of correctness for your algorithm and a way to reduce your complexity.
Let dp[i] be equal to the minimum value element at the end of an increasing subsequence of length (i + 1). To reframe it in terms of your question, dp[i] would also be equal to the weight of the stone on the ith pile.
from bisect import bisect_left
def lengthOfLIS(nums):
arr = []
for i in range(len(nums)):
idx = bisect_left(arr, nums[i])
if idx == len(arr):
arr.append(nums[i])
else:
arr[idx] = nums[i]
return len(arr)
There is a sequence {a1, a2, a3, a4, ..... aN}. A run is the maximal strictly increasing or strictly decreasing continuous part of the sequence. Eg. If we have a sequence {1,2,3,4,7,6,5,2,3,4,1,2} We have 5 possible runs {1,2,3,4,7}, {7,6,5,2}, {2,3,4}, {4,1} and {1,2}.
Given four numbers N, M, K, L. Count the number of possible sequences of N numbers that has exactly M runs, each of the number in the sequence is less than or equal to K and difference between the adjacent numbers is less than equal to L
The question was asked during an interview.
I could only think of a brute force solution. What is an efficient solution for this problem?
Use dynamic programming. For each number in the substring maintain separate count of maximal increasing and maximally decreasing subsequences. When you incrementally add a new number to the end you can use these counts to update the counts for the new number. Complexity: O(n^2)
This can be rephrased as a recurrence problem. Look at your problem as finding #(N, M) (assume K and L are fixed, they are used in the recurrence conditions, so propagate accordingly). Now start with the more restricted count functions A(N, M; a) and D(N, M, a), where A counts those sets with last run ascending, D counts those with last run descending, and a is the value of the last element in the set.
Express #(N, M) in terms of A(N, M; a) and D(N, M; a) (it's the sum over all allowable a). You might note that there are relations between the two (like the reflection A(N, M; a) = D(N, M; K-a)) but that won't matter much for the calculation except to speed table filling.
Now A(N, M; a) can be expressed in terms of A(N-1, M; w), A(N-1, M-1; x), D(N-1, M; y) and D(N-1, M-1; z). The idea is that if you start with a set of size N-1 and know the direction of the last run and the value of the last element, you know whether adding element a will add to an existing run or add a run. So you can count the number of possible ways to get what you want from the possibilities of the previous case.
I'll let you write this recursion down. Note that this is where you account for L (only add up those that obey the L distance restriction) and K (look for end cases).
Terminate the recursion using the fact that A(1, 1; a) = 1, A(1, x>1; a) = 0 (and similarly for D).
Now, since this is a multiple recursion, be sure your implementation stores results in a table and begins by trying lookup (commonly called dynamic programming).
I suppose you mean by 'brute force solution' what I might mean by 'straightforward solution involving nested-loops over N,M,K,L' ? Sometimes the straightforward solution is good enough. One of the times when the straightforward solution is good enough is when you don't have a better solution. Another of the times is when the numbers are not very large.
With that off my chest I would write the loops in the reverse direction, or something like that. I mean:
Create 2 auxiliary data structures, one to contain the indices of the numbers <=K, one for the indices of the numbers whose difference with their neighbours is <=L.
Run through the list of numbers and populate the foregoing auxiliary data structures.
Find the intersection of the values in those 2 data structures; these will be the indices of interesting places to start searching for runs.
Look in each of the interesting places.
Until someone demonstrates otherwise this is the most efficient solution.
I'm re-reading Skiena's Algorithm Design Manual to catch up on some stuff I've forgotten since school, and I'm a little baffled by his descriptions of Dynamic Programming. I've looked it up on Wikipedia and various other sites, and while the descriptions all make sense, I'm having trouble figuring out specific problems myself. Currently, I'm working on problem 3-5 from the Skiena book. (Given an array of n real numbers, find the maximum sum in any contiguous subvector of the input.) I have an O(n^2) solution, such as described in this answer. But I'm stuck on the O(N) solution using dynamic programming. It's not clear to me what the recurrence relation should be.
I see that the subsequences form a set of sums, like so:
S = {a,b,c,d}
a a+b a+b+c a+b+c+d
b b+c b+c+d
c c+d
d
What I don't get is how to pick which one is the greatest in linear time. I've tried doing things like keeping track of the greatest sum so far, and if the current value is positive, add it to the sum. But when you have larger sequences, this becomes problematic because there may be stretches of negative numbers that would decrease the sum, but a later large positive number may bring it back to being the maximum.
I'm also reminded of summed area tables. You can calculate all the sums using only the cumulative sums: a, a+b, a+b+c, a+b+c+d, etc. (For example, if you need b+c, it's just (a+b+c) - (a).) But don't see an O(N) way to get it.
Can anyone explain to me what the O(N) dynamic programming solution is for this particular problem? I feel like I almost get it, but that I'm missing something.
You should take a look to this pdf back in the school in http://castle.eiu.edu here it is:
The explanation of the following pseudocode is also int the pdf.
There is a solution like, first sort the array in to some auxiliary memory, then apply Longest Common Sub-Sequence method to the original array and the sorted array, with sum(not the length) of common sub-sequence in the 2 arrays as the entry into the table (Memoization). This can also solve the problem
Total running time is O(nlogn)+O(n^2) => O(n^2)
Space is O(n) + O(n^2) => O(n^2)
This is not a good solution when memory comes into picture. This is just to give a glimpse on how problems can be reduced to one another.
My understanding of DP is about "making a table". In fact, the original meaning "programming" in DP is simply about making tables.
The key is to figure out what to put in the table, or modern terms: what state to track, or what's the vertex key/value in DAG (ignore these terms if they sound strange to you).
How about choose dp[i] table being the largest sum ending at index i of the array, for example, the array being [5, 15, -30, 10]
The second important key is "optimal substructure", that is to "assume" dp[i-1] already stores the largest sum for sub-sequences ending at index i-1, that's why the only step at i is to decide whether to include a[i] into the sub-sequence or not
dp[i] = max(dp[i-1], dp[i-1] + a[i])
The first term in max is to "not include a[i]", the second term is to "include a[i]". Notice, if we don't include a[i], the largest sum so far remains dp[i-1], which comes from the "optimal substructure" argument.
So the whole program looks like this (in Python):
a = [5,15,-30,10]
dp = [0]*len(a)
dp[0] = max(0,a[0]) # include a[0] or not
for i in range(1,len(a)):
dp[i] = max(dp[i-1], dp[i-1]+a[i]) # for sub-sequence, choose to add or not
print(dp, max(dp))
The result: largest sum of sub-sequence should be the largest item in dp table, after i iterate through the array a. But take a close look at dp, it holds all the information.
Since it only goes through items in array a once, it's a O(n) algorithm.
This problem seems silly, because as long as a[i] is positive, we should always include it in the sub-sequence, because it will only increase the sum. This intuition matches the code
dp[i] = max(dp[i-1], dp[i-1] + a[i])
So the max. sum of sub-sequence problem is easy, and doesn't need DP at all. Simply,
sum = 0
for v in a:
if v >0
sum += v
However, what about largest sum of "continuous sub-array" problem. All we need to change is just a single line of code
dp[i] = max(dp[i-1]+a[i], a[i])
The first term is to "include a[i] in the continuous sub-array", the second term is to decide to start a new sub-array, starting a[i].
In this case, dp[i] is the max. sum continuous sub-array ending with index-i.
This is certainly better than a naive approach O(n^2)*O(n), to for j in range(0,i): inside the i-loop and sum all the possible sub-arrays.
One small caveat, because the way dp[0] is set, if all items in a are negative, we won't select any. So for the max sum continuous sub-array, we change that to
dp[0] = a[0]
So, this is a common interview question. There's already a topic up, which I have read, but it's dead, and no answer was ever accepted. On top of that, my interests lie in a slightly more constrained form of the question, with a couple practical applications.
Given a two dimensional array such that:
Elements are unique.
Elements are sorted along the x-axis and the y-axis.
Neither sort predominates, so neither sort is a secondary sorting parameter.
As a result, the diagonal is also sorted.
All of the sorts can be thought of as moving in the same direction. That is to say that they are all ascending, or that they are all descending.
Technically, I think as long as you have a >/=/< comparator, any total ordering should work.
Elements are numeric types, with a single-cycle comparator.
Thus, memory operations are the dominating factor in a big-O analysis.
How do you find an element? Only worst case analysis matters.
Solutions I am aware of:
A variety of approaches that are:
O(nlog(n)), where you approach each row separately.
O(nlog(n)) with strong best and average performance.
One that is O(n+m):
Start in a non-extreme corner, which we will assume is the bottom right.
Let the target be J. Cur Pos is M.
If M is greater than J, move left.
If M is less than J, move up.
If you can do neither, you are done, and J is not present.
If M is equal to J, you are done.
Originally found elsewhere, most recently stolen from here.
And I believe I've seen one with a worst-case O(n+m) but a optimal case of nearly O(log(n)).
What I am curious about:
Right now, I have proved to my satisfaction that naive partitioning attack always devolves to nlog(n). Partitioning attacks in general appear to have a optimal worst-case of O(n+m), and most do not terminate early in cases of absence. I was also wondering, as a result, if an interpolation probe might not be better than a binary probe, and thus it occurred to me that one might think of this as a set intersection problem with a weak interaction between sets. My mind cast immediately towards Baeza-Yates intersection, but I haven't had time to draft an adaptation of that approach. However, given my suspicions that optimality of a O(N+M) worst case is provable, I thought I'd just go ahead and ask here, to see if anyone could bash together a counter-argument, or pull together a recurrence relation for interpolation search.
Here's a proof that it has to be at least Omega(min(n,m)). Let n >= m. Then consider the matrix which has all 0s at (i,j) where i+j < m, all 2s where i+j >= m, except for a single (i,j) with i+j = m which has a 1. This is a valid input matrix, and there are m possible placements for the 1. No query into the array (other than the actual location of the 1) can distinguish among those m possible placements. So you'll have to check all m locations in the worst case, and at least m/2 expected locations for any randomized algorithm.
One of your assumptions was that matrix elements have to be unique, and I didn't do that. It is easy to fix, however, because you just pick a big number X=n*m, replace all 0s with unique numbers less than X, all 2s with unique numbers greater than X, and 1 with X.
And because it is also Omega(lg n) (counting argument), it is Omega(m + lg n) where n>=m.
An optimal O(m+n) solution is to start at the top-left corner, that has minimal value. Move diagonally downwards to the right until you hit an element whose value >= value of the given element. If the element's value is equal to that of the given element, return found as true.
Otherwise, from here we can proceed in two ways.
Strategy 1:
Move up in the column and search for the given element until we reach the end. If found, return found as true
Move left in the row and search for the given element until we reach the end. If found, return found as true
return found as false
Strategy 2:
Let i denote the row index and j denote the column index of the diagonal element we have stopped at. (Here, we have i = j, BTW). Let k = 1.
Repeat the below steps until i-k >= 0
Search if a[i-k][j] is equal to the given element. if yes, return found as true.
Search if a[i][j-k] is equal to the given element. if yes, return found as true.
Increment k
1 2 4 5 6
2 3 5 7 8
4 6 8 9 10
5 8 9 10 11