OCaml - how to sort pairs?

OCaml - how to sort pairs? - sorting

We have pairs:
(3,10000) (1,2),(2,11) (2,0) (2, 10) (1,1000000)
And we would like to order:
(1,2) (1,1000000), (2,0) (2, 10) (2,11) (3,10000)
How to do it in OCaml ?

The List.sort function, which can sort any 'a list, takes a comparison function of type 'a -> 'a -> int that must return a negative number if the first argument is strictly smaller than the second, positive if it is strictly larger, and zero if they're equal.
let lexicographic_compare (x,y) (x',y') =
let compare_fst = compare x x' in
if compare_fst <> 0 then compare_fst
else compare y y'
# List.sort lexicographic_compare [ (3,10000); (1,2); (2,11); (2,0); (2, 10); (1,1000000)];;
- : (int * int) list =
[(1, 2); (1, 1000000); (2, 0); (2, 10); (2, 11); (3, 10000)]
(This code uses the built-in compare function that does the right thing on integers.)
Note that in practice the compare function already compares pairs lexicographically, so List.sort compare ... (without writing any new code) would appear to work. How it works on pairs is not specified though, so it may change in the future and relying on this is bad style. If you want a precise comparison order, you should write your domain-specific comparison.
(Of course there are libraries out there that already provide this lexicographic_compare logic; but the point is to learn how to do it yourself.)

Related

What is the key difference between Combination Sum IV and No. of ways to make coin change problem?

Combination Sum
Given an array of distinct integer nums and a target integer target, return the number of possible combinations that add up to the target.
Input: nums = [1,2,3], target = 4
Output: 7
Explanation:
The possible combination ways are:
(1, 1, 1, 1)
(1, 1, 2)
(1, 2, 1)
(1, 3)
(2, 1, 1)
(2, 2)
(3, 1)
Note that different sequences are counted as different combinations.
Coin Change
For the given infinite supply of coins of each of denominations, D = {D0, D1, D2, D3, ...... Dn-1}. You need to figure out the total number of ways W, in which you can make the change for Value V using coins of denominations D.
For the same Input as above question:
Number of ways are - 4 total i.e. (1,1,1,1), (1,1, 2), (1, 3) and (2, 2).
I know how to solve Coin Change using the concept of UNBOUNDED KNAPSACK. But how Combination Sum IV is different here! Seems so similar

Algorithm for finding max value of functions of the form f(x) = a*min(b, x)?

I have an array of tuples (a, b) with a > 0 and b > 0.
Each tuple represents a function f such that f(x, a, b) = a * min(b, x).
Is there a known algorithm for a given x to find which tuple returns the maximum value ?
I don't want to evaluate each function to check the maximum, because I will query this array arbitrary number of times for different x.
Example:
array = [ (1, 10), (2, 3) ]
x < 6 -> choose (2, 3)
x = 6 (intersection point) -> either (1, 10) or (2, 3) doesn't matter
x > 6 -> choose (1, 10)
So the problem is that these tuples can be either sorted by a or by b. But there can be a lot of intersection points between them (if we visualize them as graphs). So I want to avoid any O(n^2) sorting algorithm to check for certain ranges of x which is the best function. I mean I don't want to compare each function with all the others to find from which point x' (intersection point) and on I should choose one over the other.

Assuming a's, b's and queried x's are always nonnegative, each query can be done in O(log(n)) time after an O(n*log(n)) preprocessing step:
The preprocessing step eliminates such functions that are strictly dominated by others. For example, (5, 10) is larger than (1, 1) for every x. (So, if there is (5, 10) in the array, then we can remove (1, 1) because it will never be the maximum for any x.)
Here is the general condition: A function (a, b) is larger than (c, d) for every x if and only if c > a and (c*d > a*b). (This is easy to prove.)
Now, what we want to do is to remove such functions (a, b) for which there exists a (c, d) such that c > a and (c*d > a*b). This can be done in O(n*log(n)) time:
1 - Sort tuples lexicographically. What I mean by lexicographically is first compare their first coordinates, and if they are equal, then compare the second ones. For example, a sorted array might look like this:
(1, 5)
(1, 17)
(2, 9)
(4, 3)
(4, 4)
2 - Iterate over the sorted array in the reverse order and keep track of the largest value of a*b that you encountered so far. Let's call this value M. Now, assume the element that we are processing in the loop is (a, b). If a*b < M, we remove this element. Because for some (c, d) that we processed earlier, both c > a and c*d > a*b, and thus (a, b) is useless. After this step, the example array will become:
(2, 9)
(4, 4)
(4, 3) was deleted because it was dominated by (4, 4). (1, 17) and (1, 5) were deleted because they are dominated by (2, 9).
Once we get rid of all the functions that are never the maximum for any x, the graph of the remaining ones will look like this.
As seen in the graph, every function is the maximum from the point where it intersects with the one before to the point where it intersects with the one after. For the example above, (4, 4) and (2, 9) intersect at x = 8. So (4, 4) is the maximum until x = 8, and after that point, (2, 9) is the maximum.
We want to calculate the points where consecutive functions in the array intersect, so that for a given x, we can binary-search on these points to find which function returns the maximum value.

The key to efficiency is to avoid useless work. If you imagine a decision tree, pruning branches is a term often used for that.
For your case, the decision-making is based on choosing between two functions (or tuples of parameters). In order to select either of the two functions, you just determine the value x at which they give you the same value. One of them performs better for smaller values, one for larger values. Also, don't forget this part, it may be that one function always performs better than the other. In that case, the one performing worse can be removed completely (see also above, avoiding useless work!).
Using this approach, you can map from this switchover point to the function on the left. Finding the optimal function for an arbitrary value just requires finding the next higher switchover point.
BTW: Make sure you have unit tests in place. These things are fiddly, especially with floating point values and rounding errors, so you want to make sure that you can just run a growing suite of tests to make sure that one small bugfix didn't break things elsewhere.

I think you should sort array based on 'b' first and then 'a'. Now for every x just use binary search and find the position from which min(b,x) will give either only b or x depending on value. So from that point if x is small then all the upcoming value of b then take tuple as t1 and and you can count value using that function and for the value of b which will be less than x you compulsorily need traverse. I'm not sure but that's what I can think.

After pre-processing the data, it's possible to calculate this maximum value in time O(log(n)), where n is the number of tuples (a, b).
First, let's look at a slightly simpler question: You have a list of pairs (c, b), and you want to find the one with the largest value of c, subject to the condition that b<=x, and you want to do this many times for different values of x. For example, the following list:
c b
------
11 16
8 12
2 6
7 9
6 13
4 5
With this list, if you ask with x=10, the available values of c are 2, 7 and 4, and the maximum is 7.
Let's sort the list by b:
c b
------
4 5
2 6
7 9
8 12
6 13
11 16
Of course, some values in this list can never give an answer. For example, we can never use the b=2, c=6 row in an answer, because if 6<=x then 5<=x, so we can use the c=4 row to get a better answer. So we might as well get rid of pairs like that in the list, i.e. all pairs for which the value of c is not the highest so far. So we whittle the list down to this:
c b
------
4 5
7 9
8 12
11 16
Given this list, with an index on b, it's easy to find the highest value of c. All you have to do is find the highest value of b in the list which is <=x, then return the corresponding value of c.
Obviously, if you change the question so that you only want the values with b>=x (instead of b<=x), you can do exactly the same thing.
Right. So how does this help with the question you asked?
For a given value of x, you can split the question into 2 questions. If you can answer both of these questions then you can answer the overall question:
Of the pairs (a, b) with b<=x, which one gives the highest value of f(x,a,b) = a*b?
Of the pairs (a, b) with b>=x, which one gives the highest value of f(x,a,b) = a*x?
For (1), simply let c=a*b for each pair and then go through the whole indexing rigmarole outlined above.
For (2), let c=a and do the indexing thing above, but flipped round to do b>=x instead of b<=x; when you get your answer for a, don't forget to multiply it by x.

Can a Robot reach a Point (x, y)?

I came across this question in one of the Job Interviews & i am unable to find the correct alogorithm of the solution so, i am posting this question here:
There is a robot who can move on a co-ordinate plane in eithr of the 2 ways:
Given that the robots current position is (x,y), The robot can move equal to the sum of x & y in either if the directon like so:
(x,y) -> (x+y, y)
(x,y) -> (x, x+y)
Now given a initial Point (x1, y1) and an destination point (x2, y2) you need to write a programme to check if the robot can ever reach the destination taking any number of moves.
Note: x1, y1 , x2 , y2 > 0
Explanation:
Suppose the robot's initial point is (2,3) and desintation is (7,5)
Result in this case is yes as the robot can take this path:
(2,3) -> (2, 2+3) => (2, 5)
(2,5) -> (2+5, 5) => (7,5)
Suppose the robot's initial point is (2,3) and desintation is (4,5)
Result in this case is No as no matter what path the robot takes it cannot reach (4,5)

A naive brute-force approach
One way would be to recursively explore every possible move until you reach the target.
Something to consider is that the robot can keep moving indefinitely (never reaching the target) so you need an end case so the function completes. Luckily the position is always increasing in the x and y axis, so when either the x-coordinate or y-coordinate is greater than the target, you can give up exploring that path.
So something like:
def can_reach_target(pos, target):
if pos == target:
return True
if pos[0] > target[0] or pos[1] > target[1]:
return False
return can_reach_target((pos[0], sum(pos)), target) or \
can_reach_target((sum(pos), pos[1]), target)
And it works:
>>> can_reach_target((2,3),(7,5))
True
>>> can_reach_target((2,3),(4,5))
False
A limitation is that this does not work for negative coordinates - not sure if this is a requirement, just let me know if it is and I will adapt the answer.
Bactracking
On the other hand, if negative co-ordinates are not allowed, then we can also approach this as Dave suggests. This is much more efficient, as the realisation is that there is one and only one way of the robot getting to each coordinate.
The method relies on being able to determine which way we stepped: either increasing the x-coordinate or the y-coordinate. We can determine which coordinate was last changed, by selecting the larger of the two. The following proof guarantees that this is the case.
The possibilities for a state change are:
1. (a, b) => (a+b, b) a x-coordinate change
and,
2. (a, b) => (a, a+b) a y-coordinate change
In case (1), the x-coordinate is now larger, since:
a > 0
a + b > b (add b to both sides)
and similarly, since b is also > 0, we can deduce that a+b is > a.
Now we can start from the target and ask: which coordinate led us here? And the answer is simple. If the x-coordinate is greater than the y-coordinate, subtract the y-coordinate from the x-coordinate, otherwise subtract the x-coordinate from the y-coordinate.
That is to say, for a coordinate, (x,y), if x > y, then we came from (x-y,y) otherwise (x,y-x).
The first code can now be adapted to:
def can_reach_target(pos, target):
if pos == target:
return True
if target[0] < pos[0] or target[1] < pos[1]:
return False
x, y = target
return can_reach_target(pos, (x-y,y) if x > y else (x,y-x))
which works as expected:
>>> can_reach_target((2,3),(7,5))
True
>>> can_reach_target((2,3),(4,5))
False
Timings
>>> timeit.timeit('brute_force((2,3),(62,3))',globals=locals(),number=10**5)
3.41243960801512
>>> timeit.timeit('backtracker((2,3),(62,3))',globals=locals(),number=10**5)
1.4046142909792252
>>> timeit.timeit('brute_force((2,3),(602,3))',globals=locals(),number=10**4)
3.518286211998202
>>> timeit.timeit('backtracker((2,3),(602,3))',globals=locals(),number=10**4)
1.4182081500184722
So you can see that the backtracker is nearly three times faster in both cases.

Go backwards. I'm assuming that the starting coordinates are positive. Say you want to know if a starting point of (a,b) is compatible with an end point of (x,y). One step back from (x,y) you were either at (x-y,y) or (x,y-x). If x > y choose the former, otherwise choose the latter.

I agree with Dave that going backwards is an efficient approach. If only positive coordinates are legal, then every coordinate has at most one valid parent. This lets you work backwards without a combinatorial explosion.
Here's a sample implementation:
def get_path(source, destination):
path = [destination]
c,d = destination
while True:
if (c,d) == source:
return list(reversed(path))
if c > d:
c -= d
else:
d -= c
path.append((c,d))
if c < source[0] or d < source[1]:
return None
print(get_path((1,1), (1,1)))
print(get_path((2,3), (7,5)))
print(get_path((2,3), (4,5)))
print(get_path((1,1), (6761, 1966)))
print(get_path((4795, 1966), (6761, 1966)))
Result:
[(1, 1)]
[(2, 3), (2, 5), (7, 5)]
None
[(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (6, 5), (11, 5), (16, 5), (21, 5), (26, 5), (31, 5), (36, 5), (41, 5), (46, 5), (46, 51), (46, 97), (143, 97), (143, 240), (383, 240), (623, 240), (863, 240), (863, 1103), (863, 1966), (2829, 1966), (4795, 1966), (6761, 1966)]
[(4795, 1966), (6761, 1966)]
Appendix: some observations I made along the way that might be useful for finding an O(1) solution:
(a,b) is reachable from (1,1) if and only if a and b are coprime.
If a and b have a common factor, then all children of (a,b) also have that common factor. Equivalently, if there is a path from (a,b) to (c,d), then there is also a path from (n*a, n*b) to (n*c, n*d), for any positive integer n.
if a and b are coprime and aren't (1,1), then there are infinitely many coprime coordinates that are unreachable from (a,b). By choosing (a,b) as a starting point, you're effectively limiting yourself to some sub-branch of the tree formed by (1,1). You can never reach any of the sibling branches of (a,b), where infinitely many coordinates reside.

A recursive funtion should work fine for that. You even got the number of possibilities.
def find_if_possible(x,y,x_obj,y_obj,max_depth):
if(max_depth < 0):
return 0
elif(x == x_obj and y == y_obj):
return 1
elif(x>x_obj or y>y_obj):
return 0
else:
return(sum(find_if_possible(x+y,y,x_obj,y_obj,max_depth-1),find_if_possible(x,y+x,x_obj,y_obj,max_depth-1))

Most common subpath in a collection of paths

There is numerous literature on the Web for the longest common subsequence problem but I have a slightly different problem and was wondering if anyone knows of a fast algorithm.
Say, you have a collection of paths:
[1,2,3,4,5,6,7], [2,3,4,9,10], [3,4,6,7], ...
We see that subpath [3,4] is the most common.
Know of a neat algorithm to find this? For my case there are tens of thousands of paths!

Assuming that a "path" has to encompass at least two elements, then the most common path will obviously have two elements (although there could also be a path with more than two elements that's equally common -- more on this later). So you can just iterate all the lists and count how often each pair of consecutive numbers appears in the different lists and remember those pairs that appear most often. This requires iterating each list once, which is the minimum amount you'd have to do in any case.
If you are interested in the longest most common path, then you can start the same way, finding the most common 2-segment-paths, but additionally to the counts, also record the position of each of those segments (e.g. {(3,4): [2, 1, 0], ...} in your example, the numbers in the list indicating the position of the segment in the different paths). Now, you can take all the most-common length-2-paths and see if for any of those, the next element is also the same for all the occurrences of that path. In this case you have a most-common length-3-path that is equally common as the prior length-2 path (it can not be more common, obviously). You can repeat this for length-4, length-5, etc. until it can no longer be expanded without making the path "less common". This part requires extra work of n*k for each expansion, with n being the number of candidates left and k how often those appear.
(This assumes that frequency beats length, i.e. if there is a length-2 path appearing three times, you prefer this over a length-3 path appearing twice. The same apprach can also be used for a different starting length, e.g. requiring at least length-3 paths, without changing the basic algorithm or the complexity.)
Here's a simple example implementation in Python to demonstrate the algorithm. This only goes up to length-3, but could easily be extended to length-4 and beyond with a loop. Also, it does not check any edge-cases (array-out-of-bounds etc.)
# example data
data = [[1,2, 4,5,6,7, 9],
[1,2,3,4,5,6, 8,9],
[1,2, 4,5,6,7,8 ]]
# step one: count how often and where each pair appears
from collections import defaultdict
pairs = defaultdict(list)
for i, lst in enumerate(data):
for k, pair in enumerate(zip(lst, lst[1:])):
pairs[pair].append((i,k))
# step two: find most common pair and filter
most = max([len(lst) for lst in pairs.values()])
pairs = {k: v for k, v in pairs.items() if len(v) == most}
print(pairs)
# {(1, 2): [(0, 0), (1, 0), (2, 0)], (4, 5): [(0, 2), (1, 3), (2, 2)], (5, 6): [(0, 3), (1, 4), (2, 3)]}
# step three: expand pairs to triplets, triplets to quadruples, etc.
triples = [k + (data[v[0][0]][v[0][1]+2],)
for k, v in pairs.items()
if len(set(data[i][k+2] for (i,k) in v)) == 1]
print(triples)
# [(4, 5, 6)]

List of Comparators in Sorting Networks

I have a question in my homework document, and I'm having hard to time to visualize and understand the question. Question is the following:
We can represent an n-input comparison network with c comparators as a
list of c pairs of integers in the range from 1 to n. If two pairs
contain an integer in common, the order of the corresponding
comparators in the network is determined by the order of the pairs in
the list. Given this representation, describe an O(n + c)-time
(serial) algorithm for determining the depth of a comparison network.
What does it mean to have pairs of integers in the context of comparison networks? Normally we used the notation below for denoting a comparison network where each horizontal line represents a number.

It means that if you have a pair (1, 2), that's one of those vertical lines, namely the one that connects horizontal lines 1 and 2.
So the top left part of this picture would be represented as (1, 2) (3, 4) (1, 3) (2, 4).
The depth of just that part is 2.

for i = 1, n
depth[i] = 0
total_depth = 0
for j = 1, c
i1 = comparators[j].entry1
i2 = comparators[j].entry2
new_depth = 1 + max(depth[i1], depth[i2])
depth[i1] = new_depth
depth[i2] = new_depth
total_depth = max(total_depth, new_depth)
print(total_depth)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

OCaml - how to sort pairs? - sorting

We have pairs: (3,10000) (1,2),(2,11) (2,0) (2, 10) (1,1000000) And we would like to order: (1,2) (1,1000000), (2,0) (2, 10) (2,11) (3,10000) How to do it in OCaml ?

Related

What is the key difference between Combination Sum IV and No. of ways to make coin change problem?

Algorithm for finding max value of functions of the form f(x) = a*min(b, x)?

Can a Robot reach a Point (x, y)?

Most common subpath in a collection of paths

List of Comparators in Sorting Networks

Categories

Resources