Efficient Approach for Maximizing the number of Pairs

Efficient Approach for Maximizing the number of Pairs - algorithm

This is an interview Question that I encountered recently .
You have G guests (numbered from 1 to G) in a party.Each Guest has a preference list of length G which represents his preferences to talk with others.
For example if the preference list of guest 1 is N Y N N Y (assuming 5 guests), then Guest 1 is interested to talk to either 2 or 5 but not others.
Assume that
a) Each guest can talk to only one other guest
b) If a is interested in talking to b , then b is also interested in talking to a
Given a matrix of guests and their preferences , Give maximum No of pairs that can be kept engaged.
Let G = 5;
The Preference matrix be
N Y N N N
Y N Y Y Y
N Y N N N
N Y N N N
N Y N N N
As We can observe everyone is interested in talking to Guest 2 but he can talk only to one other person and so the answer is 1 pair.
My Approach:
I thought of it as a Maximum matching Problem in graph theory but unable to implement it
in the short time frame.(I am not good in Graph Algo Implementation)
Is this solved only using graphs or is there some better , faster approach?
Is there any Greedy Approach ?

We can use recursion and some memoization. Please find a way to recognize a graph with K nodes and all relationships (we'll see below why we need this). During recursion we should record already solved cases (K, R) where K is the number of guests and R is the list of relationships of these K guests.
For the problem (N, R) we would number guests as 1, 2, ..., N, then scan through list of relationships to get a list with no repetition (a hash table may help checking for duplicates)
1 & 2, 1 & 4, 2 & 3, etc.
We need to find maximum of non-colliding pairs (for example, 1 & 2 collides with 2 & 3). We can use the following recursive algorithm:
A) if 1 & 2 is taken, then remove all pairs with 1 or 2, and do recursion on the rest.
RA will be the list of remaining relationships without 1 and 2. And we have recursion of ((N-2), RA)
B) if 1 & 2 are skipped, do recursion with the rest.
RB will be the list of remaining relationships without the link (1 & 2). And we have recursion of (N, RB) - still N because 1 or 2 may still remain in the full set.
C) Check what route render a bigger number of pairs.
We need memoization to store results of (K, R) because there may be clusters of guests like (1, 3, 5, ...) and (2, 4, 6, ...) where guests are friends of each other just inside the clusters.
If we use naive recursion, we'll solve the same problems many times. But the clusters mean that their solutions are symmetrical. Therefore we need to recognize the graphs by the combination of number of guests and their relationships (renumbering guests produces the same graph).

The recursion will split into two principal stages:
Simplification:
Remove guests without friends.
Divide graph into non-connected areas.
Further, for every area the next stage will be fulfilled:
Cutting
Order the guests up according to number of their friends.
Take the first untried connection into the found pairs list.
Repeat the whole process starting from simplification with the received diminished area until all connections are recursively tried or the number of pairs in the currently build set is equal to [Area power /2].
We could also improve the speed if we'll first try the connections that divide the area into two non-connected areas of even power.

Related

Classic Counterfeit coin puzzle with a twist

This problem is similar to the classic coin search for a single counterfeit coin that weighs lighter than x number of coins but with a twist in the number of coins that could possibly be fake. The real coins all weigh the same, and the fake coins weigh the same. The fake coins weigh less than the real coins.
The difference in the one I am trying to solve is for when there are at most 2 counterfeits, (i.e There can be possibly, No fake coins, 1 fake coin, or 2 fake coins).
Example of my attempt:
My attempt at an earlier part of this problem was figuring out how to find the fake coins if any, when x = 9 # of coins, however you were only allowed to use the weight scale at most 6 times to figure it out.
I started by separating x = 9 coins into groups of 3 and comparing the groups to check for equality (if all groups are = there are no fake coins, since there could be at most 2 fake coins and at least 0 fake coins.) Then going from there to checking inequalities for group 1 with group 2 and group 1 again with group 3. With the possibilities of there being 2 fake coins in group 1,2, or 3, and the other possibility of there being 1 fake coin each in 2 groups such as group 1,2, 1,3 or 2,3. Considering these cases I followed the comparisons, thereby breaking down the comparing of groups into thirds until I get to the final few coins and find the fake coins.
The problem is:
In a pile of coins where x amount of coins is ">= 3", how would I go about finding the fake coins while making sure the number of times weighed is O(log base 2 of (n)). And How would I find a generic formula to find the number of weighings required to find at most 2 fakes from an x amount of coins.
Programming this is easy when I can consider all cases and compare each one at a slower speed. However it gets significantly more difficult when considering the amount of times weighed has to be O(log base 2 (n)). I have considered using the number of coins to differentiate how the comparisons will be made such as checking if x amount of coins is an odd or even number of coins, then deciding how to compare. If odd, divide x-1 into 3 groups and put the last coin into a fourth group, then continue down the spiral of comparisons to finally find the fake coins, if there are any at all. I also considered dividing say 100 coins into 33 each and comparing the 3 groups, then getting rid of 1/3 of the coins and running comparisons on the 66 left. I still can't wrap my head around solving how to design a generic algorithm procedure to find the fake coins, and then how to even find a generic formula for comparing the amount of times weighed to log base 2 (n).
Even when n = prime/odd numbers it is difficult to split those coins and check for weight in a general procedure that works with any number n >= 3.
To clarify, I need help with figuring out if/how my earlier attempt/example can be applied to create a general comparison algorithm that will apply to any number of coins where x>=3, while the amount of times weighed is O(log base 2 (n)).

Since O(log_2 n) is the same as O(log_b n) for any base b>1, the recursive breakdown into thirds suggested by user #n.1.8e9 in the comments fits that requirement. There's no need to consider prime/odd numbers, as long as we can solve for some specified constant number of coins with a constant number of weighings.
Here, let 3 coins be our base case. After weighing all 3 pairings (technically, we can get away with 2 weighings), we will know exactly which of the 3 coins are light, if any. So if we split a pile of 11 coins into thirds of 3 each, we can take the 2 leftover coins, borrow any other coin from the other piles, perform the 3 weighings, and then discard the 2 leftover coins since we know their status. As long as there are O(log n) splitting stages, dealing with the leftovers won't affect the asymptotics.
The only complex part of the proof is that after the first step, we go from the '0, 1 or 2 fakes' problem to either two 'exactly 1 fake' subproblems or a '1 or 2 fakes' subproblem. Assuming you know the solution to the original 'exactly 1 fake' problem with 1 + log_3 n weighings, the proof should look fairly similar.
The procedure for 'at most 2 fake' and '1 or 2 fakes' is the same. Given n coins, we divide them into three groups of floor(n/3) coins (and treat any leftovers as we did above). If n <= 3, stop and just perform all weighings. Otherwise, given piles A, B and C, perform the 3 pair weighings (A, B), (A, C) and (B, C).
If they all weigh the same (A=B=C), there are no fake coins.
If one pile is different, there are two cases: the single pile is lighter or heavier than the other two.
If it is lighter (say, A < B, A < C, and B = C), then pile A has exactly 1 or 2 fake coins and we have a single problem instance on n/3 coins (discard piles B and C).
If the outlier is heavier (say, A = B, A < C, and B < C), then piles A and B have exactly one fake coin each, which is the standard counterfeit problem.
To prove the bound on number of weighings, you probably need to use induction. Each recursion level requires at most 6 weighings, so an upper bound formula for the number of weighings required when there may be up to 2 fake coins remaining is T(n) = max(T(n/3), 2 * (1 + log_3(n/3))) + 6, where the 1 + log_3 (n/3) term is the standard upper bound with perfect strategy to find one light coin among n/3 coins (where we take the floor of all divisions to get integers).

Need a algorithm to find the minimum cost in a subset of elements

I am trying to find an optimal algorithm that can find the largest subset where the sum of the elements is the lowest while covering all elements.
eg :- Imagine A B C are retailers and W X Y Z are products, the goal is to minimize the visits, and lower the price.
A B C
W 4 9 2
X 1 3 4
Y 9 3 9
Z 7 1 1
So it appears my top two choices are
a) B:{XYZ} - 7 C:{W} - 2
b) C:{WXZ} - 7 B:{Y} - 3
So a) is picked because since it has a lower cost, i.e 9.
This problem seems similar to vertex cover and other linear programming algorithms, but I can't figure out the right one.
Update:
It seems I need to add an additional variable. Introducing t. If the cost of visiting fewest retailers and the next fewest is > t, the next former is picked.
Continuing with the example.
say t = 5,
The largest subset containing all elements would be B:{WXYZ} with a cost of 16.
The next largest subset(s) is B:{XYZ} - 7 C:{W} - 2 with a cost of 9.
t = 16 - 9 > 5. So we pick B:{XYZ} - 7 C:{W} - 2
but if we did A:{X}, B:{Y}, C:{WZ} - 5, t = 9 - 5 < 5.
So B:{XYZ} - 7 C:{W} - 2 is picked
Really I'm just interested if there is already an algorithm that fits this pattern. I can't be the first person that needs this sort of optimization.

You have a problem with two objectives - 1. to minimize the total lower cost of the products and also to 2. minimize the number of stores visited. (The comment by #btilly rightly shows two competing solutions.)
Multiple objectives are fairly common in these types of Integer programming problems. See MCDM.
To resolve this, you need to have two types of costs (you currently have only one.)
The cost of buying product p from retailer r (which you have specified) C_rp
The cost of visiting a retailer: C_r
Intuition: If C_r is very high, then we'll buy all the products from one retailer. If C_r is small, then we go to multiple retailers and buy from whomever is selling it most inexpensively.
Your problem can be modeled as a variant of the "Assignment Problem." Also, read up on the so-called fixed-charge transportation problems (FCTP) if you need more references. (There is a fixed charge to visiting a retailer once.)
So on to the Integer programming formulation:
Decision Variables
Binary
X_rp = if product p is purchased from retailer r, 0 otherwise
Y_r = 1 if retailer r is visited, 0 otherwise
Objective Function
Min C_rp X_rp + C_r Y_r
Constraints
(Sum over r) X_rp = 1 for all p (Every product must be bought from some retailer)
Next, we need to ensure that Y_r is one if even one of the X_rp is 1 for for that retailer. Normally, we'd resort to the Big M method, but it is easier in this problem.
X_rp <= Y_r for all p, for all r.
If any of the X variables becomes 1, that forces the Y_r to become 1. The model will pay the price C_r.
To solve, you can use any LP solver. The good news is that problem has an integrality property, meaning that integer solutions naturally occur even when linear programming solution techniques are used.
Hope that helps.

Resource allocation- Matching

I've got a problem which I'm not sure can be solved by Linear Programming. Essentially there are 2 groups of people who are list their preference for one another and will be subsequently matchd. I'm writing an algorithm for this. Group A has upto 4 choices from Group B and vice versa.
In formulating a solution, I am currently assigning a cost to each combination of pairs. For example if Person 1 from Group A ranks Person 3 from Group B as his/her number 1 choice and vice versa, then the cost is minimal (Pair 1-3 cost: 0.01). Similarly, I would allot a cost to other pairs, devising an objective function which seeks to have pairings which minimize overall cost.
However, I do not see this being feasible because I don't know how to define my constraints and overall objective function. Reading online and from textbooks, I find resource allocation problems to be different from what I am trying to do.
Can I seek your advise on how to proceed?

Your problem can be formulated as an "Assignment Problem." As a canonical case, assignment problems are for assigning "jobs" to "machines." They can just as easily be used for Matching two sets.
Here's the formulation:
Two sets of people A and B
Decision Variable Xij
Let Xij be 1 if person i (ith person in set A) is matched with jth person in set B; 0 otherwise
Parameters:
Let Cij be the cost of pairing person i with person j
Objective Function: Minimize (Sum over i) (sum over j) Cij * Xij
Constraints:
Every Person i gets paired exactly once
Sum over j Xij = 1 (for each i)
Every Person j gets paired exactly once
Sum over i Xij = 1 (for each j)
Xij are Binary variables
Xij = (0,1)
The neat thing about Assignment problems is that the optimal pairings can be found using the fairly easy to understand 'Hungarian Method.' You can also use an LP/IP solver you have at your disposal.
Hope that helps.

Pseudo-code algorithm to calculate all permutations of N values chosen from N unequal vectors without repetition

This question is for a program I am trying to write which involves connecting chains of physical parts together. I believe I have distilled it down into the simplest form of the question. I would also appreciate if someone knows any additional words that describe this problem, as about 30 min of searching for related questions hasn't even turned up a name for this problem.
You have N vectors. If you choose one value from each vector and do not allow any repeats, you will have one permutation of the type I am trying to find. What is a pseudo-code algorithm to find all of them without brute forcing?
Example:
You have the vectors
v1=[1 2] v2=[1 2 3] v3=[1 2 3 4]
(Edit note: The nesting of the vectors is unintentional and cannot be leveraged in the algorithm.)
You pick values from each of the vectors and don't allow repeats.
Value 1 is from v1 ---> 2
Value 2 is from v2 ---> 1
Value 3 is from v3 ---> 4
Resulting permutation is [2 1 4].
This is one allowable permutation. Here is an example of a permutation that is not allowed because it repeats.
Value 1 is from v1 ---> 2
Value 2 is from v2 ---> 1
Value 3 is from v3 ---> 2
Resulting permutation is [2 1 2], which is invalid due to repeats.
What is an algorithm to find all valid permutations?
Bonus points if you can calculate how many permutations there are before calculating them.
I'll be sure to post back if I can come up with an answer before anyone else can.

The example you give has nested vectors, meaning that the entries in v_i are a subset of those in v_{i+1}. If this is indeed the general case for your application, then the number of solutions is simply:
n_1 * (n_2 - 1) * ... * (n_k - (k-1))
where n_i is the length of v_i and there are k nested vectors.
As far as algorithms are concerned, if you want to generate all possible solutions, then I cannot see a better way than to choose from each successive vector after eliminating already selected entries.
If you aren't nested, a good way to visualize this problem is as a Marriage Problem in the following sense. Make k vertices corresponding to the given k vectors
v_1 v_2 ... v_k
and another m vertices corresponding to the distinct entries of the combined vectors
a_1 a_2 ... a_m
Then connect a_i to v_j if and only if a_i appears in v_j. The goal is to find a maximum matching between the vs and the as that touches all of the v's. That is, choose k edges so that each v_i is an endpoint of exactly one edge.
Any of the standard algorithms, e.g. using augmented paths, will work to find one solution or generate them all.

I think you can solve this problem incrementally. Let s1, s2, s3,..,sk be the solutions involving v1, v2, .., vn. Now with vn+1 for every current solution si and element j (j in vn+1), see if j is already in si, if not then add it to your new collection (corresponding to n+1).
Initialize S={ {j} for j in v1 }
For n=2..m:
newS = {}
for j in vn
for s in S
if j not in s add sU{j} to newS
S = newS
return S

Algorithm to find minimum number of weightings required to find defective ball from a set of n balls

Okay here is a puzzle I come across a lot of times-
Given a set of 12 balls , one of which is defective (it weighs either less or more) . You are allow to weigh 3 times to find the defective and also tell which weighs less or more.
The solution to this problem exists, but I want to know whether we can algorithmically determine if given a set of 'n' balls what is the minimum number of times you would need to use a beam balance to determine which one is defective and how(lighter or heavier).

A wonderful algorithm by Jack Wert can be found here
http://www.cut-the-knot.org/blue/OddCoinProblems.shtml
(as described for the case n is of the form (3^k-3)/2, but it is generalizable to other n, see the writeup below)
A shorter version and probably more readable version of that is here
http://www.cut-the-knot.org/blue/OddCoinProblemsShort.shtml
For n of the form (3^k-3)/2, the above solution applies perfectly and the minimum number of weighings required is k.
In other cases...
Adapting Jack Wert's algorithm for all n.
In order to modify the above algorithm for all n, you can try the following (I haven't tried proving the correctness, though):
First check if n is of the from (3^k-3)/2. If it is, apply above algorithm.
If not,
If n = 3t (i.e. n is a multiple of 3), you find the least m > n such that m is of the form (3^k-3)/2. The number of weighings required will be k. Now form the groups 1, 3, 3^2, ..., 3^(k-2), Z, where 3^(k-2) < Z < 3^(k-1) and repeat the algorithm from Jack's solution.
Note: We would also need to generalize the method A (the case when we know if the coin is heavier of lighter), for arbitrary Z.
If n = 3t+1, try to solve for 3t (keeping one ball aside). If you don't find the odd ball among 3t, the one you kept aside is defective.
If n = 3t+2, form the groups for 3t+3, but have one group not have the one ball group. If you come to the stage when you have to rotate the one ball group, you know the defective ball is one of two balls and you can then weigh one of those two balls against one of the known good balls (from among the other 3t).

Trichotomy ! :)
Explanation :
Given a set of n balls, subdivide it in 3 sets A, B and C of n/3 balls.
Compare A and B. If equal, then the defective ball is in C.
etc.
So, your minimum number of times is the number of times you can divide n by three (sorry, i do not know the english word for that).

You could use a general planning algorithm: http://www.inf.ed.ac.uk/teaching/courses/plan/

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio