Job assignment with NO cost, would Hungarian method work? - algorithm

So I have a job assignment problem that doesn't have the traditional cost the Hungarian method requires.
For example:
I have 3 workers - A, B and C
I have 5 jobs - 1, 2, 3, 4 and 5
Each worker has a list of jobs he can perform, like so:
worker A can work on job 1, 2, 5
worker B can work on job 1, 2
worker C can work on job 1
The end result (since there's no cost) is the maximum number of assignments I can achieve. In this example, I can achieve a maximum of 3 assignments:
worker A on job 5
worker B on job 2
worker C on job 1
Is the Hungarian method a good way to solve this? Should I just use "dummy" costs? I was thinking maybe using the index of the job preference as the cost; is this a good idea?

The Hungarian algorithm could be made to work here, but an algorithm for unweighted maximum bipartite matching like Hopcroft–Karp would be faster.

Assign the cost -1 to the job which they can, and the others is zero.
Then run the Hungarian algorithm and it will give you the answer(It will returns -answer,in fact).
Don't do it with some large numbers, it may cause overflow(unless you implement the Hungarian very carefully).
Indeed, it's Maximum matchings in bipartite graphs, and there are so many ways to solve this problem, see wiki pages:
http://en.wikipedia.org/wiki/Matching_(graph_theory)#Maximum_matchings_in_bipartite_graphs
PS:Hopcroft–Karp algorithm is faster than hungarian and is more simple also. It worth a try. Some compulicated method is faster than these two, but it's not recommanded to learn these algorithms at very first.
PSS: Your ID in stackoverflow is a method to solve this problem. It's a network flow way. It's called shortest argument path(sap). See:http://coral.ie.lehigh.edu/~ted/files/ie411/lectures/Lecture11.pdf

Dummy costs should do the trick. Assign a cost of 1 to any job they can do, and an infinite cost (if your system allows that) to jobs they can't. The Hungarian algorithm is designed to minimize the total cost across all tasks, so it'll figure things out naturally. There shouldn't be any need to account for what you think their job preferences are; that's the algorithm's job.

Hungarian algorithm will give you an answer, but do not use infinity costs, since you cannot compare (infinity + infinity) and infinity (unless you compare the costs yourself).
A: 1, 2, 3
B: 1
C: 1
The matrix form:
1 2 3
A 1 2 3
B 1 inf inf
C 1 inf inf
How can your computer compare 1, inf, inf and 2, 1, inf?
Instead, use some cost that is so large that it will guarantee to be not assigned (and yes, be careful with overflowing).

Related

How to find the best combination of parameters from a very large sets?

I have a processing logic which has 11 parameters(let's say from parameter A to parameter K) and different combinations of theses parameters can results in different outcomes.
Processing Logic Example:
if x > A:
x = B
else:
x = C
y = math.sin(2x*x+1.1416)-D
# other logic involving parameter E,F,G,H,I,J,K
return outcome
Here are some examples of the possible values of the parameters(others are similar, discrete):
A ∈ [0.01, 0.02, 0.03, ..., 0.2]
E ∈ [1, 2, 3, 4, ..., 200]
I would like to find the combination of these parameters that results in the best outcome.
However, the problem I am facing is that there are in total
10^19 possible combinations while each combination takes 700ms processing time per CPU core. Obviously, the time to process the whole combinations is unacceptable even I have a large computing cluster.
Could anyone give some advice on what is the correct methodology to handle this problem?
Here is some of my thoughts:
Step 1. Minimize the step interval of each parameter that reduces the total processing time to an acceptable scope, for example:
A ∈ [0.01, 0.05, 0.09, ..., 0.2]
E ∈ [1, 5, 10, 15, ..., 200]
Step 2. Starting from the best combination resulted from step 1, doing a more meticulous research around that combination to find the best combination
But I am afraid that the best combination might hide somewhere that step 1 is not able to perceive, so step 2 is in vain
This is an optimization problem. However, you have two distinct problems in what you posed:
There are no restrictions or properties on the evaluation function;
You accept only the best solution of 10^19 possibilities.
The field of optimization serves up many possibilities, most of which are one variation or another of hill-climbing search and irruptive movement (to help break out of a local maximum that is not the global solution). All of these depend on some manner of continuity or predictability in the evaluation function's dependence on its inputs.
Without that continuity, there is no shorter path to the sole optimal solution.
If you do have some predictability, then you have some reading to do on various solution methods. Start with Newton-Raphson, move on to Gradient Descent, and continue to other topics, depending on the fabric of your function.
Have you thought about purely mathematical approach i.e. trying to find local/global extrema, or based on whether function is monotonic per operation?
There are quite decent numerical methods for derivatives/integrals, even to be used in a relatively-generic manner.
So in other words limit the scope, instead of computing every single option - depends on the general character of operations, that you have in mind.

Distributing limited resources with choices

I'm looking for an algorithm to figure out the maximum number of "products" that can be created given limited "resource choices".
For example, let's say I want to create an ABC and are given these resource choices:
A or B: 3
A or C: 1
B: 1
C: 1
In this case, I can create at most 2 ABC by selecting 2 A and 1 B from the first choice, and 1 C from the second choice. Then I have a total of 2 A, 2 B, and 2 C which can create 2 ABC.
Is there an algorithm, other than brute forcing the permutations, which solves this problem?
For completeness, here are the constraints in my actual problem:
Max number of different resources in product: 10
Max number of resource choices: 20
Max quantity of a choice: 50
It can be solved using Linear Programming or using the Ford-Fulkerson Algorithm to solve the so-called Maximum Network Flow problem. It would take me quiet some time to try to explain any of the 2 mentioned algorithms, so I think you should take a look at some online resources. If you need to solve some real problem I suggest you to go through the algorithm(s) just to get idea how you can model this particular problem and then use some of the existing libraries. If you want to learn algorithms, feel free to write your implementation :)

Trying to gain intuition for work scheduling greedy algorithm

I have the following scenario: (since I don't know of a way to show LaTeX, here's a screenshot)
I'm having some trouble conceptualizing what's going on here. If I were to program this, I would probably attempt to structure this as some kind of heap where each node represents a worker, from earliest-to-latest, then run Prim's/Kruskal's algorithm on it. I don't know if I'm on the right track with that idea, but I need to flesh out my understanding of this problem so I can do the following:
Describe in detail the greedy choice
Show that if there's an optimal solution for which the greedy choice was not made, then an exchange can be made to conform with the greedy choice
Know how to implement a greedy algorithm solution, and its running time
So where should I be going with this idea?
This problem is very similar in nature to "Roster Scheduling problems." Think of the committee as say a set of 'supervisors' and you want to have a supervisor present, whenever a worker is present. In this case, the supervisor comes from the same set as the workers.
Here are some modeling ideas, and an Integer Programming formulation.
Time Slicing Idea
This sounds like a bad idea initially, but works really well in practice. We are going to create a lot of "time instants" T i from the start time of the first shift, to the end time of the very last shift. It sometimes helps to think of
T1, T2, T3....TN as being time instants (say) five minutes apart. For every Ti at least one worker is working on a shift. Therefore, that time instant has be be covered (Coverage means there has to be at least one member of the committee also working at time Ti.)
We really need to only worry about 2n Time instants: The start and finish times of each of the n workers.
Coverage Property Requirement
For every time instant Ti, we want a worker from the Committee present.
Let w1, w2...wn be the workers, sorted by their start times s_i. (Worker w1 starts the earliest shift, and worker wn starts the very last shift.)
Introduce a new Indicator variable (boolean):
Y_i = 1 if worker i is part of the committeee
Y_i = 0 otherwise.
Visualization
Now think of a 0-1 matrix, where the rows are the SORTED workers, and the columns are the time instants...
Construct a Time-Worker Matrix (0/1)
t1 t2 t3 t4 t5 t6 ... tN
-------------------------------------------
w1 1 1
w2 1 1
w3 1 1 1
w4 1 1 1
...
...
wn 1 1 1 1
-------------------------------------------
Total 2 4 3 ... ... 1 2 4 5
So the problem is to make sure that for each column, at least 1 worker is Selected to be part of the committee. The Total shows the number of candidates for the committee at each Time instant.
An Integer Programming based formulation
Objective: Minimize Sum(Y_i)
Subject to:
Y1 + Y2 >= 1 # coverage for time t1
Y1 + Y2 + Y3 >= 1 # coverage for time t2
...
More generally, the constraints are:
# Set Covering constraint for time T_i
Sum over all worker i's that are working at time t_i (Y_i) >= 1
Y_i Binary for all i's
Preprocessing
This Integer program, if attempted without preprocessing can be very difficult, and end up choking the solvers. But in practice there are quite a number of preprocessing ideas that can help immensely.
Make any forced assignments. (If ever there is a time instant with only one
worker working, that worker has to be in the committee ∈ C)
Separate into nice subproblems. Look at the time-worker Matrix. If there are nice 'rectangles' in it that can be cut out without
impacting any other time instant, then that is a wholly separate
sub-problem to solve. Makes the solver go much, much faster.
Identical shifts - If lots of workers have the exact same start and end times, then you can simply choose ANY one of them (say, the
lexicographically first worker, WLOG) and remove all the other workers from
consideration. (Makes a ton of difference in real life situations.)
Dominating shifts: If one worker starts before and stays later than any other worker, the 'dominating' worker can stay, all the
'dominated' workers can be removed from consideration for C.
All the identical rows (and columns) in the time-worker Matrix can be fused. You need to only keep one of them. (De-duping)
You could throw this into an IP solver (CPLEX, Excel, lp_solve etc.) and you will get a solution, if the problem size is not an issue.
Hope some of these ideas help.

Second best solution to an assignmentproblem using the Hungarian Algorithm

For finding the best solution in the assignment problem it's easy to use the Hungarian Algorithm.
For example:
A | 3 4 2
B | 8 9 1
C | 7 9 5
When using the Hungarian Algorithm on this you become:
A | 0 0 1
B | 5 5 0
C | 0 1 0
Which means A gets assigned to 'job' 2, B to job 3 and C to job 1.
However, I want to find the second best solution, meaning I want the best solution with a cost strictly greater that the cost of the optimal solution. According to me I just need to find the assignment with the minimal sum in the last matrix without it being the same as the optimal. I could do this by just searching in a tree (with pruning) but I'm worried about the complexity (being O(n!)). Is there any efficient method for this I don't know about?
I was thinking about a search in which I sort the rows first and then greedily choose the lowest cost first assuming most of the lowest costs will make up for the minimal sum + pruning. But assuming the Hungarian Algorithm can produce a matrix with a lot of zero's, the complexity is terrible again...
What you describe is a special case of the K best assignments problem -- there was in fact a solution to this problem proposed by Katta G. Murty in the following 1968 paper "An Algorithm for Ranking all the Assignments in Order of Increasing Cost." Operations Research 16(3):682-687.
Looks like there are actually a reasonable number of implementations of this, at least in Java and Matlab, available on the web (see e.g. here.)
In r there is now an implementation of Murty's algorithm in the muRty package.
CRAN
GitHub
It covers:
Optimization in both minimum and maximum direction;
output by rank (similar to dense rank in SQL), and
the use of either Hungarian algorithm (as implemented in clue) or linear programming (as implemented in lpSolve) for solving the initial assignment(s).
Disclaimer: I'm the author of the package.

Algorithm design to assign nodes to graphs

I have a graph-theoretic (which is also related to combinatorics) problem that is illustrated below, and wonder what is the best approach to design an algorithm to solve it.
Given 4 different graphs of 6 nodes (by different, I mean different structures, e.g. STAR, LINE, COMPLETE, etc), and 24 unique objects, design an algorithm to assign these objects to these 4 graphs 4 times, so that the number of repeating neighbors on the graphs over the 4 assignments is minimized. For example, if object A and B are neighbors on 1 of the 4 graphs in one assignment, then in the best case, A and B will not be neighbors again in the other 3 assignments.
Obviously, the degree to which such minimization can go is dependent on the specific graph structures given. But I am more interested in a general solution here so that given any 4 graph structures, such minimization is guaranteed as the result of the algorithm.
Any suggestion/idea of solving this problem is welcome, and some pseudo-code may well be sufficient to illustrate the design. Thank you.
Representation:
You have 24 elements, I will name this elements from A to X (24 first letters).
Each of these elements will have a place in one of the 4 graphs. I will assign a number to the 24 nodes of the 4 graphs from 1 to 24.
I will identify the position of A by a 24-uple =(xA1,xA2...,xA24), and if I want to assign A to the node number 8 for exemple, I will write (xa1,Xa2..xa24) = (0,0,0,0,0,0,0,1,0,0...0), where 1 is on position 8.
We can say that A =(xa1,...xa24)
e1...e24 are the unit vectors (1,0...0) to (0,0...1)
note about the operator '.':
A.e1=xa1
...
X.e24=Xx24
There are some constraints on A,...X with these notations :
Xii is in {0,1}
and
Sum(Xai)=1 ... Sum(Xxi)=1
Sum(Xa1,xb1,...Xx1)=1 ... Sum(Xa24,Xb24,... Xx24)=1
Since one element can be assign to only one node.
I will define a graph by defining the neighbors relation of each node, lets say node 8 has neighbors node 7 and node 10
to check that A and B are neighbors on node 8 for exemple I nedd:
A.e8=1 and B.e7 or B.e10 =1 then I just need A.e8*(B.e7+B.e10)==1
in the function isNeighborInGraphs(A,B) I test that for every nodes and I get one or zero depending on the neighborhood.
Notations:
4 graphs of 6 nodes, the position of each element is defined by an integer from 1 to 24.
(1 to 6 for first graph, etc...)
e1... e24 are the unit vectors (1,0,0...0) to (0,0...1)
Let A, B ...X be the N elements.
A=(0,0...,1,...,0)=(xa1,xa2...xa24)
B=...
...
X=(0,0...,1,...,0)
Graph descriptions:
IsNeigborInGraphs(A,B)=A.e1*B.e2+...
//if 1 and 2 are neigbors in one graph
for exemple
State of the system:
L(A)=[B,B,C,E,G...] // list of
neigbors of A (can repeat)
actualise(L(A)):
for element in [B,X]
if IsNeigbotInGraphs(A,Element)
L(A).append(Element)
endIf
endfor
Objective functions
N(A)=len(L(A))+Sum(IsneigborInGraph(A,i),i in L(A))
...
N(X)= ...
Description of the algorithm
start with an initial position
A=e1... X=e24
Actualize L(A),L(B)... L(X)
Solve this (with a solveur, ampl for
exemple will work I guess since it's
a nonlinear optimization
problem):
Objective function
min(Sum(N(Z),Z=A to X)
Constraints:
Sum(Xai)=1 ... Sum(Xxi)=1
Sum(Xa1,xb1,...Xx1)=1 ...
Sum(Xa24,Xb24,... Xx24)=1
You get the best solution
4.Repeat step 2 and 3, 3 more times.
If all four graphs are K_6, then the best you can do is choose 4 set partitions of your 24 objects into 4 sets each of cardinality 6 so that the pairwise intersection of any two sets has cardinality at most 2. You can do this by choosing set partitions that are maximally far apart in the Hasse diagram of set partitions with partial order given by refinement. The general case is much harder, but perhaps you can still begin with this crude approximation of a solution and then be clever with which vertex is assigned which object in the four assignments.
Assuming you don't want to cycle all combinations and calculate the sum every time and choose the lowest, you can implement a minimum problem (solved depending on your constraints using either a linear programming solver i.e. symplex algorithm engines or a non-linear solver, much harder talking in terms of time) with constraints on your variables (24) depending on the shape of your path. You can also use free software like LINGO/LINDO to create rapidly a decision theory model and test its correctness (you need decision theory notions though)
If this has anything to do with the real world, then it's unlikely that you absolutely must have a solution that is the true minimum. Close to the minimum should be good enough, right? If so, you could repeatedly randomly make the 4 assignments and check the results until you either run out of time or have a good-enough solution or appear to have stopped improving your best solution.

Resources