NP-Hardness proof for constrained scheduling with staircase cost - complexity-theory

I am working on a problem that appears like a variant of the assignment problem. There are tasks that need to be assigned to servers. The sum of costs over servers needs to be minimized. The following conditions hold:
Each task has a unit size.
A task may not be divided among more than one servers. A task must be handled by exactly one server.
A server has a limit on the maximum number of tasks that may be assigned to it.
The cost function for task assignment is a staircase function. A server incurs a minimum cost 'a'. For each task handled by the server, the cost increases by 1. If the number of tasks assigned to a particular server exceeds half of it's capacity, there is a jump in that server's cost equal to a positive number 'd'.
Tasks have preferences, i.e., a given task may be assigned to one of a few of the servers.
I have a feeling that this is an NP-Hard problem, but I can't seem to find an NP-Complete problem to map to it. I've tried Bin Packing, Assignment problem, Multiple Knapsacks, bipartite graph matching but none of these problems have all the key characteristics of my problem. Can you please suggest some problem that maps to it?
Thanks and best regards
Saqib

Have you tried reducing the set partitioning problem to yours?
The SET-PART (stands for "set partitioning") decision problem asks whether there exists a partition of a given set S of numbers into two sets S1 and S2, so that the sum of the elements in S1 equals the sum of elements in S2. This problem is known to be NP-complete.
Your problem seems related to the m-PROCESSOR decision problem. Given a nonempty set A of n>0 tasks {a1,a2,...,an} with processing times t1,t2,...,tn, the m-PROCESSOR problem asks if you can schedule the tasks among m equal processors so that all tasks finish in at most k>0 time steps. (Processing times are (positive) natural numbers.)
The reduction of SET-PART to m-PROCESSOR is very easy: first show that the special case, with m=2, is NP-complete; then use this to show that m-PROCESSOR is NP-complete for all m>=2. (A reduction in Slovene.)
Hope this helps.
EDIT 1: Oops, this m-PROCESSOR thingy seems very similar to the assignment problem.

Related

Complexity of a non-integer K-Commodity Flow without conservation property

I'm working on a problem that can be seen as a version of the Santa Claus problem (defined here for example : https://dl.acm.org/citation.cfm?id=1132522) where the goods are divisble instead of indivisible.
For the indivisible problem, a reduction to the Partitoning problem is possible to classify it as NP-hard (see Golovin 2005 : page 3). However, with divisble goods, I couldn't find much litterature unless i changed the problem to another form.
The problem can be reduced to the K-commodity problem (an extension of ND38 (from Garey and Johnson) : Directed Two-commodity Integral Flow) with integral flow which is NP-complete, and with non-integral flows it is poynomially equivalent to Linear Programming (for two or more commodities).
However, the edge that I have in my model wouldn't be conservative as the utility of each resources is not the same for each commodities, and thus, a total input flow flow of 1 unit of commodity i into v doesn't means that the output flow is also 1. From Wikipedia it would be defined as preflow because it lacks the "Flow Conservation" property, which is essential in the problem also defined on Wikipedia.
Is there a way to prove/explain the complexity class of the K-commodity non-integral flows without the flow conservation property (which my problem can be reduced to) ?
To explain a bit more about the problem, i have N employees and M tasks. Each employee i has an efficiency in each task j defined as e_(i,j). The efficiency can be 0 if the employee doesn't know how to do the task. Each employees can work up to H_i hours and can divide his time between the different tasks that he can do.
The objective here is to maximize to production function of the firm which is a Leontieff production function, which is the last done task (the production is a max-min across the differents tasks). There is no collaboration, so the produced task amount is equal to the contribution of each employee (efficiency multiplies by the number of hours passed on this task).
If we think of the task as the agents, this problem can be seen as a max-min utility across tasks (agents) of the allocation of divisble goods (worker hours) with differentiated utilities (efficiencies).
As I can't use a linear solver insidemy program, i am limited to finding a good greedy or FPTAS algorithm to solve this within an acceptable margin of error.
Thank you for reading. I would be grateful if you have any idea or general direction/keywords to guide me in my research.

Algorithm design: can you provide a solution to the knapsack problem where there is no maximum weight constraint?

I am not able to find suitable algorithm for my problem. my problem is as follows:
There are n number of tasks. Each task is replicated different number of times. Two replicas of same task should not be on same agent. assign the replicas to agent such that the sum of replicas on each agent is approximately same. There is no weight constraint on the agent.
can this be solved by knapsack?
A knapsack problem without weights is just about sorting descending for value and taking as much as you like. So it does not make that much sense without the weight constraint - because now no optimization needed.
I also see no connection to your agents here.
The problem is quite simple to solve:
Sort the agents ascending by the length of their queues
and give the next job to the first one that not already has a copy of that task.
repeat until all tasks are assigned

Optimal assignment of jobs that can be solved by one strong worker or multiple weak workers

I'm looking for an effective way of achieving optimal job/worker assignation. I'd use Hungarian Algorithm but there is a catch: a worker can be assigned to only one job at a time and each job has a rating and each worker has his own rating. A job rated 4 can be solved by either a worker with rating 4 or by multiple workers with their combined ratings equal to the rating of the job, e.g. 2+2 or 3+1 or 2+1+1 or 1+1+1+1. A job rated 2 can be solved by two workers rated 1 or one worker rated 2. I'd like to prefer one-to-one assignation whenever possible.
Is there any known algorithm or any simple way to achieve optimal assignation in this case?
Your problem is clearly at least as hard as the Partition Problem, even just to know if a feasible solution exists. To show this, let's have a partition instance. It can be easily transformed into your problem by creating two jobs and as many workers as the number of elements in the partition problem. Each work has a rating equal to the value of the corresponding element in the partition problem. Your problem has a solution if and only if the Partition problem has a solution, hence proving that your problem is NP-hard.
I think we could also make an argument that the problem is at least as hard as NP-Complete, if we consider Subset Sum.
Transform it into this decision problem:
Given one job with rating N whose value is in the set of all real numbers, and M workers with ratings Ri for i in [0, M), where each rating is in the set of all real numbers, is there a subset of workers whose rating adds up to N?
In our case, we may be restricting the problem to positive integers, but the decision problem remains, and is in fact much harder because we have many jobs as well, and we want to maximize the number of jobs completed.

Algorithm for maximizing happiness

Imagine you have:
100 people
100 projects
Each person ranks all 100 projects in the order in which they would like to work on them. What kind of algorithm can be used to maximize the happiness of the people (i.e. being assigned to a project they ranked higher translates to greater happiness).
Assume one project per person.
The algorithm for this kind of problems is very popular and is known as the Hungarian algorithm. The similar problem solved with this kind of problem:
We consider an example where four jobs (J1, J2, J3, and J4) need to be
executed by four workers (W1, W2, W3, and W4), one job per worker. The
matrix below shows the cost of assigning a certain worker to a certain
job. The objective is to minimize the total cost of the assignment.
Source: http://www.hungarianalgorithm.com/examplehungarianalgorithm.php
Please note that the default hungarian algorithm finds the minimum cost but you can alter the program to make it work as maximizing the cost.
If the goal is to find the assignment that yields the maximum cost,
the problem can be altered to fit the setting by replacing each cost
with the maximum cost subtracted by the cost.
Source: http://en.wikipedia.org/wiki/Hungarian_algorithm
I've already implemented the Hungarian algorithm on my Github,
so feel free to use it and modify it to make it work as maximizing the cost.

How can I efficiently find the subset of activities that stay within a budget and maximizes utility?

I am trying to develop an algorithm to select a subset of activities from a larger list. If selected, each activity uses some amount of a fixed resource (i.e. the sum over the selected activities must stay under a total budget). There could be multiple feasible subsets, and the means of choosing from them will be based on calculating the opportunity cost of the activities not selected.
EDIT: There are two reasons this is not the 0-1 knapsack problem:
Knapsack requires integer values for the weights (i.e. resources consumed) whereas my resource consumption (i.e. mass in the knapsack parlance) is a continuous variable. (Obviously it's possible to pick some level of precision and quantize the required resources, but my bin size would have to be very small and Knapsack is O(2^n) in W.
I cannot calculate the opportunity cost a priori; that is, I can't evaluate the fitness of each one independently, although I can evaluate the utility of a given set of selected activities or the marginal utility from adding an additional task to an existing list.
The research I've done suggests a naive approach:
Define the powerset
For each element of the powerset, calculate it's utility based on the items not in the set
Select the element with the highest utility
However, I know there are ways to speed up execution time and required memory. For example:
fully enumerating a powerset is O(2^n), but I don't need to fully enumerate the list because once I've found a set of tasks that exceeds the budget I know that any set that adds more tasks is infeasible and can be rejected. That is if {1,2,3,4} is infeasible, so is {1,2,3,4} U {n}, where n is any one of the tasks remaining in the larger list.
Since I'm just summing duty the order of tasks doesn't matter (i.e. if {1,2,3} is feasible, so are {2,1,3}, {3,2,1}, etc.).
All I need in the end is the selected set, so I probably only need the best utility value found so far for comparison purposes.
I don't need to keep the list enumerations, as long as I can be sure I've looked at all the feasible ones. (Although I think keeping the duty sum for previously computed feasible sub-sets might speed run-time.)
I've convinced myself a good recursion algorithm will work, but I can't figure out how to define it, even in pseudo-code (which probably makes the most sense because it's going to be implemented in a couple of languages--probably Matlab for prototyping and then a compiled language later).
The knapsack problem is NP-complete, meaning that there's no efficient way of solving the problem. However there's a pseudo-polynomial time solution using dynamic programming. See the Wikipedia section on it for more details.
However if the maximum utility is large, you should stick with an approximation algorithm. One such approximation scheme is to greedily select items that have the greatest utility/cost. If the budget is large and the cost of each item is small, then this can work out very well.
EDIT: Since you're defining the utility in terms of items not in the set, you can simply redefine your costs. Negate the cost and then shift everything so that all your values are positive.
As others have mentioned, you are trying to solve some instance of the Knapsack problem. While theoretically, you are doomed, in practice you may still do a lot to increase the performance of your algorithm. Here are some (wildly assorted) ideas:
Be aware of Backtracking. This corresponds to your observation that once you crossed out {1, 2, 3, 4} as a solution, {1, 2, 3, 4} u {n} is not worth looking at.
Apply Dynamic Programming techniques.
Be clear about your actual requirements:
Maybe you don't need the best set? Will a good one do? I am not aware if there is an algorithm which provides a good solution in polynomial time, but there might well be.
Maybe you don't need the best set all the time? Using randomized algorithms you can solve some NP-Problems in polynomial time with the risk of failure in 1% (or whatever you deem "safe enough") of all executions.
(Remember: It's one thing to know that the halting problem is not solvable, but another to build a program that determines whether "hello world" implementations will run indefinetly.)
I think the following iterative algorithm will traverse the entire solution set and store the list of tasks, the total cost of performing them, and the opportunity cost of the tasks not performed.
It seems like it will execute in pseudo-polynomial time: polynomial in the number of activities and exponential in the number of activities that can fit within the budget.
ixCurrentSolution = 1
initialize empty set solution {
oc(ixCurrentSolution) = opportunity cost of doing nothing
tasklist(ixCurrentSolution) = empty set
costTotal(ixCurrentSolution) = 0
}
for ixTask = 1:cActivities
for ixSolution = 1:ixCurrentSolution
costCurrentSolution = costTotal(ixCurrentSolution) + cost(ixTask)
if costCurrentSolution < costMax
ixCurrentSolution++
costTotal(ixCurrentSolution) = costCurrentSolution
tasklist(ixCurrentSolution) = tasklist(ixSolution) U ixTask
oc(ixCurrentSolution) = OC of tasks not in tasklist(ixCurrentSolution)
endif
endfor
endfor

Resources