How to find the minimum time required to solve all N problems? - algorithm

I was trying to solve this problem but even after hours I am not able
to understand the problem completely. I am not even able to come up
with any brute force techniques.This is the question:
There are N members and N problems and each member must exactly solve
1 problem.Only one member of the from the team is allowed to read the
problem statements before anyone start to solve.
Note that not everyone have read the problems at first. So, to solve
problems a member needs to know the statements from some teammate who
already knows them. After knowing problems once, a member is eligible
to explain them to other teammates ( one teammate at a time ). You can
assume that the explaining ( 1 or N problems ) will always take D
minutes. During explaining, none of the two involved members will be
able to do anything else.
Problems are of different difficulty levels. You can assume that it
will take Ti minutes to solve the ith problem, regardless of which
member solves it.
Given a team's data, what is the minimum possible time in which they
can solve all problems?
Input
N D
2 100
T=[1 2]
Output
102
Member 1 is allowed to know problems before start time. He starts
explaing problems to member 2 when contest starts. Explaining ends at
the 100th minute. Then both of them immidiately starts solving
problems parallely. Member 1 solved 1st problem at the 101th minute
and member 2 solved 2nd problem at the 102th minute.
What is the best method to decode this type of problem and to approach it?

This reminds me of Huffman coding.
I am not sure if the following approach is optimal, but it will probably give a good answer in practice.
Pick the easiest two problems T0 and T1 and replace them by a single problem consisting of time D+max(T0,T1).
Repeat until you have a single problem left
Finding the two easiest problems can be done in O(logN) if you store the problems in a binary heap, so overall this is O(NlogN).
Example
Suppose we have 1,3,4,6 and D=2.
We first combine 1 and 3 to make 2+max(1,3)=5. The new list is 4,5,6
We then combine 4 and 5 to make 2+max(4,5)=7. The new list is 6,7.
We then combine 6 and 7 to make 2+max(6,7)=9.
This represents the following procedure.
t=0 A shares with B
t=2 A starts problem 6, B shares with C
t=4 B starts problem 4, C shares with D
t=6 C starts problem 3, D starts problem 1
t=7 D finishes
t=8 A finishes, B finishes
t=9 C finishes

Every member of the team (except the one who read the problems)
must hear the problems. That is, problems must be told N - 1 times.
For N = 2 this can be done in D minutes,
for 2 < N <= 4 in 2D minutes,
for 4 < N <= 8 in 3D minutes, etc.
If N is not an exact power of 2 then some people must finish telling
the problems at least D minutes sooner than others.
The ones who finish early can work on
the hardest problems, leaving easier problems for the ones who finish later.
If some of the problems take time Ti > D and N is neither an exact
power of 2 nor one less than an exact power of 2, you may want to have
someone stop telling problems more than D minutes before
the last problem-telling is finished.
If some of the problems take time Ti > 2D then you may need to consider
making some people stop telling problems and start working on the really
hard problems sooner even if N is an exact power of 2.
Since the solving of one problem is in every member's critical path,
but telling is in multiple members' critical paths,
it makes no sense for anyone to solve a problem until they are finished
with all the telling of problems they are going to do.
After each D minutes the number of people who know the problems
increases by the number who were telling problems.
The number who are telling problems increases by the number who
were telling problems (that is, the number who have just learned the
problems) minus the number who start working on problems at that time.
A good "brute force" approach might be to sort the problems
by difficulty; then find out the time until the last person hears
the problems if nobody starts working on them before then;
find out when the last person finishes;
then try starting problems D minutes earlier, or 2D minutes,
or 3D minutes, etc., but never start a shorter-running
problem before a longer-running one.

The problem statement is somewhat ambiguous about the explaining part. Depending on how the statement is interpreted, the following solution is possible:
If you assume that you can explain N problems in D minutes, then it takes N/D minutes to explain one problem. Let's call that Te, for "time to explain". And the time to solve Problem i is Ti, which we know is equal to i minutes.
So at the start of the contest, Member 1 (who knows all of the problems) explains problem N to Member 2. That takes Te minutes. Member 2 then begins working on problem N (which will take N minutes to solve), and Member 1 starts explaining problem N-1 to Member 3. This continues until Member 1 explains problem 2 to Member N. At that point, Member N starts working on problem 2, and Member 1 starts working on problem 1.
Let's say that there are 4 problems, 4 team members, and D=8. So Te=2.
Time Description
0 Member 1 begins explaining Problem 4 to Member 2
2 Member 2 begins working on Problem 4
Member 1 begins explaining Problem 3 to Member 3
4 Member 3 begins working on Problem 3
Member 1 begins explaining Problem 2 to Member 4
6 Member 2 completes problem 4
Member 4 begins working on Problem 2
Member 1 begins working on Problem 1
7 Member 3 completes Problem 3
Member 1 completes Problem 1
8 Member 4 completes Problem 2
This seems like the optimum solution regardless of the value of D or N: arrange it so that the problem that takes the longest to solve is started as early as possible.
I suspect that the problem statement is an English translation of a problem given in some other language or perhaps a re-translation of something that was originally written in English and translated into some other language. Because if that's the original problem statement, whoever wrote it should be barred from ever writing problems again.

The length of time it takes to complete any one task seems of the form C * D + T, where C is a positive integer less than N, and all N-1 lead-times must be accounted for. Suppose we made a mistake and the optimal solution should actually have a task coupled with a longer lead time - so some C * D + Tj < C * D + Ti, where Ti < Tj, which is impossible.
Therefore, iterate once over the sums of pairs (assuming sorted input):
solution = maximum (T2 + (n-1) * D, T3 + (n-2) * D...D + Tn)

Related

A question regarding the tower of hanoi recursive algorithm time complexity

I am doing a coding exercise today. After finishing the examination, I checked the results and I faced a problem whose problem statement is shown as follows:
Given 4 disks in the tower of Hanoi problem, the recursive algorithm calls the same function at most ___ times.
A. 10
B. 16
C. 22
D. 31
The only thing I knew is that I selected B. 16 and I was wrong.
I searched on the internet upon discovering that it should be 2n - 1 times, or 15 times.
However, it was not in the options.
Which option is correct?
Any advice will be appreciated.
Thank you.
The 4-disk puzzle takes 15 moves. The number of recursive calls, though, depends on how it's implemented.
If your recursive base case is 1 disk => 1 move, then it's 15. If your recursive base case is 0 disks => 0 moves, then it's 31.

Best approach to a variation of a bucketing problem

Find the most appropriate team compositions for days in which it is possible. A set of n participants, k days, a team has m slots. A participant specifies how many days he wants to be a part of and which days he is available.
Result constraints:
Participants must not be participating in more days than they want
Participants must not be scheduled in days they are not available in.
Algorithm should do its best to include as many unique participants as possible.
A day will not be scheduled if less than m participants are available for that day.
I find myself solving this problem manually every week at work for my football team scheduling and I'm sure there is a smart programmatic approach to solve it. Currently, we consider only 2 days per week and colleagues write down their name for which day they wanna participate, and it ends up having big lists for each day and impossible to please everyone.
I considered a new approach in which each colleague writes down his name, desired times per week to play and which days he is available, an example below:
Kane 3 1 2 3 4 5
The above line means that Kane wants to play 3 times this week and he is available Monday through Friday. First number represents days to play, next numbers represent available days(1 to 7, MOnday to Sunday).
Days with less than m (in my case, m = 12) participants are not gonna be scheduled. What would be the best way to approach this problem in order to find a solution that does its best to include each participant at least once and also considers their desires(when to play, how much to play).
I can do programming, I just need to know what kind of algorithm to implement and maybe have a brief logical explanation for the choice.
Result constraints:
Participants must not play more than they want
Participants must not be scheduled in days they don't want to play
Algorithm should do its best to include as many participants as possible.
A day will not be scheduled if less than m participants are available for that day.
Scheduling problems can get pretty gnarly, but yours isn't too bad actually. (Well, at least until you put out the first automated schedule and people complain about it and you start adding side constraints.)
The fact that a day can have a match or not creates the kind of non-convexity that makes these problems hard, but if k is small (e.g., k = 7), it's easy enough to brute force through all of the 2k possibilities for which days have a match. For the rest of this answer, assume we know.
Figuring out how to assign people to specific matches can be formulated as a min-cost circulation problem. I'm going to write it as an integer program because it's easier to understand in my opinion, and once you add side constraints you'll likely be reaching for an integer program solver anyway.
Let P be the set of people and M be the set of matches. For p in P and m in M let p ~ m if p is willing to play in m. Let U(p) be the upper bound on the number of matches for p. Let D be the number of people demanded by each match.
For each p ~ m, let x(p, m) be a 0-1 variable that is 1 if p plays in m and 0 if p does not play in m. For all p in P, let y(p) be a 0-1 variable (intuitively 1 if p plays in at least one match and 0 if p plays in no matches, but hold on a sec). We have constraints
# player doesn't play in too many matches
for all p in P, sum_{m in M | p ~ m} x(p, m) ≤ U(p)
# match has the right number of players
for all m in M, sum_{p in P | p ~ m} x(p, m) = D
# y(p) = 1 only if p plays in at least one match
for all p in P, y(p) ≤ sum_{m in M | p ~ m} x(p, m)
The objective is to maximize
sum_{p in P} y(p)
Note that we never actually force y(p) to be 1 if player p plays in at least one match. The maximization objective takes care of that for us.
You can write code to programmatically formulate and solve a given instance as a mixed-integer program (MIP) like this. With a MIP formulation, the sky's the limit for side constraints, e.g., avoid playing certain people on consecutive days, biasing the result to award at least two matches to as many people as possible given that as many people as possible got their first, etc., etc.
I have an idea if you need a basic solution that you can optimize and refine by small steps. I am talking about Flow Networks. Most of those that already know what they are are probably turning their nose because flow network are usually used to solve maximization problem, not optimization problem. And they are right in a sense, but I think it can be initially seen as maximizing the amount of player for each day that play. No need to say it is a kind of greedy approach if we stop here.
No more introduction, the purpose is to find the maximum flow inside this graph:
Each player has a number of days in which he wants to play, represented as the capacity of each edge from the Source to node player x. Each player node has as many edges from player x to day_of_week as the capacity previously found. Each of this 2nd level edges has a capacity of 1. The third level is filled by the edges that link day_of_week to the sink node. Quick example: player 2 is available 2 days: monday and tuesday, both have a limit of player, which is 12.
Until now 1st, 2nd and 4th constraints are satisfied (well, it was the easy part too): after you found the maximum flow of the entire graph you only select those path that does not have any residual capacity both on 2nd level (from players to day_of_weeks) and 3rd level (from day_of_weeks to the sink). It is easy to prove that with this level of "optimization" and under certain conditions, it is possible that it will not find any acceptable path even though it would have found one if it had made different choices while visiting the graph.
This part is the optimization problem that i meant before. I came up with at least two heuristic improvements:
While you visit the graph, store day_of_weeks in a priority queue where days with more players assigned have a higher priority too. In this way the amount of residual capacity of the entire graph is certainly less evenly distributed.
randomness is your friend. You are not obliged to run this algorithm only once, and every time you run it you should pick a random edge from a node in the player's level. At the end you average the results and choose the most common outcome. This is an situation where the majority rule perfectly applies.
Better to specify that everything above is just a starting point: the purpose of heuristic is to find the best approximated solution possible. With this type of problem and given your probably small input, this is not the right way but it is the easiest one when you do not know where to start.

Finding minimum number of days

I got this question as a part of the interview and I am still unable to solve it.
It goes like this
A person has to complete N units of work; the nature of work is the same.
In order to get the hang of the work, he completes only one unit of work in the first day.
He wishes to celebrate the completion of work, so he decides to complete one unit of work in the last day.
Given that he is only allowed to complete x, x+1 or x-1 units of work in a day, where x is the units of work
completed on the previous day.
How many minimum days will he take to complete N units of work?
Sample Input:
6
-1
0
2
3
9
13
Here, line 1 represents the number of input test cases.
Sample Output:
0
0
2
3
5
7
Each number represents the minimum days required for each input in the sample input.
I tried doing it using the coin change approach but was not able to do so.
In 2k days, it's possible to do at most 2T(k) work (where T(k) is the k'th triangular number). In 2k+1 days, it's possible to do at most T(k+1)+T(k) at most work. That's because if there's an even (2k) number of days, the most work is 1+2+3+...+k + k+(k-1)+...3+2+1. Similarly, if there's an odd (2k+1) number of days, the most work is 1+2+3+...+k+(k+1)+k+...+3+2+1.
Given this pattern, it's possible to reduce the amount of work to any value (greater than 1) -- simply reduce the work done on the day with the most work, never picking the start or end day. This never invalidates the rule that the amount of work on one day is never more than 1 difference from an adjacent day.
Call this function F. That is:
F(2k) = 2T(k)
F(2k+1) = T(k)+T(k+1)
Recall that T(k) = k(k+1)/2, so the equations simplify:
F(2k) = k(k+1)
F(2k+1) = k(k+1)/2 + (k+1)(k+2)/2 = (k+1)^2
Armed with these observations, you can solve the original problem by finding the smallest number of days where it's possible to do at least N units of work. That is, the smallest d such that F(d) >= N.
You can, for example, use binary search to find d, or an optimal approach is to solve the equations. The minimal even solution has d/2 * (d/2 + 1) >= N which you can solve as a quadratic equation, and the minimal odd solution has (d+1)^2/4 >= N, which has a solution d = ceil(2sqrt(N)-1). Once you've found the minimal even and odd solutions, then you can pick the smaller of the two.
AS you want to have the minimum amounts of days you can just say yeah x+1, since if you want the minimum amount of days, BUT you have to consider that his last day x should be 1 so you have to break at a given point and go x-1, so now we have to determine the Breakpoint.
The Breakpoint is located in the middle of the days, since you start at 1 and want to end at 1.
For example you have to do 16 units so you distribute your days like:
Work done:
1234321.
7 days worked.
When you can't make an even Breakpoint like above repeat some values
5 = 1211
Samples:
2: 11
3: 111
9: 12321
13: 1234321
If you need to do exactly N units, not more, then you could use dynamic programming on the matrix a[x][y] where x is the amount of work done in the last day, y is the total amount of work, and a[x][y] is the the minimum number of days needed. You could use Dijkstra's algorithm to minimize a[1][N]. Begin with a[1][1]=1.

Trying to gain intuition for work scheduling greedy algorithm

I have the following scenario: (since I don't know of a way to show LaTeX, here's a screenshot)
I'm having some trouble conceptualizing what's going on here. If I were to program this, I would probably attempt to structure this as some kind of heap where each node represents a worker, from earliest-to-latest, then run Prim's/Kruskal's algorithm on it. I don't know if I'm on the right track with that idea, but I need to flesh out my understanding of this problem so I can do the following:
Describe in detail the greedy choice
Show that if there's an optimal solution for which the greedy choice was not made, then an exchange can be made to conform with the greedy choice
Know how to implement a greedy algorithm solution, and its running time
So where should I be going with this idea?
This problem is very similar in nature to "Roster Scheduling problems." Think of the committee as say a set of 'supervisors' and you want to have a supervisor present, whenever a worker is present. In this case, the supervisor comes from the same set as the workers.
Here are some modeling ideas, and an Integer Programming formulation.
Time Slicing Idea
This sounds like a bad idea initially, but works really well in practice. We are going to create a lot of "time instants" T i from the start time of the first shift, to the end time of the very last shift. It sometimes helps to think of
T1, T2, T3....TN as being time instants (say) five minutes apart. For every Ti at least one worker is working on a shift. Therefore, that time instant has be be covered (Coverage means there has to be at least one member of the committee also working at time Ti.)
We really need to only worry about 2n Time instants: The start and finish times of each of the n workers.
Coverage Property Requirement
For every time instant Ti, we want a worker from the Committee present.
Let w1, w2...wn be the workers, sorted by their start times s_i. (Worker w1 starts the earliest shift, and worker wn starts the very last shift.)
Introduce a new Indicator variable (boolean):
Y_i = 1 if worker i is part of the committeee
Y_i = 0 otherwise.
Visualization
Now think of a 0-1 matrix, where the rows are the SORTED workers, and the columns are the time instants...
Construct a Time-Worker Matrix (0/1)
t1 t2 t3 t4 t5 t6 ... tN
-------------------------------------------
w1 1 1
w2 1 1
w3 1 1 1
w4 1 1 1
...
...
wn 1 1 1 1
-------------------------------------------
Total 2 4 3 ... ... 1 2 4 5
So the problem is to make sure that for each column, at least 1 worker is Selected to be part of the committee. The Total shows the number of candidates for the committee at each Time instant.
An Integer Programming based formulation
Objective: Minimize Sum(Y_i)
Subject to:
Y1 + Y2 >= 1 # coverage for time t1
Y1 + Y2 + Y3 >= 1 # coverage for time t2
...
More generally, the constraints are:
# Set Covering constraint for time T_i
Sum over all worker i's that are working at time t_i (Y_i) >= 1
Y_i Binary for all i's
Preprocessing
This Integer program, if attempted without preprocessing can be very difficult, and end up choking the solvers. But in practice there are quite a number of preprocessing ideas that can help immensely.
Make any forced assignments. (If ever there is a time instant with only one
worker working, that worker has to be in the committee ∈ C)
Separate into nice subproblems. Look at the time-worker Matrix. If there are nice 'rectangles' in it that can be cut out without
impacting any other time instant, then that is a wholly separate
sub-problem to solve. Makes the solver go much, much faster.
Identical shifts - If lots of workers have the exact same start and end times, then you can simply choose ANY one of them (say, the
lexicographically first worker, WLOG) and remove all the other workers from
consideration. (Makes a ton of difference in real life situations.)
Dominating shifts: If one worker starts before and stays later than any other worker, the 'dominating' worker can stay, all the
'dominated' workers can be removed from consideration for C.
All the identical rows (and columns) in the time-worker Matrix can be fused. You need to only keep one of them. (De-duping)
You could throw this into an IP solver (CPLEX, Excel, lp_solve etc.) and you will get a solution, if the problem size is not an issue.
Hope some of these ideas help.

Can someone help me with my algorithm homework? [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
I have a bit of a problem with the algorithm proposed as homework by our teachers. It goes something like this:
Having a number of sticks like so:
4 (the number of piles to use)
11 7 5 4 (length of the sticks)
1 1 3 3 (how many sticks per length)
I have to figure out an algorithm that will form the minimal number of sticks by merging them. The solution for the previous example is this:
15 3 (15(optimal sum) * 3(minimal sticks) = 45 = 11*1 + 7*1 + 5*3 + 4*3)
11 4
7 4 4
5 5 5
Now I am not asking for you guys to solve this problem, but to give me a line to follow, I have tried to reduce it to a "Make Change" problem, it went good until the part where I had to select from the remaining solutions the good ones.
The complexity desired is an exponential one and the restrictions are:
0 < sticks < 100
0 < max_sum_of_sticks < 1000
So do you guys have a second thought on this?
Thank you very much for your time.
Explanation on minimal number of sticks : if for instance i had a set of sticks. The sum to be formed is 80, i have a fair amount of solutions :
1 stick of length 80
2 sticks of length 40
4 sticks of length 20 so on.
The first one is trivial and we discard it,for the remaining solutions I have to test if I can build them with the sets of sticks I have because there is a possibility that the solution chosen, for example 2*40, isn't a reliable one because we have sticks that were not used.
This looks a lot like the Knapsack problem.
You might also take a look at Branch and bound, which is a general algorithm for all kinds of optimization problems.
Like Julian, I'm not entirely sure what you mean, but it sounds a lot like the "Knapsack Problem" over multiple knapsacks, and is NP-complete. There are many different ways of approaching it - from the simple heuristics like "use the big stuff first", down to ant-colony (genetic) optimisation. And almost everything in between.
In fact, there are almost as many approaches as there are candidate sets... I wonder if the question is NP-complete? ;-p
Note: I'm calling merged sticks "piles" in this answer.
Of course, the solution is always "1". Merge all of your sticks into one big pile; you have optimal sum = total length of all sticks and minimal piles = 1.
Now, assuming you want the next smallest number after 1, there are a few feasible options. Would you try 2 minimal piles? Why not? What about 4? 5?
Let's say you are left with two candidates, 3 and 5. (i.e. optimal sum=15,minimal piles=3 and optimal sum=9,minimal piles=5) If you know you can arrange your sticks into 3 piles of length 15, do you need to check 5 piles (and what length would they be)?
So, the problem comes down to finding whether you can arrange your sticks into m piles of length n.
I'm sure there's a lot of literature on this problem, but if I were doing it for homework, I'd start by solving it on my own.
And I would start by trying to form one pile of length n. Then trying to form m-1 piles of length n with the remaining sticks...
The thing to be careful about with this approach is that you may form a wrong pile at any given time, so you'd need a way of backtracking and trying another combination. For example, suppose we have these sticks: 20 1 7 7 7 7 14 6 15 and are trying to form 4 piles of length 21. This is possible with the combination (20 1) (7 7 7) (7 14) (6 15), but if you start with (14 6 1), there's no solution that will give you 3x21 piles for the rest of the sticks. Now, I'm not sure if this indicates that 4x21 is not the answer. (The answer is, in fact, 2x42.) If that were the case, you would not run into this problem of "wrong" piles if you always started with the smaller number, i.e. tried 2x42 before trying 4x21. However, not being sure, I'd write code that would backtrack and try all the different combinations before giving up.
I am not sure I understand the problem.
I assume each pile has to be the same length? That is, the sum of length of sticks has to be same in all piles?
We have here three piles. Where did this three come from? If it were possible to make 2 piles, which one would you choose? For example, if you only have six sticks of X length, would you make three piles each with two sticks or two piles, each with three?
I guess the brute force methods is: you try to make X piles. Put all permutation/combination in each pile and see if you end up with the same total length in each.
Would it help if you give unique names to each stick? In this case, you have 11-1, 7-1, 5-1, 5-2, 5-3, etc.

Resources