Let's say we have a directed graph. We want to visit every node exactly once by traveling on the edges of this graph. Every node is annotated with one or more tags; some nodes may share tags, and even have the exact same set of tags. As we go along our walk, we are collecting a list of every distinct tag we have encountered - our objective is to find the walk which postpones acquisition of new tags as much as possible.
To restate this as a traveler analogy, let's say that a carpet salesman is trying to decide which supplier he should acquire his carpets from. He makes a list of all the carpet factories in the city. He makes an appointment witht every factory, and collect samples of the kinds of carpet they make.
Let's say we have 3 factories, producing the following kinds of carpet:
F1: C1, C2, C3
F2: C1, C4
F3: C1, C4, C5
The salesman could take the following routes:
Start at F1, collect C1, C2, C3. Go to F2, collect C4 (since he already has C1). Go to F3, collect C5 (he already has C1 and C4).
Start at F1, collect C1, C2, C3. Go to F3, collect C4 and C5. Go to F2, collect nothing (since it turns out he already has all their carpets).
Start at F2, collect C1, C4. Go to F1, collect C2, C3. Go to F3 and collect C5.
Start at F2, collect C1, C4. Go to F3, collect C5. Go to F1 and collect C3.
Start at F3, collect C1, C4, C5. Go to F1, collect C2, C3. Go to F2, collect nothing.
Start at F3, collect C1, C4, C5. Go to F2, collect nothing. Go to F1, collect C2, C3.
Note how sometimes, the salesman visits a factory even though he knows he has already collected a sample for every kind of carpet they produce. The analogy breaks down here a bit, but let's say he must visit them because it would be rude to not show up for his appointment.
Now, the carpet samples are heavy, and our salesman is traveling on foot. Distance by itself isn't hugely important (assume every edge has cost 1), but he doesn't want to carry around a whole bunch of samples any more than he needs to. So, he needs to plan his trip such that he visits the factories which have a lot of rare carpets (and where he will have to pick up a lot of new samples) last.
For the example paths above, here are the numbers of samples carried at each leg of the journey (columns 2-4), and the sum (column 5).
1 0 3 4 7
2 0 3 5 8
3 0 2 4 6
4 0 2 3 5
5 0 3 5 8
6 0 3 3 6
We can see now that route 2 is very bad: first he had to carry 3 sample from F1 to F3, then he had to carry 5 samples from F3 to F2! Instead, he could have went with route 4 - he would carry first 2 samples from F2 to F3, and then 3 samples from F3 to F1.
Also, as shown in the last column, the sum of the samples carried through every edge is a good metric for how many samples he had to carry overall: The number of samples he is carrying cannot decrease, so visiting varied factories early on will necessarily inflate the sum, and a low sum is only possible by visiting similar factories with few carpets.
Is this a known problem? Is there an algorithm to solve it?
Note: I would recommend being careful about making assumptions based on my example problem. I came up with it on the spot, and deliberately kept it small for brevity. It is certain there are many edge cases that it fails to catch.
As the size of the Graph is small, we can consider using bit-mask and dynamic programming to solve this problem (Similar with how we solve the traveling salesman problem)
Assume that we have total 6 cities to visit. So the starting state is 0 and the ending is 111111b or 127 in decimal.
From each step, if the state is x, we can easily calculate the number of sampling the salesman is carrying, and the cost from state x to state y will be the number of newly added samples from x to y times the number of unvisited cities .
public int cal(int mask) {
if (/*Visit all city*/) {
return 0;
}
HashSet<Integer> sampleSet = new HashSet();//Store current samples
int left = 0;//Number of unvisited cities
for (int i = 0; i < numberOfCity; i++) {
if (((1 << i) & mask) != 0) {//If this city was visited
sampleSet.addAll(citySample[i]);
} else {
left++;
}
}
int cost;
for (int i = 0; i < numberOfCity; i++) {
if (((1 << i) & mask) == 0) {
int dif = number of new sample from city i;
cost = min(dif * left + cal(mask | (1 << i));
}
}
return cost;
}
In the case where there are edges between every pair of nodes, and each carpet is only available at one location, this looks tractable. If you pick up X carpets when there are Y steps to go, then the contribution from this to the final cost is XY. So you need to minimise SUM_i XiYi where Xi is the number of carpets picked up when you have Yi steps to go. You can do this by visiting the factories in increasing order of the number of carpets to be picked up at that factory. If you provide a schedule in which you pick up more carpets at A than B, and you visit A before B, I can improve it by swapping the times at which you visit A and B, so any schedule that does not follow this rule is not optimal.
Related
I'm sorry I don't know what's the proper title for this because I don't know what topic this question falls into.
So for example there are 5 people. They want to stay in a hotel. This hotel only allows at most 2 lodgers per room and at least 1. That means there are a few possible variations for this.
1-1-1-1-1 (1 room for each person)
1-2-2 (1 person stays alone, the other 4 are divided into 2 room)
1-1-1-2 (... and so on)
What is the algorithm to find these variations?
This is a combinatorial question, and the abstract version is typically called balls & bins. A key question is whether the balls are distinguishable. Ditto for the bins.
In your example, the balls are people and the bins are rooms. If the rooms are distinguishable, you'll also need the total number available.
Let's say neither is distinguishable. Then the only question is how many pairs we have, with the options being 0, 1, or 2, so there are 3 solutions.
If people are distinguishable but not rooms (balls but not bins), then we care who are in the pairs. In this case 1-1-1-1-1 has a single solution, 1-1-1-2 has choose(5,2) = 10 solutions (all the ways we can choose who is in the lone pair), and 1-2-2 has choose(5,2) * choose(3,2) / 2 = 10 * 3 = 30 solutions (choose who is in the first pair, then the second, then divide by 2 to avoid double-counting where the order is reversed). Total solutions: 41.
If people and rooms are both distinguishable, then for each of the solutions above we care which room each person or pair goes in. This will depend on the total number of rooms available. If there are R rooms available, then a solution which uses r rooms where rooms aren't distinguishable will need to be multiplied by R!/(R-r)!.
E.g. 1-1-1-2 has 10 solutions where rooms are indistinguishable. If the hotel has 5 rooms then we multiple that by 5!/(5-4)! = 120 to get 1200 solutions.
If the people are a,b,c,d,e and there are 5 rooms numbered 1,2,3,4,5, then the solutions where b+d are paired up, a is in room 1, and c is in room 2 are:
a1, c2, e3, bd4
a1, c2, e3, bd5
a1, c2, e4, bd3
a1, c2, e4, bd5
a1, c2, e5, bd3
a1, c2, e5, bd4
You can consider above problem similar to coin change problem, where you have the sum and coins and you have to find the number of ways you can make the sum using those coins.
Here:
coins = {1,2}
sum = Number of people
Given is a data set of number of passengers and taxi driver's time taken to cater each passenger, output the minimum time required by the taxi agency to transfer all passengers.
Constraints:
A driver would like to cater to the next consecutive passenger number. (i.e D1 would go with P1, P2 rather than P1, P3. because his contiguous passenger number chain is broken).
A driver once chosen for the task will not be allowed to participate again. (i.e if D1 goes with P1,P2 and D3 goes with P3, D1 is not considered back for P4, D3 can be considered, as P4 will be contiguous with P3).
Input is a m*n matrix, output the minimum time.
Example:
P1 P2 P3 P4
D1 2 2 1 2
D2 3 1 2 1
D3 4 2 3 1
Output(minimum time) : 6
Explanation :
we can choose D1 for P1,P2,P3 and D3 for P4 (2+2+1+1 = 6)
or we can choose D1 for P1,P2,P3 and D2 for P4 (2+2+1+1 = 6)
or we can choose D1 for P1, D2 for P2,P3 and D3 for P4 (2+1+2+1 = 6)
or we can choose D1 for P1 and D2 for P2,P3,P4 (2+1+2+1 = 6)
or we can choose D2 for P1,P2, D1 for P3, and D3 for P4 (3+1+1+1 = 6)
PS: It's a modified Assignment problem, but unable to crack the solution.
As you noticed, The only diffrence from a classical assignement problem is that in case of equality, you should maximize ∑Pi==Di.
This can be done by reducing the cost of Mat[i][i] by an infinitesimal number. In practice 1/1024 is enough if n<<1024.
Once you have done this, you can solve your problem using standard algorithms such as the Hungarian Algorithm.
The problem disturbed me for long time. I think the solution shoulde be graph algorithm. Thank a lot
Given m cups c1; c2; ;cm with integer capacity a1; a2; : : : ; am mls respectively. You are only allowed
to perform the following three types of operations:
Completely fill one cup.
Empty one cup.
Pour water from cup ci to cup cj until either ci is empty or cj is full.
Starting from the state in which all cups are empty, you would like to reach the final state in which
cup c1 has x mls of water and all other cups are empty (for some given x). Design an algorithm to findthe minimum number of operations required or report that the desired final state is not reachable.
Your algorithm must run in time polynomial in n = (a1 + 1)(a2 + 1) : : : (am + 1).
let's say c1 has capacity 5 litres and c2 have that of 3 litres. Their difference is 2 litres.
So 2 litres of water can be taken in any cup c1 or c2 by following steps-
1) fill c1 i.e. 5 litres.
2) pour it in c2 until it gets full.
3) empty c2.
4) you have 3 2 liters in c1.
For these m cups, you have m Choose 2 = m!/(m-2)!2! = m*m-1/2 combinations.
Calculate all those and fill them in hash table with operation also.
Because 2 litres can be capacity of a cup also difference of 2 cups capacity. We store operation = 1 instead of 3.
Now we have hash set of all possible litres of water we can hold with no of operations.
All you need to find the minimum length subsequence from that collection of minimum no of operations.
I'm trying to decide on the best approach for my problem, which is as follows:
I have a set of objects (about 3k-5k) which I want to uniquely assign to about 10 groups (1 group per object).
Each object has a set of grades corresponding with how well it fits within each group.
Each group has a capacity of objects it can manage (the constraints).
My goal is to maximize the sum of grades my assignments receive.
For example, let's say I have 3 objects (o1, o2, o3) and 2 groups (g1,g2) with a cap. of 1 object each.
Now assume the grades are:
o1: g1=11, g2=8
o2: g1=10, g2=5
o3: g1=5, g2=6
In that case, for the optimal result g1 should receive o2, and g2 should receive o1, yielding a total of 10+8=18 points.
Note that the number of objects can either exceed the sum of quotas (e.g. leaving o3 as a "leftover") or fall short from filling the quotas.
How should I address this problem (Traveling Salesman, sort of a weighted Knap-Sack etc.)? How long should brute-forcing it take on a regular computer? Are there any standard tools such as the linprog function in Matlab that support this sort of problem?
It can be solved with min cost flow algorithm.
The graph can look the following way:
It should be bipartite. The left part represents objects(one vertex for each object). The right part represents groups(one vertex for each group). There is an edge from each vertex from the left part to each vertex from the right part with capacity = 1 and cost = -grade for this pair. There is also an edge from the source vertex to each vertex from the left part with capacity = 1 and cost = 0 and there is an edge from each vertex from the right part to the sink vertex(sink and source are two additional vertices) with capacity = constraints for this group and cost = 0.
The answer is -the cheapest flow cost from the source to the sink.
It is possible to implement it with O(N^2 * M * log(N + M)) time complexity(using Dijkstra algorithm with potentials)(N is the number of objects, M is the number of groups).
This can be solved with an integer program. Binary variables x_{ij} state if object i is assigned to group j. The objective maximized \sum_{i,j} s_{ij}x_{ij}, where s_{ij} is the score associated with assigning i to j and x_{ij} is whether i is assigned to j. You have two types of constraints:
\sum_i x_{ij} <= c_j for all j, the capacity constraints for groups
\sum_j x_{ij} <= 1 for all i, limiting objects to be assigned to at most one group
Here's how you would implement it in R -- the lp function in R is quite similar to the linprog function in matlab.
# Score matrix
S <- matrix(c(11, 10, 5, 8, 5, 6), nrow=3)
# Capacity vector
cvec <- c(1, 1)
# Helper function to construct constraint matrices
unit.vec <- function(pos, n) {
ret <- rep(0, n)
ret[pos] <- 1
ret
}
# Capacity constraints
cap <- t(sapply(1:ncol(S), function(j) rep(unit.vec(j, ncol(S)), nrow(S))))
# Object assignment constraints
obj <- t(sapply(1:nrow(S), function(i) rep(unit.vec(i, nrow(S)), each=ncol(S))))
# Solve the LP
res <- lp(direction="max",
objective.in=as.vector(t(S)),
const.mat=rbind(cap, obj),
const.dir="<=",
const.rhs=c(cvec, rep(1, nrow(S))),
all.bin=TRUE)
# Grab assignments and objective
sln <- t(matrix(res$solution, nrow=ncol(S)))
apply(sln, 1, function(x) ifelse(sum(x) > 0.999, which(x == 1), NA))
# [1] 2 1 NA
res$objval
# [1] 18
Although this is modeled with binary variables, it will solve quite efficiently assuming integral capacities.
Let's say I have the three following lists
A1
A2
A3
B1
B2
C1
C2
C3
C4
C5
I'd like to combine them into a single list, with the items from each list as evenly distributed as possible sorta like this:
C1
A1
C2
B1
C3
A2
C4
B2
A3
C5
I'm using .NET 3.5/C# but I'm looking more for how to approach it then specific code.
EDIT: I need to keep the order of elements from the original lists.
Take a copy of the list with the most members. This will be the destination list.
Then take the list with the next largest number of members.
divide the destination list length by the smaller length to give a fractional value of greater than one.
For each item in the second list, maintain a float counter. Add the value calculated in the previous step, and mathematically round it to the nearest integer (keep the original float counter intact). Insert it at this position in the destination list and increment the counter by 1 to account for it. Repeat for all list members in the second list.
Repeat steps 2-5 for all lists.
EDIT: This has the advantage of being O(n) as well, which is always nice :)
Implementation of
Andrew Rollings' answer:
public List<String> equimix(List<List<String>> input) {
// sort biggest list to smallest list
Collections.sort(input, new Comparator<List<String>>() {
public int compare(List<String> a1, List<String> a2) {
return a2.size() - a1.size();
}
});
List<String> output = input.get(0);
for (int i = 1; i < input.size(); i++) {
output = equimix(output, input.get(i));
}
return output;
}
public List<String> equimix(List<String> listA, List<String> listB) {
if (listB.size() > listA.size()) {
List<String> temp;
temp = listB;
listB = listA;
listA = temp;
}
List<String> output = listA;
double shiftCoeff = (double) listA.size() / listB.size();
double floatCounter = shiftCoeff;
for (String item : listB) {
int insertionIndex = (int) Math.round(floatCounter);
output.add(insertionIndex, item);
floatCounter += (1+shiftCoeff);
}
return output;
}
First, this answer is more of a train of thought than a concete solution.
OK, so you have a list of 3 items (A1, A2, A3), where you want A1 to be somewhere in the first 1/3 of the target list, A2 in the second 1/3 of the target list, and A3 in the third 1/3. Likewise you want B1 to be in the first 1/2, etc...
So you allocate your list of 10 as an array, then start with the list with the most items, in this case C. Calculate the spot where C1 should fall (1.5) Drop C1 in the closest spot, (in this case, either 1 or 2), then calculate where C2 should fall (3.5) and continue the process until there are no more Cs.
Then go with the list with the second-to-most number of items. In this case, A. Calculate where A1 goes (1.66), so try 2 first. If you already put C1 there, try 1. Do the same for A2 (4.66) and A3 (7.66). Finally, we do list B. B1 should go at 2.5, so try 2 or 3. If both are taken, try 1 and 4 and keep moving radially out until you find an empty spot. Do the same for B2.
You'll end up with something like this if you pick the lower number:
C1 A1 C2 A2 C3 B1 C4 A3 C5 B2
or this if you pick the upper number:
A1 C1 B1 C2 A2 C3 A3 C4 B2 C5
This seems to work pretty well for your sample lists, but I don't know how well it will scale to many lists with many items. Try it and let me know how it goes.
Make a hash table of lists.
For each list, store the nth element in the list under the key (/ n (+ (length list) 1))
Optionally, shuffle the lists under each key in the hash table, or sort them in some way
Concatenate the lists in the hash by sorted key
I'm thinking of a divide and conquer approach. Each iteration of which you split all the lists with elements > 1 in half and recurse. When you get to a point where all the lists except one are of one element you can randomly combine them, pop up a level and randomly combine the lists removed from that frame where the length was one... et cetera.
Something like the following is what I'm thinking:
- filter lists into three categories
- lists of length 1
- first half of the elements of lists with > 1 elements
- second half of the elements of lists with > 1 elements
- recurse on the first and second half of the lists if they have > 1 element
- combine results of above computation in order
- randomly combine the list of singletons into returned list
You could simply combine the three lists into a single list and then UNSORT that list. An unsorted list should achieve your requirement of 'evenly-distributed' without too much effort.
Here's an implementation of unsort: http://www.vanheusden.com/unsort/.
A quick suggestion, in python-ish pseudocode:
merge = list()
lists = list(list_a, list_b, list_c)
lists.sort_by(length, descending)
while lists is not empty:
l = lists.remove_first()
merge.append(l.remove_first())
if l is not empty:
next = lists.remove_first()
lists.append(l)
lists.sort_by(length, descending)
lists.prepend(next)
This should distribute elements from shorter lists more evenly than the other suggestions here.