Given a node network, how to find the highest scoring loop with finite number of moves? - algorithm

For a project of mine, I'm attempting to create a solver that, given a random set of weighted nodes with weighted paths, will find the highest scoring path with a finite number of moves. I've created a visual to help describe the problem.
This example has all the connection edges shown for completeness. The number on edges are traversal costs and numbers inside nodes are scores. A node is only counted when traversed to and cannot traverse to itself from itself.
As you can see from the description in the image, there is a start/finish node with randomly placed nodes that each have a arbitrary score. Every node is connected to all other nodes and every connection has an arbitrary weight that subtracts from the total number of move units remaining. For simplicity, you could assume the weight of a connection is a function of distance. Nodes can be traveled to more than once and their score is applied again. The goal is to find a loop path that has the highest score for the given move limit.
The solver will never be dealing with more than 30 nodes, usually dealing with 10-15 nodes. I still need to try and make it as fast as possible.
Any ideas on algorithms or methods that would help me solve this problem other than pure brute force methods?

Here's an O(m n^2)-time algorithm, where m is the number of moves and n is the number of nodes.
For every time t in {0, 1, ..., m} and every node v, compute the maximum score of a t-step walk that begins at the start node and ends at v as follows. If t = 0, then there's only walk, namely, doing nothing at the start node, so the maximum for (0, v) is 0 if v is the start node and -infinity (i.e., impossible) otherwise.
For t > 0, we use the entries for t - 1 to compute the entries for t. To compute the (t, v) entry, we add the score for v to the difference of the maximum over all nodes w of the (t - 1, w) entry minus the transition penalty from w to v. In other words, an optimal t-step walk to v consists of a step from some node w to v preceded by a (t - 1)-step walk to w, and this (t - 1)-step walk must be optimal because history does not influence future scoring.
At the end, we look at the (m, start node) entry. To recover the actual walk involves working backward and determining repeatedly which w was the best node to have come from.

Related

Find the shortest path in a graph which visits all node types

I can't figure out how to proceed with the following problem.
Say I have an unoriented graph with an end node and a start node, I need to find the shortest path between these two nodes, but the path must include all mustpass node types.
There can be up to 10 of these types. This means that I should visit at least one node of each type (marked with a letter in the image) and then go to the end. Once I visit one of the nodes of type B, I may, but need not, visit other nodes of type B. The nodes that are marked with a number simply form a path and do not need to be visited.
This question is very similar to this. There it was suggested to find the shortest path between all the crucial nodes and then use the DFS algorithm. Can I apply the same algorithm to this problem?
Let N be the number of vertices and M be the number of edges.
Break down the solution into two phases.
Phase 1:
Compute the distance between each pair of edges. If the edges are not weighted, this can be easily done by starting a BFS from each node, spending a total of O(N(N+M)) time. If the edges are weighted, you can you the Dijkstra's algorithm on each node, spending a total of O(N(NlogN+M)) time.
After this phase, we have computed dist(x,y) for any pair of nodes x,y.
Phase 2:
Now that we can query the distance between any pair of nodes in O(1) using the precomputed values in phase 1, it is time to put everything together. I can propose two possibilities here
Possibility 1:
Us a similar approach as in the thread you linked. There are 1! factorial orders in which you could visit each node. Let's say we have fixed one possible order [s,t1,t2,...,t10,e] (where s and e are the start / end nodes, and ti represents a type) and we are trying to find out what would be the most optimal way to visit the nodes from start to finish in that order. Since each type can have multiple nodes belonging to it, it is not as simple as querying distances for every consecutive types t_i and t_{i+1}.
What we should do instead if, for every node x, compute the fastest way to reach the end node from node x while respecting the order of types [s,t1,...,t10,e]. So if x is of type t_i, then we are looking for a way to reach e from x while visiting nodes of types t_{i+1}, t_{i+2}, ... t_{10} in that order. Call this value dp[x] (where dp stands for dynamic-programming).
We are looking to find dp[s]. How do we compute dp[x] for a given node x? Simple - iterate through all nodes y of type t_{i+1} and consider d(x,y) + dp[y]. Then we have dp[x] = min{dist(x,y) + dp[y] for all y of type t_{i+1}}. Note that we need to compute dp[x] starting from the nodes of type t_10 all the way back to the nodes of type t_1.
The complexity here is O(10! * N^2)
Possibility 2:
There is actually a much faster way to find the answer and reduce the complexity to O(2^10 * N^3) (which can give massive gains for large N, and especially for larger number of node types (like 20 instead of 10)).
To accomplish this we do the following. For each subset S of the set of types {1,2,...10}, and for each pair of nodes x, y with types in S define
dp[S][x][y], which represents the fastest way to traverse the graph starting from node x, ending in node y and visiting all at least one node for every type in S. Note that we don't care about the actual order. To compute dp[S][x][y] for a given (S,x,y), all we need to do is go over all the possibilities for the second type to visit (node z with type t3). then we update dp[S][x][y] according dist(x,z) + dp[S-t1][z][y] (where t1 is the type of the node x). The number of all the possible subsets along with start and end nodes is 2^10 * N^2. To compute each dp, we consider N possibilities for the second node to visit. So overall we get O(2^10 * N^3)
Note: in all of my analysis above, you can replace the value 10, with a more general K, representing the number of different types possible.

Variations of Dijkstra's Algorithm for graphs with two weight properties

I'm trying to find a heuristic for a problem that is mapped to a directed graph with say non-negative weight edges. However, each edge is associated with two weight properties as opposed to only one weight (e.g. say one is distance, and another one showing how good the road's 4G LTE coverage is!). Is there any specific variation of dijkstra, Bellman Ford, or any other algorithm that pursues this objective? Of course, a naive workaround is manually deriving a single weight property as a combination of all of them, but this does not look good.
Can it be generalized to cases with multiple properties?
Say you want to optimize simultaneously two criteria: distance and attractiveness (and say path attractiveness is defined as the attractiveness of the most attractive edge, although you can think of different definitions). The following variation of Dijkstra can be shown to work, but I think it is mainly useful where one of the criteria takes a small number of values - say attractiveness is 1, ..., k for some small fixed k (smaller i is better).
The standard pseudocode for Dijsktra's algorithm uses a single priority queue. Instead use k priority queues. Priority queue i will correspond in Dijkstra's algorithm to the shortest path to a node v ∈ V with attractiveness i.
Start by initializing that each node is in each of the queues with distance ∞ (because, initially, the shortest path to v with attractiveness i is infinite).
In the main Dijkstra loop, where it says
while Q is not empty
change it to
while there is an i for which Q[i] is not empty
Q = Q[i] for the lowest such i
and continue from there.
Note that when you update, you pop from queue Q[i], and insert to Q[j] for j ≥ i.
It's possible to modify the proof of Dijkstra's relaxation property to show that this works.
Note that you will obtain up to k |V| results, as per node and attractiveness, you can have the shortest distance to the node with the given attractiveness.
Example
Taking an example from the comments:
So basically if a path has a total no-coverage miles of >10, then we go for another path.
Here, e.g., assuming the miles are integers (or can be rounded to integers), we could create 11 queues: queue i corresponds to the shortest distance with i no-coverage miles, except for 10, which corresponds to 10-or-higher no-coverage-miles.
At some point of the algorithm, say all queues are empty below queue 3. We pop queue 3, and update the vertex's neighbors: this might update, e.g., some node in queue 4, if the distance from the popped node to the other node is 1.
As the algorithm runs, it outputs mappings of (node, no-coverage-distance) → shortest distance. Here, you could decide that you discard all mappings for which the second item in the pair is 10.

Updating a tree and keeping track of the change in the nodes of some subtree

Problem:
You are given a rooted tree where each node is numbered from 1 to N. Initially each node contains some positive value, say X. Now we are to perform two type of operations on the tree. Total 100000 operation.
First Type:
Given a node nd and a positive integer V, you need to decrease the value of all the nodes by some amount. If a node is at a distance of d from the given node then decrease its value by floor[v/(2^d)]. Do this for all the nodes.
That means value of node nd will be decreased by V (i.e, floor[V/2^0]). Values of its nearest neighbours will be decreased by floor[V/2] . And so on.
Second Type:
You are given a node nd. You have to tell the number of nodes in the subtree rooted at nd whose value is positive.
Note: Number of nodes in the tree may be upto 100000 and the initial values, X, in the nodes may be upto 1000000000. But the value of V by which the the decrement operation is to performed will be at most 100000.
How can this be done efficiently? I am stuck with this problem for many days. Any help is appreciated.
My Idea : I am thinking to solve this problem offline. I will store all the queries first. then, if somehow I can find the time[After which operation] when some node nd's value becomes less than or equal to zero(say it death time, for each and every node. Then we can do some kind of binary search (probably using Binary Indexed Trees/ Segment Trees) to answer all the queries of second type. But the problem is I am unable to find the death time for each node.
Also I have tried to solve it online using Heavy Light Decomposition but I am unable to solve it using it either.
Thanks!
Given a tree with vertex weights, there exists a vertex that, when chosen as the root, has subtrees whose weights are at most half of the total. This vertex is a "balanced separator".
Here's an O((n + k) polylog(n, k, D))-time algorithm, where n is the number of vertices and k is the number of operations and D is the maximum decrease. In the first phase, we compute the "death time" of each vertex. In the second, we count the live vertices.
To compute the death times, first split each decrease operation into O(log(D)) decrease operations whose arguments are powers of two between 1 and 2^floor(lg(D)) inclusive. Do the following recursively. Let v be a balanced separator, where the weight of a vertex is one plus the number of decrease operations on it. Compute distances from v, then determine, for each time and each power of two, the cumulative number of operations on v with that effective argument (i.e., if a vertex at distance 2 from v is decreased by 2^i, then record a -1 change in the 2^(i - 2) coefficient for v). Partition the operations and vertices by subtree. For each subtree, repeat this cumulative summary for operations originating in the subtree, but make the coefficients positive instead of negative. By putting the summary for a subtree together with v's summary, we determine the influence of decrease operations originating outside of the subtree. Finally, we recurse on each subtree.
Now, for each vertex w, we compute the death time using binary search. The decrease operations affecting w are given in a logarithmic number of summaries computed in the manner previously described, so the total cost for one vertex is log^2.
It sounds as though you, the question asker, know how the next part goes, but for the sake of completeness, I'll describe it. Do a preorder traversal to assign new labels to vertices and also compute for each vertex the interval of labels that comprises its subtree. Initialize a Fenwick tree mapping each vertex to one (live) or zero (dead), initially one. Put the death times and queries in a priority queue. To process a death, decrease the value of that vertex by one. To process a query, sum the values of vertices in the subtree interval.

optimal way to calculate all nodes at distance less than k from m given nodes

A graph of size n is given and a subset of size m of it's nodes is given . Find all nodes which are at a distance <=k from ALL nodes of the subset .
eg . A->B->C->D->E is the graph , subset = {A,C} , k = 2.
Now , E is at distance <=2 from C , but not from A , so it should not be counted .
I thought of running Breadth First Search from each node in subset , and taking intersection of the respective answers .
Can it be further optimized ?
I went through many posts on SO , but they all direct to kd-trees which i don't understand , so is there any other way ?
I can think of two non-asymptotic (I believe) optimizations:
If you're done with BFS from one of the subset nodes, delete all nodes that have distance > k from it
Start with the two nodes in the subset whose distance is largest to get the smallest possible leftover graph
Of course this doesn't help if k is large (close to n), I have no idea in that case. I am positive however that k/d trees are not applicable to general graphs :)
Nicklas B's optimizations can be applied to both of the following optimizations.
Optimization #1: Modify BFS to do the intersection as it runs rather than afterwords.
The BFS and intersection seems to be the way to go. However, there is redudant work being done by the BFS. Specicially, it is expanding nodes that it doesn't need to expand (after the first BFS). This can be resolved by merging the intersection aspect into the BFS.
The solution seems to be to keep two sets of nodes, call them "ToVisit" and "Visited", rather than label nodes visited or not.
The new rules of the BFS are as followed:
Only nodes in ToVisit are expanded upon by the BFS. They are then moved from ToVisit to Visited to prevent being expanded twice.
The algorithm returns the Visited set as it's result and any nodes left in the ToVisit are discarded. This is then used as the ToVisit set for the next node.
The first node either uses a standard BFS algorithm or ToVisit is the list of all nodes. Either way, the result becomes the second ToVisit set for the second node.
It works better if The ToVisit set is small on average, which tends to be the case of m and k are much less than N.
Optimization #2: Pre-compute the distances if there are enough queries so queries just do intersections.
Although, this is incompatible with the first optimization. If there are a sufficient number of queries on differing subsets and k values, then it is better to find the distances between every pair of nodes ahead of time at a cost of O(VE).
This way you only need to do the intersections, which is O(V*M*Q), where Q is the number of queries, M is the average size of the subset over the queries and V is the number of nodes. If it is expected to the be case that O(M*Q) > O(E), then this approach should be less work. Noting the two most distant nodes are useful as any k equal or higher will always return the set of all vertices, resulting in just O(V) for the query cost in that case.
The distance data should then be stored in four forms.
The first is "kCount[A][k] = number of nodes with distance k or less from A". This provides an alternative to Niklas B.'s suggestion of "Start with the two nodes in the subset whose distance is largest to get the smallest possible leftover graph" in the case that O(m) > O(sqrt(V)) since finding the smallest is O(m^2) and it may be better to avoid trying to find the best choice for the starting pair and just pick a good choice. You can start with the two nodes in the subset with the smallest value for the given k in this data structure. You could also just sort the nodes in the subset by this metric and do the intersections in that order.
The second is "kMax[A] = max k for A", which can be done using a hashmap/dictionary. If the k >= this value, then this this one can be skipped unless kCount[A][kMax[A]] < (number of vertices), meaning not all nodes are reachable from A.
The third is "kFrom[A][k] = set of nodes k distance from A", since k is valid from 0 to the max distance, an hashmap/dictionary to an array/list could be used here rather than a nested hashmap/dictionary. This allows for space and time efficient*** creating the set of nodes with distance <= k from A.
The fourth is "dist[A][B] = distance from A to B", this can be done using a nested hashmap/dictionary. This allows for handling the intersection checks fairly quickly.
* If space isn't an issue, then this structure can store all the nodes k or less distance from A, but that requires O(V^3) space and thus time. The main benefit however is that it allow for also storing a separate list of nodes that are greater than k distance. This allows the algorithm use the smaller of the sets, dist > k or dist <= k. Using an intersection in the case of dist <= k and set subtraction in the case of dist <= k or intersection then set subtraction if the main set has the minimize size.
Add a new node (let's say s) and connect it to all the m given nodes.
Then, find all the nodes which are at a distance less than or equal to k+1 from s and subtract m from it. T(n)=O(V+E)

Finding a desired point in a connected graph

There is problem, I reduce it to a question as below:
In a connected undirected graph, edge weight is the time to go from one end to another. some people stand on some vertex. Now, they want to meet together, find a place(vertice) that within certain time T, all the people will arrive this assembly point. Try to minimise this T.
More information if you need for margin cases: No negative edge; cycle may exist; More than one person can stay on the same vertice; vertice may have no person; undirected edge, weight measures both u->v or v->u; people start from their initial location;
How to efficiently find it? Should I for every node v, calculate max(SPD(ui, v)) where ui are other people's locations, then choose the minimum one among these max times? Is there a better way?
I believe it could be done within a polynomial runtime bound as follows. In a first pass solve the All-Pairs Shortest Path problem to obtain a matrix with corresponding lengths of shortest paths for all vertices; afterwards iterate over the rows (or columns) and select a column where the maximum entry of all indices on which users are located.
It can be done by making parallel Dijkstra from all vertices, and stopping when sets of visited nodes intersect in one node. Intersection can be checked by counting. Algorithm sketch:
node_count = [1, 1, ...] * number_of_nodes # Number of visited sets node is in
dijkstras = set of objects D_n performing Dijsktra's algorithm starting from node n
queue = priority queue that stores tuples (first_in_queue_n, D_n).
first_in_queue_n is next node that will be visited by D_n
initialized by D_n.first_in_queue()
while:
first_in_queue_n, D_n = queue.pop_min()
node_count[first_in_queue_n] += 1
if node_count[first_in_queue_n] == number_of_nodes:
return first_in_queue_n
D_n.visite_node(first_in_queue_n)
queue.add( D_n.first_in_queue() )

Resources