Queries for minimum fuel needed to travel from U to V - algorithm

Given a tree with N nodes.
Each edges of the tree contains:
D : the length of the edge
T : the gold needed to pay to go through that edge (the gold should be paid before going through the edge)
When moving through an edge, if you're carrying X golds, you will need X*D fuel.
There are 2 types of queries:
u, v: find the fuel needed to transfer G golds from u to v (G is fixed among all queries)
u, v, x: update T of edge {u,v} to x ({u, v} is guaranteed to be in the tree)
2 ≤ N ≤ 100.000
1 ≤ Q ≤ 100.000
1 ≤ Ai, Bi ≤ N
1 ≤ D, T, G ≤ 10^9
N = 6, G = 2
Take queries 1 with u = 3 and v = 6 for example. First, you start at 3 with 11 golds , pay 2, having 9, and go to node 2 with 9*1 = 9 fuel. Next, we pay 3 gold, having 6, and go to node 4 with 6*2 = 12 fuel. Finally, we pay 4, having 2 gold, and go to node 6 with 2*1 = 2 fuel. So the fuel needed would be 9 + 12 + 2 = 23.
So the answer to query: u = 3, v = 6 would be 23
The second query is just updating T of the edge so I think there's no need for explanation.
My take
I was only able to solve the problem in O(N*Q). Since it's a tree, there's only 1 path from u to v, so for each query, I do a DFS to find the fuel needed to go from u to v. Here's the code for that subtask: https://ideone.com/SyINTQ
For some special cases that all T are 0. We just need to find the length from u to v and multiply it by G. The length from u to v can be easily found using a distance array and LCA. I think this could be a hint for the proper solution.
Is there a way to do the queries in logN or less?
P/S: Please comment if anything needs to be clarified, and sorry for my bad English.

This answer will explain my matrix group comment in detail and then
sketch the standard data structures needed to make it work.
Let’s suppose that we’re carrying Gold and have burned Fuel so far.
If we traverse an edge with parameters Distance, Toll, then the
effect is
Gold -= Toll
Fuel += Gold * Distance,
or as a functional program,
Gold' = Gold - Toll
Fuel' = Fuel + Gold' * Distance.
= Fuel + Gold * Distance - Toll * Distance.
The latter code fragment defines what mathematicians call an action:
each Distance, Toll gives rise to a function from Gold, Fuel to
Gold, Fuel.
Now, whenever we have two functions from a domain to that same domain,
we can compose them (apply one after the other):
Gold' = Gold - Toll1
Fuel' = Fuel + Gold' * Distance1,
Gold'' = Gold' - Toll2
Fuel'' = Fuel' + Gold'' * Distance2.
The point of this math is that we can expand the definitions:
Gold'' = Gold - Toll1 - Toll2
= Gold - (Toll1 + Toll2),
Fuel'' = Fuel' + (Gold - (Toll1 + Toll2)) * Distance2
= Fuel + (Gold - Toll1) * Distance1 + (Gold - (Toll1 + Toll2)) * Distance2
= Fuel + Gold * (Distance1 + Distance2) - (Toll1 * Distance1 + (Toll1 + Toll2 ) * Distance2).
I’ve tried to express Fuel'' in the same form as before: the
composition has “Distance” Distance1 + Distance2 and “Toll”
Toll1 + Toll2, but the last term doesn’t fit the pattern. What we can
do, however, is add another parameter, FuelSaved and define it to be
Toll * Distance for each of the input edges. The generalized update
rule is
Gold' = Gold - Toll
Fuel' = Fuel + Gold * Distance - FuelSaved.
I’ll let you work out the generalized composition rule for
Distance1, Toll1, FuelSaved1 and Distance2, Toll2, FuelSaved2.
Suffice it to say, we can embed Gold, Fuel as a column vector
{1, Gold, Fuel}, and parameters Distance, Toll, FuelSaved as a unit
lower triangular matrix
{{1, 0, 0}, {-Toll, 1, 0}, {-FuelSaved, Distance, 1}}. Then
composition is matrix multiplication.
Now, so far we only have a semigroup. I could take it from here with
data structures, but they’re more complicated when we don’t have an
analog of subtraction (for intuition, compare the problems of finding
the sum of each length-k window in an array with finding the max).
Happily, there is a useful notion of undoing a traversal here (inverse).
We can derive it by solving for Gold, Fuel from Gold', Fuel':
Gold = Gold' + Toll
Fuel = Fuel' - Gold * Distance + FuelSaved,
Fuel = Fuel' + Gold' * (-Distance) - (-FuelSaved - Toll * Distance)
and reading off the inverse parameters.
I promised a sketch of the data structures, so here we are. Root the
tree anywhere. It suffices to be able to
Given nodes u and v, query the leafmost common ancestor of u and v;
Given a node u, query the parameters to get from u to the root;
Given a node v, query the parameters to get from the root to v;
Update the toll on an edge.
Then to answer a query u, v, we query their leafmost common ancestor w
and return the fuel cost of the composition (u to root) (w to root)⁻¹
(root to w)⁻¹ (root to v) where ⁻¹ means “take the inverse”.
The full-on sledgehammer approach here is to implement dynamic trees,
which will do all of these
things in amortized logarithmic
time per operation. But we don’t need dynamic topology updates and can
probably afford an extra log factor, so a set of more easily digestable
pieces would be leafmost common ancestors, heavy path decomposition, and
segment trees (one per path; Fenwick is potentially another option, but
I’m not sure what complications a noncommutative operation might

I told in the comments that a Dijkstra algorithm was necessary, but thinking better the DFS is really enough because there is only one path for each pair of vertices, we will always need to go from the starting point to the endpoint.
Using a priority queue instead of a stack would only change the order that the graph is explored, but in the worst case it would still visit all the vertices.
Using a queue instead of a stack would make the algorithm a breadth first search, again would only change the order in which the graph is explored.
Assuming that the number of nodes in a given distance increases exponentially with the threshold. An improvement for the typical case could be achieved by doing two searches and meeting in the middle. But only a constant factor.
So I think it is better to go with the simple solution, implementing this in C/C++ will result in a program dozens of times faster.
Prepare adjacency lists, and also makes the graph undirected
from collections import defaultdict
def process_edges(rows):
edges = defaultdict(list)
for u,v,D,T in rows:
return edges
It is interesting to do the search backwards because the amount of gold is fixed at the destination, and unknown at the origin, then we can calculate the exact amount of gold and fuel required for each node going backwards.
Of course you can remove the print statement I left there
def dfs(edges, a, b, G):
Q = [((0,G),b)]
visited = set()
while len(Q) != 0:
((Fu,Gu), current_vertex) = Q.pop()
for neighbor,(D,T) in edges[current_vertex]:
if neighbor in visited:
continue; # avoid going backwards
Gv = Gu + T # add the tax of the edge to the gold budget
Fv = Fu + Gv * D # compute the required fuel
print(neighbor, (Fv, Gv))
if neighbor == a:
return (Fv, Gv)
Q.append(((Fv,Gv), neighbor))
Running your example
edges = process_edges([
Will print:
4 (6, 6)
5 (22, 8)
2 (24, 9)
3 (35, 11)
and return (35, 11). It means that for rounte from 3 to 6, it requires 11 gold, and 35 is the fuel used.


Find largest ones after setting a coordinate to one

Interview Question:
You are given a grid of ones and zeros. You can arbitrarily select any point in that grid. You have to write a function which does two things:
If you choose e.g. coordinate (3,4) and it is zero you need to flip
that to a one. If it is a one you need to flip that to a zero.
You need to return the largest contiguous region
with the most ones i.e. ones have to be at least connected to
another one.
We have the largest region being the 3 ones. We have another region which have only one one (found at coordinate (2,0)).
You are to find an algorithm that will solve this where you will call that function many times. You need to ensure that your amortized run time is the lowest you can achieve.
My Solution which has Time Complexity:O(num_row*num_col) each time this function is called:
def get_all_coordinates_of_ones(grid):
set_ones = set()
for i in range(len(grid[0])):
for j in range(len(grid)):
if grid[i][j]:
set_ones.add((i, j))
return set_ones
def get_largest_region(x, y, grid):
num_col = len(grid)
num_row = len(grid[0])
one_or_zero = grid[x][y]
if not grid[x][y]:
grid[x][y] = 1 - grid[x][y]
# get the coordinates of ones in the grid
# Worst Case O(num_col * num_row)
coordinates_ones = get_all_coordinates_of_ones(grid)
while coordinates_ones:
queue = collections.deque([coordinates_ones.pop()])
largest_one = float('-inf')
count_one = 1
visited = set()
while queue:
x, y = queue.popleft()
visited.add((x, y))
for new_x, new_y in ((x, y + 1), (x, y - 1), (x + 1, y), (x - 1, y)):
if (0 <= new_x < num_row and 0 <= new_y < num_col):
if grid[new_x][new_y] == 1 and (new_x, new_y) not in visited:
count_one += 1
if (new_x, new_y) in coordinates_ones:-
coordinates_ones.remove((new_x, new_y))
queue.append((new_x, new_y))
largest_one = max(largest_one, count_one)
return largest_one
My Proposed modifications:
Use Union Find by rank. Encountered a problem. Union all the ones that are adjacent to each other. Now when one of the
coordinates is flipped e.g. from zero to one I will need to remove that coordinate from the region that it is connected to.
Questions are:
What is the fastest algorithm in terms of time complexity?
Using Union Find with rank entails removing a node. Is this the way to do improve the time complexity. If so, is there an implementation of removing a node in union find online?
------------------------ EDIT ---------------------------------
Should we always subtract one from the degree from sum(degree-1 of each 'cut' vertex). Here are two examples the first one where we need to subtract one and the second one where we do not need to subtract one:
Block Cut Tree example 1
Cut vertex is vertex B. Degree of vertex B in the block cut tree is 2.
Sum(cardinality of each 'block' vertex) : 2(A,B) + 1(B) + 3 (B,C,D) = 6
Sum(degree of each 'cut' vertex) : 1 (B)
Block cut size: 6 – 1 = 5 but should be 4 (A. B, C, D, E, F). Here need to subtract one more.
Block Cut Tree Example 2
Sum(cardinality of each 'block' vertex) : 3 (A,B,C) + 1(C) + 1(D) + 3 (D, E, F) = 8
Sum(degree of each 'cut' vertex) : 2 (C and D)
Block cut size: 8 – 2 = 6 which is (A. B, C, D, E, F). Here no need to subtract one.
Without preprocessing:
Flip the cell in the matrix.
Consider the matrix as a graph where each '1' represents a node, and neighbor nodes are connected with an edge.
Find all connected components. For each connected component - store its cardinality.
Return the highest cardinality.
Note that O(V) = O(E) = O(num_row*num_col).
Step 3 takes O(V+E)=O(num_row*num_col), which is similar to your solution.
You are to find an algorithm that will solve this where you will call that function many times. You need to ensure that your amortized run time is the lowest you can achieve.
That hints that you can benefit from preprocessing:
Consider the original matrix as a graph G where each '1' represents a node, and neighbor nodes are connected with an edge.
Find all connected components
Construct the set of block-cut trees (section 5.2) of G (also here, here and here) (one block-cut tree for each connected component of G). Construction: see here.
If you flip a '0' cell to '1':
Find neighbor connected components (0 to 4)
Remove old block-cut trees, construct a new block-cut tree for the merged component (Optimizations are possible: in some cases, previous tree(s) may be updated instead of reconstructed).
If you flip a '1' cell to '0':
If this cell is a 'cut' in a block-cut tree:
remove it from the block-cut-tree
remove it from each neighbor 'cut' vertex
split the block-cut-tree into several block-cut trees
Otherwise (this cell is part of only one 'block vertex')
remove it from the 'block' vertex; if empty - remove vertex. If block-cut-tree empty - remove it from the set of trees.
The size of a block-cut tree = sum(cardinality of each 'block' vertex) - sum(neighbor_blocks-1 of each 'cut' vertex).
Block-cut trees are not 'well known' as other data structures, so I'm not sure if this is what the interviewer had in mind. If it is - they're really looking for someone well experienced with graph algorithms.

Dyanmic Shortest Path

Your friends are planning an expedition to a small town deep in the Canadian north
next winter break. They’ve researched all the travel options and have drawn up a directed
graph whose nodes represent intermediat destinations and edges represent the reoads betweeen
In the course of this, they’ve also learned that extreme weather causes roads in this part of
the world to become quite slow in the winter and may cause large travel delays. They’ve
found an excellent travel Web site that can accurately predict how fast they’ll be able to
travel along the roads; however, the speed of travel depends on the time of the year. More
precisely, the Web site answers queries of the following form: given an edge e = (u, v)
connecting two sites u and v, and given a proposed starting time t from location u, the
site will return a value fe(t), the predicted arrival time at v. The web site guarantees that
fe(t) > t for every edge e and every time t (you can’t travel backward in time), and that
fe(t) is a monotone increasing function of t (that is, you do not arrive earlier by starting
later). Other than that, the functions fe may be arbitrary. For example, in areas where the
travel time does not vary with the season, we would have fe(t) = t + e, wheree is the
time needed to travel from the beginning to the end of the edge e.
Your friends want to use the Web site to determine the fastest way to travel through the
directed graph from their starting point to their intended destination. (You should assume
that they start at time 0 and that all predictions made by the Web site are completely
correct.) Give a polynomial-time algorithm to do this, where we treat a single query to
the Web site (based on a specific edge e and a time t) as taking a single computational step.
def updatepath(node):
randomvalue = random.randint(0,3)
print(node,"to other node:",randomvalue)
for i in range(0,n):
distance[node][i] = distance[node][i] + randomvalue
def minDistance(dist,flag_array,n):
min_value = math.inf
for i in range(0,n):
if dist[i] < min_value and flag_array[i] == False:
min_value = dist[i]
min_index = i
return min_index
def shortest_path(graph, src,n):
dist = [math.inf] * n
flag_array = [False] * n
dist[src] = 0
for cout in range(n):
#find the node index that have min cost
u = minDistance(dist,flag_array,n)
flag_array[u] = True
for i in range(n):
if graph[u][i] > 0 and flag_array[i]==False and dist[i] > dist[u] + graph[u][i]:
dist[i] = dist[u] + graph[u][i]
path[i] = u
return dist
I applied Dijkstra algorithm but it is not correct ? What would i change in my algorithm to work it for dynamic changing edge.
Well, Key points are that function is monotonically increasing. There is an algorithm which exploits this property and it is called A*.
Accumulated cost: Your prof wants you to use two distances one is accumulated cost(this is simple the cost from previous added to the cost/time needed to move to the next node).
Heuristic cost: This is some predicted cost.
Disjkstra approach would not work because you are working with heuristic cost/predicted and accumulated cost.
Monotonically increasing means h(A) <= h(A) + f(A..B).It simply says that if you move from node A to node B then the cost should not be less than the previous node (in this case A) and this is heuristic + accumulated. If this property holds then the first path which A* chooses is always the path to goal and it never needs to backtrack.
Note: The power of this algorithm is totally base on how you predict value.
If you underestimate the value that will be corrected with accumulated value but if you overestimate the value it will chose wrong path.
Create a Min Priority queue.
insert initial city in q.
while(!pq.isEmpty() && !Goalfound)
Node min = pq.delMin() //this should return you a cities to which your
distance(heuristic+accumulated is minial).
put all succesors of min in pq // all cities which you can reach, you
can better make a list of visited
cities s that queue will be
efficient by not placing same
element twice.
Keep doing this and at the end you will either reach goal or your queue will be empty
Here i implemented a 8-puzzle-solve using A*, it can give you an idea about how costs are defined and ho it works.
private void solve(MinPQ<Node> pq, HashSet<Node> closedList) {
Node e = pq.delMin();
for(Board boards: e.getBoad().neighbors()){
Node nextNode = new Node(boards,e,e.getMoves()+1);
Node collection = pq.delMin();
while(!(collection.getPreviousNode() == null)){
collection =collection.getPreviousNode();
A link to full code is here.

Towers of Hanoi - Bellman equation solution

I have to implement an algorithm that solves the Towers of Hanoi game for k pods and d rings in a limited number of moves (let's say 4 pods, 10 rings, 50 moves for example) using Bellman dynamic programming equation (if the problem is solvable of course).
Now, I understand the logic behind the equation:
where V^T is the objective function at time T, a^0 is the action at time 0, x^0 is the starting configuration, H_0 is cumulative gain f(x^0, a^0)=x^1.
The cardinality of the state space is $k^d$ and I get that a good representation for a state is a number in base k: d digits that can go from 0 to k-1. Each digit represents a ring and the digit can go from 0 to k-1, that are the labels of the k rings.
I want to minimize the number of moves for going from the initial configuration (10 rings on the first pod) to the end one (10 rings on the last pod).
What I don't get is: how do I write my objective function?
The first you need to do is choose a reward function H_t(s,a) which will define you goal. Once this function is chosen, the (optimal) value function is defined and all you have to do is compute it.
The idea of dynamic programming for the Bellman equation is that you should compute V_t(s) bottom-up: you start with t=T, then t=T-1 and so on until t=0.
The initial case is simply given by:
V_T(s) = 0, ∀s
You can compute V_{T-1}(x) ∀x from V_T:
V_{T-1}(x) = max_a [ H_{T-1}(x,a) ]
Then you can compute V_{T-2}(x) ∀s from V_{T-1}:
V_{T-2}(x) = max_a [ H_{T-2}(x,a) + V_{T-1}(f(x,a)) ]
And you keep on computing V_{t-1}(x) ∀s from V_{t}:
V_{t-1}(x) = max_a [ H_{t-1}(x,a) + V_{t}(f(x,a)) ]
until you reach V_0.
Which gives the algorithm:
forall x:
V[T](x) ← 0
for t from T-1 to 0:
forall x:
V[t](x) ← max_a { H[t](x,a) + V[t-1](f(x,a)) }
What actually was requested was this:
def k_hanoi(npods,nrings):
if nrings == 1 and npods > 1: #one remaining ring: just one move
return 1
if npods == 3:
return 2**nrings - 1 #optimal solution with 3 pods take 2^d -1 moves
if npods > 3 and nrings > 0:
sol = []
for pivot in xrange(1, nrings): #loop on all possible pivots
sol.append(2*k_hanoi(npods, pivot)+k_hanoi(npods-1, nrings-pivot))
return min(sol) #minimization on pivot
k = 4
d = 10
print k_hanoi(k, d)
I think it is the Frame algorithm, with optimization on the pivot chosen to divide the disks in two subgroups. I also think someone demonstrated this is optimal for 4 pegs (in 2014 or something like that? Not sure btw) and conjectured to be optimal for more than 4 pegs. The limitation on the number of moves can be implemented easily.
The value function in this case was the number of steps needed to go from the initial configuration to the ending one and it needed be minimized. Thank you all for the contribution.

implementing stochastic ACO algorithm

I am trying to implement a stochastic ant colony optimisation algorithm, and I'm having trouble working out how to implement movement choices based on probabilities.
the standard (greedy) version that I have implemented so far is that an ant m at a vertex i on a graph G = (V,E) where E is the set of edges (i, j), will choose the next vertex j based on the following criteria:
j = argmax(<fitness function for j>)
such that j is connected to i
the problem I am having is in trying to implement a stochastic version of this, so that now the criteria for choosing a new vertex, j is:
P(j) = <fitness function for j>/sum(<fitness function for J>)
where P(j) is the probability of choosing vertex j,
such j is connected to i,
and J is the set of all vertices connected to i
I understand the mathematics behind it, I am just having trouble working out how i should actually implement it.
if, say, i have 3 vertices connected to i, each with a probability of 0.2, 0.3, 0.5 - what is the best way to make the selection? should I just randomly select a vertex j, then generate a random number r in the range (0,1) and if r >= P(j), select vertex j? or is there a better way?
Looking at the problem statement, I think you are not trying to visit all nodes (connected to i (say) ), but some of the nodes based on some probability distribution. Lets take an example:
You have a node i and connected to it are 5 nodes, a1...a5, with probabilities p1...p5, such that sum(p_i) = 1. No, say the precision of probabilities that you consider is 2 places after decimal. Also, you dont want to visit all 5 nodes, but only k of them. Lets say, in this example, k = 2. So, since 2 places of decimal is your probability precision, add 3 to it to increase normality of probability distribution in the random function. (You can change this 3 to any number of your choice, as far as performance is concerned) (Since you have not tagged any language, I'll take example of java's nextInt() function to generate random numbers.)
Lets give some values:
p1...p5 = {0.17, 0.11, 0.45, 0.03, 0.24}
Now, in a loop from 1 to k, generate a random number from (0...10^5). {5 = 2 + 3, ie. precision + 3}. If the generated number is from 0 to 16999, go with node a1, 17000 to 27999, go with a2, 28000 to 72999, go with a3...and so on. You get the idea.
What you're trying to implement is a weighted random choice depending on the probabilities for the components of the solution, or a random proportional selection rule on ACO terms. Here is an snippet of the implementation of this rule on the Isula Framework:
double value = random.nextDouble();
while (componentWithProbabilitiesIterator.hasNext()) {
Map.Entry<C, Double> componentWithProbability = componentWithProbabilitiesIterator
Double probability = componentWithProbability.getValue();
total += probability;
if (total >= value) {
nextNode = componentWithProbability.getKey();
return true;
You just need to generate a random value between 0 and 1 (stored in value), and start accumulating the probabilities of the components (on the total variable). When the total exceeds the threshold defined in value, we have found the component to add to the solution.

Discrete optimization algorithm

I'm trying to decide on the best approach for my problem, which is as follows:
I have a set of objects (about 3k-5k) which I want to uniquely assign to about 10 groups (1 group per object).
Each object has a set of grades corresponding with how well it fits within each group.
Each group has a capacity of objects it can manage (the constraints).
My goal is to maximize the sum of grades my assignments receive.
For example, let's say I have 3 objects (o1, o2, o3) and 2 groups (g1,g2) with a cap. of 1 object each.
Now assume the grades are:
o1: g1=11, g2=8
o2: g1=10, g2=5
o3: g1=5, g2=6
In that case, for the optimal result g1 should receive o2, and g2 should receive o1, yielding a total of 10+8=18 points.
Note that the number of objects can either exceed the sum of quotas (e.g. leaving o3 as a "leftover") or fall short from filling the quotas.
How should I address this problem (Traveling Salesman, sort of a weighted Knap-Sack etc.)? How long should brute-forcing it take on a regular computer? Are there any standard tools such as the linprog function in Matlab that support this sort of problem?
It can be solved with min cost flow algorithm.
The graph can look the following way:
It should be bipartite. The left part represents objects(one vertex for each object). The right part represents groups(one vertex for each group). There is an edge from each vertex from the left part to each vertex from the right part with capacity = 1 and cost = -grade for this pair. There is also an edge from the source vertex to each vertex from the left part with capacity = 1 and cost = 0 and there is an edge from each vertex from the right part to the sink vertex(sink and source are two additional vertices) with capacity = constraints for this group and cost = 0.
The answer is -the cheapest flow cost from the source to the sink.
It is possible to implement it with O(N^2 * M * log(N + M)) time complexity(using Dijkstra algorithm with potentials)(N is the number of objects, M is the number of groups).
This can be solved with an integer program. Binary variables x_{ij} state if object i is assigned to group j. The objective maximized \sum_{i,j} s_{ij}x_{ij}, where s_{ij} is the score associated with assigning i to j and x_{ij} is whether i is assigned to j. You have two types of constraints:
\sum_i x_{ij} <= c_j for all j, the capacity constraints for groups
\sum_j x_{ij} <= 1 for all i, limiting objects to be assigned to at most one group
Here's how you would implement it in R -- the lp function in R is quite similar to the linprog function in matlab.
# Score matrix
S <- matrix(c(11, 10, 5, 8, 5, 6), nrow=3)
# Capacity vector
cvec <- c(1, 1)
# Helper function to construct constraint matrices
unit.vec <- function(pos, n) {
ret <- rep(0, n)
ret[pos] <- 1
# Capacity constraints
cap <- t(sapply(1:ncol(S), function(j) rep(unit.vec(j, ncol(S)), nrow(S))))
# Object assignment constraints
obj <- t(sapply(1:nrow(S), function(i) rep(unit.vec(i, nrow(S)), each=ncol(S))))
# Solve the LP
res <- lp(direction="max",
const.mat=rbind(cap, obj),
const.rhs=c(cvec, rep(1, nrow(S))),
# Grab assignments and objective
sln <- t(matrix(res$solution, nrow=ncol(S)))
apply(sln, 1, function(x) ifelse(sum(x) > 0.999, which(x == 1), NA))
# [1] 2 1 NA
# [1] 18
Although this is modeled with binary variables, it will solve quite efficiently assuming integral capacities.
