Given a tree with n vertices, each vertex has a special value C_v. A straight path of length k >= 1 is defined as a sequence of vertices v_1, v_2, ... , v_k such that each two consecutive elements of the sequence are connected by an edge and all vertices v_i are different. The straight path may not contain any edges. In other words, for k = 1, a sequence containing a single vertex is also a straight path. There is a function S defined. For a given straight path v_1, v_2, ... , v_k we get S(v_1, v_2, ... ,v_k) = Cv_1 - Cv_2 + Cv_3 - Cv_4 + ...
Calculate the sum of the values of the function S for all straight paths in the tree. Since the result may be very large, give its remainder when divided by 10^9 + 7.
Paths are treated as directed. For example: paths 1 -> 2 -> 4 and 4 -> 2 -> 1 are treated as two different paths and for each one separately the value of the function S should be taken into account in the result.
My implementation is as follows:
def S(path):
total, negative_one_pow = 0, 1
for node in path:
total += (values[node - 1] * negative_one_pow)
negative_one_pow *= -1
return total
def search(graph):
global total
for node in range(1, n + 1):
queue = [(node, [node])]
visited = set()
while queue:
current_node, path = queue.pop(0)
if current_node in visited:
continue
visited.add(current_node)
total += S(path)
for neighbor in graph[current_node]:
queue.append((neighbor, [*path, neighbor]))
n = int(input())
values = list(map(int, input().split()))
graph = {i: [] for i in range(1, n + 1)}
total = 0
for i in range(n - 1):
a, b = map(int, input().split())
graph[a].append(b)
graph[b].append(a)
search(graph)
print(total % 1000000007)
The execution of the code takes too long for bigger graphs. Can you suggest ways to speed up the code?
It is a tree. Therefore all straight paths start somewhere, travel up some distance (maybe 0), hit a peak, then turn around and go down some distance (maybe 0).
First calculate for each node the sum and count of all odd length paths rising to its children, and the sum and count of all even length paths rising to its children. This can be done through dynamic programming.
Armed with that we can calculate for each node the sum of all paths with that node as a peak. (Calculate the sum of all paths that rise to the peak then fall. For each child subtract out the sum of all paths that rose to that child and then fell back to that child.)
Related
Given a tree with n nodes (n can be as large as 2 * 10^5), where each node has a cost associated with it, let us define the following functions:
g(u, v) = the sum of all costs on the simple path from u to v
f(n) = the (n + 1)th Fibonacci number (n + 1 is not a typo)
The problem I'm working on requires me to compute the sum of f(g(u, v)) over all possible pairs of nodes in the tree modulo 10^9 + 7.
As an example, let's take a tree with 3 nodes.
without loss of generality, let's say node 1 is the root, and its children are 2 and 3
costs[1] = 2, cost[2] = 1, cost[3] = 1
g(1, 1) = 2; f(2) = 2
g(2, 2) = 1; f(1) = 1
g(3, 3) = 1; f(1) = 1
g(1, 2) = 3; f(3) = 3
g(2, 1) = 3; f(3) = 3
g(1, 3) = 3; f(3) = 3
g(3, 1) = 3; f(3) = 3
g(2, 3) = 4; f(4) = 5
g(3, 2) = 4; f(4) = 5
Summing all of the values, and taking the result modulo 10^9 + 7 gives 26 as the correct answer.
My attempt:
I implemented an algorithm to compute g(u, v) in O(log n) by finding the lowest common ancestor using a sparse table.
For the finding of the appropriate Fibonacci values, I tried two approaches, namely using exponentiation on the matrix form and another by noticing that the sequence modulo 10^9 + 7 is cyclical.
Now comes the extremely tricky part. No matter how I do the above computations, I still end up going to up to O(n^2) pairs when calculating the sum of all possible f(g(u, v)). I mean there's the obvious improvement of only going up to n * (n - 1) / 2 pairs but that's still quadratic.
What am I missing? I've been at it for several hours, but I can't see a way to get that sum without actually producing a quadratic algorithm.
To know how many times the cost of a node X is to be included in the total sum, we divide the other nodes into 3 (or more) groups:
the subtree A connected to the left of X
the subtree B connected to the right of X
(subtrees C, D... if the tree is not binary)
all other nodes Y, connected through X's parent
When two nodes belong to different groups, their simple path goes through X. So the number of simple paths that go through X is:
#Y + #A × (N - #A) + #B × (N - #B)
So by counting the total number of nodes N, and the size of the subtrees under X, you can calculate how many times the cost of node X should be included in the total sum. Do this for every node and you have the total cost.
The code for this could be straightforward. I'll assume that the total number of nodes N is known, and that you can add properties to the nodes (both of these assumptions simplify the algorithm, but it can be done without them).
We'll add a child_count to store the number of descendants of the node, and a path_count to store the number of simple paths that the node is part of; both are initialised to zero.
For each node, starting from the root:
If not all children have been visited, go to an unvisited child.
If all children have been visited (or node is leaf):
Increment child_count.
Increase path_count with N - child_count.
Add this node's path_count × cost to the total cost.
If the current node is the root, we're done; otherwise:
Increase the parent node's child_count with this node's child_count.
Increase the parent node's path_count with this node's child_count × (N - child_count).
Go to the parent node.
The below algorithm's running time is O(n^3).
Tree is a strongly connected graph without loops. So when we want to get all possible pairs' costs, we are trying to find the shortest paths for all pairs. Thus, we can use Dijkstra's idea and dynamic programming approach for this problem (I took it from Weiss's book). Then we apply Fibonacci function to the cost, assuming that we already have a table to look up.
Dijkstra's idea: We start from the root and search all simple paths from the root to all other nodes and then do that for other vertices on the graph.
Dynamic programming approach: We use a 2D matrix D[][] to represent the lowest path/cost (They could be used exchangeably.) between node i and node j. Initially, D[i][i] is already set. If node i and node j is parent/child, D[i][j] = g(i, j), which is the cost between them. If node k is on the path which has lower cost for node i and node j, we can update the D[i][j], i.e., D[i][j] = D[i][k] + D[k][j] if D[i][j] < D[i][k] + D[k][j] else D[i][j].
When done, we check D[][] matrix and apply Fibonacci function to each cell and add them up, and also apply modulo operation.
Each edge in the graph has weight of 1,The graph may have cycles ,if a node has self loop it can be any distance from itself from 0 to infinity , depending on the no. of time we take the self loop.
I have solved the problem using bfs, but the constraint on distance is in order of 10^9 ,hence bfs is slow.
We ll be asked multiple queries on a given graph of the form
(distance , source)
and the o/p is the list of nodes that are exactly at the given distance starting from the source vertex.
Constraints
1<=Nodes<=500
1<queries<=500
1<=distance<=10^9
I have a feeling ,there would be many repeated computations as the no. of nodes are small,but i am not able to figure out how do i reduce the problem in smaller problems.
What is the efficient way to do this?
Edit : I have tried using matrix exponentiation but its too slow ,for the given constraints. The problem has a time limit of 1 sec.
Let G = (V,E) be your graph, and define the adjacency matrix A as follows:
A[i][j] = 1 (V[i],V[j]) is in E
0 otherwise
In this matrix, for each k:
(A^k)[i][j] > 0 if and only if there is a path from v[i] to v[j] of length exactly k.
This means by creating this matrix and then calculating the exponent, you can easily get your answer.
For fast exponent calculation you can use exponent by squaring, which will yield O(M(n)^log(k)) where M(n) is the cost for matrix multiplication for nXn matrix.
This will also save you some calculation when looking for different queries on the same graph.
Appendix - claim proof:
Base: A^1 = A, and indeed in A by definition, A[i][j]=1 if and only if (V[i],V[j]) is in E
Hypothesis: assume the claim is correct for all l<k
A^k = A^(k-1)*A. From induction hypothesis, A^(k-1)[i][j] > 0 iff there is a path of length k-1 from V[i] to V[j].
Let's examine two vertices v1,v2 with indices i and j.
If there is a path of length k between them, let it be v1->...->u->v2. Let the index of u be m.
From i.h. A^(k-1)[i][m] > 0 because there is a path. In addition A[m][j] = 1, because (u,v2) = (V[m],V[j]) is an edge.
A^k[i][j] = A^(k-1)*A[i][j] = A^(k-1)[i][1]A[1][j] + ... + A^(k-1)[i][m]A[m][j] + ... + A^(k-1)[i][n]A[n][j]
And since A[m][j] > 0 and A^(k-1)[i][m] > 0, then A^(k-1)*A[i][j] > 0
If there is no such path, then for each vertex u such that (u,v2) is an edge, there is no path of length k-1 from v to u (otherweise v1->..->u->v2 is a path of length k).
Then, using induction hypothesis we know that if A^(k-1)[i][m] > 0 then A[m][j] = 0, for all m.
If we assign that in the sum defining A^k[i][j], we get that A^k[i][j] = 0
QED
Small note: Technically, A^k[i][j] is the number of paths between i and j of length exactly k. This can be proven similar to above but with a bit more attention to details.
To avoid the numbers growing too fast (which will increase M(n) because you might need big integers to store that value), and since you don't care for the value other than 0/1 - you can treat the matrix as booleans - using only 0/1 values and trimming anything else.
if there are cycles in your graph, then you can infer that there is a link between each adjacent nodes in cycle * N + 1, because you can iterate through as much as you wish.
That bring me to the idea, we can use the cycles to our advantage!
using BFS while detecting a cycle, we calculate offset + cycle*N and then we get as close to our goal(K)
and search for the K pretty easily.
e.g.
A -> B -> C -> D -> B
K = 1000;
S = A;
A - 0
B - 1
C - 2
D - 3
B - 1 (+ 4N)
here you can check floor() of k - (1+4N) = 0 > 1000 - 1 - 4N = 0 > 999 = 4N > N=249 => best B is 249*4 + 1 = 997
simpler way would be to calculate: round(k - offset, cycle)
from here you can count only few more steps.
in this example (as a REGEX): A(BCD){249}BCD
The graph G is an undirected graph, and all its edges' weight are same. u,v are 2 given vertices, how to find the number of the shortest paths between u and v in graph G in O(|V|)?
|V| stands for the number of vertices in G.
You can use counting variation of BFS.
The idea is to hold a dictionary that maps dict:(v,depth)->#paths (entry is vertex and current depth and value is number of paths from source to this vertex with the desired depth).
At each iteration of the BFS, you track of the current depth of the path, and add the number of found paths to the next level.
The idea that if you have 3 paths leading to x and 4 paths leading to y, both on depth 3, and both have edge (x,u),(y,u) - then there are 7 paths leading to u - the 3 leading to x+(x,u), and the 4 leading to y+(y,u).
Should look something like that:
findNumPaths(s,t):
dict = {} //empty dictionary
dict[(s,0)] = 1 //empty path
queue <- new Queue()
queue.add((s,0))
lastDepth = -1
while (!queue.isEmpty())
(v,depth) = queue.pop()
if depth > lastDepth && (t,lastDepth) is in dict: //found all shortest paths
break
for each edge (v,u):
if (u,depth+1) is not entry in dict:
dict[(u,depth+1)] = 0
queue.push((u,depth+1)) //add u with depth+1 only once, no need for more!
dict[(u,depth+1)] = dict[(u,depth+1)] + dict[v,depth]
lastDepth = depth
return dic[t]
Run time is O(V+E) if using hash table for dictionary.
Another solution (easier to program but less efficient) is:
1. Build the adjacency matrix of the graph, let it be `A`.
2. Set `Curr = I` (identity matrix)
3. while Curr[s][t] != 0:
3.1. Calculate Curr = Curr * A //matrix multiplication
4. Return Curr[s][t]
The reason it works is (A^n)[x][y] is the number of paths of size n in the graph A represents from x to y. We find the first number that is higher than zero, and return the number of paths.
So I wrote a function to find the k nodes of a graph that have the smallest degree. It looks like this:
def smallestKNodes(G, k):
leastK = []
for i in range(G.GetMxNId()):
# Produces an iterator to the node
node = G.GetNI(i)
for j in range(k):
if j >= len(leastK):
leastK.append(node)
break
elif node.GetDeg() < leastK[j].GetDeg():
leastK.insert(j, node)
leastK = leastK[0:k]
break
return leastK[0:k]
My problem is when all the nodes have the same degree, it selects the same nodes every time. How can I make it so it takes all the nodes with zero degree or whatever and then selects k nodes randomly?
Stipulations:
(1) Suppose k = 7, then if there are 3 nodes with degree 0 and 10 nodes with degree 1, I would like to choose all the nodes with degree 0, but randomly choose 4 of the nodes with degree 1.
(2) If possible I don't want to visit any node twice because there might be too many nodes to fit into memory. There might also be a very large number of nodes with minimum degree. In some cases there might also be a very small number of nodes.
Store all the nodes which satisfy your condition and randomly pick k nodes from it. You can do the random pick by shuffling the array (e.g. Fisher-Yates, std::shuffle, randperm, etc.) and picking the first k nodes (for example).
You might want to do two passes, the first pass to discover the relevant degree you have to randomize, how many nodes of that degree to choose, and the total number of nodes with that degree. Then, do a second pass on your nodes, choosing only those with the desired degree at random.
To choose k nodes of n total so each node has a fair probability (k/n), loop over relevant nodes, and choose each one with probability 1, 1, ..., 1, k/(k+1), k/(k+2), ..., k/n. When choosing a node, if k nodes are already chosen, throw one of them away at random.
def randomNodesWithSpecificDegree(G, d, k, n):
result = []
examined = 0
for i in range(G.GetMxNId()):
# Produces an iterator to the node
node = G.GetNI(i)
if node.GetDeg() = d:
examined = examined + 1
if len(result) < k:
result.append(node)
elif random(0...1) < k / examined
index = random(0...k-1)
result[index] = node
assert(examined = n)
return result
This pseudo-code is good when k is small and n is big (seems your case).
This is the question I was asked some time ago on interview, I could not find answer for.
Given some samples S1, S2, ... Sn and their probability distributions(or weights, whatever it is called) P1, P2, .. Pn, design algorithm that randomly chooses sample taking into account its probability. the solution I came with is as follows:
Build cumulative array of weights Ci, such
C0 = 0;
Ci = C[i-1] + Pi.
at the same time calculate T=P1+P2+...Pn.
It takes O(n) time
Generate uniformly random number R = T*random[0..1]
Using binary search algorithm, return least i such Ci >= R.
result is Si. It takes O(logN) time.
Now the actual question is:
Suppose I want to change one of the initial Weights Pj. how to do this in better than O(n) time?
other data structures are acceptable, but random sampling algorithm should not get worse than O(logN).
One way to solve this is to rethink how your binary search tree containing the cumulative totals is built. Rather than building a binary search tree, think about having each node interpreted as follows:
Each node stores a range of values that are dedicated to the node itself.
Nodes in the left subtree represent sampling from the probability distribution just to the left of that range.
Nodes in the right subtree represent sampling from the probability distribution just to the right of that range.
For example, suppose our weights are 3, 2, 2, 2, 2, 1, and 1 for events A, B, C, D, E, F, and G. We build this binary tree holding A, B, C, D, E, F, and G:
D
/ \
B F
/ \ / \
A C E G
Now, we annotate the tree with probabilities. Since A, C, E, and G are all leaves, we give each of them probability mass one:
D
/ \
B F
/ \ / \
A C E G
1 1 1 1
Now, look at the tree for B. B has weight 2 of being chosen, A has weight 3 of being chosen, and C has probability 2 of being chosen. If we normalize these to the range [0, 1), then A accounts for 3/7 of the probability and B and C each account for 2/7s. Thus we have the node for B say that anything in the range [0, 3/7) goes to the left subtree, anything in the range [3/7, 5/7) maps to B, and anything in the range [5/7, 1) maps to the right subtree:
D
/ \
B F
[0, 3/7) / \ [5/7, 1) / \
A C E G
1 1 1 1
Similarly, let's process F. E has weight 2 of being chosen while F and G each have probability weight 1 of being chosen. Thus the subtree for E accounts for 1/2 of the probability mass here, the node F accounts for 1/4, and the subtree for G accounts for 1/4. This means we can assign probabilities as
D
/ \
B F
[0, 3/7) / \ [5/7, 1) [0, 1/2) / \ [3/4, 1)
A C E G
1 1 1 1
Finally, let's look at the root. The combined weight of the left subtree is 3 + 2 + 2 = 7. The combined weight of the right subtree is 2 + 1 + 1 = 4. The weight of D itself is 2. Thus the left subtree has probability 7/13 of being picked, D has probability 2/13 of being picked, and the right subtree has probability 4/13 of being picked. We can thus finalized the probabilities as
D
[0, 7/13) / \ [9/13, 1)
B F
[0, 3/7) / \ [5/7, 1) [0, 1/2) / \ [3/4, 1)
A C E G
1 1 1 1
To generate a random value, you would repeat the following:
Starting at the root:
Choose a uniformly-random value in the range [0, 1).
If it's in the range for the left subtree, descend into it.
If it's in the range for the right subtree, descend into it.
Otherwise, return the value corresponding to the current node.
The probabilities themselves can be determined recursively when the tree is built:
The left and right probabilities are 0 for any leaf node.
If an interior node itself has weight W, its left tree has total weight WL, and its right tree has total weight WR, then the left probability is (WL) / (W + WL + WR) and the right probability is (WR) / (W + WL + WR).
The reason that this reformulation is useful is that it gives us a way to update probabilities in O(log n) time per probability updated. In particular, let's think about what invariants are going to change if we update some particular node's weight. For simplicity, let's assume the node is a leaf for now. When we update the leaf node's weight, the probabilities are still correct for the leaf node, but they're incorrect for the node just above it, because the weight of one of that node's subtrees has changed. Thus we can (in O(1) time) recompute the probabilities for the parent node by just using the same formula as above. But then the parent of that node no longer has the correct values because one of its subtree weights has changed, so we can recompute the probability there as well. This process repeats all the way back up to the root of the tree, with us doing O(1) computation per level to rectify the weights assigned to each edge. Assuming that the tree is balanced, we therefore have to do O(log n) total work to update one probability. The logic is identical if the node isn't a leaf node; we just start somewhere in the tree.
In short, this gives
O(n) time to construct the tree (using a bottom-up approach),
O(log n) time to generate a random value, and
O(log n) time to update any one value.
Hope this helps!
Instead of an array, store the search structured as a balanced binary tree. Every node of the tree should store the total weight of the elements it contains. Depending on the value of R, the search procedure either returns the current node or searches through the left or right subtree.
When the weight of an element is changed, the updating of the search structure is a matter of adjusting the weights on the path from the element to the root of the tree.
Since the tree is balanced, the search and the weight update operations are both O(log N).
For those of you who would like some code, here's a python implementation:
import numpy
class DynamicProbDistribution(object):
""" Given a set of weighted items, randomly samples an item with probability
proportional to its weight. This class also supports fast modification of the
distribution, so that changing an item's weight requires O(log N) time.
Sampling requires O(log N) time. """
def __init__(self, weights):
self.num_weights = len(weights)
self.weights = numpy.empty((1+len(weights),), 'float32')
self.weights[0] = 0 # Not necessary but easier to read after printing
self.weights[1:] = weights
self.weight_tree = numpy.zeros((1+len(weights),), 'float32')
self.populate_weight_tree()
def populate_weight_tree(self):
""" The value of every node in the weight tree is equal to the sum of all
weights in the subtree rooted at that node. """
i = self.num_weights
while i > 0:
weight_sum = self.weights[i]
twoi = 2*i
if twoi < self.num_weights:
weight_sum += self.weight_tree[twoi] + self.weight_tree[twoi+1]
elif twoi == self.num_weights:
weight_sum += self.weights[twoi]
self.weight_tree[i] = weight_sum
i -= 1
def set_weight(self, item_idx, weight):
""" Changes the weight of the given item. """
i = item_idx + 1
self.weights[i] = weight
while i > 0:
weight_sum = self.weights[i]
twoi = 2*i
if twoi < self.num_weights:
weight_sum += self.weight_tree[twoi] + self.weight_tree[twoi+1]
elif twoi == self.num_weights:
weight_sum += self.weights[twoi]
self.weight_tree[i] = weight_sum
i /= 2 # Only need to modify the parents of this node
def sample(self):
""" Returns an item index sampled from the distribution. """
i = 1
while True:
twoi = 2*i
if twoi < self.num_weights:
# Two children
val = numpy.random.random() * self.weight_tree[i]
if val < self.weights[i]:
# all indices are offset by 1 for fast traversal of the
# internal binary tree
return i-1
elif val < self.weights[i] + self.weight_tree[twoi]:
i = twoi # descend into the subtree
else:
i = twoi + 1
elif twoi == self.num_weights:
# One child
val = numpy.random.random() * self.weight_tree[i]
if val < self.weights[i]:
return i-1
else:
i = twoi
else:
# No children
return i-1
def validate_distribution_results(dpd, weights, samples_per_item=1000):
import time
bins = numpy.zeros((len(weights),), 'float32')
num_samples = samples_per_item * numpy.sum(weights)
start = time.time()
for i in xrange(num_samples):
bins[dpd.sample()] += 1
duration = time.time() - start
bins *= numpy.sum(weights)
bins /= num_samples
print "Time to make %s samples: %s" % (num_samples, duration)
# These should be very close to each other
print "\nWeights:\n", weights
print "\nBins:\n", bins
sdev_tolerance = 10 # very unlikely to be exceeded
tolerance = float(sdev_tolerance) / numpy.sqrt(samples_per_item)
print "\nTolerance:\n", tolerance
error = numpy.abs(weights - bins)
print "\nError:\n", error
assert (error < tolerance).all()
##test
def test_DynamicProbDistribution():
# First test that the initial distribution generates valid samples.
weights = [2,5,4, 8,3,6, 6,1,3, 4,7,9]
dpd = DynamicProbDistribution(weights)
validate_distribution_results(dpd, weights)
# Now test that we can change the weights and still sample from the
# distribution.
print "\nChanging weights..."
dpd.set_weight(4, 10)
weights[4] = 10
dpd.set_weight(9, 2)
weights[9] = 2
dpd.set_weight(5, 4)
weights[5] = 4
dpd.set_weight(11, 3)
weights[11] = 3
validate_distribution_results(dpd, weights)
print "\nTest passed"
if __name__ == '__main__':
test_DynamicProbDistribution()
I've implemented a version related to Ken's code, but is balanced with a red/black tree for worst case O(log n) operations. This is available as weightedDict.py at: https://github.com/google/weighted-dict
(I would have added this as a comment to Ken's answer, but don't have the reputation to do that!)