Time complexity of Hill Climbing algorithm for finding local min/max in a graph - algorithm

What is the time complexity (order of algorithm) of an algorithm that finds the local minimum in a graph with n nodes (having each node a maximum of d neighbors)?
Detail: We have a graph with n nodes. Each node in the graph has an integer value. Each node has maximum of d neighbors. We are looking for a node that has the lowest value among its neighbors. The graph is represented by an adjacency list. The algorithm starts by selecting random nodes and, within these nodes, it selects the node with minimum value (let's say node u). Starting from node u, the algorithm finds a neighbor v, where value(v) < value(u). Then, it continues with v and repeats the above step. The algorithm terminates when the node does not have any neighbor with a lower value. What is the time complexity of this algorithm and why?

Time complexity is O(n + d), because you can have n nodes, which are connected as this, the number shows the value of node :
16-15-14-13-12-11-10-9-8-7-6-5-4-3-2-1
And you can randomly select these, marked by "!"
!-!-!-13-12-11-10-9-8-7-6-5-4-3-2-1
So you select the node with value 14 and by described alghoritm, you will check all the nodes and all the edges until you reach the node with value 1.
The worst complexity for task : "find one element" is O(N), where "N" is the length of your input and length of your input is actually N=G(n,d)=n+d.

Related

All-pair shortest path for minimum spanning tree

I am trying to solve an algorithm challenge about graphs, which I have managed to break down to the following: Given an undirected spanning tree, find the 2 leaves such that the cost between them is minimal.
Now I know of the Floyd Warshall algorithm that can find all-pair shortest paths with time complexity O(N^3) and space complexity O(N^2). The input of the problem is N = 10^5 so O(N^3) and O(N^2) are too much.
Is there a way to optimize space and time complexity for this problem?
As #Codor said, elaborating on that, in a MST there is only one unique path b/w any pair of nodes, and same will be the shortest path.
In order to calculate shortest path b/w all pairs.
You can choose to follow this algorithm.
You can basically choose find the root of the MST by constantly removing leaf nodes till only one or two nodes are left .
Complexity : centre node in a tree
this can be achieved in O(V) i.e linear time
Choose one of them as root. Calculate distance of all the other nodes in respect to the root node using Breadth First Search(BFS).
Complexity : O(V+E) ~ O(V) in case of tree
Now you can find distance b/w any pair of nodes call it a,b. Find its least common ancestor(lcp).
Then there are two case if
lcp(a,b) = r (root of the tree).
dis(a,b) = dis[a] + dis[b]
lcp(a,b) = c ( which is not the root node)
dis(a,b) = dis[a] + dis[b] - 2 * dis[c]
where dis(x,y) = distance b/w node x,y
and dis[x] distance of node x from root node
If implemented using Ranked Union Find
Complexity : O(h) , where h is height of the tree per pair of (a,b).
h = X/2, where X is the diameter of the tree.
So total complexity depends on the no. of leaf node pairs.

Maximum distance of two tree nodes with same color

Given a tree with N vertices where each edge has weight 1. The nodes are colored with C colors. We wish to find, for each color, the maximum shortest distance between two nodes of that color.
I can build a sparse table and then find LCA of two nodes in O(log n). Then check all pairs of same color. This gives O(n^2 log n). Is it possible to do better than this?
You can juggle the edges properly and run a recursive traversal starting form each node as if it is the root. As the tree has N nodes, N traversals would give you O(N2). The juggling of the edges should take O(n) time as well, as there are (n-1) edges in a tree. If you keep a matrix M with C rows and C columns and update it as go through each traversal, then you can do what you desire in O(N2) overall time and space complexity.
Basically what you do will be updating the cuth row of M in a traversal starting from node U with color cu in the following manner when you reach node V with color cv at distance d.
M[cu][cv] = max(M[cu][cv], d)

Complexity of a tree labeling algorithm

I have a generic weighted tree (undirected graph without cycles, connected) with n nodes and n-1 edges connecting a node to another one.
My algorithm does the following:
do
compute the actual leaves (nodes with degree 1)
remove all the leaves and their edges from the tree labelling each parent with the maximum value of the cost of his connected leaves
(for example if an internal node is connected to two leaf with edges with costs 5,6 then we label the internal node after removing the leaves with 6)
until the tree has size <= 2
return the node with maximum cost labelled
Can I say that the complexity is O(n) to compute the leaves and O(n) to eliminate each edge with leaf, so I have O(n)+O(n) = O(n)?
You can easily do this in O(n) with a set implemented as a simple list, queue, or stack (order of processing is unimportant).
Put all the leaves in the set.
In a loop, remove a leaf from the set, delete it and its edge from the graph. Process the label by updating the max of the parent. If the parent is now a leaf, add it to the set and keep going.
When the set is empty you're done, and the node labels are correct.
Initially constructing the set is O(n). Every vertex is placed on the set, removed and its label processed exactly once. That's all constant time. So for n nodes it is O(n) time. So we have O(n) + O(n) = O(n).
It's certainly possible to do this process in O(n), but whether or not your algorithm actually does depends.
If either "compute the actual leaves" or "remove all the leaves and their edges" loops over the entire tree, that step would take O(n).
And both the above steps will be repeated O(n) times in the worst case (if the tree is greatly unbalanced), so, in total, it could take O(n2).
To do this in O(n), you could have each node point to its parent so you can remove the leaf in constant time and maintain a collection of leaves so you always have the leaves, rather than having to calculate them - this would lead to O(n) running time.
As your tree is an artitary one. It can also be a link list in which case you would eliminate one node in each iteration and you would need (n-2) iterations of O(n) to find the leaf.
So your algorithm is actually O(N^2)
Here is an better algorithm that does that in O(N) for any tree
deleteLeaf(Node k) {
for each child do
value = deleteLeaf(child)
if(value>max)
max = value
delete(child)
return max
}
deleteLeaf(root) or deleteLeaf(root.child)

Finding number of nodes within a certain distance in a rooted tree

In a rooted and weighted tree, how can you find the number of nodes within a certain distance from each node? You only need to consider down edges, e.g. nodes going down from the root. Keep in mind each edge has a weight.
I can do this in O(N^2) time using a DFS from each node and keeping track of the distance traveled, but with N >= 100000 it's a bit slow. I'm pretty sure you could easily solve it with unweighted edges with DP, but anyone know how to solve this one quickly? (Less than N^2)
It's possible to improve my previous answer to O(nlog d) time and O(n) space by making use of the following observation:
The number of sufficiently-close nodes at a given node v is the sum of the numbers of sufficiently-close nodes of each of its children, less the number of nodes that have just become insufficiently-close.
Let's call the distance threshold m, and the distance on the edge between two adjacent nodes u and v d(u, v).
Every node has a single ancestor that is the first ancestor to miss out
For each node v, we will maintain a count, c(v), that is initially 0.
For any node v, consider the chain of ancestors from v's parent up to the root. Call the ith node in this chain a(v, i). Notice that v needs to be counted as sufficiently close in some number i >= 0 of the first nodes in this chain, and in no other nodes. If we are able to quickly find i, then we can simply decrement c(a(v, i+1)) (bringing it (possibly further) below 0), so that when the counts of a(v, i+1)'s children are added to it in a later pass, v is correctly excluded from being counted. Provided we calculate fully accurate counts for all children of a node v before adding them to c(v), any such exclusions are correctly "propagated" to parent counts.
The tricky part is finding i efficiently. Call the sum of the distances of the first j >= 0 edges on the path from v to the root s(v, j), and call the list of all depth(v)+1 of these path lengths, listed in increasing order, s(v). What we want to do is binary-search the list of path lengths s(v) for the first entry greater than the threshold m: this would find i+1 in log(d) time. The problem is constructing s(v). We could easily build it using a running total from v up to the root -- but that would require O(d) time per node, nullifying any time improvement. We need a way to construct s(v) from s(parent(v)) in constant time, but the problem is that as we recurse from a node v to its child u, the path lengths grow "the wrong way": every path length x needs to become x + d(u, v), and a new path length of 0 needs to be added at the beginning. This appears to require O(d) updates, but a trick gets around the problem...
Finding i quickly
The solution is to calculate, at each node v, the total path length t(v) of all edges on the path from v to the root. This is easily done in constant time per node: t(v) = t(parent(v)) + d(v, parent(v)). We can then form s(v) by prepending -t to the beginning of s(parent(v)), and when performing the binary search, consider each element s(v, j) to represent s(v, j) + t (or equivalently, binary search for m - t instead of m). The insertion of -t at the start can be achieved in O(1) time by having a child u of a node v share v's path length array, with s(u) considered to begin one memory location before s(v). All path length arrays are "right-justified" inside a single memory buffer of size d+1 -- specifically, nodes at depth k will have their path length array begin at offset d-k inside the buffer to allow room for their descendant nodes to prepend entries. The array sharing means that sibling nodes will overwrite each other's path lengths, but this is not a problem: we only need the values in s(v) to remain valid while v and v's descendants are processed in the preorder DFS.
In this way we gain the effect of O(d) path length increases in O(1) time. Thus the total time required to find i at a given node is O(1) (to build s(v)) plus O(log d) (to find i using the modified binary search) = O(log d). A single preorder DFS pass is used to find and decrement the appropriate ancestor's count for each node; a postorder DFS pass then sums child counts into parent counts. These two passes can be combined into a single pass over the nodes that performs operations both before and after recursing.
[EDIT: Please see my other answer for an even more efficient O(nlog d) solution :) ]
Here's a simple O(nd)-time, O(n)-space algorithm, where d is the maximum depth of any node in the tree. A complete tree (a tree in which every node has the same number of children) with n nodes has depth d = O(log n), so this should be much faster than your O(n^2) DFS-based approach in most cases, though if the number of sufficiently-close descendants per node is small (i.e. if DFS only traverses a small number of levels) then your algorithm should not be too bad either.
For any node v, consider the chain of ancestors from v's parent up to the root. Notice that v needs to be counted as sufficiently close in some number i >= 0 of the first nodes in this chain, and in no other nodes. So all we need to do is for each node, climb upwards towards the root until such time as the total path length exceeds the threshold distance m, incrementing the count at each ancestor as we go. There are n nodes, and for each node there are at most d ancestors, so this algorithm is trivially O(nd).

How to find the maximum-weight path between two vertices in a DAG?

In a DAG G, with non negative weighted edges, how do you find the maximum-weight path between two vertices in G?
Thank you guys!
You can solve this in O(n + m) time (where n is the number of nodes and m the number of edges) using a topological sort. Begin by doing topological sort on the reverse graph, so that you have all the nodes ordered in a way such that no node is visited before all its children are visited.
Now, we're going to label all the nodes with the weight of the highest-weight path starting with that node. This is done based on the following recursive observation:
The weight of the highest-weight path starting from a sink node (any node with no outgoing edges) is zero, since the only path starting from that node is the length-zero path of just that node.
The weight of the highest-weight path starting from any other node is given by the maximum weight of any path formed by following an outgoing edge to a node, then taking the maximum-weight path from that node.
Because we have the nodes reverse-topologically sorted, we can visit all of the nodes in an order that guarantees that if we ever try following an edge and looking up the cost of the heaviest path at the endpoint of that node, we will have already computed the maximum-weight path starting at that node. This means that once we have the reverse topological sorted order, we can apply the following algorithm to all the nodes in that order:
If the node has no outgoing edges, record the weight of the heaviest path starting at that node (denoted d(u)) as zero.
Otherwise, for each edge (u, v) leaving the current node u, compute l(u, v) + d(v), and set d(u) to be the largest value attained this way.
Once we've done this step, we can make one last pass over all the nodes and return the highest value of d attained by any node.
The runtime of this algorithm can be analyzed as follows. Computing a topological sort can be done in O(n + m) time using many different methods. When we then scan over each node and each outgoing edge from each node, we visit each node and edge exactly once. This means that we spend O(n) time on the nodes and O(m) time on the edges. Finally, we spend O(n) time on one final pass over the elements to find the highest weight path, which takes O(n). This gives a grand total of O(n + m) time, which is linear in the size of the input.
A simple brute-force algorithm can be written using recursive functions.
Start with an empty vector (in C++: std::vector) and insert the first node.
Then call your recursive function with the vector as argument that does the following:
loop over all neighbours and for each neighbour
copy the vector
add the neighbour
call ourself
Also add the total weight as argument to the recursive function and add the weight in every recursive call.
The function should stop whenever it reaches the end node. Then compare the total weight with the maximum weight you have so far (use a global variable) and if the new total weight is bigger, set the maximum weight and store the vector.
The rest is up to you.

Resources