Optimizations for longest path problem in cyclic graph - algorithm

What optimizations exist for trying to find the longest path in a cyclic graph?
Longest path in cyclic graphs is known to be NP-complete. What optimizations or heuristics can make finding the longest path faster than DFSing the entire graph? Are there any probabilistic approaches?
I have a graph with specific qualities, but I'm looking for an answer to this in the general case. Linking to papers would be fantastic. Here is a partial answer:
Confirm it is cyclic. Longest path in acyclic graphs is easily computed using dynamic programming.
Find out if the graph is planar (which algorithm is best?). If it is, you might see if it is a block graph, ptolemaic graph, or cacti graph and apply the methods found in this paper.
Find out how many simple cycles there are using Donald B Johnson's algorithm (Java implementation). You can change any cyclic graph into an acyclic one by removing an edge in a simple cycle. You can then run the dynamic programming solution found on the Wikipedia page. For completeness, you would have to do this N times for each cycle, where N is the length of the cycle. Thus, for an entire graph, the number of times you have to run the DP solution is equal to the product of the lengths of all cycles.
If you have to DFS the entire graph, you can prune some paths by computing the "reachability" of each node in advance. This reachability, which is mainly applicable to directed graphs, is the number of nodes each node can reach without repetitions. It is the maximum the longest path from that node could possibly be. With this information, if your current path plus the reachability of the child node is less than the longest you've already found, there is no point in taking that branch as it is impossible that you would find a longer path.

Here is a O(n*2^n) dynamic programming approach that should be feasible for up to say 20 vertices:
m(b, U) = the maximum length of any path ending at b and visiting only (some of) the vertices in U.
Initially, set m(b, {b}) = 0.
Then, m(b, U) = max value of m(x, U - x) + d(x, b) over all x in U such that x is not b and an edge (x, b) exists. Take the maximum of these values for all endpoints b, with U = V (the full set of vertices). That will be the maximum length of any path.
The following C code assumes a distance matrix in d[N][N]. If your graph is unweighted, you can change every read access to this array to the constant 1. A traceback showing an optimal sequence of vertices (there may be more than one) is also computed in the array p[N][NBITS].
#define N 20
#define NBITS (1 << N)
int d[N][N]; /* Assumed to be populated earlier. -1 means "no edge". */
int m[N][NBITS]; /* DP matrix. -2 means "unknown". */
int p[N][NBITS]; /* DP predecessor traceback matrix. */
/* Maximum distance for a path ending at vertex b, visiting only
vertices in visited. */
int subsolve(int b, unsigned visited) {
if (visited == (1 << b)) {
/* A single vertex */
p[b][visited] = -1;
return 0;
}
if (m[b][visited] == -2) {
/* Haven't solved this subproblem yet */
int best = -1, bestPred = -1;
unsigned i;
for (i = 0; i < N; ++i) {
if (i != b && ((visited >> i) & 1) && d[i][b] != -1) {
int x = subsolve(i, visited & ~(1 << b));
if (x != -1) {
x += d[i][b];
if (x > best) {
best = x;
bestPred = i;
}
}
}
}
m[b][visited] = best;
p[b][visited] = bestPred;
}
return m[b][visited];
}
/* Maximum path length for d[][].
n must be <= N.
*last will contain the last vertex in the path; use p[][] to trace back. */
int solve(int n, int *last) {
int b, i;
int best = -1;
/* Need to blank the DP and predecessor matrices */
for (b = 0; b < N; ++b) {
for (i = 0; i < NBITS; ++i) {
m[b][i] = -2;
p[b][i] = -2;
}
}
for (b = 0; b < n; ++b) {
int x = subsolve(b, (1 << n) - 1);
if (x > best) {
best = x;
*last = b;
}
}
return best;
}
On my PC, this solves a 20x20 complete graph with edge weights randomly chosen in the range [0, 1000) in about 7s and needs about 160Mb (half of that is for the predecessor trace).
(Please, no comments about using fixed-size arrays. Use malloc() (or better yet, C++ vector<int>) in a real program. I just wrote it this way so things would be clearer.)

Related

How can we find the largest contiguous region of a graph with two different "ID"s?

I've recently learned about the Flood-Fill Algorithm, an algorithm that can take a graph and assign each node a component number in O(N) time.
For example, a common problem that can be solved efficiently with the Flood-Fill Algorithm would be to find the largest region in a N*N board, where every node in the region is adjacent to another node with the same ID either directly up, down, to the left, or to the right.
In this board, the largest regions would both be of size 3, made up of all 1s and all 9s respectively.
However, I recently started wondering if we could extend this problem; specifically, if we could find the largest region in a graph such that every node in the region is adjacent to another node with two possible IDs. In the above board, the largest such region is made up of 1s and 9s, and has a size of 7.
Here was my thought process in trying to solve this problem:
Thought 1: O(N^4) Algorithm
We can solve this in O(N^4) time using a basic flood-fill algorithm. We do this by testing all O(N^2) pairs of horizontally or vertically adjacent squares. For every pair of squares, if they have different IDs, then we run a flood-fill from one of the two squares.
Then, by modifying the flood-fill algorithm so that it travels to squares with one of the two possible IDs, we can test each pair in O(N^2) time --> O(N^2) pairs * O(N^2) flood fill per pair = O(N^4) algorithm.
Then, I had an insight: An Possibly O(N^2) Algorithm
First, we run a regular flood-fill through the board and separate the board into a "component graph" (where each component in the original graph is reduced to a single node).
Now, we do a flood-fill through the edges of the component graph instead of the nodes. We mark each edge with a pair of integers signifying the two IDs inside the two components which it connects, before flood-filling through the edges as if they themselves were nodes.
I believe that this, if implemented correctly, would result in a O(N^2) algorithm, because an upper bound for the number of edges in a N*N board is 4*N*N.
Now, my question is, is my thought process logically sound? If not, can somebody suggest another algorithm to solve this problem?
Here is the algorithm that I wrote to solve your problem. It expands on your idea to flood-fill through the edges (great idea, by the way) and is able to output the correct answer for a 250*250 grid in less than 300ms, with less than 30 megabytes of memory allocated.
Here is the problem that I managed to find online that matches your question exactly, and it is also where I tested the validity of my algorithm:
USACO Problem
Note that the USACO Problem requires us to find the largest single-id component before finding the largest double-id component. In my algorithm, the first step is actually necessary in order to reduce the whole board into a component graph.
Here's my commented C++ Code:
#include <iostream>
#include <fstream>
#include <cmath>
#include <algorithm>
#include <vector>
#include <unordered_set>
using namespace std;
// board to hold square ids and comp[][] to mark component numbers
vector <vector<int>> board, comp;
vector <int> comp_size = {-1}; // size of those components
vector <int> comp_id = {-1}; // id contained within those components
vector <unordered_set <int>> adj = {{}}; // component graph adjacency list
vector <bool> visited; // component graph visited array
void dfs(int x, int y, int N, int id, int curr_comp){
if(x < 0 || x >= N || y < 0 || y >= N){return;}
else if(board[x][y] != id){
if(comp[x][y] == 0){return;}
// construct component graph adjacency list during the first flood-fill
adj[comp[x][y]].insert(curr_comp);
adj[curr_comp].insert(comp[x][y]);
// this is why we use an unordered_set: it automatically eliminates
// collisions
return;
}
else if(comp[x][y]){return;}
++comp_size[curr_comp];
comp[x][y] = curr_comp;
dfs(x-1, y, N, id, curr_comp);
dfs(x+1, y, N, id, curr_comp);
dfs(x, y-1, N, id, curr_comp);
dfs(x, y+1, N, id, curr_comp);
}
void dfs2(int curr, int id1, int id2, int &size){
visited[curr] = true;
// recurse from all valid and adjacent components to curr
vector <int> to_erase;
for(int item : adj[curr]){
if(visited[item]){continue;}
if(comp_id[item] == id1 || comp_id[item] == id2){
to_erase.push_back(item);
size += comp_size[item];
dfs2(item, id1, id2, size);
}
}
// we erase all edges connecting the current component AT THE SAME TIME to
// prevent std::unordered_set iterators from being invalidated, which would
// happen if we erased items as we iterated through adj[curr]
for(int item : to_erase){
adj[curr].erase(item);
adj[item].erase(curr);
}
return;
}
int main()
{
ifstream fin("multimoo.in");
ofstream fout("multimoo.out");
int N;
fin >> N;
board = vector <vector<int>> (N, vector <int> (N));
for(int i = 0; i < N; ++i){
for(int j = 0; j < N; ++j){
fin >> board[i][j];
}
}
// Input Done
comp = vector <vector<int>> (N, vector <int> (N, 0)); // note that comp[i][j] = 0 means not visited yet
// regular flood-fill through all the nodes
for(int i = 0, curr_comp = 1; i < N; ++i){
for(int j = 0; j < N; ++j){
if(comp[i][j]){continue;}
// add information about the current component
comp_size.push_back(0);
comp_id.push_back(board[i][j]);
adj.push_back({});
dfs(i, j, N, board[i][j], curr_comp++);
}
}
fout << *max_element(comp_size.begin(), comp_size.end()) << endl;
int ANS = 0;
for(unsigned int i = 1; i < comp_size.size(); ++i){
// no range-for loop here as we erase elements while iterating, which
// may invalidate unordered_set iterators; instead, we use a while-loop
while(!adj[i].empty()){
int size = comp_size[i], curr = *(adj[i].begin());
visited = vector <bool> (comp_size.size(), false); // reset visited
dfs2(i, comp_id[i], comp_id[curr], size);
ANS = max(ANS, size);
}
}
fout << ANS << endl;
return 0;
}
As for the time complexity, I personally am not very sure. If somebody could help analyze this algorithm to determine its complexity, I'd greatly appreciate it!
Your algorithm works...
As far as I can tell, flood filling over your induced graph indeed gives all possible components, after which it's simple to find the largest one.
...but I'm not sure about the runtime
You correctly say that there are O(N^2) edges in the original graph, and therefore O(N^2) nodes in the induced graph. However, these nodes are no longer guaranteed to be in a nice grid, which may leave more than O(N^2) induced edges.
For example, consider the large "1-block" in your example. This block has 6 edges, which will give a complete graph with 6 vertices, as all these edges-turned-vertices are connected. This may give you an induced graph with more than O(N^2) edges, making it impossible to find components in O(N^2) time.
Therefore, I believe that the algorithm will not run in O(N^2), but I'm unsure of the actual runtime, as it will depend on what exactly the algorithm does at this point. The question only notes flood fill, but I think it had not imagined this situation.
Consider the following 9x9 grid:
232323232
311111113
212313212
313212313
212313212
313212313
212313212
313212313
212313212
The idea is simple: it's a single large component designed to border as many small components as possible. The induced graph here would be a single almost-complete graph with O(N^2) vertices and O(N^4) edges. Alternatively, if we only link the (1,2) edges with other (1,2) edges, and similar for (1,3) edges and other (1,3) edges, we will have a slightly less-connected graph, but it would still consist of two components with each O(N^4) edges, albeit with a lower constant.
Therefore, creating this graph would take at least O(N^4) time, as would traversing it. This is the time I would argue that the algorithm takes, but I cannot prove that there are no possible optimizations that improve upon this.
We could achieve the optimal O(N^2) complexity by smartly doing our DFS from each pivot component.
Explanation:
First create the set of same-valued components and relationship-map to their neighbours
Notice that for the flood-zone we are looking for atmost 2 distinct values.
Lets say we consider the point (i, j) and look at its neighbours
For each 2-value-pair, say [v_ij, v_neighbour] => do a BFS from this (i,j) pivot point while only collecting nodes such that node-value is one of [v_ij, v_neighbour]
Notice that each component is visited only constant times for BFS (we ensure that by deleting the reverse-edge from child->parent while doing BFS).
Because of (5), our complexity remains O(N^2)
Working code in Python:
from queue import Queue
class Comp:
def __init__(self, point, value):
self.members = {point}
self.value = value
self.neighbours = set()
self.pivot = point
def can_add_member(self, value):
return value == self.value
def add_member(self, point):
self.members.add(point)
def add_neighbour(self, neighbour_comp):
self.neighbours.add(neighbour_comp)
def __str__(self):
return f'[M:%d, V:%d, N:%d]' % (len(self.members), self.value, len(self.neighbours))
def find_largest_flood_region(D):
point_to_comp_map = {}
N, M = len(D), len(D[0])
# Step-1: Create same-value connected-components:
for x in range(N):
for y in range(M):
if (x, y) in point_to_comp_map:
continue
comp_xy = Comp((x, y), D[x][y])
point_to_comp_map[(x, y)] = comp_xy
pq = Queue()
pq.put((x, y))
while pq.qsize() > 0:
i, j = pq.get()
for l, m in [(i-1, j), (i+1, j), (i, j-1), (i, j+1)]:
if 0 <= l < N and 0 <= m < M and (l, m) not in point_to_comp_map and D[l][m] == D[x][y]:
comp_xy.add_member((l, m))
point_to_comp_map[(l, m)] = comp_xy
pq.put((l, m))
# Step-2: Create the relationship-map between the components created above
for x in range(N):
for y in range(M):
comp_xy: Comp = point_to_comp_map[(x, y)]
for i, j in [(x-1, y), (x+1, y), (x, y-1), (x, y+1)]:
if 0 <= i < N and 0 <= j < M and D[i][j] != D[x][y]:
comp_ij: Comp = point_to_comp_map[(i, j)]
comp_xy.add_neighbour(comp_ij)
comp_ij.add_neighbour(comp_xy)
# Do BFS one by one on each unique component:
unique_comps = set(point_to_comp_map.values())
max_region = 0
for comp in unique_comps:
potential_values = set([neigh_comp.value for neigh_comp in comp.neighbours])
for value in potential_values:
value_set = {value, comp.value}
region_value = 0
pq = Queue()
pq.put(comp)
while pq.qsize() > 0:
comp_xy: Comp = pq.get()
region_value += len(comp_xy.members)
for ncomp in comp_xy.neighbours:
if ncomp.value in value_set:
if comp_xy in ncomp.neighbours:
ncomp.neighbours.remove(comp_xy)
pq.put(ncomp)
max_region = max(max_region, region_value)
return max_region
D = [
[9,2,7,9],
[1,1,9,9],
[3,1,4,5],
[3,5,6,6]
]
print(find_largest_flood_region(D))
Output:
7
We can show that solving this in O(n), where n is the number of elements in the matrix, is possible with two passes of a flood-fill union-find routine without a depth-first search.
Given
9 2 7 9
1 1 9 9
3 1 4 5
3 5 6 6
after we label with flood fill, we have:
A B C D
E E D D
F E G H
F I J J
Now that we know each component's size, we can restrict each cell to testing its best connection to a different component left or up. We only need to check one field in a map on the component, the one pointing to the same number, potentially create a new component of reference, or merge two.
In the following example, we'll label components with more than one value with two letters unrelated to their original components. Each cell visited can generate at most two new components and updates at most two components so the complexity remains O(n).
Iterating left to right, top to bottom:
A0: {⊥: 1, 2: AA, 1: DD}
B0: {⊥: 1, 9: AA, 7: BB, 1: EE}
AA = {size: 2}
C0: {⊥: 1, 2: BB, 9: CC}
BB = {size: 2}
D0: {⊥: 3, 7: CC}
CC = {size: 4}
E0: {⊥: 3, 9: DD}
DD = {size: 4}
E1: {⊥: 3, 9: DD, 2: EE}
EE = {size: 4}
D1: {⊥: 3, 7: CC, 1: DD}
DD updates to size 7
D2: {⊥: 3, 7: CC, 1: DD}
F0: {⊥: 2, 1: FF}
FF = {size: 5}
... etc.

Placing blocks and calculating height of the highest tower

Series of k blocks is given (k1, k2, ..., ki). Each block starts on position ai and ends on position bi and its height is 1. Blocks are placed consecutively. If the block overlaps another one, it is being attached on its top. My task is to calculate the highest tower of blocks.
I have created an algorithm which time complexity is about O(n^2), but I know there is a faster solution with the usage of skiplist.
#include <iostream>
struct Brick
{
int begin;
int end;
int height = 1;
};
bool DoOverlap(Brick a, Brick b)
{
return (a.end > b.begin && a.begin < b.end)
}
int theHighest(Brick bricks[], int n)
{
int height = 1;
for (size_t i = 1; i < n; i++)
{
for (size_t j = 0; j < i; j++)
{
if (bricks[i].height <= bricks[j].height && DoOverlap(bricks[i], bricks[j]))
{
bricks[i].height = bricks[j].height + 1;
if (bricks[i].height > height)
height = bricks[i].height;
}
}
}
return height;
}
That's an example drawing of created construction.
You can simply use 2 pointers after sorting the blocks based on their starting positions, if their starting positions match sort them based on their ending positions. Then simply use the 2 pointers to find the maximum height.
Time complexity : O(NlogN)
You can find the demo link here
#include <bits/stdc++.h>
using namespace std;
#define ii pair<int,int>
bool modified_sort(const pair<int,int> &a,
const pair<int,int> &b)
{
if (a.first == b.first) {
return (a.second <b.second);
}
return (a.first <b.first);
}
int main() {
// your code goes here
vector<ii> blocks;
int n; // no of blocks
int a,b;
cin>>n;
for (int i=0;i<n;i++) {
cin>>a>>b;
blocks.push_back(ii(a,b));
}
sort(blocks.begin(), blocks.end(), modified_sort);
int start=0,end=0;
int max_height=0;
while(end<n) {
while(start<end && blocks[start].second <= blocks[end].first)
{
start++;
}
max_height = max(max_height,(end-start+1));
end++;
}
cout<<max_height<<endl;
return 0;
}
Here is a straightforward solution (without skip lists):
Create an array heights
Iterate through the blocks.
For every block
Check the existing entries in the heights array for the positions the current block occupies by iterating over them. Determine their maximum.
Increase the values in the heights array for the current block to the maximum+1 determined in the previous step.
keep score of the maximum tower you have built during the scan.
This problem is isomorphic to a graph traversal. Each interval (block) is a node of the graph. Two blocks are connected by an edge iff their intervals overlap (a stacking possibility). The example you give has graph edges
1 2
1 3
2 3
2 5
and node 4 has no edges
Your highest stack is isomorphic to the longest cycle-free path in the graph. This problem has well-known solutions.
BTW, I don't think your n^2 algorithm works for all orderings of blocks. Try a set of six blocks with one overlap each, such as the intervals [n, n+3] for n in {2, 4, 6, 8, 10, 12}. Feed all permutations of these blocks to your algorithm, and see whether it comes up with a height of 6 for each.
Complexity
I think the highest complexity is likely to be sorting the intervals to accelerate marking the edges. The sort will be O(n log n). Adding edges is O(n d) where d is the mean degree of the graph (and n*d is the number of edges).
I don't have the graph traversal algorithm solidly in mind, but I expect that it's O(d log n).
It looks like you can store your already processed blocks in a skip list. The blocks should be ordered by starting position. Then to find overlapping blocks at each step you should search in this skip list which is O(log n) on average. You find first overlapping block then iterate to next and so on until you meet first non-overlapping block.
So on average you can get O(n * (log(n) + m)) where m - is the mean number of overlapping blocks. In the worst case you still get O(n^2).

global mini-cut algorithm which outputs exact edges

I am looking for an algorithm to find global mini-cut in a undirected graph.
I want to input a graph and algorithm output minimum number of the edges by cutting them the given graph can be partitioned into two parts.
Here is requirement:
find exact edges, not only their number.
the min-cut edges should compute with 100% correctly.
it is an undirected graph.
the algorithm shall terminate by indicating it found the answer or not found the answer.
I searched some articles on Internet and find out that Karger's minimum cut algorithm is randomized one, its output maybe be not the exact min-cuts. I don't algorithm like that.
I want to compute exact edges(I need to know which edges they are) whose number is the smallest.
I would like to hear some advice, while I am looking for such algorithms.
It would be great if your advice comes with introduction to the algorithm with example codes.
Thanks in advance.
We can do this using max flow algorithm, when we calculate the max-flow of a graph the edges that are saturated when the algorithm finishes are part of the min-cut. You can read more about the Max flow Min cut theorem . Calculating the max flow is a fairly standard problem, there are a lot of polynomial time solutions available for that problem you can read more about them here.
So to summarize the algorithm, we first find the max flow of the graph, then the edges in the min cut are the ones where the flow on the edge is equal to the capacity of that edge (those edges that have been saturated).
So one of the ways to solve the max flow problem is using the Ford Fulkerson algorithm, it finds augmenting path in the graph, then using the minimum edge in the augmenting path tries to saturate that augmenting path, it repeats this process untill no augmenting path's are left.
To find the augmenting path we can either do a Depth first Search or a Breadth First Search. The Edmonds Karp's algorithm finds an augmenting path by using a simple Breadth first Search.
I have written c++ code below to find the max flow and min cut, the code to find max flow is taken from the book "Competetive Programming by Steven Halim".
Also note that since the graph is undirected all the edges that the min cut function print's would be twice as it prints a -- b and b -- a.
#include<iostream>
#include<vector>
#include<queue>
#include<utility>
#define MAXX 100
#define INF 1e9
using namespace std;
int s, t, flow, n, dist[MAXX], par[MAXX], AdjMat[MAXX][MAXX]; //Adjacency Matrix graph
vector< pair<int, int> > G[MAXX]; //adjacency list graph
void minCut(){
for(int i = 0;i < n;i++){
for(int j = 0;j < G[i].size();j++){
int v = G[i][j].first;
if(AdjMat[i][v] == 0){ //saturated edges
cout << i << " " << v << endl;
}
}
}
}
void augmentPath(int v, int minEdge){
if(v == s){ flow = minEdge;return; }
else if(par[v] != -1){
augmentPath(par[v], min(minEdge, AdjMat[par[v]][v]));
AdjMat[par[v]][v] -= flow; //forward edges
AdjMat[v][par[v]] += flow; //backward edges
}
}
void EdmondsKarp(){
int max_flow = 0;
for(int i= 0;i < n;i++) dist[i] = -1, par[i] = -1;
while(1){
flow = 0;
queue<int> q;
q.push(s);dist[s] = 0;
while(!q.empty()){
int u = q.front();q.pop();
if(u == t) break;
for(int i = 0;i < G[u].size();i++){
int v = G[u][i].first;
if(AdjMat[u][v] > 0 && dist[v] == -1){
dist[v] = dist[u] + 1;
q.push(v);
par[v] = u;
}
}
}
augmentPath(t, INF);
if(flow == 0) break; //Max flow reached, now we have saturated edges that form the min cut
max_flow += flow;
}
}
int main(){
//Create the graph here, both as an adjacency list and adjacency matrix
//also mark the source i.e "s" and sink "t", before calling max flow.
return 0;
}

Path of Length N in graph with constraints

I want to find number of path of length N in a graph where the vertex can be any natural number. However two vertex are connected only if the product of the two vertices is less than some natural number P. If the product of two vertexes are greater than P than those are not connected and can't be reached from one other.
I can obviously run two nested loops (<= P) and create an adjacency matrix, but P can be extremely large and this approach would be extremely slow. Can anyone think of some optimal approach to solve the problem? Can we solve it using Dynamic Programming?
I agree with Ante's recurrence, although I used a slightly simplified version. Note that I'm using the letter P to name the maximum product, as it is used in the original problem statement:
f(1,x) = 1
f(i,x) = sum(f(i-1, y) for y in {1, ..., floor(P/x)})
f(i,x) is the number of sequences of length i that end with x. The answer to the question is then f(n+1, 1).
Of course since P can be up to 10^9 in this task, a straightforward implementation with a DP table is out of the question. However, there are only up to m < 70000 possible different values of floor(P/i). So let's find the maximal segments aj ... bj, where floor(P/aj) = floor(P/bj). We can find those segments in O(number of segments * log P) using binary search.
Imagine the full DP table for f. Since there are only m different values for floor(P/x), every row of f consists of m contiguous ranges that have the same value.
So let's compute the compressed DP table, where we represent the rows as list of (length, value) pairs. We start with f(1) = [(P, 1)] and we can compute f(i+1) from f(i) by processing the segments in increasing order and computing prefix sums of the lengths stored in f(i).
The total runtime of my implementation of this approach is O(m (log P + n)). This is the code I used:
using ll=long long;
const int mod = 1000000007;
void add(int& x, ll y) { x = (x+y)%mod; }
int main() {
int n, P;
cin >> n >> P;
int x = 1;
vector<pair<int,int>> segments;
while(x <= P) {
int y = x+1, hi = P+1;
while(y<hi) {
int mid = (y+hi)/2;
if (P/mid < P/x) hi=mid;
else y=mid+1;
}
segments.push_back(make_pair(P/x, y-x));
x = y;
}
reverse(begin(segments), end(segments));
vector<pair<int,int>> dp;
dp.push_back(make_pair(P,1));
for (int i = 1; i <= n; ++i) {
int j = 0;
int sum_smaller = 0, cnt_smaller = 0;
vector<pair<int,int>> dp2;
for (auto it : segments) {
int value = it.first, cnt = it.second;
while (cnt_smaller + dp[j].first <= value) {
cnt_smaller += dp[j].first;
add(sum_smaller,(ll)dp[j].first*dp[j].second);
j++;
}
int pref_sum = sum_smaller;
if (value > cnt_smaller)
add(pref_sum, (ll)(value - cnt_smaller)*dp[j].second);
dp2.push_back(make_pair(cnt, pref_sum));
}
dp = dp2;
reverse(begin(dp),end(dp));
}
cout << dp[0].second << endl;
}
I needed to do some micro-optimizations with the handling of the arrays to get AC, but those aren't really relevant, so I left them away.
If number of vertices is small than adjacency matrix (A) can help. Since sum of elements in A^N is number of distinct paths, if paths are oriented. If not than number of paths i sum of elements / 2. That is due an element (i,j) represents number of paths from vertex i to vertex j.
In this case, same approach can be done by DP, using reasoning that number of paths of length n from vertex v is sum of numbers of paths of length n-1 of all it's neighbours. Neigbours of vertex i are vertices from 1 to floor(Q/i). With that we can construct function N(vertex, length) which represent number of paths from given vertex with given length:
N(i, 1) = floor(Q/i),
N(i, n) = sum( N(j, n-1) for j in {1, ..., floor(Q/i)}.
Number of all oriented paths of length is sum( N(i,N) ).

Algorithm complexity for minimum number of clique in a graph

I have written an algorithm which solves the minimum number of clique in a graph. I have tested my backtracking algorithm, but I couldn't calculate the worst case time complexity, I have tried a lot of times.
I know that this problem is an NP hard problem, but I think is it possible to give a worst time complexity based on the code. What is the worst time complexity for this code? Any idea? How you formalize the recursive equation?
I have tried to write understandable code. If you have any question, write a comment.
I will be very glad for tips, references, answers.
Thanks for the tips guys:).
EDIT
As M C commented basically I have tried to solve this problem Clique cover problem
Pseudocode:
function countCliques(graph, vertice, cliques, numberOfClique, minimumSolution)
for i = 1 .. number of cliques + 1 new loop
if i > minimumSolution then
return;
end if
if (fitToClique(cliques(i), vertice, graph) then
addVerticeToClique(cliques(i), vertice);
if (vertice == 0) then //last vertice
minimumSolution = numberOfClique
printResult(result);
else
if (i == number of cliques + 1) then // if we are using a new clique the +1 always a new clique
countCliques(graph, vertice - 1, cliques, number of cliques + 1, minimum)
else
countCliques(graph, vertice - 1, cliques, number of cliques, minimum)
end if
end if
deleteVerticeFromClique(cliques(i), vertice);
end if
end loop
end function
bool fitToClique(clique, vertice, graph)
for ( i = 1 .. cliqueSize) loop
verticeFromClique = clique(i)
if (not connected(verticeFromClique, vertice)) then
return false
end if
end loop
return true
end function
Code
int countCliques(int** graph, int currentVertice, int** result, int numberOfSubset, int& minimum) {
// if solution
if (currentVertice == -1) {
// if a better solution
if (minimum > numberOfSubset) {
minimum = numberOfSubset;
printf("New minimum result:\n");
print(result, numberOfSubset);
}
c++;
} else {
// if not a solution, try to insert to a clique, if not fit then create a new clique (+1 in the loop)
for (int i = 0; i < numberOfSubset + 1; i++) {
if (i > minimum) {
break;
}
//if fit
if (fitToSubset(result[i], currentVertice, graph)) {
// insert
result[i][0]++;
result[i][result[i][0]] = currentVertice;
// try to insert the next vertice
countCliques(graph, currentVertice - 1, result, (i == numberOfSubset) ? (i + 1) : numberOfSubset, minimum);
// delete vertice from the clique
result[i][0]--;
}
}
}
return c;
}
bool fitToSubset(int *subSet, int currentVertice, int **graph) {
int subsetLength = subSet[0];
for (int i = 1; i < subsetLength + 1; i++) {
if (graph[subSet[i]][currentVertice] != 1) {
return false;
}
}
return true;
}
void print(int **result, int n) {
for (int i = 0; i < n; i++) {
int m = result[i][0];
printf("[");
for (int j = 1; j < m; j++) {
printf("%d, ",result[i][j] + 1);
}
printf("%d]\n", result[i][m] + 1);
}
}
int** readFile(const char* file, int& v, int& e) {
int from, to;
int **graph;
FILE *graphFile;
fopen_s(&graphFile, file, "r");
fscanf_s(graphFile,"%d %d", &v, &e);
graph = (int**)malloc(v * sizeof(int));
for (int i = 0; i < v; i ++) {
graph[i] = (int*)calloc(v, sizeof(int));
}
while(fscanf_s(graphFile,"%d %d", &from, &to) == 2) {
graph[from - 1][to - 1] = 1;
graph[to - 1][from - 1] = 1;
}
fclose(graphFile);
return graph;
}
The time complexity of your algorithm is very closely linked to listing compositions of an integer, of which there are O(2^N).
The compositions alone is not enough though, as there is also a combinatorial aspect, although there are rules as well. Specifically, a clique must contain the highest numbered unused vertex.
An example is the composition 2-2-1 (N = 5). The first clique must contain 4, reducing the number of unused vertices to 4. There is then a choice between 1 of 4 elements, unused vertices is now 3. 1 element of the second clique is known, so 2 unused vertices. Thus must be a choice between 1 of 2 elements decides the final vertex in the second clique. This only leaves a single vertex for the last clique. For this composition there are 8 possible ways it could be made, given by (1*C(4,1)*1*C(2,1)*1). The 8 possible ways are as followed:
(5,4),(3,2),(1)
(5,4),(3,1),(2)
(5,3),(4,2),(1)
(5,3),(4,1),(2)
(5,2),(4,3),(1)
(5,2),(4,1),(3)
(5,1),(4,3),(2)
(5,1),(4,2),(3)
The above example shows the format required for the worst case, which is when the composition contains the as many 2s as possible. I'm thinking this is still O(N!) even though it's actually (N-1)(N-3)(N-5)...(1) or (N-1)(N-3)(N-5)...(2). However, it is impossible as it would as shown require a complete graph, which would be caught right away, and limit the graph to a single clique, of which there is only one solution.
Given the variations of the compositions, the number of possible compositions is probably a fair starting point for the upper bound as O(2^N). That there are O(3^(N/3)) maximal cliques is another bit of useful information, as the algorithm could theoretically find all of them. Although that isn't good enough either as some maximal cliques are found multiple times while others not at all.
A tighter upper bound is difficult for two main reasons. First, the algorithm progressively limits the max number of cliques, which I suppose you could call the size of the composition, which puts an upper limit on the computation time spent per clique. Second, missing edges cause a large number of possible variations to be ignored, which almost ensures that the vast majority of the O(N!) variations are ignored. Combined with the above paragraph, makes putting the upper bound difficult. If this isn't enough for an answer, you might want to take the question to math area of stack exchange as a better answer will require a fair bit of mathematical analysis.

Resources