Z3 connected components - logic

As part of a puzzle solver, I dynamically build a graph, nodes and edges, from user input.
Each node is assigned an integer const, representing which connected component it is part of.
Nodes are restricted to have the same connected component as their neighbors.
Currently, it can assign the same number to multiple components, but I want to restrict the solution so that each connected component must have a unique number.
I cannot wrap my head around how to express this constraint in Z3.
I'm not looking for working code, just a high level description of how to approach this, and I can take it from there.
Thanks in advance!

Nodes of a graph component connected via other nodes can be modelled using TransitiveClosure.
The following snippet might get you started:
# https://stackoverflow.com/q/56496558/1911064
from z3 import *
B = BoolSort()
NodeSort = DeclareSort('Node')
R = Function('R', NodeSort, NodeSort, B)
TC_R = TransitiveClosure(R)
s = Solver()
na, nb, nc = Consts('na nb nc', NodeSort)
s.add(R(na, nb))
s.add(R(nb, nc))
s.add(Not(TC_R(na, nc)))
print(s.check()) # produces unsat

Related

Removing unnecessary nodes in graph

I have a graph that has two distinct classes of nodes, class A nodes and class B nodes.
Class A nodes are not connected to any other A nodes and class B nodes aren’t connected to any other B nodes, but B nodes are connected to A nodes and vice versa. Some B nodes are connected to lots of A nodes and most A nodes are connected to lots of B nodes.
I want to eliminate as many of the A nodes as possible from the graph.
I must keep all of the B nodes, and they must still be connected to at least one A node (preferably only one A node).
I can eliminate an A node when it has no B nodes connected only to it. Are there any algorithms that could find an optimal, or at least close to optimal, solution for which A nodes I can remove?
Old, Incorrect Answer, But Start Here
First, you need to recognize that you have a bipartite graph. That is, you can colour the nodes red and blue such that no edges connect a red node to a red node or a blue node to a blue node.
Next, recognize that you're trying to solve a vertex cover problem. From Wikipedia:
In the mathematical discipline of graph theory, a vertex cover (sometimes node cover) of a graph is a set of vertices such that each edge of the graph is incident to at least one vertex of the set. The problem of finding a minimum vertex cover is a classical optimization problem in computer science and is a typical example of an NP-hard optimization problem that has an approximation algorithm.
Since you have a special graph, it's reasonable to think that maybe the NP-hard doesn't apply to you. This thought brings us to Kőnig's theorem which relates the maximum matching problem to the minimum vertex cover problem. Once you know this, you can apply the Hopcroft–Karp algorithm to solve the problem in O(|E|√|V|) time, though you'll probably need to jigger it a bit to ensure you keep all the B nodes.
New, Correct Answer
It turns out this jiggering is the creation of a "constrained bipartitate graph vertex cover problem", which asks us if there is a vertex cover that uses less than a A-nodes and less than b B-nodes. The problem is NP-complete, so that's a no go. The jiggering was hard than I thought!
But using less than the minimum number of nodes isn't the constraint we want. We want to ensure that the minimum number of A-nodes is used and the maximum number of B-nodes.
Kőnig's theorem, above, is a special case of the maximum flow problem. Thinking about the problem in terms of flows brings us pretty quickly to minimum-cost flow problems.
In these problems we're given a graph whose edges have specified capacities and unit costs of transport. The goal is to find the minimum cost needed to move a supply of a given quantity from an arbitrary set of source nodes to an arbitrary set of sink nodes.
It turns out your problem can be converted into a minimum-cost flow problem. To do so, let us generate a source node that connects to all the A nodes and a sink node that connects to all the B nodes.
Now, let us make the cost of using a Source->A edge equal to 1 and give all other edges a cost of zero. Further, let us make the capacity of the Source->A edges equal to infinity and the capacity of all other edges equal to 1.
This looks like the following:
The red edges have Cost=1, Capacity=Inf. The blue edges have Cost=0, Capacity=1.
Now, solving the minimum flow problem becomes equivalent to using as few red edges as possible. Any red edge that isn't used allocates 0 flow to its corresponding A node and that node can be removed from the graph. Conversely, each B node can only pass 1 unit of flow to the sink, so all B nodes must be preserved in order for the problem to be solved.
Since we've recast your problem into this standard form, we can leverage existing tools to get a solution; namely, Google's Operation Research Tools.
Doing so gives the following answer to the above graph:
The red edges are unused and the black edges are used. Note that if a red edge emerges from the source the A-node it connects to generates no black edges. Note also that each B-node has at least one in-coming black edge. This satisfies the constraints you posed.
We can now detect the A-nodes to be removed by looking for Source->A edges with zero usage.
Source Code
The source code necessary to generate the foregoing figures and associated solutions is as follows:
#!/usr/bin/env python3
#Documentation: https://developers.google.com/optimization/flow/mincostflow
#Install dependency: pip3 install ortools
from __future__ import print_function
from ortools.graph import pywrapgraph
import matplotlib.pyplot as plt
import networkx as nx
import random
import sys
def GenerateGraph(Acount,Bcount):
assert Acount>5
assert Bcount>5
G = nx.DiGraph() #Directed graph
source_node = Acount+Bcount
sink_node = source_node+1
for a in range(Acount):
for i in range(random.randint(0.2*Bcount,0.3*Bcount)): #Connect to 10-20% of the Bnodes
b = Acount+random.randint(0,Bcount-1) #In the half-open range [0,Bcount). Offset from A's indices
G.add_edge(source_node, a, capacity=99999, unit_cost=1, usage=1)
G.add_edge(a, b, capacity=1, unit_cost=0, usage=1)
G.add_edge(b, sink_node, capacity=1, unit_cost=0, usage=1)
G.node[a]['type'] = 'A'
G.node[b]['type'] = 'B'
G.node[source_node]['type'] = 'source'
G.node[sink_node]['type'] = 'sink'
G.node[source_node]['supply'] = Bcount
G.node[sink_node]['supply'] = -Bcount
return G
def VisualizeGraph(graph, color_type):
gcopy = graph.copy()
for p, d in graph.nodes(data=True):
if d['type']=='source':
source = p
if d['type']=='sink':
sink = p
Acount = len([1 for p,d in graph.nodes(data=True) if d['type']=='A'])
Bcount = len([1 for p,d in graph.nodes(data=True) if d['type']=='B'])
if color_type=='usage':
edge_color = ['black' if d['usage']>0 else 'red' for u,v,d in graph.edges(data=True)]
elif color_type=='unit_cost':
edge_color = ['red' if d['unit_cost']>0 else 'blue' for u,v,d in graph.edges(data=True)]
Ai = 0
Bi = 0
pos = dict()
for p,d in graph.nodes(data=True):
if d['type']=='source':
pos[p] = (0, Acount/2)
elif d['type']=='sink':
pos[p] = (3, Bcount/2)
elif d['type']=='A':
pos[p] = (1, Ai)
Ai += 1
elif d['type']=='B':
pos[p] = (2, Bi)
Bi += 1
nx.draw(graph, pos=pos, edge_color=edge_color, arrows=False)
plt.show()
def GenerateMinCostFlowProblemFromGraph(graph):
start_nodes = []
end_nodes = []
capacities = []
unit_costs = []
min_cost_flow = pywrapgraph.SimpleMinCostFlow()
for node,neighbor,data in graph.edges(data=True):
min_cost_flow.AddArcWithCapacityAndUnitCost(node, neighbor, data['capacity'], data['unit_cost'])
supply = len([1 for p,d in graph.nodes(data=True) if d['type']=='B'])
for p, d in graph.nodes(data=True):
if (d['type']=='source' or d['type']=='sink') and 'supply' in d:
min_cost_flow.SetNodeSupply(p, d['supply'])
return min_cost_flow
def ColorGraphEdgesByUsage(graph, min_cost_flow):
for i in range(min_cost_flow.NumArcs()):
graph[min_cost_flow.Tail(i)][min_cost_flow.Head(i)]['usage'] = min_cost_flow.Flow(i)
def main():
"""MinCostFlow simple interface example."""
# Define four parallel arrays: start_nodes, end_nodes, capacities, and unit costs
# between each pair. For instance, the arc from node 0 to node 1 has a
# capacity of 15 and a unit cost of 4.
Acount = 20
Bcount = 20
graph = GenerateGraph(Acount, Bcount)
VisualizeGraph(graph, 'unit_cost')
min_cost_flow = GenerateMinCostFlowProblemFromGraph(graph)
# Find the minimum cost flow between node 0 and node 4.
if min_cost_flow.Solve() != min_cost_flow.OPTIMAL:
print('Unable to find a solution! It is likely that one does not exist for this input.')
sys.exit(-1)
print('Minimum cost:', min_cost_flow.OptimalCost())
ColorGraphEdgesByUsage(graph, min_cost_flow)
VisualizeGraph(graph, 'usage')
if __name__ == '__main__':
main()
Despite this is an old question, I see it has not been correctly answered yet.
An analogous question to this one has also been answered earlier in this post.
The problem you are presenting here is indeed the Minimum Set Cover Problem, which is one of the well-known NP-hard problems. From the Wikipedia, the Minimum Set Cover Problem can be formulated as:
Given a set of elements {1,2,...,n} (called the universe) and a collection S of m sets whose union equals the universe, the set cover problem is to identify the smallest sub-collection of S whose union equals the universe. For example, consider the universe U={1,2,3,4,5} and the collection of sets S={{1,2,3},{2,4},{3,4},{4,5}}. Clearly the union of S is U. However, we can cover all of the elements with the following, smaller number of sets: {{1,2,3},{4,5}}.
In your formulation, B nodes represent the elements in the universe, A nodes represent the sets and edges between A nodes and B nodes determine which elements (B nodes) belong to each set (A node). Then, the minimum set cover is equivalent to the minimum number of A nodes so that they are connected to all B nodes. Consequently, the maximum number of A nodes which can be removed from the graph while being connected to every B node are those which do not belong to the minimum set cover.
Since it is NP-hard, there is no polinomial time algorithm for computing the optimum, but a simple greedy algorithm can efficiently provide approximate solutions with tight bounds to the optimum. From the Wikipedia:
There is a greedy algorithm for polynomial time approximation of set covering that chooses sets according to one rule: at each stage, choose the set that contains the largest number of uncovered elements.

Isolate connected_components

I have following networkx graph excerpt:
Following functions were executed to explore the structure of connected components, as I have a sparse network with lots of singular connections:
nx.number_connected_components(G)
>>> 702
list(nx.connected_components(G))
>>> [{120930, 172034},
{118787, 173867, 176202},
{50376, 151561},
...]
Question: How can I restrict my whole graph visualization to connected_components with equal or more than three nodes?
graphs = list(nx.connected_component_subgraphs(G))
list_subgraphs=[items for i in graphs for items in i if len(i)>=3]
F=G.subgraph(list_subgraphs)
Creating a flat list of the subgraphs with components greater than 3 nodes lets say!
We can create a subgraph containing the components with equal or more than three nodes:
s = G.subgraph(
set.union(
*filter(lambda x: len(x) >= 3, nx.connected_components(G))
)
)
Now you just need to visualize this subgraph s.
We might need to make a copy instead of a SubGraph view, in that case, s = s.copy() will make a copy from the subgraph.

Dijkstra's algorithm: is my implementation flawed?

In order to train myself both in Python and graph theory, I tried to implement the Dijkstra algo using Python 3, and submitted it against several online judges, to see if it was correct.
It works well in many cases, but not always.
For example, I am stuck with this one: the test case works fine and I also have tried custom test cases of my own, but when I submit the following solution, the judge keeps telling me "wrong answer", and the expected result is very different from my output, indeed.
Notice that the judge tests it against quite a complex graph (10000 nodes with 100000 edges), while all the cases I tried before never exceeded 20 nodes and around 20-40 edges.
Here is my code.
Given al an adjacency list in the following form:
al[n] = [(a1, w1), (a2, w2), ...]
where
n is the node id;
a1, a2, etc. are its adjacent nodes and w1, w2, etc. the respective weights for the given edge;
and supposing that maximum distance never exceeds 1 billion, I implemented Dijkstra's algorithm this way:
import queue
distance = [1000000000] * (N+1) # this is the array where I store the shortest path between 1 and each other node
distance[1] = 0 # starting from node 1 with distance 0
pq = queue.PriorityQueue()
pq.put((0, 1)) # same as above
visited = [False] * (N+1)
while not pq.empty():
n = pq.get()[1]
if visited[n]:
continue
visited[n] = True
for edge in al[n]:
if distance[edge[0]] > distance[n] + edge[1]:
distance[edge[0]] = distance[n] + edge[1]
pq.put((distance[edge[0]], edge[0]))
Could you please help me understand wether my implementation is flawed, or if I simply ran into some bugged online judge?
Thank you very much.
UPDATE
As requested, I'm providing the snippet I use to populate the adjacency list al for the linked problem.
N,M = input().split()
N,M = int(N), int(M)
al = [[] for n in range(N+1)]
for m in range(M):
try:
a,b,w = input().split()
a,b,w = int(a), int(b), int(w)
al[a].append((b, w))
al[b].append((a, w))
except:
pass
(Please don't mind the ugly "except: pass", I was using it just for debugging purposes... :P)
Primary problem in interpreting the question:
According to your parsing code, you are treating the input data as an undirected graph, i.e. each edge from A to B also is an edge from B to A.
Is seems like this premise is not valid and it should instead be a directed graph, i.e. you have to remove this line:
al[b].append((a, w)) # no back reference!
Previous problem, now already fixed in the code:
Currently, you are using the never-changing weight of the edges in your queue:
pq.put((edge[1], edge[0]))
This way, the nodes always end up at the same position in the queue, no matter at what stage of the algorithm and how far the path to reach that node actually is.
Instead, you should use the new distance to the target node edge[0], i.e. distance[edge[0]] as the priority in the queue:
pq.put((distance[edge[0]], edge[0]))

what is the best algorithm to traverse a graph with negative nodes and looping nodes

I have a really difficult problem to solve and Im just wondering what what algorithm can be used to find the quickest route. The undirected graph consist of positive and negative adjustments, these adjustments effect a bot or thing which navigate the maze. The problem I have is mazes which contain loops that can be + or -. An example might help:-
node A gives 10 points to the object
node B takes 15 from the object
node C gives 20 points to the object
route=""
the starting node is A, and the ending node is C
given the graph structure as:-
a(+10)-----b(-15)-----c+20
node() means the node loops to itself - and + are the adjustments
nodes with no loops are c+20, so node c has a positive adjustment of 20 but has no loops
if the bot or object has 10 points in its resource then the best path would be :-
a > b > c the object would have 25 points when it arrives at c
route="a,b,c"
this is quite easy to implement, the next challenge is knowing how to backtrack to a good node, lets assume that at each node you can find out any of its neighbour's nodes and their adjustment level. here is the next example:-
if the bot started with only 5 points then the best path would be
a > a > b > c the bot would have 25 points when arriving at c
route="a,a,b,c"
this was a very simple graph, but when you have lots of more nodes it becomes very difficult for the bot to know whether to loop at a good node or go from one good node to another, while keeping track of a possible route.
such a route would be a backtrack queue.
A harder example would result in lots of going back and forth
bot has 10 points
a(+10)-----b(-5)-----c-30
a > b > a > b > a > b > a > b > a > b > c having 5 pts left.
another way the bot could do it is:-
a > a > a > b > c
this is a more efficient way, but how the heck you can program this is partly my question.
does anyone know of a good algorithm to solve this, ive already looked into Bellman-fords and Dijkstra but these only give a simple path not a looping one.
could it be recursive in some way or some form of heuristics?
referring to your analogy:-
I think I get what you mean, a bit of pseudo would be clearer, so far route()
q.add(v)
best=v
hash visited(v,true)
while(q is not empty)
q.remove(v)
for each u of v in G
if u not visited before
visited(u,true)
best=u=>v.dist
else
best=v=>u.dist
This is a straightforward dynamic programming problem.
Suppose that for a given length of path, for each node, you want to know the best cost ending at that node, and where that route came from. (The data for that length can be stored in a hash, the route in a linked list.)
Suppose we have this data for n steps. Then for the n+1st we start with a clean slate, and then take each answer for the n'th, move it one node forward, and if we land on a node we don't have data for, or else that we're better than the best found, then we update the data for that node with our improved score, and add the route (just this node linking back to the previous linked list).
Once we have this for the number of steps you want, find the node with the best existing route, and then you have your score and your route as a linked list.
========
Here is actual code implementing the algorithm:
class Graph:
def __init__(self, nodes=[]):
self.nodes = {}
for node in nodes:
self.insert(node)
def insert(self, node):
self.nodes[ node.name ] = node
def connect(self, name1, name2):
node1 = self.nodes[ name1 ]
node2 = self.nodes[ name2 ]
node1.neighbors.add(node2)
node2.neighbors.add(node1)
def node(self, name):
return self.nodes[ name ]
class GraphNode:
def __init__(self, name, score, neighbors=[]):
self.name = name
self.score = score
self.neighbors = set(neighbors)
def __repr__(self):
return self.name
def find_path (start_node, start_score, end_node):
prev_solution = {start_node: [start_score + start_node.score, None]}
room_to_grow = True
while end_node not in prev_solution:
if not room_to_grow:
# No point looping endlessly...
return None
room_to_grow = False
solution = {}
for node, info in prev_solution.iteritems():
score, prev_path = info
for neighbor in node.neighbors:
new_score = score + neighbor.score
if neighbor not in prev_solution:
room_to_grow = True
if 0 < new_score and (neighbor not in solution or solution[neighbor][0] < new_score):
solution[neighbor] = [new_score, [node, prev_path]]
prev_solution = solution
path = prev_solution[end_node][1]
answer = [end_node]
while path is not None:
answer.append(path[0])
path = path[1]
answer.reverse()
return answer
And here is a sample of how to use it:
graph = Graph([GraphNode('A', 10), GraphNode('B', -5), GraphNode('C', -30)])
graph.connect('A', 'A')
graph.connect('A', 'B')
graph.connect('B', 'B')
graph.connect('B', 'B')
graph.connect('B', 'C')
graph.connect('C', 'C')
print find_path(graph.node('A'), 10, graph.node('C'))
Note that I explicitly connected each node to itself. Depending on your problem you might want to make that automatic.
(Note, there is one possible infinite loop left. Suppose that the starting node has a score of 0 and there is no way off of it. In that case we'll loop forever. It would take work to add a check for this case.)
I'm a little confused by your description, it seems like you are just looking for shortest path algorithms. In which case google is your friend.
In the example you've given you have -ve adjustments which should really be +ve costs in the usual parlance of graph traversal. I.e. you want to find a path with the lowest cost so you want more +ve adjustments.
If your graph has loops that are beneficial to traverse (i.e. decrease cost or increase points through adjustments) then the best path is undefined because going through the loop one more time will improve your score.
Here's some psuedocode
steps = []
steps[0] = [None*graph.#nodes]
step = 1
while True:
steps[step] = [None*graph.#nodes]
for node in graph:
for node2 in graph:
steps[step][node2.index] = max(steps[step-1][node.index]+node2.cost, steps[step][node2.index])
if steps[step][lastnode] >= 0:
break;

Dynamic programming question

I am stuck with one of the algorithm homework problem. Can anyone give me some hint to solve it? Here is the question:
Consider a chain structured computation represented by a weighted graph G = (V;E) where
V = {v1; v2; ... ; vn} and E = {(vi; vi+1) such that 1<= i <= n-1. We are also given a chain-structure m identical processors P = {P1; ... ; Pm} (i.e., there exists a communication link between Pk and Pk+1 for 1 <= k <= m - 1).
The set of vertices V represents computation modules, and the set of edges E represents
communication between the two modules. Each node vi is assigned a weight wi denoting the
execution time of the module on a single processor. Each edge (vi; vi+1) is assigned a weight ci denoting the amount of communication time between the two modules if they are assigned two different processors. If multiple modules are assigned to the same processor, the modules assigned to the same processor must be consecutive. Suppose modules va; va+1; .. ; vb are assigned to Processor Pk. Then, the time taken by Pk, denoted by Tk, is the time to compute assigned modules plus the time to communicate between neighboring processors. Hence, Tk = wa+...+ wb + ca-1 + cb. Note here that ca-1 = 0 if a = 1 and cb = 0 if b = n.
The objective of the problem is to find an assignment V to P such that max1<=k<=m Tk
is minimized, where we assume that each processor must take at least one module. (This
assumption can be relaxed by adding m dummy modules with zero weight on computational
and communication time.)
Develop a dynamic programming algorithm to solve this problem in polynomial time(i.e O(mn))
I tried to find the minimum execution time for each Pk and then find the max, but I doubt my solution is dynamic programming since there is no recursive formula. Please give me some hints!
Thanks!
I think you might be able to modify the Viterbi algorithm to solve this problem.
okay. this is easy.
decompose your problem to be a function you need to minimise, say F(n,k). which results into the minimum assignment of the first n nodes to k first processors.
Then derive your formula like this, collecting the number of nodes on the kth processor.
F(n,k) = min[i=0..n]( max(F(i,k-1), w[i]+...+w[n]+c[i-1]+c[n]) )
c[0] = 0
F(*,0) = inf
F(0,*) = inf

Resources