Neo4J - Traveling Salesman - algorithm

I'm trying to solve an augmented TSP problem using a graph database, but I'm struggling. I'm great with SQL, but am a total noob on cypher. I've created a simple graph with cities (nodes) and flights (relationships).
THE SETUP: Travel to 8 different cities (1 city per week, no duplicates) with the lowest total flight cost. I'm trying to solve an optimal path to minimize the cost of the flights, which changes each week.
Here is a file on pastebin containing my nodes & relationships. Just run it against Neo4JShell to insert the data.
I started off using this article as a basis but it doesn't handle the changing distances (or in my case flight costs)
I know this is syntactically terrible/non-executable, but here's what I've done so far to get just two flights;
MATCH (a:CITY)-[F1:FLIGHT{week:1}]->(b:CITY) -[F2:FLIGHT{week:2}]->(c:CITY)
RETURN a,b,c;
But that doesn't run.
Next, I thought I'd just try to find all the cities & flights from week one, but it's not working right either as I get flights where week <> 1 as well as =1
MATCH (n) WHERE (n)-[:FLIGHT { week:1 }]->() RETURN n
Can anyone help out?
PS - I'm not married to using a graph DB to solve this, I've just read about them, and thought it would be well fitted to try it, plus gave me a reason to work with them, but so far, I'm not having much (or any) success.

Maybe this Cypher query will give you some ideas.
MATCH (from:Node {name: "Source node" })
MATCH path = (from)-[:CONNECTED_TO*6]->()
WHERE ALL(n in nodes(path) WHERE 1 = length(filter(m in nodes(path) WHERE m = n)))
AND length(nodes(path)) = 7
RETURN path,
reduce(distance = 0, edge in relationships(path) | distance + edge.distance)
AS totalDistance
ORDER BY totalDistance ASC
LIMIT 1
It does all permutations of available routes which are equal to the number of nodes (for this example it is 7), calculates lengths of all these paths and returns the shortest one.

neo4j may be a fine piece of software, but I wouldn't expect it to be of much help in solving this NP-hard problem. Instead, I would point you to an integer program solver (this one, perhaps, but I can't vouch for it) and suggest that you formulate this problem as an integer program as follows.
For each flight f, we create a 0-1 variable x(f) that is 1 if flight f is taken and 0 if flight f is not taken. The objective is to minimize the total cost of the flights (I'm going to assume that each purchase is an independent decision; if not, then you have some more work to do).
minimize sum_{flights f} cost(f) x(f)
Now we need some constraints. Each week, we purchase exactly one flight.
for all weeks i, sum_{flights f in week i} x(f) = 1
We can be in only one place at a time, so if we fly into city v for week i, then we fly out of city v for week i+1. We express this constraint with a strange but idiomatic linear equation.
for all weeks i, for all cities v,
sum_{flights f in week i to city v} x(f) -
sum_{flights f in week i+1 from city v} x(f) = 0
We can fly into each city at most once. We can fly out of each city at most once. This is how we enforce the constraint of visiting only once.
for all cities v,
sum_{flights f to city v} x(v) <= 1
for all cities v,
sum_{flights f from city v} x(v) <= 1
We're almost done. I'm going to assume at this point that the journey begins and ends in a home city u known ahead of time. For the first week, delete all flights not departing from u. For the last week, delete all flights not arriving at u. The flexibility of integer programming, however, means that it's easy to make other arrangements.

Related

How to find an algorithm to find a point in a row that has a lower distance with others?

Imagine that we have n city in one row with the same distance and each of them has a population, we want to build a post office, and we want to choose a city that most people have to take a less route to that office, how to find the city?
The user should input the cities number(n) and their population and got the city that the post office should be built.
This is the example that was in the problem and I don't know why it has this result:
6 (number of cities (n))
3 1 0 0 2 2 (populations) ----> 2 (the city number 2 that have a 1 population)
The thing that I'm looking for is an algorithm or a formula to find the city, not the code. Any idea?
If you work out your example, you'll see a simple pattern. Let's say the total distance for index i is x. If we move to index i+1,
All cities in range [0, i] will now become further away from the post office by 1 unit.
All cities in range [i+1, n] will now become closer to the post office by 1 unit.
Using the example you have provided (assuming 0 based indexing):
Let the office be at index 1. Current total distance: 3 + (2*3) + (2*4) = 17
Let's shift it to index 2. Increase in distance for the cities on left of index 2 = 3+1=4. Decrease in distance for cities on the right = 2+2=4.
Simply put, if we move from index i to i+1,
new_distance = old_distance + sum(array[j] for j in range [0,i]) - sum(array[j] for j in range [i+1,n])
To make it more efficient, sum(array[j] for j in range [0,i]) is nothing but prefix sum. You can calculate that in one pass, then solve the distances for each city in another pass. That O(N) time and space complexity.

Minimal car refills on a graph of cities

There's a graph of cities.
n - cities count
m - two-ways roads count
k - distance that car can go after refill
Road i connects cities pi and qi and has length of ri. Between the two cities can be only one road.
A man is going from city u to city v.
There are a gas stations in l cities a1, a2, ..., al.
A car starts with a full tank. If a man gets in a city with a gas station, he can refill (full tank) a car or ignore it.
Return value is a minimal count of refills to get from a city u to city v or -1 if it's impossible.
I tried to do it using Dijkstra algorithm, so I have minimal distance and path. But I have no idea how to get minimal refills count
It is slightly subtle, but the following pseudo-code will do it.
First do a breadth-first search from v to find the distance from every city to the target. This gives us a distance_remaining lookup with distance_remaining[city] being the shortest path (without regards to fillup stations).
To implement we first need a Visit data structure with information about visiting a city on a trip. What fields do we need?
city
fillups
range
last_visit
Next we need a priority queue (just like Dijkstraa) for possible visits to consider. This queue should prioritize visits by the shortest possible overall trip that we might be able to take. Which is to say visit.fillups * max_range + (max_range - visit.range) + distance_remaining[visit.city].
And finally we need a visited[city] data structure saying whether a city is visited. In Dijkstra we only consider a node if it was not yet visited. We need to tweak that to only consider a node if it was not yet visited or was visited with a range shorter than our current one (a car that arrived on full may finish even though the empty one failed).
And now we implement the following logic:
make visit {city: u, fillups: 0, range: max_range, last_visit: None}
add to priority queue the visit we just created
while v = queue.pop():
if v.city == u:
return v.fillups # We could actually find the path at this point!
else if v not in visited or visited[v.city] < v.range:
for each road r from v.city:
if r.length < v.range:
add to queue {city: r.other_city, fillups: v.fillups, range:v.range - r.length, last_visit: v}
if v.city has fillup station:
add to queue {city: v.city, fillups: fillups + 1, range: max_range, last_visit: v}
return -1

Dyanmic Shortest Path

Your friends are planning an expedition to a small town deep in the Canadian north
next winter break. They’ve researched all the travel options and have drawn up a directed
graph whose nodes represent intermediat destinations and edges represent the reoads betweeen
them.
In the course of this, they’ve also learned that extreme weather causes roads in this part of
the world to become quite slow in the winter and may cause large travel delays. They’ve
found an excellent travel Web site that can accurately predict how fast they’ll be able to
travel along the roads; however, the speed of travel depends on the time of the year. More
precisely, the Web site answers queries of the following form: given an edge e = (u, v)
connecting two sites u and v, and given a proposed starting time t from location u, the
site will return a value fe(t), the predicted arrival time at v. The web site guarantees that
1
fe(t) > t for every edge e and every time t (you can’t travel backward in time), and that
fe(t) is a monotone increasing function of t (that is, you do not arrive earlier by starting
later). Other than that, the functions fe may be arbitrary. For example, in areas where the
travel time does not vary with the season, we would have fe(t) = t + e, wheree is the
time needed to travel from the beginning to the end of the edge e.
Your friends want to use the Web site to determine the fastest way to travel through the
directed graph from their starting point to their intended destination. (You should assume
that they start at time 0 and that all predictions made by the Web site are completely
correct.) Give a polynomial-time algorithm to do this, where we treat a single query to
the Web site (based on a specific edge e and a time t) as taking a single computational step.
def updatepath(node):
randomvalue = random.randint(0,3)
print(node,"to other node:",randomvalue)
for i in range(0,n):
distance[node][i] = distance[node][i] + randomvalue
def minDistance(dist,flag_array,n):
min_value = math.inf
for i in range(0,n):
if dist[i] < min_value and flag_array[i] == False:
min_value = dist[i]
min_index = i
return min_index
def shortest_path(graph, src,n):
dist = [math.inf] * n
flag_array = [False] * n
dist[src] = 0
for cout in range(n):
#find the node index that have min cost
u = minDistance(dist,flag_array,n)
flag_array[u] = True
updatepath(u)
for i in range(n):
if graph[u][i] > 0 and flag_array[i]==False and dist[i] > dist[u] + graph[u][i]:
dist[i] = dist[u] + graph[u][i]
path[i] = u
return dist
I applied Dijkstra algorithm but it is not correct ? What would i change in my algorithm to work it for dynamic changing edge.
Well, Key points are that function is monotonically increasing. There is an algorithm which exploits this property and it is called A*.
Accumulated cost: Your prof wants you to use two distances one is accumulated cost(this is simple the cost from previous added to the cost/time needed to move to the next node).
Heuristic cost: This is some predicted cost.
Disjkstra approach would not work because you are working with heuristic cost/predicted and accumulated cost.
Monotonically increasing means h(A) <= h(A) + f(A..B).It simply says that if you move from node A to node B then the cost should not be less than the previous node (in this case A) and this is heuristic + accumulated. If this property holds then the first path which A* chooses is always the path to goal and it never needs to backtrack.
Note: The power of this algorithm is totally base on how you predict value.
If you underestimate the value that will be corrected with accumulated value but if you overestimate the value it will chose wrong path.
Algorithm:
Create a Min Priority queue.
insert initial city in q.
while(!pq.isEmpty() && !Goalfound)
Node min = pq.delMin() //this should return you a cities to which your
distance(heuristic+accumulated is minial).
put all succesors of min in pq // all cities which you can reach, you
can better make a list of visited
cities s that queue will be
efficient by not placing same
element twice.
Keep doing this and at the end you will either reach goal or your queue will be empty
Extra
Here i implemented a 8-puzzle-solve using A*, it can give you an idea about how costs are defined and ho it works.
`
private void solve(MinPQ<Node> pq, HashSet<Node> closedList) {
while(!(pq.min().getBoad().isGoal(pq.min().getBoad()))){
Node e = pq.delMin();
closedList.add(e);
for(Board boards: e.getBoad().neighbors()){
Node nextNode = new Node(boards,e,e.getMoves()+1);
if(!equalToPreviousNode(nextNode,e.getPreviousNode()))
pq.insert(nextNode);
}
}
Node collection = pq.delMin();
while(!(collection.getPreviousNode() == null)){
this.getB().add(collection.getBoad());
collection =collection.getPreviousNode();
}
this.getB().add(collection.getBoad());
System.out.println(pq.size());
}
A link to full code is here.

rudimentary flight trip planner

![question based on travel planner][1]
what approach will be best for solving this problem?,any kind of help will be appreciated
The input is the set of flights between various cities. It is given as a file. Each line of the file contains "city1 city2 departure-time arrival-time flight-no. price" This means that there is a flight called "flight-no" (which is a string of the form XY012) from city1 to city2 which leaves city1 at time "departure-time" and arrives city2 at time "arrival-time". Further the price of this flight is "price" which is a poitive integer. All times are given as a string of 4 digits in the 24hr format e.g. 1135, 0245, 2210. Assume that all city names are integers between 1 and a number N (where N is the total number of cities).
Note that there could be multiple flights between two cities (at different times).
The query that you have to answer is: given two cities "A" and "B", times "t1", "t2", where t1 < t2, find the cheapest trip which leaves city "A" after time "t1" and arrives at city "B" before time "t2". A trip is a sequence of flights which starts at A after time t1 and ends at B before time t2. Further, the departure time from any transit (intermediate) city C is at least 30 mins after the arrival at C
You can solve this problem with a graph search algorithm, such as Dijkstra's Algorithm.
The vertices of the graph are tuples of locations and (arrival) times. The edges are a combination of a layover (of at least 30 minutes) and an outgoing flight. The only difficulty is marking the vertices we've visited already (the "closed" list), since arriving in an airport at a given time shouldn't prevent consideration of flights into that airport that arrive earlier. My suggestion is to mark the departing flights we've already considered, rather than marking the airports.
Here's a quick implementation in Python. I assume that you've already parsed the flight data into a dictionary that maps from the departing airport name to a list of 5-tuples containing flight info ((flight_number, cost, destination_airport, departure_time, arrival_time)):
from heapq import heappush, heappop
from datetime import timedelta
def find_cheapest_route(flight_dict, start, start_time, target, target_time):
queue = [] # a min-heap based priority queue
taken_flights = set() # flights that have already been considered
heappush(queue, (0, start, start_time - timedelta(minutes=30), [])) # start state
while queue: # search loop
cost_so_far, location, time, route = heappop(queue) # pop the cheapest route
if location == target and time <= target_time: # see if we've found a solution
return route, cost
earliest_departure = time + timedelta(minutes=30) # minimum layover
for (flight_number, flight_cost, flight_dest, # loop on outgoing flights
flight_departure_time, flight_arrival_time) in flight_dict[location]:
if (flight_departure_time >= earliest_departure and # check flight
flight_arrival_time <= target_time and
flight_number not in taken_flights):
queue.heappush(queue, (cost_so_far + flight_cost, # add it to heap
flight_dest, flight_arrival_time,
route + [flight_number]))
taken_flights.add(flight_number) # and to the set of seen flights
# if we got here, there's no timely route to the destination
return None, None # or raise an exception
If you don't care about efficiency, you can solve the problem like this:
For each "final leg" flight arriving at the destination before t2, determine the departure city of the flight (cityX) and the departure time of the flight (tX). Subtract 30 minutes from the departure time (tX-30). Then recursively find the cheapest trip from start, departing after t1, arriving in cityX before tX-30. Add the cost of that trip to the cost of the final leg to determine the total cost of the trip. The minimum over all those trips is the flight you want.
There is perhaps a more efficient dynamic programming approach, but I might start with the above (which is very easy to code recursively).

Algorithm to establish ordering amongst a set of items

I have a set of students (referred to as items in the title for generality). Amongst these students, some have a reputation for being rambunctious. We are told about a set of hate relationships of the form 'i hates j'. 'i hates j' does not imply 'j hates i'. We are supposed to arrange the students in rows (front most row numbered 1) in a way such that if 'i hates j' then i should be put in a row that is strictly lesser numbered than that of j (in other words: in some row that is in front of j's row) so that i doesn't throw anything at j (Turning back is not allowed). What would be an efficient algorithm to find the minimum number of rows needed (each row need not have the same number of students)?
We will make the following assumptions:
1) If we model this as a directed graph, there are no cycles in the graph. The most basic cycle would be: if 'i hates j' is true, 'j hates i' is false. Because otherwise, I think the ordering would become impossible.
2) Every student in the group is at least hated by one other student OR at least hates one other student. Of course, there would be students who are both hated by some and who in turn hate other students. This means that there are no stray students who don't form part of the graph.
Update: I have already thought of constructing a directed graph with i --> j if 'i hates j and doing topological sorting. However, since the general topological sort would suit better if I had to line all the students in a single line. Since there is a variation of the rows here, I am trying to figure out how to factor in the change into topological sort so it gives me what I want.
When you answer, please state the complexity of your solution. If anybody is giving code and you don't mind the language, then I'd prefer Java but of course any other language is just as fine.
JFYI This is not for any kind of homework (I am not a student btw :)).
It sounds to me that you need to investigate topological sorting.
This problem is basically another way to put the longest path in a directed graph problem. The number of rows is actually number of nodes in path (number of edges + 1).
Assuming the graph is acyclic, the solution is topological sort.
Acyclic is a bit stronger the your assumption 1. Not only A -> B and B -> A is invalid. Also A -> B, B -> C, C -> A and any cycle of any length.
HINT: the question is how many rows are needed, not which student in which row. The answer to the question is the length of the longest path.
It's from a project management theory (or scheduling theory, I don't know the exact term). There the task is about sorting jobs (vertex is a job, arc is a job order relationship).
Obviously we have some connected oriented graph without loops. There is an arc from vertex a to vertex b if and only if a hates b. Let's assume there is a source (without incoming arcs) and destination (without outgoing arcs) vertex. If that is not the case, just add imaginary ones. Now we want to find length of a longest path from source to destination (it will be number of rows - 1, but mind the imaginary verteces).
We will define vertex rank (r[v]) as number of arcs in a longest path between source and this vertex v. Obviously we want to know r[destination]. Algorithm for finding rank:
0) r_0[v] := 0 for all verteces v
repeat
t) r_t[end(j)] := max( r_{t-1}[end(j)], r_{t-1}[start(j)] + 1 ) for all arcs j
until for all arcs j r_{t+1}[end(j)] = r_t[end(j)] // i.e. no changes on this iteration
On each step at least one vertex increases its rank. Therefore in this form complexity is O(n^3).
By the way, this algorithm also gives you student distribution among rows. Just group students by their respective ranks.
Edit: Another code with the same idea. Possibly it is better understandable.
# Python
# V is a list of vertex indices, let it be something like V = range(N)
# source has index 0, destination has index N-1
# E is a list of edges, i.e. tuples of the form (start vertex, end vertex)
R = [0] * len(V)
do:
changes = False
for e in E:
if R[e[1]] < R[e[0]] + 1:
changes = True
R[e[1]] = R[e[0]] + 1
while changes
# The answer is derived from value of R[N-1]
Of course this is the simplest implementation. It can be optimized, and time estimate can be better.
Edit2: obvious optimization - update only verteces adjacent to those that were updated on the previous step. I.e. introduce a queue with verteces whose rank was updated. Also for edge storing one should use adjacency lists. With such optimization complexity would be O(N^2). Indeed, each vertex may appear in the queue at most rank times. But vertex rank never exceeds N - number of verteces. Therefore total number of algorithm steps will not exceed O(N^2).
Essentailly the important thing in assumption #1 is that there must not be any cycles in this graph. If there are any cycles you can't solve this problem.
I would start by seating all of the students that do not hate any other students in the back row. Then you can seat the students who hate these students in the next row and etc.
The number of rows is the length of the longest path in the directed graph, plus one. As a limit case, if there is no hate relationship everyone can fit on the same row.
To allocate the rows, put everyone who is not hated by anyone else on the row one. These are the "roots" of your graph. Everyone else is put on row N + 1 if N is the length of the longest path from any of the roots to that person (this path is of length one at least).
A simple O(N^3) algorithm is the following:
S = set of students
for s in S: s.row = -1 # initialize row field
rownum = 0 # start from first row below
flag = true # when to finish
while (flag):
rownum = rownum + 1 # proceed to next row
flag = false
for s in S:
if (s.row != -1) continue # already allocated
ok = true
foreach q in S:
# Check if there is student q who will sit
# on this or later row who hates s
if ((q.row == -1 or q.row = rownum)
and s hated by q) ok = false; break
if (ok): # can put s here
s.row = rownum
flag = true
Simple answer = 1 row.
Put all students in the same row.
Actually that might not solve the question as stated - lesser row, rather than equal row...
Put all students in row 1
For each hate relation, put the not-hating student in a row behind the hating student
Iterate till you have no activity, or iterate Num(relation) times.
But I'm sure there are better algorithms - look at acyclic graphs.
Construct a relationship graph where i hates j will have a directed edge from i to j. So end result is a directed graph. It should be a DAG otherwise no solutions as it's not possible to resolve circular hate relations ship.
Now simply do a DFS search and during the post node callbacks, means the once the DFS of all the children are done and before returning from the DFS call to this node, simply check the row number of all the children and assign the row number of this node as row max row of the child + 1. Incase if there is some one who doesn't hate anyone basically node with no adjacency list simply assign him row 0.
Once all the nodes are processed reverse the row numbers. This should be easy as this is just about finding the max and assigning the row numbers as max-already assigned row numbers.
Here is the sample code.
postNodeCb( graph g, int node )
{
if ( /* No adj list */ )
row[ node ] = 0;
else
row[ node ] = max( row number of all children ) + 1;
}
main()
{
.
.
for ( int i = 0; i < NUM_VER; i++ )
if ( !visited[ i ] )
graphTraverseDfs( g, i );`enter code here`
.
.
}

Resources