Pairwise swap of linkedlist - algorithm

I have been trying to do pairwise swap of linkedlist elements. In place of swapping the elements by data, I am swapping them by swapping the links.
C# code:
public LinkedList pairWiseSwapLinks(LinkedList ll)
{
LinkedList curr = ll;
LinkedList next = curr.nextNode;
ll = curr;
while (curr.nextNode != null && next.nextNode != null)
{
curr.nextNode = next.nextNode;
next.nextNode = curr;
Console.WriteLine(curr.data);
Console.WriteLine(next.data);
curr = curr.nextNode;
next = curr.nextNode;
Console.WriteLine(curr.data);
Console.WriteLine(next.data);
}
return ll;
}
The input is: 1 -> 3 -> 10 -> 14 -> 16 -> 20 -> 40
Output: 1 -> 10 -> 16 -> 40
Can someone help me out with what mistake I am making?

There are 2 issues:
after all swaps, your first node should be 3 instead of 1 in your case, so you should return the originally second node (what if there's only 1?).
The reason why 14 is skipped in your case is because you only involve 2 nodes as a pair in each swap. So what happens after the first swap is that the list becomes 3 -> 1 -> 10 -> 16 ... and 14 -> 10 -> 16 so essentially you've lost 14 (i.e. you didn't change the "nextNode" ref in node 1, which should be pointed to 14 in that case).
I don't want to give you the direct solution here but I can give you some hints:
You need to involve 3 nodes in each swap.
However, what if there are only 1/2 nodes?
Adding all the corner cases into your code would make it kinda unreadable, so what if I add a dummy node to the head of the list so eventually I can just write "return dummy.next" instead of bothering finding the current first node?

Related

Is it possible to determine the hop-count when performing Dijkstra?

Thank the codes from #trincot I can modify the Dijkstra to obtain the shortest path between a given source node and destination node.
Moreover, I tried to count the hop when performing the Dijkstra to find the shortest path, when the hop-count exceeds the pre-defined Max_hop, the Dijkstra will be terminated, but I was failed.
Hop is defined as the (N - 1), where N is the number of vertices contained in the shortest paths.
Absolutely, after finding the shortest path, we can easily count the hop number. However, during the Dijkstra's path searching, can we count the hop between a given source and?
from heapq import heappop, heappush
def dijkstra(adjList, source, sink):
n = len(adjList)
parent = [None]*n
heap = [(0,source,0)]
explored_node=[]
hop_count = 0
Max_hop = 8
while heap:
distance, current, came_from = heappop(heap)
if parent[current] is not None: # skip if already visited
continue
parent[current] = came_from # this also marks the node as visited
if sink and current == sink: # only correct place to have terminating condition
# build path
path = [current]
while current != source:
current = parent[current]
path.append(current)
path.reverse()
hop_count -=1
print("Hop count is ",hop_count)
return 1, distance, path
for (neighbor, cost) in adjList[current]:
if parent[neighbor] is None: # not yet visited
heappush(heap, (distance + cost, neighbor, current))
hop_count = hop_count + 1
if hop_count > Max_hop:
print("Terminate")
adjList =[
[],
[[2,3],[4,11],[5,5]],
[[1,3],[3,5],[5,11],[6,7]],
[[2,5],[6,3]],
[[1,11],[5,15],[7,9]],
[[1,5],[2,11],[6,3],[7,6],[8,3],[9,9]],
[[2,7],[3,3],[5,3],[9,10]],
[[4,9],[5,6],[8,1],[10,11],[11,8]],
[[5,3],[7,1],[9,9],[11,11]],
[[5,9],[6,10],[8,9],[11,3],[12,8]],
[[7,11],[13,7],[14,3]],
[[7,8],[8,11],[9,3],[12,8],[14,6]],
[[9,8],[11,8],[15,11]],
[[10,7],[15,3]],
[[10,3],[11,6],[15,9]],
[[12,11],[13,3],[14,9]],
]
flag, dist, path = dijkstra(adjList,1,15)
print("found shortest path {}, which has a distance of {}".format(path, dist))
The graph of adjList is as shown: (the red line is the shortest path from 1 to 15)
I know this is incorrect since when Dijkstra iterates the neighbor, I make hop_cout + 1 that represents the number of explored nodes rather than the hop_count.
In my opinion, there are two significant issues that need to be addressed.
When the shortest distance between a parent_node and a neighbor_node is determined, the hop_count can be added 1. But, Dijkstra finds the shortest path by iterating the neighbor nodes, and the array that stores the shortest distance is updated gradually during path searching. How to determine Dijkstra has already found the shortest distance between a parent_node and a neighbor_node?
Only condition 1 is not enough, even we can know when Dijkstra has found the shortest distance between two nodes, but how do we know whether the neighbor_node will be included in the shortest path between a given source and destination?
In summary, if we want to know the current hop-count during Dijkstra is running, we need to set hop_count +1, When the shortest path from the parent_node to the neighbor_node has been determined, and the neighbor_node will be included to the shortest path from the source to the destination node.
To better define the problem, as shown in this figure, the red line is the shortest path between node 1 and node 15, the shortest path is 1 ->5 ->8 ->7 ->10 ->13 ->15.
When node 2 is explored and the shortest distance between node 1 and
node 2 is determined as 3, the hop_count cannot be added 1 since
node 2 is not contained in the shortest path between 1 and 15.
When node 5 is explored and the shortest distance between node 1 and
node 5 is determined as 5, the hop_count should be added 1 since
node 5 is contained in the shortest path between 1 and 15.
Is my understanding correct? May I hear your idea that "Is it possible to determine the hop-count when performing Dijkstra? "
As the heap will have nodes that represent paths having varying lengths, you cannot hope to use one variable for the hop count. You would need to add the hop count as an additional information in the tuples that you put on the heap, as it is specific to each individual path.
Secondly, you would need to allow that different paths to the same node are allowed to be extended further, as some of these might drop out because of the hop limit, while another may stay under that limit. So concretely, when a more costly path is found to an already visited node, but the number of hops is less, it should still be considered. This means that came_from is not a good structure now (as it only allows one path to pass via a node). Instead we can use a linked list (of back-references) that is included in the heap-element.
NB: I would also make max_hop a parameter to the function:
from heapq import heappop, heappush
def dijkstra(adjList, source, sink, max_hop=8): # make max_hop a parameter
n = len(adjList)
least_hops = [n]*n # Used for deciding whether to visit node via different path
heap = [(0, 0, (source, None))] # came_from is now a linked list: (a, (b, (c, None)))
while heap:
distance, hop_count, chain = heappop(heap) # hop_count is part of tuple
current = chain[0]
if hop_count >= least_hops[current]:
continue # Cannot be an improvement
least_hops[current] = hop_count
if sink and current == sink:
print("Hop count is ", hop_count)
path = []
while chain:
current, chain = chain # Unwind linked list
path.append(current)
return 1, distance, path[::-1]
if hop_count >= max_hop: # no recursion beyond max_hop
print("Terminate")
continue
hop_count += 1 # Adjusted for next pushes unto heap
for neighbor, cost in adjList[current]:
heappush(heap, (distance + cost, hop_count, (neighbor, chain))) # Prepend neighbor
As to your other question:
How to determine Dijkstra has already found the shortest distance between a parent_node and a neighbor_node?
We don't determine this immediately and allow multiple paths to the same node to co-exist. The if in the for loop detects whether the node was already visited and the number of hops to it is not an improvement: this means it had received priority on the heap and had been pulled from it in an earlier iteration of the main while loop, and thus we already have a shortest path to that node. This if prevents us from pushing a useless "alternative" path on the heap: even if the shortest path needs to be rejected later because it cannot stay within the hop limit, an alternative that did not use fewer hops, cannot hope to then stay within the limit either, so it can be rejected now.
There are two questions here, one is how to keep track of the length of the path and the other is terminating the program once the maximum path length is exceeded. Both have quite different answers.
On one hand, you can keep count of how many hops the shortest path has by just getting the length of the path after the algorithm finishes (though it doesn't seem to be what you want). Secondly, you might also keep track of how many hops are required to get from the source to any given node X at an arbitrary iteration, just keep track of the length of the current path from s to a vertex X and update the path-length of the neighbors at the relaxation step. This is greatly covered by #trincot answer which provides code too.
Now, before getting to the program termination part, let me state three useful lemmas that are invariant through Dijkstra Algorithm.
Lemma 1: For every marked vertex, the distance from source to that vertex is a shortest path.
Lemma 2: For every unmarked vertex, the current recorded distance is a shortest path considering only the already visited vertices.
Lemma 3: If the shortest is s -> ... -> u -> v then, when u is visited and it's neighbor's distance updated the distance d(s, v) will remain invariant.
What these lemmas tell us is that:
When node X is marked as visited then: d(s, x) is minimal and the length of the path s->x will remain invariant (from Lemma 1)
Until node X is marked as visited d(s, x) is an estimate and the length of the path s->x is whatever the current path length is. Both values might change. (from Lemma 2)
You can't guarantee that a path of length N is a shortest path nor guarantee that the shortest path has length <= N (From Lemma 3 with a bit of work)
Therefore, if you decide to terminate the program when the path-length from source to sink is greater than a maximum hops number the information obtained can't be guaranteed to be optimal. In particular, any of these may happen at program termination:
The path length is N but there is another path of length N with shorter distance.
The path length is N and there is another path of minor length and shorter distance.
If you want to get the shortest path from source to sink while putting a limit on the path length you should use the Bellman-Ford algorithm instead, which guarantees that at each iteration i all path have length of at most i edges and that this path is shortest with that constraint.
This code is using prioirty queue for dijkstra algorithm.
#include <iostream>
#include <algorithm>
#include <queue>
#include <cstring>
#include <cstdio>
#include <vector>
#define limit 15
using namespace std;
int cost[20001];
vector<int> plist[20001];
const int MaxVal = -1;
vector< vector< pair<int, int> > > arr;
struct node {
pair<int, int> info;
vector<int> path;
};
bool operator < (node a, node b) {
return a.info.first > b.info.first;
}
int main() {
int i, j, k;
int n, m;
int s;
int a, b, c;
cin >> n >> m;
cin >> s;
//arr.reserve(n + 1);
arr.resize(n + 1);
for (i = 1; i <= m; i++) {
cin >> a >> b >> c;
arr[a].push_back({ b, c });
}
for (i = 1; i <= n; i++) {
cost[i] = MaxVal;
}
priority_queue<node, vector<node>> mh;
mh.push(node{ { 0, s }, { } });
while (mh.size() > 0) {
int current = mh.top().info.second;
int val = mh.top().info.first;
auto path = mh.top().path;
mh.pop();
if (cost[current] != MaxVal) continue;//All path would be sorted in prioirty queue. And the path that got out late can't be the shorter path.
cost[current] = val;
path.push_back(current);
if(path.size() > limit) {
//limit exceeded!!
cout << "limitation exceeded!!";
break;
}
plist[current] = path;
for (auto it : arr[current]) {
if (cost[it.first] != MaxVal) continue;
mh.push({ { it.second + val, it.first }, path });
}
}
for (i = 1; i <= n; i++) {
cout << "path to " << i << " costs ";
if (cost[i] == MaxVal) {
cout << "INF\n";
}
else {
cout << cost[i] << "\n";
}
for (auto p : plist[i]) {
cout << p << " ";
}
cout << endl << endl;
}
return 0;
}
//test case
15 55
1 //Starting Node Number
1 2 3
1 4 11
1 5 5
2 1 3
2 3 5
2 5 11
2 6 7
3 2 5
3 6 3
4 1 11
4 5 15
4 7 9
5 1 5
5 2 11
5 6 3
5 7 6
5 8 3
5 9 9
6 2 7
6 3 3
6 5 3
6 9 10
7 4 9
7 5 6
7 8 1
7 10 11
7 11 8
8 5 3
8 7 1
8 9 9
8 11 11
9 5 9
9 6 10
9 8 9
9 11 3
9 12 8
10 7 11
10 13 7
10 14 3
11 7 8
11 8 11
11 9 3
11 12 8
11 14 6
12 9 8
12 11 8
12 15 11
13 10 7
13 15 3
14 10 3
14 11 6
14 15 9
15 12 11
15 13 3
15 14 9
path to 1 costs 0
1
path to 2 costs 3
1 2
path to 3 costs 8
1 2 3
path to 4 costs 11
1 4
path to 5 costs 5
1 5
path to 6 costs 8
1 5 6
path to 7 costs 9
1 5 8 7
path to 8 costs 8
1 5 8
path to 9 costs 14
1 5 9
path to 10 costs 20
1 5 8 7 10
path to 11 costs 17
1 5 8 7 11
path to 12 costs 22
1 5 9 12
path to 13 costs 27
1 5 8 7 10 13
path to 14 costs 23
1 5 8 7 11 14
path to 15 costs 30
1 5 8 7 10 13 15

Why does Frame.ofRecords garbles its results when fed a sequence generated by a parallel calculation?

I am running some code that calculates a sequence of records and calls Frame.ofRecords with that sequence as its argument. The records are calculated using PSeq.map from the library FSharp.Collections.ParallelSeq.
If I convert the sequence into a list then the output is OK. Here is the code and the output:
let summaryReport path (writeOpenPolicy: WriteOpenPolicy) (outputs: Output seq) =
let foo (output: Output) =
let temp =
{ Name = output.Name
Strategy = string output.Strategy
SharpeRatio = (fst output.PandLStats).SharpeRatio
CalmarRatio = (fst output.PandLStats).CalmarRatio }
printfn "************************************* %A" temp
temp
outputs
|> Seq.map foo
|> List.ofSeq // this is the line that makes a difference
|> Frame.ofRecords
|> frameToCsv path writeOpenPolicy ["Name"] "Summary_Statistics"
Name Name Strategy SharpeRatio CalmarRatio
0 Singleton_AAPL MyStrategy 0.317372564 0.103940018
1 Singleton_MSFT MyStrategy 0.372516931 0.130150478
2 Singleton_IBM MyStrategy Infinity
The printfn command let me verify by inspection that in each case the variable temp was calculated correctly.
The last code line is just a wrapper around FrameExtensions.SaveCsv.
If I remove the |> List.ofSeq line then what comes out is garbled:
Name Name Strategy SharpeRatio CalmarRatio
0 Singleton_IBM MyStrategy 0.317372564 0.130150478
1 Singleton_MSFT MyStrategy 0.103940018
2 Singleton_AAPL MyStrategy 0.372516931 Infinity
Notice that the empty (corresponding to NaN) and Infinity items are now in different lines and other things are also mixed up.
Why is this happening?
The Frame.ofRecords function iterates over the sequence multiple times, so if your sequence returns different data when called repeatedly, you will get inconsistent data into the frame.
Here is a minimal example:
let mutable n = 0.
let nums = seq { for i in 0 .. 10 do n <- n + 1.; yield n, n }
Frame.ofRecords nums
This returns:
Item1 Item2
0 -> 1 12
1 -> 2 13
2 -> 3 14
3 -> 4 15
4 -> 5 16
5 -> 6 17
6 -> 7 18
7 -> 8 19
8 -> 9 20
9 -> 10 21
10 -> 11 22
As you can see, the first item is obtained during the first iteration of the sequence, while the second items is obtained during the second iteration.
This should probably be better documented, but it makes the performance better in typical scenarios - if you can send a PR to the docs, that would be useful.
Parallel Sequences are run in arbitrary order, because they get split across many processors therefore the result-set will be in random order. You can always sort them afterwards, or not run your data in parallel.

Traffic Light Graph

Say you have a standard graph with values attached to each node and each edge.
You want to go from one node on the graph to another in the shortest amount of time.
The amount of time you have taken so far to traverse this graph will be known as T.
If an edge has value V, traversing that edge will add V to your time spent (T += V).
If a node has a value N, traversing that node will force you to wait until your time spent is divisible by N (T += (N - T % N) % N).
You can think of this like streets and traffic lights.
Driving on a street takes a constant amount of time to reach the other end.
Driving through a traffic light takes the amount of time you have to wait for it to turn green.
For example, lets say you have this graph:
S--6--[1]--2--[7]
| |
3 2
| |
[9]--3--[6]--1--E
Just at a glance, the top path looks faster because it has shorter edges and a shorter delay.
However, the bottom route turns out to be faster. Let's compute the bottom first:
Start: 0 + 6 -> 6
6 % 1 == 0 # We can pass
6 + 3 -> 9
9 % 9 == 0 # We can pass
9 + 3 -> 12
12 % 6 == 0 # We can pass
12 + 1 -> 13
End: 13
And then the top:
Start: 0 + 6 -> 6
6 % 1 == 0 # We can pass
6 + 2 -> 8
8 % 7 != 0 # Have to wait
8 + 6 -> 14
14 % 7 == 0 # We can pass
14 + 2 -> 16
16 % 6 != 0 # Have to wait
16 + 2 -> 18
18 % 6 == 0 # We can pass
18 + 1 -> 19
End: 19
As you can see, the bottom is much shorter.
At small sizes like this it's easier to calculate but at city sizes, you'd need to use some sort of traversal algorithm.
Does anyone know if there's any sort of solution besides brute force?
It is known as shortest path search problem and can be solved by Dijkstra's algorithm in polynomial time. When the lenght of the path is computed, the amount of time spent waiting in the destination vertex should also be added(except for the destination vertex). So it is still the shortest path search problem, but the weight function is slightly different from simple edges' weights sum.

Finding the root value of a binary tree?

I have an array which stores the relations of values, which makes several trees something like:
So, in this case, my array would be (root, linked to)
(8,3)
(8,10)
(3,1)
(3,6)
(6,4)
(6,7)
(10,14)
(14,13)
And i'd like to set all the root values in the array to the main root in the tree (in all trees):
(8,3)
(8,1)
(8,6)
(8,4)
(8,7)
(8,10)
(8,14)
(8,13)
What algorithm should i investigate?
1) Make a list of all the unique first elements of the tuples.
2) Remove any that also appear as the second element of a tuple.
3) You'll be left with the root (8 here). Replace the first elements of all tuples with this value.
EDIT:
A more complicated approach that will work with multiple trees would be as follows.
First, convert to a parent lookup table:
1 -> 3
3 -> 8
4 -> 6
6 -> 3
7 -> 6
10 -> 8
13 -> 14
14 -> 10
Next, run "find parent with path compression" on each element:
1)
1 -> 3 -> 8
gives
1 -> 8
3 -> 8
4 -> 6
...
3)
3 -> 8
4)
4 -> 6 -> 3 -> 8
gives
1 -> 8
3 -> 8
4 -> 8
6 -> 8
7 -> 6
...
6)
6 -> 8 (already done)
7)
7 -> 6 -> 8
etc.
Result:
1 -> 8
3 -> 8
4 -> 8
6 -> 8
7 -> 8
...
Then convert this back to the tuple list:
(8,1)(8,3)(8,4)...
The find parent with path compression algorithm is as find_set would be for disjoint set forests, e.g.
int find_set(int x) const
{
Element& element = get_element(x);
int& parent = element.m_parent;
if(parent != x)
{
parent = find_set(parent);
}
return parent;
}
The key point is that path compression helps you avoid a lot of work. In the above, for example, when you do the lookup for 4, you store 6 -> 8, which makes later lookups referencing 6 faster.
So assume you have a list of tuples representing the points:
def find_root(ls):
child, parent, root = [], [], []
for node in ls:
parent.append(node[0])
child.append(node[1])
for dis in parent:
if (!child.count(dis)):
root.append(dis)
if len(root) > 1 : return -1 # failure, the tree is not formed well
for nodeIndex in xrange(len(ls)):
ls[nodeIndex] = (root[0], ls[nodeIndex][1])
return ls

Find number of permutations of a given sequence of integers which yield the same binary search tree

Given an array of integers arr = [5, 6, 1]. When we construct a BST with this input in the same order, we will have "5" as root, "6" as the right child and "1" as left child.
Now if our input is changed to [5,1,6], our BST structure will still be identical.
So given an array of integers, how to find the number of different permutations of the input array that results in the identical BST as the BST formed on the original array order?
Your question is equivalent to the question of counting the number of topological orderings for the given BST.
For example, for the BST
10
/ \
5 20
\7 | \
15 30
the set of topological orderings can be counted by hand like this: 10 starts every ordering. The number of topological orderings for the subtree starting with 20 is two: (20, 15, 30) and (20, 30, 15). The subtree starting with 5 has only one ordering: (5, 7). These two sequence can be interleaved in an arbitrary manner, leading to 2 x 10 interleavings, thus producing twenty inputs which produce the same BST. The first 10 are enumerated below for the case (20, 15, 30):
10 5 7 20 15 30
10 5 20 7 15 30
10 5 20 15 7 30
10 5 20 15 30 7
10 20 5 7 15 30
10 20 5 15 7 30
10 20 5 15 30 7
10 20 15 5 7 30
10 20 15 5 30 7
10 20 15 30 5 7
The case (20, 30, 15) is analogous --- you can check that any one of the following inputs produces the same BST.
This examples also provides a recursive rule to calculate the number of the orderings. For a leaf, the number is 1. For a non-leaf node with one child, the number equals to the number of topological orderings for the child. For a non-leaf node with two children with subtree sizes |L| and |R|, both having l and r orderings, resp., the number equals to
l x r x INT(|L|, |R|)
Where INT is the number of possible interleavings of |L| and |R| elements. This can be calculated easily by (|L| + |R|)! / (|L|! x |R|!). For the example above, we get the following recursive computation:
Ord(15) = 1
Ord(30) = 1
Ord(20) = 1 x 1 x INT(1, 1) = 2 ; INT(1, 1) = 2! / 1 = 2
Ord(7) = 1
Ord(5) = 1
Ord(10) = 1 x 2 x INT(2, 3) = 2 x 5! / (2! x 3!) = 2 x 120 / 12 = 2 x 10 = 20
This solves the problem.
Note: this solution assumes that all nodes in the BST have different keys.
Thanks for the explanation antti.huima! This helped me understand. Here is some C++:
#include <vector>
#include <iostream>
using namespace std;
int factorial(int x) {
return (x <= 1) ? 1 : x * factorial(x - 1);
}
int f(int a, int b) {
return factorial(a + b) / (factorial(a) * factorial(b));
}
template <typename T>
int n(vector<T>& P) {
if (P.size() <= 1) return 1;
vector<T> L, R;
for (int i = 1; i < P.size(); i++) {
if (P[i] < P[0])
L.push_back(P[i]);
else
R.push_back(P[i]);
}
return n(L) * n(R) * f(L.size(), R.size());
}
int main(int argc, char *argv[]) {
vector<int> a = { 10, 5, 7, 20, 15, 30 };
cout << n(a) << endl;
return 0;
}
This question can be solved easily if you have little knowledge of recursion, permutation and combinations, and familiarity with Binary Search Tree(obviously).
First you build a binary search tree with the given sequence. You can also perform the same operation in the array but tree-visualisation would paint a good picture.
For given sequence arr[1..n], 1st element would stay put as it is in the given array and only arrangement needs to be brought in arr[2..n].
Assume:
bag1 = number of elements in arr[2..n] which are less than arr[0].
and,
bag2 = number of elements in arr[2..n] which are greater than arr[0].
Since the permutation of elements in bag1 in the sequence won't pose a conflict with the numbers present in the bag2 while forming a binary search tree, one can start begin calculating the answer by picking bag1 elements out of (n-1) elements to permutate and then rest ((n-1) - bag1) = bag2 elements can be placed in 1 way only now. Ordering of elements in bag1 should should be same and likewise for bag2 elements in the sequence.
Since each subtree of a binary search tree has to be a BST. Similar process would be operated on each node and multiply the local answer for the node to final answer.
int ans = 1;
int size[1000000] = {0};
// calculate the size of tree and its subtrees before running function "fun" given below.
int calSize(struct node* root){
if(root == NULL)
return 0;
int l = calSize(root->left);
int r = calSize(root -> right);
size[root->val] = l+r+1;
return size[root->val];
}
void fun(struct node* root){
if(root == NULL)
return;
int n = size[root->val];
if(root->left){
ans *= nCr(n-1, size[root->left]);
ans *= 1; // (Just to understand that there is now only 1 way
//to distribute the rest (n-1)-size of root->left)
}
fun(root->left);
fun(root->right);
}
int main(){
struct node* root;
//construct tree
//and send the root to function "fun"
fun(root);
cout<<ans<<endl;
return 0;
}
You could do this backwards: Given a BST, enumerate all the arrays of integers which could yield this BST...
Couldn't you (using nondeterminism...)
emit root and add it to the emitted set.
nondeterministically choose an item from the tree which is not in the emitted set,
but whose parent is, and add it to the emitted set and emit it.
repeat 2 until all emitted.
The nondeterminism will give you all such arrays. Then you can count them.

Resources