How to assign a different value to each node and each arc of a tree - algorithm

I wanted to ask a somewhat specific question. I had to solve a problem, which I manage to do with some help. I want to explain what I did and then say what I've been told I'm missing.
What I did: Given a list of nodes (a-b,b-c) I return nodes "ids" from 1 to N (like giving them a name, each an unique number). Then, I see which nodes are connected, and calculate the absolute number that subtracting its names/ids gives me (a-b would be 1-2, abs of that is 1. b-c would have 2-3 as values and would get another 1 for their arc. If I had an a-d node, I would return 1-4, so 3 as their arc value).
Then I return the list of nodes and their IDs/values, and the list of the arcs and its values (1,a-b), (1,a-c).
Graphically:
6a
5 4
1b 2e
2 3
3c 5f
1
4d
5a
4 3
1b 2e
1 2
3c 4f
5a
4 3
1b 2e
2
3c
1
4d
9a
8 7
1b 2e
6 4
7c 6f
2 2
5d 4g
3 1
8h 3i
I've worked this by hand so... still no clear algorythm.
What I've been told I'm missing: Each ARC has to have an unique number, too. So not only nodes, but nodes and arc can only have one number, from 1 to N, being N the number of nodes/arcs.
Problem is, I can't figure this out, at all. Closer I get if I do it on paper, I would calculate doing longer and longer ecuations, but not sure that would actually solve anything, so far I found nothing.
The reason for me not understanding this is that the tree/list a-b,b-c would have as nodes:
a-3,b-2,c-1 ; and arcs: 2,a-b|1,b-c
And another, very simple list like a-b,a-c would have:
a-2,b-1,c-3 ; and arcs: 2,a-b|1,a-c
This is one possible solution, but if the trees grow bigger, I fail to see how is it possible for each arc to have one value between 1 and N, non repeating, and same for each node. Is this even possible? How should I approach this task/am I missing some kind of point of view?
Thanks in advance.
edit: since I am not being clear with terminology:
enumerate(CONNECTIONS_IN, NODES_OUT, ARCS_OUT)
?- enumerate([a-b,b-c], EnumNodos, EnumArcos) returns, as of right now:
EnumNodos=[(1,a),(2,b),(3,c)]
EnumArcos=[(1,a-b), (1,a-c)]
It should give:
EnumNodos=[(3,a),(1,b),(2,c)]
EnumArcos=[(1,a-b), (2,a-c)]%because each arc HAS to have an unique number from 1 to N-1 (nodes are from 1 to N)

Related

Need help for finding optimal path which visits multiple sequences of nodes

Summary
Recently I have had a path-finding puzzle that has some complex constraints (currently, I don't have any solution for this one)
A 2D matrix represented the graph. The length of a path is the number of traversed cells.
One or more number sequences are to be found inside the matrix. Each sequence is scored with a value.
Maximum length of the path in the graph. The number of picked cells must not exceed this value.
At any given moment, you can only choose cells in a specific column or row.
On each turn, you need to switch between column and row and stay on
the same line as the last cell you picked. You have to move at right angles. (The direction is like the Snake game).
Always start with picking the first cell from the top row, then go
vertically down to pick the second cell, and then continue switching
between column and row as usual.
You can't choose the same cell twice. The resulting path must not contain duplicated
cells.
For example:
The task is to find the shortest path, if possible in the graph that contains one or more sequences with the highest total score and the path's length is not exceed the provided maximum length.
The picture below demonstrates the solved puzzle with the resulting path marked in red:
Here, we have a path 3A-10-9B. This path contains the given
sequence 3A-10-9B so, which earns 10pts. More complex graphs typically have longer paths containing various sequences at once.
More complex examples
Multiple Sequences
You can complete sequences in any order. The order in which the sequences are listed doesn't matter.
Wasted Moves
Sometimes we are forced to waste moves and choose different cells that don't belong to any sequence. Here are the rules:
Able to waste 1 or 2 moves before the first sequence.
Able to waste 1 or 2 moves between any neighboring sequences.
However, you cannot break sequences and waste moves in the middle of them.
Here, we must waste one move before the sequence 3A-9B and two moves between sequences 3A-9B and 72-D4. Also, notice how red lines between 3A and 9B as well as between 72 and D4 "cross" previously selected cells D4 and 9B, respectively. You can pick different cells from the same row or column multiple times.
Optimal Sequences
Sometimes, it is not possible to have a path that contains all of the provided sequences. In this case, choose the way which achieved the most significant score.
In the above example, we can complete either 9B-3A-72-D4 or 72-D4-3A but not both due to the maximum path length of 5 cells. We have chosen the sequence 9B-3A-72-D4 since it grants more score points than 72-D4-3A.
Unsolvable solution
The first sequence 3A-D4 can't be completed since the code matrix doesn't contain code D4 at all. The second sequence, 72-10, can't be completed for another reason: codes 72 and 10 aren't located in the same row or column anywhere in the matrix and, therefore, can't form a sequence.
Performance advice
One brute force way is to generate all possible paths in the code matrix, loop through them and choose the best one. This is the easiest but also the slowest approach. Solving larger matrices with larger maximum length of path might take dozens of minutes, if not hours.
Try to implement a faster algorithm that doesn’t iterate through all possible paths and can solve puzzles with the following parameters in less than 10 seconds:
Matrix size: 10x10
Number of sequences: 5
Average length of sequences: 4
Maximum path length: 12
At least one solution exists
For example:
Matrix:
41,0f,32,18,29,4b,55,3f,10,3a,
19,4f,57,43,3a,25,19,1e,5e,42,
13,5a,54,3c,1b,32,29,1c,15,30,
49,45,22,2e,25,51,2f,21,4c,37,
1a,5e,49,12,55,1e,49,19,43,2d,
34,26,53,48,49,60,32,3c,50,10,
0f,1e,30,3d,64,37,5b,5e,22,61,
4e,4f,15,5a,13,56,44,22,40,26,
43,2c,17,2b,1f,25,43,60,50,1f,
3c,2b,54,46,42,4d,32,46,30,24,
Sequences:
30, 26, 44, 32, 3c - 25pts
5a, 3c, 12, 1e, 4d - 10pts
1e, 5a, 12 - 10pts
4d, 1e - 5pts
32, 51, 2f, 49, 55, 42 - 30pts
Optimal solution
3f, 1c, 30, 26, 44, 32, 3c, 22, 5a, 12, 1e, 4d
Which contains
30, 26, 44, 32, 3c
5a, 12, 1e
1e, 4d
Conclusion
I am looking for any advice for this puzzle since I have no idea what keywords to look for. A pseudo-code or hints would be helpful for me, and I appreciate that. What has come to my mind is just Dijkstra:
For each sequence, since the order doesn't matter, I have to find all get all possible paths with every permutation, then find the highest score path that contains other input sequences
After that, choose the best of the best.
In this case, I doubt the performance will be the issue.
First step is to find if a required sequence exists.
- SET found FALSE
- LOOP C1 over cells in first row
- CLEAR foundSequence
- ADD C1 to foundSequence
- LOOP C2 over cells is column containing C1
- IF C2 value == first value in sequence
- ADD C2 to foundSequence
- SET found TRUE
- break from LOOP C2
- IF found
- SET direction VERT
- LOOP V over remaining values in sequence
- TOGGLE direction
- SET found FALSE
- LOOP C2 over cells in same column or row ( depending on direction ) containing last cell in foundSequence
- IF C2 value == V
- ADD C2 to foundSequence
- SET found TRUE
- break from LOOP C2
- IF ! found
break out of LOOP V
- IF foundSequence == required sequence
- RETURN foundSequence
RETURN failed
Note: this doesn't find sequences that are feasible with "wasted moves". I would implement this first and get it working. Then, using the same ideas, it can be extended to allow wasted moves.
You have not specified an input format! I suggest a space delimited text files with lines beginning with 'm' containing matrix values and lines beginning 's' containing sequences, like this
m 3A 3A 10 9B
m 9B 72 3A 10
m 10 3A 3A 3A
m 3A 10 3A 9B
s 3A 10 9B
I have implemented the sequence finder in C++
std::vector<int> findSequence()
{
int w, h;
pA->size(w, h);
std::vector<int> foundSequence;
bool found = false;
bool vert = false;
// loop over cells in first row
for (int c = 0; c < w; c++)
{
foundSequence.clear();
found = false;
if (pA->cell(c, 0)->value == vSequence[0][0])
{
foundSequence.push_back(pA->cell(c, 0)->ID());
found = true;
}
while (found)
{
// found possible starting cell
// toggle search direction
vert = (!vert);
// start from last cell found
auto pmCell = pA->cell(foundSequence.back());
int c, r;
pA->coords(c, r, pmCell);
// look for next value in required sequence
std::string nextValue = vSequence[0][foundSequence.size()];
found = false;
if (vert)
{
// loop over cells in column
for (int r2 = 1; r2 < w; r2++)
{
if (pA->cell(c, r2)->value == nextValue)
{
foundSequence.push_back(pA->cell(c, r2)->ID());
found = true;
break;
}
}
}
else
{
// loop over cells in row
for (int c2 = 0; c2 < h; c2++)
{
if (pA->cell(c2, r)->value == nextValue)
{
foundSequence.push_back(pA->cell(c2, r)->ID());
found = true;
break;
}
}
}
if (!found) {
// dead end - try starting from next cell in first row
break;
}
if( foundSequence.size() == vSequence[0].size()) {
// success!!!
return foundSequence;
}
}
}
std::cout << "Cannot find sequence\n";
exit(1);
}
This outputs:
3A 3A 10 9B
9B 72 3A 10
10 3A 3A 3A
3A 10 3A 9B
row 0 col 1 3A
row 3 col 1 10
row 3 col 3 9B
You can check out the code for the complete application at https://github.com/JamesBremner/stackoverflow75410318
I have added the ability to find sequences that start elsewhere than the first row ( i.e. with "wasted moves" ). You can see the code in the github repo.
Here are the the results of a timing profile run on a 10 by 10 matrix - the algorithm finds 5 sequences in 0.6 milliseconds
Searching
41 0f 32 18 29 4b 55 3f 10 3a
19 4f 57 43 3a 25 19 1e 5e 42
13 5a 54 3c 1b 32 29 1c 15 30
49 45 22 2e 25 51 2f 21 4c 37
1a 5e 49 12 55 1e 49 19 43 2d
34 26 53 48 49 60 32 3c 50 10
0f 1e 30 3d 64 37 5b 5e 22 61
4e 4f 15 5a 13 56 44 22 40 26
43 2c 17 2b 1f 25 43 60 50 1f
3c 2b 54 46 42 4d 32 46 30 24
for sequence 4d 1e
Cannot find sequence starting in 1st row, using wasted moves
row 9 col 5 4d
row 4 col 5 1e
for sequence 30 26 44 32 3c
Cannot find sequence starting in 1st row, using wasted moves
Cannot find sequence
for sequence 5a 3c 12 1e 4d
Cannot find sequence starting in 1st row, using wasted moves
row 2 col 1 5a
row 2 col 3 3c
row 4 col 3 12
row 4 col 5 1e
row 9 col 5 4d
for sequence 1e 5a 12
Cannot find sequence starting in 1st row, using wasted moves
row 6 col 1 1e
row 4 col 5 1e
row 4 col 3 12
for sequence 32 51 2f 49 55 42
Cannot find sequence starting in 1st row, using wasted moves
row 2 col 5 32
row 3 col 5 51
row 3 col 6 2f
row 4 col 6 49
row 4 col 4 55
row 9 col 4 42
raven::set::cRunWatch code timing profile
Calls Mean (secs) Total Scope
5 0.00059034 0.0029517 findSequence

Pyramidal algorithm

I'm trying to find an algorithm in which i can go through a numerical pyramid, starting for the top of the pyramid and go forward through adjacent numbers in the next row and each number has to be added to a final sum. The thing is, i have to find the route that returns the highest result.
I already tried to go throught the higher adjacent number in next row, but that is not the answer, because it not always get the best route.
I.E.
34
43 42
67 89 68
05 51 32 78
72 25 32 49 40
If i go through highest adjacent number, it is:
34 + 43 + 89 + 51 + 32 = 249
But if i go:
34 + 42 + 68 + 78 + 49 = 269
In the second case the result is higher, but i made that route by hand and i can't think in an algorithm that get the highest result in all cases.
Can anyone give me a hand?
(Please tell me if I did not express myself well)
Start with the bottom row. As you go from left to right, consider the two adjacent numbers. Now go up one row and compare the sum of the number that is above the two numbers, in the row above, with each of the numbers below. Select the larger sum.
Basically you are looking at the triangles formed by the bottom row and the row above. So for your original triangle,
34
43 42
67 89 68
05 51 32 78
72 25 32 49 40
the bottom left triangle looks like,
05
72 25
So you would add 72 + 05 = 77, as that is the largest sum between 72 + 05 and 25 + 05.
Similarly,
51
25 32
will give you 51 + 32 = 83.
If you continue this approach for each two adjacent numbers and the number above, you can discard the bottom row and replace the row above with the computed sums.
So in this case, the second to last row becomes
77 83 81 127
and your new pyramid is
34
43 42
67 89 68
77 83 81 127
Keep doing this and your pyramid starts shrinking until you have one number which is the number you are after.
34
43 42
150 172 195
34
215 237
Finally, you are left with one number, 271.
Starting at the bottom (row by row), add the highest value of both the values under each element to that element.
So, for your tree, 05 for example, will get replaced by max(72, 25) + 05 = 77. Later you'll add the maximum of that value and the new value for the 51 element to 67.
The top-most node will be the maximum sum.
Not to spoil all your fun, I'll leave the implementation to you, or the details of getting the actual path, if required.

How to traverse a grid using two different paths to maximize the sum of the path?

I have an N by N grid, with values in each box. I have to move from the top-left corner to the bottom-right corner (path 1) and from the top-right corner to the bottom-left (path 2). When I move from top-left to bottom-right I can only move down or to the right. Likewise, when I move from the top-right to the bottom-left I can only move left and down.
But if I move down while taking path 1, the corresponding move for path 2 should be to the left. Similarly, if I move right while taking path 1, the corresponding move for path 2 should be down.
As we take both paths, we sum up the values we encounter in each box. What is the maximum value we can obtain?
Consider as an example the following grid:
6 0 3 -1
7 4 2 4
-3 3 -2 8
13 10 -1 -4
The best paths we can take are represented as follows: path one is represented by a *, while path 2 by a ~.
(6*) (0) (3~) (-1~)
(7*) (4*~) (2~) (4)
(-3) (3*~) (-2*) (8*)
(13~) (10~) (-1) (-4*)
The sum for both of these paths is 56.
We have to devise an algorithm to compute the maximum possible score given an arbitrary N by N grid.
It was pretty clear that this was a DP problem. So, I tried to identify a recurrence relation, so to speak. I tried using the recurrence relation from the classic problem of finding the maximum sum over all paths in an N by M grid, but that didn't work because it just got too complicated.
I tried to divide the N by N grid into four (N-1) by (N-1) grids that overlap, so demonstrating this in a 3 by 3 grid:
a1 a2 a3
a4 a5 a6
a7 a8 a9
I divided this into four 2 x 2 grids:
a1 a2 , a2 a3 , a4 a5 , a5 a6
a4 a5 , a5 a6 , a7 a8 , a8 a9
Assuming we know the best paths for all these grids, can we then compute the best path for the larger grid?
Well, this seemed promising, but I quickly found out that these recurrence relations, were dependent on the larger case. For example,
If we consider the second 2 x 2 grid, assuming we know the best path 1 and path 2 = S. Now, clearly, for us to get from a1 to a2, we need to move right, but this means that the first move in the sub case (the first move in path 2) should be to the down, which isn't guaranteed.
How do we solve this?
The rules for moving the two points are equivalent to finding a single path through a grid which is a sum of your grid and itself rotated -90 degrees (90 degrees to the left / counterclockwise / anti-clockwise, depending on your locale).
"Down" for the top-left point corresponds to "Left" for the top-right point, which when rotated -90 degrees is down. "Right" for the top-left point corresponds to "Down" for the top-right point, which when rotated -90 degrees is right. (Got it?)
So your example grid is
6-1 0+4 3+8 -1-4 5 4 11 -5
7+3 4+2 2-2 4-1 10 6 0 3
=
-3+0 3+4 -2+3 8+10 -3 7 1 18
13+6 10+7 -1-3 -4+13 19 17 -4 9
You can now find a path from top-left to bottom-right by any of the usual means. In fact, you don't need the path, just the maximal sum, which is easier: collapse the matrix from top-left down by addition. The initial condition is the above grid. The next step is to add the top left value to its valid neighbors:
9 11 -5
15 6 0 3
-3 7 1 18
19 17 -4 9
Then pick the larger of the two neighbors for any grid point with two valid neighbors (here 21 is larger than 20 and 12):
20 -5
21 0 3
12 7 1 18
19 17 -4 9
And so on...
15
21 3 24
->
28 1 18 29 18 -> 47
->
31 17 -4 9 48 -4 9 44 9 56
I have just solved your 4x4 case by hand, and the answer is 56.
You can reduce this to the case for just one path on one grid. Make a second copy of the grid, rotate the second copy, then matrix-add them together and solve on the new grid with just one path. The dynamic programming problem is easy from there (i.e. compute partial maximums for each node).

Need to find lowest differences between first line of an array and the rest ones

Well, I've been given a number of pairs of elements (s,h), where s sends an h element on the s-th row of a 2d array.It is not necessary that each line has the same amount of elements, only known that there cannot be more than N elements on a line.
What I want to do is to find the lowest biggest difference(!) between a certain element of the first line and the rest ones.
Thus, if I have 3 lines with (101,92) (100,25,95,52,101) (93,108,0,65,200) what I want to find is 3, because I have to choose 92 and I have 95-92=3 from first to second and 93-92=1 form first to third.
I have reached a point where it is certain that if I have s lines with n(i) elements each and i=0..s, then n0<=n1<=...<=ns so as to have a good average performance scenario when picking the best-fit from 1st line towards the others.
However, I cannot think of a way lower than O(n2) or even maybe O(n3) in some cases. Does anyone have a suggestion about a fairly improved way to do this?
Combine all lines into a single list, also keeping track of which element comes from where.
Sort this list.
Have a last-value variable for each line.
For each item in the sorted list, update the last-value variable of the applicable list. If not all lines have a last-value set yet, do nothing. If it's an element from the first list:
Recalculate the biggest difference for all of the last-value variables. Store this difference.
If it's an element from any other list:
If all values have previous not been set, calculate the biggest difference. Otherwise, if the difference between the first list's last-value and this element is bigger than the biggest difference, update the biggest difference with this difference. Store this difference.
The smallest difference is the desired value.
Example:
Lists: (101,92) (100,25,95,52,101) (93,108,0,65,200)
Sorted 0 25 52 65 92 93 95 100 101 101 108 200
Source 2 1 1 2 0 2 1 1 0 1 2 2
Last[0] - - - - 92 92 92 92 101 101 101 101
Last[1] - 25 52 52 52 52 95 100 100 101 101 101
Last[2] 0 0 0 65 65 93 93 93 93 93 108 200
Diff - - - - 40 41 3 8 8 8 7 9
Best - - - - 40 40 3 3 3 3 3 3
Best = 3 as required. Storing the actual items or finding them afterwards should be easy enough.
Complexity:
Let n be the total number of items and k be the number of lists.
O(n log n) for the combine + sort.
O(nk) (worst case) for the scan through, since we're checking n items and, at each item, we do maximum O(k) work.
So O(n log n + nk).

Print Maximum List

We are given a set F={a1,a2,a3,…,aN} of N Fruits. Each Fruits has price Pi and vitamin content Vi.Now we have to arrange these fruits in such a way that the list contains prices in ascending order and the list contains vitamins in descending order.
For example::
N=4
Pi: 2 5 7 10
Vi: 8 11 9 2
This is the exact question https://cs.stackexchange.com/questions/1287/find-subsequence-of-maximal-length-simultaneously-satisfying-two-ordering-constr/1289#1289
I'd try to reduce the problem to longest increasing subsequent problem.
Sort the list according to first criteria [vitamins]
Then, find the longest increasing subsequent in the modified list,
according to the second criteria [price]
This solution is O(nlogn), since both step (1) and (2) can be done in O(nlogn) each.
Have a look on the wikipedia article, under Efficient Algorithms - how you can implement longest increasing subsequent
EDIT:
If your list allows duplicates, your sort [step (1)] will have to sort by the second parameter as secondary criteria, in case of equality of the primary criteria.
Example [your example 2]:
Pi::99 12 34 10 87 19 90 43 13 78
Vi::10 23 4 5 11 10 18 90 100 65
After step 1 you get [sorting when Vi is primary criteria, descending]:
Pi:: 013 43 78 12 90 87 87 99 10 34
Vi:: 100 90 65 23 18 11 10 10 05 04
Step two finds for longest increasing subsequence in Pi, and you get:
(13,100), (43,90), (78,65), (87,11), (99,10)
as a feasible solution, since it is an increasing subsequence [according to Pi] in the sorted list.
P.S. In here I am assuming the increasing subsequence you want is strictly increasing, otherwise the result is (13,100),(43,90),(78,65),(87,11),(87,10),(99,10) - which is longer subsequence, but it is not strictly increasing/decreasing according to Pi and Vi

Resources