First of all, sorry for the title. Someone please propose a better one, I really didn't know how to express my question properly.
Basically, I'm just looking for the name of a data structure where the elements look like this (ignore the dots):
......5
....3...2
..4...1...6
9...2...3...1
I first thought it could be a certain kind of "tree", but, as wikipedia says:
A tree is [...] an acyclic connected graph where each node has zero or more children nodes and at most one parent node
Since there can be more than one parent by node in the data structure I'm looking for, it's probably not a tree.
So, here's my question:
What is the name of a data structure that can represent data with the following links between the elements? (/ and \ being the links, again, ignore the dots):
......5
...../..\
....3...2
.../..\./..\
..4...1...6
../.\./..\./..\
9...2...3...1
I think it isn't totally wrong to call it a Tree, although "Digraph" (directed graph) would be a more proper term.
First of all, sorry for the title.
Someone please propose a better one, I
really didn't know how to express my
question properly.
The title is fine, I LOL'd hard when I opened the question. I am going to start calling them "Bowling Pins" now :)
5
3 2
4 1 6
9 2 3 1
The most popular thing I reckon, which was laid out like this, is Pascal's triangle. It's the structure used to calculate binomial coefficients; each node is the sum of its parents:
http://info.ee.surrey.ac.uk/Personal/L.Wood/publications/MSc-thesis/fig36.gif.
Usuallly, when it comes to implementing such algorithms (such class is commonly referred to as "dynamic programming"), this "structure" is usually represented as a simple two-dimensional array. See here, for example:
n\k 0 1 2 3 4
------------------
0 1 0 0 0 0
1 1 1 0 0 0
2 1 2 1 0 0
3 1 3 3 1 0
4 1 4 6 4 1
5 1 5 10 10 5
6 1 6 15 20 15
I think, that there's no formal name for such a structure, but in dynamic programming such stuff is just... arrays.
But from now on, as NullUserException suggests I'm totally calling it "bowling pins" :-)
A "dag", or directed acyclic graph, is a graph with directed edge in which there may be multiple paths to a node, and some nodes may have both incoming and outgoing edges, but there is no way to leave any node and return to it (there are no cycles). Note that in a finite DAG at least one node must have nothing but outgoing edges and at least one must have nothing but incoming edges. Were that not the case, it would be possible to move continuously through the graph without ever reaching a dead end (since every node would have an exit), and without visiting any node twice (since the graph is acyclic). If there are only a finite number of nodes, that's obviously impossible.
Since there can be more than one parent by node in the data structure I'm looking for, it's probably not a tree.
What you're looking for is probably a graph. A tree is a special case of a graph where each node has exactly one parent. (except the root which has none)
Related
We are implementing path representation to solve our travelling salesman problem using a genetic algorithm. However, we were wondering how to solve the issue that there might be identical tours in our individuals, but which are recognised by the path representation as different individuals. An example:
Each individual consists of an array, in which the elements are the cities visited in order.
Individual 1:
[1 2 3 4 5]
Individual 2:
[4 5 1 2 3]
You can see that the tour in 1 and 2 are actually identical, only the "starting" location is different.
We see some solutions to this problem, but we were wondering which one would be the best, or if there are best practices to overcome this problem from literature/experiments/....
Solution 1
Sort each individual to see if the individuals are identical:
1. pick an individual
2. shift the elements by 1 element to the right (at the end of the array, elements are placed at the beginning of the array)
3. check if this shift now matches an individual
4. if not, repeat steps 3 to 4
Solution 2
1. At the start if the simulations, choose a fixed starting point of the cities.
2. If the fixed starting point would change (mutation, recombination,...) then
3. Shift the array so that chosen starting point is back on first index.
Solution 3
1. Use the adjacency representation to check which individuals are identical.
2. Pass this information on to the path representation.
3. This is used to "correct" the individuals.
Solution 1 and 2 seem time consuming, although 2 would probably need much less computing time. Solution 3 would need to constantly switch from one to the other representation.
Then there is also the issue that in our example the tour can be read in 2 ways:
[1 2 3 4 5]
is the same as
[5 4 3 2 1]
Again here, are there any best practises the solve this?
Since you need to visit every city and return to the origin city, you can simply fix the origin. That solves your problem of shifted equivalent tours outright.
For the other, less important issue of mirrored tours, you can start by sorting your individuals by cost (which you probably already do), and check any pair of tours with equal costs using a simple palindrome-checking algorithm.
I had thought I understood BSTs.
That was until my Professor came along.
Let's say I have a BST:
2
/ \
1 3
Now if I were to insert 4, my tree would look like this:
2
/ \
1 3
\
4
but my Professor's tree would end up like this:
2
/ \
1 4
/ \
3 4
Basically, he finds where the new node should be placed and places it there. He then changes the value of the new node's parent to the new node's value and makes the left child of the parent what the original parent node used to be.
I have looked around online but can't find anyone doing this.
What kind of insertion technique is this? Am I missing something?
I don't think it would make a difference but this was specifically for AVL trees.
I think he is trying to keep the tree strictly binary ,i.e., each node has 0 or exactly 2 children.
As i understand, in the example, the first and second 4s are added in their places. A left rotation (required for balancing the avl tree) brings the final shape.
Your insertion of key "4" into a BST is completely right, I can assure you. This is the orthodox and the simplest way to insert elements into a BST.
I think that you've misunderstood your professor, because your third illustration is incorrect at least because of the fact that in BST no duplicates are allowed, and you have element "4" occurring twice.
For example, consider a non-wraparound 4x4 matrix;
1 2 5 1
5 2 5 2
9 3 1 7
2 9 0 3
If I wanted to find the neighbours of, say, the 5 in the first row = 2,5,1. Is there a more efficient solution than doing two for loops and adding a bunch of if conditions?
Yes. If you really need to find the neighbors, then you have an option to use graphs.
Graphs are basically vertex classes w/ their adjacent vertexes, forming an edge. We can see here that 2 forms an edge w/ 5, and 1 form an edge w/ 5, etc.
If you're going to need to know the neighbors VERY frequently(because this is inefficient if you're not), then implement your own vertex class, wrapping the value(5) in a generic T val variable. Have a hashtable of adjacent numbers and their respective distances(1 in this case, and if you need to find neighbors of 2, then you're going to need to assign those as well) by add(vertex, distance) into the hashtable.
Later on, simply iterate through the hashtable for the neighbors.
However, for an array this simple, there isn't much overhead for just doing a for loop and using "a bunch of if statements". In reality you only need to have if(boundaries check) for every direction(which is 4).
Hopefully this helps.
I'm working through a past exam paper for my advanced programming course and I've gotten stuck at this question
What property must the values in a binary search tree satisfy? How many different binary search trees are there containing the three values 1 2 3? Explain your answer.
I can answer the first part easily enough but the second bit, about the number of possible trees has me stumped. My first instinct is to say that there is only a single tree possible, with 2 as the root because the definition says so, but this question is work 8 marks out of a total of 100 for the entire paper, so I can only assume that it's a trick question, and there's a more subtle explanation, but there's nothing in the lecture notes that explains this. Does anyone know who to answer this question?
The question doesn't say that the tree is balanced, so think about whether 1 or 3 can be at the root node.
Try to think about all possible binary trees with these three nodes. How many of those trees fulfill the property of binary search tree?
I think that a trick is that a tree can be a degenerate one (effectively, a linked list of elements):
1
\
2
\
3
And variations thereof.
Also, are these trees considered to be identical ?
2 2
/ \ / \
3 1 1 3
If I remember correctly, the root of the tree does not have to be the "middle element". Thus there are a few more combinations of trees:
2
1 3
or
1
2
3
or
1
3
2
or
3
2
1
or
3
1
2
Maybe I forget a few, but I think you'll get the idea. Just for my notation: Newline meets get down in the tree, right and left of the upperline showes whether it is right or left of its parent node ;)
I'm looking for an algorithm that, given a set of items containing a start time, end time, type, and id, it will return a set of all sets of items that fit together (no overlapping times and all types are represented in the set).
S = [("8:00AM", "9:00AM", "Breakfast With Mindy", 234),
("11:40AM", "12:40PM", "Go to Gym", 219),
("12:00PM", "1:00PM", "Lunch With Steve", 079),
("12:40PM", "1:20PM", "Lunch With Steve", 189)]
Algorithm(S) => [[("8:00AM", "9:00AM", "Breakfast With Mindy", 234),
("11:40AM", "12:40PM", "Go to Gym", 219),
("12:40PM", "1:20PM", "Lunch With Steve", 189)]]
Thanks!
This can be solved using graph theory. I would create an array, which contains the items sorted by start time and end time for equal start times: (added some more items to the example):
no.: id: [ start - end ] type
---------------------------------------------------------
0: 234: [08:00AM - 09:00AM] Breakfast With Mindy
1: 400: [09:00AM - 07:00PM] Check out stackoverflow.com
2: 219: [11:40AM - 12:40PM] Go to Gym
3: 79: [12:00PM - 01:00PM] Lunch With Steve
4: 189: [12:40PM - 01:20PM] Lunch With Steve
5: 270: [01:00PM - 05:00PM] Go to Tennis
6: 300: [06:40PM - 07:20PM] Dinner With Family
7: 250: [07:20PM - 08:00PM] Check out stackoverflow.com
After that i would create a list with the array no. of the least item that could be the possible next item. If there isn't a next item, -1 is added:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7
1 | 7 | 4 | 5 | 6 | 6 | 7 | -1
With that list it is possible to generate a directed acyclic graph. Every vertice has a connection to the vertices starting from the next item. But for vertices where already is a vertices bewteen them no edge is made. I'll try to explain with the example. For the vertice 0 the next item is 1. So a edge is made 0 -> 1. The next item from 1 is 7, that means the range for the vertices which are connected from vertice 0 is now from 1 to (7-1). Because vertice 2 is in the range of 1 to 6, another edge 0 -> 2 is made and the range updates to 1 to (4-1) (because 4 is the next item of 2). Because vertice 3 is in the range of 1 to 3 one more edge 0 -> 3 is made. That was the last edge for vertice 0. That has to be continued with all vertices leading to such a graph:
Until now we are in O(n2). After that all paths can be found using a depth first search-like algorithm and then eliminating the duplicated types from each path.
For that example there are 4 solutions, but none of them has all types because it is not possible for the example to do Go to Gym, Lunch With Steve and Go to Tennis.
Also this search for all paths has a worst case complexity of O(2n). For example the following graph has 2n/2 possible paths from a start vertice to an end vertice.
(source: archive.org)
There could be made some more optimisation, like merging some vertices before searching for all paths. But that is not ever possible. In the first example vertice 3 and 4 can't be merged even though they are of the same type. But in the last example vertice 4 and 5 can be merged if they are of the same type. Which means it doesn't matter which activity you choose, both are valid. This can speed up calculation of all paths dramatically.
Maybe there is also a clever way to consider duplicate types earlier to eliminate them, but worst case is still O(2n) if you want all possible paths.
EDIT1:
It is possible to determine if there are sets that contain all types and get a t least one such solution in polynomial time. I found a algorithm with a worst case time of O(n4) and O(n2) space. I'll take an new example which has a solution with all types, but is more complex.
no.: id: [ start - end ] type
---------------------------------------------------------
0: 234: [08:00AM - 09:00AM] A
1: 400: [10:00AM - 11:00AM] B
2: 219: [10:20AM - 11:20AM] C
3: 79: [10:40AM - 11:40AM] D
4: 189: [11:30AM - 12:30PM] D
5: 270: [12:00PM - 06:00PM] B
6: 300: [02:00PM - 03:00PM] E
7: 250: [02:20PM - 03:20PM] B
8: 325: [02:40PM - 03:40PM] F
9: 150: [03:30PM - 04:30PM] F
10: 175: [05:40PM - 06:40PM] E
11: 275: [07:00PM - 08:00PM] G
1.) Count the different types in the item set. This is possible in O(nlogn). It is 7 for that example.
2.) Create a n*n-matrix, that represents which nodes can reach the actual node and which can be reached from the actual node. For example if position (2,4) is set to 1, means that there is a path from node 2 to node 4 in the graph and (4,2) is set to 1 too, because node 4 can be reached from node 2. This is possible in O(n2). For the example the matrix would look like that:
111111111111
110011111111
101011111111
100101111111
111010111111
111101000001
111110100111
111110010111
111110001011
111110110111
111110111111
111111111111
3.) Now we have in every row, which nodes can be reached. We can also mark each node in a row which is not yet marked, if it is of the same type as a node that can be reached. We set that matrix positions from 0 to 2. This is possible in O(n3). In the example there is no way from node 1 to node 3, but node 4 has the same type D as node 3 and there is a path from node 1 to node 4. So we get this matrix:
111111111111
110211111111
121211111111
120121111111
111212111111
111121020001
111112122111
111112212111
111112221211
111112112111
111112111111
111111111111
4.) The nodes that still contains 0's (in the corresponding rows) can't be part of the solution and we can remove them from the graph. If there were at least one node to remove we start again in step 2.) with the smaller graph. Because we removed at least one node, we have to go back to step 2.) at most n times, but most often this will only happend few times. If there are no 0's left in the matrix we can continue with step 5.). This is possible in O(n2). For the example it is not possible to build a path with node 1 that also contains a node with type C. Therefore it contains a 0 and is removed like node 3 and node 5. In the next loop with the smaller graph node 6 and node 8 will be removed.
5.) Count the different types in the remainig set of items/nodes. If it is smaller than the first count there is no solution that can represent all types. So we have to find another way to get a good solution. If it is the same as the first count we now have a smaller graph which still holds all the possible solutions. O(nlogn)
6.) To get one solution we pick a start node (it doesn't matter which, because all nodes that are left in the graph are part of a solution). O(1)
7.) We remove every node that can't be reached from the choosen node. O(n)
8.) We create a matrix like in step 2.) and 3.) for that graph and remove the nodes that can not reach nodes of any type like in step 4.). O(n3)
9.) We choose one of the next nodes from the node we choosen before and continue with 7.) until there we are at a end node and the graph only has one path left.
That way it is also possible to get all paths, but that can still be exponential many. After all it should be faster than finding solutions in the original graph.
Hmmm, this reminds me of a task in the university, I'll describe what i can remember
The run-time is O(n*logn) which is pretty good.
This is a greedy approuch..
i will refine your request abit, tell me if i'm wrong..
Algorithem should return the MAX subset of non colliding tasks(in terms of total length? or amount of activities? i guess total length)
I would first order the list by the finishing times(first-minimum finishing time,last-maximum) = O(nlogn)
Find_set(A):
G<-Empty set;
S<-A
f<-0
while S!='Empty set' do
i<-index of activity with earliest finish time(**O(1)**)
if S(i).finish_time>=f
G.insert(S(i)) \\add this to result set
f=S(i).finish_time
S.removeAt(i) \\remove the activity from the original set
od
return G
Run time analysis:
initial ordering :nlogn
each iteration O(1)*n = O(n)
Total O(nlogn)+O(n) ~ O(nlogn) (well, given the O notation weakness to represent real complexety on small numbers.. but as the scale grow, this is a good algo)
Enjoy.
Update:
Ok, it seems like i've misread the post, you can alternatively use dynamic programming to reduce running time, there is a solution in link text page 7-19.
you need to tweak the algorithm a bit, first you should build the table, then you can get all variations on it fairly easy.
I would use an Interval Tree for this.
After you build the data structure, you can iterate each event and perform an intersection query. If no intersections are found, it is added to your schedule.
Yes exhaustive search might be an option:
initialise partial schedules with earliest tasks that overlap (eg 9-9.30
and 9.15-9.45)
foreach partial schedule generated so far generate a list of new partial schedules appending to each partial schedule the earliest task that don't overlap (generate more than one in case of ties)
recur with new partial schedules
In your case initlialisation would produce only (8-9 breakfast)
After the first iteration: (8-9 brekkie, 11.40-12.40 gym) (no ties)
After the second iteration: (8-9 brekkie, 11.40-12.40 gym, 12.40-1.20 lunch) (no ties again)
This is a tree search, but it's greedy. It leaves out possibilities like skipping the gym and going to an early lunch.
Since you're looking for every possible schedule, I think the best solution you will find will be a simple exhaustive search.
The only thing I can say algorithmically is that your data structure of lists of strings is pretty terrible.
The implementation is hugely language dependent so I don't even think pseudo-code would make sense, but I'll try to give the steps for the basic algorithm.
Pop off the first n items of the same type and put them in list.
For each item in list, add that item to schedule set.
Pop off next n items of same type off list.
For each item that starts after the first item ends, put on list. (If none, fail)
Continue until done.
Hardest part is deciding exactly how to construct the lists/recursion so it's most elegant.