Build Tree from List of "Contains" relationships

Build Tree from List of "Contains" relationships - data-structures

I have a list of areas (A,B,C...) , each with a list of the towns (1, 2, 3, 4) they contain.
Note, these are NOT direct parents, the same town shows up in any area that contains it.
A: 1 2 3 4 5 6 7 8 9
B: 2 3 4 5
C: 4 5
D: 2 3
E: 5 6 7
In my case, they always form a unique hierarchical relationship where an area can contain another area, along with the towns that are not in any of the child areas.
A: B E 1 8 9
B: C D
C: 4 5
D: 2 3
E: 5 6 7
If we assume that the hierarchy is unique, can someone give me a pointer to an algorithm in any general purpose language (or pseudo-code) (I'm using C#) to derive the hierarchy?
I've developed something that I think works, but I'd prefer something that's more mathematically certain than "this seems to work".
I'm perfectly happy to have it break if there is no unique hierarchy.
Many thanks

My solution, which I think is correct, but is not efficient, is
(1) Determine the immediate parent area for each town:
For each town, find the area with the smallest number of towns that contains that town.
(2) Determine the immediate parent for each area:
For each area, find the area with the smallest number of towns that fully contains that area (not including itself, of course) [completely contain = all towns in area A are also in area B]
Again: this seems to work assuming that if a area X contains a town that area Y contains, then area X contains all the towns that area Y contains, or vice-versa.
In my case, I test this (again massively inefficiently, but I've got small (< 1,000 towns) data sets) by modifying step 1. For each town, I find all areas that contain that town and sort them by number of towns in the area. I then check that each area is a complete contained by the next area in the list.
If anyone has a known correct answer (or one more efficiently, I'd be grateful for your contribution.)

Related

How can I identify what class of graph algorithm to apply to this problem and possibly how to solve it?

I have this graph problem I would like to solve. I am not sure what algorithm to apply to it.
Maybe depth first search, topological sort or something else. Even if I know which one was
necessary I am not sure I would be able to apply it. Maybe someone can point me in the right
direction. If you want to completely solve it, you are also welcome, but you should also explain
it, because that's also important and not just get the code. Thank you in advance.
Problem statement:
In the city of N, under unclear circumstances, the territory of one of the factories turned into
an anomalous zone. All entrances to the territory were blocked, and it was named an industrial zone.
There are N buildings in the industrial zone, some of them connected by roads. On any road, you
can move in both directions.
A novice stalker was given the task to get to the warehouse in the industrial zone. He found
several maps of the industrial zone's territory in the electronic archive.
Since the maps were created by different people, each of them contains information
only about certain roads in the industrial zone. The same road can appear on several maps.
On the way, the stalker can upload one map from the archive to the mobile phone.
When loading a new map, the previous one is not saved in the phone's memory.
The stalker can only move along roads marked on the currently loaded map.
Each card download costs 1 dollar. To
minimize costs, the stalker needs to choose a route that allows them to load maps as few times as possible.
A stalker can download the same map several times, and you will have to pay
for each download. Initially, there is no map in the mobile phone's memory.
You need to write a program that calculates the minimum amount of expenses required
for a stalker to get from the entrance to the industrial zone to the warehouse.
Input data format:
The first line of the input file contains two natural numbers N and K (2 ≤ N ≤ 2000; 1 ≤ K ≤
2000) — the number of buildings in the industrial zone and the number of maps, respectively.
The entrance to the industrial zone is located in
building number 1, and the warehouse is located in building number N.
The following lines contain information about available maps.
First line of the I-th description maps contains the number ri— the number of roads marked on the i-th map.
Then come ri-strings containing two natural numbers a and b each (1 ≤ a, b ≤ N; a ≠ b)
indicating that there is a road connecting buildings a and b on the i-th map.
The total number of roads marked on all maps does not
exceed 300,000 (r1+ r2+ … + rK ≤ 300 000).
Output data format:
In the output file, you need to output one number — the minimum amount of expenses of the stalker.
If you can't get to the warehouse, print the number -1.
Examples or test cases:
Example 1:
Input:
5 3
1
3 4
3
1 2
1 3
2 4
1
4 5
output:
2
Example 2:
Input:
5 3
2
3 2
4 5
1
2 1
2
1 3
5 4
output:
-1
Example 3:
input:
12 4
4
1 6
2 4
7 9
10 12
3
1 4
7 11
3 6
3
2 5
4 11
8 9
5
3 10
10 7
7 2
12 3
5 12
output:
3

How can I implement an algorithm for a matrix where each cell can have one of two values, with hints

Firstly, sorry for the vagueness of the title.
I don't know if there is a name for this sort of algorithm.
I am interested in finding out how this scenario could be implemented.
Say I have a matrix (columns and rows) where each "cell" can have one of two values ("on" and "off").
Each cell's "background" should also display a number that tells how many of the adjacent cells are "on".
For example, if every cell should be "on":
A
B
C
A
B
C
4
6
4
-->
on
on
on
6
9
6
on
on
on
4
6
4
on
on
on
This is pretty simple to implement. The question is: what's the minimal amount of tips I can have, and how can I implement this?
In this same example, it would be enough to have:
A
B
C
A
B
C
-->
on
on
on
9
on
on
on
on
on
on
It can get trickier:
A
B
C
A
B
C
-->
on
?
off
4
1
on
?
off
on
?
off
In this last example, the matrix would be larger and have other hints that later on would help determine the remaining cells.
So how can this be achieved?

Functional data structure

I'm currently trying to make a non-trivial calculator like Maple, wolfram alpha and those. Just for fun. But I have made the constraint that it has to be in a pure strict functional language. That means no lazy evaluation and mutable structures like arrays.
The question is simply what would be an efficient data structure to make vectors and matrices? The "easy" go to answer would of course be lists, but I find them highly inefficient when it comes to products of matrices. To formalize even more, the vectors and matrices should be of arbitrary size.

You can leave it abstract by representing vectors and matrices as a product type (e.g. record or tuple) of dimensions and a function from index or row and column to element value. Note that you could also accumulate symbolic vector-matrix expressions and, therefore, simplify them before evaluation in order to eliminate as many temporaries as possible.
If your vectors and matrices are sparse then you might want to use a dictionary for the concrete representation. If they are dense then you might want to use an immutable array.

Thanks Jon
I found a better structure, or semi the same.
I use a binary three as structure, where every left child is the subtree with root that is the node with key value that is the next entry of the column of the matrix and the right child is the next node in the row. so it would look like
1 -- 2 -- 3
| | |
4 -- 5 -- 6
| | |
7 -- 8 -- 9
Where | is the left node and -- is the right node of the tree, it is even possible to not represent the same node twice.
So if we have the matrix
1 2 3
4 5 6
7 8 9
where node(Left, 1, Right) would be the root of the tree, and Left is the submatrix
4 5 6
7 8 9
where Node(_, 4, _) is the root of this matrix and Right is the submatrix
2 3
5 6
8 9
with root Node(_, 2, _).
This is at worst as good as arrays if we analyse the worst case runtime.
in some cases it is faster, if we for example wish to get the submatrix
5 6
8 9
We simple go one left and then right from the root and we have the whole tree.
We get the same properties we have with single linked list, as we can create a new matrix or vector(1 x m or m x 1 matrix) of an existing one simply adding the new nodes and det the children to the right node in the old matrix, and still have the old matrix.

program to get all the combinations of ball-box application

I am new to combination and permutation related algorithms. Does anybody have any thoughts on how to program to solve this classical problem? There are 3 boxes(A,B,C) and 10 balls(1,2,3,...,10), we want to put all balls into the boxes. The result should be {Box A: ball 1; Box B: ball 2,3,4; Box C: ball 5 6 7 8 9 10}, {Box A: ball 1 2; Box B: ball 3 4 5; Box C: 7 8 9 10}, .... I want to get all combinations (not the number of different combinations).
Furthermore, what if there is a constraint that each box contains at most 4 balls?
Thank you.

You can put the first ball in any of three boxes, so you have three variants.
There are three variants for the second ball, three for the third and so on.
They are independent, so you have 3^10 variants, and each variant has 1:1 mapping with a number in range 0..3^10-1.
Consider number in ternary number system, so k-th ternary digit of number tells us what box (a=0,b=1,c=2) k-th ball belongs to.
Example for 3 balls:
Number 14 = 112 ternary, so first ball in C, second and third in B
For case of limited box size simple approach is recursive generation - arguments of recursion are list of available balls and current combination (list of boxes with balls and vacant places).

What is a good individual representation for a closed path planning task using genetic algorithm?

There is a n*n grid and in one of the cells of the grid lies an agent A.
A can travel T number of cells.
Each cell in the grid has some weight and the path for A has to maximize that weight.
A also has to return to its starting position within its traveling range T.
What can be a good individual representation to represent the paths?
Methods I have tried:
Chromosome is a list of coordinates.
Chromosome is a list of directions. Each gene is a direction like up, down, up-right, etc. Path never breaks in the middle.
Problems with both methods is that crossing-over almost always generates invalid paths. Paths become broken in the middle. They don't form a closed path. I can't seem to figure out a good way to represent the individual solution and an appropriate crossing-over method. Please help.

First of all, I would say that this problem is a better fit for other approaches, such as maybe ant colony optimization, greedy approaches that give good enough solutions etc. GAs might not work so well for the exact reason you describe.
However, if you must use GAs, here are two possible models that might be worth investigating:
Severely punish invalid paths by giving invalid moves a cost of -infinity. For example, if your chromosome says go from a cell x to an unreachable cell y, consider the cost of y -infinity. This might be worth combining with a low probability of crossover happening, something like 5% maybe.
Don't do crossover, just do some form of more involved mutation of the offspring.
If you want to get even fancier, this is somewhat similar to the travelling salesman problem, which has a lot of research in relation to genetic algorithms:
http://www.lalena.com/AI/Tsp/
http://www.math.hmc.edu/seniorthesis/archives/2001/kbryant/kbryant-2001-thesis.pdf

You could encode the path as a reference list:
Assume these are your locations (1 2 3 4 5 6 7 8 9)
A subset route of (1 2 3 4 8) could be encoded (1 1 2 1 4).
Now take two parents
p1 = (1 1 2 1 | 4 1 3 1 1)
p2 = (5 1 5 5 | 5 3 3 2 1)
which will produce
o1 = (1 1 2 1 5 3 3 2 1)
o2 = (5 1 5 5 4 1 3 1 1)
which will be decoded into these location routes
o1 = 1 – 2 – 4 – 3 – 9 – 7 – 8 – 6 – 5
o2 = 5 – 1 – 7 – 8 – 6 – 2 – 9 – 3 – 4
This way, a crossover will always yield valid results (whether this representation will help you solving your problem better is a different question).
Some additional information can be found here.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio