Cyclic connected graph vs acyclic connected graph use case - algorithm

I have a graph with such structure and rules:
Graph image
It is directed connected cyclic graph (direction from top to bottom nodes).
Any parent node can have 0, 1 or 2 children.
Child can have 1 or 2 parents (root node n_id_1 has no parents).
There are left child and right child. Parent can have right child and not have left child or vice versa (see n_id_9 and n_id_2).
If parent (n_id_9) has left branch (n_id_3 - n_id_8) and no right branch, it always becomes parent of child of left branch (n_id_11).
At this point i see 2 solutions for data structure:
Option 1. Make a directed connected cyclic graph like this (most obvious way):
parent children
1 2
2 3, 11
3 5, 6
5 9
6 4
9 10, 12
4 7
12 8
7 8
8 11
I think that downside of this - hard to render and to maintain.
Option 2. Make a directed connected acyclic graph in which every parent can have 3 children: left, right and third (don't know how to call it better). So for example n_id_2 has left child n_id_3, no right child and third child n_id_11. For missing children we should specify null.
parent children (left, right, third)
1 2, null, null
2 3, null, 11
3 5, 6, 8
5 9, null, null
6 4, null, null
9 null, 10, 12
4 7, null, null
12 null, null, null
7 null, null, null
8 null, null, null
It is much easier to render and walk through data since now we have a tree without cycles. Downside of this i see - if for some reason specs would be changed - it might be harder to change. I prefer this one.
So the questions are:
Are there any other solutions?
What do you think about my provided solutions?
Thank you for reviewing.

This really depends on what operations you need to perform on your graph. From your comment, you only need to edit the graph and not perform any expensive computation, so take the representation that suits you better from the usability perspective.
Personally, I would go with your second solution, which seems more natural for a graph with such properties, I would only describe it a little different:
The graph can be represented as a sequence of nodes, each node may have a left and right child branch and a successor (which is what you called third). Both child branches are optional; no successor means the end of the sequence. A child branch is again a sequence of the same kind of nodes.
This gives you a simple recursive structure, which I find easy to reason about (YMMV, of course).
As a side note: this graph is not cyclic, only it's undirected counterpart is.

Related

Algorithm for node assignment in graph

There are N nodes (1 ≤ N ≤ 2⋅10^5) and M (1 ≤ M ≤ 2⋅10^5) directed edges in a graph. Every node has an assigned number (an integer in the range 1...N) that we are trying to determine.
All nodes with a certain assigned number will have directed edges leading to other nodes with another certain assigned number. This also implies that if one node has multiple directed edges coming out of it, then the nodes that it leads to all have the same assigned number. We have to use this information to determine an assignment of numbers such that the number of distinct numbers among all nodes is maximized.
Because there are multiple possible answers, the output should be the assignment that minimizes the numbers assigned to nodes 1…N, in that order. Essentially the answer is the lexicographically smallest one.
Example:
In a graph of 9 nodes and 12 edges, here are the edges. For the two integers i and j on each line, there is a directed edge from i to j.
3 4
6 9
4 2
2 9
8 3
7 1
3 5
5 8
1 2
4 6
8 7
9 4
The correct assignment is that nodes 1, 4, 5 have the assigned number 1; nodes 2, 6, 8 have the assigned number 2; and nodes 3, 7, 9 have the assigned number 3. This makes sense because nodes 1, 4, 5 lead to nodes 2, 6, 8, which lead to nodes 3, 7, 9.
To solve this problem, I thought that you could create a graph with disconnected subgraphs each representing a group of nodes that have the same assigned number. To do this, I could simply scan through all the nodes, and if a node has multiple directed edges to other nodes, you should add them to your graph as a connected component. If some of the nodes were already in the graph, you could simply add edges in between the current components.
Then, for the rest of the nodes, you could find which nodes they have directed edges to, and somehow use that information to add them to your new graph.
Would this strategy work? If so, how can I properly implement the second portion of my algorithm?
EDIT 1: Earlier I interpreted the problem statement incorrectly; I have now posted the correct interpretation and my new way of approaching the problem.
EDIT 2: So once I go through all the nodes once, adding edges in the way I described above, I would determine the components for each node. Then I would iterate through the nodes again, this time making sure to add the rest of the edges into the graph recursively. For example, if a node with an assigned number has a directed edge to a node that hasn't been assigned a number, I can add that node to its designated component. I can also use Union Find to maintain the components.
While this will be fast enough, I'm worried that there may be errors - for example, when I do this recursive solution, it is possible that when a node is assigned a number, other nodes with assigned numbers that are connected to that node may not work with it. Basically, there would be a contradiction. I would have to come up with a solution for that.
For each node, print rand() % rand() + 1 and pray. With dedication, you might pass all cases.

Priority order in BFS (Breadth First Search Algorithm)

Starting from the most top node i.e 1, at node 2, there will be two adjacent nodes to visit i.e 3 and 4. Which one should we put first in queue and print? Also please tell why.
By its definition BFS should always process 2 and 5 before processing 3 and 4.
In other words the order is determined by the distance from the origin.
For plain vanilla BFS it makes no difference if 2 is processed before 5 or after 5, as it makes no difference if 3 is processed before 4 or after it.
Note that this in not true for Depth First Search.

How to find id of union find operation

I am studying Union Find.
I understand how these union operations come together to make this graph but I do not understand how the ID variable is assigned. At first, I thought it was the size of each graph but this is not true because the size of the first graph is 5 and the size of the second one is 3. Any help would be appreciated.
Normally in the array ID, the index represents a node of any of the graphs and the associated value is the root of the graph that belongs to. So in the example here:
The node 0 (firs element) is associated with 6, because 0 belongs to a graph where 6 is the root.
The node 1 is also associated with 6, because 1 belongs to a graph where 6 is the root.
[...]
In the same way, 4, 5 7 are associated to 4 because these nodes belong to the graph where 4 is the root.
It's a way to quickly identify if two nodes are connected

How to build binary tree knowing only which nodes are connected?

I have to build a binary tree but i don't know which node is parent, left child, or right child. I only know which nodes are connected. Example: for input like this:
6 4
5 7
9 7
1 5
10 4
3 4
2 6
7 8
5 6
(from 1 there is always one path) the tree should looks like that:
One the input i have also given number of nodes. Any ideas, tips?
From the list of the edges, one can easily create a tree.
You can find which are the leaf nodes, just search which nodes are appearing only in one edge.
But, it is not possible to know which of the leaf nodes is the head of the tree.
In a tree structure it is possible to choose any leaf, choose it as the head, and reorder the tree, and it'll be a valid tree.
There is also the issue of tree isomorphism, you can swap the right and left sub-trees to get a valid tree.
To summarize, from this list you can get 6 heads and in each 4 possible swaps, so in total 24 different valid trees.

Basic and Balance BST insertion node placement

I have some questions about certain placement of child nodes since I'm just learning
BSTs and it's quite confusing even after reading some sources and doing some online insertion applets.
Let's say I want to add nodes 5,7,3,4 to an empty basic BST.
add 5
5
add 7
5
7
add 3
5
3 7
add 4
5
3 7
4
Ok I understand that the left child must be less than the parent AND less than or
equal to the right child from that same parent. I follow it until we add the 4 node. How
do we determine that the insertion of 4 goes to the bottom right leaf position of 3 instead of the bottom left leaf position?
Also, doing a AVL insertion of nodes 5,18,3,7,11 yielded some surprising position placements. Inserting the fourth node, 7, went down through 18 instead of 3. Is there a particular reason why? Assuming that is the correct way, inserting 11 would switch the 11 and 18 spots, but wouldn't having 18 as the parent node, 7 as left child, and 11 as right child adhere to the principle of left child smaller than parent and smaller or equal to right child? I'm confused! I would appreciate any help. Thank you!
insert 7
5
3 18
7
insert 11
5
3 11
7 18
In BSTs, elements in the left (right) subtree of any node are all less (greater) than the parent of the root of the subtree. So, in the left subtree of 5, both 3 and 4 are less than 5. Now look at 3. The reason 4 goes to the right, is because 4 is bigger than 3, so it goes right.
Same for your AVL question. 7 became the left child of 18, instead of the right child of 3 because 5 is the root. When doing an insert, you compare elements one level at a time. "Is 7 bigger than 5 (root)? Yes, go right. Is 7 bigger than 18? No, go left. No node to compare to, so 7 goes here".
Having 11 as the right child would not fit into a BST. 11 is smaller than 18, so it should be in the left subtree rooted at 18.

Resources