Below is a representation of a binary tree that I use in my project. In the bottom are the leaf nodes (orange boxes), and every level is the sum of the children below.
So, 3 on the leftmost node is the sum of 1 and 2 (it's left and right children), 10 is the sum of 3 and 7 (again left and right children).
What I am trying to do is, store this tree in a flat array without using any pointers. So this array is basically an integer array, holding 2n-1 nodes (n is the number of the leaf nodes).
So the index of the root element is 0 (let's call it p), and the index of it's left child is 2p+1, index of the right child is 2p+2. Please see Binary Tree (Array implementation)
Everything works nicely if I know the number of leaf values beforehand but I can't seem to find a way to store this tree in a dynamically expanding array.
If I need to add 9 for example as the 9th element to the array, the structure needs to change and I need to recalculate all the indices again which I refrain because there may be hundreds of thousand of elements in the array at any time.
Does anyone know of an implementation that handles dynamic arrays with this implementation?
EDIT:
Below is the demonstration of what happens when I add new elements to the array. 36 was the root before, now it's a second level element and the new root array[0] is 114, which triggers a new layout.
I currently have a double-linked list of objects in descending sorted order. (The list is intrusive--pointers in the objects.) I have a very limited set of operations:
add a node with the highest possible key
remove a node with the highest possible key (doesn't matter which one)
remove a node with key 0 (doesn't matter which one)
increment key of a node with highest current key (doesn't matter which one)
decrement key of any given node whose key is above 0
Operations 1-4 will be constant time, but operation 5 is O(n), where n=number of nodes with same key value. This is because such nodes, when incremented, have to be moved past their siblings with the same key value, and placed after that range. And finding that re-insert place will be O(n).
I thought of the heap (heapsort heap, not malloc heap) as a solution where worst-case would be O(log n) (where n=number of nodes). However, based on my recollection and what Google is finding me, it seems invariably implemented in an array, as opposed to a binary tree. So:
Question: is there an implementation of a heap that uses pointers in the manner of a binary tree, as opposed to an array, that maintains O() of the typical array implementation?
One common way to do this is to use an array-based heap, but:
In the heap you store pointers to nodes;
In each node you store its index in the heap; and
Whenever you swap elements in the heap, you update the indexes in the corresponding nodes;
This preserves the complexity of all the heap operations, and costs around 1.5 pointers and 1 integer per node. (the extra .5 is because of the way growable arrays are implemented).
Alternatively, you can just link the nodes together into a tree with pointers. To support the operations you want, though, this requires 3 pointers per node (parent, left, right)
Both ways work fine, but the array implementation is simpler, faster, and uses a bit less memory.
ETA:
I should point out, though, that if you use pointers then you can use different kinds of heaps. A Fibonacci heap will let you decrement the value of a node in amortized constant time. It's kinda complicated, though, and slow in practice: https://en.wikipedia.org/wiki/Fibonacci_heap
Unfortunately the answer to the written problem isn't an answer to the headline title of the written problem.
Solution 1: amortized O(1) data structure
A solution was found with amortized O(1) implementations of all required operations.
It is simply a double-linked list of double-linked lists. The "main" double-linked list nodes are called parents, and we have at most one parent per key value. The parent nodes keep a double-linked list of child nodes with the same key value. Each child additionally points to its parent.
add a node with the highest possible value: If there is no list head or it's value is not max, add new node to head of main linked list. Otherwise, add it to tail of the head node's child list.
remove a (any) node with the highest possible value: In the case of multiple items with highest value, it doesn't matter which we remove. So, if head parent has children, remove the tail child from the child list. Otherwise, remove the parent from the main list.
remove a (any) node with value 0: Same operations.
increment value of a (any) node with the highest current value: In case of multiple nodes with same key value, we can choose any, so choose the head parent's tail child. Remove it from the child list. If incrementing its value exceeds max value then you're done. Otherwise it's a new head node. If instead there are no children, then increment the head parent in place, and if it exceeds maximum value remove it.
decrement value of any node above 0: If the node is a child, remove from child list, then either add to parent's successor's child list or as a new node after the parent. A parent with no children: if the successor in the main list still has a smaller key, you're done. Otherwise remove it and add as successor's tail child. A parent with children: same but promote the head child to take its place. This is O(n), where n=number of nodes of given size, because you must change the parent pointer for all children. However, if the odds of the node selected for decrement being the parent node of all nodes of given size are 1/n, this amortizes to O(1).
The main downside is that we logically have 7 different pointers for each node. If it's in the parent role we need previous and next parent, and head and tail child. If it's in the child role we need previous and next child, and parent. These can be unionized into two alternate substructures of 4 and 3 pointers, which saves storage, but not CPU time (excepting perhaps the need to zero out unused pointers for cleanliness). Updating them all won't be fast.
Solution 2: Sloppy is Good Enough
Another approach is simply to be sloppy. The application benefits from finding nodes with higher scores but it's not critical that they be absolutely in perfect order. So rather than an O(n) operation to move nodes potentially from one end of the chain to the other, we could accept a solution that does an O(1) albeit at times imperfect job.
This could be the current implementation of a double linked list. It can support all operations except decrement in O(1). It can handle decrement of a unique key value in O(1). Only decrement of a non-unique key value would go O(n), as we need to skip the remaining nodes with the previous key value to find the first with the same or higher key. in the worst case, we could simply cap that search at say 5 or 10 links. This too would provide a nominally O(1) solution. However, some pernicious usage patterns may slowly cause the entire list to become quite unordered.
Say that I have a sequence of key values to be inserted into a B-tree of any given order. After insertion of all the elements, I am performing a deletion operation on some of those elements. Does it always give an unique result (in the form of a B-tree) or it can it differ according to the deletion operation?
Quoted from wiki :
link:https://en.wikipedia.org/wiki/B-tree
Deletion from an internal node
Each element in an internal node acts as a separation value for two
subtrees, therefore we need to find a replacement for separation. Note
that the largest element in the left subtree is still less than the
separator. Likewise, the smallest element in the right subtree is
still greater than the separator. Both of those elements are in leaf
nodes, and either one can be the new separator for the two subtrees.
Algorithmically described below:
Choose a new separator (either the largest element in the left subtree or the smallest element in the right subtree), remove it from
the leaf node it is in, and replace the element to be deleted with the
new separator.
The previous step deleted an element (the new separator) from a leaf
node. If that leaf node is now deficient (has fewer than the required
number of nodes), then rebalance the tree starting from the leaf node.
I think according to the deletion operation it may vary because of the above lines quoted in bold letters. Am I right? help :)
If your question is whether two B-trees that contain the exact same collection of key values will always have identical nodes, then the answer is No.
Note that this is also true for e.g. simple binary trees.
However, in the case of B-trees this can be more pronounced because B-trees are optimized for minimizing page changes and thus the need to write back to slow secondary storage.
I have an unordered tree in the form of, for example:
Root
A1
A1_1
A1_1_1
A1_1_2
A1_1_2_1
A1_1_2_2
A1_1_2_3
A1_1_3
A1_1_n
A1_2
A1_3
A1_n
A2
A2_1
A2_2
A2_3
A2_n
The tree is unordered
each child can have a random N count of children
each node stores an unique long value.
the value required can be at any position.
My problem: if I need the long value of A1_1_2_3, first time I will traverse the nodes I do depth first search to get it, however: on later calls to the same node I must get its value without a recursive search. Why? If this tree would have hundreds of thousands of nodes until it reaches my A1_1_2_3 node, it would take too much time.
What I thought of, is to leave some pointers after the first traverse. E.g. for my case, when I give back the long value for A1_1_2_3 I also give back an array with information for future searches of the same node and say: to get to A1_1_2_3, I need:
first child of Root, which is A1
first child of A1, which is A1_1
second child of A1_1, which is A1_1_2
third child of A1_1_2, which is what I need: A1_1_2_3
So I figured I would store this information along with the value for A1_1_2_3 as an array of indexes: [0, 0, 1, 2]. By doing so, I could easily recreate the node on subsequent calls to the A1_1_2_3 and avoid recursion each time.
However the nodes can change. On subsequent calls, I might have a new structure, so my indexes stored earlier would not match anymore. But if this happens, I thought whnever I dont find the element anymore, I would recursively go back up a level and search for the item, and so on until I find it again and store the indexes again for future references:
e.g. if my A1_1_2_3 is now situated in this new structure:
A1_1
A1_1_0
A1_1_1
A1_1_2
A1_1_2_1
A1_1_2_2
A1_1_21_22
A1_1_2_3
... in this case the new element A1_1_0 ruined my stored structure, so I would go back up a level and search children again recursively until I find it again.
Does this even make sense, what I thought of here, or am I overcomplicating things? Im talking about an unordered tree which can have max about three hundreds of thousands of nodes, and it is vital that I can jump to nodes as fast as possible. But the tree can also be very small, under 10 nodes.
Is there a more efficient way to search in such a situation?
Thank you for any idea.
edit:
I forgot to add: what I need on subsequent calls is not just the same value, but also its position is important, because I must get the next page of children after that child (since its a tree structure, Im calling paging on nodes after the initially selected one). Hope it makes more sense now.
What are the main differences between a Linked List and a BinarySearchTree? Is BST just a way of maintaining a LinkedList? My instructor talked about LinkedList and then BST but did't compare them or didn't say when to prefer one over another. This is probably a dumb question but I'm really confused. I would appreciate if someone can clarify this in a simple manner.
Linked List:
Item(1) -> Item(2) -> Item(3) -> Item(4) -> Item(5) -> Item(6) -> Item(7)
Binary tree:
Node(1)
/
Node(2)
/ \
/ Node(3)
RootNode(4)
\ Node(5)
\ /
Node(6)
\
Node(7)
In a linked list, the items are linked together through a single next pointer.
In a binary tree, each node can have 0, 1 or 2 subnodes, where (in case of a binary search tree) the key of the left node is lesser than the key of the node and the key of the right node is more than the node. As long as the tree is balanced, the searchpath to each item is a lot shorter than that in a linked list.
Searchpaths:
------ ------ ------
key List Tree
------ ------ ------
1 1 3
2 2 2
3 3 3
4 4 1
5 5 3
6 6 2
7 7 3
------ ------ ------
avg 4 2.43
------ ------ ------
By larger structures the average search path becomes significant smaller:
------ ------ ------
items List Tree
------ ------ ------
1 1 1
3 2 1.67
7 4 2.43
15 8 3.29
31 16 4.16
63 32 5.09
------ ------ ------
A Binary Search Tree is a binary tree in which each internal node x stores an element such that the element stored in the left subtree of x are less than or equal to x and elements stored in the right subtree of x are greater than or equal to x.
Now a Linked List consists of a sequence of nodes, each containing arbitrary values and one or two references pointing to the next and/or previous nodes.
In computer science, a binary search tree (BST) is a binary tree data structure which has the following properties:
each node (item in the tree) has a distinct value;
both the left and right subtrees must also be binary search trees;
the left subtree of a node contains only values less than the node's value;
the right subtree of a node contains only values greater than or equal to the node's value.
In computer science, a linked list is one of the fundamental data structures, and can be used to implement other data structures.
So a Binary Search tree is an abstract concept that may be implemented with a linked list or an array. While the linked list is a fundamental data structure.
I would say the MAIN difference is that a binary search tree is sorted. When you insert into a binary search tree, where those elements end up being stored in memory is a function of their value. With a linked list, elements are blindly added to the list regardless of their value.
Right away you can some trade offs:
Linked lists preserve insertion order and inserting is less expensive
Binary search trees are generally quicker to search
A linked list is a sequential number of "nodes" linked to each other, ie:
public class LinkedListNode
{
Object Data;
LinkedListNode NextNode;
}
A Binary Search Tree uses a similar node structure, but instead of linking to the next node, it links to two child nodes:
public class BSTNode
{
Object Data
BSTNode LeftNode;
BSTNode RightNode;
}
By following specific rules when adding new nodes to a BST, you can create a data structure that is very fast to traverse. Other answers here have detailed these rules, I just wanted to show at the code level the difference between node classes.
It is important to note that if you insert sorted data into a BST, you'll end up with a linked list, and you lose the advantage of using a tree.
Because of this, a linkedList is an O(N) traversal data structure, while a BST is a O(N) traversal data structure in the worst case, and a O(log N) in the best case.
They do have similarities, but the main difference is that a Binary Search Tree is designed to support efficient searching for an element, or "key".
A binary search tree, like a doubly-linked list, points to two other elements in the structure. However, when adding elements to the structure, rather than just appending them to the end of the list, the binary tree is reorganized so that elements linked to the "left" node are less than the current node and elements linked to the "right" node are greater than the current node.
In a simple implementation, the new element is compared to the first element of the structure (the root of the tree). If it's less, the "left" branch is taken, otherwise the "right" branch is examined. This continues with each node, until a branch is found to be empty; the new element fills that position.
With this simple approach, if elements are added in order, you end up with a linked list (with the same performance). Different algorithms exist for maintaining some measure of balance in the tree, by rearranging nodes. For example, AVL trees do the most work to keep the tree as balanced as possible, giving the best search times. Red-black trees don't keep the tree as balanced, resulting in slightly slower searches, but do less work on average as keys are inserted or removed.
Linked lists and BSTs don't really have much in common, except that they're both data structures that act as containers. Linked lists basically allow you to insert and remove elements efficiently at any location in the list, while maintaining the ordering of the list. This list is implemented using pointers from one element to the next (and often the previous).
A binary search tree on the other hand is a data structure of a higher abstraction (i.e. it's not specified how this is implemented internally) that allows for efficient searches (i.e. in order to find a specific element you don't have to look at all the elements.
Notice that a linked list can be thought of as a degenerated binary tree, i.e. a tree where all nodes only have one child.
It's actually pretty simple. A linked list is just a bunch of items chained together, in no particular order. You can think of it as a really skinny tree that never branches:
1 -> 2 -> 5 -> 3 -> 9 -> 12 -> |i. (that last is an ascii-art attempt at a terminating null)
A Binary Search Tree is different in 2 ways: the binary part means that each node has 2 children, not one, and the search part means that those children are arranged to speed up searches - only smaller items to the left, and only larger ones to the right:
5
/ \
3 9
/ \ \
1 2 12
9 has no left child, and 1, 2, and 12 are "leaves" - they have no branches.
Make sense?
For most "lookup" kinds of uses, a BST is better. But for just "keeping a list of things to deal with later First-In-First-Out or Last-In-First-Out" kinds of things, a linked list might work well.
The issue with a linked list is searching within it (whether for retrieval or insert).
For a single-linked list, you have to start at the head and search sequentially to find the desired element. To avoid the need to scan the whole list, you need additional references to nodes within the list, in which case, it's no longer a simple linked list.
A binary tree allows for more rapid searching and insertion by being inherently sorted and navigable.
An alternative that I've used successfully in the past is a SkipList. This provides something akin to a linked list but with extra references to allow search performance comparable to a binary tree.
A linked list is just that... a list. It's linear; each node has a reference to the next node (and the previous, if you're talking of a doubly-linked list). A tree branches---each node has a reference to various child nodes. A binary tree is a special case in which each node has only two children. Thus, in a linked list, each node has a previous node and a next node, and in a binary tree, a node has a left child, right child, and parent.
These relationships may be bi-directional or uni-directional, depending on how you need to be able to traverse the structure.
Linked List is straight Linear data with adjacent nodes connected with each other e.g. A->B->C. You can consider it as a straight fence.
BST is a hierarchical structure just like a tree with the main trunk connected to branches and those branches in-turn connected to other branches and so on. The "Binary" word here means each branch is connected to a maximum of two branches.
You use linked list to represent straight data only with each item connected to a maximum of one item; whereas you can use BST to connect an item to two items. You can use BST to represent a data such as family tree, but that'll become n-ary search tree as there can be more than two children to each person.
A binary search tree can be implemented in any fashion, it doesn't need to use a linked list.
A linked list is simply a structure which contains nodes and pointers/references to other nodes inside a node. Given the head node of a list, you may browse to any other node in a linked list. Doubly-linked lists have two pointers/references: the normal reference to the next node, but also a reference to the previous node. If the last node in a doubly-linked list references the first node in the list as the next node, and the first node references the last node as its previous node, it is said to be a circular list.
A binary search tree is a tree that splits up its input into two roughly-equal halves based on a binary search comparison algorithm. Thus, it only needs a very few searches to find an element. For instance, if you had a tree with 1-10 and you needed to search for three, first the element at the top would be checked, probably a 5 or 6. Three would be less than that, so only the first half of the tree would then be checked. If the next value is 3, you have it, otherwise, a comparison is done, etc, until either it is not found or its data is returned. Thus the tree is fast for lookup, but not nessecarily fast for insertion or deletion. These are very rough descriptions.
Linked List from wikipedia, and Binary Search Tree, also from wikipedia.
They are totally different data structures.
A linked list is a sequence of element where each element is linked to the next one, and in the case of a doubly linked list, the previous one.
A binary search tree is something totally different. It has a root node, the root node has up to two child nodes, and each child node can have up to two child notes etc etc. It is a pretty clever data structure, but it would be somewhat tedious to explain it here. Check out the Wikipedia artcle on it.