Trees are one of the most commonly used data structures in web development. This statement holds true for both developers and users. Every web developer who has written HTML and loaded it into a web browser has created a tree, which is referred to as the Document Object Model (DOM). Every user of the Internet who has, in turn, consumed information on the Internet has received it in the form of a tree—the DOM.
Now, here's the climax: The article that you are reading at this moment is rendered in your browser as a tree! The paragraph that you are reading is represented as text in a <p>
element; the <p>
element is nested inside a <body>
element; and the <body>
element is nested inside an <html>
element.
The nesting of data is similar to a family tree. The <html>
element is a parent, the <body>
element is a child, and the <p>
element is a child of the <body>
element. If this analogy of a tree seems useful to you, then you will find comfort in knowing that more analogies will be used during our implementation of a tree.
In this article, we will create a tree using two different methods of tree traversal: Depth-First Search (DFS) and Breadth-First Search (BFS). (If the word traversal is unfamiliar to you, consider it to mean visiting every node of the tree.) Both of these types of traversals highlight different ways of interacting with a tree; both travels, moreover, incorporate the use of data structures that we've covered in this series. DFS uses a stack and BFS uses a queue to visit nodes. That's cool!
Tree (Depth-First Search and Breadth-First Search)
In computer science, a tree is a data structure that simulates hierarchical data with nodes. Each node of a tree holds its own data and pointers to other nodes.
The terminology of nodes and pointers may be new to some readers, so let's describe them further with an analogy. Let's compare a tree to an organizational chart. The chart has a top-level position (root node), such as CEO. Directly underneath this position are other positions, such as vice president (VP).
To represent this relationship, we use arrows that point from the CEO to a VP. A position, such as the CEO, is a node; the relationship we create from a CEO to a VP is a pointer. To create more relationships in our organizational chart, we just repeat this process—we have a node point to another node.
On a conceptual level, I hope that nodes and pointers make sense. On a practical level, we can benefit from using a more technical example. Let's consider the DOM. A DOM has an <html>
element as its top-level position (root node). This node points to a <head>
element and a <body>
element. This process is repeated for all nodes in the DOM.
One of the beauties of this design is the ability to nest nodes: a <ul>
element, for instance, can have many <li>
elements nested inside it; moreover, each <li>
element can have sibling <li>
nodes. That's weird, yet funny and true!
Operations of a Tree
Since every tree contains nodes, which can be a separate constructor from a tree, we will outline the operations of both constructors: Node
and Tree
.
Node
-
data
stores a value. -
parent
points to a node's parent. -
children
points to the next node in the list.
Tree
-
_root
points to the root node of a tree. -
traverseDF(callback)
traverses nodes of a tree with DFS. -
traverseBF(callback)
traverses nodes of a tree with BFS. -
contains(data, traversal)
searches for a node in a tree. -
add(data, toData, traverse)
adds a node to a tree. -
remove(child, parent)
removes a node in a tree.
Implementation of a Tree
Now let's write the code for a tree!
Properties of a Node
For our implementation, we will first define a function named Node
and then a constructor named Tree
.
function Node(data) { this.data = data; this.parent = null; this.children = []; }
Every instance of Node contains three properties: data
, parent
, and children
. The first property holds data associated with a node. The second property points to one node. The third property points to many children nodes.
Properties of a Tree
Now, let's define our constructor for Tree
, which includes the Node
constructor in its definition:
function Tree(data) { var node = new Node(data); this._root = node; }
Tree
contains two lines of code. The first line creates a new instance of Node
; the second line assigns node
as the root of a tree.
The definitions of Tree
and Node
require only a few lines of code. These lines, however, are enough to help us simulate hierarchical data. To prove this point, let's use some example data to create an instance of Tree
(and, indirectly, Node
).
var tree = new Tree('CEO'); // {data: 'CEO', parent: null, children: []} tree._root;
Thanks to the existence of parent
and children
, we can add nodes as children of _root
and also assign _root
as the parents of those nodes. In other words, we can simulate the creation of hierarchical data.
Methods of a Tree
Next, we will create the following five methods:
Tree
traverseDF(callback)
traverseBF(callback)
contains(data, traversal)
add(child, parent)
remove(node, parent)
Since every method of a tree requires us to traverse a tree, we will first implement methods that define different types of tree traversal. (Traversing a tree is a formal way of saying visiting every node of a tree.)
1 of 5: traverseDF(callback)
This method traverses a tree with depth-first search.
Tree.prototype.traverseDF = function(callback) { // this is a recurse and immediately-invoking function (function recurse(currentNode) { // step 2 for (var i = 0, length = currentNode.children.length; i < length; i++) { // step 3 recurse(currentNode.children[i]); } // step 4 callback(currentNode); // step 1 })(this._root); };
traverseDF(callback)
has a parameter named callback
. If it's unclear from the name, callback
is presumed to be a function, which will be called later in traverseDF(callback)
.
The body of traverseDF(callback)
includes another function named recurse
. This function is a recursive function! In other words, it is self-invoking and self-terminating. Using the steps mentioned in the comments of recurse
, I'll describe the general process that recurse
uses to traverse the entire tree.
Here are the steps:
- Immediately invoke
recurse
with the root node of a tree as its argument. At this moment,currentNode
points to the current node. - Enter a
for
loop and iterate once for each child ofcurrentNode
, beginning with the first child. - Inside the body of the
for
loop, invoke recurse with a child ofcurrentNode
. The exact child depends on the current iteration of thefor
loop. - When
currentNode
no longer has children, we exit thefor
loop and invoke thecallback
we passed during the invocation oftraverseDF(callback)
.
Steps 2 (self-terminating), 3 (self-invoking), and 4 (callback) repeat until we traverse every node of a tree.
Recursion is a very hard topic to teach and requires an entire article to adequately explain it. Since the explanation of recursion isn't the focus of this article—the focus is implementing a tree—I'll suggest that any readers lacking a good grasp of recursion do the following two things.
First, experiment with our current implementation of traverseDF(callback)
and try to understand to a degree how it works. Second, if you want me to write an article about recursion, then please request it in the comments of this article.
The following example demonstrates how a tree would be traversed with traverseDF(callback)
. To traverse a tree, I'll create one in the example below. The approach that I will use at this moment isn't ideal, but it works. A better approach would be to use add(value)
, which we will implement in step 4 of 5.
var tree = new Tree('one'); tree._root.children.push(new Node('two')); tree._root.children[0].parent = tree; tree._root.children.push(new Node('three')); tree._root.children[1].parent = tree; tree._root.children.push(new Node('four')); tree._root.children[2].parent = tree; tree._root.children[0].children.push(new Node('five')); tree._root.children[0].children[0].parent = tree._root.children[0]; tree._root.children[0].children.push(new Node('six')); tree._root.children[0].children[1].parent = tree._root.children[0]; tree._root.children[2].children.push(new Node('seven')); tree._root.children[2].children[0].parent = tree._root.children[2]; /* creates this tree one ├── two │ ├── five │ └── six ├── three └── four └── seven */
Now, let's invoke traverseDF(callback)
.
tree.traverseDF(function(node) { console.log(node.data) }); /* logs the following strings to the console 'five' 'six' 'two' 'three' 'seven' 'four' 'one' */
2 of 5: traverseBF(callback)
This method uses breadth-first search to traverse a tree.
The difference between depth-first search and breadth-first search involves the sequence that nodes of a tree visit. To illustrate this point, let's use the tree we created from traverseDF(callback)
.
/* tree one (depth: 0) ├── two (depth: 1) │ ├── five (depth: 2) │ └── six (depth: 2) ├── three (depth: 1) └── four (depth: 1) └── seven (depth: 2) */
Now, let's pass traverseBF(callback)
the same callback we used for traverseDF(callback)
.
tree.traverseBF(function(node) { console.log(node.data) }); /* logs the following strings to the console 'one' 'two' 'three' 'four' 'five' 'six' 'seven' */
The logs from the console and the diagram of our tree reveal a pattern about breadth-first search. Start with the root node; then travel one depth and visit every node in that depth from left to right. Repeat this process until there are no more depths to travel.
Since we have a conceptual model of breadth-first search, let's now implement the code that would make our example work.
Tree.prototype.traverseBF = function(callback) { var queue = new Queue(); queue.enqueue(this._root); currentTree = queue.dequeue(); while(currentTree){ for (var i = 0, length = currentTree.children.length; i < length; i++) { queue.enqueue(currentTree.children[i]); } callback(currentTree); currentTree = queue.dequeue(); } };
Our definition of traverseBF(callback)
contains a lot of logic. For this reason, I'll explain the logic in manageable steps:
- Create an instance of
Queue
. - Add the node that invoked
traverseBF(callback)
to the instance ofQueue
. - Declare a variable named
currentNode
and initialize it to thenode
we just added to our queue. - While
currentNode
points to a node, execute the code inside thewhile
loop. - Use a
for
loop to iterate on the children ofcurrentNode
. - Inside the body of the
for
loop, add every child to the queue. - Take
currentNode
and pass it as an argument ofcallback
. - Reassign
currentNode
to the node being removed from the queue. - Until
currentNode
does not point to a node—every node in the tree has been visited—repeat steps 4 to 8.
3 of 5: contains(callback, traversal)
Let's define a method that will allow us to search for a particular value in our tree. To use either of our methods of tree traversal, I've defined contains(callback, traversal)
to accept two arguments: the data to search and the type of traversal.
Tree.prototype.contains = function(callback, traversal) { traversal.call(this, callback); };
In the body of contains(callback, traversal)
, we use a method named call
to pass this
and callback
. The first argument binds traversal
to the tree that invoked contains(callback, traversal)
; the second argument is a function that is invoked on every node in our tree.
Imagine that we want to log to the console any nodes that contain data with an odd number and traverse every node in our tree with BFS. This is the code we would write:
// tree is an example of a root node tree.contains(function(node){ if (node.data === 'two') { console.log(node); } }, tree.traverseBF);
4 of 5: add(data, toData, traversal)
We now have a method to search for a specific node in our tree. Let's now define a method that will enable us to add a node to a specific node.
Tree.prototype.add = function(data, toData, traversal) { var child = new Node(data), parent = null, callback = function(node) { if (node.data === toData) { parent = node; } }; this.contains(callback, traversal); if (parent) { parent.children.push(child); child.parent = parent; } else { throw new Error('Cannot add node to a non-existent parent.'); } };
add(data, toData, traversal)
defines three parameters. The first parameter, data
, is used to create a new instance of Node
. The second parameter, toData
, is used to compare against every node in a tree. The third parameter, traversal
, is the type of tree traversal used in this method.
In the body of add(data, toData, traversal)
, we declare three variables. The first variable, child
, is initialized as a new instance of Node
. The second variable, parent
, is initialized as null
; but it will later point to any node in a tree that matches the value of toData
. The reassignment of parent
happens in the third variable we declare, which is callback
.
callback
is a function that compares toData
to every node's data
property. If the if
statement evaluates to true
, then parent
is assigned to the node that matched the comparison in the if
statement.
The actual comparison of every node to toData
occurs in contains(callback, traversal)
. The type of traversal and callback
must be passed as arguments of contains(callback, traversal)
.
Finally, if parent
does exist in the tree, we push child
into parent.children
; we also assign parent
to the parent of child
. Else, we throw an error.
Let's use add(data, toData, traversal)
in an example:
var tree = new Tree('CEO'); tree.add('VP of Happiness', 'CEO', tree.traverseBF); /* our tree 'CEO' └── 'VP of Happiness' */
Here's a more complex example of add(addData, toData, traversal)
:
var tree = new Tree('CEO'); tree.add('VP of Happiness', 'CEO', tree.traverseBF); tree.add('VP of Finance', 'CEO', tree.traverseBF); tree.add('VP of Sadness', 'CEO', tree.traverseBF); tree.add('Director of Puppies', 'VP of Finance', tree.traverseBF); tree.add('Manager of Puppies', 'Director of Puppies', tree.traverseBF); /* tree 'CEO' ├── 'VP of Happiness' ├── 'VP of Finance' │ ├── 'Director of Puppies' │ └── 'Manager of Puppies' └── 'VP of Sadness' */
5 of 5: remove(data, fromData, traversal)
To complete our implementation of Tree
, we will add a method named remove(data, fromData, traversal)
. Similar to removing a node from the DOM, this method will remove a node and all of its children.
Tree.prototype.remove = function(data, fromData, traversal) { var tree = this, parent = null, childToRemove = null, index; var callback = function(node) { if (node.data === fromData) { parent = node; } }; this.contains(callback, traversal); if (parent) { index = findIndex(parent.children, data); if (index === undefined) { throw new Error('Node to remove does not exist.'); } else { childToRemove = parent.children.splice(index, 1); } } else { throw new Error('Parent does not exist.'); } return childToRemove; };
Similar to add(data, toData, traversal)
, remove traverses a tree to find a node that contains the second argument, which is now fromData
. If this node is found, then parent
points to it.
At this moment, we reach our first if
statement. If parent
does not exist, we throw an error. If parent
does exist, we invoke findIndex()
with parent.children
and the data we want to remove from the children nodes of parent
. (findIndex()
is a helper method that I defined below.)
function findIndex(arr, data) { var index; for (var i = 0; i < arr.length; i++) { if (arr[i].data === data) { index = i; } } return index; }
Inside findIndex()
, the following logic occurs. If any of the nodes in parent.children
contain data that match data
, the variable index
is assigned an integer. If none of the children's data properties match data
, then index
retains its default value of undefined
. On the last line of findIndex()
, we return index
.
We now return to remove(data, fromData, traversal)
. If index
is undefined
, an error is thrown. If index
is defined, we use it to splice the node we want to remove from the children of parent
; we also assign the removed child to childToRemove
.
Finally, we return childToRemove
.
Complete Implementation of a Tree
Our implementation of Tree
is complete. Take a look—we've accomplished a lot:
function Node(data) { this.data = data; this.parent = null; this.children = []; } function Tree(data) { var node = new Node(data); this._root = node; } Tree.prototype.traverseDF = function(callback) { // this is a recurse and immediately-invoking function (function recurse(currentNode) { // step 2 for (var i = 0, length = currentNode.children.length; i < length; i++) { // step 3 recurse(currentNode.children[i]); } // step 4 callback(currentNode); // step 1 })(this._root); }; Tree.prototype.traverseBF = function(callback) { var queue = new Queue(); queue.enqueue(this._root); currentTree = queue.dequeue(); while(currentTree){ for (var i = 0, length = currentTree.children.length; i < length; i++) { queue.enqueue(currentTree.children[i]); } callback(currentTree); currentTree = queue.dequeue(); } }; Tree.prototype.contains = function(callback, traversal) { traversal.call(this, callback); }; Tree.prototype.add = function(data, toData, traversal) { var child = new Node(data), parent = null, callback = function(node) { if (node.data === toData) { parent = node; } }; this.contains(callback, traversal); if (parent) { parent.children.push(child); child.parent = parent; } else { throw new Error('Cannot add node to a non-existent parent.'); } }; Tree.prototype.remove = function(data, fromData, traversal) { var tree = this, parent = null, childToRemove = null, index; var callback = function(node) { if (node.data === fromData) { parent = node; } }; this.contains(callback, traversal); if (parent) { index = findIndex(parent.children, data); if (index === undefined) { throw new Error('Node to remove does not exist.'); } else { childToRemove = parent.children.splice(index, 1); } } else { throw new Error('Parent does not exist.'); } return childToRemove; }; function findIndex(arr, data) { var index; for (var i = 0; i < arr.length; i++) { if (arr[i].data === data) { index = i; } } return index; }
Conclusion
Trees simulate hierarchical data. Much of the world around us resembles this type of hierarchy, such as web pages and our families. Any time you find yourself in need of structuring data with hierarchy, consider using a tree.
Comments