Nested Tree Model

This page written by David Medinets

NOTE: If you don't know the concepts behind the Nested Tree Model, you'll need to find a copy of Joe Celko's 'SQL for Smarties' or read the online article at http://www.intelligententerprise.com/001020/celko.shtml.

NOTE: The SQL in this document was developed in SQL Server. There is no guarantee that the SQL statements and stored procedures will work in other database environments.

Updates
Introduction
The Java Implementation
Creating the Node Table
Creating an Indented Display
Inserting Nodes
Deleting Nodes
Indenting Nodes
Outdenting Nodes

Proof of Concept Stored Procedures
Finding the Parent Node
How to Build a Hierarchy

Updates

Mar 19, 2002
I've create a Java implementation of NTM that uses simple SQL statments so it should work with every database. If you find a database that doesn't accept the SQL, please let me know so that I can simplify even further.

Mar 18, 2000

Added @switch_to_internal_id so that applications where multiple parents aren't needed can ignore the InternalNodeID value. Please bear in mind that this parameter is not implemented for every routine. But it's easy to add if you need to - just check how it's done in the ntm_delete_node routine.

Added @ErrorMessage so that errors in the ntm routines are passed out to the calling program.

Added ntm_get_children procedure. The record set returns only the children of the current node, not the node itself.

Added @add_as_last_child parameter to ntm_insert_node. Fixed bug when inserting in alphabetic order. Correctly handle single quotes in the node name.

Added @move_to_first_child parameter to ntm_move_node.

Updated ntm_outdent_node to match the rules of MS Project.

Mar 5, 2000

Updated the ntm_insert_node routine to insert node alphabetically when the @use_alphabetic_order parameter is set to 1.

Feb 28, 2000

I've added a new field, called ntmLevel, to the NTM database table. This field is automatically maintained when nodes are inserted, indented, outdented, or moved.

Feb 28, 2000

Nodes can now be deactivated and activated. This allows you to remove a node from active use, but still retain its relationships in the tree. The ability to delete nodes is still available.

Feb 14, 2000

I've modified all of the routines so that you can specify a database table name instead of NTM_NODES. On the other hand, 'ntm_nodes' is the default value of the @node_table_name parameter, so other than using named parameter (and you already should be using named parameters) existing code won't be broken by this change.

Feb 6, 2000

Added a TreeID field to the ntm_nodes table so that multiple trees can be managed. Added a ParentID field to the ntm_nodes table to make finding a node's parent very easy. Added an InternalNodeID so that even nodes with multiple parents can be moved.

Feb 6, 2000

Having been inspired by Dean Saxe, I've rewritten the routines so that any node can have multiple parents. Unfortunately, the ntm_move_node routine hasn't been updated to handle multiple parents yet. Supporting multiple parents means that the same UUID value can appear in a graph more than once, so nleft values are used as routine parameters.

Feb 6, 2000

The good programmers at NetProSys have reworked my ntm_move_node routine to eliminate the temporary table that I was using because I was too lazy to pursue a better algorithm. The new routine lets us Celko's order_okay constraint.

Introduction

This document talks about how to maintain a hierarchy using SQL Server stored procedures. In order to make the solution widely applicable, the generic term 'node' is used to represent each location in the hierarchy. For example, there are four nodes in the following hierarchy:

BOOKS

SCIENCE FICTION

ISAAC ASMIMOV
ROMANCE

Each node is uniquely identified by a Globally Unique Identifier (or GUID). However, when implementing a real-world solution the GUID field will probably be replaced with one or more foreign keys. For example, the GUID field might be replaced with project_id and task_id in a project management system.

Creating the Node Table


	CREATE TABLE 

		ntm_nodes

	(

		-- The TreeID allows multiple trees in the ntm_nodes

		-- table. So you could have two wholely separate menu

		-- structures, for example.

		TreeID            uniqueidentifier NOT NULL

		-- The NodeID is a foreign key to locate details about

		-- the node.

		,NodeID           uniqueidentifier NOT NULL

		-- The InternalNodeID facilitates multiple parent nodes.

		-- When a node has multiple parents, it is has more than

		-- one record in the ntm_nodes table. Therefore, we need

		-- another way to uniquely identify each record.

		,InternalNodeID   uniqueidentifier UNIQUE NOT NULL

		-- The outdent_node procedure needs to know the parent

		-- node of the node to be outdented. See the comments at the

		-- end of this file. NULLs are allowed because the root of the

		-- tree is represented by null.

		,ParentID         uniqueidentifier NULL

		-- The node name is only here so that testing the code

		-- is easy. In production systems, it would not be needed.

		,NodeName         varchar(255)     NOT NULL 

		-- The nleft and nright fields are the heart of the Nested Tree Model.

		-- Look elsewhere an explanation of them. Ye'll get no further help

		-- here.

		,nleft            int              NOT NULL 

		,nright           int              NOT NULL 

		,CHECK (nleft > 0)

		,CHECK (nright > 0)

		,CONSTRAINT order_okay CHECK (nleft < nright)

	)

In production systems, the name field serves no purpose - the information would be stored in the object table along with the rest of the object attributes. In this proof-of-concept implemention the name fields serves to simplify the code because the node table doesn't need be joined with an object table.

NOTE: Celko suggests that the nleft and nright fields be marked as unique. Well, he's right. They are unique. However, the manipulations that the ntm_mode_node route performs cause the UNIQUE constraint to break.

Inserting Nodes

In order to insert a node exactly where it is needed, two pieces of information are needed. The location in the tree where the new node is to be placed and whether or not the new node is a child of the existing node or the sibling. Since we are working with computers, the node's location has to be specified exactly. I've chosen to do this by passing a node's id (the buddy node) to ntm_insert_node.

The new node becomes a child or a sibling of that buddy node. The following table shows how nodes can be inserted. For simplicity's sake, the node is assumed to be a text string.

Insert MUSIC node using buddy of NULL	`MUSIC 1 2`
Insert BOOKS node using buddy of MUSIC and an insertion type of AS_SIBLING	`MUSIC 1 2 BOOKS 3 4`
Insert POLKA node using buddy of MUSIC and an insertion type of AS_CHILD	`MUSIC 1 4 POLKA 2 3 BOOKS 5 6`

NOTE: I called the node passed to the insertion procedure a buddy node because I can't think of a better term. If it helps, remember this phrase 'Hey BUDDY, insert the new node after me!'.

Creating an Indented Display


SELECT 

	space(ntmLevel) + NodeName

	,ntmLevel

	,nleft

	,nright

FROM  

	ntm_nodes

order by

	nleft

As a point of historical interest; before the ntmLevel field was supported you needed to perform the above query using the following sql statement:


SELECT 

	space(COUNT(Child.nleft)) + child.NodeName

	,COUNT(Child.nleft) AS lvl

	,Child.nleft

	,Child.nright

FROM  

	dbo.ntm_nodes Parent CROSS JOIN dbo.ntm_nodes Child

WHERE 

	Child.nleft BETWEEN Parent.nleft AND Parent.nright

GROUP BY 

	Child.nleft

	,Child.nright

	,Child.NodeName

order by

	child.nleft

Deleting Nodes

The ntm_delete_node stored procedure deletes a node and its children. Currently, you can't delete a node and attach its children to another node. For example, you can't do the following:

Given this table:	`MUSIC 1 6 POLKA 2 5 GERMAN 3 4`
Transform to this table where the grandchild is 'promoted' to child.	`MUSIC 1 4 GERMAN 2 3`

NOTE: While the ability to promote grandchildren to child might be very useful in some situations, it doesn't seem to be applicable to my needs, so I've ignored it. Deleting a node is accomplished by simply passing its node ID to ntm_delete_node.

Indenting Nodes

The act of indenting a node is the opposite of outdenting a node. The ntm_indent_node stored procedure demotes a node from being a sibling to being a child. For example:

Given this table:	`MUSIC 1 4 POLKA 2 3`
Demote POLKA to be a child of MUSIC	`MUSIC 1 4 POLKA 2 3`

Since all children nodes must have a parent node it is not possible to further indent the POLKA node. However, the following example shows a situation where multiple indents are possible.

Given this table	`MUSIC 1 4 POLKA 2 3 GERMAN 5 6`
Demote GERMAN to be a child of MUSIC.	`MUSIC 1 6 POLKA 2 3 GERMAN 4 5`
Demote GERMAN to be a child of POLKA.	`MUSIC 1 6 POLKA 2 5 GERMAN 3 4`

Again, we've a point where further indentation is impossible.

NOTE: This implementation only indents whole sub-trees (ie, the specified node and all of its children). You can't indent a node without affecting its children.

Outdenting Nodes

The act of outdenting a node is the opposite of indenting a node. The ntm_outdent_node stored procedure promotes a node from being a child to being a sibling. For example:

Given this table	`MUSIC 1 4 POLKA 2 3`
Promote POLKA to be a sibling of MUSIC	`MUSIC 1 2 POLKA 3 4`

Since the parent of POLKA is the (invisble) root node, it is not possible to further outdent the POLKA node. However, the following example shows a situation where multiple outdents are possible.

Given this table	`MUSIC 1 6 POLKA 2 5 GERMAN 3 4`
Promote GERMAN to be a sibling of POLKA.	`MUSIC 1 6 POLKA 2 3 GERMAN 4 5`
Promote GERMAN to be a sibling of MUSIC.	`MUSIC 1 4 POLKA 2 3 GERMAN 5 6`

Again, we've a point where further outdentation is impossible.

NOTE: This implementation only outdents whole sub-trees (ie, the specified node and all of its children). You can't outdent a node without affecting its children.

Proof of Concept Stored Procedures

ntm_indented_display - This shows how I test the different stored procedures.

ntm_create_node_table

ntm_insert_node
ntm_delete_node

ntm_indent_node
ntm_outdent_node

ntm_move_node

ntm_deactivate_node
ntm_activate_node

ntm_get_children

ntm_test_insert_node
ntm_test_delete_node
ntm_test_indent_node
ntm_test_outdent_node
ntm_test_move_node

Finding the Parent Node

When I first thought of locating the parent node, I came up with the following SQL:


	select

		@parent_task_id = id

		,@parent_end_branch = nright 

	from 

		ntm_nodes 

	where 

		nleft = ( 

			select 

				max(parent.nleft) 

			from 

				ntm_nodes child

				,ntm_nodes parent 

			where 

				child.nleft between parent.nleft and parent.nright 

				and child.id = @node_to_outdent_id 

				and parent.id != @node_to_outdent_id 

		)

However, after further thought I like the following SQL better:


	select top 1 

		@parent_task_id = id

		,@parent_end_branch = nright

	from 

		ntm_nodes 

	where 

		nleft < @outdent_start_branch 

		and nright > @outdent_end_branch

	order by 

		nleft desc

The advantage of eliminating the inner SELECT statement should be obvious. And I'll leave explanations of what the SQL is doing as a reader exercise.

Feb 7, 2000 Update - I've added a ParentID field to each node this whole line of thought is now moot.

How to Build a Hierarchy


-- Declare variable types.

declare	@node_table_name varchar(255)

declare @TreeID uniqueidentifier



-- Provide values to local variables.

set @node_table_name = 'ntm_tasks'

set @TreeID = NewID()



-- Create (or recreate) the node table using NTM_NODES as the tablename.

exec ntm_create_node_table



exec ntm_test_insert_node 

	@TreeID=@TreeID

	,@buddy_name=null

	,@node_name='Music'

	,@insert_type='AS_SIBLING'



exec ntm_test_insert_node 

	@TreeID=@TreeID

	,@buddy_name='music'

	,@node_name='blues'

	,@insert_type='AS_CHILD'



exec ntm_test_insert_node 

	@TreeID=@TreeID

	,@buddy_name='music'

	,@node_name='alternative'

	,@insert_type='AS_SIBLING'



exec ntm_test_insert_node 

	@TreeID=@TreeID

	,@buddy_name='alternative'

	,@node_name='punk'

	,@insert_type='AS_CHILD'



-- Try various tree manipulations



--exec ntm_test_insert_node @TreeID=@TreeID, 'alternative', 'blues', 'AS_CHILD'



--exec ntm_test_delete_node @TreeID=@TreeID, @node_name='alternative'



--exec ntm_test_indent_node @TreeID=@TreeID, @node_name='alternative'

--exec ntm_test_indent_node @TreeID=@TreeID, @node_name='alternative'

--exec ntm_test_indent_node @TreeID=@TreeID, @node_name='punk

--exec ntm_test_outdent_node @TreeID=@TreeID, @node_name='punk'

--exec ntm_test_outdent_node @TreeID=@TreeID, @node_name='punk'



--exec ntm_test_move_node @TreeID=@TreeID, @move_node_name='alternative', @buddy_node_name='blues', @move_type='AS_SIBLING'

--exec ntm_test_move_node @TreeID=@TreeID, @move_node_name='alternative', @buddy_node_name='blues', @move_type='AS_CHILD'

Nested Tree Model

Table of Contents

Updates

Introduction

Creating the Node Table

Inserting Nodes

Creating an Indented Display

Deleting Nodes

Indenting Nodes

Outdenting Nodes

Proof of Concept Stored Procedures

Finding the Parent Node

How to Build a Hierarchy