Nested Tree Model

This page written by David Medinets

NOTE: If you don't know the concepts behind the Nested Tree Model, you'll need to find a copy of Joe Celko's 'SQL for Smarties' or read the online article at http://www.intelligententerprise.com/001020/celko.shtml.

NOTE: The SQL in this document was developed in SQL Server. There is no guarantee that the SQL statements and stored procedures will work in other database environments.

Table of Contents

Updates

Mar 19, 2002
I've create a Java implementation of NTM that uses simple SQL statments so it should work with every database. If you find a database that doesn't accept the SQL, please let me know so that I can simplify even further.

Mar 18, 2000

  • Added @switch_to_internal_id so that applications where multiple parents aren't needed can ignore the InternalNodeID value. Please bear in mind that this parameter is not implemented for every routine. But it's easy to add if you need to - just check how it's done in the ntm_delete_node routine.
  • Added @ErrorMessage so that errors in the ntm routines are passed out to the calling program.
  • Added ntm_get_children procedure. The record set returns only the children of the current node, not the node itself.
  • Added @add_as_last_child parameter to ntm_insert_node. Fixed bug when inserting in alphabetic order. Correctly handle single quotes in the node name.
  • Added @move_to_first_child parameter to ntm_move_node.
  • Updated ntm_outdent_node to match the rules of MS Project.

    Mar 5, 2000

  • Updated the ntm_insert_node routine to insert node alphabetically when the @use_alphabetic_order parameter is set to 1.

    Feb 28, 2000

  • I've added a new field, called ntmLevel, to the NTM database table. This field is automatically maintained when nodes are inserted, indented, outdented, or moved.

    Feb 28, 2000

  • Nodes can now be deactivated and activated. This allows you to remove a node from active use, but still retain its relationships in the tree. The ability to delete nodes is still available.

    Feb 14, 2000

  • I've modified all of the routines so that you can specify a database table name instead of NTM_NODES. On the other hand, 'ntm_nodes' is the default value of the @node_table_name parameter, so other than using named parameter (and you already should be using named parameters) existing code won't be broken by this change.

    Feb 6, 2000

  • Added a TreeID field to the ntm_nodes table so that multiple trees can be managed. Added a ParentID field to the ntm_nodes table to make finding a node's parent very easy. Added an InternalNodeID so that even nodes with multiple parents can be moved.

    Feb 6, 2000

  • Having been inspired by Dean Saxe, I've rewritten the routines so that any node can have multiple parents. Unfortunately, the ntm_move_node routine hasn't been updated to handle multiple parents yet. Supporting multiple parents means that the same UUID value can appear in a graph more than once, so nleft values are used as routine parameters.

    Feb 6, 2000

  • The good programmers at NetProSys have reworked my ntm_move_node routine to eliminate the temporary table that I was using because I was too lazy to pursue a better algorithm. The new routine lets us Celko's order_okay constraint.

    Introduction

    This document talks about how to maintain a hierarchy using SQL Server stored procedures. In order to make the solution widely applicable, the generic term 'node' is used to represent each location in the hierarchy. For example, there are four nodes in the following hierarchy:

    Each node is uniquely identified by a Globally Unique Identifier (or GUID). However, when implementing a real-world solution the GUID field will probably be replaced with one or more foreign keys. For example, the GUID field might be replaced with project_id and task_id in a project management system.

    Creating the Node Table

    
    	CREATE TABLE 
    
    		ntm_nodes
    
    	(
    
    		-- The TreeID allows multiple trees in the ntm_nodes
    
    		-- table. So you could have two wholely separate menu
    
    		-- structures, for example.
    
    		TreeID            uniqueidentifier NOT NULL
    
    		-- The NodeID is a foreign key to locate details about
    
    		-- the node.
    
    		,NodeID           uniqueidentifier NOT NULL
    
    		-- The InternalNodeID facilitates multiple parent nodes.
    
    		-- When a node has multiple parents, it is has more than
    
    		-- one record in the ntm_nodes table. Therefore, we need
    
    		-- another way to uniquely identify each record.
    
    		,InternalNodeID   uniqueidentifier UNIQUE NOT NULL
    
    		-- The outdent_node procedure needs to know the parent
    
    		-- node of the node to be outdented. See the comments at the
    
    		-- end of this file. NULLs are allowed because the root of the
    
    		-- tree is represented by null.
    
    		,ParentID         uniqueidentifier NULL
    
    		-- The node name is only here so that testing the code
    
    		-- is easy. In production systems, it would not be needed.
    
    		,NodeName         varchar(255)     NOT NULL 
    
    		-- The nleft and nright fields are the heart of the Nested Tree Model.
    
    		-- Look elsewhere an explanation of them. Ye'll get no further help
    
    		-- here.
    
    		,nleft            int              NOT NULL 
    
    		,nright           int              NOT NULL 
    
    		,CHECK (nleft > 0)
    
    		,CHECK (nright > 0)
    
    		,CONSTRAINT order_okay CHECK (nleft < nright)
    
    	)
    
    

    In production systems, the name field serves no purpose - the information would be stored in the object table along with the rest of the object attributes. In this proof-of-concept implemention the name fields serves to simplify the code because the node table doesn't need be joined with an object table.

    NOTE: Celko suggests that the nleft and nright fields be marked as unique. Well, he's right. They are unique. However, the manipulations that the ntm_mode_node route performs cause the UNIQUE constraint to break.

    Inserting Nodes

    In order to insert a node exactly where it is needed, two pieces of information are needed. The location in the tree where the new node is to be placed and whether or not the new node is a child of the existing node or the sibling. Since we are working with computers, the node's location has to be specified exactly. I've chosen to do this by passing a node's id (the buddy node) to ntm_insert_node.

    The new node becomes a child or a sibling of that buddy node. The following table shows how nodes can be inserted. For simplicity's sake, the node is assumed to be a text string.

    Insert MUSIC node using buddy of NULL MUSIC 1 2
    Insert BOOKS node using buddy of MUSIC and an insertion type of AS_SIBLING MUSIC 1 2
    BOOKS 3 4
    Insert POLKA node using buddy of MUSIC and an insertion type of AS_CHILD MUSIC 1 4
      POLKA 2 3
    BOOKS 5 6

    NOTE: I called the node passed to the insertion procedure a buddy node because I can't think of a better term. If it helps, remember this phrase 'Hey BUDDY, insert the new node after me!'.

    Creating an Indented Display

    
    SELECT 
    
    	space(ntmLevel) + NodeName
    
    	,ntmLevel
    
    	,nleft
    
    	,nright
    
    FROM  
    
    	ntm_nodes
    
    order by
    
    	nleft
    
    

    As a point of historical interest; before the ntmLevel field was supported you needed to perform the above query using the following sql statement:

    
    SELECT 
    
    	space(COUNT(Child.nleft)) + child.NodeName
    
    	,COUNT(Child.nleft) AS lvl
    
    	,Child.nleft
    
    	,Child.nright
    
    FROM  
    
    	dbo.ntm_nodes Parent CROSS JOIN dbo.ntm_nodes Child
    
    WHERE 
    
    	Child.nleft BETWEEN Parent.nleft AND Parent.nright
    
    GROUP BY 
    
    	Child.nleft
    
    	,Child.nright
    
    	,Child.NodeName
    
    order by
    
    	child.nleft
    
    

    Deleting Nodes

    The ntm_delete_node stored procedure deletes a node and its children. Currently, you can't delete a node and attach its children to another node. For example, you can't do the following:

    Given this table: MUSIC      1 6
      POLKA    2 5
        GERMAN 3 4
    Transform to this table where the grandchild is 'promoted' to child. MUSIC  1 4
    GERMAN 2 3

    NOTE: While the ability to promote grandchildren to child might be very useful in some situations, it doesn't seem to be applicable to my needs, so I've ignored it. Deleting a node is accomplished by simply passing its node ID to ntm_delete_node.

    Indenting Nodes

    The act of indenting a node is the opposite of outdenting a node. The ntm_indent_node stored procedure demotes a node from being a sibling to being a child. For example:

    Given this table: MUSIC    1 4
    POLKA    2 3
    Demote POLKA to be a child of MUSIC MUSIC   1 4
      POLKA 2 3

    Since all children nodes must have a parent node it is not possible to further indent the POLKA node. However, the following example shows a situation where multiple indents are possible.

    Given this table MUSIC 1 4
      POLKA 2 3
    GERMAN 5 6
    Demote GERMAN to be a child of MUSIC. MUSIC 1 6
      POLKA 2 3
      GERMAN 4 5
    Demote GERMAN to be a child of POLKA. MUSIC 1 6
      POLKA 2 5
        GERMAN 3 4

    Again, we've a point where further indentation is impossible.

    NOTE: This implementation only indents whole sub-trees (ie, the specified node and all of its children). You can't indent a node without affecting its children.

    Outdenting Nodes

    The act of outdenting a node is the opposite of indenting a node. The ntm_outdent_node stored procedure promotes a node from being a child to being a sibling. For example:

    Given this table MUSIC 1 4
      POLKA 2 3
    Promote POLKA to be a sibling of MUSIC MUSIC 1 2
    POLKA 3 4

    Since the parent of POLKA is the (invisble) root node, it is not possible to further outdent the POLKA node. However, the following example shows a situation where multiple outdents are possible.

    Given this table MUSIC 1 6
      POLKA 2 5
        GERMAN 3 4
    Promote GERMAN to be a sibling of POLKA. MUSIC 1 6
      POLKA 2 3
      GERMAN 4 5
    Promote GERMAN to be a sibling of MUSIC. MUSIC 1 4
      POLKA 2 3
    GERMAN 5 6

    Again, we've a point where further outdentation is impossible.

    NOTE: This implementation only outdents whole sub-trees (ie, the specified node and all of its children). You can't outdent a node without affecting its children.

    Proof of Concept Stored Procedures

    ntm_indented_display - This shows how I test the different stored procedures.

    ntm_create_node_table

    ntm_insert_node
    ntm_delete_node

    ntm_indent_node
    ntm_outdent_node

    ntm_move_node

    ntm_deactivate_node
    ntm_activate_node

    ntm_get_children


    ntm_test_insert_node
    ntm_test_delete_node
    ntm_test_indent_node
    ntm_test_outdent_node
    ntm_test_move_node

    Finding the Parent Node

    When I first thought of locating the parent node, I came up with the following SQL:

    
    	select
    
    		@parent_task_id = id
    
    		,@parent_end_branch = nright 
    
    	from 
    
    		ntm_nodes 
    
    	where 
    
    		nleft = ( 
    
    			select 
    
    				max(parent.nleft) 
    
    			from 
    
    				ntm_nodes child
    
    				,ntm_nodes parent 
    
    			where 
    
    				child.nleft between parent.nleft and parent.nright 
    
    				and child.id = @node_to_outdent_id 
    
    				and parent.id != @node_to_outdent_id 
    
    		) 
    
    

    However, after further thought I like the following SQL better:

    
    	select top 1 
    
    		@parent_task_id = id
    
    		,@parent_end_branch = nright
    
    	from 
    
    		ntm_nodes 
    
    	where 
    
    		nleft < @outdent_start_branch 
    
    		and nright > @outdent_end_branch
    
    	order by 
    
    		nleft desc
    
    

    The advantage of eliminating the inner SELECT statement should be obvious. And I'll leave explanations of what the SQL is doing as a reader exercise.

    Feb 7, 2000 Update - I've added a ParentID field to each node this whole line of thought is now moot.

    How to Build a Hierarchy

    
    -- Declare variable types.
    
    declare	@node_table_name varchar(255)
    
    declare @TreeID uniqueidentifier
    
    
    
    -- Provide values to local variables.
    
    set @node_table_name = 'ntm_tasks'
    
    set @TreeID = NewID()
    
    
    
    -- Create (or recreate) the node table using NTM_NODES as the tablename.
    
    exec ntm_create_node_table
    
    
    
    exec ntm_test_insert_node 
    
    	@TreeID=@TreeID
    
    	,@buddy_name=null
    
    	,@node_name='Music'
    
    	,@insert_type='AS_SIBLING'
    
    
    
    exec ntm_test_insert_node 
    
    	@TreeID=@TreeID
    
    	,@buddy_name='music'
    
    	,@node_name='blues'
    
    	,@insert_type='AS_CHILD'
    
    
    
    exec ntm_test_insert_node 
    
    	@TreeID=@TreeID
    
    	,@buddy_name='music'
    
    	,@node_name='alternative'
    
    	,@insert_type='AS_SIBLING'
    
    
    
    exec ntm_test_insert_node 
    
    	@TreeID=@TreeID
    
    	,@buddy_name='alternative'
    
    	,@node_name='punk'
    
    	,@insert_type='AS_CHILD'
    
    
    
    -- Try various tree manipulations
    
    
    
    --exec ntm_test_insert_node @TreeID=@TreeID, 'alternative', 'blues', 'AS_CHILD'
    
    
    
    --exec ntm_test_delete_node @TreeID=@TreeID, @node_name='alternative'
    
    
    
    --exec ntm_test_indent_node @TreeID=@TreeID, @node_name='alternative'
    
    --exec ntm_test_indent_node @TreeID=@TreeID, @node_name='alternative'
    
    --exec ntm_test_indent_node @TreeID=@TreeID, @node_name='punk
    
    --exec ntm_test_outdent_node @TreeID=@TreeID, @node_name='punk'
    
    --exec ntm_test_outdent_node @TreeID=@TreeID, @node_name='punk'
    
    
    
    --exec ntm_test_move_node @TreeID=@TreeID, @move_node_name='alternative', @buddy_node_name='blues', @move_type='AS_SIBLING'
    
    --exec ntm_test_move_node @TreeID=@TreeID, @move_node_name='alternative', @buddy_node_name='blues', @move_type='AS_CHILD'