Re: How to represent a tree-structure in a relational database

From: Tulassay Zsolt <zsolt(at)tek(dot)bke(dot)hu>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: sqllist <pgsql-sql(at)postgresql(dot)org>
Subject: Re: How to represent a tree-structure in a relational database
Date: 2000-12-14 13:02:39
Message-ID: Pine.LNX.4.10.10012141346220.10051-100000@tek.tek.bke.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql


There actually is a model of tree structures in SQL databases which is
different from those mentioned earlier in that it represents the tree
as nested sets (ie. nodes are subsets of parent sets (parent nodes)).

There is a huge advantage in this model as it eliminates the need for
recursion. For example, if you want to get a specific node's parents,
grandparents and all other parents up the tree, you can do this in
_one single_ SELECT query. If you just stored the the id of the parent
of each node, you would need (n-1) queries if the node is at the n-th
level.

However, inserting or deleting nodes is more complex this way since you
have to keep the data structure "balanced". But in most cases you won't
insert new nodes all the time so you won't suffer from this overhead. And
on the other hand, the performance gain on queries might be enormous.

You can find the article dealing with this at
http://www.utdt.edu/~mig/sql-trees

Actually what I did was to implement _both_ models at the same time. I
took the nested set structure, and besides stored the parent id of all
nodes as well. The benefit of this is that by accepting a little more
overhead during inserts, I was able to use the advantages of both models.

If somebody is interested in it, I might share the code (a few tables
and some plpgsql functions). But the article is surely worth a read.

Zsolt

ps: I found the link a few months ago in the pgsql-sql archive :-)

On Wed, 13 Dec 2000, Josh Berkus wrote:

> Stuart,
>
> > I don't think I'd be comfortable with having the node_level column in the
> > table structure. First, because you can derive that value using a function,
> > it's duplicate data. Second, if you decide to take an entire segment of your
> > hierarchy and move it under another node (by changing the value of
> > node_linkup/ParentCategoryID), you'll need to recalculate all of those
> > node_level values. And all the node_level values underneath it.
>
> I can see that. I suppose it depends on the data you're storing. The
> project I was working on tracked grocery inventory for a delivery
> service, and thus each item had a fixed "level" in the heirarcy (Food
> Class, Food Type, Manufacturer, and Item) and thus while items might get
> reassigned *across* the heirarcy, they did not get re-assigned *up and
> down* the heirarcy.
>
> Also, I can't think of a way to represent the tree in pure SQL without
> having the level identifiers (and a fixed number of levels).
>
> > We've done a similar thing for Java. It was ridiculously easy to create a
> > TreeModel wrapped around this data. Almost too easy; it made me feel dirty.
>
> Great. Maybe I'll buy it from you if I ever need to use Java :-)
>
> -Josh
>
> --
> ______AGLIO DATABASE SOLUTIONS___________________________
> Josh Berkus
> Complete information technology josh(at)agliodbs(dot)com
> and data management solutions (415) 436-9166
> for law firms, small businesses fax 436-0137
> and non-profit organizations. pager 338-4078
> San Francisco
>

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Jan Wieck 2000-12-14 13:08:48 Re: Strange slow behavior in backend
Previous Message Karel Zak 2000-12-14 12:49:24 Re: to_char() causes backend to close connection