Re: Trees: maintaining pathnames

From: "Dan Langille" <dan(at)langille(dot)org>
To: "Josh Berkus" <josh(at)agliodbs(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Trees: maintaining pathnames
Date: 2002-11-20 20:20:15
Message-ID: 3DDBA82F.15041.BA07175D@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On 17 Nov 2002 at 14:51, Josh Berkus wrote:

> Dan,
>
> > My existing tree implementation reflects the files contained on disk.
> > The
> > full pathname to a particlar file is obtained from the path to the
> > parent
> > directory. I am now considering putting this information into a
> > field in
> > the table.
> <snip>
> > Suggestions, comment, open ridicule, most welcome. thanks.
>
> This is a fine implementation using the adjacency list model of tree
> design. However, I think you may find that the string-based tree
> implementation in /contrib/ltree is more suited to your purposes, and
> easier to maintain.

That looks interesting. I have installed that onto a test server and
I'm playing around with it.[1] The contrib/ltree project implements
a tree via text parsing. Below I show the test data it created.

For my usage, I'm not sure I need it. I have implemented the
"Adjacency List" tree implementation (that's what I've been told).
In short, my tree contains three basic fields: id, name, parent_id.

Given that I'm considering adding a new field path_name to the tree,
I can't see the ltree package will give me anything more than I can
get from like. My main reason for adding path_name was doing queries
such as:

select * from tree where path_name like '/path/to/parent/%'

which will return me all the descendants of a give node (in this case
'/path/to/parent/'.[2]

I have discussed [offlist] the option of using a secondary table to
store the pathname (i.e. a cach table) which would be updated using a
loop in the tigger instead of using cascading triggers. I would
prefer to keep the pathname in the same table.

In my application, I have about 120,000 nodes in the tree. I am
using PL/pgSQL quite a lot. Perhaps moving the triggers to C at a
later date may provide a speed increase if the tree expands
considerably.

Also, it is noted that those triggers set the pathname twice, once in
the before, and once in the after trigger. I'll try to optimize that
for a future "release".

ltreetest=# \d
List of relations
Name | Type | Owner
------+-------+-------
test | table | dan
(1 row)

ltreetest=# select * from test;
path
-----------------------------------------------
Top
Top.Science
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
Top.Science.Astronomy.Cosmology
Top.Hobbies
Top.Hobbies.Amateurs_Astronomy
Top.Collections
Top.Collections.Pictures
Top.Collections.Pictures.Astronomy
Top.Collections.Pictures.Astronomy.Stars
Top.Collections.Pictures.Astronomy.Galaxies
Top.Collections.Pictures.Astronomy.Astronauts
(13 rows)

[1] - For other following on, I had to do the following:

- downloaded the 7.2 version of the code from
http://www.sai.msu.su/~megera/postgres/gist/ltree/

- installed using gmake not make
- grabbed the sample file from
http://developer.postgresql.org/cvsweb.cgi/pgsql-
server/contrib/ltree/ltreetest.sql

[2] - My application involves mirroring a file system (directories
and files). FWIW, in this instances, files are not renamed, they are
deleted and recreated elsewhere.
--
Dan Langille : http://www.langille.org/

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Neil Conway 2002-11-20 21:15:57 Re: Closing inactive connections OR user connections limits
Previous Message scott.marlowe 2002-11-20 19:32:40 Re: Closing inactive connections OR user connections limits