From: | Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Performance and doing USENET style threaded messages |
Date: | 2000-12-23 10:09:11 |
Message-ID: | 3.0.5.32.20001223180911.00960e80@192.228.128.13 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi (Merry Christmas and Happy New Millennium everyone!),
I've been wondering what would be the best way to do things like threaded
messages on Postgresql- e.g. where you have the usual parent-child
relationships between entities- each entity has a unique id and a parent id.
So far I'm considering two reasonable options.
<options>
Option 1 - do it the obvious "proper way" - have the application do selects
recursively.
e.g. To select child items
<pseudocode>
traverse ( id, orderspec) {
select id,subdata from tablename where parentid=id order by orderspec
for each id
display(id,subdata)
traverse(id, orderspec)
}
</pseudocode>
Option 2 - do it the kludgy limited way.
Keep the same structure as Option 1 but add a new text column I'll call
geneaology (or gene for short).
For a structure like
1
|_
| |
2 3
|_
| |
4 5
Item 4 will have gene=00000001,00000003,00000004,
Where the ids are all zeropadded _hexadecimal_ and delimited by commas (for
easier human processing if necessary).
So to select all items with parent id=3 (including item id=3) one would use
a query like:
select * from tablename where gene like '00000001,00000003,%' order by
orderspec;
Option 2.1 - use a combination of the two, e.g. use option 1 when the depth
gets crazy.
Option X - any ideas?
</options>
The trouble with option 1 is it seems it will be quite slow when you have
lots of items at significant depths.
The trouble with option 2 is the geneaology column can get quite huge (9 X
depth bytes) if the depth increases (e.g. there is a flame war ;) ). What
is the recommended max indexable text length in Postgresql 7.0? If it won't
index well then option 1 may actually be better.
If option 2 indexes well, I'm thinking of using it and limiting the depth
to say maybe a hundred. I figure in practice it's time for a new thread
once the depth is > 100 - it's mutated enough to warrant a new geneaology ;).
Oracle has a "start with" feature, but even so I'm not sure if the
performance is better that way - it may just be for application level
convenience.
Any comments or suggestions welcome.
Thanks,
Link.
From | Date | Subject | |
---|---|---|---|
Next Message | Lincoln Yeoh | 2000-12-23 10:14:26 | Database I/O and other performance questions. |
Previous Message | Tom Lane | 2000-12-23 05:57:36 | Re: allowing users access to a trusted C function |