optimizing a query over tree-like structure

From: az(at)svilendobrev(dot)com
To: pgsql-sql(at)postgresql(dot)org
Subject: optimizing a query over tree-like structure
Date: 2008-09-30 08:32:51
Message-ID: 200809301132.51493.az@svilendobrev.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

hi.
sorry for the vague syntax used below, but the query is huge so i've
tried to present it in simple terms. And sorry if i'm doing obviously
stupid things, i have lots of years programming behind me but NO sql
involved.
i have a somewhat tree-like structure of objects that link to each
other via many2many associations. it looks like:
(N is "root")
N links to R,P,F
R links to F
P links to O,F
O links to O,F #recursively
F links to Z
All links to F but the one in O are "shortcuts", to avoid looking it
up recursively.
each of these objects has some associated values (again many2many,
ownership).

what i want is to get all the values related to a given N and its
sublevels, in one query.

one variant of what i've invented so far is (~pseudocode, no recursion
on O):

SELECT ownership.*, value.*
FROM Nazn, mm_N2P, mm_P2O, mm_O2O, mm_O2O AS mm_O2O1, mm_N2Z,
ownership JOIN value ON ownership.value = value.dbid
WHERE (
N.dbid = ownership.N
OR
N.dbid = mm_N2R.left AND mm_N2R.right = ownership.R
OR
N.dbid = mm_N2P.left AND (
mm_N2P.right = ownership.P
OR
mm_N2P.right = mm_P2O.left AND (
mm_P2O.right = ownership.O
OR
mm_P2O.right = mm_O2O.left AND (
mm_O2O.right = ownership.O
OR
mm_O2O.right = mm_O2O1.left AND
mm_O2O1.right = ownership.O
)))
OR
Nazn.dbid = mm_N2F.left AND (
mm_N2F.right = ownership.F
OR
mm_N2Z.right = ownership.Z
)
) AND ownership.value = value.dbid AND N.obj = whatever-filter-by-N

----------------
this scales very poor.
it uses the shortcut to F present in N.
for just 200 rows with related associations, it takes 4 seconds to get
result.
if i use the shortcut to F present in P, it takes 2 seconds - but
thats still inacceptable.
seems that the number or consequtive ORs on same level is killing it.
EXPLAIN gives nested loops all over.
What am i doing wrong here?
should i expand the A-to-B links of the sort
mm_N2P.right = mm_P2O.left
into
mm_N2P.right = P.dbid and P.dbid == mm_P2O.left ?

the query is generated via sqlalchemy and a layer on top, so i can
tweak it any way required (and it has many other sub/filterings which
i've ommited for brevity - they dont make it better/worse).

any pointers of how such queries should be written are appreciated -
e.g. what is considered fine, what doable and what is a no-no.

thanks ahead
ciao
svil

www.svilendobrev.com
dbcook.sf.net

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Glenn Gillen 2008-09-30 12:16:06 Can COPY update or skip existing records?
Previous Message Steve Midgley 2008-09-30 06:05:32 Re: Finding sequential records