Re: Eliminating unnecessary left joins

From: Jim Nasby <decibel(at)decibel(dot)org>
To: Ottó Havasvölgyi <havasvolgyi(dot)otto(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Eliminating unnecessary left joins
Date: 2007-04-11 22:07:19
Message-ID: A40A041A-1E2F-48ED-8C9B-3F8E4B077FFC@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I agree with others that the way that query is constructed is a bit
odd, but it does bring another optimization to mind: when doing an
inner-join between a parent and child table when RI is defined
between them, if the query only refers to the child table you can
drop the parent table from the join, because each row in the child
table must have one and only one row in the parent.

Use-case: I'll often use views to make it easier to query several
related tables, but not all queries against the views need to hit
every table. IE: if a table has several status fields that have RI to
parent tables that describe what each status is, you sometimes will
query for the status description, sometimes not.

I suspect that checking to see if tables have the right unique keys
or RI would add a noticeable amount of extra work to query planning,
so we might want a GUC to disable it.

On Apr 7, 2007, at 12:45 PM, Ottó Havasvölgyi wrote:

> Sorry, I have left out the PK requirement.
> What Nicolas wrote is right, I also use an O/R mapper and
> inheritance is solved with vertical partitioning. The tables are
> connected to each other with the PK. And the mapper defines views
> for each class with left joins. The mapper generates queries based
> on these views. A high fraction of the joins would be eliminated
> almost in every query.
>
> My simple example:
>
> Class hierarchy and fields:
> Shape (ID, X, Y)
> |
> +-Circle (ID, Radius)
> |
> +-Rectangle (ID, Width, Height)
>
> The mapper creates 3 tables with the columns next to the class name.
> And it creates 3 views. One of them:
>
> RectangleView: SELECT r."ID" as "ID", s."X" as "X", s."Y" as "Y",
> r."Width" as "Width", r."Height" as "Height" FROM "Rectangle" r
> LEFT JOIN "Shape" s ON ( r.ID=s.ID)
>
> Now if I query Rectangle object IDs, whose Width is greater than 5,
> it will generate this:
>
> SELECT "ID" FROM RectangleView WHERE "Width">5
>
> In this case I don't need to left join the Shape table, because X
> and Y columns are not used.
>
>
> The other typical situation is when I execute more complex, not-O/
> Rmapper-generated SQL commands based on these views for reporting.
> For example the average width of rectangles whose height is greater
> than 10.
> ----------------------------------------------------
>
> This optimization should be also applied to subqueries.
>
>
>
> Is this optimization relatively easy to introduce?
>
> I would gladly work on this, but unfortunately I don't know the
> codebase at all.
> I would really appreciate if someone competent implemented this
> feature in 8.4.
>
> Thank you in advance,
> Otto
>

--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2007-04-11 22:23:38 Re: elog(FATAL) vs shared memory
Previous Message Jason Lustig 2007-04-11 22:02:48 Slow Postgresql server