Re: GEQO vs join order restrictions

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: GEQO vs join order restrictions
Date: 2009-07-18 16:19:05
Message-ID: 200907181819.05570.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Saturday 18 July 2009 17:48:14 Tom Lane wrote:
> I spent a bit of time looking into why GEQO behaves so miserably on the
> test case Andres Freund presented here:
> http://archives.postgresql.org/message-id/200907091700.43411.andres@anaraze
>l.de
>
> The difficulty seems to be that the problem query is full of join order
> restrictions; in particular lots of joins inside the right side of a
> LEFT JOIN. So those sub-joins have to occur before their relations
> can be joined to anything else. When GEQO generates a random join
> sequence, it is very likely indeed that such pairs of relations will
> not be adjacent in the raw join sequence. There is some code in
> gimme_tree() (the "stack" business I added here:
> http://archives.postgresql.org/pgsql-committers/2004-01/msg00186.php
> ) that was meant to try to deal with that, but I now realize that it
> entirely fails to do so. Given two relations that have to be joined
> to each other first, if they are not already adjacent in the input
> then they just create two separate stack entries and the algorithm
> never tries to join them to each other. So the failure rate in the
> presence of join order restrictions is very high, and it gets rapidly
> worse as the join problem size increases. This explains Andres'
> observation of random_init_pool() reporting complete failure at high
> collapse_limit settings. We can't really expect a random search process
> to be efficient at discovering that two out of a hundred items must be
> adjacent --- especially not if the problem has multiple restrictions
> like that and the only feedback it gets is overall success or failure.
Yes, this sound sensible and follows the same line of thought I started to
have after you suggested the problem is in random_init_pool.

This also explains why I saw nearly no improvement during the genetic search
itself. The paths out of random_init_pool were already hugely selected, so
there were not that many improvements to find and a change was relatively like
to yield a impossible ordering.
> I'm inclined to address this by rewriting gimme_tree so that it *always*
> finds a valid join order based on the given tour. This would involve
> searching the whole stack for a possible join partner instead of
> considering only the topmost stack entry. It sounds inefficient, but
> it's not nearly as bad as wasting the entire cycle by failing near
> the end, which is what happens now.
I guess this should be relatively easy to implement and test?

> A different line of thought is to add some knowledge about join order
> restrictions to tour generation, such that the code never generates an
> invalid join order to start with. Unfortunately it's not at all clear
> how to do that.
I haven't really thought much about that yet, but wouldn't it be possible to
iteratively check the current path during tour generation for every relation
added?
If the next relation yields a impossible ordering, choose another relation as
next, if no more left, restart?

I do even less know how feasible this is, but given that joins in the right
hand side of a LEFT JOIN are not really useful to study from the outside in
the general case, would it be possible to "hide" them below the join during
join order planning? If yes, that should be applicable for the normal planner
as well.
I do realize that this is nothing one could fastly cook up and that it would
require changes in lots of places, but as a general idea...

Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2009-07-18 16:31:34 Re: SE-PostgreSQL?
Previous Message David Fetter 2009-07-18 16:06:00 SE-PostgreSQL?