GEQO vs join order restrictions

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Cc: Andres Freund <andres(at)anarazel(dot)de>
Subject: GEQO vs join order restrictions
Date: 2009-07-18 15:48:14
Message-ID: 17807.1247932094@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I spent a bit of time looking into why GEQO behaves so miserably on the
test case Andres Freund presented here:
http://archives.postgresql.org/message-id/200907091700.43411.andres@anarazel.de

The difficulty seems to be that the problem query is full of join order
restrictions; in particular lots of joins inside the right side of a
LEFT JOIN. So those sub-joins have to occur before their relations
can be joined to anything else. When GEQO generates a random join
sequence, it is very likely indeed that such pairs of relations will
not be adjacent in the raw join sequence. There is some code in
gimme_tree() (the "stack" business I added here:
http://archives.postgresql.org/pgsql-committers/2004-01/msg00186.php
) that was meant to try to deal with that, but I now realize that it
entirely fails to do so. Given two relations that have to be joined
to each other first, if they are not already adjacent in the input
then they just create two separate stack entries and the algorithm
never tries to join them to each other. So the failure rate in the
presence of join order restrictions is very high, and it gets rapidly
worse as the join problem size increases. This explains Andres'
observation of random_init_pool() reporting complete failure at high
collapse_limit settings. We can't really expect a random search process
to be efficient at discovering that two out of a hundred items must be
adjacent --- especially not if the problem has multiple restrictions
like that and the only feedback it gets is overall success or failure.

I'm inclined to address this by rewriting gimme_tree so that it *always*
finds a valid join order based on the given tour. This would involve
searching the whole stack for a possible join partner instead of
considering only the topmost stack entry. It sounds inefficient, but
it's not nearly as bad as wasting the entire cycle by failing near
the end, which is what happens now.

A different line of thought is to add some knowledge about join order
restrictions to tour generation, such that the code never generates an
invalid join order to start with. Unfortunately it's not at all clear
how to do that.

Ideas, comments?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2009-07-18 15:58:27 Re: [GENERAL] pg_migrator not setting values of sequences?
Previous Message Andrew Dunstan 2009-07-18 15:42:12 Re: navigation menu for documents