Re: Re: [COMMITTERS] pgsql: Rewrite GEQO's gimme_tree function so that it always finds a

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Rewrite GEQO's gimme_tree function so that it always finds a
Date: 2009-12-03 03:24:09
Message-ID: 603c8f070912021924j3893ba16wcf9538eeb9038a15@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Wed, Dec 2, 2009 at 9:55 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> One other thing I'm noticing about the current implementation is that
>> it seems to spend an entirely excessive amount of brain power
>> considering the best order in which to execute cross-joins.  If I do
>> X, A JOIN B ON Pab JOIN C ON Pac JOIN D ON Pad JOIN E ON Pae, it looks
>> to me like join_search_one_level() will try joining X to each of A-E.
>> That seems fairly pointless; why would I ever want to join X to
>> anything other than {A B C D E}?
>
> Not sure that a lot of cross joins with no conditions is the case to
> design around.  Usually queries aren't that devoid of features of
> interest, and so different join paths are actually usefully different.

Not sure what you mean. There's already a special-case code path for
cross joins; but I think it's probably considering a lot of silly
paths. Is there a case where it makes sense to do cross joins at some
stage of the process other than last?

>> ...  We should maybe also
>> think about raising the default value for work_mem.  It's hard for me
>> to believe that the average Postgres user wants a sort that takes more
>> than 1MB of memory to spill to disk; there certainly are people who
>> probably want that, but I doubt there are very many.  I believe we've
>> been using that value for a decade, and memory size has increased a
>> lot in that time.
>
> Maybe.  I'll certainly grant that machines have more memory, but is the
> average Postgres installation using that to run bigger sorts, or to run
> more sorts (either more concurrent queries or more complex queries
> containing more sorts)?  We know that increasing work_mem too much
> can be counterproductive, and much sooner than one might think.

A further confounding factor is that work_mem also controls memory
usage for hash tables - whereas the original sort_mem did not - and at
least in my experience it's more common to have multiple hashes in a
query than multiple sorts. It would be nice to have some data on
this rather than just hand-waving, but I'm not sure how to get it.
For default_statistics_target, *_collapse_threshold, and
geqo_threshold, we were able to construct worst-case queries and
benchmark them. I have no idea how to do something comparable for
work_mem.

...Robert

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2009-12-03 03:32:00 Re: Re: [COMMITTERS] pgsql: Rewrite GEQO's gimme_tree function so that it always finds a
Previous Message Tom Lane 2009-12-03 02:55:42 Re: Re: [COMMITTERS] pgsql: Rewrite GEQO's gimme_tree function so that it always finds a

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2009-12-03 03:28:18 Re: Catastrophic changes to PostgreSQL 8.4
Previous Message Jerome Alet 2009-12-03 03:09:29 Re: [Bacula-users] Catastrophic changes to PostgreSQL 8.4