Re: plan time of MASSIVE partitioning ...

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers Hackers <pgsql-hackers(at)postgresql(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: plan time of MASSIVE partitioning ...
Date: 2010-10-28 22:56:49
Message-ID: 27684.1288306609@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
> Tom Lane rta:
>> That seems like a ridiculously large number of ECs. What is the
>> test query again?

> The test case is here:
> http://archives.postgresql.org/message-id/4CBD9DDC.4040304@cybertec.at

After poking through that a bit, I think that the real issue is in this
division of labor:

index_pathkeys = build_index_pathkeys(root, index,
ForwardScanDirection);
useful_pathkeys = truncate_useless_pathkeys(root, rel,
index_pathkeys);

If you trace what is happening here, the index pathkeys that actually
survive the "usefulness" test all refer to exactly ONE equivalence
class, namely the one arising from the query's "order by timestamp2"
clause. All the other pathkeys that get created are immediately
discarded as being irrelevant to the query. The reason that we end up
with so many equivalence classes is that there is nothing causing the
variables of the different child tables to be recognized as all
sort-equivalent. Maybe that's a bug in itself, but I would argue that
the right way to make this faster is to refactor things so that we
don't generate useless equivalence classes in the first place, or
at least don't keep them around in the planner's lists once we realize
they're useless.

I like Heikki's hack to cut down on searching in make_canonical_pathkey,
but I think that complicating the data structure searching beyond that
is just a band-aid. Reasonably-sized queries shouldn't contain very
many equivalence classes: they should only come from equality clauses
or sort conditions that appeared in the query text. Therefore, there
also shouldn't be all that many distinct pathkeys.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Brendan Jurd 2010-10-28 22:59:37 Re: Keywords in pg_hba.conf should be field-specific
Previous Message Tom Lane 2010-10-28 21:19:40 Re: contsel and gist