Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats
Date: 2017-04-06 09:55:43
Message-ID: CAKJS1f95tOuSEMfmYWBPj-fGw=SY0MYDbQh5BiRiTtonMpws7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 6 April 2017 at 19:50, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> At Thu, 6 Apr 2017 18:59:35 +1200, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote in <CAKJS1f-yrLizV5N_-r1o4vemuZBTJd8EzwPyx2QG=F6891++=g(at)mail(dot)gmail(dot)com>
>> On 6 April 2017 at 18:03, Kyotaro HORIGUCHI
>> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> > At Thu, 6 Apr 2017 13:10:48 +1200, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote in <CAKJS1f8Um=BvRmgcb3u6ze1q1xL7A1VKTVF9s2R1_UfRqx8q5w(at)mail(dot)gmail(dot)com>
>> >> On 6 April 2017 at 13:05, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>> I'm not all that sure why the number of columns in the relation has
>> relevance to the performance of find_relation_from_clauses(). The
>> bms_get_singleton_member() is checking which relations are part of the
>> RestrictInfo, nothing related to columns in relations.
>> Perhaps you meant clauses in the clauses list? Which does not really
>> have all that much to do with the number of columns in the relation
>> either.
>
> Sorry, it's number of relations, not columns. I'm not sure up to
> how many relations we practically should consider but anyway it
> is extra burden to every call to clauselist_selectivity. We
> should avoid calling find_relation_from_clauses as far as
> possible or do the same in more simple way. However I'm not sure
> more precise exclusion is possible or not, I thinks that the case
> of jointype != JOIN_INNER can be exluded.

Well, I imagine queries with >= 32 relations are not planning very
quickly as of today already. I understand what you mean when you speak
of attributes, as we could constantly be looking for the 1400's
attribute which is many loops into a bms_get_singleton_member() call.
I can't imagine we'll even flow over the first word in a bitmap set in
99% of cases with clause_relids. In any case, even if there's a giant
chain of clauses in the the 'clauses' list, we'll bail out on the
first join qual anyway, since it won't be a singleton clause_relid.

I'd say if you can come up with a test case where you can measure the
impact of this, then let's discuss more. Otherwise we're stepping back
into the territory that Tom warned me about a few emails up....
Premature Optimisation. I'm not walking down there again, I only just
got back.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2017-04-06 10:42:24 Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats
Previous Message Kyotaro HORIGUCHI 2017-04-06 07:50:56 Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2017-04-06 10:30:07 Re: PoC plpgsql - possibility to force custom or generic plan
Previous Message Kuntal Ghosh 2017-04-06 09:34:13 Re: strange parallel query behavior after OOM crashes