Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: david(dot)rowley(at)2ndquadrant(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, simon(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats
Date: 2017-04-06 10:42:24
Message-ID: 20170406.194224.249381919.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

At Thu, 6 Apr 2017 21:55:43 +1200, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote in <CAKJS1f95tOuSEMfmYWBPj-fGw=SY0MYDbQh5BiRiTtonMpws7Q(at)mail(dot)gmail(dot)com>
> On 6 April 2017 at 19:50, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > At Thu, 6 Apr 2017 18:59:35 +1200, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote in <CAKJS1f-yrLizV5N_-r1o4vemuZBTJd8EzwPyx2QG=F6891++=g(at)mail(dot)gmail(dot)com>
> >> On 6 April 2017 at 18:03, Kyotaro HORIGUCHI
> >> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >> > At Thu, 6 Apr 2017 13:10:48 +1200, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote in <CAKJS1f8Um=BvRmgcb3u6ze1q1xL7A1VKTVF9s2R1_UfRqx8q5w(at)mail(dot)gmail(dot)com>
> >> >> On 6 April 2017 at 13:05, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> >> I'm not all that sure why the number of columns in the relation has
> >> relevance to the performance of find_relation_from_clauses(). The
> >> bms_get_singleton_member() is checking which relations are part of the
> >> RestrictInfo, nothing related to columns in relations.
> >> Perhaps you meant clauses in the clauses list? Which does not really
> >> have all that much to do with the number of columns in the relation
> >> either.
> >
> > Sorry, it's number of relations, not columns. I'm not sure up to
> > how many relations we practically should consider but anyway it
> > is extra burden to every call to clauselist_selectivity. We
> > should avoid calling find_relation_from_clauses as far as
> > possible or do the same in more simple way. However I'm not sure
> > more precise exclusion is possible or not, I thinks that the case
> > of jointype != JOIN_INNER can be exluded.
>
> Well, I imagine queries with >= 32 relations are not planning very
> quickly as of today already. I understand what you mean when you speak
> of attributes, as we could constantly be looking for the 1400's
> attribute which is many loops into a bms_get_singleton_member() call.
> I can't imagine we'll even flow over the first word in a bitmap set in
> 99% of cases with clause_relids. In any case, even if there's a giant
> chain of clauses in the the 'clauses' list, we'll bail out on the
> first join qual anyway, since it won't be a singleton clause_relid.

Yes, I agree that most cases doesn't suffer this. Anyway since I
don't have enough knowlege required to roughly estimate the
impact nor concrete expample where the planning time increases
significantly, I don't assert any more on this point.

> I'd say if you can come up with a test case where you can measure the
> impact of this, then let's discuss more. Otherwise we're stepping back
> into the territory that Tom warned me about a few emails up....
> Premature Optimisation. I'm not walking down there again, I only just
> got back.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Simon Riggs 2017-04-06 12:35:59 pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
Previous Message David Rowley 2017-04-06 09:55:43 Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-04-06 10:47:24 Constraint exclusion for partitioned tables
Previous Message Andres Freund 2017-04-06 10:40:26 Re: Other formats in pset like markdown, rst, mediawiki