Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: david(dot)rowley(at)2ndquadrant(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, simon(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats
Date: 2017-04-06 07:50:56
Message-ID: 20170406.165056.135052192.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

At Thu, 6 Apr 2017 18:59:35 +1200, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote in <CAKJS1f-yrLizV5N_-r1o4vemuZBTJd8EzwPyx2QG=F6891++=g(at)mail(dot)gmail(dot)com>
> On 6 April 2017 at 18:03, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > At Thu, 6 Apr 2017 13:10:48 +1200, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote in <CAKJS1f8Um=BvRmgcb3u6ze1q1xL7A1VKTVF9s2R1_UfRqx8q5w(at)mail(dot)gmail(dot)com>
> >> On 6 April 2017 at 13:05, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> I'm not all that sure why the number of columns in the relation has
> relevance to the performance of find_relation_from_clauses(). The
> bms_get_singleton_member() is checking which relations are part of the
> RestrictInfo, nothing related to columns in relations.
> Perhaps you meant clauses in the clauses list? Which does not really
> have all that much to do with the number of columns in the relation
> either.

Sorry, it's number of relations, not columns. I'm not sure up to
how many relations we practically should consider but anyway it
is extra burden to every call to clauselist_selectivity. We
should avoid calling find_relation_from_clauses as far as
possible or do the same in more simple way. However I'm not sure
more precise exclusion is possible or not, I thinks that the case
of jointype != JOIN_INNER can be exluded.

> > At Thu, 6 Apr 2017 13:05:24 +1200, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote in <CAKJS1f_gB=gyZn8wMw0v8uCKD1nYeWyNYCXKz=+Oo0yR4RRyiA(at)mail(dot)gmail(dot)com>
> >> > And you measured the overhead of doing it the other way to be ... ?
> >> > Premature optimization and all that.
> >>
> >> I tested with the attached, and it does not seem to hurt planner
> >> performance executing:
> >
> > Here, bms_singleton_member takes longer time if the relation has
> > many columns and there's a functional dependency covering the
> > columns at the very tail. Maybe only two are not practical for
> > testing.
>
> Can you explain why you think this? And confirm you're speaking about
> the bms_get_singleton() member in find_relation_from_clauses()

I mentioned dependency_is_compatible_clause here, but I saw that
it has been simplified enough in the committed shape.

> > Even if it doesn't impact performance detectably, if only one
> > attribute is needed, an AttrNumber member in context will be
> > sufficient. No bitmap operation seems required in
> > dependency_compatible_walker and it can bail out by the second
> > attribute.
>
> Are you looking at an old patch? That function no longer exists.

Yes! Sorry for the noise.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message David Rowley 2017-04-06 09:55:43 Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats
Previous Message David Rowley 2017-04-06 06:59:35 Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats

Browse pgsql-hackers by date

  From Date Subject
Next Message Emre Hasegeli 2017-04-06 08:01:01 Re: BRIN cost estimate
Previous Message Craig Ringer 2017-04-06 07:50:05 Re: Faster methods for getting SPI results (460% improvement)