Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats
Date: 2017-04-06 06:59:35
Message-ID: CAKJS1f-yrLizV5N_-r1o4vemuZBTJd8EzwPyx2QG=F6891++=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 6 April 2017 at 18:03, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> At Thu, 6 Apr 2017 13:10:48 +1200, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote in <CAKJS1f8Um=BvRmgcb3u6ze1q1xL7A1VKTVF9s2R1_UfRqx8q5w(at)mail(dot)gmail(dot)com>
>> On 6 April 2017 at 13:05, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>> > I tested with the attached, and it does not seem to hurt planner
>> > performance executing:
>>
>> Here's it again, this time with a comment on the
>> find_relation_from_clauses() function.
>
> It seems to work as the same as the previous version with
> additional cost to scan over restrict clauses. But separate loop
> over clauses is additional overhead in any cases even irrelavant
> to functional dependency. The more columns are in the relation,
> the longer time bms_get_singleton_member takes. Although I'm not
> sure how much it hurts performance and I can't think of a good
> alternative right now, I think that the overhead should be
> avoided anyhow.

I'm not all that sure why the number of columns in the relation has
relevance to the performance of find_relation_from_clauses(). The
bms_get_singleton_member() is checking which relations are part of the
RestrictInfo, nothing related to columns in relations.

Perhaps you meant clauses in the clauses list? Which does not really
have all that much to do with the number of columns in the relation
either.

>
> At Thu, 6 Apr 2017 13:05:24 +1200, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote in <CAKJS1f_gB=gyZn8wMw0v8uCKD1nYeWyNYCXKz=+Oo0yR4RRyiA(at)mail(dot)gmail(dot)com>
>> > And you measured the overhead of doing it the other way to be ... ?
>> > Premature optimization and all that.
>>
>> I tested with the attached, and it does not seem to hurt planner
>> performance executing:
>
> Here, bms_singleton_member takes longer time if the relation has
> many columns and there's a functional dependency covering the
> columns at the very tail. Maybe only two are not practical for
> testing.

Can you explain why you think this? And confirm you're speaking about
the bms_get_singleton() member in find_relation_from_clauses()

> Even if it doesn't impact performance detectably, if only one
> attribute is needed, an AttrNumber member in context will be
> sufficient. No bitmap operation seems required in
> dependency_compatible_walker and it can bail out by the second
> attribute.

Are you looking at an old patch? That function no longer exists.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2017-04-06 07:50:56 Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats
Previous Message Heikki Linnakangas 2017-04-06 06:12:01 pgsql: Remove dead code and fix comments in fast-path function handling

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2017-04-06 07:02:27 Re: Declarative partitioning vs. information_schema
Previous Message Petr Jelinek 2017-04-06 06:55:37 Re: Quorum commit for multiple synchronous replication.