Quick Links

Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats

From:	David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats
Date:	2017-04-06 01:05:24
Message-ID:	CAKJS1f_gB=gyZn8wMw0v8uCKD1nYeWyNYCXKz=+Oo0yR4RRyiA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-committers pgsql-hackers

On 6 April 2017 at 11:33, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
>> On 6 April 2017 at 10:48, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> The buildfarm is unhappy about the fact that this changed the API
>>> for clauselist_selectivity(). I am not convinced that that change
>>> was a good idea, so before telling FDW authors that they need to
>>> change their code, I'd like to hear a defense of the API change.
>
>> Because varReliId is often passed in as 0, and that meant we'd have to
>> write some code to check of the clause was made up of RestrictInfos
>> from a single relation or not, and look for extended stats on that
>> singleton rel.
>
> Generally, if it's passed as zero, that's a good clue that the clause
> *is* a join clause. In any case, this defense fails to address my
> other question, which is what's going to happen to this API when you
> want to use extended stats in join-clause estimates, which I'd expect
> to surely happen before very long.
>
> Also, I find it hard to believe that a bms_get_singleton_member call is
> going to be material in comparison to all the work that will be invoked
> indirectly via whatever selectivity estimation function gets called for
> each clause. Even a single catcache fetch would swamp that.
>
> So no, you have not convinced me that this isn't a broken design.
>
>> FWIW, I found this function being called 72 times in a 5 way join
>> search problem.
>
> And you measured the overhead of doing it the other way to be ... ?
> Premature optimization and all that.

I tested with the attached, and it does not seem to hurt planner
performance executing:

explain select * from ab ab1
inner join ab ab2 on ab1.a = ab2.a and ab1.b = ab2.b
inner join ab ab3 on ab1.a = ab3.a and ab1.b = ab3.b
inner join ab ab4 on ab1.a = ab4.a and ab1.b = ab4.b
inner join ab ab5 on ab1.a = ab5.a and ab1.b = ab5.b
inner join ab ab6 on ab1.a = ab6.a and ab1.b = ab6.b
inner join ab ab7 on ab1.a = ab7.a and ab1.b = ab7.b
inner join ab ab8 on ab1.a = ab8.a and ab1.b = ab8.b;

after having executed:

create table ab (a int, b int);

I get:

find_relation_from_clauses
tps = 48.992918 (excluding connections establishing)
tps = 49.060407 (excluding connections establishing)
tps = 49.075815 (excluding connections establishing)

Master

tps = 48.938027 (excluding connections establishing)
tps = 48.066274 (excluding connections establishing)
tps = 48.727089 (excluding connections establishing)

running pgbench -n -T 60 -f 8wayjoin.sql

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment	Content-Type	Size
find_relation_from_clauses.patch	application/octet-stream	13.0 KB

In response to

Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats at 2017-04-05 23:33:03 from Tom Lane

Responses

Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats at 2017-04-06 01:10:48 from David Rowley

Browse pgsql-committers by date

	From	Date	Subject
Next Message	David Rowley	2017-04-06 01:10:48	Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats
Previous Message	Tom Lane	2017-04-05 23:33:03	Re: [COMMITTERS] pgsql: Collect and use multi-column dependency stats

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2017-04-06 01:07:11	Re: Partition-wise join for join between (declaratively) partitioned tables
Previous Message	Andres Freund	2017-04-06 01:02:42	Re: Re: new set of psql patches for loading (saving) data from (to) text, binary files