Skip site navigation (1) Skip section navigation (2)

Re: New design for FK-based join selectivity estimation

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ronan(dot)dunklau(at)dalibo(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Adrien Nayrat <adrien(dot)nayrat(at)dalibo(dot)com>
Subject: Re: New design for FK-based join selectivity estimation
Date: 2016-12-15 20:51:43
Message-ID: 16149.1481835103@sss.pgh.pa.us (view raw, whole thread or download thread mbox)
Thread:
Lists: pgsql-hackers
I wrote:
> ronan(dot)dunklau(at)dalibo(dot)com writes:
>> If I understand it correctly and the above is right, I think we should ignore
>> SEMI or ANTI joins altogether when considering FKs, and keep the corresponding
>> restrictinfos for later processing since they are already special-cased later
>> on.

> That seems like an overreaction.  While the old code happens to get this
> example exactly right, eqjoinsel_semi is still full of assumptions and
> approximations, and it doesn't do very well at all if it lacks MCV lists
> for both sides.

> I'm inclined to think that what we want to have happen in this case is
> to estimate the fraction of outer rows having a match as equal to the
> selectivity of the inner query's WHERE clauses, ie the semijoin
> selectivity should be sizeof(inner result) divided by sizeof(inner
> relation).

After further study, I concluded that we can only easily estimate that
when the inner side of the SEMI or ANTI join is just the single referenced
table.  If the inner side is itself a join, it's not easy to determine
what fraction of the referenced table will survive the join clauses.

However, we can still be brighter than to just throw all the FK qual
clauses back into the pool: that would result in multiplying their
selectivity estimates together, which for a multi-column FK results in
exactly the drastic underestimation that 100340e2d intended to avoid.
What seems to make sense here is to take the minimum of the per-clause
selectivities, as we are doing for other outer-join cases.

Hence, I propose the attached patch.  This rearranges the existing code
slightly to avoid duplicating it.

			regards, tom lane


Attachment: fix-semi-anti-join-selectivity-for-fk.patch
Description: text/x-diff (4.7 KB)

In response to

pgsql-hackers by date

Next:From: Tom LaneDate: 2016-12-15 20:54:31
Subject: Re: Proposal for changes to recovery.conf API
Previous:From: Magnus HaganderDate: 2016-12-15 20:20:59
Subject: Re: Proposal for changes to recovery.conf API

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group