Quick Links

Re: PATCH: add support for IN and @> in functional-dependency statistics use

From:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To:	Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc:	Pierre Ducroquet <p(dot)psql(at)pinaraf(dot)info>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: PATCH: add support for IN and @> in functional-dependency statistics use
Date:	2020-03-18 00:29:46
Message-ID:	20200318002946.6dvblukm3cfmgir2@development
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Mar 17, 2020 at 06:05:17PM +0100, Tomas Vondra wrote:
>On Tue, Mar 17, 2020 at 04:14:26PM +0000, Dean Rasheed wrote:
>>On Tue, 17 Mar 2020 at 15:37, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>>
>>>On Tue, Mar 17, 2020 at 12:42:52PM +0000, Dean Rasheed wrote:
>>>
>>>>The other thing that I'm still concerned about is the possibility of
>>>>returning estimates with P(a,b) > P(a) or P(b). I think that such a
>>>>thing becomes much more likely with the new types of clause supported
>>>>here, because they now allow multiple values from each column, where
>>>>before we only allowed one. I took another look at the patch I posted
>>>>on the other thread, and I've convinced myself that it's correct.
>>>>Attached is an updated version, with some cosmetic tidying up and now
>>>>with some additional regression tests.
>>>
>>>Yeah, I agree that's something we need to fix. Do you plan to push the
>>>fix, or do you want me to do it?
>>>
>>
>>I can push it. Have you had a chance to review it?
>>
>
>Not yet, I'll take a look today.
>

OK, I took a look. I think from the correctness POV the patch is OK, but
I think the dependencies_clauselist_selectivity() function now got a bit
too complex. I've been able to parse it now, but I'm sure I'll have
trouble in the future :-(

Can we refactor / split it somehow and move bits of the logic to smaller
functions, or something like that?

Another thing I'd like to suggest is keeping the "old" formula, and
instead of just replacing it with

P(a,b) = f * Min(P(a), P(b)) + (1-f) * P(a) * P(b)

but explaining how the old formula may produce nonsensical selectivity,
and how the new formula addresses that issue.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Re: PATCH: add support for IN and @> in functional-dependency statistics use at 2020-03-17 17:05:17 from Tomas Vondra

Responses

Re: PATCH: add support for IN and @> in functional-dependency statistics use at 2020-03-19 19:53:39 from Dean Rasheed

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2020-03-18 00:31:43	Re: Small docs bugfix: make it clear what can be used in UPDATE FROM and DELETE USING
Previous Message	Nikita Glukhov	2020-03-18 00:27:37	Re: [PATCH] Opclass parameters