Quick Links

Re: distinct estimate of a hard-coded VALUES list

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: distinct estimate of a hard-coded VALUES list
Date:	2016-08-23 12:28:25
Message-ID:	10078.1471955305@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
> On 08/22/2016 07:42 PM, Alvaro Herrera wrote:
>> Also, if we patch it this way and somebody has a slow query because of a
>> lot of duplicate values, it's easy to solve the problem by
>> de-duplicating. But with the current code, people that have the
>> opposite problem has no way to work around it.

> I certainly agree it's better when a smart user can fix his query plan
> by deduplicating the values than when we end up generating a poor plan
> due to assuming some users are somewhat dumb.

Well, that seems to be the majority opinion, so attached is a patch
to do it. Do we want to sneak this into 9.6, or wait?

> I wonder how expensive would it be to actually count the number of
> distinct values - there certainly are complex data types where the
> comparisons are fairly expensive, but I would not expect those to be
> used in explicit VALUES lists.

You'd have to sort the values before de-duping, and deal with VALUES
expressions that aren't simple literals. And you'd have to do it a
lot --- by my count, get_variable_numdistinct is invoked six times
on the Var representing the VALUES output in Jeff's example query.
Maybe you could cache the results somewhere, but that adds even
more complication. Given the lack of prior complaints about this,
it's hard to believe it's worth the trouble.

regards, tom lane

Attachment	Content-Type	Size
assume-VALUES-are-distinct.patch	text/x-diff	2.6 KB

In response to

Re: distinct estimate of a hard-coded VALUES list at 2016-08-22 22:38:53 from Tomas Vondra

Responses

Re: distinct estimate of a hard-coded VALUES list at 2017-06-27 03:14:46 from Jeff Janes

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2016-08-23 12:32:01	Re: Block level parallel vacuum WIP
Previous Message	Kevin Grittner	2016-08-23 12:26:31	Re: Logical decoding of sequence advances, part II