Re: distinct estimate of a hard-coded VALUES list

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: distinct estimate of a hard-coded VALUES list
Date: 2017-06-27 03:14:46
Message-ID: CAMkU=1w02NNDfXEhzaj28Z8Qf3=YPms2fDxFQX3+TT7Th+YoWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 23, 2016 at 5:28 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
> > On 08/22/2016 07:42 PM, Alvaro Herrera wrote:
> >> Also, if we patch it this way and somebody has a slow query because of a
> >> lot of duplicate values, it's easy to solve the problem by
> >> de-duplicating. But with the current code, people that have the
> >> opposite problem has no way to work around it.
>
> > I certainly agree it's better when a smart user can fix his query plan
> > by deduplicating the values than when we end up generating a poor plan
> > due to assuming some users are somewhat dumb.
>
> Well, that seems to be the majority opinion, so attached is a patch
> to do it. Do we want to sneak this into 9.6, or wait?
>
> > I wonder how expensive would it be to actually count the number of
> > distinct values - there certainly are complex data types where the
> > comparisons are fairly expensive, but I would not expect those to be
> > used in explicit VALUES lists.
>
> You'd have to sort the values before de-duping, and deal with VALUES
> expressions that aren't simple literals. And you'd have to do it a
> lot --- by my count, get_variable_numdistinct is invoked six times
> on the Var representing the VALUES output in Jeff's example query.
> Maybe you could cache the results somewhere, but that adds even
> more complication. Given the lack of prior complaints about this,
> it's hard to believe it's worth the trouble.
>
>
This patch still applies, and I think the argument for it is still valid.
So I'm going to make a commit-fest entry for it. Is there additional
evidence we should gather?

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-06-27 04:43:46 Re: pg_basebackup fails on Windows when using tablespace mapping
Previous Message Michael Paquier 2017-06-27 03:13:58 Re: pg_basebackup fails on Windows when using tablespace mapping