Quick Links

Re: distinct estimate of a hard-coded VALUES list

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: distinct estimate of a hard-coded VALUES list
Date:	2016-08-20 20:58:25
Message-ID:	2584.1471726705@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> On Thu, Aug 18, 2016 at 2:25 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> It does know it, what it doesn't know is how many duplicates there are.

> Does it know whether the count comes from a parsed query-string list/array,
> rather than being an estimate from something else? If it came from a join,
> I can see why it would be dangerous to assume they are mostly distinct.
> But if someone throws 6000 things into a query string and only 200 distinct
> values among them, they have no one to blame but themselves when it makes
> bad choices off of that.

I am not exactly sold on this assumption that applications have
de-duplicated the contents of a VALUES or IN list. They haven't been
asked to do that in the past, so why do you think they are doing it?

>> If we do what I think you're suggesting, which is assume the entries are
>> all distinct, I'm afraid we'll just move the estimation problems somewhere
>> else.

> Any guesses as to where? (other than the case of someone doing something
> silly with their query strings?)

Well, overestimates are as bad as underestimates --- it might lead us away
from using a nestloop, for example.

regards, tom lane

In response to

Re: distinct estimate of a hard-coded VALUES list at 2016-08-20 20:19:14 from Jeff Janes

Responses

Re: distinct estimate of a hard-coded VALUES list at 2016-08-22 17:19:32 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2016-08-20 21:52:38	Re: SP-GiST support for inet datatypes
Previous Message	Jeff Janes	2016-08-20 20:19:14	Re: distinct estimate of a hard-coded VALUES list