Quick Links

Re: distinct estimate of a hard-coded VALUES list

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: distinct estimate of a hard-coded VALUES list
Date:	2016-08-18 21:25:12
Message-ID:	25962.1471555512@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> So even though it knows that 6952 values have been shoved in the bottom, it
> thinks only 200 are going to come out of the aggregation. This seems like
> a really lousy estimate. In more complex queries than the example one
> given it leads to poor planning choices.

> Is the size of the input list not available to the planner at the point
> where it estimates the distinct size of the input list? I'm assuming that
> if it is available to EXPLAIN than it is available to the planner. Does it
> know how large the input list is, but just throw up its hands and use 200
> as the distinct size anyway?

It does know it, what it doesn't know is how many duplicates there are.
If we do what I think you're suggesting, which is assume the entries are
all distinct, I'm afraid we'll just move the estimation problems somewhere
else.

I recall some talk of actually running an ANALYZE-like process on the
elements of a VALUES list, but it seemed like overkill at the time and
still does.

regards, tom lane

In response to

distinct estimate of a hard-coded VALUES list at 2016-08-18 21:03:24 from Jeff Janes

Responses

Re: distinct estimate of a hard-coded VALUES list at 2016-08-20 20:19:14 from Jeff Janes

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Claudio Freire	2016-08-18 21:26:31	Re: [WIP] [B-Tree] Keep indexes sorted by heap physical location
Previous Message	Alvaro Herrera	2016-08-18 21:23:55	Re: [WIP] [B-Tree] Keep indexes sorted by heap physical location