Quick Links

Re: bad estimates

From:	Ken Geis <kgeis(at)speakeasy(dot)org>
To:	Bruno Wolff III <bruno(at)wolff(dot)to>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: bad estimates
Date:	2003-08-30 05:05:18
Message-ID:	3F50308E.6070100@speakeasy.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Bruno Wolff III wrote:
> I haven't come up with any great ideas for this one. It might be interesting
> to compare the explain analyze output from the distinct on query with
> and without seqscans enabled.

Can't do that comparison. Remember, with seqscan it fails. (Oh, and
that nested loops solution I thought was fast actually took 31 minutes
versus 29 for index scan in 7.4b2.)

I ran another query across the same data:

select price_date, count(*) from day_ends group by price_date;

It used a table scan and hashed aggregates, and it ran in 5.5 minutes.
Considering that, pgsql should be able to do the query that I had been
running in a little more time than that. So...

From what I've learned, we want to convince the optimizer to use a
table scan; that's a good thing. I want it to use hashed aggregates,
but I can't convince it to (unless maybe I removed all of the
statistics.) To use group aggregates, it first sorts the results of the
table scan (all 17 million rows!) There ought to be some way to tell
pgsql not to do sorts above a certain size. In this case, if I set
enable_sort=false, it goes back to the index scan. If I then set
enable_indexscan=false, it goes back to sorting.

In response to

Re: bad estimates at 2003-08-30 02:55:03 from Bruno Wolff III

Responses

Re: bad estimates at 2003-08-30 14:35:52 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Ron Johnson	2003-08-30 05:21:17	Re: Hardware recommendations to scale to silly load
Previous Message	Christopher Kings-Lynne	2003-08-30 04:50:31	Re: sourcecode for newly release eRServer?