Quick Links

Re: optimize query with a maximum(date) extraction

From:	Gregory Stark <stark(at)enterprisedb(dot)com>
To:	"Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
Cc:	"Peter Childs" <peterachilds(at)gmail(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: optimize query with a maximum(date) extraction
Date:	2007-09-05 14:28:53
Message-ID:	87ps0xp4nu.fsf@oxford.xeocode.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

"Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com> writes:

>>
>> why not
>>
>> select id,
>> min(the_date) as min_date,
>> max(the_date) as max_date
>> from my_table group by id;
>>
>> Since 8.0 or was it earlier this will use an index should a reasonable one
>> exist.

As I mentioned in the other post that's not true for this query.

> without any limits, seq scan is optimal.

That's not necessarily true either. You could have ten distinct ids and
millions of dates for them. In that case a scan of the index which jumped
around to scan from the beginning and end of each distinct id value would be
faster. There's just no such plan type in Postgres currently.

You can simulate such a plan with the subqueries I described but there's a bit
more overhead than necessary and you need a reasonably efficient source of the
distinct ids. Also it may or may not be faster than simply scanning the whole
table like above and simulating it with subqueries makes it impossible to
choose the best plan.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Re: optimize query with a maximum(date) extraction at 2007-09-05 12:48:18 from Pavel Stehule

Responses

Re: optimize query with a maximum(date) extraction at 2007-09-05 15:23:38 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Tom Lane	2007-09-05 15:03:22	Re: No pg_dumplo on 8.2.4
Previous Message	Gregory Stark	2007-09-05 14:23:23	Re: optimize query with a maximum(date) extraction