Re: How to speed up min/max(id) in 50M rows table?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: henk de wit <henk53602(at)hotmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: How to speed up min/max(id) in 50M rows table?
Date: 2007-10-12 22:17:41
Message-ID: 5937.1192227461@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I wrote:
> The only way I can see for that to be so slow is if you have a very
> large number of rows where payment_id is null --- is that the case?

> There's not a lot you could do about that in existing releases :-(.

Actually, there is a possibility if you are willing to change the query:
make a partial index that excludes nulls. Toy example:

regression=# create table fooey(f1 int);
CREATE TABLE
regression=# create index fooeyi on fooey(f1) where f1 is not null;
CREATE INDEX
regression=# explain select max(f1) from fooey;
QUERY PLAN
---------------------------------------------------------------
Aggregate (cost=36.75..36.76 rows=1 width=4)
-> Seq Scan on fooey (cost=0.00..31.40 rows=2140 width=4)
(2 rows)

regression=# explain select max(f1) from fooey where f1 is not null;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Result (cost=0.03..0.04 rows=1 width=0)
InitPlan
-> Limit (cost=0.00..0.03 rows=1 width=4)
-> Index Scan Backward using fooeyi on fooey (cost=0.00..65.55 rows=2129 width=4)
Filter: (f1 IS NOT NULL)
(5 rows)

Probably the planner ought to be smart enough to figure this out without
the explicit WHERE in the query, but right now it isn't.

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message henk de wit 2007-10-12 22:21:26 Re: How to speed up min/max(id) in 50M rows table?
Previous Message Erik Jones 2007-10-12 22:13:14 Re: Huge amount of memory consumed during transaction