Re: Selecting max(pk) is slow on empty set

From: "Alexander Staubo" <alex(at)purefiction(dot)net>
To: "Richard Huxton" <dev(at)archonet(dot)com>
Cc: "Postgresql General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Selecting max(pk) is slow on empty set
Date: 2008-01-22 12:33:58
Message-ID: 88daf38c0801220433m2c38243eg3149dc54a811469f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 1/22/08, Richard Huxton <dev(at)archonet(dot)com> wrote:
> Alexander Staubo wrote:
> > # explain analyze select max(id) from user_messages where user_id = 13604;
> >
> > QUERY PLAN
> > ------------------------------------------------------------------------------------------------------------------------------------------------------
> > Result (cost=633.19..633.20 rows=1 width=0) (actual
> > time=339160.704..339160.704 rows=1 loops=1)
> > InitPlan
> > -> Limit (cost=0.00..633.19 rows=1 width=4) (actual
> > time=339160.700..339160.700 rows=0 loops=1)
> > -> Index Scan Backward using user_messages_pkey on
> > user_messages (cost=0.00..633188.12 rows=1000 width=4) (actual
> > time=339160.697..339160 Filter: ((id IS NOT NULL) AND
> > (user_id = 13604))
> > Total runtime: 339160.770 ms
> > (6 rows)
> >
> > Note that it's using the correct index -- user_messages_pkey is on the
> > id attribute. (Why rows=1000 here?)
>
> 1000 looks suspiciously like a default estimate if the planner knows no
> better. Odd since you say that you've just analysed.
>
> Do you have an index on user_id? Presumably that's what's being used in
> the case of SELECT * or count(*).

Yes, I do. However, for some reason it's not being used here. The
index is clustered -- but I haven't run "cluster" on it recently. Does
that matter?

> What cost does the count(*) come up with?

# explain analyze select count(*) from user_messages where user_id = 13604;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=3646.04..3646.05 rows=1 width=0) (actual
time=39.448..39.448 rows=1 loops=1)
-> Index Scan using user_messages_user on user_messages
(cost=0.00..3643.53 rows=1000 width=0) (actual time=39.410..39.410
rows=0 loops=1)
Index Cond: (user_id = 13604)
Total runtime: 39.648 ms
(4 rows)

So here it's using the right index.

> Can you trick it with a sub-query (to see the explain)?
> SELECT max(id) FROM (SELECT id FROM user_messages WHERE user_id = 13604)
> AS foo;

No, I tried that as well; PostgreSQL is clever enough to optimize it
into exactly the same query as the original.

Alexander.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Richard Huxton 2008-01-22 13:00:56 Re: Selecting max(pk) is slow on empty set
Previous Message Richard Huxton 2008-01-22 12:27:49 Re: Selecting max(pk) is slow on empty set