Re: SELECT LIMIT 1 VIEW Performance Issue

From: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>
To: K C Lau <kclau60(at)netvigator(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: SELECT LIMIT 1 VIEW Performance Issue
Date: 2005-09-24 01:40:25
Message-ID: 4334AE89.8060806@paradise.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

K C Lau wrote:
> Thank you all for your suggestions. I' tried, with some variations too,
> but still no success. The times given are the best of a few repeated
> tries on an 8.1 beta 2 db freshly migrated from 8.0.3 on Windows.
>

A small denormalization, where you mark the row with the latest atdate
for each playerid may get you the performance you want.

e.g: (8.1beta1)

ALTER TABLE player ADD islastatdate boolean;

UPDATE player SET islastatdate = true where (playerid,atdate) IN
(SELECT playerid, atdate FROM vcurplayer);

CREATE OR REPLACE VIEW vcurplayer AS
SELECT * FROM player a
WHERE islastatdate;

CREATE INDEX player_id_lastatdate ON player(playerid, islastatdate)
WHERE islastatdate;

ANALYZE player;

Generating some test data produced:

EXPLAIN ANALYZE
SELECT playerid,atdate
FROM vcurplayer
WHERE playerid='22220';

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------
Index Scan using player_id_lastatdate on player a (cost=0.00..4.33
rows=1 width=13) (actual time=0.142..0.149 rows=1 loops=1)
Index Cond: ((playerid = '22220'::text) AND (lastatdate = true))
Filter: lastatdate
Total runtime: 0.272 ms
(4 rows)

Whereas with the original view definition:

CREATE OR REPLACE VIEW vcurplayer AS
SELECT * FROM player a
WHERE a.atdate =
( SELECT max(b.atdate) FROM player b
WHERE a.playerid = b.playerid);

EXPLAIN ANALYZE
SELECT playerid,atdate
FROM vcurplayer
WHERE playerid='22220';

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using player_id_date on player a (cost=0.00..7399.23
rows=11 width=13) (actual time=121.738..121.745 rows=1 loops=1)
Index Cond: (playerid = '22220'::text)
Filter: (atdate = (subplan))
SubPlan
-> Result (cost=1.72..1.73 rows=1 width=0) (actual
time=0.044..0.047 rows=1 loops=2000)
InitPlan
-> Limit (cost=0.00..1.72 rows=1 width=4) (actual
time=0.028..0.031 rows=1 loops=2000)
-> Index Scan Backward using player_id_date on
player b (cost=0.00..3787.94 rows=2198 width=4) (actual
time=0.019..0.019 rows=1 loops=2000)
Index Cond: ($0 = playerid)
Filter: (atdate IS NOT NULL)
Total runtime: 121.916 ms
(11 rows)

Note that my generated data has too many rows for each playerid, but the
difference in performance should illustrate the idea.

Cheers

Mark

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2005-09-24 03:09:09 Re: optimization downgrade perfomance?
Previous Message Cristian Prieto 2005-09-23 22:03:11 Index use in BETWEEN statement...