| From: | K C Lau <kclau60(at)netvigator(dot)com> | 
|---|---|
| To: | Mark Kirkwood <markir(at)paradise(dot)net(dot)nz> | 
| Cc: | pgsql-performance(at)postgresql(dot)org | 
| Subject: | Re: SELECT LIMIT 1 VIEW Performance Issue | 
| Date: | 2005-09-24 03:18:24 | 
| Message-ID: | 6.2.1.2.0.20050924105728.05388920@localhost | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-performance | 
Dear Mark,
Thank you. That seems like a more manageable alternative if nothing else 
works out. It should cover many of the OLTP update transactions. But it 
does mean quite a bit of programming changes and adding another index on 
all such tables, and it would not cover those cases when we need to get the 
latest record before a certain time, for example.
I'm wondering if this performance issue is common enough for other users to 
merit a fix in pg, especially as it seems that with MVCC, each of the data 
records need to be accessed in addition to scanning the index.
Best regards,
KC.
At 09:40 05/09/24, Mark Kirkwood wrote:
>A small denormalization, where you mark the row with the latest atdate for 
>each playerid may get you the performance you want.
>
>e.g: (8.1beta1)
>
>ALTER TABLE player ADD islastatdate boolean;
>
>UPDATE player SET islastatdate = true where (playerid,atdate) IN
>(SELECT playerid, atdate FROM vcurplayer);
>
>CREATE OR REPLACE VIEW vcurplayer AS
>SELECT * FROM player a
>WHERE islastatdate;
>
>CREATE INDEX player_id_lastatdate ON player(playerid, islastatdate)
>WHERE islastatdate;
>
>ANALYZE player;
>
>Generating some test data produced:
>
>EXPLAIN ANALYZE
>SELECT playerid,atdate
>FROM vcurplayer
>WHERE playerid='22220';
>
>                                                            QUERY PLAN
>--------------------------------------------------------------------------------------------------------------------------------
>  Index Scan using player_id_lastatdate on player a  (cost=0.00..4.33 
> rows=1 width=13) (actual time=0.142..0.149 rows=1 loops=1)
>    Index Cond: ((playerid = '22220'::text) AND (lastatdate = true))
>    Filter: lastatdate
>  Total runtime: 0.272 ms
>(4 rows)
>
>Whereas with the original view definition:
>
>CREATE OR REPLACE VIEW vcurplayer AS
>SELECT * FROM player a
>WHERE a.atdate =
>( SELECT max(b.atdate) FROM player b
>   WHERE a.playerid = b.playerid);
>
>EXPLAIN ANALYZE
>SELECT playerid,atdate
>FROM vcurplayer
>WHERE playerid='22220';
>     QUERY PLAN
>-----------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Index Scan using player_id_date on player a  (cost=0.00..7399.23 rows=11 
> width=13) (actual time=121.738..121.745 rows=1 loops=1)
>    Index Cond: (playerid = '22220'::text)
>    Filter: (atdate = (subplan))
>    SubPlan
>      ->  Result  (cost=1.72..1.73 rows=1 width=0) (actual 
> time=0.044..0.047 rows=1 loops=2000)
>            InitPlan
>              ->  Limit  (cost=0.00..1.72 rows=1 width=4) (actual 
> time=0.028..0.031 rows=1 loops=2000)
>                    ->  Index Scan Backward using player_id_date on player 
> b  (cost=0.00..3787.94 rows=2198 width=4) (actual time=0.019..0.019 
> rows=1 loops=2000)
>                          Index Cond: ($0 = playerid)
>                          Filter: (atdate IS NOT NULL)
>  Total runtime: 121.916 ms
>(11 rows)
>
>Note that my generated data has too many rows for each playerid, but 
>the  difference in performance should illustrate the idea.
>
>Cheers
>
>Mark
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Mark Kirkwood | 2005-09-24 05:14:59 | Re: SELECT LIMIT 1 VIEW Performance Issue | 
| Previous Message | Tom Lane | 2005-09-24 03:09:09 | Re: optimization downgrade perfomance? |