Re: SELECT LIMIT 1 VIEW Performance Issue

From: K C Lau <kclau60(at)netvigator(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: SELECT LIMIT 1 VIEW Performance Issue
Date: 2005-09-23 08:53:55
Message-ID: 6.2.1.2.0.20050923155952.05889a50@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thank you all for your suggestions. I' tried, with some variations too, but
still no success. The times given are the best of a few repeated tries on
an 8.1 beta 2 db freshly migrated from 8.0.3 on Windows.

For reference, only the following gets the record quickly:

esdt=> explain analyze select PlayerID,AtDate from Player a
where PlayerID='22220' and AtDate = (select b.AtDate from Player b
where b.PlayerID = '22220' order by b.PlayerID desc, b.AtDate desc LIMIT 1);

Index Scan using pk_player on player a (cost=0.75..4.26 rows=1 width=23)
(actual time=0.054..0.057 rows=1 loops=1)
Index Cond: (((playerid)::text = '22220'::text) AND ((atdate)::text =
($0)::text))
InitPlan
-> Limit (cost=0.00..0.75 rows=1 width=23) (actual
time=0.027..0.028 rows=1 loops=1)
-> Index Scan Backward using pk_player on player
b (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.023..0.023 rows=1
loops=1)
Index Cond: ((playerid)::text = '22220'::text)
Total runtime: 0.132 ms

At 02:19 05/09/23, Kevin Grittner wrote:
>Have you tried the "best choice" pattern -- where you select the set of
>candidate rows and then exclude those for which a better choice
>exists within the set? I often get better results with this pattern than
>with the alternatives.

esdt=> explain analyze select PlayerID,AtDate from Player a where
PlayerID='22220'
and not exists (select * from Player b where b.PlayerID = a.PlayerID and
b.AtDate > a.AtDate);

Index Scan using pk_player on player a (cost=0.00..3032.46 rows=878
width=23)
(actual time=35.820..35.823 rows=1 loops=1)
Index Cond: ((playerid)::text = '22220'::text)
Filter: (NOT (subplan))
SubPlan
-> Index Scan using pk_player on player b (cost=0.00..378.68
rows=389 width=776) (actual time=0.013..0.013 rows=1 loops=1743)
Index Cond: (((playerid)::text = ($0)::text) AND
((atdate)::text > ($1)::text))
Total runtime: 35.950 ms

Note that it is faster than the LIMIT 1:

esdt=> explain analyze select PlayerID,AtDate from Player a where
PlayerID='22220' and AtDate = (select b.AtDate from Pl
ayer b where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate
desc LIMIT 1);

Index Scan using pk_player on player a (cost=0.00..2789.07 rows=9
width=23) (actual time=41.366..41.371 rows=1 loops=1)
Index Cond: ((playerid)::text = '22220'::text)
Filter: ((atdate)::text = ((subplan))::text)
SubPlan
-> Limit (cost=0.00..0.83 rows=1 width=23) (actual
time=0.013..0.014 rows=1 loops=1743)
-> Index Scan Backward using pk_player on player
b (cost=0.00..970.53 rows=1166 width=23) (actual time=0.008..0.008 rows=1
loops=1743)
Index Cond: ((playerid)::text = ($0)::text)
Total runtime: 41.490 ms

At 02:07 05/09/23, Merlin Moncure wrote:
> > >Here is a trick I use sometimes with views, etc. This may or may not be
> > >effective to solve your problem but it's worth a shot. Create one small
> > >SQL function taking date, etc. and returning the values and define it
> > >immutable. Now in-query it is treated like a constant.

esdt=> create or replace function player_max_atdate (varchar(32)) returns
varchar(32) as $$
esdt$> select atdate from player where playerid = $1 order by playerid
desc, AtDate desc limit 1;
esdt$> $$ language sql immutable;
CREATE FUNCTION
esdt=> create or replace view VCurPlayer3 as select * from Player where
AtDate = player_max_atdate(PlayerID);
CREATE VIEW
esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where
PlayerID='22220';

Index Scan using pk_player on player (cost=0.00..1331.83 rows=9
width=23) (actual time=65.434..65.439 rows=1 loops=1)
Index Cond: ((playerid)::text = '22220'::text)
Filter: ((atdate)::text = (player_max_atdate(playerid))::text)
Total runtime: 65.508 ms

While it says loops=1, the time suggests that it is going through all 1743
records for that PlayerID.

I tried to simulate the fast subquery inside the function, but it is taking
almost twice as much time:

esdt=> create or replace function player_max_atdate (varchar(32)) returns
varchar(32) as $$
esdt$> select atdate from player a where playerid = $1 and AtDate =
(select b.AtDate from Player b
esdt$> where b.PlayerID = $1 order by b.PlayerID desc, b.AtDate desc LIMIT 1);
esdt$> $$ language sql immutable;
CREATE FUNCTION
esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where
PlayerID='22220';

Index Scan using pk_player on player (cost=0.00..1331.83 rows=9
width=23) (actual time=119.369..119.373 rows=1 loops=1)
Index Cond: ((playerid)::text = '22220'::text)
Filter: ((atdate)::text = (player_max_atdate(playerid))::text)
Total runtime: 119.441 ms

Adding another LIMIT 1 inside the function makes it even slower:

esdt=> create or replace function player_max_atdate (varchar(32)) returns
varchar(32) as $$
esdt$> select atdate from player where playerid = $1 and AtDate = (select
b.AtDate from Player b
esdt$> where b.PlayerID = $1 order by b.PlayerID desc, b.AtDate desc LIMIT 1)
esdt$> order by PlayerID desc, AtDate desc LIMIT 1;
esdt$> $$ language sql immutable;
CREATE FUNCTION
esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where
PlayerID='22220';

Index Scan using pk_player on player (cost=0.00..1331.83 rows=9
width=23) (actual time=129.858..129.863 rows=1 loops=1)
Index Cond: ((playerid)::text = '22220'::text)
Filter: ((atdate)::text = (player_max_atdate(playerid))::text)
Total runtime: 129.906 ms

At 00:16 05/09/23, Simon Riggs wrote:
>If the current value is used so often, use two tables - one with a
>current view only of the row maintained using UPDATE. Different
>performance issues maybe, but at least not correlated subquery ones.

Many of our tables have similar construct and it would be a huge task to
duplicate and maintain all these tables throughout the system. We would
prefer a solution with SQL or function at the view or db level, or better
still, a fix, if this problem is considered general enough.

>You're welcome in advance, ;)
>Merlin

Thank you all in advance for any further ideas.
KC.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Ulrich Wisser 2005-09-23 09:31:25 Re: How to determine cause of performance problem?
Previous Message Oleg Bartunov 2005-09-23 07:35:48 Re: tsearch2 seem very slow