Re: select max(field) from table much faster with a group by clause?

From: Palle Girgensohn <girgen(at)pingpong(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: select max(field) from table much faster with a group by clause?
Date: 2007-11-01 14:20:14
Message-ID: 00DC41259D75AF3300AAF9D2@rambutan.pingpong.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

--On torsdag, november 01, 2007 09.43.39 -0400 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
wrote:

> Palle Girgensohn <girgen(at)pingpong(dot)net> writes:
>> When running
>> select max("when") from login where userid='userid'
>> it takes a second or two, but when adding "group by userid" the planner
>> decides on using another plan, and it gets *much* faster. See example
>> below.
>
> It's only faster for cases where there are few or no rows for the
> particular userid ...

Well, no, not really. See below. OTH, it sometimes a bit slower. Seems to
depend on how far away from the estimated number of rows you get? Weird?

>> Number of tuples per user varies from zero to a couple of thousands.
>
> The planner is using an intermediate estimate of 406 rows. You might be
> well advised to increase the statistics target for login.userid --- with
> luck that would help it to choose the right plan type for both common
> and uncommon userids.

I'll try that, thanks!

--

pp=# SELECT max("when") FROM login WHERE userid='kudo' group by userid;
max
-------------------------------
2007-01-04 15:31:46.863325+01
(1 row)

Time: 6,194 ms
pp=# SELECT max("when") FROM login WHERE userid='kudo' ;
max
-------------------------------
2007-01-04 15:31:46.863325+01
(1 row)

Time: 992,391 ms
pp=# SELECT max("when") FROM login WHERE userid='kudo' ;
max
-------------------------------
2007-01-04 15:31:46.863325+01
(1 row)

Time: 779,582 ms
pp=# SELECT max("when") FROM login WHERE userid='kudo' ;
max
-------------------------------
2007-01-04 15:31:46.863325+01
(1 row)

Time: 818,667 ms
pp=# SELECT max("when") FROM login WHERE userid='kudo' ;
max
-------------------------------
2007-01-04 15:31:46.863325+01
(1 row)

Time: 640,242 ms
pp=# SELECT max("when") FROM login WHERE userid='kudo' group by userid;
max
-------------------------------
2007-01-04 15:31:46.863325+01
(1 row)

Time: 18,384 ms
pp=# SELECT count(*) FROM login WHERE userid='kudo' group by userid;
count
-------
1998
(1 row)

Time: 12,762 ms
pp=# explain analyze SELECT max("when") FROM login WHERE userid='kudo'
group by userid;
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=0.00..648.44 rows=1 width=25) (actual
time=24.700..24.703 rows=1 loops=1)
-> Index Scan using login_userid_idx on "login" (cost=0.00..646.40
rows=406 width=25) (actual time=0.140..16.931 rows=1998 loops=1)
Index Cond: (userid = 'kudo'::text)
Total runtime: 24.779 ms
(4 rows)

Time: 25,633 ms
pp=# explain analyze SELECT max("when") FROM login WHERE userid='kudo' ;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=323.93..323.94 rows=1 width=0) (actual
time=1400.994..1400.997 rows=1 loops=1)
InitPlan
-> Limit (cost=0.00..323.93 rows=1 width=8) (actual
time=1400.975..1400.979 rows=1 loops=1)
-> Index Scan Backward using login_when_idx on "login"
(cost=0.00..131515.87 rows=406 width=8) (actual time=1400.968..1400.968
rows=1 loops=1)
Filter: (("when" IS NOT NULL) AND (userid = 'kudo'::text))
Total runtime: 1401.057 ms
(6 rows)

Time: 1401,881 ms

pp=# SELECT userid, count("when") FROM login WHERE userid in ('girgen' ,
'kudo') group by userid;
userid | count
--------+-------
kudo | 1998
girgen | 1120
(2 rows)

pp=# explain analyze SELECT max("when") FROM login WHERE userid='girgen'
group by userid;
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=0.00..648.44 rows=1 width=25) (actual
time=25.137..25.141 rows=1 loops=1)
-> Index Scan using login_userid_idx on "login" (cost=0.00..646.40
rows=406 width=25) (actual time=0.121..20.712 rows=1120 loops=1)
Index Cond: (userid = 'girgen'::text)
Total runtime: 25.209 ms
(4 rows)

Time: 25,986 ms

pp=# explain analyze SELECT max("when") FROM login WHERE userid='girgen' ;
QUERY
PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=323.93..323.94 rows=1 width=0) (actual time=6.695..6.698
rows=1 loops=1)
InitPlan
-> Limit (cost=0.00..323.93 rows=1 width=8) (actual
time=6.669..6.675 rows=1 loops=1)
-> Index Scan Backward using login_when_idx on "login"
(cost=0.00..131515.87 rows=406 width=8) (actual time=6.660..6.660 rows=1
loops=1)
Filter: (("when" IS NOT NULL) AND (userid =
'girgen'::text))
Total runtime: 6.785 ms
(6 rows)

Time: 7,776 ms

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Sami Dalouche 2007-11-01 14:21:17 Re: [Fwd: Re: Outer joins and Seq scans]
Previous Message Tom Lane 2007-11-01 13:43:39 Re: select max(field) from table much faster with a group by clause?