Re: max() versus order/limit (WAS: High update

From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Luke Lonergan <LLonergan(at)greenplum(dot)com>
Cc: Adam Rich <adam(dot)r(at)sbcglobal(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Craig A(dot) James" <cjames(at)modgraph-usa(dot)com>, PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: max() versus order/limit (WAS: High update
Date: 2007-01-15 15:51:50
Message-ID: 45ABA316.4060202@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Luke Lonergan wrote:
> Adam,
>
> This optimization would require teaching the planner to use an index for
> MAX/MIN when available. It seems like an OK thing to do to me.

Uhmmm I thought we did that already in 8.1?

Joshua D. Drake

>
> - Luke
>
>> -----Original Message-----
>> From: pgsql-performance-owner(at)postgresql(dot)org
>> [mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of Adam Rich
>> Sent: Sunday, January 14, 2007 8:52 PM
>> To: 'Joshua D. Drake'; 'Tom Lane'
>> Cc: 'Craig A. James'; 'PostgreSQL Performance'
>> Subject: Re: [PERFORM] max() versus order/limit (WAS: High
>> update activity, PostgreSQL vs BigDBMS)
>>
>>
>> Did anybody get a chance to look at this? Is it expected behavior?
>> Everyone seemed so incredulous, I hoped maybe this exposed a
>> bug that would be fixed in a near release.
>>
>>
>> -----Original Message-----
>> From: Adam Rich [mailto:adam(dot)r(at)sbcglobal(dot)net]
>> Sent: Sunday, January 07, 2007 11:53 PM
>> To: 'Joshua D. Drake'; 'Tom Lane'
>> Cc: 'Craig A. James'; 'PostgreSQL Performance'
>> Subject: RE: [PERFORM] High update activity, PostgreSQL vs BigDBMS
>>
>>
>>
>> Here's another, more drastic example... Here the order by / limit
>> version
>> runs in less than 1/7000 the time of the MAX() version.
>>
>>
>> select max(item_id)
>> from events e, receipts r, receipt_items ri
>> where e.event_id=r.event_id and r.receipt_id=ri.receipt_id
>>
>> Aggregate (cost=10850.84..10850.85 rows=1 width=4) (actual
>> time=816.382..816.383 rows=1 loops=1)
>> -> Hash Join (cost=2072.12..10503.30 rows=139019 width=4) (actual
>> time=155.177..675.870 rows=147383 loops=1)
>> Hash Cond: (ri.receipt_id = r.receipt_id)
>> -> Seq Scan on receipt_items ri (cost=0.00..4097.56
>> rows=168196 width=8) (actual time=0.009..176.894 rows=168196 loops=1)
>> -> Hash (cost=2010.69..2010.69 rows=24571 width=4) (actual
>> time=155.146..155.146 rows=24571 loops=1)
>> -> Hash Join (cost=506.84..2010.69 rows=24571 width=4)
>> (actual time=34.803..126.452 rows=24571 loops=1)
>> Hash Cond: (r.event_id = e.event_id)
>> -> Seq Scan on receipts r (cost=0.00..663.58
>> rows=29728 width=8) (actual time=0.006..30.870 rows=29728 loops=1)
>> -> Hash (cost=469.73..469.73 rows=14843 width=4)
>> (actual time=34.780..34.780 rows=14843 loops=1)
>> -> Seq Scan on events e (cost=0.00..469.73
>> rows=14843 width=4) (actual time=0.007..17.603 rows=14843 loops=1)
>> Total runtime: 816.645 ms
>>
>> select item_id
>> from events e, receipts r, receipt_items ri
>> where e.event_id=r.event_id and r.receipt_id=ri.receipt_id
>> order by item_id desc limit 1
>>
>>
>> Limit (cost=0.00..0.16 rows=1 width=4) (actual
>> time=0.047..0.048 rows=1
>> loops=1)
>> -> Nested Loop (cost=0.00..22131.43 rows=139019 width=4) (actual
>> time=0.044..0.044 rows=1 loops=1)
>> -> Nested Loop (cost=0.00..12987.42 rows=168196 width=8)
>> (actual time=0.032..0.032 rows=1 loops=1)
>> -> Index Scan Backward using receipt_items_pkey on
>> receipt_items ri (cost=0.00..6885.50 rows=168196 width=8) (actual
>> time=0.016..0.016 rows=1 loops=1)
>> -> Index Scan using receipts_pkey on receipts r
>> (cost=0.00..0.02 rows=1 width=8) (actual time=0.010..0.010 rows=1
>> loops=1)
>> Index Cond: (r.receipt_id = ri.receipt_id)
>> -> Index Scan using events_pkey on events e (cost=0.00..0.04
>> rows=1 width=4) (actual time=0.009..0.009 rows=1 loops=1)
>> Index Cond: (e.event_id = r.event_id)
>> Total runtime: 0.112 ms
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: pgsql-performance-owner(at)postgresql(dot)org
>> [mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of Joshua D.
>> Drake
>> Sent: Sunday, January 07, 2007 9:10 PM
>> To: Adam Rich
>> Cc: 'Craig A. James'; 'Guy Rouillier'; 'PostgreSQL Performance'
>> Subject: Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS
>>
>>
>> On Sun, 2007-01-07 at 20:26 -0600, Adam Rich wrote:
>>> I'm using 8.2 and using order by & limit is still faster than MAX()
>>> even though MAX() now seems to rewrite to an almost identical plan
>>> internally.
>>
>> Gonna need you to back that up :) Can we get an explain analyze?
>>
>>
>>> Count(*) still seems to use a full table scan rather than an index
>> scan.
>> There is a TODO out there to help this. Don't know if it will
>> get done.
>>
>> Joshua D. Drake
>>
>> --
>>
>> === The PostgreSQL Company: Command Prompt, Inc. ===
>> Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
>> Providing the most comprehensive PostgreSQL solutions since 1997
>> http://www.commandprompt.com/
>>
>> Donate to the PostgreSQL Project:
>> http://www.postgresql.org/about/donate
>>
>>
>>
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 4: Have you searched our list archives?
>>
>> http://archives.postgresql.org
>>
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 7: You can help support the PostgreSQL project by donating at
>>
>> http://www.postgresql.org/about/donate
>>
>>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
>

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Andrew Hammond 2007-01-15 17:55:47 FiberChannel cards for FreeBSD on AMD64
Previous Message Steinar H. Gunderson 2007-01-15 12:58:34 Re: pg_trgm performance