Skip site navigation (1) Skip section navigation (2)

Re: max() versus order/limit (WAS: High update activity, PostgreSQL vs BigDBMS)

From: "Adam Rich" <adam(dot)r(at)sbcglobal(dot)net>
To: "'Joshua D(dot) Drake'" <jd(at)commandprompt(dot)com>,"'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "'Craig A(dot) James'" <cjames(at)modgraph-usa(dot)com>,"'PostgreSQL Performance'" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: max() versus order/limit (WAS: High update activity, PostgreSQL vs BigDBMS)
Date: 2007-01-15 04:52:16
Message-ID: 069301c73860$ee779a20$6400a8c0@dualcore (view raw or flat)
Thread:
Lists: pgsql-performance
Did anybody get a chance to look at this?  Is it expected behavior?
Everyone seemed so incredulous, I hoped maybe this exposed a bug
that would be fixed in a near release.


-----Original Message-----
From: Adam Rich [mailto:adam(dot)r(at)sbcglobal(dot)net] 
Sent: Sunday, January 07, 2007 11:53 PM
To: 'Joshua D. Drake'; 'Tom Lane'
Cc: 'Craig A. James'; 'PostgreSQL Performance'
Subject: RE: [PERFORM] High update activity, PostgreSQL vs BigDBMS



Here's another, more drastic example... Here the order by / limit
version
runs in less than 1/7000 the time of the MAX() version.


select max(item_id)
from events e, receipts r, receipt_items ri
where e.event_id=r.event_id and r.receipt_id=ri.receipt_id

Aggregate  (cost=10850.84..10850.85 rows=1 width=4) (actual
time=816.382..816.383 rows=1 loops=1)
  ->  Hash Join  (cost=2072.12..10503.30 rows=139019 width=4) (actual
time=155.177..675.870 rows=147383 loops=1)
        Hash Cond: (ri.receipt_id = r.receipt_id)
        ->  Seq Scan on receipt_items ri  (cost=0.00..4097.56
rows=168196 width=8) (actual time=0.009..176.894 rows=168196 loops=1)
        ->  Hash  (cost=2010.69..2010.69 rows=24571 width=4) (actual
time=155.146..155.146 rows=24571 loops=1)
              ->  Hash Join  (cost=506.84..2010.69 rows=24571 width=4)
(actual time=34.803..126.452 rows=24571 loops=1)
                    Hash Cond: (r.event_id = e.event_id)
                    ->  Seq Scan on receipts r  (cost=0.00..663.58
rows=29728 width=8) (actual time=0.006..30.870 rows=29728 loops=1)
                    ->  Hash  (cost=469.73..469.73 rows=14843 width=4)
(actual time=34.780..34.780 rows=14843 loops=1)
                          ->  Seq Scan on events e  (cost=0.00..469.73
rows=14843 width=4) (actual time=0.007..17.603 rows=14843 loops=1)
Total runtime: 816.645 ms

select item_id
from events e, receipts r, receipt_items ri
where e.event_id=r.event_id and r.receipt_id=ri.receipt_id
order by item_id desc limit 1


Limit  (cost=0.00..0.16 rows=1 width=4) (actual time=0.047..0.048 rows=1
loops=1)
  ->  Nested Loop  (cost=0.00..22131.43 rows=139019 width=4) (actual
time=0.044..0.044 rows=1 loops=1)
        ->  Nested Loop  (cost=0.00..12987.42 rows=168196 width=8)
(actual time=0.032..0.032 rows=1 loops=1)
              ->  Index Scan Backward using receipt_items_pkey on
receipt_items ri  (cost=0.00..6885.50 rows=168196 width=8) (actual
time=0.016..0.016 rows=1 loops=1)
              ->  Index Scan using receipts_pkey on receipts r
(cost=0.00..0.02 rows=1 width=8) (actual time=0.010..0.010 rows=1
loops=1)
                    Index Cond: (r.receipt_id = ri.receipt_id)
        ->  Index Scan using events_pkey on events e  (cost=0.00..0.04
rows=1 width=4) (actual time=0.009..0.009 rows=1 loops=1)
              Index Cond: (e.event_id = r.event_id)
Total runtime: 0.112 ms





-----Original Message-----
From: pgsql-performance-owner(at)postgresql(dot)org
[mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of Joshua D.
Drake
Sent: Sunday, January 07, 2007 9:10 PM
To: Adam Rich
Cc: 'Craig A. James'; 'Guy Rouillier'; 'PostgreSQL Performance'
Subject: Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS


On Sun, 2007-01-07 at 20:26 -0600, Adam Rich wrote:
> I'm using 8.2 and using order by & limit is still faster than MAX()
> even though MAX() now seems to rewrite to an almost identical plan
> internally.


Gonna need you to back that up :) Can we get an explain analyze?


> Count(*) still seems to use a full table scan rather than an index
scan.
> 

There is a TODO out there to help this. Don't know if it will get done.

Joshua D. Drake

-- 

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate




---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org


Responses

pgsql-performance by date

Next:From: Luke LonerganDate: 2007-01-15 07:42:34
Subject: Re: max() versus order/limit (WAS: High update
Previous:From: Luke LonerganDate: 2007-01-14 19:20:35
Subject: Re: Large table performance

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group