Re: PostgreSQL runs a query much slower than BDE and MySQL

From: Scott Lamb <slamb(at)slamb(dot)org>
To: Scott Lamb <slamb(at)slamb(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Peter Hardman" <peter(at)ssbg(dot)zetnet(dot)co(dot)uk>, pgsql-performance(at)postgresql(dot)org
Subject: Re: PostgreSQL runs a query much slower than BDE and MySQL
Date: 2006-08-17 22:00:52
Message-ID: EB1CB9B3-3853-401C-A947-FB9B2648902C@slamb.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Peter, I compared these using the data you supplied on my PostgreSQL
8.1.4 system:

On Aug 17, 2006, at 12:09 PM, Scott Lamb wrote:

> On Aug 16, 2006, at 3:51 PM, Tom Lane wrote:
>>> /* Select all sheep who's most recent transfer was into the
>>> subject flock */
>>> SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in
>>> FROM SHEEP_FLOCK f1 JOIN
>>> /* The last transfer date for each sheep */
>>> (SELECT f.regn_no, MAX(f.transfer_date) as last_xfer_date
>>> FROM SHEEP_FLOCK f
>>> GROUP BY f.regn_no) f2
>>> ON f1.regn_no = f2.regn_no
>>> WHERE f1.flock_no = '1359'
>>> AND f1.transfer_date = f2.last_xfer_date


QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------
Unique (cost=2575.07..2575.08 rows=1 width=36) (actual
time=1083.579..1083.696 rows=32 loops=1)
-> Sort (cost=2575.07..2575.07 rows=1 width=36) (actual
time=1083.576..1083.613 rows=32 loops=1)
Sort Key: f1.regn_no, f1.transfer_date
-> Nested Loop (cost=1364.00..2575.06 rows=1 width=36)
(actual time=287.895..1083.297 rows=32 loops=1)
-> HashAggregate (cost=1364.00..1366.50 rows=200
width=36) (actual time=262.345..337.940 rows=38815 loops=1)
-> Seq Scan on sheep_flock f
(cost=0.00..1116.00 rows=49600 width=36) (actual time=0.005..119.282
rows=81802 loops=1)
-> Index Scan using sheep_flock_pkey on sheep_flock
f1 (cost=0.00..6.02 rows=1 width=36) (actual time=0.016..0.016
rows=0 loops=38815)
Index Cond: (((f1.regn_no)::text =
("outer".regn_no)::text) AND ((f1.flock_no)::text = '1359'::text) AND
(f1.transfer_date = "outer"."?column2?"))
Total runtime: 1085.115 ms
(9 rows)

>>
>> This seems pretty closely related to this recent thread:
>> http://archives.postgresql.org/pgsql-performance/2006-08/msg00220.php
>> in which the OP is doing a very similar kind of query in almost
>> exactly
>> the same way.
>>
>> I can't help thinking that there's probably a better way to phrase
>> this
>> type of query in SQL, though it's not jumping out at me what that is.
>
> I don't know about better, but I tend to phrase these in a quite
> different way that's (hopefully) equivalent:
>
> select latest.regn_no,
> latest.transfer_date as date_in
> from sheep_flock latest
> where not exists (
> select 'x'
> from sheep_flock even_later
> where latest.regn_no = even_later.regn_no
> and latest.transfer_date < even_later.transfer_date)
> and latest.flock_no = '1359'
>
> There's no MAX() or DISTINCT here, so maybe this is easier to
> optimize?

Q
UERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
------
Bitmap Heap Scan on sheep_flock latest (cost=764.60..2185.05
rows=124 width=36) (actual time=11.915..13.800 rows=32 loops=1)
Recheck Cond: ((flock_no)::text = '1359'::text)
Filter: (NOT (subplan))
-> Bitmap Index Scan on sheep_flock_pkey (cost=0.00..764.60
rows=248 width=0) (actual time=10.950..10.950 rows=127 loops=1)
Index Cond: ((flock_no)::text = '1359'::text)
SubPlan
-> Index Scan using sheep_flock_pkey on sheep_flock
even_later (cost=0.00..317.49 rows=83 width=0) (actual
time=0.016..0.016 rows=1 loops=127)
Index Cond: ((($0)::text = (regn_no)::text) AND ($1 <
transfer_date))
Total runtime: 13.902 ms
(9 rows)

seems to return the same data in two orders of magnitude less time.

--
Scott Lamb <http://www.slamb.org/>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2006-08-18 00:31:51 Re: PostgreSQL runs a query much slower than BDE and MySQL
Previous Message Peter Hardman 2006-08-17 20:25:21 Re: PostgreSQL runs a query much slower than BDE and MySQL