Re: PostgreSQL runs a query much slower than BDE and MySQL

From: Rodrigo De León <rdeleonp(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: PostgreSQL runs a query much slower than BDE and MySQL
Date: 2006-08-16 18:02:31
Message-ID: a55915760608161102j6f93402aufaafdaee0ef49278@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 8/16/06, Peter Hardman <peter(at)ssbg(dot)zetnet(dot)co(dot)uk> wrote:
> I'm in the process of migrating a Paradox 7/BDE 5.01 database from single-user
> Paradox to a web based interface to either MySQL or PostgreSQL.
> The database is a pedigree sheep breed society database recording sheep and
> flocks (amongst other things).
>
> My current problem is with one table and an associated query which takes 10
> times longer to execute on PostgreSQL than BDE, which in turn takes 10 times
> longer than MySQL. The table links sheep to flocks and is created as follows:
>
> CREATE TABLE SHEEP_FLOCK
> (
> regn_no varchar(7) NOT NULL,
> flock_no varchar(6) NOT NULL,
> transfer_date date NOT NULL,
> last_changed date NOT NULL,
> CONSTRAINT SHEEP_FLOCK_pkey PRIMARY KEY (regn_no, flock_no,
> transfer_date)
> )
> WITHOUT OIDS;
> ALTER TABLE SHEEP_FLOCK OWNER TO postgres;
>
> I then populate the table with
>
> COPY SHEEP_FLOCK
> FROM 'e:/ssbg/devt/devt/export_data/sheep_flock.txt'
> WITH CSV HEADER
>
> The table then has about 82000 records
>
> The query I run is:
>
> /* Select all sheep who's most recent transfer was into the subject flock */
> SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in
> FROM SHEEP_FLOCK f1 JOIN
> /* The last transfer date for each sheep */
> (SELECT f.regn_no, MAX(f.transfer_date) as last_xfer_date
> FROM SHEEP_FLOCK f
> GROUP BY f.regn_no) f2
> ON f1.regn_no = f2.regn_no
> WHERE f1.flock_no = '1359'
> AND f1.transfer_date = f2.last_xfer_date
>
> The sub-select on it's own returns about 32000 rows.
>
> Using identically structured tables and the same primary key, if I run this on
> Paradox/BDE it takes about 120ms, on MySQL (5.0.24, local server) about 3ms,
> and on PostgresSQL (8.1.3, local server) about 1290ms). All on the same
> Windows XP Pro machine with 512MB ram of which nearly half is free.
>
> The query plan shows most of the time is spent sorting the 30000+ rows from the subquery, so I added a further
> subquery as follows:
>
> /* Select all sheep who's most recent transfer was into the subject flock */
> SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in
> FROM SHEEP_FLOCK f1 JOIN
> /* The last transfer date for each sheep */
> (SELECT f.regn_no, MAX(f.transfer_date) as last_xfer_date
> FROM SHEEP_FLOCK f
> WHERE f.regn_no IN
> /* Limit the rows extracted by the outer sub-query to those relevant to the
> subject flock */
> /* This typically reduces the time from 1297ms to 47ms - from 35000 rows
> to 127 rows */
> (SELECT s.regn_no FROM SHEEP_FLOCK s where s.flock_no = '1359')
> GROUP BY f.regn_no) f2
> ON f1.regn_no = f2.regn_no
> WHERE f1.flock_no = '1359'
> AND f1.transfer_date = f2.last_xfer_date
>
> then as the comment suggests I get a considerable improvement, but it's still an
> order of magnitude slower than MySQL.
>
> Can anyone suggest why PostgreSQL performs the original query so much slower than even BDE?

ANALYZE?

Regards,

Rodrigo

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Sebastián Baioni 2006-08-16 18:27:05 Re: Inner Join of the same table
Previous Message Arjen van der Meijden 2006-08-16 18:02:24 Re: PostgreSQL runs a query much slower than BDE and MySQL