7.4.xx regression

From: bs <bs(at)ionicsoft(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: 7.4.xx regression
Date: 2004-01-21 09:51:13
Message-ID: 400E4B91.1060402@ionicsoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I have the following tables and indexes

CREATE TABLE cddb
(
id integer PRIMARY KEY,

/* artist name */
name text,

/* disk title */
title text,

/* the type of the artist group, person, orchestra*/
type smallint,

/* the creation date */
created integer
);

CREATE INDEX cddb1 on cddb(title);
CREATE INDEX cddb2 on cddb(name);

CREATE TABLE cddbentry
(
id integer PRIMARY KEY,

diskid text,

entry integer REFERENCES cddb
);

CREATE INDEX cddbentry1 on cddbentry (diskid);
CREATE INDEX cddbentry2 on cddbentry (entry);

I am using the following query:
select cddb.* from cddb,cddbentry where cddbentry.diskid = 'toto' and
cddbentry.entry = cddb.id;

On version 7.3.4 it produces this query plan

Nested Loop (cost=0.00..41.25 rows=5 width=78)
-> Index Scan using cddbentry1 on cddbentry (cost=0.00..17.07
rows=5 width=4)
Index Cond: (diskid = 'toto'::text)
-> Index Scan using cddb_pkey on cddb (cost=0.00..4.82 rows=1 width=74)
Index Cond: ("outer".entry = cddb.id)

I have upgraded to version 7.4.0 (compiling the software and migrating
the database using dump/restore)

Now the following plan is produced

Hash Join (cost=17.08..42.15 rows=7 width=74)
Hash Cond: ("outer".id = "inner".entry)
-> Seq Scan on cddb (cost=0.00..20.00 rows=1000 width=74)
-> Hash (cost=17.07..17.07 rows=6 width=4)
-> Index Scan using cddbentry1 on cddbentry (cost=0.00..17.07
rows=6 width=4)
Index Cond: (diskid = 'toto'::text)

which result in a VERY much slower query as the cddb table has more than
1 million entry.... and there is at most one entry in cddbentry
which matches the diskid !

The workaround is to disable the hash join capability using set
enable_hashjoin to false; resulting in the same query plan as in 7.3.x.

I think this can classified as a regression bug.

N.B: The cost value may be wrong, because to be able to send you this
email, they have been produced on empty tables; but
I can guarantee you they are the same on the full loaded database.

Hope this help to improve this great product

Bernard SNYERS

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Meskes 2004-01-21 14:11:13 Re: BUG #1056: ecpg ignores WHENEVER NOT FOUND
Previous Message PostgreSQL Bugs List 2004-01-20 22:52:08 BUG #1057: psql sometimes does not flush its output