New Optimizer Behaviour In 7.0b1

From: Mark Kirkwood <markir(at)ihug(dot)co(dot)nz>
To: pgsql-sql(at)postgresql(dot)org
Subject: New Optimizer Behaviour In 7.0b1
Date: 2000-02-25 03:03:31
Message-ID: 38B5F103.68A06D0@ihug.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

I have been playing a bit with this new release. There are lots of new
possible plans, which is really great.

I have been using the query shown below to study optimizer changes.

select
d0.d0f1,
count(f.f1)
from dim0 d0,
fact1 f
where d0.d0key = f.d0key
and d0.d0f1 between '1999-11-01' and '1999-12-01'
group by d0.d0f1

Table setup is :

Table "fact1" size 300000 rows
Attribute | Type | Modifier
-----------+---------+----------
d0key | integer |
d1key | integer |
f1 | integer |
Index: fact1_q1 on d0key

Table "dim0" size 900 rows
Attribute | Type | Modifier
-----------+-------------+----------
d0key | integer |
d0f1 | timestamp |
d0f2 | varchar(20) |
d0f3 | varchar(20) |
Indices: dim0_pk on d0key,
dim0_q1 on d0f1

Explain is :
Aggregate (cost=12205.78..12372.44 rows=3333 width=20)
-> Group (cost=12205.78..12289.11 rows=33333 width=20)
-> Sort (cost=12205.78..12205.78 rows=33333 width=20)
-> Hash Join (cost=21.75..9371.33 rows=33333 width=20)
-> Seq Scan on fact1 f (cost=0.00..4765.00
rows=300000 width=8)
-> Hash (cost=21.50..21.50 rows=100 width=12)
-> Seq Scan on dim0 d0 (cost=0.00..21.50
rows=100 width=12)

Initially this ran fairly slowly : 8-10s , the query scans about 9000
out the 300000 in the big table(fact1).
A bit of tweeking with the set variables : ( these are new -see
src/backend/commands/variable.c )
set cpu_tuple_cost = '0.6';
set enable_hashjoin = 'off';
set enable_mergejoin = 'off';

gave a new plan :
Aggregate (cost=0.00..18476945.83 rows=3333 width=20)
-> Group (cost=0.00..18476862.50 rows=33333 width=20)
-> Nested Loop (cost=0.00..18476779.16 rows=33333 width=20)
-> Index Scan using dim0_q1 on dim0 d0 (cost=0.00..81.98
rows=100 width=12)
-> Index Scan using fact1_q1 on fact1 f
(cost=0.00..4016.97 rows=1500 width=8)

which is devestatiingly fast... about 1 s. Note that the table order is
reversed and that the index on the big
table ( fact1) is used.

However it seems a bit on the brutal side to have to coerce the
optimizer this way ( after all hash joins are
generally good), is there any way to get a reasonably sensible use of
indexes without such desperate
measures ?

P.s : I realize that this is beta 1..... I am impressed, I have had no
problems relinking php4 and subsequently apache for use with this
release - seems like a very good quality beta 1. well done guys!

Mark
(markir(at)ihug(dot)co(dot)nz,
mark(dot)kirkwood(at)hnz(dot)co(dot)nz )

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2000-02-25 04:48:29 Re: [INTERFACES] arrays of varchar and "
Previous Message Tom Lane 2000-02-25 01:03:37 Re: [INTERFACES] arrays of varchar and "