Optimizer badness in 7.0 beta

From: Brian Hirt <bhirt(at)mobygames(dot)com>
To: pgsql-hackers(at)postgreSQL(dot)org
Cc: Brian Hirt <bhirt(at)loopy(dot)berkhirt(dot)com>
Subject: Optimizer badness in 7.0 beta
Date: 2000-03-05 09:26:55
Message-ID: 20000305032655.A29257@loopy.berkhirt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

I just downloaded the 7.0 beta to test it with my database to make sure
there will be no unexpected problems when I upgrade my production site.
I've run into a problem that I hope you can help me with. I dumped my
6.5.2 database and loaded it into 7.0. Lot's of queries are now taking
much much longer. I have included the plans from one of the queries.
In 7.0, the query takes 94 seconds compared to less than a second for
it to run on 6.5.2. All of the data is exactly the same, the indexes are the
same. I thought maybe the indexes had bad statistics, so I "vaccum analyze"
both the 6.5.2 database and the 7.0 database and ran again on both just to
be on the safe side. Still, same problem. I know that there were problems
with IN clauses optimizing and the preferred method is to use an exists
statement. However, I wouldn't expect this kind of change in performance.
It does appear that 7.0 is trying to be smarter by using an index in the
SubPlan, but for some reason it's being a hog.

Some more information that may be useful, the table 'game' has about 1000
rows and the table game_developer has about 15000 rows. There is an
index on game_developer(developer_id)

Other than these types of queries, everything else seems to be working
okay. I logged about 500 different queries that run against my database,
removed the ones that exhibit the behaviour above and ran a little
benchmark. The run times between 6.5.2 and 7.0.0, for the types of
queries I'm running, are almost identical. I was hoping that the new
improved optimizer would bring a great speed improvement, but I'm not
seeing it. My guess is that most of the queries that I'm running are
small and there's a fixed cost associated with running each one -- the
actual work they perform is pretty small. Possibly more time is being
spent optimizing the plan and is offsetting the improved execution time
on smaller queries.

-brian

-- PG 7.0 --
NOTICE: QUERY PLAN:

Sort (cost=383940.72..383940.72 rows=905 width=59)
-> Seq Scan on game (cost=0.00..383896.28 rows=905 width=59)
SubPlan
-> Unique (cost=0.00..808.88 rows=0 width=4)
-> Index Scan using game_developer_game_index on game_developer (cost=0.00..808.87 rows=4 width=4)

EXPLAIN

-- PG 6.5.2 --
NOTICE: QUERY PLAN:

Sort (cost=99.32 rows=872 width=59)
-> Seq Scan on game (cost=99.32 rows=872 width=59)
SubPlan
-> Unique (cost=578.53 rows=2 width=4)
-> Sort (cost=578.53 rows=2 width=4)
-> Seq Scan on game_developer (cost=578.53 rows=2 width=4)

EXPLAIN

Query:

select
creation_timestamp,
approved,
moby_user_id,
copyright_year,
game_title,
game_url,
company_line,
credits_complete,
game_id
from
game
where
approved = 1
and
game_id in (
select
distinct game_id
from
game_developer
where
developer_id = 3)
order by
copyright_year desc,
game_title;
--
The world's most ambitious and comprehensive PC game database project.

http://www.mobygames.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2000-03-05 13:29:50 pg_pwd trigger to be removed
Previous Message Tatsuo Ishii 2000-03-05 09:09:51 DROP TABLE inside a transaction block