Re: [HACKERS] Optimizer badness in 7.0 beta

From: Brian Hirt <bhirt(at)mobygames(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Optimizer badness in 7.0 beta
Date: 2000-03-05 21:01:32
Message-ID: 20000305150132.B29257@loopy.berkhirt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter,

Actually, the query you supply will not work, I'll get duplicate
rows because the relationship between game and game_developer is
a one to many. Of course you had no way of knowing that from the
information I supplied. I could throw a distinct in there to get
the results, but that really feels like bad form because of the large
amount of duplicate rows. In any case, the original query I supplied
is generated SQL, created by a Database to Object persistance layer
and cannot by design have multiple tables in the from clause, so
restrictions to the table seleted from must be in the form of a
qualifier.

I realize that the query in question could be written better. My
concern was the huge difference in performance between 6.5 and 7.0 on
this type of query. Other people may be bitten by this one, so I wanted
to bring it up. I've been able to easily to work around this problem,
it just seems wrong that the difference in execution time is so far
off from the previous release.

I dont know too much about the PG internals, but when I used sybase,
it would usually execute the sub-select independently and stuff the
results into a temp table and then do another query, joining to the
results of the sub-select. In a situation like this one, worst case
without indexes you would get a table scan for the sub-select
and then a merge join with a sequential scan on the temp table and a
sequential scan on the other table (example below). Using that
approach, with no indexes, the query still executes in a fraction of a
second. It just seems that a query on tables as small as I'm describing
should never take as long as it did. It seems like a problem with
the optimizer, but if people are happy with currenty functionality that's
fine with me also.

-brian

For Example:

SELECT DISTINCT game_id INTO temp tmp_res
FROM game_developer
WHERE developer_id = 3

SELECT *
FROM game, tmp_res
WHERE game.game_id = tmp_res.game_id
AND game.approved = 1
ORDER BY copyright_year desc, game_title

On Sun, Mar 05, 2000 at 03:15:45PM +0100, Peter Eisentraut wrote:
> This query can be rewritten as
>
> SELECT creation_timestamp, etc.
> FROM game, game_developer
> WHERE game.game_id = game_developer.game_id
> AND approved = 1 AND developer_id = 3
> ORDER BY copyright_year desc, game_title
>
> The way you're writing it you're almost asking it to be slow. :)
>
> Of course that still doesn't explain why it's now 94sec versus formerly 1
> but I'm sure Tom Lane will enlighten us all very soon. :)
>
>
> Brian Hirt writes:
>
> > select
> > creation_timestamp,
> [snip]
> > from
> > game
> > where
> > approved = 1
> > and
> > game_id in (
> > select
> > distinct game_id
> > from
> > game_developer
> > where
> > developer_id = 3)
> > order by
> > copyright_year desc,
> > game_title;
>
>
> --
> Peter Eisentraut Sernanders väg 10:115
> peter_e(at)gmx(dot)net 75262 Uppsala
> http://yi.org/peter-e/ Sweden
>
>
>
> ************

--
The world's most ambitious and comprehensive PC game database project.

http://www.mobygames.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mike Mascari 2000-03-06 01:31:25 Re: [HACKERS] DROP TABLE inside a transaction block
Previous Message Peter Eisentraut 2000-03-05 14:15:45 Re: [HACKERS] Optimizer badness in 7.0 beta