Skip site navigation (1) Skip section navigation (2)

Re: [HACKERS] Optimizer badness in 7.0 beta

From: Brian Hirt <bhirt(at)mobygames(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Optimizer badness in 7.0 beta
Date: 2000-03-05 21:01:32
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers

Actually, the query you supply will not work, I'll get duplicate 
rows because the relationship between game and game_developer is 
a one to many.  Of course you had no way of knowing that from the 
information I supplied.  I could throw a distinct in there to get 
the results, but that really feels like bad form because of the large
amount of duplicate rows.  In any case, the original query I supplied 
is generated SQL, created by a Database to Object persistance layer 
and cannot by design have multiple tables in the from clause, so 
restrictions to the table seleted from must be in the form of a 

I realize that the query in question could be written better.  My 
concern was the huge difference in performance between 6.5 and 7.0 on
this type of query.  Other people may be bitten by this one, so I wanted
to bring it up.  I've been able to easily to work around this problem, 
it just seems wrong that the difference in execution time is so far 
off from the previous release.  

I dont know too much about the PG internals, but when I used sybase, 
it would usually execute the sub-select independently and stuff the 
results into a temp table and then do another query, joining to the 
results of the sub-select.  In a situation like this one, worst case 
without indexes you would get a table scan for the sub-select 
and then a merge join with a sequential scan on the temp table and a 
sequential scan on the other table (example below).  Using that
approach, with no indexes, the query still executes in a fraction of a 
second.  It just seems that a query on tables as small as I'm describing
should never take as long as it did.   It seems like a problem with
the optimizer, but if people are happy with currenty functionality that's 
fine with me also.


For Example:

SELECT DISTINCT game_id INTO temp tmp_res
FROM game_developer
WHERE developer_id = 3

FROM game, tmp_res
WHERE game.game_id = tmp_res.game_id
  AND game.approved = 1
ORDER BY copyright_year desc, game_title

On Sun, Mar 05, 2000 at 03:15:45PM +0100, Peter Eisentraut wrote:
> This query can be rewritten as
> SELECT creation_timestamp, etc. 
> FROM game, game_developer
> WHERE game.game_id = game_developer.game_id
>   AND approved = 1 AND developer_id = 3
> ORDER BY copyright_year desc, game_title
> The way you're writing it you're almost asking it to be slow. :)
> Of course that still doesn't explain why it's now 94sec versus formerly 1
> but I'm sure Tom Lane will enlighten us all very soon. :)
> Brian Hirt writes:
> > select 
> > 	creation_timestamp,
> [snip]
> > from 
> > 	game 
> > where 
> > 	approved = 1 
> > and 
> > 	game_id in (
> > 		select 
> > 			distinct game_id
> > 		from 
> > 			game_developer
> > 		where 
> > 			developer_id = 3) 
> > order by 
> > 	copyright_year desc,
> > 	game_title;
> -- 
> Peter Eisentraut                  Sernanders väg 10:115
> peter_e(at)gmx(dot)net                   75262 Uppsala
>            Sweden
> ************

The world's most ambitious and comprehensive PC game database project.


In response to


pgsql-hackers by date

Next:From: Mike MascariDate: 2000-03-06 01:31:25
Subject: Re: [HACKERS] DROP TABLE inside a transaction block
Previous:From: Peter EisentrautDate: 2000-03-05 14:15:45
Subject: Re: [HACKERS] Optimizer badness in 7.0 beta

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group