Re: Talking about optimizer, my long dream

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Віталій Тимчишин <tivv00(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Talking about optimizer, my long dream
Date: 2011-02-27 17:59:28
Message-ID: AANLkTim4zH1LFCe+ibfTr1r74JO-zgfCDh3XyPGezh_z@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

2011/2/4 Віталій Тимчишин <tivv00(at)gmail(dot)com>:
> Hi, all.
> All this optimizer vs hint thread reminded me about crazy idea that got to
> my head some time ago.
> I currently has two problems with postgresql optimizer
> 1) Dictionary tables. Very usual thing is something like "select * from
> big_table where distionary_id = (select id from dictionary where
> name=value)". This works awful if dictionary_id distribution is not uniform.

Does it work better if you write it as a join?

SELECT b.* FROM big_table b, dictionary d WHERE b.dictionary_id = d.id
AND d.name = 'value'

I would like to see a concrete example of this not working well,
because I've been writing queries like this (with MANY tables) for
years and it's usually worked very well for me.

> The thing that helps is to retrieve subselect value and then simply do
> "select * from big_table where dictionary_id=id_value".
> 2) Complex queries. If there are over 3 levels of subselects, optmizer
> counts often become less and less correct as we go up on levels. On ~3rd
> level this often lead to wrong choises. The thing that helps is to create
> temporary tables from subselects, analyze them and then do main select using
> this temporary tables.
> While first one can be fixed by introducing some correlation statistics, I
> don't think there is any simple way to fix second one.
> But what if optimizer could in some cases tell "fetch this and this and then
> I'll plan other part of the query based on statistics of what you've
> fetched"?

I've had that thought, too. It's pretty hard to see how to make ti
work, but I think there are cases where it could be beneficial.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2011-02-27 18:16:55 Re: Bad query plan when the wrong data type is used
Previous Message Kevin Grittner 2011-02-26 21:52:44 Re: Picking out the most recent row using a time stamp column