Re: Optimization idea

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
Cc: Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Optimization idea
Date: 2010-04-28 00:46:36
Message-ID: q2w603c8f071004271746s43f8669cz45bec5914b1fa9e0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Apr 26, 2010 at 5:33 AM, Cédric Villemain
<cedric(dot)villemain(dot)debian(at)gmail(dot)com> wrote:
> In the first query, the planner doesn't use the information of the 2,3,4.
> It just does a : I'll bet I'll have 2 rows in t1 (I think it should
> say 3, but it doesn't)
> So it divide the estimated number of rows in the t2 table by 5
> (different values) and multiply by 2 (rows) : 40040.

I think it's doing something more complicated. See scalararraysel().

> In the second query the planner use a different behavior : it did
> expand the value of t1.t to t2.t for each join relation and find a
> costless plan. (than the one using seqscan on t2)

I think the problem here is one we've discussed before: if the query
planner knows that something is true of x (like, say, x =
ANY('{2,3,4}')) and it also knows that x = y, it doesn't infer that
the same thing holds of y (i.e. y = ANY('{2,3,4}') unless the thing
that is known to be true of x is that x is equal to some constant.
Tom doesn't think it would be worth the additional CPU time that it
would take to make these sorts of deductions. I'm not sure I believe
that, but I haven't tried to write the code, either.

...Robert

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2010-04-28 00:55:11 Re: autovacuum strategy / parameters
Previous Message Greg Spiegelberg 2010-04-27 03:59:50 Re: tmpfs and postgres memory