Re: Optimization idea

From: Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Optimization idea
Date: 2010-04-28 09:38:20
Message-ID: 4BD8020C.20509@dc.baikal.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


> 2010/4/28 Robert Haas <robertmhaas(at)gmail(dot)com>:
>
>> On Mon, Apr 26, 2010 at 5:33 AM, Cédric Villemain
>> <cedric(dot)villemain(dot)debian(at)gmail(dot)com> wrote:
>>
>>> In the first query, the planner doesn't use the information of the 2,3,4.
>>> It just does a : I'll bet I'll have 2 rows in t1 (I think it should
>>> say 3, but it doesn't)
>>> So it divide the estimated number of rows in the t2 table by 5
>>> (different values) and multiply by 2 (rows) : 40040.
>>>
>> I think it's doing something more complicated. See scalararraysel().
>>
>
> Thank you for driving me to the right function, Robert.
> It is in fact more complicated :)
>
>
>>> In the second query the planner use a different behavior : it did
>>> expand the value of t1.t to t2.t for each join relation and find a
>>> costless plan. (than the one using seqscan on t2)
>>>
>> I think the problem here is one we've discussed before: if the query
>> planner knows that something is true of x (like, say, x =
>> ANY('{2,3,4}')) and it also knows that x = y, it doesn't infer that
>> the same thing holds of y (i.e. y = ANY('{2,3,4}') unless the thing
>> that is known to be true of x is that x is equal to some constant.
>> Tom doesn't think it would be worth the additional CPU time that it
>> would take to make these sorts of deductions. I'm not sure I believe
>> that, but I haven't tried to write the code, either.
>>
>
> If I understand correctly, I did have some issues with
> exclusion_constraint= ON for complex queries in datamining where the
> planner failled to understand it must use only one partition because
> the where clause where not enough 'explicit'. But it's long time ago
> and I don't have my use case.
>
> We probably need to find some real case where the planner optimisation
> make sense. But I don't want usual queries to see their CPU time
> increase...
> <joke>Do we need real Planner Hints ?</joke>
>
>
Even if it will be done it does not solve the original issue. If I
understood you right there is now no any decent way of speeding up the query

select *
from t2
join t1 on t1.t = t2.t
where t1.id = X;

except of the propagating the t1.id value to the table t2 and createing
and index for this column? Then the query will look like

select *
from t2
where t1_id = X;

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2010-04-28 13:53:36 Re: autovacuum strategy / parameters
Previous Message Cédric Villemain 2010-04-28 07:49:05 Re: Optimization idea