Re: Recognizing range constraints (was Re: Plan for

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Arjen van der Meijden <acmmailing(at)vulcanus(dot)its(dot)tudelft(dot)nl>, pgsql-hackers(at)postgreSQL(dot)org, pgsql-performance(at)postgreSQL(dot)org
Subject: Re: Recognizing range constraints (was Re: Plan for
Date: 2005-04-06 23:24:46
Message-ID: 1112829886.16721.1104.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Wed, 2005-04-06 at 18:09 -0400, Tom Lane wrote:
> I wrote:
> > Arjen van der Meijden <acmmailing(at)vulcanus(dot)its(dot)tudelft(dot)nl> writes:
> >> SELECT COUNT(*) FROM
> >> data_main AS dm,
> >> postcodes AS p
> >> WHERE dm.range BETWEEN p.range_from AND p.range_till
>
> > Planner error ... because it doesn't have any good way to estimate the
> > number of matching rows, it thinks that way is a bit more expensive than
> > data_main as the outside, but in reality it seems a good deal cheaper:
>
> BTW, it would get the right answer if it had recognized the WHERE clause
> as a range restriction --- it still doesn't know exactly what fraction
> of rows will match, but its default estimate is a great deal tighter for
> "WHERE x > something AND x < somethingelse" than it is for two unrelated
> inequality constraints. Enough tighter that it would have gone for the
> correct plan.
>
> The problem is that it doesn't recognize the WHERE as a range constraint
> on dm.range.

> Can anyone suggest a more general rule? Do we need for example to
> consider whether the relation membership is the same in two clauses
> that might be opposite sides of a range restriction? It seems like
>
> a.x > b.y AND a.x < b.z

Not sure we need a more general rule. There's only three ways to view
this pair of clauses:
i) its a range constraint i.e. BETWEEN
ii) its the complement of that i.e. NOT BETWEEN
iii) its a mistake, but we're not allowed to take that path

Arjen's query and your generalisation of it above is a common type of
query - using a lookup of a reference data table with begin/end
effective dates. It would be very useful if this was supported.

> probably can be treated as a range restriction on a.x for this purpose,
> but I'm much less sure that the same is true of
>
> a.x > b.y AND a.x < c.z

I can't think of a query that would use such a construct, and might even
conclude that it was very poorly normalised model. I would suggest that
this is much less common in practical use.

Best Regards, Simon Riggs

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Neil Conway 2005-04-07 01:57:13 Re: DELETE ... USING
Previous Message Tom Lane 2005-04-06 23:06:34 Re: [HACKERS] Recognizing range constraints (was Re: Plan for relatively simple query seems to be very inefficient)

Browse pgsql-performance by date

  From Date Subject
Next Message Neil Conway 2005-04-06 23:42:30 Re: Tweaking a C Function I wrote
Previous Message Adam Palmblad 2005-04-06 23:22:59 Tweaking a C Function I wrote