Recognizing range constraints (was Re: Plan for relatively simple query seems to be very inefficient)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Arjen van der Meijden <acmmailing(at)vulcanus(dot)its(dot)tudelft(dot)nl>
Cc: pgsql-hackers(at)postgreSQL(dot)org, pgsql-performance(at)postgreSQL(dot)org
Subject: Recognizing range constraints (was Re: Plan for relatively simple query seems to be very inefficient)
Date: 2005-04-06 22:09:37
Message-ID: 6326.1112825377@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

I wrote:
> Arjen van der Meijden <acmmailing(at)vulcanus(dot)its(dot)tudelft(dot)nl> writes:
>> SELECT COUNT(*) FROM
>> data_main AS dm,
>> postcodes AS p
>> WHERE dm.range BETWEEN p.range_from AND p.range_till

> Planner error ... because it doesn't have any good way to estimate the
> number of matching rows, it thinks that way is a bit more expensive than
> data_main as the outside, but in reality it seems a good deal cheaper:

BTW, it would get the right answer if it had recognized the WHERE clause
as a range restriction --- it still doesn't know exactly what fraction
of rows will match, but its default estimate is a great deal tighter for
"WHERE x > something AND x < somethingelse" than it is for two unrelated
inequality constraints. Enough tighter that it would have gone for the
correct plan.

The problem is that it doesn't recognize the WHERE as a range constraint
on dm.range. I thought for a moment that this might be a
recently-introduced bug, but actually the code is operating as designed:
clauselist_selectivity says

* See if it looks like a restriction clause with a pseudoconstant
* on one side. (Anything more complicated than that might not
* behave in the simple way we are expecting.)

"Pseudoconstant" in this context means "a constant, parameter symbol, or
non-volatile functions of these" ... so comparisons against values from
another table don't qualify. It seems like we're missing a bet though.

Can anyone suggest a more general rule? Do we need for example to
consider whether the relation membership is the same in two clauses
that might be opposite sides of a range restriction? It seems like

a.x > b.y AND a.x < b.z

probably can be treated as a range restriction on a.x for this purpose,
but I'm much less sure that the same is true of

a.x > b.y AND a.x < c.z

Thoughts?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim C. Nasby 2005-04-06 22:25:36 Re: [HACKERS] Recognizing range constraints (was Re: Plan for relatively simple query seems to be very inefficient)
Previous Message Tom Lane 2005-04-06 20:51:30 Re: Plan for relatively simple query seems to be very inefficient

Browse pgsql-performance by date

  From Date Subject
Next Message Alex Turner 2005-04-06 22:12:06 Re: How to improve db performance with $7K?
Previous Message Jarosław Pałka 2005-04-06 21:05:05 Never ending delete story