Re: Parameterized paths vs index clauses extracted from OR clauses

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: Parameterized paths vs index clauses extracted from OR clauses
Date: 2013-03-06 04:20:17
Message-ID: CA+TgmoZOrNuRpAcu17jv-+bkvAb-EoS0T6xG8VkHadHWtaL=iw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 5, 2013 at 3:44 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Well, the point is not so much about whether it's an improvement as that
> 9.2's current behavior is a regression from 9.1 and earlier. People may
> not like changes in minor releases, but they don't like regressions
> either.

That's true, but I'm still worried that we're just moving the
unhappiness around from one group of people to another group of
people, and I don't have a lot of confidence about which group is
larger.

>>> A downside of this approach is that to preserve
>>> the same-number-of-rows assumption, we'd end up having to enforce the
>>> extracted clauses as filter clauses in parameterized paths, even if
>>> they'd not proved to be of any use as index quals.
>
>> I'm not sure I fully grasp why this is a downside. Explain further?
>
> Because we'd be checking redundant clauses. You'd get something like
>
> Nested Loop
> Filter: (foo OR (bar AND baz))
>
> ... some outer scan here ...
>
> Index Scan:
> Filter: (foo OR bar)
>
> If "foo OR bar" is useful as an indexqual condition in the inner scan,
> that's one thing. But if it isn't, the cycles expended to check it in
> the inner scan are possibly wasted, because we'll still have to check
> the full original OR clause later. It's possible that the filter
> condition removes enough rows from the inner scan's result to justify
> the redundant checks, but it's at least as possible that it doesn't.

Yeah, that's pretty unappealing. It probably doesn't matter much if
foo is just a column reference, but what if it's an expensive
function? For that matter, what if it's a volatile function that we
can't execute twice without changing the results?

>> Since there's little point in using a paramaterized path in the first
>> place unless it enables you to drastically reduce the number of rows
>> being processed, I would anticipate that maybe the consequences aren't
>> too bad, but I'm not sure.
>
> Yeah, we could hope that the inner scan is already producing few enough
> rows that it doesn't matter much. But I think that we'd end up checking
> the added qual even in a non-parameterized scan; there's no mechanism
> for pushing quals into the general qual lists and then retracting them
> later. (Hm, maybe what we need is a marker for "enforce this clause
> only if you feel like it"?)

Not sure I get the parenthesized bit.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2013-03-06 04:21:27 Re: Support for REINDEX CONCURRENTLY
Previous Message Kyotaro HORIGUCHI 2013-03-06 04:01:30 Re: 9.2.3 crashes during archive recovery