Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date: 2021-03-12 03:01:33
Message-ID: CAApHDvqgEPhQAvNbV-aGOjUSL1DsiEWdWnZ6Ecb2ChB0VDsamw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 12 Mar 2021 at 14:59, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> > The 0001 patch adds a has_volatile bool field to RestrictInfo and sets
> > it when building the RestrictInfo.
>
> I'm -1 on doing it exactly that way, because you're expending
> the cost of those lookups without certainty that you need the answer.
> I had in mind something more like the way that we cache selectivity
> estimates in RestrictInfo, in which the value is cached when first
> demanded and then re-used on subsequent checks --- see in
> clause_selectivity_ext, around line 750. You do need a way for the
> field to have a "not known yet" value, but that's not hard. Moreover,
> this sort of approach can be less invasive than what you did here,
> because the caching behavior can be hidden inside
> contain_volatile_functions, rather than having all the call sites
> know about it explicitly.

I was aware that the selectivity code did things that way. However, I
didn't copy it as we have functions like match_opclause_to_indexcol()
and match_saopclause_to_indexcol() which calls
contain_volatile_functions() on just a single operand of an OpExpr.
We'd have no chance to cache the volatility property on the first
lookup since we'd not have the RestrictInfo to set it in. I didn't
think that was great, so it led me down the path of setting it always
rather than on the first volatility lookup.

I had in mind that most RestrictInfos would get tested between
checking for hash and merge joinability and index compatibility.
However, I think baserestrictinfos that reference non-indexed columns
won't get checked, so the way I've done it will be a bit wasteful like
you mention.

David

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2021-03-12 03:08:41 Re: [HACKERS] Custom compression methods
Previous Message Amit Langote 2021-03-12 02:45:50 Re: Allow batched insert during cross-partition updates