Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date: 2021-03-23 11:42:18
Message-ID: CAApHDvpw5qG4vTb+hwhmAJRid6QF+R9vvtOhX2HN2Yy9+w20sw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 15 Mar 2021 at 23:57, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Fri, 12 Mar 2021 at 14:59, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > I'm -1 on doing it exactly that way, because you're expending
> > the cost of those lookups without certainty that you need the answer.
> > I had in mind something more like the way that we cache selectivity
> > estimates in RestrictInfo, in which the value is cached when first
> > demanded and then re-used on subsequent checks --- see in
> > clause_selectivity_ext, around line 750. You do need a way for the
> > field to have a "not known yet" value, but that's not hard. Moreover,
> > this sort of approach can be less invasive than what you did here,
> > because the caching behavior can be hidden inside
> > contain_volatile_functions, rather than having all the call sites
> > know about it explicitly.
>
> I coded up something more along the lines of what I think you had in
> mind for the 0001 patch.

I've now cleaned up the 0001 patch. I ended up changing a few places
where we pass the RestrictInfo->clause to contain_volatile_functions()
to instead pass the RestrictInfo itself so that there's a possibility
of caching the volatility property for a subsequent call.

I also made a pass over the remaining patches and for the 0004 patch,
aside from the name, "Result Cache", I think that it's ready to go. We
should consider before RC1 if we should have enable_resultcache switch
on or off by default.

Does anyone care to have a final look at these patches? I'd like to
start pushing them fairly soon.

David

Attachment Content-Type Size
v17-0001-Cache-PathTarget-and-RestrictInfo-s-volatility.patch text/plain 12.1 KB
v17-0002-Allow-estimate_num_groups-to-pass-back-further-d.patch text/plain 9.1 KB
v17-0003-Allow-users-of-simplehash.h-to-perform-direct-de.patch text/plain 3.5 KB
v17-0004-Add-Result-Cache-executor-node.patch text/plain 148.8 KB
v17-0005-Remove-code-duplication-in-nodeResultCache.c.patch text/plain 5.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Drouvot, Bertrand 2021-03-23 11:47:29 Re: Minimal logical decoding on standbys
Previous Message Masahiko Sawada 2021-03-23 11:01:28 Re: New IndexAM API controlling index vacuum strategies