Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date: 2021-03-15 10:57:45
Message-ID: CAApHDvreXu5hDUDXEFbbXnNcoT=3GvbXCUYyBhR5czjo1tdsyA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 12 Mar 2021 at 14:59, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> > On Tue, 23 Feb 2021 at 18:43, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> I doubt it's that bad. We could cache such info in RestrictInfo
> >> for quals, or PathTarget for tlists, without much new notational
> >> overhead. That doesn't cover everything the planner deals with
> >> of course, but it would cover enough that you'd be chasing pretty
> >> small returns to worry about more.
>
> > This seems like a pretty good idea. So I coded it up.
>
> > The 0001 patch adds a has_volatile bool field to RestrictInfo and sets
> > it when building the RestrictInfo.
>
> I'm -1 on doing it exactly that way, because you're expending
> the cost of those lookups without certainty that you need the answer.
> I had in mind something more like the way that we cache selectivity
> estimates in RestrictInfo, in which the value is cached when first
> demanded and then re-used on subsequent checks --- see in
> clause_selectivity_ext, around line 750. You do need a way for the
> field to have a "not known yet" value, but that's not hard. Moreover,
> this sort of approach can be less invasive than what you did here,
> because the caching behavior can be hidden inside
> contain_volatile_functions, rather than having all the call sites
> know about it explicitly.

I coded up something more along the lines of what I think you had in
mind for the 0001 patch.

Updated patches attached.

David

Attachment Content-Type Size
v16-0001-Cache-PathTarget-and-RestrictInfo-s-volatility.patch text/plain 11.9 KB
v16-0002-Allow-estimate_num_groups-to-pass-back-further-d.patch text/plain 9.1 KB
v16-0003-Allow-users-of-simplehash.h-to-perform-direct-de.patch text/plain 3.5 KB
v16-0004-Add-Result-Cache-executor-node.patch text/plain 148.8 KB
v16-0005-Remove-code-duplication-in-nodeResultCache.c.patch text/plain 5.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message iwata.aya@fujitsu.com 2021-03-15 11:26:43 RE: libpq debug log
Previous Message Thomas Munro 2021-03-15 10:51:13 Re: Regression tests vs SERIALIZABLE