Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date: 2020-07-08 22:25:14
Message-ID: CAApHDvq5_n8svmLHJJ0Vr_v2R3GTmsO9xuW-hU6RyYSVosyj0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 9 Jul 2020 at 04:53, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> On 2020-05-20 23:44:27 +1200, David Rowley wrote:
> > I've attached a patch which implements this. The new node type is
> > called "Result Cache". I'm not particularly wedded to keeping that
> > name, but if I change it, I only want to do it once. I've got a few
> > other names I mind, but I don't feel strongly or confident enough in
> > them to go and do the renaming.
>
> I'm not convinced it's a good idea to introduce a separate executor node
> for this. There's a fair bit of overhead in them, and they will only be
> below certain types of nodes afaict. It seems like it'd be better to
> pull the required calls into the nodes that do parametrized scans of
> subsidiary nodes. Have you considered that?

I see 41 different node types mentioned in ExecReScan(). I don't
really think it would be reasonable to change all those.

Here are a couple of examples, one with a Limit below the Result Cache
and one with a GroupAggregate.

postgres=# explain (costs off) select * from pg_Class c1 where relname
= (select relname from pg_Class c2 where c1.relname = c2.relname
offset 1 limit 1);
QUERY PLAN
-------------------------------------------------------------------------------------
Seq Scan on pg_class c1
Filter: (relname = (SubPlan 1))
SubPlan 1
-> Result Cache
Cache Key: c1.relname
-> Limit
-> Index Only Scan using pg_class_relname_nsp_index
on pg_class c2
Index Cond: (relname = c1.relname)
(8 rows)

postgres=# explain (costs off) select * from pg_Class c1 where relname
= (select relname from pg_Class c2 where c1.relname = c2.relname group
by 1 having count(*) > 1);
QUERY PLAN
-------------------------------------------------------------------------------------
Seq Scan on pg_class c1
Filter: (relname = (SubPlan 1))
SubPlan 1
-> Result Cache
Cache Key: c1.relname
-> GroupAggregate
Group Key: c2.relname
Filter: (count(*) > 1)
-> Index Only Scan using pg_class_relname_nsp_index
on pg_class c2
Index Cond: (relname = c1.relname)
(10 rows)

As for putting the logic somewhere like ExecReScan() then the first
paragraph in [1] are my thoughts on that.

David

[1] https://www.postgresql.org/message-id/CAApHDvr-yx9DEJ1Lc9aAy8QZkgEZkTP=3hBRBe83Vwo=kAndcA@mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2020-07-08 22:44:26 Re: Index Skip Scan (new UniqueKeys)
Previous Message Tom Lane 2020-07-08 22:20:57 Re: jsonpath versus NaN