Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date: 2020-12-08 07:15:52
Message-ID: CAApHDvpXN0nT75ki9jbrAcPckYxzZP_KWdCmZDQs2fb=rNu_5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've attached another patchset that addresses some comments left by
Zhihong Yu over on [1]. The version number got bumped to v12 instead
of v11 as I still have a copy of the other version of the patch which
I made some changes to and internally named v11.

The patchset has grown 1 additional patch which is the 0004 patch.
The review on the other thread mentioned that I should remove the code
duplication for the full cache check that I had mostly duplicated
between adding a new entry to the cache and adding tuple to an
existing entry. I'm still a bit unsure that I like merging this into
a helper function. One call needs the return value of the function to
be a boolean value to know if it's still okay to use the cache. The
other need the return value to be the cache entry. The patch makes the
helper function return the entry and returns NULL to communicate the
false value. I'm not a fan of the change and might drop it.

The 0005 patch is now the only one that I think needs more work to
make it good enough. This is Result Cache for subplans. I mentioned
in [2] what my problem with that patch is.

On Mon, 7 Dec 2020 at 12:50, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> Basically, an extra argument in ExecBuildParamSetEqual() which allows
> the TupleTableSlotOps for the left and right side to be set
> individually. Previously I was passing a single TupleTableSlotOps of
> TTSOpsMinimalTuple. The probeslot is a TTSOpsVirtual tuple, so
> passing TTSOpsMinimalTuple causes the function to add a needless
> EEOP_OUTER_FETCHSOME step to the expression.

I also benchmarked that change and did see that it gives a small but
notable improvement to the performance.

David

[1] https://www.postgresql.org/message-id/CALNJ-vRAgksPqjK-sAU+9gu3R44s_3jVPJ_5SDB++jjEkTntiA@mail.gmail.com
[2] https://www.postgresql.org/message-id/CAApHDvpGX7RN+sh7Hn9HWZQKp53SjKaL=GtDzYheHWiEd-8moQ@mail.gmail.com

Attachment Content-Type Size
v12-0001-Allow-estimate_num_groups-to-pass-back-further-d.patch text/plain 9.1 KB
v12-0004-Remove-code-duplication-in-nodeResultCache.c.patch text/plain 5.1 KB
v12-0002-Allow-users-of-simplehash.h-to-perform-direct-de.patch text/plain 3.5 KB
v12-0005-Use-a-Result-Cache-node-to-cache-results-from-su.patch text/plain 25.7 KB
v12-0003-Add-Result-Cache-executor-node.patch text/plain 144.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Irodotos Terpizis 2020-12-08 07:43:42 Printing page request trace from buffer manager
Previous Message Andres Freund 2020-12-08 07:04:13 Re: Blocking I/O, async I/O and io_uring