Re: Optimization of NestLoop join in the case of guaranteed empty inner subtree

From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimization of NestLoop join in the case of guaranteed empty inner subtree
Date: 2019-12-15 10:56:44
Message-ID: fbbc482d-276e-8605-4113-dd2894fadf17@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/11/19 8:49 PM, Tom Lane wrote:
> Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru> writes:
>> During NestLoop execution we have bad corner case: if outer subtree
>> contains tuples the join node will scan inner subtree even if it does
>> not return any tuples.
>
> So the first question about corner-case optimizations like this is always
> "how much overhead does it add in the normal case where it fails to gain
> anything?". I see no performance numbers in your proposal.

I thought it is trivial. But quick study shows no differences that can
be seen.

>
> I do not much like anything about the code, either: as written it's
> only helpful for an especially narrow corner case (so narrow that
> I wonder if it really ever helps at all: surely calling a nodeMaterial
> whose tuplestore is empty doesn't cost much).

Scanning of large outer can be very costly. If you will try to play with
analytical queries you can find cases, where nested loops uses
materialization of zero tuples. At least one of the cases for this is
finding data gaps.
Also, this optimization exists in logic of hash join.

> But that doesn't stop it
> from adding a bool to the generic PlanState struct, with global
> implications. What I'd expected from your text description is that
> nodeNestLoop would remember whether its inner child had returned zero rows
> the first time, and assume that subsequent executions could be skipped
> unless the inner child's parameters change.

This note I was waiting for. I agree with you that adding a bool
variable to PlanState is excessful. See in attachment another version of
the optimization.

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
v2-0001-Skip-scan-of-outer-subtree-if-inner-of-the-NestedLoo.patch text/x-patch 3.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-12-15 13:07:08 Re: error context for vacuum to include block number
Previous Message Robert Haas 2019-12-15 03:38:49 Re: more backtraces