Re: Oversight in reparameterize_path_by_child leading to executor crash

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Oversight in reparameterize_path_by_child leading to executor crash
Date: 2023-08-08 11:33:56
Message-ID: CAExHW5uZUV1=MV40uRekDJAdg+7wXJXHZk1ZNDCpk3qSN6_T+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 8, 2023 at 12:44 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
>
>
> I agree that we should mention in the function's comment that
> reparameterize_path_by_child can only be used after the best path has
> been selected because the RTE might be modified by this function. I'm
> not sure if it's a good idea to move the reparameterization of
> tablesample to create_samplescan_plan(). That would make the
> reparameterization work separated in different places. And in the
> future we may find more cases where we need modify RTEs or RELs for
> reparameterization. It seems better to me that we keep all the work in
> one place.

Let's see what a committer thinks. Let's leave it like that for now.

>
>
> This is not an existing bug. Our current code (without this patch)
> would always call create_nestloop_path() after the reparameterization
> work has been done for the inner path. So we can safely check against
> outer rel (not its topmost parent) in create_nestloop_path(). But now
> with this patch, the reparameterization is postponed until createplan.c,
> so we have to check against the topmost parent of outer rel in
> create_nestloop_path(), otherwise we'd have duplicate clauses in the
> final plan.

Hmm. I am worried about the impact this will have when the nestloop
path is created for two child relations. Your code will still pick the
top_parent_relids which seems odd. Have you tested that case?

> For instance, without this change you'd get a plan like
>
> regression=# explain (costs off)
> select * from prt1 t1 left join lateral
> (select t1.a as t1a, t2.* from prt1 t2) s on t1.a = s.a;
> QUERY PLAN
... snip ...
>
> Note that the clauses in Join Filter are duplicate.

Thanks for the example.

>
> Good question. But I don't think calling this function at the beginning
> of reparameterize_path_by_child() can solve this problem. Even if we do
> that, if we find that the path cannot be reparameterized in
> reparameterize_path_by_child(), it would be too late for us to take any
> measures. So we need to make sure that such situation cannot happen. I
> think we can emphasize this point in the comments of the two functions,
> and meanwhile add an Assert in reparameterize_path_by_child() that the
> path must be reparameterizable.

Works for me.

>
> Attaching v3 patch to address all the reviews above.

The patch looks good to me.

Please add this to the next commitfest.

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiro Ikeda 2023-08-08 11:36:20 Re: Support to define custom wait events for extensions
Previous Message Hannu Krosing 2023-08-08 11:24:24 Re: How to build a new grammer for pg?