From: | Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, kristianlejao(at)gmail(dot)com |
Subject: | Re: TRAP: failed Assert("outerPlan != NULL") in postgres_fdw.c |
Date: | 2025-09-17 10:42:36 |
Message-ID: | CAPmGK164pycWPoMH8JeeWv0J8dF8kp=E89VJePYXh+Fhm+H0Hg@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Sat, Aug 23, 2025 at 5:54 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Fri, Aug 22, 2025 at 3:59 AM Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> wrote:
> > I reviewed the postgres_fdw.c part of the fix:
> >
> > @@ -6313,9 +6314,18 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
> > * calling foreign_join_ok(), since that function updates fpinfo and marks
> > * it as pushable if the join is found to be pushable.
> > */
> > - if (root->parse->commandType == CMD_DELETE ||
> > - root->parse->commandType == CMD_UPDATE ||
> > - root->rowMarks)
> > + for (PlannerInfo *proot = root; proot != NULL; proot = proot->parent_root)
> > + {
> > + if (proot->parse->commandType == CMD_DELETE ||
> > + proot->parse->commandType == CMD_UPDATE ||
> > + proot->rowMarks)
> > + {
> > + need_epq = true;
> > + break;
> > + }
> > + }
> > +
> > + if (need_epq)
> > {
> > epq_path = GetExistingLocalJoinPath(joinrel);
> > if (!epq_path)
> >
> > I think this successfully avoids the assertion failure and produces
> > the correct result, but sorry, I don't think it's the right way to go.
> > I think the root cause of this issue is in the EPQ handling of
> > foreign/custom joins in ExecScanFetch() in execScan.h: it runs the
> > recheck method function for a given foreign/custom join whenever it is
> > called for EPQ rechecking, but that is not 100% correct. I think the
> > correct handling is: run the recheck method function for the join if
> > it is called for EPQ rechecking and the join is a *descendant* join in
> > the EPQ plan tree; otherwise run the access method function for the
> > join even if it is called for EPQ rechecking, like the attached (where
> > I used the epqParam of the given EPQState to determine whether the
> > join is a descendant join or not, which localizes the fix pretty
> > well). For the SELECT FOR UPDATE query shown upthread, when doing an
> > EPQ recheck, the fix evaluates the sub-select expression in the target
> > list by doing a remote join, not a local join, so it would work more
> > efficiently than the fix you proposed.
>
> Thank you for reviewing the patch! I've confirmed that your patch
> fixes the issue too.
Thanks for testing!
> If I understand your proposed fix correctly, the reported problem is
> fixed by not rechecking the test tuple by ForeignRecheck() (performing
> a local join), but instead we simply call ForeignNext() and get the
> next tuple.
That's right.
> I think while this fix would have to have a regression
> test like I've proposed, it's probably a good time to add some
> regression tests to cover postgresRecheckForeignScan() too possibly as
> a separate patch.
Agreed. Actually, we have fixed many EvalPlanQual issues with
postgres_fdw so far, but have always left adding test cases for future
work, so let's do that now!
The test case you showed upthread and added to the patch is useful, so
I think we should add it as well, but my question about it is: is it
really a good idea to use injection points? Why don't you just use
BEGIN/COMMIT statements like this:
-- session 1
begin isolation level read committed;
update a set i = i + 1;
-- session 2
begin isolation level read committed;
select a.i,
(select 1 from b, c where a.i = b.i and b.i = c.i)
from a
for update; -- waits for the transaction in session 1 to complete
-- session 1
commit;
-- session 2
select a.i,
(select 1 from b, c where a.i = b.i and b.i = c.i)
from a
for update; -- produces results after doing an EvalPlanQual recheck
i | ?column?
---+----------
2 |
(1 row)
Again, my apologies for the late response.
Best regards,
Etsuro Fujita
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2025-09-17 14:34:23 | BUG #19055: Server crash at ExecInterpExpr |
Previous Message | Masahiko Sawada | 2025-09-17 04:32:59 | Re: Read Replica termination occurs when its max_active_replication_origins setting is lower than the primary |