Re: [BUG] Logical replica crash if there was an error in a function.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: "Anton A(dot) Melnikov" <aamelnikov(at)inbox(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Maxim Orlov <orlovmg(at)gmail(dot)com>
Subject: Re: [BUG] Logical replica crash if there was an error in a function.
Date: 2022-11-02 18:02:28
Message-ID: 1828454.1667412148@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Michael Paquier <michael(at)paquier(dot)xyz> writes:
> Yeah, the query string is not available in this context, but it surely
> looks wrong to me to assume that something as low-level as
> function_parse_error_transpose() needs to be updated for the sake of a
> logical worker, while we have other areas that would expect a portal
> to be set up.

After thinking about this some more, I'm withdrawing my opposition to
fixing it by making function_parse_error_transpose() cope with not
having an active portal. I have a few reasons:

* A Portal is intended to contain an executor state. While worker.c
does fake up an EState, there's certainly no plan tree or planstate tree,
and I doubt it'd be sane to create dummy ones. So even if we made a
Portal, it'd be lacking a lot of the stuff one would expect to find there.
I fear that moving the cause of this sort of problem from "there's no
ActivePortal" to "there's an ActivePortal but it lacks field X" wouldn't
be an improvement.

* There is actually just one other consumer of ActivePortal,
namely EnsurePortalSnapshotExists, and that doesn't offer a lot of
support for the idea that ActivePortal must always be set. It says

* Nothing to do if a snapshot is set. (We take it on faith that the
* outermost active snapshot belongs to some Portal; or if there is no
* Portal, it's somebody else's responsibility to manage things.)

and "it's somebody else's responsibility" summarizes the situation
here pretty perfectly. worker.c *does* set up an active snapshot.

* The comment in function_parse_error_transpose() freely admits that
looking at the ActivePortal is a hack. It works, more or less, for
the intended case of reporting a function-body syntax error nicely
during CREATE FUNCTION. But it's capable of getting false-positive
matches, so maybe someday we should replace it with something more
bulletproof.

* There isn't any strong reason why function_parse_error_transpose()
has to succeed at finding the original query text. Its fallback
approach of treating the syntax error position as internal to the
function body text is fine, in fact it's just what we want here.

So I'm now good with the idea of just not failing. I don't like
the patch as presented though. First, the cfbot is quite rightly
complaining about the "uninitialized variable" warning it draws.
Second, I don't see a good reason to tie the change to logical
replication in any way. Let's just change the Assert to an if(),
as attached.

regards, tom lane

Attachment Content-Type Size
v6-0001-Fix-logical-replica-assert-on-func-error.patch text/x-diff 1.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2022-11-02 18:10:17 Re: pg15 inherited stats expressions: cache lookup failed for statistics object
Previous Message David G. Johnston 2022-11-02 17:48:22 Re: Glossary and initdb definition work for "superuser" and database/cluster