Re: [PATCH] plpython function causes server panic

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Hao Zhang <zhrt1446384557(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] plpython function causes server panic
Date: 2024-03-22 20:37:48
Message-ID: 1084039.1711139868@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> - I don't think the documentation changes are entirely accurate. The
> whole point of the patch is to allow parallel workers to make changes
> to the transaction state, but the documentation says you can't. Maybe
> we should just delete "change the transaction state" entirely from the
> list of things that you're not allowed to do, since "write to the
> database" is already listed separately; or maybe we should replace it
> with something like "assign new transaction IDs or command IDs,"
> although that's kind of low-level. I don't think we should just delete
> the "even temporarily" bit, as you've done.

Fair enough. In the attached v2, I wrote "change the transaction
state (other than by using a subtransaction for error recovery)";
what do you think of that?

I dug around in the docs and couldn't really find anything about
parallel-query transaction limitations other than this bit in
parallel.sgml and the more or less copy-pasted text in
create_function.sgml; did you have any other spots in mind?
(I did find the commentary in README.parallel, but that's not
exactly user-facing.)

> - While I like the new comments in BeginInternalSubTransaction(), I
> think the changes in ReleaseCurrentSubTransaction() and
> RollbackAndReleaseCurrentSubTransaction() need more thought.

Yah. After studying the code a bit more, I realized that what
I'd done would cause IsInParallelMode() to start returning false
during a subtransaction within parallel mode, which is surely not
what we want. That state has to be heritable into subtransactions
in some fashion. The attached keeps the current semantics of
parallelModeLevel and adds a bool parallelChildXact field that is
true if any outer transaction level has nonzero parallelModeLevel.
That's possibly more general than we need today, but it seems like
a reasonably clean definition.

> One additional thing that might (or might not) be worth mentioning or
> checking for here is that the leader shouldn't try to reduce the
> height of the transaction state stack to anything less than what it
> was when the parallel operation started; if it wants to do that, it
> needs to clean up the parallel workers and exit parallel mode first.
> Similarly a worker shouldn't ever end the toplevel transaction except
> during backend cleanup.

I think these things are already dealt with. However, one thing
worth questioning is that CommitSubTransaction() will just silently
kill any workers started during the current subxact, and likewise
CommitTransaction() zaps workers without complaint. Shouldn't these
instead throw an error about how you didn't close parallel mode,
and then the corresponding Abort function does the cleanup?
I did not change that behavior here, but it seems dubious.

v2 attached works a bit harder on the comments and adds a simplistic
test case. I feel that I don't want to incorporate the plpython
crash that started this thread, as it's weird and dependent on
Python code outside our control (though I have checked that we
don't crash on that anymore).

regards, tom lane

Attachment Content-Type Size
v2-allow-subxacts-in-parallel-workers.patch text/x-diff 20.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-03-22 20:41:49 Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs
Previous Message Robert Haas 2024-03-22 20:04:47 Re: session username in default psql prompt?