Re: [PATCH] Hooks at XactCommand level

From: Gilles Darold <gilles(at)darold(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Nicolas CHAHWEKILIAN <leptitstagiaire(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] Hooks at XactCommand level
Date: 2021-07-15 07:44:13
Message-ID: 9875dd5b-ce95-3348-468d-cd4cfadcddaf@darold.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le 14/07/2021 à 21:26, Tom Lane a écrit :
> Gilles Darold <gilles(at)darold(dot)net> writes:
>> I have renamed the patch and the title of this proposal registered in
>> the commitfest "Xact/SubXact event callback at command start" to reflect
>> the last changes that do not include new hooks anymore.
> Hmm, it doesn't seem like this has addressed my concern at all.
> The callbacks are still effectively defined as "run during
> start_xact_command", so they're not any less squishy semantically
> than they were before. The point of my criticism was that you
> should move the call site to someplace that's more organically
> connected to execution of commands.

I would like to move it closer to the command execution but the only
place I see would be in BeginCommand() which actually is waiting for
code to execute, for the moment this function does nothing. I don't see
another possible place after start_xact_command() call. All my attempt
to inject the callback after start_xact_command() result in a failure.
If you see an other place I will be please to give it a test.

> Another thing I'm not too pleased with in this formulation is that it's
> very unclear what the distinction is between XACT_EVENT_COMMAND_START
> and SUBXACT_EVENT_COMMAND_START. AFAICS, *every* potential use-case
> for this would have to hook into both callback chains, and most likely
> would treat the two events alike. Plus, as you note, the goalposts
> have suddenly been moved for the amount of overhead it's okay to have
> in an XactCallback or SubXactCallback function. So that might cause
> problems for unrelated code. It's probably better to not try to
> re-use that infrastructure.

Actually XACT_EVENT_COMMAND_START occurs only after the call BEGIN, when
a transaction starts, whereas SUBXACT_EVENT_COMMAND_START occurs in all
subsequent statement execution of this transaction. This helps to
perform different actions following the event. In the example extension
only SUBXACT_EVENT_COMMAND_START is used but for example I could use
event XACT_EVENT_COMMAND_START to not send a RELEASE savepoint as there
is none. I detect this case differently but this could be an improvement
in the extension.

>
> <digression>
>
>> The objective of this new callback is to be able to call user-defined
>> code before any new statement is executed. For example it can call a
>> rollback to savepoint if there was an error in the previous transaction
>> statement, which allow to implements Rollback at Statement Level at
>> server side using a PostgreSQL extension, see [1] .
> Urgh. Isn't this re-making the same mistake we made years ago, namely
> trying to implement autocommit on the server side? I fear this will
> be a disaster even larger than that was, because if it's an extension
> then pretty much no applications will be prepared for the new semantics.
> I strongly urge you to read the discussions that led up to f85f43dfb,
> and to search the commit history before that for mentions of
> "autocommit", to see just how extensive the mess was.
>
> I spent a little time trying to locate said discussions; it's harder
> than it should be because we didn't have the practice of citing email
> threads in the commit log at the time. I did find
>
> https://www.postgresql.org/message-id/flat/Pine.LNX.4.44.0303172059170.1975-100000%40peter.localdomain#7ae26ed5c1bfbf9b22a420dfd8b8e69f
>
> which seems to have been the proximate decision, and here are
> a few threads talking about all the messes that were created
> for JDBC etc:
>
> https://www.postgresql.org/message-id/flat/3D793A93.7030000%40xythos.com#4a2e2d9bdf2967906a6e0a75815d6636
> https://www.postgresql.org/message-id/flat/3383060E-272E-11D7-BA14-000502E740BA%40wellsgaming.com
> https://www.postgresql.org/message-id/flat/Law14-F37PIje6n0ssr00000bc1%40hotmail.com
>
> Basically, changing transactional semantics underneath clients is
> a horrid idea. Having such behavior in an extension rather than
> the core doesn't make it less horrid. If we'd designed it to act
> that way from day one, maybe it'd have been fine. But as things
> stand, we are quite locked into the position that this has to be
> managed on the client side.

Yes I have suffered of this implementation for server side autocommit,
it was reverted in PG 7.4 if I remember well. I'm old enough to remember
that :-). I'm also against restoring this feature inside PG core but the
fact that the subject comes again almost every 2 years mean that there
is a need on this feature. This is why I'm proposing to be able to use
an extension for those who really need the feature, with all the
associated warning.

For example in my case the first time I was needing this feature was to
emulate the behavior of DB2 that allows rollback at statement level.
This is not exactly autocommit because the transaction still need to be
validated or rolledback at end, this is just that an error will not
invalidate the full transaction but just the failing statement. I think
that this is different. Actually I have an extension that is doing that
for most of the work but we still have to send the ROLLBACK TO savepoint
at client side which is really a performances killer and especially
painful to implement with JDBC Exception blocks.

Recently I was working on an Oracle to PostgreSQL migration and want to
implement an other Oracle feature like that is heavily used when
importing data from different sources into a data warehouse. It's very
common in the Oracle world to batch data importinside a transaction and
log silently the errors into a dedicated table to be processed later.
"Whatever" (this concern only certain errors) happens you continue to
import the data and DBAs will check what to fix and will re-import the
records in error. Again, I have an extension that is doing that but we
still have to generate the ROLLBACK TO at client side. This can be
avoided with this proposal and will greatly simplify the code at client
side.

We all know the problems of such server side implementation but once you
have implemented it at client side and you are looking for better
performances it's obvious that this kind of extension could help. The
other solution is to move to a proprietary PostgreSQL fork which is
surely not what we want.

> </digression>
>
> That point doesn't necessarily invalidate the value of having
> some sort of hook in this general area. But I would kind of like
> to see another use-case, because I don't believe in this one.

I have sited two use-case, they are both based on the rollback at
statement level feature. I'm pretty sure that there is several other
use-cases that escape my poor imagination. IMHO the possibility to offer
the rollback at statement level feature through an extension should be
enough but if anyone have other use-case I will be pleased to create an
extension to test it :-)

--
Gilles Darold
http://www.darold.net/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message gkokolatos 2021-07-15 07:48:08 Re: Introduce pg_receivewal gzip compression tests
Previous Message Michael Paquier 2021-07-15 07:19:07 Re: Incorrect usage of strtol, atoi for non-numeric junk inputs