Re: BUG #17434: CREATE/DROP DATABASE can be executed in the same transaction with other commands

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17434: CREATE/DROP DATABASE can be executed in the same transaction with other commands
Date: 2022-07-18 22:17:11
Message-ID: CAKFQuwYQpm6E4nTgiOZAGRg0JKquHaXQ0JkqruWeHuOQX6-70g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Mon, Jul 18, 2022 at 1:20 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
> > On Fri, Jul 15, 2022 at 2:06 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Here's a draft patch to fix this. We basically just need to force
> >> finish_xact_command() in the same way as we do for transaction control
> >> statements. I considered using the same technology as the code uses
> >> for transaction control --- that is, statically check for the types of
> >> statements that are trouble --- but after reviewing the set of callers
> >> of PreventInTransactionBlock() I gave that up as unmaintainable.
>
> > This seems like too narrow a fix though.
>
> I read this, and I have absolutely no idea what you're talking about
> or what you concretely want to do differently. If it's just a
> documentation question, I agree that I didn't address docs yet.
> Probably we do need to put something in the protocol chapter
> pointing out that some commands will commit immediately.
>

I guess I am expecting exec_execute_message to have:

if (completed && use_implicit_block)
{
EndImplicitTransactionBlock();
finish_xact_command();
} else if (completed) [existing code continues]
Or, in terms of the protocol,

"Therefore, an Execute phase is always terminated by the appearance of
exactly one of these messages: CommandComplete, EmptyQueryResponse (if the
portal was created from an empty query string), ErrorResponse, or
PortalSuspended."

CommandComplete includes an implied commit when the implicit transaction
block is in use; which basically means sending Execute while using the
implicit transaction block will cause a commit to happen.

I don't fully understand PortalSuspended but it seems like it is indeed a
valid exception to this rule.

EmptyQueryResponse seems like it should be immaterial.

ErrorResponse seems to preempt all of these.

The implied transaction block does not count for purposes of determining
whether a command that must not be executed in a transaction block can be
executed.

Now, as you say below, the "multiple commands per implicit transaction
block in extended query mode" is an intentional design choice so the above
would indeed be incorrect. However, there is still something fishy here,
so please read below.

> I'm not sure I buy your argument that there's a fundamental
> difference between simple and extended query protocol in this
> area. In simple protocol you can wrap an "implicit transaction"
> around several commands by sending them in one query message.
> What we've got here is that you can do the same thing in
> extended protocol by omitting Syncs. Extended protocol's
> skip-till-Sync-after-error behavior is likewise very much like
> the fact that simple protocol abandons the rest of the query
> string after an error.
>

The fact that SYNC has the side effect of ending the implicit transaction
block is a POLA violation to me and the root of my misunderstanding here.
I suppose it is too late to change at this point. I can at least see that
giving the client control of the implicit transaction block, even if not
through SQL (which I suppose comes with implicit), has merit, even if this
choice of implementation is unintuitive.

In any case, I tried to extend the pgbench exercise but don't know what
went wrong. I will explain what I think would happen:

For the script:

drop table if exists performupdate;
create table performupdate (val integer);
insert into performupdate values (2);
\startpipeline
update performupdate set val = val * 2;
--create database benchtest;
select 1/0;
--rollback
\endpipeline
DO $$BEGIN RAISE NOTICE 'Value = %', (select val from performupdate); END;
$$

I get this result - the post-pipeline DO block never executes and I
expected that it would. Uncommenting the rollback made no difference. I
suppose this is just because we are abusing the tool in lieu of writing C
code. That's fine.

pgbench: client 0 executing script "/home/vagrant/pipebench.sql"
pgbench: client 0 sending drop table if exists performupdate;
pgbench: client 0 receiving
pgbench: client 0 receiving
pgbench: client 0 sending create table performupdate (val integer);
pgbench: client 0 receiving
pgbench: client 0 receiving
pgbench: client 0 sending insert into performupdate values (2);
pgbench: client 0 receiving
pgbench: client 0 receiving
pgbench: client 0 executing \startpipeline
pgbench: client 0 sending update performupdate set val = val * 2;
pgbench: client 0 sending select 1/0;
pgbench: client 0 executing \endpipeline
pgbench: client 0 receiving
pgbench: client 0 receiving
pgbench: client 0 receiving
pgbench: error: client 0 script 0 aborted in command 6 query 0:
transaction type: /home/vagrant/pipebench.sql
scaling factor: 1
query mode: extended
number of clients: 1
number of threads: 1
maximum number of tries: 1
number of transactions per client: 1
number of transactions actually processed: 0/1
number of failed transactions: 0 (NaN%)
pgbench: error: Run was aborted; the above results are incomplete.

In any case, for the above script, given the definition of pipeline mode, I
would expect that the value reported to be 2. This assumes that when
coming out of pipeline mode the system basically goes back to ReadyForQuery.

However, if I now uncomment the create database command the expectation is
either:

1. It fails to execute since an existing command is sharing the implicit
transaction, and fails the implicit transaction block, thus the reported
value is still 2
2. It succeeds, the next command executes and fails, the database creation
is undone and the update is undone, thus the reported value is still 2

What does happen, IIUC, is that both the preceding update command and the
create database are now committed and the returned value is 4

In short, we are saying that issuing a command that cannot be executed in a
transaction block within the middle of the implicit transaction block will
cause the block to implicitly commit if the command completes successfully.

From this it seems that not only should we issue a commit after executing
create database in the implicit transaction block but we also need to
commit before attempting to execute the command in the first place. The
mere presence of a such a command basically means:

COMMIT;
CREATE DATABASE...;
COMMIT;

That is what it means to be unable to be executed in a transaction block -
with an outright error if an explicit transaction block has already been
established.

David J.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Masahiko Sawada 2022-07-19 03:26:02 Re: [15] Custom WAL resource managers, single user mode, and recovery
Previous Message Bruce Momjian 2022-07-18 21:21:30 Re: BUG #17496: to_char function resets if interval exceeds 23 hours 59 minutes

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-07-18 22:18:40 Re: [PATCH] Introduce array_shuffle() and array_sample()
Previous Message Nathan Bossart 2022-07-18 22:17:10 Re: allow specifying action when standby encounters incompatible parameter settings