Re: New docs chapter on Transaction Management and related changes

From: Robert Treat <rob(at)xzilla(dot)net>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Erik Rijkers <er(at)xs4all(dot)nl>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New docs chapter on Transaction Management and related changes
Date: 2022-11-09 14:16:18
Message-ID: CAJSLCQ0fH4Jt0yiJVbXmCmJUYP4h4fXzoyiUU6Nxva4KWGZG6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 7, 2022 at 5:04 PM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
>
> On Sat, 2022-11-05 at 10:08 +0000, Simon Riggs wrote:
> > Agreed; new compilation patch attached, including mine and then
> > Robert's suggested rewordings.
>
> Thanks. There is clearly a lot of usefule information in this.
>
> Some comments:
>
<snip>
> > --- a/doc/src/sgml/ref/release_savepoint.sgml
> > +++ b/doc/src/sgml/ref/release_savepoint.sgml
> > @@ -34,23 +34,16 @@ RELEASE [ SAVEPOINT ] <replaceable>savepoint_name</replaceable>
> > <title>Description</title>
> >
> > <para>
> > - <command>RELEASE SAVEPOINT</command> destroys a savepoint previously defined
> > - in the current transaction.
> > + <command>RELEASE SAVEPOINT</command> will subcommit the subtransaction
> > + established by the named savepoint, if one exists. This will release
> > + any resources held by the subtransaction. If there were any
> > + subtransactions of the named savepoint, these will also be subcommitted.
> > </para>
> >
> > <para>
>
> "Subtransactions of the named savepoint" is somewhat confusing; how about
> "subtransactions of the subtransaction established by the named savepoint"?
>
> If that is too long and explicit, perhaps "subtransactions of that subtransaction".
>

Personally, I think these are more confusing.

> > --- a/doc/src/sgml/ref/rollback.sgml
> > +++ b/doc/src/sgml/ref/rollback.sgml
> > @@ -56,11 +56,14 @@ ROLLBACK [ WORK | TRANSACTION ] [ AND [ NO ] CHAIN ]
> > <term><literal>AND CHAIN</literal></term>
> > <listitem>
> > <para>
> > - If <literal>AND CHAIN</literal> is specified, a new transaction is
> > + If <literal>AND CHAIN</literal> is specified, a new unaborted transaction is
> > immediately started with the same transaction characteristics (see <xref
> > linkend="sql-set-transaction"/>) as the just finished one. Otherwise,
> > no new transaction is started.
>
> I don't think that is an improvement. "Unaborted" is an un-word. A new transaction
> is always "unaborted", isn't it?
>

I thought about this as well when reviewing it, but I do think
something is needed for the case where you have a transaction which
has suffered an error and then you issue "rollback and chain"; if you
just say "a new transaction is immediately started with the same
transaction characteristics" it might imply to some the new
transaction has some kind of carry over of the previous broken
transaction... the use of the word unaborted makes it clear that the
new transaction is 100% functional.

> > --- a/doc/src/sgml/wal.sgml
> > +++ b/doc/src/sgml/wal.sgml
> > @@ -909,4 +910,36 @@
> > seem to be a problem in practice.
> > </para>
> > </sect1>
> > +
> > + <sect1 id="two-phase">
> > +
> > + <title>Two-Phase Transactions</title>
> > +
> > + <para>
> > + <productname>PostgreSQL</productname> supports a two-phase commit (2PC)
> [...]
> > + <filename>pg_twophase</filename> directory. Currently-prepared
> > + transactions can be inspected using <link
> > + linkend="view-pg-prepared-xacts"><structname>pg_prepared_xacts</structname></link>.
> > + </para>
> > + </sect1>
> > +
> > </chapter>
>
> I don't like "currently-prepared". How about:
> "Transaction that are currently prepared can be inspected..."
>

This seems to align with other usage, so +1

> This is clearly interesting information, but I don't think the WAL chapter is the right
> place for this. "pg_twophase" is already mentioned in "storage.sgml", and details about
> when exactly a prepared transaction is persisted may exceed the details level needed by
> the end user.
>
> I'd look for that information in the reference page for PREPARE TRANSACTION; perhaps
> that would be a better place. Or, even better, the new "xact.sgml" chapter.
>
> > --- /dev/null
> > +++ b/doc/src/sgml/xact.sgml
>
> + <title>Transaction Management</title>
>
> + The word transaction is often abbreviated as "xact".
>
> Should use <quote> here.
>
> > + <title>Transactions and Identifiers</title>
>
> > + <para>
> > + Once a transaction writes to the database, it is assigned a
> > + non-virtual <literal>TransactionId</literal> (or <type>xid</type>),
> > + e.g., <literal>278394</literal>. Xids are assigned sequentially
> > + using a global counter used by all databases within the
> > + <productname>PostgreSQL</productname> cluster. This property is used by
> > + the transaction system to order transactions by their first database
> > + write, i.e., lower-numbered xids started writing before higher-numbered
> > + xids. Of course, transactions might start in a different order.
> > + </para>
>
> "This property"? How about:
> "Because transaction IDs are assigned sequentially, the transaction system can
> use them to order transactions by their first database write"
>

+1

> I would want some additional information here: why does the transaction system have
> to order transactions by their first database write?
>
> "Of course, transactions might start in a different order."
>
> Now that confuses me. Are you saying that BEGIN could be in a different order
> than the first database write? Perhaps like this:
>
> "Note that the order in which transactions perform their first database write
> might be different from the order in which the transactions started."
>

+1

> > + The internal transaction ID type <type>xid</type> is 32-bits wide
>
> There should be no hyphen in "32 bits wide", just as in "3 years old".
>

Minor aside, we should clean up glossary.sgml as well.

> > + Xids are used as the
> > + basis for <productname>PostgreSQL</productname>'s <link
> > + linkend="mvcc">MVCC</link> concurrency mechanism, <link
> > + linkend="hot-standby">Hot Standby</link>, and Read Replica servers.
>
> What is the difference between a hot standby and a read replica? I think
> one of these terms is sufficient.
>

Agreed the distinction is not clear, although you could replace Hot
Standby with Warm Standby.

> > + In addition to <literal>vxid</literal> and <type>xid</type>,
> > + when a transaction is prepared for two-phase commit it
> > + is also identified by a Global Transaction Identifier
> > + (<acronym>GID</acronym>).
>
> Better:
>
> "In addition to <literal>vxid</literal> and <type>xid</type>,
> prepared transactions also have a Global Transaction Identifier
> (<acronym>GID</acronym>) that is assigned when the transaction is
> prepared for two-phase commit."
>

+1

> > + <sect1 id="xact-locking">
> > +
> > + <title>Transactions and Locking</title>
> > +
> > + <para>
> > + Currently-executing transactions are shown in <link
> > + linkend="view-pg-locks"><structname>pg_locks</structname></link>
> > + in columns <structfield>virtualxid</structfield> and
> > + <structfield>transactionid</structfield>.
>
> Better:
>
> "The transaction IDs of currently executing transactions are shown in <link
> linkend="view-pg-locks"><structname>pg_locks</structname></link>
> in the columns <structfield>virtualxid</structfield> and
> <structfield>transactionid</structfield>."
>
> > + Lock waits on table-level locks are shown waiting for
> > + <structfield>virtualxid</structfield>, while lock waits on row-level
> > + locks are shown waiting for <structfield>transactionid</structfield>.
>
> That's not true. Transactions waiting for table-level locks are shown
> waiting for a "relation" lock in both "pg_stat_activity" and "pg_locks".
>
> > + Row-level read and write locks are recorded directly in locked
> > + rows and can be inspected using the <xref linkend="pgrowlocks"/>
> > + extension. Row-level read locks might also require the assignment
> > + of multixact IDs (<literal>mxid</literal>). Mxids are recorded in
> > + the <filename>pg_multixact</filename> directory.
>
> "are recorded directly in *the* locked rows"
>
> I think the mention of multixacts should link to
> <xref linkend="vacuum-for-multixact-wraparound"/>. Again, I would not
> specifically mention the directory, since it is already described in
> "storage.sgml", but I have no strong optinion there.
>
> > + <sect1 id="subxacts">
> > +
> > + <title>Subtransactions</title>
>
> > + The word subtransaction is often abbreviated as
> > + <literal>subxact</literal>.
>
> I'd use <quote>, not <literal>.
>
> > + If a subtransaction is assigned a non-virtual transaction ID,
> > + its transaction ID is referred to as a <literal>subxid</literal>.
>
> Again, I would use <quote>, since we don't <literal> "subxid"
> elsewhere.
>
> + Up to
> + 64 open subxids are cached in shared memory for each backend; after
> + that point, the overhead increases significantly since we must look
> + up subxid entries in <filename>pg_subtrans</filename>.
>
> Comma before "since". Perhaps you should mention that this means disk I/O.
>

ISTR that you only use a comma before since in cases where the
preceding thought contains a negative.

In any case, are you thinking something like this:

" 64 open subxids are cached in shared memory for each backend; after
that point the overhead increases significantly due to additional disk I/O
from looking up subxid entries in <filename>pg_subtrans</filename>."

Robert Treat
https://xzilla.net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Reid Thompson 2022-11-09 14:23:25 Re: Add tracking of backend memory allocated to pg_stat_activity
Previous Message Justin Pryzby 2022-11-09 14:14:47 Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL