Re: logical decoding and replication of sequences, take 2

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject: Re: logical decoding and replication of sequences, take 2
Date: 2023-07-20 14:51:59
Message-ID: 591c59af-6254-127c-3de0-6ef4445ada05@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/20/23 09:24, Ashutosh Bapat wrote:
> Thanks Tomas for the updated patches.
>
> Here are my comments on 0006 patch as well as 0002 patch.
>
> On Wed, Jul 19, 2023 at 4:23 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> I think this is an accurate description of what the current patch does.
>> And I think it's a reasonable behavior.
>>
>> My point is that if this gets released in PG17, it'll be difficult to
>> change, even if it does not change on-disk format.
>>
>
> Yes. I agree. And I don't see any problem even if we are not able to change it.
>
>>
>> I need to think behavior about this a bit more, and maybe check how
>> difficult would be implementing it.
>
> Ok.
>
> In most of the comments and in documentation, there are some phrases
> which do not look accurate.
>
> Change to a sequence is being refered to as "sequence increment". While
> ascending sequences are common, PostgreSQL supports descending sequences as
> well. The changes there will be decrements. But that's not the only case. A
> sequence may be restarted with an older value, in which case the change could
> increment or a decrement. I think correct usage is 'changes to sequence' or
> 'sequence changes'.
>
> Sequence being assigned a new relfilenode is referred to as sequence
> being created. This is confusing. When an existing sequence is ALTERed, we
> will not "create" a new sequence but we will "create" a new relfilenode and
> "assign" it to that sequence.
>
> PFA such edits in 0002 and 0006 patches. Let me know if those look
> correct. I think we
> need similar changes to the documentation and comments in other places.
>

OK, I merged the changes into the patches, with some minor changes to
the wording etc.

>>
>> I did however look at the proposed alternative to the "created" flag.
>> The attached 0006 part ditches the flag with XLOG_SMGR_CREATE decoding.
>> The smgr_decode code needs a review (I'm not sure the
>> skipping/fast-forwarding part is correct), but it seems to be working
>> fine overall, although we need to ensure the WAL record has the correct XID.
>>
>
> Briefly describing the patch. When decoding a XLOG_SMGR_CREATE WAL
> record, it adds the relfilenode mentioned in it to the sequences hash.
> When decoding a sequence change record, it checks whether the
> relfilenode in the WAL record is in hash table. If it is the sequence
> changes is deemed transactional otherwise non-transactional. The
> change looks good to me. It simplifies the logic to decide whether a
> sequence change is transactional or not.
>

Right.

> In sequence_decode() we skip sequence changes when fast forwarding.
> Given that smgr_decode() is only to supplement sequence_decode(), I
> think it's correct to do the same in smgr_decode() as well. Simillarly
> skipping when we don't have full snapshot.
>

I don't follow, smgr_decode already checks ctx->fast_forward.

> Some minor comments on 0006 patch
>
> + /* make sure the relfilenode creation is associated with the XID */
> + if (XLogLogicalInfoActive())
> + GetCurrentTransactionId();
>
> I think this change is correct and is inline with similar changes in 0002. But
> I looked at other places from where DefineRelation() is called. For regular
> tables it is called from ProcessUtilitySlow() which in turn does not call
> GetCurrentTransactionId(). I am wondering whether we are just discovering a
> class of bugs caused by not associating an xid with a newly created
> relfilenode.
>

Not sure. Why would it be a bug?

> + /*
> + * If we don't have snapshot or we are just fast-forwarding, there is no
> + * point in decoding changes.
> + */
> + if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT ||
> + ctx->fast_forward)
> + return;
>
> This code block is repeated.
>

Fixed.

> +void
> +ReorderBufferAddRelFileLocator(ReorderBuffer *rb, TransactionId xid,
> + RelFileLocator rlocator)
> +{
> ... snip ...
> +
> + /* sequence changes require a transaction */
> + if (xid == InvalidTransactionId)
> + return;
>
> IIUC, with your changes in DefineSequence() in this patch, this should not
> happen. So this condition will never be true. But in case it happens, this code
> will not add the relfilelocation to the hash table and we will deem the
> sequence change as non-transactional. Isn't it better to just throw an error
> and stop replication if that (ever) happens?
>

It can't happen for sequence, but it may happen when creating a
non-sequence relfilenode. In a way, it's a way to skip (some)
unnecessary relfilenodes.

> Also some comments on 0002 patch
>
> @@ -405,8 +405,19 @@ fill_seq_fork_with_data(Relation rel, HeapTuple
> tuple, ForkNumber forkNum)
>
> /* check the comment above nextval_internal()'s equivalent call. */
> if (RelationNeedsWAL(rel))
> + {
> GetTopTransactionId();
>
> + /*
> + * Make sure the subtransaction has a XID assigned, so that
> the sequence
> + * increment WAL record is properly associated with it. This
> matters for
> + * increments of sequences created/altered in the
> transaction, which are
> + * handled as transactional.
> + */
> + if (XLogLogicalInfoActive())
> + GetCurrentTransactionId();
> + }
> +
>
> I think we should separately commit the changes which add a call to
> GetCurrentTransactionId(). That looks like an existing bug/anomaly
> which can stay irrespective of this patch.
>

Not sure, but I don't see this as a bug.

> + /*
> + * To support logical decoding of sequences, we require the sequence
> + * callback. We decide it here, but only check it later in the wrappers.
> + *
> + * XXX Isn't it wrong to define only one of those callbacks? Say we
> + * only define the stream_sequence_cb() - that may get strange results
> + * depending on what gets streamed. Either none or both?
> + *
> + * XXX Shouldn't sequence be defined at slot creation time, similar
> + * to two_phase? Probably not.
> + */
>
> Do you intend to keep these XXX's as is? My previous comments on this comment
> block are in [1].
>
> In fact, given that whether or not sequences are replicated is decided by the
> protocol version, do we really need LogicalDecodingContext::sequences? Drawing
> parallel with WAL messages, I don't think it's needed.
>

Right. We do that for two_phase because you can override that when
creating the subscription - sequences allowed that too initially, but
then we ditched that. So I don't think we need this.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0001-Make-test_decoding-ddl.out-shorter-20230720.patch text/x-patch 473.3 KB
0002-Logical-decoding-of-sequences-20230720.patch text/x-patch 52.9 KB
0003-Add-decoding-of-sequences-to-test_decoding-20230720.patch text/x-patch 20.6 KB
0004-Add-decoding-of-sequences-to-built-in-repli-20230720.patch text/x-patch 265.7 KB
0005-Simplify-protocol-versioning-20230720.patch text/x-patch 15.6 KB
0006-replace-created-flag-with-XLOG_SMGR_CREATE-20230720.patch text/x-patch 16.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2023-07-20 14:52:25 Re: Printing backtrace of postgres processes
Previous Message Daniel Gustafsson 2023-07-20 14:37:08 Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"