Re: logical decoding and replication of sequences, take 2

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: logical decoding and replication of sequences, take 2
Date: 2023-11-29 12:45:21
Message-ID: 444c89bf-f39f-2058-0c0e-0a91903ed07c@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/27/23 23:06, Peter Smith wrote:
> FWIW, here are some more minor review comments for v20231127-3-0001
>
> ======
> doc/src/sgml/logicaldecoding.sgml
>
> 1.
> + The <parameter>txn</parameter> parameter contains meta information about
> + the transaction the sequence change is part of. Note however that for
> + non-transactional updates, the transaction may be NULL, depending on
> + if the transaction already has an XID assigned.
> + The <parameter>sequence_lsn</parameter> has the WAL location of the
> + sequence update. <parameter>transactional</parameter> says if the
> + sequence has to be replayed as part of the transaction or directly.
>
> /says if/specifies whether/
>

Will fix.

> ======
> src/backend/commands/sequence.c
>
> 2. DecodeSeqTuple
>
> + memcpy(((char *) tuple->tuple.t_data),
> + data + sizeof(xl_seq_rec),
> + SizeofHeapTupleHeader);
> +
> + memcpy(((char *) tuple->tuple.t_data) + SizeofHeapTupleHeader,
> + data + sizeof(xl_seq_rec) + SizeofHeapTupleHeader,
> + datalen);
>
> Maybe I am misreading but isn't this just copying 2 contiguous pieces
> of data? Won't a single memcpy of (SizeofHeapTupleHeader + datalen)
> achieve the same?
>

You're right, will fix. I think the code looked differently before, got
simplified and I haven't noticed this can be a single memcpy().

> ======
> .../replication/logical/reorderbuffer.c
>
> 3.
> + * To decide if a sequence change is transactional, we maintain a hash
> + * table of relfilenodes created in each (sub)transactions, along with
> + * the XID of the (sub)transaction that created the relfilenode. The
> + * entries from substransactions are copied to the top-level transaction
> + * to make checks cheaper. The hash table gets cleaned up when the
> + * transaction completes (commit/abort).
>
> /substransactions/subtransactions/
>

Will fix.

> ~~~
>
> 4.
> + * A naive approach would be to just loop through all transactions and check
> + * each of them, but there may be (easily thousands) of subtransactions, and
> + * the check happens for each sequence change. So this could be very costly.
>
> /may be (easily thousands) of/may be (easily thousands of)/
>
> ~~~

Thanks. I've reworded this to

... may be many (easily thousands of) subtransactions ...

>
> 5. ReorderBufferSequenceCleanup
>
> + while ((ent = (ReorderBufferSequenceEnt *)
> hash_seq_search(&scan_status)) != NULL)
> + {
> + (void) hash_search(txn->toptxn->sequences_hash,
> + (void *) &ent->rlocator,
> + HASH_REMOVE, NULL);
> + }
>
> Typically, other HASH_REMOVE code I saw would check result for NULL to
> give elog(ERROR, "hash table corrupted");
>

Good point, I'll add the error check

> ~~~
>
> 6. ReorderBufferQueueSequence
>
> + if (xid != InvalidTransactionId)
> + txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
>
> How about using the macro: TransactionIdIsValid
>

Actually, I wrote in some other message, I think the check is not
necessary. Or rather it should be an assert that XID is valid. And yeah,
the macro is a good idea.

> ~~~
>
> 7. ReorderBufferQueueSequence
>
> + if (reloid == InvalidOid)
> + elog(ERROR, "could not map filenode \"%s\" to relation OID",
> + relpathperm(rlocator,
> + MAIN_FORKNUM));
>
> How about using the macro: OidIsValid
>

I chose to keep this consistent with other places in reorderbuffer, and
all of them use the equality check.

> ~~~
>
> 8.
> + /*
> + * Calculate the first value of the next batch (at which point we
> + * generate and decode another WAL record.
> + */
>
> Missing ')'
>

Will fix.

> ~~~
>
> 9. ReorderBufferAddRelFileLocator
>
> + /*
> + * We only care about sequence relfilenodes for now, and those always have
> + * a XID. So if there's no XID, don't bother adding them to the hash.
> + */
> + if (xid == InvalidTransactionId)
> + return;
>
> How about using the macro: TransactionIdIsValid
>

Will change.

> ~~~
>
> 10. ReorderBufferProcessTXN
>
> + if (reloid == InvalidOid)
> + elog(ERROR, "could not map filenode \"%s\" to relation OID",
> + relpathperm(change->data.sequence.locator,
> + MAIN_FORKNUM));
>
> How about using the macro: OidIsValid
>

Same as the other Oid check - consistency.

> ~~~
>
> 11. ReorderBufferChangeSize
>
> + if (tup)
> + {
> + sz += sizeof(HeapTupleData);
> + len = tup->tuple.t_len;
> + sz += len;
> + }
>
> Why is the 'sz' increment split into 2 parts?
>

Because the other branches in ReorderBufferChangeSize do it that way.
You're right it might be coded on a single line.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2023-11-29 13:23:53 Re: Python installation selection in Meson
Previous Message Heikki Linnakangas 2023-11-29 12:41:57 Re: Extending SMgrRelation lifetimes