Re: logical decoding and replication of sequences

From: Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: logical decoding and replication of sequences
Date: 2022-03-23 12:46:29
Message-ID: BDDB28EE-84D3-4BA8-A5BD-DFF56CE584BC@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On 23. 3. 2022, at 12:50, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Mar 22, 2022 at 5:41 PM Petr Jelinek
> <petr(dot)jelinek(at)enterprisedb(dot)com <mailto:petr(dot)jelinek(at)enterprisedb(dot)com>> wrote:
>>
>>> On 22. 3. 2022, at 13:09, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>>
>>> On Mon, Mar 21, 2022 at 4:25 AM Tomas Vondra
>>> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Attached is a rebased patch, addressing most of the remaining issues.
>>>>
>>>
>>> It appears that on the apply side, the patch always creates a new
>>> relfilenode irrespective of whether the sequence message is
>>> transactional or not. Is it required to create a new relfilenode for
>>> non-transactional messages? If not that could be costly?
>>>
>>
>>
>> That's a good catch, I think we should just write the page in the non-transactional case, no need to mess with relnodes.
>>
>
> What if the current node has also incremented from the existing
> sequence? Basically, how will we deal with conflicts? It seems we will
> overwrite the actions done on the existing node which means sequence
> values can go back.
>

I think this is perfectly acceptable behavior, we are replicating state from upstream, not reconciling state on downstream.

You can't really use the builtin sequences to implement distributed sequence via replication. If user wants to write to both nodes they should not replicate the sequence value and instead offset the sequence on each node so they produce different ranges, that's quite common approach. One day we might want revisit adding support for custom sequence AMs.

> * Currently, the patch uses one sync worker per sequence. It seems to
> be a waste of resources considering apart from one additional process,
> we need origin/slot to sync each sequence.
>

This is indeed wasteful but not something that I'd consider blocker for the patch personally.

--
Petr

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2022-03-23 12:52:28 Re: Skip partition tuple routing with constant partition key
Previous Message Robert Haas 2022-03-23 12:35:43 Re: Support isEmptyStringInfo