Re: logical decoding and replication of sequences

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: logical decoding and replication of sequences
Date: 2022-03-07 21:25:26
Message-ID: 93af17c3-a78d-4918-5f6b-76dfeb2d48bd@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/7/22 17:53, Tomas Vondra wrote:
> On 2/28/22 12:46, Amit Kapila wrote:
>> On Sat, Feb 12, 2022 at 6:04 AM Tomas Vondra
>> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>>
>>> On 2/10/22 19:17, Tomas Vondra wrote:
>>>> I've polished & pushed the first part adding sequence decoding
>>>> infrastructure etc. Attached are the two remaining parts.
>>>>
>>>> I plan to wait a day or two and then push the test_decoding part. The
>>>> last part (for built-in replication) will need more work and maybe
>>>> rethinking the grammar etc.
>>>>
>>>
>>> I've pushed the second part, adding sequences to test_decoding.
>>>
>>
>> The test_decoding is failing randomly in the last few days. I am not
>> completely sure but they might be related to this work. The two of
>> these appears to be due to the same reason:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2022-02-25%2018%3A50%3A09
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=locust&dt=2022-02-17%2015%3A17%3A07
>>
>> TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File:
>> "reorderbuffer.c", Line: 1173, PID: 35013)
>> 0 postgres 0x00593de0 ExceptionalCondition + 160\\0
>>
>
> This might be related to the issue reported by Amit, i.e. that
> sequence_decode does not call ReorderBufferProcessXid(). If this keeps
> failing, we'll have to add some extra debug info (logging LSN etc.), at
> least temporarily. It'd be valuable to inspect the WAL too.
>
>> Another:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&dt=2022-02-16%2006%3A21%3A48
>>
>> --- /home/nm/farm/xlc32/HEAD/pgsql.build/contrib/test_decoding/expected/rewrite.out
>> 2022-02-14 20:19:14.000000000 +0000
>> +++ /home/nm/farm/xlc32/HEAD/pgsql.build/contrib/test_decoding/results/rewrite.out
>> 2022-02-16 07:42:18.000000000 +0000
>> @@ -126,6 +126,7 @@
>> table public.replication_example: INSERT: id[integer]:4
>> somedata[integer]:3 text[character varying]:null
>> testcolumn1[integer]:null
>> table public.replication_example: INSERT: id[integer]:5
>> somedata[integer]:4 text[character varying]:null
>> testcolumn1[integer]:2 testcolumn2[integer]:1
>> COMMIT
>> + sequence public.replication_example_id_seq: transactional:0
>> last_value: 38 log_cnt: 0 is_called:1
>> BEGIN
>> table public.replication_example: INSERT: id[integer]:6
>> somedata[integer]:5 text[character varying]:null
>> testcolumn1[integer]:3 testcolumn2[integer]:null
>> COMMIT
>> @@ -133,7 +134,7 @@
>> table public.replication_example: INSERT: id[integer]:7
>> somedata[integer]:6 text[character varying]:null
>> testcolumn1[integer]:4 testcolumn2[integer]:null
>> table public.replication_example: INSERT: id[integer]:8
>> somedata[integer]:7 text[character varying]:null
>> testcolumn1[integer]:5 testcolumn2[integer]:null
>> testcolumn3[integer]:1
>> COMMIT
>> - (15 rows)
>> + (16 rows)
>>
>
> Interesting. I can think of one reason that might cause this - we log
> the first sequence increment after a checkpoint. So if a checkpoint
> happens in an unfortunate place, there'll be an extra WAL record. On
> slow / busy machines that's quite possible, I guess.
>

I've tweaked the checkpoint_interval to make checkpoints more aggressive
(set it to 1s), and it seems my hunch was correct - it produces failures
exactly like this one. The best fix probably is to just disable decoding
of sequences in those tests that are not aimed at testing sequence decoding.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2022-03-07 21:34:20 Re: role self-revocation
Previous Message Robert Haas 2022-03-07 21:25:14 Re: refactoring basebackup.c