Re: logical decoding and replication of sequences

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: logical decoding and replication of sequences
Date: 2022-03-11 11:38:09
Message-ID: CAA4eK1LcgaKv2hJomXcszHBsEEkedC5hwN+bu8x04xgKx=wM3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 11, 2022 at 5:04 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Mar 8, 2022 at 11:59 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >
> > On 3/7/22 22:25, Tomas Vondra wrote:
> > >>
> > >> Interesting. I can think of one reason that might cause this - we log
> > >> the first sequence increment after a checkpoint. So if a checkpoint
> > >> happens in an unfortunate place, there'll be an extra WAL record. On
> > >> slow / busy machines that's quite possible, I guess.
> > >>
> > >
> > > I've tweaked the checkpoint_interval to make checkpoints more aggressive
> > > (set it to 1s), and it seems my hunch was correct - it produces failures
> > > exactly like this one. The best fix probably is to just disable decoding
> > > of sequences in those tests that are not aimed at testing sequence decoding.
> > >
> >
> > I've pushed a fix for this, adding "include-sequences=0" to a couple
> > test_decoding tests, which were failing with concurrent checkpoints.
> >
> > Unfortunately, I realized we have a similar issue in the "sequences"
> > tests too :-( Imagine you do a series of sequence increments, e.g.
> >
> > SELECT nextval('s') FROM generate_sequences(1,100);
> >
> > If there's a concurrent checkpoint, this may add an extra WAL record,
> > affecting the decoded output (and also the data stored in the sequence
> > relation itself). Not sure what to do about this ...
> >
>
> I am also not sure what to do for it but maybe if in some way we can
> increase checkpoint timeout or other parameters for these tests then
> it would reduce the chances of such failures. The other idea could be
> to perform checkpoint before the start of tests to reduce the
> possibility of another checkpoint.
>

One more thing, I notice while checking the commit for this feature is
that the below include seems to be out of order:
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -42,6 +42,7 @@
#include "replication/reorderbuffer.h"
#include "replication/snapbuild.h"
#include "storage/standby.h"
+#include "commands/sequence.h"

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Sharma 2022-03-11 11:45:09 Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
Previous Message osumi.takamichi@fujitsu.com 2022-03-11 11:36:55 RE: Skipping logical replication transactions on subscriber side