Re: Logical replication keepalive flood

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Abbas Butt <abbas(dot)butt(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Zahid Iqbal <zahid(dot)iqbal(at)enterprisedb(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>
Subject: Re: Logical replication keepalive flood
Date: 2021-09-16 12:36:26
Message-ID: CAA4eK1+z3-wYE+rLy-LBW5aYReuVQUTd2EHucJ7HNRd5s2Ew-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 16, 2021 at 6:29 AM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> From Tuesday, September 14, 2021 1:39 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> > However, the problem I found is that, with the patch applied, there is
> > a test failure when running “make check-world”:
> >
> > t/006_logical_decoding.pl ............ 4/14
> > # Failed test 'pg_recvlogical acknowledged changes'
> > # at t/006_logical_decoding.pl line 117.
> > # got: 'BEGIN
> > # table public.decoding_test: INSERT: x[integer]:5 y[text]:'5''
> > # expected: ''
> > # Looks like you failed 1 test of 14.
> > t/006_logical_decoding.pl ............ Dubious, test returned 1 (wstat
> > 256, 0x100) Failed 1/14 subtests
> >
> >
>
> After applying the patch,
> I saw the same problem and can reproduce it by the following steps:
>
> 1) execute the SQLs.
> -----------SQL-----------
> CREATE TABLE decoding_test(x integer, y text);
> SELECT pg_create_logical_replication_slot('test_slot', 'test_decoding');
> INSERT INTO decoding_test(x,y) SELECT s, s::text FROM generate_series(1,4) s;
>
> -- use the lsn here to execute pg_recvlogical later
> SELECT lsn FROM pg_logical_slot_peek_changes('test_slot', NULL, NULL) ORDER BY lsn DESC LIMIT 1;
> INSERT INTO decoding_test(x,y) SELECT s, s::text FROM generate_series(5,50) s;
> ----------------------
>
> 2) Then, if I execute the following command twice:
> # pg_recvlogical -E lsn -d postgres -S 'test_slot' --start --no-loop -f -
> BEGIN 708
> table public.decoding_test: INSERT: x[integer]:1 y[text]:'1'
> table public.decoding_test: INSERT: x[integer]:2 y[text]:'2'
> table public.decoding_test: INSERT: x[integer]:3 y[text]:'3'
> table public.decoding_test: INSERT: x[integer]:4 y[text]:'4'
> COMMIT 708
>
> # pg_recvlogical -E lsn -d postgres -S 'test_slot' --start --no-loop -f -
> BEGIN 709
>
> It still generated ' BEGIN 709' in the second time execution.
> But it will output nothing in the second time execution without the patch.
>

I think here the reason is that the first_lsn of a transaction is
always equal to end_lsn of the previous transaction (See comments
above first_lsn and end_lsn fields of ReorderBufferTXN). I have not
debugged but I think in StreamLogicalLog() the cur_record_lsn after
receiving 'w' message, in this case, will be equal to endpos whereas
we expect to be greater than endpos to exit. Before the patch, it will
always get the 'k' message where we expect the received lsn to be
equal to endpos to conclude that we can exit. Do let me know if your
analysis differs?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-09-16 12:44:35 Re: Column Filtering in Logical Replication
Previous Message Marko Tiikkaja 2021-09-16 11:45:06 Re: Partial index "microvacuum"