Re: Replication slot stats misgivings

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, vignesh C <vignesh21(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: Replication slot stats misgivings
Date: 2021-04-29 05:43:44
Message-ID: CAD21AoCgFHr3yALUh9H8FcFDvVAcc4Vaw0eCgyz=fffvmC2kJQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 29, 2021 at 11:55 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Apr 29, 2021 at 4:58 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Thu, Apr 29, 2021 at 5:41 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > >
> > > It seems that the test case added by f5fc2f5b2 is still a bit
> > > unstable, even after c64dcc7fe:
> >
> > Hmm, I don't see the exact cause yet but there are two possibilities:
> > some transactions were really spilled,
> >
>
> This is the first test and inserts just one small record, so how it
> can lead to spill of data. Do you mean to say that may be some
> background process has written some transaction which leads to a spill
> of data?

Not sure but I thought that the logical decoding started to decodes
from a relatively old point for some reason and decoded incomplete
transactions that weren’t shown in the result.

>
> > and it showed the old stats due
> > to losing the drop (and create) slot messages.
> >
>
> Yeah, something like this could happen. Another possibility here could
> be that before the stats collector has processed drop and create
> messages, we have enquired about the stats which lead to it giving us
> the old stats. Note, that we don't wait for 'drop' or 'create' message
> to be delivered. So, there is a possibility of the same. What do you
> think?

Yeah, that could happen even if any message didn't get dropped.

>
> > For the former case, it
> > seems to better to create the slot just before the insertion and
> > setting logical_decoding_work_mem to the default (64MB). For the
> > latter case, maybe we can use a different name slot than the name used
> > in other tests?
> >
>
> How about doing both of the above suggestions? Alternatively, we can
> wait for both 'drop' and 'create' message to be delivered but that
> might be overkill.

Agreed. Attached the patch doing both things.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

Attachment Content-Type Size
fix_stats_test.patch application/octet-stream 10.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message tanghy.fnst@fujitsu.com 2021-04-29 05:46:00 RE: [BUG]"FailedAssertion" reported in lazy_scan_heap() when running logical replication
Previous Message Amit Kapila 2021-04-29 05:30:58 Re: Forget close an open relation in ReorderBufferProcessTXN()